Experienced Data Labelers Can Make Up for Dataset Variances in Agritech

What does a head of broccoli look like in a photo? Look closely, and you’ll probably be able to pick it out. Now, pick out that same head of broccoli when the photo is taken from a different angle. And again now, when a cloud has passed overhead and the light is shining differently. Is it always easy to find?  

In agriculture automation, this is what a machine has to do, over and over again — pick out specific information based on visual data. And it may not be as simple as picking out a head of broccoli. 

A machine might need to pluck fruit, spray pesticides, pollinate flowers or detect weeds and other unwanted crops. In order for a machine to harvest or perform crop control, its algorithm needs to be provided with accurate training data sets that are representative of the variations that they will be faced with when in action in the real world. This can only be obtained if you create the right data and label it accurately. 

Machine learning algorithms for agriculture require models that enable the machines to visualize various situations and react accordingly. This would mean that besides the crop, there are other things — from fruits, leaves, farming equipment, pipes — in the field would need to be annotated and fed to the algorithm. 

Easier said than done

This is a process that may seem straightforward. After all, it’s simply taking images, and labeling them (or working with a data labelling partner to get them labeled). In theory, it’s simple. But the reality is never that easy. 

The main problem is that agritech training data can include an immense variance in image quality. A field full of crops is often cluttered and subject visibility is often at stake. Even if the subject is visible, how does it look within its surroundings or when those surroundings change? 

In the real world, there are shadows. Leaves will move in the wind. Daylight at a specific time of day today, will not look the same as daylight at that exact same time tomorrow. Clouds will move. Even when humans get involved, nature has its own way of being.

A common misconception

Very frequently, companies don’t realize that they need to take this variance into account. Or even when they do, they may not have the experience required to consider variances that are location or climate-dependent. Experts from one region may not be familiar with agriculture in another region of the world. 

During the computer visualization process, even if imaging disruptions have been minimized — perhaps shadows and angle issues have been eliminated by taking images at night with strobe lights, or having drones take the images at the same height — would the model still work if the images are taken in a different location? Say, where the ground is a different colour or where the humidity may be higher? 

Would it still work at a different farm, where the same crop may be planted in a different arrangement? Perhaps with less detectable straight lines or with more overlaps? 

Minimizing potential losses

For an agritech company that utilizes machine learning for the services they provide to farmers, the inaccuracies that result from these variances can make or break its business. 

At the end of the day, farmers are looking for services that will help them to increase their profits. Inaccurate machines lead to losses, both in terms of profits, as well as business opportunities for farmers. 

Agritech companies who want to provide top notch services to their clients must ensure that their algorithms are able to account for all the common variances in the training data.

Where experience counts

This is where it’s important to get specialized data labelers to come in. Data labeling companies that have significant experience in an industry, such as agriculture, will be aware of the challenges that each industry presents and ask the right questions to ensure you’re getting the data you need. In this case, that would be the subject variance, occlusion rates, need for colour correction and cluttered environments found in agritech data. 

Data labelers who have been selected for and have gained experience within this specific field would be able to annotate these agriculture images more accurately, thus developing more accurate data for training the algorithm. 

This is why it’s so important for agritech companies to ensure that they work with the best data labelers — either through hiring them, or through partnerships with specialized data labeling companies. 

Need a data labeling partner with experience in the variances that come with Agritech training data sets? Get in touch!


Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.