What is Image Annotation?

The performance of Artificial Intelligence is heavily reliant on the accuracy of its training data. 

Image annotation is a key technique used to create training data for computer vision. In order for machines to perceive objects in their surroundings, annotated images are needed to train Machine Learning algorithms to learn to see the world as we do.  

Annotation in Machine Learning is essentially the process of labelling data in the various mediums of images, text or video. The labels are usually predetermined by a machine learning engineer or computer vision scientist, and are chosen to provide the computer vision model information on objects depicted in an image. 

The algorithm would then use the annotated data to learn and recognise similar patterns when presented with fresh, new data.

Depending on the nature of the project, different industries would need different forms of annotation.

Types of image annotation

Bounding Box

Bounding boxes example used in retail AI technology, to ensure the state of the shelves 

The most commonly used and simplest type of image annotation is the bounding box. This form of annotation requires labellers to draw a box as close as possible to the edges of key objects within the image. Usage of the 2D bounding boxes is often found in object classification, localization and detection for various industries such as retail, ecommerce and healthcare. 

Polygon annotation

Example of Polygon Annotation used in Agritech

Polygon annotation is important because not every object may fit precisely in a bounding box. It’s usually used for more precise annotation for items that are irregularly shaped, for example non-symmetrical objects in aerial images such as fruits, trees, landmarks or houses. Polygon annotation usually requires a high level of precision from the labeller. 

Line annotation

Line annotation as the name suggests involves the annotation of mainly lines and splines, which are used to draw boundaries in a region of an image. It is primarily used when a section that needs to be delineated is too small or thin and isn’t achievable by bounding box. Line annotation is commonly used to label data for autonomous vehicles. 

The lines are used to train vehicle perception models for lane detection. Dissimilar to the bounding box, it avoids white space and additional noise.

Point annotation

Point annotation involves the accurate plotting of key points at specified location on an image.This form of annotation is most commonly used for facial recognition and sentiment analysis. By identifying and following the movement of landmark points on facial expression, machine learning algorithms can detect emotions through predictive reading. 

Point annotation is used to help machines in detecting and identifying facial expressions and emotions in sentiment analysis.

Semantic Segmentation

Autonomous vehicles are able to detect the edges of nearby objects with semantic segmentation

Semantic Segmentation is the task of separating an image into multiple sections and classifying every pixel in each segment to a corresponding class label of what it represents (i.e, pedestrian, car, lamp post). This gives machines a comprehensive understanding of every pixel of a scene in an image. 

Semantic Segmentation is commonly used for detection and localisation of a specific object. Applications of such granular understanding of images can usually be found in a variety of industries, and it is especially popular in the Autonomous Vehicle industry, as self driving cars require deep understanding of their surroundings. While in Agritech it is used for analysis of crop fields to detect diseases and abnormal growth. 

As the computer vision industry advances year upon year, the way training data is prepared for each use case will keep evolving as well. Image annotation is one of the most crucial tasks in computer vision.

While having the right annotation tool is important, computer vision models also rely heavily on high quality annotation work as it will ultimately translate into the accuracy at which is able to identify one object from another. 

Getting highly accurate training data in large volumes done by an external party requires a partner who is able to break complex instructions down into clear and concise steps. 


Get in touch with us to set up a POC or request for a demo of our image annotation tool.

0 Shares:
1 comment

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.