BLOG Computer Vision

Your Complete Guide To Image Segmentation

Merlin Peter
June 30, 2021

Computer vision has advanced rapidly over the last few years. At its core, computer vision is the technology that allows machines to process their surroundings as humans do. While the human brain is naturally capable of multi-tasking and making quick decisions, transferring this capability to machines was a challenge in the beginning. However, today we have been able to build computer vision models that can detect objects, determine shapes, predict object movements, and take necessary actions based on data. Self-driving cars, ariel mapping, surveillance applications, and various other AR/VR technologies we enjoy today are a result of the progress made in computer vision models. 

The most popular method used for training CV applications or implementing perception models is via identifying objects in an image [labels] or object detection. The more granular method of training models at the pixel level is segmentation. Today, we’ll explore different types of segmentation and how you can access it for free via GT Studio. 

What is Image Segmentation?

To understand segmentation, let’s look at the different types of annotations using simple examples. All annotations initiate a particular function for an ML algorithm’s output. There are four primary types of image annotations used to train computer vision models.

Consider these examples: 

Image (a) has only one dog. We can build a straightforward cat-dog classifier model and predict that there’s a dog in the given image. 

Image (b) has both a dog and a cat. Here, we can train a multi-label classifier, in that instance. But we also need to understand where in the image the cat and god are located. This is where image localization becomes important. In this image object detection helps us detect the object classes and also predicts the location. 

Although these annotations help models detect object classes and predict accurate locations, they do not provide an accurate representation of what the image actually consists of. This is where segmentation becomes critical for CV models. 

Segmentation partitions each pixel on a given image to provide an accurate representation of the shapes. Segmentation is the process of creating a pixel-wise mask for each object in a given image. This gives us a more granular understanding of the contents of the image. The goal here is to recognize and understand what an image contains at the pixel level. Every pixel in an image belongs to at least one class, as opposed to object detection where the bounding boxes of objects can overlap.

Consider this comparison:

In image (c) object detection only showcases the classes and tells us nothing about the shape. 

In image (d) segmentation gives us pixel-wise information about the objects along with the class. 

Different Types of Segmentation

The different types of segmentation include:

Non-instance Segmentation

Non-instance segmentation is also known as semantic segmentation. It helps specify the shape/size/form of the objects in addition to their location and presence. It’s used in cases that require more specificity and where a model needs to definitively know whether or not an image contains the object of interest and also what isn’t an object of interest. In the above example, all the green pixels indicate the Dog and the pink pixels indicate the Cat.

Instance Segmentation

Instance segmentation tracks the presence, location, number, size, and shape of objects in an image. The goal here is to understand the image more accurately with every pixel. In the above example, the pixels related to the dog are tagged as ‘Dog 1’ and those that belong to the cat are tagged as ‘Cat 1’. If there were more dogs and cats in the image, then they would be segmented in similar colors and tagged consecutively. 

Non-instance vs Instance Segmentation

The key difference between the two can be understood with the help of this example.

In image (d) there are no instances. All objects are simply named Dog and the pixels are marked in pink.

In image (e) the instances are clearly marked. Dogs are labeled as Dog 1, Dog 2, etc. along with varying colors to differentiate between the objects. 

Pan-optic Segmentation 

This type of annotation blends both semantic and instance segmentation. The background and objects are semantically segmented and the objects also have instances. This provides granular information for certain ML algorithms. 

Take a look at the different types of segmentation in the example.

(a) Original image for an autonomous driving use case. 

(b) Segmented classes have no instances and are tagged as Car, Building, Road, Sky. Each pixel of an object class is assigned a different color.

(c) Segmented classes have instances and are tagged as Car 1, Car 2, Car 3. Each pixel of an instance pertaining to an object class is assigned a different color

(d) Combination of segmented classes with instances Car 1, Car 2, Car 3, and non-instanced classes like Sky, Road, etc.

Popular Image Segmentation Use Cases 

Segmentation is used for the granular understanding of images in a variety of industries, and it is especially popular in the autonomous driving industry, as self-driving cars require a deep understanding of their surroundings. It also becoming rapidly popular in cancer research in the medical field. The annotator is given the task of separating an image into multiple sections and classifying every pixel in each segment to a corresponding class label of what it represents. 

Non-instance Segmentation For Geospatial Application

Full-pixel, non-instanced segmentation is used for training perception models to identify objects of interest from faraway cameras for geospatial applications.

Instance Segmentation For Autonomous Driving

Full-pixel instance segmentation is commonly used for AV use cases when information of every pixel is critical and may influence the accuracy of the perception model.

Instance Segmentation For Cancer Cell Identification

Instances are used for detecting the shapes of the cancerous cell(s) to expedite the cancer diagnosis processes in the healthcare industry. 

Leverage GT Studio For Your Segmentation Projects

GT Studio is a scalable, web-based data labeling platform designed to empower ML teams. The platform is completely free for a team of 5 users so that ML teams can create labeled data faster to test their ML initiatives.

We have tons of features that make segmentation simple and faster. 

Try GT Studio for free today... 

Anyone can sign up and start using GT Studio. We have an excellent team to support you through your journey while exploring our platform. 

Feel free to reach us at