Extracting Foreground Objects from Images and Videos using OpenCV with Python

Foreground Objects

When working with images and videos, one common task is to extract the foreground objects from the background. This process is essential in various applications such as object detection, video surveillance, virtual reality, and image segmentation.

In this article, we will explore how to extract foreground objects from images and videos using OpenCV, a popular computer vision library, with Python.

What is OpenCV?

OpenCV (Open Source Computer Vision Library) is an open-source computer vision and machine learning software library. It provides various functions and algorithms for image and video analysis, including object recognition, image processing, and more. OpenCV supports multiple programming languages, including Python, making it an excellent choice for implementing computer vision projects.

Why Extract Foreground Objects?

Extracting the foreground objects from images and videos is vital for many applications. By isolating the foreground, we can focus on specific objects or regions of interest, segment images, remove background noise, or perform person/object tracking.

Steps to Extract Foreground Objects

The following steps describe the process of extracting foreground objects from images and videos using OpenCV with Python:

  1. Read the Image/Video: First, we need to read the input image or video file using OpenCV's imread() or VideoCapture() functions, respectively. This will load the data into our program for further processing.

  2. Apply Background Subtraction: Background subtraction is a common technique used to separate the foreground from the background. OpenCV provides various algorithms for background subtraction, such as Gaussian Mixture Models (GMM), Codebook, and K-nearest neighbors (KNN). We can use these algorithms to analyze the pixel-wise differences between the current frame and a reference background model.

  3. Foreground Extraction: After applying background subtraction, we obtain a foreground mask where the foreground objects are represented by white pixels, while the background is black. We can use techniques like morphological operations (erosion, dilation) and contour detection to refine the mask and extract the foreground objects more accurately.

  4. Apply Mask to Image/Video: Once we have the foreground mask, we can apply it to the original image or video frame to obtain the isolated foreground objects. This step involves bitwise operations or simply multiplying the mask with the image/frame pixels.

  5. Post-processing: Post-processing may be necessary to eliminate noise, fill holes, or apply additional filtering techniques to enhance the extracted foreground objects. Techniques like morphological operations, blurring, or thresholding can be used for this purpose.

  6. Display or Save Results: Finally, we can display the extracted foreground objects on the screen or save them as separate images or video sequences for further analysis or use in our applications.

Code Example

Here's a simple example of extracting foreground objects from an image using OpenCV and Python:

import cv2

# Read the input image
image = cv2.imread('input_image.jpg')

# Apply background subtraction
bg_subtractor = cv2.createBackgroundSubtractorMOG2()
foreground_mask = bg_subtractor.apply(image)

# Refine the foreground mask
foreground_mask = cv2.morphologyEx(foreground_mask, cv2.MORPH_OPEN, kernel)
foreground_mask = cv2.threshold(foreground_mask, 127, 255, cv2.THRESH_BINARY)[1]

# Apply mask to the original image
foreground_image = cv2.bitwise_and(image, image, mask=foreground_mask)

# Display the results
cv2.imshow('Input Image', image)
cv2.imshow('Foreground Objects', foreground_image)


Extracting foreground objects from images and videos is a fundamental task in computer vision. OpenCV provides a wealth of functions and algorithms that allow us to perform this task efficiently. By following the steps outlined in this article and experimenting with different techniques, we can extract foreground objects accurately and enhance various computer vision applications.

noob to master © copyleft