Segmentation of Non-text Objects with Connected Operators

Sun 21 May 2017

This project is part of my present work at the Center for Development of Advanced Computing (CDAC)

A paper detailing this work was accepted for publication at the International Conference on Computer Vision and Image Processing 2016. Can be viewed here: CVIP 2016

Introduction

The segmentation of non-text objects in a document image is often one of the tasks performed as part of a document analysis task. The segmentation and recognition of non-text image objects as belonging to one of many different non- text object classes (halftones, tables, line-drawings, graphs etc) allows for a higher-level understanding of the page under consideration.

As part of this project I aim to develop solutions which allow for the segmentation of non-text objects from grayscale document images. The nature of the images which my work aims to segment are ones with simple layouts (rectangular layouts) at present and images which have suffered certain amounts of degradations, particularly bleedthrough degradations, where the content printed on a different side of the page becomes visible on the page under consideration. The approach I am exploring here however differs from conventional page segmentation approaches in that the object segmentation is performed on grayscale images rather than on binary images. The approach I use makes use of a tool from mathematical morphology known as connected operators.

Connected Operators

Connected operators may be considered to be filtering tools which act by merging flat zones, where a flat zone is a connected component where the image is of constant graylevel. Connected operators are such that in the course of their operation they only merge the contours of existing image regions while not creating any new contours or changing the positions of the existing image contours.

One method of realizing these operators is by representing the image as a tree constructed by repeatedly thresholding the image at different graylevels, performing a connected component analysis on the result of the thresholding and forming links between nodes of the tree to reflect an inclusion relationship. Based on certain variations in the way in which the tree is constructed different types of trees may be constructed. The tree representation of the image represents a structured representation of the image aiding in subsequent analysis tasks.

The processing of the image is now done by means of a pruning the leaves of tree and reconstructing the image from the pruned tree. Depending on the criteria used for pruning the trees and the nature of the tree, different kinds of filters may be realized.

An introduction to connected operators may be found in the Connected Operators article by Salembier and Wilkinson. (pdf)

Proposed Approach

The approach I am working on at present makes use of a maxtree representation of the image and combines the connected operator approach with more conventional techniques so as to effectively segment the non-text objects of a degraded grayscale document image as well as non-degraded document images. I will elaborate on this approach soon. Results from the approach look promising.

You can examine some of the segmented images here: Segmentation Results Album