- Open Access
An improved approach for the segmentation of starch granules in microscopic images
© Tang et al; licensee BioMed Central Ltd. 2010
- Published: 02 November 2010
Starches are the main storage polysaccharides in plants and are distributed widely throughout plants including seeds, roots, tubers, leaves, stems and so on. Currently, microscopic observation is one of the most important ways to investigate and analyze the structure of starches. The position, shape, and size of the starch granules are the main measurements for quantitative analysis. In order to obtain these measurements, segmentation of starch granules from the background is very important. However, automatic segmentation of starch granules is still a challenging task because of the limitation of imaging condition and the complex scenarios of overlapping granules.
We propose a novel method to segment starch granules in microscopic images. In the proposed method, we first separate starch granules from background using automatic thresholding and then roughly segment the image using watershed algorithm. In order to reduce the oversegmentation in watershed algorithm, we use the roundness of each segment, and analyze the gradient vector field to find the critical points so as to identify oversegments. After oversegments are found, we extract the features, such as the position and intensity of the oversegments, and use fuzzy c-means clustering to merge the oversegments to the objects with similar features. Experimental results demonstrate that the proposed method can alleviate oversegmentation of watershed segmentation algorithm successfully.
We present a new scheme for starch granules segmentation. The proposed scheme aims to alleviate the oversegmentation in watershed algorithm. We use the shape information and critical points of gradient vector flow (GVF) of starch granules to identify oversegments, and use fuzzy c-mean clustering based on prior knowledge to merge these oversegments to the objects. Experimental results on twenty microscopic starch images demonstrate the effectiveness of the proposed scheme.
- Binary Image
- Starch Granule
- Cosine Similarity
- Gradient Vector Flow
- Watershed Algorithm
Starch, the complex carbohydrate, is a major component in human diet. It has also been widely used in industrial applications such as making food, health and nutrition, pharmaceutical and personal care. The natural starch is produced in chloroplasts by photosynthesis and it is usually packed into granules with a layered structure . The shape and size of native starch granules vary among species. For example, potato starches have large round granules and their diameters are up to 100μm. Rice starch granules are the smallest of the cereal starches and are about 2μm in diameters . In wheat, the situation is more complicated. Three types of granule size distributions were reported . The shape and size of starch granules can affect the starch properties and functions; starch granules for industrial application usually have specific requirement in their shape and size. For instance, starch granules for the paper industrial are required to have spherical shape and a typical uniform granule size. Natural starches do not have such uniform characteristics, therefore they are needed to be modified for industrial applications. In the other hand, cultivation of a variety with specific shape and size of starch granules is an alternative way to meet this demand. Hence, the new cultivar has to be investigated and exploited. Developing a new cultivar with a certain desired size of starch granules for industrial has economical significance in the agriculture science.
Among several methods to measure the shape and size of granules, microscopic evaluation is the most convenient and relatively precise way . The granule images were taken from a light microscope. The shape and size can be examined directly from the image prints. The microscopy images can be converted or directly acquired into digital data and analysis automatically. In addition, with this method, starch granules can be analyzed in situ, that is that starch granules do not need to be extracted from plant tissues. In situ data are significant for the study of starch granule development in related to other cellular component. Microscopy image analysis is a relative simple for quick checking the starch changes in plant tissue for breeders who are interested in developing new cultivars. The data from image analysis are the most precise and repeatable compared with others methods , but this method could be retarded by the image quality in more complex scenarios of overlapping granules. Distinguishing starch granules from background noise and identifying overlapping granules are a promising technique for quantitative analysis.
In order to analyze starch granules quantitatively, image segmentation technique is often adopted to segment the starch granules from the background. Segmentation is a challenging task due to the noise, the irregular shape of the objects, and the complicated topology, etc. For microscopic images of starch granules, many starch granules usually gather together and even overlap, which makes the segmentation more difficult. To deal with the contacted and overlapping objects, many approaches have been proposed . Some approaches are based on deformable contours and level sets. Because of some disadvantages of the current active contour models and level sets, many improved versions have been proposed for real applications. For example, Ortiz de Solorzano et al.  presented a level set scheme for the segmentation of nuclei and cells. Vese et aldeveloped a novel multiphase level set framework based on Mumford and Shah model, and the method can avoid the problems of vacuum and overlapping automatically. Yan et al.introduced an interaction model with repulsion and competition to segment high throughput RNAi fluorescent cellular images. However, these methods are time consuming and require many control parameters. In this paper, we develop image segmentation algorithm based on watershed. Watershed algorithm  considers the gradient magnitude of an image as a topographic surface. Pixels having the highest gradient magnitude correspond to the watershed lines, which are interpreted as the region boundaries. The successive flooding strategy is performed to construct a basin which represents a segment. The watershed method has many advantages. For example, it is simple, intuitive, can be parallelized, and always produces an entire partition of the image. However, image segmentation based on watershed still has some drawbacks such as oversegmentation and sensitivity to noise. In this paper, we propose a novel merging scheme to solve the oversegmentation problem by using fuzzy c-means classification based on the prior knowledge and region features of starch granules.
Segmentation using thesholding and watershed
The rough segmentation of the image is divided into two steps. The first step is to separate starch granules from background using automatic thresholding. The second step is to use watershed algorithm to roughly segment the image.
Step 1. Select an initial threshold T0 and stopping criterion ε. T0 is set as the mean of whole image, and the stopping ε is set as a very small value;
Step 2. Let the current threshold T equal to T0;
Step 3. Segment the image with threshold T as follows: If a pixel has a gray value less than T, it is classified into class C1 (starch grain part), and if the pixel has a gray value more than T, then it is classified into C2(background part).
where f(.) is the gray values of an image, M, N are the numbers of the pixels in classes C1 and C2 respectively.
Repeat step 3 through 5 until the difference in T in successive iterations is smaller than ε.
After the starch granules have been separated from background, a binary image is generated which 0 means the starch grain part and 1 means the background. After we obtain the binary image, we calculate the Chamfer distance of the binary image. Chamfer distance transformations rely on the assumption that it is possible to deduce the value of the distance at a pixel from the value of the distance at its neighbors . Chamfer distance transforms are a class of effective discrete algorithms which offer a good approximation to the desired Euclidean distance transform which is computationally very intensive. After the Chamfer distance map of the binary image is obtained, it is used as the input of watershed algorithm for object segmentation. The basic idea of watershed algorithm comes from field of topography : a drop of water falling on topographical surface follows a steepest descent line until it reaches a local minimum. Watershed lines are considered as divide lines to attract the drops of water. Water will fill up basins starting at these local minima and dams will be built when water coming from different basins meets . The surface is partitioned into regions or basins are separated by these dams. For image segmentation, intensity gradient is usually considered as a topographic surface, each regional minimum of the gradient image is the attraction point of a catchment basin. In our algorithm, Meyer’s watershed algorithm  is adopted. The regions after watershed segmentation are called a region.
Identification of oversegmentation
One of the big issues of watershed algorithm is oversegmentation when it is used for image segmentation. Oversegmentation happens when a granule is segmented into two or more segments. These segments from the same granule are called oversegments in this paper.
In order to alleviate oversegmentation, we develop a hybrid algorithm to reduce it. The proposed algorithm for reducing oversegmentation is divided into two stages. The first stage is to identify the oversegments and the second stage is to merge the oversegments to the objects which they belongs to. In the first stage, we use both shape information and gradient vector flow to identify the oversegments automatically.
where S and L are the area and perimeter of a segment respectively. If a segment has a roundness value less than some threshold, it is classified as an oversegment.
The cosine similarity is from 0 to 1, and bigger cosine similarity means higher similarity of two directions as in Fig.2. In our experiments, if the similarity value between two directions is more than 0.90, it can be considered similar enough or the same. A critical point can be detected if there are more than 6 directions similar enough to the 8-neighborhood directions as in Fig.2. If the segment doesn’t have critical point, then it is identified as an oversegment.
After roundness criterion and critical point criterion are used to identify the oversegments, there are still some oversegments which are not identified. In the experiments, we found that an oversegment which is not identified by the two criteria is the segment which occupies more than half of an object. Thus, after identification processing, the segments are classified into three types: (1) oversegments identified; (2)oversegments unidentified; (3) single objects. We call the second and the third types of segments core segments because they are segments with big round values and critical points.
Fuzzy c-means is a data clustering technique based on the features of the observed objects, thus effective features selection plays a vital role in merging similar clusters and splitting clusters. Since intensity and position are the main features to distinguish different segments, the mean intensity, intensity variance are the two primary features of a segment. The space position is also important to investigate the relationship between segments. After being clustered by the watershed, the center of each segment is obtained. We can calculate the distance between segment center and the center of region of interest (ROI), which includes the supposed oversegments. In order to indicate the importance of the features, we can assign weight to each feature. The distance should be given bigger weight for it’s more important when two or more adjacent segments are determined to merge. In our experiment, the weights are 0.25, 0.25, 1.0 for mean intensity, variance of intensity and centers distance.
Our proposed method can be described as follows:
Step 1. Separate background and starches using automatic thresholding and obtain a binary image;
Step 2. Calculate Chamfer distance of the binary image;
Step 3. Segment objects using watershed algorithm based on Chamfer distance map;
Step 4. Compute the roundness of the segments by watershed method, and then identify the oversegments with small roundness;
Step 5. Analyze the GVF fields of segments with big roundness and search critical points so as to identify the oversegments without critical points.
Step 6. Extract the features for segments.
Step 7. Initialize cluster centers using centers of core segments.
Step 8. Merge the oversegments identified to the core segments by Fuzzy c-means clustering.
In this paper, we used sweet potato starch as an object to present a method to improve microscopy image analysis. Sweet potato starch was extracted from storage roots and diluted in water and stained with I2/IK (iodine-potassium iodide solution) . These stained starch granules were then put on microscopy slides and covered with a glass. The slides were analyzed under bright-field illumination with a microscope (Olympus). Images were captured with a high-resolution CCD color camera (DP71 Olympus). Fig.1 (a) shows a sample of a microscope image with 7 starch granules.
The first experiment was performed to segment the image roughly using thresholding and watershed algorithm. The results are shown in Fig.1. Fig.1 (b) is the histogram of the image shown in Fig.1 (a). From Fig.1 (b), we can find that the histogram has two peaks, which correspond to the gray background and the black granules respectively. The histogram shows that thesholding is possibly an effective method to separate the background from the foreground. Fig.1(c) is the binary image obtained using automatic thresholding. In the thresholding processing, the initial thresholding was set to be 109, and ε was set to be 0.2. The automatic thresholding stopped when the threshold reached 115. The segmentation result shown in Fig.1 (c) verifies that thesholding method has good performance. After the binary image was obtained using thresholding, Chamfer distance of the binary image was computed and then used as the input of the segmentation algorithm using watershed. The Chamfer distance of the binary image is shown in Fig.1(d), and Fig.1(e) shows the segmentation result obtained by watershed algorithm on Chamfer distance map in Fig.1(d). In Fig. 1(e), there are 15 segments which were filled with different colours and labelled by different numbers from 1 to 15 (background labelled with 1). From Fig. 1(e), we can find that oversegmentation happened in the segmentation by watershed algorithm.
Roundness values and the critical points found in the GVF fields.
From the roundness and critical points of the segments, we can conclude there are 7 starch granules (they are segments 2, 7, 8, 9, 10, 14, 15), and other segments except for segment 1(background) should be merged. Fig.4 (f) is the final result after merging by the FCM. In ROI 1, segments 3, 4 and 5 were merged into segment 2, while segment 6 was merged into segment 8; in ROI 2, segments 11 and 13 were merged into segment 10, while segment 12 was merged into segment 14.
From the intensity distribution of microscopic images of starch granules, we can know there are two peaks corresponding to starch granules and background respectively, therefore, we investigated automatic thresholding to separate objects from background. The threshold was initialized by the middle of intensity range and adjusted by the difference between previous threshold and the mean of the intensity of pixels belong to the two classes during an iterative process. After the binary image was acquired, we can calculate its Chamfer distance and then segment roughly starch granules using watershed method. Since the shapes of most of the starch granules are round or nearly round, we computed roundness so as to automatically identify the oversegment. In our experiments, it is good if the threshold of roundness is from 0.7 to 0.75. However, some single starch granules have roundness less than the experimental threshold and some oversegments have big roundness. In order to determine the real single object and oversegments effectively, the GVF field is applied to search the critical points in the neighbourhood of an segment center. It can be considered as a single object if it has big roundness value and a critical point around its center, or else it’s an oversegment. Therefore, these two criteria such as roundness and critical point can be used to determine the object number automatically. Fuzzy c-means clustering is a powerful technique to merge adjacent segments with similar features. How to define features which can represent a starch granule effectively is a key issue. Since intensity and position are the primary characteristics of a segment, we adopted features including mean intensity, intensity variance and distance between the centers of the ROI and the segment. Different weights were assigned to these features so as to reveal its significance, center distance is given bigger weight because only the adjacent segments may be merged.
We proposed a novel method to segment microscopic images of starch granules. In order to deal with the oversegmentation of watershed algorithm, we calculated the roundness of the segments and analyzed the GVF field so as to identify the segments and determine core segment automatically. Position and intensity of segments were extracted as input features for fuzzy c-means clustering. Experimental results show that the oversegments could be merged successfully and the proposed algorithm may obtain prominent performance in object segmentation.
Publication of this supplement was made possible with support from the International Society of Intelligent Biological Medicine (ISIBM). This article has been published as part of BMC Genomics Volume 11 Supplement 2, 2010: Proceedings of the 2009 International Conference on Bioinformatics & Computational Biology (BioComp 2009). The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/11?issue=S2.
- Dündar E, Turana Y, Blaurock AE: Large scale structure of wheat, rice and potato starch revealed by ultra small angle X-ray diffraction. International Journal of Biological Macromolecules. 2009, 45 (2): 206-212.View ArticlePubMedGoogle Scholar
- Daniels D, Donald A: Soft Material Characterization of the Lamellar Properties of Starch: Smectic Side-Chain Liquid-Crystalline Polymeric Approach. Macromolecules. 2004, 37 (4): 1312-1318.View ArticleGoogle Scholar
- Zhou Z, Robards K, Helliwell S, Blanchard C: Composition and functional properties of rice. International Journal of Food Science and Technology. 2002, 37: 849-868.View ArticleGoogle Scholar
- Wilson J D, Bechtel D, Todd T, Seib P: Measurement of Wheat Starch Granule Size Distribution Using Image Analysis and Laser Diffraction Technology. Cereal Chemistry. 2006, 83 (3): 259-268.View ArticleGoogle Scholar
- Shannon J, Garwood D: Genetics and physiology of starch development. In: Starch: Chemistry and Technology. Edited by: Edited by Wishtler R, Bemiller J, and Paschall F. 1984, Orlando, FL, Academic Press, 25-86.View ArticleGoogle Scholar
- De Solorzano C, Malladi R, Lelièvre S, Lockett S: Segmentation of nuclei and cells using membrane related protein markers. Journal of Microscopy. 2001, 201: 404-415.View ArticlePubMedGoogle Scholar
- Vese L, Chan T: A multiphase level set framework for image segmentation using the Mumford and Shah model. International Journal of Computer Vision. 2002, 50 (3): 271-293.View ArticleGoogle Scholar
- Yan P, Zhou X, Shah M, Wong STC: Automatic segmentation of high throughput RNAi fluorescent cellular images. IEEE Transaction on Information Technology in Biomedicine. 2008, 12 (1): 109-117.View ArticleGoogle Scholar
- Vincent L, Soille P: Watersheds in digital spaces: an efficient algorithm based on immersion simulation. IEEE Transaction on Pattern Analysis and Machine Intelligence. 1991, 13 (6): 583-598.View ArticleGoogle Scholar
- Cloppet F, Boucher A: Segmentation of overlapping/aggregating nuclei cells in biological images. Proceedings of 19th International Conference on Pattern Recognition. 2008, 1-4.Google Scholar
- Shapiro L: Stockman G: Computer Vision. Prentice Hall. 2002Google Scholar
- Borgefors G: Distance transformations in digital images. Computer Vision, Graphics, and Image Processing. 1986, 34: 344-371.View ArticleGoogle Scholar
- Borgefors G: Distance transformation in arbitrary dimensions. Computer Vision, Graphics, and Image Processing. 1984, 27: 321-345.View ArticleGoogle Scholar
- Meyer F: Topographic distance and watershed lines. Signal Processing. 1994, 38 (9): 113-125.View ArticleGoogle Scholar
- Vachier C, Meyer F: The viscous watershed transform. Journal of Mathematical Imaging and Vision. 2005, 22 (2-3): 251-267.View ArticleGoogle Scholar
- Wang Y, Jia Y: Analysis of the critical point of the gradient vector Flow snake model. Journal of Software. 2006, Vol.17 (No.9):Google Scholar
- Xu C, Prince JL: Snakes, shapes, and gradient vector flow. IEEE Trans Image Processing. 1998, 7 (3): 359-369.View ArticleGoogle Scholar
- Xu C, Prince JL: Generalized gradient vector flow external forces for active contours. Signal Processing. 1998, 71 (2): 131-139.View ArticleGoogle Scholar
- Tang J, Millington S, Acton S, Crandall J, Hurwitz S: Surface extraction and thickness measurement of the articular cartilage from MR images using directional gradient vector flow snake. IEEE Transaction on Biomedical Engineering. 2006, 52 (5): 896-907.View ArticleGoogle Scholar
- Tang J: A multi-direction GVF snake for the segmentation of skin cancer images. Pattern Recognition. 2009, 42: 1172-1179.View ArticleGoogle Scholar
- Bezdek J: Pattern Recognition with Fuzzy Objective Function Algorithms. Plenum Press, New York,NY, USA. 1981Google Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.