Image segmentation and particles classification using texture analysis method

Introduction: Ingredients of oily fish include a large amount of polyunsaturated fatty acids, which are important elements in various metabolic processes of humans, and have also been used to prevent diseases. However, in an attempt to reduce cost, recent developments are starting a replace the ingredients of fish oil with products of microalgae, that also produce polyunsaturated fatty acids. To do so, it is important to closely monitor morphological changes in algae cells and monitor their age in order to achieve the best results. This paper aims to describe an advanced vision-based system to automatically detect, classify, and track the organic cells using a recently developed SOPAT-System (Smart On-line Particle Analysis Technology), a photo-optical image acquisition device combined with innovative image analysis software. Methods: The proposed method includes image de-noising, binarization and Enhancement, as well as object recognition, localization and classification based on the analysis of particles’ size and texture. Results: The methods allowed for correctly computing cell’s size for each particle separately. By computing an area histogram for the input images (1h, 18h, and 42h), the variation could be observed showing a clear increase in cell. Conclusion: The proposed method allows for algae particles to be correctly identified with accuracies up to 99% and classified correctly with accuracies up to 100%.


Introduction
Long-chain polyunsaturated fatty acids (eicosapentaenoic acid, EPA and docosahexaenoic acid, DHA) are important elements for various metabolic processes in humans (Martins et al., 2013).The ingredients of oily fish include a large amount of polyunsaturated fatty acids, which are used in the prevention of heart and cancer diseases (Madeira et al., 2007;Martins et al., 2013;Perez-Garcia et al., 2011).Also, the microalgae Crypthecodinium cohnii are identified as a good producer of DHA (Hiraishi et al., 2013;Martin, 2003).In order to reduce cost, recent developments are starting to replace the ingredients of fish oil with products of microalgae, which include 30 to 70% of DHA (Martins et al., 2013).
Cells of C. cohnii grow in a liquid dark environment at 27 °C (Bhaud et al., 1991).During the life cycle of C. cohnii, two forms are observed: motile cells and non-motile (cysts) (Hiraishi et al., 2013;Martin, 2003).Both forms can vary in size (10-50 μm) (Martin, 2003).The variation of cell size may indicate different stages of the cell life cycle -a small cell represents a young offspring and a cyst represents an adult cell.Motile cells appear generally in oval shape, while non-motile cells (cyst) are generally round in shape and appear larger than the motile cells (Hiraishi et al., 2013;Martin, 2003).
The main objective of this paper is to devise a method to visually differentiate between the small (non-active) cells and cyst (active) ones in order to follow and observe morphological changes for active cells, while the cells are also kept in a certain age.To do so, the method will be based on two measurable parameters: cell size and cell texture.
The determination of cell size is an important issue in different areas.For example, the particle size distribution of a drug substance may have significant effects on the final performance of the product (Brümmer, 2008;Paul, 2009).In wastewater treatment, it plays an important role to determine the size of organic matters and their relationship with the treatment process (Audrey et al., 1985).The same can be said for the prediction of relative amounts of various sizes of fine particles in Athabasca oil sands (Schutte et al., 1999).Furthermore, texture particle property represents an important visual indication to Volume 32, Number 3, p. 243-252, 2016 classify active cells out of non-active ones (Mihran and Anil, 1998).
A number of publications can be found in the literature dealing with image analysis for bioprocesses.For example, Wartelle et al. ( 2004) counted animal cells by means of three parameters: size of the high pass filter to enforce the shape detection, threshold value used after the Hough transformation, and the number of circles which were examined.The correlation coefficient between reference (ground truth) and data sets has been equal to 0.99.Bittner et al. (1998), processed images of Saccharomyces cerevisiae to examine cell size distribution.The process was based on two parameters (optical properties of the cells, and threshold values) to facilitate cell recognition.The results show error percentage below 2%.Similar experiments were carried out on liquid drops and air bubbles (Galindo et al., 2005;Ribeiro et al., 2009), oil drops (Dominguez and Corkidi, 2011), and crystal drops (Rivoire et al., 2009).Nevertheless, the most relevant literature on this subject has been introduced by Jiang et al. (2010).These authors provided a strong method for segmentation, which is composed of image de-noising using Haar wavelet transform, histogram equalization for quality improving, edge detection and morphological operations to remove debris and gain closed contour for segmented objects.The presented work follows the aforementioned approach taking into consideration the different features of our images and how to process the connected and overlapped particles.
The C. cohnii images can be obtained by probe microscopy.Different microscopic devices are listed in Morita (2006).Nevertheless, microscopic images suffer from tricky problems, such as the small size of the particles, poor colour balance, un-sharpening, and low contrast and resolution (Bazila and Mir, 2014).SOPAT GmbH is a specialized company for measuring the development of particles in multiphase processes.The SOPAT-System (Smart On-line Particle Analysis Technology) is a photo-optical image acquisition device combined with innovative image analysis software.The system is being optimized for biological systems (SOPAT Bio Pro) so that cells' information is extracted from images taken by an in-situ microscope probe, where the cells are evaluated within their own environment.Particles' size, morphology, concentration and other relevant parameters are measured simultaneously by automatic image analysis programs.
The method presented here is implemented using MATLAB  programming software, including the required methods for in-situ observation.

Methods
As shown in the literature, image processing and analysis play an important role in in-line observation of microorganisms using microscopic tools.Hence, the proposed method is divided into two major steps: image segmentation and object classification.The image segmentation process includes image de-noising, image binarization, image enhancement, and object recognition and localization.Object classification is achieved on the basis of cells size, and energy texture parameters.Figure 1 summarizes the various steps of the proposed method.

Wavelet based de-noising processing
The first step in the proposed method is image de-noising, which is the process of removing out noise from the captured image.The approach can be summarized in three steps: (1) Stationary wavelet transform is applied to the noisy image; (2) Normalization is used to modify the coefficients of the wavelet.(3) The inverse discrete stationary wavelet transform is performed to obtain the denoised image (Jiang et al., 2010).
Haar wavelet has been used in the proposed method (Ismail and Khan, 2012).The scaling of the wavelet function can be represented as follows: As show, it is described as an average function, which used for image approximation.Also, the mother wavelet function is described as the average of difference, which is used for image details.It is extended into three functions described as, , where h, v, d are horizontal, vertical and diagonal details coefficients respectively.Equation 1 is generated by the approximating of columns in the x-direction and approximating the rows in the y-direction.Horizontal details are obtained by approximating columns in the x-direction and detailing rows in the y-direction.Vertical details are computed by detailing columns in the x-direction and approximating rows in the y-direction.The diagonal image is performed by detailing both directions, x, and y-directions (Ismail and Khan, 2012;Salem, 2011).
In order to create a new image that contains the horizontal, vertical and diagonal edges that had been smoothed in the approximation image, the three detailed images are converted to binary images by a suitable threshold defined for each of them.Different threshold algorithms are described in Sachin and Dharmpal (2011) -from those, the authors chose the 'penalized threshold' method, which is based on sorting the details coefficients (d) in descending order and then computing the global threshold value (T) by minimizing the penalized criterion, as shown in Equation 5: where, t = 1, 2, …, n; α: sparsity parameter > 1; n: number of coefficients; and σ: standard deviation of the zero mean gaussian white noise in the denoising model (Moghtased-Azar and Gholamnia, 2014).Finally, the inverse discrete stationary wavelet transform is performed to obtain the denoised image (Raviraj and Sanavullah, 2007).

Image binarization
Edge detection is one of the most frequently used techniques in image binarization.An edge is characterized by a high local change of the intensity in the image (Maini and Aggarwal, 2009).Edge detection can be provided by various operators, such as: Sobel, Canny, Roberts, Prewitt, and log.In this paper we use a mix of Sobel and log operators.The following matrix is called Sobel kernel, and it is applied to the image for extracting horizontal and vertical edges (Vincent and Folorunso, 2009).
The log operator is used to smooth the image with Gaussian filter in order to reduce its sensitivity to noise, and then convolves the smoothed image with a Laplacian filter.Alternatively, the image is convolved with a linear filter -the Laplacian of Gaussian (LoG kernel): x y x y LoG x y e The edges can be detected by finding the zero crossing of the 2 nd derivative of the image intensity.
Also, image multi-level thresholding is performed on the image for binarization.This process determines various threshold values to segment the image into several clusters, C.This is done using a very popular thresholding technique -Otsu's method (Deng-Yuan et al., 2011), that selects an optimal threshold value, t*, by maximizing σ 2 B as follows: where, ω i : cumulative of the occurrence probability of grey level i; μT: mean intensity of the image.
The proposed method applies the convolution process using Sobel kernel, which is favourable for the next operation (image multi-level thresholding).Again the edge detection process is performed using log operator for extracting strong edges.

Image enhancement
To get a complete contour of segmented objects, dilation, erosion, closing, and region filling operations are applied on the binary image.
Dilation and erosion are the main morphological operations.Morphological image processing is a collection of non-linear operations related to shape or morphology of features in an image.Morphological operations rely only on the relative ordering of pixel values, not on their numerical values, and therefore are especially suited to processing binary images.
Suppose A is an image set and B a structuring element.Dilation and erosion operations are described as in Equations 10 and 11, respectively.
where, E: Euclidean distance = Z 2 , φ: for the empty set, Closing operation is defined as dilation followed by erosion, using the same structuring element for both operations.Region filling is based on set of dilations, complement, and intersections, as in the following equation: where, X 0 : first point inside the boundary; A c : the complement of set A; f = 1, 2, 3, … (El-dosuky et al., 2014).
The results of these operations are influenced by the size and shape of the structuring element B. In fact, the choice of the structuring element depends on the shape of the objects in the image.In our case, the suitable element is the 'disk-shaped' structuring element based on the circular shaped of the biological objects.To identify the objects, the dilation process is performed, followed by region filling to complete the pores of the objects.Furthermore, a median filter is applied to smooth the contour curve after the erosion operation.

Object recognition and localization
It may happen that two or more objects are connected in their borders -this effect has been neglected by many authors.The problem is that, if the number of connected objects is large with respect to the total number of objects, it may cause significant classification errors (Pons and Vivier, 1999).Therefore, we shall extract mutually connected objects using a Watershed Transform.
Watershed transformation belongs to a class of methods for region-based segmentation.The basic step of this transformation is to visualize an image (grey scale, or binary) into a topographic surface, which includes three notions: minima, catchment basins, and watershed lines, as shown in Figure 2.
Each regional minimum of this surface is represented as a catchment basin.The idea of watershed transformation is to pierce a hole in each minimum of catchment basin and allowing it to sink in a lake.The water would be inserted gradually from local minimum firstly until fill up different catchment basins in the image.When the rising water in different basins would risk merging, dams are built to prevent the merging of water coming from two different basins.These boundary dams correspond to the watershed lines that needed to divide (Lu and Ke, 2007).
There are various methods are used for the watershed transform, such as gradient, marker-controller and distance transform (Aayushi, 2014;Beucher, 1991).The gradient magnitude method is applied to gradient images instead of grey level ones (Belaid and Mourou, 2009;Jackway, 1996).This method has an over-segmentation problem due to noise in the image.
To solve this problem, the marker-controller watershed approach is introduced.It consists of markers to be defined as new minima of the gradient image objects.After the segmentation process, the watershed line for each object can be separated from its neighbors (Aayushi, 2014).
In the proposed method, the authors have chosen distance transform to perform the segmentation step.It calculates a distance map from a center point to the edges of the object.The centers are determined by performing multiple successive erosions with a suitable structuring element until all foreground regions of the image are eroded away.Then it fills that topological map with imaginary water.Where two watersheds meet a dam is built to separate them (Aayushi, 2014;Lu and Ke, 2007).The distance transform provides a metric of the separation of points in the image such as Euclidean, City Block, or Chessboard.
The chessboard metric has been used to measure the path between the pixels based on an 8-connected neighborhood.The pixels whose have edges or corners are touched, are one unit apart.The actual structuring element that should be used depends on which distance metric is chosen.A (3x3) 'square-shaped' element provides the chessboard distance transform.

Object classification
The object is recognized by computing connected component labelling operations after applying the previously described methodologies (Stefano and Bulgarelli, 1999).Then it is possible to determine the number of cells in each input image and then measure the concentration of cells during growth.After that, relevant parameters of each cell size can be obtained: area, perimeter, diameter, and radius.
The process of classification is unsupervised and based on two fundamental features: cell size, and cell texture.The area of the cell is determined from its radius.Also, cell texture has been chosen to provide a way for accurate classification, ensuring the effectiveness and reliability of the process.
The texture of an image is represented by the distributions and relationships between the grey levels of the image.The co-occurrence matrix is identified as an estimation matrix for joint probability of two pixels P(i, j).The dimension of this matrix is equal to the number of grey levels in the image (G) and its values are associated with the number of occurrences of pairs of grey levels (Figure 3).The matrix represents the most popular second-order statistical features for texture analysis (Malik et al., 2001).Indeed, from this matrix, we can compute the following properties: energy, entropy, contrast, homogeneity and correlation (Andrzej and Michal, 1998;Mihran and Anil, 1998).Mostly, energy is the most crucial factor used in our method, while the remaining factors do not properly differentiate between the objects due to the similarity of the particles' values within the image.The angular second moment (energy) is computed as:

Results
Two types of datasets are used to evaluate the proposed method.The first data set is a synthetic image, generated by a computer program.Synthetic images allow for a 'ground truth' for all cells -known location (centre) and radius of each cell.The presented method is applied on 5 different synthetic images, with the same dimensions, 1024 × 1024, and almost equal number of cells.The purpose of that is to calibrate and test the reliability of the proposed method.Real images of C. cohnii cells growing in a liquid environment comprise the second data set.Cell growth is tracked by SOPAT-Probe at 1h, 18h, and 42h. Figure 4 represents the output of each step in the experiment -as described in Figure 1.
Figure 4a represents the initial grey image.Figure 4b shows the first step -de-noising using Haar wavelet transform as a pre-processing step for the microalgae microscopic images.The initial image is decomposed into three levels, and the hard threshold is chosen for all levels.This eliminates noise effectively and also retains the edges information.The next step, image convolution by Sobel kernel, is shown in Figure 4c, and for better visualization, the image contrast is adjusted as shown in Figure 4d.
For the binary image, the multi-thresholding using Otsu's method is applied, followed by edge detection via log operator (Figure 5a).The log parameters are assigned as: t = 0.0005, σ = 2.5.In the resulting image, a clear identification of the cells could not be made.The enhancement process is started by closing and dilating the objects to identify the features of cells, as shown Figure 5b.Even if the dilation of the cells borders leads to connect previously unconnected cells the watershed technique would be able to overcome this issue.To improve identification, the holes of the particle are filled, as shown in Figure 5c.The resulting image again is not optimal since it contains a large amount of debris.This problem is solved by applying one of the morphological operations, erosion, with "disk-shaped" structuring element of size "10".In order to remove such debris and get the circular edges more smoothed, median filter technique is applied with a mask window (9×9), as shown in Figure 5d.
After that, the problem of connected objects, which are segmented as one component, must be faced.The objective is to find the area size of the individual object.To do so, distance transform chessboard is used along with watershed to segment mutually connecting objects.The result is shown in Figure 6.Now, the cell's size can be computed for each object separately.By computing an area histogram for the input images (1h, 18h, and 42h), size variation can be observed, showing a clear increase in cell concentration.Figure 7 shows the area histogram of the given images.
We found that the size parameter can be used to clearly differentiate cells.In our case, the cells with a radius greater than or equal to 29 are classified as active cells.It may happen by chance that a certain group of elements have a radius 30 or more, however some may not really be biological cells.Therefore, another set of parameters must be used for accurate classification.
Energy texture is then measured for the denoised image for better object classification.Objects which have energy greater than 0.12 are classified as active objects, otherwise they are classified as non-active.Overall, the classification process is set for radius >= 29 and energy >= 0.12.These threshold values were chosen based on try and error (Figure 8).Often, the interested objects are visualized as a circular shape.By utilizing the centre and radius (neighbourhood size) of each object, the energy of the object could be measured separately.
In Figure 8, the green squares represent active objects and blue ones represent non-active cells -rectangular squares are just the coloured border.
The accuracy percentage is given by calculating the average values of this metric for the input images, and is shown in Table 1.The method was applied to 5 different images with dimensions 2272 × 1704 and the accuracy values reached up to 99% for segmentation and 100% for classification.The results of this method have been compared with those obtained by Bittner et al. (1998) and Jiang et al. (2010).In Jiang et al. (2010), a strong method for microalgae segmentation was presented and reached excellent accuracy.However, it did not address the case of connected objects and was applied to only 2 real images of Nitzschia closterium Eh and Chlorella, which are not enough to assess the goodness of a segmentation method.The method presented in this paper has been applied to process 5 synthetic images and 5 real images.Results showed that it can handle the separation of connected and overlapped particles by applying the watershed transform, delivering good accuracy.The drawback of the watershed method is over-segmentation.Sometimes, it divides a perfect object into multiple ones and counts them as more than one object.Alternatively, a circular Hough transform may be used to segment the circular shapes and extract mutually connected objects, while preserving high accuracy.In Bittner et al. (1998), images of Saccharomyces cerevisiae were processed to examine

Discussion
In this work, a method for in-line identification of micro-organisms has been presented.The method is supported by Haar wavelets, as the powerful de-noising the cell size distribution.The authors achieved error percentage below 2%.The method proposed in this paper was also able to achieve the same goal with high accuracy and error percentage below 1%.
Whatever the segmentation method applied or the dataset used, the correct localization of the cells is an import element to allow for accurate classification.In fact, the size parameter has been the most important factor when differentiating between the cells.The method was capable of identifying small objects as non-active and other cells as active ones (biological cells).The radius of each cell has been computed separately after the segmentation step by using the properties extracted by the connected component labelling method.Energy texture property has also been an important feature for accurate classification.
One of the limitation of this work is that it must be applied to high-contrast images.Low-contrast images would not render satisfactory results.The proposed technique has been based on the basis of grey level intensity, while the edges are characterized by a high local change of the intensity within the image.Since low-contrast images have low gradient values, the process of edge extraction can not be performed properly.
Overall, the proposed method can reduce the processing time for in-line identification of microorganisms when compared to other methods described in the literature.The classification process used in this paper does not require a training data set (unsupervised) and is based on basic properties of segmented cells (texture and size), contributing to the reduction of processing time.In order to further improve processing time, and allow for in-line observation and control, the authors suggest the use of parallelization on different cores in a multi-core computer, which would contribute to decrease the required time mathematical processing.

Figure 1 .
Figure 1.Flow diagram of proposed method.

Figure 2 .
Figure 2. A topographic surface of an image -grey scale or binary one.

Figure 4 .
Figure 4. (a) Grey scale image of microalgae; (b) Denoised image of microalgae by Haar wavelet method; (c) Image convolution using Sobel kernel; and (d) Image enhancement.

Figure 5 .
Figure 5. (a) Binary image of microalgae; (b) Image after closing and dilation; (c) Image after filling holes; and (d) Image after erosion and median filter.

Figure 8 .
Figure 8. Image classification based on size and energy texture parameters -green squares represent active objects and blue ones represent the non-active cells.

Table 1 .
Accuracy percentage for input images.