Texture analysis of masses in digitized mammograms using Gleason and Menhinick diversity indexes

Rocha, Simara Vieira da; Braz Junior, Geraldo; Silva, Aristófanes Corrêa; Paiva, Anselmo Cardoso de

doi:10.4322/rbeb.2014.008

Abstract

INTRODUCTION: Breast cancer is the second most common type of cancer in the world, being more common among women and representing 22% of all new cancer cases every year. The sooner it is diagnosed, the better the chances of a successful treatment are. Mammography is one way to detect non-palpable tumors that cause breast cancer. However, it is known that the sensitivity of this exam can vary considerably due to factors such as the specialist's experience, the patient's age and the quality of the images obtained in the exam. The use of computational techniques involving artificial intelligence and image processing has contributed more and more to support the specialists in obtaining a more precise diagnosis. METHODS: This paper proposes a methodology that exclusively uses texture analysis to describe features of masses in digitized mammograms. To increase the efficiency of texture feature extraction, the diversity index's capability to detect patterns of species co-occurrence is used. For this purpose, the Gleason and Menhinick indexes are used. Finally, the extracted texture is classified using the Support Vector Machine, looking to differentiate the malignant masses from the benign. RESULTS: The best result was obtained using the Gleason index, with 86.66% accuracy, 90% sensitivity, 83.33% specificity and an area under the ROC Curve (Az) of 0.86. CONCLUSION: Both indexes showed statistically similar performance; however, the Gleason index was slightly superior.

Breast cancer; Medical images; Gleason Diversity Index; Menhinick Diversity Index; Computer-aided diagnosis

ORIGINAL ARTICLE

Texture analysis of masses in digitized mammograms using Gleason and Menhinick diversity indexes

Simara Vieira da Rocha^* * e-mail: simara@deinf.ufma.br ; Geraldo Braz Junior; Aristófanes Corrêa Silva; Anselmo Cardoso de Paiva

Applied Computing Group - NCA, Federal University of Maranhão - UFMA, Av. dos Portugueses, s/n, Campus do Bacanga, São Luís, MA, Brasil

ABSTRACT

INTRODUCTION: Breast cancer is the second most common type of cancer in the world, being more common among women and representing 22% of all new cancer cases every year. The sooner it is diagnosed, the better the chances of a successful treatment are. Mammography is one way to detect non-palpable tumors that cause breast cancer. However, it is known that the sensitivity of this exam can vary considerably due to factors such as the specialist's experience, the patient's age and the quality of the images obtained in the exam. The use of computational techniques involving artificial intelligence and image processing has contributed more and more to support the specialists in obtaining a more precise diagnosis.

METHODS: This paper proposes a methodology that exclusively uses texture analysis to describe features of masses in digitized mammograms. To increase the efficiency of texture feature extraction, the diversity index's capability to detect patterns of species co-occurrence is used. For this purpose, the Gleason and Menhinick indexes are used. Finally, the extracted texture is classified using the Support Vector Machine, looking to differentiate the malignant masses from the benign.

RESULTS: The best result was obtained using the Gleason index, with 86.66% accuracy, 90% sensitivity, 83.33% specificity and an area under the ROC Curve (Az) of 0.86.

CONCLUSION: Both indexes showed statistically similar performance; however, the Gleason index was slightly superior.

Keywords: Breast cancer, Medical images, Gleason Diversity Index, Menhinick Diversity Index, Computer-aided diagnosis.

Introduction

Breast cancer is the second most common type of cancer in the world, being more common among women and representing 22% of all new cancer cases every year. The occurrence of this type of cancer has grown 3.1% per year (American..., 2011). Mammography is one way of detecting non-palpable tumors that cause breast cancer. On average, it detects 80% to 90% of breast cancers in asymptomatic women (American..., 2011). Since this exam began being used as a routine check-up, a reduction in the mortality rate related to this pathology has been observed. This can be largely explained by a mammogram's ability to detect the cancer in its initial stage, when treatment can be more efficient and a cure is more likely. However, it is known that the sensitivity of this exam can vary considerably due to factors such as the specialist's experience, the patient's age and the quality of the images obtained in the exam.

The use of computational techniques of artificial intelligence and image processing has contributed more and more to support specialists in obtaining more precise diagnoses. In this way, since the last decade, the interest in systems for detection (Computer-Aided Detection - CAD) and diagnosis (Computer-Aided Diagnosis - CADx) has been growing, in order to support the radiologists in interpreting the mammograms (Kinoshita et al., 2004).

One of the visibly detectable anomalies in mammographic images is the cell masses, or clusters of cells that create a denser band than the surrounding tissue. Masses can be caused by both benign and malignant conditions. Therefore, characterizing mass features such as size, shape and margin disposition is fundamental in establishing the probability of malignancy (Kopans, 2000).

Because texture is an attribute that is hard for the human analyst to interpret, it is common to use features of the outlines of the masses to diagnose these regions. However, such features are not always clear in these exams. For example, there can be lesions without a well-defined outline that can overlap regions of masses and calcifications, preventing their proper visualization. This difficulty contributes to an increase in the number of biopsies with negative results. Therefore, the development of techniques of texture feature extraction can help specialists produce more precise diagnoses. Many studies have been developed to perform the texture analysis, aiming to differentiate suspicious regions in mammography exams in order to establish their benign or malignant behavior. In this paper, several of these systems will be described along with their respective techniques.

One strategy largely used to extract texture features is the use of gray levels co-occurrence matrixes (GLCM), whose features can be described using the measures of Haralick et al. (1973). Various works adopt this technique, including Lim and Er (2004) and Rangayyan et al. (2010). In Mavroforakis et al. (2006), gray level run length matrixes (GLRLM) were used along with the GLCM matrixes as a texture descriptor. In the methods proposed in this work, we use the GLCM matrix, the GLRLM matrixes and the gray level gap length matrixes (GLGLM), though we use it only as a way of representing the region of interest (ROI), as will be explained in the methodology.

The feature extraction stage can be performed both by texture analysis and by mass geometry. However, given the difficulty of differentiation between benign and malignant patterns, many studies usually combine texture features and geometry to perform this task, such as Sahiner et al. (2001), Shi et al. (2007), Suganthi and Madheswaran (2010), Varela et al. (2006), Mu et al. (2008) and Liu et al. (2010, 2011).

In Table 1, a summary of related works is presented, containing the technique used to extract texture, the classifier used, the adopted image base, the number of test samples (with M = malignant and B = benign) and the results in percentages.

Thumbnail

Here, we see a necessity for developing a technique with exclusive usage of texture analysis for feature extraction to analyze mammographic images with ill-defined mass outlines to determine malignancy.

The goal of this paper is to propose a methodology to differentiate between malignancy and benignancy using the texture of masses in digitized mammograms using Gleason and Menhinick diversity indexes over regions of interest, represented in the form of gray level co-occurrence matrixes (GLCM), gray level run length matrixes (GLRLM) and gray level gap length matrixes (GLGLM). Furthermore, the Support Vector Machine is used to discriminate whether the features produced by the masses should be in the malignant or benign classes.

Methods

The steps of the proposed methodology for differentiation between the benign and malignant classes in digitized mammography masses, using texture characterization through the diversity indexes of Gleason and Menhinick, are presented in Figure 1. They are Image Acquisition, Pre-processing, Image Representation, Feature Extraction and Pattern Recognition.

Image acquisition

The first stage of the methodology was dedicated to obtaining the mammographic images that were used in the tests. For this purpose, the Digital Database for Screening Mammography (DDSM) public database of digitized mammograms, available on the Internet, (Heath et al., 2001) was used. Because the focus of this research was to characterize the mass textures through diversity indexes and determine their malignant nature, we did not use the complete mammogram image. For the sample selection, we adopted the same approach used by Braz et al. (2009). With this approach, from the markings performed by specialists in mammography, we extracted the bounding box (minimum area-enclosing rectangle) containing only regions that have masses, yielding a total of 3559 ROIs.

For the tests performed in this project, a subset of 300 ROIs was used, all chosen at random, totaling 160 malignant masses and 140 benign masses. Although this number of samples does not represent the total set of samples in the DDSM database, it does not compromise the generalization capacity of the classifier because the selection is performed at random and there is not a significant imbalance between the two groups. In addition, it allows comparison with other works that use the same database and sample size.

Pre-processing

The goal of this stage is to improve the contrast of the object of interest in relation to the background that may exist in the ROIs and consequently provide a better description of their texture. In this way, we use the logarithmic transformation and average filter (Gonzalez and Woods, 2002).

The logarithmic transformation is defined by the equation:

where g_t(x, y) is the new value of gray level at the point (x, y), g(x, y) is the original gray level value, and G is the factor defined by the maximum and minimum limits of the image. This factor aims to guarantee that the new values are between zero and the maximum gray level allowed for the image representation. In this work, G will be equal to 255 because the ROIs have 8 bits per pixel.

The logarithmic transformation is used to give more relevance to dark gray levels, which are rarer in mass images (ROIs) of mammograms (lower frequency of occurrence). This way, when we apply the logarithmic transformation in an ROI, the darker gray levels are grouped to increase their quantitative importance. This process is illustrated in Figure 2.

In applying the logarithmic transformation, the noise in the ROI is amplified (Figure 2b); therefore, we use an average filter with a window size of 5×5 to soften the noise, smoothing the ROI.

Image representation

This stage was introduced in the proposed methodology due to the necessity of adopting the concept of ecological diversity as a technique to perform the feature extraction. Thus, based on the image enhancement performed in the previous stage, the image was represented using first-order statistics in order to calculate the diversity indexes based on the ROI's histogram (Gonzalez and Woods, 2002); second-order statistics through the GLCM matrix (Haralick et al., 1973); and superior-order statistics using the GLRLM matrixes (Galloway, 1975) and GLGLM (Xinli et al., 1994).

Feature extraction

This stage aims to produce descriptive measures for the images, which will form the feature vectors that will be used in the classifying stage. In this work, texture analysis was performed using the statistical approach in Gonzalez and Woods (2002), through adapting the concept of the Ecological Diversity index.

Ecological diversity indexes

In ecology, the term 'diversity' is used to refer to the variety of species present in a community, habitat or region. A community is defined as a set of species that occur in a specific time and place (Magurran, 2004). In this way, the usage of indexes, even though it does not represent the total composition of a community, makes it possible to measure the richness, equality and diversity of species in the different studied environments. This tool is useful to monitor and predict environmental changes.

The concept of diversity involves two parameters: richness, which represents the number of species; and relative abundance, which represents the number of individuals of a particular species that occur in a place or sample. This way, communities with the same richness can differ in diversity depending on the distribution of individuals among species (Mcintosh, 1967). In this paper, we used Gleason and Menhinick's Indexes.

The Gleason Diversity Index considers only the number of species (s) and the logarithm (base 10 or natural) of the total number of individuals (Brower et al., 1997). This index is defined by the equation:

where s is the number of sampled species and N is the total number of individuals in all species.

Another diversity index used in this paper is Menhinick (1964), which considers only the number of species (s) and the square root of the total number of individuals and is calculated by the equation:

where s is the number of sampled species, and N is the total number of individuals in all species.

Adaptation of the concept of ecological diversity

In order to adapt the concept of ecological diversity for our purposes, two ABSTRACTions are used. The first considers that a community will be formed by the pixels of the ROI (each pixel is an individual, and its value defines the species). The second considers that the community is defined by the internal elements of the co-occurrence matrixes of calculated species, and the species will be formed by the values of those elements. This way, we can investigate whether, within the ROIs, there is a dominance of some gray levels over others. Independent of the index, the procedure to extract the features remains the same.

First, the highlighted samples were quantized from 256 to 128, 64, 32, 16 and 8 gray levels, considering each ROI as a community of 256, 128, 64, 32, 16 and 8 species. Along with the quantization, we looked to reproduce the ROIs' representations in different scales of gray levels in such a way to make the description of the texture in those scales possible. After that, these ROIs had their diversity indexes calculated.

Through the histogram of the ROI, we registered the frequency of each gray level (species). This made it possible to extract the richness of species (d) through the quantity of non-null entries (bins) of the histogram, as well as the relative abundance of each species through the value of each bin. The produced feature vector presents 6 variations because we calculated the diversity value for each quantization.

The idea of using the GLCM matrix (Haralick et al., 1973) as a way of representing the ROI was to verify the diversity of the dominance of some pairs of gray levels over others. Two approaches are used to represent the community of gray level pairs. In the first, the individuals correspond to the occurrence of a pair of pixels (i, j) with the same value (gray level) separated by a distance, d, and positioned in a direction θ. The population of each species is represented in the main diagonal of the GLCM matrix (Figure 3b). In the second, we consider the occurrence of members of the community as pairs of pixels (i, j) with different values. This way, the elements outside the main diagonal of the GLCM matrix represent the population of individuals (Figure 3c).

For the θ direction, we adopted the values 0º, 45º, 90º and 135º. For the distance d, the values were 1, 2, 3, 4 and 5. The feature vector generated presented 120 texture attributes (5 distances × 4 directions × 6 quantizations), because a GLCM is needed for each θ and d, and 6 quantizations were considered.

The goal of using the GLRLM matrix (Galloway, 1975) is to analyze whether there is a predominance of relatively long run lengths in relation to the short runs or vice-versa. Therefore, the community was formed by the occurrence of consecutive and collinear sequences of n pixels of the same value and a direction θ. This scheme is presented in Figure 4.

The usage of diversity indexes with the GLGLM matrix (Xinli et al., 1994) seeks to investigate whether a mass presents, in a general way, a more homogeneous texture than other mass. It is possible that it contains a higher concentration of homogeneous neighbors, suggesting low diversity. Otherwise, if it possesses a lower concentration of homogeneous neighbors, it is likely to have a high diversity. This way, a community is composed of pixels with intensity i when this pixel is found only in the beginning and the end of a sequence of consecutive and collinear pixels in a direction θ (Figure 5).

For both the GLRLM matrix and the GLGLM matrix, the values adopted for θ were equal to 0º, 45º, 90º and 135º. In both situations, because a matrix is necessary for each direction, and six quantizations were considered, the resulting feature vector presented 24 variables.

Pattern recognition

To analyze whether the produced features differentiate between a benign and a malignant pattern, a pattern recognition stage was included in the following methodology, which will be detailed in sequence.

Support Vector Machine

To validate the proposed methodology and classify the masses as benign or malignant, we used the Support Vector Machine (SVM) (Vapnik, 1998). This technique has performed well when applied to image processing of mammograms, especially to distinguish patterns of the mammogram in mass or normal tissue, as reported in Braz et al. (2009), Carvalho et al. (2012) and Martins et al. (2010). Previously, in Rocha et al. (2012), the SVM was used successfully for diagnosing breast regions as benign and malignant.

Broadly, given the set of training samples (x_i, y_i), the input vector is x_i ∈ ⁿ, the correct classification of the samples is y_i, and the index of each sample point is i = 1, ..., n. The aim of the classification is to estimate the function f : ⁿ → {±1}, which correctly separates the test samples into distinct classes. Each sample x is mapped to a feature space of the highest dimension through the transformation function z = Φ(x). The hypothesis is that in this new space, the samples can be discriminated by linear iterations. This way, we can say that the decision function can be improved with support from the kernel function K(x, y), which is represented by Φ(x) (Haykin and Engel, 2001). Therefore, the SVM issue is treated as a linear function of optimization represented by:

where K(x, x_i) = Φ(x), and that the coefficient α_i and the variable b are obtained through the optimization of a dual quadratic system based on the equation:

where C > 0 is a parameter estimated by the user, which corresponds to a classifying error penalty, and ξ_i are the slack variables that penalize training errors.

SVM can perform very complex boundary separations. To do so, we use a function of space transformation (kernel function), which transforms the data in the feature space in a space of a higher dimension tending to infinity, where it is possible to trace a separation hyperplane. In this paper, we investigated the linear function (x_i^Tx_j) and the Radial Basis Function (RBF) (exp(-γ ||x_i - x_j||²)). The goal was to analyze which of them better discriminates the studied pattern.

In this paper, to perform the experiments, various criteria were adopted for the division of the training and test bases in the training stage of the SVM. The criteria were 50/50, 60/40, 70/30, and 80/20. Independent of the adopted proportion, for each configuration, the test was repeated 5 times at random. Because it is a random selection, each experiment had the cost parameters (C) and complexity level of the mapping function g, which is used when the chosen kernel is RBF of the SVM estimated.

The goal was to analyze whether the accuracies, in all repetitions, behave in a similar way, evidencing how the approach represents the texture pattern in the samples of benign and malignant masses. We used the implementation of the SVM available in the LIBSVM library (Chang and Lin, 2011) to conduct this stage.

Result validation

To measure the performance of the proposed methodology, we calculated certain statistics of the test results. The statistics were accuracy (A), sensitivity (S) and specificity (Sp). Accuracy is defined as (TP +TN)/(TP +FP+TN +FN). Sensitivity is given by TP/(TP+FN). Specificity is defined as: TN/(TN+FP), where TP (true positive), TN (true negative), FP (false positive) and FN (false negative) (Bland, 2000).

Finally, we also performed the performance evaluation of the classifiers by analyzing the Receiver Operating Characteristic (ROC) curve, which relates the sensitivity and specificity of the classifier (Mazurowski et al., 2008).

The index used by this paper in the analysis of the ROC curve was the Az, which represents the area under the ROC curve. The closer the Az index is to 1, the better the discrimination performed by the classifier between benign and malignant classes in the test samples. When the index Az equals 0.5, it means that the classifier could not differentiate between the benign and malignant classes (Brown and Davis, 2006).

Results

The results produced by applying the proposed methodology when performing the tests can be found in sequence. As described in the methods section, for each proportion of training and testing, 5 repetitions were performed. The parameters used for each test were estimated using genetic methods of selection, implemented in the reference library used. The parameters are unique for each test and represent the mapping function of the feature vectors for support vectors. Thus, due to the random selection of the train and test bases, the parameters cannot be reused. We also present the average accuracy, sensitivity, and specificity of each proportion with their standard deviation and present the best result obtained by the experiment.

Table 2 presents the results produced by the experiment using the Gleason index. The approach that presented the best result was the GLCM from the representation of the ROI performed by the main diagonal of the matrix, combined with the RBF kernel in 80/20 proportion, presenting an average accuracy of 84.33%. At this proportion, the best result was 86.66% accuracy, 90% sensitivity, 83.33% specificity and an area Az of 0.86.

Thumbnail

The results obtained using the Menhinick Index are listed in Table 3. In this configuration, the technique with the best results was GLCM using the whole matrix to represent the ROI and RBF kernel but at the proportion 50/50, with an 83.33% average of accuracy. The best result generated at this proportion had 85.33% accuracy, 81.71% sensitivity, 89.70% specificity and an area Az of 0.85.

Thumbnail

Discussion

As we found from the obtained results, the two indexes present very similar performances; however, in general, the Gleason index was slightly superior, if we consider the greater average of accuracy.

By analyzing all results produced in both experiments, we noticed that the best performance of each index was obtained through the combination of RBF kernel with GLCM matrixes. In the Gleason index, the standard deviations of the attained average accuracies were 2.02% and 0.44% for diagonal GLCM and whole matrix, respectively. In the Menhinick Index case, the standard deviations of the averages were 0.99% and 0.91%, respectively, for the diagonal GLCM and whole matrix. We can observe that, in both situations, there were no discrepancies in the averages. This shows that the results behave in a similar way, evidencing that the tested approaches represent well the texture pattern of the samples of benign and malignant masses.

To have a more detailed analysis of the results produced by this work, we chose, at random, 5 malignant ROIs and 5 benign ROIs from the set of samples used in the tests of the DDSM database.

As we can observe from Figures 6a and b, the masses visually have a spiculated outline, indicating a high probability of malignancy. Figures 7a and b have a regular outline, suggesting a high probability of benignancy. However, only by visual analysis of Figures 6c, d, e, 7c, d and e, it is not possible for the specialist to perform a precise diagnosis because those samples do not have well-defined outline features and have similar textures.

Given the difficulty of differentiation between malignant and benign patterns of the masses, the ideal for a more precise classification is to combine geometry and texture features to perform this task. However, the proposed method shows good results if compared to related works (Table 1), even using only texture analysis to describe the masses.

To provide evidence of the relevance of the proposed method, from the results presented in Table 2 and Table 3, graphs were generated for some of the produced texture features, in the best and worst case of each diversity index using the RBF kernel, for all samples in Figure 6 and Figure 7.

The worst result presented by the Gleason index was for the feature extraction performed from the diversity calculated by the histogram. This can be verified by the graph in Figure 8a, where the 6 produced texture features (one for each quantization), for both the malignant samples and the benign, present many values in the same range, making the differentiation of the sample classes difficult by the classifier.

Figure 8b shows the graph generated by the best result produced by the Gleason index, obtained through the GLCM matrix parting from the ROI representation performed by the diagonal of the matrix. In the construction of the graph, 10 features were used (1 quantization × 5 distances × 2 directions). The quantization used 8 gray levels, distance d = 1, 2, 3, 4 and 5, and θ = 0º and 45º. We can observe that, for each line in the graph, this form of representation of the ROI produces the majority of the features of the malignant and benign classes with different value ranges, making it possible for the classifier to have an average accuracy of 84.33%.

The Menhinick index, in the RBF kernel, as well as the Gleason index also presented the worst results, through feature extraction performed via diversity calculation, using the histogram (Figure 9a). However, the best results were obtained through the diversity calculation with the GLCM approach, using the whole matrix to represent the ROI. In the graph construction of Figures 9a and b, we respected the same requisites used in elaborating the graphs referring to the Gleason index.

The overlap among the values of the malignant and benign samples shown in the graphs of Figure 8b and Figure 9b happen because the two techniques produce some texture features with the same range of values. These data are possibly explained by the fact that the two researched indexes only take into consideration two parameters in the diversity calculation: the number of species and the total number of individuals. Factors such as the size of samples and weight given to rare species are not considered in the calculation of those indexes. However, as previously demonstrated, the textures of malignant and benign masses are similar; therefore, such factors can be determinants for a better discrimination between the mass classes. This way, it is necessary to research other diversity indexes in order to prove this supposition.

Comparing results produced by this work with related ones (presented in the introduction) was not a simple task because, as previously discussed, the works present different methodologies, image bases and number of samples used in the experiments. However, by analyzing Table 1, it is possible to establish some conclusions. The first is that, from approaches that extract features only via texture analysis, the performance of the methodology proposed in this work is, in general, superior to the results presented by other works. Even when comparing the results of this work with those that combine texture and geometry features, we can observe that the results generated here are superior to the majority of the related works, outlining that the proposed methodology is quite promising.

Another point that deserves attention is that, in this work, a detailed analysis of the results shows the average accuracy, sensitivity, and specificity obtained. In other words, not only isolated results are considered. This way, it is possible to verify the consistency of the produced results for discriminating between malignant and benign patterns.

Overall, although the two indexes presented statistically similar performances, the Gleason index was slightly superior. The Gleason diversity index, combined with the approach of the GLCM matrix, resulted in 86.66% accuracy, 90% sensitivity, 83.33% specificity and an area Az of 0.86. However, it is still necessary to not only perform more tests but also investigate the performance in other bases of regions of interest, as well as other diversity indexes, in order to perform a more detailed analysis of the proposed methodology.

Acknowledgements

The authors acknowledge CAPES, CNPq and FAPEMA for the financial support.

Received: 9 July 2013

Accepted: 9 December 2013

American Cancer Society - ACS. Learn about breast cancer. [cited 2011 nov 17]. Available from: http://www.cancer.org
Bland M. An Introduction to medical statistics. 3rd ed. Oxford University Press; 2000.
Braz JG, Silva AC, Paiva AC, Oliveira ACM. Classification of breast tissues using Moran's index and Geary's coefficient as texture signatures and SVM. Computers in Biology and Medicine. 2009;39:1063-72. PMid:19800057. http://dx.doi.org/10.1016/j.compbiomed.2009.08.009
Brower JE, Zar JH, Ende CV. Field and laboratory methods for general ecology. 4th ed. Dubuque: McGraw-hill Science; 1997.
Carvalho PMS, Paiva AC, Silva AC. Classification of breast tissues in mammographic images in mass and non-mass using McIntosh's Diversity Index and SVM. In: Machine Learning and Data Mining in Pattern Recognition. In: MLDM 2012: Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition - LNCS; 2012; Berlin. Berlin: MLDM; 2012. p. 482-94.
Chang C, Lin C. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. 2011;2(3):27-27pages. [cited 2012 Oct 6]. Available from: http://www.csie.ntu.edu.tw/cjlin/libsvm
Brown CD, Davis HT. Receiver operating characteristics curves and related decision measures: a tutorial. Chemometrics and Intelligent Laboratory Systems. 2006;80:24-38. http://dx.doi.org/10.1016/j.chemolab.2005.05.004
Galloway MM. Texture analysis using gray level run lengths. Computer Graphics and Image Processing. 1975;4:172-9. http://dx.doi.org/10.1016/S0146-664X(75)80008-6
Gonzalez RC, Woods RE. Digital image processing. 2nd ed. New York: Prentice Hall; 2002.
Haralick RM, Shanmugan K, Dinstein I. Textural features for image classification. IEEE Transactions on Systems, Man and Cybernetics. 1973;3(6):610-21. http://dx.doi.org/10.1109/TSMC.1973.4309314
Haykin S, Engel P. Redes neurais: princípios e prática. Rio de Janeiro: Bookman; 2001.
Heath M, Bowyer KW, Kopans D, Moore R, Kegelmeyer WP. DDSM: The Digital Database for Screening Mammography. In: International Workshop on Digital Mammography: Proceedings of the 5th International Workshop on Digital Mammography; 2001. Medical Physics Publishing, 2001. p. 212-8.
Kinoshita SK, Pereira RR, Honda MO, Rodrigue J, Marques PMA. An automatic method for detection of the nipple and pectoral muscle in digitized mammograms. In: Congresso Latino-Americano de Engenharia Biomédica - CLAEB 2004: Anais do Congresso Latino-Americano de Engenharia Biomédica; 2004 Set 22-25; João Pessoa, Brasil. João Pessoa: International Federation for Medical and Biological Engineeringp; 2004. p. 1303-6.
Kopans DB. Imagens da mama. 2nd ed. Porto Alegre: Medsi; 2000.
Lim WK, Er MJ. Classification of mammographic masses using generalized dynamic fuzzy neural networks. Medical Physics. 2004;31(5):1288-95. http://dx.doi.org/10.1118/1.1708643
Liu X, Liu J, Zhilin F. Mass classification in mammography with morphological features and multiple kernel learning. In: iCBBE 2011: Proceedings of the 5th International Conference on Bioinformatics and Biomedical Engineering; 2011 May 10-12; Wuhan. Wuhan: IEEE; 2011. p 1-4. http://dx.doi.org/10.1109/icbbe.2011.5780356
Liu X, Liu J, Zhou D, Tang J. A benign and malignant mass classification algorithm based on an improved level set segmentation and texture feature analysis. In: iCBBE 2010: Proceedings of the 4th International Conference on Bioinformatics and Biomedical Engineering; 2010 June 18-20; Chengdu. Chengdu: IEEEE; 2010. p. 1-4. http://dx.doi.org/10.1109/ICBBE.2010.5518284
Magurran AE. Measuring miological diversity. New York: Wiley; 2004.
Martins LO, Silva AC, Paiva AC, Braz JG. Comparison of Support Vector Machines and Bayesian Neural Networks performance for breast tissues using geostatistical functions in mammographic images. International Journal on Computational Intelligence and Applications. 2010;9:271-88. http://dx.doi.org/10.1142/S1469026810002914
Mavroforakis ME, Georgiou HV, Dimitropoulos N, Cavouras D, Theodoridis S. Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artificial Intelligence in Medicine. 2006;37(2):145-62. PMid:16716579. http://dx.doi.org/10.1016/j.artmed.2006.03.002
Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD. Training neural network mlassifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Networks. 2008; 21:427-36. PMid:18272329 PMCid:PMC2346433. http://dx.doi.org/10.1016/j.neunet.2007.12.031
Mcintosh RP. An index of diversity and the relation of certain concepts to diversity. Ecological Society of America. 1967;48:392-404.
Menhinick EF. A Comparison of some species-individuals diversity indices applied to samples of field insects. Ecology. 1964;45(4):859-61. http://dx.doi.org/10.2307/1934933
Mu T, Nandi AK, Rangayyan RM. Classification of breast masses using selected shape, edge-sharpness, and texture features with linear and kernel-based classifiers. Journal of Digital Imaging. 2008;21(2):153-69. PMid:18306000 PMCid:PMC3043867. http://dx.doi.org/10.1007/s10278-007-9102-z
Rangayyan RM, Nguyen TM, Ayres FJ, Nandi AK. Effect of pixel resolution on texture features of breast masses in mammograms. Journal of Digital Imaging. 2010;23(5):547-53. PMid:19756865 PMCid:PMC3046677. http://dx.doi.org/10.1007/s10278-009-9238-0
Rocha SV, Braz JG, Paiva AC, Silva AC. Uso da função K de Ripley e Máquina de Vetores de Suporte para diagnóstico de regiões da mama. In: XXIII CBEB 2012: Anais do XXIII Congresso Brasileiro de Engenharia Biomédica; 2012 out 1-5; Porto de Galinhas, Brasil. Porto de Galinhas; 2012. p. 1153-7.
Sahiner B, Chan HP, Petrick N, Helvie MA, Hadjiiski LM. Improvement of mammographic mass characterization using spiculation measures and morphological features. Medical Physics. 2001;28(7):1455-65. PMid:11488579. http://dx.doi.org/10.1118/1.1381548
Shi J, Sahiner B, Chan H, Ge J, Hadjiiski L, Helvie MA, Nees A, Wu Y, Wei J, Zhou C, Zhang Y, Cui J. Improvement of mammographic mass characterization using spiculation measures and morphological features. Medical physics. 2007;35(1):280-90. http://dx.doi.org/10.1118/1.2820630
Suganthi M, Madheswaran M. An improved medical decision support system to identify the breast cancer using mammogram. Springer Science Business Media. 2010;36:79-91.
Vapnik VN. Statistical learning theory. New York: Wiley-Interscience; 1998.
Varela C, Timp S, Karssemeijer N. Use of border information in the classification of mammographic masses. Phys. Med. Biol. 2006;51(2):425-41. PMid:16394348. http://dx.doi.org/10.1088/0031-9155/51/2/016
Xinli W, Albregtsen F, Foyb B. Texture features from gray level gap length matrix. In: Workshop on Machine Vision Applications: Proceedings of the Workshop on Machine Vision Applications; 1994 Dec 13-15; Kawasaki. Kawasaki; 1994. p. 375-8.

*

e-mail:

simara@deinf.ufma.br

Publication Dates

Publication in this collection
23 Apr 2014
Date of issue
Mar 2014

History

Accepted
09 Dec 2013
Received
09 July 2013

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.

[1] American Cancer Society - ACS. Learn about breast cancer. [cited 2011 nov 17]. Available from: http://www.cancer.org

[2] Bland M. An Introduction to medical statistics. 3rd ed. Oxford University Press; 2000.

[3] Braz JG, Silva AC, Paiva AC, Oliveira ACM. Classification of breast tissues using Moran's index and Geary's coefficient as texture signatures and SVM. Computers in Biology and Medicine. 2009;39:1063-72. PMid:19800057. http://dx.doi.org/10.1016/j.compbiomed.2009.08.009

[4] Brower JE, Zar JH, Ende CV. Field and laboratory methods for general ecology. 4th ed. Dubuque: McGraw-hill Science; 1997.

[5] Carvalho PMS, Paiva AC, Silva AC. Classification of breast tissues in mammographic images in mass and non-mass using McIntosh's Diversity Index and SVM. In: Machine Learning and Data Mining in Pattern Recognition. In: MLDM 2012: Proceedings of the 8th international conference on Machine Learning and Data Mining in Pattern Recognition - LNCS; 2012; Berlin. Berlin: MLDM; 2012. p. 482-94.

[6] Chang C, Lin C. LIBSVM: A library for support vector machines. ACM Transactions on Intelligent Systems and Technology. 2011;2(3):27-27pages. [cited 2012 Oct 6]. Available from: http://www.csie.ntu.edu.tw/cjlin/libsvm

[7] Brown CD, Davis HT. Receiver operating characteristics curves and related decision measures: a tutorial. Chemometrics and Intelligent Laboratory Systems. 2006;80:24-38. http://dx.doi.org/10.1016/j.chemolab.2005.05.004

[8] Galloway MM. Texture analysis using gray level run lengths. Computer Graphics and Image Processing. 1975;4:172-9. http://dx.doi.org/10.1016/S0146-664X(75)80008-6

[9] Gonzalez RC, Woods RE. Digital image processing. 2nd ed. New York: Prentice Hall; 2002.

[10] Haralick RM, Shanmugan K, Dinstein I. Textural features for image classification. IEEE Transactions on Systems, Man and Cybernetics. 1973;3(6):610-21. http://dx.doi.org/10.1109/TSMC.1973.4309314

[11] Haykin S, Engel P. Redes neurais: princípios e prática. Rio de Janeiro: Bookman; 2001.

[12] Heath M, Bowyer KW, Kopans D, Moore R, Kegelmeyer WP. DDSM: The Digital Database for Screening Mammography. In: International Workshop on Digital Mammography: Proceedings of the 5th International Workshop on Digital Mammography; 2001. Medical Physics Publishing, 2001. p. 212-8.

[13] Kinoshita SK, Pereira RR, Honda MO, Rodrigue J, Marques PMA. An automatic method for detection of the nipple and pectoral muscle in digitized mammograms. In: Congresso Latino-Americano de Engenharia Biomédica - CLAEB 2004: Anais do Congresso Latino-Americano de Engenharia Biomédica; 2004 Set 22-25; João Pessoa, Brasil. João Pessoa: International Federation for Medical and Biological Engineeringp; 2004. p. 1303-6.

[14] Kopans DB. Imagens da mama. 2nd ed. Porto Alegre: Medsi; 2000.

[15] Lim WK, Er MJ. Classification of mammographic masses using generalized dynamic fuzzy neural networks. Medical Physics. 2004;31(5):1288-95. http://dx.doi.org/10.1118/1.1708643

[16] Liu X, Liu J, Zhilin F. Mass classification in mammography with morphological features and multiple kernel learning. In: iCBBE 2011: Proceedings of the 5th International Conference on Bioinformatics and Biomedical Engineering; 2011 May 10-12; Wuhan. Wuhan: IEEE; 2011. p 1-4. http://dx.doi.org/10.1109/icbbe.2011.5780356

[17] Liu X, Liu J, Zhou D, Tang J. A benign and malignant mass classification algorithm based on an improved level set segmentation and texture feature analysis. In: iCBBE 2010: Proceedings of the 4th International Conference on Bioinformatics and Biomedical Engineering; 2010 June 18-20; Chengdu. Chengdu: IEEEE; 2010. p. 1-4. http://dx.doi.org/10.1109/ICBBE.2010.5518284

[18] Magurran AE. Measuring miological diversity. New York: Wiley; 2004.

[19] Martins LO, Silva AC, Paiva AC, Braz JG. Comparison of Support Vector Machines and Bayesian Neural Networks performance for breast tissues using geostatistical functions in mammographic images. International Journal on Computational Intelligence and Applications. 2010;9:271-88. http://dx.doi.org/10.1142/S1469026810002914

[20] Mavroforakis ME, Georgiou HV, Dimitropoulos N, Cavouras D, Theodoridis S. Mammographic masses characterization based on localized texture and dataset fractal analysis using linear, neural and support vector machine classifiers. Artificial Intelligence in Medicine. 2006;37(2):145-62. PMid:16716579. http://dx.doi.org/10.1016/j.artmed.2006.03.002

[21] Mazurowski MA, Habas PA, Zurada JM, Lo JY, Baker JA, Tourassi GD. Training neural network mlassifiers for medical decision making: The effects of imbalanced datasets on classification performance. Neural Networks. 2008; 21:427-36. PMid:18272329 PMCid:PMC2346433. http://dx.doi.org/10.1016/j.neunet.2007.12.031

[22] Mcintosh RP. An index of diversity and the relation of certain concepts to diversity. Ecological Society of America. 1967;48:392-404.

[23] Menhinick EF. A Comparison of some species-individuals diversity indices applied to samples of field insects. Ecology. 1964;45(4):859-61. http://dx.doi.org/10.2307/1934933

[24] Mu T, Nandi AK, Rangayyan RM. Classification of breast masses using selected shape, edge-sharpness, and texture features with linear and kernel-based classifiers. Journal of Digital Imaging. 2008;21(2):153-69. PMid:18306000 PMCid:PMC3043867. http://dx.doi.org/10.1007/s10278-007-9102-z

[25] Rangayyan RM, Nguyen TM, Ayres FJ, Nandi AK. Effect of pixel resolution on texture features of breast masses in mammograms. Journal of Digital Imaging. 2010;23(5):547-53. PMid:19756865 PMCid:PMC3046677. http://dx.doi.org/10.1007/s10278-009-9238-0

[26] Rocha SV, Braz JG, Paiva AC, Silva AC. Uso da função K de Ripley e Máquina de Vetores de Suporte para diagnóstico de regiões da mama. In: XXIII CBEB 2012: Anais do XXIII Congresso Brasileiro de Engenharia Biomédica; 2012 out 1-5; Porto de Galinhas, Brasil. Porto de Galinhas; 2012. p. 1153-7.

[27] Sahiner B, Chan HP, Petrick N, Helvie MA, Hadjiiski LM. Improvement of mammographic mass characterization using spiculation measures and morphological features. Medical Physics. 2001;28(7):1455-65. PMid:11488579. http://dx.doi.org/10.1118/1.1381548

[28] Shi J, Sahiner B, Chan H, Ge J, Hadjiiski L, Helvie MA, Nees A, Wu Y, Wei J, Zhou C, Zhang Y, Cui J. Improvement of mammographic mass characterization using spiculation measures and morphological features. Medical physics. 2007;35(1):280-90. http://dx.doi.org/10.1118/1.2820630

[29] Suganthi M, Madheswaran M. An improved medical decision support system to identify the breast cancer using mammogram. Springer Science Business Media. 2010;36:79-91.

[30] Vapnik VN. Statistical learning theory. New York: Wiley-Interscience; 1998.

[31] Varela C, Timp S, Karssemeijer N. Use of border information in the classification of mammographic masses. Phys. Med. Biol. 2006;51(2):425-41. PMid:16394348. http://dx.doi.org/10.1088/0031-9155/51/2/016

[32] Xinli W, Albregtsen F, Foyb B. Texture features from gray level gap length matrix. In: Workshop on Machine Vision Applications: Proceedings of the Workshop on Machine Vision Applications; 1994 Dec 13-15; Kawasaki. Kawasaki; 1994. p. 375-8.