SciELO - Scientific Electronic Library Online

vol.30 issue4Automatic segmentation and classification of blood components in microscopic images using a fuzzy approachAdvances and perspectives of mechanomyography author indexsubject indexarticles search
Home Pagealphabetic serial listing  

Services on Demand




Related links


Revista Brasileira de Engenharia Biomédica

version ISSN 1517-3151

Rev. Bras. Eng. Bioméd. vol.30 no.4 Rio de Janeiro Oct./Dec. 2014 



A systematic review on the evaluation and characteristics of computer-aided diagnosis systems



Vagner Mendonça GonçalvesI, *; Márcio Eduardo DelamaroII; Fátima de Lourdes dos Santos NunesI

ILaboratório de Aplicações de Informática em Saúde, Escola de Artes, Ciências e Humanidades, Universidade de São Paulo - USP, Rua Arlindo Bettio, 1000, CEP 03828-000, São Paulo, SP, Brazil
IIInstituto de Ciências Matemáticas e de Computação, Universidade de São Paulo - USP, Campus de São Carlos, São Carlos, SP, Brazil




INTRODUCTION: One of the challenges in developing Computer-Aided Diagnosis (CAD) systems is their accurate and comprehensive assessment. This paper presents the conduction and results of a systematic review (SR) that aims to verify the state of the art regarding the assessment of CAD systems. This survey provides a general analysis of the current status of the design, development and assessment of such systems and includes discussions on the most used metrics and approaches that could be utilized to obtain more objective evaluation methods.
METHODS: The SR was conducted using the scientific databases, ACM Digital Library, IEEE Xplore Digital Library, ScienceDirect and Web of Science. Inclusion and exclusion criteria were defined and applied to each retrieved work to select those of interest. From 156 studies retrieved, 100 studies were included.
Results: There is a number of abnormalities that have been used for the development of CAD systems. Images from computed tomographies and mammographies are the most encountered types of medical images. Additionally, a number of studies used public databases for CAD evaluations. The main evaluation metrics and methods applied to CAD systems include sensitivity, accuracy, specificity and receiver operating characteristic (ROC) analyses. In the assessed CAD systems that used the segmentation method, 30.0% applied the overlap measure.
DISCUSSION: There remain several topics to explore for the assessment of CAD schemes. While some evaluation metrics are traditionally used, they require a prior knowledge of case characteristics to test CAD systems. We were not able to identify articles that use software testing to evaluate CAD systems. Thus, we realize that there is a gap between CAD assessments and traditional practices of software engineering. However, the scope of this research is limited to scientific and academic works and excludes commercial interests. Finally, we discuss potential research studies within this scope to create a more objective and efficient evaluation of CAD systems.

Keywords: CAD evaluation, Classification, Computer-aided diagnosis, Detection, Medical image, Segmentation.




Computer-Aided Diagnosis (CAD) schemes are computer systems aiming at providing second opinions to physicians to aid in diagnoses (Doi, 2007). These systems compute outputs based on information from diverse sources, primarily from medical images captured using various methods. According to van Ginneken et al. (2010), CAD has become the most active field of research in medical imaging. Further, Doi (2006) showed that CAD systems provide consistent interpretations of medical images to improve the precision of a diagnosis.

The assessment of CAD schemes is one of the major difficulties encountered in their development. Because results can vary depending on the used set of images, it is not an easy task to determine the effectiveness of a particular technique. To ascertain the feasibility of a technique, tests should be conducted with a set of images that preferably have varied acquisition characteristics. Additionally, this image set should meet the requirements of the purpose of the technique, i.e., they should contain the structures sought by the computer system. This entails a collaborative effort with hospitals and clinics to perform detailed and thorough analyses to obtain appropriate medical images and their respective reports. The cataloging of these images based on their characteristics allow for their fast and accurate retrieval.

This paper presents a systematic review (SR) aiming to verify the state of the art regarding the assessment of CAD systems. Additionally, we analyzed data from the included studies, such as computational techniques employed in the development of CAD schemes, abnormalities investigated and modalities of the most explored medical imaging. Thus, this survey made possible a general analysis of the current scope regarding the design, development and assessment of such systems.

Overviews on the development and trends of CAD schemes have been presented by Doi (2007) and Shiraishi et al. (2009). Doi (2007) carried out a historical review regarding the development of CAD schemes. Examples of works that explored different modalities of medical imaging were also presented to aid in diagnosing several abnormalities, such as lung nodules, vertebral fractures and intracranial aneurysms. Doi also discussed the potential of such schemes for applications in clinical routines. Shiraishi et al. (2009) presented a review on the application of analyses using Receiver Operating Characteristic (ROC) curves to assess CAD systems. The review included studies published in Radiology Journal between 1997 and 2006. The review also analyzed the participation of human observers in the assessment processes and identified the most explored medical imaging modalities.

In our study, the presented unique systematic review is based on a set of criteria characterizing the differences and relevance between results, as listed below:

A comprehensive selection of scientific databases as reference sources, allowing access to diverse publications in the field (these databases were selected after carrying out an exploratory study that defined relevant research sources);

A definition and disclosure of an SR protocol that was strictly followed during the review (this protocol permitted a reproducible review and audited the used criteria); and

An analysis of results taking into account each automated task to aid individual diagnoses (i.e., segmentation, detection and classification of abnormalities and regions of interest).

In addition to this introductory section, the paper is organized as follows. The "Methods" section presents concepts about the SR, the protocol used and the process of conducting the review. The "

Results and discussion" section presents and discusses the results.



The systematic review is a rigorous methodology of bibliographic research that aims to identify primary and secondary studies related to a particular research topic. It permits the assessment and interpretation of all relevant research on a particular issue or topic of interest (Kitchenham, 2004).

According to Biolchini et al. (2007) and Kitchenham (2004), an SR is carried out in three well-defined phases: planning, conduction and analysis of results. In the planning phase, a protocol is defined specifying research questions and the methodology to be employed in the conduction of the review. Furthermore, this protocol defines purposes for the SR, reference sources, criteria for the inclusion or exclusion of primary studies, keywords and other topics of interest. In the conduction phase, the bibliographic research is carried out. In this phase, studies are selected according to the defined inclusion and exclusion criteria. Finally, in the analysis of results, the data extraction is performed, and the results are compared.

A major difference between an SR and the non-systematic review of the literature is the fact that the establishment of a protocol allows the SR to be reproduced and audited. Other researchers can follow the same protocol and assess the methods used for the case at issue (Biolchini et al., 2007). The following subsections describe each of the previously cited phases applied in the SR carried out in the present work.


First, research questions are defined:

What are the methodologies currently applied for the assessment of CAD systems based on medical imaging?

What are the modalities of medical imaging, abnormalities and computational techniques involved in the development of CAD systems?

What are the advantages and limitations presented by the assessment methodologies employed in CAD systems?

An exploratory analysis on CAD was carried out over several scientific databases. This preliminary survey aided in selecting reference sources and the definition of the keywords used in the SR. Based on our experiences with journals, we consulted databases that traditionally published articles on the subject. The following databases were selected:

ACM Digital Library (ACM)

IEEE Xplore Digital Library (IEEE)

ScienceDirect (SD)

Web of Science (WS)

From the defined keywords of the protocol, queries for papers on journals or proceedings of scientific conferences were carried out in the selected databases. Only recent studies (published since 2006) were considered for assessing the state of the art of CAD.

The selection of acknowledged reference sources in the field made it possible to retrieve a significant number of studies on CAD. Moreover, defining adequate criteria for the inclusion and exclusion of works focused the sources to the relevant subject matter. A composition of terms was used to initially narrow the scope of the reference sources. These terms had to be present in the title, abstract or keywords to qualify a source (these indices were searchable by means of advanced search tools available for each database). The defined terms were as follows:

(evaluation OR testing OR assessment)


("computer-aided diagnosis system" OR "computer-aided diagnosis scheme" OR "computer-assisted diagnosis system" OR "computer-assisted diagnosis scheme" OR "diagnosis support system" OR "diagnostic support system")

Table 1 presents the compositions of terms translated for each of the search engines of the consulted databases.

To select only relevant papers for the subject of study, we defined the inclusion and exclusion criteria. We included only studies that met at least one of the inclusion criteria and none of the exclusion criteria.

Sources that met the defined inclusion criteria had to:

(a) present or discuss concepts, criteria and methodologies to assess CAD systems whose outputs are presented as images or use images processed in a way to enable diagnoses;

(b) apply a specific methodology in the assessment of some CAD system with the characteristics mentioned in the first inclusion criterion; or

(c) present concepts or historical reviews on CAD.

In turn, excluded sources that met all the defined exclusion criteria had to:

(d) be similar, in content and results, with other studies by the same authors retrievable from any of the consulted databases;

(e) have a publication year outside the specified deadline (i.e., earlier than 2006); and

(f) not be fully available in the consulted databases or in any other database accessible on the Internet.

Conduction and data extraction

The searches were carried out between April and June 2014. In total, 156 studies were retrieved. As a whole, 100 studies (64.10%) were included, 96 of which had CAD schemes and the employed assessment methodologies. Four papers contained conceptual contents as described in the subsection "Review and theoretical papers".

Every conduction stage of the SR was duly documented based on the models proposed in Biolchini et al. (2007) and Kitchenham (2004). The produced documents and the tools used are described below.

Conduction form: One form was produced for each consulted database. They document all the relevant information of the search: access dates, a composition of terms used, a list of retrieved studies and their satisfied inclusion criteria and other observations.

Data extraction form: For each included study, in addition to the bibliographic and reference information, a summary of the study and documented topics of interest were included. The main topics of interest extracted included: the purpose of the system used/proposed, assessment method of the employed CAD, modality and number of images/cases used in the test, main tests and results and other information relevant research.

Figure 1 shows a flow diagram, based on Liberati et al. (2009), which summarizes the selection of the studies.

Table 2 shows the studies included, and their satisfied inclusion criteria. We retrieved studies published in scientific journals, conference proceedings and other collections of articles. In particular, a significant variety of medical publications, computational intelligence, and imaging processing and analyses (other than pattern recognition) were observed. Table 3, Table 4 and Table 5 show the sources from which studies included in this SR were taken. The next section presents and discusses the results obtained through this SR.


Results and Discussion

Table 6 presents the 98 CAD systems reported in the 96 included studies (Pietka et al. (2010) reported the development of three distinct systems). In the following subsections, an overview of the state of the art regarding the development and assessment of CAD systems is presented based on the studies included through the SR.

Abnormalities studied

As seen in Table 6, there is a significant variety of abnormalities that are currently subjects of study for the development of CAD systems. As shown in Figure 2, breast cancer and lung cancer are the diseases most commonly studied within the scope of this SR. Such evidence is significant, considering the importance of early diagnoses of many different types of neoplasia is well known.



In addition to the diseases mentioned, other abnormalities were also reported, encompassing the processing of medical images of different structures and organs of the human body, including brain, skin, retina, bones, heart, arteries, liver, ear, prostate, and gastrointestinal tract, among others (Table 6). CAD systems are the subject of study and research with applicability in a wide range of various medical areas. Although many approaches are still far from clinical application, the variety of ideas, techniques and application areas show that one can expect significant development in computational applications for the diagnosis of many well-known abnormalities.

Modalities of medical imaging exploited

Different types of medical imaging have been objects of study for the development of CAD systems. This statement is confirmed by observing the results of this SR. From the 29 systems that reported dealing with breast cancer, eighteen (62.07%) addressed mammograms. Mammography is the most effective technique for the early diagnosis of breast cancer (Giger, 1999). Thus, the utilization of such images is still very important due to the possibility that a physician may misread or misinterpret an exam (Giannakopoulou et al. 2010; Jasmine et al. 2009; Osman et al. 2009; Verma, 2009). Five other systems (17.24%) reported work with ultrasound imaging, a technique that has been very important, in conjunction with mammography, in increasing the precision of breast cancer diagnosis (Giger, 1999; Lee et al., 2009; Shen et al., 2007). Other reported imaging modalities dealing with breast cancer included magnetic resonance imaging (10.34%), microscopic/cytologic imaging (6.90%) and digital tomosynthesis (3.45%). Works with magnetic resonance, microscopic/cytologic imaging and digital tomosynthesis have only been recently developed, which suggests the use of new imaging techniques to aid in breast cancer diagnoses.

The literature search also yielded thirteen CAD systems focused on lung cancer, nine of which (69.23%) used images from CT examinations. The other four systems (30.77%) used chest radiography images.

Five systems reported in this review addressed the diagnosis of Alzheimer's disease. Four (80%) used CT images, while only one report used magnetic resonance images.

Four systems reported using images from dermatoscopy examinations to address skin cancer, while only one used images from fluorescence spectroscopy.

Five systems were reported to diagnose eye diseases affecting the retina using processed retinography images. In turn, the system used to diagnose nuclear cataracts (Huang et al., 2009b) worked with images obtained by means of a slit lamp.

The graph in Figure 3 presents the number of systems for each medical imaging method in the studies. In this graph, each reported system is evaluated for modality, independent of the studied abnormality. As seen, images from CTs and mammographies were the most exploited for the development of CAD systems.



Tasks for computer-aided diagnosis systems

Each developed technique performed a specific task utilizing a computer-aided diagnosis. In general, a complete CAD system involved segmented structures, the detection of abnormalities and the extraction of their characteristics for a subsequent classification of the problem (e.g., normal, benign or malignant, depending on the case). For example, to classify structures, the previous stages of segmentation and detection are required. Studies with CAD schemes have contributed to the automation of these tasks, either by means of developing a new technique or improving an existing technique. For each task of interest, the charts presented in Figure 4, Figure 5 and Figure 6 show the number of reported systems that contributed innovations.







Most of the CAD systems used for the diagnosis of breast cancer (Figure 4) aimed at classifying microcalcifications, lesions or other regions of interest (ROIs). This indicates that the main focus of CAD technology for breast cancer is the identification of suspicious structures in medical images and the determination of whether these structures are benign or malignant (i.e., capable of being cancerous). This can help physicians diagnose the disease and its severity level, thereby reducing chances of misinterpretation and aiding in determining recommended treatments.

He et al. (2011) presented an approach to risk classifications for breast cancer, i.e., a formula to determine the risk of developing the disease. This risk was estimated by analyzing mammary tissue patterns through mammography.

For lung cancer CAD systems (Figure 5), most systems were designed to detect nodules. Regardless of the disease studied, it can be seen that the main tasks of interest were the detection and classification of abnormalities (Figure 6). This was expected because these are the tasks that most reflect the contribution of CAD systems.

Among the reported systems in the reviewed studies, different techniques have been proposed to complete the different aforementioned tasks. Systems utilizing segmentation used techniques based on thresholding (Ashwin et al., 2012; Korfiatis et al., 2007; Liu et al., 2012; Pietka et al., 2010; Suganthi and Madheswaran, 2010; Usha and Sandya, 2013), morphological operators (Beuren et al., 2012; Lerdsinmongkol et al., 2011; Li et al., 2012; Usha and Sandya, 2013; Volpi et al., 2009), fuzzy k-means clustering (Beuren et al., 2012; Li et al., 2012; Wittenberg et al., 2012), region growing (Wittenberg et al., 2012; Zheng et al., 2008), and Gaussian mixture models (Haindl et al., 2007; Tan et al., 2010), among others.

For systems focused on detecting abnormalities, images were analyzed by using techniques to segment structures in the images. A few of the segmentation techniques were based on the Fuzzy Set Theory (Huang et al., 2007; Pietka et al., 2010), thresholding (Garnavi et al., 2011; Gomathi and Thangaraj, 2010) and models (Mumcuoglu et al., 2011; Schilham et al., 2006). Detection approaches were based on the use of classifiers, such as artificial neural networks (Ashwin et al., 2012; Bevilacqua, 2013; García-Orellana et al., 2008; Itai et al., 2009; Kumar et al., 2011; Li et al., 2009; López et al., 2011; Mironică et al., 2011; Sasaki et al., 2010), k-nearest neighbor (Al-Absi et al., 2012; Li et al., 2009; Mironică et al., 2011; Sanchez et al., 2011; Schilham et al., 2006), support vectors machine (Grana et al., 2011; Li et al., 2009; Martinez-Murcia et al., 2014; Mironică et al., 2011; Miyaki et al., 2013; Segovia et al., 2012; Shilaskar and Ghatol, 2013; Wang et al., 2009) and probabilistic classifiers (Li et al., 2012; Liu et al., 2013; Mironică et al., 2011; Ramírez et al., 2009; Vertan et al., 2011).

Finally, for classification tasks, we observed the use of classifiers such as k-nearest neighbor (Filipczuk et al., 2013; Gedik and Atasoy, 2013; Gopinath and Shanthi, 2013; He et al., 2011; Muramatsu et al., 2013; Nava et al., 2014; Odeh et al., 2006; Osman et al., 2009; Raja et al., 2010; Verikas et al., 2006), artificial neural networks (Barhoumi et al., 2007; Geetha et al., 2008; Jasmine et al., 2009; López et al., 2008; Raja et al., 2007; Streba et al., 2012; Verma, 2009; Wu et al., 2006), Bayesian classifiers (Ampeliotis et al., 2007; Bhooshan et al., 2011; Garnavi et al., 2012; Gruszauskas et al., 2008, 2009; Retter et al., 2013; Tolouee et al., 2011) techniques based on linear discriminant analysis (Lee et al., 2009; Muramatsu et al., 2013; Tanner et al., 2006) and logistic regression models (Shen et al., 2007; Tanner et al., 2006).

A few studies combined clustering algorithms with classification algorithms. He et al. (2011) and Raja et al. (2010) both used the k-means and k-nearest neighbor algorithms. Verma (2009) used a clustering algorithm with artificial neural networks trained with back propagation. Barhoumi et al. (2007) classified skin lesions by combining results from artificial neural networks with results of a content-based image retrieval (CBIR) scheme through the Dempster-Shafer Theory.

Public databases of medical images

One of the difficulties encountered in the development of CAD systems is the lack of availability of test cases. It is not always possible to obtain a medical image database containing various acquisition characteristics, structures or abnormalities that a technique requires for the detection, analysis or diagnosis of a disease. Factors such as partnerships with clinics and hospitals, ethical issues, and image access permissions, tend to hamper the task.

For this reason, a few projects have been developed and maintained aiming to provide medical images for research groups that develop CAD technologies. These projects consist of public databases of medical images that document reports made by physicians and often include information about structures of interest in the images.

We cataloged various public databases from the systems reported in the studies.

Digital Database for Screening Mammography - DDSM (García-Orellana et al., 2008; Haindl et al., 2007; Muramatsu et al., 2013; Ramos et al., 2012; Song et al., 2010; Suganthi and Madheswaran, 2010; Verma, 2009; Wang et al., 2009; Zheng et al., 2008): maintained by the University of South Florida, this database serves as a resource for research in mammographic imaging analysis (Heath et al., 2001). The database contains 2,620 cases divided into 43 volumes, each composed of normal cases, cases containing suspicious structures proved benign or proven cases of cancer. Delineations of regions of interest (i.e., ground truth regions), if any, are provided. A set of programs for decoding and manipulating mammography images are also included.

The mini-MIAS database of mammograms (Gedik and Atasoy, 2013; Geetha et al., 2008; He et al., 2011; Jasmine et al., 2009; López et al., 2008; Osman et al., 2009; Tahmasbi et al., 2011; Wu et al., 2006): this database is maintained by the Mammographic Image Analysis Society (MIAS) and offers 322 mammograms for use in research. In addition to mammograms, the database contains information concerning the type, severity and coordinates of the central pixel of abnormalities, if any, in each image (Suckling et al., 1994).

Lung Image Database Consortium - LIDC (Korfiatis et al., 2007; Pietka et al., 2010): this public database is the result of an initiative that aimed to provide chest CT images to support the development, training and assessment of CAD schemes for the detection of pulmonary nodules (Armato III et al., 2011). The database provides exams of 1,010 patients, delimitations of existing lesions, and a software for the manipulation of images.

Japanese Society of Radiological Technology (JSRT) database (Al-Absi et al., 2012; Nagata et al., 2013; Schilham et al., 2006): this database of chest radiographs of the JSRT was created in cooperation with the Japan Radiological Society in 1998. It consists of 247 images, comprising 154 cases of a single pulmonary nodule (grouped by subtlety) and 93 cases without nodules. It also provides information about patients (e.g., age and gender), diagnoses (i.e., benign or malignant) and central coordinates of each node (Shiraishi et al., 2000).

MESSIDOR Digital Retinal Images (Sanchez et al., 2011): the MESSIDOR database was developed to facilitate CAD studies for diabetic retinopathy. It contains 1,200 images of the eye fundus. Moreover, for each image, it provides scores given by experts indicating the level of the retinopathy, and the risk of macular edema (Messidor, 2014).

Digital Retinal Images for Vessel Extraction - DRIVE (Hatanaka et al., 2011; Jiménez et al., 2010): this database was established with the aim of enabling a comparative study on the segmentation of blood vessels in retinal images. To this end, 40 photographs of the retina (eye fundus) are available. For each photo, there are two manually segmented blood vessel images (Staal et al., 2004). Researchers who use the database to test methods for segmentation of blood vessels can submit their results through the home-page of the project, to make them available for comparison with other studies in the area.

STructured Analysis of the Retina - STARE (Jiménez et al., 2010): The STARE project, initiated in 1975 at the University of California, focuses on developing a system to aid in the detection of diseases of the human eye (McCormick and Goldbaum, 1975). The project has a database with 402 eye fundus photographs and a corresponding diagnosis for each image.

Alzheimer's Disease Neuroimaging Initiative - ADNI (López et al., 2011; Segovia et al., 2012): The ADNI (Mueller et al., 2005) has the goal of defining the progression of Alzheimer's disease. To this end, it aims to collect and validate data, such as images from positron emission tomography, magnetic resonance imaging and other sources, to predict the disease. The initiative provides collected data for research purposes. The database contains data and images on 895 patients.

Assessment of CAD systems

As previously mentioned, the main objectives of this SR were to examine and analyze the state of the art of CAD systems. We have already presented the studied abnormalities, medical imaging modalities, tasks of interest and public databases used in the testing and development of the reported systems. In this section, the assessment techniques are discussed.

From Table 6, we see that assessments of CAD systems involved carrying out segmentations, detections or classifications based on a set of inputs and on previously known correct results (provided mostly by specialists) to obtain performance metrics for the system. None of the assessed studies used test criteria established in the literature, such as functional or structural techniques. From this point of view, the assessment is made on an ad hoc basis.

Assessment metrics

Figure 7, Figure 8 and Figure 9 show charts for each diagnosis and the metrics and assessment methods applied to the reported systems. We only considered mentioned methods. However, for the analysis of these results, we considered the main tasks of each system and any secondary metrics and methods (Figure 7). For example, if the main task of interest of a given work is the classification of lesions but also reported on the assessment of a previous segmentation task (i.e., mentioning the metrics and/or methods used), these were included in the chart concerning segmentations. For each chart, the total number of reported systems is shown.









The graph in Figure 7 shows that six systems that assessed a segmentation method applied the overlap measure. This measurement consisted obtaining the relative area of the intersection between two considered regions (Gruszauskas et al., 2008; Korfiatis et al., 2007) by means of assessing the set of pixels resulting from the segmentation process. Given |Aseg|, the area of an automatically segmented region Aseg, and |Aman|, the area of a region Aman, which is considered correct for the segmentation process (e.g., generated manually), the overlap measure is defined by Equation 1. A value of 0 indicates the worst performance, i.e., there is no intersection between the correct area and the automatically obtained area. A value of 1 indicates a perfect segmentation.

A relative area difference metric, applied in the assessment of three systems (Beuren et al., 2012; Tan et al., 2010; Zheng et al., 2008), predicts an extension of the automatically segmented region that does not match the expected correct region (Tan et al., 2010). This measure can be obtained through Equation 2. It is seen, that if Aseg = Aman, then the relative area difference is 0.

A metric applied to evaluate segmentation results, reported in three systems (Endo et al., 2012; Li et al., 2012; Liu et al., 2012), is Dice's coefficient. This metric calculates overlapping areas between an automatically segmented region and the correct expected region (Liu et al., 2012). Equation 3 defines the calculation of this metric. If , then a perfect segmentation results (Dice's coefficient = 1).

Other metrics that were also observed for the segmentation assessment included accuracy and sensitivity. These metrics are part of a set of very traditional statistic metrics in the assessment of CAD systems. They are based on true positive (TP), true negative (TN), false positive (FP) and false negative (FN) results (Garnavi et al., 2011), which are concepts defined by Wagner et al. (2007):

True positive: a positive detection result of an abnormal structure present in the organ or tissue represented in the image, or a correct classification for a detected structure;

True negative: a negative detection result of an image of an organ or tissue that does not show any abnormal structure, or a correct classification that indicates an abnormal structure that does not belong to a particular class;

False positive: a positive detection result of an image of an organ or tissue that does not represent any abnormal structure, or an incorrect classification that indicates a particular structure belonging to a given class when, in fact, it does not; and

False negative: a negative detection result of an image of an organ or tissue that presents one or more abnormal structures that should be detected, or an incorrect classification that indicates a structure that does not belong to a given class when, in fact, it does.

In the case of segmentation, an approach for the use of these metrics is to define TP pixels (i.e., segmented and within the region of interest), TN (outside the region of interest and not segmented), FP (segmented and not within the region of interest) and FN (within the region of interest and not segmented). Later in this section, these metrics are presented in relation to final results of CAD systems. Other methods and assessment metrics, applicable to segmentation routines, were observed in the reported systems are listed in Table 6 and their references.

The metrics and assessment methods applied for the classification and detection tasks are mostly the same. The graphs in Figure 8 and Figure 9 show the metrics and methods used for each particular task, and the chart in Figure 10 show the results for both tasks.

As we can be seen, there was a predominant use of metrics based on TP, TN, FP and FN for the assessment of classification and detection tasks in reported systems. In the combined cases, the sensitivity metric is the most considered method for evaluating a CAD system. Table 7 lists the key reported metrics in Garnavi et al. (2011).

ROC curve

Another widely known method used in the assessment of CAD systems is the Receiver Operating Characteristic (ROC) curve. As seen in Figure 7-10, this method was used to assess 45% of the classification systems reported. Figure 10 combines the applied classification and detection methods and metrics.

This curve represents the sensitivity as a function of the fraction of false positives (FFP = 1 - specificity; Metz, 1999; Wagner et al., 2007). An example of an ROC curve trace is shown in Figure 11. An ideal CAD system presents an operating point (0,1) on the graph, where 0 represents the minimum FFP and 1 represents the maximum sensitivity. The ROC curve estimates operating points that the CAD system can present with variations of specific parameters. For example, this permits the comparison of performances of multiple techniques, and considers performance changes with parameter variations.



An assessment metric extracted from the ROC curve and often employed in the assessment process of the systems is the area under the ROC curve. The larger the area under the curve, the better the performance of the CAD system. More details regarding this metrics can be obtained in Metz (1999) and other included studies.

FROC curve

Another curve used in the assessment of CAD systems is the Free-Response Receiver Operating Characteristic (FROC) curve. This curve represents the sensitivity as a function of the average number of false positives per image (Nishikawa, 2007). Despite having been employed in few reported systems, this method is worth mentioning given the scope of this SR.

To show the differences between an FROC curve and a conventional ROC curve, Metz (1999) used an example of structure detection. A conventional ROC curve provides the probability for a positive region (i.e., containing an abnormal structure) to be diagnosed as positive (sensitivity) and the probability for a negative region (i.e., containing no abnormal structure) to be diagnosed as positive (FFP). An FROC curve provides the probability that a randomly selected lesion will be detected after an average number of FP detections.

Metz (1999) also showed that ROC curves are restricted to cases in which there are two detection possibilities for each image or region processed. FROC curves can be used when a lesion may be present in more than one position in each image; the CAD scheme attempts to detect the lesion in all possible locations (Metz, 1999).

Review and theoretical papers

Four review and theoretical papers about CAD systems and trends in the last decade were included in this SR.

Shiraishi et al. (2009) conducted a survey on the use of ROC curves in medical imaging analyses from studies published in Radiology Journal. The researchers analyzed 295 studies published between 1997 and 2006 containing the phrase, "receiver operating characteristic". Approximately 79% of the studies reported findings based on subjective diagnoses or objective measurements, while 14.6% did not include human observers. Most works of the latter evaluated CAD systems.

Pietka et al. (2011) presented and discussed the participation of health professionals and technicians in various stages of the life cycle (i.e., design, assessment and implementation) of CAD systems. The researchers used their own CAD systems to examine how consecutive stages were developed by the multidisciplinary team.

Zhang et al. (2011) presented a review about recent advances in breast tissue classification technologies of CAD systems for breast cancer. The researchers did not present the review methodology; instead, they discussed three classification approaches (texture feature analysis, statistical modeling and machine learning) and compared results obtained from analyzed studies. According to the researchers, machine learning is the most feasible approach to developing universal CAD systems.

Korotkov and Garcia (2012) presented a review about computerized analyses of pigmented skin lesions in microscopic (dermatoscopic) and macroscopic (clinical) images. The researchers presented an extensive background about applied field concepts and described features and methodologies used in the analyzed studies.

Trends and opportunities

From the studies included in this review, the most explored subjects were breast and lung cancers. However, we believe other diseases diagnoses are as important, especially those related to the heart and the brain, which have caused many deaths worldwide. CAD schemes for aiding early disease diagnoses in these areas could potentially improve treatment therapies and thereby effect higher rates of positive outcomes.

Several types of images have been used in the reviewed studies, with X-ray images being the most frequent. However, most CAD systems only considered one type of image, perhaps due to the difficulty in simultaneously evaluating multiple images with diverse formats and characteristics. The evaluation of the combination of different image types may be an interesting subject to explore.

All analyzed studies developed techniques for detecting, segmenting or classifying structures to compose CAD systems. No study proposed any other procedure; all works used traditional approaches to evaluate their results. The limitations of these approaches and the non-standardization of the databases used in evaluations are further discussed in the following section.

Traditional metrics (e.g., ROC and FROC curves) are based on TP, TN, FP and FN results; thus, they depend on prior knowledge of case characteristics used to test CAD systems. For example, to identify actual lesions from medical images, the actual details and characteristics of structures of lesions in an image must be known beforehand. Therefore, there is a chance for normal, benign structures to be mistakenly considered as lesions.

Typically, in this type of evaluation, physician participation is mandatory every time the CAD presents a new technique or approach. The "visual" analyses of each case and the comparison of CAD results against traditional diagnoses must be performed by an experienced medical professional (e.g., a physician or radiologist). Therefore, the testing and evaluation of CAD systems using such metrics is resource-intensive. Whenever any part of the system is modified, subsequent results validation is required. An additional complication is that a physician evaluation can vary, depending on factors such as experience, fatigue, and time availability, as cited by Aziz et al. (2004), Barlow et al. (2004), and Pindborg et al. (1985). Appropriate method for evaluating these systems have yet to be determined and can constitute new opportunities for exploration. Consequently, more reliable systems may be able to decrease variations and improve the quality of diagnoses.

We did not encounter any articles mentioning the use of software engineering techniques, such as software testing, in conjunction with other evaluation approaches. While traditional techniques are not ideal subjects for new publications, it is well known that in specific domains, testing activities require adapted or new techniques. In particular, complexities of input and output domains of image processing software might pose a real challenge for software engineers. Selecting robust test cases from a large, complex and diverse data set is not a trivial task and has not been adequately studied. In addition, using software testing metrics with traditional approaches for CAD evaluations has not yet been explored. However, the present study considers only scientific and academic works; commercial products were out our scope. Thus, we only considered data related to the studied articles.

We suggest and discuss some research approaches within this scope by using Content-Based Image Retrieval (CBIR) concepts to evaluate CAD outputs and test criteria definitions and applications. This permits the identification of errors in the software and computational testing tools. As a result, the CAD evaluation becomes more objective and effective.

In recent years, CBIR techniques have been explored to aid in image retrieval and to assist the physician in composing a diagnosis based on data from similar cases. CBIR systems use features related to color, shape, texture and distance to calculate the similarity between images and their features. We believe concepts from CBIR could be extended to compose more objective approaches to evaluate CAD systems. In the suggested CBIR approach, the expert is required to determine one correct solution (i.e., a model image), and extractors automatically verify whether the answer produced by the image processing program has similar characteristics as that of the model image. Thus, our approach avoids a reliance on potentially biased diagnoses for verification. From a software engineering standpoint, this is an important improvement because testing processes are resource-intensive and are often required during software development and maintenance.

A few standardized databases were used in the reviewed studies. However, none of these databases were complete, i.e., a few databases did not have all structure types, others presented hard-to-process image formats, and still others did not provide enough information for testing techniques. Thus, the creation of image databases to serve as reliable benchmarks, with standardized image formats, mechanisms to select cases of interest and data that allow performance comparisons of different CADs requires further study.

One of our future goals is to define a methodology that reduces the complexity and repetition required to assess CAD systems. To this end, we intend to use the concept of CBIR for a comparison of graphical outputs (i.e., images) of CAD systems while considering their respective outputs as correct (Delamaro et al., 2013). The existence of public databases favors this approach because such databases often contain expert diagnoses that are associated with the images.

This paper presented the results of a systematic review that allowed the survey and analysis of the state of the art regarding the design, development and evaluation of systems for computer-aided diagnoses. We cataloged 98 CAD systems designed to automate various tasks to aid in disease diagnoses. These systems are described in the studies retrieved from five databases of published scientific papers.

Several groups worldwide have developed CAD systems. Thus, there are vast numbers of published papers analyzing diseases and various modalities of medical imaging, as presented here.

A non-systematic review may not fully explore the state of the art and can even require rework due to the lack of a detailed record on the performance of the review. In this context, the performance of a systematic review, specifically for CAD systems, provides both a general overview and specific details for interested groups. Furthermore, regular updates to the review and the ease of auditing presented results provide increased productivity in the bibliographic research.

The results confirm traditional metrics, which are based on true positives, true negatives, false positives and false negatives as the primary means to assess and compare the performance of CAD systems. ROC and FROC curves use methods derived from these metrics to assist in the assessment of system behaviors, given variations in their parameters.

However, these metrics and methods require the repetitive and exhausting participation of physicians and radiologists for verifying the accuracy of each version of a CAD system. While such participation is essential, the authors of this paper intend to focus on automating these verification tasks by using CBIR to compare the graphical output (i.e., images) of CAD systems with their respective database. In this future work, we will aim to determine an objective methodology for assessing CAD systems.

This study contributed an extensive literature review of the past six years on the state of the art of the design, development and assessment of CAD systems. We presented tasks of interest, relevant public databases of medical images, the main metrics and assessment methods, and a general analysis over the entire art.



This work was supported by the São Paulo Research Foundation (Fapesp) [grant numbers 2010/15691-0, 2010/09806-0, 2010/01496-1]; the National Council for Scientific and Technological Development (CNPq) [grant numbers 559931/2010-7, 559915/2010-1]; and the National Institute of Science and Technology-Medicine Assisted by Scientific Computing (INCT-MACC).



Al-Absi HRH, Samir BB, Shaban KB, Sulaiman S. Computer aided diagnosis system based on machine learning techniques for lung cancer. In: ICCIS 2012: Proceedings of the 2012 International Conference on Computer Information Science; 2012 June 12-14; Kuala Lumpur, Malaysia. IEEE; 2012. p. 295-300. v. 1.        [ Links ]

Álvarez Illán I, Górriz JM, Ramírez J, Salas-Gonzalez D, López M, Segovia F, Padilla P, Puntonet CG. Projecting independent components of SPECT images for computer aided diagnosis of Alzheimer's disease. Pattern Recognition Letters. 2010; 31(11):1342-7.        [ Links ]

Ampeliotis D, Antonakoudi A, Berberidis K, Psarakis EZ. Computer aided detection of prostate cancer using fused information from dynamic contrast enhanced and morphological magnetic resonance images. In: ICSPC 2007: Proceedings of the IEEE International Conference on Signal Processing and Communications; 2007 Nov 24-27; Dubai, United Arab Emirates. IEEE; 2007. p. 888-91.         [ Links ]

Armato III SG, McLennan G, Bidaut L, McNitt-Gray MF, Meyer CR, Reeves AP, Zhao B, Aberle DR, Henschke CI, Hoffman EA, Kazerooni EA, MacMahon H, Van Beeke EJ, Yankelevitz D, Biancardi AM, Bland PH, Brown MS, Engelmann RM, Laderach GE, Max D, Pais RC, Qing DP, Roberts RY, Smith AR, Starkey A, Batrah P, Caligiuri P, Farooqi A, Gladish GW, Jude CM, Munden RF, Petkovska I, Quint LE, Schwartz LH, Sundaram B, Dodd LE, Fenimore C, Gur D, Petrick N, Freymann J, Kirby J, Hughes B, Casteele AV, Gupte S, Sallam M, Heath MD, Kuhn MH, Dharaiya E, Burns R, Fryd DS, Salganicoff M, Anand V, Shreter U, Vastagh S, Croft BY, Clarke, LP. The Lung Image Database Consortium (LIDC) and Image Database Resource Initiative (IDRI): a completed reference database of lung nodules on CT scans. Medical Physics. 2011; 38(2):915-31. PMCid:PMC3041807.        [ Links ]

Ashwin S, Kumar SA, Ramesh J, Gunavathi K. Efficient and reliable lung nodule detection using a neural network based computer aided diagnosis system. In: ICETEEEM 2012: Proceedings of the International Conference on Emerging Trends in Electrical Engineering and Energy Management; 2012 Dec 13-15; Chennai, India. IEEE; 2012. p. 135-42.         [ Links ]

Aziz ZA, Wells AU, Hansell DM, Bain GA, Copley SJ, Desai SR, Ellis SM, Gleeson FV, Grubnic S, Nicholson AG, Padley SPG, Pointon KS, Reynolds JH, Robertson RJH, Rubens MB. HRCT diagnosis of diffuse parenchymal lung disease: inter-observer variation. Thorax. 2004; 59(6):506-11.        [ Links ]

Barhoumi W, Dhahbi S, Zagrouba E. A collaborative system for pigmented skin lesions malignancy tracking. In: IST'07: Proceedings of the IEEE International Workshop on Imaging Systems and Techniques; 2007 May 5; Krakow, Poland. IEEE; 2007. p. 1-6.         [ Links ]

Barlow WE, Chi C, Carney PA, Taplin SH, D'Orsi C, Cutter G, Hendrick RE, Elmore JG. Accuracy of screening mammography Interpretation by characteristics of radiologists. Journal of the National Cancer Institute. 2004; 96(24):1840-50.        [ Links ]

Beuren AT, Janasieivicz R, Pinheiro G, Grando N, Facon J. Skin melanoma segmentation by morphological approach. In: ICACCI'12: Proceedings of the International Conference on Advances in Computing, Communications and Informatics; 2012; Chennai, India. ACM; 2012. p. 972-8.         [ Links ]

Bevilacqua V. Three-dimensional virtual colonoscopy for automatic polyps detection by artificial neural network approach: new tests on an enlarged cohort of polyps. Neurocomputing. 2013; 116:62-75.        [ Links ]

Bhooshan N, Giger M, Lan L, Li H, Marquez A, Shimauchi A, Newstead GM. Combined use of T-2-weighted MRI and T-1-weighted dynamic contrast-enhanced MRI in the automated Analysis of breast lesions. Magnetic Resonance in Medicine. 2011; 66(2):555-63. PMid:21523818 PMCid:PMC4156840.        [ Links ]

Biolchini JCA, Mian PG, Natali ACC, Conte TU, Travassos GH. Scientific research ontology to support systematic review in software engineering. Advanced Engineering Informatics. 2007; 21(2):133-51.        [ Links ]

Chan T. Clinical usage considerations in the development and evaluation of a computer aided diagnosis system for acute intracranial hemorrhage on brain CT. In: Zhang D, Sonka M, editors. Medical Biometrics. Heidelberg: Springer; 2010. p. 268-75. Lecture Notes in Computer Science v. 6165.         [ Links ]

Chang TC, Lee JD, Huang CH, Wu T, Chen CJ, Wu SJ. The diagnostic application of brain image processing and analysis system for ischemic stroke. In: Bebis G, Boyle R, Parvin B, Koracin D, Remagnino P, Nefian A, Meenakshisundaram G, Pascucci V, Zara J, Molineros J, Theisel H, Malzbender T, editors. Advances in visual computing. Part II. Heidelberg: Springer; 2006. p. 31-8. Lecture Notes in Computare Science v. 4292.         [ Links ]

Charbonnier JP, Smit EJ, Viergever MA, Velthuis BK, Vos PC. Computer-aided diagnosis of acute ischemic stroke based on cerebral hypoperfusion using 4D CT angiography. In: Novak CL, Aylward S, editors. Medical imaging 2013: Computer-Aided Diagnosis. Florida: SPIE; 2013. SPIE Proceedings v. 8670.         [ Links ]

Charisis VS, Katsimerou C, Hadjileontiadis LJ, Liatsos CN, Sergiadis GD. Computer-aided capsule endoscopy images evaluation based on color rotation and texture features: an educational tool to physicians. In: CBMS 2013: Proceedings of the 26th IEEE International Symposium on Computer-Based Medical Systems; 2013 June 20-22; Porto, Portugal. IEEE; 2013. p. 203-8.        [ Links ]

Cheng J, Liu J, Xu Y, Yin F, Wong DWK, Lee BH, Cheung C, Aung T, Wong TY. Superpixel classification for initialization in model based optic disc segmentation. In: EMBC 2012: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2012 Aug 28-Sept 1; San Diego, USA. IEEE; 2012. p. 1450-3. PMid:23366174        [ Links ]

David J, Krishnan R, Sukesh Kumar A. Neural network based retinal image analysis. In: CISP' 08: Proceedings of the Congress on Image and Signal Processing; 2008 May 27-30; Sanya, China. IEEE; 2008. p. 49-53. v. 2.         [ Links ]

Delamaro ME, Nunes FLS, Oliveira RAP. Using concepts of content-based image retrieval to implement graphical testing oracles. Software Testing, Verification and Reliability. 2013; 23(3):171-98.        [ Links ]

Doi K. Computer-aided diagnosis in medical imaging: historical review, current status and future potential. Computerized Medical Imaging and Graphics. 2007; 31(4-5):198-211. PMid:17349778 PMCid:PMC1955762.        [ Links ]

Doi K. Diagnostic imaging over the last 50 years: research and development in medical imaging science and technology. Physics in Medicine and Biology. 2006; 51(13):R5-27. PMid:16790920.        [ Links ]

Elizabeth DS, Nehemiah HK, Raj CSR, Kannan A. A novel segmentation approach for improving diagnostic accuracy of CAD systems for detecting lung cancer from chest computed tomography images. ACM Journal of Data and Information Quality. 2012; 3(2):4:1-4:16.         [ Links ]

Endo M, Aramaki T, Asakura K, Moriguchi M, Akimaru M, Osawa A, Hisanaga R, Moriya Y, Shimura K, Furukawa H, Yamaguchi K. Content-based image-retrieval system in chest computed tomography for a solitary pulmonary nodule: method and preliminary experiments. International Journal of Computer Assisted Radiology and Surgery. 2012; 7(2):331-8. PMid:22258753.        [ Links ]

Filipczuk P, Kowal M, Obuchowicz A. Multi-label fast marching and seeded watershed segmentation methods for diagnosis of breast cancer cytology. In: EMBC 2013: Proceedings of the 35th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2013 July 3-7; Osaka, Japan. IEEE; 2013. p. 7368-71. PMid:24111447        [ Links ]

García-Orellana CJ, Gallardo-Caballero R, Gonzalez-Velasco HM, Garcia-Manso A, Macias-Macias M. Study of a mammographic CAD performance dependence on the considered mammogram set. In: EMBS 2008: Proceedings of the 30th Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2008 Aug 20-25; Vancouver, Canada. IEEE; 2008. p. 4776-9.         [ Links ]

Garnavi R, Aldeen M, Bailey J. Computer-aided diagnosis of melanoma using border- and wavelet-based texture analysis. IEEE Transactions on Information Technology in Biomedicine. 2012; 16(6):1239-52. PMid:22893445.        [ Links ]

Garnavi R, Aldeen M, Celebi ME. Weighted performance index for objective evaluation of border detection methods in dermoscopy images. Skin Research and Technology. 2011; 17(1):35-44. PMid:20923454.        [ Links ]

Gedik N, Atasoy A. A computer-aided diagnosis system for breast cancer detection by using a curvelet transform. Turkish Journal of Electrical Engineering & Computer Sciences. 2013; 21(4):1002-14.         [ Links ]

Geetha K, Thanushkodi K, Kumar AK. New particle swarm optimization for feature selection and classification of microcalcifications in mammograms. In: ICSCN'08: Proceedings of the International Conference on Signal Processing, Communications and Networking; 2008 Jan 4-6; Chennai, India. IEEE; 2008. p. 458-63.         [ Links ]

Giannakopoulou G, Spyrou GM, Antaraki A, Andreadis I, Koulocheri D, Zagouri F, Nonni A, Filippakis GM, Nikita KS, Ligomenides PA, Zografos GC. Downgrading BIRADS 3 to BIRADS 2 category using a computer-aided microcalcification analysis and risk assessment system for early breast cancer. Computers in Biology and Medicine. 2010; 40(11-12):853-9. PMid:20950798        [ Links ]

Giger ML. Overview of computer-aided diagnosis in breast imaging. In: Doi K, MacMahon H, Giger ML, Hoffmann KR, editors. Computer-aided diagnosis in medical imaging. Amsterdam: Elsevier; 1999. p. 167-76. International Congress Series v. 1182.         [ Links ]

Gomathi M, Thangaraj P. Automated CAD for detection of lung nodule using CT scans. In: COMPUTE'10: Proceedings of the Third Annual ACM Bangalore Conference; 2010; Bangalore, India. ACM; 2010. p. 25:1-4.         [ Links ]

Gopinath B, Shanthi N. Computer-aided diagnosis system for classifying benign and malignant thyroid nodules in multi-stained FNAB cytological images. Australasian Physical & Engineering Sciences In Medicine. 2013; 36(2):219-30. PMid:23690210.        [ Links ]

Grana M, Termenon M, Savio A, Gonzalez-Pinto A, Echeveste J, Perez JM, Besga A. Computer aided diagnosis system for alzheimer disease using brain diffusion tensor imaging features selected by Pearson's correlation. Neuroscience Letters. 2011; 502(3):225-9. PMid:21839143.        [ Links ]

Gruszauskas NP, Drukker K, Giger ML, Chang RF, Sennett CA, Moon WK, Pesce LL. Breast US computer-aided diagnosis system: robustness across urban populations in South Korea and the United States. Radiology. 2009; 253(3):661-71. PMid:19864511 PMCid:PMC2786194.        [ Links ]

Gruszauskas NP, Drukker K, Giger ML, Sennett CA, Pesce LL. Performance of breast ultrasound computer-aided diagnosis: dependence on image selection. Academic Radiology. 2008; 15(10):1234-45. PMid:18790394 PMCid:PMC2567418.        [ Links ]

Haindl M, Mikeš S, Scarpa G. Unsupervised detection of mammogram regions of interest. In: Apolloni B, Howlett RJ, Jain L, editors. Knowledge-based intelligent information and engineering systems. Heidelberg: Springer; 2007. p. 33-40. Lecture Notes in Computer Science v. 4694.         [ Links ]

Haindl M, Mikeš S. Texture segmentation benchmark. In: ICPR 2008: Proceedings of the 19th International Conference on Pattern Recognition; 2008 Dec 8-11; Tampa, USA. IEEE; 2009. p. 1-4.         [ Links ]

Hatanaka Y, Mizukami A, Muramatsu C, Hara T, Fujita H. Automated lesion detection in retinal images. In: ISABEL'11: Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies; 2011; Barcelona, Spain. ACM; 2011. p. 91:1-5.         [ Links ]

He W, Denton ERE, Stafford K, Zwiggelaar R. Mammographic image segmentation and risk classification based on mammographic parenchymal patterns and geometric moments. Biomedical Signal Processing and Control. 2011; 6(3):321-9.        [ Links ]

Heath M, Bowyer K, Kopans D, Moore R, Kegelmeyer WP. The digital database for screening mammography. In: IWDM 2000: Proceedings of the Fifth International Workshop on Digital Mammography; 2000. Medical Physics Publishing; 2001. p. 212-8.         [ Links ]

Hebert D, Desir C, Petitjean C, Heutte L, Thiberville L. Detection of pathological condition in distal lung images. In: ISBI 2012: Proceedings of the 9th IEEE International Symposium on Biomedical Imaging; 2012 May 2-5; Barcelona, Spain. IEEE; 2012. p. 1603-6.         [ Links ]

Huang JY, Kao PF, Chen YS. A set of image processing algorithms for computer-aided diagnosis in nuclear medicine whole body bone scan images. IEEE Transactions on Nuclear Science. 2007; 54(3):514-22.        [ Links ]

Huang SF, Chaoa HY, Hsu CC, Yang SF, Kao PF. A computer-aided diagnosis system for whole body bone scan using single photon emission computed tomography. In: ISBI'09: Proceedings of the IEEE International Symposium on Biomedical Imaging: From Nano to Macro; 2009 June 28-July 1; Boston, USA. IEEE; 2009a. p. 542-5.         [ Links ]

Huang W, Li H, Chan K, Lim J, Liu J, Wong T. A computer-aided diagnosis system of nuclear cataract via ranking. In: Yang GZ, Hawkes D, Rueckert D, Noble A, Taylor C, editors. Medical Image Computing and Computer-Assisted Intervention: MICCAI 2009. Heidelberg: Springer; 2009b. p. 803-10. Lecture Notes in Computer Science v. 5762.         [ Links ]

Itai Y, Kim H, Ishikawa S, Katsuragawa S, Doi K. Reduction of FPs for lung nodules in MDCT by use of temporal subtraction with voxel-matching technique. In: Koppen M, Kasabov N, Coghill G, editors. Advances in Neuro-Information Processing. Heidelberg: Springer; 2009. p. 504-12. Lecture Notes in Computer Science v. 5506.         [ Links ]

Jasmine JSL, Govardhan A, Baskaran S. Microcalcification detection in digital mammograms based on wavelet analysis and neural networks. In: INCACEC 2009: Proceedings of the International Conference on Control, Automation, Communication and Energy Conservation; 2009 June 4-6; Perundurai, India. IEEE; 2009. p. 1-6.         [ Links ]

Jiménez S, Alemany P, Fondón I, Foncubierta A, Acha B, Serrano C. Detección automática de vasos en retinografías. Archivos de la Sociedad Española de Oftalmología. 2010; 85(3):103-9.        [ Links ]

Kitchenham BA. Procedures for performing systematic reviews. Keele: Department of Computer Science, Keele University; 2004. [acesso em 2014 ago 18]. Disponível em:         [ Links ]

Korfiatis P, Skiadopoulos S, Sakellaropoulos P, Kalogeropoulou C, Costaridou L. Automated 3D segmentation of lung fields in thin slice CT exploiting wavelet preprocessing. In: Kropatsch WG, Kampel M, Hanbury A, editors. Computer analysis of images and patterns. Heidelberg: Springer; 2007. p. 237-44. Lecture Notes in Computer Science v. 4673.         [ Links ]

Korotkov K, Garcia R. Computerized analysis of pigmented skin lesions: a review. Artificial Intelligence in Medicine. 2012; 56(2):69-90. PMid: 23063256.        [ Links ]

Kovacs T, Cattin P, Alkadhi H, Wildermuth S, Szekely G. Automatic segmentation of the aortic dissection membrane from 3D CTA images. In: Yang G, Jiang T, Shen D, Gu L, Yang J, editors. Medical imaging and augmented reality. Heidelberg: Springer; 2006. p. 317-24. Lecture Notes in Computer Science v. 4091.         [ Links ]

Kuang W, Ye W. A kernel-modified SVM based computer-aided diagnosis system in initial caries. In: IITA'08: Proceedings of the Second International Symposium on Intelligent Information Technology Application; 2008 Dec 20-22; Shanghai, China. IEEE; 2008. p. 207-11.         [ Links ]

Kumar SS, Moni RS, Rajeesh J. Contourlet transform based computer-aided diagnosis system for liver tumors on computed tomography images. In: ICSCCN 2011: Proceedings of the International Conference on Signal Processing, Communication, Computing and Networking Technologies; 2011 July 21-22; Thuckafay. IEEE; 2011. p. 217-22. PMCid:PMC3209886        [ Links ]

Lartizien C, Rogez M, Niaf E, Ricard F. Computer aided staging of lymphoma patients with FDG PET/CT imaging based on textural information. IEEE Journal of Biomedical and Health Informatics. 2014; 18(3):946-55. PMid:24081876.        [ Links ]

Lee HW, Liu BD, Hung KC, Lei SF, Wang PC, Yang TL. Breast tumor classification of ultrasound images using wavelet-based channel energy and imageJ. IEEE Journal of Selected Topics in Signal Processing. 2009; 3(1):81-93.        [ Links ]

Lerdsinmongkol J, Chaisaowong K, Roongruangsorakarn S, Kraus T, Aach T. Efficient application of 3D morphological operations in the framework of a computer-assisted diagnosis system. In: ICSP 2008: Proceedings of the 9th International Conference on Signal Processing; 2008 Oct 26-29; Beijing, China. IEEE; 2011. p. 857-60.         [ Links ]

Li B, Qi L, Meng MQH, Fan Y. Using ensemble classifier for small bowel ulcer detection in wireless capsule endoscopy images. In: ROBIO 2009: Proceedings of the IEEE International Conference on Robotics and Biomimetics; 2009 Dec 19-23; Guilin, China. IEEE; 2009. p. 2326-31. PMid:19799296        [ Links ]

Li YH, Zhang L, Hu QM, Li HW, Jia FC, Wu JH. Automatic subarachnoid space segmentation and hemorrhage detection in clinical head CT scans. International Journal of Computer Assisted Radiology and Surgery. 2012; 7(4):507-16. PMid:22081264.        [ Links ]

Liberati A, Altman DG, Tetzlaff J, Mulrow C, Gotzsche PC, Ioannidis JPA, Clarke M, Devereaux PJ, Kleijnen J, Moher D. The PRISMA statement for reporting systematic reviews and meta-analyses of studies that evaluate health care interventions: explanation and elaboration. PLoS Medicine. 2009; 6(7):e1000100-1-28.         [ Links ]

Liu H, Hu H, Xu X, Song E. Automatic left ventricle segmentation in cardiac MRI Using topological stable-state thresholding and region restricted dynamic programming. Academic Radiology. 2012; 19(6):723-31. PMid:22465463.        [ Links ]

Liu P, Wang S, Turkbey B, Grant K, Pinto P, Choyke P, Wood BJ, Summers RM. A prostate cancer computer-aided diagnosis system using multimodal magnetic resonance imaging and targeted biopsy labels. In: Novak C, Aylward S, editors. Medical Imaging 2013: Computer-Aided Diagnosis. SPIE Proceedings v. 8670.         [ Links ]

López M, Ramírez J, Górriz JM, Álvarez I, Salas-Gonzalez D, Segovia F, Chaves R, Padilla P, Gómez-Río M. Principal component analysis-based techniques and supervised classification schemes for the early detection of Alzheimer's disease. Neurocomputing. 2011; 74(8):1260-71.        [ Links ]

López Y, Novoa A, Guevara MA, Quintana N, Silva A. Computer aided diagnosis system to detect breast cancer pathological lesions. In: Ruiz-Shulcloper J, Kropatsch W, editors. Progress in Pattern Recognition, Image Analysis and Applications. Heidelberg: Springer; 2008. p. 453-60. Lecture Notes in Computer Science v. 5197.         [ Links ]

Markkongkeaw A, Phinyomark A, Boonyapiphat P, Phukpattaranont P. Preliminary results of breast cancer cell classifying based on gray-level co-occurrence matrix. In: BMEiCON 2013: Proceedings of the 6th Biomedical Engineering International Conference; 2013 Oct 23-25; Amphur Muang, Thailand. IEEE; 2013. p. 1-4.         [ Links ]

Martinez-Murcia FJ, Gorriz JM, Ramirez J, Moreno-Caballero M, Gomez-Rio M, Ini PPM. Parametrization of textural patterns in I-123-ioflupane imaging for the automatic detection of Parkinsonism. Medical Physics. 2014; 41(1). PMid:24387526.        [ Links ]

McCormick B, Goldbaum M. STARE: structured analysis of the retina. Image processing of TV fundus image. In: Proceedings of the USA-Japan Workshop on Image Processing; 1975; Pasadena, USA. Jet Propulsion Laboratory; 1975.         [ Links ]

Messidor. Messidor: digital retinal images. [acesso em 2014 ago 10]. Disponível em:         [ Links ]

Metz CE. Evaluation of CAD methods. In: Doi K, MacMahon H, Giger ML, Hoffmann KR, editors. Computer-aided diagnosis in medical imaging. Amsterdam: Elsevier; 1999. p. 543-54. International Congress Series v. 1182.         [ Links ]

Mironică I, Vertan C, Gheorghe DC. Automatic pediatric otitis detection by classification of global image features. In: EHB 2011: Proceedings of the E-Health and Bioengineering Conference; 2011 Nov 24-26; Iasi, Romania. IEEE; 2011. p. 1-4.         [ Links ]

Miyaki R, Yoshida S, Tanaka S, Kominami Y, Sanomura Y, Matsuo T, Oka S, Raytchev B, Tamaki T, Koide T, Kaneda K, Yoshihara M, Chayama K. Quantitative identification of mucosal gastric cancer under magnifying endoscopy with flexible spectral imaging color enhancement. Journal of Gastroenterology and Hepatology. 2013; 28(5):841-7. PMid:23424994.        [ Links ]

Moon WK, Shen YW, Huang CS, Chiang LR, Chang RF. Computer-aided diagnosis for the classification of breast masses in automated whole breast ultrasound images. Ultrasound in Medicine and Biology. 2011; 37(4):539-48. PMid:21420580.        [ Links ]

Mueller SG, Weiner MW, Thal LJ, Petersen RC, Jack CR, Jagust W, Trojanowski JQ, Toga AW, Beckett L. Ways toward an early diagnosis in Alzheimer's disease: the Alzheimer's Disease Neuroimaging Initiative (ADNI). Alzheimer's Dement. 2005; 1(1):55-66. PMid:17476317 PMCid:PMC1864941.        [ Links ]

Mumcuoglu EU, Bozkurt FM, Aslan M, Sener E, Ugur O. Computerized scar detection on renal cortical scintigraphy images. Nuclear Medicine Communications. 2011; 32(11):1070-8. PMid:21956492.        [ Links ]

Muramatsu C, Matsumoto T, Hayashi T, Hara T, Katsumata A, Zhou X, Iida Y, Matsuoka M, Wakisaka T, Fujita H. Automated measurement of mandibular cortical width on dental panoramic radiographs. International Journal of Computer Assisted Radiology and Surgery. 2013; 8(6):877-85. PMid:23179683.        [ Links ]

Nagata R, Kawaguchi T, Miyake H. A computer-aided diagnosis system for lung nodule detection in chest radiographs using a two-stage classification method based on radial gradient and template matching. In: BMEI 2013: Proceedings of the 6th International Conference on Biomedical Engineering and Informatics; 2013 Dec 16-18; Hangzhou, China. IEEE; 2013. p. 80-5.         [ Links ]

Nava R, Escalante-Ramirez B, Cristobal G, Estepar RSJ. Extended Gabor approach applied to classification of emphysematous patterns in computed tomography. Medical & Biological Engineering & Computing. 2014; 52(4):393-403. PMid:24496558.        [ Links ]

Nishikawa RM. Current status and future directions of computer-aided diagnosis in mammography. Computerized Medical Imaging and Graphics. 2007; 31(4-5):224-35. PMid:17386998.        [ Links ]

Odeh S, Ros E, Rojas I, Palomares J. Skin lesion diagnosis using fluorescence images. In: Campilho A, Kamel M, editors. Image analysis and recognition. Heidelberg: Springer; 2006. p. 648-59. Lecture Notes in Computer Science v. 4142.         [ Links ]

Osman ME, Wahed MA, Mohamed AS, Kadah YM. Computer aided diagnosis system for classification of microcalcifications in digital mammograms. In: NRSC 2009: Proceedings of the National Radio Science Conference; 2009 Mar 17-19; New Cairo, Egypt. IEEE; 2009. p. 1-6. PMid:19210552        [ Links ]

Pietka E, Kawa J, Badura P, Spinczyk D. Open architecture computer-aided diagnosis system. Expert Systems. 2010; 27(1):17-39.        [ Links ]

Pietka E, Kawa J, Spinczyk D, Badura P, Wieclawek W, Czajkowska J, Rudzki M. Role of radiologists in CAD life-cycle. Role of radiologists in CAD life-cycle. 2011; 78(2):225-33.         [ Links ]

Pindborg JJ, Reibel J, Holmstrup P. Subjectivity in evaluating oral epithelial dysplasia, carcinoma in situ and initial carcinoma. Journal of Oral Pathology & Medicine. 1985; 14(9):698-708.        [ Links ]

Raja KB, Madheswaran M, Thyagarajah K. Analysis of ultrasound kidney images using content descriptive multiple features for disorder identification and ANN based classification. In: ICCTA'07: Proceedings of the International Conference on Computing: Theory and Applications; 2007 Mar 5-7; Kolkata, India. IEEE; 2007. p. 382-8.         [ Links ]

Raja KB, Madheswaran M, Thyagarajah K. Texture pattern analysis of kidney tissues for disorder identification and classification using dominant Gabor wavelet. Machine Vision and Applications. 2010; 21(3):287-300.        [ Links ]

Ramírez J, Gorriz JM, Chaves R, Lopez M, Salas-Gonzalez D, Alvarez I, Segovia F. SPECT image classification using random forests. Electronics Letters. 2009; 45(12):604-5.        [ Links ]

Ramos RP, Nascimento MZ, Pereira DC. Texture extraction: an evaluation of ridgelet, wavelet and co-occurrence based methods applied to mammograms. Expert Systems with Applications. 2012; 39(12):11036-47.        [ Links ]

Retter F, Plant C, Burgeth B, Botella G, Schlossbauer T, Meyer-Baese A. Computer-aided diagnosis for diagnostically challenging breast lesions in DCE-MRI based on image registration and integration of morphologic and dynamic characteristics. EURASIP Journal on Advances in Signal Processing. 2013; 2013.        [ Links ]

Roberts MG, Pacheco EMB, Mohankumar R, Cootes TF, Adams JE. Detection of vertebral fractures in DXA VFA images using statistical models of appearance and a semi-automatic segmentation. Osteoporosis International. 2010; 21(12):2037-46. PMid:20135093.        [ Links ]

Sanchez CI, Niemeijer M, Dumitrescu AV, Suttorp-Schulten MSA, Abramoff MD, Van Ginneken B. Evaluation of a computer-aided diagnosis system for diabetic retinopathy screening on public data. Investigative Ophthalmology Visual Science. 2011; 52(7):4866-71. PMid:21527381.        [ Links ]

Sasaki T, Kinoshita K, Kishida S, Hirata Y, Yamada S. Effect of pre-processing on performance of a neural network with one-dimensional sampling from X-ray images of chest. In: ICNC 2010: Proceedings of the Sixth International Conference on Natural Computation; 2010 Aug 10-12; Yantai, China. IEEE; 2010. p. 257-61. v. 1.         [ Links ]

Sato K, Kadowaki S, Madokoro H, Ito M, Inugami A. Unsupervised segmentation for MR brain images. In: ISABEL'11: Proceedings of the 4th International Symposium on Applied Sciences in Biomedical and Communication Technologies; 2011; Barcelona, Spain. ACM; 2011. p. 44:1-5.         [ Links ]

Schilham AMR, Van Ginneken B, Loog M. A computer-aided diagnosis system for detection of lung nodules in chest radiographs with an evaluation on a public database. Medical Image Analysis. 2006; 10(2):247-58. PMid:16293441.        [ Links ]

Segovia F, Gorriz JM, Ramirez J, Salas-Gonzalez D, Alvarez I, Lopez M, Chaves R. A comparative study of feature extraction methods for the diagnosis of Alzheimer's disease using the ADNI database. Neurocomputing. 2012; 75(1):64-71.        [ Links ]

Shen WC, Chang RF, Moon WK. Computer aided classification system for breast ultrasound based on breast imaging reporting and data system (BI-RADS). Ultrasound in Medicine & Biology. 2007; 33(11):1688-98. PMid:17681678.        [ Links ]

Shilaskar S, Ghatol A. Feature selection for medical diagnosis: evaluation for cardiovascular diseases. Expert Systems with Applications. 2013; 40(10):4146-53.        [ Links ]

Shiraishi J, Katsuragawa S, Ikezoe J, Matsumoto T, Kobayashi T, Komatsu Ki, Matsui M, Fujita H, Kodera Y, Doi K. Development of a digital image database for chest radiographs with and without a lung nodule: receiver operating characteristic analysis of radiologists' detection of pulmonary nodules. American Journal of Roentgenology. 2000; 174(1):71-4. PMid:10628457.        [ Links ]

Shiraishi J, Pesce LL, Metz CE, Doi K. Experimental design and data analysis in receiver operating characteristic studies: lessons learned from reports in radiology from 1997 to 2006. Radiology. 2009; 253(3):822-30. PMid:19864510 PMCid:PMC2786192.        [ Links ]

Song E, Xu S, Xu X, Zeng J, Lan Y, Zhang S, Hung CC. Hybrid segmentation of mass in mammograms using template matching and dynamic programming. academic radiology. 2010; 17(11):1414-24. PMid:20817575.        [ Links ]

Staal J, Abramoff MD, Niemeijer M, Viergever MA, Van Ginneken B. Ridge-based vessel segmentation in color images of the retina. IEEE Transactions on Medical Imaging. 2004; 23(4):501-9. PMid:15084075.        [ Links ]

Streba CT, Ionescu M, Gheonea DI, Sandulescu L, Ciurea T, Saftoiu A, Vere CC, Rogoveanu I. Contrast-enhanced ultrasonography parameters in neural network diagnosis of liver tumors. World Journal of Gastroenterology. 2012; 18(32):4427-34. PMid:22969209 PMCid:PMC3436061.        [ Links ]

Suckling J, Parker J, Dance DR, Astley S, Hutt I, Boggis C, Ricketts I, Stamatakis E, Cerneaz N, Kok SL, Taylor P, Betal D, Savage J. The Mammographic Image Analysis Society digital mammogram database. In: Gale AG, Astley SM, Dance DR, Cairns AY, editors. Proceedings of the Second International Workshop on Digital Mammography. Amsterdam: Excerta Medica; 1994.         [ Links ]

Suganthi M, Madheswaran M. An enhanced decision support system for breast tumor identification in screening mammograms using combined classifier. In: ICWET'10: Proceedings of the International Conference and Workshop on Emerging Trends in Technology; 2010; Mumbai, India. ACM; 2010. p. 786-91.        [ Links ]

Sulaiman SN, Ahmad KA, Baharudin R, Ahmad A, Harron NA, Saod AHM, Isa NAM, Yusoff IA. Performance of Hybrid Radial Basis Function network: adaptive Fuzzy K-Means versus Moving k-Means clustering as centre positioning algorithms on cervical cell precancerous stage classification. In: ICCSCE 2012: Proceedings of the IEEE International Conference on Control System, Computing and Engineering; 2012 Nov 23-25; Penang, Malaysia. IEEE; 2012. p. 607-11.         [ Links ]

Tahmasbi A, Saki F, Shokouhi SB. CWLA: a novel cognitive classifier for breast mass diagnosis. In: ICBME 2011: Proceedings of the 18th Iranian Conference of Biomedical Engineering; 2011 Dec 14-16; Tehran, Iran. IEEE; 2011. p. 255-9.         [ Links ]

Tan NM, Liu J, Wong DWK, Yin F, Lim JH, Wong TY. Mixture model-based approach for optic cup segmentation. In: EMBC 2010: Proceedings of the Annual International Conference of the IEEE Engineering in Medicine and Biology Society; 2010 Aug 31-Sept 4; Buenos Aires, Argentina. IEEE; 2010. p. 4817-20.         [ Links ]

Tanner C, Hawkes DJ, Khazen M, Kessar P, Leach MO. Does registration improve the performance of a computer aided diagnosis system for dynamic contrast-enhanced MR mammography? In: Proceedings of the IEEE International Symposium on Biomedical Imaging: Nano to Macro; 2006 Apr 6-9; Arlington, USA. IEEE; 2006. p. 466-9.         [ Links ]

Tolouee A, Moghaddam HA, Forouzanfar M, Gity M, Garnavi R. Image based diagnostic aid system for interstitial lung diseases. Expert Systems With Applications. 2011; 38(6):7755-65.        [ Links ]

Usha BS, Sandya S. Measurement of ovarian size and shape parameters. In: INDICON 2013: Proceedings of the Annual IEEE India Conference; 2013 Dec 13-15; Mumbai, India. IEEE; 2013. p. 1-6.         [ Links ]

Van Ginneken B, Armato III SG, Hoop B, Amelsvoort-van de Vorst S, Duindam T, Niemeijer M, Murphy K, Schilham A, Retico A, Fantacci ME, Camarlinghi N, Bagagli F, Gori I, Hara T, Fujita H, Gargano G, Bellotti R, Tangaro S, Bolaños L, De Carlo F, Cerello P, Cristian Cheran S, Lopez Torres E, Prokop M. Comparing and combining algorithms for computer-aided detection of pulmonary nodules in computed tomography scans: the ANODE09 study. Medical Image Analysis. 2010; 14(6):707-22. PMid:20573538        [ Links ]

Verikas A, Gelzinis A, Bacauskiene M, Uloza V. Towards a computer-aided diagnosis system for vocal cord diseases. Artificial Intelligence in Medicine. 2006; 36(1):71-84. PMid:16412950.        [ Links ]

Verma B. Impact of multiple clusters on neural classification of ROIs in digital mammograms. In: IJCNN 2009: Proceedings of the International Joint Conference on Neural Networks; 2009 June 14-19; Atlanta, USA. IEEE; 2009. p. 2532-5.         [ Links ]

Vertan C, Gheorghe DC, Ionescu B. Eardrum color content analysis in video-otoscopy images for the diagnosis support of pediatric otitis. In: ISSCS 2011: Proceedings of the 10th International Symposium on Signals, Circuits and Systems; 2011 June 30-July 1; lasi, Romania. IEEE; 2011. p. 1-4.         [ Links ]

Voigt D, Döllinger M, Braunschweig T, Yang A, Eysholdt U, Lohscheller J. Classification of functional voice disorders based on phonovibrograms. Artificial Intelligence in Medicine. 2010; 49(1):51-9. PMid:20138486.        [ Links ]

Volpi SL, Antonelli M, Lazzerini B, Marcelloni F, Stefanescu DC. Segmentation and reconstruction of the lung and the mediastinum volumes in CT images. In: ISABEL 2009: Proceedings of the 2nd International Symposium on Applied Sciences in Biomedical and Communication Technologies; 2009 Nov 24-27; Bratislava, Slovakia. IEEE; 2009. p. 1-6.        [ Links ]

Wada S, Matsumoto T, Murao K, Sone S. A study on the performance evaluation of computer-aided diagnosis for detecting pulmonary nodules for the various CT reconstruction. In: Jiang Y, Eckstein MP, editors. Medical Imaging 2006: image perception, observer performance and technology assessment. SPIE; 2006. SPIE Proceedings v. 6146.         [ Links ]

Wagner RF, Metz CE, Campbell G. Assessment of medical imaging systems and computer aids: a tutorial review. Academic Radiology. 2007; 14(6):723-48. PMid:17502262.        [ Links ]

Wang D, Shi L, Heng PA. Automatic detection of breast cancers in mammograms using structured support vector machines. Neurocomputing. 2009; 72(13):3296-302.        [ Links ]

Wittenberg T, Wagner F, Gryanik A. Towards a computer assisted diagnosis system for digital breast tomosynthesis. Biomedical Engineering/Biomedizinische Technik. 2012; 57(SI-1):223-6.         [ Links ]

Wu ZQ, Jiang J, Peng YH, Gulsrud TO. A filter-based approach towards automatic detection of microcalcification. In: Astley S, Brady M, Rose C, Zwiggelaar R, editors. Digital mammography. Heidelberg: Springer; 2006. p. 424-32. Lecture Notes in Computer Science v. 4046.         [ Links ]

Xiao F, Liao CC, Huang KC, Chiang IJ, Wong JM. Automated assessment of midline shift in head injury patients. Clinical Neurology and Neurosurgery. 2010; 112(9):785-90. PubMed: PMID 20663606. PMid:20663606.        [ Links ]

Zhang G, Wang W, Moon J, Pack JK, Jeon SI. A review of breast tissue classification in mammograms. In: RACS'11: Proceedings of the ACM Symposium on Research in Applied Computation; 2011; Miami, USA. ACM; 2011. p. 232-7.         [ Links ]

Zheng B, Pu J, Park SC, Zuley M, Gur D. Assessment of the relationship between lesion segmentation accuracy and computer-aided diagnosis scheme performance. In: Giger ML, Karssemeijer N, editors. Medical Imaging 2008: computer-aided diagnosis. SPIE; 2008. SPIE Proceedings v. 6915.         [ Links ]



Received: 03 February 2014
Accepted: 05 August 2014



* e-mail:

Creative Commons License All the contents of this journal, except where otherwise noted, is licensed under a Creative Commons Attribution License