Using computed tomography enterography to evaluate patients with Crohn's disease: what impact does examiner experience have on the reproducibility of the method?

Objective To assess the impact that examiner experience has on the reproducibility and accuracy of computed tomography (CT) enterography in the detection of radiological signs in patients with Crohn's disease. Materials and Methods This was a retrospective, cross-sectional observational study involving the analysis of CT enterography scans of 20 patients with Crohn's disease. The exams were analyzed independently by two radiologists in their last year of residence (duo I) and by two abdominal imaging specialists (duo II). The interobserver agreement of each pair of examiners in identifying the main radiological signs was calculated with the kappa test. The accuracy of the examiners with less experience was quantified by using the consensus among three experienced examiners as a reference. Results Duo I and duo II obtained a similar interobserver agreement, with a moderate to good correlation, for mural hyperenhancement, parietal thickening, mural stratification, fat densification, and comb sign (kappa: 0.45-0.64). The less experienced examiners showed an accuracy > 80% for all signs, except for lymph nodes and fistula, for which it ranged from 60% to 75%. Conclusion Less experienced examiners have a tendency to present a level of interobserver agreement similar to that of experienced examiners in evaluating Crohn's disease through CT enterography, as well as showing satisfactory accuracy in identifying most radiological signs of the disease.


INTRODUCTION
Crohn's disease is one of the most prevalent inflammatory bowel diseases. It is a noncaseating, granulomatous disease, of uncertain origin, that can lead to inflammation of any part of the digestive tract, the terminal ileum being the most commonly affected site. The disease occurs transmurally, with discontinuous focal impairment of diverse parts of the gastrointestinal tract, in some cases inducing the formation of stenosis with intestinal obstruction and fistulas, among other complications (1) . In the assessment of patients with Crohn's disease, the radiologist plays a crucial role, not only in assisting in the initial diagnosis, via the definition of the extent of the disease, but also by establishing the degree of inflammatory activity and its associated complications (2,3) .
Diverse imaging methods can be used in order to assess Crohn's disease, among which are magnetic resonance (MR) enterography and computed tomography (CT) enterography. Although MR enterography presents the main advantage of not using ionizing radiation, it has certain limitations, such as lower availability and lower reproducibility, in comparison with CT enterography, mainly the diagnosis of the complications of the disease and during its acute phase (3)(4)(5) . For these reasons, CT enterography has been recommended as a valuable diagnostic option in the initial evaluation of patients suspected of having Crohn's disease and in the definition of its complications (2)(3)(4)(5) .
Numerous studies have shown good interobserver agreement in the evaluation of Crohn's disease with CT enterography (3)(4)(5) . However, studies that establish the impact of examiner experience on the reproducibility of the method are still lacking. Only one study of MR enterography quantified the impact of examiner experience on the evaluation of the main signs of Crohn's disease and concluded that more experienced radiologists tend to have better performance in the evaluation of vascular engorgement (designated the comb sign) and of lymph node enlargement (6) . Therefore, we do not know exactly whether, among less experienced radiologists and specialists in abdominal imaging, CT enterography shows similar accuracy and reproducibility (good interobserver agreement).
The objective of this study is to assess the level of agreement between examiners with lesser and greater degrees of experience in the interpretation of the main symptoms of Crohn's on CT enterography, as well as determining the accuracy of these young professionals.

MATERIALS AND METHODS
This was a cross-sectional, retrospective, observational study, undertaken at a single institution between January 2012 and December 2013. The study was approved by the Research Ethics Committee of the Federal University of São Paulo Paulista School of Medicine. All patients gave written informed consent.
We included 20 consecutive patients with a clinical diagnosis of Crohn's disease, all of whom were followed at the inflammatory bowel disease outpatient clinic and in whom CT enterography was indicated because of suspicion of inflammatory activity or complications related to the disease.
The exclusion criteria were as follows: patients who refused to participate in the study; having a history of severe allergic reaction to iodinated contrast medium; having renal failure; and presenting with any other contraindication to abdominal CT involving the use of iodinated contrast medium.
Of the 20 patients evaluated, 6 (30%) were female and 14 (70%) were male. Ages ranged from 28 to 65 years (mean, 39.5 years). The mean time since the onset of symptoms was 11.2 years (range, 1-17 years). On average, the disease had been diagnosed 8.3 years before the patient underwent CT enterography.

CT enterography procedure
We carried out all the examinations with a 64-channel multislice CT scanner (Brilliance 64; Philips Medical Systems, Best, The Netherlands). The multislice technique and volumetric acquisition were used, extending from the diaphragm to the pubic symphysis, 50 s after the start of the intravenous injection of the contrast medium (7) . The contrast medium was administered with an automated injection pump, at a speed of 3 mL/s, in a volume of 2 mL/kg of body weight, with a maximum total volume of 150 mL. The following technical parameters were used in the CT examinations: collimation, 64 × 0.624; pitch, 0.891; slice thickness, 1 mm and 3 mm; tube voltage, 120 kVp; and tube current, variable (depending on the abdominal thickness of the patient).
To determine the radiation dosage, we used dose modulation software (8) . In terms of the endoluminal contrast, the CT enterography protocol involved oral administration of a polyethylene glycol solution, as previously reported (9,10) . The examinations and image acquisition took place 45 min after the initiation of the oral administration of the contrast medium (11) .

Examiners
Four examiners, working independently and blinded to the patient clinical data, interpreted the CT enterography examinations, using a picture archiving and communication system (Synapse; Fujifilm Medical Systems, Stamford, CT, USA), on the workstation provided with the CT equipment. Of the four examiners, two were third-year residents in general radiology and diagnostic imaging (duo I) and two were radiologists specializing in abdominal imaging with more than five years of experience in performing CT enterography (duo II). The examiners with less experience had previously received a short training course, evaluating the CT enterography examinations of ten Crohn's disease patients who were not included in this study. A third specialist in abdominal imaging, with 25 years of experience, subsequently evaluated in consensus with the experienced examiners the CT enterography examinations, determining the presence or absence of the main radiological signs related to Crohn's disease. This consensual evaluation was used as the reference for the calculation of the accuracy of duo I.

Radiological signs
We evaluated the presence of the main radiological signs of Crohn's disease, its inflammatory activity and its complications ( Figure 1), defined as the following (12)(13)(14)(15) : -Wall thickening: thickening of the intestinal wall ≥ 5 mm.
-Mucous enhancement: visual comparison with healthy adjacent loops.
-"Comb sign": corresponding to the vasa recta being dilated from the inflammatory process.
-Densification of the fat adjacent to the loop.
-Regional lymph node enlargement: lymph nodes measuring ≥ 5 mm along their short axis. We also measured the number of enlarged lymph nodes identified.
-Presence of stenosis: reduction of the caliber of the loop with upstream distension, characterized by the small loop segment having a cross-sectional diameter > 2.5 cm, with or without collapse of the distal fragment.
-Presence of fistula: defined as a linear image, with peripheral enhancement, connecting two anatomical structures, with or without air or fluid content.

Systematization of the analysis of images
The radiological signs described above were evaluated in a segmented and independent manner in the following intestinal segments: jejunum, ileum, right colon, transverse colon, and left colon. Therefore, for each patient, the presence or absence of the radiological signs were evaluated five times, or rather, in five segments. Consequently, a total of 100 intestinal segments were evaluated.
Due to the difficulty of analyzing the interpretation of lymph node enlargement and intestinal fistulas in a segmented manner, we interpreted those variables only as present or absent in each participant and not per intestinal segment.

Calculation of accuracy
The sensitivity, specificity, and accuracy of the examiners in duo I, in terms of the detection of radiological signs of Crohn's disease on CT enterography, were calculated using the consensus analysis carried out by the three experienced examiners as a reference.

Prevalence of the findings
According to the overall evaluation of the examiners, the following symptoms were encountered: densification of the wall, in 22% of the intestinal segments evaluated; thickening of the wall, in 24%; prominent mucous increase, in 28%; mural stratification, in 27%; the comb sign, in 26%; and stenosis, in 20%. All these symptoms were most prevalent in the ileum. The prevalence of fistula and lymph node enlargement occurred, respectively, in 65 and 50% of patients studied.

Measure of reproducibility
The two pairs of examiners presented similar results for reproducibility. There was moderate to good interobserver agreement (a strong correlation) between the two less experienced examiners (duo I), as well as between the two more experienced examiners (duo II), in the assessment of the majority of radiological signs of Crohn's disease, with the exception of lymph node enlargement and fistulas, for which there was poor interobserver agreement (a weak correlation) in both analyses ( Table 1).

Measure of accuracy
The less experienced examiners (duo 1) presented sensitivity, specificity, and accuracy above 80% for the variables mural densification, the comb sign, wall thickening, and mucous enhancement. Accuracy in the analysis of lymph nodes and fistulas ranged from 60% to 75% (Table 2).

DISCUSSION
The main objective of this study was to evaluate the reproducibility of CT enterography, which is simply the extent to which examiners agree between or among themselves. The evaluation of interobserver agreement is important to the validation of the use of the technique, because we expect that similar results can be reproduced in a consistent manner for different examiners. It is also crucial to establish what impact the experience of these examiners has on the reproducibility of the method, which can increase the robustness of its use in the diagnosis and monitoring of a given disease (16,17) . In the present study, we found that examiner inexperience had little influence on the reproducibility of CT enterography in the identification of the main radiological signs of Crohn's disease, as long as a short training course covering the method and disease was undertaken. The accuracy obtained by these examiners was considered satisfactory for all of the radiological signs, except for lymph node enlargement and fistula.
There have been few studies in which the main objective was to measure the reproducibility of CT enterography in the evaluation of patients with Crohn's disease (4,18) . This evaluation is made, generally, in a secondary manner, which can cause the results to diverge across studies. For example, some studies have reported excellent agreement for the detection of signs of Crohn's disease (3,19) , whereas others have reported only moderate agreement for the majority of symptoms (4,18) , with results similar to those obtained in the present study. This discrepancy can be explained by a number of variables, among which we can cite differences in examination protocols, in the quality of images obtained, and in the severity of the disease among the patients included in the study sample, heightened by the fact that radiological signs utilized have a certain degree of subjectivity, which requires their validation in reproducibility studies.
In the present study, for the variables that indicate the inflammatory activity of Crohn's disease, such as wall thickening, mucous enhancement, stratification, densification of the mesenteric adipose tissue, the comb sign, and stenosis, the agreement between the two inexperienced examiners was similar to that observed between the two more experienced examiners. This is in agreement with the results obtained by other authors who also quantified the reproducibility of CT enterography among specialists in abdominal imaging (18) . Our results suggest that the experience of the examiner has little impact on the reproducibility of CT enterography in the evaluation of patients with Crohn's disease. Our results are also similar to those obtained by other researchers in regard to the comb sign, wall stratification, and mucous enhancement criteria, as evaluated by experienced examiners (4) . As described in a similar study quantifying the reproducibility of CT enterography in evaluating Crohn's disease (6) , we observed poor agreement in the evaluation of mesenteric lymph nodes, which could reflect the difficulty and subjectivity of evaluating this variable with CT enterography and MR enterography. However, unlike MR enterography, CT enterography showed good agreement in the evaluation of diverse signs of inflammatory activity of the disease, indicating a tendency toward improvement in the reproducibility of the method.
In the present study, the level of agreement between the experienced examiners was surprisingly low in the evaluation of fistulas. The reported prevalence of fistula in patients with Crohn's disease is generally 10-30% (20,21) , considerably lower than the 60% observed in our patient sample. It is possible that the elevated prevalence of fistula in our sample led some examiners to overestimate or underestimate the symptom, resulting in the discrepancy of results between examiners and consequently in poor agreement. Detection of a fistula between the bowel loops by CT enterography uses subjective criteria and is difficult to substantiate, because surgical exploration is typically contraindicated in this patient population (20,22) , as well as because other, potentially more effective, diagnostic methods, such as barium followthrough, capsule endoscopy, and CT with oral administration of water-soluble iodized (positive) contrast medium, are not routinely employed (20,23,24) . That could explain our finding of poor agreement even between examiners with more experience, motivating us to search for signs that are more consistent with a diagnosis of internal fistula in Crohn's disease and to use orally administered contrast media in the CT studies conducted because of suspicion of internal fistula. It should be borne in mind that, despite their desirability, there have been few studies demonstrating the reproducibility and accuracy of barium follow-through or the reproducibility of CT with positive contrast media in the diagnosis of enteral fistula. However, some of the studies aimed at establishing the value of CT enterography in the evaluation of fistula have done so without determining the accuracy of the method or its reproducibility or by studying samples of patients with known fistula, which reduces the strength of the CT enterography results (20,24) . Further studies, utilizing reliable reference standards, will be needed in order to evaluate not only the reproducibility but also the accuracy of CT enterography in the diagnosis of enteroenteric fistulas.
This study presents certain limitations. First, the patient sample was small. However, all of the patients evaluated presented an advanced stage of the disease with multiple radiological signs, increasing the prevalence of the variables analyzed. In addition, we used a reference standard based solely on a consensus between experienced examiners, although there is no surgical or histological gold standard, because surgical intervention is contraindicated in the majority of patients with Crohn's disease (24)(25)(26) . Furthermore, the study was carried out in a retrospective manner. Nevertheless, the homogeneity in the form of CT enterography acquisition, which has been included in the protocols of previous studies, permitted a consistent analysis in examinations considered to be of good diagnostic quality. Moreover, because of a lack of data corresponding to the period during which the CT enterography examinations were performed, we were unable to correlate the CT findings with clinical and laboratory changes in the patients (and therefore with the index of inflammatory activity). Further studies are warranted in order to determine the impact of examiner experience on the evaluation of the degree of inflammatory activity in Crohn's disease.

CONCLUSION
In the present study, examiner experience was found to have no substantial impact on the reproducibility of CT enterography in the assessment of patients with Crohn's disease and a short training course was found to be adequate to obtain satisfactory accuracy in the detection of the main CT signs of the inflammatory activity of the disease. The diagnosis of enteral fistula by CT enterography remains a challenge that merits improvement in acquisition techniques, better examination protocols, and more examiner training.