Genome size of three Brazilian flies from the Sciaridae family

We determined the genome size of the Brazilian sciarid flies Bradysia hygida, Rhynchosciara americana and Trichosia pubescens (Diptera: Sciaridae) using absorbance measurements of Feulgen-stained nuclei belonging to these species (and chicken erythrocytes as a standard) to calculate the amount of DNA in picograms (pg) and the number of base pairs (bp), or C-value, for each of these species. The C-values were: 3 x 10 bp (0.31 pg) for B. hygida; 3.6 x 10 bp (0.37 pg) for R. americana; and 1 x 10 bp (1.03 pg) for T. pubescens. The sciarids investigated in this work had considerably higher C-values than the average for previously described dipteran species, including D. melanogaster.

The determination of the genome size of an organism not only provides important data relevant to both the classification of the organism and for comparative studies of the structural genome and evolution (Hardie et al., 2002) but is also useful in molecular biology studies such as the development of genome sequencing strategies.
In haploid genomes the amount of DNA (C-Value) is a unique characteristic of each species.The C-value is given in base pairs (bp) and can range from 10 5 to 10 7 bp in prokaryotes and from 10 7 to 10 11 bp in eukaryotes.Although this variation occurs, it is difficult to establish a relationship between the haploid genome size and the morphological, physiological or developmental complexity of an organism (Lozovskaya et al., 1999), this problem originally being known as the C-value paradox but was subsequently renamed to the C-value enigma (Gregory, 2002;2003).The sequencing of several genomes has also shown that the gene number does not correlate with the organism complexity (the G-value paradox), which can be explained by several events such as gene redundancy, alternative splicing and post-translational modifications (Hahn and Wray, 2002).
Birds, reptiles and mammals display only a small variation in the DNA content within classes, with the genome size of birds being the most conserved (Tiersch and Wachtel, 1991).However, in amphibians, fish and insects there is a wide variation in C-values between species whose apparent complexity does not vary greatly.
Among the insects whose genome size has already been determined, Drosophila melanogaster presents a compact genome of 1.8 x 10 8 bp (Adams et al., 2000) while the haploid DNA content of the fly Aedes albopictus is about sixfold larger (Lengyel and Penman, 1975), with Laupala crickets having a genome size eleven times larger than Drosophila (Petrov and Hartl, 2000) and Podisma pedestris, the brown mountain grasshopper, a haploid genome size of 1.8 x 10 10 bp (Bensasson et al., 2001), the largest genome size yet reported for insects (Gregory, 2001).
In general, the known genome sizes cover only a small fraction of the total of species in each class, genome sizes having been estimated in only 2% of birds, 3% of fishes, 4% of reptiles, 7% of mammals and 8% of amphibians (Gregory, 2002).The invertebrates constitute the major form of multicellular life but the genome size of only about 1,300 invertebrate species is available in Gregory's Animal Genome Size Database (Gregory 2001).For Diptera, the genome size has been described for only 12 families, mainly Drosophilidae and Cullicidae, but no data is available for other dipteran families such as the Sciaridae although insects of this family have been widely used in molecular and cell biology research (Fiorini et al., 2001;Basso Jr. et al., 2002;Monesi et al., 2003;Soares et al., 2003) and the description of their genome size is not only important in these research areas but it could also help in the solution of evolutionary questions relating to the Diptera.The aim of the study presented in this paper was to contribute to the data on insect genome size by determining the DNA content and the genome size of the Brazilian sciarids Bradysia hygida, Rhynchosciara americana and Trichosia pubescens using absorbance measurements of Feulgen-stained neuroblast nuclei from larvae of these species.
The sciarid Bradysia hygida (Sauaia and Alves, 1968) has been reared since 1995 at the Departamento de Biologia Celular e Genética at the Universidade Estadual de Maringá, Paraná, Brazil, according to the conditions described by Laicine et al. (1984) and da Conceição Silva and Fernandez (2000).At 20 °C B. hygida has a 36-day life cycle.During the larval stage of which three molts occur delimiting four instars, the larval eyespots appearing at the sixth day of the fourth instar and are a useful marker to determine larval age and establish the different developmental stages on this instar (Laicine et al., 1984).In our investigation we used E5 aged female larvae.The B. hygida genome is partitioned into four chromosome pairs (A, B, C and X), males being X0 and females XX (Borges et al., 2000).
At 20 °C Rhynchosciara americana (Breuer, 1967) has a 60-day life cycle with three molts delimiting four instars which have no visible eyespots, because of which the developmental stages during the fourth instar have to be established by changes in the communal cocoon which initially consists of a loose and transparent net but gradually changes to a more solid and white structure due to calcium carbonate deposition.Because this species only survives for three generations under laboratory conditions we collected R. americana larvae in the field and maintained them at 20 °C for one generation using the same diet as for B. hygida (Laicine et al., 1984;da Conceição Silva and Fernandez, 2000).For our study we sacrificed the R. americana larvae during the 3A period (Yokosawa et al., 1999) just before the onset of DNA amplification which in this species occurs during the fourth instar.
We obtained Trichosia pubescens (Morgante, 1969) from a larvae culture originally collected in 1973 at Mogi das Cruzes, São Paulo, Brazil (Amabis, 1983), a permanent culture of this fly being maintained by Dr. Eliana Dessen (Instituto de Biociências, Universidade de São Paulo, São Paulo, Brazil).The life cycle and instar stages of T. pubescens are similar to those described above for B. hygida, and for our investigation we used 19-day old L5 larvae.
Reference slides for our study were made using cells from 5-day old chickens (Gallus domesticus) obtained from a commercial animal house in Maringá city.
For neuroblast nuclei preparations, brains of E5 aged female B. hygida larvae, 3A period R. americana, and L5 T. pubescens were dissected in 50 mL of 0.7% (w/v) aqueous sodium chloride under a stereo microscope and processed as described by Borges et al. (2000) and Gaspar et al. (2002).Briefly, 10 larvae brains were transferred to a 5 mL petri dish with 1% (w/v) hypotonic aqueous trisodium citrate for 10 min, changing the solution once during this time.The brains were then fixed in 3:1 methanol:acetic acid for 1 h at room temperature and the fixative changed at 15 min intervals.After fixation slides were prepared by transferring two brains to a slide and squash mounting them in a drop of 45% (v/v) aqueous acetic acid under a coverslip, which was subsequently removed by freezing the slide in liquid nitrogen and flipping the coverslip off with a razor blade.The slides were then fixed again in 3:1 ethanol:acetic acid for one hour at 4 °C, air dried and stored at 4 °C until staining.
Chicken erythrocytes slides were obtained as described by Falco et al., (1999).Briefly, 10 mL of blood was collected in 2 mL of 0.9% (w/v) aqueous sodium chloride supplemented with 0.24 M EDTA and slides prepared by the spread method.The chicken erythrocytes slides were fixed in 3:1 ethanol: acid acetic for 1 min for the B. hygida and R. americana experiments or 1.75% paraformaldeyde/0.1% glutaraldeyde in 0.1 M phosphate buffer pH 7.4 for 15 min for the T. pubescens experiments, washed in 70% (v/v) aqueous ethanol for 5 min, air dried and stored at 4 °C until staining.
All staining procedures were performed at the same time and with the same solutions for each group.The slides were hydrated in 50%, 30%, 10% (v/v) ethanol and distilled water (5 min at each concentration) and DNA depurinization carried out in 4N hydrochloric acid at 25 °C for 25 min (established previously, data not shown).The slides were then washed in distilled water for one minute and stained with Schiff's reagent for 30 min at 25 °C.After staining, the slides were incubated in two changes of sulphurous water for a total of 30 min, after which the preparations were washed three times in distilled water for 2 min, dehydrated in serial baths of 50%, 70%, and absolute ethanol and finally incubated for 2 min in 1:1 ethanol:xylol and then pure xylol.The preparations were then covered with a coverslip in immersion oil.
Images of the B. hygida; R. americana and T. pubescens neuroblast nuclei and G. domesticus erythrocyte nuclei were captured using a Axiophot Zeiss photomicroscope fitted with a 100x oil-immersion objective, polychromatic light source and a DXC107-A Sony camera using Snappy 3.0 software (Play Inc.).The images were analyzed using the free UTHSCSA ImageTool program developed at the University of Texas Health Science Center at San Antonio, Texas and available from the Internet by anonymous FTP (ftp://maxrad6.uthscsa.edu).This tool allowed us to measure the gray values of each nucleus, the resulting integrated densitometry data being analyzed using Microsoft ® Excell ® software.
The determination of the genome size of the sciarids B. hygida, R. americana and T. pubescens was performed by absorbance measurements of Feulgen-stained nuclei, this study being performed in three groups (B.hygida (group 01), R. americana (group 02), and T. pubescens (group 03) with a set of G. domesticus slides being processed at the same time.
Slide images were captured and the nuclei Feulgen-DNA integrated optical density (Feulgen-DNA IOD) determined, the mean and standard deviation of which are shown in Table 1.These data allowed us to determine the C-value and the genome size using simple calculations.The C-value of G. domesticus is well known and contains 1.25 pg of DNA (Smith and Burt, 1998).The comparison between the Feulgen-DNA IOD of the sciarids and G. domesticus resulted in a C-value of 0.31 picograms of DNA for B. hygida, 0.37 for R. americana and 1.03 for T. pubescens.(Table 2).The next step was to convert these values to base pair.According to Dolezel et al. (2003), the number of base pairs is equivalent to the mass in picograms multiplied by 0.978 x 10 9 .The processed data resulted in a genome size of 3 x 10 8 bp for B. hygida, 3.6 x 10 8 bp for R. americana, and 1 x 10 9 bp for T. pubescens.
Fluorescence DNA quantification analysis of B. hygida salivary gland polytene nuclei from larvae at the end of the larval stage (Paçó-Larson, 1976) showed values con-sistent with ours.During larval development B. hygida polytene nuclei undergo 12 replication cycles plus an additional replication for the S1 gland region (21% more DNA) that results in a DNA content of 3 ng.If the B. hygida diploid genome DNA content obtained by us (0.62 pg, equivalent to 6.2 x 10 -4 ng) is submitted 12 replication cycles plus an additional 21% replication the final result is the same value as that reported by Paçó-Larson (1976) for polytene nuclei.
The previous name for R. americana was Rhynchosciara angelae, and determination of the R. angelae genome size by indirect methods (Balsamo et al., 1973) showed a smaller genome size than that calculated by us.Results from biochemical analysis revealed a genome Sciaridae genome size 745  size of 2.1 x 10 8 bp (Balsamo et al., 1973) but since the value of the E. coli genome used as reference was 4 x 10 6 bp and not 4.7 x 10 6 bp (Blattner et al., 1997) the R. americana genome may be larger than previously described.
There is no data in the literature on the polytene nucleus content of T. pubescens that could be compared to our results.However, the amount of DNA in the nuclei of different tissues of Sciara coprophila was determined by cytospectrophotometry to be about 2 pg by Rasch (1970).
The sciarids used in our work present considerably greater C-values than the average C-values of representatives dipteran species already described and also greater than the genome size (0.18 pg) of the model organism D. melanogaster (Table 2; Gregory, 2001).It is interesting to note that the largest C-values in Table 2 are for T. pubescens and Musca domestica.
Only a small number of sequences of the total genome is involved in coding and there is a considerable presence of non-coding sequences such as pseudogenes, transposable elements and repetitive sequences in general (Petrov, 2001;Prokopowich et al., 2003).According to Hancock (2002), transposable elements and the accumulation of repetitive sequences are the main genetic mechanisms responsible for variations in genome size.
The reported size of a genome can be affected by other genetic mechanisms such as, mutation components, e.g.polyploidy, fixation of accessory chromosomes, large duplications or chromosome deficiencies (Lozovskaya et al., 1999;Petrov et al., 2000).The genetic mechanisms contributing to the C-value paradox can sometimes be classified as selection components, e.g., a genome cannot be smaller than the necessary size needed to include all the essential genes, nor can it be so big as to require excess energy to maintain itself (Lozovskaya et al., 1999).Petrov et al. (2000) have pointed out that in evolutionary terms variations in genome size can be explained by assuming that organisms with larger genomes have exhibited low rates of DNA elimination over thousands of years.This extra DNA has functions such as codification of proteins or participation on the direct regulation of that process, or even unknown function (Kimura et al., 2001).
The importance of the gene number (G-value) has been reported in some studies (Betrán and Long, 2002).It is known that Aedes albopictus has 6 times more DNA than D. melanogaster although both species present the same level of gene expression, with 3,000 to 4,000 mRNA sequences having been observed in cultured cells of both of these species (Lengyel and Penman, 1975).Such data supports the G-value paradox, which states that the genome size is independent of the number of coding sequences.
The high DNA content observed by us in the genomes of the sciarids B. hygida, R. americana and T. pubescens is interesting and needs to be investigated, possibly by the analysis of repeated sequences.