A.N. Kolmogorov’s defence of Mendelism

In 1939 N.I. Ermolaeva published the results of an experiment which repeated parts of Mendel’s classical experiments. On the basis of her experiment she concluded that Mendel’s principle that self-pollination of hybrid plants gave rise to segregation proportions 3:1 was false. The great probability theorist A.N. Kolmogorov reviewed Ermolaeva’s data using a test, now referred to as Kolmogorov’s, or Kolmogorov-Smirnov, test, which he had proposed in 1933. He found, contrary to Ermolaeva, that her results clearly confirmed Mendel’s principle. This paper shows that there were methodological flaws in Kolmogorov’s statistical analysis and presents a substantially adjusted approach, which confirms his conclusions. Some historical commentary on the Lysenko-era background is given, to illuminate the relationship of the disciplines of genetics and statistics in the struggle against the prevailing politically-correct pseudoscience in the Soviet Union. There is a Brazilian connection through the person of Th. Dobzhansky.


Introduction
[note that in bibliographies Kolmogorov's name is frequently cited and spelled as Kolmogoroff; as also done herein, whenever references are given] analysed two tables, Tables 4 and 6 of Ermolaeva (1939), who summarized and analysed the results of a series of experiments which she had done in the preceding years. Ermolaeva followed the design of some experiments made by Mendel (1866), in what may be seen now as a pointless exercise, to disprove Mendel's principal law of inheritance. Nowadays every basic course of biology states that if one observes self-pollination of a hybrid plant the proportion of dominant plants grown from the resultant seeds will be 3/4. The main part of Ermolaeva's data related to colour of seed coat: white vs. greyish-brown, correlated with white vs. violet flowers (Ermolaeva's Table 4) and colour of cotyledons: yellow or green (Ermolaeva's Table 6). The dominant states are respectively grayish-brown seed coat and yellow cotyledon. Ermolaeva did extensive experiments on colour of the seed coat and colour of the seed cotyledon.
Ermolaeva said that her data did not support a model of a constant underlying proportion and in this she was supported by Lyssenko (1940) [note that Lysenko's name frequently appears as Lyssenko in the literature] who therefore concluded that Kolmogorov was wrong. But Kolmo-goroff (1940) wrote: "This material, despite Ermolaeva's claims to the contrary, has proved to be a new brilliant confirmation of Mendel's laws." Kolmogorov's paper is interesting for a number of reasons: it appeared at a critical time for the discipline of genetics in the Soviet Union but also it was an early example of the application of his own statistical test (Kolmogoroff, 1933).
Let S n (x) denote the empirical distribution function of a simple sample of size n drawn from a population in which the random variable X has a continuous distribution function F(x). That is S n (x) = N(x)/n, where N(x) = number of sample values £ x. Denote by D n the supremum over the full range of x of | S n (x) -F(x)|. Kolmogoroff (1933) gave the limit distribution of the random variable D n , giving an expression for the limiting form as n ® ¥ of Pr( / ) D n n < l for arbitrary positive l. Since D n tends to zero as n ® ¥, Kolmogorov's formula provides the basis of a test that a sample of values of X come from a postulated distribution F providing n is large. The limiting expression is given in our third section.
The main part of this paper examines Kolmogorov's application of his test to Tables 4 and 6 of Ermolaeva (1939) following a section on the data itself. We reproduce the relevant columns of these two tables as our Tables 4 and  5 respectively. Ermolaeva (1939) concluded that her experiment proved that self-pollination of hybrids, Aa x Aa, did not produce a consistent segregation ratio. The second issue, assuming that there is a consistent ratio, is whether the proportion of dominants is 3/4. The fourth section of this paper uses a partition of c 2 to analyse both issues. Some historical background is given in the fifth section and some general comments in the final section.

Ermolaeva's Experiment
Ermolaeva's Table 4 consists of 98 entries relating to seed coat colour. It appears that each entry gives the numbers of dominant and recessive plants (or potential plants, if grown) produced by a single hybrid plant. Thus, Table 4 provides n = 98 sample values. The variate of interest is the observed proportion of dominants and so the binomial distribution provides the model of variation. Kolmogorov exploited the normal as an approximation to the binomial distribution. Table 5 provides 122 values relating to colour of cotyledon.
A question raised by Kolmogoroff (1940) concerned the numbers of plants in each family, that is in each line of Tables 4 and 5, and hence of the validity of using the standard normal distribution as a model of the binomial. Taking 20 as a desired minimum number of seeds (justification in our Section 4), it is seen that the data relating to seed-coat colour are not satisfactory: only a small number of families have number of seeds of 20 or more. The summary details for Table 4 are: minimum number 2, first quartile 9, median 11.5, third quartile 17, and maximum 33. The numbers in respect of cotyledon colour are more satisfactory: minimum 6, first quartile 16.25, median 22, third quartile 28.75, maximum 64. Clearly the use of Kolmogorov's test given below is easier to justify in respect of cotyledon colour.
In the light of the low counts in many families, there would be many observed proportions varying markedly from Mendel's proportion 3/4. Also, in view of experience obtained from experiments before and after Ermolaeva's, some of the results obtained could be explained only as resulting from technical errors such as a parental plant being homozygotic rather than hybrid. Table 1 of our paper is a reproduction of part of Ermolaeva's Table 3 which relates to cotyledon colour. Her Table 3, closely related to the data in her Table 6 (our Table 5), but not entirely to only this data, gives the lines used to obtain the hybrid seeds, reference numbers to sets of pollinations, the numbers of plants in a set, the number of seeds classified as dominant, the number recessive, the percentage of seeds dominant and Ermolaeva's indication of goodness of fit to Mendel's 3:1 model, poor fit being denoted by the symbol 'P'. Some of the indications of significance in Ermolaeva's tables are based on far more stringent requirements than are customary. For example, the very first entry in our Table 1 is marked as significant when the observed number of dominants differs from expected by about one standard error. Table 2 of Ermolaeva, closely related to the data in her Table 4, but not entirely to only this data, is reproduced here as our Table 2, and gives a list of the lines used to ob-tain the hybrid seeds used to study seed-coat colour. It summarises the numbers of dominant and recessive forms obtained from individual seeds. Data in Ermolaeva's Table 5 relating to seed form were obtained from only 5 plants and are not considered here. Ermolaeva (1939) noted one item of detail: "We did not have the opportunity to cross the same pair of plants several times, due to the fact that peas have a comparatively low number of flowers and for a short period of time. Because of this we took several pairs of the 178 Stark and Seneta  same pure-bred types of peas. " Fisher (1936) noted that on average about 30 seeds were classified from each plant in some of Mendel's experiments. As can be seen from Tables 1 and 2 of the present paper, on average fewer [than 30] seeds were classified from each mother plant in Ermolaeva's experiment.
Tables 1 and 2 of this paper show that the same parental line (47) was used in the production of all hybrids of the two characters. Ermolaeva did not indicate which line was used as the mother plant from which the F 1 seeds were taken. It may be that she followed Mendel in making the cross in both reciprocal directions.
Summing the numbers in Table 1 yields 2023 dominant and 745 recessive seeds so that the percentage of dominants is 73.1%. The standard error of the observed proportion assuming hypothetical value 0.75 and number of seeds 2023 + 745 = 2768 is 0.00823. Dividing the difference of the observed proportion from 0.75 by the standard error gives an approximate standard normal value 2.326. The two-sided probability of exceeding this value is approximately 2%.
Summing the numbers in Table 2 yields 1008 dominant and 355 recessive seeds so that the percentage of dominants is 74.0%. The standard error of the observed proportion assuming hypothetical value 0.75 and number of seeds 1008 + 355 = 1363 is 0.01173. Dividing the difference of the observed proportion from 0.75 by the standard error gives an approximate standard normal value 0.891. The two-sided probability of exceeding this value is approximately 37%.
There are many discrepancies between Ermolaeva's Tables 4 and 6 and the earlier tables.
Rather than referring to the vast amount of work carried out elsewhere which overwhelmingly supported Mendel's, Ermolaeva (1939) included a quotation from the Lysenko-era geneticist Lev Nikolaevich Delone (Delaunay) . Delone had established a reputation in the Soviet Union using radiation to induce mutations in wheat. He adopted the usual rhetorical device of attributing to the Mendelians something which they would not use in practice. In this case it concerned a plan to produce a plant with a desirable trait or combination of traits controlled by a large number of recessive genes. Delone stated that the probability of obtaining a plant with the desired characteristic from hybrids is 4 -n , where n is the number of independent (unlinked) genes controlling the trait. When n is large, the correctness of this formula is precisely the reason why a Mendelian would not use a mass planting in the hope of finding a plant with the desirable combination of traits.
At least, when referring to orthodox geneticists, Ermolaeva did not use the pejorative label "Johannsen-Mendelian-Morganist", or the more usual label in which Weismann replaces Johannsen, as was Lysenko's custom. At the core of the disagreement between Lysenko and his puppet master Stalin on one side and orthodox geneticists on the other was the concept summarized by Wright (1917): "Heredity as looked upon since the time of Weismann is relatively simple to understand. It consists merely in the persistence of a certain cell constitution (in the germ cells) through an unending succession of cell divisions. " Lysenko (1951) claimed, for example, that geneticists believed that this meant that the development of plants and animals was not affected by environmental factors and that the germ plasm could not be changed by mutation. Lysenko either did not understand, or simply ignored, the fact that geneticists recognized the presence of heterozygosity, when it was there, and exploited it in selection for desirable traits, just as he did not understand the possible existence of 'pure lines', such as those studied by Johannsen.
The marked disparity between the numbers of seeds obtained to study seed-coat colour and those for colour of cotyledon was noted above. Families number 4 and 38 in Table 4 have rather low numbers of dominant seeds. These two features suggest that there may have been problems in the conduct of these experiments. Family 41 in Table 5 (Ermolaeva's Table 6) also has a very low number of dominants.

Kolmogorov's Analysis
Kolmogorov then recommends that, if the number of individuals in each family is very low, for example less than 10, it is feasible to verify formula (1) with the aid of "the c 2 criterion of [Karl] Pearson". He does not elaborate on this suggestion. It may be a mistranslation into English of Kolmogorov's intentions, by his translator.
He then defines the normalized deviations D as and notes that these normalized deviations D obey approximately the "law of Gauss with unit dispersion", that is the probability for the inequality D £ x to hold is approximately equal to In (1), (2) we have used Kolmogorov's notation. Our Table 3 reproduces the table given by Kolmogorov which shows that the number of times |D | exceeds unity agrees closely with expectations. Kolmogorov's comment is: "Strangely enough, N.I. Ermolaeva herself states in her work that existence of a considerable propor-tion of families showing |D | > 1 should be regarded as disproving Mendel's theory." Kolmogorov then makes a formal analysis of Ermolaeva's experiments by means of the what is now known as Kolmogorov's test, which is a one-sample version of the later Kolmogorov-Smirnov two-sample test. He takes the sets of standardized values (2) and tests them against the standard normal distribution.
He refers to the account of his own test , introduced in Kolmogoroff (1933), as presented in the monograph of the leading Russian mathematical statistician of his time, Romanovsky (1938). In this book, the relevant material occurs on pp. 226-229 (Kolmogorov cites p. 226) in a section whose title (in English translation) is: 61. A new criterion for agreement of an empirical and a theoretical distribution. Kolmogoroff (1940) uses the notation F(l), of the book, in the way we describe below. We note also that in the preceding section of his book, Romanovsky (1938) uses the c 2 goodness of fit criterion of Karl Pearson to illustrate the same example as in his Section 61. It is also relevant that Kolmogorov had reviewed the book of Romanovsky (1938) when it had appeared, so he would have had it to hand when composing, in the guise of mathematical statistician, the note Kolmogoff(1940).
The following is an adaptation from p. 450 of Gnedenko (1968), a close associate of Kolmogorov, of directions for the use of Kolmogorov's test. If the cumulative distribution function under test F(x) is continuous and the empirical distribution function from a sample of size n is denoted by F n (x), then as n ® ¥, is sufficiently small (conventionally, less than 0.05), then a very unlikely event has occurred, and the difference be-tween F n (x) and F(x) is regarded as significant and no longer explained by the randomness of the observed values. However, if F(l 0 ) is large, then the difference between F n (x) and F(x) is considered insignificant, and our hypothesized F(x) may be regarded as being compatible with experiment. Figure 1 displays the function F(l) for values of l from 0.4 to 1.5. Note that n is used in two senses in quoting from Kolmogorov's paper and Gnedenko's monograph. In the former n is used as the number of seeds or plants in a 'family' and in the latter the number of lines in either Table 4 or 6 of Ermolaeva, that is the sample size in Kolmogorov's test.
We attempted to verify Kolmogorov's calculations using the statistical package R, specifically its procedure ks.test, relevant to the Kolmogorov-Smirnov tests, on the obtained frequencies in Tables 4 and 6, reproduced here in condensed form in Tables 4 and 5   and l 0 = 0.999, with 2-sided probability 0.27 and for seedcoat colour D n = 0.0667 and l 0 = 0.660, with probability 0.78. Our Figure 2 shows the empirical distribution of values relating to colour of cotyledon plotted against the standard normal distribution function and Figure 3 the corresponding plots for seed-coat colour. A notable feature of both Figure 2 and Figure 3 is the negative skewness of the distributions of proportions. ks.test advised that the probabilities shown above which it calculated were not correct because of the presence of coincidences in the data sets. Figures 2 and 3 give a visual impression of agreement with the Mendel model except at the left hand end. Gnedenko (1968), however, clearly specifies that the test was to be applied to continuous distributions whereas the data in this application are discrete, and there are some duplications of values of both sets of data, an event which has zero probability for continuous distributions. Accordingly there are ambiguities in calculating the probability associated with maximum D n values.
A related problem is inclusion in the above analysis of small families, whereas for D to properly approximate a sample value from a standard normal distribution, the family size n should be large. In respect of seed-coat colour, for example, there are two plants with 2 dominant and 3 recessive seeds. These were the readings associated with maximum D n value. In respect of colour of cotyledon, the maximum D n occurred at cumulative probability around 0.5. Additionally, in Table 5 there are some suspect readings. Family # 148 records 0 dominants and 10 recessives, while family # 105 records 50 dominants and 0 recessives. In Ermolaeva's Table 4, there is one plant from which all seeds were classified as recessive. Such readings are highly unlikely results from hybrid crosses Aa x Aa.

Analysis of Ermolaeva's Tables 4 and 6 Using the Chi-Squared Distribution
A rule of thumb which is generally applied in the related statistical problem of applying a normal approximation with continuity correction to readings from a binomial distribution is that both np ³ 5 and n(1 -p) ³ 5, where p (here 3/4), is the probability of "success" in n trials. Thus if we apply this rule, family size should be at least 20. If we Kolmogorov's defence of Mendel 181 apply this rule by including only families of at least 20, and additionally exclude from consideration from Table 5 family # 105, we can be reasonably confident that each D 2 , the square of a standard normal variable, has a c 2 distribution independently of the other D 2 's, and when we sum such D 2 's, the sum will have a c N 2 distribution, where N is the number of summands, under the hypothesis the Mendelian "3/4".
Then for Table 4 we obtain observed c 13 2 = 20.579, with p-value a 0.09; and for her Table 6 c 75 2 = 90.211, with p-value 0.11. Since both p-values exceed the conventional cut-off of 0.05, there is no strong statistical evidence against the Mendelian hypothesis.
In the above brief c 2 analysis we have attempted to use an essentially equivalent test to Kolmogorov's inasmuch as it relies on the approximate standard normality of the D's, after "cleaning" the data appropriately. So while the conclusion drawn by Kolmogoroff (1940) confirms what is now totally accepted, the evidence in support of this conclusion is not as strong as his paper presents. Of course his statistical technology was well beyond the understanding of Lyssenko (1940) and Kolman (1940), who could hardly argue on the grounds of its incompletely justified application and possible arithmetic error, to data which may have been poorly prepared. Seneta (2004) describes Kolman's leading role in the attacks on mathematicians and traditional pure mathematics in the Soviet Union during the Stalinist era.
We now pass to a consideration of Ermolaeva's Tables 2 and 3 (our Tables 2 and 1 respectively). These at first appear to be condensations of Tables 4 and 6, and while this is partially true, there are a number of inconsistencies and inaccuracies.
For example the third line of Table 1 should show 52 dominants instead of 42. Making the substitution in that line gives the percentage of dominants as 65.8 instead of 60.9.
There are 122 lines of data in Table 5, but there are 123 families considered in Table 1. Kolmogoroff (1940) therefore thought the number of lines in Table 5 was 123 while it is actually 122. He reports 38 as the number of lines in Table 5 where D > 1, which is correct, but the percentage is slightly "out", since it relates to a total of 123.
Each line in Table 5 appears to be the result of scoring the state, that is with either yellow or green cotyledon, of the seeds from the pod or pods produced by a self-pollinating hybrid plant derived from the pollination designated in Table 1.
It seems that some hybrid plants produced one useable or used pod and others more. Summing the numbers in Table 6 yields 2104 dominants and 742 recessives, the percentage of dominants 73.9 and a standard normal value 1.320, so not significantly different from 3/4.
Looking at the histogram of the 122 individual proportions of dominants gives only weak support to the view taken by Lyssenko (1940), discussed below, that it is not reasonable to consider that the segregation of states comes from a single underlying proportion 3/4, but rather that the phenomenon is more variable. There are 13 proportions below 0.6, some clearly explicable by virtue of small sample size. The final entry gives 10 seeds all recessive. This was part of batch 16 (Table 1), which was one of 8 batches of the cross 178 x 47. Table 1 shows that the other 7 batches gave proportions remarkably consistent with 3/4. Ermolaeva shows 6 plants used in batch 16 (Table 1) whereas there are   182 Stark and Seneta  only 5 in Table 5. Summing these five yields the percentage 59.8 instead of the 56.8 given by Ermolaeva.
Ermolaeva constructed her Table 2 by condensing the data relating to colour of the seed-coat given in Table 4. About one third of the lines in Table 2 are inconsistent with the entries in Table 4. There are 98 lines of data in Ermolaeva's Table 4. The total number of dominants is 939 and recessives 336 giving the percentage of dominants 73.6 and standard normal value 1.116. The final line of Table 2 gives a batch with label 13a for which there are no corresponding entries in Table 4. This accounts for much but not all of the difference between the total numbers of plants of the two tables. Fisher (1924) and associated papers examine the properties of the formula developed by Pearson (1900) where S denotes summation over a number of cell frequencies, x is a typical cell count and m the corresponding expected cell count. Consider a single line in Ermolaeva's the refinements demonstrated by Fisher can be applied to D 2 as defined in (2) and sums of such terms. Fisher (1924) Kolmogorov's defence of Mendel 183 set out the conditions which should apply when using (4) as a "measure of discrepancy between observation and expectation". An important issue in the application of (4) is using the correct number of degrees of freedom. Fisher noted that these should be determined by the number of degrees of freedom in which observation and expectation might differ. So, in applying (4) to (5), although there are 2 cells there is only one degree of freedom, in accord with the use of D 2 earlier. Fisher noted that, if an estimateq of q was made, the number of degrees of freedom should be reduced by one. Further, such an estimate should be consistent and efficient and an estimate made by minimising c 2 was both. The left hand side of (5) and therefore the equivalent right hand side can be applied to Tables 4 and 6 by substituting q = 3/4 with N degrees of freedom and q q =~, with N -1 degrees of freedom. The difference between the two values of c 2 is c 2 with one degree of freedom and measures the improvement to the goodness of fit made by estimating q from the data. It also provides a test of whether q = 3/4 should be rejected.
The estimate~. q = 0 740 is obtained from the reduced data set of Table 4 with c 12 2 = 20.401 (p = 0.074) and c 1 2 = 0.178 (p = 0.95). The corresponding values obtained from the reduced data set of Table 5 are~. q = 0 7365, c 74 2 = 88.051 (p = 0.14) and c 1 2 = 2.160 (p = 0.36). Accordingly, in neither case doesq provide a significantly better fit to the data than 3/4. is concerned, this could be explained, at least in part, to lack of control. In respect of colour of cotyledon, as Kolman (1940) noted, the empirical distribution function lies fairly consistently to the left of the normal distribution function. But the same comment could not be made about seed-coat colour. In any case, taking into account the kinds of a priori considerations raised by Fisher, it would have been prudent to try to repeat the experiment and to move on to other experiments, for example to backcrossing, as Mendel did.