Acessibilidade / Reportar erro

The computation of pitch with vectors

Abstract

A pitch model is proposed which is supported by a vector representation of tones. First, an algorithm capable of performing the vector addition of the spectral components of two-tone harmonic complexes is introduced which initially converts the amplitude, frequency, and phase (AFP) parameters into coordinates of the here introduced quotient, distance in octaves, and loudness (QOL) tone space. As QOL is isomorphic to the hue, saturation, and value (HSV) color space, a transformation from QOL to the red, green, and blue (RGB) vector space can be formulated so that the vector addition of two pure tones is conceived by analogy with color mixing operations. Since the QOL to RGB transformation is invertible, the resulting RGB vector sum can be transformed back to QOL. Then, by converting QOL coordinates back to AFP parameters, a tone is found whose frequency supposedly corresponds to the pitch evoked by the original two-tone complex. As for complexes having more than two components, the algorithm is to be sequentially applied to pairs of vectors in such a way that initially the first two vector tones are added together, then the resulting vector is added to the third vector tone, and so on.

pitch computation; vector representation of tones; two-tone complexes; missing undamental


ARTICLES

The computation of pitch with vectors

Aluizio Arcela

Department of Computer Science, University of Brasilia. Campus Universitário, Asa Norte, Phone: +55 (61) 33072705, Zip 70910-900 - Brasilia - DF - BRAZIL. arcela@unb.br

ABSTRACT

A pitch model is proposed which is supported by a vector representation of tones. First, an algorithm capable of performing the vector addition of the spectral components of two-tone harmonic complexes is introduced which initially converts the amplitude, frequency, and phase (AFP) parameters into coordinates of the here introduced quotient, distance in octaves, and loudness (QOL) tone space. As QOL is isomorphic to the hue, saturation, and value (HSV) color space, a transformation from QOL to the red, green, and blue (RGB) vector space can be formulated so that the vector addition of two pure tones is conceived by analogy with color mixing operations. Since the QOL to RGB transformation is invertible, the resulting RGB vector sum can be transformed back to QOL. Then, by converting QOL coordinates back to AFP parameters, a tone is found whose frequency supposedly corresponds to the pitch evoked by the original two-tone complex. As for complexes having more than two components, the algorithm is to be sequentially applied to pairs of vectors in such a way that initially the first two vector tones are added together, then the resulting vector is added to the third vector tone, and so on.

Keywords: pitch computation, vector representation of tones, two-tone complexes, missing undamental.

1. INTRODUCTION

Seebeck [21] proposed a relationship between pitch and periodicity after having observed that the waveform's repetition period could be perceived as pitch even if there is no spectral component at the corresponding frequency-a consideration which gave rise to the concept of "missing fundamental". Later on, a dispute began between Seebeck and Ohm [16], who believed that to a perceived pitch there corresponds the frequency of a non-null spectral component. Some years later, von Helmholtz [31] presented arguments in favor of Ohm's view, when he further conjectured about a possible analogy between the phenomena of mixed colors and those of compound musical tones. Incidentally, this latter hypothesis is taken as the starting point of the present study which is concerned with applying the mathematics of colors to pitch computation.

This paper assumes that three-dimensional vectors are an appropriate representation of tones since they have means of adding them in a convenient way. Indeed, the addition of pure tones as vectors does not yield a "complex" [10], such as occurs in the addition of pure tones as sinusoids, but it just yields a single pure tone. Furthermore, taking into account that a complex is always characterized by one definite pitch [19], it is hypothesized here that the addition of tones as vectors can find a resulting tone whose frequency corresponds to such definite pitch. This tone is called here the vector addition tone, while its frequency is referred to as the computed pitch.

Since the formalization of tones as vectors requires some key properties of two-tone harmonic complexes, Section 2 first reviews the calculation of the period of these complexes, then it describes how the two spectral components can have their loudnesses compared one to the other, so as to demonstrate that for every frequency ratio there is a corresponding proportion between the amplitudes at which the components have the same loudness. This basic property is described as a theorem-the equilibrium theorem-from which a loudness scale is derived for pure tones. Finally, it describes how to find a measure of the symmetry of the second-derivative's zero-crossing pattern, which acts as a magnitude coefficient in the vector addition operation.

Section 3 introduces the vector representation of tones in two subsections. In the first one, it defines the quotient, distance in octaves, and loudness (QOL) space for representing tones in a three-dimensional system. Any pure tone expressed in amplitude, frequency, and phase (AFP) parameters can be represented in the QOL space. In the second subsection, by taking into account that the QOL's mathematical structure is isomorphic to that of the hue, saturation, and value (HSV) color space [30], it shows that QOL tones can be converted to red, green, and blue (RGB) colors, so that any pure tone can be expressed as a RGB vector.

Section 4 details how to compute the overall vector addition tone for harmonic complexes by first describing the algorithm for the vector addition of two tones. This also includes the inverse transformations which are to be applied initially from RGB back to QOL, and then from QOL back to AFP. Next, it shows how the final pitch is computed for harmonic complexes having more than two components by adding vector pairs across the components.

Section 5 first shows how pitch is related to frequency ratios in two-tone harmonic complexes, a description based on the geometric properties of the vector representation of tones. Subsequently, it describes how phase relationships affect the pitch of these complexes.

Section 6 ends the paper with a discussion on the conditions necessary for the pitch of a complex being equal to the frequency (F0) of the fundamental, along with the pitch computation of ten selected complexes having more than two components. They are presented in a sequence which intends to illustrate the main points of the mentioned conditions, being the first three complexes taken from the literature so as to compare the results found in this study with those of some important pitch models (e.g., [5, 6, 8, 17, 18, 29, 32]) either from the "temporal" view, which is based on the "autocorrelation" hypothesis raised by Licklider [13], or from the "pattern matching" models which was first described by de Boer [6]. The results found with vector addition tones show that the pitch of harmonic complexes corresponds to the frequency of the fundamental only in some special cases.

2. KEY PROPERTIES OF TWO-TONE HARMONIC COMPLEXES

Some concepts introduced in this paper are derived from two-tone harmonic complexes. In this way, a mathematical description of these complexes is given below by using a terminology as close as possible to that found in the literature related to such class of complexes [7, 10, 11, 25]. A few terms are introduced, however, as a consequence of considering other aspects of two-tone complexes, as the equilibrium (Section 2.2) and the symmetry of the second derivative with respect to time (Section 2.3).

The lower component x(t) and the upper component y(t) of a two-tone harmonic complex are according to

and

where ax and ay are amplitudes; fx and fy are frequencies; and px and py are phase angles (in radians; when expressed in degrees, they are represented as [px] and [py]). The addition of these components together defines the class of two-tone harmonic complexes c(t), or m: n-complexes for short, that is,

There are three properties of m: n-complexes which are relevant to the present theory, as described below in Sections 2.1-2.3.

2.1. THE PERIOD OF M:N-COMPLEXES

The first relevant property of a m:n-complex is that its period is calculable. That is to say, since the ratio between its component frequencies can be written as

where m and n are integers such that m < n and gcd(m, n) = 1, the period of the complex is measurable, as illustrated in Figure 1 for a 4: 5-complex. In this way, one period τc of a m: n-complex comprises m periods of x(t) against n periods of y(t),that is,


Therefore, to find the period of a m: n-complex from Equation (5), it is first necessary to find one of the numbers m or n, which can be done by means of the equations

and

where is a method for converting a decimal number to a common fraction reduced to lowest terms, that is, the method gives two integers, being the first one collected by the method num while the second one is collected by the method den. These two numbers are referred to here as reduced harmonic numbers.

As studied in Sections 2.2 and 2.3, the focus here is that the period τc is a length of time along which the waveform of a m: n-complex (1) has a well-defined number of peaks, and (2) the symmetry of the zero-crossing pattern of its second derivative with respect to time affects the amplitude of the vector addition tone, as studied in Section 4.

2.2. THE EQUILIBRIUM THEOREM

The second relevant property is related to the number of maxima a m: n-complex has within one period τc. Since this number is exclusively either m or n, a m: n-complex may be said to have two states: the low state, when there are m maxima within the period τc, and the high state, when there are n maxima. In this way, there is a border separating the low from the high state, which is referred to here as the equilibrium of the m: n-complex.

One way of counting the number of maxima is by means of the waveform inflection points, for any maximum occurs in a curve segment having a down concavity. Since the inflection points, that is, the points where the concavity changes from up to down, occur at values of t where the second derivative of c(t) with respect to time is null, this second derivative must be taken and then its zero crossings must be found so as to determine the number of times the concavity of the complex's waveform changes from up to down.

All possible waveforms for a m: n-complex at a given phase relationship can be obtained by holding ax constant while allowing ay to vary from a small value up to be equal to ax, as shown in Figure 2 for one period of a 4:5-complex with phases px = py = π/2. The waveforms in the upper half-surface are in the low state, all of them having four maxima, that is, m peaks, whereas the waveforms in the bottom half-surface are in the high state, all of them having five maxima, that is, n peaks. At the border between these states, the waveform is in equilibrium, as shown by the thick waveform drawing.


Phase relationships between px and py do not cause changes of state. Although they can modify the position of the inflection points, they cannot change the number of them. The state can only be changed by amplitude changes, for there is a well-defined amplitude proportion ax/ay associated to the equilibrium of every m: n-complex. Such proportion is given by a theorem [1], which is stated as follows.

Theorem 1 (Equilibrium theorem) A m: n-complex is in equilibrium if

Proof. Let t µ be a instant such that 0 < t µ < τc at which the lower tone x(t) is at a maximum, i.e., x() = ax, as shown in Figure 1. In terms of periods τx of x(t), it can be expressed as

where k is the number of full periods τx between 0 and , i.e., k = int(tµ,τx); andΔt is the amount of time separating the end of these k successive periods from , i.e., Δt = mod(tµ,τx). Therefore, according to Equations (1) and (5), the phase px needed for the occurrence of a maximum of x(t) at tµ is such that

When x(t) and y(t) are taken together, there is a particular relationship between the phases px and py that causes the positioning of a minimum of y(t) at , i.e., y(tµ ) = -ay, thus producing a peak and valley opposition as shown in Figure 1, a fact that simplifies this proof. In this way, from Equations (2) and (5), it follows that

Now combining Equations (10) and (11) so as to eliminate tµ/τc gives that the expression

is the desired relationship between the phases.

As for the amplitudes, three different proportions occurring at instant t µ are compared in Figure 4, where the more powerful component forces the waveform of the complex to be in an inflection state in which the number of maxima is according to its reduced harmonic number. More precisely, if the second derivative of c(t) at t µ is negative, as in Figure 4(a), the waveform is concave downward, and just one single maximum exists for the length of time whose extent is 1/m times the period τc. This situation characterizes the low state. By contrast, if the second derivative is positive, as in Figure 4(c), the waveform is concave upward, so that two maxima, which are symmetrical in relation to , replace that single maximum of Figure 4(a), thus increasing the overall number of maxima within the period τc. This puts the waveform in the high state. Finally, if the second derivative is null, as in Figure 4(b), the m: n-complex is in equilibrium, for it has a null curvature at . That is to say, c() is an inflection point separating waveforms with just one maximum at from waveforms with two maxima around . In order to find the amplitude proportion ax/ay which is associated to the equilibrium of a given m: n-complex, the second derivative c"(t) is obtained from x"(t) and y"(t) taken separately. From Equations (1) and (2), these derivatives are expressed as

and

so that at the instant tµ they are

and


As the second derivative of c(t) must be null at the instant , according to Equation (3) the following relation must hold

Therefore, from Equations (15) and (16), it follows that

or,

which can be reduced to Equation (8) by means of Equation (4), thus ending the proof of Theorem 1.

If besides this condition the components x(t) and y(t) are in cosine phase, that is, [px] = [py] = 90°, the m: n-complex is said to be in entire equilibrium. In this case, the zero-crossing pattern of its second derivative is at maximum symmetry, as studied in Section 2.3.

2.2.1. A loudness scale for pure tones: Theorem 1 gives a way of constructing a loudness scale or, more precisely, a pitch-strength scale ([28]; [22] ) since it is applied just for pure tones. Hence, its modeling is done on a theoretical basis different from that of classical loudness models-most of them based on the sound pressure level, as found in [15], [26], [27], and [20].

As inferred from the loudness experiment described below in which Theorem 1 is applied to different tones, the expression

derived from Equation (19) establishes a loudness scale which is assumed to be linear, that is, a tone with i2 units of loudness where i2 = ki 1 is perceived as being k times louder than a tone having i 1 units.

In this way, let T be a sequence of pure tones within two octaves of the major diatonic scale, such that their frequencies are, for example, 192, 216, 240, 256, 288, 320, 360, 384, 432, 480, 512, 576, 640, 720, and 768 Hz. If they all have the same amplitude and are generated at a uniform tone-duration-for instance, 600 ms for each note, and a sound pressure level of 65 dB SPL for the first one-they are perceived as having increasing loudness levels. As a consequence of this, increasing pitch tones are only heard at equal loudness levels when their amplitudes are gradually decreasing. Here, the concept of equilibrium of m: n-complexes can account for a melodic loudness equality, since a sequence of increasing pitch tones is heard at equal loudness levels if every pair of contiguous tones is in equilibrium. That is, by applying Theorem 1 to the sequence T relatively to the first tone, the amplitudes will be proportional to =1, = 0.7901, = 0.64, =0.5625, =0.4444, =0.36, = 0.2844, = 0.25, =0.16, =0.1111, =0.09, =0.0625. Under the same conditions used in the equal amplitude case-a duration of 600 ms for each note, and a sound pressure level of 65 dB SPL for the first note-the tones are now perceived as having about the same pitch strength, as demonstrated experimentally in [2].

The role of the above defined loudness scale for pure tones is to be one of the three dimensions of the QOL tone space, according to the description of Section 3.

2.3. THE ZERO-CROSSING PATTERN OF THE SECOND DERIVATIVE

The third and last relevant property of a m: n-complex is that its second derivative c"(t) has a zero-crossing pattern whose symmetry is a significant piece of information, as discussed below.

The tone whose frequency is supposed to be the correlate of the pitch of a given m: n-complex-as described in Sections 3 and 4-has an amplitude which results from the reciprocal action between the components, so that it can assume any value from zero to a certain limit, being a null result only possible with the 1:1-complex, for if ax = ay, and py=px± π, the amplitude of the resulting sinusoid is zero. For all m: n-complexes in which m=n, a null amplitude is impossible, unless the amplitudes ax and ay are both null.

A measure of this reciprocal action between the two components can be found from the zero-crossing pattern of c"(t) along one period τc. Here, it is appropriate to substitute the time t of Equations (1) and (2) by an angle α according to

such that they can be rewritten as

and

where 0 <α < 2π. In this way, the zero-crossing pattern can be found from the function c"(α) = x"(α) + y"(α), i.e.,

and

The symmetry (ξ) of the zero-crossing pattern (℘) changes according to the phase relationship between px and py as well as according to the amplitude proportion ax/ay. As measured by comparing the halves of the zero-crossing pattern to each other, the symmetry ranges from 0 to 1, that is, from no symmetry to full symmetry, according to the Algorithm S below.

2.3.1. Finding the symmetry: The symmetry of the second-derivative's zero-crossing pattern is found by the Algorithm S which is defined as follows.

Algorithm S (Symmetry algorithm). Given two harmonic tones x(t) and y(t) expressed in AFP quantities, find the symmetry ξ of the zero-crossing pattern of c"(t) = x"(t) | y"(t).

step S1. [Find the numbers m and n.] By applying Equations (6) and (7), obtain the reduced harmonic numbers m andn.

step S2.[Get the second derivative calculation.] By adding Equations (24) and (25), find c"(α) for 0 <α < 2π, as in the example shown in Figure 5;


step S3. [Find the zero-crossings.] Insert oriented zero-crossing marks, i.e., find all the abscissas where the second derivative is null along one period of the complex, as in the example of Figure 6. If the zero-crossing refers to a negative-to-positive crossing, i.e., if the third derivative is positive at the zero-crossing's abscissa αi , that is, if c'"(αi) > 0, insert an upward arrow, otherwise insert a downward one;


step S4. [Construct a Boolean pattern.] From the zero-crossing marks, build a Boolean pattern ℘(α) with dark rectangles for representing negative values (false) of the second derivative, and light rectangles for representing positive values (true), as shown in Figure 7;


step S5. [Extract the half-patterns.] Extract two sections of the pattern ℘(α), as shown in the example of Figure 8. For the first one, just take the left half of the pattern ℘(α), i.e., the sub-pattern ℘L(α) extending from 0 to π radians, that is,


where sub(℘(α), α1, α2) gives the subpattern of ℘(α) extending from α1 to α2. For the second one, take the mirror image ℘IR(α) of the right-half of ℘(α) as detailed in Figure 8, that is,

which gives the inversion of the right half-pattern of ℘( α). That is, every angular position α in ℘(α) where π< α < 2π becomes 2π -α.

step S6. [Take the exclusive-or of the halves.] Find the symmetry measuring pattern χ(α) given by the Boolean exclusive-or operation of the patterns ℘L(α) and ℘IR(α), as in the example shown in Figure 9, i.e.,


step S7. [Calculate the symmetry.] Finally, take the average value ξ of χ(α), i.e.,

where w[ χ(α),j] is the angular width of the j-th dark rectangle of the pattern χ(α); and J is the number of dark rectangles. This ends the symmetry algorithm.

2.3.2. The symmetry in the 1:1-complex: The zero-valued symmetry only occurs in the 1:1-complex. Although in perceptual terms a 1:1-complex is considered more as a single tone than as a complex, it plays a basic role in theoretical terms, not only for revealing how the symmetry of the zero-crossing pattern of c"(t) affects the resulting loudness, but also for being a kind of unity element of the class of m: n-complexes. Figure 10 shows the relationship between the pattern symmetry and the amplitude of the resultant sinusoid in a 1:1-complex.


3. VECTOR REPRESENTATION OF TONES

The addition of two pure tones according to Equation (3) results in a different entity because a complex has properties not present in pure tones. However, as mentioned in Section 1, it is hypothesized that a computable pure tone exists whose frequency corresponds to the pitch of the complex. This hypothesis leads to the consideration of a mathematical model whose addition operation when applied to a pair of sinusoidal tones just yields a single

sinusoidal tone, instead of a superposition of sinusoidal functions, so that the pitch problem can be formalized in this way. For this purpose, it is first necessary to arrange the tones in a three-dimensional mathematical space.

3.1. THE QOL SPACE

One way of organizing spatially the tones is through the rectangular QOL space shown in Figure 11 where tones are arranged in pages. More specifically, QOL is a space of tones having three dimensions, namely quotient, distance in octaves, and loudness, where all the tones belonging to a same page have the same quotient.


3.1.1. Building QOL from AFP: In order to set proper limits to the quantities involved in the AFP representation of Equations (1) and (2), it is assumed a linear working of the auditory system, where the frequencies can have any value in the range of N octaves, i.e., from fmin to 2Nfmin (for theoretical purposes, N is assumed to be ten); the amplitudes can assume any value between zero and the limit given by Equation (35); and the phases, any value in the range from 0 to 360°. For each 〈a, f, p〉 triple, that is, for each tone with amplitude a, frequency f, and phase p, there is a corresponding 〈q, o, l〉 triple in QOL, and vice versa, so that there is a bijective transformation between AFP and QOL spaces.

The quotient (q) is a frequency ratio within an octave, i.e., 1 < q < 2. Specifically, the quotient of a tone is the ratio between its frequency f and the value obtained by shifting the lower limit pitch fmin by the number of octaves υ existing between fmin and f, i.e.,

where υis given by

The term "quotient" as employed here has a meaning similar to that of "tone chroma" [4] in that both refer to the tone position within an octave. For example, notes having the same name also have the same chroma as well as the same quotient, regardless of the octave in which they are located. In color theory, however, the term "chroma", which was introduced by Munsell [14], has ameaning that could result in a conflicting terminology if both tone and color spaces are used together, as occurs in the present study. In view of this, the term "chroma" is avoided.

The distance in octaves (o) of a tone is a quantity given by the number of octaves separating f from fmin plus a fractional part due to the phase p. Therefore, it lies in the range 0 < o < 10. This definition of distance in octaves may be illustrated by means of the helix of pitch [23] shown in Figure 12, where the integer part is given by the number of turns from fmin to f, since each turn of the helix counts as one octave. In Figure 12(a), the helix is at the normal angular position, that is, its lower end is at zero radian. A tone in the normal helix is assumed to have a null phase. Starting at fmin and going up to the higher frequencies, the helix intersects a certain horizontal circle (a cross section of the helix's circumscribed cylinder) defining the position of frequency f at a point po. If the tone has a non-null phase p, as indicated in the same horizontal circle, the whole helix must be rotated by p radians so as to reach that circle exactly at point p, as shown in Figure 12(b). In summary, the number of octaves υ gives the integer part of the distance in octaves, while the decimal part is given by the quotient between the phase and the maximum possible rotation in the helix, i.e.,


The concept of distance in octaves is thus like that of "tone height" found in [4]. However, since the phase is included, the distance in octaves is based on a continuous scale, instead of a discrete one.

Finally, the loudness dimension (l) is built in accordance with the definition presented in Section 2.2.1. In order to be included as one of the dimensions of the QOL tone space, the loudness must be rescaled to the range 0-100 loudness units. Therefore, it follows from Equation (20) that

where imax is the upper limit of the loudness scale, that is, a value above which the auditory system loses linearity. It is given by

where fmin is the frequency corresponding to the lower limit of pitch discrimination, and amax is the largest amplitude which is supported by the auditory system under linearity conditions at fmin. If the equilibrium theorem is taken along the whole audible frequency range relatively to Equation (34), the corresponding amplitude (a) at a given frequency (f) is such that

The loudness unit-or lut for short-for the scale l defined in Equation (33) is derived from the assumption that a tone with frequency fmin and amplitude amax has a loudness of 100 luts. That is to say, at five octaves above fmin, for example, a tone with 100 luts of loudness has an amplitude equals to (1/1024)amax.

3.2. THE QOL TO RGB TRANSFORMATION

The addition of two tones seems to be not obvious in QOL because it does not constitute a vector space. In other words, two QOL tones 〈q1, o1, l1〉 and 〈q2, o2, l2〉cannot be mathematically combined from a coordinate-by-coordinate addition, for example. However, the QOL space can be mapped onto the RGB color space-where the addition operation can be carried out-if, as a first step, its rectangular organization is converted to a cylindrical one. That is, by transforming the quotient q, which lies in the range 1 < q < 2, into an angle lying in the range 0 - 360°, a cylinder of pages is defined according to Figure 13 as having the same organization as that of the HSV color space. More specifically, by conveniently positioning, scaling, and orienting the RGB cube relatively to the QOL cylinder, a correspondence is established between quotient and hue, distance in octaves and saturation, and loudness and value. Here the cylinder's axis coincides with the achromatic diagonal KW of the cube, which is aligned with the vertical direction; the radius of the cylinder's base is equal to the projection of the vector Y = R | G on the horizontal plane; and the red page (the QOL page defined by q = 1) contains the edge KR of the cube. Although the angular spacing of QOL pages is continuous, only a discrete set of pages is shown in order to render the inner side of the volume visible.


The QOL to RGB transformation requires a mapping of every page of the cylinder into a corresponding vertical triangle enclosed in the RGB cube, referred to here as sail, which is defined by the points K, W, and the intersection point of the corresponding QOL page and one of the edges RY, YG, GC, CB, BM, or MR, of the cube, as shown in Figure 14. That is, each tone of a given QOL page is mapped into a vector of the corresponding RGB sail. This can be done by taking advantage of the analogy between QOL and HSV, so that a QOL to HSV transformation must be carried out first.


3.2.1. The QOL to HSV transformation: As the geometries of QOL and HSV spaces are coincident, the conversion of a 〈q, o, l〉 tone to a 〈h, s, v〉 color is according to

and

Therefore, the ranges are 0 < h < 360°, 0 < s < 1, and 0< v <1.

3.2.2. The HSV to RGB transformation: A pair of algorithms allowing forward and inverse transformations between HSV and RGB color spaces was introduced by Smith [24]. Such algorithms are based on the "hexcone" representation of the HSV space which is equivalent to the cylindrical one. In this way, let Γ(〈h, s, v〉, k) be the Smith's HSV to RGB algorithm, where each one of the coordinates r, g, and b is indexed by k. That is,

and

The convention adopted here in relation to these algorithms is that found in [30] where the hue is taken in degrees, that is, from 0 to 360°, instead of 0 to 1.

3.2.3. Color interpretation of music: A related matter, although not properly belonging to this paper's main subject, refers to applications of the above described tone-to-color transformation in visual translation of music. More generally, it refers to any composition that has a visual counterpart the colors of which are calculated by applying the method described in Section 3 to the existing notes. Some of these applications can be accessed at http://www.cic.unb.br/docentes/arcela/colormusic/.

4. THE VECTOR ADDITION TONE

The purpose of applying the mathematical equivalence between a tone space and a color space in the computation of the pitch of a given complex is that it becomes possible to find a single tone as the final result of adding vectorially the respective component tones. Naturally, this is done in analogy with the addition-or mixing-of colors in the RGB cube, which is an operation that necessarilly results in a single vector, since regardless of the number of colors being mixed together, a single color must be produced at the end of the mixing procedure. In other words, the approach to compute pitch which is presented here could not be proposed if the addition of tones were done by means of common algebraic addition of sine functions.

Therefore, the methods introduced in Section 3 for representing tones as vectors are now combined so as to compute pure tones having supposedly pitch equivalence with harmonic complexes. The basic operation is the vector addition of a pair of tones, which is carried out by Algorithm A as described below. For complexes with more than two components, the computation is carried out by Algorithm M, which is described in Section 4.2. Multi-tone complexes are broken into several temporary m: n-complexes, so as to be resolved by successive applications of Algorithm A, each of them producing a vector addition tone referred to as a temporary component.

4.1. THE VECTOR ADDITION OF TWO TONES

The computation of the vector addition tone for a m:n-complex requires a sequence of three basic operations, namely down-transposition, vector composition, and up-transposition which are respectively carried out by the algorithms D, C, and U described below.

4.1.1. Down transposition: The need for this operation is due to the nature of the angular representation of tones in the QOL space, for the transformation given by Equation (36) which maps quotients into hues is not linear. For instance, if the lower component of a 2: 3-complex is such that fx/fmin is a power of two, so that the lower and upper quotients are respectively qx = 1 and qy = 1.5, the QOL page of fx is the same as that of fmin. Hence, according to Equation (36), the angular difference between the HSV page of the upper component y(t) and that of the lower component x(t) is 180°, for hy = 360(1.5 - 1) = 180° and hx = 360(1 - 1) = 0°, so that Δh = 180°. This means that the resulting vector will be found on one of the two pages since they are in the same plane (Section 5.2). Now, when fx/fmin is not a power of two, the angular difference is not 180°, as can be seen with a 2:3-complex having qx = 1.25. In this case, the quotient of the upper component is qy = 1.25(3/2) = 1.875, so that hy = 360(1.875-1) = 315°, and hx = 360(1.25 - 1) = 90°, i.e., Δh = 225°, and not 180° as in the first case. Therefore, in order to have a vector composition as a homogeneous operation with respect to the frequency ratio m: n, and whose result does not depend on the value set to fmin, the down transposition is a required operation.

After the down-transposition operation, both lower and upper components will have their parameters changed in a particular way. More specifically, (t) becomes(t) by first dividing its frequency fx by its quotient qx, so that (t) will be a component having a unitary quotient, that is, qx = 1, while (t) becomes (t), a component which is lowered by the same factor qx, so that the frequency ratio m: n is held. Next, the amplitudes are increased in the proportion given by the equilibrium theorem so as to preserve the loudness of both components. Finally, the phases are transformed in such a way that the transposed lower component (t) has its phase set to π/2 rd, while the phase of the transposed upper component y(t) is set to a value at which the waveform of x(t) + y(t) assumes-in a different time scale-the same shape as that of the waveform of x(t) + y(t), as shown in Figure 15. The purpose of this phase transformation is to have a reference for measuring the waveform's symmetry with Algorithm S (Section 2.3.1), since a m:n-complex whose components are both in cosine phase is symmetrical.


As a result of the down transposition, which is formalized below by Algorithm D, the sail of the lower tone becomes coincident with that of fmin, i.e., the [q = 1]-sail (or red sail), as shown in Figure 16.


Algorithm D (Down transposition algorithm). Given a m:n-complex whose lower and upper components are, respectively, [x]AFP = 〈ax,fx,px and [y]AFP = 〈ay,fy,py〉, find the transposed components []AFP = 〈a

,f,p
and []AFP = 〈a , f, p〉.

step D1. [Find the quotient of the lower component.] Use Equations (30) and (31) to find qx.

step D2. [Down-transpose the lower component.] Find the frequency f of the down-transposed lower component(t) by dividing fx by the quotient qx, that is,

Then, find the amplitude a by considering that the transposed component (t) must have the same loudness as (t). That is, by applying Theorem 1,

Now set the phase p to π/2. That is,

step D3. [Down-transpose the upper component.] Find the down-transposed upper component (t) according to

and

where the relationship between p and p is the same as that of px and py.

4.1.2. Vector composition: This operation is applied to the transposed components (t) and (t). After obtaining the symmetry ξ of the zero-crossing pattern of the second derivative of the transposed complex (t) = (t) + (t), which is the same as that of the untransposed complex (t), it finds the vector composition of the two transposed tones as a vector addition under a magnitude coefficient ξ as described below.

Algorithm C (Vector composition algorithm). Given the transposed tones (t) and (t), find their vector composition []RGB.

step C1. [Convert to QOL.] Convert the transposed components []AFP= 〈a,f,p〉 and []AFP = 〈a, f,p〉 into QOL by using Equations (30)-(33).

step C2. [Convert to HSV.] Convert the transposed components [] QOL = 〈q,o,l〉 and [] QOL = 〈q,o,l〉 into HSV by using Equations (36)-(38).

step C3. [Convert to RGB.] Convert the transposed components []HSV = 〈hx,sx,vx and []HSV = 〈h, s, v〉 into RGB by using Equations (39)-(41).

step C4. [Find the symmetry.] Apply Algorithm S (symmetry algorithm; Section 2.3.1) to the transposed tones (t) and (t) in order to find the symmetry ξ of the zero-crossing pattern of c"(t) = "(t) +"(t), i.e.,

step C5. [Add the vectors.] Find the transposed vector composition []RGB by using the symmetry ξ as a scalar multiplier to the vector addition [*]RGB = []RGB + []RGB.That is,[ ]RGB= ξ[*]RGB,or

and

4.1.3. Up transposition: The up transposition, which is the inverse operation of the down transposition, is applied to the down transposed vector []RGB in order to find and place the resulting tone u(t) in respect to the original untransposed tones x(t) and y(t), as shown in Figure 17. Before the up transposition is effectively applied, the vector []RGB is first converted from RGB to HSV, then to QOL, and finally to AFP.


Algorithm U (Up transposition algorithm). Let Γ-1(〈r, g,b〉, k ) represent the k-th component (0 < k < 2) of the RGB to HSV transformation. step U1. [Go back to HSV] Find the components h, s. and v, by using

and

step U2. [Go back to QOL.] Use Equations (55)&-(57), which are derived from Equations (36)&-(38), to convert the above computed HSV values back to QOL:

and

step U3. [Go back to AFP.] Use Equations (58)-(60), which are derived from Equations (30)-(33), to convert the resulting tone from QOL to AFP. Find the transposed vector addition tone (t) = a sin(2π ft | p) by calculating first the frequency f, then the amplitude a, and finally the phase p, that is,

where int(o) is the integer part of o. The amplitude au is then given by

and the phase p comes from

step U4. [Transpose the tone up.] Find the vector addition tone [u]AFP =au, fu, pu〉 by up transposing []AFP under the factor 1/qx. From Equations (45)-(47) used to down transpose the tone y(t), the equations for the up transposition of tone u(t) can be deduced. They are

and

where fu is the computed pitch.

4.1.4. Grouping the algorithms: At this point, the above described algorithms D, C, and U are combined together so as to define the algorithm A for computing the vector addition of two tones, as illustrated in Figure 18.


Algorithm A (Vector addition tone algorithm) Given the components x(t) and y(t) of a m:n-complex, find the vector addition tone u(t).

The components x(t) and y(t) are expressed in AFP quantities, that is, [x]AFP = 〈ax, fx,px and [y]AFP = 〈ay, fy,py〉, where the phases are in degrees.

step A1. [Do the down transposition.] Apply Algorithm D to the components x(t) and y(t) so as to find the transposed tones []AFP = 〈a, f,p〉 and []AFP = 〈a, f, p〉.

step A2. [Do the vector composition.] Apply Algorithm C to []AFP and []AFP so as to obtain the transposed vector composition []RGB =ru, gu, bu〉 .

step A3. [Do the up transposition.] Apply Algorithm U to []RGB so as to find the vector addition tone [u]AFP =au, fu, pu〉.

4.2. COMPLEXES HAVING MORE THAN TWO TONES

For complexes having more than two components, the vector addition tone is found by extending the application of Algorithm A to all the components of the complex. More specifically, the first two vector tones are added together, the sum of which is added to the third component vector, and so on.

Algorithm M (Vector addition tone algorithm for harmonic complexes having more than two tones). Let C(t) = {z0, z1, ..., zk.1} be a complex having k harmonic components. Find the vector addition tone u(t) by sequentially applying Algorithm A to pairs of tones as follows:

step M1. [Add up the first two spectral components.] Find the first temporary component u 1(t) for the vector addition tone by applying Algorithm A to the first two spectral components, that is,

step M2. [Add up the remainder components.] Compute each iteration of the following operation sequence

as a subsequent temporary component uj (t) for u(t). The last iteration of the operation sequence of Equation (65) gives the resulting vector addition tone of the complex C(t), i.e.,

This ends the algorithm M.

5. GEOMETRY OF COMPLEXES

There is a close relationship between the geometry of any vector composition and the corresponding reduced harmonic numbers m,n from which relevant properties related to the pitch of complexes can be derived. Some of these properties refer to the problem of the missing fundamental. In this way, in order to address the conditions of equality between the computed pitch and the frequency of the fundamental, four basic frequency ratios are studied here.

5.1. COMPONENTS AN OCTAVE APART

According to Equation (31), in a two-tone complex whose components are one octave apart, that is, 1:2, the difference between the number of octaves of the transposed tone (t) relatively to fmin and that of (t) is

As shown in Figure 19(a) for the respective vector composition, since the RGB sails of both transposed tones are coincident, that is,


the transposed addition tone will be located on this common sail. That is to say, any quotient the lower tone x(t) might have, which is always equal to that of y(t), the vector-addition tone u(t) will have the same quotient (Section 4.1.4, step A3), that is, qu = qx. However, mainly due to the phase relationship and secondarily due to the amplitude proportion, the resulting pitch can be either fx or fy. For instance, when the 1: 2-complex is in equilibrium, that is,

the transposed vector addition tone (t) has a distance in octaves o equal to the arithmetical mean of the distances in octaves o and o of the transposed components (t) and (t). As a consequence, if the phase px is set to π/2 rd while py can assume any value between 0 and 2π rd, the pitch will correspond to fx when 0 < py</2, and will correspond to fy when 3π/2 < py < 2π rd, as demonstrated below.

The HSV equivalents of Equations (68) and (69) are given respectively by Equations (36) and (38), that is,

and

In terms of RGB coordinates, the necessary conditions for a vector to be located on the red sail are that (1) the coordinates g and b are equal, and (2) the coordinate r is the greatest among the three, that is, r

= max(r,g,b
) and r = max(r, g,b). Therefore, according to the HSV to RGB transformation represented by Equations (39)-(41) and the conditions indicated by Equations (70) and (71), the transposed RGB vectors can be obtained. The vector []RGB is given by r
= v, g
= v (1 - s), and b
=g, that is,

and

In the same way, the vector []RGB is given by

and

From Equations (72)-(77), the coordinates of the transposed resulting vector []RGB =r

+ r, g
+ g, b
+ b〉 are given by

and

In order to find the distance in octaves o, first the saturation s must be obtained from the RGB to HSV transformation indicated by Equations (52)-(54), that is, s

= [max(r,g,b)-min(r,g,b)]/max(r,g,b
). As the maximum is r and the minimum is g, it follows that

Substituting s = o

/10, as given by Equation (37), together with Equations (78)-(79) into Equation (81) yields

which, according to Equation (32), can be rewritten as

Now, taking into account Equation (67) yields

Therefore, if

that is, if p | p> 2π, the number of octaves υ of the transposed vector addition tone will be 1 plus the number of octaves υ, so that the resulting pitch equals fy. Otherwise υ will be equal to υ, and so the pitch will be fx. For example, ifpx is set to π/2 which, according to Equations (44) and (47), implies in a equality between px and p as well as between py and p, then for 0 < p < 3π/2, the resulting pitch is fx, while for 3π/2 < px < 2π it is 2fx.

5.2. COMPONENTS A FIFTH APART

For a complex whose components are a fifth apart, that is, m: n = 2:3, the vector placement is that shown in Figure 19(b). In this case, the sails of the transposed tones are complementary, so that the association of them defines the red-cyan rectangular section of the RGB cube. Because of this singular alignment of sails, the transposed resulting vector [

]RGB will be located exclusively either on the sail of []RGB or on that of []RGB. As illustrated in Figure 20 for the red-cyan rectangular section, if the angle ψ between the transposed addition vector [
]RGB and the diagonal KC (of the face KBCG) is greater than δ = tan-1(√2/2)-the angle between the diagonal KW of the cube and the diagonal KC of the face-the resulting vector will be on the red sail, so that its quotient will be the same as that of []RGB. Otherwise, it will be on the cyan sail, and so its quotient will be equal to that of [ ]rgb.


Since the necessary condition for a RGB vector to be located on the red-cyan rectangular section is that its components g and b have the same value, it holds for the vector []RGB that g = b, while for the vector it holds that gy = b, so that the projection of the resulting vector [*]RGB on the side KR is given by

while its projection on the diagonal KC is

Therefore, as

the resulting vector will be located on the red sail if

that is,

whereas it will be located on the cyan sail if

5.2.1. Computed pitch under equilibrium: When the 2:3-complex is in equilibrium, that is, l = l, it follows from Equation (57) that

a condition whose RGB equivalent is found from the RGB to HSV transformation mentioned in Section 3.2.2 according to which v = max(r,g,b), and v = max(rg,b). Therefore, since

and

it holds that

Substituting Equation (95) into Equation (90) gives the condition for the transposed resulting vector being located on the red sail, that is,

Substituting Equation (95) into Equation (91) gives the condition for the transposed resulting vector being located on the cyan sail, that is,

In order to obtain the AFP equivalents of these conditions, it is necessary to carry out the back conversion. According to the RGB to HSV transformation, the saturations s and sy are given respectively by s = [max(r,g, b) -min(r,g,b)]/max(r,g,b) and s= [max(r, g, b

min(r, g, b)]/max(r, g, b)

Since min(r,g,g) = gx and min( r , g, g) = and by using Equations (93)-(95), it follows that

and

To find the condition for the transposed resulting vector being on the red sail under equilibrium, it is necessary to compare Equations (98) and (99) one to the other while taking into consideration Equation (96). In this way, it follows that s > s, or, according to Equation (56), the respective distances in octaves are such that

which, from Equation (32), yields

Since fx is a power of two times fmin, it follows from Equations (30) and (31) that f and f are in the same octave relatively to fmin. Therefore,

Then, substituting Equations (44) and (102) into Equation (101), it follows that

That is, under equilibrium conditions of the 2:3-complex, the resulting transposed vector will be on the red sail if the phase of the transposed higher tone (t) is lesser than π/2 rd. Otherwise, that is, if

it will be on the cyan sail. Therefore, the pitch of a 2: 3-complex in equilibrium has a bipolar response to the phase relationship, since it can assume just one of two values, i.e., according to Equation (61), fminqx, when the transposed resulting vector is on the red sail, and 1.5fminqx, when it is on the cyan sail.

Finally, if both tones not only have the same loudness but also are in cosine phase, that is, the 2:3-complex is in entire equilibrium (Section 2.2), the resulting vector will be aligned with the achromatic diagonal KW, which is the border between the red and cyan sails. This means that in this very particular case, the resulting pitch is indefinite.

5.3. COMPONENTS A FOURTH APART

For a complex whose components are a fourth apart, that is, m: n = 3:4, the placement of vectors is that shown in Figure 19(c). Now, as the sails of the transposed tones are angularly spaced apart by 120°, the transposed resulting vector []RGB can be located on any sail defined between 0 and 120°. Therefore, the quotient of the vector addition tone has a value between the component quotients.

When a 3:4-complex is in entire equilibrium, the value of its computed pitch is an octave below the arithmetic mean of the component frequencies, that is, fu = (fx | fy)/4. (A proof of this statement is not given here). The effects of phase in this complex are described below in Section 5.5

5.4. COMPONENTS A MAJOR THIRD APART

For a complex whose components are a major third apart, that is, m:n = 4:5, the placement of vectors is that shown in Figure 19(d). In this case, the sails of the transposed tones are angularly apart by 90°, so that the transposed resulting vector (t) can be located on any sail defined between 0 and 90°. Therefore, the computed quotient is a value between the component quotients.

5.5. PHASE EFFECTS ON THE PITCH OF M:N-COMPLEXES

By keeping the phase px at 90° while py is allowed to vary from 0 to 360°, it is possible in terms of phase sensitivity to classify m: n-complexes into four major groups according to the way the pitch changes in each of them.

The first group contains just the 1:1-complex, because it is the only complex whose pitch does not change with the phase relationship, although the loudness of the vector addition tone is affected.

The second one is constituted of complexes having a bipolar effect, that is, those in which, for a certain phase subrange for py, the resulting pitch corresponds to one of the component frequencies, while for the complementary subrange it corresponds to the other component frequency, as occurs with 1: 2 and 2: 3 complexes discussed above in Sections 5.1 and 5.2.

The third one comprises complexes whose pitch response has a discontinuity, that is, there are two complementary phase subranges where pitch varies continuously, being these subranges about one octave apart one from the other. For example, the pitch of a 4: 5-complex in equilibrium increases continuously 44 cents relatively to a given frequency value as py goes from 0 to 330, while from 330 to 360 it increases continuously from 1244 to 1247 cents relatively to the same reference value of the first subrange. Other examples include 5: 6, 22: 27, and 5: 8 complexes.

Finally, the fourth and last group comprises m: n-complexes in which the pitch varies continuously along the full phase range, in general for a small interval. For instance, the pitch of a 3: 4-complex in equilibrium increases continuously about 88 cents as py goes from 0 to 360, while a 8: 9-complex in equilibrium increases about 20 cents. Other examples include 25: 36, 3: 5, and 8: 15 complexes.

6. THE MISSING FUNDAMENTAL

The application of Algorithm M to three examples selected from the literature is first considered. Subsequently, the possibility for the pitch of a harmonic complex to be the same as the frequency of the fundamental is investigated by exploring the geometric properties of the vector pairs operated by Algorithm A inside Algorithm M. An audible demonstration of these complexes and their respective vector-addition tones is found in [3].

From this point on, the notation "k" for a spectral component is used, which is intended to mean that the integer k is the harmonic number of the respective component, that is, fk = kf1, being k the same as k(t) = aH (2πffct p).

6.1. PITCH COMPUTATION FOR A COMPLEX WITH SUCCESSIVE HARMONICS

The complex C1 = {3,4,5 } to be considered in the first place has component frequencies according to 600, 800, and 1000 Hz, respectively, so that its missing fundamental is at 200 z. A study of this complex was reported in [9] where the components 3, 4, and 5 have the same amplitude and are in cosine phase. Their values in AFP quantities are taken as [3]AFP = (0.9,600,90), [4]AFP = (0.9,800,90), and [5]AFP = (0.9,1000, 90). In this way, their loud-nesses are according to la = 36, li = 64, and ls = 100 luts. The amplitudes are set to 0.9 units because this value is appropriate in relation to the upper limit amplitude of the highest component 5, which is also the loudest one. More precisely, for a lower limit pitch fmin set to 30 z [12] and an upper limit amplitude [amax]f set to 1000 units, then according to Equation (34) the maximum loudness is imax = 900000. Therefore, it follows from Equation (35) that the upper-limit amplitude for H3 is [amax]f = 900000/10002 = 0.9 units. As the amplitude values are relative to each other, overall gain adjustments are required for a suitable sound pressure level as, for example, a value around 65 dB SPL for H5. The final vector addition tone is found by applying Algorithm M to the set of the three components such that u(t) = M(3, 4,H5) = A[A(3, 4),5], that is, the Algorithm A is first applied to the components 3 and 4 thus yielding a temporary tone u1(t), then it is applied again, this time to the pair (u1,H5). In this way, the computed pitch for the complex C 1 is 229.145 Hz, that is, 236 cents above the 200-Hz fundamental. As for the loudness lu, the computed tone u(t) has a loudness of 45 luts, therefore a value between the loudnesses of 4 and 5.

6.1.1. Phase sensitivity: Small changes in any of the phases of components 3, 4, and 5 produce small changes in the pitch of C 1. There are, however, some phase relationships that produce significant changes in the pitch as, for example, when the phase of 3 is set to 0 while those of 4 and 5 are held at 90, the resulting tone is [u]AFP = 〈1.87,485.25,150〉, that is, the computed pitch is 1299 cents above that value found in the case where the components are all in cosine phase, with a loudness of 49 luts. If the phase of 4 is set to 180 while the phases of 3 and 5 are held at 90, the resulting tone is [u]AFP = 〈0.65,983.88,199.17〉, that is, the pitch is 2523 cents above the first computed value, with a loudness of 70 luts.

6.2. ANOTHER COMPLEX WITH SUCCESSIVE HARMONICS

The complex C2 = {9,10,11} to be considered now was mentioned in [5]. It has component frequencies according to 1800, 2000, and 2200 Hz, respectively, so that its missing fundamental is at 200 Hz. In AFP quantities they are taken as [H9]AFP = 〈0.0675,1800,0〉, [10]AFP = 〈0.15,2000,0〉, and [11]AFP = 〈0.0675,2200,0〉 so that the corresponding loudnesses are l9 = 24, ll0 = 66, and lll = 36 luts. After applying Algorithm M to C2, it is found that [u]AFP = 〈0.21,2024.38,359.88〉. Thus, the computed pitch, that is, 2024.38 Hz, is 4007 cents above the frequency of the fundamental. The loudness is 97 luts.

6.3. A COMPLEX HAVING NONSUCCESSIVE HARMONICS

The complex C3 = {H46,H51,H56} to be considered now was also mentioned in [5]. It has component frequencies according to 1840, 2040, and 2240 Hz, respectively, so that its missing fundamental is at 40 Hz. In AFP quantities they are taken as [46]AFP = 〈0.0675,1840,0〉, [51]AFP = 〈0.15,2040,0〉, and [56]AFP = 〈0.0675,2240,0〉 so that the corresponding loudnesses are l46 = 25, l 51 = 69, and l56 = 37 luts. After applying Algorithm M to C3, it is found that [u]AFP = 〈0.11,2058.17,359.88〉. Thus, the computed pitch, that is, 2058.17 Hz, is 29 cents above the value found for the preceding complex. The loudness is 53 luts.

6.4. COMPUTED PITCH COMPARED TO THE FREQUENCY OF THE FUNDAMENTAL

The pitch computation for the complexes C 1 and C2 above indicates that successive harmonics are not a necessary and sufficient condition to assure the equality between pitch and the frequency of the fundamental. The same is true for nonsuccessive harmonics, as shown with complex C3. The equality between pitch and the frequency of the fundamental (whether present or missing) is discussed below through the pitch computation of some selected complexes, followed by the description of a well-defined class of harmonic numbers related to such equality.

6.4.1. Complex having the first three harmonics. First case: the computed pitch is equal to the frequency of the fundamental: First, let C4 = {1,2,3} be a complex having the first three harmonics. According to Algorithm M, the vector addition tone of complex C4 can be found by two successive applications of Algorithm A, namely u1(t) = A(1, 2), andu2(t) = A(u1, 3). After generating the down transposed tones 1 and H2 in the first step of Algorithm D, since f2 = 2f1 , it follows from Equation (30) that q1= q2 = 1. That is, the vectors [1]RGB and [2]RGB are on the same RGB sail. Therefore, the temporary component u1(t) has the same quotient as 1, as discussed in Section 5.1 for a 1:2-complex. Now the situation is like that described above in Section 5.2 for a 2: 3-complex. That is, if u1(t) when compared to 3 is such that Equation (89) holds, the resulting tone u2(t) will have the same quotient as u 1 (t), which is the same as that of 1. Therefore, the computed pitch will be equal to the frequency of the fundamental. Otherwise, it will have the same quotient as 3, for it will be on the page qu2 = q3. In this case, the computed pitch will be the same as a fifth above the frequency of the fundamental.

For a direct numerical example, let the spectral components

1, 2, and 3 of C4 be [1]AFP = 〈4,200,90〉, [2]AFP = 〈3,400,90〉, and [3]AFP = 0.34,600,90〉. According to Equation (33), the corresponding loudnesses are l1 = 18, l2 = 53, and l1 = 14 luts. After applying Algorithm M to C4, it is found that [u]AFP = 〈16.89,200,55.39〉. Thus, the computed pitch-200 Hz-is the same as the frequency of the fundamental, the loudness being 75 luts.

6.4.2. Complex having the first three harmonics. Second case: the computed pitch is a fifth above the frequency of the fundamental: For a complex C5 having the same first three harmonics 1, 2, and 3 as C4, but having another proportion of amplitudes, namely [1]AFP = 〈3,200,90〉, [2]AFP = 〈1.05,400,90〉, and [3]AFP = 〈2,600, 90〉-that is, the corresponding loudnesses are l1 = 13, l2 = 17, and l3 = 80 luts-after applying Algorithm M to C5, it is found that [u]AFP = 〈7.25,150,152〉. Thus, the computed pitch-150 Hz-is a fourth below the frequency of the fundamental, with a loudness of 18 luts.

6.4.3. Complex having the first four harmonics. First case: the computed pitch is equal to the frequency of the fundamental: Let C6 = {1,2,3 ,4} be a complex having the first fourth harmonics. According to Algorithm M, the vector addition tone of complex C6 can be carried out by three successive applications of Algorithm A, that is, u1(t) = A(1,2), u2(t) = A(u1,H3), and u3(t) = A(u2,4). If the components are such that [1]AFP = 〈4,200,90〉, [2]AFP = 〈3,400,90〉, [3]AFP = 〈0.75,600,90〉, and [4]AFP = 〈0.5, 800,90〉-that is, the corresponding loudnesses are l1 = 18, l2 = 53, l3 = 30, and l4 = 36 luts-after applying Algorithm M to C6, it is found that [u]AFP = 〈22.35,200,167〉. Thus, the computed pitch-200 Hz-is the same as the frequency of the fundamental with a loudness of 99 luts.

6.4.4. Complex having the first four harmonics. Second case: the first three components have the same loudness; the computed pitch is one octave below the frequency of the fundamental: Let C7 = {H1,H2,H3 ,H4} be a complex having the first fourth harmonics. If the components are such that [1]AFP = 〈6,200,90〉, [2]AFP = 〈1.5,400,90〉, [3]AFP = 〈0.65,600,90〉, and [4]AFP = 〈0.23, 800,90〉-that is, the first three have the same loudness l1 = l2 = l3 = 26 luts, while the last one has l4 = 16 luts-after applying Algorithm M to C7, it is found that [u]AFP = 〈285.82, 50, 63.56〉. Therefore, the computed pitch-50 Hz-is two octaves below the frequency of the fundamental; the corresponding loudness being 79 luts. This occurs because the vector addition of 1, 2, and 3 is similar to the complex C6 above, that is, the resulting frequency of u2 is a fifth above the frequency of the fundamental. Thus, the addition of this resulting tone with 4, which is the last component of C7, is like the the case of the addition of tones a fourth apart, as discussed in Section 5.3.

6.4.5. Complex having the first four harmonics. Third case: the components have the same loudness; the phase of the first component is set to 91; the phase of the second is set to 0; the computed pitch is 33 cents below the frequency of the fundamental: Let C8 = {1,2,3 ,4} be a complex having the first fourth harmonics. If the components are such that [1]AFP = 〈6,200,91〉, [2]AFP = 〈1.5,400,0〉, [3]AFP = 〈0.65,600,90〉, and [4]AFP = 〈0.37, 800,90〉-that is, all of them have the same loudness l1 = l2 = l3 = l4 = 26 luts-after applying Algorithm M to C it is found that [u]AFP = 〈5.519,196.25,51.226〉. Therefore, the computed pitch-196.25 Hz-is 33 cents below the fundamental. The loudness is 23 luts.

6.4.6. Removing the fundamental and inserting the sixth harmonic. First case: the computed pitch is equal to the frequency of the missing fundamental: Let C9 = {2,3,4,6}beafour-tone complex composed of second, third, fourth, and sixth harmonics. If the components are such that [2]AFP = 〈2.5,400, 90〉, [3]AFP = 〈0.3,600,90〉, [4]AFP = 〈0.5,800,90〉, and [6]AFP = 〈0.065,1200,90〉-that is, the corresponding loudnesses are l2 = 44, l3 = 12, l 4= 36, and l6 = 10 luts-after applying Algorithm M to C9, it is found that [ u ]AFP = 〈18.81,200,243.71〉. Thus, the computed pitch-200 Hz-equals the frequency of the missing fundamental. The loudness is 84 luts.

6.4.7. Removing the fundamental and inserting the sixth harmonic. Second case: the computed pitch is a fifth above the frequency of the missing fundamental: Let C10 = {2, 3,4, 6} be a four-tone complex with second, third, fourth, and sixth harmonics. If the components are such that [2]AFP = 〈3,400,90〉, ]AFP = 〈1,800,90〉, and [6]AFP = 〈0.5,1200,90〉-that is, now the sixth har- [3]AFP = 〈1,600,90〉, [4]AFP [6]AFP = 〈0.5,1200, 90〉-thati monic is louder than the others, namely l2 = 53, l3 = 40, l4 = 71, and l6 = 80 luts-, after applying Algorithm M to C10, it is found that [u]AFP = 〈41.35,150, 62.47〉. Therefore, the computed pitch-150 Hz-is an octave below the fifth of the frequency of the missing fundamental. The loudness is about 100 luts.

6.5. CONCLUDING THE CONSIDERATIONS ON THE FUNDAMENTAL

There are two conditions to be satisfied for the equality between the computed pitch fu and the frequency (F0) of the fundamental. The first one is that both the vector-addition tone and the fundamental must be located on the same RGB sail, that is, they must have the same quotient. The second one is that their corresponding vectors must occur in close directions, so that they can be found in the same octave, that is, the difference between their distances in octaves must be lesser than one. For complexes having a large number of components, its not so simple-as it was in Section 5 for the case of two-tone complexes-to determine a priori, that is, before applying Algorithm M, whether they can satisfy these two conditions. Theoretically, there is an infinity of harmonic complexes which cannot satisfy them. However, there is at least one particular solution which can be analyzed, since in the context of Algorithm M, when the transposed vector addition tone is located on the red sail, it has the same quotient as one of the transposed spectral components, or as one of the transposed temporary components. If such a situation not only occurs but also the harmonic number of the corresponding component is a power of two, then the vector addition tone will have the same quotient as the fundamental. Here, at least one well-behaved class of harmonic complexes can be cited in which every vector composition may be carried out exclusively on the red-cyan rectangular section. More specifically, the composition of transposed vectors having harmonic numbers according to 1,2,3,4, 6,8,12 ... may take place exclusively on the red-cyan rectangular section so that, depending upon the amplitude proportions and/or phase relationships, the transposed vector addition tone can be placed on the red sail. In summary, the vector addition tone of a complex whose harmonic numbers can be expressed as 2j · 3k, where j and k are integers such that j > 0 and 0 < k < 1, may have a quotient which equals that of the fundamental.

By using the spectral content of the ten complexes described in Sections 6.1-6.4 so as to experiment them as additive musical instruments in the playing of a melodic line, that is, a sequence of fundamentals, it becomes clear that obtaining a computed pitch equal to the frequency of the fundamental at every note is not simple, even when the complex belongs to the above mentioned class of harmonic complexes-since it accounts just for the first condition, that is, the equality of quotients. It is audibly demonstrated in [3] that for the first three complexes the pitch neither reaches the frequency of the current fundamental nor any of its octaves. In the first one, the pitch is a diminished minor third above the frequency of current fundamental for all the notes, as found for C1. In the second one, it is three octaves plus a Phytagorean major third above the fundamental sequence, as computed for C2. In the third one, it is five octaves plus a just minor sixth above the fundamental sequence, as computed for C3. For some of the most favorable cases (complexes C4· · · C10), the pitch may in fact correspond to the frequency of the fundamental, specially in the case of C4, but only at some notes for it will experience octave shifts at other ones. It seems that playing a song with whatever instrument is in some way putting it out of tune, unless the instrument has a single component.

7. CONCLUSIONS

  1. Results were presented showing that pitch of harmonic complexes can be computed by means of vector operations provided that the spectral components are represented in QOL coordinates and then transformed into RGB vectors (Sections 3.1&-3.2). Besides, the symmetry of the second-derivative's zero-crossing pattern along the period of involved

    m:

    n-complexes is also needed for the pitch computation.

  2. Any complex having two or more components whose vector addition tone is aligned with the achromatic diagonal of the RGB cube does not have a definite pitch. This occurs, for instance, with the 2: 3-complex at entire equilibrium (Section 5.2).

  3. Phase effects are quite different for different groups of

    m:

    n-complexes (Section 5.5). In complexes having more than two components, the pitch can change continuously within small intervals or can change around one or more octaves (Section 6.1.1), depending on the diversities of harmonic numbers.

  4. The missing fundamental problem can be addressed by observing the geometry of the vector compositions involved in the computation of its pitch, and by finding the appropriate amplitude proportions and phase relationships which are necessary to produce a resulting vector on the RGB sail of the missing fundamental (Section 6.4). There are harmonic complexes however where the frequency of the fundamental cannot correspond to the computed pitch, even if it is present as a spectral component.

ACKNOWLEDGMENTS

The author would like to thank CNPq for the financial support at the beginning of this research some years ago, as well as the University of Brasilia for supporting the Multimedia Computing Laboratory, the location where experiments, computer programs, and graduate classes related to the background of this paper have taken place.

Received 07 January 2008; accepted 04 July 2008

  • [1] A. Arcela. As árvores de tempos e a configuração genética dos intervalos musicais PhD thesis, Pontifical Catholic University of Rio de Janeiro, 1984.
  • [2] A. Arcela. The equilibrium theorem in Bach's two-part inventions: an audible demonstration. http://www.cic.unb.br/docentes/arcela/equilibrium/, January 2008.
  • [3] A. Arcela. Out of tune: audible demonstration of the vector-addition tone in the computation of the pitch of some multitone complexes. http://www.cic.unb.br/docentes/arcela/outoftune/, January 2008.
  • [4] A. Bachem. Tone height and tone chroma as two different pitch qualities. Acta Psychol., 5:80&-88, 1950.
  • [5] J. F. Schouten; R. J. Ritsma; B. Lopes Cardozo. Pitch of the residue. J. Acoust. Soc. Am., 34:1418&- 1424, 1962.
  • [6] E. de Boer. On the "residue" in hearing PhD thesis, University of Amsterdam, 1956.
  • [7] A. J. M. Houtsma; J. L. Goldstein. The central origin of the pitch of complex tones: evidence from musical interval recognition. J. Acoust. Soc. Am., 69:520&-529, 1972.
  • [8] J. L. Goldstein. An optimum processor theory for the central formation of the pitch of complex tones. J. Acoust. Soc. Am., 54:1496&-1516, 1973.
  • [9] R. Meddis; M. J. Hewitt. Virtual pitch and phase sensitivity of a computer model of the auditory periphery. I: Pitch identification. J. Acoust. Soc. Am., 89:2866&-2882, 1991.
  • [10] A. J. M. Houtsma. Musical pitch of two-tone complexes and predictions by modern pitch theories. J. Acoust. Soc. Am., 66:87&-99, 1979.
  • [11] A. J. M. Houtsma. Pitch of unequal-amplitude di-chotic two-tone harmonic complexes. J. Acoust. Soc. Am., 69:1778&-1785, 1981.
  • [12] D. Pressnitzer; R. D. Patterson; K. Krumbholz. The lower limit of melodic pitch. J. Acoust. Soc. Am., 109:2074&-2084, 2001.
  • [13] J. C. R. Licklider. A duplex theory of pitch perception. J. Acoust. Soc. Am., 7:128&-133, 1951.
  • [14] A. H. Munsell. A pigment color system and notation. The American Journal of Psychology, 23:236&- 244, 1912.
  • [15] H. Fletcher; W. A. Munson. Loudness, its definition, measurement and calculation. J. Acoust. Soc. Am., 5:82&-108, 1933.
  • [16] G. S. Ohm. Ueber die definition des tones, nebst daran geknüpfter theorie der sirene. Ann. Phys. Chem., 59:513&-565, 1843.
  • [17] R. Meddis; L. O'Mard. Virtual pitch in a computational physiological model. J. Acoust. Soc. Am., 120:3861&-3869, 2006.
  • [18] J. G. Bernstein; A. J. Oxenham. An autocorrelation model with place dependence to account for the effect of harmonic number on fundamental frequency discrimination. J. Acoust. Soc. Am., 117:2816&-3831, 2005.
  • [19] R. Plomp. Pitch of complex tones. J. Acoust. Soc. Am., 41:1526&-1533, 1967.
  • [20] E. Zwicker; B. Scharf. A model of loudness summation. Psychological Review, 72:3&-26, 1965.
  • [21] A. Seebeck. Beobachtungen über einige bedingun-gen der entstehung von tönen. Ann. Phys. Chem., 53:417&-436, 1841.
  • [22] W. P. Shofner; G. Selas. Pitch strength and Stevens's power law. Perception & Psychophysics, 64:437&- 450, 2002.
  • [23] R. N. Shepard. Structural representations of musical pitch. In D. Deutsch, editor, The Psychology of Music, chapter 11, pages 343&-390. Academic, Orlando, 1982.
  • [24] A. R. Smith. Color gamut transform pairs. ACM SIGGRAPH, Computer Graphics, 12:12&-19, 1978.
  • [25] G. F. Smoorenburg. Pitch perception of two-frequency stimuli. J. Acoust. Soc. Am., 48:924&-942, 1970.
  • [26] S. S. Stevens. The measurement of loudness. J. Acoust. Soc. Am., 27:815&-829, 1955.
  • [27] S. S. Stevens. On the validity of the loudness scale. J. Acoust. Soc. Am., 31:995&-1003, 1959.
  • [28] H. Fastl; G. Stoll. Scaling of pitch strength. Hearing Research, 119:293&-301, 1979.
  • [29] E. Terhardt. Pitch, consonance and harmony. J. Acoust. Soc. Am., 55:1061&-1069, 1974.
  • [30] J. D. Foley; A. van Dam; S. K. Feiner; J. F. Hughes. Computer Graphics: principles and practice, chapter 13, pages 343&-363. Addison-Wesley, second edition, 1990.
  • [31] H. L. F. von Helmholtz. On the Sensations of Tone, chapter 4, pages 58&-65. Dover (English translation A. J. Ellis, 1885, 1954), 1877.
  • [32] F. L. Wightman. The pattern-transformation model of pitch. J. Acoust. Soc. Am., 54:407&-416, 1973.

Publication Dates

  • Publication in this collection
    03 Nov 2008
  • Date of issue
    Sept 2008

History

  • Received
    07 Jan 2008
  • Accepted
    04 July 2008
Sociedade Brasileira de Computação Sociedade Brasileira de Computação - UFRGS, Av. Bento Gonçalves 9500, B. Agronomia, Caixa Postal 15064, 91501-970 Porto Alegre, RS - Brazil, Tel. / Fax: (55 51) 316.6835 - Campinas - SP - Brazil
E-mail: jbcs@icmc.sc.usp.br