Abstracts
In this note we revisit E. Cartan's address at the 1928 International Congress of Mathematicians at Bologna, Italy. The distributions considered here will be of the same class as those considered by Cartan, a special type which we call strongly or maximally nonholonomic. We set up the groundwork for using Cartan's method of equivalence (a powerful tool for obtaining invariants associated to geometrical objects), to more general nonholonomic distributions.
nonholonomic mechanics; Cartan's equivalence method; affine connections
Nesta nota revisitamos a comunicação de E. Cartan no Congresso Internacional da IMU em Bolonha, Itália. As distribuições aqui consideradas serão do mesmo tipo que as tomadas por Cartan, uma classe especial que chamamos fortemente nãoholônomas. Porém, preparamos o caminho para a aplicação do método da equivalência de Cartan (uma ferramenta poderosa para a obtenção de invariantes) a distribuições mais gerais.
mecânica não holonômica; método de equivalência de Cartan; conexões afins
Nonholonomic connections following Élie Cartan
JAIR KOILLER^{1}, PAULO R. RODRIGUES^{2} and PAULO PITANGA^{3}
^{1}Laboratório Nacional de Computação Científica,
Av. Getulio Vargas 333  25651070 Petrópolis, RJ  Brazil.
^{2}Departamento de Geometria, Instituto de Matemática,
Universidade Federal Fluminense  24020140 Niterói, RJ  Brazil
^{3}Instituto de Física, Universidade Federal do Rio de Janeiro,
Cx. Postal 68528, Cidade Universitária  21945970, Rio de Janeiro, RJ,Brazil.
Manuscript received on February 5, 2001; accepted for publication on February 12, 2001;
presented by MANFREDO DO CARMO
ABSTRACT
In this note we revisit E. Cartan's address at the 1928 International Congress of Mathematicians at Bologna, Italy. The distributions considered here will be of the same class as those considered by Cartan, a special type which we call strongly or maximally nonholonomic. We set up the groundwork for using Cartan's method of equivalence (a powerful tool for obtaining invariants associated to geometrical objects), to more general nonholonomic distributions.
Key words: nonholonomic mechanics, Cartan's equivalence method, affine connections.
INTRODUCTION
Le vrai problème de la represéntation géométrique d'un système matériel non holonome consiste[^{ ... }] dans la recherche d'un schéma géométrique lié d'une maniére invariante aux propriétés mécaniques du système.
Élie Cartan
In this article we revisit E. Cartan's address (Cartan 1928) at the 1928 International Congress of Mathematicians at Bologna, Italy. The distributions considered by Cartan were of a special type which we call strongly or maximally nonholonomic. Our aim is to set up the groundwork for using Cartan's method of equivalence (a powerful tool for obtaining invariants associated to geometrical objects (Gardner 1989) to more general nonholonomic distributions.
This is a local study, but we outline some global aspects. If the configuration space Q is a manifold of dimension n, its tangent bundle T Q should admit a smooth subbundle E of dimension m, m < n. As it is well known, this imposes topological constraints on Q, see Koschorke 1981. Although we will be discussing only local invariants, hopefully these will help constructing global ones, such as special representations for the characteristic classes (Milnor and Stasheff 1974, Postnikov et al. 1999).
NOTATION. Throughout this paper we follow consistently the following convention: capital roman letters I, J, K, etc. run from 1 to n. Lower case roman characters i, j, k run from 1 to m (representing the constraint distribution). Greek characters ,,, etc., run from m + 1 to n. Summation over repeated indices is assumed unless otherwise stated.
1. NONHOLONOMIC CONNECTIONS
We fix a Riemannian metric g on Q and let the associated Riemannian connection, torsion free and metric preserving:
(1.1)
In section 8 we consider an arbitrary affine connection (see Hicks 1965) on Q. Recall that given a local frame e_{I} on an open subset
Q and its dual coframe , a connection is described by local 1forms =  such that
(1.2)
The torsion tensor is T(X, Y) = Y  X  [X, Y] = t_{I}(X, Y) e_{I} and expanding the left hand side we get the structure equations
+
(1.3)
As the Riemannian connection is torsion free, t_{I} 0.
We assume heretofore that the frame is adapted to the distribution E. This means {e_{i}(q)} span the subspace E_{q}, q Q, and the remaining {e_{a}} span the gorthogonal space F_{q} = E_{q}^{^}.
DEFINITION 1.1. The (LeviCivita) nonholonomic connection on E is defined by the rule
(
(1.4)
(1.5)
Here we allow X to be any vectorfield on Q, not necessarily tangent to E. Notice that for vectorfields Y, Z tangent to E, the metriccompatibility
still holds.
The motivation for Definition 1.1 is D'Alembert's principle: Consider a mechanical system with kinetic energy
g((t),(t)) and applied forces f, subject to constraints such that (t) E_{c(t)}. The constraining force (t)  f is gperpendicular to the constraint subspace E_{c(t)} , since it does not produce work.Unless otherwise mentioned, we assume there are no applied forces. The geodesic equations are given, in Cartan's approach, by
(1.6)
One may wish to see the equations explicitly. Choose a coordinate system x on
Q. Define m^{3} functions (Christoffel symbols) (x) on by
(1.7)
Write
(
(1.8)
(some authors call the p_{k} = ((t)) quasivelocities).
PROPOSITION 1.2. The geodesic condition D
= 0 yields a nonlinear system in n + m dimensions for x and p given by=
(1.9)
Here e is the Rth component ( 1 R n) of the kth Ebasis vector ( 1 k m) in terms of the chosen trivialization of T Q.
The Christoffel symbols can be rewritten in terms of the m^{3} data c defined by
=
(1.10)
In fact, compatibility with the metric (see (1.1) implies
=  , and torsion free implies c =  . It follows that
(1.11)
Inserting into the geodesic equations, and taking into account the symmetric and antisymmetric terms, we see that (1.9) can be rewritten as
= 
(1.12)
In this formula we already observe the interplay between the Lie bracket stucture and the metric.
2. AN EXAMPLE
Cartan's viewpoint bypasses the traditional methods in constrained dynamics where Lagrange multiplier terms are added to represent the constraint forces (just to be eliminated afterwards); see Blajer 1995 and references therein. Moreover, when using the EulerLagrange equations, the system of ODEs comes in implicit form unsuitable for being easily integrated numerically. In Cartan's approach all this information is embodied in the Christoffel symbols or equivalently in the structure coefficients. Cartan's approach provides an algorithmic way to derive the equations of motion for nonholonomic systems:
i) Compute the adapted orthonormal basis e_{i}, e, say, by the GramSchmidt procedure.
ii) Compute the structure coefficients c(x), taking Liebrackets of the vectorfields.
Building up on the example in Cartan (1928, section 11) we start up the derivation of the equations of motion for "Caplygin's sphere''. Details of the derivation and a theoretical analysis will be provided elsewhere.
Caplygin's sphere is a dishonest billiard ball, namely, a non homogeneous sphere of radius a and total mass m = 1 (without loss of generality), with moments of inertia I_{1}, I_{2}, I_{3}, rolling without sliping on a horizontal plane. The configuration space is ^{2}x SO (3). It is assumed that the center of mass coincides with the geometric center. Cartan considered only the homogeneous (honest) case k = I_{1} = I_{2} = I_{3}.
We follow Arnol'd's notation (Arnol'd 1978), where capital letters denote vectors as seen from the body. We denote by the angular velocity viewed in the space frame, and the angular velocity as viewed in the body frame, which we may assume attached at the principal axis of inertia. Thus R = , where R SO (3) is the attitude matrix. Intrinsically speaking, this corresponds to left translation in the Lie group SO (3). Let Q,  Q  = a, a material point in the sphere, viewed in the body frame. Thus in space
q = RQ + (x, y, a) . (2.1)Computing the total kinetic energy
T =
(Q)  dQyields, in the same fashion as in the rigid body with a fixed point,
,
(2.2)
where = and = are the cartesian components of the velocity of the geometric center.
The nonslip condition at the contact point in vector form is given by
,
(2.3)
that is,

(2.4)
Some comments are in order. Firstly, the component corresponds to pivoting around the contact point, and therefore is arbitrary. In fact, this distribution E of three dimensional subspaces (m = 3) in the five dimensional (n = 5) manifold R^{2}x SO (3) is a realization of Cartan's famous "235'' distribution (see "Les systèmes de Pfaff à cinq variables et les équations aux derivées partielles du second ordre'', Cartan 1939). We observe that (2.4) could also be written in complex variables notation + i =  ai( + i), which motivates studying nonholonomic systems in the context of pseudoholomorphic bundles.
It is very important in our context is to observe that the constraints define a distribution E in Q which is both right SO (3)invariant and ^{2}invariant. This brings us immediatelly to the issue of integrability of nonholonomic systems, which was introduced in (Koiller 1992) and extensivelly discussed in the colletanea by Cushman and Sniatycki 1998, also see Bates and Cushman, 1999. There is a conflict between the left SO(3) invariance of the Hamiltonian (2.2) and the right SO (3) invariance of the constraints (2.4). When k = I_{1} = I_{2} = I_{3}, the Hamiltonian is also rightinvariant,
2T = k^{2} ( + + ) + + = k^{2} ( + + ) + +
and the problem is amenable to full right reduction and becomes easily integrable.
In fact, for the homogeneous case we use the ansatz
The adapted basis is the dual basis of the (incidentally, we provide a correction to the coefficients given in Cartan, 1928).
We compute 2T = /dt^{2}:
2T = (A^{2} + D^{2})( + ) + (B^{2} + a^{2}D^{2})(p^{2} + q^{2}) + r ^{2} + 2(AB  aD)(q  p)
The mixed term is zero provided
D = AB / a (2.7)If we also set
=
(2.8)
then
+
(2.9)
with
m = A^{2}(1 + B^{2} / a^{2}) (2.10)The free parameters here are a, A and B. Another choice could be using a, as the free parameters,
,
(2.11)
/
(2.12)
so that
+
(2.13)
We now outline the procedure for the general case (also integrable) using Cartan's programme. To organize the calculations, we write the left invariant forms in SO (3) as
(2.14)
and we denote () = (P, Q, R). The last entry follows the alphabet, and the reader will forgive us for mixing up the notation in the left hand side.
Likewise, the rightinvariant forms are
(2.15)
and we denote () = (p, q, r). The relation R = corresponds formally to the adjoint representation. Now, if one desires explicit formulas using, say, Euler angles, it is sufficient to parametrize R = R(,,) and compute the left hand side of (2.14) and (2.15) in terms of the d, d, d.
It is worthy, however, to proceed as intrinsically as possible. Let f_{1}, f_{2}, f_{3} the rightinvariant vectorfields in SO(3) forming the dual basis for the ,,. The constraint distribution E is annihilated by the 1forms
dx  a
dt , dy + adtand by inspection, we observe that the vectors
(2.16)
generate E. We complete to a basis of TQ with the vectorfields /x ,/y .
Applying a GramSchmidt procedure on these vectors we get the basis e_{i}, e, providing the starting point for Cartan's method. However, this preliminary step brings a certain amount of pain: since the inner product , associated to T is left, but not right invariant, the orthonormal basis is neither left nor right invariant. Surprisingly, the system is still integrable. See Arnol'd et al. 1988.
To orthonormalize, we need the dual basis of the ,,, that is, the left invariant vectorfields F_{1}, F_{2}, F_{3} such that
(F_{1}, F_{2}, F_{3}) R^{1} = (f_{1}, f_{2}, f_{3}) . (2.17)Thus in particular, f_{3} = R_{31}F_{1} + R_{32}F_{2} + R_{33}F_{3} so that
=
(2.18)
We define then: e_{1} = f_{3}/  f_{3} . In a similar fashion we compute e_{2} = /  , where
=
(2.19)
To compute the inner product we revert to the basis of leftinvariant vectorfields via (2.17) and we obtain
= (
(2.20)
It is clear that the calculations get increasingly involved but are within our powers.
3. GEOMETRIC INTERPRETATION
Direct and inverse "development'' of frames and curves were so obvious to Cartan (and for that matter, also to LeviCivita) that he (they) did not bother to give details. Actually, inverse parallel transport seems closest to their way of thinking. We elaborate these concepts, exhibiting explicitly (Theorem 3.4 below) a system of ODEs producing at the same time, the solution of the nonholonomic system, a parallel frame along it, and a hodograph representation of the solution curves on the Euclidean space
^{m}.3.1. DIRECT PARALLEL TRANSPORT OF A Eq_{o}FRAME.
A frame for Eq_{o}
Tq_{o}Q can be transported along a curve c(t) in Q. The "novelty'' here (as stressed by Cartan): c(t) is an arbitrary curve in Q, that is (t) does not need to be tangent to E.Recall that given a tangent vector V^{o} = v_{j}^{o}e_{j}
Eq_{o} and a curve c(t) Q, c(0) = q_{o}, there is a unique vectorfield V(t) E_{c(t)} , V(0) = V^{o} such that DV(t) 0 . In fact, we are led to the linear timedependent system of ODEs
(3.1)
(using D
e_{j} =  ()e_{k} and (1.6)). In particular, an orthonormal frame at Eq_{o} is transported to E_{c(t)} and remains orthonormal.3.2. HODOGRAPH OF A E_{c(t)}FRAME TO^{m}.
Given a curve of frames for T Q along c(t), {e_{I}(t)}, consider the frame for E_{c(t)} formed by the first m vectors e_{i}. We have
e_{i} =  () e_{j}  () e_{a}We develop a "mirror'' or hodograph frame {U(t) : u_{1}(t),^{ ... }, u_{m}(t)} confined to ^{m}
Eq_{o}, q_{o} = c(0), solving the system for U(t) O(m) given by= 
(3.2)
Equivalently (by elementary matrix algebra)
= (
(3.3)
where u_{i} are the columns of U.
LEMMA 3.1. Let {U(0) : u_{i}(0) = e_{i}(0)} a frame for Eq_{o}. The hodograph of its direct paralel transport {e_{1}(t),^{ ... }, e_{m}(t)} along c(t) is the constant frame U(t) U(0) = Id .
PROOF. This is because D
e_{i}(t) 0 iff ((t)) 0.PROPOSITION 3.2. Let U(t) a curve of frames in ^{m}. Define a frame {(t),^{ ... },(t)} for E_{c(t)} by
,
(3.4)
where {e_{1} ,^{ ... }, e_{m}} is parallel along c(t). Then
i) ,^{ ... }, satisfies D =  with
= U^{1}.ii) The hodograph of ,^{ ... }, to ^{m} is U(t).
PROOF. It sufficies to prove i) and it is simple:
D(,^{ ... },) = (e_{1},^{ ... }, e_{m}) + (De_{1},^{ ... }, De_{m}) U = (e_{1},^{ ... }, e_{m}) = (e_{1},^{ ... }, e_{m}) U U^{ 1} = (,^{ ... },)()3.3. HODOGRAPH TO
^{m}OF A CURVE c(t) IN Q.Consider c(t), a curve in Q, c(0) = q_{o}. As before, it is not assumed that (t) is tangent to E_{c(t)}. Let {e_{I} : e_{1}(t),^{ ... }, e_{n}(t)} a local orthonormal frame for T_{q}Q along c(t) with {e_{1},^{ ... }, e_{m}} tangent to E_{q}. Denote , I = 1,^{ ... }, n the dual basis. Construct first the hodograph {u_{1}(t),^{ ... }, u_{m}(t)} of {e_{1},^{ ... }, e_{m}} to ^{m}, and then define
where is an 1form with values in ^{m} given by
=
(3.6)
and for short we wrote (t) = ((t)).
The curve (t) in ^{m}
Eq_{o}, is called the hodograph of c(t) to ^{m}. If e_{1}(t),^{ ... }, e_{m}(t) are parallel along c, then (t) is given by(t) =
dt ,^{ ... }, dttaking the coordinate axis of
^{m} along u_{1}(0) u_{1}(t),^{ ... }, u_{m}(0) u_{m}(t).3.4. DEVELOPMENT ON Q OF A CURVE (t)
^{m}Eq_{o}On the other direction,
PROPOSITION 3.3. Given a curve (t) in ^{m}
Eq_{o}, we can construct a unique curve c(t) in Q tangent to E whose hodograph is (t). The curve c(t) is called the development of (t).PROOF. First, extend an Eq_{o}adapted basis for Tq_{o}Q in a neighborhood q
Q, with corresponding forms , I, J = 1,^{ ... }, n. Then consider the vectorfield in x O (m) given by
(3.7)
Integrating this vectorfield we obtain a curve (c(t), U(t)).
We claim that the hodograph of c(t) is (t) = (t) (the vectorfield was constructed precisely for that purpose). Indeed, by the previous item,
which is equal to by elementary linear algebra: if v is any vector and U any invertible matrix, v = (U^{1}v)_{i}u_{i}, where u_{i} are the columns of U (U is the matrix changing coordinates from the basis u_{i} to the canonical basis).
What if we had used a different frame on U? We would get a system of ODEs
(3.9)
and we claim that Y = X so the curve c(t), is unique.
To prove this fact it equivalent to show that T = UP where P changes basis from , i = 1,^{ ... }, m to e_{i}, i = 1,^{ ... }, m , that is
,
(3.10)
We compute
T ^{1} = (UP)^{1}(U + P) = P^{1} + P^{1}(U^{1})P = P^{1} + P^{1}()Pwhich is indeed the gaugetheoretical rule giving the forms () of the basis defining T from the forms () of the basis e_{i}.
We can upgrade this construction to provide a parallel frame along c(t), by declaring 0. This gives
+ (
(3.11)
which could be added to system (3.7). Actually, we can take the equation for U out of that system, observing that U = P^{1} (PROOF: U^{1} =  U^{1}
U^{1} =  ()U^{1} ).THEOREM 3.4. Given a curve (t)
^{m}, consider the nonautonomous system ODEs in the frame manifold Fr(E) given by
(3.12)
It gives the developed curve c(t) on Q and an attached parallel frame
,
(3.13)
For a line (t) = t v passing through the origin in ^{m} we obtain the nonholonomic geodesic starting at q with velocity (0) = v.
3.5. HODOGRAPH OF THE D'ALEMBERTLAGRANGE EQUATION
We elaborate on the comments of §7 in Cartan 1928 ["La trajectoire du système matériel, supposé soumis à des forces données de travail elémentaire
, se développe suivant la trajectoire d'un point matériel de masse 1 placé dans l'espace euclidien à m dimensions et soumis à la force de composantes .'']. Consider a mechanical system with kinetic energy T and external forces F (written in contravariant form, we lower indices using the metric so that F TQ), subject to constraints defined by the distribution E. The nonholonomic dynamics is given by
(3.14)
where the right hand side is the ortogonal projection of F over E.
Let be the hodograph of c. Fix a constant frame ,^{ ... }, on ^{m} and write
(t) = (t) .
Let be the parallel frame along c(t) obtained in Theorem 3.4. Decompose
.
(3.15)
COROLLARY 3.5. (Cartan 1928, §7). Equation (3.14) is equivalent to
(
(3.16)
Equations (3.16) should be solved simultaneously with (3.12) and (3.13).
This approach can be helpful for setting up numerical methods, and in some cases reducing the nonholonomic system to a second order equation on
^{m}. We also observe that F can represent nonholonomic control forces actuating over the system, as those studied in Krishnaprasad et al. 1996.4. EQUIVALENT CONNECTIONS
In this section and the next we discuss the question of whether two nonholonomic connections D and on E have the same geodesics.
Given A GL (n  m), C O(m), B M (m, n  m), we take
=
(4.1)
This is the most general change of coframes preserving the subRiemannian metric
+
(4.2)
supported on E. The corresponding dual frame satisfies
(4.3)
(here, for ease of notation we place scalars after vectors).
In matrix form, we have
=
(4.4)
,
(4.5)
Using matrix notation is not only convenient for the calculations, but also to set up the equivalence problem (Gardner 1989). Consider the linear group G of matrices of the form
(4.6)
The equivalence problem for subRiemannian geometry can be described as follows: Given coframes = (,^{ ... },)^{t} and = (,^{ ... },)^{t} on open sets and , find invariants characterizing the existence of a diffeomorphism F :
satisfying F^{*} = g^{ . }. For subRiemannian geometry, see Montgomery 2001.In nonholonomic geometry we are lead to a more difficult equivalence problem (see section 6.3 below). In Cartan's 1928 paper, the nonholonomic connections are characterized only for a certain type of distributions, which we will call strongly nonholonomic. Interestingly, Cartan did not work out the associated invariants, even in this case. He focused in finding a special representative in the equivalence class of connections with the same geodesics.
Consider the modified metric on Q
=
(4.7)
and the associated LeviCivita connection . The geodesic equation is
,
(4.8)
To compare (4.8) and (1.6), there is no loss in generality by taking C = id. By inspection one gets:
PROPOSITION 4.1. (Cartan 1928, §5.) Fix C = id. The geodesics of D and are the same iff
(
(4.9)
for all T tangent to E.
5. PFAFFIAN SYSTEMS AND LIE ALGEBRAS OF VECTORFIELDS
5.1. EQUIVALENT 1FORMS
In view of (4.9) it seems useful to introduce the following
DEFINITION 5.1. Two 1forms and are Eequivalent if  anihilates E. We write
or simply .In the C(Q)ring of differential forms (Q), consider the ideal generated by the 1forms , = m + 1,^{ ... }, n. We can write
(5.1)
where the superscript ^{^} means "objects annihilated by''.
Clearly
is equivalent to  = f_{a}. More generally, two kforms and are said to be Eequivalent if their difference vanishes when one of the slots (v_{1},^{ ... }, v_{k}) is taken on E. Again, this means that  . (In fact, given a Pfaffian system of 1forms on Q= 0,^{ ... }, = 0,
one can form the ideal on (Q) generated by these forms. Every form that is annulled by the solutions of the system belongs to , see ChoquetBruhat et al. 1997, p.232).
If
it does not necessarily follow that dd. For the later to happen, the former must be equivalent over a larger subspace, (^{1})^{^}E which we now describe.5.2. FILTRATIONS IN T Q AND IN T^{ *}Q
For background and a comprehensive review of the theory, see, e.g., Vershik and Gershkovich 1994. Let a Pfaffian system.
DEFINITION 5.2. The derived system D() is
(5.2)
One constructs (see Bryant et al. 1991) the decreasing filtration
^{... }
^{(2)}^{(1)}^{(0)} =defined inductively by
^{(k + 1)} = (^{(k)})^{1}.Here is thought as a submodule over C(Q) consisting of all 1forms generated by the . We assume all have constant rank. The filtration eventually stabilizes after a finite number of inclusions, and we denote this space final. By Frobenius theorem, the Pfaffian system final is integrable. Fix a leaf S and consider the pull back of the filtration. That is, we pull back all forms by the inclusion j : S Q. The filtration associated to j^{*} stabilizes at zero.
There is a dual viewpoint, more commonly used in nonholonomic control theory (Li Z and Canny 1993): given a distribution E in TQ one considers an increasing filtration
E_{o} = E E_{1}
E_{2}^{... }Two (different) options are used by workers in this area:
1) E_{i} = E_{i  1} + [E_{i  1}, E_{i  1}] .
2) E_{i} = E_{i  1} + [E_{o}, E_{i  1}] = E_{i  1} + [E_{j}, E_{k}] .
We follow the first option, which is recursive, and yields faster growth vectors. Moreover, the following fundamental duality result is easy to prove:
LEMMA 5.3.
^{(1)} = E_{1}^{^} (equivalently E_{1} = (^{(1)})^{^} ).PROOF. Let X, Y, Z E. Observe that
_{1} iff (check why)(X + [Y, Z]) = 0 .
Well, (X) = 0 by default and (the correct signs do not matter)
[Y, Z] = d(Y, Z)Z(Y)Y(Z) = 0
as (Y) = (Z) 0 because
and d(Y, Z) = 0 because _{1}.6. MAIN RESULT
6.1. STRONGLY NONHOLONOMIC DISTRIBUTIONS
The main question to be addressed in the local theory is the following. Assume that the geodesics of two LeviCivita nonholonomic connections D and are the same. Proposition 4.1 says that a necessary and sufficient condition for this to happen is
.What are the implications of this condition in terms of the original coframes = (,^{ ... },) and = (,^{ ... },)? The answer is that it depends on the type of distribution E.
One extreme: suppose E is integrable, that is ^{(1)} = . There is a foliation of Q by mdimensional manifolds whose tangent spaces are the subspaces E_{q}. Then it is clear that there are no further conditions. We can change the complement F = E^{^} without any restriction, and the metric there. In fact, we can fix a leaf S and the LeviCivita connection on S will coincide with the projected connection, no matter what is outside E.
The other extreme is the case studied in Cartan 1928:
DEFINITION 6.1. We say that the distribution E is of the strongly or maximally nonholonomic type if the derived Pfaffian system associated to E is zero.
In the modern terminology one says that the nonholonomicity degree is 2. We now prove
THEOREM 6.2. (Cartan 1928, §5.) In the strongly nonholomic case, the metrics and g must have the same complementary subspaces. In other words: B 0. Thus
F = E^{^}
is intrinsecally defined.
PROOF. Cartan used an argument that we found not so easy to decipher (see (6.6) on section 6.2 below). Thus we prefer to use a different argument to show that B 0. We start with the structure equation
d = 
Since
and (see Equation (4.1) with C = I) this implies
(6.1)
Now Equation (6.4) yields
d = d + dBi + Bid
and this inocently looking expression, together with (6.1) yields
(6.2)
Hence if the distribution is of strongly nonholonomic type then B 0.
6.2. DIGRESSION
The following calculations are actually never explicitly written in Cartan 1928, it seems that Cartan does something equivalent to them mentally. A caveat: the connection forms are antisymmetric in the indices I, J but in general this will not be the case for the forms below. If desired, they will have to be antisymmetrized (a posteriori).
We begin by differentiating
The block ( ) is given by
( ) =  C()C^{1} + B()C^{1} + dC C^{1}
We can take C = const. = id since we are not changing the subspace E. In this case
=
(6.4)
and
) = (
(6.5)
From equation (6.5), Cartan observed:
PROPOSITION 6.3. The condition
is equivalent to)
(6.6)
Cartan showed that under the hypothesis of the derived system being zero (6.6) implies B 0. This follows from applying matrix B to the structure equations
= 
(6.7)
Actually Cartan gave the expression (Cartan 1928, section §4) like
=
(6.8)
from which (6.2) gives
(6.9)
which is assumed to have only the trivial solution ["Nous allons, das ce qui suit, nous borner au cas où les équations homogènes
cjku = 0 aux n  m inconnues u_{m + 1},^{ ... }, u_{n} n'admettent que la solution u_{a} = 0. Cela revient à dire que le système dérivé se reduit à zéro'' (Cartan 1928, section §5)].6.3. EQUIVALENCE PROBLEM FOR NONHOLONOMIC GEOMETRY
The method of equivalence is advertised by Cartan in the 1928 address ["La recherche des invariants d'un systeme de d'éxpressions de Pfaff visàvis d'un certain groupe de substituitions linéaires effectuées sur ces expressions'' (Cartan 1928, section 4)], but interestingly, he did not apply the method to its full power. We now outline the equivalence problem.
Recall that for general distributions (6.2) leads to the condition
Bi
^{(1)} .The derivation was done in the particular case where C = id. But this is not a restriction. Replacing = (e_{i})C by the e_{i} does not change the nonholonomic geometry and leads to the transition matrix
(6.10)
Then () = () + C^{1}B() and (6.2) becomes
C^{1}B()
^{(1)}and C^{1} can be removed because ^{(1)} is a module over the functions on Q.
In spite of Cartan's caveat ["Si le système dérivé n'est pas identiquement nul, le problème de la représentation géométrique du système matériel devient plus compliqué. On est obligé de distinguer différent cas, dans chacun desquels, par des conventions plus ou moins artificielles, on peut arriver à trouver un schéma géométrique approprié. Nous n'entreprendrons pas cette étude génerale, dont l'interét géométrique sévanouirait rapidement à mesure que les cas envisagés deviendraient plus compliqués'' (Cartan 1928, §11)], we hope to raise interest in further research on the equivalence problem for nonholonomic geometry:
Given coframes () = (,) and () = (,) on open sets and , find invariants characterizing the existence of a diffeomorphism F
satisfying F^{*} = g^{ . }, where the substitutions are of the form
(6.11)
with
)
(6.12)
We recall that
(T^{*}Q) is the annihilator of E. The greek indices can be further decomposed into two parts:capital greek letters = 1,^{ ... }, r representing forms
^{(1)};lower case greek letters = m + r + 1,^{ ... }, n , where r = dim ^{(1)} , 0 r n  m .
Matrix B can be written B = (B_{1}, B_{2}) where the first is m x r and the second is m x (n  m  r). Condition (6.2) is equivalent to B_{2} 0 and our choice of basis implies that
d involve only the 's and the 's; d involve at least one of the .
The group of substitutions consist of matrices of the form
(6.13)
In terms of frames we have
(e_{i}) = ()C (e) = ()B_{1} + ()A_{o} + ()A_{1} (6.14) (e_{a}) = ()A_{2}which in particular shows:
THEOREM 6.4. With the above notations, we have:
i) (e_{i}, e_{a}) generate an intrinsic subspace [^{(1)}]^{^}, annihilated by
^{(1)}.ii) The e_{a}generate an intrinsic orthogonal complement F of E in [^{(1)}]^{^}.
iii) There is complete freedom to choose the e
to complete the full frame for T_{q}Q .7. NONHOLONOMIC TORSIONS AND CURVATURES
In this section we come back to the strongly nonholonomic case. Since B = 0 (and as we can take C = id) we have
=
(7.1)
We look at the original structure equations for the :
= (
(7.2)
where =  and we expand as a certain combination of the coframe basis , (at this point there is still freedom to choose the matrix A defining the ). The result is of the form
d =  +
+ smi.We now use to our advantage the condition
of Proposition 4.1. We can modify
There is a unique choice of p's making the 's symmetrical, namely
=
(7.4)
Summarizing, we have the Cartan structure equations for strongly nonholonomic connections:
THEOREM 7.1. (Cartan 1928, §6). Consider the nonholonomic connection with connection forms (1.4) modified as in (7.3). Then and D have the same geodesics and
= 
(7.5)
=
(7.6)
The forms =  are uniquely defined by the symmetry requirement
=
(7.7)
Cartan did not invest on computing curvatures ["En même temps qu'une torsion, le développement comporte une courbure, dont il est inutile d'e'crire l'expression analytique'' (Cartan 1928, §8). We take Cartan's words as dogma, perhaps to be subverted in future work]. The curvature forms for the connection would be helpful to compute characteristic classes of the bundle E Q.
7.1. A CANONICAL CHOICE OF METRIC IN F = E^{^}
Assuming the strongly nonholonomic hypothesis, (6.9) yields for each pair of indices j k,
(7.8)
Interpreted as a linear system for the u_{a}, this in particular implies
m(m  1) n  m or m(m + 1) 2n .We now work on the change of coframes
=
(7.9)
The differentials of the latter are given by:
d = Aad + mod
d = Aa(  ) + mod
Now,
=
(7.10)
so that
=
(7.11)
with
=
(7.12)
We can choose matrix A uniquely by a GramSchmidt procedure on the n  m linearly independent vectors (cij) = m + 1,^{ ... }, n in ^{(m(m1)/2)}.
Thus we obtain the conditions on the bivectors (Cartan's terminology):
(7.13)
From this point on, in order to maintain the ortonormality conditions, the change of coframes must be restricted to A O(n  m). Hence we get
THEOREM 7.2. (Cartan 1928, §9). Assume the strongly nonholonomic case. The conditions (7.13) define uniquely a metric on TQ = E F.
7.2. GEOMETRIC INTERPRETATION OF TORSION
Recall the
^{m} = Eq_{o} valued 1form given by (3.6)= "d" = u_{j}
which is the integrand of (3.5). The quotes indicate that this is a loose notation, "d" is not exact. Indeed, we compute
d = d
u_{k} + du_{j} .Now, du_{j} =  u_{k} by construction, so by Proposition 7.1
=
(7.14)
In the strongly nonholonomic case, Theorem 7.1 gives
=
(7.15)
PROPOSITION 7.3. (Cartan 1928, §8). Consider an infinitesimal parallelogram in Q spanned by vectors u, v in T_{q}Q, and the associated infinitesimal variation d(u, v) in ^{m}. If u, v belong to E_{q} there is no variation in ^{m} after the cycle. For u E_{q} and v E_{q}^{^} = F_{q} the variation is given by the torsion coefficients 's. For u, v F_{q} the variation is determined by the coefficients sm _{i}'s.
The symmetry (7.7) has the following interpretation:
(
(7.16)
with u, v E_{q}, n F_{q}.
One can consider the nonholonomic connection on F associated to the metric . Moreover, one can repeat the procedure in Theorem 7.1. Write
=
(7.17)
where the ambiguity on the 's can be removed by changing to another = + mod[] and imposing the symmetry = and the antisymmetry =  .
Mutatis mutandis, the geometric interpretation of the torsion coefficients 's and c_{ija}'s is analogous. In particular, there is no torsion for pairs u, vF.
It seems that these geometric interpretations were forgotten by the geometers from the 60's on. For instance, in the very influential lectures (Hicks 1965, p. 59), it is written: "as far as we know, there is no nice motivation for the word torsion''.
7.3. THE CASE WHERE F IS INTEGRABLE
When the torsion coefficients in (7.5) all vanish, d = , then all the forms d belong to the ideal
= [
(7.18)
so by Frobenius theorem, the distribution F = ^{^} is integrable.
One can construct a local fibration
B, whose fibers are (pieces of) F leaves. Choose coordinates (q_{1},^{ ... }, q_{m}, q_{m + 1}, q_{n}) on , such that (q_{1},^{ ... }, q_{m}) are coordinates on B and the fibration is (q_{1},^{ ... }, q_{m}, q_{m + 1}, q_{n}) (q_{1},^{ ... }, q_{m}). The distribution E will be given by dq_{a} = b_{a i }d q_{i}. (7.19)If the functions b_{ai} do not depend on the last m  n coordinates, we have locally an ^{m  n} action on
B and a connection on this (local) principal bundle. More generally, one can formulate the following equivalence problems:
Given (
,
^{ ... },
) a coframe on
, find a Lie group
G of dimension
n 
m, a diffeomorphism
F
P =
B
x
and a connection on the principal bundle
P such that the distribution
E
= 0 on
corresponds to the horizontal spaces of the connection on P.

Add to the previous a Riemannian metric
g on
and assume that it is
Gequivariant.

Same, requiring that the vertical and horizontal spaces are
gorthogonal. In other words, the vertical spaces
b
x
correspond to the leaves of
F.
Case 2) was considered in (Koiller 1992). The nonholonomic connection on E projects to a connection on B. Alternatively, it is also possible to use the LeviCivita riemannian connection D^{B} on B, relative to the projected metric. We get an equation of the form D
= K(b)^{ . } where K is antisymmetric. The force in the right hand side is gyroscopic (does not produce work).This force K vanishes in case 3). This seems to be what Cartan had in mind in the abelian case ["Si alors dans l'expression de la force vive du système on tient compte des équations des liaisons, on obtient une forme quadratique en q'_{1},^{ ... }, q'_{m}, avec des coefficients fonctions de q_{1},^{ ... }, q_{m}. On peut appliquer les équations de Lagrange ordinnaires.'' (Cartan 1928, §10)]. We observe that in the example of Caplygin's sphere, we have the principal bundle with connection ^{2}x SO(3) SO(3), and the vertical and horizontal spaces are not orthogonal. A "nonholonomic force'' is present in the reduced system even in Cartan's homogeneous case.
8. RESTRICTED CONNECTIONS
In this section we adopt an "internal'' point of view, as opposed to the "extrinsic'' approach of the preceding ones. It is quite fragmentary and tentative, aiming to propose directions for future work. We change the notation for the configuration space, which will be denoted M.
Consider a subbundle E M of TM, and a vector bundle H M.
DEFINITION 8.1. An Econnection D on H is an operator D_{X}s for X section of E and s section of H, satisfying:
D is linear in X and s and C(M)linear in X.
D is Leibnitzian in s:
D_{X}fs = X( f )s + fD_{X}s .
To emphasize the fact that X E, we also call this object an E restricted connection. When H = E an Erestricted connection on E will be called a nonholonomic connection on E.
A comment is in order. This definition seems natural here but we have searched the literature and have not found it. In fact, given a vector bundle H M, the usual notion of a connection D on H (see e.g., Milnor and Stasheff 1974, appendix C) means a TMconnection on H, in the sense of our Definition 8.1. We will call those full connections. That is, X is allowed to be any section of TM, so one is able to covariantly differentiate allong any curve c(t) in M. The difference in Def 8.1 is that the covariant differention is defined just for curves with
E. Therefore, to avoid confusion, we called the connection in Definition 8.1 a restricted connection.Given a full connection, evidently, it can the restricted to E or F. Given a (restricted) Econnection on H, can it always be extended to a (full) TMconnection? The answer is yes. Consider the following "cutandpaste'' or "genetic engineering'' operations:
Let
(8.1)
be a Whitney sum decomposition with projection operators denoted by P (over E parallel to F) and Q (over F parallel to E).
i) Given a (full) connection D_{X}Y on TM, it induces full connections D^{1} on E and D^{2} on F, by restricting Y to one of the factors (say, E) and projecting the covariant derivative D_{X}Y over this factor. Since full connections are plentiful, so are restricted ones.
ii) Given D^{1}, D^{2} Erestricted (Frestricted, respectively) connections on H, it is obvious that D_{X}s = D
s + Ds defines a full connection on H.PROPOSITION 8.2. Given a nonholonomic connection D^{(E, E)} on E, and D^{(F, E)} an Fconnection on E, the rule
(8.2)
defines a TM connection on E extending D. Here P and Q are respectively the projections on E (resp. F) along F (resp. E).
REMARK 8.3. However, given an Erestricted connection D^{1} in E and an Frestricted connection D^{2} in F, the rule
D_{X}Y = D
Y_{1} + DY_{2}fails to define a connection in TM, because
X_{1}(f )Y_{1} + X_{2}(f )Y_{2}
X(f )Y.The equivalence problem can be rephrased as follows: characterize the class of full connections D on M such that their nonholonomic restrictions D^{E} = D^{(E, E)}have the same geodesics.
8.1. PARALLEL TRANSPORT AND GEODESICS
The basic facts about TMconnections (see Hicks 1965), chapter 5) hold also for Erestricted connections. For instance,

(
D
_{X}
Y)
_{m} depends only on the values of
Y
H along any curve
c(
t)
M with
(0) =
X
_{m}.

Parallel transport of a vector
h
_{o}
H
_{m} along a curve
c(
t)
M with
E.
We slightly change the usual proofs (Hicks 1965). Take a local basis {h_{j}}, j = 1,^{ ... }, p trivializing H over a neighborhood U M and vectorfields e_{1},^{ ... }, e_{q} on U M generating E. Here q is the dimension of the fiber of E and p the dimension of the fiber of H. We define p^{2}q functions (1 i, j p, 1 k q) on U by
(8.3)
Write
We search a_{j}(t) such that h(t) = a_{j}(t)h_{j} satisfies
D
h(c(t)) 0.We get a linear system of ODEs in pdimensions
(8.4)
where
(t) = (c(t)) .Recall that an Econnection on itself (that is, H = E) is called a nonholonomic connection on E. The equation D
= 0 gives a nonlinear system in n + p dimensions (where n is the dimension of M and p = q is the dimension of E) for x and a given by
(8.5)
Here e^{k}_{r} is the r  component ( 1 r n) of the kth Ebasis vector ( 1 k p) in terms of a standard trivialization of TM.
8.2. TORSION AND CURVATURE
Let D, two (full) TMconnections, and D^{E}, their restrictions as Econnections along F. Consider the difference tensor on TM
B(X, Y) = Y  D_{X}Y , X, Y (TM) .
Clearly B is C(M)linear in both slots. Decompose B = S + A into symmetric and skewsymmetric pieces:
[
(8.6)
Consider also the torsions
(
(8.7)
It is easy to verify
(
(8.8)
These objects clearly make sense in the restricted version. Recall (Hicks 1965, 6.5) the notion of torsion associated to a (1, 1) tensor P ( m M P_{m}
End (T_{m}M)): T_{P}(X, Y) = D_{X}P(Y)  D_{Y}P(X)  P[X, Y] . (8.9)Here we take for operator P the projection over E along F. In the context of restricted connections X, Y are vectorfields in E.
DEFINITION 8.4. Let B^{E} the restriction of B to E, with values projected on E along F, and similarly define S^{E}, A^{E}. Define the restricted torsion by
(
(8.10)
The latter is a E_{m} valued tensor (v_{m}, w_{m}) E_{m}x E_{m}
T(v_{m}, w_{m})THEOREM 8.5. The following are equivalent:
a) D^{E} and have the same Egeodesics.
b) B_{E}(X, X) = 0 for all X (E).
c) S_{E} = 0.
d) B_{E} = A_{E}.
COROLLARY 8.6. The restricted connections D^{E} and are equal if and only if they have the same geodesics and the same restricted torsion tensors.
COROLLARY 8.7. Given a restricted connection , there is a unique restricted connection D^{E} having the same geodesics as and zero restricted torsion.
PROOF. The results on (Hicks 1965, section 5.4) follow ipsis literis in the restricted context. For instance, we show the latter. The uniqueness results from the second proposition. To show the existence, we define
(8.11)
D^{E} is clearly an Econnection. We compute
B^{E}(X, Y) = T(X, Y) = A^{E} , S^{E} = 0,
since the torsion is skew symmetric. Since S^{E} = 0, they have the same geodesics. Finally, a simple calculation gives
T^{E} =  2A^{E} = 0
so D^{E} has zero torsion.
In terms of the original full connection, there is still too much liberty. We can extend D^{E} to a full connection with arbitrary completions D^{F, E}, D^{TM, F}. In the spirit of Cartan's approach, one would like to characterize special completions. We plan to pursue this futurely. For this purpose, Vilms (1967) and Vagner (1965) can provide the starting point.
ACKNOWLEDGEMENTS
Supported in part by fellowships from FAPERJ and CNPq/PCI. We thank Prof. Manfredo do Carmo for encouragement and many stories about Cartan.
RESUMO
Nesta nota revisitamos a comunicação de E. Cartan no Congresso Internacional da IMU em Bolonha, Itália. As distribuições aqui consideradas serão do mesmo tipo que as tomadas por Cartan, uma classe especial que chamamos fortemente nãoholônomas. Porém, preparamos o caminho para a aplicação do método da equivalência de Cartan (uma ferramenta poderosa para a obtenção de invariantes) a distribuições mais gerais.
Palavraschave: mecânica não holonômica, método de equivalência de Cartan, conexões afins.
MONTGOMERY R. 2001. A tour of subriemannian geometries: geodesics and applications, American Mathematical Society, to appear.
Correspondence to: Jair Koiller
Email: jair@lncc.br
 ARNOLD V. 1978. Mathematical Methods in Classical Mechanics, SpringerVerlag.
 ARNOL'D V, KOZLOV V AND NEISHTADT A. 1988. Dynamical Systems III, SpringerVerlag.
 BATES L AND CUSHMAN R. 1999. What is a completely integrable nonholonomic dynamical system? Rep Math Phys, 44(1/2): 2935.
 BLAJER W. 1995. An orthonormal tangent space method for constrained multibody systems, Comput Methods Appl Mech Engrg, 121: 4557.
 BRYANT RL, CHERN SS, GARDNER RB, GOLDSCHMIDT HL AND GRIFFITS PA. 1991. Exterior Differential Systems, SpringerVerlag.
 CARTAN E. 1928. Sur la represéntation géométrique des systčmes matériels non holonomes, Proc Int Congr Math, Bologna, 4: 253261.
 CARTAN E. 1939. Selecta, GauthierVillars; 1910, Ann Sci École Norm Sup, 27: 109192.
 CHOQUETBRUHAT Y, DEWITTMORETTE C AND DILLARDBLEICK M. 1997. Analysis, Manifolds and Physics, NorthHolland.
 CUSHMAN R AND NIATYCKI J. 1998. Proceedings of the Pacific Institute of Mathematical Sciences workshop on nonholonomic constraints in dynamics, Rep Mathl Physics, 42: 1/2.
 DE LEÓN M AND MARTIN DE DIEGO D. 1996. On the geometry of nonholonomic Lagrangian systems, J Math Phys, 37: 33893414.
 GARDNER R. 1989. The Method of Equivalence and its Applications, SIAM.
 HICKS NJ. 1965. Notes on Differential Geometry, Van Nostrand.
 KOILLER J. 1992. Reduction of some classical nonholonomic systems with symmetry, Arch Rational Mech Anal, 118: 113148.
 KOSCHORKE U. 1981. Vector fields and other vector bundle morphismsa singularity approach. Lecture Notes in Mathematics 847, Springer, Berlin.
 KRISHNAPRASAD PS, BLOCH A, MARSDEN JE AND MURRAY RM. 1996. Nonholonomic Mechanics and Symmetry, Arch Ratl Mech Analysis, 136 (1): 2199.
 LI Z AND CANNY JF. 1993. Nonholonomic Motion Planning, Kluwer.
 MILNOR J AND STASHEFF JD. 1974. Characteristic classes, Princeton Univ. Press.
 POSTNIKOV A, SHAPIRO B AND SHAPIRO M. 1999. Algebras of curvature forms on homogeneous manifolds. In: Differential topology, infinitedimensional Lie algebras, and applications, 227235, Amer Math Soc Transl Ser 2, 194, Amer Math Soc, Providence, RI.
 VAGNER VV. 1965. Geometria del calcolo delle variazioni, Ed. Cremonese.
 VERSHIK AM AND GERSHKOVICH V. 1994. Nonholonomic dynamical systems, geometry of distributions and variational problems, in V.I. ARNOLD, S.P. NOVIKOV (Eds.), Dynamical Systems VII, SpringerVerlag.
 VILMS J. 1967. Connections on tangent bundles, J Diff Geometry 1: 235243.
Publication Dates

Publication in this collection
08 June 2001 
Date of issue
June 2001
History

Accepted
12 Feb 2001 
Received
05 Feb 2001