Non-Holonomic Connections Following \'Elie Cartan

In this note we revisit E. Cartan's address at the 1928 International Congress of Mathematicians at Bologna, Italy. The distributions considered here will be of the same class as those considered by Cartan, a special type which we call strongly non-holonomic. We set up the groundwork for using Cartan's method of equivalence (a powerful tool for obtaining invariants associated to geometrical objects), to more general non-holonomic distributions.


Introduction
In this note we revisit E. Cartan's address [1] at the 1928 International Congress of Mathematicians at Bologna, Italy. The distributions considered by Cartan were of a special type which we call strongly non-holonomic. Our aim is to set up the groundwork for using Cartan's method of equivalence (a powerful tool for obtaining invariants associated to geometrical objects, [2]),to more general non-holonomic distributions. This is a local study, but we outline some global aspects. If the configuration space Q is a manifold of dimension n, its tangent bundle T Q should admit a smooth subbundle E of dimension m, m < n. As it is well known, this imposes topological constraints on Q, see [3]. Although we will be discussing only local invariants, hopefully these will help constructing global ones, such as special representations for the characteristic classes [4], [5].
Notation. Throughout this paper we follow consistently the following convention: capital roman letters I, J, K, etc. run from 1 to n. Lower case roman characters i, j, k run from 1 to m (representing the constraint distribution). Greek characters α, β, γ, etc., run from m + 1 to n. Summation over repeated indices is assumed unless otherwise stated.

Non-holonomic connections
We fix a Riemannian metric g on Q and let ∇ the associated Riemannian connection (torsion free and metric preserving: In section 8 we consider an arbitrary affine connection (see [9]) on Q. Recall that given a local frame e I on V ⊂ Q and its dual coframe ω J , a connection ∇ is described by local 1-forms ω IJ = −ω JI such that (2) ∇ X e J = ω IJ (X) e I .
The torsion tensor is T (X, Y ) = ∇ X Y − ∇ Y X − [X, Y ] = t I (X, Y ) e I and expanding the left hand side we get the structure equations As the Riemannian connection ∇ is torsion free: t I ≡ 0. We assume heretofore that the frame is adapted to the distribution E. This means {e i (q)} span the subspace E q , q ∈ Q, and the remaining {e α } span the g-orthogonal space F q = E ⊥ q . Definition 2.1. The (Levi-Civita) non-holonomic connection on E is defined by the rule (4) D X e j = ω ij (X)e i , (i, j = 1, ..., m) .
Here we allow X to be any vectorfield on Q, not necessarily tangent to E. Notice that for vectorfields Y, Z tangent to E, the metric-compatibility (5) X Y, Z = D X Y, Z + Y, D X Z still holds. The motivation for this definition is D'Alembert's principle: Consider a mechanical system with kinetic energy 1 2 g(ċ,ċ) and applied forces F , subject to constraints such thaṫ c ∈ E c(t) . The constraining force ∇ċċ − F is g-perpendicular to the constraint subspace E c(t) , since it does not produce work.
Unless otherwise mentioned, we assume there are no applied forces. The geodesic equations are given, in Cartan's approach, by One may wish to see the equations explicitly. Choose a coordinate system x on V ⊂ Q.
(some authors call the p k = ω k (ċ(t)) quasivelocities). The equation Dċċ = 0 gives a nonlinear system in n + m dimensions for x and p given by . Here e k R is the R-th component (1 ≤ R ≤ n) of the k-th E-basis vector (1 ≤ k ≤ m) in terms of the chosen trivialization of T Q.

Geometric interpretation.
Direct and inverse "development" of frames and curves were so obvious to Cartan that he did not bother to give details. We elaborate these concepts, exhibiting explicitly (Theorem 3.4 below) a system of ODEs producing at the same time, the solution of the non-holonomic system, a parallel frame along it, and a representation of the nonholonomic systems on the euclidian space R m , the hodograph of frames to R m .
3.1. Direct parallel transport of a E qo -frame.
A frame for E qo ⊂ T qo Q can be transported along a curve c(t) in Q. The "novelty" here (as stressed by Cartan): c(t) is an arbitrary curve in Q, that isċ(t) does not need to be tangent to E.
Recall that given a tangent vector In fact, we are led to the linear time-dependent system of ODEs (6) ).
In particular, an orthonormal frame at E qo is transported to E c(t) and remains orthonormal.

3.2.
Hodograph of a E c(t) -frame to R m .
Given a curve of frames for T Q along c(t), {e I (t)}, consider the frame for E c(t) formed by the first m vectors e i . We have We develop a "mirror" or hodograph frame {U(t) : u 1 (t), · · · , u m (t)} confined to R m ≡ E qo , q o = c(0), solving the system for U(t) ∈ O(m) given by (10) Equivalently (by elementary matrix algebra) where u i are the columns of U.
Proof. First, extend an E q -adapted basis for T q Q in a neighborhood q ∈ V ⊂ Q, with corresponding forms ω IJ , I, J = 1, · · · , n. Then consider the vectorfield in V × O(m) given by Integrating this vectorfield we obtain a curve (c(t), U(t)).
We claim that the hodograph of c(t) isγ(t) = γ(t) (the vectorfield was constructed precisely for that purpose). Indeed, by the previous item, What if we had used a different frame e I on U? We would get a system of ODEs , c(0) = q, T (0) = id and we claim that Y = X so the curve c(t), is unique. To prove this fact it equivalent to show that T = UP where P changes basis from e i , i = 1, · · · , m to e i , i = 1, · · · , m , that is (18) (e 1 , · · · , e m ) = (e 1 , · · · , e m ) P .
We compute which is indeed the gauge-theoretical rule giving the forms (ω ij ) of the basis e i defining T from the forms (ω ij ) of the basis e i . We can upgrade this construction to provide a parallel frame along c(t), by declaring ω ij ≡ 0. This gives which could be added to system (15). Actually, we can take the equation for U out of that system, observing that U = P −1 . ( Theorem 3.4. Given a curve γ(t) ∈ R m , consider the nonautonomous system ODEs in the manifold Fr(E) given by It gives the developed curve c(t) on Q and an attached parallel frame (21) (e 1 , · · · , e m ) = (e 1 , · · · , e m ) P .
For a line γ(t) = t v passing through the origin in R m we obtain the non-holonomic geodesic starting at q with velocityċ(0) = v.

Hodograph of the D'Alembert-Lagrange equation.
We elaborate on the comments of §7 in [1] 1 . Consider a mechanical system with kinetic energy T and external forces F (in contravariant form), subject to constraints defined by the distribution E. The non-holonomic dynamics is given by where the right hand side is the ortogonal projection of F over E.
Let γ be the hodograph of c. Fix a constant frame f 1 , · · · , f m on R m and write Let e i be the parallel frame along c(t) obtained in Theorem 3.4. Decompose .

Equations (24) should be solved simultaneously with (20).
This approach can be helpful for setting up numerical methods, and in some cases reducing the non-holonomic system to a second order equation on R m . We also observe that F can represent non-holonomic control forces actuating over the system, as those studied in [8].

Equivalent connections.
In this section and the next we discuss the question of whether two non-holonomic connections D and D on E have the same geodesics. Given This is the most general change of coframes preserving the sub-Riemannian metric supported on E. The corresponding dual frame e I satisfies (27) e j = e i C ij , e α = e i B iα + e λ A λα 1 "La trajectoire du système matériel, supposé soumisà des forces données de travail elémentaire P i ω i , se d'eveloppe suivant la trajectoire d'un point matériel de masse 1 placé dans l'espace euclidieǹ a m dimensions et soumisà la force de composantes P i ." (here, for ease of notation we place scalars after vectors).
In matrix form, we have Using matrix notation is not only convenient for the calculations, but also to set up the equivalence problem ( [2]). Consider the linear group G of matrices of the form The equivalence problem for sub-Riemannian geometry can be described as follows: For this study, see Montgomery [10].
In non-holonomic geometry we are lead to a more difficult equivalence problem (see section 6.3 below). In [1], Cartan characterized the non-holonomic connections only for a certain type of distributions, which we will call strongly non-holonomic. Interestingly, Cartan did not work out the associated invariants, even in this case. He focused in finding a special representative in the equivalence class of connections with the same geodesics.
Consider the modified metric on Q (31) g = ω 1 2 + · · · + ω n 2 and the associated Levi-Civita connection D. The geodesic equation is To compare (32) and (6), there is no loss in generality by taking C = id. By inspection one gets: The geodesics of D and D are the same iff for all T tangent to E.

Equivalent 1-forms.
In view of (33) it seems useful to introduce the following In the C ∞ (Q)-ring of differential forms Λ * (Q), consider the ideal I generated by the 1-forms ω α , α = m + 1, · · · , n. We can write More generally, two k-forms ω 1 and ω 2 are said to be E-equivalent if their difference vanishes when one of the slots (v 1 , · · · , v k ) is taken on E. Again, this means that ω 1 − ω 2 ∈ I. (In fact, given a Pfaffian system of 1-forms on Q θ 1 = 0, · · · , θ r = 0, one can form the ideal I on Λ * (Q) generated by these forms. Every form that is annulled by the solutions of the system belongs to I, see [6] p.232).
If ω 1 ∼ E ω 2 it does not necessarily follow that dω 1 ∼ E dω 2 . For the later to happen, the former must be equivalent over a larger subspace, (I 1 ) ⊥ ⊃ E which we now describe.

Filtrations in T Q and in T * Q.
Let I a Pfaffian system.
Here I is thought as a submodule over C ∞ (Q) consisting of all 1-forms generated by the θ i . We assume all have constant rank. The filtration eventually stabilizes after a finite number inclusions, and we denote this space I final . By Frobenius theorem, the Pfaffian system I final is integrable. Fix a leaf S and consider the pull back of the filtration. That is, we pull back all forms by the inclusion j : S → Q. The filtration associated to j * I stabilizes at zero.
There is a dual viewpoint, more commonly used in non-holonomic control theory ( [12]): given a distribution E in T Q one considers an increasing filtration Two (different) options are used: We want to show that for any θ ∈ I 1 Well, θ(X) = 0 by default and (the correct signs do not matter) as θ(Y ) = θ(Z) ≡ 0 because θ ∈ I and dθ(Y, Z) = 0 because θ ∈ I 1 .

Strongly non-holonomic distributions.
The main question to be addressed in the local theory is the following. Assume that the geodesics of two Levi-Civita nonholonomic connections D and D are the same. Proposition 4.1 says that a necessary and sufficient condition for this to happen is ω ij ∼ E ω ij .
What are the implications of this condition in terms of the original coframes ω = (ω 1 , · · · , ω n ) and ω = (ω 1 , · · · , ω n )? The answer is that it depends on the type of distribution E.
One extreme: suppose E is integrable, that is I (1) = I. There is a foliation of Q by m-dimensional manifolds whose tangent spaces are the subspaces E q . Then it is clear that there are no further conditions. We can change the complement F = E ⊥ without any restriction, and the metric there. In fact, we can fix a leaf S and the Levi-Civita connection on S will coincide with the projected connection, no matter what g is outside E.
The other extreme is the case studied in [1]): Definition 6.1. We say that the distribution E is of the strongly non-holonomic type if the derived Pfaffian system associated to E is zero.
We now prove Proof. Cartan used an argument that we found not so easy to decipher (see (41) on section 6.2 below). Thus we prefer to use a different argument to show that B ≡ 0. We start with the structure equation Since ω ij ∼ E ω ij and ω j ∼ E ω j (see Equation (25) with C = I) this implies and this inocently looking expression, together with (36) yields Hence if the distribution is of strongly non-holonomic type then B ≡ 0.

Digression.
The following calculations are actually never explicitly written in [1], it seems that Cartan does something equivalent to them mentally. A caveat: the connection forms ω IJ are antisymmetric in the indices I, J but in general this will not be the case for the forms ω IJ below. If desired, they will have to be antisymmetrized (a posteriori).
We begin by differentiating (to be antisymmetrized) The block (−ω ij ) is given by We can take C = const. = id since we are not changing the subspace E. In this case From equation (40), Cartan observed: Cartan showed that under the hypothesis of the derived system being zero (41) implies B ≡ 0. This follows from applying matrix B to the structure equations (42) dω α = −ω αi ω α + modI , α = m + 1, · · · , n . which is assumed to have only the trivial solution 2 .

Equivalence problem for non-holonomic geometry.
The method of equivalence is advertised by Cartan in the 1928 address 3 , but interestingly, he did not apply the method to its full power. We now outline the equivalence problem.
The derivation was done in the particular case where C = id. But this is not a restriction. Replacing e i = (e i )C by the e i does not change the non-holonomic geometry and leads to the transition matrix Then (ω i ) = (ω i ) + C −1 B(ω α ) and (37) becomes C −1 B(ω α ) ∈ I (1) and C −1 can be removed because I (1) is a module over the functions on Q.
In spite of Cartan's caveat 4 , we hope to raise interest in further research on the Equivalence problem for non-holonomic geometry: U on open sets U and V, find invariants characterizing the existence of a diffeomorphism F : U → V satisfying F * Ω V = g · ω U , where the substitutions are of the form We recall that I ⊂ Λ 1 (T * Q) is the annihilator of E. The greek indices can be further decomposed into two parts: capital greek Φ = 1, · · · , r representing forms ω Φ ∈ I (1) ; lower case greek letters α = m + r + 1, · · · , n , where r = dim I (1) , 0 ≤ r ≤ n − m . Matrix B can be written B = (B 1 , B 2 ) where the first is m × r and the second is m × (n − m − r). Condition (37) is equivalent to B 2 ≡ 0 and our choice of basis implies that dω Φ involve only the ω Φ 's and the ω α 's; dω α involve at least one of the ω i . The group of substitutions consist of matrices of the form In terms of frames we have which in particular shows: Theorem 6.4. With the above notations, we have: i) (e i , e α ) generate an intrinsic subspace [I (1) ] ⊥ , annihilated by I (1) .
ii) The e α generate an intrinsic orthogonal complement F of E in [I (1) ] ⊥ . iii) There is complete freedom to choose the e Λ to complete the full frame for T q Q .

Non-holonomic torsions and curvatures.
In this section we assume the strongly non-holonomic case. Since B = 0 (and as we can take C = id) we have We look at the original structure equations for the ω i : where ω ij = −ω ji and we expand ω iα as a certain combination of the coframe basis ω j , ω β (at this point there is still freedom to choose the matrix A defining the ω β ). The result is of the form We now use to our advantage the condition ω ij ∼ E ω ij of Proposition 4.1. We can modify There is a unique choice of p ′ s making the γ's symmetrical, namely Summarizing, we have the Cartan structure equations for strongly non-holonomic connections: The forms ω ij = −ω ji are uniquely defined by the symmetry requirement Cartan did not invest on computing curvatures 5 . The curvature forms for the connection ω ij would be helpful to compute characteristic classes of the bundle E → Q.

A canonical choice of metric in
Assuming the strongly nonholonomic hypothesis, (44) yields for each pair of indices j = k, Interpreted as a linear system for the u α , this in particular implies We now work on the change of coframes (58) ω α = A αλ ω λ , α = m + 1, ..., n.
The differentials of the latter are given by: We can choose matrix A uniquely by a Gram-Schmidt procedure on the n − m linearly independent vectors (c ijλ ) λ = m + 1, · · · , n in R (m(m−1)/2) .
Thus we obtain the conditions on the bivectors (Cartan's terminology): From this point on, in order to maintain the ortonormality conditions, the change of coframes must be restricted to A ∈ O(n − m). Hence we get Theorem 7.2. (Cartan, [1], §9). Assume the strongly non-holonomic case. The conditions (62) define uniquely a metric g on T Q = E ⊕ F .

Geometric interpretation of torsion.
Recall the R m = E qo valued 1-form θ given by (14) θ = "dγ" = ω j u j which is the integrand of (13). The quotes indicate that this is a loose notation, "dγ" is not exact. Indeed, we compute dθ = dω k u k + du j ∧ ω j . Now, du j = −ω kj u k by construction, so by Proposition 7.1 In the strongly non-holonomic case, Theorem 7.1 gives Proposition 7.3. Cartan, [1], §8). Consider an infinitesimal parallelogram in Q spanned by vectors u, v in T q Q, and the associated infinitesimal variation dω(u, v) in R m . If u, v belong to E q there is no variation in R m after the cycle. For u ∈ E q and v ∈ E ⊥ q = F q the variation is given by the torsion coefficients γ kλi 's. For u, v ∈ F q the variation is determined by the coefficients s λµi 's.
The symmetry (56) has the following interpretation: One can consider the non-holonomic connection on F associated to the metric g. Moreover, one can repeat the procedure in Theorem 7.1. Write where the ambiguity on the δ's can be removed by changing to another ω λα = ω λα + mod[ω i ] and imposing the symmetry δ kλα = δ kαλ and the antisymmetry ω λα = −ω αλ . Mutatis mutandis, the geometric interpretation of the torsion coefficients δ kαλ 's and c ijα 's is analogous. In particular, there is no torsion for pairs u, v ∈ F .
It seems that these geometric interpretations were forgotten by the geometers from the 60's on. For instance, in the very influential lectures [9], it is written (p. 59): "as far as we know, there is no nice motivation for the word torsion".

The case where F is integrable.
When the torsion coefficients in (54) all vanish, dω j = ω ij ω i , then all the forms dω j belong to the ideal (67) J = [ω 1 , ..., ω m ] so by Frobenius theorem, the distribution F = J ⊥ is integrable. One can construct a local fibration U → B, whose fibers are (pieces of) F leaves. Choose coordinates (q 1 , · · · , q m , q m+1 , q n ) on U, such that (q 1 , · · · , q m ) are coordinates on B and the fibration is (q 1 , · · · , q m , q m+1 , q n ) → (q 1 , · · · , q m ). The distribution E will be given by If the functions b αi do not depend on the last m − n coordinates, we have locally an R m−n action on U → B and a connection on this (local) principal bundle. More generally, one can formulate the following equivalence problems: (1) Given (ω 1 , · · · , ω n ) a coframe on U, find a Lie group G of dimension n − m, a diffeomorphism F : U → P = B × G and a connection on the principal bundle P such that the distribution E : ω α = 0 on U corresponds to the horizontal spaces of the connection on P. (2) Same, adding the requirement that the vertical spaces b × G correspond to the leaves of F . (3) Same, adding a Riemannian metric g on U and requiring that it corresponds to a G-equivariant metric on P . (4) Same, requiring that the vertical and horizontal spaces are g-orthogonal. The case 3) was considered in [11]. The non-holonomic geodesic equations on E reduce to a certain non-holonomic connection on B. If one prefers to use the Levi-Civita D B riemannian connection on B, relative to the projected metric, one gets an equation of the form D Ḃ bḃ = K(b) ·ḃ where K is antisymmetric so that the force in the right hand side is gyroscopic (does not produce work).
This force vanishes in case 4). This seems to be what Cartan had in mind in the abelian case 6 .

Restricted connections
In this section we adopt an "internal" point of view, as opposed to the "extrinsic" approach of thepreceding ones. It is quite fragmentary and tentative, aiming to propose directions for future work. We change the notation for the configuration space, which will be denoted M. • D is Leibnitzian in s: To emphasize the fact that X ∈ E, we also call this object an E-restricted connection. When H = E an E-restricted connection on E will be called a non-holonomic connection on E.
A comment is in order. This definition seems natural here but we have searched the literature and have not found it. In fact, given a vector bundle H → M, the usual notion of a connection D on H (see e.g., Milnor [4], appendix C) means a T M-connection on H, in the sense of our Definition 8.1. We will call those full connections. That is, X is allowed to be any section of T M, so one is able to covariantly differentiate allong any curve c(t) in M. The difference in Def 8.1 is that the covariant differention is defined just for curves withċ ∈ E. Therefore, to avoid confusion, we called the connection in Definition 8.1 a restricted connection.
Given a full connection, evidently, it can the restricted to E or F . Given a (restricted) E-connection on H, can it always be extended to a (full) TM-connection? The answer is yes. Consider the following "cut-and-paste" or "genetic engineering" operations: Consider a Whitney sum decomposition with profection operators denotes by P (over E parallel to F ) and Q (over F parallel to E). i) Given a (full) connection D X Y on T M, it induces full connections D 1 on E and D 2 on F , by restricting Y to one of the factors (say, E) and projecting the covariant derivative D X Y over this factor. Since full connections are plentiful, so are restricted ones. ii) Given D 1 , D 2 E-restricted (F-restricted, respectively) connections on H, it is obvious that D X s = D 1 X 1 s + D 2 X 2 s defines a full connection on H.
Proposition 8.2. Given a non-holonomic connection D (E,E) on E, and D (F,E) an Fconnection on E, the rule QX Y , Y ∈ Γ(E) defines a T M connection on E extending D. Here P and Q are respectively the projections on E (resp. F) along F (resp. E). Remark 8.3. However, given an E-restricted connection D 1 in E and an F-restricted connection D 2 in F , the rule The equivalence problem can be rephrased as follows: characterize the class of full connections D on M such that their non-holonomic restrictions D E = D (E,E) have the same geodesics.

Parallel transport and geodesics.
The basic facts about TM-connections (see Hicks, [9], chapter 5) hold also for E-restricted connections. For instance, • (D X Y ) m depends only on the values of Y ∈ H along any curve c(t) ∈ M witḣ c(0) = X m . • Parallel transport of a vector h o ∈ H m along a curve c(t) ∈ M withċ ∈ E.
We slightly change the usual proofs ( [9]). Take a local basis {h j }, j = 1, · · · , p trivializing H over a neighborhood U ⊂ M and vectorfields e 1 , · · · , e q on U ⊂ M generating E. Here q is the dimension of the fiber of E and p the dimension of the fiber of H. We define p 2 q functions Γ i jk (1 ≤ i, j ≤ p, 1 ≤ k ≤ q) on U by We search a j (t) such that h(t) = a j (t)h j satisfies Dċh(c(t)) ≡ 0.
We get a linear system of ODEs in p-dimensions where Γ j ik (t) = Γ j ik (c(t)) . Recall that an E-connection on itself (that is, H = E) is called a non-holonomic connection on E. The equation Dċċ = 0 gives a nonlinear system in n + p dimensions (where n is the dimension of M and p = q is the dimension of E) for x and a given by dx r dt = k=1,...,p a k e k r (x) (1 ≤ r ≤ n), Here e k r is the r − component (1 ≤ r ≤ n) of the k-th E-basis vector (1 ≤ k ≤ p) in terms of a standard trivialization of T M.

Torsion and curvature.
Let These objects clearly make sense in the restricted version. Recall ([9], 6.5) the notion of torsion associated to a (1, 1) tensor P (m ∈ M → P m ∈ End(T m M)): Here we take for operator P the projection over E along F . In the context of restricted connections X, Y are vectorfields in E. Proof. The results on Hicks ([9], section 5.4) follow ipsis literis in the restricted context. For instance, we show the latter. The uniqueness results from the second proposition. To