Abstract
We consider producing permutations and combinations in lexicographical order. Except for the array that holds the combinatorial object, we require only O(1) extra storage. The production of the next item requires O(1) amortized time.
Permutations; combinations; amortized complexity
ARTICLES
Generating permutations and combinations in lexicographical order
Alon Itai
Computer Science Department Technion Haifa Israel
ABSTRACT
We consider producing permutations and combinations in lexicographical order. Except for the array that holds the combinatorial object, we require only O(1) extra storage. The production of the next item requires O(1) amortized time.
Keywords: Permutations, combinations, amortized complexity.
1 Introduction
Let n and p < n be integers, and let N denote {1,... ,n}. A p-combination is a subset of N of size p. A p-combination, s, may be represented by a boolean array of size n, i.e., s¡ = 1 if and only if i Î s. A permutation p over N can also be represented by an array of n distinct integers, p1... pn Î N. An array a = a1... an is lexicographically less than b = b1 ... bn if for some i: ai < bi and for all j < i, aj< bj. A permutation (combination) p is lexicographically less than s if the array that represents p is lexicographically less than the one that represents s. Thus the permutations are ordered
and the p-combinations are ordered
We consider producing permutations of N and p-combinations in lexicographical order. Producing the next object might require O(n) time: consider the consecutive permutations 1n(n 1) ... 2 and 2134 ... n, since these two permutations differ in all positions, given the first permutation W(n) time is required to produce the second one. We show algorithms whose amortized time complexity is O(1), i.e., the time required to produce m consecutive objects is O(n + rn).
While previous algorithms required auxiliary data structures of size q(n), our algorithms require at most O(1) additional space. Thus our data structure is implicit [4]. An algorithm for generating combinatorial objects is memoryless if its input is a single object with no additional structure and its output is the next object according to some prespecified order. In our case, an algorithm is memoryless if its input consists of no data other than the array required to store the permutation or combination.
We present a memoryless algorithm to produce permutations. While, for combinations, our algorithm requires retaining in addition to the combination two integer variables. Moreover, these modest extra space requirements cannot be improved without sacrificing timewe show that every memoryless algorithm for producing p-conibinations requires non-constant time.
1.1 Previous research
There is a considerable body of research in constructing combinatorial objects. Knuth and Szwarc-fiter [3] produce all topological sortings, Squire [5] generated all acyclic orientations of an undirected graph, and Szwarcfiter and Chaty [6] generated all kernels of a directed graph with no directed cycles.
Walsh [7] presents two non-recursive algorithms to produce all well-formed parenthesis strings. The first generates each string in O(n) worst-case time and requires space for only O(1) extra integer variables, and the other generates each string in O(1) worst-case time and uses O(n) extra space. Thus he too shows a time/space tradeoff.
Ehrlich [2] and [1] generate permutations and combinations in O(1) worst time complexity. They both use O(n) time and the objects are not generated in lexicographic order.
2 Permutations
The lexicographically first permutation is
12...n
and the last is
nn 1...1.
Let p1... pi-1* denote the set of all permutations of N whose prefix is p 1...pi-1. In the lexicographical order these permutations appear consecutively, i.e., if p1, p2Î p1...pi-1* and p1 <L s <L p2 then sÎ p1... pi-1*.
The first permutation, p1, of p1 ... pi-1* satisfies p < ... < p. The last one, p2, satisfies p > ... > p. Let p3 be the permutation immediately following p2. The permutation p3 cannot belong to p1 ... pi-1*. If pi-1 > p then p2 is the last permutation of p1 ... pi-2*. Hence p3Ï p1 ... pi-2*. If, however, pi-1 < pi then exchanging pi-1 and pi yields a lexicographically larger permutation. Thus p3Î p 1 ... pi-2*, and is the first permutation of p1 ... pi-2* which is lexicographically larger than p2. Thus pÎ {p ... p} = {pi ... pn}, i.e.,
Since any permutation s Î p1 ... pi-2p* is lexicographically greater than p2, the permutation p3 is the minimum permutation of p1 ... pi-2p *, i.e., pp < ... < p .
Given a permutation p= p1 ... pn, to find the next permutation p3, we first find the last decreasing subsequence pi > p i+1 > ... > pnby scanning from position n; then we find p by equation (1); swap positions i 1 and j. Now p > ... > p > p-1 > p > ... > p. The final sequence is obtained by reversing p ... p . See Program 1.
Finding the index i requires q(n - i) time. Likewise for j. Reversing the sequence also requires q(n i) time. For i = 1 or j = 1 the amount of work is therefore q(n). We next show that the amortized complexity is much lower.
Theorem 2.1 Procedure next_perm() requires O(1) amortized time.
Proof: For i = 1,..., n 1 let
And define a potential fuction
The actual time is t = n i and since < +1 < ... < , DF = - (n-i-1). Thus the amortized time is
a = t + DF = O(1).
Note that since two consecutive permutations may differ by up to n places (e.g. n 1nn 2n 3 ... 1 and n 1 2 ... n 1) the worst case time is q(n). Thus in order to achieve O(1) time, we must settle for amortized complexity.
3 Combinations
3.1 Upper bound
We will use the following notation: 1k will denote a run (a consecutive block) of k ''1''s and 0k a run of k ''0''s. In addition to the boolean array of size n that represents the p-combination, we keep two counters q and r. The counter 0 < r < p counts the number of ''1''s that follow the last ''0'', and the counter 1 < q < np counts the number of consecutive ''0''s before the last run of ''1''s, i.e., p = a0q1r.
Let p be the current combination: There are two cases to consider:
Case 1:r > 0, i.e., p ends with a ''1''. Exchange the first ''1'' of the last run of ''1''s with the last ''0'' of the last run of ''0''s.
a0q1r =a 0q-1011r-1Þ a0q-1101r-1 = a¢
where a¢ = a0q-11, q¢= 1 and r¢= r 1.
Case 2:r = 0, i.e., p ends with a ''0''.
If q=n-p, then p = 1p0n-p is the last combination. Otherwise, the next combination is produced by exchanging the first ''1'' of the last run with the ''0'' on its left and moving all the remaining ''1''s of the last run all the way right, i.e.,
p = a01s0qÞa10q+1 1s-1 = a¢ =p¢,
where a¢ = a1, q¢ = q + 1 and r¢ = s-1.
See Program 2.
Theorem 3.1 Procedure next_comb() requires O(1) amortized time.
Proof: Consider the potential function F= p r. Let ridenote the value of r after the i-th operation.
Case 1(r > 0): F = p r, F' = p r' = p (r 1). Then DF = F' F = 1. Since in this case the procedure next.comb only exchanges two positions, the actual time is t = 1. Hence, the amortized time is a= t+ DF = O(1).
Case 2 (r 0): F p, F' p (s 1) p s + 1. DF = F' F = (p s + 1) p = s + 1. In this case, The actual time t = 1 + min{s, q} < 1 + s. Thus, the amortized time is a = t + DF = O(1).
3.2 Lower bounds
First note that the amortized time bound cannot be replaced by a ''worst case'' bound since the number of bits that have to be changed between two consecutive combinations is 1 + min{q, r}. In particular, the combination 01p0n-p-1 is followed by 1On-p1p-1 which involves 1 + min{p, n p} changes. This is maximized when p = n/2.
Consider a memoryless scheme that operates on p =a0q1r. If the sequence ends with a ''1'' (r > 0), we must find the first ''0'' before the the last run of ''1''s, and since we have no extra data we must scan p from its right-hand end until finding a ''0'', i.e., scan r ''1''s. If the sequence ends with a ''0'' (r 0) then we have to find the first ''1'' before the last run of ''0''s, i.e., the time is q. Since there are p ''1''s the probability that the last digit is ''1'' is p/n, and the probability that the sequence ends with a ''0'' is 1 p/n. The average time is therefore:
When r > 0 the last digit is ''1'', and we have p 1 other ''1''s which are partitioned by the n p ''0''s to n p+1 runs (some of which might have zero length). The average length of such a run is therefore . Since the last run of ''1''s ends with an additional ''1'', its average length is 1 + = .
When r = 0 the last digit is ''0'', and the remaining np1 ''0''s are partitioned by the p ''1''s to p+1 runs of ''0''s. Their average length is. The average length of the last run of ''0''s is 1 + = .
The asymptotic value of An (p) depends on p. The first term monotonically increases while the second decreases. For p = cn, An(p) = O(1), for p = or p = n , An(p) = q() and for p = 1 or p = n 1, An(p) = q(n). In general, for a function 1 < f(n) < n, and p = f (n) or p = n f (n), we have that An(p) = q(n/f(n)).
Thus for many values of p the average time required to produce the next combination is not constant but an increasing function on n. Hence, for memory less algorithms the amortized complexity is also non-constant.
- [1] Nachum Dershowitz. A simplified loop-free algorithm for generating permutations. BIT, 15(2):158-164, 1975.
- [2] Gideon Ehrlich. Loopless algorithms for generating permutations, combinations, and other combinatorial configurations. Journal of the ACM, 20(3):500-513, 1973.
- [3] D. E. Knuth and J. L. Szwarcfiter. A structured program to generate all topological sorting arrangements. Information Processing Letters, 2:153-157, 1974.
- [4] J. L. Munro. An implicit data structure for the dictionary problem that runs in polylog time. In FOCS, volume 25, pages 369-374, 1984.
- [5] Matthew B. Squire. Generating the acyclic orientations of a graph. Journal of Algorithms, 26(2):275-290, 1998.
- [6] J. L. Szwarcfiter and G. Chaty. Enumerating the kernels of a directed graph with no odd circuits. Information Processing Letters, 51:149-153, 1994.
- [7] Timothy R. Walsh. Generation of well-formed parenthesis strings in constant worst-case time. Journal of Algorithms, pages 165-17, 1998.
Publication Dates
-
Publication in this collection
16 Dec 2003 -
Date of issue
2001