# Sattolo’s Algorithm (2017)

**Sattolo’s algorithm**

I lately had a field where a part of the answer became once to assemble a series of pointer accesses that can maybe presumably sail round a piece of memory in pseudo-random insist. Sattolo’s algorithm gives one arrangement to this in consequence of it produces a permutation of an inventory with precisely one cycle, which guarantees that we’re going to have the chance to realize each and each a part of the checklist despite the undeniable fact that we’re traversing it in random insist.

On the other hand, the explanations of why the algorithm labored that I would possibly well presumably acquire on-line either feeble some invent of mathematical machinery (stirling numbers, assuming familiarity with cycle notation, and many others.), or feeble logic that became once now not easy for me to savor a look at. I acquire that this is frequent for explanations of ideas that can maybe presumably, nevertheless assemble now not prefer to, expend pretty about a mathematical machinery. I assemble now not tell there is anything else horrible with the usage of new mathematical suggestions per se — or now not it’s some distance an amazing psychological shortcut whereas you are aware of the ideas, nevertheless I ranking or now not it’s heart-broken that or now not it’s now not easy to acquire a relatively uncomplicated clarification that would now not require any background. After I became once having a sight a straightforward clarification, I additionally realized pretty about a of us that had been the usage of Sattolo’s algorithm in locations where it wasn’t appropriate and additionally of us that did now not know that Sattolo’s algorithm is what they had been having a sight, so right here’s an strive at an clarification of why the algorithm works that would now not recall an undergraduate combinatorics background.

Sooner than we scrutinize at Sattolo’s algorithm, let’s scrutinize at Fisher-Yates, which is an in-set up algorithm that produces a random permutation of an array/vector, where each and each that you too can accept as true with permutation occurs with uniform chance.

We will scrutinize on the code for Fisher-Yates after which uncomplicated how to point out that the algorithm produces the supposed consequence.

```
def dawdle(a):
n = len(a)
for i in vary(n - 1): # i from 0 to n-2, inclusive.
j = random.randrange(i, n) # j from i to n-1, inclusive.
a[i], a[j] = a[j], a[i] # swap a[i] and a[j].
```

`dawdle`

takes an array and produces a permutation of the array, i.e., it shuffles the array. We are in a position to bring to mind this loop as inserting each and each a part of the array, `a`

, in flip, from `a[0]`

to `a[n-2]`

. On some iteration, `i`

, we desire undoubtedly one of `n-i`

parts to swap with and swap part `i`

with some random part. The last part in the array, `a[n-1]`

, is skipped in consequence of it would possibly well maybe presumably continuously be swapped with itself. One choice to scrutinize that this produces each and each that you too can accept as true with permutation with uniform chance is to jot down down the chance that every and each part will discontinuance up in any explicit online page. One more choice to assemble it’s some distance to scrutinize two facts about this algorithm:

- Every output that Fisher-Yates produces is produced with uniform chance
- Fisher-Yates produces as many outputs as there are diversifications (and each and each output is a permutation)

(1) For every and each random preference we ranking in the algorithm, if we ranking a varied preference, we ranking a varied output. To illustrate, if we scrutinize on the resultant `a[0]`

, the best formula to set up the part that became once on the muse in `a[k]`

(for some `okay`

) in the resultant `a[0]`

is to swap `a[0]`

with `a[k]`

in iteration `0`

. If we desire a varied part to swap with, we will discontinuance up with a varied resultant `a[0]`

. After we set up `a[0]`

and scrutinize on the resultant `a[1]`

, the an identical thing is correct of `a[1]`

and so on for every and each `a[i]`

. Additionally, each and each preference reduces the vary by the an identical amount — there is a form of symmetry, in that despite the undeniable fact that we set up `a[0]`

first, lets savor positioned any varied part first; each and each preference has the an identical assemble. Right here is vaguely analogous to the reason that you too can desire an integer uniformly at random by picking digits uniformly at random, one at a time.

(2) How many varied outputs does Fisher-Yates assemble? On the first iteration, we fix undoubtedly one of `n`

alternatives for `a[0]`

, then given that preference, we fix undoubtedly one of `n-1`

decisions for `a[1]`

, then undoubtedly one of `n-2`

for `a[2]`

, and so on, so there are `n (n-1) (n-2) ... 2 1 = n!`

that you too can accept as true with varied outputs.

Right here is precisely the an identical preference of that you too can accept as true with diversifications of `n`

parts, by beautiful powerful the an identical reasoning. If we prefer to depend the preference of that you too can accept as true with diversifications of `n`

parts, we first desire undoubtedly one of `n`

that you too can accept as true with parts for the first diagram, `n-1`

for the second diagram, and so on leading to `n!`

that you too can accept as true with diversifications.

Since Fisher-Yates only produces outlandish diversifications and there are precisely as many outputs as there are diversifications, Fisher-Yates produces each and each that you too can accept as true with permutation. Since Fisher-Yates produces each and each output with uniform chance, it produces all that you too can accept as true with diversifications with uniform chance.

Now, let’s scrutinize at Sattolo’s algorithm, which is virtually an corresponding to Fisher-Yates and additionally produces a shuffled version of the input, nevertheless produces one thing pretty varied:

```
def sattolo(a):
n = len(a)
for i in vary(n - 1):
j = random.randrange(i+1, n) # i+1 in preference to i
a[i], a[j] = a[j], a[i]
```

As one more of picking a bit at random to swap with, admire we did in Fisher-Yates, we desire a bit at random that is now not the part being positioned, i.e., we assemble now not enable a bit to be swapped with itself. One side assemble of this is that no part finally ends up where it on the muse began.

Sooner than we discuss about why this produces the supposed consequence, let’s ranking particular we’re on the an identical page regarding terminology. One choice to scrutinize at an array is to brand it as an define of a graph where the index indicates the node and the designate indicates where the sting parts to. To illustrate, if we savor the checklist `0 2 3 1`

, this is also regarded as a directed graph from its indices to its designate, which is a graph with the following edges:

```
0 -> 0
1 -> 2
2 -> 3
3 -> 1
```

Node 0 parts to itself (for the reason that designate at index 0 is 0), node 1 parts to node 2 (for the reason that designate at index 1 is 2), and so on. If we traverse this graph, we scrutinize that there are two cycles. `0 -> 0 -> 0 ...`

and `1 -> 2 -> 3 -> 1...`

.

As an instance we swap the part in diagram 0 with some varied part. It will possibly possibly truly be any part, nevertheless to illustrate that we swap it with the part in diagram 2. Then we will savor the checklist `3 2 0 1`

, which would possibly be regarded as the following graph:

```
0 -> 3
1 -> 2
2 -> 0
3 -> 1
```

If we traverse this graph, we scrutinize the cycle `0 -> 3 -> 1 -> 2 -> 0...`

. Right here is an instance of a permutation with precisely one cycle.

If we swap two parts that belong to varied cycles, we will merge the two cycles proper into a single cycle. One choice to scrutinize this is when we swap two parts in the checklist, we’re essentially picking up the arrow-heads pointing to each and each part and swapping where they level (in preference to the arrow-tails, which protect keep). Tracing the discontinuance consequence of this is admire tracing a resolve-8. Actual as an illustration, issue if we swap `0`

with an arbitrary a part of the varied cycle, to illustrate part 2, we will discontinuance up with `3 2 0 1`

, whose only cycle is `0 -> 3 -> 1 -> 2 -> 0...`

. Voice that this operation is reversible — if we assemble the an identical swap again, we discontinuance up with two cycles again. Most regularly, if we swap two parts from the an identical cycle, we ruin the cycle into two separate cycles.

If we feed an inventory consisting of `0 1 2 ... n-1`

to sattolo’s algorithm we will ranking a permutation with precisely one cycle. Furthermore, we savor the an identical chance of generating any permutation that has precisely one cycle. Let’s scrutinize at why Sattolo’s generates precisely one cycle. Afterwards, we will resolve out why it produces all that you too can accept as true with cycles with uniform chance.

For Sattolo’s algorithm, to illustrate we originate with the checklist `0 1 2 3 ... n-1`

, i.e., an inventory with `n`

cycles of length `1`

. On each and each iteration, we assemble one swap. If we swap parts from two separate cycles, we will merge the two cycles, decreasing the preference of cycles by 1. We will then assemble `n-1`

iterations, decreasing the preference of cycles from `n`

to `n - (n-1) = 1`

.

Now let’s scrutinize why or now not it’s expedient to recall we continuously swap parts from varied cycles. In each and each iteration of the algorithm, we swap some part with index > `i`

with the part at index `i`

after which increment `i`

. Since `i`

will get incremented, the part that will get positioned into index `i`

can never be swapped again, i.e., each and each swap puts undoubtedly one of the important two parts that became once swapped into its closing diagram, i.e., for every and each swap, we take two parts that had been potentially swappable and render undoubtedly one of them unswappable.

After we originate, we savor `n`

cycles of length `1`

, each and each with `1`

part that is swappable. After we swap the initial part with some random part, we will take undoubtedly one of the important swappable parts and render it unswappable, rising a cycle of length `2`

with `1`

swappable part and leaving us with `n-2`

varied cycles, each and each with `1`

swappable part.

The principle invariant that is maintained is that every and each cycle has precisely `1`

swappable part. The invariant holds in the starting when we savor `n`

cycles of length `1`

. And as prolonged as this is correct, at any time when we merge two cycles of any length, we will take the swappable part from one cycle and swap it with the swappable part from the varied cycle, rendering undoubtedly one of the important two parts unswappable and rising a protracted cycle that quiet only has one swappable part, asserting the invariant.

Since we won’t swap two parts from the an identical cycle, we merge two cycles with each and each swap, decreasing the preference of cycles by 1 with each and each iteration except we savor sprint `n-1`

iterations and savor precisely one cycle closing.

To scrutinize that we generate each and each cycle with equal chance, show camouflage that there would possibly be merely one choice to assemble each and each output, i.e., changing any explicit random preference outcomes in a varied output. In the first iteration, we randomly desire undoubtedly one of `n-1`

placements, then `n-2`

, then `n-3`

, and so on, so for any explicit cycle, we assemble it with chance `(n-1) (n-2) (n-3) ... 2 1 = (n-1)!`

. If we’re going to have the chance to demonstrate that there are `(n-1)!`

diversifications with precisely one cycle, then we will know that we generate each and each permutation with precisely one cycle with uniform chance.

As an instance we savor an arbitrary checklist of length `n`

that has precisely one cycle and we add a single part, there are `n`

ways to lengthen that to turn into a cycle of length `n+1`

in consequence of there are `n`

locations lets add in the novel part and set up the cycle, that arrangement that the preference of cycles of length `n+1`

, `cycles(n+1)`

, is `n cycles(n)`

.

To illustrate, issue we savor a cycle that produces the path `0 -> 1 -> 2 -> 0 ...`

and we prefer so as to add a brand novel part, `3`

. We are in a position to replace `-> 3 ->`

for any `->`

and ranking a cycle of length 4 in preference to length 3.

In the rotten case, there is one cycle of length 2, the permutation `1 0`

(the varied permutation of length two, `0 1`

, has two cycles of length one in preference to having a cycle of length 2), so we know that `cycles(2) = 1`

. If we apply the recurrence above, we ranking that `cycles(n) = (n-1)!`

, which is precisely the preference of varied diversifications that Sattolo’s algorithm generates, that arrangement that we generate all that you too can accept as true with diversifications with one cycle. Since we know that we generate each and each cycle with uniform chance, we now know that we generate all that you too can accept as true with one-cycle diversifications with uniform chance.

One more choice to scrutinize that there are `(n-1)!`

diversifications with precisely one cycle, is that we rotate each and each cycle round so that `0`

is initially and write it down as `0 -> i -> j -> okay -> ...`

. The preference of these is the an identical as the preference of diversifications of parts to the lawful of the `0 ->`

, which is `(n-1)!`

.

### Conclusion

Now we savor appeared at two algorithms that are an identical, as adversarial to for a two personality trade. These algorithms assemble pretty varied outcomes — one algorithm produces a random permutation and the varied produces a random permutation with precisely one cycle. I ranking these algorithms are dapper in consequence of they’re so uncomplicated, correct a double for loop with a swap.

In prepare, you maybe assemble now not “need” to take hold of how these algorithms work for the reason that well-liked library for lots of neatly-liked languages can savor some arrangement of producing a random dawdle. And whereas you savor gotten a feature that offers you with a dawdle, you too can assemble a permutation with precisely one cycle whereas you assemble now not mind a non-in-set up algorithm that takes an additional race. I’m going to proceed that as an utter for the reader, nevertheless whereas you purchase to savor a hint, one choice to assemble it parallels the “alternate” choice to scrutinize that there are `(n-1)! diversifications with precisely one cycle.

Although I acknowledged that you maybe assemble now not prefer to take hold of these items, you assemble genuinely prefer to brand it whereas you are going to place into effect a personalized shuffling algorithm! Which will sound evident, nevertheless there is a prolonged history of of us imposing unsuitable shuffling algorithms. This became once frequent in games and on on-line gambling sites in the 90s and even the early 2000s and you continue to scrutinize the occasional mis-implemented dawdle, e.g., when Microsoft implemented a bogus dawdle and did now not successfully randomize a browser preference poll. On the time, the head Google hit for `javascript random array kind`

became once the unsuitable algorithm that Microsoft ended up the usage of. That online page has been mounted, nevertheless you too can quiet acquire unsuitable tutorials floating round on-line.

#### Appendix: generating a random derangement

A permutation where no part finally ends up in its normal diagram is named a derangement. Sattolo’s algorithm generates derangements, nevertheless it completely only generates derangements with precisely one cycle, and there are derangements with a few cycle (e.g., `3 2 1 0`

), so it would possibly well maybe presumably now not presumably generate random derangements with uniform chance.

One choice to generate random derangements is to generate random shuffles the usage of Fisher-Yates after which retry except we ranking a derangement:

```
def derangement(n):
train n != 1, "can now not savor a derangement of length 1"
a = checklist(vary(n))
whereas now not is_derangement(a):
dawdle(a)
return a
```

This algorithm is modest, and is overwhelmingly likely to ultimately return a derangement (for n != 1), nevertheless it completely’s now not straight away evident how prolonged we must always quiet demand this to sprint before it returns a consequence. Maybe we will ranking a derangement on the first strive to sprint `dawdle`

once, or even this would maybe presumably also merely take 100 tries and we will prefer to assemble 100 shuffles before getting a derangement.

To resolve this out, we will prefer to take hold of the chance that a random permutation (dawdle) is a derangement. To ranking that, we will prefer to take hold of, given an inventory of of length `n`

, what number of diversifications there are and the best arrangement many derangements there are.

Since we’re deep in the appendix, I’m going to recall that you know the preference of diversifications of a n parts is `n!`

what binomial coefficients are, and are cheerful with Taylor series.

To depend the preference of derangements, we’re going to have the chance to originate with the preference of diversifications, `n!`

, and subtract off diversifications where a bit remains in its starting diagram, `(n desire 1) (n - 1)!`

. That is never always genuinely pretty lawful in consequence of this double subtracts diversifications where two parts live in the starting diagram, so we will prefer so as to add support `(n desire 2) (n - 2)!`

. That is never always genuinely pretty lawful in consequence of we savor overcorrected parts with three diversifications, so we will prefer so as to add those support, and so on and so forth, leading to `∑ (−1)ᵏ (n desire okay)(n−okay)!`

. If we lengthen this out and divide by `n!`

and murder issues out, we ranking `∑ (−1)ᵏ (1 / okay!)`

. If we scrutinize on the limit as the preference of parts goes to infinity, this seems correct admire the Taylor series for `e^x`

where `x = -1`

, i.e., `1/e`

, i.e., in the limit, we demand that the a part of diversifications that are derangements is `1/e`

, i.e., we demand to prefer to assemble `e`

instances as many swaps to generate a derangement as we assemble to generate a random permutation. Treasure many alternating series, this series converges snappy. It will get within 7 important figures of `e`

when `okay = 10`

!

One silly thing about our algorithm is that, if we set up the first part in the first online page, we already know that we assemble now not savor a derangement, nevertheless we continue inserting parts except we savor created a full permutation. If we reject unlawful placements, we’re going to have the chance to assemble even higher than a a part of `e`

overhead. It be additionally that you too can accept as true with to come support up with a non-rejection primarily based algorithm, nevertheless I genuinely expertise the naive rejection primarily based algorithm in consequence of I acquire it delectable when traditional randomized algorithms that consist of “set up attempting again” work successfully.

#### Appendix: wikipedia’s clarification of Sattolo’s algorithm

I wrote this clarification in consequence of I realized the clarification in Wikipedia comparatively now not easy to savor a look at, nevertheless whereas you acquire the clarification above complicated to brand, presumably you are going to purchase wikipedia’s version:

The fact that Sattolo’s algorithm continuously produces a cycle of length n would possibly well also be confirmed by induction. Pick by induction that after the initial iteration of the loop, the closing iterations permute the first n – 1 parts in accordance to a cycle of length n – 1 (those closing iterations are correct Sattolo’s algorithm applied to those first n – 1 parts). This implies that tracing the initial part to its novel diagram p, then the part on the muse at diagram p to its novel diagram, and so forth, one only will get support to the initial diagram after having visited all varied positions. Speak the initial iteration swapped the last part with the one at (non-closing) diagram okay, and that the subsequent permutation of first n – 1 parts then moved it to diagram l; we compare the permutation π of all n parts with that closing permutation σ of the first n – 1 parts. Tracing successive positions as correct mentioned, there would possibly be never this kind of thing as a distinction between σ and π except arriving at diagram okay. But then, under π the part on the muse at diagram okay is moved to the last diagram in preference to to diagram l, and the part on the muse on the last diagram is moved to diagram l. From there on, the sequence of positions for π again follows the sequence for σ, and all positions can were visited before getting support to the initial diagram, as required.

As for the equal chance of the diversifications, it suffices to scrutinize that the modified algorithm involves (n-1)! sure that you too can accept as true with sequences of random numbers produced, each and each of which clearly produces a varied permutation, and each and each of which occurs–assuming the random number source is unbiased–with equal chance. The (n-1)! varied diversifications so produced precisely expend the diagram of cycles of length n: each and each such cycle has a outlandish cycle notation with the designate n in the last diagram, which lets in for (n-1)! diversifications of the closing values to comprise the varied positions of the cycle notation

*Thanks to Mathieu Guay-Paquet, Leah Hanson, Rudi Chen, Kamal Marhubi, Michael Robert Arntzenius, Heath Borders, Shreevatsa R, and David Turner for feedback/corrections/discussion.*