He starts from the wrong axiom that its hard to prove and creates a lot of nonsense over that.
Its requires just two induction proofs:
- One that for I=1, after N comparisons the largest number is at position 1 (Proven with induction) its the base case
- The other, that for any I=n+1 if we assume that the first n slots are ordered we can treat n+1 as a new array of length N-n and solve using the base case proof.
Talking about computer science and not doing the bare minimum of computer science is peak software engineering :)
I thought his goal was to get the prover to prove it without understanding it himself.
By realizing the low-indexed portion is always sorted, you've already proved the algorithm yourself and the prover is just checking for bugs in your logic.
I'm not saying the proof isnt valuable, just that it's not magical and actually requires the user to understand the majority of the proof already.
That is an elegant proof... for a totally different algorithm. J starts at 1, not at I. Every element of the loop is subject to be moved in every single outer iteration. Besides it gets the comparison backwards.
If the (i+1)'st outer loop starts with the first i elements in sorted order, then it ends with the first i+1 elements in sorted order.
In fact if k of the first i elements are <= A[i+1], then the first k iterations of the inner loop will not swap, and the next i-k iterations will swap. The latter can be seen to maintain order.
The last n-i iterations of the inner loop (where j>=i) do not affect this reasoning and can be omitted, as can the first iteration of the outer loop.
I guess it can be thought of as an unoptimized insertion or bubble sort.
I think it is very possible to write this algorithm by mistake in intro compsci classes when you try to code a bubble sort by heart. I would think TAs may have many such instances in their students' homework.
The video from the article (https://www.youtube.com/watch?app=desktop&v=bydMm4cJDeU) is much better because it highlights the index of the outer loop, which is unclear from the cascading visualization there. By seeing the indexes it becomes clear that (1) in the area before the outer index, every value gets swapped in and out of the outer loop location to put them in order, and (2) at the end of the first outer loop iteration, the largest element will be at the location of the outer loop index, and so everything to the right is unchanged in each iteration.
I actually used this algorithm a decade ago to implement a log-based, transactional database system for an embedded system with very low amount of memory and requirement that all memory be statically allocated.
To the frustration of the rest of the development team who first called me an idiot (I was new) then they could not make quicksort run as fast on inputs that were capped at something like 500 items. Apparently, algorithmic complexity isn't everything.
You sure it was this algorithm and not Bubble Sort?
> algorithmic complexity isn’t everything
Yeah very true. Or at least hopefully everyone knows that complexity analysis only applies to large n, and small inputs can change everything. In console video games it was very common to avoid dynamic allocation and to use bubble sort on small arrays. Also extremely common to avoid a sort completely and just do a linear search on the (small) array while querying, that can end up being much faster than sorting and binary searching, especially when the total number of queries or the number of queries per frame is also low.
I fee like anyone who was surprised that algorithmic complexity isn't everything, probably didn't totally understand it. The assumptions (like ignoring constants) are straight out of calculus limits. That (+10000) on the end doesn't mean anything if you're sorting an infinite list, but it means a lot if you're sorting 15 (or in your case 500) entries.
If it looks "obviously wrong" the quick proof that it works ... the j loop will move the largest unsorted element in the array into a sorted position. Since the i loop executes the j loop n times, the array must be sorted (since the n largest elements will be in the correct order).
EDIT
^W^W^W
Nope, I'm wrong. I did a worked example on paper. I think the devious thing here is the algorithm looks simple at a lazy glance.
To sort: 4-3-6-9-1, next round pivot for the i loop in [].
I can see that it sorts everything to the left of a pivot, then because it does that n times it comes to a sorted list. A reasonable proof will be more complicated than I thought.
The list on the left (index less than i) is always sorted. The ith element is inserted by the j loop and the rest of the list is shifted right by one element by repeated swapping with the ith position. Nothing to the right changes because the ith element is the max of the entire list, which seems to be a red herring for analysis.
Yeah, I see this as a very interesting reply to the TLA+ post. Might spend my evening diving into Ada and Spark.
Edit: Some things I noticed. The package gnat-12 does not have gnatprove. Ada mode for emacs requires a compilation step that failes with gnat community edition. With alire there is no system gnat so it cannot compile it (quite possible I'm missing something). In the end I gave up on using emacs. Gnatstudio wouldn't run for me until I realized it needed ncurses. It also had some unhandled exceptions for me (raised PROGRAM_ERROR : adjust/finalize raised GNATCOLL.VFS.VFS_INVALID_FILE_ERROR: gnatcoll-vfs.adb:340), but in the end I managed to get it up and running.
Edit2: After playing around with it, I'm extremely impressed with what spark can do. I made a function to add 7 to numbers. Tried putting a post condition that the return value is bigger than input. "Nuh uh, it could overflow". Ok so I add a pre condition that numbers must be less than 100. "Uhm, buddy you are passing in 200 right there". This is useful stuff for real and easy to write too.
I've been watching those mesmerizing YouTube videos visualizing sorting algorithms lately. The header of this article uses a screen cap from one of them.
Them: So what shows do you watch?
Me: ... It's complicated.
There are a lot of different sorting algorithms. Like, a lot, a lot.
As I watch them, I try to figure out what they were optimizing for. Some only scan in one direction. Some only use the swap operation. Some seem to do the minimum number of writes. Some are incremental performance improvements over others.
When I see an algorithm like this, I don't assume the person who wrote it was an idiot. I assume they were optimizing for something that's not obvious to me. Its only modifying operation is swap, so maybe that operation is faster than an arbitrary insert for whatever system or data structure they're using. There are no temporary variables besides loop counters, so maybe they're on a memory-constrained environment. There's barely any code here, so maybe this is for a microcontroller with precious little ROM. Or maybe they're applying this as a binary patch and they have a strict ceiling to the number of ops they can fit in.
Or maybe it's just the first sorting algorithm they could think of in an environment that doesn't ship with one and the performance is adequate for their needs. In that case, it's optimized for developer time and productivity. And honestly, it's a far more elegant algorithm than my "naive" version would be.
These are all valid reasons to use a "naive" sorting algorithm.
This is a fantastic post! It's great to have a concrete example of proving a not-trivial algorithm. And somehow this Ada code feels more approachable to me than HOL4 and other functional proof assistants I've looked at.
While it's been on my list for a while, I'm more curious to try out Ada now.
It'll take you less than a week to get proficient enough in it to make something useful, or at least interesting. It's a very straightforward language, overall.
https://learn.adacore.com/ - Pretty much the best free source until you're actually ready to commit to the language. The "Introduction to Ada" course took me maybe a week of 1-2 hours a day reading and practicing to go through. There's also a SPARK course that takes a bit longer, but is also interactive.
The language reference manuals for 2012 and 202x (which should become Ada 2022):
Two of the most interesting things that trickled down during the design of Spark2014:
1- contracts are written in the same language as the code to be checked/proved. This made important to add expressive features to the language (for all / for some: quantifiers! expression functions, if- and case-statements and more recently the new delta-aggregates notation): these additions make the language far more expressive without too much loss of readability.
2- executable contracts: most contracts can be checked at runtime or proved. And the escape hatch (code only present for proof) is 'ghost' code which is also a nice addition.
Lots of little nifty additions to the language, from just 'thinking in contracts' or 'designing for probability'.
(I'm really not trying to brag - I assume I must be missing something, that I'm naïve to think it obviously works, and it actually works for a different reason.)
In plain English - 'for every element, compare to every element, and swap position if necessary' - .. of course that works? It's as brute force as you can get?
If the second loop is from j = i to n, it is easy to see that it will sort in decreasing order. But notice j = 1 to n, then suddenly it will sort in increasing order
It manages to be more inefficient than most, and will even disorder and then reorder an already sorted input. Bubble sort (a very low bar to beat) doesn't even do that. If you've only got a small number of items, it doesn't matter unless you're sorting a small number of items a large number of times.
I taught myself this exact algorithm at 8 or 10 by sorting playing cards (we established an order for the kind, diamond < leaf < spades < heart) in a very very very long plane trip where I had neglected to bring comic books or something.
After each outer loop iteration, A[:i] (the array up to i) is in ascending order with the max of A at A[i].
This is true the first iteration since max(A) is eventually swapped to A[1]. This is true in subsequent iterations since during the ith iteration, it inserts the next element, initially at A[i], into A[:i-1] and shift everything up with swaps with A[i] so A[:i] is sorted, with the max of A moved to A[i]. After that no more swaps happen in the ith iteration since A[i] contains the max of A.
Isn't the proof incomplete because it does not ensure that the result is a permutation of the original array contents? Just overwriting the entire array with its first element should still satisfy the post-condition as specified, but is obviously not a valid sorting implementation.
It was touched upon in the post: 'Tip: Don’t prove what you don’t need to prove' (and surrounding text). With a link on another proof for another sorting algorithm.
Article says it is better to actually read the paper instead of trying to figure it out.
I disagree on that. It is a good exercise to try to prove it by yourself and it was actually quite fun
Main mistakes author makes though
- trying to prove it works without first understanding why / how it works. Always simulate test-runs on paper / in your head before
- trying to prove on machine first. Always make sure you can do the proof on paper first. Then and only then should you battle with your verifier / solver to get it to understand your proof
> After the first main loop, the first item will be the biggest item in the list.
> The inner loop, as it always starts from 1 again will gradually replace the bigger items with smaller items, and so on.
I think the "and so on" is the point. To put it more precisely, I think that anyone (with a suitable familiarity with the tools) can prove formally that "after the first main loop, the first item will be the biggest item in the list"; but I think that it might take a bit more work to prove the "and so on" in a way that satisfies a formal prover.
(As to the intuition—I can buy that the "and so on" part is reasonably intuitive, but only at a level of intuition that can also believe wrong things. For example, just always replacing bigger items with smaller items doesn't guarantee you'll get a sorted list!)
Every time somebody proves that a sorting algorithm sorts, they forget to prove that the algorithm doesn't just drop values or fill the entire array with 0s (with fixed-length arrays, as here).
On paper: "it's just swaps"
Formally: "how do I even specify this?"
(For every value e of the element type original array and the final array have the same number of occurrences of e. Show that that's transitive, and show it's preserved by swap.)
Hi, co-author here. We actually touched upon this subject in the article: 'Tip: Don’t prove what you don’t need to prove' and surrounding text. Also previous comment by Yannick: https://news.ycombinator.com/item?id=31980126
Can confirm. It's on page 243, in the "Arrays" chapter, and is referred to as the "exchange sort." The actual code is:
/* Sort the array
for (i = 0; i < array_size - 1; i++)
for (j = i + 1; j < array_size; j++)
if (array[i] > array[j]])
{
int temp = array[i];
array[i] = array[j]:
array[j] = temp;
}
As addressed in other comments in this thread and past threads on this sort (when the paper came out), it is not bubble sort. Bubble sort only compares adjacent items and swaps them if they are in the wrong order. This has the effect of "bubbling" the largest value to the top (the way it's often written, could also reverse it to push the smallest down first).
This sort is like a worse performance version of insertion sort. But insertion sort creates a partition (and in doing this performs a sort) of the first i+1 items, with i increasing until the entire sequence is sorted. This one will scan the entire sequence every time on the inner loop instead of just a subset of it and usually perform more swaps.
I don't think that the goal here is to show a fast and elegant sort, but rather to show that a sorting algorithm that seems like it can't possibly work actually does. That is, probably no-one will learn from this article how to sort better, but hopefully people will learn from this article how to formally prove things (e.g., about sorting) better.
Isn't the sorting algorithm in question the famous BubbleSort? I understand the value of formally proving it works, but why is the name mentioned nowhere?
It feels similar at first blush but it's not really. In bubble sort you compare/swap adjacent elements, and exit if you make a pass through the collection without making any changes. Whereas this will compare/swap the element at every index to every other index, and just exits when it's done performing all those comparisons.
The exact same thing happened to me, and this is how I discovered algorithmic complexity. Sorting my triangles took forever (I saw it in the profile, taking a whopping 90% of the time ) and I eventually figured there might be a proper sorting algorithms out there.
I was at the same time happily churning out assembly code, talking to the vga card, writing a dos extender, etc.
You can actually do quite a few things without formal education!
Well, Hi there, it me. Back then I was 17 and wading through software rendering so (light) mesh tessellation, triangle projection (twice, for a stereoscopic anaglyph renderer), triangle filling, texture mapping, with at most 300 objects, all in crappy crashy C (first with VESA, then SDL eased my life...) with lots of misunderstandings about pointers.
I was welllll over my head with complex stuff, and sorting didn't appear in the profiles, so... I guess you can call that profile-guided learning? I had the formal training, later on, and even then it didn't stick until I faced the actual complexity problem head-on.
I'll never forget that whole weekend with my AMD Athlon 900 at 100% CPU sorting a 200 000 words dictionary... It was still not finished on Monday. Implemented (a very primitive) insertion sort and it was done in less than 2 minutes...
drdrek|3 years ago
Its requires just two induction proofs: - One that for I=1, after N comparisons the largest number is at position 1 (Proven with induction) its the base case - The other, that for any I=n+1 if we assume that the first n slots are ordered we can treat n+1 as a new array of length N-n and solve using the base case proof.
Talking about computer science and not doing the bare minimum of computer science is peak software engineering :)
phkahler|3 years ago
By realizing the low-indexed portion is always sorted, you've already proved the algorithm yourself and the prover is just checking for bugs in your logic.
I'm not saying the proof isnt valuable, just that it's not magical and actually requires the user to understand the majority of the proof already.
puffoflogic|3 years ago
unknown|3 years ago
[deleted]
tromp|3 years ago
In fact if k of the first i elements are <= A[i+1], then the first k iterations of the inner loop will not swap, and the next i-k iterations will swap. The latter can be seen to maintain order.
The last n-i iterations of the inner loop (where j>=i) do not affect this reasoning and can be omitted, as can the first iteration of the outer loop.
tonmoy|3 years ago
unknown|3 years ago
[deleted]
smusamashah|3 years ago
If you compare it with both Insertion and Bubble sort. You can see it looks more like insertion sort than bubble sort.
iib|3 years ago
I think it is very possible to write this algorithm by mistake in intro compsci classes when you try to code a bubble sort by heart. I would think TAs may have many such instances in their students' homework.
bjterry|3 years ago
calmingsolitude|3 years ago
raldi|3 years ago
twawaaay|3 years ago
To the frustration of the rest of the development team who first called me an idiot (I was new) then they could not make quicksort run as fast on inputs that were capped at something like 500 items. Apparently, algorithmic complexity isn't everything.
dahart|3 years ago
> algorithmic complexity isn’t everything
Yeah very true. Or at least hopefully everyone knows that complexity analysis only applies to large n, and small inputs can change everything. In console video games it was very common to avoid dynamic allocation and to use bubble sort on small arrays. Also extremely common to avoid a sort completely and just do a linear search on the (small) array while querying, that can end up being much faster than sorting and binary searching, especially when the total number of queries or the number of queries per frame is also low.
delecti|3 years ago
texaslonghorn5|3 years ago
roenxi|3 years ago
EDIT
^W^W^W
Nope, I'm wrong. I did a worked example on paper. I think the devious thing here is the algorithm looks simple at a lazy glance.
To sort: 4-3-6-9-1, next round pivot for the i loop in [].
I can see that it sorts everything to the left of a pivot, then because it does that n times it comes to a sorted list. A reasonable proof will be more complicated than I thought.blueflow|3 years ago
phkahler|3 years ago
ufo|3 years ago
Does anyone here also use SPARK for this sort of thing? Are there other formal methods tools you'd use if you had to prove something like this?
im3w1l|3 years ago
Edit: Some things I noticed. The package gnat-12 does not have gnatprove. Ada mode for emacs requires a compilation step that failes with gnat community edition. With alire there is no system gnat so it cannot compile it (quite possible I'm missing something). In the end I gave up on using emacs. Gnatstudio wouldn't run for me until I realized it needed ncurses. It also had some unhandled exceptions for me (raised PROGRAM_ERROR : adjust/finalize raised GNATCOLL.VFS.VFS_INVALID_FILE_ERROR: gnatcoll-vfs.adb:340), but in the end I managed to get it up and running.
Edit2: After playing around with it, I'm extremely impressed with what spark can do. I made a function to add 7 to numbers. Tried putting a post condition that the return value is bigger than input. "Nuh uh, it could overflow". Ok so I add a pre condition that numbers must be less than 100. "Uhm, buddy you are passing in 200 right there". This is useful stuff for real and easy to write too.
layer8|3 years ago
Stampo00|3 years ago
Them: So what shows do you watch?
Me: ... It's complicated.
There are a lot of different sorting algorithms. Like, a lot, a lot.
As I watch them, I try to figure out what they were optimizing for. Some only scan in one direction. Some only use the swap operation. Some seem to do the minimum number of writes. Some are incremental performance improvements over others.
When I see an algorithm like this, I don't assume the person who wrote it was an idiot. I assume they were optimizing for something that's not obvious to me. Its only modifying operation is swap, so maybe that operation is faster than an arbitrary insert for whatever system or data structure they're using. There are no temporary variables besides loop counters, so maybe they're on a memory-constrained environment. There's barely any code here, so maybe this is for a microcontroller with precious little ROM. Or maybe they're applying this as a binary patch and they have a strict ceiling to the number of ops they can fit in.
Or maybe it's just the first sorting algorithm they could think of in an environment that doesn't ship with one and the performance is adequate for their needs. In that case, it's optimized for developer time and productivity. And honestly, it's a far more elegant algorithm than my "naive" version would be.
These are all valid reasons to use a "naive" sorting algorithm.
eatonphil|3 years ago
While it's been on my list for a while, I'm more curious to try out Ada now.
Jtsummers|3 years ago
https://learn.adacore.com/ - Pretty much the best free source until you're actually ready to commit to the language. The "Introduction to Ada" course took me maybe a week of 1-2 hours a day reading and practicing to go through. There's also a SPARK course that takes a bit longer, but is also interactive.
The language reference manuals for 2012 and 202x (which should become Ada 2022):
http://www.ada-auth.org/standards/ada12.html
http://www.ada-auth.org/standards/ada2x.html
touisteur|3 years ago
1- contracts are written in the same language as the code to be checked/proved. This made important to add expressive features to the language (for all / for some: quantifiers! expression functions, if- and case-statements and more recently the new delta-aggregates notation): these additions make the language far more expressive without too much loss of readability.
2- executable contracts: most contracts can be checked at runtime or proved. And the escape hatch (code only present for proof) is 'ghost' code which is also a nice addition.
Lots of little nifty additions to the language, from just 'thinking in contracts' or 'designing for probability'.
jwilk|3 years ago
https://news.ycombinator.com/item?id=28758106 (318 comments)
OJFord|3 years ago
(I'm really not trying to brag - I assume I must be missing something, that I'm naïve to think it obviously works, and it actually works for a different reason.)
In plain English - 'for every element, compare to every element, and swap position if necessary' - .. of course that works? It's as brute force as you can get?
Asooka|3 years ago
The surprising algorithm sorts even though it swaps elements that are already ordered.
YetAnotherNick|3 years ago
shp0ngle|3 years ago
It is opposite!
It works but not as you think it does.
unknown|3 years ago
[deleted]
donatj|3 years ago
Jtsummers|3 years ago
prionassembly|3 years ago
asrp|3 years ago
This is true the first iteration since max(A) is eventually swapped to A[1]. This is true in subsequent iterations since during the ith iteration, it inserts the next element, initially at A[i], into A[:i-1] and shift everything up with swaps with A[i] so A[:i] is sorted, with the max of A moved to A[i]. After that no more swaps happen in the ith iteration since A[i] contains the max of A.
fweimer|3 years ago
touisteur|3 years ago
nephanth|3 years ago
I disagree on that. It is a good exercise to try to prove it by yourself and it was actually quite fun
Main mistakes author makes though
- trying to prove it works without first understanding why / how it works. Always simulate test-runs on paper / in your head before
- trying to prove on machine first. Always make sure you can do the proof on paper first. Then and only then should you battle with your verifier / solver to get it to understand your proof
MattPalmer1086|3 years ago
After the first main loop, the first item will be the biggest item in the list.
The inner loop, as it always starts from 1 again will gradually replace the bigger items with smaller items, and so on.
JadeNB|3 years ago
> The inner loop, as it always starts from 1 again will gradually replace the bigger items with smaller items, and so on.
I think the "and so on" is the point. To put it more precisely, I think that anyone (with a suitable familiarity with the tools) can prove formally that "after the first main loop, the first item will be the biggest item in the list"; but I think that it might take a bit more work to prove the "and so on" in a way that satisfies a formal prover.
(As to the intuition—I can buy that the "and so on" part is reasonably intuitive, but only at a level of intuition that can also believe wrong things. For example, just always replacing bigger items with smaller items doesn't guarantee you'll get a sorted list!)
fdupress|3 years ago
On paper: "it's just swaps" Formally: "how do I even specify this?"
(For every value e of the element type original array and the final array have the same number of occurrences of e. Show that that's transitive, and show it's preserved by swap.)
touisteur|3 years ago
jeffhuys|3 years ago
To PROVE it, is another thing. Good work.
zoomablemind|3 years ago
The number of comparisons is bound by n^2/2 vs n^2 of TFA.
palunon|3 years ago
The one in TFA isn't, for one crucial reason: the comparison is the other way around... And yet, it sorts.
unknown|3 years ago
[deleted]
ki_|3 years ago
carlsborg|3 years ago
petercooper|3 years ago
unknown|3 years ago
[deleted]
hobo_mark|3 years ago
freemint|3 years ago
nabla9|3 years ago
Only the syntax is similar.
MrYellowP|3 years ago
I'm so confused. That "new" algorithm is just BubbleSort?
Jtsummers|3 years ago
This sort is like a worse performance version of insertion sort. But insertion sort creates a partition (and in doing this performs a sort) of the first i+1 items, with i increasing until the entire sequence is sorted. This one will scan the entire sequence every time on the inner loop instead of just a subset of it and usually perform more swaps.
FrozenVoid|3 years ago
(Combsort as well as mentioned algorithm also consists of two loops and a swap)
JadeNB|3 years ago
ermir|3 years ago
smcl|3 years ago
palotasb|3 years ago
justusthane|3 years ago
unknown|3 years ago
[deleted]
ufo|3 years ago
rationalfaith|3 years ago
[deleted]
dgan|3 years ago
Agingcoder|3 years ago
I was at the same time happily churning out assembly code, talking to the vga card, writing a dos extender, etc.
You can actually do quite a few things without formal education!
touisteur|3 years ago
I was welllll over my head with complex stuff, and sorting didn't appear in the profiles, so... I guess you can call that profile-guided learning? I had the formal training, later on, and even then it didn't stick until I faced the actual complexity problem head-on.
I'll never forget that whole weekend with my AMD Athlon 900 at 100% CPU sorting a 200 000 words dictionary... It was still not finished on Monday. Implemented (a very primitive) insertion sort and it was done in less than 2 minutes...
That was my 10000-a-day day https://xkcd.com/1053
im3w1l|3 years ago
fn_t_b_j|3 years ago
[deleted]