Algorithms Implemented in Python

[+] lorenzhs|7 years ago|reply

This collection is bad. I had a look at several implementations and pretty much everything I looked at was bad, and not in a subtle way. It may well be that some of things in there are good, but given the amount of bad, you really shouldn't use this for learning.

Here's a short list of things I found that were really bad, all of which we teach our first-year students how to do correctly:

- the entire `Graphs/` directory is a dumpster fire. The implementation of Dijkstra's algorithm doesn't use a priority queue, nor does that of Prim's MST algorithm. Both of these also use a strange hard-coded maximum distance of 100000 for no reason. Both Kruskal implementations are bad, too: one doesn't use any kind of Union-Find data structure, the other uses a naive version which has terrible worst case behaviour.

- There is another implementation of Dijkstra's algorithm in `data_structures/Graph/dijkstra.py` that is just as bad.

- The DFS implementation in `data_structures/Graph/DepthFirstSearch.py` isn't a DFS.

- The sorting and selection algorithms are comically bad, too. `sorts/merge_sort_fastest.py` really takes the cake here. It is neither a mergesort nor fast (it's a weird non-inplace implementation of cocktail shaker sort, with quadratic best- and worst-case behaviour). The quicksort at `sorts/quick_sort.py` always uses the first element as a pivot, which is a bad idea. Its partitioning also copies the entire data in every recursive step, which is an even worse idea. The quickselect in `searches/quick_select.py` at least uses a random pivot, but also partitions out-of-place.

- Why does this contain an implementation of FTP?

[+] jsnell|7 years ago|reply

So the repository is billed as "Open Source Resource for Newbies to Learn Algorithms and Implement them in any Programming Language". Which I would indeed have interpreted as a resource for learning, but it actually seems to be meant as "newbies implementing algorithms while learning from some other source, and submitting the results to this repository".

[+] bquinlan|7 years ago|reply

Also, the space and time complexity analysis is incorrect for Trie: https://github.com/TheAlgorithms/Python/blob/master/data_str...

[+] doomjunky|7 years ago|reply

Actually it's a good didactical way to first implement the "raw" version of Dijkstra's algorithm without a priority queue and than discuss the other variants of the algorithm afterwards, because priority queues are another topic that adds extra difficulty and doesn't yield any benefit to comprehend the acutal problem.

[+] matrixagent|7 years ago|reply

Thanks for your warning! I'd love something like this with good implementations (and documentation as to why the implementation is good), does anyone know similar projects?

[+] tomrod|7 years ago|reply

Is it worth submitting pull requests to fix?

[+] ipozgaj|7 years ago|reply

I wouldn’t recommend anyone to use this. Other than really poor implementation and quality of the algorithms (some of which are totally incorrect), code in that repo is anything but Pythonic - reimplementing a lot of things from the standard library, not using list, dict and set comprehensions, using indexes instead of iterators, copying things around for no reason etc. They didn’t even care to use linter to PEP8-ify it.

[+] aaaaaaaaaab|7 years ago|reply

Oh my God, this is a horrible mess!

Others have already addressed the incorrectly implemented algorithms, so let me instead point out that there is an ebook about prehistoric humans in the “ciphers” folder: https://github.com/TheAlgorithms/Python/blob/master/ciphers/...

(Robert J. Braidwood - Prehistoric Men)

[+] rgovostes|7 years ago|reply

This cracked me up, but it turns out it's used as a source file for doing an encryption benchmark.

https://github.com/TheAlgorithms/Python/blob/d4b4b7ba35cc4e1...

[+] santiagobasulto|7 years ago|reply

This looks like a nice effort and is appreciated, but be aware that the quality of the python is REALLY bad. Don't follow these standards if you're looking at these algorithms. For example, in Quickselect, I saw this:

  pivot = random.randint(0, len(list) - 1)
  pivot = list[pivot]

which is bad at every possible level. First, it can be a one liner with `pivot = random.choice(list)`, but also, reusing the same variable over and over again is discouraged.

Again, congrats to the team that put this together for free and with probably the best intentions. But if you're using it, be aware that the code might be worth a refactor.

[+] aw3c2|7 years ago|reply

Apparently no quality control or even rules on common standards among the file tree. A first look at the directory names is a good indicator for the mess. :/

[+] SilverSlash|7 years ago|reply

Dijkstra's algorithm without a priority queue?

[+] jwilk|7 years ago|reply

Where?

[+] wodenokoto|7 years ago|reply

I'm far from strong algorithmically, but I'd love to improve.

I like Human Ressource Machine, because it rates me on code length and complexity. But the types of algorithms seems limited.

I like project Euler for its variety, but I don't get any feedback on quality of my solutions.

Is there a good middle ground?

[+] cnasc|7 years ago|reply

This course on EdX is pretty good: https://courses.edx.org/courses/course-v1:UCSanDiegoX+ALGS20...

They have an automated grader that runs through a number of test cases (some of which are hidden from you). Unfortunately, after the first few problems unless you're a paid student you only get an input file and a box to enter your result in.

No direct feedback on complexity, but the automated grader has some limits on execution time and memory use that at least tell you whether your implementation is "good enough."

[+] tempay|7 years ago|reply

TIS-100 is a nice step up from human resource machine while keeping the feedback.

https://store.steampowered.com/app/370360/TIS100/

[+] nine_k|7 years ago|reply

Note: these all sorting algorithms implemented in Python.

Nice (even animated) illustrations in readme.

[+] shakna|7 years ago|reply

Not just sorting.

The categories are:

* Sorting

* Searching

* Ciphers

As well as others, which aren't listed in the README.

27 comments