For high-performance there is BioJulia[1][2] ecosystem. They did comparison with Seq specifically[3]. If you are willing to help improving the state of the biology and genomics in Julia language, they accept donations[4] as well.
tl;dr of the comparison post:
BioSequences (from BioJulia) used to be slower than Seq, with most of that time spent on input validation (which Seq does not do). After this paper, BioSequences was performance-tuned, so that it's now on par with Seq in speed, while still retaining input validation and other benefits.
> With the updates, BioSequences [2.X] rivals Seq in speed while keeping its advantages of a lower memory footprint and doing data validation.
I messed around with Seq when it was posted here a while back. At the time, I was looking for a more performant language than Python for HashBackup (author), and was looking into D, Go, and Nim. I had a few microbenchmarks to get me a little familiar with the syntax and check performance on things that were a problem in Python, like huge dicts of unique integers (each integer in Python is 24 bytes).
The HN post on Seq came up right as I was doing this so I figured I'd check it too. It did really fantastic on the dict microbenchmark, using something like 350MB of RAM while Python used 1.8 GB, or something like that.
I have no use for any of the genome features, and when I talked with them, they have no use for crypto features. The things that are important to me were not a high priority on their roadmap, so I didn't pursue it.
The only thing I can see in the demo code that you can't do in regular Python is the sequence literal. So `s"ACGT"` vs `Seq("ACGT")` or something. Oh, and having the Python 2 `print` instead of Python 3 `print()`.
Noob question. What does it mean to say that a language is "python-based"? Python is itself a language. Does it mean the parser/compiler is written in Python?
> "Seq enables users to write high-level, Pythonic code without having to worry about low-level or domain-specific optimizations, and allows for the seamless expression of the algorithms, idioms and patterns found in many genomics or bioinformatics applications. "
I gave it a look, and indeed it's not complete python, it will break if you start using the language/stdlib features.
Stdlib modules I couldn't import:
- sqlite3
- urllib
- pathlib
- hashlib
- json
Infra unavailable:
- Debug mode
- pip/venv
- shell
Syntax/built in that didn't work:
- byte and complex literals
- type()
- Some unpacking (E.g: [*[0]])
- raise from
- async/await
It also adds incompatible syntax that is not python, such as 's""' and '|>'.
So despite what that the README says "the vast majority of Python programs should work without any modifications", it's actually the opposite.
The project has real value though, you just need to understand that what you buy here.
[+] [-] shepardrtc|4 years ago|reply
Link to the language itself: https://seq-lang.org/
[+] [-] asicsp|4 years ago|reply
* https://news.ycombinator.com/item?id=28537179 (5 days ago, 58 comments)
* https://news.ycombinator.com/item?id=22107510 (Jan 21, 2020, 68 comments)
[+] [-] m-watson|4 years ago|reply
Just to make my comment less useless here is the github: https://github.com/seq-lang/seq
[+] [-] xvilka|4 years ago|reply
[1] https://biojulia.net/
[2] https://github.com/BioJulia
[3] https://biojulia.net/post/seq-lang/
[4] https://opencollective.com/biojulia
[+] [-] sundarurfriend|4 years ago|reply
> With the updates, BioSequences [2.X] rivals Seq in speed while keeping its advantages of a lower memory footprint and doing data validation.
[+] [-] prirun|4 years ago|reply
The HN post on Seq came up right as I was doing this so I figured I'd check it too. It did really fantastic on the dict microbenchmark, using something like 350MB of RAM while Python used 1.8 GB, or something like that.
I have no use for any of the genome features, and when I talked with them, they have no use for crypto features. The things that are important to me were not a high priority on their roadmap, so I didn't pursue it.
[+] [-] BiteCode_dev|4 years ago|reply
[+] [-] globular-toast|4 years ago|reply
Is there something else?
[+] [-] tenaciousDaniel|4 years ago|reply
[+] [-] koeng|4 years ago|reply
[1] https://github.com/TimothyStiles/poly/issues
[+] [-] unknown|4 years ago|reply
[deleted]
[+] [-] tenaciousDaniel|4 years ago|reply
[+] [-] shoulderchipper|4 years ago|reply
https://github.com/seq-lang/seq
[+] [-] BiteCode_dev|4 years ago|reply
I gave it a look, and indeed it's not complete python, it will break if you start using the language/stdlib features.
Stdlib modules I couldn't import:
- sqlite3
- urllib
- pathlib
- hashlib
- json
Infra unavailable:
- Debug mode
- pip/venv
- shell
Syntax/built in that didn't work:
- byte and complex literals
- type()
- Some unpacking (E.g: [*[0]])
- raise from
- async/await
It also adds incompatible syntax that is not python, such as 's""' and '|>'.
So despite what that the README says "the vast majority of Python programs should work without any modifications", it's actually the opposite.
The project has real value though, you just need to understand that what you buy here.
[+] [-] chomp|4 years ago|reply
[+] [-] MR4D|4 years ago|reply
[+] [-] syntonym2|4 years ago|reply
[+] [-] Proven|4 years ago|reply
[deleted]