I recently ported a reinforcement learning algorithm from PyTorch to Julia. I did my best to keep the implementations the same, with the same hyperparameters, network sizes, etc. I think I did a pretty good job because the performance was similar, solving the CartPole environment in the a similar number of steps, etc.
The Julia implementation ended up being about 2 to 3 times faster. I timed the core learning loops, the network evaluations and gradient calculations and applications, and PyTorch and Julia performed similar here. So it wasn't that Julia was faster at learning. Instead it was all the in-between, all the "book keeping" in Python ended up being much faster in Julia, enough so that overall it was 2 to 3 times faster.
(I was training on a CPU though. Things may be different if you're using a GPU, I don't know.)
Similar experience over here. (G)ARCH models are severely underserved in Python, and I could not be bothered to learn a Probabilistic programming abstraction like Pyro or Stan just to build a quick prototype myself.
Chose Julia instead. Took 4 hours to get everything sorted out (including getting IT to allow Julias package manager to actually download stuff) and have the first model running just putting a paper into code. Since code is just writing the math, this is a vast communication improvement.
After fiddling around withit at home for a week, this was the first professional experience and I'm blown away.
Julia is such a wonderful language. There are many design decisions that I like, but most importantly to me, its ingenious idea of combining multiple dispatch with JIT compilation still leaves me in awe. It is such an elegant solution to achieving efficient multiple dispatch.
Thanks to everyone who is working on this language!
Julia is the first language to really show that multiple dispatch can be efficient in performance-critical code, but I'm not really sure why: JIT concepts were certainly familiar to implementors of Common Lisp and Dylan.
I've been running the 1.6 release candidates, and the compilation speed improvements have been massive. There have been plenty of instances in the past where I've tried to 'quickly' show off some Julia code, and I end up waiting ~45 seconds for a plot to show or a minute for a Pluto notebook to run, and that's not to mention waiting for my imports to finish. It's still slower than Matlab for the first run, but it's at least in the same ballpark now.
On the package ecosystem side, 1.6 is required for JET.jl [0]. Despite being a dynamic language, the Julia compiler does a lot of static analysis (or "abstract interpretation" in Julia lingo). JET.jl exposes some of this to the user, opening a path for additional static analysis tools (or maybe even custom compilers).
Whatever improves loading times is more than welcome. It's not really acceptable to wait because you import some libraries. In understand Julia makes lots of things under the hood and that there's a price to pay for that but being a python user, it's a bit inconvenient.
But I'll sure give it a try because Julia hits a sweet spot between expressiveness and speed (at least for the kind of stuff I do : matrix, algorithms, graphs computations).
I like Julia (mostly because of multiple dispatch). The only thing that's lacking is an industry strength Garbage Collector, something that can be found in the JVM.
I know that you shouldn't produce garbage, but I happen to like immutable data structures and those work better with optimised GCs.
> I know that you shouldn't produce garbage, but I happen to like immutable data structures and those work better with optimised GCs.
If you use immutable data-structures in julia, you're rather unlikely to end up with any heap allocations at all. Unlike Java, Julia is very capable of stack allocating user defined types.
The feature I'm most excited about is the parallel — and automatic — precompilation. Combined with the iterative latency improvements, Julia 1.6 has far fewer coffee breaks.
I think so - Julia master branch (1.7 precursor) works on M1, but not all the dependencies that some packages require have been built for M1. Though, I understand that the wonderful packaging system and the folks who work on it are working on it.
I think it's more interesting to see what people do with the language instead of focusing on microbenchmarks. There's for instance this great package https://github.com/JuliaSIMD/LoopVectorization.jl which exports a simple macro `@avx` which you can stick to loops to vectorize them in ways better than the compiler (=LLVM). It's quite remarkable you can implement this in the language as a package as opposed to having LLVM improve or the julia compiler team figure this out.
And then replacing the matmul.jl with the following:
@avx for i = 1:m, j = 1:p
z = 0.0
for k = 1:n
z += a[i, k] * b[k, j]
end
out[i, j] = z
end
I get a 4x speedup from 2.72s to 0.63s. And with @avxt (threaded) using 8 threads it goes town to 0.082s on my amd ryzen cpu. (So this is not dispatching to MKL/OpenBLAS/etc). Doing the same in native Python takes 403.781s on this system -- haven't tried the others.
I've rewritten two major pipelines from numpy-heavy, fairly optimized Python to Julia and gotten a 30x performance improvement in one, and 10x in the other. It's pretty fast!
looks like they're just multiplying two 100x100 matrices, once? (maybe I'm reading it wrong?) in Julia, runtime would be dominated by compilation + startup time.
A fair comparison with C++ would be to at least include the compilation/linking time into the time reported.
Ditto for Java or any JVM language (you'd have JVM startup cost but that doesn't count the compilation time for bytecode).
Generally, for stuff (scientific computing benchmarks) like this you want to run a lot of computation precisely to avoid stuff like this (i.e you want to fairly allow the cost of compilation & startup amortize)
This appears to be a set of benchmarks of how fast a brainfuck interpreter implemented in different programming languages is on a small set of brainfuck programs? What a bizarre thing to care about benchmarks for. Are you planning on using Julia by writing brainfuck code and then running it through an interpreter written in Julia?
Idk, but just a few weeks ago I started looking at Julia, partly because of the performance claims. I wanted to write a program a bit heavier than your average starter program, so I wrote a back-tracker (automatic layout for stripboards, to be precise). It was
* interesting (not fun) to find out how Julia works
* annoying AF to discover that much of the teaching material was hidden behind some 3rd party website, presumably in videos (I didn't bother to register, but started browsing the manual instead). What's wrong with text?
* unnecessarily complex because the documentation for the basic functions is nearly inaccessible to beginners.
But, I managed to get a simple layout system up and running, and it wasn't fast. I rewrote it in Go (the language in which I'm currently working most), and it was literally >100x faster. And that should not be due to the startup costs, because a backtracker shouldn't have that much overhead JIT-ing.
I think I can now say that I can't see the use case for Julia. "Faster than Python" is simply not good enough, and for the rest there are no redeeming features. Perhaps the fabled partial differential equation module is worth it, but that can get ported to other languages, I guess.
I think this particular Julia code is pretty misleading, and I'm (probably) one of the most qualified people in this particular neck of the woods. I wrote a transpiler for Julia that converts a Brainfuck program to a native Julia function at parse time, which you can then call like you would any other julia function.
Here's code I ran, with results:
julia> using GalaxyBrain, BenchmarkTools
julia> bench = bf"""
>++[<+++++++++++++>-]<[[>+>+<<-]>[<+>-]++++++++
[>++++++++<-]>.[-]<<>++++++++++[>++++++++++[>++
++++++++[>++++++++++[>++++++++++[>++++++++++[>+
+++++++++[-]<-]<-]<-]<-]<-]<-]<-]++++++++++."""
julia> @benchmark $(bench)(; output=devnull, memory_size=100)
BenchmarkTools.Trial:
memory estimate: 352 bytes
allocs estimate: 3
--------------
minimum time: 96.706 ms (0.00% GC)
median time: 97.633 ms (0.00% GC)
mean time: 98.347 ms (0.00% GC)
maximum time: 102.814 ms (0.00% GC)
--------------
samples: 51
evals/sample: 1
julia> mandel = bf"(not printing for brevity's sake)"
julia> @benchmark $(mandel)(; output=devnull, memory_size=500)
BenchmarkTools.Trial:
memory estimate: 784 bytes
allocs estimate: 3
--------------
minimum time: 1.006 s (0.00% GC)
median time: 1.009 s (0.00% GC)
mean time: 1.011 s (0.00% GC)
maximum time: 1.022 s (0.00% GC)
--------------
samples: 5 evals/sample: 1
Note that, conservatively, GalaxyBrain is about 8 times faster than C++ on "bench.b" and 13 times faster than C on "mandel.b," with each being the fastest language for the respective benchmarks. In addition, it allocates almost no memory relative to the other programs, which measure memory usage in MiB.
You could argue that I might see similar speedup for other languages on my machine, assuming I have a spectacularly fast setup, but this person ran their benchmarks on a tenth generation Intel CPU, whereas mine's an eighth generation Intel CPU:
julia> versioninfo()
Julia Version 1.5.1
Commit 697e782ab8 (2020-08-25 20:08 UTC)
Platform Info: OS: Linux (x86_64-pc-linux-gnu)
CPU: Intel(R) Core(TM) i7-8700K CPU @ 3.70GHz
WORD_SIZE: 64
LIBM: libopenlibm LLVM: libLLVM-9.0.1 (ORCJIT, skylake)
I think i can answer that, first of all Julia isnt as fast as C/C++/Nim etc. in most cases Julia is just fast in scientific computing that's all. (there is only one "scientific" benchmark on kostya benchmarks)
Second to write very fast julia u need to knew a lot of "tricks" and in most cases u won't be doing it as easy as writing normal code.
And all people writing this benchmark is measuring compilation time (XD?) or not including jitting time they could just look at code/readme for 5s before commenting.
Julia is fast and can be as fast as C but not in all cases and not as easy at it seems.
Is there a per-project way to manage dependencies yet? I find global package installation to be the biggest weakness of all the R projects out there. Anaconda can help, but it’s not widely used for R projects. And Docker... well, don’t get me started.
Yeah. Julia's had that since (at least) 1.0. Environments are built-in, and you specify project dependencies in a Projects.toml file https://pkgdocs.julialang.org/v1/toml-files/.
maybe i misread this, but milestone "1.6 blockers" still has 3 open with "1.6 now considered feature-complete. This milestone tracks release-blocking issues." - so how can 1.6 be ready?
Buttons840|5 years ago
The Julia implementation ended up being about 2 to 3 times faster. I timed the core learning loops, the network evaluations and gradient calculations and applications, and PyTorch and Julia performed similar here. So it wasn't that Julia was faster at learning. Instead it was all the in-between, all the "book keeping" in Python ended up being much faster in Julia, enough so that overall it was 2 to 3 times faster.
(I was training on a CPU though. Things may be different if you're using a GPU, I don't know.)
gdpr|4 years ago
Chose Julia instead. Took 4 hours to get everything sorted out (including getting IT to allow Julias package manager to actually download stuff) and have the first model running just putting a paper into code. Since code is just writing the math, this is a vast communication improvement.
After fiddling around withit at home for a week, this was the first professional experience and I'm blown away.
wiz21c|5 years ago
stellalo|5 years ago
beeforpork|5 years ago
Thanks to everyone who is working on this language!
chalst|5 years ago
skohan|5 years ago
pjmlp|5 years ago
MisterBiggs|5 years ago
peatmoss|5 years ago
I just did a “using Plots” in 1.6.0, and it was fast enough to not care about the delta between Plots and, say, R loading ggplot.
Huge kudos to the Julia team.
Sukera|5 years ago
snicker7|5 years ago
[0]: https://github.com/aviatesk/JET.jl
akdor1154|5 years ago
celrod|5 years ago
Like for autodiff or GPUs.
cbkeller|5 years ago
[1] https://www.oxinabox.net/2021/02/13/Julia-1.6-what-has-chang...
wiz21c|5 years ago
But I'll sure give it a try because Julia hits a sweet spot between expressiveness and speed (at least for the kind of stuff I do : matrix, algorithms, graphs computations).
odipar|5 years ago
I know that you shouldn't produce garbage, but I happen to like immutable data structures and those work better with optimised GCs.
eigenspace|5 years ago
> I know that you shouldn't produce garbage, but I happen to like immutable data structures and those work better with optimised GCs.
If you use immutable data-structures in julia, you're rather unlikely to end up with any heap allocations at all. Unlike Java, Julia is very capable of stack allocating user defined types.
superdimwit|5 years ago
newswasboring|5 years ago
noisy_boy|5 years ago
dklend122|5 years ago
Check out staticcompiler.jl
ced|5 years ago
triztian|5 years ago
Or are there steps to produce a binary (much like Go or C or Rust)??
systems|5 years ago
not that my suggestion is good, but what they have now is bad
https://github.com/JuliaLang/julia/issues/37187
StefanKarpinski|5 years ago
3JPLW|5 years ago
shmeano|5 years ago
pjmlp|5 years ago
xiphias2|5 years ago
Will there be an M1 Mac version for 1.7?
thetwentyone|5 years ago
> `git clone https://github.com/JuliaLang/julia` and `make` should be enough at this point.
https://github.com/JuliaLang/julia/issues/36617#issuecomment...
fermienrico|5 years ago
Julia loses almost consistently to Go, Crystal, Nim, Rust, Kotlin, Python (PyPy, Numpy): https://github.com/kostya/benchmarks
Is this because of bad typing or they didn't use Julia properly in idiomatic manner?
stabbles|5 years ago
See the docs which kinda read like blog posts: https://juliasimd.github.io/LoopVectorization.jl/stable/
And then replacing the matmul.jl with the following:
I get a 4x speedup from 2.72s to 0.63s. And with @avxt (threaded) using 8 threads it goes town to 0.082s on my amd ryzen cpu. (So this is not dispatching to MKL/OpenBLAS/etc). Doing the same in native Python takes 403.781s on this system -- haven't tried the others.SatvikBeri|5 years ago
paul_milovanov|5 years ago
A fair comparison with C++ would be to at least include the compilation/linking time into the time reported.
Ditto for Java or any JVM language (you'd have JVM startup cost but that doesn't count the compilation time for bytecode).
Generally, for stuff (scientific computing benchmarks) like this you want to run a lot of computation precisely to avoid stuff like this (i.e you want to fairly allow the cost of compilation & startup amortize)
StefanKarpinski|5 years ago
tgv|5 years ago
* interesting (not fun) to find out how Julia works
* annoying AF to discover that much of the teaching material was hidden behind some 3rd party website, presumably in videos (I didn't bother to register, but started browsing the manual instead). What's wrong with text?
* unnecessarily complex because the documentation for the basic functions is nearly inaccessible to beginners.
But, I managed to get a simple layout system up and running, and it wasn't fast. I rewrote it in Go (the language in which I'm currently working most), and it was literally >100x faster. And that should not be due to the startup costs, because a backtracker shouldn't have that much overhead JIT-ing.
I think I can now say that I can't see the use case for Julia. "Faster than Python" is simply not good enough, and for the rest there are no redeeming features. Perhaps the fabled partial differential equation module is worth it, but that can get ported to other languages, I guess.
otde|5 years ago
Here's code I ran, with results:
Note that, conservatively, GalaxyBrain is about 8 times faster than C++ on "bench.b" and 13 times faster than C on "mandel.b," with each being the fastest language for the respective benchmarks. In addition, it allocates almost no memory relative to the other programs, which measure memory usage in MiB.You could argue that I might see similar speedup for other languages on my machine, assuming I have a spectacularly fast setup, but this person ran their benchmarks on a tenth generation Intel CPU, whereas mine's an eighth generation Intel CPU:
This package is 70 lines of Julia code. You can check it out for yourself here: https://github.com/OTDE/GalaxyBrain.jlI talk about this package in-depth here: https://medium.com/@otde/six-months-with-julia-parse-time-tr...
adgjlsfhk1|5 years ago
unknown|5 years ago
[deleted]
machineko|5 years ago
Second to write very fast julia u need to knew a lot of "tricks" and in most cases u won't be doing it as easy as writing normal code.
And all people writing this benchmark is measuring compilation time (XD?) or not including jitting time they could just look at code/readme for 5s before commenting.
Julia is fast and can be as fast as C but not in all cases and not as easy at it seems.
JulianMorrison|5 years ago
mbauman|5 years ago
https://github.com/JuliaLang/julia/issues/40190
Edit: it's now building:
https://github.com/JuliaLang/docs.julialang.org/runs/2196972...
f6v|5 years ago
adgjlsfhk1|5 years ago
krastanov|5 years ago
eigenspace|5 years ago
psychometry|5 years ago
siproprio|5 years ago
StefanKarpinski|5 years ago
ng55QPSK|5 years ago
kristofferc|5 years ago
ghosty-discord|5 years ago
[deleted]
DreamScatter|5 years ago
[deleted]