simeonschaub's comments

simeonschaub | 4 years ago | on: A new programming language for high-performance computers

It reads to me like ATL is actually more of a tensor compiler than a general-purpose language. In fact, I'd actually be curious if we could lower (subsets of) Julia to ATL, let it optimize the tensor expressions and transpile everything back into Julia/LLVM IR for further optimizations.

In Julia, there is already Tullio.jl [1], which is basically a front end for tensor expressions, which can target both CPUs and GPUs with AD support and automatically parallelizes the computation. It doesn't really optimize the generated code much right now though, so something like this could be interesting.

[1] https://github.com/mcabbott/Tullio.jl

simeonschaub | 4 years ago | on: Awkward: Nested, jagged, differentiable, mixed type, GPU-enabled, JIT'd NumPy

In Julia you would just use an array of views for this, which can represent slices of arrays without any copies:

julia> struct Muon p_T::Float64; phi::Float64; eta::Float64; end

julia> a = reinterpret(Muon, rand(3*7))

7-element reinterpret(Muon, ::Vector{Float64}):

Muon(0.5512393381972832, 0.9349789151451744, 0.006690464595502932) Muon(0.5856015732294971, 0.19023473269375601, 0.40764209748521973) Muon(0.14872954753560852, 0.12281085717049867, 0.9307398048388644) Muon(0.7885776521084014, 0.1392696530731592, 0.4054805743644767) Muon(0.4841152655677211, 0.053858886714772236, 0.9556610184833677) Muon(0.5325190758093583, 0.31100637434877343, 0.4364100043728055) Muon(0.8697751162452897, 0.07683143115108726, 0.49822326551511953)

julia> jagged = [view(a, idx) for idx in [1:3, 4:4, 5:5, 6:7]]

4-element Vector{SubArray{Muon, 1, Base.ReinterpretArray{Muon, 1, Float64, Vector{Float64}, false}, Tuple{UnitRange{Int64}}, true}}:

[Muon(0.5512393381972832, 0.9349789151451744, 0.006690464595502932), Muon(0.5856015732294971, 0.19023473269375601, 0.40764209748521973), Muon(0.14872954753560852, 0.12281085717049867, 0.9307398048388644)] [Muon(0.7885776521084014, 0.1392696530731592, 0.4054805743644767)] [Muon(0.4841152655677211, 0.053858886714772236, 0.9556610184833677)] [Muon(0.5325190758093583, 0.31100637434877343, 0.4364100043728055), Muon(0.8697751162452897, 0.07683143115108726, 0.49822326551511953)]

Also note how I was able to tell Julia to just reinterpret a bunch of contiguous floating point values as objects of type `Muon`, which produced a `ReinterpretArray`. Nowhere in there did I ever copy any data from the original array produced by the `rand(3*7)` call.*

simeonschaub | 4 years ago | on: Julia 1.7 Highlights

> how is running in debug compiled mode different than just...running the code normally?

It's not. By default, the debugger will recursively interpret any nested function calls until it encounters any intrinsics, since breakpoints can be set inside any functions. Compiled mode means the debugger won't do that for functions in a certain module (e.g. Base) and instead invoke the function like you normally would, so breakpoints either inside those functions or -- if functions from other modules are called from inside the compiled functions -- also breakpoints inside those will be ignored.

simeonschaub | 4 years ago | on: Julia 1.7 Highlights

I would not go as far as calling it very naive, there has certainly been some work put into optimizing performance within the current design.

There are probably some gains to be had by using a different storage format for the IR though as proposed in [1], but it is difficult to say how much of a difference that will make in practice.

[1] https://github.com/JuliaDebug/JuliaInterpreter.jl/pull/309

simeonschaub | 4 years ago | on: Julia 1.7 Highlights

Yes, it uses Debugger.jl, which relies on JuliaInterpreter.jl under the hood, so while you can tell the debugger to compile functions in certain modules, it will mostly interpret your code.

You might be interested in https://github.com/JuliaDebug/Infiltrator.jl, which uses an approach more similar to what you describe.

simeonschaub | 4 years ago | on: Introduction to Pluto.jl

Julia already comes with an interpreter, try starting your session with `julia --compile=min`.

One part of the ongoing effort to reduce latencies is to allow package authors to specify optimization levels on a per-module basis. This is great for plotting packages for example, since they usually don't benefit much from overly aggressive optimizations, so spending less time optimizing codes generally leads to a snappier experience. It is now even possible to opt into a module-specific fully interpreted mode, which can make a lot of sense for typical scripting tasks.

simeonschaub | 4 years ago | on: Introduction to Pluto.jl

Check out https://github.com/fredrikekre/jlpkg. It does pretty much exactly what you are describing.