top | item 23065748

(no title)

akssri | 5 years ago

Julia is great no doubt. However, all the additional complexity is not entirely 'free'.

For instance, Julia has a performant generational GC compared to Python's simple (but slow) refcount GC. This has the effect of not freeing up intermediate tensors immediately while doing backprop, thus leading to the premature exhaustion of GPU memory while training a deep neural net in Julia. This above issue was flagged with Flux.jl sometime back, and while I'm sure this has been fixed, it's also an illustration of how such low-level implementation details come back to bite.

Pytorch/Chainer etc. in contrast use a memory pool to manage GPU resources (the equivalent of malloc is quite slow), so that Python's 'slow' GC is actually a boon for deep learning workloads.

discuss

order

No comments yet.