top | item 36072293

(no title)

pumanoir | 2 years ago

Are there examples to use it for SGD using this? Like "Here is a tutorial on how to do a nanoGPT using DiffEqGPU.jl"?

discuss

order

ChrisRackauckas|2 years ago

There is an example of using this with gradient-based optimization here: https://docs.sciml.ai/SciMLSensitivity/dev/tutorials/data_pa....

As an ODE solver, you wouldn't do nanoGPT with it though, you'd need to go back to KernelAbstractions and write a nanoGPT based on that same abstraction layer. Again, this is a demonstration of the cross-GPU tools for ODEs, but for LLMs you'd need to take these tools and implement an LLM.