top | item 37325173

ThirdAI Uses Ray for Parallel Training of Billion-Parameter NN on Commodity CPUs

78 points| thirdailab | 2 years ago |anyscale.com | reply

15 comments

order
[+] stainablesteel|2 years ago|reply
imo this sounds like the right move, relying on a single company to produce all the gpus results in massive competition to get them, and a large barrier to entry to buy them

by skipping it you can play a little slower but probably a lot cheaper

[+] blovescoffee|2 years ago|reply
Their methods rely on heuristics to sparsify NNs. They compare their sparse and dense methods to PyTorch and Tensorflow and get worse perf on dense inferences but better on sparse. It would be useful to know if PyTorch runs faster or slower than their methods using their sparsified matrix techniques.
[+] mochomocha|2 years ago|reply
> It would be useful to know if PyTorch runs faster or slower than their methods using their sparsified matrix techniques.

Probably not an easy feat. Sparse support in pytorch is minimal.

[+] grandma_tea|2 years ago|reply
Very cool tech! I remember seeing the SLIDE paper and then being shocked at the lack of high profile follow-up work.

Is there any plan to open source BOLT? This would be extremely valuable to the community and in reducing Nvidia's chokehold.

[+] vihan_|2 years ago|reply
Thanks for reading! This is a great question. While we currently don't have immediate plans to open source BOLT, we are still in the very early stages of the company and may consider it down the road. Although BOLT is currently closed-source, we do offer free trial licenses https://www.thirdai.com/try-bolt/
[+] aabhay|2 years ago|reply
Yes but do they pay Anyscale anything for this use of Ray?
[+] rvrs|2 years ago|reply
IIUC Anyscale is managed Ray, and also works on Ray full-time. However Ray is open source and you can run it yourself. It's fairly easy to get started with, too.
[+] choppaface|2 years ago|reply
Founder of Anyscale is the Berkeley student who helped start Ray and is usually happy to give free advice. Anyscale is to Ray as Databricks is to Spark.
[+] datlife|2 years ago|reply
you can set up a self-hosted Ray infra to achieve this.
[+] zwaps|2 years ago|reply
Seems to require Tensorflow. That makes it basically irrelevant?