ThirdAI Uses Ray for Parallel Training of Billion-Parameter NN on Commodity CPUs

[+] stainablesteel|2 years ago|reply

imo this sounds like the right move, relying on a single company to produce all the gpus results in massive competition to get them, and a large barrier to entry to buy them

by skipping it you can play a little slower but probably a lot cheaper

[+] blovescoffee|2 years ago|reply

Their methods rely on heuristics to sparsify NNs. They compare their sparse and dense methods to PyTorch and Tensorflow and get worse perf on dense inferences but better on sparse. It would be useful to know if PyTorch runs faster or slower than their methods using their sparsified matrix techniques.

[+] mochomocha|2 years ago|reply

> It would be useful to know if PyTorch runs faster or slower than their methods using their sparsified matrix techniques.

Probably not an easy feat. Sparse support in pytorch is minimal.

[+] vihan_|2 years ago|reply

[deleted]

[+] grandma_tea|2 years ago|reply

Very cool tech! I remember seeing the SLIDE paper and then being shocked at the lack of high profile follow-up work.

Is there any plan to open source BOLT? This would be extremely valuable to the community and in reducing Nvidia's chokehold.

[+] vihan_|2 years ago|reply

Thanks for reading! This is a great question. While we currently don't have immediate plans to open source BOLT, we are still in the very early stages of the company and may consider it down the road. Although BOLT is currently closed-source, we do offer free trial licenses https://www.thirdai.com/try-bolt/

[+] aabhay|2 years ago|reply

Yes but do they pay Anyscale anything for this use of Ray?

[+] rvrs|2 years ago|reply

IIUC Anyscale is managed Ray, and also works on Ray full-time. However Ray is open source and you can run it yourself. It's fairly easy to get started with, too.

[+] jsd_dmatrix|2 years ago|reply

This is all open source Ray. None of the AWS benchmarks were run on Anyscale platform. You can install and run OSS Ray on any of the popular clouds, including your laptop.

All the scripts for you to run in your OSS Ray cluster are here: https://github.com/ThirdAILabs/Public-Benchmarks/blob/main/c...

[+] choppaface|2 years ago|reply

Founder of Anyscale is the Berkeley student who helped start Ray and is usually happy to give free advice. Anyscale is to Ray as Databricks is to Spark.

[+] datlife|2 years ago|reply

you can set up a self-hosted Ray infra to achieve this.

[+] zwaps|2 years ago|reply

Seems to require Tensorflow. That makes it basically irrelevant?

[+] vihan_|2 years ago|reply

Hi, I am one of the co-authors of the blog from ThirdAI. Our software has no dependency on TensorFlow or any other deep learning framework. You can see the libraries we import for training and try it out for yourself in our public github repo: https://github.com/ThirdAILabs/Public-Benchmarks/blob/main/c...

[+] choppaface|2 years ago|reply

Ray does not require Tensorflow, tho the cited workflow might use tf.

Here’s a random example of pytorch on ray: https://www.ray.io/ray-sgd

[+] kamikaz1k|2 years ago|reply

PyTorch too?[1]

[1] https://docs.ray.io/en/latest/train/api/doc/ray.train.torch....

15 comments