top | item 38378534

(no title)

winterismute | 2 years ago

I am looking at solving this challenge in a specific way: using high-perf, GPU-accelerated HW simulators and ML algorithms to tune a new HW architecture automatically. Best ML HW => run on it the best ML models => produce new best HW (arch) => build new best HW => GOTO 10.

Reach out if you are interested in any way.

discuss

order

PeterisP|2 years ago

The issue is that there is no such thing such as "best ML HW" because different ML algorithms need quite different things.

The process you describe would work to produce hardware that's best for some specific type of ML models, while the whole point of this article is that the currently "best" ML models are there not because they are objectively best, but rather because they are what's best within the bounds of the hardware architectures we have (i.e. we're not looking at best ML algorithms but rather at ML algorithms that "won the hardware lottery"), and it seems plausible that there might be better ML methods if there was different hardware that enabled them - but the process you describe wouldn't possibly find that different hardware, it would instead tune the hardware even more towards the algorithm path we're already stuck on.

heyitsguay|2 years ago

How would you define "new best HW"? Seems challenging, particularly if it's for generation N+1 ML models that haven't been created yet. Also, while there is work underway to use ML models to guide HW design, it's not clear to me that the best ML models for that == the best ML models for more general purpose tasks in audio, visual, and natural language processing, i.e. is HW circuit design done using transformers? What are the inputs, latent space, and outputs?

GPU development seems driven by more general computational principles that might be summed up glibly as "We're hitting fundamental physics limits for single-core processors, what is the maximum amount of data we can move per second through the maximum number of cores?" Perhaps there is a way to extend this with a view toward the challenges of current ML model training and inference, but I imagine trying to approach it as a black-box optimization problem could be quite difficult.