top | item 41675614

(no title)

Oh, man... this is the same old stuff from the 2023 Anna Goldie statement (is this Anna Goldie's comment?). This was all addressed by Kahng in 2023 - no valid criticisms. Where do I start?

Kahng's ISPD 2023 paper is not in dispute - no established experts objected to it. The Nature paper is in dispute. Dozens of experts objected to it: Kahng, Cheng, Markov, Madden, Lienig, Swartz objected publically.

The fact that Kahng's paper was invited doesn't mean it wasn't peer reviewed. I checked with ISPD chairs in 2023 - Kahng's paper was thoroughly reviewed and went through multiple rounds of comments. Do you accept it now? Would you accept peer-reviewed versions of other papers?

Kahng is the most prominent active researcher in this field. If anyone knows this stuff, it's Kahng. There were also five other authors in that paper, including another celebrated professor, Cheng.

The pre-training thing was disclaimed in the Google release. No code, data or instructions for pretraining were given by Google for years. The instructions said clearly: you can get results comparable to Nature without pre-training.

The "much older technology" is also a bogus issue because the HPWL scales linearly and is reported by all commercial tools. Rectangles are rectangles. This is textbook material. But Kahng etc al prepared some very fresh examples, including NVDLA, with two recent technologies. Guess what, RL did poorly on those. Are you accepting this result?

The bit about financial incentives and open-source is blatantly bogus, as Kahng leads OpenROAD - the main open-source EDA framework. He is not employed by any EDA companies. It is Google who has huge incentives here, see Demis Hassabis tweet "our chips are so good...".

The "Stronger Baselines" matched compute resources exactly. Kahng and his coauthors performed fair comparisons between annealing and RL, giving the same resources to each. Giving greater resources is unlikely to change results. This was thoroughly addressed in Kahng's FAQ - if you only could read that.

The resources used by Google were huge. Cadence tools in Kahng's paper ran hundreds times faster and produced better results. That is as conclusive as it gets.

It doesn't take a Ph.D. to understand fair comparisons.

discuss

negativeonehalf|1 year ago

For AlphaChip, pre-training is just training. You train, and save the weights in between. This has always been supported by the Google's open-source repository. I've read Kahng's FAQ, and he fails to address this, which is unsurprising, because there's simply no excuse for cutting out pre-training for a learning-based method. In his setup, every time AlphaChip sees a new chip, he re-randomizes the weights and makes it learn from scratch. This is obviously a terrible move.

HPWL (half-perimeter wirelength) is an approximation of wirelength, which is only one component of the chip floorplanning objective function. It is relatively easy to crunch all the components together and optimize HPWL --- minimizing actual wirelength while avoiding congestion issues is much harder.

Simulated annealing is good at quickly converging on a bad solution to the problem, with relatively little compute. So what? We aren't compute-limited here. Chip design is a lengthy, expensive process where even a few-percent wirelength reduction can be worth millions of dollars. What matters is the end result, and ML has SA beat.

(As for conflict of interest, my understanding is Cadence has been funding Kahng's lab for years, and Markov's LinkedIn says he works for Synopsis. Meanwhile, Google has released a free, open-source tool.)

clickwiseorange|1 year ago

It's not that one needs an excuse. The Google CT repo said clearly you don't need to pretrain. "supported" usually includes at least an illustration, some scripts to get it going - no such thing there before Kahng's paper. Pre-trained was not recommended and was not supported.

Everything optimized in Nature RL is an approximation. HPWL is where you start, and RL uses it in the objective function too. As shown in "Stronger Baselines", RL loses a lot by HPWL - so much that nothing else can save it. If your wires are very long, you need routing tracks to route them, and you end up with congestion too.

SA consistently produces better solutions than RL for various time budgets. That's what matters. Both papers have shown that SA produces competent solutions. You give SA more time, you get better solutions. In a fair comparison, you give equal budgets to SA and RL. RL loses. This was confirmed using Google's RL code and two independent SA implementations, on many circuits. Very definitively. No, ML did not have SA beat - please read the papers.

Cadence hasn't funded Kahng for a long time. In fact, Google funded Kahng more recently, so he has all the incentives to support Google. Markov's LinkedIn page says he worked at Google before. Even Chatterjee, of all people, worked at Google.

Google's open-source tool is a head fake, it's practically unusable.

Update: I'll respond to the next comment here since there's no Reply button.

1. The Nature paper said one thing, the code did something else, as we've discovered. The RL method does some training as it goes. So, pre-training is not the same as training. Hence "pre". Another problem with pretraining in Google work is data contamination - we can't compare test and training data. The Google folks admitted to training and testing on different versions of the same design. That's bad. Rejection-level bad.

2. HPWL is indeed a nice simple objective. So nice that Jeff Dean's recent talks use it. It is chip design. All commercial circuit placers without exception optimize it and report it. All EDA publications report it. Google's RL optimized HPWL + density + congestion

3. This shows you aren't familiar with EDA. Simulated Annealing was the king of placement from mid 1980s to mid 1990s. Most chips were placed by SA. But you don't have to go far - as I recall, the Nature paper says they used SA to postprocess macro placements.

SA can indeed find mediocre solutions quickly, but keeps on improving them, just like RL. Perhaps, you aren't familiar with SA. I am. There are provable results showing SA finds optimal solution if given enough time. Not for RL.

sijnapq|1 year ago

[deleted]

smokel|1 year ago

Wow, you seem to be pretty invested in this topic. Care to clarify?

anna-gabriella|1 year ago

Reposting, as someone is flagging my comments. > People in the know are following this topic - big-wow surprise!

anna-gabriella|1 year ago

[deleted]

djmips|1 year ago

> Kahng is the most prominent active researcher in this field. If anyone knows this stuff, it's Kahng.

This is written as a textbook example logical fallacy of appeal to authority.

ok_dad|1 year ago

The GP would have had to appeal only to the expert’s opinion, with no actual evidence, but the GP actually gave a lot of evidence to the expertise of the researcher in the form of peer reviewed papers and other links. That’s not an appeal to authority at all.

anna-gabriella|1 year ago

[deleted]