top | item 38380199

(no title)

Micoloth | 2 years ago

They sure as hell have no incentives to make Neural Network faster and more accessible, for starters..

(Considering they right now make more money and have more control, the less accessible and the more computation-hungry AI models are)

To be fair, this approach (claims to) only speed up inference, not training, so all the GPUs are needed anyway.

discuss

godelski|2 years ago

I wouldn't be so quick to conspiracy. I'm the author of a work and a famous blog post that trains a particular common architecture much faster (don't want to dox myself too much) and with far fewer parameters, but it has been rejected several times and is now arxiv only. Our most common complaint was "who would use this? Why not just take a large model and tune it?" That question alone held us back a year (had over a hundred citations by then and remains my most cited work) until it switched to "use more datasets" and "not novel" (by that time true, others had built off of us, cited us, and published in top venues).

I don't think this was some conspiracy by big labs to push back against us (we're nobodies) but rather that people get caught up in hype and reviewers are lazy and incentivized to reject. You're trained to be critical of works and especially consider that post hoc most solutions appear far simpler than they actually are. But context matters because if you don't approach every paper with nuance it's easy to say "oh, it's just x." But if those ideas were so simple and obvious they would also be prolific. I see a lot of small labs suffer the same fate simply due to lack of compute. If you don't make your new technique work on many datasets it becomes the easiest thing to reject a paper by. ACs aren't checking that reviews are reasonable. I've even argued with fellow reviewers about papers in workshops -- papers I would have accepted in the main conference -- that are brushed off and the reviewers admit in their reviews that they do not work on these topics. I don't understand what's going on but at times it feels like a collective madness. A 10 page paper with 4 very different datasets that solves a problem, is clearly written, has no major flaws, and is useful to the community should not need defending when submitted to a workshop just because reviewers aren't qualified to review the work (this paper got in btw). We are moving into a "pay to play" ecosystem and that will only create bad science due to group think. (another aspect of "pay to play" is in the tuning. Spending $1M to tune your model to be the best doesn't mean it is better than a model that could not afford the search. Often more than half of resources are spent on tuning now)

wruza|2 years ago

Is there a place where you guys discuss... things? I'm layman interested in this topic akin to pop-physics/maths, but have no chance to just read papers and "get it". On the other hand, immediately available resources focus more on how-to part of it rather than on what's up overall. Also, do you have something like 3b1b/pbs/nph for it? Content that you can watch and say "well, yep, good job".

jamesblonde|2 years ago

Unless they were very confident of acceptance, a top research prof would rewrite and resubmit before publishing on arxiv so that others could "build on it" (scoop you at a top conference).

WithinReason|2 years ago

They certainly have an incentive to keep these kinds of improvements in-house and not publish them, since they are commercial entities and this represents a competitive advantage.

lawlessone|2 years ago

I think Nvidia might have an incentive for this not to exist.

edit: but you are right for the AI companies not open sourcing their models it's an advantage to have it when others don't