top | item 42320683

(no title)

Zandikar | 1 year ago

Of course. A cheap card with oodles of VRAM would benefit some people, I'm not denying that. I'm tackling the question of would it benefit intel (as the original question was "why doesn't intel do this"), and the answer is: Profit/Loss.

There's a huge number of people in that community that would love to have such a card. How many are actually willing and able to pony up >=$3k per unit? How many units would they buy? Given all of the other considerations that go into making such cards useful and easy to use (as described), the answer is - in intel's mind - nowhere near enough, especially when the financial side of the company's jimmies are so rustled that they sacked Pat G without a proper replacement and nominated some finance bros in as interim CEO's. Intel is ALREADY taking a big risk and financial burden trying to get into this space in the first place, and they're already struggling, so the prospect of betting the house like that just isn't going to fly for the finance bros that can't see passed the next 2 quarters.

To be clear, I personally think there is huge potential value in trying to support the OSS community to, in essence, "crowd source" and speedrun some of that ecosystem by supplying (Compared to the competition) "cheap" cards that aschew the artificial segmentation everyone else is doing and investing in that community. But I'm not running Intel, so while that'd be nice, it's not really relevant.

discuss

ryao|1 year ago

I suspect that Intel could hit a $2000 price point for a 128GB ARC card, but even at $3000, it would certainly be cheaper than buying 8 A770 LE ARC cards and connecting them to a machine. There are likely tens of thousands of people buying multiple GPUs to run local inferencing on Reddit’s local llama, and that is a subset of the market.

In Q1 2023, Intel sold 250,000 ARC cards. Sales then collapsed the next quarter. I would expect sales to easily exceed that and be maintained. The demand for high memory GPUs is far higher than many realize. You have professional inferencing operations such as the ones listed at openrouter.ai that would gobble up 128GB VRAM ARC cards for running smaller high context models, much like how you have businesses gobbling up the Raspberry Pi for low end tasks, without even considering the local inference community.