top | item 45714713

(no title)

RyeCatcher | 4 months ago

I absolutely love it. I’ve been up for days playing with it. But there are some bleeding edge issues. I tried to write a balanced article. I would highly recommend for people that love to get their hands dirty. Blows away any consumer GPU.

discuss

order

enum|4 months ago

+1

I have H100s to myself, and access to more GPUs than I know what to do with in national clusters.

The Spark is much more fun. And I’m more productive. With two of them, you can debug shallow NCCL/MPI problems before hitting a real cluster. I sincerely love Slurm, but nothing like a personal computer.

latchkey|4 months ago

Your complaint sounds more like the way that you have to access the HPC (via slurm), not the compute itself. After having now tried slurm myself, I don't understand the love for it at all.

As for debugging, that's where you should be allowed to spin up a small testing cluster on-demand. Why can't you do that with your slurm access?

Tepix|4 months ago

> Blows away any consumer GPU.

Nah. Do you have 1st hand experience with Strix Halo? At less than 1600€ for a 128GB configuration it manages >45 tokens/s with gpt-oss 120b. Which is faster than DGX Spark at a fraction of the cost.

storus|4 months ago

Strix Halo has awful token prefill speed. Only suitable for very small contexts.

CompoundEyes|4 months ago

One thing I can’t find anyone mention in reviews - does inference screech to a halt when using large context windows on models? Say if you’re in the 100k range on gpt-oss. I’m not concerned about lightning inference speed overall as I understand the purpose of the spark is to be well rounded / trainer tuner. I just want to know if it becomes unusable vs reasonable slowdown at larger contexts. That’s the thing people are unpleasantly surprised to find about a Mac Studio which has prevented me from going that route.

yunohn|4 months ago

Thanks for this bleeding edge content!

But please have your LLM post writer be less verbose and repetitive. This is like the stock output from any LLM, where it describes in detail and then summarizes back and forth over multiple useless sections. Please consider a smarter prompt and post-editing…

Tepix|4 months ago

I agree whole-heartedly. Two thirds of the article read like slop.

furyofantares|4 months ago

Since the text is obviously LLM output, how much prompting and editing went into this post? Did you have to correct anything that you put into it that it then got wrong or added incorrect output to?

NathanielK|4 months ago

Definitely reeks of someone who doesn't know what makes a readable blogpost and hoped the LLM did.

I was not familiar with the hardware, so I was disappointed there wasn't a picture of the device. Tried to skim the article and it's a mess. Inconsistent formatting and emoji without a single graph to visualize benchmarks.