Show HN: Software for Remote GPU-over-IP

[+] fire|3 years ago|reply

> *Basically, we aren't targeting support for graphical applications running on Linux because there is very little demand for this - but we cover everything else. You CAN run graphical applications on Windows vs. a Linux server.

Ah, disappointing; I was hoping to try this with a Steam Deck as an alternative to using Moonlight and streaming the entire game from the Windows machine.

[+] lopkeny12ko|3 years ago|reply

I'm confused, where is the actual source code? This repo only has some Dockerfiles that, as far as I can tell, are pulling precompiled opaque binaries and have some convenience scripts to set up the required runtime environment.

[+] pifm_guy|3 years ago|reply

I don't think it's opensource.

I assume it cost quite some $$$ to produce this because you kinda have to cut nvidias binary drivers in half, which is going to require quite a lot of reverse engineering.

[+] JayStavis|3 years ago|reply

Serverless GPU is all of the rage in the past month - I'd love to see a review of this from someone who knows how to benchmark a GPU workload.

In particular:

- Autoscaling Stable Diffusion Inference

- Traditional creative workflows (realtime GPU viewport in octane for example)

- Gaming from one GPU in your house to everywhere else

I get the training example for small models but can't imagine it scales that well with model size

The big value seems to be... share 1 GPU to many computers, so spend less on a cluster? Capacity fungibility is real value but hard to measure!

In any case, stuff like this is a good bet. GPU software will continue to increase in prevalence, and utilization will remain low. Solving for the compute market liquidity is important despite NVIDIA's best efforts.

[+] canadacow|3 years ago|reply

We have all these running fantastically, please check out our discord where we have clips and and demonstrations of these sorts of workloads. https://discord.gg/2SWbpXx9

[+] kkielhofner|3 years ago|reply

For anything involving inference you’re much better off with one of the many inference model servers such as TensorFlow serving, Triton Inference Server, etc.

[+] allanrbo|3 years ago|reply

It surprises me that this works well enough to be useful. I would have thought that network latency, being orders of magnitude higher than memory latency, would be a huge problem. Latency Numbers Everyone Should Know: https://static.googleusercontent.com/media/sre.google/en//st...

[+] cobertos|3 years ago|reply

I'd be surprised if this works for anything latency sensitive over anything more than a LAN.

Even just the time it takes speed of light between NY and LA (410^6m/310^8m/s=1/75s) is roughly how long a 60 fps frame is (1/60s). Add OS serializing the frame from the GPU onto the network card, network switching of those packets, and you're starting to really feel that latency.

[+] capableweb|3 years ago|reply

For gaming, this is obviously a no-go. But for bunch of AI/ML related workloads, it might make perfect sense.

[+] taf2|3 years ago|reply

About 10 years ago I found set operations in ruby were slower then set operations in redis. So I shipped all my data over the network - let redis sort into a sorted set and then crunched my data in redis - retrieving it again over the network in its reduced form… I think it makes sense that for vector operations a remote gpu could be pretty cool. Now if we can get this working from MacBooks to Linux gpus I’d be pretty stoked

[+] delijati|3 years ago|reply

PCI-Express 16x 4.0 has 31,5 GByte/s. The fastest fiber ETH has 50 GB/s. So it "could" be useful if you have datacenter grade equipment ;)

[+] lmeyerov|3 years ago|reply

Neato, sounds like bitfusion in their early days!

Definitely of interest to us, even w/ latency limits, both for ai dev & investigations and occasional full runs

I do have to wonder about the non-oss licensing, as that's part of why we didn't spend much time on bitfusion...

[+] fock|3 years ago|reply

Didn't we have those things already? Virtual-GL and Co. say hi.

Also for most real GPU applications, you need to get the data in and out. I don't think splitting compute across a (insert any non-Infiniband-link) solves this

[+] Melatonic|3 years ago|reply

100gbe is pretty similar to infiniband no? Or does infiniband still kill it on latency?

[+] miovoid|3 years ago|reply

compress it

[+] xrd|3 years ago|reply

I see lots of comments in various ML repositores about trouble running on multiple GPUs. This seems like a great way to run across multiple low VRAM GPUs instead of buying a huge expensive single card. It feels reminiscent of how Google built their clusters on commodity hardware where they would just throw away a failed device rather than trying to fix it. This is really cool.

[+] en4bz|3 years ago|reply

I doubt this does multi-server. All the GPUs probably have to be on the same machine.

[+] jasonni|3 years ago|reply

Glad to see a https://virtaitech.com/en/index competitor. As I know VirtAI doen't provide freeware. But they provide RDMA network and GPU pooling features. For guys interested in how this is done, I suggest have a look of https://github.com/ut-osa/gpunet and https://github.com/tkestack/vcuda-controller

[+] zamadatix|3 years ago|reply

That's really awesome. I'm not sure what I'd use it for but just being able to makes me want to find an excuse! What's impressive is this seems to have more capabilities than most "local" software vGPU solutions for e.g. VMs.

[+] nimitt|3 years ago|reply

Do you have any numbers on the viability of using this for ML/AI workloads? seems like once a model is ingested into a gpu vram theoretically the transactional new inputs / outputs would be trivial.

[+] stevegolik|3 years ago|reply

For some use cases we're already at parity - YMMV.

[+] ridgered4|3 years ago|reply

Can this be used to accelerate video decode in a linux machine/virtual machine? It sounds like it is not for graphics on linux but it unclear to me where decode falls.

[+] sworley|3 years ago|reply

wait is the code actually FOSS or is this just freeware. I only see docker files.

[+] ApolloRising|3 years ago|reply

Would this allow a VMware Workstation Linux VM use the GPU from a Windows Host with an Nvidia Video Card for ML usage?

[+] dezmou|3 years ago|reply

does it really feel like the GPU I use is one on my machine ? or do I have lot of boilerplate to make it work client side ?

[+] neuronexmachina|3 years ago|reply

I haven't tried it yet, but based on their doc it seems like after setting the host in the juice.cfg, you basically just need to run `juicify [application path]`: https://github.com/Juice-Labs/Juice-Labs/wiki/Juice-for-Wind...

56 comments