New – EC2 Instances (G5) with Nvidia A10G Tensor Core GPUs

[+] latch|4 years ago|reply

I take pretty unhealthy pleasure in looking at EC2/GC announcements and comparing them to random offers on the WebHostingTalk's Dedicated Hosting Offers section. The cloud offerings almost always end up being 2-4x more expensive (or more) and 2-4x slower.

Except when GPUs are involved. This just seems like something that dedicated hosting industry isn't on top of _at all_. I don't know if it's hardware availability, upfront costs, unreliable demand, an inability to compete on price, or if the usage really is so much more elastic.

The best I could [quickly] find was a 5600X w/32G Ram, 1TB SSD and 12GB 3060 for $189/m.

[+] toomuchtodo|4 years ago|reply

You’re not paying for the compute, you’re paying for IAM, the service ecosystem, unified enterprise billing, and the engineering experience. Totally different market than bare metal colocated. Like comparing a crated LS engine to a showroom Corvette.

[+] totoglazer|4 years ago|reply

I think this really is much more elastic for some workloads. I want to use hundreds of A100s for a few weeks. Then not use them for a quarter while I use and analyze what I trained, fine tune on smaller machines, etc. We’d never be able to justify buying a cluster that large and managing it for intermittent use, much better to have it burst properly.

[+] fxtentacle|4 years ago|reply

When I needed GPU instances, I asked my long-term bare metal hoster (hetzner.de) and they sent me a private price list.

My theory is that publicly advertising GPU instances tends to attract people who want to convert stolen credit cards into crypto mining power. I mean it's the same reason why the 3090 == GPU most suitable for crypto has been sold out for a year.

That's also in line with the fact that I had to sign a voucher stating that I will NOT do any kind of crypto hosting / mining / processing on any of their servers before I got that GPU instance.

[+] rjzzleep|4 years ago|reply

Apparently Hetzner discontinued theirs because of cryptocurrency abuse(whatever that means)

[+] slownews45|4 years ago|reply

if you like following WHT dedicated hosting offers one fun thing is to just see how long lasting (or not) these providers are.

Despite supposedly being 4x faster and 4x cheaper, these dedicated hosting offer guys come and go like moths circling a burning flame.

Same thing with the s3 competitor file hosting companies. I haven't followed this space recently from hotfile / filesonic etc etc. All supposedly way cheaper than S3 - but boy do they disappear in the night at times.

[+] ev1|4 years ago|reply

IIRC, you are not allowed to use a RTX/GTX in a datacentre.

[+] buildbot|4 years ago|reply

There’s vast.ai!

[+] Jack000|4 years ago|reply

Still way too expensive for individuals and startups imo. These A10 cards are just slightly slower than RTX3090, which makes it an easy comparison. $1/hour for the cheapest option means renting for 62 days is equivalent to buying outright (add a few weeks to account for the price of the other computer components)

A big part of the difference is in Nvidia's datacenter tax (The A10 is basically a 3090, just double the price). Hopefully AMD's new accelerators will bring some much needed competition to the market.

[+] aljarry|4 years ago|reply

Most people don't just rent a single GPU to run something outside of training. A10 tensors are a faster version of T4, which are cheaper for running inference (albeit slower than typical GPU).

Cheapest reasonable GPU instance you can get on EC2 is P3.2xlarge with $3.06/h.

[+] dougSF70|4 years ago|reply

Given 5 year old cards can still perform at some level, buying makes much more sense than renting at $1 per hour.

[+] scottcodie|4 years ago|reply

There are a lot of other costs to consider than only the total cost of the physical item. If you need full time dedicated tensor cards and willing to foot the datacenter and staffing costs then maybe purchasing the cards out right is a better option.

[+] dougSF70|4 years ago|reply

I am a fan of aws but it is all too easy to mount up large bills. I began using accelerated compute instances but as a newbie it was hard for me to get a feel of the performance, how large and a dataset could we run before it fell over or over heated. If we were spending other people's money I would have gone all in on aws GPS-enabled instances, as a boot strapped firm doing some experiments in the ML space it was more cost effective to buy a small server and get a feel for the performance envelope. We just doubled our hardware spec for another on-prem computer. The best thing is, amazon could not offer better value.

[+] DarthNebo|4 years ago|reply

Still awaiting GPU runtime in lambda. The only alternatives I can think up is ECS auto-scaling or to hook up a Kubernetes cluster/Knative-Kuda solution.

[+] JoshTriplett|4 years ago|reply

Forget the GPUs, the top instance type here (g5.48xlarge) has 192 vCPUs! I'd love to get that in a compute-optimized instance type, without the GPUs.

(The largest available compute-optimized instance is currently the c6i.32xlarge at 128 vCPUs, not counting the absurd u-* family of instances that mere mortals can't get. 192 vCPUs would be a substantial upgrade; 256 would be even better.)

[+] rp1|4 years ago|reply

Just out of curiosity, what workload do you have that requires so much compute? Could you move it to the gpu?

[+] infocollector|4 years ago|reply

Does anyone know how this will compare with the v100 GPUs that AWS offers currently? Any benchmark pointers?

[+] coolspot|4 years ago|reply

A10G is a cut-down 3090:

> Unlike the fully unlocked GeForce RTX 3090 Ti, which uses the same GPU but has all 10752 shaders enabled, NVIDIA has disabled some shading units on the A10G to reach the product's target shader count. It features 9216 shading units, 288 texture mapping units, and 96 ROPs. Also included are 288 tensor cores which help improve the speed of machine learning applications. The card also has 72 raytracing acceleration cores.

[+] TaylorPhebillo|4 years ago|reply

I used to be engaged in this part of the industry, but haven't looked in a while. So, sincere question: Are V100's still meaningfully popular? I know they were incredible for a long time, but I figured most usage would have shifted to A100 for high performance or T4 for cost.

[+] keldlundgaard|4 years ago|reply

It's a cut down and under-clocked A6000 with 24 GB GDDR6 instead of 48. I would expect that it performs about the same as a V100 or a tad slower. https://www.techpowerup.com/gpu-specs/a10g.c3798 https://www.techpowerup.com/gpu-specs/rtx-a6000.c3686 - https://lambdalabs.com/gpu-benchmarks

[+] ironfootnz|4 years ago|reply

I can’t believe people are still paying for cloud providers. The next generation of computing is decentralized and cheap. AWS stands on the back of poor engineering and a massive marketing Bs.

As CTO from two unicorns, I’d say that it’s pretty much wasted money.

[+] teloli|4 years ago|reply

The "Remote Workstations" use case is interesting: what kind of software do people use to do that nowadays? Is x2go still a thing?

[+] jaimehrubiks|4 years ago|reply

Crazy that for each new type of gpu (or new instance type in general) aws needs to make those available in many many regions at the same time.

[+] 37ef_ced3|4 years ago|reply

Or, do your inference on a $10/month ($0.015 per hour) AVX-512 server at Vultr (https://www.vultr.com/products/cloud-compute/) with NN-512 (https://NN-512.com)

[+] kamalaonknees|4 years ago|reply

[deleted]

71 comments