The fastest de-referer service... with AWS Global Accelerator

sudhirj|4 years ago

Full disclosure, I work at Fly.io now.

This exact setup is easier on Fly.io - our proxy layer runs in 20 regions worldwide with anycast, so your requests hit the nearest region and quickly terminate TLS there.

You can also run any Docker container, and either choose regions to run them in, or just set the min/max and ask us to start and stop containers in whichever region has demand, so your deployment follows the sun.

ignoramous|4 years ago

Correct me if I am wrong: fly's anycast has its limitations compared to Global Accelerator (GA) though:

On ocassion, it breaks UDP protocols that are "connection oriented" (like QUIC and WireGuard, though both have built-in capabilities to recover).

There is no way to pin traffic to VMs (route / client affinities) or shape traffic.

100+ locations with GA, and two Anycast IPs (in two distinct "zones").

---

Alternatives to Fly on AWS that I know of:

Anycast on AWS without global accelerator: S3 buckets with transfer acceleration); (edge optimized) API Gateway to Lamba / Fargate; S3 + CloudFront.

AWS AppRunner + Copilot (which are comparable to Fly + Flyctl) can be geo-routed to nearest instance by DNS-based load balancing with Route53 (not anycast specifically).

---

Fly's killer feature (and why we are transitioning to it) is its cost-effectiveness and almost 'zero-devops' setup.

- Super cheap bandwidth ($2 per 100GB!)

- Free deploys (AppRunner charges for deploys)

- Free monitoring (versus expensive but comprehensive CloudWatch)

- Free orchestration

- Free anycast transit (expensive on aws)

- Cheaper, zero-touch, global/cross-region private-network across VMs in the same fly org (zero-muck: transit gateway, nat gateway, internet gateway, vpc private endpoints, iam policies...).

- Super cheap and fast disks ($0.15 per GB/disk!)

- Easier HA Postgres and HA Redis setups.

- GA's TCP proxy does not preserve source/client ip-ports (Fly can).

0xy|4 years ago

I really like Fly and would love to move some side project workloads to it, the only thing holding me back is the Postgres product which seems to be a little bit 'not ready for production'. I'm referring to point-in-time recovery and ease of backup restoration mostly.

The product looks too good to be true, and when you dig into a little deeper it seems like it isn't totally 100%.

Amazon RDS is something that I really trust, but I didn't get the same vibe looking at Fly Postgres.

sudhirj|4 years ago

Also way cheaper, if your container is efficient you’ll probably pay under the cost of Global Accelerator alone, and bandwidth is way cheaper as well.

dizpers|4 years ago

Do you have your own servers or you build your service on top of the aws/gcp/azure?

lucacasonato|4 years ago

It is 2021, and AWS Global Accelerator still does not support IPv6. Google Cloud has supported this on their global anycast load balancers since September 2017.

jjoonathan|4 years ago

I get the distinct impression that big, core chunks of AWS don't support IPv6, and where they do, it's very much a bolted-on second class citizen for which one should keep the expectations low.

RKearney|4 years ago

Google is by far the worst cloud provider when it comes to IPv6 support. They only support IPv6 on VPC networks in asia-east1, asia-south1, europe-west2, and us-west2, you cannot access the metadata service over IPv6, and IPv6 on load balances just left Alpha last year I believe.

mrkurt|4 years ago

Google doesn't own nearly as many IPs as Amazon, so they have to do better at IPv6. AWS is incentivized to keep the world IPv4 as long as possible.

dncornholio|4 years ago

I find it hilarious that they're avoiding sending the referer to a 3rd party, by sending the referer to a 3rd party..

underwater|4 years ago

In the context of the problem being solved – removing Referer headers – isn't this optimising the wrong thing?

Almost every browser, except for IE, supports the `Referrer-Policy` header. We should be aiming to avoid additional redirects, not to make them faster.

defaultname|4 years ago

I would say the vast majority of solutions do it properly, and it's hard to fathom why someone would ever use a service like this. Quite aside from adding additional latency, improve privacy/security by looping in another party? That does not follow.

Set the header and call it a day, and at this point browsers should default to same-origin. The only outlier is IE 11, with 0.5% usage, and it is so grossly out of date it's pretty reasonable to just dump.

I remember in the very early Internet trying to raise an alarm that a lot of people didn't realize the privacy implications of referral headers (run a website and you could find all sorts of crazy niche discussion forums, see the resources people were referencing, etc). I certainly am not claiming prescience, but it was amazing how little anyone cared. Mind you, I also once promoted the notion that browsers should only ever send strongly, expensively domain-specific hashed passwords from password elements, and that too was pooh poohed. Several billion pwned passwords later...

sofixa|4 years ago

But what are the costs and scalability?

EC2s in many regions + Global Accelerator is a completely different pricing model to pay as you go serverless (and, depending on your use case and usage patterns, more or less expensive), and scaling is entirely up to you. It's cool to optimise for latency, but it would have been nice to have resulting costs and downsides (scaling).

dmw_ng|4 years ago

Global accelerator $0.025/hr = $18.27/month.

3x t3a.small @ $13.72/month. = $41.16/month.

A 3 region single node web app could be done for under $60, dropping to around $43 with instance reservations. With a config this small there is no need for regional LBs, and even if you have multiple instances per region, GA has configurable internal flow hashing.

The dominant cost for any real site will likely continue to be bandwidth. GA pushes egress pricing up substantially in many cases, from 1.5c/GiB to as much as 10.5c/GiB, but this is unlikely to be a concern unless you're serving heavy images or video.

Autoscaling works the same as anywhere else. You can use spot instances, or even front ALBs with GA and build your whole app out of Lambda functions (although this would assuredly completely negate any latency benefit gained by using GA in the first place!).

andrewstuart|4 years ago

I read the first six paragraphs and still really had no idea what this is talking about. Can anyone help me understand?

gnfargbl|4 years ago

The author likes AWS Global Accelerator. The author likes GA mainly because it uses BGP anycast, which means that the packets drop into AWS' network at the closest AWS point-of-presence, instead of traversing the internet. The author's experience is that this feature of GA, when combined with always-on VMs that do not need to cold start, provides him low latency for a particular service he is running.

The author also mentions Cloudflare and Google App Engine, but rejects these because on those services he chose to use the lambda-like compute functionality and wasn't prepared for the cold starts. He doesn't appear to have tried using Lambda on AWS, or dedicated VMs with Google. Thus it is a slightly apples-to-oranges comparison.

yardstick|4 years ago

I guess the blog is using referrer stripping as an example to demo/play with AWS global accelerator.

If it was me, I would set referrer-policy: origin (or no-referrer) header[1] and avoid all the overhead. Supported on all current browsers[2], and no additional costs or code to worry about.

Also, does it seem a bit insecure to anyone else to outsource referrer stripping services if your goal is to secure the content of the referrer from third parties? How well do you trust the middlemen in this case?

1. https://developer.mozilla.org/en-US/docs/Web/HTTP/Headers/Re...

2. https://caniuse.com/referrer-policy

blacha|4 years ago

Looks to be a service to remove the HTTP referer header when linking to other sites.

Say your on example.com and click a a link to foo.com the browser will send the http header `Referer: example.com` in the HTTP get to foo.com, this means foo.com can then track how you came to their site.

r1ch|4 years ago

Using a Rwandan TLD negates pretty much all the effort in finding the fastest provider. No glue records and an unresponsive TLD nameserver translates into 100+ms for the initial DNS lookup (likely the only one that matters for such a service).

gnfargbl|4 years ago

I'm not sure it's _so_ bad, in practice?

If you dig +trace url.rw, you can see that the NS record for url.rw is held on ns-rw.afrinic.net, pch.ricta.org.rw, ns1.ricta.org.rw, ns3.ricta.org.rw and fork.sth.dnsnode.net. It's true that some of those servers are slower than others (for me, the AfriNIC server is 500ms whilst the dnsnode.net server is 50ms), but that shouldn't really matter because the TTL on the record is 86400. So the chances are that all the big DNS services (8.8.8.8 etc) should have the correct nameservers for url.rw in the cache already. Yes, if you're running a local-ish resolver, things are different... but most folks are dependent on Google, Cloudflare or their (large) ISP.

The actual A record for url.rw is held on AWS's DNS servers, with a TTL of 300. But AWS's DNS servers are fast.

YetAnotherNick|4 years ago

If you don't know, DNS are cached at multiple layers, including ISP.

Santosh83|4 years ago

Well the homepage of the site opened pretty much instantly for me, and I've never visited it before to have cached its pages or IP address.

alberth|4 years ago

> ”Cloudflare will not serve traffic from all edge locations, especially Australia is hit hard.”

I’ve read a few (dated) posts about bandwidth costs being extreme in Australia and Cloudflare would route around it, causing higher latency in that region.

Is that still the case though?

lifeofguenter|4 years ago

Yes, this is still true in 2021 (why I switched) and even true that you will not always receive the closest PoP even in bandwidth-cheap countries.

For example from Sweden I was usually routed via Denmark. I was Pro user, more PoPs are available to Business or Enterprise customers however.

mcintyre1994|4 years ago

Neat, I didn't know AWS had an AnyCast service like that. The Cloudflare workers issue is a bit disappointing, I remember their blog about eliminating cold starts and how clever it seemed, but I guess it's still not there? On another note have you evaluated fly.io at all for this? This looks right in their wheelhouse of many regions + AnyCast + IPv6, and I could definitely see configuration being simpler with them than AWS. Not sure if they'd meet all your requirements though/how they'd compare on price etc.

EduM|4 years ago

The cloudflare workers issue should be resolved now, they eliminated cold starts in 2020 (https://blog.cloudflare.com/eliminating-cold-starts-with-clo...) and it seems like he tested it in 2019

kentonv|4 years ago

The latency that this individual was experiencing with Workers wayyyyy back in 2019 was not caused by cold starts, but rather a bug in the workers runtime which caused rare but long pauses. Specifically, the bug involved a hash map with a broken hash function. Here's me publicly facepalming over the issue at the time: https://twitter.com/KentonVarda/status/1189966124688953344

But that was two years ago, Workers has grown ~an order of magnitude since then (both traffic and team size), and such a bug would be immediately noticed today.

The article author says that after the fix he still wasn't happy with performance, but I don't know what he was experiencing exactly. We've done a lot of other things to improve performance since then; it may very well be better today.

jiggawatts|4 years ago

This sounds like the Azure cross-region load balancer: https://docs.microsoft.com/en-us/azure/load-balancer/cross-r...

"Cross-region IPv6 frontend IP configurations aren't supported." -- sigh.

It would be an interesting experiment to see what is the cheapest set of VM-like workloads that can be spun up in every region...

base|4 years ago

If you need to access the client IPs for several reasons (throttling, analytics, security etc), be aware that AWS Global Accelerator still only supports IP preservation in some specific setups: https://docs.aws.amazon.com/global-accelerator/latest/dg/pre...

lambdadmitry|4 years ago

Another option would be using Fastly and writing something as simple as this (literally under 10 lines): https://fiddle.fastlydemo.net/fiddle/9069876a It probably doesn't handle url encoded strings because I came up with it in a few minutes, but it's zero hassle, is extremely fast, and won't ever break.

lentil|4 years ago

I think you could run this service on CloudFront Functions, if you didn't mind using a Refresh header instead of the meta tag.

(CloudFront Functions can't generate body content, but it looks like this service works by responding with a refresh meta tag. So if you change that to a Refresh header you wouldn't need to write the body.)

CloudFront functions run on the 200+ edge locations of CloudFront and, according to the docs, have sub-millisecond startup times. So might be a viable option?

taf2|4 years ago

So I have found if you put your origin behind cloudfront and set it to not cache you can get similar if not better performance since you have low latency at the edge and the benefit of shared tcp connection for all assets with the origin… the last part means using http2 and serving up your assets on the same domain as the origin….

stingraycharles|4 years ago

Isn’t AWS Global Accelerator more like CloudFlare Argo, rather than CloudFlare Workers?

ignoramous|4 years ago

No, the equivalent Clouflare service is Spectrum, which is tellingly expensive at $1/GB (!) than AWS Ubiquity / Global Accelerator.

Cloudflare Argo is "more like" AWS Edge-optimized API Gateway.

0xbkt|4 years ago

Yes, the whole point is to absorb user traffic into their own backbone in the first-ish mile rather than let it travel across public Internet up to last-ish mile.

vishnugupta|4 years ago

I didn't know this! How is this different from AWS CloudFront?

helsinkiandrew|4 years ago

Cloudfront caches content at edge locations near users whilst Accelerator routes requests to the nearest endpoint to the user.

homero|4 years ago

Where do you run https://url.rw/?https%3A%2F%2Fcurlmyip.net%2F

lifeofguenter|4 years ago

On AWS with Global Accelerator as well.

unknown|4 years ago

[deleted]

77 comments