If I’m reading the code right round trips (HTTP requests) go through queueForIdleConn which picks up any pre-existing connections to a host. The only time these connections are cleaned up (in HTTP2) is if keepalives are turned off and the connection has been idle for too long OR the connection breaks in some way OR the max number of connections is hit LRU cache evictions take place.
It should, but like the sibling, I haven't seen what Go does. I've seen it happen elsewhere. Exchange used to cache any answer it got until it restarted. Java has had that behavior from time to time if you're not careful as well.
Querying DNS can be expensive, so it makes sense to build a cache to avoid querying again when you don't need to, but typical APIs for name resolution such as gethostbyname / getaddrinfo don't return the TTL, so people just assume forever is a good TTL. Especially for a persistant (http) connection, it kind of makes sense to never query DNS again while you already have a working connection that you made with that name, and if it's TLS, it's quite possible that you don't check if the certificate has expired while you're connected or if you do a session resumption.
But innocent things like this add up to make operating services tricky. Many times, if you start refusing connections, clients figure it out, but sometimes the caches still don't get cleared.
I don't know about Golang but I swear I've seen this before as well - clients holding on to an old IP address without ever re-resolving the domain name. It makes me wary of using DNS for load balancing or blue-green deployments. I feel like I can't trust DNS clients.
It's been 8-10 years but when I was serving tracking pixels we were astonished how long we still got requests from residential IPs for whole hostnames we had deprecated. That means I would not trust DNS caching anyway. I'm not talking days here, but months, with a TTL set to mere days.
TTL isn't universally respected. Consider the following path:
Your machine -> Local router -> Configured upstream DNS Server (ISP/CF/Quad8/etc) -> ? -> Authoritative DNS Server
Any one of those layers can override/mess with/cache in a variety of ways including TTL. This is why Cloudflare and a variety of other providers use IP anycast. They accepted DNS for what it is and worked around it.
Not only is the IP always the IP, the "global" BGP routing table actually universally and consistently updates much faster than DNS. Then whatever routers, machines, etc downstream from that don't matter.
__turbobrew__|1 year ago
If I’m reading the code right round trips (HTTP requests) go through queueForIdleConn which picks up any pre-existing connections to a host. The only time these connections are cleaned up (in HTTP2) is if keepalives are turned off and the connection has been idle for too long OR the connection breaks in some way OR the max number of connections is hit LRU cache evictions take place.
Furthermore, the golang dnsclient doesn’t even expose record TTLs to callers so how could the HTTP2 transport know when an entry is stale? https://github.com/golang/go/blob/master/src/net/dnsclient_u...
toast0|1 year ago
Querying DNS can be expensive, so it makes sense to build a cache to avoid querying again when you don't need to, but typical APIs for name resolution such as gethostbyname / getaddrinfo don't return the TTL, so people just assume forever is a good TTL. Especially for a persistant (http) connection, it kind of makes sense to never query DNS again while you already have a working connection that you made with that name, and if it's TLS, it's quite possible that you don't check if the certificate has expired while you're connected or if you do a session resumption.
But innocent things like this add up to make operating services tricky. Many times, if you start refusing connections, clients figure it out, but sometimes the caches still don't get cleared.
fotta|1 year ago
Oh wow I didn’t know this but I looked it up and you’re right. Interesting.
hypeatei|1 year ago
loevborg|1 year ago
wink|1 year ago
ignoramous|1 year ago
kkielhofner|1 year ago
Your machine -> Local router -> Configured upstream DNS Server (ISP/CF/Quad8/etc) -> ? -> Authoritative DNS Server
Any one of those layers can override/mess with/cache in a variety of ways including TTL. This is why Cloudflare and a variety of other providers use IP anycast. They accepted DNS for what it is and worked around it.
Not only is the IP always the IP, the "global" BGP routing table actually universally and consistently updates much faster than DNS. Then whatever routers, machines, etc downstream from that don't matter.
__turbobrew__|1 year ago
I would need to re-read the code to refresh my memory.
pvtmert|1 year ago
also, java historically had -1 ttl (eg: infinite) by default. causing a lot of headaches with ephemeral/container services.