Add opt-in transparent telemetry to Go toolchain

[+] bsaul|2 years ago|reply

Very relieved that they chose to back off the initial opt-out proposal. It’s always refreshing to see a language listening to its user base. This was the kind of decision that could instantly change the reputation of a PL, for purely political / psychological reasons.

[+] cesarb|2 years ago|reply

I think adding telemetry to a compiler goes in the opposite direction of the current "minimal privileges, sandbox all the things" trend.

For instance, it would make sense to create a set of selinux rules (or something like it) to make sure that a compiler cannot do anything other than reading its input files (and system headers/libraries/etc) and writing to its output directory, even if for instance a buffer overflow triggered by a malicious source code file led to running shell code within the compiler. Having to allow access to the network for the telemetry would require weakening these rules.

It reminds me of the classic "confused deputy" article (https://css.csail.mit.edu/6.858/2015/readings/confused-deput...), which coincidentally also involved a compiler tracking statistics about its usage.

[+] arp242|2 years ago|reply

That would actually already be difficult with the current go tool, since it's more than "just" a compiler but also fetches dependencies. If all dependencies are already in place it won't hit the network, so there are options, but you'd have to find another way to retrieve those.

The telemetry is opt-in,and failing to send them won't fail the compile (it won't even run on every compile). It's not really preventing you from applying your SELinux policy if you want, even if it would have been opt-out.

[+] duped|2 years ago|reply

> to make sure that a compiler cannot do anything other than reading its input files (and system headers/libraries/etc)

Define "input files." Tools have to do a combination of reading, parsing, and sometimes even version unification/downloads just to get the complete set of inputs to feed to the compiler.

Of course you can define the compiler as the tool that parses text and writes machine code, but then you're just shoveling dirty water around.

[+] kklisura|2 years ago|reply

Can someone explain how we managed to have programming languages and toolchain development for half a century without using telemetry, but somehow we need it today in our tools? Their Why Telemetry? blog, just doesn't cut it for me [1].

[1] https://research.swtch.com/telemetry-intro

[+] estebank|2 years ago|reply

Humans managed to live for most of history without penicillin, or even boiling water, at the cost of most humans dying before making it to adolescence. People managed to have global communications with only steam ships and telegraph, at the cost of slower pace of information dissemination. NASA managed to make it to the moon with less computing power than the cellphone in your pocket, at high resource, monetary, human and time costs. Cars managed to work with more rudimentary design than today's, without any computers, at the cost of lower life-spans, lower efficiency and higher pollution.

You can make many arguments for and against telemetry in developer tools. Not acknowledging that telemetry helps with visibility into how those tools actually work in the wild, which in turn helps lower the incidence of bugs and speed up development, is disingenuous. You can arrive to the conclusion that even inert, opt-in telemetry is not worth it, but don't disregard out of hand the utility of it in helping their development as if it were some crazy idea.

[+] arp242|2 years ago|reply

You can say "we managed to do X without Y" for a lot of values of X and Y.

I think that's the wrong way to go about things; instead it's more useful to ask "will this be useful?"

There's a long list of real-world use cases in part 3 of that blog series.

I miss telemetry in my app sometimes too; there's some features where I wonder if anyone actually uses this, and I also don't really know what kind of things people run in to. Simply "ask people" is tricky, as I don't really have a way to contact everyone who cloned my git repo, and in general most people tend to be conservative in reporting feedback. I have found this a problem even in a company setting with internal software: people would tell me issues they've been frustrated at for months over beers in the pub, when this was sometimes just a simple 5 minute tweak that I would be happy to make.

Can I make my app without telemetry? Obviously, yes. And I have no plans to ever add it. But that doesn't mean it's not useful.

[+] gjulianm|2 years ago|reply

You can manage to have things without telemetry, but having telemetry is incredibly useful. I think the example of "how much of our user base actually uses these features" is a very good one, specially in a compiler where maintaining old features could be adding a lot of complexity to the code base. And, as they also explain, a lot of bugs and undesired behaviors are things that the users won't know they have to report and just accept as part of the normal behavior. Things like cache misses, slowed down compilation times in certain situations, sporadic crashes... All of those things could be improved if the developers knew about them.

[+] jjav|2 years ago|reply

> Can someone explain how we managed to have programming languages and toolchain development for half a century without using telemetry, but somehow we need it today in our tools?

Back in the 90s in companies I worked for, attempting to add any kind of phone-home code was a fireable offense. Or at least, would get you a very stiff talking to from a few very high up people in the organization. You'd never even think of doing that again. Customer trust and privacy was paramount.

As we all know, spyware slowly started creeping into end user apps and later became a flood. Now it's difficult to find any consumer app that doesn't continuously leak everything the user does.

It's become so normalized that now even developers tools seem to think it's somehow ok to leak user data.

[+] JohnFen|2 years ago|reply

I think there is a whole generation of developers who have no experience with how to do these things in the absence of telemetry, so they genuinely believe it's not possible.

[+] eternalban|2 years ago|reply

Two things happened in '00s in relation to this question. One, on demand computing infrastructure (cloud), and two, the scaling requirements of a new breed of networked services. The germinal change in response to these shifts was that processes replaced components, and system boundaries spanned devices.

When your code is running on application servers and your applications are composed of components, all the tools were already there, in the OS and as add ons, like dtrace, and in whatever monitoring tools came with your application server. Today, instead of components, we compose systems out of (lightweight) processes, and processes can be created on any device, and the replacement for the application server is the whole gamut of k8, terraform, elastic, ..., etc.

Nothing has changed in the abstract structure of our systems, its just that the current approach has the beast dismembered and spread out and loosely connected via protocols, instead of a linker or a dispatch mechanism of a platform.

[+] shadowgovt|2 years ago|reply

It'll be interesting to see what effect telemetry has on the ongoing development of the tools and language, since we don't have another tool / language chain to compare it to.

[+] jeroenhd|2 years ago|reply

I like that they changed their approach to opt-in. I also like how much effort they've put into making the data collection as anonymous as possible despite being a Google project.

Well done, golang team. Other companies with supposedly open languages (looking at you, Microsoft) can learn a thing or two from you.

[+] liampulles|2 years ago|reply

Given how happy people seem to be sending large parts of their codebases to LLMs these days, privacy concerns over telemetry logging look quaint in comparison.

[+] potamic|2 years ago|reply

Programs running on your machine have access to much more than the code they are working with.

[+] JohnFen|2 years ago|reply

The actions of a few shouldn't be taken as representative of the whole.

[+] laputan_machine|2 years ago|reply

Google doing more spying, unsurprising. At least it's opt-in (for now)

[+] jeroenhd|2 years ago|reply

As long as they keep this opt-in, I see no reason to accuse them if anything. The telemetry collection design is clearly made to be as privacy preserving as possible.

There's always the risk that they'll roll out the telemetry setup now as an opt-in feature and then switch it to opt-out down the line, but I don't think this is the current team's intention.

[+] parabyl|2 years ago|reply

It'll be interesting to see how this is accepted by the community at large. I think explicit opt-in, combined with having the discussion about which metrics to collect in public, would be enough assurance for most that this isn't a bad idea - but doesn't neccesarily mean that most will opt-in as a result.

[+] JohnFen|2 years ago|reply

That they made it opt-in means that I will no longer completely rule out using golang. I don't know if I'd actually opt in, though. I haven't evaluated that issue.

[+] TDiblik|2 years ago|reply

I might be wrong, but isn't dotnet telemetry opt-out by default?

[+] tjpnz|2 years ago|reply

I'm fairly accustomed now to disabling this shite, but if it's only going to become more prevalent I could see it creating real toil. Are we also going to need a PiHole-like solution for servers and dev boxes?

[+] remram|2 years ago|reply

It's opt-in.

[+] varispeed|2 years ago|reply

> IP addresses exposed by the HTTP session that uploads the report are not recorded with the reports.

Like pinky promise, trust me bro.

This whole thing looks delusional. I hope someone is going to create a fork as Golang team has lost their marbles.

[+] carapace|2 years ago|reply

(Maybe I'm just cranky this morning, so please ignore this comment if it fails to please you, but uh... I think the "real WTF" is getting into a position where you need telemetry at all. It's hard to describe what I mean, it seems like most folks who think about these things at all either get it or don't, and the two positions are so self-evident to those that hold them that it blinds us to the rationale of the "other side". In any event, "compiler does not access network" Works For Me. Sorry for the noise.)

[+] hedora|2 years ago|reply

Yeah; I can’t imagine how terrible golang’s test infrastructure would have to be for this to provide them with any practical benefit.

Their motivating example was something like “golang stopped working on clean macos, and no one noticed for months”, which kind of proves my point.

[+] bpicolo|2 years ago|reply

When reading the title, I was hoping it would mean better abstractions than context for passing around OpenTelemetry info in a golang codebase

[+] xyzzy_plugh|2 years ago|reply

It's a real shame libraries like OpenTelemetry encourage using the context to pass information around. In my programs I prefer to be explicit. Sadly this breaks so many libraries which also depends on having a magically configured context to function properly. It's thread local storage all over again.

[+] jackmott42|2 years ago|reply

How is telemetry so valuable that language maintainers feel the need to introduce it even when it is so controversial?

[+] Patrickmi|2 years ago|reply

When you’re in organization with strict goals and no resource for public tests implementation you’ll have to make it up for correct information

[+] Patrickmi|2 years ago|reply

Every Problem have a Solution, telemetry is good but abuse of anything is bad, if this follows strictly inline to what they have planned just like what they have been doing, I see no problem. Not everyone must be satisfied

[+] JohnFen|2 years ago|reply

Good on them! That's how you do telemetry without it becoming spying.

[+] pachico|2 years ago|reply

In general, I have no objections to well explained opt-in anything.

[+] ape4|2 years ago|reply

Is it in the toolchain or in the applications made with Go?

[+] cratermoon|2 years ago|reply

Just the toolchain. https://github.com/golang/go/issues/58894

[+] kitd|2 years ago|reply

Toolchain

[+] aatd86|2 years ago|reply

I will be the one overweighted in the report because I keep typing go mod init and go work init...

I don't know I never actually remember the commands. :o)

[+] davidw|2 years ago|reply

When did logging and reporting become "telemetry"?

[+] hedora|2 years ago|reply

Logging and reporting is generally for the benefit of the end user.

Telemetry is generally for the benefit of the marketing team, law enforcement and development team.

[+] nathanlied|2 years ago|reply

When was it not? Telemetry - the act of collecting logs/"metrics" and reporting them back to a remote (tele) station - how could this not be considered telemetry?

[+] account42|2 years ago|reply

When developers started to log on my computer and report that data to their servers.

119 comments