top | item 35960290

(no title)

bilqis | 2 years ago

They say they don’t

discuss

az226|2 years ago

I’m sure you think this is a clever reply but the reality is that GitHub wouldn’t even begin to think if that were even technically possible. If it got out that it trained on confidential customer data, it would be game over. The risk is so stupidly large nobody in their right mind would take it. So yeah, if they say they don’t, they don’t.

account42|2 years ago

Yet its ok to train of copyleft code?

unreal37|2 years ago

I don't understand why people just automatically doubt things that companies say when they can be sued (or would otherwise destroy their business) if they are lying about it. Seems unnecessarily pessimistic.

Lio|2 years ago

People doubt Microsoft because they've historically run a very aggressive business and done things of questionable morality many times.

They've been to court and they've lost and it definitely hasn't destroyed their business one bit.

For example, Microsoft subsidiary LinkedIn routed customer email through their servers so that they could scrape it. They did that without customer knowledge via a dark patten.

They later apologised for doing it but still used it to propel the company's growth. In the end it didn't hurt anything but their reputation for respecting people's privacy.

Microsoft's own anti-trust history is littered with exceptional behaviour too. They are the size they are now by dint of super aggressive business practices.

phpisthebest|2 years ago

Normally because history shows us that redress via the court systems is rarely punitive to a company the size of Microsoft, further Microsoft has a long history of lying to its customers with seemingly no impact on its business.

yulaow|2 years ago

I mean, we discovered that the whole car industry was lying flagrantly on their emission tests which had the potential of destroying the whole business and there were A LOT of people who knew about it and could talk anytime

Why wouldn't sw companies do the same?

bilqis|2 years ago

But will that actually be against ToS or copyright? Many people tend to say that copilot learning from OSS doesn’t infringe any copyright and is no different from a person just learning from someone else’s work. So how is it different if copilot is learning from private repositories? Or eg from leaked source code?

nindalf|2 years ago

I'm frequently told on HN that Big Tech would willingly, flagrantly violate GDPR like its nothing. Even if the upside of collecting that info was minimal and the downside was 4% of global revenue.

I guess if they can do that, then what's a small lie about private repos between friends.

esrauch|2 years ago

Why would they possibly lie about that?

ChatGTP|2 years ago

Because they do shady shit, like, by default Copilot would "sample" code for training while using it. Maybe this is no longer the default, maybe it still is, but it was the default.

This type of thing erodes trust? Why should my proprietary code be used for training by default?

I was really annoyed by this.