I'm not suggesting the concerns aren't valid, but I guess I don't understand why this same principal isn't applied to other internet connected / cloud software? Do these companies worry that web browsers like Chrome could leak data or applications like Google Docs?
What is it about an AI chat bots that makes the risk of a data leak so much higher? Is something about OpenAI's ToS? Or it's relative infancy?
That’s because OpenAI can use any data that you send to chatgpt for training purposes. [0] They don’t do it with their APIs btw.
“(c) Use of Content to Improve Services. We do not use Content that you provide to or receive from our API (“API Content”) to develop or improve our Services. We may use Content from Services other than our API (“Non-API Content”) to help develop and improve our Services. You can read more here about how Non-API Content may be used to improve model performance.”
I have seen companies have rules about (or against) cloud computing in general. I remember when decent web-based translation services first came out, and the guidance from BigCorp was to not use them for anything work-related.
From what I personally have seen, this sort of guidance remains. When companies do use things like Google Docs or Microsoft Office365, they likely have some specific contract in place with Google / Microsoft / etc., that the company's legal team has decided they are happy with.
I anticipate that the same will eventually be true of ChatGPT and such, that there will be some paid corporate offering with contract terms that make the company lawyers happy.
Most of my career has been with larger companies, often with high data sensitivity; I can easily imagine that some smaller and/or less data-sensitive companies might not care about any of this.
> Do these companies worry that web browsers like Chrome could leak data or applications like Google Docs?
Yes they do. Where I work the whole google office suite is blocked from inside the network (you have to use MS Office). ChatGPT is blocked. Most web apps that you can copy text or data into are either blocked, or we have an agreement with the provider, or (for open source) we have an internal on-prem fork.
From TechCrunch on March 1, 2023: "Starting today, OpenAI says that it won’t use any data submitted through its API for “service improvements,” including AI model training, unless a customer or organization opts in."
So prior to that, they were willing to use your data for model training. Every service may have leaks/security issues, but few say they'll purposely use your data. OpenAI probably should've promised not to use your data from the beginning; it'll be a hard perception to change now.
It is, firstly there are legal agreements when you’re an enterprise user of such solution then there are various tools like DLP solutions that integrate with cloud/SAAS services such as Google Docs or Officer 365 and lastly there are CASB solutions that allow you to control how corporate users use those solutions in the first place.
E.G. you’ll probably be able to use the corporate account to sign into the corporate Google Docs or O365 instance but if you try to sign into your own it would be blocked and likely also reported on so you might get a call from SecOps down the line.
OpenAI currently offers none of it and more importantly it openly uses the data that users submit to it as well as the responses for additional training and any other purpose they might come up with.
As for browsers these are also often also configured not to send data outside of the company and yes it’s possible. Windows 11 web search and other features would also likely be disabled on your corporate device.
This probably isn't at the top of the list of serious concerns, but one problem that's kind of unique to AI is that in general, AI-generated content isn't eligible for copyright protection. Companies might worry about losing copyright on certain things if someone finds out that a lazy employee didn't actually create the content themselves.
Companies generally tend to be wary of cloud services due to data leak concerns. At the very least, they like to be in control of the decision about which services are approved and which are not.
It's about leaking private/proprietary company information. It's not about features.
How will that private/proprietary information be used by OpenAI? Does it include NDA information from another company that they don't have the right to share? How secure is the information stored (think industrial espionage)? There is a lot that needs to be taken into account that even goes beyond this.
With Google Docs, MS Office, Atlassian etc you get a real software product with engineers paid ~$300k per year to fix bugs.
With ChatGPT, you get researchers paid over $1m per year [1] to use you as a training data source and ship stuff with basic bugs and then "feel sorry" when stuff breaks:
https://www.theregister.com/2023/03/23/openai_ceo_leak/
Another position for those like Samsung: preventing ChatGPT use encourages incubation of internal competing solutions.
It’s shadow IT. You are not supposed to leak confidential company data to any other company unless you have the appropriate vendor agreements in place. It’s like loading the source code to Dropbox when you are supposed to use bitbucket.
Don't know about chrome, because it's an application and not itself harmful, but yes they think Google docs will steal their data. Or any other cloud service. They are all banned for employees. I'm surprised that anyone is surprised by this.
I don't think it is at all clear that OpenAI wont use data that is put into it for unclear purposes, and I don't think they have a corporate account feature to guarantee prompt privacy.
We've seen demonstrations of models being tricked into sharing details about their setup or training data. If they are to be trained on what is shared with them then that data could be procured by an attacker.
I would have concerns about Google using my data but I wouldn't be concerned that the data I enter could easily appear in someone else's spreadsheet.
My company explicitly blocks the use of non-corporate controlled cloud products for obvious reasons. All it takes is one person to post an Excel document incorrectly to cause a major incident.
Companies banning the use of ChatGPT level tools going forward will find the rules either flouted, subverted or the employees going elsewhere.
Of course there is a duty on employees to be professional - the latter will be the ones taking up opportunities at non-legacy/dinosaur corporations that think they can command the waves.
The answer is to sort your processes, security and training out - new AI is here to stay, and managers cannot stop employees using game-changing tools without looking very foolish and incompetent.
Why? It’s a valid concern in my opinion. You’re feeding OpenAI your intellectual property and just hoping they don’t do anything with it. I have the same concerns with Microsoft’s TypeScript playground
It’s not clear whether ChatGPT and the likes would increase productivity at the organization level. And I am talking about the current GPT-4, not some hypothetical AGI. From what I have seen, a large swath of usages are basically just people DDOSing their teams with a lot of words. Things like someone in a marketing team prompting for a “detailed 10-week plan with actual numbers” that naturally have no basis on reality, but will take a lot of effort from their team to decipher the bullshit. Likewise there are also generated hundred lines of code with tests that are subtlety wrong.
Basically the challenge is fairly straightforward, if one side is machine-generated, and the other side is human-validated, the human loses 100% of the time. Either the machine has to be 100% accurate or very close to it, or the human needs tools to help him. As it stands, neither of those conditions is here yet
Companies banning the use of ChatGPT level tools going forward will find the rules either flouted, subverted or the employees going elsewhere.
Why? Companies typically have many rules that they expect employees to follow. Why would employees disregard these particular rules, or even quit because of them?
They actually can. But not every business is competent. It is completely irresponsible to query public services with internal company intellectual property or any type of information that may breach your contract that you signed with a responsible and competent business. It is trivial to track you do on business equipment and if you think you're smart by subverting that by querying public methods through other means, that can and will be solve soon. Someone is going to make a lot of money deploying an AI system that can retroactively track these things and when that happens, for better or worse, hope you were not irresponsible and worked for a business with the technical acumen to trace you. It's not a matter of if it can be done, with today's tech. It's a matter of whether the business you worked for has the means and willpower to do it.
My advice is follow best practice and wait for an official company policy detailing the use of these new services. Otherwise you may find yourself in legal trouble years from now when the traces you left can easily be uncovered by technology that does not forget.
That's strange to me. I'm employed in order to receive a paycheck. If receiving my paycheck is contingent on me not using ChatGPT, then so be it, what do I care?
Pretty much the thought-devoid "It's new, so it must be good!" argument people have been pushing for centuries, whether it's music or technology or politics or fashion.
Companies banning the use of ChatGPT level tools going forward will find the rules either flouted, subverted or the employees going elsewhere.
If my employees are leaking company information through ChatGPT, I'm happy to have them go work for my competitors and leak their information, instead.
I was with you until you said employees going elsewhere. There will be a group who needs to use it because they can't function without some AI assisted help but people working at the company know how to do their jobs already why leave?
Do you think regular employees care all that much about ChatGPT? It's a cute toy. If my employer says not to use it for work data, that's just not a big deal for me.
Funnily enough our company will start to roll out an internal ChatGPT UI with an agreement with Azure on not to use the data for training etc, where we would be allowed to share internal data.
The market is definitely there for enterprise LLMs. Everyone is using GPT for work. I use it to provide stubs for memos and to brainstorm - but the real value comes from replace internal “tribal knowledge” with an AI who knows your org in and out.
As of now, these chatbots still make some subtle, egregious errors in code that could easily create more bugs than handwriting if you don't thoroughly audit the output.
It's wise to ban them until they improve or naive users get more instruction how to use them properly. And from experience with some Samsung products, they could do with tightening up their code QA standards a bit.
After skimming through the article it's not too clear to me what the misuse was. My best guess is entering "personal or company related information into the services"?
>My best guess is entering "personal or company related information into the services"?
I'm not really familiar exactly on how ChatGPT works but does it get trained on input data(search queries)? People are also "leaking" their personal information to Google when they search for something personal like health issues, financial issues, family issues etc.
What is the privacy policy of these chatbots afer-all?
My employer (big tech co using AI internally already) just blocked it too. They put out a very well thought out explanation why.
We're investigating standing up our own internal model for internal use to control any risk of our IP leaking out and to be able to vet the training data.
I see lots of casual mention of using it for assistance in writing code.. we view that as fairly dangerous in terms of the risk of accidentally including snippets of open source code.
There are many risks to this. Personally my own experimentation (for coding) was pretty ambivalent as to what it can/could actually speed up. Chat GPT is not that fast and it takes time to keep refining what it sends back to you. The window of what it actually helps with seems small right now.
I'm working on an open source enterprise self-hosted LLM integration, pretty sure many people are these days. The difference will be on the business side, rather than the LLM side, as they are quickly becoming commoditized, at least on the open source side, since they are free to use by anyone.
I'd also say that one needs a moat in order to succeed; you can't just provide the LLM, since anyone can do that, you need to provide something more that works even without any AI at all.
How about outside of any workflows, and just one off? I've used ChatGPT twice recently after spending a good amount of time Googling around and finding nothing (specifically https://chatgptonline.ai/chat/), generalising the statements/removing any/all real data.
1. given the following string [...], build me a regex that extracts the [abc] before [`], until the 2nd [xyz], used for extracting a bunch of info from an array
2. give me a list of common beneficiaries, then give me 5 more
Took an hour or so of Googling, then about 10 mins to find an online/open ChatGPT prompt, and about 2 mins to implement the answer in my code. But that's where I draw the line, I'll never use an editor that uses AI in my actual IDE, or expose my code openly to train models
>ChatGPT is a viral AI chatbot that is trained on huge amounts of data and is able to generate response to user queries. It is a form of so-called generative AI.
The lazy use of viral when talking about computer tech here annoys me slightly.
One of my colleague was pasting parts of /var/log/messages into chatGPT trying to debug some sshd related issue. Is this safe? He said he was masking hostnames before pasting, would that help here?
Need legislation to limit and control what data goes into LLMs. People need to be paid or at least always have a choice when it comes to being input for training data.
[+] [-] kypro|2 years ago|reply
What is it about an AI chat bots that makes the risk of a data leak so much higher? Is something about OpenAI's ToS? Or it's relative infancy?
[+] [-] huseyinkeles|2 years ago|reply
“(c) Use of Content to Improve Services. We do not use Content that you provide to or receive from our API (“API Content”) to develop or improve our Services. We may use Content from Services other than our API (“Non-API Content”) to help develop and improve our Services. You can read more here about how Non-API Content may be used to improve model performance.”
[0] https://openai.com/policies/terms-of-use
[+] [-] tjr|2 years ago|reply
From what I personally have seen, this sort of guidance remains. When companies do use things like Google Docs or Microsoft Office365, they likely have some specific contract in place with Google / Microsoft / etc., that the company's legal team has decided they are happy with.
I anticipate that the same will eventually be true of ChatGPT and such, that there will be some paid corporate offering with contract terms that make the company lawyers happy.
Most of my career has been with larger companies, often with high data sensitivity; I can easily imagine that some smaller and/or less data-sensitive companies might not care about any of this.
[+] [-] xmcqdpt2|2 years ago|reply
Yes they do. Where I work the whole google office suite is blocked from inside the network (you have to use MS Office). ChatGPT is blocked. Most web apps that you can copy text or data into are either blocked, or we have an agreement with the provider, or (for open source) we have an internal on-prem fork.
[+] [-] idopmstuff|2 years ago|reply
So prior to that, they were willing to use your data for model training. Every service may have leaks/security issues, but few say they'll purposely use your data. OpenAI probably should've promised not to use your data from the beginning; it'll be a hard perception to change now.
https://techcrunch.com/2023/03/01/addressing-criticism-opena...
[+] [-] dogma1138|2 years ago|reply
E.G. you’ll probably be able to use the corporate account to sign into the corporate Google Docs or O365 instance but if you try to sign into your own it would be blocked and likely also reported on so you might get a call from SecOps down the line.
OpenAI currently offers none of it and more importantly it openly uses the data that users submit to it as well as the responses for additional training and any other purpose they might come up with.
As for browsers these are also often also configured not to send data outside of the company and yes it’s possible. Windows 11 web search and other features would also likely be disabled on your corporate device.
[+] [-] elihu|2 years ago|reply
Companies generally tend to be wary of cloud services due to data leak concerns. At the very least, they like to be in control of the decision about which services are approved and which are not.
[+] [-] mfer|2 years ago|reply
How will that private/proprietary information be used by OpenAI? Does it include NDA information from another company that they don't have the right to share? How secure is the information stored (think industrial espionage)? There is a lot that needs to be taken into account that even goes beyond this.
[+] [-] choppaface|2 years ago|reply
With ChatGPT, you get researchers paid over $1m per year [1] to use you as a training data source and ship stuff with basic bugs and then "feel sorry" when stuff breaks: https://www.theregister.com/2023/03/23/openai_ceo_leak/
Another position for those like Samsung: preventing ChatGPT use encourages incubation of internal competing solutions.
[1] https://davidgoudet.medium.com/how-did-this-ai-scientist-end....
[+] [-] dougmwne|2 years ago|reply
[+] [-] Aeolun|2 years ago|reply
[+] [-] newswasboring|2 years ago|reply
[+] [-] unethical_ban|2 years ago|reply
[+] [-] scarface74|2 years ago|reply
We are not to use any third party cloud hosted software that is not approved to store proprietary information.
There are only three external SaaS companies that I am aware of that we use that stores anything proprietary.
We are not even allowed to take pictures of the whiteboard because the pictures can be synced to external cloud storage providers.
[+] [-] AlecSchueler|2 years ago|reply
I would have concerns about Google using my data but I wouldn't be concerned that the data I enter could easily appear in someone else's spreadsheet.
[+] [-] foxyv|2 years ago|reply
[+] [-] RecycledEle|2 years ago|reply
This is the same thing.
I see a distinction between several kinds of companies:
Competent companies are asking employees to be more productive and are training them with AI.
Less competent companies are restricting use of AI.
[+] [-] mellosouls|2 years ago|reply
Of course there is a duty on employees to be professional - the latter will be the ones taking up opportunities at non-legacy/dinosaur corporations that think they can command the waves.
The answer is to sort your processes, security and training out - new AI is here to stay, and managers cannot stop employees using game-changing tools without looking very foolish and incompetent.
[+] [-] bretticus|2 years ago|reply
[+] [-] NhanH|2 years ago|reply
Basically the challenge is fairly straightforward, if one side is machine-generated, and the other side is human-validated, the human loses 100% of the time. Either the machine has to be 100% accurate or very close to it, or the human needs tools to help him. As it stands, neither of those conditions is here yet
[+] [-] tjr|2 years ago|reply
Why? Companies typically have many rules that they expect employees to follow. Why would employees disregard these particular rules, or even quit because of them?
[+] [-] Art9681|2 years ago|reply
My advice is follow best practice and wait for an official company policy detailing the use of these new services. Otherwise you may find yourself in legal trouble years from now when the traces you left can easily be uncovered by technology that does not forget.
[+] [-] vfacv|2 years ago|reply
What would impact the world economy more? OpenAI disappearing (no one would notice) or Samsung disappearing?
[+] [-] VoodooJuJu|2 years ago|reply
That's strange to me. I'm employed in order to receive a paycheck. If receiving my paycheck is contingent on me not using ChatGPT, then so be it, what do I care?
[+] [-] reaperducer|2 years ago|reply
Companies banning the use of ChatGPT level tools going forward will find the rules either flouted, subverted or the employees going elsewhere.
If my employees are leaking company information through ChatGPT, I'm happy to have them go work for my competitors and leak their information, instead.
[+] [-] ipaddr|2 years ago|reply
[+] [-] loeg|2 years ago|reply
[+] [-] iLoveOncall|2 years ago|reply
[+] [-] refurb|2 years ago|reply
The only use for AI is for writing code and the company created a policy around that.
[+] [-] gnopgnip|2 years ago|reply
[+] [-] SushiHippie|2 years ago|reply
[+] [-] 40acres|2 years ago|reply
[+] [-] BearOso|2 years ago|reply
It's wise to ban them until they improve or naive users get more instruction how to use them properly. And from experience with some Samsung products, they could do with tightening up their code QA standards a bit.
[+] [-] gumballindie|2 years ago|reply
[+] [-] sometimez|2 years ago|reply
[+] [-] throwawaysleep|2 years ago|reply
https://www.pcmag.com/news/samsung-software-engineers-busted...
Basically sensitive code got spat out later.
[+] [-] mrkramer|2 years ago|reply
I'm not really familiar exactly on how ChatGPT works but does it get trained on input data(search queries)? People are also "leaking" their personal information to Google when they search for something personal like health issues, financial issues, family issues etc.
What is the privacy policy of these chatbots afer-all?
[+] [-] ben7799|2 years ago|reply
We're investigating standing up our own internal model for internal use to control any risk of our IP leaking out and to be able to vet the training data.
I see lots of casual mention of using it for assistance in writing code.. we view that as fairly dangerous in terms of the risk of accidentally including snippets of open source code.
There are many risks to this. Personally my own experimentation (for coding) was pretty ambivalent as to what it can/could actually speed up. Chat GPT is not that fast and it takes time to keep refining what it sends back to you. The window of what it actually helps with seems small right now.
[+] [-] dwighttk|2 years ago|reply
[+] [-] satvikpendem|2 years ago|reply
I'd also say that one needs a moat in order to succeed; you can't just provide the LLM, since anyone can do that, you need to provide something more that works even without any AI at all.
[+] [-] kingsloi|2 years ago|reply
[+] [-] johnea|2 years ago|reply
It's 2 sentences repeated 3 time.
What exactly is "misuse"?
News for the ADHD...
[+] [-] dahwolf|2 years ago|reply
Using AI in a way that doesn't share your company secrets and private code with the entire world is around the corner or already partially possible.
Very obviously Samsung will not ban AI as a whole forever.
[+] [-] hospitalJail|2 years ago|reply
Havent found something that works in 1 click.
[+] [-] unknown|2 years ago|reply
[deleted]
[+] [-] unethical_ban|2 years ago|reply
The lazy use of viral when talking about computer tech here annoys me slightly.
[+] [-] Pbhaskal|2 years ago|reply
[+] [-] blondie9x|2 years ago|reply
[+] [-] nmca|2 years ago|reply
[+] [-] croes|2 years ago|reply
Can they enforce code injections to create backdoors, like they tried with cryptography?