I find it interesting that most of these tactics can be summarized into: "write clearly and provide ample information."
I have a side business for ACT/SAT prep. I teach English, and often have to remind students about sentence structure and word flow. For example, I can't say "My mom, my grandma, and I went to her favorite store" -- in that example, there is no clear way to know who "her" is.
Similarly, I see many people claim GPT-n is "dumb", yet when I see their prompt, I realize it was a bad prompt. There are clear logical inconsistencies, there is inadequate information, and there is confusing word usage.
I've been astounded by GPT-4 and have nearly 5x-ed my productivity with it (for coding and for parsing documents). But I think my experience is a result of my habitual "standardized testing" writing style, while others' poor experience is a result of their more layman writing style.
In what pre-ChatGPT world did "write clearly and provide ample information" include such familiar and long-known tips as:
* Ask the model to adopt a persona
* Use delimiters to clearly indicate distinct parts of the input
* Specify the desired length of the output
> others' poor experience is a result of their more layman writing style.
I guess we'll have to pass the hat around for those wretched souls. In the meantime, someone needs to tell the English teacher that "layman" is not an adjective.
> sentence structure and word flow
In my experience ChatGPT doesn't care about those. It's able to infer through quite a large amount of sloppiness. The much larger gains come from guiding it into a model of the world, as opposed to direct it to respond to lean perspectives like, "What do I eat to be better?"
>I have a side business for ACT/SAT prep. I teach English, and often have to remind students about sentence structure and word flow. For example, I can't say "My mom, my grandma, and I went to her favorite store" -- in that example, there is no clear way to know who "her" is.
The Lord’s work. I deal with this issue at work all day long.
I am lucky I had a high school English teacher who DRILLED into me the slogan “no pronouns without a clear, one-word antecedent.”
That slogan is probably a top 2 mantra for me that has paid dividends in my communication skills. The other one would be “break long sentences into short ones.”
> Similarly, I see many people claim GPT-n is "dumb", yet when I see their prompt, I realize it was a bad prompt.
I see this comment a lot, but I drew a different conclusion. People with high reading / writing ability assign less value to GPT's writing ability, whereas people with lower reading / writing ability assign more value to GPT's ability.
Edit:
> yet when I see their prompt, I realize it was a bad prompt.
So, I asked ChatGPT to re-write my reply.
----
Prompt: "Re-write this text, but more intelligently."
Output: I frequently encounter this observation, yet I have arrived at a divergent inference. Individuals possessing exceptional proficiency in reading and writing tend to ascribe a diminished significance to the writing aptitude of GPT, whereas individuals with more limited abilities in reading and writing tend to attach greater value to GPT's prowess.
----
Prompt: "Re-write this text, but make it sound better."
Output: I frequently encounter this recurring sentiment, yet I have formulated an alternative perspective. Individuals endowed with advanced reading and writing prowess tend to bestow a lesser degree of importance upon GPT's writing ability, whereas individuals possessing more modest proficiency in reading and writing are inclined to accord greater significance to GPT's remarkable capabilities.
-----
Prompt: "Re-write this text, but try and use the word 'ability' less."
Output: I frequently come across this comment, but my conclusion differs. Individuals who excel in reading and writing tend to place a diminished emphasis on GPT's writing prowess, while those with more limited skills in reading and writing tend to attach greater importance to GPT's capabilities.
----
Conclusion: I like my original reply better than GPT's re-write.
It's interesting you say this. I spent the weekend playing with ChatGPT to try and get it to build a Swift app for iOS and macOS (I have zero previous experience with Swift). Thankfully I had a compiler to back me up and tell me if things actually worked. I found the whole experience a little jarring. ChatGPT was pretty good at generating some code, but it felt a lot like a job interview where I'm working hard to coach a candidate into the right answer. Or, now that you mention it, some previous experiences I've had trying to work with outsourced providers where we're trying to overcome a language barrier.
The problems are often that I got exactly what I asked for. Not a thing more, no context that I thought would be assumed (e.g., don't remove the functionality I asked you to implement in the previous step), just a very literal interpretation of the asks.
I definitely found myself quickly adapting to try and be clearer and potentially over expressive in my prompts.
Can't do math or logic. I have a question I ask ChatGPT to see if it can do logic yet, it still cannot. (Can't mention this question here or it will get fixed.)
Its great for brain storming or low risk problems. I don't think the accuracy problem will ever be fixed.
I probably 5x my productivity as well, but that doesnt mean its able to do logic.
I think we will find that certain personality and thinking types will be the most successful with this technology.
It will be interesting if only the highly educated are able to best leverage this, because that would be unfortunate and would accelerate inequality.
I also really hope this can be used to improve learning to bridge this gap, and this summer I will have my high school intern use this technology frequently with the hope that it accelerates his improvement.
I think something interesting is that this unlocks huge potential for English majors and puts engineering / math / comp sci at a structural disadvantage. Hmmm
> I find it interesting that most of these tactics can be summarized into: "write clearly and provide ample information."
I'm so used to ChatGPT now that for any non-trivial programming task that I'm not 100% confident with, I check how ChatGPT approaches it first. Not just one or two sentence, but a detailed summary with code example. Usually a single prompt of mine costs >300 tokens. Sometimes >1000 tokens.
I honestly don't know whether ChatGPT made me dumber or smarter. On one hand I'm relying on it a lot. On the other hand, I'm forced to actually think over my task in hand, not just telling myself that I've thought it through.
> I've been astounded by GPT-4 and have nearly 5x-ed my productivity with it (for coding and for parsing documents).
I wonder if StackOverflow, Quora, GitHub Issues and similar websites have noticed a decrease in new questions/issues raised? LLMs have a kind of network effect that it makes people ask less questions in public sites and more privately to their own LLM sessions, and maybe even give up if the LLM does not answer. That reduces critical mass at q&a forums, so, given enough time, less people will go there for answers. I'd say that's a great loss!
In a way, sarcastically speaking, it's one of AI's way of becoming our overlords.
I submitted a puzzle from https://dmackinnon1.github.io/fickleSentries/, with the basic prompt, "I am going to present you with a logic puzzle. I would like you to solve the puzzle."
The solution GPT-3.5 (I don't have access to 4.) gave was: "In conclusion, based on the statements and the given information, the treasure in the cave must be copper."
The solution given with the puzzle is "Here is one way to think about it:
If Guard 1 is telling the truth, then the treasure must be diamonds. If Guard 1 is lying, then the treasure can be copper or gold. If Guard 2 is telling the truth, then the treasure must be silver. If Guard 2 is lying, then the treasure can be diamonds or rubies. The only possible option based on the statements of both guards is diamonds."
Heh yeah me too for software — fun thing we have all of our after hours chats powered by gpt 4 with our documentation.
Many people try to chat with it like a dumb bot using single words… few people have figured out you get very in-depth answers - so over all I’m very pleased with our results but still funny how people react to it…
What are some other clarifications to that sentence besides those in the forms "My mom and I went with my grandma to her favorite store" or "I went with my mom and grandma to my mom's favorite store"?
I've been surprised at how many times I've realized that a bad result from GPT was because I specified the problem inadequately or incorrectly. Or that I took the solution, but applied it incorrectly.
I've been trying to use the OpenAI API for the last two weeks or so (GPT-4 mostly). This article rubs me the wrong way. "GPT Best Practices" indeed.
Most of my calls end with a time out (on their side) after 10 minutes. I get 524 and 502 errors, sometimes 429, and sometimes a mildly amusing 404 Model not found. The only way I can get reasonable responses is to limit my requests to less than 1400 tokens, which is too little in my application.
And on top of that they actually charge me for every request. Yes, including those 524s, 502s and 429s, where I haven't seen a single byte of a response. That's fraudulent. I reported this to support twice, a week later I haven't even heard back.
Their status page happily states that everything is just fine.
From the forums it seems I'm not the only one experiencing these kinds of problems.
I'd argue "GPT Best Practices" should include having working APIs, support that responds, and not charging customers for responses that are never delivered.
That's odd, I have been heavily using GPT-4 API (near 100 requests a day) and didn't notice any errors like that. I noticed maybe 1 or 2 errors with really chat history.
This is a good observation. I find that working with LLMs feels closer to the skills of managing a team than to coding itself. Intuitions about how to divide work and understanding strengths and limitations seem to go far.
Great to see OpenAI upping their game when it comes to providing documentation for how to get the most out of of their models.
I shuddered a bit at "Ask the model to adopt a persona" because I thought it was going to be that "You are the world's greatest expert on X" junk you see people spreading around all the time, but it was actually good advice on how to use the system prompt - their example was:
> "When I ask for help to write something, you will reply with a document that contains at least one joke or playful comment in every paragraph."
This example stuck out to me[0]. We've been calling this a "triage" prompt and it's quite effective when you have multiple paths a user could go down or if they could be asking for multiple things at once.
In his State of GPT talk (1), Karpathy does say the “you are an expert” type prompts help, because the model has learned from all sorts of content both good and bad, so it kind of narrows it down to more likely provide good answers
Two more practices that are relevant to how transformers work:
- instead of using it as a source of facts, use it to transform the text with the facts you provide, which is does much better. (if the accuracy is important for your case)
- to improve the answer, ask it to reflect on its own result and reiterate the answer. The model produces the result token by token, so it's unable to check its validity at the inference time. This way you put it back into the context and explicitly tell the model to make a second pass.
Here's my personal template for semi-complex prompts:
System message
[A couple sentences of instructions]
Example 1 - Input
##
[example input 1]
##
Example 1 - Output
##
[example output 1]
##
User message
Actual 1 - Input
##
[the thing you want it to process]
##
Actual 1 - Output
##
Fill in all the [] sections. Then hit submit. This should work pretty well. I'd suggest setting the temperature to 0 if you want more predictable responses.
I first played with GPT early 2021, and have been actively using it since mid 2022. This the method I've found to have the best tradeoff between complexity and effectiveness.
Note that I always try to zero shot it first, and I only use this method for things where zero shot fails, and where I need GPT to get right and that it's worth the effort of making a few shot prompt for.
'tell it exactly the answer you want and keep changing your prompt until it spits that back at you. if you know the right answer already, you will know when it gets it right. assume it will break. ideally use a different tool.'
These are good examples of how to leverage the system prompt, which is vastly underdiscussed as that is only available via the API or the Playground and not the megapopular ChatGPT webapp. Even in LangChain it requires some hacking to get working and may not be stable across generations.
I am releasing a minimal Python chat AI package interface this week which very heavily encourages use the system prompt for efficient generations that are also stable and can hand a variety of user inputs. The results have been very effective!
Absolutely nothing about preventing or mitigating prompt injections.
Any other "best practices" for any other sort of platform, database or language, should include suggestions on how to keep your system secure and not vulnerable to abuse.
Coding for LLMs right now is a bit like coding with PHP+MySQL in the late 90s to early 00s, throw stuff at it with little thought and see what happens, hence the wave of SQL injection vulnerabilities in software of that era. The best practices haven't even really been established, particularly when it comes to security.
I'm still somewhat confident it'll eventually be formally proven that you can't make a LLM (or the successor generative models) resistant to "prompt injections" without completely destroying its general capability of understanding and reasoning about their inputs.
SQL injections, like all proper injection attacks (I'm excluding "prompt injections" here), are caused by people treating code as unstructured plaintext, and doing in plaintext-space the operations that should happen in the abstract, parsed state - one governed by the grammar of the language in question. The solution to those is to respect the abstraction / concept boundaries (or, in practice, just learn and regurgitate a few case-by-case workarounds, like "prepared statements!").
"Prompt injections" are entirely unlike that. There is no aspect of doing insertion/concatenation at the wrong abstraction level, because there are no levels here. There is no well-defined LLMML (LLM Markup Language). LLMs (and their other generative cousins, like image generation models) are the first widely used computer systems that work directly on unstructured plaintext. They are free to interpret it however they wish, and we only have so much control over it (and little insight into). There are no rules - there's only training that's trying to make them respond the way humans would. And humans, likewise, are "vulnerable" to the same kind of "prompt injections" - seeing a piece of text that forces them to recontextualize the thing they've read so far.
I think mitigations are the only way forward, and at least up to the point we cross the human-level artificial general intelligence threshold, "prompt injection" and "social engineering" will quickly become two names for the same thing.
Prompt injection becomes not a problem if you write a restrictive enough template for your prompt with a a LLM template language, such as what Guidance from microsoft provides.
You can literally force it to return responses that are only one of say 100 possible responses (i.e. structure the output in such a way that it can only return a highly similar output but with a handful of keywords changing).
It's work, but it will work with enough constraints because you've filtered the models ability to generate "naughty" output.
Not affiliated with them apart from being an early customer, but we're working with Credal.ai to solve this problem. In addition to being able to redact content automatically before it hits the LLM, they also have agreements in place with OpenAI and Anthropic for data deletion, etc. Ravin and the team have been super responsive and supportive and I'd recommend them for folks who are looking to solve this issue.
The only thing I still use ChatGPT semi-frequently is to translate stuff, mainly from Japanese to my native language or English.
And I'm surprised how often it failed to follow the basic instruction of
Please translate the following paragraph to X-language.
(Paragraph in Japanese.)
And I have to say "Please translate the following paragraph to X-language" every single time -- I can't just say, "hey, please just translate paragraphs I give from now on." It won't follow it for very long before it starts to do other random stuff or tries to follow the content of the Japanese paragraphs I was trying to get translated.
Interesting that we have OpenAI using the more generic "GPT" here. Previously they would refer more specifically to ChatGPT, GPT-3, or GPT-4. I am guessing this is related to their trademark application for GPT, which was initially refused by the USPTO on the grounds of "GPT" being "merely descriptive".
"best practices" ... what proof do they offer that these are the best? Is there a detailed deconstruction of why these are the best, and what practices they are superior to? A multifaceted analysis that may show other practices to be better under some circumstances?
I would highly recommend watching - State of GPT video by Karpathy at Microsoft Build to get an idea of how LLM’s work. There are some really good prompting ideas. It changed the way I use GPT-4.
I have yet to find a good way for example to feed ChatGPT GPT-4 (or GPT-3.5 for that matter) "here is a semi-large list of like... songs. help me classify which genre they are closest to" because of the token limit/timeout in the chat.
I'm sure an API integration is possible, but that opens yourself up to potentially "huge" costs compared to a guaranteed free implementation (or the fixed $20/mo)
Anybody able to feed it rows/cells from Google Sheets easily?
> The idea of inner monologue is to instruct the model to put parts of the output that are meant to be hidden from the user into a structured format that makes parsing them easy. Then before presenting the output to the user, the output is parsed and only part of the output is made visible.
This is quite remarkable. Can we now say that inner monologue is also an emergent property? Can this lead to some form of self-awareness or introspection?
Just came across this valuable piece on GPT best practices, and it reminded me of an interesting point I read elsewhere. It's crucial to shape the input prompts effectively as the AI's response heavily depends on the input provided, mirroring a 'garbage in, garbage out' principle for AI interactions.
One « best practice » completely ignored by this document is how to ensure non-stochastic results (eg temperature=0), and better yet how to be « sure » which version of chatGPT you’re using (currently no way of knowing). I wish they would give more transparent versioning
Often the right prompt and hacks are way beyong the trivial tricks given in this article. Or for that matter, the course OpenAI created along with Deeplearning.ai.I wish there is a great compilation somewhere
[+] [-] maxdoop|2 years ago|reply
I have a side business for ACT/SAT prep. I teach English, and often have to remind students about sentence structure and word flow. For example, I can't say "My mom, my grandma, and I went to her favorite store" -- in that example, there is no clear way to know who "her" is.
Similarly, I see many people claim GPT-n is "dumb", yet when I see their prompt, I realize it was a bad prompt. There are clear logical inconsistencies, there is inadequate information, and there is confusing word usage.
I've been astounded by GPT-4 and have nearly 5x-ed my productivity with it (for coding and for parsing documents). But I think my experience is a result of my habitual "standardized testing" writing style, while others' poor experience is a result of their more layman writing style.
[+] [-] diydsp|2 years ago|reply
* Ask the model to adopt a persona
* Use delimiters to clearly indicate distinct parts of the input
* Specify the desired length of the output
> others' poor experience is a result of their more layman writing style.
I guess we'll have to pass the hat around for those wretched souls. In the meantime, someone needs to tell the English teacher that "layman" is not an adjective.
> sentence structure and word flow
In my experience ChatGPT doesn't care about those. It's able to infer through quite a large amount of sloppiness. The much larger gains come from guiding it into a model of the world, as opposed to direct it to respond to lean perspectives like, "What do I eat to be better?"
[+] [-] hammock|2 years ago|reply
The Lord’s work. I deal with this issue at work all day long.
I am lucky I had a high school English teacher who DRILLED into me the slogan “no pronouns without a clear, one-word antecedent.”
That slogan is probably a top 2 mantra for me that has paid dividends in my communication skills. The other one would be “break long sentences into short ones.”
[+] [-] nickfromseattle|2 years ago|reply
I see this comment a lot, but I drew a different conclusion. People with high reading / writing ability assign less value to GPT's writing ability, whereas people with lower reading / writing ability assign more value to GPT's ability.
Edit:
> yet when I see their prompt, I realize it was a bad prompt.
So, I asked ChatGPT to re-write my reply.
----
Prompt: "Re-write this text, but more intelligently."
Output: I frequently encounter this observation, yet I have arrived at a divergent inference. Individuals possessing exceptional proficiency in reading and writing tend to ascribe a diminished significance to the writing aptitude of GPT, whereas individuals with more limited abilities in reading and writing tend to attach greater value to GPT's prowess.
----
Prompt: "Re-write this text, but make it sound better."
Output: I frequently encounter this recurring sentiment, yet I have formulated an alternative perspective. Individuals endowed with advanced reading and writing prowess tend to bestow a lesser degree of importance upon GPT's writing ability, whereas individuals possessing more modest proficiency in reading and writing are inclined to accord greater significance to GPT's remarkable capabilities.
-----
Prompt: "Re-write this text, but try and use the word 'ability' less."
Output: I frequently come across this comment, but my conclusion differs. Individuals who excel in reading and writing tend to place a diminished emphasis on GPT's writing prowess, while those with more limited skills in reading and writing tend to attach greater importance to GPT's capabilities.
----
Conclusion: I like my original reply better than GPT's re-write.
[+] [-] glenngillen|2 years ago|reply
The problems are often that I got exactly what I asked for. Not a thing more, no context that I thought would be assumed (e.g., don't remove the functionality I asked you to implement in the previous step), just a very literal interpretation of the asks.
I definitely found myself quickly adapting to try and be clearer and potentially over expressive in my prompts.
[+] [-] hospitalJail|2 years ago|reply
Depends.
Can't do math or logic. I have a question I ask ChatGPT to see if it can do logic yet, it still cannot. (Can't mention this question here or it will get fixed.)
Its great for brain storming or low risk problems. I don't think the accuracy problem will ever be fixed.
I probably 5x my productivity as well, but that doesnt mean its able to do logic.
[+] [-] mensetmanusman|2 years ago|reply
It will be interesting if only the highly educated are able to best leverage this, because that would be unfortunate and would accelerate inequality.
I also really hope this can be used to improve learning to bridge this gap, and this summer I will have my high school intern use this technology frequently with the hope that it accelerates his improvement.
[+] [-] fnordpiglet|2 years ago|reply
[+] [-] raincole|2 years ago|reply
I'm so used to ChatGPT now that for any non-trivial programming task that I'm not 100% confident with, I check how ChatGPT approaches it first. Not just one or two sentence, but a detailed summary with code example. Usually a single prompt of mine costs >300 tokens. Sometimes >1000 tokens.
I honestly don't know whether ChatGPT made me dumber or smarter. On one hand I'm relying on it a lot. On the other hand, I'm forced to actually think over my task in hand, not just telling myself that I've thought it through.
[+] [-] weinzierl|2 years ago|reply
... but if you provide too much information your counterpart might lose interest and forget what you said first.
[+] [-] ojosilva|2 years ago|reply
I wonder if StackOverflow, Quora, GitHub Issues and similar websites have noticed a decrease in new questions/issues raised? LLMs have a kind of network effect that it makes people ask less questions in public sites and more privately to their own LLM sessions, and maybe even give up if the LLM does not answer. That reduces critical mass at q&a forums, so, given enough time, less people will go there for answers. I'd say that's a great loss!
In a way, sarcastically speaking, it's one of AI's way of becoming our overlords.
[+] [-] mcguire|2 years ago|reply
https://pastebin.com/a3WzgvK4
The solution GPT-3.5 (I don't have access to 4.) gave was: "In conclusion, based on the statements and the given information, the treasure in the cave must be copper."
The solution given with the puzzle is "Here is one way to think about it: If Guard 1 is telling the truth, then the treasure must be diamonds. If Guard 1 is lying, then the treasure can be copper or gold. If Guard 2 is telling the truth, then the treasure must be silver. If Guard 2 is lying, then the treasure can be diamonds or rubies. The only possible option based on the statements of both guards is diamonds."
Is there any way to improve that prompt?
[+] [-] taf2|2 years ago|reply
Many people try to chat with it like a dumb bot using single words… few people have figured out you get very in-depth answers - so over all I’m very pleased with our results but still funny how people react to it…
[+] [-] BeFlatXIII|2 years ago|reply
[+] [-] _ea1k|2 years ago|reply
[+] [-] nathan_compton|2 years ago|reply
[+] [-] jwr|2 years ago|reply
Most of my calls end with a time out (on their side) after 10 minutes. I get 524 and 502 errors, sometimes 429, and sometimes a mildly amusing 404 Model not found. The only way I can get reasonable responses is to limit my requests to less than 1400 tokens, which is too little in my application.
And on top of that they actually charge me for every request. Yes, including those 524s, 502s and 429s, where I haven't seen a single byte of a response. That's fraudulent. I reported this to support twice, a week later I haven't even heard back.
Their status page happily states that everything is just fine.
From the forums it seems I'm not the only one experiencing these kinds of problems.
I'd argue "GPT Best Practices" should include having working APIs, support that responds, and not charging customers for responses that are never delivered.
[+] [-] sagarpatil|2 years ago|reply
Although, there’s a waitlist for GPT-4, so I’m not sure if you’ll access immediately.
[+] [-] avindroth|2 years ago|reply
[+] [-] hammock|2 years ago|reply
1. Write clear instructions
2. Provide reference text
3. Split complex tasks into simpler subtasks
4. Give time to "think"
5. Use external tools
6. Test changes systematically
[+] [-] f0e4c2f7|2 years ago|reply
[+] [-] fnordpiglet|2 years ago|reply
All of these best practices are great for humans dealing with their humans as well:
[+] [-] tikkun|2 years ago|reply
[+] [-] justanotheratom|2 years ago|reply
"Transformers need tokens to think" - @karpathy on Chain of Thought prompting.
[+] [-] simonw|2 years ago|reply
I shuddered a bit at "Ask the model to adopt a persona" because I thought it was going to be that "You are the world's greatest expert on X" junk you see people spreading around all the time, but it was actually good advice on how to use the system prompt - their example was:
> "When I ask for help to write something, you will reply with a document that contains at least one joke or playful comment in every paragraph."
I thought the section with suggestions on how to automate testing of prompts was particularly useful - I've been trying to figure that out myself recently. https://platform.openai.com/docs/guides/gpt-best-practices/s...
[+] [-] mritchie712|2 years ago|reply
0 - https://platform.openai.com/docs/guides/gpt-best-practices/s...
[+] [-] tomduncalf|2 years ago|reply
1: https://m.youtube.com/watch?v=s6zNXZaIiiI
[+] [-] orbital-decay|2 years ago|reply
- instead of using it as a source of facts, use it to transform the text with the facts you provide, which is does much better. (if the accuracy is important for your case)
- to improve the answer, ask it to reflect on its own result and reiterate the answer. The model produces the result token by token, so it's unable to check its validity at the inference time. This way you put it back into the context and explicitly tell the model to make a second pass.
[+] [-] tikkun|2 years ago|reply
System message
User message Fill in all the [] sections. Then hit submit. This should work pretty well. I'd suggest setting the temperature to 0 if you want more predictable responses.I wrote up additional info here: https://llm-utils.org/My+template+for+a+semi-complex+GPT-4+p...
I first played with GPT early 2021, and have been actively using it since mid 2022. This the method I've found to have the best tradeoff between complexity and effectiveness.
Note that I always try to zero shot it first, and I only use this method for things where zero shot fails, and where I need GPT to get right and that it's worth the effort of making a few shot prompt for.
[+] [-] awinter-py|2 years ago|reply
[+] [-] minimaxir|2 years ago|reply
I am releasing a minimal Python chat AI package interface this week which very heavily encourages use the system prompt for efficient generations that are also stable and can hand a variety of user inputs. The results have been very effective!
[+] [-] samwillis|2 years ago|reply
Any other "best practices" for any other sort of platform, database or language, should include suggestions on how to keep your system secure and not vulnerable to abuse.
Coding for LLMs right now is a bit like coding with PHP+MySQL in the late 90s to early 00s, throw stuff at it with little thought and see what happens, hence the wave of SQL injection vulnerabilities in software of that era. The best practices haven't even really been established, particularly when it comes to security.
[+] [-] TeMPOraL|2 years ago|reply
SQL injections, like all proper injection attacks (I'm excluding "prompt injections" here), are caused by people treating code as unstructured plaintext, and doing in plaintext-space the operations that should happen in the abstract, parsed state - one governed by the grammar of the language in question. The solution to those is to respect the abstraction / concept boundaries (or, in practice, just learn and regurgitate a few case-by-case workarounds, like "prepared statements!").
"Prompt injections" are entirely unlike that. There is no aspect of doing insertion/concatenation at the wrong abstraction level, because there are no levels here. There is no well-defined LLMML (LLM Markup Language). LLMs (and their other generative cousins, like image generation models) are the first widely used computer systems that work directly on unstructured plaintext. They are free to interpret it however they wish, and we only have so much control over it (and little insight into). There are no rules - there's only training that's trying to make them respond the way humans would. And humans, likewise, are "vulnerable" to the same kind of "prompt injections" - seeing a piece of text that forces them to recontextualize the thing they've read so far.
I think mitigations are the only way forward, and at least up to the point we cross the human-level artificial general intelligence threshold, "prompt injection" and "social engineering" will quickly become two names for the same thing.
[+] [-] tester457|2 years ago|reply
[+] [-] Demmme|2 years ago|reply
There is no relevant promtinjection you should be aware of because you will not be affected by it ajyway
[+] [-] Der_Einzige|2 years ago|reply
You can literally force it to return responses that are only one of say 100 possible responses (i.e. structure the output in such a way that it can only return a highly similar output but with a handful of keywords changing).
It's work, but it will work with enough constraints because you've filtered the models ability to generate "naughty" output.
[+] [-] Peretus|2 years ago|reply
[+] [-] smoldesu|2 years ago|reply
[+] [-] nomadness|2 years ago|reply
[deleted]
[+] [-] thrdbndndn|2 years ago|reply
And I'm surprised how often it failed to follow the basic instruction of
And I have to say "Please translate the following paragraph to X-language" every single time -- I can't just say, "hey, please just translate paragraphs I give from now on." It won't follow it for very long before it starts to do other random stuff or tries to follow the content of the Japanese paragraphs I was trying to get translated.Any clue how to make it better? I use 3.5 FWIW.
[+] [-] wskish|2 years ago|reply
https://news.ycombinator.com/item?id=36155583
[+] [-] jesterson|2 years ago|reply
Not talking about complex matters, just not closed quotes or incorrect closing quotes, which is very easy to see.
When confronted it apologies and then gives corrected answer, which... yea, sometimes have other silly mistakes too.
It wasnt in GPT-3 so wild guess would be they adding mistakes on purpose.
[+] [-] AtlasBarfed|2 years ago|reply
"best practices" ... what proof do they offer that these are the best? Is there a detailed deconstruction of why these are the best, and what practices they are superior to? A multifaceted analysis that may show other practices to be better under some circumstances?
Oh, you mean "recommended practices".
[+] [-] sagarpatil|2 years ago|reply
Video: https://youtu.be/bZQun8Y4L2A
[+] [-] RcouF1uZ4gsC|2 years ago|reply
[+] [-] MuffinFlavored|2 years ago|reply
I have yet to find a good way for example to feed ChatGPT GPT-4 (or GPT-3.5 for that matter) "here is a semi-large list of like... songs. help me classify which genre they are closest to" because of the token limit/timeout in the chat.
I'm sure an API integration is possible, but that opens yourself up to potentially "huge" costs compared to a guaranteed free implementation (or the fixed $20/mo)
Anybody able to feed it rows/cells from Google Sheets easily?
[+] [-] braindead_in|2 years ago|reply
This is quite remarkable. Can we now say that inner monologue is also an emergent property? Can this lead to some form of self-awareness or introspection?
[+] [-] foxbyte|2 years ago|reply
[+] [-] jerpint|2 years ago|reply
[+] [-] jofinjoseph|2 years ago|reply