top | item 39205656

(no title)

I always feel like there is some trick to these I am missing out of, are there any good guides? Any time I look for some its just typical low effort blog/youtube spam trying to get in on the AI/GPT key words.

I have tried to work on one where I uploaded various documentation and spec sheets, wrote detailed instructions on how to search through it. Then described how it should handle different prompt situations (errors, types of questions, quotes from the documentation). It is able to search through the provided knowledge and provide quotes and responses with it, but it at no point gives a coherent response, so it basically always functions like a more intelligent search feature. Putting that it should re-prompt itself with the knowledge extracted and rationalize/elaborated on it doesn't seem to do much either, though it did provide some improvement.

discuss

firtoz|2 years ago

The retrieval from file has issues. I'm unsure what exactly it retrieves and how. Afaik it gets a kind of "chunk" from only one file per request in whatever way it considers to be relevant to the request. Could be a simple "embedding vector comparison" or something else...

Then we are unsure how much of the context that chunk replaces or overrides. Does it overwrite past messages? Does it overwrite the system prompt? Anything else? Who knows.

If anyone has any info I would appreciate it too. I gave up on it for anything significantly complicated, better off using the actions API to query a better RAG system instead.

hickelpickle|2 years ago

I had to add to the instructions for it to search the knowledge files 2000 characters at a time, and to search for keywords and not exact phrases, which is really the only thing I could find online about developing one. It also needs to have the code interpreter enabled afaik and it seems to have issues with zip files as well but can extract and search them sometimes, though it seems to vary the technique and sometimes fail. I can confirm that it can search multiple files as I uploaded a mailing list archive and it would return results from multiple files in it.

I've moved to combining all my data into single files, but sometimes it also seems to have issues with them as well even if they are under the upload size limit, I assume that is due to how many characters are in them, and it will just brick the whole GPT until the offending file is removed.

The part I have issues with is having it actually use the data, it will quote/summarize data it found in the knowledge base and return where it found it if it can, but I can never make it do more than that. Ideally I want it to contextualize the data it finds in the knowledge files and prompt itself or factor it into a response, but anytime it accesses the knowledge base I get nothing more than a paraphrased response of what it found and why it may be applicable to my prompt.