top | item 44957716

(no title)

Josh5 | 6 months ago

are they even sure that the AI even accessed the content that second time? LLMs are really good and making up shit. I have tested this by asking various LLMs to scrape data from my websites while watching access logs. Many times, they don't and just rely on some sort of existing data or spout a bunch of BS. Gemini is especially bad like this. I have not used copilot myself, but my experience with other AI makes me curious about this.

discuss

bongodongobob|6 months ago

This is it. M365 uses RAG on your enterprise data that you allow it to access. It's not actually accessing the files directly in the cases he provided. It's working as intended.

albert_e|6 months ago

If this is indeed how copilot is archtected, then it needs clear documentation -- that it is a non-audited data store.

But how then did MS "fix" this bug? Did they stop pre-ingesting, indexing, and caching the content? I doubt that.

Pushing (defaulting) organizations to feed all their data to Copilot and then not providing an audit trail of data access on that replica data store -- feels like a fundamental gap that should be caught by a security 101 checklist.

crooked-v|6 months ago

If that's the case, then as noted in the article, the 'as intended' is probably violating liability requirements around various things.