top | item 35590372

(no title)

quanticle | 2 years ago

Currently? You don't know for sure. It might be possible to make some guesses by examining how the model responds to various prompts, and checking the output of the model against the output from the same prompt against known models. But that will at best, give you a good guess, not certainty. This is why the FLI's proposal for AI regulation [1] suggests that AI models be both watermarked and that the output from AI models be clearly identified as such. In a world where most people use regulated models, this would enable you to identify which model generated a certain piece of content.

As for applying a delta to the weights, that would likely break the model. It would be like randomly scrambling bytes in a compressed file and then expecting the file to decompress properly.

[1]: https://futureoflife.org/wp-content/uploads/2023/04/FLI_Poli...

discuss

SongofEarth|2 years ago

By delta I really mean the kind of finetuning people are doing to avoid directly giving Llama weights. May be watermarking will be the norm even for open source models to prevent abuse.