I have been working on a project for a few months now coding up different methodologies for LLM Jailbreaking. The idea was to stress-test how safe the new LLMs in production are and how easy is is to trick them. I have seen some pretty cool results with some of the methods like TAP (Tree of Attacks) so I wanted to share this here. Here is the github link: https://github.com/General-Analysis/GA
No comments yet.