top | item 42727834

Show HN: Fruitstand – A Library for Regression Testing LLMs

1 points| gmarland | 1 year ago |github.com

1 comment

order

gmarland|1 year ago

Thanks for checking out fruitstand! I was recently working on a project that involved doing intent detection/entity extraction using an LLM. I was asked how we could make it so that the model could be changed/upgraded in the future without impacting the function. After a bit of thinking, this is what I came up with.

For anyone who is interested in how this works, it's pretty straightforward. The baseline creates an embedding for the outputted phrase and stores it in the json file. The test then runs the same phrase through the tested llm/model and the embeds the results.

The results of the baseline and the test are then compared with a cosine similarity so determine the similarity.

It was a pretty fun library to write!