top | item 46231391 (no title) willwade | 2 months ago I wonder if this would have been useful https://github.com/microsoft/presidio - its heavy but looks really good. There is a lite version.. discuss order hn newest shaoz|2 months ago I've used it, lots of false positives out of the box, you need to do a ton of tuning or put a transformer/BERT model with it, but then at that point it's basically the same thing as the OP's project. threecheese|2 months ago Looks like it uses Googles Langextract, which uses only LLMs for NLP, while OP is using a small NER model that runs locally. winchester6788|2 months ago full of false positives though. but definitely good for some types of entities and regexes
shaoz|2 months ago I've used it, lots of false positives out of the box, you need to do a ton of tuning or put a transformer/BERT model with it, but then at that point it's basically the same thing as the OP's project.
threecheese|2 months ago Looks like it uses Googles Langextract, which uses only LLMs for NLP, while OP is using a small NER model that runs locally.
winchester6788|2 months ago full of false positives though. but definitely good for some types of entities and regexes
shaoz|2 months ago
threecheese|2 months ago
winchester6788|2 months ago