I find it strange that the author is really happy with the quality of string comparison here https://pgaleone.eu/ai/coding/2025/01/26/using-ai-for-coding... and while it would kind of work, it's a very weird piece of code from ML standpoint, e.g. it's training a TF-IDF vectorizer on just two strings being compared, which at best won't change anything (unless the same word is repeated within one product), and is a super weird thing to do as for better quality you'd probably want to train that on some corpus, or not bother at all. And also it compare two strings as bags of words, which again is not the end of the world but maybe not what the author wants here, and if they want this then it's not the easiest way of doing it. So it's taking some things which can be useful when comparing texts (tf-idf and cosine similarity) but then applying them in a weird way which does not let them show their strengths.
No comments yet.