(no title)
galkk | 11 days ago
ISBN is a an attribute/key, but not primary key, in database terms :)
ISBNs are messy and in real world you’ll see crazy amount of broken/edge cases that shouldn’t happen by the letter of the standard, but happen all the time in reality.
* For example, isbn can be reused by publisher for completely different book.
* 2nd edition, while very different, may have same isbn.
* Reissue of the same book could have different isbn.
* Textbook of same author for 6th and 7th grade could have same isbn.
* As soon as you’ll get in translations all bets are off.
* I already mentioned textbooks. How anbout about college books where each year there was slightly revised edition of same book.
If you ask yourself - wtf? You’re not alone.
—-
In my youth I heard horror stories about people who suddenly found multiple duplicate guids (uuidv1) in their databases because cheap Chinese knockoff network cards were using same MAC addresses. Think that with isbn that could Happen to you any time.
Ekaros|11 days ago
galkk|11 days ago
Honestly, right now I probably wouldn’t even try to code complex algorithm of book matching but fed all of books metadata, including book covers etc to llm and it would do better than what we had.
Our algorithm had tons of special cases coded and in results ui there was a button “needs manual review”, that was launching review workflow (not a joke, business people has special support team in India, because we were matching not only books) for cases when confidence score was low.