Making my bookshelves clickable

[+] greggsy|2 years ago|reply

What I would really like is a tool that takes a video or series of photos, and automatic catalogues the contents. This would be nice from a ‘document all the things’ perspective, but really, really convenient in case of a fire or theft claim.

If you haven’t already, I strongly urge everyone reading this comment to stand up and do a video walkthrough of your house, today. Do all of your book spines, jewellery, DVDs, games, clothes, tools, cutlery, etc. Take photos of your bike’s serial numbers (usually under the hub).

Store the videos and photos in a cloud album or even a free tier somewhere. Email it to yourself, whatever, just don’t forget to do it.

It might take half an hour or, but this evidence is priceless (in terms of time, but it does actually have an monetary value) if you ever need to claim insurance in case of a fire or buglary.

It simply isn’t possible to remember everything you own.

Some insurers demand photographic evidence of recent ownership - I found this out the hard way (who here has a photo of themselves with their bike?! I even had the receipt!)

[+] zerojames|2 years ago|reply

You may like "Writing documentation for your house", written by hsiao.dev, and its corresponding HN thread: https://news.ycombinator.com/item?id=38444577.

[+] enos_feedler|2 years ago|reply

its also good for airbnb hosts who want to decide if anything is missing from guests

[+] miniatureape|2 years ago|reply

I would really love a control-f for the real world.

Imagine you have a list of wines you want to try, or used books you’re hoping to buy.

In the store you open your phone and scan the shelves with your camera and if it finds any matches from your lists, it shows you them on the screen.

[+] pjmorris|2 years ago|reply

Same. I've imagined 'spines.com' for books, where the work of linking book spines to ISBN's, etc, has been crowdsourced and you can point your phone camera at a shelf and look up reviews, etc.

[+] buildsjets|2 years ago|reply

Apple iOS does this, kindasorta, but it's not real time. It does text recognition on text on images in your photostream, and you can search for text in your photos using the search bar.

[+] greggsy|2 years ago|reply

Apple lidar is absolutely suited to this, and I’ve always thought there is an opportunity to integrate it into the dollhouse addon in Homeassistant.

Or, create a digital twin of your garden, and simulate light shadows throughout the day after adding or removing a tree. Add pruning schedules to a fruit tree.

It’s trivial to do a scan, but it hasn’t really taken off in a practical sense.

[+] lathiat|2 years ago|reply

Vivino sort of kind of can do that. Has a rapid multi scan mode for shelves. Not quite the pointy AR experience yet though.

[+] KTibow|2 years ago|reply

Gemini can help with this but it has a large amount of overhead compared to dedicated models

[+] dmd|2 years ago|reply

I wish when people published things like this they would take the time to notice that their requirements.txt and READMEs don't actually work; e.g., try it out in a fresh VM or container divorced from their working environment. Not even their arguments match what's in the README.

[+] adammarples|2 years ago|reply

It blows my mind how much effort people, particularly python ml people, put into creating, training, blogging, promoting their github repo and then give so little thought to whether anybody will be able to use it. You're lucky if you get an incomplete and unversioned requirements.txt and no mention of what python version was used. If it requires conda, just give up. You might get an injunction to "install pytorch" and you're on your own.

[+] maroonblazer|2 years ago|reply

While watching an interview recently, where the interviewee was sitting in front of their bookshelf, I was trying to discern the book titles to add to my reading list. I screenshotted the person/bookshelf and tried asking ChatGPT+/GPT4 to list all the books. It could only identity a tiny fraction.

[+] moonlitzxspec|2 years ago|reply

This is a nice time saving tool for "minimalist" information hoarding. A pet project of mine is to thin out my bookshelf only to books I regularly reach for and store "never going to read" books out of sight.

The idea is, if I can save the details of the "never going to read books" and acquire a digital copy of them, it may be easier for me to psychologically let go of the physical copy and gain the storage space again.

I was going to take a photo of my crowded bookshelves and manually put the ISBN and titles into a spreadsheet. Keeping the photos simply for extra reference. Your project making the photo clickable is a great bridge between the data and artifact.

[+] abhgh|2 years ago|reply

This is interesting. For me it seems like me not reaching out for books is a consequence of lazy "rotation", i.e., these are the same books I'd eventually find time to read if I could see them around often enough, so when I have some leisure I don't forget about them.

Anyway this is our (my wife and mine) hypothesis - so we are currently working on rotating books, let's see how that works out : - )

[+] JoeDaDude|2 years ago|reply

For those not interested in taking photos, this Virtual Bookshelf project was posted some time ago:

https://github.com/petargyurov/virtual-bookshelf

[+] cxr|2 years ago|reply

HTML has image maps. It's been there since longer than many of the people who might read this post have even been alive. You don't need SVG and JS for this.

[+] lights0123|2 years ago|reply

It is certainly odd that they didn't use them, especially since image maps are mentioned at the beginning of the article.

[+] jonititan|2 years ago|reply

What's the benefit of image maps vs SVG here?

[+] codedeep|2 years ago|reply

Looks interesting, but surely image maps (https://developer.mozilla.org/en-US/docs/Web/HTML/Element/ma...) would be more suitable? no need for `onclick` handlers either then.

(They are mentioned at the start as an option so maybe there was a reason they were disregarded)

[+] Nevermark|2 years ago|reply

Works great in Vision. But now I want to get a browsable spacial model when I tap any book in my field of view

Cool demos are so frustrating!

[+] butz|2 years ago|reply

Was expecting to find a project where you add switch behind a book and "clicking" it opens up secret entrance behind the bookshelf, but this is also very cool.

[+] jonititan|2 years ago|reply

Very nice project. Seems like it would be the ideal kind of thing for AR/XR.

[+] Imnimo|2 years ago|reply

Would it make sense to have the user click and then use that point as a SAM prompt? It might let you find a book even if the initial SAM query doesn't find it.

[+] zerojames|2 years ago|reply

Post author here. I like this idea. I plan to explore it and make a more generic solution. I'd love to have a point-and-click interface for annotating scenes.

For example, I'd like to be able to click on pieces of coffee equipment in a photo of my coffee setup so I can add sticky note annotations when you hover over each item.

For the bookshelves idea specifically, I would love to have a correction system in place. The problem isn't so much SAM as it is Grounding DINO, the model I'm using for object identification. I then pass each identified region to SAM and map the segmentation mask to the box.

Grounding DINO detects a lot of book spines, but often misses 1-2. I am planning to try out YOLO-World (https://github.com/AILab-CVC/YOLO-World), which, in my limited testing, performs better for this task.

29 comments