matroid's comments

matroid | 2 months ago | on: Simple 3D Packing

Thanks. I'll link it in the first line in the README. I think the interlocking-free part can pack cups like you suggest. They propose a flood fill algorithm which computes all the reachable places for the voxelized shape. It doesn't put assumptions on convexity. I think it would be a great example to try it out on though.

matroid | 2 months ago | on: Simple 3D Packing

A while back, I implemented a paper that had showed up on HN for a course project (Dense, Interlocking-Free and Scalable Spectral Packing of Generic 3D Objects).

Over the holidays, I cleaned up the implementation (with the help of Claude Code, although this is not an advertisement for it) and released it on GitHub.

If anyone needs fast 3D packing in python, do give this a shot. Hopefully I have attributed all the code/ideas I have used from elsewhere properly (if not, please feel free to let me know).

matroid | 5 months ago | on: FaceLift [ICCV 2025]

Hey everyone, I wanted to share my friend's work on Single Portrait Photograph to 3D Head Model. He has a Huggingface demo that you can play with!

matroid | 1 year ago | on: Weak supervision to isolate sign language communicators in crowded news videos

Thanks Zie for the message. I'm sorry to hear about your "interpreter" encounter :(

I do think these problems are much, much worse for ISL as you rightly noted.

I think I should have been careful when I said "solve" in my post. But that really came from a place of optimism/excitement.

matroid | 1 year ago | on: Weak supervision to isolate sign language communicators in crowded news videos

Also, in India, many hearing-impaired people know only ISL.

matroid | 1 year ago | on: Weak supervision to isolate sign language communicators in crowded news videos

That is correct. We want to translate between English and ISL. English, because it is by and large the language of the Web and I think we should try to connect ISL to it rather than Indian Languages.

From my understanding, they are quite dissimilar. A person who knows ISL will not understand ASL, for example.

matroid | 1 year ago | on: Weak supervision to isolate sign language communicators in crowded news videos

Thanks for the feedback. You raise great points and this was the reason why we wrote this post, so that we can hear from people where the actual problem lies.

On a related note, this sort of explains why our model is struggling to fit on 500 hours of our current dataset (even on the training set). Even so, the current state of automatic translation for Indian Sign Language is that, in-the-wild, even individual words cannot be detected very well. We hope that what we are building might at least improve the state-of-the-art there.

> It's more of a bad and broken transliteration that if you struggle to think about you can parse out and understand.

Can you elaborate a bit more on this. Do you think if we make a system for bad/broken transliteration and funnel it through ChatGPT, it might give meaningful results? That is ChatGPT might be able to correct for errors as it is a strong language model.

matroid | 1 year ago | on: Harnessing Weak Supervision to Isolate Sign Language in Crowded News Videos

Hello everyone, we are trying to make a large dataset for Sign Language translation, inspired by BSL-1K [1]. As part of cleaning our collected videos, we use a nice technique for aggregating heuristic labels [2]. We thought it was interesting enough to share with people on here.

[1] https://www.robots.ox.ac.uk/~vgg/research/bsl1k/

[2] https://github.com/snorkel-team/snorkel

matroid | 1 year ago | on: Alternative clouds are booming as companies seek cheaper access to GPUs

I have never seen 1x H100 available on Lambda Labs. Don't know why though.

matroid | 2 years ago | on: Segmenting comic book frames

This is such a good resource! Thank you!

matroid | 2 years ago | on: Segmenting comic book frames

All that sounds definitely a lot of fun. I was also working on colorization with SD ControlNet recently (https://vrroom.github.io/blog/2024/02/16/interactive-colorin...).

matroid | 2 years ago | on: Segmenting comic book frames

Wow! Thanks for sharing this.

matroid | 2 years ago | on: Segmenting comic book frames

The author here. I would just like to say that this project is definitely work-in-progress and the AI elements often fail miserably.

As amazing as recent AI progress has been, we do overrate it a lot (I'm including myself in that).

matroid | 2 years ago | on: Segmenting comic book frames

The original blog post by Max Halford (https://maxhalford.github.io/blog/comic-book-panel-segmentat...) does exactly that. I love his approach because, unlike mine, it is simple, yet it goes a long way. I'd encourage you to check it out.

Can you explain what you mean by motion comic generation? Sounds interesting!

matroid | 2 years ago | on: GALA3D: Towards Text-to-3D Complex Scene Generation

Hi, can you explain this problem a bit more. I’m a new PhD student and love low-hanging fruit.

matroid | 2 years ago | on: Interactive Coloring with SD ControlNet

Two reasons (may not be well thought out or wrong):

* Flash Attention, an efficient attention module which significantly speeds up training, only works on Ampere GPUs [1]

* Even if I bought a 3090, I would have to get a computer to go with it, along with a PSU and some cooling. Don't know where to start with that.

[1] https://github.com/Dao-AILab/flash-attention/issues/190

matroid | 2 years ago | on: Stable Cascade

Someone with resources will have to train Zero123 [1] with this backbone.

[1] https://zero123.cs.columbia.edu/

matroid | 2 years ago | on: The engineering behind Figma's vector networks (2019)

It looks like Vector Networks is based on Boris Dalstein's work (https://www.borisdalstein.com/research/phd). He even has a startup (VGC) for Vector Graphic Editing tool based on these concepts. It is pretty cool!

P.S. I have no affiliation with his work, although I did contribute 10$ to his Kickstarter Campaign back in the day.

matroid | 2 years ago | on: Ask HN: How to get back to programming Python?

Don't know if its worth looking at, but you can look at my code: https://github.com/Vrroom/vectorrvnn :)

matroid | 2 years ago | on: Auto-unloading models using __init_subclass__ (Python)

I wanted functionality where GPU VRAM isn't constantly hogged while I'm serving a PyTorch model, so that I could simultaneously train stuff.

I wanted a solution which was agnostic to the type of the model, with respect to loading and inferring.

So I made this AutoUnloadModel class that unloads the model if it hasn't been used for some period. I used __init_subclass__ to ensure that all the details regarding timers, locks etc are hidden from the subclass.

I found __init_subclass__ very cool for this job, which is the reason I'm sharing this. Thanks!