g_airborne's comments

g_airborne | 5 years ago | on: Self-Supervised Video Object Segmentation by Motion Grouping

In terms of accuracy, the authors mention it as a limitation, so probably it could be a problem.

In terms of runtime, it should not matter. Generally speaking though, the overhead of optical flow is often overlooked. For video DL applications, optical flow calculation often takes more time than inference itself. For academic purposes, datasets are often already preprocessed and the optical flow runtime is not mentioned. Doing real time video analysis with optical flow is quite impractical though.

g_airborne | 5 years ago | on: Applications of Deep Neural Networks v2 [pdf]

This is very cool, I’ll be studying your implementation of I3D. Did you ever attempt to train I3D end-to-end as done in the Quo Vadis paper? And it so, did you get comparable Top1/Top5 accuracy?

g_airborne | 5 years ago | on: Applications of Deep Neural Networks v2 [pdf]

I couldn't agree more, especially with the latter part. I've worked on action recognition with I3D for over a year now, and found that seemingly equivalent implementations in Keras, TensorFlow 2 or PyTorch will produce wildly different results. Worse yet, I found a bunch of papers that will claim SOTA results compared against one of those non-original implementations with just a few percentage-point differences. It makes no sense! It took me hundreds of hours to hunt down the differences between how these frameworks implement their layers before I could come even close to the expected accuracy...

g_airborne | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

The JIT is hands down the best feature of PyTorch. Especially compared to the somewhat neglected suite of native inference tools for TensorFlow. Just recently I was trying to get a TensorFlow 2 model to work nicely in C++. Basically, the external API for TensorFlow is the C API, but it does not have proper support for `SavedModel` yet. Linking to the C++ library is a pain, and both of them cannot do eager execution at all if you have a model trained in Python code :(

PyTorch will happily let you export your model, even with Python code in it, and run it in C++ :)

g_airborne | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

It's not that Python is by definition much slower than C++, rather, doing inference in C++ makes it much easier to control exactly when memory is initialised, copied and moved between CPU and GPU. Especially on frame-by-frame models like object detection this can make a big difference. Also, the GIL can be a real problem if you are trying to scale inference on multiple incoming video streams for example.

g_airborne | 5 years ago | on: Are we in an AI Overhang?

> The current hardware floor is nearer to the RTX 2080 TI's $1k/unit for 125 tensor-core TFLOPS, and that gives you $25/pflops-d.

It's definitely true that the RTX 2080 Ti would be more efficient money-wise, but the Tensor Cores are not going to get you the advertised speedup. Those speedups can only be reached in ideal circumstances.

Nevertheless, the article as a whole makes a very good point. The thing that is most scary about this is that it would become very hard for new players to enter the space. Large incumbents would be the only ones able to make the investments necessary to build competitive AI. Because of that, I really hope the author isn't right - unfortunately they probably are.

g_airborne | 5 years ago | on: The Overfitted Brain: Dreams evolved to assist generalization

If we’re going down this road of theorizing about the human brain based on DNNs, what is the deal with dropout? Could we help human brains with generalization by randomly removing 10% of our newly created connections at the end of each day to improve long term learning? :)

g_airborne | 5 years ago | on: New Compute Engine A2 VMs–First Nvidia Ampere A100 GPUs in the Cloud

Very cool! Does anyone know how is software support for all these features? It seems that TF doesn’t support the TP32/16 types as of yet. Is this something only CUDA engineers can use right now?

It does seem a little fishy to me that NVIDIA often boasts with figures like 10x performance upgrade whilst in practice those are only possible if you use one of their non-default float types which are hardly supported in most deep learning libraries :(

g_airborne | 5 years ago | on: Neurons that fire together, wire together, but how?

The connectedness of neurons in neural nets is usually fixed from the start (i.e. between layers, or somewhat more complicated in the case CNNs etc). If we could eliminate this and let neurons "grow" towards each other (like this article shows), would that enable smaller networks with similar accuracy? There's some ongoing research to prune weights by finding "subnets" [1] but I haven't found any method yet where the network grows connections itself. The only counterpoint I can come up with is that is probably wouldn't generate a significant performance speed up because it defeats the use of SIMD/matrix operations on GPUs. Maybe we would need chips that are designed differently to speed up these self-growing networks?

I'm not an expert on this subject, does anybody have any insights on this?

1. https://www.technologyreview.com/2019/05/10/135426/a-new-way...

g_airborne | 5 years ago | on: Cleaning My MacBook After 16800 Hours of Use

I just replaced my MacBook Late 2013 after similar usage. No repairs, no hiccups or any component failure whatsoever except for a very badly degraded battery - it lasted about 1.5 hours on a full charge at the end. There’s a lot to be said about Apple but honestly I don’t see any other manufacturers that produce laptops that last so insanely long. Hopefully my new one will last as long as well :)

g_airborne | 5 years ago | on: Security Bulletin VLC 3.0.11

Can’t agree more. Every codec implementation or video-related software package is just a giant pile of pointer-heavy C/C++ code. It’s not a bad thing because it’s fast and practically still the only way to do it. But looking at cosebases like VLC and especially ffmpeg makes me a little nervous. How many bugs like this are hidden in these libraries that we don’t know about?

g_airborne | 5 years ago | on: CUDA on Windows Subsystem for Linux 2

This looks very cool if it can deliver up to its promises! As a Mac user and ML developer I’m starting to look more and more jealous towards Windows - it is starting to make sense this way.

But I’m also a bit afraid. Has anything related to CUDA ever been easy to install and setup? Anyone who tried this have some pointers on this? For example, I don’t know how many times I’ve googled the CUDA, cudnn, TF compatibility matrix but it must be close to 100. Is this helping fix that as well?

g_airborne | 5 years ago | on: The Rust compiler isn't slow; we are

You could say that Rusts great dependency management is both a blessing and a problem at the same time. C programmers, often motivated by resource constraints, are much less likely to use third party dependencies - not just to save resources but also because it is just much harder to do it and to do it well. They end up just rewriting the parts they need themselves. Because those parts are likely a small subset of a full-featured library, binary size and compile times are smaller at the expense of actual time spent writing code.

g_airborne | 5 years ago | on: FFmpeg 4.3

Most likely you are transcoding the video instead of copying the raw stream. A lot of more complicated stuff requires that but things like trimming can be done the fast way by simply cutting of the irrelevant pieces in the raw encoded data itself which is much faster. It’s kind of a sport to find the exact string of flags that has the correct effect without transcoding :)

The reason it often fails is that ffmpeg can do so many things that any time you are using some curious combination of flag A, B and C is likely that no one else has ever done that and there are some side effects ;)

Anyway, some of it can be avoided by learning how containers en codecs work, what I-frames are and all the other nitty gritty details of the world of video where there is so much to learn!

Here’s a great intro to get started for anyone who got curious: https://github.com/leandromoreira/digital_video_introduction

g_airborne | 5 years ago | on: An understanding of AI’s limitations is starting to sink in

Like others are saying the progress towards AGI isn’t great but each individual subdomain is seeing great advances all the time. Object detection, facial recognition and NLP with GPT are much better than they were a few years ago. Each of these can provide business value to a certain degree, but I would agree that something resembling AGI holds the most business value. For this to happen all of the pieces have to be put together somehow - right now research focuses on specific subdomains and improves the SOTA on them. Once someone figures out how to make everything work together, it could mean a second, much larger wave of AI. So the question is, when will that happen?

g_airborne | 5 years ago | on: Zipline: Drone delivery of medical supplies

Zipline is such an awesome company. I love that their largest target market is Africa. Almost every high-tech startup starts with conquering the US en EU markets even though double-digit growth of economies is now mostly happening in emergent markets. It’s one of many interesting points touched upon in Hans Roslings last book (Factfulness - a must read!) and here we have a company that did exactly that. And they are doing it with such cool technology as well!
page 1