briggers's comments

briggers | 2 years ago | on: Ask HN: Could you share your personal blog here?

Machine learning productionization: https://paulbridger.com

briggers | 3 years ago | on: Ask HN: Is having a personal blog/brand worth it for you?

Absolutely yes.

I write fairly deep ML performance tuning articles at https://paulbridger.com and the (many) hours I've spent on each article have been hugely worth it.

Many people reach out to me via this work, and when we talk they already see me as an expert or already want to work with me.

I need to blog more, thanks for the reminder.

briggers | 3 years ago | on: Ask HN: What bits of fundamental knowledge are productivity multipliers?

Before you start optimizing runtime performance: measure, trace, inspect, or whatever is appropriate to understand current performance.

briggers | 4 years ago | on: Podman can transfer container images without a registry

Perhaps you are looking for something like this?

docker run -d -p 5000:5000 --name registry registry:2

https://docs.docker.com/registry/#:~:text=The%20Registry%20i....

briggers | 5 years ago | on: Reloadr – Hot code reloading tool for Python

I used to love doing this with Clojure, it’s an awesome way to increase productivity by a good chunk.

It’s less about saving the time to re-run something, and more about removing conceptual overhead (I think).

briggers | 5 years ago | on: A terminal-based workflow for research, writing, and programming

Yes! I do this and love it. Works locally or via SSH with no difference at all.

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

Nice one! I've long been interested in the ONNX serving path.

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

Yeah. A 2080Ti doesn't fit in your pocket or in your AR glasses but the same techniques and tools scale down.

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

BTW, this is pumping the same video file through the network - not just a single file. I don't measure latency, but this is not a deep pipeline so it's easy to calculate.

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

Mobile phones definitely since these days most of them have pretty powerful GPUs.

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

Great question, now I wish I'd recorded power consumption for all these experiments. Judging from cumulative hours of watching the output of nvidia-smi I've definitely seen a linearish relationship between utilization and power draw (with a non-zero floor of 30-40W).

briggers | 5 years ago | on: Object Detection at 1840 FPS with TorchScript, TensorRT and DeepStream

Very practical question :) Exactly as you say, multi-stream throughput. Also for faster than realtime offline processing of video. Check the caveats section at the end of the post - DeepStream is probably not well suited to high throughput single-stream inference.

briggers | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

I looked into the GIL saturation as measured with gil_load (https://github.com/chrisjbillington/gil_load), but perhaps I should have included more numbers here.

To me, seeing the GIL held for 40% of time and significant time spent waiting on GIL by other threads was a fairly strong indicator. Keen to hear your thoughts/experience on it.

briggers | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

Great point - dependencies between frames are inherently problematic for many of these techniques.

Everything lostdog says. I've had experience speeding up tracking immensely using the same big hammer I talk about in the article - moving the larger parts of tracking compute to GPU.

Also, in a tracking pipeline you'll generally have the big compute on pixels done up front. Object detection and ReID take the bulk of the compute and can be easily batched and run in parallel. The results (metadata) can then be fed into a more serial process (but still doing the N<->N ReID comparisons on GPU).

briggers | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

Author here. As other commenters are saying, the Pytorch JIT and torchscript might be your friend here.

Alternatively, there are some quite fast OSS libraries for object detection. Nvidia's retinanet will export to a TensorRT engine which can be used with DeepStream.

briggers | 5 years ago | on: Object Detection from 9 FPS to 650 FPS

Author here. I really appreciate your feedback.

Completely agree that almost none of the SoTA github repos are really ready for production and making this stuff work can be pretty hard.

Getting this done on C++ and moving up to the next level of performance is the focus of my next article :)

briggers | 5 years ago | on: Show HN: bbox-visualizer – Make drawing and labeling bounding boxes easy as cake

Nice one.

More useful to me would be something similar that operates on tensors on the GPU.

Doing image annotations on host/CPU often becomes a bottleneck.

briggers | 5 years ago | on: AI slays top F-16 pilot in DARPA dogfight simulation

It’s important to note this is just within visual range dogfighting/BFM. Also just guns, and also perfect enemy state information. Important progress but it’s super early. Beyond visual range/BVR, coordination with wingmen and many other higher-level tasks still to come (or are still secret).

briggers | 5 years ago | on: What would you do if you lost your Google account?

Similar thing happened to me. Lost access to a perfectly set-up forwarding account. The account recovery process is impossible because I nolonger have the same phone number from 10 years ago.

briggers | 6 years ago | on: Standardizing OpenAI’s deep learning framework on PyTorch

I think I understand your point about declaring and then later using your layers.

Are you aware of the Sequential module? It allows you to chain together layers into a single variable, making this repetition disappear into a single forward/__call__ on the Sequential.