shahbazac's comments

shahbazac | 1 year ago | on: Visualizing Attention, a Transformer's Heart [video]

Is there a reference which describes how the current architecture evolved? Perhaps from very simple core idea to the famous “all you need paper?”

Otherwise it feels like lots of machinery created out of nowhere. Lots of calculations and very little intuition.

Jeremy Howard made a comment on Twitter that he had seen various versions of this idea come up again and again - implying that this was a natural idea. I would love to see examples of where else this has come up so I can build an intuitive understanding.

shahbazac | 2 years ago | on: QuIP#: 2-bit Quantization for LLMs

Can someone answer CS 101 questions about this please.

I know there are other methods related to matrix factorization, but I’m asking specifically about quantization.

Does quantization literally mean the weight matrix floats are being represented using fewer bits than the 64 bit standard?

Second, if fewer bits are being used, are CPUs able to do math directly on fewer bits? Aren’t CPU registers still 64 bit? Are these floats converted back to 64 bit for math, or is there some clever packing technique where a 64 bit float actually represents many numbers (sort of a hackey simd instruction)? Or do modern CPUs have the hardware to do math on fewer bits?

shahbazac | 2 years ago | on: Causal inference as a blind spot of data scientists

I’ve tried to understand causal inference several times and failed. Tutorials seem unnecessarily long winded. I wish authors would give simple, to the point examples.

Say I have a simple table of outdoor temperatures and ice cream sales.

What can the machinery of causal inference do for me in this situation?

If it doesn’t apply here, what do I need to add to my dataset to make it appropriate for causal inference? More columns of data? Explicit assumptions?

If I can use causal inference, what can it tell me? If I think of it as a function CA(data), can it tell me if the relationship is actually causal? Can it tell me the direction of the relationship? If there were more columns, could it return a graph of causal relationships and their strength? Or do I need to provide that graph to this function?

I know a wet pavement can be caused by rain or spilled water or that an alarm can go off due to an earthquake or a burglary. I have common sense. I also understand the basics of graph traversal from comp sci classes.

How do I practically use causal inference?

To the authors of future articles on this (or any technical tutorial), please explain the essence, the easy path, then the caveats and corner cases. Only then will abstract philosophizing make sense.

shahbazac | 3 years ago | on: Ask HN: Those making $0/month or less on side projects – Show and tell

FIX Parser: https://fixparser.targetcompid.com/

A website which allows people in financial trading companies to more easily understand the FIX protocol.

Obviously this is a very niche app, but very useful! It is somewhat well known in the industry (among the type of people who use FIX).

Amusingly, recently a friend forwarded me a website, run by a prestigious financial software company, which is CLEARLY a copy of my website! They are marketing their site on LinkedIn and, I’m sure, other places.

I keep thinking of developing this firther. I have several ideas, just lack the time.

shahbazac | 3 years ago | on: Lambda the Ultimate is now running in a new, more stable environment

Unfortunately most of the comments are about site reliability.

This used to be an absolutely fantastic forum. I was a young comp sci graduate who somehow finished school without taking any programming language theory courses. I used to read this every single day. At one point I had every book ever written on ML (ocaml, sml, etc) and most written about various lisps. To this day I love how TAPL was written (Types and Programming Languages by Pierce). I loved the expansive nature of Concepts, Techniques, and Models of Computer Programming by Van Roy. Some books were discussed so often that they were simply referred to by their abbreviations.

There were serious academics, PHD students, industry folks and newbies like myself who could not even understand most abstracts, much less the full papers.

I once asked if a new forum could be created for novices like myself so I could ask my dumb little questions. I was instead encouraged to ask my questions in the main forum :)

For a short while there was a related user group in NYC where people would discuss type theory at random diners.

shahbazac | 4 years ago | on: “Kids who grew up on smartphones do not know how to do anything on a computer”

I have also found this to be true. Very surprising and, frankly, hilarious. I teach an intro to programming course to grad students, many of them freshly minted undergrads.

I've had to explain the concept of downloading files, file system directory structure and other basics. These are students are at a pretty good school, studying to become data scientists!

The reason is exactly what the author in the link says. They grew up with smart devices, but apparently not desktop/laptop computers.

shahbazac | 5 years ago | on: Repl.it Teams for Education

This looks very useful. I usually teach 20 to 30 students at a time and always struggle to provide like by line feedback. This should help.

For a slightly different use case, I wrote https://postcell.io/ for my classes. We use mostly Jupyter notebooks in my class and I mark certain cells with a special magic command. When students execute code in that cell, it is sent up to my server and I can see every student’s submission. I turn off student names and share the screen with the class. We then go through all the variations of answers and talk about the differences.

Although anyone can use it, the interface is probably a bit rough right now.

shahbazac | 5 years ago | on: Show HN: My recent side project: practice your SQL at sqlforever.com

Hi, I’m developing an SQL course for students who may not be very techie. I wanted to avoid having them install Postgres or work with SQLite at the command line.

I created http://sqlforever.com so students can start writing queries against a pre-loaded database. No downloads, no signups, no picking this option or that option. Just sql.

The backing database is SQLite. The datasets are open source relational datasets available on the internet.

Students can click a button to share their queries with a teacher or classmates.

shahbazac | 8 years ago | on: Ask HN: Why blog as a developer?

I find it a good way to think through ideas in more detail.

For example, I always wanted to implement a few financial trading software ideas using programming languages I don't use at work, such as scala and javascript. By writing a blog post, I force myself to make the project at least somewhat complete and understand it enough to explain it to others. It is a good way to commit to something, at least for a few weeks.

shahbazac | 9 years ago | on: RedHat is hiring to make Linux run better on laptops

Is it really very difficult to get nVidia GPUs working with linux?

I wanted to buy a laptop with a GPU for CUDA experiments.

I was looking at Lenovo Yoga 710 (14 inch with nvidia gpu). Am I going to have a bad time dual booting a linux distribution?

shahbazac | 9 years ago | on: Surround 360 is now open source

Can a rig like this be used to extract a depth map/point cloud, instead of just visual 3d images? What will be the accuracy of such a point cloud?

shahbazac | 9 years ago | on: Disruptor: High performance alternative to bounded queues (2011) [pdf]

Its been many years since I saw the presentation but I believe they made a point of mentioning that ring buffers are not a new technology. They mentioned how such data structures are used all the time in network devices, such as switches.

The disruptor is an actual library based on ring buffers (other have already pointed out some other features).

In practical terms, this library has been immensely popular. Within a couple years of disruptor's release, practically every (java based) trading firm was using them. I've even seen c++ implementations, only vaguely resembling this library, being referred to as 'disruptors.'

Beyond the library, the original (and subsequent) presentations by the authors popularized the concepts of "mechanical sympathy" (understand the abstraction layer beneath what you are currently working in) and introduced a whole new set of developers to modern CPUs, thinking about cache lines, etc.

shahbazac | 9 years ago | on: Notes on CPSC 465/565: Theory of Distributed Systems [pdf]

This looks interesting. I've been searching for a modern book which explains the algorithms behind many of the building blocks of modern distributed software such as zookeeper, consul, etcd, mesos, etc.

With a bit of a background in the theory, the tools mentioned above and their alternative add up to a bewildering number of projects, all seemingly doing very similar things.

Just a couple of hours ago I asked a relevant question on SO (unfortunately, it looks like it will be closed soon): http://stackoverflow.com/questions/37843295/book-recommendat...

page 1