sshumaker's comments

sshumaker | 1 year ago | on: OpenAI and Microsoft Azure to deprecate GPT-4 32K

It’s how LLMs work - they are effectively recursive at inference time, after each token is sampled, you feed it back in. You will end up with the same model state (not including noise) as if that had been the original input prompt.

sshumaker | 1 year ago | on: Context caching guide

It depends on how large the input prompt (previous context) is. Also, if you can keep cache on GPU with a LRU mechanism, for certain workloads it's very efficient.

You can also design an API optimized for batch workloads (say the same core prompt with different data for instruct-style reasoning) - that can result in large savings in those scenarios.

sshumaker | 1 year ago | on: Context caching guide

They are almost certainly doing this internally for their own chat products.

The simple version of this just involves saving off the KV cache in the attention layers, and restore it back instead of recomputing. It only requires small changes to inference and the attention layers.

The main challenge is being able to do this under scale, e.g. dump the weights out of GPU memory, persist them, and have a system to rapidly reload them as needed (or just regenerate).

sshumaker | 1 year ago | on: Context caching guide

This is a pretty standard technique if you're running the models yourself. e.g. ChatGPT almost certainly does this.

There's even work that is more sophisticated in this domain that allows 'template' style partial caching: https://arxiv.org/abs/2311.04934

sshumaker | 1 year ago | on: Ask HN: Who is hiring? (May 2024)

Stealth startup | ML Engineer (edge inferencing) | Bay Area or LA (hybrid) | Full-time

We're a stealth startup building something unbelievably ambitious in the AI space that blends AI and gaming tech - co-founders are Andy Gavin [1] (co-founder of video game developer Naughty Dog) and myself (VP @ Microsoft, Credit Karma, previously Google and Naughty Dog). Venture-backed by top investors including First Round Capital and Battery. Get in on the ground floor and work directly alongside a living legend and a dream team of world-class talent.

We're looking for someone who has deep experience with CoreML and optimizing model inferencing for mobile usage (e.g. ANE on iOS).

If interested, reach out to me at: [email protected]

[1] Andy Gavin on the Making of Crash Bandicoot: https://www.youtube.com/watch?v=pSHj5UKSylk

sshumaker | 1 year ago | on: Descent 3 Source Code

I had the pleasure of working closely with Jeff Slutter early in my career. Was the first really fantastic engineer I worked with - he’s at Santa Monica Studios (God of War) these days. I think Matt Toschlog is there too now. Glad to see these folks working on an awesome AAA franchise.

sshumaker | 1 year ago | on: Bay Area workers charged for building secret apartments inside train stations

Monarchies usually result in long-term thinking? I’m sure we haven’t read the same history books. Most monarchs throughout history have been very self-interested and their decisions have been focused on their own personal interest - the definition of short-term.

I’m not claiming democracy is any better - humans are notoriously bad at long-term thinking unfortunately.

sshumaker | 1 year ago | on: The Pentagon's Silicon Valley Problem

Hamas gained a ton of support for their cause by the tragedy inflicted in the counterattack. I believe they celebrate the deaths of Palestinian innocents as much as they do the Israeli ones so they can extract propaganda wins. This has been part and parcel of their strategy for decades - it’s why they embed their military activity in soft targets like schools and hospitals.

The Israelis know this by now, so the fact that Israel was goaded into a ground war speaks as much to the political situation as anything else, but either way it’s tragic.

sshumaker | 2 years ago | on: Bypassing Safari 17's advanced audio fingerprinting protection

It seems like rather than adding a random amount to each sample (which lets them compute a mean by recreating the same audio and extracting out the differences), Safari could instead add randomness that is based on a key that rotates every hour. (Function of audio sample and key, so the noise would be the same in a given session, but useless for tracking an hour later).

sshumaker | 2 years ago | on: Ask HN: Who is hiring? (March 2024)

Stealth startup | Lead mobile engineer, AI Engineer | Bay Area or LA (hybrid) | Full-time

We're a stealth startup building something unbelievably ambitious in the AI space that blends AI and gaming tech - co-founders are Andy Gavin [1] (co-founder of video game developer Naughty Dog) and myself (VP @ Microsoft, Credit Karma, previously Google and Naughty Dog). Venture-backed by top investors including First Round Capital and Battery. Get in on the ground floor and work directly alongside a living legend and a small team of world-class talent.

Lead Mobile Engineer - Swift and C/C++, CoreML a plus

AI engineer - Finetuning/Retraining LLMs (LLama/Mixtral/etc), MLOps a plus

If interested, reach out to me at: [email protected]

[1] Andy Gavin on the Making of Crash Bandicoot: https://www.youtube.com/watch?v=pSHj5UKSylk

sshumaker | 2 years ago | on: Hallucination is inevitable: An innate limitation of large language models

You’re being downvoted because this is a hot take that isn’t supported by evidence.

I just tried exactly that with dalle-3 and it worked well.

More to the point, it’s pretty clear LLMs do form a model of the world, that’s exactly how they reason about things. There was some good experiments on this a while back - check out the Othello experiment.

https://thegradient.pub/othello/

sshumaker | 2 years ago | on: Generalized K-Means Clustering

You can also look at Bertopic which has this functionality as an open source library:

https://maartengr.github.io/BERTopic/index.html

sshumaker | 2 years ago | on: Generalized K-Means Clustering

Sometimes you can use a heuristic to estimate K, or use a variant that terminates at some distance threshold.

That said, something like hdbscan doesn’t suffer from this problem.

sshumaker | 2 years ago | on: Cursorless is alien magic from the future

I have an eightsleep, which cools the bed down dramatically (circulates cold water). Makes a huge difference in my ability up sleep.

sshumaker | 2 years ago | on: Show HN: I rewrote the 1990's LambdaMOO server

I talked to Pavel about taking a role in my team earlier this year (I ended up leaving Microsoft myself since). He was passionate about making the developer experience excellent - code quality, clean APIs, etc. Thats a tall order for some parts of Microsoft with 30+-year old codebases. He mentioned he was interested in rockets so I hope he found a gig doing that.

sshumaker | 2 years ago | on: Doug Lenat has died

I don’t know about [1]. I asked an example from the paper above to GPT-4: “[If you had to guess] how many thumbs did Lincoln’s maternal grandmother have?”

Response: There is no widely available historical information to suggest that Abraham Lincoln's maternal grandmother had an unusual number of thumbs. It would be reasonable to guess that she had the typical two thumbs, one on each hand, unless stated otherwise.

sshumaker | 2 years ago | on: Social media for AI bots: “No humans allowed”

If you ask it properly it gets it right.

From a pure measurement standpoint, could Jupiter fit in the space between the earth and moon?

The average distance from the Earth to the Moon is about 238,855 miles (384,400 kilometers). Jupiter, the largest planet in our solar system, has a diameter of about 86,881 miles (139,822 kilometers).

So, if you were to somehow place Jupiter in between the Earth and the Moon, it would fit with a significant amount of room to spare. However, it's important to note that this is a purely theoretical situation and not something that could actually happen without cataclysmic consequences due to gravitational forces and other factors.

sshumaker | 2 years ago | on: Why are there no antitrust claims vs. GitHub Copilot, when there is a precedent?

I’m fairly confident this is untrue. At Microsoft at least, it’s a big deal when there is a privacy issue, even a small localized one on a single product - and creates a small firestorm.

We’ll get engineers working long hours focused on it, consulting closely with our legal and trust teams. One of the first questions we ask legal when we suspect a privacy issue is “Is this a notifiable event?”

It’s not really about getting slapped by regulators - it’s the fact that much of Microsoft’s business is built by earning the trust of large companies and small ones. Many of them are in the EU of course, but we have strict compliance we apply broadly. It’s just not worth damaging our reputation (and hurting our business) for some shortcut somewhere, as trust takes a long time to build and is easily broken.

sshumaker | 2 years ago | on: I'm never investing in Google's smart home ecosystem again

I had the same issue, but support kept disconnecting me and I had to start all over again going through the entire “try all of the brain dead stuff like resetting” 3 times, before I finally gave up and just ordered a new Nest and ate the cost.

sshumaker | 2 years ago | on: America Forgot About IBM Watson. Is ChatGPT Next?

> They will instead pay a lot for MSFTs cloud service offering, which of course comes with the crucial promise that their data is safe and secured and handled in a way that is compliant with all privacy laws. Which of course isn't true, but that doesn't matter, the promise is what matters.

In what way is this not true? Obviously there is no perfection here, only degrees of risk. But this is literally why people pick MSFT over others. They have by far the strongest culture around maintaining trust in the enterprise space.