gandalfgeek's comments

gandalfgeek | 2 months ago | on: Nvidia just paid $20B for a company that missed its revenue target by 75%

> About a year ago, Groq announced a $1.5 billion infrastructure investment deal with Saudi Arabia. They also secured a $750 million Series D funding round.... Then in maybe one of the best rug pulls of all time, in July they quietly changed their revenue projections to $500 million. A 75% cut in four months. I’ve never seen anything like that since the 2008 financial crisis.

Not following the core argument here. Author seems to be comparing valuation in funding rounds to revenue projections. Revenue projection was revised downward, valuation was not.

Good point about not running the proprietary models, but that doesn't preclude strategic fit with Nvidia.

gandalfgeek | 3 months ago | on: The Undermining of the CDC

> Government agencies in general are largely insulated from politics.

This was obviously false during the pandemic when these “health” agencies did what the White House wanted, from the actual “science” to the messaging.

gandalfgeek | 6 months ago | on: MIT Study Finds AI Use Reprograms the Brain, Leading to Cognitive Decline

The coverage of this has been so bad that the authors have had to put up an FAQ[1] on their website, where the first question is the following:

Is it safe to say that LLMs are, in essence, making us "dumber"? No! Please do not use the words like “stupid”, “dumb”, “brain rot”, "harm", "damage", "brain damage", "passivity", "trimming" , "collapse" and so on. It does a huge disservice to this work, as we did not use this vocabulary in the paper, especially if you are a journalist reporting on it.

[1]: https://www.media.mit.edu/projects/your-brain-on-chatgpt/ove...

gandalfgeek | 10 months ago | on: Wasting Inferences with Aider

There is no fundamental blocker to agents doing all those things. Mostly a matter of constructing the right tools and grounding, which can be fair amount of up-front work. Arming LLMs with the right tools and documentation got us this far. There’s no reason to believe that path is exhausted.

gandalfgeek | 1 year ago | on: Stay Gold, America

Most charities on that list are on sharp sides of deeply polarizing culture war issues and it is not at all clear that their causes align with the “American Dream”.

If the last election was any indication then more than half the country explicitly rejected many of them.

gandalfgeek | 1 year ago | on: AI Predictions for 2025, from Gary Marcus

I don't understand why he has such an axe to grind. Is there some historical baggage here?

Of course there are plenty of problems with the current state of AI and LLMs, but to have such a preconceived pessimistic outlook that can't even acknowledge their massive and quick adoption and usefulness in multiple domains seems not intellectually honest.

gandalfgeek | 1 year ago | on: How Google spent 15 years creating a culture of concealment

This is a BS story.

Pretty much every public company, at least every bigtech company, follows the same conventions -- don't say incriminating things in chat, trainings for "communicate with care" (definitely don't say "we will kill the competition!!" in email or chat), automatic retention policy etc etc.

No need to single out Google.

gandalfgeek | 1 year ago | on: Alan Kay on Messaging (1998)

Thanks for the pointer!

"Call by meaning" sounds exactly like LLMs with tool-calling. The LLM is the component that has "common-sense understanding" of which tool to invoke when, based purely on natural language understanding of each tool's description and signature.

gandalfgeek | 1 year ago | on: The Friendship that made Google huge (2018)

(former Googler)

It was really special to see how this pair basically laid out the foundations of large-scale distributed computing. Protobufs, huge parts of the search stack, GFS, MapReduce, BigTable... the list goes on.

They are the only two people at Google at level 11 (senior fellow) on a scale that goes from 3 (fresh grad) to 10 (fellow).

gandalfgeek | 1 year ago | on: DJI ban passes the House and moves on to the Senate

Keep seeing DJI drones at local police dept open houses. They even have a "drone unit" that specializes in SAR, hazardous recon type scenarios. Given extensive existing use throughout US local law enforcement, fire depts etc, not sure if this will actually happen.

Or maybe they all mass-migrate to Anduril solutions?

gandalfgeek | 1 year ago | on: What we've learned from a year of building with LLMs

Agree that your use-case is different. The papers above are dealing mostly with adding a domain-specific textual corpus, still answering questions in prose.

"Teaching" the LLM an entirely new language (like a DSL) might actually need fine-tuning, but you can probably build a pretty decent first-cut of your system with n-shot prompts, then fine-tune to get the accuracy higher.

gandalfgeek | 1 year ago | on: What we've learned from a year of building with LLMs

This was kind of conventional wisdom ("fine tune only when absolutely necessary for your domain", "fine-tuning hurts factuality"), but some recent research (some of which they cite) has actually quantitatively shown that RAG is much preferable to FT for adding domain-specific knowledge to an LLM:

- "Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?" https://arxiv.org/abs//2405.05904

- "Fine-Tuning or Retrieval? Comparing Knowledge Injection in LLMs" https://arxiv.org/abs/2312.05934

gandalfgeek | 1 year ago | on: Ask HN: Is RAG the Future of LLMs?

#1 motivation for RAG: you want to use the LLM to provide answers about a specific domain. You want to not depend on the LLM's "world knowledge" (what was in its training data), either because your domain knowledge is in a private corpus, or because your domain's knowledge has shifted since the LLM was trained.

The latest connotation of RAG includes mixing in real-time data from tools or RPC calls. E.g. getting data specific to the user issuing the query (their orders, history etc) and adding that to the context.

So will very large context windows (1M tokens!) "kill RAG"?

- at the simple end of the app complexity spectrum: when you're spinning up a prototype or your "corpus" is not very large, yes-- you can skip the complexity of RAG and just dump everything into the window.

- but there are always more complex use-cases that will want to shape the answer by limiting what they put into the context window.

- cost-- filling up a significant fraction of a 1M window is expensive, both in terms of money and latency. So at scale, you'll want to filter out and RAG relevant info rather than indiscriminately dump everything into the window.

page 1