ozr's comments

ozr | 1 year ago | on: Is regulated BGP security coming?

By this logic, I should be concerned about defending against raccoon attacks since they are endemic to my area and I often go outside.

The point is that, in practice, the attacks are so uncommon and mitigated by so many other factors that the cost involved of further mitigation it isn't worth it.

You develop a threat model to specifically get rid of concerns like this; not to list every possible attack vector imaginable.

ozr | 1 year ago | on: Is regulated BGP security coming?

Yes. Whether or not a particular standard has been implemented is not interesting. What matters is the result.

Is BGP an attack vector that matters for the vast majority of threat models right now? I would say no. Given that: there is no need for (inevitably) poor regulation.

ozr | 1 year ago | on: Is regulated BGP security coming?

BGP operators _have_ self-organized sufficient security measures. Compared to just about any other attack vector on the internet, BGP hijacking is among the least likely to impact most people.

ozr | 1 year ago | on: I want flexible queries, not RAG

The parent's description isn't quite correct. It's kinda sorta describing the implementation; RAG is often implemented via embeddings. In practice, you generally get better results with a mix of vector and, e.g., TF-IDF.

An example of RAG could be: you have a great LLM that was trained at the end of 2023. You want to ask it about something that happened in 2024. You're out of luck.

If you were using RAG, then that LLM would still be useful. You could ask it

> "When does the tiktok ban take effect?"

Your question would be converted to an embedding, and then compared against a database of other embeddings, generated from a corpus of up-to-date information and useful resources (wikipedia, news, etc).

Hopefully it finds a detailed article on the tiktok ban. The input to the LLM could then be something like:

> CONTEXT: <the text of the article>

> USER: When does the tiktok ban take effect?

The data retrieved by the search process allows for relevant in-context learning.

You have augmented the generation of an LLM by retrieving a relevant document.

ozr | 1 year ago | on: I want flexible queries, not RAG

That's not what RAG is. RAG is the process of adding relevant information to a prompt in an LLM. It's a form of in-context learning.

ozr | 1 year ago | on: SpaceX has grown to 87% of the tonnage to orbit

This is more rhetoric driven by a personal dislike of someone vs. reality.

> Twitter dropped in value to just 25% of what it was.

It's a private company. You don't know the value.

> Racism and naziism are rampant, endless stream of bots pushing propaganda or porn

Maybe on your feed? Statistically, no.

> and has any advertiser actually come back?

Who knows. It's still running after firing anyone: that's the point.

Tesla was the first and still the only electric car player that matters.

Same with SpaceX and space.

Again: you can dislike Elon's personality or politics, but trying to attack his results is ridiculous.

ozr | 1 year ago | on: SpaceX has grown to 87% of the tonnage to orbit

Twitter is running fine post-firings, and he saved a ton of money.

If Elon is so horrific at SpaceX, why is it the only space organization (including NASA) able to innovate and ship anymore?

You can dislike his personality, but criticizing his performance is silly.

ozr | 1 year ago | on: Does Fine-Tuning LLMs on New Knowledge Encourage Hallucinations?

I’ve taught LLMs imaginary words and their meanings with minute amounts of data (two or three examples) via full fine-tuning, LoRA and QLoRA.

I have no idea where the myth of ‘can’t add new knowledge via fine-tuning’ came from. It’s a sticky meme that makes no sense.

Pretraining obviously adds knowledge to a model. The difference between pretraining and fine-tuning is the number of tokens and learning rate. That’s it.

ozr | 1 year ago | on: Has Llama-3 just killed proprietary AI models?

If your product is an AI model (OpenAI, Anthropic, etc) you can't give it away for free.

If your product is a social graph w/ ads (Meta), you can.

It's hardly corporate charity:

* Meta releasing these models creates an improvement and tuning ecosystem around it, giving them access to tons of free developer time.

* It's also a strong recruiting tool, for engineers and researchers frustrated by, e.g., Google and OpenAI becoming increasingly closed. They know they can publish at Meta.

* The cost is insignificant. Meta had 30B in revenue just in Q2 2023.

ozr | 1 year ago | on: Home insurers are dropping customers based on aerial images

You aren't required to purchase a home or drive a car. If you chose to, as most Americans do, then yes: the demand is inelastic.

But that doesn't imply what you're saying unless the supplier has monopoly power, which they, by law, do not.

ozr | 1 year ago | on: Home insurers are dropping customers based on aerial images

Insurance carriers have an incentive to compete on price like any other business. There are plenty of options. The margins are generally pretty small, and there is a lot of regulation around what portion of payments must be used for claims.

There's a few obvious exceptions, but plenty of insurance isn't required.

page 2