top | item 44565133

(no title)

etaioinshrdlu | 7 months ago

LLMs are a key enabling technology to extract real insights from the enormous amount of surveillance data the USA captures. I think it's not an understatement to say we are entering a new era here!

Previously, the data may have been collected, but there was so much that effectively, on average no one was "looking" at it. Now it can all be looked at.

discuss

order

schmidtleonard|7 months ago

I remember when PRISM was spooky. This is gonna be something else!

int_19h|7 months ago

Imagine PRISM, but all intercepted communications are then fed into automatic sentiment analysis by a hierarchy of models. The first pass is done by very basic and very fast models with a high error rate, but which are specifically trained to minimize false negatives (at the expense of false positives). Anything that is flagged in that pass gets fed to some larger models that can reason about the specifics better. And so on, until at last the remaining content is fed into SOTA LLMs that can infer things from very subtle clues.

With that, full-fledged panopticon becomes technically feasible for all unencrypted comms, so long as you have enough money to handle compute costs. Which the US government most certainly does.

I expect attempts to ban encryption to intensify going forward now that it is a direct impediment to the efficiency of such system.

jMyles|7 months ago

So what are the actions which represent our duties to resist?

* End-to-end encryption (has downsides with regard to convenience)

* Legislation (very difficult to achieve, and can be ignored without the user having a way to verify)

* Market choices (ie, doing business only with providers who refrain from profiteering from illicit surveillance)

* Creating open-weight models and implementations which are superior (and thus forcing states and other malicious actors to rely on the same tooling as everyone else)

* Teaching LLMs the value of peace and the degree to which it enjoys consensus across societies and philosophies. This of course requires engineering what is essentially the entire corpus of public internet communications to echo this sentiment (which sounds unrealistic, but perhaps in a way we're achieving this without trying?)

* Wholesale deprecation of legacy states (seems inevitable, but still possibly centuries off)

What am I missing? What's the plan here?

andai|7 months ago

I call it One Fed Per Child...

ezst|7 months ago

NLP was a thing decades before LLMs and deep learning. If one thing, LLMs are a crazy inefficient and costly way to get at it. I really doubt this has anything to do with scaling.

TZubiri|7 months ago

LLMs are unbelievably effective at NLP. Most NLP before that was pretty bad, the only good example I can think of is Alexa, and it was restricted to English.

lucaspauker|7 months ago

It is way better now though...

xnx|7 months ago

grep : NLP :: NLP : LLM

moomoo11|7 months ago

Even the best LLM can't even process a 50 line CSV with like 2+ columns properly.

sshine|7 months ago

LLMs make counting mistakes like forgetting the number of columns halfway through. I won't say "much like humans", since that will probably trigger some. But the general tendency for LLMs to be "bad at counting" (this includes computing) is resolved by producing programs that do the counting, and executing those programs instead. The LLMs that do that today are called agentic.

autoexec|7 months ago

and hallucinated about.

swat535|7 months ago

This is even more terrifying, imagine an AI making up all sorts of "facts" about you that puts you on a watch list, resulting an endless life of harassment by the Government..

and what recourse do you have as a citizen? next to none.

echelon|7 months ago

If you think about LLMs as new types of databases, it's quite obvious that they'll start winning over many types of legacy systems.

They ingest unstructured data, they have a natural query language, and they compress the data down into manageable sizes.

They might hallucinate, but there are mechanisms for dealing with that.

These won't destroy actual systems of record, but they will obsolete quite a lot of ingestion and search tools.

int_19h|7 months ago

LLMs don't make for a particularly good database, though. The "compression" isn't very efficient when you consider that e.g. the entirety of Wikipedia - with images! - is an order of magnitude smaller than a SOTA LLM. There are no known reliable mechanisms to deal with hallucinations, either.

So, no, LLMs aren't going to replace databases. They are going to replace query systems over those databases. Think more along the lines of Deep Research etc, just with internal classified data sources.

ericmcer|7 months ago

arent they complete trash as a database? "Show me people who have googled 'Homemade Bomb' in the last 30 days". For returning bulk data in a sane format it is terrible.

If their job was to process incoming data into a structured form I could see them being useful, but holy cow it will be expensive to in realtime run all the garbage they pick up via surveillance through an AI.