top | item 47160450

(no title)

dirk94018 | 4 days ago

This is exactly why local inference matters. Every query you send to a cloud API is another data point. Your prompts contain your code, your logs, your thought process — arguably more identifying than your HN comments.

The paper shows deanonymization from public posts. Imagine what's possible with private API traffic: the questions you ask, the code you paste, the errors you debug. Even if providers don't read it today, the data exists and the cost of analyzing it is going to zero.

Air-gapped local inference isn't paranoia. It's necessary.

discuss

order

Imustaskforhelp|4 days ago

Combine this with the fact that even the private mode of any AI provider still keeps logs of the chats and from some past discussion iirc, will keep it indefinitely.

> Air-gapped local inference isn't paranoia. It's necessary.

I definitely agree, I am seeing new model like qwen-3.5-30A3b (iirc) being able to be run reasonably on normal hardware (You can buy a mac mini whose price hasn't been inflated) and get decent tps while having a decent model overall.

There are some services like proton lumo, the service by signal, kagi's AI which seem to try to be better but long term, my plan is to buy mac-mini for such levels of inference for basic queries.

Of course, in the meanwhile like for example coding, it might not make too big of a difference between using local model or not unless for the most extremely sensitive work (perhaps govt/bank oriented)