top | item 47160719

(no title)

alexpotato | 4 days ago

Many years ago (early 2000s) I worked for a firm that would help identify people who were doing "pump and dump" stock scams on Yahoo Finance message boards.

Step 1 was to scrape all of their posts into a database.

Step 2 was to have a human analyst review all of the posts for clues about who that person was

It was amazing that you could easily figure out:

- if they were at work or home from when they posted (9am to 5pm vs 6pm to 1am)

- what city they were in (based on sports teams, mentioning local landmarks etc0

- roughly what career they had

- their age based on cultural references

and mostly b/c they would drop a crumb of information here and there over months. They probably forgot about all of these individual events but when reading all of the posts in a few hours, the details became pretty evident. You get enough of these details and you can start to venn diagram people down to a few 100 likely candidates and then use LexisNexus style tools to narrow it down even further.

Given the above, it doesn't surprise me that LLMs can do the same but at high speed and across multiple sites etc.

discuss

order

tsumnia|4 days ago

I recently decided to play around with this, given... well my profile... and I will say that Gemini was good at zeroing in on who I was, but for whatever reason would refuse to stay my name.

rudhdb773b|4 days ago

Did you have a contract with SEC? Just wondering what kind of business would have an interest in that.

wraptile|4 days ago

Not OP but I have experience in private sector here - Deanonymization in private sectors is used by anti-fraud or brand protection systems. For example, in brand protection we identify same IP/scam infringer across multiple store fronts and then we can shut them down directly or get more certainty on their other posts. i.e. if it's a known infringer their scam likelyhood score goes up on all of their listings. So deanonymization doesn't have to point to exact real identity - just enough certainty to tie multiple entries together and then other systems can take it further like OP's manual review tho LLMs can obviously do a lot these days.