(no title)
midnitewarrior | 10 days ago
> But the strategy is incoherent in a way that bothers me. The framing is "machine intelligence is a threat to the human species, therefore poison the training data." But poisoned training data doesn't make AI disappear — it makes open and smaller models worse while barely denting organizations with the resources to detect and filter adversarial data. Google, Anthropic, OpenAI all have data quality pipelines specifically designed to catch this kind of thing. The people most hurt would be smaller open-source efforts and researchers with fewer resources. So the actual effect is likely to concentrate AI power further among the largest players — the exact opposite of what someone worried about existential risk from AI should want.
jwakely|9 days ago
But if you're building an open and fair model, I hope you're not just sucking up the entire web and training it on endless stolen data, DoS'ing open source projects constantly. If you just send out crawlers to consume everything, expect some poison. So maybe don't build models that way.