I think we need both approaches. I don't want to know some things. For example, people who know how good heroin feels can't escape the addiction. The knowledge itself is a hazard.
Still, any AI model vulnerable to cogitohazards is a huge risk because any model could trivially access the full corpus of human knowledge. It makes more sense to making sure the most powerful models are resistant to cogitohazards rather than developing elaborate schemes to shield their vision and hope that plan works out in perpetuity.
atleastoptimal|7 months ago