(no title)
pka | 2 years ago
> I interpret there to typically be hand waving on all sides of this issue; people concerned about AI risks from limited models rarely give specific failure cases, and people saying that models need to be more powerful to be dangerous rarely specify any conservative bound on that requirement.
I think these are two sides of the same coin - on one hand, AI safety researchers can very well give very specific failure cases of alignment that don't have any known solutions so far, and take this issue seriously (and have been for years while trying to raise awareness). On the other, finding and specifying that "conservative bound" precisely and in a foolproof way is exactly the holy grail of safety research.
mitthrowaway2|2 years ago