Recursive insight is possible with a model that self trains, but right now that would result in a detour into unreality. Perhaps with the right systems of vetting prior to incorporating new data into the retraining set.
Right now they just get stupider if you train them on their own output, which suggests that the quality of the data available in the training set is higher than the quality of output produced by the model as a general rule. The fidelity is < 1.0 . Apparently, it is possible to achieve fidelity >1 (the growth of human knowledge) but our algorithms are not so great at this point, it seems.
Not necessarily. For example Anthropic's ConstitutionalAI (CAI) leverages the model to substitute human judgments in RLHF, effectuating essentially RLAIF. CAI information is used to fine-tune the Claude model.
Broadly speaking, you require statistics at echelon N+1 when you are at rung N. We can amplify models by providing them additional time, self-reflexion, demand step by step planning, allow external tools, tune it on human preferences, or give it feedback from executing a code, or from a robot.
K0balt|2 years ago
Right now they just get stupider if you train them on their own output, which suggests that the quality of the data available in the training set is higher than the quality of output produced by the model as a general rule. The fidelity is < 1.0 . Apparently, it is possible to achieve fidelity >1 (the growth of human knowledge) but our algorithms are not so great at this point, it seems.
visarga|2 years ago
Broadly speaking, you require statistics at echelon N+1 when you are at rung N. We can amplify models by providing them additional time, self-reflexion, demand step by step planning, allow external tools, tune it on human preferences, or give it feedback from executing a code, or from a robot.