This type of behavior (and related) would primarily only be an issue with unconstrained generative models. If you're the one deploying the model, or a downstream consumer, once trained, the neural network can be reexpressed (via an exogenous/secondary model/process) to derive reliable and interpretable uncertainty quantification by conditioning on reference classes (in a held-out Calibration set) formed by the Similarity to Training (depth-matches to training), Distance to Training, and a CDF-based per-class threshold on the output magnitude. If the prediction/output falls below the desired probability threshold, gracefully fail by rejecting the prediction, rather than allowing silent errors to accumulate.For higher-risk settings, you can always turn the crank to be more conservative (i.e., more stringent parameters and/or requiring a larger sample size in the highest probability and reliability data partition).
For classification tasks, this follows directly. For generative output, this comes into play with the final verification classifier used over the output.
No comments yet.