top | item 44094453

(no title)

deepsharp | 9 months ago

Would you seriously deploy a rigid AI system into a mission-critical environment—say, autonomous driving, finance, or defense—where conditions change constantly? It's a safety risk.

discuss

PeterStuer|9 months ago

The variance of which you speak would be handled by the current deployed version of the system that has been tested and declared fit for operation across a range of contitions.

Meanwhile, the next (might be multiple) release candidates are being developed/trained an tested for potential future production use.

e.g. When I did autonomous robotics, the sensor models had to be quite adaptive as less predictable environmental parameters such as lightning conditions, dirt, energy level and temperature could influence readings dramatically. These dynamic adaptations occur at runtime, sometimes by a fairly non trivial trained sensor model.

What you usually do not want is running an untested system that "freely" learns from presented data in a live production environment as that could lead e.g. to contextual over-fitting or destabilization and even subversion of the adaptive control processes.

Exceptions could be systems that have to operate in extremely dynamic and less understood environments, but where risks are bound and you can confidently implement guardrails to protect against excessive loss (e.g. HFT agents).

deepsharp|9 months ago

“The variance of which you speak would be handled by the current deployed version of the system that has been tested and declared fit for operation across a range of conditions.”

This statement reflects a common (and dangerous) assumption in today's AI culture—that one can foresee all possible future conditions at design time—knowing the unknown unknows. Zillow’s AI was also "declared fit"... until COVID flipped housing dynamics and cost them half a billion. Tiger Global’s $17B loss followed a similar trajectory—confidence in pre-deployment testing, blindsided by real-world shifts....I can go on. But the good news is some communities, especially those deploying AI in the real world, have started to recognize this. For example:

"Autonomous systems must be able to operate in complex, possibly a priori unknown environments that possess a large number of potential states that cannot all be pre-specified or be exhaustively examined or tested. Systems must be able to assimilate, respond to, and adapt to dynamic conditions that were not considered during their design... This 'scaling' problem... is highly nontrivial." — Institute for Defense Analyses (IDA)

Until the broader AI/ML culture internalizes this gap—between leaderboard AI (wins in pre-defined benchmarks) and real-world AI—we'll keep seeing deployed systems fail in costly, unpredictable ways.