(no title)
drillsteps5 | 1 month ago
How do you QA black box non-deterministic system? I'm not being facetious, seriously asking.
EDIT: Formatting
drillsteps5 | 1 month ago
How do you QA black box non-deterministic system? I'm not being facetious, seriously asking.
EDIT: Formatting
pegasus|1 month ago
The thing is (and maybe this is what parent meant by non-determinism, in which case I agree it's a problem), in this brave new technological use-case, the space of possible interactions dwarfs anything machines have dealt with before. And it seems inevitable that the space of possible misunderstandings which can arise during these interactions will balloon similarly. Simply because of the radically different nature of our AI interlocutor, compared to what (actually, who) we're used to interacting with in this world of representation and human life situations.
drillsteps5|1 month ago
By "non-deterministic" I meant that it can give you different output for the same input. Ask the same question, get a different answer every time, some of which can be accurate, some... not so much. Especially if you ask the same question in the same dialog (so question is the same but the context is not so the answer will be different).
EDIT: More interestingly, I find an issue, what do I even DO? If it's not related to integrations or your underlying data, the black box just gave nonsensical output. What would I do to resolve it?
datsci_est_2015|1 month ago
That’s not strictly how I test my systems. I can release with confidence because of a litany of SWE best practices learned and borrowed from decades of my own and other people’s experiences.
> No system is guaranteed to never fail, it's all about degree of effectiveness and resilience.
It seems like the product space for services built on generative AI is diminishing by the day with respect to “effectiveness and resilience”. I was just laughing with some friends about how terrible most of the results are when using Apple’s new Genmoji feature. Apple, the company with one of the largest market caps in the world.
I can definitely use LLMs and other generative AI directly, and understand the caveats, and even get great results from them. But so far every service I’ve interacted with that was a “white label” repackaging of generative AI has been absolute dogwater.
unknown|1 month ago
[deleted]
themafia|1 month ago
It's the training data that matters. Your "AI interlocutor" is nothing more than a lossy compression algorithm.
unknown|1 month ago
[deleted]
kylehotchkiss|1 month ago