> A spokesperson for the mayor, Dora Pekec, confirmed in a text message that the new administration plans to take down the chatbot. She said a member of the Mamdani transition team had seen reporting on the bot from The Markup and THE CITY and presented it to the mayor as a possible place to save funds.
Journalism teed up an easy way for an incoming politician to dunk on his predecessor, if you'll forgive the mixed metaphor. Not that I'm opposed to any part of it, just that this was an easy scenario for "journalism" to "work" in.
Why did NYC release it in the first place? Did they not QA it?
Or was it perhaps one of those cases where they found issues, but the only way to really know for sure that the deleterious impact is significant enough by pushing it to prod?
> Why did NYC release it in the first place? Did they not QA it?
Considering Louis Rossmann's videos on his adventures with NYC bureaucracy (e.g. [0]), the QAers might not have known the laws any better than the chat bot.
Remember that many people are heavily are happy-path biased. They see a good result once and say "that's it, ship it!"
I'm sure they QA'd it, but QA was probably "does this give me good results" (almost certainly 'yes' with an LLM), not "does this consistently not give me bad results".
The chatbot was released under the Eric Adams administration. The same Eric Adams, as soon as his term finished, went to Dubai and launched a cryptocurrency.
I think he is simply not very bright, and got mesmerized by all the shiny promises AI and crypto makes without the slightest understanding of how it actually works. I do not understand how he got into office in the first place.
QA efforts can whack-a-mole some issues, but the mismatch of problem and solution is inherent in any situation in which a generator of plausible-sounding text gets pointed at an area where correctness matters.
It’s an LLM. The dirty little secret of LLMs is that they cannot be used for anything important, unless the output is checked by an expert (which typically rather defeats the purpose).
It was implemented by our scammy, grifting, Republican in a Democratic lawmaker suit former mayor Eric Adams who should probably be in prison but who made a deal with Trump to not be prosecuted.
Being in and around the NYC area, while also knowing plenty of small businesses, I'm glad Mamdani killed this bot. Telling bosses to steal tips from their employees is run-of-the-mill corruption and common over here. The vibe for businesses is that everyone has to be exploiting someone else or have a schtick. If you were to talk about morals, you would be ridiculed. Most lawyers wouldn't even prosecute small businesses for this. It's probably why the agent was put into production, the level of business ethics in NYC is cartoonishly evil.
In the case of stealing tips, that's wage theft and the New York State Department of Labor has zero sense of humor about that. They will definitely investigate all claims on that topic. It might be too little and too late for the individual affected, but the business will pay.
I always ask this question about these bots: is the literature the training data or is the understanding of literature the training data? Meaning, sure you trained the bot on the current rules and regulations. But does that mean the model weights contain them? Such that really is a guess at legal accuracy? Or is it trained to be a lawyer and understand the docs which sit outside the model? Every time I've asked the answer is the former, and to me that's the wrong approach. But I'm not an AI scientist so I don't know how hard my theoretically perfect solution is.
What I do know is that if it was done my way it would be pretty easy for it to do what the Google AI does. Say it's not responsible, give links for humans to fact check it. I've noticed a dramatic drop in hallucinations after it had to provide links to its sources. Still not 0, though.
> I've noticed a dramatic drop in hallucinations after it had to provide links to its sources. Still not 0, though.
I’ve noticed that Google does a fair job at linking to relevant sources but it’s still fairly common for it to confabulate something that source doesn’t say or even directly contradicts. It seems to hit the underlying inability to reason where if the source covers more than one thing it’s prone to taking an input “X does A while Y does B” and emitting “Y does A” or “X does A and B”. It’s a fascinating failure mode which seems to be insurmountable.
I thought Gemini just started providing citations in the last few months. Are you saying they should have beaten Google to the punch on this? As part of the $500,000 budget?
> The bot, built using Microsoft’s cloud computing platform
When is the last time there was positive news involving Microsoft? This bot could've easily been on AWS or GCP but I find it hilarious that here they are, getting dragged yet again
Even if the capability of each platform was exactly the same, Microsoft cloud users skew heavily towards governments, large non-tech corporations and really anyone who you sell to using large sales teams, fancy dinners and kickbacks rather than quality of software. And the end result follows.
> The Office of Technology and Innovation spent nearly $600,000 to build out the foundations of the MyCity chatbot, which will be used for future chatbot offerings on MyCity. [0]
This was experimental tech... while I admire cities attempting to implement AI, it seems they did not spend enough tax dollars on it!
We’ll likely see a lot of these AI pet projects get axed in the coming year or two… especially things rushed out in the early phases of the AI bubble when folks were desperate to appear to be using AI.
yeah i hope the problems stay to somewhat humorous themes like convincing a car sales bot to sell you a car for $1 and not more serious issues like convincing a bot to metaphorically launch the ICBMs.
toomuchtodo|1 month ago
Journalism works.
andrewflnr|1 month ago
atq2119|1 month ago
andsoitis|1 month ago
Or was it perhaps one of those cases where they found issues, but the only way to really know for sure that the deleterious impact is significant enough by pushing it to prod?
drillsteps5|1 month ago
How do you QA black box non-deterministic system? I'm not being facetious, seriously asking.
EDIT: Formatting
thedanbob|1 month ago
Considering Louis Rossmann's videos on his adventures with NYC bureaucracy (e.g. [0]), the QAers might not have known the laws any better than the chat bot.
[0] https://www.youtube.com/watch?v=yi8_9WGk3Ok
cheald|1 month ago
I'm sure they QA'd it, but QA was probably "does this give me good results" (almost certainly 'yes' with an LLM), not "does this consistently not give me bad results".
pibaker|1 month ago
https://apnews.com/article/eric-adams-crypto-meme-coin-942ba...
I think he is simply not very bright, and got mesmerized by all the shiny promises AI and crypto makes without the slightest understanding of how it actually works. I do not understand how he got into office in the first place.
elgenie|1 month ago
rsynnott|29 days ago
There’s no amount of qa that could save this.
freejazz|1 month ago
fragmede|1 month ago
erxam|1 month ago
Perhaps a big fat check was involved.
JohnTHaller|29 days ago
kittikitti|1 month ago
patrickmay|1 month ago
Neywiny|1 month ago
What I do know is that if it was done my way it would be pretty easy for it to do what the Google AI does. Say it's not responsible, give links for humans to fact check it. I've noticed a dramatic drop in hallucinations after it had to provide links to its sources. Still not 0, though.
acdha|29 days ago
I’ve noticed that Google does a fair job at linking to relevant sources but it’s still fairly common for it to confabulate something that source doesn’t say or even directly contradicts. It seems to hit the underlying inability to reason where if the source covers more than one thing it’s prone to taking an input “X does A while Y does B” and emitting “Y does A” or “X does A and B”. It’s a fascinating failure mode which seems to be insurmountable.
sdwr|1 month ago
I thought Gemini just started providing citations in the last few months. Are you saying they should have beaten Google to the punch on this? As part of the $500,000 budget?
sylens|1 month ago
When is the last time there was positive news involving Microsoft? This bot could've easily been on AWS or GCP but I find it hilarious that here they are, getting dragged yet again
embedding-shape|1 month ago
paxys|1 month ago
SanjayMehta|19 days ago
https://organiser.org/2026/02/10/339366/world/zohran-mamdani...
hashberry|1 month ago
This was experimental tech... while I admire cities attempting to implement AI, it seems they did not spend enough tax dollars on it!
[0] https://abc7ny.com/post/ai-artificial-intelligence-eric-adam...
terespuwash|1 month ago
greekrich92|1 month ago
cmiles8|1 month ago
chasd00|1 month ago
1970-01-01|1 month ago
direwolf20|1 month ago
hydrogen7800|1 month ago
monero-xmr|1 month ago
[deleted]
geoffeg|1 month ago