top | item 42704941

(no title)

The beta is inconsistently showing (required a few refreshes to get something to show up), but my limited usage of it showed a plethora of issues:

- Assumed UTC instead of EST. Corrected it and it still continued to bork

- Added random time deltas to my asked times (+2, -10 min).

- Couple notifications didn't go off at all

- The one that did go off didn't provide a push notification.

---

On top of that, only usable without search mode. In search mode, it was totally confused and gave me a Forbes article.

Seems half baked to me.

Doing scheduled research behind the scenes or sending a push notification to my phone would be cool, but surprised they thought this was OK for a public beta.

discuss

gukov|1 year ago

You'd think Open AI's dev velocity and quality would be off the charts since they live and breathe "AI." If a company building ChatGPT itself often delivers buggy features then it doesn't bode well for this whole 'AI will eat the world' notion.

practice9|1 year ago

Well none of the labs have good frontend or mobile engineers or even infra engineers

Anthropic is ahead in this because they keep their UIs simplistic so the failure modes are also simple (bad connection)

OpenAI is just pushing half baked stuff to prod and moving on (GPTs, Canvas).

Find it hilarious and sad that o1-pro just times out thinking on very long or image-intense chats. Need to reload page multiple times after it fails to reply and maybe answer will appear (or not? Or in 5 minutes?). Kinda shows they’re not testing enough and “not eating their own food” and feels like chatgpt 3.5 ui before the redesign

golergka|1 year ago

So far, I've found AI to be a great force multiplier in green field, small projects. In a huge corporate codebase, it has the power of advanced refactoring (which doesn't touch more than a handful files at a time) and a CSS wizard.

cruffle_duffle|1 year ago

According to all the magazines I've been reading, all that is required is to just prompt it with "please fix all of these issues" and give it a bulleted list with a single sentence describing each issue. I mean, it's AI powered and therefore much better than overpaid prima-donna engineers, so obviously it should "just work" and all the problems will get fixed. I'm sure most of the bugs were the result of humans meddling in the AI's brilliant output.

Right now, in fact, my understanding is OpenAI is using their current LLM's to write the next generation ones which will far surpass anything a developer can currently do. Obviously we'll need to keep management around to tell these things what to do, but the days of being a paid software engineer are numbered.

ineedasername|1 year ago

When I have it do a search I have to tell it to just get all the info it can in the search but wait for the next request. The I explicitly tell it we’re done searching and to treat the next prompt as a new request but using the new info it found.

That’s the only way I get it to have a halfway decent brain after a web search. Something about that mode makes it more like a PR drone version of whatever I asked it to search, repeating things verbatim even when I ask for more specifics in follow-up.

emkee|1 year ago

Can you give an example prompt for this approach?

imsotiredspacex|1 year ago

i posted the system prompt part describing the function call; if you read it and adjust your prompt for creating the task it works way better.

unknown|1 year ago

[deleted]

potatoman22|1 year ago

I'd rather have buggy things now than perfect things in a year.

dmadisetti|1 year ago

Doesn't need to be perfect- but using this would actively reduce productivity

sprobertson|1 year ago

First impressions matter, if the experience is this bad you're probably waiting a year to come back anyway.

jahewson|1 year ago

Worked out great for Sonos when their timers and alarms didn’t work.

broknbottle|1 year ago

Found the PM

arthurcolle|1 year ago

DateTime stuff is generally super annoying to debug. Can't fault them too badly. Adding a scheduler is a key enabling idea for a ton of use cases

sensanaty|1 year ago

> Can't fault them too badly

The same company that touts their super hyper advanced AI tool that can do everyone's (except the C-level's, apparently) jobs to the world can't figure out how to make a functional cron job happen? And we're giving them a pass, despite the bajillions of dollars that M$ and VC is funneling their way?

Quite interesting they wouldn't just throw the "proven to be AGI cause it passes some IQ tests sometimes" tooling at it and be done with it.

cbeach|1 year ago

Agreed on date/time being a frustrating area of software development.

But wouldn't a company like OpenAI use a tick-based system in this architecture? i.e. there's an event emitter that ticks every second (or maybe minute), and consumers that operate based on these events in realtime? Obviously things get complicated due to the time consumed by inference models, but if OpenAI knows the task upfront it could make an allowance for the inference time?

If the logic is event driven and deterministic, it's easy to test and debug, right?

dmadisetti|1 year ago

Yeah, they're not exactly a scrappy startup- I'd be surprised if they had 0 QA.

Makes me wonder if they internally have "press releases / Q" as an internal metric to keep up the hype.