I don't know how much this API churn is going to help developers who are trying to integrate OAI into real, actual, non-wrapper products. Every vendor-managed state machine that handles conversation, messages, prompt hand-off, etc., has ultimately proven inadequate, presumptive or distracting for my use cases.
At the end of the day, all I ever seem to use is the chat completion API with structured outputs turned on. Despite my "basic" usage, I am employing tool use, recursive conversations, RAG, etc. I don't see the value in outsourcing state management of my "agent" to a 3rd party. I have way more autonomy if I keep things like this local.
The entire premise of these products is that you are feeding a string literal into some black box and it gives you a new string. Hopefully, as JSON or whatever you requested. If you focus just on the idea of composing the appropriate string each time, everything else melts away. This is the only grain that really matters. Think about other ways in which we compose highly-structured strings based upon business state stored in a database. It's literally the exact same thing you do when you SSR a webpage with PHP. The only real difference is how it is served.
I haven't really found any agent framework that gives me anything I need above a simple structured gen call.
As you say, most requests to LLMs are (should be?) prompt-in structure-out, in line with the Unix philosophy of doing precisely one thing well.
Agent frameworks are simply too early. They are layers built to abstract a set of design patterns that are not common. We should only build abstractions when it is obvious that everyone is reinventing the wheel.
In the case of agents, there is no wheel to invent. It's all simple language model calls.
I commonly use the phrase "the language model should be the most boring part of your code". You should be spending most of your time building the actual software and tooling -- LLMs are a small component of your software. Agent frameworks often make the language model too large a character in your codebase, at least for my tastes.
I mirror this sentiment. Even their "function calling" abstraction still hallucinates parameters and schema, and the JSON schema itself is clearly way too verbose and breaks down completely if you feed it anything more complex than 5 very simple function calls. This just seems to build upon their already broken black box abstractions and isn't useful for any real world applications, but it's helpful for getting small proof-of-concept apps going, I guess...
Exactly. You would have to be naive to build a company on top of this kind of API. LLMs are going to be become commodities, and this is OpenAI fighting against that fate as their valuation and continued investment requirements doesn't make any sense otherwise.
If you built on the Assistant API, maybe take the hint and don't just rewrite to the Responses API? Own your product, black box the LLM-of-the-day.
This bit feels like we are being pushed away from the existing API for non-technical reasons?
> When using Chat Completions, the model always retrieves information from the web before responding to your query. To use web_search_preview as a tool that models like gpt-4o and gpt-4o-mini invoke only when necessary, switch to using the Responses API.
Porting over to the new Responses API is non-trivial, and we already have history, RAG and other things an assistant needs already.
Couldn’t have said it better. I’ve developed multiple agents with just function calling, structured outputs and they have been in production for more than a year (back in the day we did not call it agent lol)
I think these is targeted towards people who are already using agent frameworks + OpenAI API.
“ These new tools streamline core agent logic, orchestration, and interactions, making it significantly easier for developers to get started with building agents”
Sounds exactly like “the cloud”, especially AWS. Basically “get married to our platform, build on top of it, and make it hard to leave.” The benefits are that it’s easy to get started. And also that they invested in the infrastructure, but now they are trying to lock you in by storing as much state and data as possible with them withoit an easy way to migrate. So, increase your switching costs. For social networks the benefit was that they had the network effect but that doesn’t apply here.
outsourcing state to openai & co is great for them as vendor lock-in. the real money in AI will be business- and user-interfacing tools built on top of the vendors and it would be a terrible business decision to not abstract away from the model provider in the background and keep all private data under your domain, also from a data protection / legal point of view
i can understand them trying to prevent their business from becoming a commodity but i don't see that working out for them except for some short term buzz, but others will run with their ideas in domain specific applications
Speaking of string literals, I hate that the state of the art nowadays is to force you to format model inputs as a conversation with separate messages. Ever since OpenAI did that and discontinued the regular completions API it became nearly useless for me since I don't use LLMs for conversation. And because OpenAI is the Apple of LLMs, everyone else is copying that worthless chat messages abstraction and not providing normal completions as they should.
Don't get me wrong, chat completions are nice to have for certain use cases, but that being the only option makes me practically unable to use the model.
I feel like all those AI agent attempts are misguided at their core because they don't attempt to create new ways but replace humans on the legacy systems. This is fundamentally shortsighted because the economy, life and everything is about humans interacting with humans.
The current AI agent approach appears to be permutations of the joke about how people will make AI to expand their once sentence to a long nice e-mail and the AI on the receiving end will summarize that long e-mail into single sentence.
I get that there's a use case for automating tasks on legacy systems but IMHO the real opportunity is the opportunity to remove most of the legacy systems.
Humans are not that bad you know? Is it creating UIs for humans using AI then make AI use these UI to do stuff really the way forward?
I think the most valuable path for the current generation of AI models is integrating them with the configuration and administration side of the product.
For example, as a supplemental user experience that power users in your org can leverage to macro out client configuration and project management tasks in a B2B SaaS ecosystem. Tool use can be very reliable when you have a well constrained set of abstractions, contexts and users to work with.
100% but this is not the same thing, nor is this going to replace the agent SDK (or visa versa). Agents will always need some form of communication protocol, if we look at the world and agentic frameworks its a sea of logos and without some forms of open standards this would be hard.
I'm currently at Comet and I have personally worked on MCP implementations AND have made some contributions to Agent SDK in the form of a native integration and improvement to test suite.
I think the key to what OpenAI is pushing towards is simplicity for developers through very easy to use components. I won't comment on the strategy or pricing etc, but on first glance as a developer the simple modular approach and lack of bloat in their SDK is refreshing.
Kudos to the team and people working on the edge to innovate and think differently in an already crowded and shifting landscape.
main fun part - since responses are stored for free by default now, how can we abuse the Responses API as a database :)
other fun qtns that a HN crew might enjoy:
- hparams for websearch - depth/breadth of search for making your own DIY Deep Research
- now that OAI is offering RAG/reranking out of the box as part of the Responses API, when should you build your own RAG? (i basically think somebody needs to benchmark the RAG capabilities of the Files API now, because the community impression has not really updated from back when Assistants API was first launched)
- whats the diff between Agents SDK and OAI Swarm? (basically types, tracing, pluggable LLMs)
- will the `search-preview` and `computer-use-preview` finetunes be merged into GPT5?
They did not announce the price(s) in the presentation. Likely because they know it is going to be very expensive:
Web Search [0]
* $30 and $25 per 1K queries for GPT‑4o search and 4o-mini search.
File search [1]
* $2.50 per 1K queries and file storage at $0.10/GB/day
* First 1GB is free.
Computer use tool (computer-use-preview model) [2]
* $3 per 1M input tokens and $12/1M output tokens.
So they're basically pivoting from selling text by the ounce to selling web searches and cloud storage? I like it, it's a bold move. When the slow people at Google finally catch up it might be too late for Google?
> “we plan to formally announce the deprecation of the Assistants API with a target sunset date in mid-2026.”
The new Responses API is a step in the right direction, especially with the built-in “handoff” functionality.
For agentic use cases, the new API still feels a bit limited, as there’s a lack of formal “guardrails”/state machine logic built in.
> “Our goal is to give developers a seamless platform experience for building agents”
It will be interesting to see how they move towards this platform, my guess is that we’ll see a graph-based control flow in the coming months.
Now there are countless open-source solutions for this, but most of them fall short and/or add unnecessary obfuscation/complexity.
We’ve been able to build our agentic flows using a combination of tool calling and JSON responses, but there’s still a missing higher order component that no one seems to have cracked yet.
I'm impressed by the advances in Computer Use mentioned here and this got me wondering - is this already mature enough to be utilized for usability testing? Would I be right to assume that in general, a UI that is more difficult for AI to navigate is likely to also be relatively difficult for humans, and that it's a signal that it should be simplified/improved in some way?
Which could be combined with the query_kb tool from the mr_kb plugin (in my mr_kb repo) which is actually probably better than File Search because it allows searching multiple KBs.
Anyway, if anyone wants to help with my program, create a plugin on PR, or anything, feel free to connect on GitHub, email or Discord/Telegram (runvnc).
This is one of the few agent abstractions I've seen that actually seems intuitive. Props to the OpenAI team, seems like it'll kill a lot of bad startups.
I was fortunate to get early access to the new Agent SDK and APIs that OpenAI dropped today and made an open source project to show some of the capabilities [1]. If you are using any of the other agent frameworks like LangGraph/LangChain, AutoGen, Crew, etc I definitely suggest giving this agent SDK a spin.
To ease into it, I added the entire SDK with examples and full documentation as a single text file in my repo [2] so you can quickly get up to speed be adding it to a prompt and just asking about it or getting some quick start code to play around with.
The code in my repo is very modular so you can try implementing any module using one of the other frameworks to do a head-to-head.
Here’s a blog post with some more thoughts on this SDK [3] and some if its major capabilities.
Nice to finally see one of the labs throwing weight behind a much needed simple abstraction. It's clear they learned from the incumbents (langchain et al)-- don't sell complexity.
Also very nice of them to include extensible tracing. The AgentOps integration is a nice touch to getting behind the scenes to understand how handoffs and tool calls are triggered
I don't know about agents, but this finally adds the ability to search the web to the API. This is a very useful big deal.
Kind of annoying that they've made a bunch of tiny changes to the history format though. It doesn't seem to change anything important, and only serves to make existing code incompatible.
Aa lot of criticism about the potential of vendor lockin etc, but I think this is great, especially for building proof of concepts and small projects. As they said, these are the first building blocks and they look great to me.
When gpt3.5 came out these are literally the first things I built manually. I mainly use LLMs through a telegram bot. I know there are a lot of tools and frameworks out there but I wrote a few hundred lines of hacky python to give my bot memory, web search, image analysis. It's fun and useful and I agree that these are the basic building blocks that many apps need.
Sure you can find better stuff elsewhere with less lock in and more control, but now it "just works" and this responses api is cleaner and more powerful than the chatcompletions one, so personally I'm happy to give openai credit for this, I just don't know why they couldn't have released it two years ago
Feels like OpenAI really want to compete with its own ecosystem. I guess they are doing this to try to position themselves as the standard web index that everyone uses, and the standard RAG service, etc.
But they could just make great services and live in the infra layer instead of trying to squeeze everyone out at the application layer. Seems unnecessarily ecosystem-hostile
[+] [-] bob1029|1 year ago|reply
At the end of the day, all I ever seem to use is the chat completion API with structured outputs turned on. Despite my "basic" usage, I am employing tool use, recursive conversations, RAG, etc. I don't see the value in outsourcing state management of my "agent" to a 3rd party. I have way more autonomy if I keep things like this local.
The entire premise of these products is that you are feeding a string literal into some black box and it gives you a new string. Hopefully, as JSON or whatever you requested. If you focus just on the idea of composing the appropriate string each time, everything else melts away. This is the only grain that really matters. Think about other ways in which we compose highly-structured strings based upon business state stored in a database. It's literally the exact same thing you do when you SSR a webpage with PHP. The only real difference is how it is served.
[+] [-] cpfiffer|1 year ago|reply
I haven't really found any agent framework that gives me anything I need above a simple structured gen call.
As you say, most requests to LLMs are (should be?) prompt-in structure-out, in line with the Unix philosophy of doing precisely one thing well.
Agent frameworks are simply too early. They are layers built to abstract a set of design patterns that are not common. We should only build abstractions when it is obvious that everyone is reinventing the wheel.
In the case of agents, there is no wheel to invent. It's all simple language model calls.
I commonly use the phrase "the language model should be the most boring part of your code". You should be spending most of your time building the actual software and tooling -- LLMs are a small component of your software. Agent frameworks often make the language model too large a character in your codebase, at least for my tastes.
[+] [-] sippeangelo|1 year ago|reply
[+] [-] Androider|1 year ago|reply
If you built on the Assistant API, maybe take the hint and don't just rewrite to the Responses API? Own your product, black box the LLM-of-the-day.
[+] [-] daviding|1 year ago|reply
> When using Chat Completions, the model always retrieves information from the web before responding to your query. To use web_search_preview as a tool that models like gpt-4o and gpt-4o-mini invoke only when necessary, switch to using the Responses API.
Porting over to the new Responses API is non-trivial, and we already have history, RAG and other things an assistant needs already.
[+] [-] sagarpatil|1 year ago|reply
[+] [-] EGreg|1 year ago|reply
Sounds exactly like “the cloud”, especially AWS. Basically “get married to our platform, build on top of it, and make it hard to leave.” The benefits are that it’s easy to get started. And also that they invested in the infrastructure, but now they are trying to lock you in by storing as much state and data as possible with them withoit an easy way to migrate. So, increase your switching costs. For social networks the benefit was that they had the network effect but that doesn’t apply here.
[+] [-] ripped_britches|1 year ago|reply
Don’t be fooled by moving state management to somewhere other than your business logic unless it enables a novel use case (which these SDKs do not)
With that said, glad to see the agentic endpoints available but still going to be managing my state this way
[+] [-] malthaus|1 year ago|reply
i can understand them trying to prevent their business from becoming a commodity but i don't see that working out for them except for some short term buzz, but others will run with their ideas in domain specific applications
[+] [-] edoceo|1 year ago|reply
[+] [-] LoganDark|11 months ago|reply
Don't get me wrong, chat completions are nice to have for certain use cases, but that being the only option makes me practically unable to use the model.
[+] [-] danielmarkbruce|1 year ago|reply
But you can't expect them not to try.
[+] [-] mortoc|1 year ago|reply
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] samstave|1 year ago|reply
So - are people forming relationships with OAI which include an SLA, and if so - what do those look like?
[+] [-] simonw|1 year ago|reply
Here's the alternative link for people who aren't signed in to Twitter: https://nitter.net/athyuttamre/status/1899541471532867821
[+] [-] bradyriddle|1 year ago|reply
[+] [-] telotortium|1 year ago|reply
[+] [-] mrtksn|1 year ago|reply
The current AI agent approach appears to be permutations of the joke about how people will make AI to expand their once sentence to a long nice e-mail and the AI on the receiving end will summarize that long e-mail into single sentence.
I get that there's a use case for automating tasks on legacy systems but IMHO the real opportunity is the opportunity to remove most of the legacy systems.
Humans are not that bad you know? Is it creating UIs for humans using AI then make AI use these UI to do stuff really the way forward?
[+] [-] NitpickLawyer|1 year ago|reply
How many hand crafted, clay bowls, baked in a human powered kiln are you using everyday? Or how many weaved baskets, made out of hand picked sticks?
History has showed that anything that can be automated, will be automated. And everything that can be made "cheaper" or "faster" will as well.
[+] [-] bob1029|1 year ago|reply
For example, as a supplemental user experience that power users in your org can leverage to macro out client configuration and project management tasks in a B2B SaaS ecosystem. Tool use can be very reliable when you have a well constrained set of abstractions, contexts and users to work with.
[+] [-] zellyn|1 year ago|reply
[+] [-] koconder|1 year ago|reply
I'm currently at Comet and I have personally worked on MCP implementations AND have made some contributions to Agent SDK in the form of a native integration and improvement to test suite.
- https://github.com/comet-ml/opik-mcp
- https://github.com/openai/openai-agents-python/pull/91
Our recent integration shipped on day 1:
- https://www.comet.com/docs/opik/tracing/integrations/openai_...
I think the key to what OpenAI is pushing towards is simplicity for developers through very easy to use components. I won't comment on the strategy or pricing etc, but on first glance as a developer the simple modular approach and lack of bloat in their SDK is refreshing.
Kudos to the team and people working on the edge to innovate and think differently in an already crowded and shifting landscape.
[+] [-] nilslice|1 year ago|reply
but yes, it's the strongest anti-developer move to not directly support MCP. not surprised given OpenAI generally. but would be a very nice addition!
[+] [-] thenameless7741|1 year ago|reply
> [Q] Does the Agents SDK support MCP connections? So can we easily give certain agents tools via MCP client server connections?
> [A] You're able to define any tools you want, so you could implement MCP tools via function calling
in short, we need to do some plumbing work.
relevant issue in the repo: https://github.com/openai/openai-agents-python/issues/23
[+] [-] knowaveragejoe|1 year ago|reply
https://github.com/SecretiveShell/MCP-Bridge
[+] [-] dgellow|1 year ago|reply
[+] [-] swyx|1 year ago|reply
https://latent.space/p/openai-agents-platform
main fun part - since responses are stored for free by default now, how can we abuse the Responses API as a database :)
other fun qtns that a HN crew might enjoy:
- hparams for websearch - depth/breadth of search for making your own DIY Deep Research
- now that OAI is offering RAG/reranking out of the box as part of the Responses API, when should you build your own RAG? (i basically think somebody needs to benchmark the RAG capabilities of the Files API now, because the community impression has not really updated from back when Assistants API was first launched)
- whats the diff between Agents SDK and OAI Swarm? (basically types, tracing, pluggable LLMs)
- will the `search-preview` and `computer-use-preview` finetunes be merged into GPT5?
[+] [-] rvz|1 year ago|reply
[1] https://platform.openai.com/docs/pricing#built-in-tools
[2] https://platform.openai.com/docs/pricing#latest-models
[+] [-] rudedogg|1 year ago|reply
I have a hard time seeing how this API is better than https://www.anthropic.com/news/model-context-protocol.
It seems like the motivation was "how can we make more money", rather than "how can we be more useful for our users".
[+] [-] yard2010|1 year ago|reply
[+] [-] _bramses|1 year ago|reply
I also wrote a script that searches the web and works pretty well (using the vercel ai sdk)[1]
[0] - https://brave.com/search/api/
[1] - https://gist.github.com/bramses/41e90b27d156590154bcefd4119f...
[+] [-] anorak27|1 year ago|reply
https://github.com/Anilturaga/aiide
[+] [-] jumploops|1 year ago|reply
The new Responses API is a step in the right direction, especially with the built-in “handoff” functionality.
For agentic use cases, the new API still feels a bit limited, as there’s a lack of formal “guardrails”/state machine logic built in.
> “Our goal is to give developers a seamless platform experience for building agents”
It will be interesting to see how they move towards this platform, my guess is that we’ll see a graph-based control flow in the coming months.
Now there are countless open-source solutions for this, but most of them fall short and/or add unnecessary obfuscation/complexity.
We’ve been able to build our agentic flows using a combination of tool calling and JSON responses, but there’s still a missing higher order component that no one seems to have cracked yet.
[+] [-] falcor84|1 year ago|reply
[+] [-] ilaksh|1 year ago|reply
BTW I have something somewhat similar to some of this like Responses and File Search in MindRoot by using the task API: https://github.com/runvnc/mindroot/blob/main/api.md
Which could be combined with the query_kb tool from the mr_kb plugin (in my mr_kb repo) which is actually probably better than File Search because it allows searching multiple KBs.
Anyway, if anyone wants to help with my program, create a plugin on PR, or anything, feel free to connect on GitHub, email or Discord/Telegram (runvnc).
[+] [-] serjester|1 year ago|reply
[+] [-] sdcoffey|1 year ago|reply
[+] [-] dazzaji|1 year ago|reply
To ease into it, I added the entire SDK with examples and full documentation as a single text file in my repo [2] so you can quickly get up to speed be adding it to a prompt and just asking about it or getting some quick start code to play around with.
The code in my repo is very modular so you can try implementing any module using one of the other frameworks to do a head-to-head.
Here’s a blog post with some more thoughts on this SDK [3] and some if its major capabilities.
I’m liking it. A lot!
[1] https://github.com/dazzaji/agento6
[2] https://raw.githubusercontent.com/dazzaji/agento6/refs/heads...
[3] https://www.dazzagreenwood.com/p/unleashing-creativity-with-...
[+] [-] mentalgear|1 year ago|reply
[+] [-] Areibman|1 year ago|reply
Also very nice of them to include extensible tracing. The AgentOps integration is a nice touch to getting behind the scenes to understand how handoffs and tool calls are triggered
[+] [-] unknown|1 year ago|reply
[deleted]
[+] [-] esafak|1 year ago|reply
[+] [-] swyx|1 year ago|reply
[+] [-] LeoPanthera|1 year ago|reply
Kind of annoying that they've made a bunch of tiny changes to the history format though. It doesn't seem to change anything important, and only serves to make existing code incompatible.
[+] [-] simonw|1 year ago|reply
[+] [-] sixhobbits|1 year ago|reply
When gpt3.5 came out these are literally the first things I built manually. I mainly use LLMs through a telegram bot. I know there are a lot of tools and frameworks out there but I wrote a few hundred lines of hacky python to give my bot memory, web search, image analysis. It's fun and useful and I agree that these are the basic building blocks that many apps need.
Sure you can find better stuff elsewhere with less lock in and more control, but now it "just works" and this responses api is cleaner and more powerful than the chatcompletions one, so personally I'm happy to give openai credit for this, I just don't know why they couldn't have released it two years ago
[+] [-] cowpig|1 year ago|reply
But they could just make great services and live in the infra layer instead of trying to squeeze everyone out at the application layer. Seems unnecessarily ecosystem-hostile
[+] [-] nowittyusername|1 year ago|reply
[+] [-] tiniuclx|1 year ago|reply
I wonder what justifies this drastic difference in price.
[0] https://docs.perplexity.ai/guides/pricing