timini
|
28 days ago
|
on: Ex-GitHub CEO launches a new developer platform for AI agents
clearly this is a plat to own all the agent traces for further training
timini
|
28 days ago
|
on: Ex-GitHub CEO launches a new developer platform for AI agents
clearly this is a play to own all the agentic traces for further training
timini
|
4 months ago
|
on: Evaluating Control Protocols for Untrusted AI Agents
This paper evaluates three control strategies for untrusted agents: deferral to trusted models, resampling, and critical action deferral. Initial testing showed resampling and critical action deferral achieving 96% safety. However, adversarial testing revealed resampling crashes to 17% safety when attackers can detect resampling or simulate monitors, while critical action deferral remained robust against all attack strategies.
timini
|
4 months ago
|
on: HaluMem: Evaluating Hallucinations in Memory Systems of Agents
HaluMem introduces the first benchmark for evaluating hallucinations in agent memory systems at the operation level. Through three evaluation tasks (memory extraction, updating, and question answering), it reveals that existing memory systems generate and accumulate hallucinations during early stages, which then propagate errors downstream. The benchmark uses two datasets spanning different context scales to systematically reveal these failure modes.
timini
|
4 months ago
|
on: The OpenHands Software Agent SDK: Composable and Extensible
OpenHands SDK provides a complete architectural redesign for building production software development agents. It balances simplicity (few lines of code for basic agents) with extensibility (custom tools, memory management) while delivering seamless local-to-remote execution, integrated security, and connections to various interfaces (VS Code, command line, APIs).
timini
|
4 months ago
|
on: Ask HN: Are there AI agents that learn from experience?
timini
|
4 months ago
|
on: Can Agentic AI workflows create good content?
judging by this article, no
timini
|
4 months ago
|
on: Dynamic Tool Allocation for AI Agents (The Rats Pattern)
TL;DR
Problem: "Tool overload" is a critical bottleneck for AI agents. Providing an LLM with a large, static list of tools bloats the context window, degrading performance, increasing costs, and reducing accuracy.
Solution: Implement a "select, then execute" architectural pattern. Use a lightweight "router" agent to first retrieve a small, relevant subset of tools for a specific task. Then, a more capable "specialist" agent uses that curated set to execute the request.
Benefits: Lower latency and cost (fewer tokens), higher tool-selection precision, a scalable architecture for large tool catalogs, and improved reliability.
Pattern: This pattern is a form of Retrieval-Augmented Generation (RAG) applied to tools, often called Retrieval-Augmented Tool Selection (RATS). It can be combined with State-Based Gating for even greater precision.
How: This post provides a complete, production-aware implementation using Google's Agent Development Kit (ADK).
timini
|
2 years ago
|
on: Hallucination is inevitable: An innate limitation of large language models
I think its fairly simple, it needs a certain level of proof e.g references to authoritative sources, if not say "i don't know".
timini
|
2 years ago
|
on: Animated Drawings
did you get paid to make this?
timini
|
3 years ago
|
on: Why we added package.json support to Deno
The biggest issue with node.js was the terrible package manager. Looks like Deno is gonna make all the same mistakes.
timini
|
3 years ago
|
on: ChatCoach – GTP-4 enabled life coaching
Just threw this together, trying to get some feedback on what i can do to improve / add more value?
timini
|
3 years ago
|
on: CoachChat – A GPT-4 Career Coaching Chatbot
Just threw this together last night with GPT-4 API. Any feedback / suggestions greatly appreciated@
timini
|
3 years ago
|
on: GPT-4 CareerCoaching Chatbot
Just threw together this chatbot for life coaching / career coaching
timini
|
3 years ago
|
on: Bard and new AI features in Search
Are you sure that google wont provide the link? If these chatbots could provide references for their answers it allows them to link back to websites, solving lots of problems mentioned here
timini
|
3 years ago
|
on: Bard and new AI features in Search
^ This.
we will all become servants to the giant AI, feeding it more and more levels of detail and obscurity
timini
|
3 years ago
|
on: UK's largest tunnel cutting head lowered into its London shaft
anyone know if it cheaper to make a bridge or tunnel across the thames?
timini
|
3 years ago
|
on: UK's largest tunnel cutting head lowered into its London shaft
This is an excellent idea TBH
timini
|
4 years ago
|
on: Ask HN: Are most of us developers lying about how much work we do?
What practical things can I do to get better at my job? Ive always been a procrastinator but since the pandemic I've become like op. Very little work on job, whole weeks where I don't do anything. It's double edged sword, I have rekindled some old hobbies but I feel a lot of guilt about not being good at my job. I get by just about at work some people love my work some people hate working with me.
I would love to learn from successful people like you, is there anything you can recommend reading or any course to learn?
I'm in the middle of my life and feeling stuck and confused about working in tech. What's the remedy?
timini
|
4 years ago
|
on: Ask HN: Are most of us developers lying about how much work we do?
I really think this post is onto something, many people sharing similar experiences. Is this something to do with working remotely?
Does anyone have any book / podcast recommendations that can help understand this dilemma?