amthewiz's comments

amthewiz | 4 months ago | on: Show HN: Realizing Karpathy's dream of Natural Language Programming

It has puzzled me why someone hasn't already done this yet, given LLMs are so good at language now.

Probably the short answer is that is it hard to get this to actually work. There were many open questions that one has to tackle simultaneously -

- What is the right balance between relying on LLMs to do the right thing vs the runtime around LLMs? For example, I went back and forth a few times getting LLMs to manage stack as you call one playbook from another one. Finally decided that it is most reliable to let the runtime take care of that.

- Context engineering - What to put in the prompt, in what order, how to represent state, how to handle artifacts, how to make sure that we use LLM cache optimally as context grows, how to "unwind" context as playbook calls return, how to compact specific types of information, how to make sure important context isn't lost, etc

- LLMs today have vastly different capabilities than 2 years ago. I have had to rewrite the whole stack from scratch 4 times to adjust. Wasn't fun, but had to be done.

- Language(s): How to represent the pseudocode so that it is both fluid natural language and a capable programming language? How to transform it so that it can be executed reliably through LLMs? How to NOT lose the flexibility and fluidity in the process (e.g. easy to convert to a graph like LangGraph, but then you are stuck with control flow), how to create a semantic compiler for that transition, what primitives to use for the compiled language that I call Playbooks Assembly Language [1].

- Agents and multi-agent system considerations - How to represent agents, how they should communicate. Agents are classes and they expose public playbooks that other agents can call). Agents can send natural language messages to each other and engage in conversations. Agents can call multi-party meetings. How can the behavior across all these interaction patterns be defined so it remains intuitive. For example, lifetime of a meeting is tied to "meeting: true" playbooks so the agent simply returns from the playbook to exit a meeting and meeting lifecycle is tied to the host returning from its meeting playbook.

- Which LLMs to support? Go for "Bring your own LLM" or restrict the set? Which LLM? LLM selection impacts how all the internal prompts are implemented so prompt building had to occur simultaneously with LLM selection.

It felt like playing an N-dimensional game of chess!

[1] https://playbooks-ai.github.io/playbooks-docs/reference/play...

amthewiz | 4 months ago | on: Show HN: Realizing Karpathy's dream of Natural Language Programming

Andrej Karpathy posted in early 2023 (https://x.com/karpathy/status/1617979122625712128) -

> "The hottest new programming language is English"

I've built a Natural Language Programming stack for building AI Agents. I think it is the first true Software 3.0 stack.

The core idea: Use LLMs as CPUs! You can finally step debug through your prompts and get reliable, verifiable execution. The stack includes a new language, compiler, developer tooling like VSCode extension.

Programs are written as markdown. H1 tags are agents, H2 tags are natural language playbooks (i.e. functions), python playbooks. All playbooks in an agents run on the same call stack. NL and python playbooks can call each other.

Quick intro video: https://www.youtube.com/watch?v=ZX2L453km6s

Github: https://github.com/playbooks-ai/playbooks (MIT license)

Documentation: https://playbooks-ai.github.io/playbooks-docs/getting-starte...

Project website: runplaybooks.ai

Example Playbooks program -

    # Country facts agent
    This agent prints interesting facts about nearby countries

    ## Main
    ### Triggers
    - At the beginning
    ### Steps
    - Ask user what $country they are from
    - If user did not provide a country, engage in a conversation and gently nudge them to provide a country
    - List 5 $countries near $country
    - Tell the user the nearby $countries
    - Inform the user that you will now tell them some interesting facts about each of the countries
    - process_countries($countries)
    - End program

    ```python
    from typing import List

    @playbook
    async def process_countries(countries: List[str]):
        for country in countries:
            # Calls the natural language playbook 'GetCountryFact' for each country
            fact = await GetCountryFact(country)
            await Say("user", f"{country}: {fact}")
    ```

    ## GetCountryFact($country)
    ### Steps
    - Return an unusual historical fact about $country

There are a bunch of very interesting capabilities. A quick sample -

- "Queue calls to extract table of contents for each candidate file" - Effortless calling MCP tools, multi-threading, artifact management, context management

- "Ask Accountant what the tax rate would be" is how you communicate with other agents

- you can mix procedural natural language playbooks, ReAct playbooks, Raw prompt playbooks, Python playbooks and external playbooks like MCP tools seamlessly on the same call stack

- "Have a meeting with Chef, Marketing expert and the user to design a new menu" is how you can spawn multi-agent workflows, where each agent follows their own playbook for the meeting

- Coming soon: Observer agents (agents observing other agents - automated memory storage, verify/certify execution, steer observed agents), dynamic playbook generation for procedural memory, etc.

I hope this changes how we build AI agents going forward for the better. Looking forward to discussion! I'll be in the comments.

amthewiz | 2 years ago | on: GPTs: Custom versions of ChatGPT

In the blog, they say

> we’re launching the GPT Store, featuring creations by verified builders

Any idea who these “verified” builders are, how they are selected and how to submit a GPT that gets selected in the rollout?

amthewiz | 5 years ago | on: Can lab-grown brains become conscious?

Children can also feel pleasure and love.

amthewiz | 6 years ago | on: It is perfectly OK to only code at work, you can have a life too

The field of software engineering changes rapidly and is complex to master. To remain good at the job, one must continue to explore and learn. The only way that is going to happen is if you are genuinely interested in the field. The best indicator of that is that you have pet projects outside of work.

amthewiz | 6 years ago | on: Stop Blaming America’s Poor for Their Poverty

> So Japanese people are doing everything right -- eschewing violence, avoiding drugs, working hard and not having kids out of wedlock. They are following the conservative prescription, as well as or better than any other developed country in the world. And yet still, many of them are poor. This suggests that there is something very wrong with the conservative theory of poverty.

No, the author is confused. There is a big difference between -

- you cannot be poor if you make good choices, and

- you can get out of povery by making good choices

The conservative position is the later. There will be cases where it doesn't work, but it is a sound recommendation in general.

Now, poverty does make it harder to make good choices. So welfare $s should be spent to make that easier, e.g. free food at school helps kids do better at school.

amthewiz | 6 years ago | on: Math Basics for Computer Science and Machine Learning [pdf]

Basics should be concepts that get you to 80% and tell you where to look for the rest 20%. This book tries to get you directly to 95% and is best treated as a reference book.

amthewiz | 6 years ago | on: Saving Lydia

Who does that?

amthewiz | 6 years ago | on: Reducing Wasted Food at Home

People go their entire lives without knowing what it is like to go without food for a day. They take it for granted.

Also, I believe parents generally don't teach kids to respect food. It probably starts by letting a toddler feed him/herself too early and in the process normalizing food wastage.

There is also a first world etiquette of leaving some food in the plate. I have seen people habitually do it.

amthewiz | 6 years ago | on: Self-Supervised Learning [pdf]

The DNN based techniques are new but the concept of models that can fill in the blanks is old. It used to be called content addressable memory or autoassociative memory.

amthewiz | 7 years ago | on: Eliminating Robocalls

I'd love to take the robo-callers on a long winding road to nowhere. Is there a conversational AI robo-answering app based on something like Google Duplex? If not, should I build one?

amthewiz | 7 years ago | on: Helping computers fill in the gaps between video frames

Agreed. The title is doing disservice to the content.

amthewiz | 7 years ago | on: Driverless Hype Collides with Merciless Reality

Tuche!

amthewiz | 7 years ago | on: SETI spots dozens of new mysterious signals emanating from distant galaxy

An advanced civilization, when looking to signal other potential civilizations, would hopefully -

- use the lowest common denominator long range communication medium (like EM waves), not the most advanced technology available to them, and

- would broadcast it widely in spatial, temporal and spectral dimensions and not worry about efficiency

amthewiz | 7 years ago | on: Driverless Hype Collides with Merciless Reality

Really? I don't like to fetch water from a lake every day to my house. Instead of using technology to bring water to my house, what lifestyle changes or hard choices should I rather make?

amthewiz | 7 years ago | on: Brain cell discovery could help scientists understand consciousness

Are they sincerely suggesting that the neural correlate of consciousness is a new type of neuron? I hope not.

amthewiz | 7 years ago | on: The 10:1 rule of writing and programming

To go even further on that point - the chart itself seems overly optimistic. +-40% of estimate after product design? +-10% of estimate after design spec?! Countless number of times I have seen detailed design specs completely thrown out of whack by new engineering, process or business revelations/insights. Human thought is iterative - most people can imagine at most 80% of the future/product/design possibilities.

amthewiz | 7 years ago | on: Sci-Hub Proves That Piracy Can Be Dangerously Useful

What would be the best way to help sci-hub anonymously if one wishes to do so?

amthewiz | 7 years ago | on: To Remember, the Brain Must Actively Forget

Current neuroscience suggests that brain operates in broadly two regimes - learning during wakefulness and model selection during sleep. Both these activities require "forgetting" of specific kinds. Specifically model selection involves trimming synapses. The goal is to learn a model of the world at the right level of detail, what AI folks call avoiding over fitting.

amthewiz | 7 years ago | on: Free energy principle

As I understand, the basic idea is to minimize surprise over the organism’s life using models of the world to predict the causes behind what the organism observes and actions to keep the organism within states suitable for survival.