At this point, I am starting to feel like we don’t need new languages, but new ways to create specifications.
I have a hypothesis that an LLM can act as a pseudocode to code translator, where the pseudocode can tolerate a mixture of code-like and natural language specification. The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer. This also might enable lower resource “non-frontier” models to be more useful. Additionally, it allows tolerance to syntax mistakes or in the worst case, natural language if needed.
In other words, I think llms don’t need new languages, we do.
What we need is a programming language that defines the diff to be applied upon the existing codebase to the same degree of unambiguity as the codebase itself.
That is, in the same way that event sourcing materializes a state from a series of change events, this language needs to materialize a codebase from a series of "modification instructions". Different models may materialize a different codebase using the same series of instructions (like compilers), or say different "environmental factors" (e.g. the database or cloud provider that's available). It's as if the codebase itself is no longer the important artifact, the sequence of prompts is. You would also use this sequence of prompts to generate a testing suite completely independent of the codebase.
- LLMs can act as pseudocode to code translators (they are excellent at this)
- LLMs still create bugs and make errors, and a reasonable hypothesis is at a rate in direct proportion to the "complexity" or "buggedness" of the underlying language.
In other words, give an AI a footgun and it will happily use it unawares. That doesn't mean however it can't rapidly turn your pseudocode into code.
None of this means that LLMs can magically correct your pseudocode at all times if your logic is vastly wrong for your goal, but I do believe they'll benefit immensely from new languages that reduce the kind of bugs they make.
This is the moment we can create these languages. Because LLMs can optimize for things that humans can't, so it seems possible to design new languages to reduce bugs in ways that work for LLMs, but are less effective for people (due to syntax, ergonomics, verbosity, anything else).
This is crucially important. Why? Because 99% of all code written in the next two decades will be written by AI. And we will also produce 100x more code than has ever been written before (because the cost of doing it, has dropped essentially to zero). This means that, short of some revolutions in language technology, the number of bugs and vulnerabilities we can expect will also 100x.
That's why ideas like this are needed.
I believe in this too and am working on something also targeting LLMs specifically, and have been working on it since Mid to Late November last year. A business model will make such a language sustainable.
This is something that could be distilled from some industries like aviation, where specification of software (requirements, architecture documents, etc.) is even more important that the software itself.
The problem is that natural language is in itself ambiguous, and people don't really grasp the importance of clear specification (how many times I have repeated to put units and tolerances to any limits they specify by requirements).
Another problem is: natural language doesn't have "defaults": if you don't specify something, is open to interpretation. And people _will_ interpret something instead of saying "yep I don't know this".
Thats again programming languages. Real issue with LLMs now is it doesn't matter if it can generate code quickly. Some one still has to read, verify and test it.
Perhaps we need a need a terse programming language. Which can be read quickly and verified. You could call that specification.
I think that we haven't even started to properly think about a higher-level spec language. The highest level objects would have to be the users and the value that they get out of running the software. Even specific features would have to be subservient to that, and would shift as the users requirements shift. Requirements written in a truly higher-level spec language would allow the software to change features without the spec itself changing.
This is where LLMs slip up. I need a higher-level spec language where I don't have to specify to an LLM that I want the jpeg crop to be lossless if possible. It's doubly obvious that I wouldn't want it to be lossy, especially because making it lossy likely makes the resulting files larger. This is not obvious to an LLM, but it's absolutely obvious if our objects are users and user value.
A truly higher-level spec language compiler would recognize when actual functionality disappeared when a feature was removed, and would weigh the value of that functionality within the value framework of the hypothetical user. It would be able to recognize the value of redundant functionality by putting a value on user accessibility - how many ways can the user reach that functionality? How does it advertise itself?
We still haven't even thought about it properly. It's that "software engineering" thing that we were in a continual argument about whether it existed or not.
I disagree I think we always need new languages. Every language over time becomes more and more unnecessarily complex.
It's just part of the software lifecycle. People think their job is to "write code" and that means everything becomes more and more features, more abstractions, more complex, more "five different ways to do one thing".
Many many examples, C++, Java esp circa 2000-2010 and on and on and on.
There's no hope for older languages. We need simpler languages.
we have some markup for architectures like - d2lang, sequencediagram.org's, bpmn.io xmls (which are OMG XMLs), so question is - can we master these, and not invent new stuf for a while?
p.s. a combination of the above fares very well during my agentic coding adventures.
This is the approach that Agint takes. We inference the structure of the code first top down as a graph, then add in types, then interpret the types as in out function signatures and then "inpaint" the functions for codegen.
I'm actually building this, will release it early next month. I've added a URL to watch to my profile (should be up later this week). It will be Open Source.
So in this case an LLM would just be a less-reliable compiler? What's the point? If you have to formally specify your program, we already have tools for that, no boiling-the-oceans required
Developed by Jordan Hubbard of NVIDIA (and FreeBSD).
My understanding/experience is that LLM performance in a language scales with how well the language is represented in the training data.
From that assumption, we might expect LLMs to actually do better with an existing language for which more training code is available, even if that language is more complex and seems like it should be “harder” to understand.
I don’t think that assumption holds. For example, only recently have agents started getting Rust code right on the first try, but that hasn’t mattered in the past because the rust compiler and linters give such good feedback that it immediately fixes whatever goof it made.
This does fill up context a little faster, (1) not as much as debugging the problem would have in a dynamic language, and (2) better agentic frameworks are coming that “rewrite” context history for dynamic on the fly context compression.
A lot of this depends on your workflow. A language with great typing, type checking and good compiler errors will work better in a loop than one with a large surface overhead and syntax complexity, even if it's well represented. This is the instinct behind, e.g. https://github.com/toon-format/toon, a json alternative format. They test LLM accuracy with the format against JSON, (and are generally slightly ahead of JSON).
Additionally just the ability to put an entire language into context for an LLM - a single document explaining everything - is also likely to close the gap.
I was skimming some nano files and while I can't say I loved how it looked, it did look extremely clear. Likely a benefit.
Blackpill is that, for this reason, the mainstream languages we have today will be the final (human-designed) languages to be relevant on a global scale.
Eventually AIs will create their own languages. And humans will, of course, continue designing hobbyist languages for fun. But in terms of influence, there will not be another human language that takes the programming world by storm. There simply is not enough time left.
I mostly agree, and I think a combination of good representation and tooling that lets it self-correct quickly will do better than new language in the short term.
In the long term I expect it won't matter - already GPT3.5 was able to reason about the basic semantics of programs in languages "synthesised" zero-shot in context by just describing it as a combination of existing languages (e.g. "Ruby with INTERCAL's COME FROM") or by providing a grammar (e.g. simple EBNF plus some notes on new/different constructs) reasonably well and could explain what a program written in a franken-language it had not seen before was likely to do.
I think long before there is enough training data for a new language to be on equal grounds in that respect, we should expect the models to be good enough at this that you could just provide a terse language spec.
But at the same time, I'd expect the same improvement to future models to be good enough at working with existing languages that it's pointless to tailor languages to LLMs.
It's not just how well the language is represented. Obscure-ish APIs can trip up LLMs. I've been using Antigravity for a Flutter project that uses ATProto. Gemini is very strong at Dart coding, which makes picking up my 17th managed language a breeze. It's also very good at Flutter UI elements. It was noticeably less good at ATProto and its Dart API.
The characteristics of failures have been interesting: As I anticipated it might be, an over ambitious refactoring was a train wreck, easily reverted. But something as simple as regenerating Android launcher icons in a Flutter project was a total blind spot. I had to Google that like some kind of naked savage running through the jungle.
I wonder if there is a way to create a sort of 'transpilation' layer to a new language like this for existing languages, so that it would be able to use all of the available training from other languages. Something that's like AST to AST. Though I wonder if it would only work in the initial training or fine-tuning stage.
Not my experience, honestly. With a good code base for it to explore and good tooling, and a really good prompt I've had excellent results with frankly quite obscure things, including homegrown languages.
As others said, the key is feedback and prompting. In a model with long context, it'll figure it out.
Optimistically I dumped the whole thing into Claude Opus 4.5 as a system prompt to see if it could generate a one-shot program from it:
llm -m claude-opus-4.5 \
-s https://raw.githubusercontent.com/jordanhubbard/nanolang/refs/heads/main/MEMORY.md \
'Build me a mandelbrot fractal CLI tool in this language'
> /tmp/fractal.nano
So I fired up Claude Code inside a checkout of the nanolang and told it how to run the compiler and let it fix the problems... which DID work. Here's that transcript:
I think you need to either feed it all of ./docs or give your agent access to those files so it can read them as reference. The MEMORY.md file you posted mentions ./docs/CANONICAL_STYLE.md and ./docs/LLM_CORE_SUBSET.md and they in turn mention indirectly other features and files inside the docs folder.
The required-test-per-function is sort of interesting. But it's not enforced that the test does anything useful, is it?
So I wonder how exhausting would it be to write in a language that required, for all functions, that they are tested with 100% path coverage.
Of course, this by itself wouldn't still be equivalent to proving the code, but it would probably point people to the corner cases of code quite rapidly. Additionally it would make it impossible to have code that cannot be tested with 100% path coverage due to static relationships within it, that are not (or cannot be) expressed in the type system, e.g. if (foo) { if (!foo) {..} }.
And would such a language need to have some kind of dynamic dependency injection mechanism for mocking the tests?
One novel part here is every function is required to have tests that run at compile time.
I'm still skeptical of the value add having to teaching a custom language to an LLM instead of using something like lua or python and applying constraints like test requirements onto that.
I'm not sure that it's novel but I'm skeptical about the noise to signal ratio for anything that is not an example.
I think that a real world file of source code will be either completely polluted by tests (they are way longer than the actual code they test) or become
and the test code will be written in another file, and every function in the test code will have its own shadow function asserting true, to please the compiler.
> Really needs an agent-oriented “getting started” guide to put in the context, and evals vs. the same task done with Python, Rust etc.
It has several such documents, including a ~1400 line MEMORY.md file referencing several other such files, a language specification, a collection of ~100 documents containing just about every thought Jordan has ever had about the entire language and the evolution of its implementation, and a collection of examples that includes an SDL2 based OpenGL program.
Obviously, jkh clearly understands the need to bootstrap LLMs on his ~5 month old, self-hosted solo programming language.
a.k.a. jkh. That's a blast from the past. Back in the early FreeBSD days, Jordan was fielding mailing list traffic and holding the project together as people peppered the lists with questions, trying to get their systems running with their sundry bits of hardware. I wondered when he slept.
Apparently he did as well[1]:
"The start of the 2.0 ports collection. No sup repository yet, but I'll make one when I wake up again.. :)"
Submitted by: jkh
Aug 21, 1994
1. Nanolang is a total thought experiment. The key word its description is "experimental" - whether it's a Good experiment or a Bad experiment can be argued either way, especially by language purists!
2. Yes, it's a total Decorator Crab of a language. An unholy creation by Dr Frankenstein, yes! Those criticisms are entirely merited. It wasn't designed, it accreted features and was a fever dream I couldn't seem to stop having. I should probably take my own temperature.
3. I like prefix notation because my first calculator was an HP calculator (the HP 41C remains, to this day, my favorite calculator of ALL TIME). I won't apologize for that, but I DO get that it's not everybody's cup of tea! I do, however, use both vi and emacs now.
Umm. I think that about covers it. All of this LLM stuff is still incredibly young to me and I'm just firing a shotgun into the dark and listening to hear if I hit anything. It's going to be that way for a while for all of us until we figure out what works and what does not!
I think this kind of misses what's actually challenging with LLM code -- auditing it for correctness. LLMs are ~fine at spitting out valid syntax. Humans need to be able to read the output, though.
It seems that something that does away with human friendly syntax and leans more towards a pure AST representation would be even better? Basically a Lisp but with very strict typing might do the trick. And most LLMs are probably trained on lots of Lisps already.
A language targeting an LLM might be well served with a lot of keywords, similar to a CISC instruction set, where keywords do specific things well. Giving it building blocks and having them piece together is likely to pay off.
Seems like a simplified Rust with partial prefix notation (which the rationale that is better for LLMs is based on vibes really) that compiles to C. Similar language posted here not too long ago: Zen-C => more features, no prefix notation / Rue => no prefix notation, compiles directly to native code (no C target). Surprisingly compared to other LLM "optimized" languages, it isn't so much concerned about token efficiency.
I find Polish or Reverse Polish notation jarring after a lifetime of thinking in terms of operator precedence. Given that it's fairly rare to see, I wonder what about it would be more LLM-friendly. It does lend itself better to "tokenization" of a sort - if you want to construct operations from lots of smaller operations, for example if you're mutating genetic algorithms (a la Eureqa). But I've written code in the past to explicitly convert those kinds of operations back to infix for easier readability. I wonder if the LLMs in this case are expected to behave a bit like genetic algorithms as they construct things.
Just scanning through this, looks interesting and is totally needed, but I think it is missing showing future use-cases and discussions of decoding. So, for instance, it is all well and good to define a simple language focused on testing and the like, but what about live LLM control and interaction via a programming language? Sort of a conversation in code? Data streams in and function calls stream out with syntax designed to minimize mistakes in calls and optimize the stream? What I mean by this is special block declarations like:
```
#this is where functions are defined and should compile and give syntax errors
```
(yeah, ugly but the idea is there)
The point being that the behavior could change. In the streaming world it may, for instance, have guarantees of what executes and what doesn't in case of errors. Maybe transactional guarantees in the stream blocks compared to pure compile optimization in the other blocks? The point here isn't that this is the golden idea, but that we probably should think about the use cases more. High on my list of use cases to consider (I think)
- language independence: LLMs are multilingual and this should be multilingual from the start.
- support streaming vs definition of code.
- Streaming should consider parallelism/async in the calls.
- the language should consider cached token states to call back to. (define the 'now' for optimal result management, basically, the language can tap into LLM properties that matter)
Hmm... That is the top of my head thoughts at least.
I think the real gap in computer languages wrt LLMs is a replacement for python as a "notebook" language that the LLM uses to solve ad hoc problems during a chat.
What you want is something that is safe, performant, uses minimal tokens and takes careful note of effects and capabilities. Tests aren't really even important for that use case.
Looks a bit like Rust. My peeve with Rust is that it makes error handling too much donkey work. In a large class of programs you just care that something failed and you want a good description of that thing:
context("Loading configuration from {file}")
Then you get a useful error message by unfolding all the errors at some point in the program that is makes sense to talk to a human, e.g. logs, rpc error etc.
Failed: Loading configuration from .config because: couldn't open file .config because: file .config does not exist.
It shouldn't be harder than a context command in functions. But somehow Rust conspires to require all this error type conversion and question marks. It it is all just a big uncomfortable donkey game, especially when you have nested closures forced to return errors of a specific type.
I like your "context" proposal, because it adds information about developer intention to error diagnostics, whereas showing e.g. a call stack would just provide information about the "what?", not the "why?" to the end user facing an error at runtime.
(You should try to get something like that into various language specs; I'd love you to success with it.)
Really clean language where the design decisions have led to fewer traps (cond is a good choice).
It’s peculiar to see s-expressions mixed together with imperative style. I’ve been experimenting along similar lines - mixing s-expressions with ML style in the same dialect (for a project).
Having an agentic partner toiling away with the lexer/parser/implementation details is truly liberating. It frees the human to explore crazy ideas that would not have been feasible for a side/toy/hobby project earlier.
I like the creativity, but I'm not sure it's needed. I've been building a large-scale database system using Opus 4.5, and targeting Rust. It's not perfect, but the Rust compiler is so helpful that Opus has solved a lot of problems on its own. I have around 100,000 lines of code, and have completed some major refactoring.
I am using a variation of spec-driven development.
This absolutely was NOT needed, just to be clear, it was just fun to do and, perhaps more importantly, it taught me a lot in the process of making it.
I also have a few geometric 3D printed objects on my desk that I made with openscad as "printing challenges" and then beat my head against my 3D printers for hours trying to actually print them. They did not need to be printed, they serve no purpose other than to be aesthetically pleasing and educational. :)
Context: This project is by the FreeBSD (co-)founder and former Apple engineering director, Jordan Hubbard. He is now a senior director at Nvidia, according to his public LinkedIn[1].
I feel this could be achieved better with Golang or Kotlin and a custom linter that enforces parentheses around each expression term to make precedence explicit, and enforce each function has at least one test. Although I guess neither of those languages has free interop with C, they are close. And Go doesn’t have unions :’(
Interesting. The syntax looks like C and Scheme had an illegitimate child together. (Don't get me wrong, I do like the unambiguity of prefix notation.)
as i already wrote in an other comment. Where are the millions lines of code needed to train LLM in this Nanolang? LLM are like parrots. if you dont give them data to extract the statistic probability of the next word, you will not get any usefull output. LLM do not think, they can't learn without training data
That's an incorrect assumption. You can get quite far with some skill documents and some examples in combination with tools to compile and run your code. The LLM will train itself on the fly based on the feedback from these tools.
Simple & Fast - Minimal syntax, native performance
Design Philosophy:
Minimal syntax (18 keywords vs 32 in C)
One obvious way to do things
Tests are part of the language, not an afterthought
Transpile to C for maximum compatibility
ehh. i dont think the overhead of inventing a new language makes up for the lack of data around it. in fact if you're close enough to rust/c then llms are MORE likely to make up stuff from their training data and screw up your minimal language.
(pls argue against this, i want to be proven wrong)
opus is currently the only one that can code rust, but if you give it symbol resolution there is quite literally nothing better. The type system in rust is incredibly powerful and llms are great (just opus for now) at utilizing it.
It looks like a Frankenstein's abomination that has c-like function signatures and structs with Sexpr function bodies and this will anger some homomorphism nerds. I love it.
Every new language pet project these days claims to be "designed for LLM's", lol. Don't read too much into it. The only language that's really designed for LLM is COBOL, because it was written to read just like English natural language and LLM's are trained by reading lots of English language books.
deepsquirrelnet|1 month ago
I have a hypothesis that an LLM can act as a pseudocode to code translator, where the pseudocode can tolerate a mixture of code-like and natural language specification. The benefit being that it formalizes the human as the specifier (which must be done anyway) and the llm as the code writer. This also might enable lower resource “non-frontier” models to be more useful. Additionally, it allows tolerance to syntax mistakes or in the worst case, natural language if needed.
In other words, I think llms don’t need new languages, we do.
roncesvalles|1 month ago
That is, in the same way that event sourcing materializes a state from a series of change events, this language needs to materialize a codebase from a series of "modification instructions". Different models may materialize a different codebase using the same series of instructions (like compilers), or say different "environmental factors" (e.g. the database or cloud provider that's available). It's as if the codebase itself is no longer the important artifact, the sequence of prompts is. You would also use this sequence of prompts to generate a testing suite completely independent of the codebase.
keepamovin|1 month ago
- LLMs can act as pseudocode to code translators (they are excellent at this)
- LLMs still create bugs and make errors, and a reasonable hypothesis is at a rate in direct proportion to the "complexity" or "buggedness" of the underlying language.
In other words, give an AI a footgun and it will happily use it unawares. That doesn't mean however it can't rapidly turn your pseudocode into code.
None of this means that LLMs can magically correct your pseudocode at all times if your logic is vastly wrong for your goal, but I do believe they'll benefit immensely from new languages that reduce the kind of bugs they make.
This is the moment we can create these languages. Because LLMs can optimize for things that humans can't, so it seems possible to design new languages to reduce bugs in ways that work for LLMs, but are less effective for people (due to syntax, ergonomics, verbosity, anything else).
This is crucially important. Why? Because 99% of all code written in the next two decades will be written by AI. And we will also produce 100x more code than has ever been written before (because the cost of doing it, has dropped essentially to zero). This means that, short of some revolutions in language technology, the number of bugs and vulnerabilities we can expect will also 100x.
That's why ideas like this are needed.
I believe in this too and am working on something also targeting LLMs specifically, and have been working on it since Mid to Late November last year. A business model will make such a language sustainable.
trklausss|1 month ago
This is something that could be distilled from some industries like aviation, where specification of software (requirements, architecture documents, etc.) is even more important that the software itself.
The problem is that natural language is in itself ambiguous, and people don't really grasp the importance of clear specification (how many times I have repeated to put units and tolerances to any limits they specify by requirements).
Another problem is: natural language doesn't have "defaults": if you don't specify something, is open to interpretation. And people _will_ interpret something instead of saying "yep I don't know this".
kamaal|1 month ago
Thats again programming languages. Real issue with LLMs now is it doesn't matter if it can generate code quickly. Some one still has to read, verify and test it.
Perhaps we need a need a terse programming language. Which can be read quickly and verified. You could call that specification.
pessimizer|1 month ago
This is where LLMs slip up. I need a higher-level spec language where I don't have to specify to an LLM that I want the jpeg crop to be lossless if possible. It's doubly obvious that I wouldn't want it to be lossy, especially because making it lossy likely makes the resulting files larger. This is not obvious to an LLM, but it's absolutely obvious if our objects are users and user value.
A truly higher-level spec language compiler would recognize when actual functionality disappeared when a feature was removed, and would weigh the value of that functionality within the value framework of the hypothetical user. It would be able to recognize the value of redundant functionality by putting a value on user accessibility - how many ways can the user reach that functionality? How does it advertise itself?
We still haven't even thought about it properly. It's that "software engineering" thing that we were in a continual argument about whether it existed or not.
dpweb|1 month ago
It's just part of the software lifecycle. People think their job is to "write code" and that means everything becomes more and more features, more abstractions, more complex, more "five different ways to do one thing".
Many many examples, C++, Java esp circa 2000-2010 and on and on and on. There's no hope for older languages. We need simpler languages.
hombre_fatal|1 month ago
And then we can look at multiple LLM-generated implementations to inform how the prompt might need to be updated further until it's a one-shot.
Now you have perfect intention behind code, and you can refine the intention if it's wrong.
danielvaughn|1 month ago
https://x.com/danielvaughn/status/2011280491287364067?s=46
larodi|1 month ago
p.s. a combination of the above fares very well during my agentic coding adventures.
AgintAI|1 month ago
jasfi|1 month ago
catlifeonmars|1 month ago
pizzafeelsright|1 month ago
Consider:
"Eat grandma if you're hungry"
"Eat grandma, if you're hungry"
"Eat grandma. if you're hungry"
Same words and entirely different outcome.
Pseudo code to clarify:
[Action | Directive - Eat] [Subject - Grandma] [Conditional of Subject - if hungry]
bigfishrunning|1 month ago
GrowingSideways|1 month ago
The code was always a secondary effect of making software. The pain is in fully specifying behavior.
UltraSane|1 month ago
avereveard|1 month ago
QuadmasterXLII|1 month ago
Footprint0521|1 month ago
But seriously, llms can transmit ideas to each other through English that we do understand, we are screwed if it’s another language lol
thorum|1 month ago
My understanding/experience is that LLM performance in a language scales with how well the language is represented in the training data.
From that assumption, we might expect LLMs to actually do better with an existing language for which more training code is available, even if that language is more complex and seems like it should be “harder” to understand.
adastra22|1 month ago
This does fill up context a little faster, (1) not as much as debugging the problem would have in a dynamic language, and (2) better agentic frameworks are coming that “rewrite” context history for dynamic on the fly context compression.
vessenes|1 month ago
Additionally just the ability to put an entire language into context for an LLM - a single document explaining everything - is also likely to close the gap.
I was skimming some nano files and while I can't say I loved how it looked, it did look extremely clear. Likely a benefit.
nemo1618|1 month ago
Eventually AIs will create their own languages. And humans will, of course, continue designing hobbyist languages for fun. But in terms of influence, there will not be another human language that takes the programming world by storm. There simply is not enough time left.
nl|1 month ago
This isn't really true. LLMs understand grammars really really well. If you have a grammar for your language the LLM can one-shot perfect code.
What they don't know is the tooling around the language. But again, this is pretty easily fixed - they are good at exploring cli tools.
vidarh|1 month ago
In the long term I expect it won't matter - already GPT3.5 was able to reason about the basic semantics of programs in languages "synthesised" zero-shot in context by just describing it as a combination of existing languages (e.g. "Ruby with INTERCAL's COME FROM") or by providing a grammar (e.g. simple EBNF plus some notes on new/different constructs) reasonably well and could explain what a program written in a franken-language it had not seen before was likely to do.
I think long before there is enough training data for a new language to be on equal grounds in that respect, we should expect the models to be good enough at this that you could just provide a terse language spec.
But at the same time, I'd expect the same improvement to future models to be good enough at working with existing languages that it's pointless to tailor languages to LLMs.
Zigurd|1 month ago
The characteristics of failures have been interesting: As I anticipated it might be, an over ambitious refactoring was a train wreck, easily reverted. But something as simple as regenerating Android launcher icons in a Flutter project was a total blind spot. I had to Google that like some kind of naked savage running through the jungle.
nxobject|1 month ago
NewsaHackO|1 month ago
cmrdporcupine|1 month ago
As others said, the key is feedback and prompting. In a model with long context, it'll figure it out.
boxed|1 month ago
whimsicalism|1 month ago
simonw|1 month ago
https://github.com/jordanhubbard/nanolang/blob/main/MEMORY.m...
Optimistically I dumped the whole thing into Claude Opus 4.5 as a system prompt to see if it could generate a one-shot program from it:
Here's the transcript for that. The code didn't work: https://gist.github.com/simonw/7847f022566d11629ec2139f1d109...So I fired up Claude Code inside a checkout of the nanolang and told it how to run the compiler and let it fix the problems... which DID work. Here's that transcript:
https://gisthost.github.io/?9696da6882cb6596be6a9d5196e8a7a5...
And the finished code, with its output in a comment: https://gist.github.com/simonw/e7f3577adcfd392ab7fa23b1295d0...
So yeah, a good LLM can definitely figure out how to use this thing given access to the existing documentation and the ability to run that compiler.
e12e|1 month ago
nodja|1 month ago
hahahahhaah|1 month ago
_flux|1 month ago
So I wonder how exhausting would it be to write in a language that required, for all functions, that they are tested with 100% path coverage.
Of course, this by itself wouldn't still be equivalent to proving the code, but it would probably point people to the corner cases of code quite rapidly. Additionally it would make it impossible to have code that cannot be tested with 100% path coverage due to static relationships within it, that are not (or cannot be) expressed in the type system, e.g. if (foo) { if (!foo) {..} }.
And would such a language need to have some kind of dynamic dependency injection mechanism for mocking the tests?
spicybright|1 month ago
I'm still skeptical of the value add having to teaching a custom language to an LLM instead of using something like lua or python and applying constraints like test requirements onto that.
pmontra|1 month ago
I think that a real world file of source code will be either completely polluted by tests (they are way longer than the actual code they test) or become
and the test code will be written in another file, and every function in the test code will have its own shadow function asserting true, to please the compiler.sinuhe69|1 month ago
https://pyret.org/docs/latest/testing.html
cadamsdotcom|1 month ago
Seems unlikely for an out-of-distribution language to be as effective as one that’s got all the training data in the world.
Really needs an agent-oriented “getting started” guide to put in the context, and evals vs. the same task done with Python, Rust etc.
topspin|1 month ago
It has several such documents, including a ~1400 line MEMORY.md file referencing several other such files, a language specification, a collection of ~100 documents containing just about every thought Jordan has ever had about the entire language and the evolution of its implementation, and a collection of examples that includes an SDL2 based OpenGL program.
Obviously, jkh clearly understands the need to bootstrap LLMs on his ~5 month old, self-hosted solo programming language.
jitl|1 month ago
Summary:
- Co-created FreeBSD.
- Led UNIX technologies at Apple for 13 years
- iXSystems, lead FreeNAS
- idk something about Uber
- Senior Director for GPU Compute Software at NVIDIA
For whatever it’s worth.
topspin|1 month ago
Apparently he did as well[1]: "The start of the 2.0 ports collection. No sup repository yet, but I'll make one when I wake up again.. :)" Submitted by: jkh Aug 21, 1994
[1] https://github.com/freebsd/freebsd-ports/commit/7ca702f09f29...
Interesting commit starting Ports 2.0. Three version of bash, four versions of Emacs, plus jove.
JamesTRexx|1 month ago
I might accidentally summon a certain person from Ork.
jll29|1 month ago
jkh99|1 month ago
Quick reaction:
1. Nanolang is a total thought experiment. The key word its description is "experimental" - whether it's a Good experiment or a Bad experiment can be argued either way, especially by language purists!
2. Yes, it's a total Decorator Crab of a language. An unholy creation by Dr Frankenstein, yes! Those criticisms are entirely merited. It wasn't designed, it accreted features and was a fever dream I couldn't seem to stop having. I should probably take my own temperature.
3. I like prefix notation because my first calculator was an HP calculator (the HP 41C remains, to this day, my favorite calculator of ALL TIME). I won't apologize for that, but I DO get that it's not everybody's cup of tea! I do, however, use both vi and emacs now.
Umm. I think that about covers it. All of this LLM stuff is still incredibly young to me and I'm just firing a shotgun into the dark and listening to hear if I hit anything. It's going to be that way for a while for all of us until we figure out what works and what does not!
- jkh
willquack|1 month ago
loeg|1 month ago
tossandthrow|1 month ago
cheriot|1 month ago
tricorn|1 month ago
abraxas|1 month ago
verdverm|1 month ago
hsaliak|1 month ago
forgotpwd16|1 month ago
noduerme|1 month ago
jmward01|1 month ago
``` #this is where functions are defined and should compile and give syntax errors ```
:->r = some(param)/connected(param, param, @r)/calls(param)<-:
(yeah, ugly but the idea is there) The point being that the behavior could change. In the streaming world it may, for instance, have guarantees of what executes and what doesn't in case of errors. Maybe transactional guarantees in the stream blocks compared to pure compile optimization in the other blocks? The point here isn't that this is the golden idea, but that we probably should think about the use cases more. High on my list of use cases to consider (I think)
- language independence: LLMs are multilingual and this should be multilingual from the start.
- support streaming vs definition of code.
- Streaming should consider parallelism/async in the calls.
- the language should consider cached token states to call back to. (define the 'now' for optimal result management, basically, the language can tap into LLM properties that matter)
Hmm... That is the top of my head thoughts at least.
empath75|1 month ago
What you want is something that is safe, performant, uses minimal tokens and takes careful note of effects and capabilities. Tests aren't really even important for that use case.
fizlebit|1 month ago
Failed: Loading configuration from .config because: couldn't open file .config because: file .config does not exist.
It shouldn't be harder than a context command in functions. But somehow Rust conspires to require all this error type conversion and question marks. It it is all just a big uncomfortable donkey game, especially when you have nested closures forced to return errors of a specific type.
wazzaps|1 month ago
jll29|1 month ago
(You should try to get something like that into various language specs; I'd love you to success with it.)
EDIT: typo fixed.
auggierose|1 month ago
firemelt|1 month ago
tromp|1 month ago
sheepscreek|1 month ago
It’s peculiar to see s-expressions mixed together with imperative style. I’ve been experimenting along similar lines - mixing s-expressions with ML style in the same dialect (for a project).
Having an agentic partner toiling away with the lexer/parser/implementation details is truly liberating. It frees the human to explore crazy ideas that would not have been feasible for a side/toy/hobby project earlier.
catlifeonmars|1 month ago
wiremine|1 month ago
I am using a variation of spec-driven development.
jkh99|1 month ago
I also have a few geometric 3D printed objects on my desk that I made with openscad as "printing challenges" and then beat my head against my 3D printers for hours trying to actually print them. They did not need to be printed, they serve no purpose other than to be aesthetically pleasing and educational. :)
runjake|1 month ago
1. https://www.linkedin.com/in/johubbard/
jitl|1 month ago
wendgeabos|1 month ago
NanoLang solves three problems:
LLM Code Generation - Unambiguous syntax reduces AI errors Testing Discipline - Mandatory tests improve code quality
particularly the first.
jason_s|1 month ago
teaearlgraycold|1 month ago
firemelt|1 month ago
boutell|1 month ago
benob|1 month ago
refulgentis|1 month ago
Surac|1 month ago
ako|1 month ago
thomasahle|1 month ago
swyx|1 month ago
LLM Code Generation - Unambiguous syntax reduces AI errors
Testing Discipline - Mandatory tests improve code quality
Simple & Fast - Minimal syntax, native performance
Design Philosophy:
Minimal syntax (18 keywords vs 32 in C)
One obvious way to do things
Tests are part of the language, not an afterthought
Transpile to C for maximum compatibility
ehh. i dont think the overhead of inventing a new language makes up for the lack of data around it. in fact if you're close enough to rust/c then llms are MORE likely to make up stuff from their training data and screw up your minimal language.
(pls argue against this, i want to be proven wrong)
kachapopopow|1 month ago
stevedonovan|1 month ago
Trufa|1 month ago
stevefan1999|1 month ago
nurettin|1 month ago
pancsta|1 month ago
so like Go?
> Key Features; Prefix Notation
wow
NEXT!
zozbot234|1 month ago
v3ss0n|1 month ago
aixpert|1 month ago
ares623|1 month ago
prngl|1 month ago
phplovesong|1 month ago