Software 2.0 (2017)

[+] et1337|3 years ago|reply

If the hardest part of programming is reading, understanding, and debugging other people’s code, our jobs are about to get a lot harder with a bunch of AIs running around spitting out 90% accurate code.

[+] hn_throwaway_99|3 years ago|reply

Hmm, a little frustrated because I feel like a lot of comments here are missing the forest for the trees.

For example, as someone who works with financial software, I don't see Karpathy's "Software 2.0" replacing, say, account ledgering software anytime soon. "Yeah, we calculate our clients' balances correctly 99.9% of the time!" isn't going to cut it.

But I don't think that's what Karpathy is arguing. There is a large set of problem domains where Karpathy's Software 2.0 is a much better solution than what he calls Software 1.0. For example, even in finance, stuff like fraudulent transaction detection, or financial security software for intrusion detection, is very well-suited to Software 2.0.

So yes, I think Software 1.0 will always be around, but I don't think it makes sense to use it for domains where Software 2.0 is a better fit. What I feel like Karpathy is arguing for is really now a recognition that Software 2.0 really is a whole new paradigm shift, and we need better tooling (he uses "GitHub for Software 2.0" as an example) to support it.

[+] obviouslynotme|3 years ago|reply

I agree that almost no Software 1.0 will be effected by this. Software 2.0 will start doing the jobs humans do now because Software 1.0 can't, e.g. security guard, customer support, as well as your examples.

The real interesting things will be Software 1.0 and 2.0 working together. You use 1.0 to run and validate the work of 2.0 that is guided by prompts. An example of this would be using prompts to generate source code that is compiled and tested. The real TDD is only writing tests and letting Software 2.0 create the code for you. This extends to other work like engineering as well.

[+] underdeserver|3 years ago|reply

The problem is that the term "2.0" usually means a newer, better, more feature-rich version (literally version) of the same thing.

The kinds of software that "Software 1.0" is suitable for are markedly different that the ones "Software 2.0" are. As Karpathy argues, it's a different tool, suitable for different tasks, and it should have a different name.

[+] jdoliner|3 years ago|reply

On an entirely silly tangent. It's a bit of a shame that Karpathy stopped working at Tesla. Just because having a guy named Karpathy working on car pathing was such a great example of nominative determinism.

[+] albert_e|3 years ago|reply

If you were from India you would read his name as car-pathy (Lord of Cars)

cf. gana-pathy (Lord of the Ganas)

[+] BulgarianIdiot|3 years ago|reply

Do you believe in nominative determinism? It's an interesting concept with gigantic implications.

[+] edanm|3 years ago|reply

On an even sillier tangent, it's a shame that Karpathy stopped making YT videos on how to speed-solve a Rubik's cube. I learned most of what I know from him, many years ago :)

[+] 0xdeadbeefbabe|3 years ago|reply

Nonfiction for the win.

[+] layer8|3 years ago|reply

Maybe Tesla should relocate their research center to the Carpathians.

[+] piokoch|3 years ago|reply

So we will have some abstract language that will allow business people to define what software should do and on that base AI will generate source code and working project.

Wait, wait, I have heard that it was called BPMN and it generated underneath Enterprise Java Beans and it was working amazingly, all software is written today, right? Right?

But, but nobody touches this crap besides generating pictures to show on Powerpoint slides. Because writing software is circa 10% of all effort, specification, legal stuff, maintenance, avoiding technical debt, proper test cases, anomaly testing, performance testing the right stuff. That is hard, that matters. Software 2.0 is barking the wrong tree.

[+] marcus_holmes|3 years ago|reply

And COBOL before that. The goal of COBOL was to allow a non-technical business person to describe the problem in English and that description could be used as the basis of the program.

Needless to say it didn't work. Accurately describing what a program should do is what a programmer does. We're not telling the computer what to do (move this value from memory into this register, etc). We're describing a program's behaviour.

So I think we'll just get another language that is a good fit for describing what the program should do, and instead of compiling that to machine code, it will be used to train a model.

[+] thefourthchime|3 years ago|reply

Do you know that scene in "Office Space" where that guy says:

"What is it that you do here?" and he says:

"I take the specifications and hand them to the software engineers".

I think our jobs are about to approach that a lot more closely than you might think. Our jobs will be effective translators between what the business wants and what the AI outputs.

[+] majkinetor|3 years ago|reply

> writing software is circa 10% of all effort, specification, legal stuff, maintenance, avoiding technical debt, proper test cases, anomaly testing, performance testing the right stuff. That is hard, that matters. Software 2.0 is barking the wrong tree.

Exactly. For serious projects (not to do lists) u will need humans, or equivalent (AI that is grown like one).

[+] roflyear|3 years ago|reply

BPMN is so far gone, when execs suggest using it they ONLY mean for the workflow charts - not actually executing workflows or allowing business folk to create new workflows!

So it is really confusing because they are only evaluating "pretty pictures" and you're evaluating the technical requirements of the tool (to do what it was built for) and you mismatch.

[+] deeviant|3 years ago|reply

The Apple Newton table computer was a complete disaster and thus all tablet computing devices thereafter will always suck, yes?

[+] layer8|3 years ago|reply

Tangentially, it would have been fun if Copilot and ChatGPT existed two decades ago when EJB was all the rage.

[+] Anuiran|3 years ago|reply

A step beyond that, Software 2.0 and beyond does not generate source code. Your app simply lives in the LLM.

[+] agentultra|3 years ago|reply

I think we need to train people to write formal specifications before we're going to see machine learning techniques generating useful programs.

Sure sometimes an LLM generates a correct program. And sometimes horoscopes predict the future.

I will be impressed when we can write a precise and formally verifiable specification of a program and some other program can generate the code for us from that specification and prove the generated implementation is faithful to the specification.

An active area of research here, code synthesis, is promising! It's still a long way off from generating whole programs from a specification. The search spaces are not small. And even using a language as precise as mathematics leaves a lot of search space and ambiguity.

Where we're going today with LLM's trying to infer a program from an imprecise specification written in informal language is simply disappointing.

[+] lpapez|3 years ago|reply

> I will be impressed when we can write a precise and formally verifiable specification of a program and some other program can generate the code for us from that specification and prove the generated implementation is faithful to the specification.

I cannot do this, and neither can any of the people I have ever worked with. Yet despite that we all call ourselves programmers, create value and earn money by writing ill-specified, often buggy code. Why would a tool need to be formally verified to be considered impressive and/or useful?

[+] canes123456|3 years ago|reply

A precise and formally verified specification is WAY harder to write then the code for it

[+] denton-scratch|3 years ago|reply

> I think we need to train people to write formal specifications

I don't think that is really a matter of training. You have to start with people who can think clearly; if they can't think clearly, it's hopeless to expect them to produce formal specs. Very few people can think clearly about difficult subjects.

Of course, you can train people to improve the clarity of their thinking. I think that should be the main purpose of an undergraduate degree.

Writing a formal spec is analagous to writing a program; if you can't program, your program won't work. So writing a formal spec proves that you can think clearly; but if you can think clearly, you can write a program without first writing a formal spec.

[+] ClassAndBurn|3 years ago|reply

I agree here. Formal models have to become easier to create through.

Today's ecosystem requires advanced knowledge of system design and still coding abilities.

To democratize model generation we need a more iterative and understandable way of defining intented execution. The problem is this devolves into just coding the damn thing pretty quickly.

[+] ansgri|3 years ago|reply

We need to train people to write specifications since they seem to be the best way to fine-tune the AI performance for the task without changing the enormous dataset. Reinforcement learning methods can be used to tune the model in production under pressure of failing tests (executable specifications). Writing specifications to evaluate outcomes should be a better use of domain experts time than dataset cleaning.

As for formality, real formal specifications are very hard, and LLMs are close to understanding natural language anyway, and 1000s of 90%-strict specs are better than 10 provably correct ones. So, some sort of legalese for machines will evolve.

[+] Kinrany|3 years ago|reply

The article is not about generating correct programs.

[+] Duwensatzaj|3 years ago|reply

Sid Meier's Alpha Centauri brought this up in 1999.

"We are no longer particularly in the business of writing software to perform specific tasks. We now teach the software how to learn, and in the primary bonding process it molds itself around the task to be performed. The feedback loop never really ends, so a tenth year polysentience can be a priceless jewel or a psychotic wreck, but it is the primary bonding process—the childhood, if you will—that has the most far-reaching repercussions."

– Bad'l Ron, Wakener, "Morgan Polysoft"

[+] wewtyflakes|3 years ago|reply

That game is worth playing for all of those neat "quotes"; they seem more on point by the day. https://civilization.fandom.com/wiki/Pre-Sentient_Algorithms...

[+] Winsaucerer|3 years ago|reply

That game really deserves a sequel, with the same cast of characters. Beyond Earth tried to do generic leader personalities, and it suffered as a result. The unique personalities of the original cast were a large part of the game.

[+] titzer|3 years ago|reply

> Think about how amazing it could be if your web browser could automatically re-design the low-level system instructions 10 stacks down to achieve a higher efficiency in loading web pages.

This sounds great in theory, but in practice, a system that has that much dynamic adaptation has brutally steep performance cliffs and is massively complex. I for one, will be opting out of that giant vertical slice of hell. This is one of the _good_ reasons for having layers: separate failure zones, separate levels of abstraction--true reuse and modularity. Bugs break all that.

And no, given the hallucinations of large models just in the natural language space, I do not want to reason through the mad ravings of a tripping AI to debug a monster pile that happens to make web property X go 10% faster.

[+] samstave|3 years ago|reply

ELI5 what "10 stacks down" means?

I havent heard this phrase before.

[+] Iv|3 years ago|reply

I remember a pretty old interview with Linus Torvalds where they are talking about object oriented programming. The interviewer asked him if he expected a similar paradigm change in the coming years and I remember being surprised by his answer: (quoting from memory) No, I don't see anything big coming. Probably the next change will be caused by AI.

Yes, differentiable code is already a new paradigm (write a function with millions of parameter, a loss function that requires more craft than people realize and train). That has a property that used to be the grail of IT project management: it is a field where, when you want to improve your code performance, you can just throw more compute at it.

And I think that the clumsy but still impressive attempts at code generation hints at the possibility that yet another AI-caused paradigm change is on the horizon: coding through prompt, adding another huge step on the abstraction ladder we have been climbing.

Forget ChatGPT coding mistakes, but down the road there is a team that will manage to propose a highly abstract yet predictable code generator fueled by language models. It will change our work totally.

[+] zelphirkalt|3 years ago|reply

We might get into another slump of efficiency as an outcome of this, again stopping us from making the most of the hardware and computational resources we have, due to prompts being too unspecific. Did not specify the OS your code will run on? Well, we better use this general cross OS available library here, instead of the optimized one for the actual OS the thing will run on.

The same mentality, that causes today's "everything must be a web app", will caused terrible inefficiency in AI generated (and human prompted for) code. In the end our systems might not be more performant than anything we already have, because there are dozens of useless abstraction layers inserted.

At the same time other people might complain, that the AI does not generate code, that can be run everywhere. That they have to be too specific. People might work on that, producing code generators which output even more overheady code.

At least some of that overhead will slip through the cracks into production systems, as companies wont be willing to invest into proof-reading software engineers and long prompt-generate-review-feedback cycles.

[+] mattgreenrocks|3 years ago|reply

This feels like re-discovering DSLs, except the syntax is English, and the implementation is a blackbox.

[+] zmmmmm|3 years ago|reply

I've always thought this was the critical point where Google missed the boat. It was 2015. Google was clearly and unambiguously ahead in AI and ML. They also were rampaging to massive majority market share in the mobile OS space. They had the opportunity right at that point to turn Android into the very first AI native operating system. I remember seeing a glimpse of this when I found buried in the Android SDK a face detection feature that accurately located eyes and faces in photographs. I thought at the time: this is it - this is where Google will build AI level features into Android and expose them as platform OS features and Android will race ahead in the computing space and take over the world. The whole programming model will look completely different as devs start using first class AI as a basic feature of their toolkit.

And then it never happened. They focused on cloning iPhone features and approach, dumbing down and simplifying the OS to the point where it's pretty hard to distinguish any more.

[+] binarymax|3 years ago|reply

An interesting treatise but I don’t fully agree with the statement that gathering data or stating a goal is easy. Getting quality data is arguably the hardest part of ML. Getting people to fully describe their intentions and all edge cases (“requirements”) in a clear manner is also the one of the hardest parts of software development - which is why things like agile exist. Maybe at the granular function or module level, but business requirements are not easy.

[+] darksaints|3 years ago|reply

To contribute my personal lowbrow dismissal, this is the guy in charge of computer vision at Tesla, where cars have had a habit of doing absolutely bonkers behavior that doesn’t make any sense, when faced with completely normal phenomenon like lanes that fork into two.

His single sentence caveat about how Neural Networks can fail in unintuitive and embarrassing ways is the understatement of the century. I’d like to add that Tesla still hasn’t solved that lane forking problem even eight years since it was first identified. I guess just throw more data at it, and eventually it will get better? At what point does the belief that things will get better with more data fed into the same algorithm become a religious creed?

Neural Networks are significant advances in the state of not just machine learning, but the world as a whole. But the caveat that we don’t really understand what they’re doing is the whole fucking problem. Until Neural Networks can take advantage of, constrain to, and augment human models, they don’t have a snowballs chance in hell at replacing the types of software we rely on the most.

Until then, you’ll just create a massively inefficient system where the neural network writes the software but you spend 10x on engineering your training datasets so that your brilliant neural network knows that it is better to commit to one of two lanes in a forked road than it is to crash into the concrete lane divider. Or to not be racist. Or to not go haywire because of a sticker on a stop sign.

[+] dang|3 years ago|reply

Software 2.0 - https://news.ycombinator.com/item?id=15678587 - Nov 2017 (36 comments)

[+] impalallama|3 years ago|reply

Funny how little a splash this made then compared to now

[+] dangoor|3 years ago|reply

(2017) Interesting to see that this is from 2017, given what the AI explosion has been like recently.

[+] tkiolp4|3 years ago|reply

I think AI like ChatGPT and similars will become what APIs are today. Just that. Today we wire together a crypto library with some JWT library with some Facebook API using some axios library. We write the code in between. Since we are able to do more, user requirements get more complex, and so software engineers are in more demand. Sure thing, what required 10 engineers in the past (1980) can be done by two today.

In the same sense, in the future we will be wiring together AI APIs (probably because it will be cheaper to wire together manually N AIs than to write one that is the sum of the N AIs). Since we’ll be able to do more, user requirements will get more complex… and so the demand for software engineers will go up as well. In the future only a couple of engineers will be needed when today we need 10.

[+] BulgarianIdiot|3 years ago|reply

I hope he's smarter at machine learning than at blog posts.

Yes, a lot of software will include NN models. Traditional software is going nowhere, because it's the only means of being 100% sure of what the outcome will be, non-probabilistically.

Neural Networks are a tool for solving probabilistic, fuzzy logic problems.

[+] valenterry|3 years ago|reply

Yeah I agree. I can see those models helping us improve productivity quite a bit on the side of (still) writing code. Essentially it is a much better form of context aware c&p from stackoverflow.

And then, as you say, there will be certain parts where those models are actually gonna be integrated in software in one way or the other. And I think this is powerful. It would be awesome if I can just toss certain problems to the business folks and empower them to figure out the solution AND implementation by themselves.

But even that will probably take quite some time.

[+] travisjungroth|3 years ago|reply

Your upper bound for the probability of software written by people being correct is higher than mine.

[+] kklisura|3 years ago|reply

If assume the article is correct and there will be Software 2.0, what's the cost of running it vs cost of running Software 1.0? The reason I'm asking is given this quote "when the network fails in some hard or rare cases, we do not fix those predictions by writing code, but by including more labeled examples of those cases." - is it more costly to do "curating, growing, massaging and cleaning labeled datasets" and ultimately training neural networks than to just write code? Maybe not for small NNs, but for DNNs?

[+] zelphirkalt|3 years ago|reply

The higher level the labeled examples become, the more effort is needed to make them in the first place.

For example take a website. How are we going to provide enough examples of websites to make the code generated fit what we need and not have annoying properties we want to void? Lets say we have a website and we tell the code generator of choice, that we want that website to be accessible for blind people. How do we create the amount of labeled examples, that make the code generator understand what to create? Maybe that very creation of labeled examples will be a software developer's future work activity.

[+] gtirloni|3 years ago|reply

Ideally, providing labeled examples can be done by relatively cheaper humans.

[+] jstx1|3 years ago|reply

What the article is saying is that when you develop and build ML systems, your workflow is different from the workflow you would have if you were solving the same problem without ML.

The weird part is presenting it as 2.0 which implies that it replaces 1.0; it doesn't, apart from some edge cases like image recognition - we don't hand code rules to recognise images anymore but that's a tiny tiny part of all the software development work out there.

[+] thurn|3 years ago|reply

This just feels like taking two unrelated things and bunching them together under a label in order to be provocative and imply that software 2.0 will "replace" existing code.

If you surveyed e.g. all of the code Google has in their piper repository, you would find significantly less than 1% of it could be replaced by even an extremely good neural network.

[+] unhammer|3 years ago|reply

Neural nets are great at language and image processing, compression, search and probabilistic prediction problems. In many ways they open up new possibilities for what software can do at all. But replace existing solutions? How on earth do you create a single neural net architecture in say pytorch that turns into a CRUD app with oauth, email verification, css+html, support for parsing some ancient xml format uploaded by the user and stored in S3 along with various database connections? I can't even imagine what a labelled or unlabelled training set for that would look like.

(Unless you're talking about making GPT write the code that does all of the above – I'm sure people are working on that – but unless I'm completely misreading Karpathy's article that's not at all what he intended; IIUC he's talking about programmers making datasets for specific problems + simple neural net architectures to train on that dataset. If ChatGPT wrote some Java code for you, you can't make it go faster by removing half the nodes, which nodes would those be? It definitely won't go faster if you send the same prompt to a lobotomised ChatGPT)

[+] cranium|3 years ago|reply

Don't forget it was written in Nov 2017, AlphaGo Zero was just released at this point. Seeing how the field evolved in a 5-year span, I can understand the desire to be at the forefront.

Personally, I want to keep working on things that I can get to the bottom of. In the same way proprietary software is a nightmare to debug, having an AI blackbox in the middle of my stack could wreck all traceability. However, I can see myself using an AI blackbox when its output is consumed/checked by a human or when "best-effort" is good enough (but treat output as dirty).

Example: sorting documents by relevance (human consumes), code assistant (human checks), transcription of my audio notes (wouldn't do it myself or pay anyone for that, so any output is good enough),

Counter-examples (too dangerous!): AI personal assistant that accept/reject meetings, a ChatGPT text box as an interface for settings, auto-generated tests, Infrastructure-as-a-Desire: input your software and get a new K8S cluster provisioned for it.

330 comments