Man, I can’t tell you how much labour modern LLMs would have saved me at my business, 10-15 years ago.
An awful lot of what we ended up dealing with was awful data - the worst example I can think of was a big old heap of textual recipes that the client wanted normalised, so they could be scaled up/down, have nutritional information, etc. - about 180,000 of them, all UGC.
This required mountains of regexes for pre-processing, and then toolchains for a small army of interns to work through every. single. one. and normalise it - we did what we could, trying to pull out quantities and measures and ingredients and steps, but it was all such slop it took thousands of man-hours, and then many more to fix the messes the interns made.
With an LLM, it could have been done… more or less instantly.
And this is just one example of so, so many times that we found ourselves having to turn a heap of utter garbage into usable data, where an LLM would have been able to just do it.
Anyway. I at least managed to assuage my past torment by seeing the writing on the wall and stocking up on NVDA at about the time I was wrestling with this stuff.
i think the shift in expectations has a lot to do with a change in audience.
it used to be that fancy new ML models would be discussed among ML practitioners that had enough background/context to understand why seemingly little improvements were a big deal and what reasonable expectations would be for a model.
but now a new ML (sorry "AI") model is evaluated by the general public that doesn't know the technical background but DOES know the marketing hype. you can give them an amazing language model that blows away every language-related benchmark but they'll have ridiculous expectations so it's always a disappointment.
i'm still amazed when language models do relatively 'simple' things with grammar and syntax (like being able to understand which objects different a pronouns are referencing), but most people have never thought about language or computers in a way that lets them see how hard and impressive that is. they just ask it a question like 'what should i eat for dinner' and then get mad when it recommends food they dont like.
Arguably the goal post for AGI has moved about as much, if not more. One wonders if Turing reading a 2024 LLM chat transcript would say "but it's not really thinking!".
I always saw it commenting on the difference between what non-techies perceive as hard. Multiple times in my career a single off the cuff requirement in a meeting changed the estimate of a project by several months.
> This one always felt off to me. Humans spent millennia working out the navigation problem.
Even the navigation problem still offers some challenges that most apps fail to address. Consider the store locator function common on retail business websites and apps. They usually just compute the straight line distance from you to the stores and show the stores within some particular range, sorted by distance.
That's probably fine most of the time, but consider a place like Seattle and its surrounding areas. Suppose you are in Kingston, which is on the west side of Puget Sound about 5 miles away from the east side, which is the side Seattle is on.
The Walgreens store locator shows 10 stores when searching for stores near Kinston, and 9 of them are on the Seattle side of Puget Sound. Crossing the Sound there is a 30 minute ferry ride that costs around $20 each way if you are bringing your car.
The one it shows on the west side of the Sound is on Bainbridge Island and that is probably not the one someone in Kinston would go to. They would go to the one in Silverdale. It's actually closer to Kinston than the one on Bainbridge by road distance, but slightly farther away straight line.
The one in Silverdale is on their list, as are three in Bremerton and one in Port Orchard, which are all closer in terms of time and travel expenses to Kingston than are any of the ones on the Seattle side, but you only see those on the map if you hit the "load more" button. Once brings in Silverdale and a couple in Bremerton, and twice brings in the rest.
Similar for businesses whose site has an option to find items in stock locally. They often report an item is locally available, but it turns out to only be in stores across the Sound.
That is kind of the point. It seems like navigational thing would be much more complicated to the layman, yet anyone can do it in few hours now. While a thing that is seemingly more simple, would take years because it isn't easily solved and served as an API. Although it is now.
I don’t think the point is about GPS, it’s about GIS. So it’s not the navigation problem, it’s a “is this point in this polygon” problem. Which is… a bit easier
Your observation doesn't contradict the point of the comic. It isn't about which tasks are difficult in totality, it is talking about which tasks are difficult with our current technology.
The idea is that non-software developers don't know which tasks current technology can solve trivially and which tasks can't be. Yes, the distribution of those tasks into the two buckets changes over time, but it is still not easily knowable by lay-people.
Everything we do today would be extremely difficult to re-create from scratch, but that doesn't mean it is hard to do - because we DON'T have to re-create it from scratch.
When I was a student we got a task where we had to spell check some text. This was super easy because we could fit the entire dictionary in memory.
Hadn’t always been that easy. Once upon a time someone was paging in and out their dictionary from a floppy disk. Not to forget about the compression they had to implement from scratch.
I'd love to see a source but I'm pretty sure the energy, time, and money that has gone into compute infrastructure is far greater than our space programs.
> I'll add that if you think training models takes a lot of energy, try launching fleets of rockets to maintain an artificial satellite constellation.
It's not the training what makes it difficult! It's the necessary research to invent machine learning algorithms which can be used to train a model to recognize birds. For multiple decades, this was way harder than maintaining a satellite constellation.
I was just listening to a Planet Money podcast about GPS (https://overcast.fm/+AAYs-52QVys) while it wouldn’t be “global”, the US did have a terrestrial alternative location system until 2010 (https://en.m.wikipedia.org/wiki/Loran-C). With today’s technology, enhanced LORAN would have been more accurate if funded as a backup to GPS
One could say, "This didn't age well." but I think the real point of "it can be hard to explain the difference between the easy and the virtually impossible" is only reinforced by an almost ironic twist that switched the hard and easy around. Who would have thought ten yeas ago?
> Understanding what kind of tasks LLMs can and cannot reliably solve remains incredibly difficult and unintuitive.
Case in point: the other day my daughter was doing a presentation and she said "Dad can you help me find a picture of the word HELLO spelled out in vegetables?"
I was like "CAN I!!?!?! This sounds like a job for ChatGPT".
I'll tell you what: ChatGPT can give you a picture of a cat wearing a space suit drinking a martini but it definitely cannot give you the word HELLO spelled out in vegetables.
I ended up getting it to give me each individual letter of the alphabet constructed with vegetables and she pasted them together to make the words she wanted for her presentation.
I did this tutorial series to try to get some context/foundation in deep learning, and the first lesson was building the bird thing from this comic. It was really easy and fun. The whole course is great. Highly recommend for anyone who has a programming background and wants to get a solid intro to deep learning.
What no-one is pointing out is that LLMs have made almost as much progress on the first part of the request as the second! ChatGPT writes me a is_point_in_national_park function and points me to the relevant shapefile in like ~30 seconds. That's a few hundred times speedup of the "few hours" referenced in the comic.
I wonder if we'll ever hit a critical mass of technical literacy where this kind of misunderstanding largely disappears. Ten years ago I would have said yes. Now I think the advances in UX/UIs and the appification of everything have insulated the median person from the details. That's good as far as individual products go, but in aggregate might lead to unrealistic expectations. I've heard younger folks ask questions about "why doesn't x just do y" that I previously could only have imagined my very non-technical parents' cohort asking.
At least in the 80s, when computers roughly equalled magic for much of the population (looking at you Wargames!), most people didn't really have to interact with it. Their expectations about computers were roughly as important as my expectations about alien life. But I'm afraid that magical thinking about tech will be of greater consequence both individually and societally.
GPS is a ready-made infrastructure that took decades of hard work to build and maintain. When the comic was made, image recognition didn't didn't have the same done for it, but now with pre-trained models everyone can do it in 5 minutes too.
Is it? I assume that you are thinking of using a 3rd-party API endpoint to which you upload the image so that the service decides for you if it is a bird and which kind of bird it is. Or you use something like Firebase.
Because if that is the way you'd solve this problem, then just sending lat/lon to a service to determine if it is in a national park is even easier, as it's just a GET request.
I'm still unsure about what would be harder to set up locally.
Back in the day I had a manager that didn't understand programming.
To him, it was just one button that would open this small info window. Just one button. Just one window.
It took him weeks to understand that we didn't have the data ready he wanted to show. We could do it, but it would take weeks of research and development.
Note that we are still within Randall’s expectations - the initial estimate for the project at the time was five years and ten years later there is a publicly available solution.
It would have been interesting to see the reverse - the problem becoming trivial in less time then the project’s estimate.
> Understanding what kind of tasks LLMs can and cannot reliably solve remains incredibly difficult and unintuitive.
That’s because the idea of it being a super human intelligence (an undefined metric) is being sold. So you have to lie and say “it’s amazing, it’s going to change everything”. If I tell you “it’s okay and is often wrong” you wouldn’t buy my product would you? This is just to say I can’t blame that on easy/hard task agency, specifically.
=== addendumb ===
“it’s okay and is often wrong” Sounds like working with my junior coworker who I don’t enjoy pairing with. If I said “it’s impressive how the results are to the level of a junior engineer” you sell me on your product.
One thing I’ve always felt is that the relative difficulty of each task seems to have flipped? I could write a bird classifier in my sleep using fastai, but I have no idea how to do a GIS lookup.
An LLM will tell you how to do the GIS lookup, but ironically as privacy laws become better it will genuinely become a harder and harder task unless the user explicitly wants you to do it.
[+] [-] ml_basics|1 year ago|reply
10 years ago the GAN paper came out and everyone was excited how amazing the generated image quality was (https://arxiv.org/abs/1406.2661)
The amount of progress we've made is mind boggling.
[+] [-] ethbr1|1 year ago|reply
'Common people misunderstand what computers are capable of, because they run it through human equivalency.
E.g. a child can do basic arithmetic, and a computer can do basic arithmetic. A child can also speak, so surely a computer can speak.'
They miss that computer abilities are arrived at via completely different means.
Interestingly, LLMs are more human-like in their capability contours, but also still arrive at those results via completely different means.
[+] [-] madaxe_again|1 year ago|reply
An awful lot of what we ended up dealing with was awful data - the worst example I can think of was a big old heap of textual recipes that the client wanted normalised, so they could be scaled up/down, have nutritional information, etc. - about 180,000 of them, all UGC.
This required mountains of regexes for pre-processing, and then toolchains for a small army of interns to work through every. single. one. and normalise it - we did what we could, trying to pull out quantities and measures and ingredients and steps, but it was all such slop it took thousands of man-hours, and then many more to fix the messes the interns made.
With an LLM, it could have been done… more or less instantly.
And this is just one example of so, so many times that we found ourselves having to turn a heap of utter garbage into usable data, where an LLM would have been able to just do it.
Anyway. I at least managed to assuage my past torment by seeing the writing on the wall and stocking up on NVDA at about the time I was wrestling with this stuff.
[+] [-] seydor|1 year ago|reply
[+] [-] parpfish|1 year ago|reply
it used to be that fancy new ML models would be discussed among ML practitioners that had enough background/context to understand why seemingly little improvements were a big deal and what reasonable expectations would be for a model.
but now a new ML (sorry "AI") model is evaluated by the general public that doesn't know the technical background but DOES know the marketing hype. you can give them an amazing language model that blows away every language-related benchmark but they'll have ridiculous expectations so it's always a disappointment.
i'm still amazed when language models do relatively 'simple' things with grammar and syntax (like being able to understand which objects different a pronouns are referencing), but most people have never thought about language or computers in a way that lets them see how hard and impressive that is. they just ask it a question like 'what should i eat for dinner' and then get mad when it recommends food they dont like.
[+] [-] bumby|1 year ago|reply
I've heard this applied to all kinds of human goals, but it seems apt for AI expectations as well.
[+] [-] ttflee|1 year ago|reply
[+] [-] fouronnes3|1 year ago|reply
[+] [-] Workaccount2|1 year ago|reply
Especially people with what appears to be "low hanging fruit" work for AI, after the recent paradigm shift.
[+] [-] jessekv|1 year ago|reply
The comic exists in this brief window of time where one task was finally "solved" and the other one was just getting started.
I'll add that if you think training models takes a lot of energy, try launching fleets of rockets to maintain an artificial satellite constellation.
[+] [-] moritonal|1 year ago|reply
[+] [-] tzs|1 year ago|reply
Even the navigation problem still offers some challenges that most apps fail to address. Consider the store locator function common on retail business websites and apps. They usually just compute the straight line distance from you to the stores and show the stores within some particular range, sorted by distance.
That's probably fine most of the time, but consider a place like Seattle and its surrounding areas. Suppose you are in Kingston, which is on the west side of Puget Sound about 5 miles away from the east side, which is the side Seattle is on.
The Walgreens store locator shows 10 stores when searching for stores near Kinston, and 9 of them are on the Seattle side of Puget Sound. Crossing the Sound there is a 30 minute ferry ride that costs around $20 each way if you are bringing your car.
The one it shows on the west side of the Sound is on Bainbridge Island and that is probably not the one someone in Kinston would go to. They would go to the one in Silverdale. It's actually closer to Kinston than the one on Bainbridge by road distance, but slightly farther away straight line.
The one in Silverdale is on their list, as are three in Bremerton and one in Port Orchard, which are all closer in terms of time and travel expenses to Kingston than are any of the ones on the Seattle side, but you only see those on the map if you hit the "load more" button. Once brings in Silverdale and a couple in Bremerton, and twice brings in the rest.
Similar for businesses whose site has an option to find items in stock locally. They often report an item is locally available, but it turns out to only be in stores across the Sound.
[+] [-] mewpmewp2|1 year ago|reply
[+] [-] rtpg|1 year ago|reply
[+] [-] cortesoft|1 year ago|reply
The idea is that non-software developers don't know which tasks current technology can solve trivially and which tasks can't be. Yes, the distribution of those tasks into the two buckets changes over time, but it is still not easily knowable by lay-people.
Everything we do today would be extremely difficult to re-create from scratch, but that doesn't mean it is hard to do - because we DON'T have to re-create it from scratch.
[+] [-] hmottestad|1 year ago|reply
Hadn’t always been that easy. Once upon a time someone was paging in and out their dictionary from a floppy disk. Not to forget about the compression they had to implement from scratch.
[+] [-] josefx|1 year ago|reply
And you think we spend any less time trying to identify food animals that lay tasty eggs?
[+] [-] roomey|1 year ago|reply
It only makes sense if we ignore the "standing on the shoulders of giants bit".
[+] [-] zulban|1 year ago|reply
[+] [-] cubefox|1 year ago|reply
It's not the training what makes it difficult! It's the necessary research to invent machine learning algorithms which can be used to train a model to recognize birds. For multiple decades, this was way harder than maintaining a satellite constellation.
[+] [-] scarface_74|1 year ago|reply
[+] [-] m463|1 year ago|reply
The National Park Service was started August 25, 1916 (only 108 years ago)
:)
[+] [-] weinzierl|1 year ago|reply
[+] [-] dools|1 year ago|reply
Case in point: the other day my daughter was doing a presentation and she said "Dad can you help me find a picture of the word HELLO spelled out in vegetables?"
I was like "CAN I!!?!?! This sounds like a job for ChatGPT".
I'll tell you what: ChatGPT can give you a picture of a cat wearing a space suit drinking a martini but it definitely cannot give you the word HELLO spelled out in vegetables.
I ended up getting it to give me each individual letter of the alphabet constructed with vegetables and she pasted them together to make the words she wanted for her presentation.
[+] [-] dvh|1 year ago|reply
[+] [-] itslennysfault|1 year ago|reply
https://course.fast.ai/
[+] [-] Dave_Rosenthal|1 year ago|reply
[+] [-] spaceman_2020|1 year ago|reply
Had a project that involved describing and cataloging over 20,000 images.
Traditional method using real people would take months and crap load of money (the descriptions have to be customer-readable)
OpenAI’s vision API does it for cents per image. Must have spent under $200 for the whole thing
[+] [-] BWStearns|1 year ago|reply
At least in the 80s, when computers roughly equalled magic for much of the population (looking at you Wargames!), most people didn't really have to interact with it. Their expectations about computers were roughly as important as my expectations about alien life. But I'm afraid that magical thinking about tech will be of greater consequence both individually and societally.
[+] [-] stefanos82|1 year ago|reply
Would it be too much to ask you to start livestreaming any coding of yours that can be shared publicly?
I would love to learn so many things from you, especially around your current ecosystem, that is Python, SQLite (data), and JavaScript.
[+] [-] appendix-rock|1 year ago|reply
[+] [-] Almondsetat|1 year ago|reply
[+] [-] cung|1 year ago|reply
[+] [-] qwertox|1 year ago|reply
Because if that is the way you'd solve this problem, then just sending lat/lon to a service to determine if it is in a national park is even easier, as it's just a GET request.
I'm still unsure about what would be harder to set up locally.
[+] [-] unsigner|1 year ago|reply
"task about which you will find more easy-looking tutorials hiding the complexity under a blanket of 3rd party code and services" is better
[+] [-] marricks|1 year ago|reply
[+] [-] thaumasiotes|1 year ago|reply
That depends on whether you care about getting the answer right. If you don't, it was always the easier task.
If you do, Seek by iNaturalist still can't do this job, and that's the only thing Seek is supposed to be able to do.
[+] [-] consp|1 year ago|reply
[+] [-] munchler|1 year ago|reply
[+] [-] ta1243|1 year ago|reply
"I'll need a research team and 5 years"
In 2020 The BBC had this blog about cameras detecting not just "is it a bird or superman", but what types of birds
https://www.bbc.co.uk/rd/blog/2020-06-springwatch-artificial...
I guess Cueball got the team together.
[+] [-] riiii|1 year ago|reply
To him, it was just one button that would open this small info window. Just one button. Just one window.
It took him weeks to understand that we didn't have the data ready he wanted to show. We could do it, but it would take weeks of research and development.
[+] [-] thih9|1 year ago|reply
It would have been interesting to see the reverse - the problem becoming trivial in less time then the project’s estimate.
[+] [-] ryzvonusef|1 year ago|reply
EDIT: A month, https://code.flickr.net/2014/10/20/introducing-flickr-park-o...
It's so weird them explaining 'Deep Networks'. Language on AI has definitely changed in the past ten years.
Also, hilariously, the page they created to demonstrate this (http://parkorbird.flickr.com/) no longer works. Oh, how time flies.
explain XKCD page for good measure: https://www.explainxkcd.com/wiki/index.php/1425:_Tasks
[+] [-] sva_|1 year ago|reply
https://nitter.poast.org/jeremyphoward/status/15180380012924...
[+] [-] amp108|1 year ago|reply
[+] [-] righthand|1 year ago|reply
That’s because the idea of it being a super human intelligence (an undefined metric) is being sold. So you have to lie and say “it’s amazing, it’s going to change everything”. If I tell you “it’s okay and is often wrong” you wouldn’t buy my product would you? This is just to say I can’t blame that on easy/hard task agency, specifically.
=== addendumb ===
“it’s okay and is often wrong” Sounds like working with my junior coworker who I don’t enjoy pairing with. If I said “it’s impressive how the results are to the level of a junior engineer” you sell me on your product.
[+] [-] 1f60c|1 year ago|reply
[+] [-] brazzy|1 year ago|reply
[+] [-] moffkalast|1 year ago|reply
[+] [-] solardev|1 year ago|reply
Just pop up a dialog for the user. "Are you in a national park?"