The primary reason why it worked is because Claude could rip off the Linux driver. Without any prior work to rely on, how will the AI figure out proprietary hardware?
He also mentioned it took 2 months. I’m actually wondering how long it would take to do the Linux to BSD port by eyeball, or at least ai assisted. Probably not that much longer? I guess it depends on wall time vs real time.
Most hardware drivers are simpler than people expect. The hardware is usually designed to do the sensible thing in a straightforward way, and you're just translating what the OS wants into a bunch of bits you need to write to the right hardware register.
On the flip side, the perceived barrier is high. Most folks don't have an intuitive sense of how the kernel or "bare metal" environment differs from userland. How do you allocate memory? Can you just printf() a debug message? How to debug if it freezes or crashes? All of these questions have pretty straightforward answers, but it means you need to set aside time to learn.
So, I wouldn't downplay the value of AI for the same reason I wouldn't downplay it with normal coding. It doesn't need to do anything clever to be useful.
That said, for the same reasonss, it's harder to set up a good agent loop here, and the quality standard you're aiming for must be much higher than with a web app, because the failure mode isn't a JavaScript error, but possibly a hard hang.
I estimate two weeks from having never seen kernel source to something reasonably stable based on experience with block devices/raid controllers. But I knew a bit of C (had patches merged into SVN, Exim4, etc).
Just like it does when given an existing GPL’d source and dealing with its hallucinations, the agent could be operated on a black box (or a binary Windows driver and a disassembly)?
The GPL code helped here but as long as the agent can run in a loop and test its work against a piece of hardware, I don’t see why it couldn’t do the same without any code given enough time?
Presumably one would like to use the laptop before the million years it would take the million monkeys typing on a million typewriters to produce the Shakespearean WiFi driver.
Consider that even with the Linux driver available to study, this project took two months to produce a viable BSD driver.
Seems very promising but then you realize the LLM behind said agent was trained on public but otherwise copyright encumbered proprietary code available as improperly redistributed SDKs and DDKs, as well as source code leaks and friends.
In fact most Windows binaries have public debug symbols available which makes SRE not exactly a hurdle and an agent-driven SRE not exactly a tabula rasa reimplementation.
The Linux driver in this case is ISC licensed. There’s no legal or ethical problem in porting it. This is open source working as intended.
I feel like the jury is still out on whether this is acceptable for GPL code. Suppose you get agent 1 to make a clear and detailed specification from reading copyrighted code (or from reverse engineering). Then get agent 2 to implement a new driver using the specification. Is there anything wrong with that?
True. But also -- how do humans do it? There are docs and there's other similar driver code. I wouldn't be surprised if Claude could build new driver code sight-unseen, given the appropriate resources
Humans do it with access to the register-level data sheets, which are only available under NDA, and usually with access to a logic analyzer for debugging.
Usually, the problem with developing a driver isn't "writing the code," it's "finding documentation for what the code should do."
Scientific method. There are many small discoveries humans make that involve forming a hypothesis, trying something out, observing the results, and coming to a conclusion that leads to more experimentation until you get to what you actually want. LLMs can’t really do that very well as the novel observations would not be in their training data.
GPL is not a patent. It covers the work and _derivatives_; it does not cover ideas or general knowledge. The chip in question has docs.
I fully expect that Claude wrote code that does not resemble that of the driver in the Linux tree. TFA is taking on some liability if it turns out that the code Claude wrote does largely resemble GPL'ed code, but if TFA is not comfortable with the code written by Claude not resembling existing GPL'ed code then they can just post their prompts and everyone who needs this driver can go through the process of getting Claude to code it.
In court TFA would be a defendant, so TFA needs to be sure enough that the code in question does not resemble GPL'ed code. Here in the court of public opinion I'd say that claims of GPL violation need to be backed up by a serious similarity analysis.
Prompts cannot possibly be considered derivatives of the GPL'ed code that Claude might mimic.
In this case, they didn't really work from the chip's published documentation. They instead ultimately used a sorta-kinda open-book clean-room method, wherein they generated documentation using the source code of the GPL'd Linux driver and worked from that.
That said: I don't have a dog in this race. I don't really have an opinion of whether this is quite fine or very-much not OK. I don't know if this is something worthy of intense scrutiny, or if it should instead be accepted as progress.
WD-42|7 days ago
lich_king|7 days ago
On the flip side, the perceived barrier is high. Most folks don't have an intuitive sense of how the kernel or "bare metal" environment differs from userland. How do you allocate memory? Can you just printf() a debug message? How to debug if it freezes or crashes? All of these questions have pretty straightforward answers, but it means you need to set aside time to learn.
So, I wouldn't downplay the value of AI for the same reason I wouldn't downplay it with normal coding. It doesn't need to do anything clever to be useful.
That said, for the same reasonss, it's harder to set up a good agent loop here, and the quality standard you're aiming for must be much higher than with a web app, because the failure mode isn't a JavaScript error, but possibly a hard hang.
lstodd|7 days ago
05|7 days ago
- have AI reverse engineer Windows WiFi driver and make a crude prototype
- have AI compare registers captured by filter driver with linux driver version and iterate until they match (or at least functional tests pass)
not exactly rocket surgery, and windows device drivers generally don't have DRM/obfuscation, so reverse engineering them isn't hard for LLMs.
wingmanjd|7 days ago
https://download.samba.org/pub/tridge/misc/french_cafe.txt
unknown|7 days ago
[deleted]
Nextgrid|7 days ago
Just like it does when given an existing GPL’d source and dealing with its hallucinations, the agent could be operated on a black box (or a binary Windows driver and a disassembly)?
The GPL code helped here but as long as the agent can run in a loop and test its work against a piece of hardware, I don’t see why it couldn’t do the same without any code given enough time?
dotancohen|7 days ago
Consider that even with the Linux driver available to study, this project took two months to produce a viable BSD driver.
vitorsr|7 days ago
In fact most Windows binaries have public debug symbols available which makes SRE not exactly a hurdle and an agent-driven SRE not exactly a tabula rasa reimplementation.
josephg|7 days ago
I feel like the jury is still out on whether this is acceptable for GPL code. Suppose you get agent 1 to make a clear and detailed specification from reading copyrighted code (or from reverse engineering). Then get agent 2 to implement a new driver using the specification. Is there anything wrong with that?
Barbing|7 days ago
Wonder if the courts will move fast enough to generally matter.
bootwoot|7 days ago
jwatte|7 days ago
Usually, the problem with developing a driver isn't "writing the code," it's "finding documentation for what the code should do."
slopinthebag|7 days ago
Probably a mix of critical thinking, thinking from first principles, etc. You know, all things that LLM's are not capable of.
deadbabe|7 days ago
chrisjj|7 days ago
Intelligence.
dev_l1x_be|7 days ago
https://www.reddit.com/r/learnmachinelearning/comments/1665d...
cryptonector|7 days ago
I fully expect that Claude wrote code that does not resemble that of the driver in the Linux tree. TFA is taking on some liability if it turns out that the code Claude wrote does largely resemble GPL'ed code, but if TFA is not comfortable with the code written by Claude not resembling existing GPL'ed code then they can just post their prompts and everyone who needs this driver can go through the process of getting Claude to code it.
In court TFA would be a defendant, so TFA needs to be sure enough that the code in question does not resemble GPL'ed code. Here in the court of public opinion I'd say that claims of GPL violation need to be backed up by a serious similarity analysis.
Prompts cannot possibly be considered derivatives of the GPL'ed code that Claude might mimic.
shakna|7 days ago
SPDX-License-Identifier: ISC
Copyright (c) 2010-2022 Broadcom Corporation
Copyright (c) brcmfmac-freebsd contributors
Based on the Linux brcmfmac driver.
I'm going to ahead and say there are copyright law nightmares, right here.
ssl-3|7 days ago
In this case, they didn't really work from the chip's published documentation. They instead ultimately used a sorta-kinda open-book clean-room method, wherein they generated documentation using the source code of the GPL'd Linux driver and worked from that.
That said: I don't have a dog in this race. I don't really have an opinion of whether this is quite fine or very-much not OK. I don't know if this is something worthy of intense scrutiny, or if it should instead be accepted as progress.
(It is interesting to think about, though.)
Gud|7 days ago
ranger_danger|7 days ago
toast0|7 days ago
rustyhancock|7 days ago
It's a bhyve VM running alpine Linux and you pass through your WiFi adaptor and get a bridge out on the freebsd host.
WD-42|7 days ago
unknown|7 days ago
[deleted]