Hmmm.... Wondering if this could be eventually used to emulate a PCIe card using another device, like a RaspberryPi or something more powerful... Thinking the idea of a card you could stick in a machine, anything from a 1x to 16x slot, that emulates a network card (you could run VPN or other stuff on the card and offload it from the host) or storage (running something with enough power to run ZFS and a few disks, and show to the host as a single disk, allowing ZFS on devices that would not support it). but this is probably not something easy...
Hi! Author here! You can technically offload the transactions the real driver on your host does to wherever you want really. PCI is very delay-tolerant and it usually negotiates with the device so I see not much of an issue doing that proven that you can efficiently and performantly manage the throughput throughout the architecture. The thing that kinda makes PCIem special is that you are pretty much free to do whatever you want with the accesses the driver does, you have total freedom. I have made a simple NVME controller (With a 1GB drive I basically malloc'd) which pops up on the local PCI bus (And the regular Linux's nvme block driver attaches to it just fine). You can format it, mount it, create files, folders... it's kinda neat. I also have a simple dumb rasteriser that I made inside QEMU that I wanted to write a driver for, but since it doesn't exist, I used PCIem to help me redirect the driver writes to the QEMU instance hosting the card (Thus was able to run software-rendered DOOM, OpenGL 1.X-based Quake and Half-Life ports).
This kind of stuff is stupid easy on an OS like Plan 9 where you speak a single protocol: 9P. Ethernet devices are abstracted and served by the kernel as a file system explained in ether(3). Since it's all 9P the system doesn't care where the server is running; could be a local in-kernel/user-space server or remote server over ANY 2-way link including TCP, IL, PCIe link, RS232 port, SPI, USB, etc. This means you can mount individual pieces of hardware or networking stacks like ip(3), any 9P server, from other machines to a processes local namespace. Per-process name spaces let you customize the processes view of the file system and hence all its children allowing you to customize each and every programs resource view.
There is interest in getting 9front running on the Octeon chips. This would allow one to run anything they want on an Octeon card (Plan 9 cross platform is first class) so one could boot the card using the hosts root file system, write and test a program on the host, change the objtype env variable to mips/arm, build the binary for the Octeon and then run it on the Octeon using rcpu (like running a command remotely via ssh.) All you need is a working kernel on the Octeon and a host kernel driver and the rest is out of the box.
Something like the stm32mp2 series of MCUs can run Linux and act as a PCIe endpoint you can control from a kernel module on the MCU. So you can program an arbitrary PCIe device that way (although it won’t be setting any speed records, and I think the PHY might be limited to PCIe 1x)
So just to be clear, you have to boot up the physical machine with a kernel command-line argument to reserve some RAM for this to work? And the amount of RAM you reserve is for BAR memory? If you wanted multiple PCIem devices (can you do that?) you'd need to reserve RAM for each of them?
Hi! That's correct. We need a way to have a chunk of what Linux calls "Reserved" memory for the virtual BAR trick. Currently, PCIem only thinks about a single device (Since I first needed to have something that worked in order to check how feasible this all was), but there's planned support for multiple devices that can share a "Reserved" memory pool dynamically so you can have multiple BARs for multiple devices.
Indeed, the project has gone through a few iterations already (It was first a monolithic kernel module that required a secondary module to call into the API and whatnot). I've went towards a more userspace-friendly usage mainly so that you can iterate your changes much, much faster. Creating the synthetic PCI device is as easy as opening the userspace shim you program, it'll then appear on your bus. When you want to test new changes, you close the shim normally (Effectively removing it from the bus) and you can do this process as many times as needed.
Could you explain in layman terms how it would help with developing PCIE hardware / drivers? I can immediately imagine something like writing more robust unit tests and maybe developing barebones drivers before you get access to actual hardware, but that's where my imagination runs out of fuel.
I feel like libfvio-user is a cool project and works perfectly fine, that is, if you want to have the device on the host's userspace but exposed to a VM (QEMU, in this case).
PCIem kinda does that, but it's down a level; in terms of, it basically pops the device on your host PCI bus, which lets real, unmodified drivers to interact with the userspace implementation of your card, no QEMU, no VM, no hypervisors.
Not saying that you can then, for instance, forward all the accesses to QEMU (Some people/orgs already have their cards defined in QEMU so it'd be a bit pointless to redefine the same stuff over and over, right?) so they're free to basically glue their QEMU stuff to PCIem in case they want to try the driver directly on the host but maintaining the functional emulation on QEMU. PCIem takes care of abstracting the accesses and whatnot with an API that tries to mimick that the cool people over at KVM do.
Tangential question: PCIe is a pretty future-proof technology to learn/invest in,
right? As in,
it is very unlikely to become obsolete in the next 5-10 years (like USB)?
Neither of those is going to be obsolete in 5 years. Might get rebadged and a bunch of extensions, but there's such a huge install base that rapid change is unlikely. Neither Firewire nor Thunderbolt unseated USB.
PCIe is probably the most future proof technology we have right now. Even if it is upheaveled at the hardware level, from the software perspective it just exposes a device's arbitrary registers to some memory mapped location. Software drivers for PCIe devices will continue to work the same.
Curious what you mean by learning? Learning about TLPs? Learning about FPGA DMA Engines like XDMA? Learning about PCIe switches / retimers? Learning about `lspci`?
very interesting work! I've been exploring a different idea on the side, using SPDK+libvfio-user [0] to emulate PCIe devices inside QEMU, which doesn't require a kernel module but it's a bit less flexible than this approach.
Highly interesting! I kinda wanted not to rely on QEMU as a default "end" for the emulation (As in, I want the end user to be able to choose whatever userspace shim/transport layer/thing they want), but for some of my tests I did forward accesses to QEMU itself (And worked wonders). Thanks for that link! Super cool stuff!
sedatk|1 month ago
In the presentation below, only the USB capabilities of it is discussed, but it was able to simulate PCI devices too.
https://download.microsoft.com/download/5/b/9/5b97017b-e28a-...
tiernano|1 month ago
cakehonolulu|1 month ago
MisterTea|1 month ago
There is interest in getting 9front running on the Octeon chips. This would allow one to run anything they want on an Octeon card (Plan 9 cross platform is first class) so one could boot the card using the hosts root file system, write and test a program on the host, change the objtype env variable to mips/arm, build the binary for the Octeon and then run it on the Octeon using rcpu (like running a command remotely via ssh.) All you need is a working kernel on the Octeon and a host kernel driver and the rest is out of the box.
pjc50|1 month ago
The other existing solution to this is FPGA cards: https://www.fpgadeveloper.com/list-of-fpga-dev-boards-for-pc... - note the wide spread in price. You then also have to deal with FPGA tooling. The benefit is much better timing.
Palomides|1 month ago
xerxes901|1 month ago
unsnap_biceps|1 month ago
https://blog.reds.ch/?p=1759 and https://blog.reds.ch/?p=1813 is what inspired me to play with it.
hsbauauvhabzb|1 month ago
asdefghyk|1 month ago
I've often wondered why such a card (with FPGA) is not available for retro? computer emulation or simulation ??
justsomehnguy|1 month ago
https://mikrotik.com/product/ccr2004_1g_2xs_pcie
and G-RAID
wmf|1 month ago
hhh|1 month ago
immibis|1 month ago
Seems unlikely you'd emulate a real PCIe card in software because PCIe is pretty high-speed.
krupan|1 month ago
cakehonolulu|1 month ago
Surac|1 month ago
cakehonolulu|1 month ago
asimovDev|1 month ago
unknown|1 month ago
[deleted]
iamoutoftouch|1 month ago
cakehonolulu|1 month ago
PCIem kinda does that, but it's down a level; in terms of, it basically pops the device on your host PCI bus, which lets real, unmodified drivers to interact with the userspace implementation of your card, no QEMU, no VM, no hypervisors.
Not saying that you can then, for instance, forward all the accesses to QEMU (Some people/orgs already have their cards defined in QEMU so it'd be a bit pointless to redefine the same stuff over and over, right?) so they're free to basically glue their QEMU stuff to PCIem in case they want to try the driver directly on the host but maintaining the functional emulation on QEMU. PCIem takes care of abstracting the accesses and whatnot with an API that tries to mimick that the cool people over at KVM do.
throwaway132448|1 month ago
pjc50|1 month ago
CupricTea|1 month ago
neocron|1 month ago
checker659|1 month ago
GrowingSideways|1 month ago
agent013|1 month ago
cakehonolulu|1 month ago
_lunix|1 month ago
[0] https://movementarian.org/blog/posts/2025-08-27-vfio-user-cl...
cakehonolulu|1 month ago
JoshTriplett|1 month ago
cakehonolulu|1 month ago
petabyt|1 month ago
brcmthrowaway|1 month ago
cakehonolulu|1 month ago
unknown|1 month ago
[deleted]