qdot_me's comments

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

To a limited extent. The accelerated graphics on ARM (and cell phones) has always been a system integrator problem. Until very recently, they were not user customizable - which means the initialisation was part of the proprietary firmware and hard to port across even different implementations of the same SoC.

The x86 world had the advantage of user upgradable GPUs, which necessitates standardization and common firmware.

On top of which, the accelerated graphics of cell phones is a horrible kludge of various standards.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

That’s actually not the case. Historically, low amounts of RAM and IO were the bottlenecks, we’re talking quad-core arm-v8 with a pretty beefy vector gpu/coprocessor.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

True. And with RPi4 having a 1000baseT, it’s not as painful as it seems. Perhaps even the driver can be coaxed into some form of DMA and MPI that is a bit lower latency than IP stack.

With secondary IP layer on 802.11, it might actually work reasonably well.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

It’s all, cough, implementation dependent. You can run two VideoBIOS implementations, and chances are only one will install an int10h handler, but both will set up the registers and whatnot in the GPUs.

My AMD TAHITIs for instance need VideoBIOS to start some form of thermal management loop - otherwise they just run full-throttle on the fan.

Then whichever card prevails (BIOS has the ability to select the initialization order) becomes the boot display device.

X11 has some (generally working, for well behaved GPUs) emulation of this environment, so that the GPU can initialize late, and even reset under X control. This is how sane cards can work under headless ARM etc.

Now, some manufacturers assume you get something like SSE or MMX - VideoBIOS spec technically mandates 386 instruction set only. That crap gets badly emulated.

On top of this, drivers can sometimes reinit anyways, from native kernel code. If that happens, the VideoBIOS concerns are moot.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

Yup. The daughterboard is on my mind. Likely flex-PCB and that’s gonna take a week or two to respin. Hence I’m collecting ideas for various daughtercards I could cram in a panel before sending it off - straight-through to riser via USB3, expresscard SMT, maybe through-hole 1x?

That said, PCIe phy’s are extremely robust - they do most of the impedance matching and delay mismatch training. And if you don’t ruin the onboard caps, this could be jumpered straight across.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

There are two ways of doing clusters - one is a message passing paradigm, which you can do over Ethernet (to an extent - I’d still take USB3 for 4x the bandwidth) - and the other is direct memory access a’la Cray.

What really motivated me to do this hack is the relative abundance of stuff I can now plug into an FPGA :)

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

This might even allow some level of memory mapping in all directions. Unfortunately PCIe switches are not very hackable as such, but maybe I can spin a board for this purpose... after I get the hack more industrialized.

qdot_me | 6 years ago | on: PCI Express on the Raspberry Pi 4

Hack’s creator here - it’s on my list of things to try. GPUs are notoriously hard to get to work on non-intel, having tried to get a few up on Alpha and Itaniums in the past.

VideoBIOS expects to run and expects a well behaving Intel CPU to do the power-up. That said X can sometimes emulate these quite well. On ARM we’d also run into alignment issues and likely other quirks - but in principle...

page 1