So they don't flip buffers, and don't even use video memory for them, they just copy data from buffer in main memory into a memory-mapped framebuffer. This must be slow. Also it doesn't check for vblank and therefore doesn't protect from tearing, but the description says they have "double buffering".
As a side note, I remember that framebuffer was very slow on a computer I had. Without proprietary video card driver both Windows and Linux had troubles even scrolling a window, it was very laggy. Why was unaccelerated VGA (or VESA?) mode so slow, I wonder?
VESA is suprisingly good with Haiku. Since most GPU manufacturers have ignored Haiku, the community has spent some time getting the VESA driver as good as it can get, and it is very respectable. Most users will not be aware that they are running VESA, since Haiku still less laggy than it's contemporaries.
The memory mapped frame buffer is video memory, that's why you have to use a driver API to map it. Though obviously on many systems it ends up just being system dram with different caching settings anyway.
And this isn't something you'd code a game with, but you'd be surprised at how high modern memory bandwidth is.
>As a side note, I remember that framebuffer was very slow on a computer I had. Without proprietary video card driver both Windows and Linux had troubles even scrolling a window, it was very laggy. Why was unaccelerated VGA (or VESA?) mode so slow, I wonder?
FB on modern computers work surprisingly well except for playing video or 3d graphics. Sometimes on BSD systems it is the only option.
Without proprietary video card driver both Windows and Linux had troubles even scrolling a window, it was very laggy. Why was unaccelerated VGA (or VESA?) mode so slow, I wonder?
I haven't looked into the details on Linux but on Windows the "default" VESA/VGA mode uses the VGA BIOS, whose code is run in 16-bit VM86 mode:
The 16-bit code is not optimised for speed (is your framebuffer more than 64K? It's probably doing bankswitching and only copying 64K at a time), and the video BIOS itself may reside behind a slow serial interface on the GPU[1] --- so executing from it is very slow.
Sadly linux framebuffer is still inferior in comparison to Windows framebuffer
I'm running Windows 7 with a 14 yo gfx card (ATI Radeon 9250 - DirectX 8.1) along with Windows 2000 drivers in compatibility mode and if you exclude some minor tearing everything else is perfect.
Same configuration but this time in linux, on various distros, from the lightest to the heaviest and the results are the opposite.
heavy screen tearing and choppy scrolling everywhere, and the weird thing is that i see more Graphic Features enabled in the chrome://gpu/ in linux than i see in windows on this ancient card, yet the overall performance is inferior in linux.
My experience is the exact same, I so want to be able to have a snappy low latency desktop experience in Linux and it is just not happening. Today I run a "lightweight" Ubuntu Mate workstation with an i7 + dedicated GPU and it's still worse latency wise than my Windows XP PC was more than 15 years ago. Not good. Really not good.
"...produce fullscreen pixels effects easily with non-accelerated framebuffer ... the initial target platform is a Raspberry PI 3B"
Of course the Raspberry Pi 3 B actually has a GPU but it is so entangled with crap that someone invests their time in building a sub-par visual experience. That is so very sad.
Is "parralel" a real word in English? I think the author meant "parallel"?
I am asking this because everywhere in the README.md and even the source code itself uses "parralel", so this typo seems to be proposital. The author is either using a uncommon way to write the word parallel, is not a native speaker or there is some other reason.
A suggestion for the benchmarks: add one where the screen is filled with a solid color. I was getting 175fps on 1280x768 with a 1.5Ghz AMD back in 2003.
So, such a microbenchmark marks the baseline for how fast you could get via framebuffer.
That's like 6 ms per frame. If you spend a bit less than two thirds of the time in rendering you could still get 60 fps. But that's still just crazy slow.
The 175 fps at that resolution comes down to about 20-60 MB/s depending on display depth. Memory peak transfer rates were worst like that during the 1990's, then several hundreds of MB/s for regular speeds in the 2000's and are measured in GB/s these days.
As long as we can reduce framebuffer access to shared memory (to eliminate legacy transfer methods of pixels, if any) there's no reason we couldn't do significantly better. Basically the display controller is reading the memory and the cpu is writing it so we must share some of the bandwidth but still the speeds should be so high that there's absolutely no reason we should ever see visual jerkiness on framebuffer graphics, due to hardware.
Yeah that is actually extremely slow unless the time goes into calculating buffer contents and in that case it isn't really a benchmark of an FB implementation but something else.
I would have thought the vector capabilities of ARM and Intel architectures would have been useful here, in addition to any multi-threadedness. They can be used to add fast GPU like functions, e.g. alpha blending.
How does it compare to llvmpipe, for instance? I expect llvmpipe to be able to use neon (vector extension on ARM) instructions, as well as similar SIMD mechanisms on other platforms.
Hmm.. 1. This is being announced on GitHub, a Microsoft entity; 2. there are typos everywhere in the documentation; 3. the code is generally undocumented.
This is a single author's work to write a demonstration piece. Odd that is made the top five items on HackerNews.
[+] [-] codedokode|7 years ago|reply
https://github.com/grz0zrg/fbg/blob/master/src/fbgraphics.c#...
> memcpy(fbg->buffer, fbg->disp_buffer, fbg->size);
So they don't flip buffers, and don't even use video memory for them, they just copy data from buffer in main memory into a memory-mapped framebuffer. This must be slow. Also it doesn't check for vblank and therefore doesn't protect from tearing, but the description says they have "double buffering".
As a side note, I remember that framebuffer was very slow on a computer I had. Without proprietary video card driver both Windows and Linux had troubles even scrolling a window, it was very laggy. Why was unaccelerated VGA (or VESA?) mode so slow, I wonder?
[+] [-] smallstepforman|7 years ago|reply
[+] [-] ajross|7 years ago|reply
And this isn't something you'd code a game with, but you'd be surprised at how high modern memory bandwidth is.
[+] [-] pmarin|7 years ago|reply
FB on modern computers work surprisingly well except for playing video or 3d graphics. Sometimes on BSD systems it is the only option.
[+] [-] userbinator|7 years ago|reply
I haven't looked into the details on Linux but on Windows the "default" VESA/VGA mode uses the VGA BIOS, whose code is run in 16-bit VM86 mode:
https://wiki.osdev.org/Virtual_8086_Mode#Usage
http://nuclear.mutantstargoat.com/articles/pcmetal/pcmetal04...
The 16-bit code is not optimised for speed (is your framebuffer more than 64K? It's probably doing bankswitching and only copying 64K at a time), and the video BIOS itself may reside behind a slow serial interface on the GPU[1] --- so executing from it is very slow.
[1] https://en.wikipedia.org/wiki/Serial_Peripheral_Interface_Bu...
[+] [-] raverbashing|7 years ago|reply
No, it is as fast as it gets and it's the correct way of doing it (in fb at least).
[+] [-] biscuitNotchips|7 years ago|reply
I'm running Windows 7 with a 14 yo gfx card (ATI Radeon 9250 - DirectX 8.1) along with Windows 2000 drivers in compatibility mode and if you exclude some minor tearing everything else is perfect.
Same configuration but this time in linux, on various distros, from the lightest to the heaviest and the results are the opposite. heavy screen tearing and choppy scrolling everywhere, and the weird thing is that i see more Graphic Features enabled in the chrome://gpu/ in linux than i see in windows on this ancient card, yet the overall performance is inferior in linux.
[+] [-] dingdingdang|7 years ago|reply
[+] [-] ChuckMcM|7 years ago|reply
Of course the Raspberry Pi 3 B actually has a GPU but it is so entangled with crap that someone invests their time in building a sub-par visual experience. That is so very sad.
The computer business sucks so much these days.
[+] [-] stefan_|7 years ago|reply
[+] [-] m45t3r|7 years ago|reply
I am asking this because everywhere in the README.md and even the source code itself uses "parralel", so this typo seems to be proposital. The author is either using a uncommon way to write the word parallel, is not a native speaker or there is some other reason.
[+] [-] onirom|7 years ago|reply
Thank you!
[+] [-] m45t3r|7 years ago|reply
[+] [-] quadcore|7 years ago|reply
[+] [-] yason|7 years ago|reply
That's like 6 ms per frame. If you spend a bit less than two thirds of the time in rendering you could still get 60 fps. But that's still just crazy slow.
The 175 fps at that resolution comes down to about 20-60 MB/s depending on display depth. Memory peak transfer rates were worst like that during the 1990's, then several hundreds of MB/s for regular speeds in the 2000's and are measured in GB/s these days.
As long as we can reduce framebuffer access to shared memory (to eliminate legacy transfer methods of pixels, if any) there's no reason we couldn't do significantly better. Basically the display controller is reading the memory and the cpu is writing it so we must share some of the bandwidth but still the speeds should be so high that there's absolutely no reason we should ever see visual jerkiness on framebuffer graphics, due to hardware.
[+] [-] onirom|7 years ago|reply
However a single core memset (still RPI 3B) for this case is fast : 235 FPS
[+] [-] canadaduane|7 years ago|reply
[+] [-] s_ngularity|7 years ago|reply
[+] [-] jwilk|7 years ago|reply
[+] [-] AHTERIX5000|7 years ago|reply
[+] [-] aparashk|7 years ago|reply
[+] [-] gameswithgo|7 years ago|reply
https://www.twitch.tv/videos/8349645
[+] [-] MayeulC|7 years ago|reply
[+] [-] chme|7 years ago|reply
[+] [-] CharlesMerriam2|7 years ago|reply
This is a single author's work to write a demonstration piece. Odd that is made the top five items on HackerNews.
[+] [-] saagarjha|7 years ago|reply