Humourously enough, when I worked on a team that was writing a graphical web browser for mobile in the late 90's [1], they used a display list for rendering.
The reasoning was somewhat different, web pages were essentially static (we didn't do "DHTML"), if the page rendering process could generate an efficient display list, then the page source could be discarded, and only the display list needed to be held in memory, this rendering could then be pipelined with reading the page over the network, so the entire page was never in memory.
Full Disclosure: while I later wrote significant components of this browser (EcmaScript, WmlScript, SSL, WTLS, JPEG, PNG), the work I'm describing was entirely done by other people!
[1] - I joined in 97, the first public demo was at GSM World Congress Feb 98
I developed a hypermedia browser called HyperTIES for NeWS that was scriptable in FORTH, and whose formatter could output FORTH code to layout and paginate an article for a particular screen size, which could be saved out in a binary FORTH image that you could restart quickly.
The FORTH code then downloaded PostScript code into the NeWS server, where it would be executed in the server to draw the page.
It even had an Emacs interface written in Mocklisp!
Now that this is closer to shipping, I'm curious what impact this would have on battery life. On the one hand, this is lighting up more silicon; on the other hand: a faster race to sleep, perhaps?
Have there been any measurements on what the end result is on a typical modern laptop?
Just tried it with the Nightly by setting gfx.webrender.enabled to true in about:config. Wow, that thing flies. It's seriously amazing. And so far no bugs or visual inconsistencies I could detect. Firefox is really making great progress on this front!
There's more steps necessary to enable WebRender in full capacity.
I presume, though, that things are buggier then and the potentially introduced performance drops might actually make it feel slower for now. I don't know, though, I haven't tested it with just gfx.webrender.enabled.
If only it were as simple as just using Loop-Blinn. :) The technique described there will produce unacceptably bad antialiasing for body text. Loop-Blinn is fine if you want fast rendering with medium quality antialiasing, though. (Incidentally, it's better to just use supersampling or MLAA-style antialiasing with Loop-Blinn and not try to do the fancy shader-based AA described in that article.)
Additionally, the original Loop-Blinn technique uses a constrained Delaunay triangulation to produce the mesh, which is too expensive (O(n^3) IIRC) to compute in real time. You need a faster technique, which is really tricky because it has to preserve curves (splitting when convex hulls intersect) and deal with self-intersection. Most of the work in Pathfinder 2 has gone into optimizing this step. In practice people usually use the stencil buffer to compute the fill rule, which hurts performance as it effectively computes the winding number from scratch for each pixel.
The good news is that it's quite possible to render glyphs quickly and with excellent antialiasing on the GPU using other techniques. There's lots of miscellaneous engineering work to do, but I'm pretty confident in Pathfinder's approach these days.
The name "WebRender" is unfortunate though. Things with a "Web" prefix - "Web Animations", "WebAssembly", "WebVR" - are typically cross-browser standards. This is just a new approach Firefox is using for rendering. It doesn't appear to be part of any standard.
I remember reading at some point that WebRender could actually be isolated relatively easily and then applied to basically any browser. That sort of already took place, going from Servo over into Gecko.
So, it might actually turn into somewhat of a pseudo-standard.
I'd largely forgotten what pixel shaders actually were, so it was nice to get a high level understanding through this article, especially with the drawings!
I was already extremely pleased with the Firefox Quantum beta, they really are stepping their game up. If this is truly as clean as they say it is, web browsing on cheap computers just got much smoother.
I really appreciate the time they are taking to describe the changes in an easy to understand way. The sketches and graphics really help explain a pretty complex subject.
Vulkan has been a consideration from the earliest architecturing steps done in WebRender. So, the internal pipelines are all set up to be mapped to Vulkan's pipelines.
It's actually OpenGL which fits less into the architecture, but it's still easier to just bundle WebRender's pipelines all together and then throw that into OpenGL.
They do, but we're targeting Intel HD quality graphics, not gaming-oriented NVIDIA and AMD GPUs.
That said, even Intel GPUs can often deal with large numbers of draw calls just fine. It's mobile where they become a real issue.
Aggressive batching is still important to take maximum advantage of parallelism. If you're switching shaders for every rect you draw, then you frequently lose to the CPU.
> What if we stopped trying to guess what layers we need? What if we removed this boundary between painting and compositing and just went back to painting every pixel on every frame?
This feels a bit like cheating. Not all devices have a GPU. Would Firefox be slow on those devices?
Also, pages can become arbitrarily complicated. This means that an approach where compositing is used can still be faster in certain circumstances.
To address your second point, you seem to be saying that missing the frame budget once and then compositing the rest of the time would be better than missing the frame budget every time.
That is certainly true, but a) the cases where you can do everything as a compositor optimization are very few (transform and opacity mostly) so aside from a few fast paths you'd miss your frame budget all the time there too, and b) we have a lot of examples of web pages that are slow on CPU renderers and very fast on WebRender and very few examples of the opposite aside from constructed edge case benchmarks. Those we have found had solutions and I suspect the other cases will too.
As resolution and framerate scale, CPUs cannot keep up. GPUs are the only practical path forward.
Actually, virtually every device the average grade consumer uses, has a GPU. For instance, even Atom processors have GPUs. Granted, they don't have as much cores as a full-fledged nVidia GPU, nor as much dedicated memory, but they are still GPU with several tens of cores and specialized APIs that were designed specifically for the tasks at hand. Plus, they offload (ish) the CPU.
I imagine the render task tree also has to determine which intermediate textures to keep in the texture cache, and which ones will likely need to be redone in the next frame. That kind of optimization has to be tricky.
With a compositor you're already drawing every pixel every frame on the GPU, whether it's just a cursor blinking or not. The WR approach basically only adds a negligible amount of vertex shading time.
I tried testing it out on a ThinkPad T61 to see how well it works with an older embedded GPU (Intel 965 Express), but I can't enable it (on Windows 10) because D3D11 compositing is disabled, it says D3D11_COMPOSITING: Blocklisted; failure code BLOCKLIST_
So does that mean that it is known not to work with that GPU? Can you override the blocklist to see what happens?
Edit: It also says:
> Direct2D: Blocked for your graphics driver version mismatch between registry and DLL.
and
> CP+[GFX1-]: Mismatched driver versions between the registry 8.15.10.2697 and DLL(s) 8.14.10.2697, reported.
Indeed that is correct, the driver is marked as version 8.15.10.2697 but the fileversion of the dlls are 8.14.10.2697, this seems to be intentional by Microsoft or Intel, note that the build numbers are still the same. Firefox is quite naive if it thinks it can just try to match those.
While I would consider myself more a Golang fan than a Rust fan, I am impressed by the speed by which the Mozilla team is changing fundamental parts of their browser and somehow I believe rust has something to do with that speed.
I've been working professionally with Rust for a year now. When I got over the first wall, it has become the best tool I've had for creating backend applications. I have history with at least nine different languages during my professional career, but nothing comes close giving the confidence and ergonomics than the tools Rust ecosystem provides.
Firefox, especially the new Quantum version is awesome. But Rust as a side product might be the best thing Mozilla brought us. I'm truly thankful for that.
Will there finally be a unified use of the GPU on all platforms (win, mac, linux, etc) or will WebRender just be a Windows only feature for quite some time?
I have WebRender working on Linux with Intel 5500 integrated graphics. Hardware acceleration is still a bit glitchy though I'm afraid (with or without WebRender).
To enable, toggle 'layers.acceleration.force-enabled' as well as 'gfx.webrender.enabled'
edit: It's also working through my Nvidia 950m (through bumblebee), although subjectively it seems to have a little more lag this way.
Why are they so obsessed with 60 fps? 120 fps looks considerably better, and there are other effects like smear and judder that significantly decrease even with significantly higher frame rates, say 480 fps [1].
The WebRender folks are well aware that higher framerates are the future. Here's a tweet from Jack Moffitt today, a Servo engineer (and Servo's technical lead, I believe): https://twitter.com/metajack/status/917784559143522306
"People talk about 60fps like it's the end game, but VR needs 90fps, and Apple is at 120. Resolution also increasing. GPUs are the only way. Servo can't just speed up today's web for today's machines. We have to build scalable solutions that can solve tomorrow's problems."
As everyone said, 60fps is not the destination but merely a waypoint. It's a good goal, considering 99% of screens that are in use today refresh at 60 Hz or their regional equivalent. Higher refresh rates are next.
Not an expert, but I feel that that was more of an analogy/image to give what they were aiming for. The real objective is not 60fps, the real objective is to use the GPU to do tasks that it was designed for. Plain and simple. This however, gives the user a smoother experience, and 60 fps generally gives a noticeable difference.
We're not as other people have said in other comments. On normal content you can often see WebRender hit 200+ fps if you don't lock it to the frame rate. To see this for yourself, run Servo on something with -Z wr-stats which will show you a performance overlay.
I don't think, they are obsessed with 60 FPS, that's just what for most people is synonymous to a smooth experience and is often not met by browsers at this point in time.
In the video, he says 500 FPS, but assuming there's no more complicated formula behind this, I think it would actually be 2174 FPS. (0.46 ms GPU time per frame -> 1/0.00046s = 2173.913 FPS)
[+] [-] pacaro|8 years ago|reply
The reasoning was somewhat different, web pages were essentially static (we didn't do "DHTML"), if the page rendering process could generate an efficient display list, then the page source could be discarded, and only the display list needed to be held in memory, this rendering could then be pipelined with reading the page over the network, so the entire page was never in memory.
Full Disclosure: while I later wrote significant components of this browser (EcmaScript, WmlScript, SSL, WTLS, JPEG, PNG), the work I'm describing was entirely done by other people!
[1] - I joined in 97, the first public demo was at GSM World Congress Feb 98
[+] [-] DonHopkins|8 years ago|reply
The FORTH code then downloaded PostScript code into the NeWS server, where it would be executed in the server to draw the page.
It even had an Emacs interface written in Mocklisp!
http://www.donhopkins.com/home/archive/HyperTIES/ties.doc.tx...
http://www.donhopkins.com/home/images/HyperTIESDiagram.jpg http://www.donhopkins.com/home/images/HyperTIESAuthoring.jpg
http://www.donhopkins.com/drupal/node/101 http://www.donhopkins.com/drupal/node/102
http://www.donhopkins.com/home/ties/ http://www.donhopkins.com/home/ties/fmt.f http://www.donhopkins.com/home/ties/fmt.c http://www.donhopkins.com/home/ties/fmt.cps http://www.donhopkins.com/home/ties/fmt.ps http://www.donhopkins.com/home/ties/ties-2.ml
[+] [-] AceJohnny2|8 years ago|reply
[+] [-] pohl|8 years ago|reply
Have there been any measurements on what the end result is on a typical modern laptop?
[+] [-] anon1253|8 years ago|reply
[+] [-] 482794793792894|8 years ago|reply
I presume, though, that things are buggier then and the potentially introduced performance drops might actually make it feel slower for now. I don't know, though, I haven't tested it with just gfx.webrender.enabled.
You can find the current full list of steps to enable WebRender here: https://mozillagfx.wordpress.com/2017/09/25/webrender-newsle...
[+] [-] mycoborea|8 years ago|reply
[+] [-] Antrikshy|8 years ago|reply
[+] [-] jacob019|8 years ago|reply
[+] [-] Vinnl|8 years ago|reply
[+] [-] vvanders|8 years ago|reply
Speaking of rendering text glyphs on the GPU, there's a really clever trick(commonly called loop-blinn, from the two authors): https://developer.nvidia.com/gpugems/GPUGems3/gpugems3_ch25....
You can pretty much just use the existing bezier control points from TTF as-is which is really nice.
[+] [-] pcwalton|8 years ago|reply
Additionally, the original Loop-Blinn technique uses a constrained Delaunay triangulation to produce the mesh, which is too expensive (O(n^3) IIRC) to compute in real time. You need a faster technique, which is really tricky because it has to preserve curves (splitting when convex hulls intersect) and deal with self-intersection. Most of the work in Pathfinder 2 has gone into optimizing this step. In practice people usually use the stencil buffer to compute the fill rule, which hurts performance as it effectively computes the winding number from scratch for each pixel.
The good news is that it's quite possible to render glyphs quickly and with excellent antialiasing on the GPU using other techniques. There's lots of miscellaneous engineering work to do, but I'm pretty confident in Pathfinder's approach these days.
[+] [-] Animats|8 years ago|reply
[+] [-] kidfiji|8 years ago|reply
[+] [-] winter_blue|8 years ago|reply
[+] [-] bsimpson|8 years ago|reply
The name "WebRender" is unfortunate though. Things with a "Web" prefix - "Web Animations", "WebAssembly", "WebVR" - are typically cross-browser standards. This is just a new approach Firefox is using for rendering. It doesn't appear to be part of any standard.
[+] [-] rileyphone|8 years ago|reply
[+] [-] 482794793792894|8 years ago|reply
So, it might actually turn into somewhat of a pseudo-standard.
[+] [-] djhworld|8 years ago|reply
I'd largely forgotten what pixel shaders actually were, so it was nice to get a high level understanding through this article, especially with the drawings!
[+] [-] frostwhale|8 years ago|reply
[+] [-] stevenhubertron|8 years ago|reply
[+] [-] shmerl|8 years ago|reply
UPDATE: Ah, I see it's mentioned in the future work: https://github.com/servo/webrender/wiki#future-work
So it will be using OpenGL then?[+] [-] 482794793792894|8 years ago|reply
It's actually OpenGL which fits less into the architecture, but it's still easier to just bundle WebRender's pipelines all together and then throw that into OpenGL.
[+] [-] larsberg|8 years ago|reply
[+] [-] kevindqc|8 years ago|reply
Don't PC games use thousands of draw calls per frame?
[+] [-] pcwalton|8 years ago|reply
That said, even Intel GPUs can often deal with large numbers of draw calls just fine. It's mobile where they become a real issue.
Aggressive batching is still important to take maximum advantage of parallelism. If you're switching shaders for every rect you draw, then you frequently lose to the CPU.
[+] [-] simlevesque|8 years ago|reply
[+] [-] amelius|8 years ago|reply
This feels a bit like cheating. Not all devices have a GPU. Would Firefox be slow on those devices?
Also, pages can become arbitrarily complicated. This means that an approach where compositing is used can still be faster in certain circumstances.
[+] [-] metajack|8 years ago|reply
That is certainly true, but a) the cases where you can do everything as a compositor optimization are very few (transform and opacity mostly) so aside from a few fast paths you'd miss your frame budget all the time there too, and b) we have a lot of examples of web pages that are slow on CPU renderers and very fast on WebRender and very few examples of the opposite aside from constructed edge case benchmarks. Those we have found had solutions and I suspect the other cases will too.
As resolution and framerate scale, CPUs cannot keep up. GPUs are the only practical path forward.
[+] [-] aneutron|8 years ago|reply
[+] [-] fritzy|8 years ago|reply
[+] [-] pcwalton|8 years ago|reply
In practice LRU caches work pretty well.
[+] [-] azinman2|8 years ago|reply
[+] [-] pcwalton|8 years ago|reply
[+] [-] poizan42|8 years ago|reply
So does that mean that it is known not to work with that GPU? Can you override the blocklist to see what happens?
Edit: It also says:
> Direct2D: Blocked for your graphics driver version mismatch between registry and DLL.
and
> CP+[GFX1-]: Mismatched driver versions between the registry 8.15.10.2697 and DLL(s) 8.14.10.2697, reported.
Indeed that is correct, the driver is marked as version 8.15.10.2697 but the fileversion of the dlls are 8.14.10.2697, this seems to be intentional by Microsoft or Intel, note that the build numbers are still the same. Firefox is quite naive if it thinks it can just try to match those.
[+] [-] JepZ|8 years ago|reply
[+] [-] pimeys|8 years ago|reply
Firefox, especially the new Quantum version is awesome. But Rust as a side product might be the best thing Mozilla brought us. I'm truly thankful for that.
[+] [-] markdog12|8 years ago|reply
Update: about:support says not ready for Android
[+] [-] Brakenshire|8 years ago|reply
[+] [-] aneutron|8 years ago|reply
[+] [-] esaym|8 years ago|reply
[+] [-] hexane360|8 years ago|reply
To enable, toggle 'layers.acceleration.force-enabled' as well as 'gfx.webrender.enabled'
edit: It's also working through my Nvidia 950m (through bumblebee), although subjectively it seems to have a little more lag this way.
[+] [-] madez|8 years ago|reply
[1] http://blogs.valvesoftware.com/abrash/down-the-vr-rabbit-hol...
[+] [-] kibwen|8 years ago|reply
"People talk about 60fps like it's the end game, but VR needs 90fps, and Apple is at 120. Resolution also increasing. GPUs are the only way. Servo can't just speed up today's web for today's machines. We have to build scalable solutions that can solve tomorrow's problems."
[+] [-] sturmen|8 years ago|reply
[+] [-] aneutron|8 years ago|reply
[+] [-] metajack|8 years ago|reply
[+] [-] 482794793792894|8 years ago|reply
Here's for example an early demo showing Wikipedia at ridiculous frames per second (starts at 0:26:00): https://air.mozilla.org/bay-area-rust-meetup-february-2016/
In the video, he says 500 FPS, but assuming there's no more complicated formula behind this, I think it would actually be 2174 FPS. (0.46 ms GPU time per frame -> 1/0.00046s = 2173.913 FPS)
[+] [-] stcredzero|8 years ago|reply
Baby steps.
[+] [-] to3m|8 years ago|reply