I still don’t understand why you didn’t use the precompiled shaders packed with the games… you’re emulating the GameCube or Wii GPU, and it’s never going to change, and the games provide precompiled shaders.
First, GameCube/Wii API actually generates the "shaders" at runtime, so there is simply no way to know which vertex/pixel pipeline states the game needs short of playing though the whole game, looking at every single bit of level geometry.
Many games actually dynamically generate new "shaders" on the fly, based on which lights are near an object, and in which order.
Second we can't use those vertex/pixel pipeline states directly on modern GPU, they need to be translated into modern shaders, and then compiled by the driver for your graphics card. It's actually that compile step which causes the stuttering, dolphin's translation is plenty fast enough.
The combination of these two facts means Dolphin can't depend on any pre-computation at all.
1. "shader" is just a metaphor, the actual code running on the gamecube gpu is a custom pipeline that has a dynamic structure and is updated aggressively throughout the lifetime of the app - there is no static "shader" program to run on the host GPU.
2. The architectures of the gamecube and modern GPUs are so distinct as to require an intricate translation layer in order to map gamecube rendering operations to first class shader operations on a modern GPU. This very process causes the stuttering that starts the issue.
That's the trick, they actually don't provide precompiled shaders as you know them. The graphics hardware back then was fixed function pipelines with a tremendous number of options to configure how they work. The downside is that you can't run truly arbitrary code but the upside is that they can instantaneously switch behavior as fast as setting a register.
Prior to ubershaders the emulator took a configuration for the hardware pipeline and turned that into a shader, which took time to compile. Ubershaders work by emulating the entire fixed function pipeline in one glorious shader until the smaller, more efficient shader can be compiled and slipped in.
Basically, the ubershader is the only thing that can actually understand the "shaders" packaged with the game and start using them with zero latency.
Why not just precompile all the possible hardware combinations? There's far more combinations than atoms in the universe.
Why not just precompile all the hardware combinations that the game actually uses? There's no way to tell before hand without examining every branch of the game's code which ranges in difficulty from "computationally prohibitive" to "fundamental theorems of how computers work says this is impossible".
The article mentions that some users actually passed around cached shader packs, but that solution was brittle.
Wait, I hought that's what the ubershaders are. What you say is what I kept thinking for much of the article - "just" emulate the GPU, no compiler needed. And then they did.
One thing to remember is these older consoles don't have the same concept of a "shader" as we do today.
Go back far enough and you'll find the industry trying to settle on quads or triangles for rendering (and we all knew who won)
The games were given basically an immediate mode API into the graphics card and they could do whatever they wanted, whenever they wanted, without warning.
The stutter happened when they were translating the API mentioned above into modern GPU shaders.
When it was on the CPU - They had to determine the effect, generate and compile the modern shaders, and upload that to the GPU, sometimes hundreds of times a second. Then the GPU would take over and display.
Uber shaders took that entire pipeline and moved it into the GPU.
This was low level emulation, just still hitting limits of modern CPUs.
PC games that have a shader pre compile step usually have to re do it when new drivers come out, pre compiled shaders can be shipped to closed systems such as consoles or even steam deck but not for PC. Each different GPU brand requires different ones and like I said even when you update drivers.
They're precompiled for the console GPU architecture, not the PC architecture, so they can't be used directly and still need to be emulated - I think those precompiled shaders are the input to the ubershader.
The GAMES THEMSELVES are precompiled for the PowerPC architecture, not the PC architecture, though. That didn’t stop anyone from creating Dolphin.
GPUs (I’m told) have far fewer instructions to emulate than a CPU, so I’d think that low level emulation of the Flipper shaders would be no trouble. Can’t translate or transpile them to PC GPUs though because those instruction sets are somewhat secret, I think.
I know nothing about this stuff but I am a developer so perhaps I know enough to ask the most stupid questions possible.
It’s gotta be a performance thing, why they didn’t emulate Flipper at a low enough level to use the precompiled shaders directly.
> because those instruction sets are somewhat secret, I think
The GPU ISAs are known (e.g. the PTX compiler for NVidia is open source and has a backend in LLVM). The main problem is that the GPU ISA changes with every GPU hardware generation and manufacturer, so if you want to support Nvidia 3xxx + 4xxx + AMD VLIW + AMD GCN + ... you have to use the common demoninator GLSL/HLSL/SPIR-V/whatever.
> why they didn’t emulate Flipper at a low enough level to use the precompiled shaders directly.
They did. Originally the GPU emulator was done in the CPU, and in 2017, the GPU emulator itself was moved into a shader ("ubershader").
The console game itself does not include shaders in text format like many PC games do.
The ubershader is the thing that emulates Flipper at a low enough level to use the precompiled "shaders" directly. Prior to that the precompiled "shaders" were examined and recompiled into individual shaders, a process that took time.
(Why "shaders" in quotes? Because they weren't shaders as we know them today but really more like lists of hardware flags for how to flow data through a fixed function pipeline)
Yes, that’s exactly the point though. This is the same question as why you can’t emulate a game by precompiling its code, and this doesn’t work because that information isn’t available until you try to run the game. That’s why Dolphin has an interpreter/JIT.
>This is the same question as why you can’t emulate a game by precompiling its code, and this doesn’t work because that information isn’t available until you try to run the game.
I mean technically you can, but it generally requires a bunch of inefficient jump tables, or alternatively a way to fall back to an interpreter or JIT for self modifying code.