The secret to SK’s unbeatable framepacing results
SwapChains are how Direct3D sequences rendered images for display output.
You may have heard of double- or triple-buffering, V-Sync, and refresh rate. Those are all properties of a DXGI SwapChain, along with framebuffer resolution, format and colorspace (HDR), and legacy things like MSAA / Gamma. Not all properties of a DXGI SwapChain are appropriate for modern Direct3D rendering as you will find out in the section on tuning.
For the super interested, we have two Sankey diagrams over on the Presentation Model page that shows the path from the swap effect requsted by the game and the display mode used to the final presentation model used by Windows, summarizing the below info dump in a hopefully understandable diagram.
It is a bit convoluted, but basically there are different swap effects a game can configure a swapchain to use – the swap effect controls how the rendered frame should be presented from the perspective of the game. The actual presentation model used, however, depends on various factors beyond just the swap effect used.
Discard (Flip)
and Sequential (Flip)
, as well as Discard (BitBlt)
and Sequential (BitBlt)
(really rare).
Out of these three, Discard (BitBlt)
is the classic swap effect and BitBlt
basically means that it performs a sort of copy operation to present the rendered frame. The other two are Flip
based and means that instead of a copy operation occurring, resoruces are shared directly between the app and the DWM so frames are “flipped” out as needed with as minimal overhead as possible.
Fullscreen Exclusive (FSE)
refers to a scenario where the game process has exclusive ownership and control of the display, and controls everything related to the display itself. This ensured the lowest possible latencies up until Windows 8.1 since when using Discard (BitBlt)
it meant that the presented frame could be directly outputted from the game to the display without any additional latencies added by copying the frame over to the DWM which then would present it to the display. When using SK or PresentMon, this presentation model is shown as Hardware: Legacy Flip
.
Fullscreen Optimizations (FSO)
is an umbrella term that refers to various enhancements and improvements introduced in Windows 10. For end users the term typically refers to Windows 10’s new automated conversion of games running in FSE mode into a pseudo-borderless fullscreen window that uses flip model, but for Microsoft engineers themselves it also often refers to flip related optimizations such as DirectFlip
and Independent Flip
which can engage in various scenarios to lower the latency of flip model based swapchains.
DirectFlip
and Independent Flip
(often used interchangeably) refers to scenarios where a flip model based swapchain matches one of three possible scenarios (one of which requires MPOs to engage) where the DWM can take a step back and allow the game to present its frame directly to the display, thereby eliminating any additional latency that would’ve been incurred if the DWM had to compose and present the game frame itself to the display. When DirectFlip/Independent Flip is able to engage, the game is capable of achieving the same efficiency as classic FSE mode.
Hardware: Independent Flip
is what’s typically shown in SK and PresentMon once DirectFlip/Independent Flip have engaged and the DWM have taken a step back. There’s also Hardware Composed: Independent Flip
which is basically the same with the minor difference of the frames of the game being scanned out (presented) to a dedicated hardware overlay plane of the GPU.So Hardware [Composed]: Independent Flip
is an indicator that modern DirectFlip/Independent Flip (and FSO) are engaged for a game – the DWM retains ownership of the display but has taken a step back and the game presents straight to the display, while Hardware: Legacy Flip
indicates classic FSE and that the game has ownership of the display.
Composed: Flip
) are various forms of suboptimal modes where the DWM has ownership of the display and various copy operations or desktop composition occur before the frame can get from the game to the display, incurring additional latency in the process.Multi-Plane Overlay (MPO)
refers to the use of additional dedicated hardware scanout planes in the GPU that frames can be presented to, which the GPU then takes care of scanning out to the display itself, thereby allowing the GPU to shoulder the work (again achieving lower latencies) that the DWM would otherwise do but in software (which would incur an additional latency). How many MPOs a graphic card has assigned to a display is shown in SKIF’s Settings tab (introduced in April 2023). Typically NVIDIA assigns all of the planes it supports (usually upwards of 4 of them) to a single display while the rest of the displays goes without any.
Minimum Requirements:
Support depends on the GPU and display configuration. Unusual driver or display configurations might disable MPO support, such as by using 10bpc in SDR mode or custom GPU scaling (NIS/RSR/integer etc).
Special K’s primary attack strategy for D3D11 frame pacing is to enable flip model presentation.
(DXGI) Flip model is a new mode introduced in Windows 8 that makes the swapchain work efficiently with the Desktop Window Manager (DWM) of Windows. Flip model eliminates performance penalties normally associated with windowed mode rendering and introduces new methods to measure and regulate render latency.
Flip model is required for all D3D12 software as well as all UWP games sold on the Microsoft Store. It is also useable in D3D10 and D3D11 software, but most developers shipping software on Windows are oblivious to this and your typical D3D11 game performs sub-par as a result.
Both the Unreal Engine 4 and Unity engines had native support for flip model as of 2019, meaning games developed with newer versions of those engines typically use flip model by default.
Windows 10’s Fullscreen Optimizations (FSO) feature takes Fullscreen Exclusive (FSE) games and runs them in a borderless fullscreen mode that uses flip model instead.
Windows 11’s Windowed Optimizations converts older DirectX 10-11 games running in (borderless) windowed modes over to using flip model instead, achieving the same end result as Special K!
Traditional swapchain behavior (with V-Sync enabled) requires that each backbuffer be displayed on screen for at least 1 screen refresh before the contents of the next backbuffer is displayed and that all buffers output in the same order they were input.
FIFO’s inability to prioritize the newest frame is the reason triple-buffering has traditionally meant trading increased input latency for V-Sync stutter mitigation.
Special K is configured to drop old frames by default.
We see no reason to ever use the traditional FIFO queuing behavior on a desktop machine in any graphics API that has the ability to skip old frames (e.g. D3D10/11/12 and Vulkan).
The sections below include various recommended swapchain management settings that can be used in Special K in some common scenarios. Note that there is often no need to change any of these values, as Special K already ships with and enables the most optimal settings for most use-cases out-of-the-box. However, if you want to experiment and test around with various settings in an attempt to improve the experience in any of the below scenarios, these are some good baselines to go by.
Not all games support overriding the BackBuffer Count value. If any issues occur, open the game-specific config and change the
BackBufferCount
parameter to-1
to restore the game’s default, or try 1 below/above its current value and see if that works better.Enabling the Waitable SwapChain paramater, which may lower low latency in certain cases, makes it impossible to use fullscreen exclusive mode (in D3D11 games), and it is no longer recommended in any scenario due to the stutter it results in when DirectFlip optimizations are lost.
D3D12 is a low-level API; many graphics engines have hard-coded their frame queuing behavior and will crash if the number of buffers is changed.
- What was a quick and easy way to mitigate stutter in D3D11 is potentially fatal in D3D12.
You may need to set
AllowD3D12FootGuns=true
in the the game-specific config to be able to adjust the BackBuffer Count and Maximum Device Latency settings in D3D12.Fiddle with BackBuffer Count in D3D12 at your own peril… Expect most games to crash at startup, and be prepared to set it back to
-1
The remaining SwapChain settings are generally safe to change.
While using regular V-Sync or Black Frame Insertion (BFI):
In-Game Setting | Value | Config Parameter |
---|---|---|
Presentation Interval | 1 |
PresentationInterval |
BackBuffer Count | 3 |
BackBufferCount |
Maximum Device Latency | 4 or -1 |
PreRenderLimit |
Enable Tearing | Enabled |
AllowTearingInDWM |
While using a variable refresh rate (VRR/G-Sync/FreeSync):
In-Game Setting | Value | Config Parameter |
---|---|---|
Presentation Interval | 1 |
PresentationInterval |
BackBuffer Count | 3 |
BackBufferCount |
Maximum Device Latency | 2 or -1 |
PreRenderLimit |
Enable Tearing | Enabled |
AllowTearingInDWM |
While neither using V-Sync, Black Frame Insertion (BFI), nor variable refresh rate (VRR/G-Sync/FreeSync), which also engages Special K’s tear-free no-sync feature. See Latent Sync for more information on that feature.
In-Game Setting | Value | Config Parameter |
---|---|---|
Presentation Interval | 0 |
PresentationInterval |
BackBuffer Count | 3 |
BackBufferCount |
Maximum Device Latency | -1 |
PreRenderLimit |
Enable Tearing | Enabled |
AllowTearingInDWM |
Why Flip Model Overrides Have Never Been Implemented In Other Projects (e.g. ReShade)
… and where these projects should focus their efforts
Technical restrictions Special K has to work-around to enable Flip Model in all games:
1. sRGB Formats are Unsupported in Flip Model
Special K handles sRGB <-> Linear gamma correction for games that request an sRGB SwapChain and would otherwise not be Flip Model compatible.
Prior versions ignored the issue completely, simply stripping the sRGB format from the SwapChain and letting the user deal with the dark image that produced–do not do that
2. MSAA (using SwapChain resolve) is Unsupported in Flip Model
Special K disables SwapChain-level MSAA when you enable Flip Model.
There are multiple ways of doing MSAA; modern graphics engines handle MSAA Resolve manually and are unaffected by this restriction.
Only really bad ports / engines use the SwapChain for MSAA in D3D11.
Special K may eventually provide offscreen MSAA resolve services for these poorly designed D3D11 games, but it is not a high priority.
3. Flip Model only supports 1 SwapChain per-window
This is almost never an issue, except for stupendiously screwed up games (i.e. NieR: Automata)
Special K detects situations where a game has created multiple SwapChains for the same window and will try to consolidate them into a single SwapChain in order to support Flip Model.
This workaround is usually successful…
If a game mixes and matches graphics APIs based on GDI and DXGI to render to the same HWND, it will never be compatible with Flip Model via an override.
Such a game is in serious need of refactoring and not worth the effort of supporting.
4. Flip Model Unbinds the SwapChain Backbuffer from the D3D11 Output Merger
After a frame is Presented, Flip Model SwapChains will unbind any RenderTarget that references one of their Backbuffers from the D3D11 Immediate Context.
Applications that do not explicitly setup the OM state to output to the SwapChain each frame will tend to produce a black screen if you force Flip Model on without also resetting the Output Merger state after Present (...)
5. Fullscreen Exclusive Mode Switches Require Resizing the SwapChain
Whether the actual resolution changes or not, until a Flip Model SwapChain is Resized, all attempts to Present (...)
will fail.