This is James Fulop, Senior Engine Programmer on the Guild Wars 2 team. DirectX11 support is a project that has been a long time in the making. In this post I am going to go over some of the high-level technical decisions that were made to determine the shape of this project, as well as a walkthrough of our graphics runtime and how the DirectX11 renderer affects performance.
The beta will begin 21 September 2021. You can opt-in to the beta in-game in the Graphics Options menu. The change takes effect after restarting the game.
So why did we decide to upgrade to DirectX11 in the first place? Client performance is a priority for us, and we want everyone to be able to play with the highest possible framerate. We pinpointed that sometimes the game could stall waiting for rendering work to complete. Guild Wars 2 has been live for nine years now and implementing some DirectX11-dependent features can help the game continue to look beautiful over time.
DirectX11 also offers some modern technology options that aren’t available in DirectX9. Upgrading to DirectX11 is the first step toward being able to do more shiny things.
After careful research we decided to integrate the open source rendering library BGFX into Guild Wars 2. BGFX is well-written and supports various graphics backends and is already used in many games industry-wide. You can learn more about it on their official website.
As the graphics ecosystems in the computer industry continue to evolve, BGFX allows us to work together as a community of rendering engineers, rather than all of us having to reinvent the wheel at every studio. ArenaNet has and will continue to contribute to the development of BGFX.
We chose to use DirectX11 instead of DirectX12 or Vulkan because we found that switching to BGFX’s DirectX11 implementation provided enough of a performance boost that the graphics backend was no longer ever a limiting factor for client performance. DirectX11 is very stable and has already been used by thousands of games for nearly a decade at this point. It allows us to provide support back to Windows Vista, while Vulkan support starts at Windows 7 and DirectX12 support starts at Windows 10. As far as graphics features are concerned, jumping from DirectX9 to DirectX11 gives us plenty of options for adding interesting features to the engine in years to come. Supporting more than one of these backends would balloon QA work for little tangible benefits.
The current DirectX9 renderer hasn’t been altered in any significant way. One of the philosophies I had as I built this was to have the DirectX11 renderer look the same as DirectX9. If I was making changes to both at the same time, there would be no ground truth to how things are supposed to look. Eventually the DirectX9 renderer will be deprecated then removed as the new renderer becomes stable.
Frame Dive
Now I’d like to go into how the DirectX11 renderer affects performance for the game. To start from essentials, video games work just like a movie in that you are viewing still images that are being flipped in front of you rapidly. We call each still a frame. This process is where the term “frames per second” (FPS) comes from. The higher the FPS, the smoother the action looks, until you hit the physical limit of your display. This process of continually responding to inputs (keyboard and mouse) and producing frames is the game loop.
Coming up are some timeline visualizations of the game loop. I’m taking all this data from the above location in Lion’s Arch using the “Best Appearance” graphics preset.
I picked this spot since there is a lot to render. There’s all the city, and then the technique we use for real-time reflections is basically rendering the world a second time upside down.
I am using an Intel I7-6700 CPU and an Nvidia 1080 GPU for this test.
Above is a screenshot from a visualization tool we use called Telemetry, developed by RAD Game Tools. This visualizes how the game’s work gets spread across your CPU’s cores. Time is represented horizontally. This is displaying two frames of game logic. I’ve blurred out a lot of the identifiers for clarity.
Each horizontal row represents a logical thread. Here you can see I have six worker threads. We adjust the number of worker threads depending on how many CPU threads your CPU supports. I have a chip that supports eight hardware threads, so the game divides them as one game thread, one render thread, and six worker threads. Note that Guild Wars 2 has other threads running as well, which I’m not displaying here. Those threads are not particularly CPU intensive and are not relevant to this post.
The various blocks within the thread channels represent work happening. When the thread is displaying as empty, that means it is idle and not doing any game work. In those spaces the hardware thread may be picking up work from other applications on your machine.
Logical frames are where we consider the game loop restarted. Vertical lines mark where logical frames begin. Hardware inputs (like keyboard/mouse) are queried at the beginning here.
This is where the graphics frame loops. Since the beginning of the game frame doesn’t do any rendering, we let some work from the last frame overrun into next frame so we can do as much work in parallel as possible. At some point though you must wait for the last frame to flush out so you can start submitting work for the next one. Here you might have noticed a red block of work on the main thread, which lines up with some work ending on the render thread. This is what we want to eliminate. The game thread should never have to wait for rendering work to complete.
Aside: There are some other much smaller red blocks on the main thread. That’s the main thread waiting for job thread work to complete.
Here we build out detailed lists of instructions for the rendering thread to execute. When a block of generalized rendering work has been generated it gets handed across to the render thread.
The render thread converts the rendering work into per graphics backend (OpenGL or DX9) specifics.
DX11 Beta Renderer Frame
Here’s how the new renderer looks in the profiler.
The layout of threads is the same.
The main thread no longer must wait for the render thread to complete! This accomplished because of how BGFX is architected. Draw calls are collected on the game thread. (And on worker threads as well, we’ll get to that). Then when the frame ends BGFX processes those calls on its render thread. Now we have a lot of headroom for drawing!
Conversion from our queued rendering data to BGFX calls happens in parallel across job threads.
And that’s it! This was very interesting project for me. I’m looking forward to the technical future for Guild Wars 2.
Please let us know how the DirectX11 beta works for you on the forums.
See you in Tyria!
-James Fulop