Posts Tagged ‘vulkan’

July 2018 Development Update

Sunday, August 26th, 2018 by fireclaw

Despite the vacation period, the developers have not remained idle in July. Here is an update on some of the new developments.

Collision shapes

While the internal collision system provides many useful tests between various collision solids, the CollisionTube solid (representing a capsule shape) in particular was only really useful as an “into” collision shape. Many of you have requested for more tests to be added so that it can also be used as a “from” shape, since many see it as a good fit for use with character bodies. Earlier, we had already added tube-into-plane and tube-into-sphere tests. We have now also extended this to include tube-into-tube tests and tube-into-box tests. We have also added a line-into-box test to complete the suite of CollisionBox tests.

For those who are using Bullet physics instead of the internal collision system, we have also extended the ability to convert collision solids from the Panda3D representation to the Bullet representation to include CollisionTube and CollisionPlane as well. These solids can now be easily converted to a BulletCapsuleShape and BulletPlaneShape, respectively. This way you can add these shapes directly to your .egg models and load them into your application without needing custom code to convert them to Bullet shapes.

Depth buffer precision

As most Panda3D programmers will know, two important variables to define when configuring a camera in any game are the “near” and “far” distances. These determine the range of the depth buffer; objects at the near distance have a value in the depth buffer of 0.0, whereas objects at the far plane have a value of 1.0. As such, they also determine the drawing range: objects that fall outside this range cannot be rendered. This is fundamental to how perspective rendering works in graphics APIs.

As it happens, because of the way the projection matrix is defined, it is actually possible to set the “far” distance to infinity. Panda3D added support for this a while ago already. Because of the reciprocal relationship between the distance to the camera and the generated depth value, the near distance is far more critical to the depth precision than the far distance. If it is too low, then objects in the distance will start to flicker as the differences in depth values between different objects becomes 0; the video card can no longer tell the difference between their respective distances and gets confused about which surface to render in front of the other. This is usually known as “Z-fighting”.

This is a real problem in games that require a very large drawing distance, while still needing to render objects close to the camera. There are a few ways to deal with this.

One way people usually try to resolve this is by increasing the precision of the depth buffer. Instead of the default 24 bits of depth precision, we can request a floating-point depth buffer, which has 32 bits of depth precision. However, since 32-bit floating-point numbers still have a 24-bit mantissa, this does not actually improve the precision by that much. Furthermore, due to the exponential nature of floating-point numbers, most precision is actually concentrated near 0.0, whereas we actually need precision in the distance.

As it turns out, there is a really easy way to solve this: just invert the depth range! By setting the near distance to infinity, and the far distance to our desired near distance, we get an inverted depth range whereby a value of 1.0 is close to the camera and 0.0 is infinitely far away. This turns out to radically improve the precision of the depth buffer, as further explained by this NVIDIA article, since the exponential precision curve of the floating-point numbers now complements the inverse precision curve of the depth buffer. We also need to swap the depth comparison function so that objects that are behind other objects won’t appear in front of them instead.

There is one snag, though. While the technique above works quite well in DirectX and Vulkan, where the depth is defined to range from 0.0 to 1.0, OpenGL actually uses a depth range of -1.0 to 1.0. Since floating-point numbers are most precise near 0.0, this actually puts all our precision uselessly in the middle of the depth range:

This is not very helpful, since we want to improve depth precision in the distance. Fortunately, the OpenGL authors have remedied this in OpenGL 4.5 (and with the GL_ARB_clip_control extension for earlier versions), where it is possible to configure OpenGL to use a depth range of 0.0 to 1.0. This is accomplished by setting the gl-depth-zero-to-one configuration variable to `true`. There are plans to make this the default Panda3D convention in order to improve the precision of projection matrix calculation inside Panda3D as well.

All the functionality needed to accomplish this is now available in the development builds. If you wish to play with this technique, check out this forum thread to see what you need to do.

Double precision vertices in shaders

For those who need the greatest level of numerical precision in their simulations, it has been possible to compile Panda3D with double-precision support. This makes Panda3D perform all transformation calculations with 64-bit precision instead of the default 32-bit precision at a slight performance cost. However, by default, all the vertex information of the models are still uploaded as 32-bit single-precision numbers, since only recent video cards natively support operations on 64-bit precision numbers. By setting the vertices-float64 variable, the vertex information is uploaded to the GPU as double-precision.

This worked well for the fixed-function pipeline, but was not supported when using shaders, or when using an OpenGL 3.2+ core-only profile. This has now been remedied; it is possible to use double-precision vertex inputs in your shaders, and Panda3D will happily support this in the default shaders when vertices-float64 is set.

Interrogate additions

The system we use to provide Python bindings for Panda3D’s C++ codebase now has limited support for exposing C++11 enum classes to Python 2 as well by emulating support for Python 3 enums. This enables Panda3D developers (and any other users of Interrogate) to use C++11 enum classes in order to better wrap enumerations in the Panda3D API.

Multi-threading

We have continued to improve the thread safety of the engine in order to make it easier to use the multi-threaded rendering pipeline. Mutex lock have been added to the X11 window code, which enables certain window calls to be safely made from the App thread. Furthermore, a bug was fixed that caused a crash when taking a screenshot from a thread other than the draw thread.

June 2018 Development Update

Tuesday, July 31st, 2018 by fireclaw

Due to the vacation period, this post is somewhat delayed, but the wait is finally at an end. Here is the new update with a selection of the developments in June.

OpenGL on macOS

At the Apple WWDC in June of this year, Apple formally announced the deprecation of the OpenGL graphics API, in favor of an Apple-only graphics API called Metal. This move puzzled many as a lot of software is relying on OpenGL for macOS support, including Panda3D, and investing significant resources into developing a whole new rendering back-end to support just a relatively small segment of the market is hard to justify.

While it seems likely that—despite the deprecation notice—Apple will continue to include OpenGL support as a part of macOS, we will need to start looking at another approach for maintaining first-class support for high-end graphics on macOS. It would be nice if there were one next-gen graphics API that would be well-supported on all platforms going forward.

Enter MoltenVK. This is an implementation of the cross-platform Vulkan API for macOS and iOS, implemented as a wrapper layer on top of Apple’s Metal API. It has recently been open sourced by the Khronos group, in an effort to make Vulkan available on every operating system. Despite being a wrapper layer, it was still found by Valve to have increased performance benefits over plain OpenGL. This will let us focus our efforts on implementing Vulkan and thereby support all platforms.

Accordingly, we have increased the priority towards developing a Vulkan renderer, and it has made several strides forward in the past months. It is still not quite able to render more than the simplest of sample programs, however. We will keep you updated as developments unfold.

Mouselook smoothness improved

In order to implement camera movement using the mouse, many applications use move_pointer to reset the cursor to the center of the window every frame. However, when the frame rate gets low, the mouse movement can become choppy. This is usually only an issue occurring on Windows as other platforms support the M_relative mouse mode, which obviates the need to reset the mouse cursor. For all those who can’t or don’t use this mode, a significant improvement in smoothing this movement has now been worked out.
In previous versions, small movements that occurred between the last iteration of the event loop and the call to move_pointer could have been ignored, causing the mouselook to become choppy. This has been fixed by changing the get_pointer() method to always obtain the latest mouse cursor position from the operating system, resulting in more accurate mouse tracking. In the future, we will further improve on this by emulating relative mouse mode on Windows and perhaps even adding a new interface for handling mouselook.

Grayscale and grayscale-alpha support with FFmpeg

Panda3D’s implementation of FFmpeg now gained the ability to load videos that only contain grayscale colours more efficiently, greatly reducing the memory usage compared to the previous approach of converting all videos to full RGB. This is also extended to videos that have a grayscale channel as well as an alpha channel for transparency. Do note, however, that not all video formats support grayscale pixel formats.

Removing ‘using namespace std’ from Panda3D headers

C++ users, take note: besides the changes to add better C++11 support, we have also eliminated the bad practice of having using namespace std; in the public Panda3D headers. This prevents the entire std namespace from being pulled in and causing namespace clashes. If your codebase relied on being able to use standard types without an explicit std:: prefix, you may need to add using namespace std to your own headers. It would be even better to fully qualify your references to standard types and functions with an std:: prefix or—where appropriate—pull in specific types with individual using std::x; statements.

Bullet Vehicle chassis-to-wheel synchronisation

Using Bullet’s BulletVehicle class to simulate car physics had a problem with syncing the wheels to the chassis if the wheels were parented to the the vehicle’s RigidBodyNode. The wheel models would always be one frame ahead of the chassis model, resulting in visible artifacts as the wheels would appear to move away from the chassis. Especially if the vehicle accelerates to a significant speed, the wheels may appear to drift quite far from the vehicle. This bug was fixed thanks to a contributor.

May 2018 Development Update

Sunday, June 24th, 2018 by rdb

With the work on the new input system and the new deployment system coming to a close, it is high time we shift gears and focus our efforts on bundling all this into a shiny new release. So with an eye toward a final release of Panda3D 1.10, most of the work in May has centered around improving the engine’s stability and cleaning up the codebase.

As such, many bugs and regressions have been fixed that are too numerous to name. I’m particularly proud to declare the multithreaded render pipeline significantly more stable than it was in 1.9. We have also begun to make better use of compiler warnings and code-checking tools. This has led us find bugs in the code that we did not even know existed!

We announced two months ago that we were switching the minimum version of the Microsoft Visual C++ compiler from 2010 to 2015. No objections to this have come in, so this move has been fully implemented in the past month. This has cleared the way for us to make use of C++11 to the fullest extent, allowing us to write more robust code and spend less of our time writing compiler-specific code or maintaining our own threading library, which ultimately results in a better engine for you.

Thinking ahead

Behind the scenes, many design discussions have been taking place regarding our plans for the Panda3D release that will follow 1.10. In particular, I’d like to highlight a proposed new abstraction for describing multi-pass rendering that has begun to take shape.

Multi-pass rendering is a technique to render a scene in multiple ways before compositing it back together into a single rendered image. The current way to do this in Panda3D hinges on the idea of a “graphics buffer” being similar to a regular on-screen window, except of course that it does not appear on screen. At the time this feature was added, this matched the abstractions of the underlying graphics APIs quite well. However, it is overly cumbersome to set up for some of the most common use cases, such as adding a simple post-processing effect to the final rendered image. More recent additions like FilterManager and the RenderPipeline’s RenderTarget system have made this much easier, but these are high-level abstractions that simply wrap around the same underlying low-level C++ API, which still does not have an ideal level of control over the rendering pipeline.

That last point is particularly relevant in our efforts to provide the most optimal level of support for Oculus Rift and for the Vulkan rendering API. For reasons that go beyond the scope of this post, implementing these in the most optimal fashion will require Panda3D to have more complete knowledge of how all the graphics buffers in the application fit together to produce the final render result, which the current API makes difficult.

To remedy this, the proposed approach is to let the application simply describe all the rendering passes up-front in a high-level manner. You would graph out how the scene is rendered by connecting the inputs and outputs of all the filters and shaders that should affect it, similar to Blender’s compositing nodes. You would no longer need to worry about setting up all the low-level buffers, attachments, cameras and display regions. This would all be handled under the hood, enabling Panda3D to optimize the setup to make better use of hardware resources. We could even create a file format to allow storing such a “render blueprint” in a file, so something like loading and enabling a deferred rendering pipeline would be possible in only a few lines of code!

This is still in the early design stages, so we will talk about these things in more detail as we continue to iron out the design. If you have ideas of your own to contribute, please feel free to share them with us!

Helping out

In the meantime, we will continue to work towards a final release of 1.10. And this is the time when you can shine! If you wish to help, you are encouraged to check out a development build of Panda3D from the download page (or installed via our custom pip index) and try it with your projects. If you encounter an issue, please go to the issue tracker on GitHub and let us know!