Threading model and "Non-framebuffer-completeness"

I’ve wanted to use the threaded render pipeline, as it should produce an improvement in frame-rate on multi-core machines. However, for some time enabling it seemed to result in crashes. Today I decided to give it another shot, especially given that I’m presumably using more-recent engine-code than I was when last I tried.

To start with, it at least doesn’t seem to crash as it once did.

However, it also seems to break rendering somehow. Specifically, it looks as though it’s unhappy with my shadow-buffers, and specifically their frame-buffer properties. While all is fine if multi-threading is disabled, when it’s enabled I get a stream of errors; with “gl-debug #t” in my PRC file, the errors are as follows (with the final three lines repeated indefinitely, it seems):

(The repetition below is as printed in the output, I believe.)

:display:gsg:glgsg(warning): Framebuffer unsupported. Framebuffer object light buffer is unsupported because the depth and stencil attachments are mismatched.
:display:gsg:glgsg(warning): Framebuffer unsupported. Framebuffer object light buffer is unsupported because the depth and stencil attachments are mismatched.
:display:gsg:glgsg(warning): Framebuffer unsupported. Framebuffer object light buffer is unsupported because the depth and stencil attachments are mismatched.
:display:gsg:glgsg(error): EXT_framebuffer_object reports non-framebuffer-completeness:
:display:gsg:glgsg(error): FRAMEBUFFER_UNSUPPORTED for light buffer
:display:gsg:glgsg(warning): Framebuffer unsupported. Framebuffer object light buffer is unsupported because the depth and stencil attachments are mismatched.
:display:gsg:glgsg(error): GL_INVALID_FRAMEBUFFER_OPERATION error generated. Operation is not valid because a bound framebuffer is not framebuffer complete.
:display:gsg:glgsg(error): GL_INVALID_FRAMEBUFFER_OPERATION error generated. Operation is not valid because a bound framebuffer is not framebuffer complete.

I’m currently not requesting stencil bits, I believe–only depth bits, more or less as follows:

frameProps = FrameBufferProperties()
        frameProps.setDepthBits(1)

I’ve tried changing the “1” to “16”, and adding “setStencilBits(0)”, to no avail.

The exact results after loading a level seem to vary according to which threading model I choose–“Cull” still seems to crash or force-quit, while the other two produce different graphical glitches, likely related to the game’s shadow-mapping.

Any suggestions, anyone? :confused:

I’m afraid to say that there are a number of issues with certain rendering techniques (including shadows) and the multithreaded render pipeline. Until all of those have been isolated and resolved, I would suggest you avoid using pipelined rendering.

Of course, if you’re willing to isolate the individual issues, reduce them down to small reproducible test snippets and file them on GitHub, that would be helpful.

Ah, that’s a pity. Well, thank you for the information! I suppose that the multi-threaded pipeline is off of the table for now, then. :confused:

Hmm… It might still be worth my while looking into Panda3D’s more general threading support, with the goal of offloading some logic onto another thread, to be handled by another core. (Indeed, I should probably look into threading for the purpose of level-loading anyway.)

Yes, I’m not aware of any bugs in the threaded model loading.

If you want to offload some game logic (collisions, pathfinding, autosave, AI, etc) to a second cpu core then you should look into multiprocessing not threading. Threading in Python will always run on one core because of the (in-)famous Global Interpreter Lock and is always slower. Multiprocessing with Panda3D is a bit tricky, because it works by running separate interpreter instances and communicate between them using pickled objects (and the p3d scenegraph is not always happy with that), it gets even more tricky when you deploy because the python interpreter is hidden in the deployed executable. The only way I got a multiprocessing setup running is by making the executable ‘Popen’ another instance of itself but with a command line argument telling the ‘child process’ to run without a window and communicate with the ‘parent’ process using a UDP socket. I have code I can share if you are interested.

That’s good to know; thank you. :slight_smile:

Hmm… That does like a bit of a pain, to me: threading was never my strong suit, and your description seems to indicate that this is trickier to work with. For now, at least, I think that I’ll stick with the default approach, a pity though it may be to be perhaps underutilising the cores of the computer).

Nevertheless, thank you for the offer to share code! :slight_smile:

I tend to recommend strongly against multiprocessing in Panda because it wreaks havoc with packaging systems. I don’t believe we have a general solution yet for multiprocessing in either deploy-ng or p3d applications.

Threading is still very effective because Panda3D is not limited by the GIL. It’s perfectly feasible doing game logic tasks for the next frame while Panda is rendering the current frame.

Ah, fair enough–thank you for the warning.

Hmm… I’m glad that it doesn’t have that limit. But do I understand correctly that it still wouldn’t take advantage of multiple cores?

(Even so, it might be worth reconsidering, I suppose. It’s something to think about–and as noted before, I’ll want to look into threading to some degree for the purposes of loading anyway.)

I don’t see a reason why it couldn’t take advantage of multiple cores.

I was going by what wezu said, I believe–but looking at that again, I see that he gave the “Global Interpreter Lock” as the reason that threading would remain on one core, and you did say that Panda3D’s threading system doesn’t have that issue. Sorry about that–I was perhaps a bit tired as of my last post there! ^^;

Fair enough, then–it may well be worth looking into a threaded approach. Thank you for the help! :slight_smile:

Can I get a tip on how to write python code that will make use of many cores with p3d? I’m clearly missing something.

There was a blog post about it a while back.

It is also in the manual.

From what I understood of it, your game logic all runs on one core in python (GIL and all) while culling routines are run on a second core and any further CPU side rendering tasks are done on a third core. So this way you can still get good efficiency on a quad core machine as long as your application is not too logic heavy.

You do not have to write code in any special way to take advantage of it, just enable it in the config (IIRC). Although maybe you get smoother looking motion when you use character animations, intervals and the integrated physics engines that allow future transformations of render-able stuff to be predicted without waiting for the logic thread to change their state explicitly?

That approach–the config-based “multithreaded render pipeline”–is what I was referring to in my initial post, I believe. Unfortunately, it seems that it tends to act up around certain features, such as shadow-mapping.

I (and I presume Wezu) was looking for some alternate approach, presumably using Panda’s general threading implementation.

Sorry, I should have read the original post more carefully!

Not a problem! :slight_smile: