Dissassembly for speed.

Hi. I can program in assembly and I have used panda3d.
It lags quite badly on my AGP 256MB GFORCE 5500.

I was wondering if I could help with some optimization in the form
of dissassembling the panda engine and optimizing it on a machine
code level. This shouldn’t change a thing for how the end user does things.
Should I do it?
I will re-assemble all the code.
All I will do is remove some of the rubbish the compiler has put inside
the exe files. I will make them smaller and more efficient.

Mainly because panda3d is a really cool engine which is only
fast enough for PCI express cards.

If you really want to squeeze every last microsecond out of the compiled Panda, the first thing to do is recompile a custom version of Panda with all of the asserts taken out and all the normal error-checking disabled. Our experience is that this gains you maybe 5% on performance. We don’t provide a streamlined version of Panda like this, though, because you really don’t want to be developing on such a cut-down Panda; with no error checking, the first call into C++ you make with invalid parameters will generate a segment fault.

But this is not the way to achieve major performance gains, anyway. If your application is unacceptably slow, it may be more of an issue with what your application is doing. You need to research where the bottlenecks actually are, and put your effort into where it will do the most good. For instance, maybe you’re spending too much time in Python. Or maybe it’s your collisions that are too expensive, bogging the frame rate down. Or maybe you’re rendering too many nodes. Maybe you"re rendering too many vertices per node, or maybe too few. Maybe you’re relying too much on lists, when you should be using dictionaries. Maybe you’re using too many dictionaries, when lists would do.

Discovering bottlenecks like this, and fixing them, is the way to make performance gains of 100% or better. And once you make these fixes, they generally stay fixed (at least for your particular application), and you also gain a better sense of how to write tight 3D code for the next application.

On the other hand, trying to second-guess the compiler is a poor proposition for the long term. You might make small improvements, but probably less than 5%. And the next time we build a new version of Panda, all of your improvements will be lost, and will have to be re-done.

David

very true david =)

from my experience. researching bottlenecks and get rid of them is much more of a performance improvement.
i mis-used lists and dictinoarys… after i solved it the needed processing power went down by a faktor of >5000.
most common performance issue is too many nodes with too few vertices. (at least on my end)
if you wanna improve performance even with perfect optimized scene you could start adding LOD algorithms =) see my post in the general discussion thread … would hepl a lot more people and performance gain can be significant.

but first… check for the more common bottlenecks drwr told you :slight_smile:

Could you elaborate on when lists or dicts cause slow downs?

@irnjad: I don’t think assembly will work fine. I can program in assembly too, but I rarely use it on the computer. Programming your program in C/C++ will make your program a lot more stable, because ASM can’t allow bugs in your programs. One bug, and peew your computer crashes.
Optimizing the current Panda3D code will be indeed better.

@Laurens
I don’t try to answer your question myself, but cite Fredrik Lundh (effbot)

http://effbot.org/zone/python-list.htm:

Especially appending/inserting to huge lists is slow.

Thanks! I’ll keep this info in mind.

Well the reason why I posted that is because my
AGP 256MB GeForce 5500 graphics card
lags when I run the panda demos. I don’t think it has
anything to do with bottlenecks. I rate panda is just too slow.
I mean, even the simple demos with hardly any graphics
run at ± 40FPS.

Anyway, can I legally downlaod the panda source and translate it
to another language and add features and sell it as my own?
Regardless of how hard or impossible or how stupid this may seem.
I just wanna know if I can legally do this?

sell it?? Hey! You’re not Bill Gates, are you? :stuck_out_tongue:
Nah, jus’ kidding. No, you cannot sell it.
See the license:
panda3d.org/license.php

:astonished:

if you have only that few fps it sounds like you have a problem with your driver or something else. looks allmost like you run panda in software mode.
panda is pretty fast. i had no problems rendering several hundret thousand triangles per second on my geforce 4 (back when i had the gf4).

can you post the console output you get when starting a demo?
and what os/driver combination are you using, and if other 3d-apps are running (opengl/d3d)?

and about selling … you can sell it, but i think you cant claim it’s your own engine (except you completely rewrite it)… but that’s not the point since your problem iseems to be somewhere near your 3d-acceleration :slight_smile:

40fps is low?
Isn’t panda fps capped at 60fps when using opengl.
But dx gets about 85 capped.
This is on my friends fast pc. My pc is still agp, he has pci express.
I will send the info

panda isnt capped at all. if you disable vertical syncronisation. and even if you left it enabled you really have a bad monitor with only 40Hz.
if you ask me your hardware acceleration is not working for some reason.
to disable and show the fps try to add those lines at the beginning of your examples you want to test:
[/code]
from pandac.PandaModules import loadPrcFileData
loadPrcFileData("", “sync-video 0” )
loadPrcFileData("", “show-frame-rate-meter 1”)


a "normal" pc with working 3d-acceleration and v-sync off should get far beyond 100fps in the solar system demos.
the carusel-demo runs on my linux box with gefore6600 with ~260fps
even a geforce3 or even 2 should run them fine.
if you get 40fps with a gf5 something IS wrong.

do other 3d-applications run fine? games? benchmarks? and could you post the console output you get when starting a panda application?

With the code added. It ran the bump mapping tutorial at 22fps.
But when you press enter it runs at 175-250fps.
My monitor is fine. It runs at 85hz refresh rate.
Games also run fine on my pc.
I have no problem with them.
Airblade gets 18fps when fighting the boss at the end.
And the egg model takes forever to load. So I converted to bam.
I know about Config.prc so I just made the changes there.

I have been learning some of panda but when I realised it was too slow
I moved on, but now it seems I was wrong.

Oh, btw I just realised that I have panda 1.3.0
Maybe I should just try the new version and see if it was an old bug.

I am only going to be using python with panda. Not C, will this create
a speed problem?

The console window says.

DirectStart: Starting the game
Warning : DirectNotify: category ‘Interval’ already exists
Known pipe types:
wglGraphicsPipe
(3 aux display modules not yet loaded.)
:util(warning) : Adjusting global clock’s real time by 0.888729 seconds.
:util(warning) : Adjusting global clock’s real time by -1.2145 seconds.[/img]

usualy not.
and in case you run into a speed problem (like when writing some fancy algorithms) you can still write them in c and call them via python.

about your console output…
wglGraphicsPipe
(3 aux display modules not yet loaded.)
well idont know if this is bad in any way but i get the message “all display modules loaded”
and since i’m not on windows my grafic pipe is glx isntead of wgl.
have you tried dx8/dx9?

I tried dx8. Shaders are not supported.
I tried dx9. Bump mapping doesn’t work and the demos are glitchy.
It seems that I must use OpenGL.

The console window says.

DirectStart: Starting the game
Warning : DirectNotify: category ‘Interval’ already exists
Known pipe types:
wglGraphicsPipe
(3 aux display modules not yet loaded.)
:util(warning) : Adjusting global clock’s real time by 0.888729 seconds.
:util(warning) : Adjusting global clock’s real time by -1.2145 seconds

But I have
Opengl32.dll
glu32.dll
glwext32.dll
glut32.dll

Why do I get the error of 3 aux display modules not yet loaded?

Please can someone help me.
Aren’t there any other windows users with the same problem?

This is a normal message for Windows users. It’s not an error. It’s telling you that you haven’t loaded pandadx8 or pandadx9 (or, I suppose, pandadx7, even though this one is deprecated); you’ve loaded pandagl instead.

What is the problem that you’re experiencing? Other than the fact that the bump mapping demo is not working in dx9, it sounds like everything is working normally. The Airblade demo does slow down at the end during the boss battle. This game was written very quickly, and was never optimized for performance. As such, it’s a testament to how easy it is to write a game with playable performance in Panda, without spending any effort to optimize it! It should be fairly easy to fix the final battle scene in Airblade to run much faster if anyone wanted to spend the effort to do so.

I don’t know what’s up with the bump mapping demo, but it may be something similar. It was designed as a proof of technology, not as a proof of performance. Some people have complained that it doesn’t even run at all on their graphics cards; it might be using an overly complex shader. Someone posted a simpler version on the forum not too long ago; try searching for that.

Like any graphics engine (or, indeed, any programming package), it is possible to write sluggish programs in Panda. It is also possible to write blazingly fast programs in Panda, but this takes a bit more effort. However, with Panda it is particularly easy to write a program that is at least playable with very little effort; and you can spend the time to tweak the performance later, if you care.

David

Ok but the problem is everything is running slow.
For all of the tutorials I get 60fps or lower.
For glow filter it is 28fps and for normal mapping I get 28fps aswell.
The fireflies demo doesn’t run at all.
My card is a GFORCE FX5500. It has 256MB.

So the three graphics libraries must be dx7, 8 and 9 not being loaded
because I told it to do OpenGL.

Surely panda should not be running so slowly.
Other users report that they get fps in the hundreds.
But I only get 60 or less.
Please help

Dude, the human eye can only perceive motion at about 60fps. And you said yourself your monitor is refreshing itself at 85fps. Anything more than that is just wasted and superfluous.

60fps is considered the holy grail of 3-D framerates. You realize that 60fps means each frame is computed and drawn in less than 17ms, right? That’s very, very fast. People spend months optimizing their code to make it render at 60fps, in any 3-D engine. More often, games have to settle for 30fps or 20fps or even slower (all of which is still perfectly playable).

Granted, these are silly little demo applications, not full-fledged games. So it’s easier to achieve higher frame rates than 60fps with some of these. Just because people do, doesn’t mean that it should be expected or that it means you’re running too slow if you don’t.

Generally, the bottleneck is usually the CPU. Your graphics card only becomes an issue in some of the shader demos.

David

Oh, I see.
Well that’s good to know. I am rather happy that panda can be made fast
if I code well and that these demo’s aren’t written for speed so that’s why they are running slow.
Well hopefully one day I can give back to panda in the future.
Thanks a lot.