Assertion failed on line 59 of DeletedBufferChain.cxx

Hello everybody,

I’ve just successfully compiled the latest source code of Panda3D (from the GitHub repo) on an NVIDIA Jetson TK1 developer board. It uses the ‘armhf’ processor architecture and has a GPU that supports both OpenGL and OpenGLES. I’m using Ubuntu 14.04 LTS.

I built the source with this command:

python makepanda/makepanda.py --everything --no-artoolkit --no-fcollada --no-fmodex --no-opencv --no-squish --no-tiff --no-vrpn --no-rocket --no-fftw  --no-gles --no-gles2 --threads 4 --installer

(I’ve chosen not to use OpenGLES)

When I try to run the Panda Hello World program, everything looks fine for a few moments (the scene is rendering properly, at over 200fps.) Then, this error pops up in the console:

$ python p3d-test.py
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
python: dtool/src/dtoolbase/deletedBufferChain.cxx:59: void* DeletedBufferChain::allocate(size_t, TypeHandle): Assertion `obj->_flag == (AtomicAdjust::Integer)DCF_deleted' failed.
Aborted

It looks to me that there is some problem with a “DeletedBufferChain” that Panda3D is trying to create.

Does anyone have an idea about why this might be happening? The animation runs fine for a second or two, sometimes longer. Then the window closes itself and this appears. Any help would be greatly appreciated.

Edit: The same error appears when using ‘p3tinydisplay’ software rendering.

Hmm, odd. Do you get the same crash when running “pview”? Do you you think you might be able to get me a gdb traceback of the crash?

To get a gdb traceback of pview, just use “gdb pview” and when the prompt appears, hit “run”. When it crashes, type “bt” in the prompt.

To get a gdb traceback of Python (in case pview doesn’t exhibit the same crash), use “gdb --args python p3d-test.py”.

Also, did you build with or without Eigen?

‘pview’ causes the exact same crash.
After displaying the window for varying amounts of time, the program aborts.

This is the output of gdb:

$ gdb pview
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from pview...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/pview 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
pview: dtool/src/dtoolbase/deletedBufferChain.cxx:59: void* DeletedBufferChain::allocate(size_t, TypeHandle): Assertion `obj->_flag == (AtomicAdjust::Integer)DCF_deleted' failed.

Program received signal SIGABRT, Aborted.
__libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
44	../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
#1  0xb51a4f0e in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2  0xb51a7766 in __GI_abort () at abort.c:89
#3  0xb51a0150 in __assert_fail_base (fmt=0x1 <error: Cannot access memory at address 0x1>, assertion=0xb53945b0 "obj->_flag == (AtomicAdjust::Integer)DCF_deleted", assertion@entry=0x0, 
    file=0xb5394584 "dtool/src/dtoolbase/deletedBufferChain.cxx", file@entry=0xb4eba000 "", line=59, line@entry=3039178924, 
    function=function@entry=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:92
#4  0xb51a01e6 in __GI___assert_fail (assertion=0x0, file=0xb4eba000 "", line=3039178924, 
    function=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:101
#5  0xb5386d86 in DeletedBufferChain::allocate(unsigned int, TypeHandle) () from /usr/lib/arm-linux-gnueabihf/panda3d/libp3dtool.so.1.9
#6  0xb5beb862 in std::_Rb_tree<NodePath, std::pair<NodePath const, DisplayRegion*>, std::_Select1st<std::pair<NodePath const, DisplayRegion*> >, std::less<NodePath>, pallocator_single<std::pair<NodePath const, DisplayRegion*> > >::_M_insert_(std::_Rb_tree_node_base*, std::_Rb_tree_node_base*, std::pair<NodePath const, DisplayRegion*> const&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#7  0xb5be40ea in GraphicsEngine::cull_to_bins(ov_set<PointerTo<GraphicsOutput>, IndirectLess<GraphicsOutput> > const&, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#8  0xb5be41a2 in GraphicsEngine::WindowRenderer::do_frame(GraphicsEngine*, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#9  0xb5be628c in GraphicsEngine::render_frame() () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#10 0x0004b4f0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

I built with Eigen3 (it was already installed on my system.)

Thanks!

EDIT: Sometimes the backtrace is different! The same assertion is always triggered, but here’s another, for example:

$ gdb pview
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from pview...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/pview 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
pview: dtool/src/dtoolbase/deletedBufferChain.cxx:59: void* DeletedBufferChain::allocate(size_t, TypeHandle): Assertion `obj->_flag == (AtomicAdjust::Integer)DCF_deleted' failed.

Program received signal SIGABRT, Aborted.
__libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
44	../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
#1  0xb51a4f0e in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2  0xb51a7766 in __GI_abort () at abort.c:89
#3  0xb51a0150 in __assert_fail_base (fmt=0x1 <error: Cannot access memory at address 0x1>, assertion=0xb53945b0 "obj->_flag == (AtomicAdjust::Integer)DCF_deleted", assertion@entry=0x0, 
    file=0xb5394584 "dtool/src/dtoolbase/deletedBufferChain.cxx", file@entry=0xb4eba000 "", line=59, line@entry=3039178924, 
    function=function@entry=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:92
#4  0xb51a01e6 in __GI___assert_fail (assertion=0x0, file=0xb4eba000 "", line=3039178924, 
    function=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:101
#5  0xb5386d86 in DeletedBufferChain::allocate(unsigned int, TypeHandle) () from /usr/lib/arm-linux-gnueabihf/panda3d/libp3dtool.so.1.9
#6  0xb59da432 in TransformState::make_mat(LMatrix4f const&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#7  0xb4e51bd0 in GLGraphicsStateGuardian::calc_projection_mat(Lens const*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpandagl.so
#8  0xb5bf76f6 in GraphicsStateGuardian::set_scene(SceneSetup*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#9  0xb5bdd564 in GraphicsEngine::do_draw(CullResult*, SceneSetup*, GraphicsOutput*, DisplayRegion*, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#10 0xb5bdd66e in GraphicsEngine::draw_bins(GraphicsOutput*, DisplayRegion*, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#11 0xb5be3c68 in GraphicsEngine::draw_bins(ov_set<PointerTo<GraphicsOutput>, IndirectLess<GraphicsOutput> > const&, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#12 0xb5be41ba in GraphicsEngine::WindowRenderer::do_frame(GraphicsEngine*, Thread*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#13 0xb5be628c in GraphicsEngine::render_frame() () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#14 0x0004b4f0 in ?? ()
Backtrace stopped: previous frame identical to this frame (corrupt stack?)
(gdb) 

And yet another:

$ gdb pview
GNU gdb (Ubuntu 7.7.1-0ubuntu5~14.04.2) 7.7.1
Copyright (C) 2014 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "arm-linux-gnueabihf".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from pview...(no debugging symbols found)...done.
(gdb) run
Starting program: /usr/bin/pview 
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/arm-linux-gnueabihf/libthread_db.so.1".
Known pipe types:
  glxGraphicsPipe
(all display modules loaded.)
pview: dtool/src/dtoolbase/deletedBufferChain.cxx:59: void* DeletedBufferChain::allocate(size_t, TypeHandle): Assertion `obj->_flag == (AtomicAdjust::Integer)DCF_deleted' failed.

Program received signal SIGABRT, Aborted.
__libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
44	../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S: No such file or directory.
(gdb) bt
#0  __libc_do_syscall () at ../ports/sysdeps/unix/sysv/linux/arm/libc-do-syscall.S:44
#1  0xb51a4f0e in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#2  0xb51a7766 in __GI_abort () at abort.c:89
#3  0xb51a0150 in __assert_fail_base (fmt=0x1 <error: Cannot access memory at address 0x1>, assertion=0xb53945b0 "obj->_flag == (AtomicAdjust::Integer)DCF_deleted", assertion@entry=0x0, 
    file=0xb5394584 "dtool/src/dtoolbase/deletedBufferChain.cxx", file@entry=0xb4eba000 "", line=59, line@entry=3039178924, 
    function=function@entry=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:92
#4  0xb51a01e6 in __GI___assert_fail (assertion=0x0, file=0xb4eba000 "", line=3039178924, 
    function=0xb5394468 <DeletedBufferChain::allocate(unsigned int, TypeHandle)::__PRETTY_FUNCTION__> "void* DeletedBufferChain::allocate(size_t, TypeHandle)") at assert.c:101
#5  0xb5386d86 in DeletedBufferChain::allocate(unsigned int, TypeHandle) () from /usr/lib/arm-linux-gnueabihf/panda3d/libp3dtool.so.1.9
#6  0xb54654f0 in WeakReferenceList::add_reference(WeakPointerToVoid*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpandaexpress.so.1.9
#7  0xb59e6524 in WeakPointerToBase<RenderState>::reassign(RenderState*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#8  0xb59d910a in StateMunger::munge_state(RenderState const*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#9  0xb5987cf6 in CullableObject::munge_geom(GraphicsStateGuardianBase*, GeomMunger*, CullTraverser const*, bool) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#10 0xb5988118 in CullResult::add_object(CullableObject*, CullTraverser const*) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#11 0xb597dd6e in GeomNode::add_for_draw(CullTraverser*, CullTraverserData&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#12 0xb598113e in CullTraverser::traverse_below(CullTraverserData&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#13 0xb5fbe690 in TextNode::cull_callback(CullTraverser*, CullTraverserData&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
#14 0xb5afc5c4 in FrameRateMeter::cull_callback(CullTraverser*, CullTraverserData&) () from /usr/lib/arm-linux-gnueabihf/panda3d/libpanda.so.1.9
Cannot access memory at address 0x3d3e2066
(gdb) 

I notice that this looks like a memory issue:

<error: Cannot access memory at address 0x1>

appears in all three backtraces, even though each is different.

Any ideas?
Thanks!

Hmm, does the system happen to be close to running out of memory at that point? It could just be a simple memory allocation failure masked by Panda’s deleted chain system.

If not, it might be an obscure platform-specific bug in the deleted chain system (which is a system to reuse memory for the same objects). You could try compiling without the deleted chain system, by making a clean build and adding --override USE_DELETED_CHAIN=UNDEF to the makepanda command line.
(Look at built/include/dtool_config.h to make sure it’s really writing “#undef USE_DELETED_CHAIN” to the file.)

If it’s not a bug in the deleted chain system, this would probably make the crash happen in a different way, which might shed more light on the cause of this issue.

Actually, just after writing that post, I realised that the deleted chain flag in question is written to using atomic adjustment. Atomic intrinsics tend to have their platform-specific quirks.

There are two ways to change this to see if this is the issue. The first is to compile without full pipelined threading support (if you don’t intend to use Panda’s threading and pipelining features) by adding “–override SIMPLE_THREADS=1”.
The other way is to disable use of GCC atomic intrinsics in favour of a simple pthreads mutex. To do that, edit dtool/src/dtoolbase/atomicAdjust.h, and comment out the whole #elif case concerning the GCC implementation.

(of course, if you try this, enable the deleted chain system so that we can verify whether or not this is what causes the problem.)

Thanks for the fast replies.

I’ll try disabling the use of GCC atomic intristics in atomicAdjust.h, like you mentioned, and post back with the results.

That worked! Now everything runs fine.

Just a question, will the disabled AtomicAdjust cause a performance decrease? (If so, I would go with the disabled threading instead.) Do you know which solution would be better performance-wise?

Thank you!

Hmm, it’s a bit troubling that the GCC atomic intrinsics cause trouble on ARM. I do wonder why; perhaps we’re not using the right memory model, or not issuing a needed memory barrier… I’ll have to think about how we’re going to handle this in the future. I may have no choice but to disable it entirely, at least on ARM processors, until the cause is found. (I do have an ARM-based development board around somewhere, so I might try to reproduce the issue when I find the time.)

The lack of availability of atomics may impact performance a little bit, since there is a little bit more overhead associated with the global mutex (even if only because of the extra call into the pthread library).
Disabling true threading support altogether is definitely a solution if you don’t plan on using Panda’s advanced threaded pipeline at all (which frankly is still a bit experimental in some areas). If you don’t use it, Panda actually becomes faster when you disable threading because of the additional overhead in other areas.

Assuming that your application doesn’t rely heavily on using Panda3D in a threaded fashion, I would suggest that you compile Panda with SIMPLE_THREADS=1, which disables atomics but enables a fake threading system (negating the overhead of true threading while still offering a simple threading interface). If you find yourself dealing with performance issues down the road and think you could benefit from pipelined render support, we could revisit this issue.

Thanks for helping to pinpoint this issue! It’s much appreciated. :slight_smile:

Or, if you’re feeling adventurous, you could try this shot in the dark to see if it happens to fix the issue:

diff --git a/dtool/src/dtoolbase/deletedBufferChain.cxx b/dtool/src/dtoolbase/deletedBufferChain.cxx
index d02f10e..ad016d1 100644
--- a/dtool/src/dtoolbase/deletedBufferChain.cxx
+++ b/dtool/src/dtoolbase/deletedBufferChain.cxx
@@ -56,8 +56,8 @@ allocate(size_t size, TypeHandle type_handle) {
     _lock.release();

 #ifdef USE_DELETEDCHAINFLAG
-    assert(obj->_flag == (AtomicAdjust::Integer)DCF_deleted);
-    obj->_flag = DCF_alive;
+    AtomicAdjust::Integer orig_flag = AtomicAdjust::compare_and_exchange(obj->_flag, DCF_deleted, DCF_alive);
+    assert(orig_flag == (AtomicAdjust::Integer)DCF_deleted);
 #endif  // USE_DELETEDCHAINFLAG

     void *ptr = node_to_buffer(obj);
@@ -77,7 +77,7 @@ allocate(size_t size, TypeHandle type_handle) {
   obj = (ObjectNode *)NeverFreeMemory::alloc(alloc_size);

 #ifdef USE_DELETEDCHAINFLAG
-  obj->_flag = DCF_alive;
+  AtomicAdjust::set(obj->_flag, DCF_alive);
 #endif  // USE_DELETEDCHAINFLAG

   void *ptr = node_to_buffer(obj);

(Of course, re-enable atomics if you do try this.)

Since I was compiling a new version of Panda anyway, I tried that. It did not work - the same error appeared.

I also tried the SIMPLE_THREADS override. It does not work either:

Traceback (most recent call last):
  File "../../p3d-test.py", line 3, in <module>
    from direct.showbase.ShowBase import ShowBase
  File "/usr/share/panda3d/direct/showbase/ShowBase.py", line 10, in <module>
    from panda3d.core import *
  File "/usr/lib/python2.7/dist-packages/panda3d/core.py", line 3, in <module>
    from ._core import *
ImportError: /usr/lib/python2.7/dist-packages/panda3d/_core.so: undefined symbol: is_os_threads

That’s OK, though - disabling GCC atomic intrinsics as you also mentioned works to solve the problem.

This patch seems to address the issue, according to someone else who’s built Panda for ARM. I’ll check it in soon.

diff --git a/dtool/src/dtoolbase/atomicAdjustGccImpl.I b/dtool/src/dtoolbase/atomicAdjustGccImpl.I
index 3e462f0..226f293 100644
--- a/dtool/src/dtoolbase/atomicAdjustGccImpl.I
+++ b/dtool/src/dtoolbase/atomicAdjustGccImpl.I
@@ -124,8 +124,8 @@ INLINE AtomicAdjustGccImpl::Integer AtomicAdjustGccImpl::
 compare_and_exchange(TVOLATILE AtomicAdjustGccImpl::Integer &mem,
                      AtomicAdjustGccImpl::Integer old_value,
                      AtomicAdjustGccImpl::Integer new_value) {
-
-  __atomic_compare_exchange_n(&mem, &old_value, new_value, true,
+  assert((((size_t)&mem) & (sizeof(Integer) - 1)) == 0);
+  __atomic_compare_exchange_n(&mem, &old_value, new_value, false,
                               __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
   return old_value;
 }
@@ -142,7 +142,8 @@ compare_and_exchange_ptr(TVOLATILE AtomicAdjustGccImpl::Pointer &mem,
                          AtomicAdjustGccImpl::Pointer old_value,
                          AtomicAdjustGccImpl::Pointer new_value) {

-  __atomic_compare_exchange_n(&mem, &old_value, new_value, true,
+  assert((((size_t)&mem) & (sizeof(Pointer) - 1)) == 0);
+  __atomic_compare_exchange_n(&mem, &old_value, new_value, false,
                               __ATOMIC_SEQ_CST, __ATOMIC_SEQ_CST);
   return old_value;
 }