shared libraries, RTTI and dlopen
January 28th, 2007 by BramzWhen I tried to run a first render in linux, I got a segmentation fault in the photon mapper because there were no lights. Of course, this shouldn’t cause a crash (fixed by now), but why were there no lights? I was pretty sure I defined one, and a quick investigation of the scene graph confirmed this.
Before rendering is started, SceneLight
instances are harvested from the scene graph by a visitor LightContextGatherer
to collect all instances of SceneLight
objects. It’s an acyclic visitor as described by Andrei Alexandrescu.
class LightContextGatherer: public util::VisitorBase, public util::Visitor<SceneObject>, public util::Visitor<SceneLight> { void doVisit(SceneObject&); void doVisit(SceneLight&); };
The way it works is that if a scene object, for example a LightArea
, accepts a visitor, it tries to cast the visitor to util::Visitor<LightArea>
. If it succeeds, it will call the appropriate doVisit
function. Otherwise, it tries again using its parent type, in this case SceneLight
. As a result, all lights instances end up in doVisit(SceneLight&)
, and all other objects in doVisit(SceneObject&)
.
But for some reason, neither doVisit
was ever called. It turned out that the dynamic_cast
, used to cast the visitor to the appropriate type, always returned null, even when trying to cast to util::Visitor<SceneObject>
. Huh?
Some googling revealed that GCC 3.0 uses address comparisons to determine type equality. This obviously has a performance advantage over string comparisons, but doesn’t work quite well if left and right have a different set of typeinfo
instances. This can when using dynamic linkage, and it certainly doesn’t help that Python uses dlopen
to load extension modules.
Fortunately, the GCC FAQ also mentions a solution which is to link with the -E
flag to add all symbols to the dynamic symbol table, and to use dlopen
with the RTLD_GLOBAL
flag set. For LiAR, the former means adding -Wl,-E
to extra_link_args
in setup.py
. The latter can be done using sys.setdlopenflags
. Proper try/except
wrapping ensures that it is only called if appropriate. At least one linux system did not provide the dl
module, but fortunately the same constants also live in DLFCN
(also not always available), so that’s why the heavy nesting. The code is put at the beginning of __init__.py
.
try: try: import dl except: try: import DLFCN as dl except: pass sys.setdlopenflags(dl.RTLD_NOW | dl.RTLD_GLOBAL) except: pass