-
Notifications
You must be signed in to change notification settings - Fork 212
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
typeid / type_info equality check fails for clang/libc++ when using VSG in dynamic libraries #899
Comments
I never understood the problem with just using dynamic_cast. In vsgCs I have:
and use it instead of ref_ptr::cast(). At the time I didn't understand how ref_ptr::cast() was supposed to work. |
@martinweber That's an obscure and unwelcome finding. The Object::cast<> exist to lower the CPU overhead of casting compared to dynamic_cast<>. Perhaps compatibility issues like this is partly why dynamic_cast<> is so slow. As a short term fix perhaps falling back to using dynamic_cast<> as the implementation on dynamic build would be workaround. @timoore "I never understood the problem with just using dynamic_cast" the Elephant in the room any time you use dynamic_cast<> is how slow it is. When I introduced the VSG's RTTI functions I did benchmark them against dynamic_cast<> and they are 3.6 X faster. I tweeted about it back in July 2020 when I introduced the functionality:
Looking online perhaps the following might be another alternative: |
I understand that dynamic_cast is or can be slow, but is dynamic downcasting really in the hot path of anything in the VSG? |
This looks very verbose. Unfortunately, it seems to suffer from the same issue of not working across module boundaries, as listed under limitations:
I was thinking of generating a compile time hash for the type that can be used by |
I think the best way to tackle this issue is to create a test example in vsgExamples that we can use to reproduce the problem and benchmark performance on different solutions. Unfortunately it looks that I threw away the test program I original wrote when I originally worked on this RTTI functionality back in July 2020 as this would have a good starting place. Once we can reliably reproduce the issue and benchmark performance we can iterator on different solutions. I am rather stretched across tasks right now and can't handle right away another round of investigation, trying different solutions so help here would be appreciated. |
I created a minimal example to reproduce the issue on my fork of vsgExamples: martinweber/vsgExamples@a143725 Just returning a This is the output I get on macOS: (I haven't tried it on Windows or Linux)
I have an idea about using compile time generated hashes that I want to try. I'll keep you posted. |
Thanks, I have pulled the example into vsgExamples as the branch: https://github.com/vsg-dev/vsgExamples/tree/martinweber-clang-typeid-issue I will now investigate. |
Results so far: VSG built gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, static build: $ clang_typeid
Local: type name: N3vsg15MatrixTransformE, type hash: 8883272728397651726
Dylib: type name: N3vsg15MatrixTransformE, type hash: 8883272728397651726
types are compatible VSG built gcc (Ubuntu 11.4.0-1ubuntu1~22.04) 11.4.0, dynamic library build: clang_typeid
Local: type name: N3vsg15MatrixTransformE, type hash: 8883272728397651726
Dylib: type name: N3vsg15MatrixTransformE, type hash: 8883272728397651726
types are compatible Next I'll install and switch over to the clang compilers. |
I have installed clang-16 & clang++-16 and from the Ubuntu 22.04 repo, and set my CC and CXX in my env vars with: export CC=/bin/clang-16
export CXX=/bin/clang++-16 But on attempting to configure cmake I get the following error "/bin/ld: cannot find -lstdc++: No such file or directory" : cmake .
-- The CXX compiler identification is Clang 16.0.6
-- Detecting CXX compiler ABI info
-- Detecting CXX compiler ABI info - failed
-- Check for working CXX compiler: /bin/clang++-16
-- Check for working CXX compiler: /bin/clang++-16 - broken
CMake Error at /usr/share/cmake-3.22/Modules/CMakeTestCXXCompiler.cmake:62 (message):
The C++ compiler
"/bin/clang++-16"
is not able to compile a simple test program.
It fails with the following output:
Change Dir: /home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeTmp
Run Build Command(s):/bin/gmake -f Makefile cmTC_b09ee/fast && /bin/gmake -f CMakeFiles/cmTC_b09ee.dir/build.make CMakeFiles/cmTC_b09ee.dir/build
gmake[1]: Entering directory '/home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeTmp'
Building CXX object CMakeFiles/cmTC_b09ee.dir/testCXXCompiler.cxx.o
/bin/clang++-16 -MD -MT CMakeFiles/cmTC_b09ee.dir/testCXXCompiler.cxx.o -MF CMakeFiles/cmTC_b09ee.dir/testCXXCompiler.cxx.o.d -o CMakeFiles/cmTC_b09ee.dir/testCXXCompiler.cxx.o -c /home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeTmp/testCXXCompiler.cxx
Linking CXX executable cmTC_b09ee
/usr/bin/cmake -E cmake_link_script CMakeFiles/cmTC_b09ee.dir/link.txt --verbose=1
/bin/clang++-16 CMakeFiles/cmTC_b09ee.dir/testCXXCompiler.cxx.o -o cmTC_b09ee
/bin/ld: cannot find -lstdc++: No such file or directory
clang: error: linker command failed with exit code 1 (use -v to see invocation)
gmake[1]: *** [CMakeFiles/cmTC_b09ee.dir/build.make:100: cmTC_b09ee] Error 1
gmake[1]: Leaving directory '/home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeTmp'
gmake: *** [Makefile:127: cmTC_b09ee/fast] Error 2
CMake will not be able to correctly generate this project.
Call Stack (most recent call first):
CMakeLists.txt:3 (project)
-- Configuring incomplete, errors occurred!
See also "/home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeOutput.log".
See also "/home/robert/Dev/VulkanSceneGraph/CMakeFiles/CMakeError.log". This is how I previously used clang instead of gcc and I recall it working OK, and searches online haven't given me any useful pointers yet. |
Unfortunately I do not have a recent Linux installation ready to use. Our target is ancient CentOS 7 with gcc. Looking at the clang meta package on 22.04LTS I would think clang-14 is the last supported version on 22.04LTS. clang-16 seems to be a 23.04 (lunar) package only, though even there the clang meta package still uses v15. On macOS the current Apple clang version is 14.0.3 which is the one I use and which shows the issue. |
I tried installing clang-14 but the package was broken :-| |
clang++15 installs but I get the same /bin/ld: cannot find -lstdc++: No such file or director issue when running cmake. |
I have implemented a first prototype that generates type hashes at compile time, using a simple FNV-1a hash. I got the hashing code from here. The FNV-1a hash is very simple (just some XOR's with a good initial value) and therefore is fast. It is vulnerable to long sequences with zeros. That is not going to happen with type name strings. I implemented the templates necessary to generate the type hash value at compile time. Now I get this output (macOS clang 14.0.3):
I haven't really tested this, as I treated it as proof-of-concept! I am also not sure about the implementation for This implementation will have an impact on compile time. Runtime performance and memory requirements should not really be affected. I still need to implement a test / benchmark for that. My branch with the implementation is here. I updated the reproduction test example that generates the output above as well. |
I just noticed that it has Did you delete the build folder after building with gcc? At least |
I clobbered all my VSG projects before trying the clang build, so no CMakeCache.txt prior to running cmake. The error is for a will be a CMake generated testCXXCompiler.cxx file, and CMake is generating it's own link lines. |
A quick follow up. I have replaced a couple of
I will investigate this as well. I have been very busy this week with other tasks but I plan to continue to work on tests and benchmarks for the compile time generated type hashes next week. I also ordered a SSD so I can install Ubuntu as well for testing. |
Thanks for continuing with the work. While I haven't been able to keep trying to get clang installed and keep testing this is an area I'm committed to see an solution checked in. I have other work that I have to get on with right now, but as a TODO items for the next point release for the VSG I think we need a solution to these problems, so will dive back into this topic prior to the next release. |
Unfortunately I also was busy with other work. I now found some time to do more testing. The issue is indeed caused by
Checking with
So after this, I compiled VulkanSceneGraph as dynamic library and linked the executable and dynamic library against it. This solved the issue as now So going forward, this would be the solution to the Clang/libc++ Ideally, we could add a build option to either build VulkanSceneGraph as either static or dynamic library. |
I'm a bit lost on what approach works for you now. Do you still need your changes to be applied for dynamic library version of the VSG to work OK on clang?
The VSG builds using the standard CMake approach using BUILD_SHARED_LIBS option. We've been building static and dynamic libraries of the VSG since it's inception using this, so do you just mean a solution for Clang and dynamic libraries? |
Using VSG as dynamic library in all modules works in the reproduction example. So I am now looking into changing our use of VSG to a dynamic library.
Ah, thanks. I am pretty confident, that switching VSG to a dynamic library will solve my issues with Clang on macOS without any changes to VSG needed. That's the next step for me to verify. |
Chiming in with what is probably obvious advice to most. You need to either
|
Yeah, in hindsight it seems obvious 😉 |
Issue found
I am posting this here to discuss the following behavior I found:
I have been running into issues when using VSG in a dynamic library, where the cast<> function on
vsg::Object
did return anullptr
even when the type was correct. This is caused by how clang/libc++ is generatingtype_info.hash_code()
and the related type_info comparison operator.For example, in a
vsg::Visitor
:(Note: this also happens in another area where we are using
vsg::Object::cast()
)This always returned a nullptr even when the object was of type
vsg::MatrixTransform
. Usingdynamic_cast<vsg::MatrixTransform*>(&obj)
instead returned a pointer to thevsg::MatrixTransform
.I then logged the values for
std::type_info
invsg::Inherit::is_compatible()
:(Note: the type after "? is_compatible:" is from the
type
parameter.)type_info.name()
returned the same value (N3vsg15MatrixTransformE)type_info.hash_code()
returned different values even though the type (type_info.name()
) was identicalThe
type_info.hash_code()
is identical at the call site, but differs in the type'sis_compatible()
function. The call site is in a different dynamic library than VSG, which is linked as static library into a different dynamic library.This seems to be an issue with clang/libc++ when using dynamic libraries. I have found discussions about this here and here.
The issue seems to be present when dynamic libraries are loaded using RTLD_LOCAL. Symbols tables are then local to the library and the
type_info.hash_code()
for the same type is different. Also, the comparison operator onstd::type_info
returns false in this case.Environment
The same code works correctly on Windows with MSVC!
Possible fixes?
Using a
strcmp()
withtype_info.name()
? Thetype_info.name()
is working correctly in this case. This is the solution pybind was going for. This requires astrcmp()
which is computationally much more expensive than the current code. Especially considering, that in case of type difference,is_compatible()
is called recursively for parent types.Implement a type_hash<> template similar to type_name<> found in
type_name.h
that will guarantee to return an identical value for identical types?something else?
Conclusion
We already have two known places where this breaks our application on macOS (and possibly Linux). For now, a
dynamic_cast<>
instead of usingvsg::Object::cast()
is a working alternative. Comparingtype_name()
values also would work.I fear that this behavior of clang/libc++ will cause more issues though.
Thanks!
The text was updated successfully, but these errors were encountered: