Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Combining pre-compiled OSv kernel with pre-compiled executable #821

Open
nyh opened this issue Dec 22, 2016 · 7 comments
Open

Combining pre-compiled OSv kernel with pre-compiled executable #821

nyh opened this issue Dec 22, 2016 · 7 comments

Comments

@nyh
Copy link
Contributor

nyh commented Dec 22, 2016

OSv's bundled build system ("scripts/build") encourages building both the OSv kernel and the user's application using the same compiler and libraries (see also issue #743, #687, #619)

However, often you'd like to combine a compiled OSv kernel with a pre-compiled executable compiled by someone else. This is even more common in better OSv image composition tools like Capstan and Mikelangelo's improved version (https://github.com/mikelangelo-project/capstan). In that case, it is possible that the the OSv kernel and the executable were compiled with different versions of the compiler, and/or different versions of libraries.

Currently, mixing osv/application compiler and library versions encounters these kind of problems:

  1. The application is compiled with one version of libstdc++, but OSv supplies the symbols of a different version.
  2. Same thing for the couple of Boost libraries which the OSv kernel uses (and are also included in the OSv kernel).
  3. The gcc compiler also assumes the gcc support library, which is also compiled into the kernel and used, for example, for C++ exception handling, so we may have problems if the gcc's major version doesn't match.

The goal of this issue is to do two things:

  1. Reproduce the above problems, and come up with techniques to overcome them. For example, perhaps we can hide certain symbols in the OSv kernel (see issue Be more selective on symbols exported from the kernel #97) that come from the libraries we don't want to export. Or perhaps we don't even need hiding - the application can include in the image a separate copy of that library - the version it wants to use - and our dynamic linker will "do the right thing" (in the application, prefer symbols from the application's included library, not the kernel).
  2. Document what can or can't be done in this area - especially if the attempts in the previous paragraphs cannot solve all problems.
@miha-plesko
Copy link

One thing to add here is that we should not interfere with one of the most awsome features of the OSv:

You can compile your existing Linux application with its normal
build process, and run the resulting Linux executable on OSv.

I'm afraid that if we e.g. force user to compile application in a way that it also contains the symbols, this feature would be broken.

So, what you suggest is that we prepare a separate MPM package with the symbols, one package for each set of them e.g.

com.package.symbols.ubuntu-14.04
com.package.symbols.ubuntu-16.04
...

In the unikernel we then upload appropriate package depending on what the user application is compiled for. This way we overcome the incompatibility between user application and OSv kernel, which is great.

However, the incompatibility between user application and remote packages remains unresolved. I'm not sure, actually, if this compatibility is really that important. I guess that users will just either grab packages that we've prepared and use them either compile their application and use no pre-prepared package.

@nyh
Copy link
Contributor Author

nyh commented Apr 23, 2017

https://groups.google.com/d/msg/osv-dev/zAkAilS446Q/mWsMmzXRCgAJ is yet another example of this problem: A person used a two year old version of OSv (from Capstan) and trying to run modern-compiled software (which assumed a newer C++ ABI than was included in OSv), and encountered missing symbols in std::__cxx11::basic_string.

As mentioned in that thread by @avikivity one workaround is to recompile the application with -D_GLIBCXX_USE_CXX11_ABI=0.

A completely different workaround might be to hide the C++ library inside OSv as an internal implementation (and only include the part of the library which OSv actually uses) - and ask the application to provide its own copy of the C++ library which suites it. This would of course mean that there cannot be any OSv-specific C++ APIs used by the application, but this is normally fine (all of the Linux APIs are C-only and don't need C++).

@wkozaczuk
Copy link
Collaborator

I just came across two incompatibility issues I wanted to describe (they may be falling into what we described above but may be new ones).

  1. A mikelangelo capstan user tried to use newest OSv kernel (loader.img) with MPM packages created in January 2018. He came across this error when trying to run OSv:
OSv v0.24-534-g54e6c42
could not load libvdso.so

[backtrace]
0x0000000000413268 osv::application::prepare_argv(elf::program*)+1128
0x0000000000414bbd <osv::application::application(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, bool, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > > const*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::function<0x00000000004169b8 <osv::application::run_and_join(std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > const&, bool, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > > const*, waiter*, std::__cxx11::basic_string<char, std::char_traits, std::allocator > const&, std:0x000000000040bd7b <osv::run(std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::vector<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::allocator<std::__cxx11::basic_string<char, std::char_traits, std::allocator > > >, int*, bool, std::unordered_map<std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::__cxx11::basic_string<char, std::char_traits, std::allocator >, std::hash<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::equal_to<std::__cxx11::basic_string<char, std::char_traits, std::allocator > >, std::allocator<std::pair<std::__cxx11::basic_string<char, std::char_traits, std::allocator > const, std::__cxx11::basic_string<char, std::char_traits, std::allocator > > > > const*)+107>
0x0000000000425033 <???+4345907>
0x00000000002148ad <do_main_thread(void*)+5821>
0x0000000000447725 <???+4486949>
0x00000000003e5d76 <thread_main_c+38>
0x0000000000389bf2 <???+3709938>
0x0150e3bad77aa6ff <???+-679827713>
0x00000000003e579f <???+4085663>
0xfb89485354415540 <???+1413567808>

This happened because newest OSv kernel was trying to load new tiny libvdso library which was NOT part of older osv-bootstrap.mpm package that normally contains the stuff from usr.manifest.skel.

  1. I myself was trying to test new OSv kernel with existing mikelangelo packages (in anticipation of releasing OSv soon and what it would mean to the users) and after solving the issue above I came across this one:
java.so: Starting JVM app using: io/osv/nonisolated/RunNonIsolatedJvmApp
/java.so: failed looking up symbol _ZN3elf7program11get_libraryESsSt6vectorISsSaISsEE (elf::program::get_library(std::string, std::vector<std::string, std::allocator<std::string> >))

[backtrace]
0x000000000033e603 <elf::object::symbol(unsigned int)+227>
0x0000000000341229 <elf::object::resolve_pltgot(unsigned int)+137>
0x0000000000341415 <elf_resolve_pltgot+69>
0x00000000003888af <???+3705007>
0x0000000000c6b15f <???+13021535>
0x00000000003e5d76 <thread_main_c+38>
0x0000000000389bf2 <???+3709938>
0xe9201417c18320ff <???+-1048370945>
0x00000000003e579f <???+4085663>
0xfb89485354415540 <???+1413567808>

This one happened because osv.openjdk8-zulu-compact1.mpm from mikelangelo had older java.so that attempted to use OSv API function elf::program::get_library() that has changed since to support Golang.

I think these two issues suggest that every time we release OSv and publish kernel binary on github we also build and publish capstan packages that contain OSv apps/modules (java.so, httpserver-api.so, etc). This would require structuring capstan packages differently - for example osv.openjdk8-zulu-compact1.mpm should not contain zulu JDK and run-java artifacts. Instead these should be separate and for example java.so should be part of separate MPM package.

So in general I postulate that OSv-specific public API (not glibc, SYSCALL) should not be required to be backwards compatible. However ideally OSv should be backwards compatible (and I think it is) as far libc or SYSCALL are concerned.

Also I wonder if what @miha-plesko stated about 'build platform' is exactly accurate. Is it really Ubuntu 14 vs Ubuntu 16 or even Fedora 27? Or to be more precise it is about version of GCC compiler and libraries used to build OSv kernel, its internal apps (look java.so as an example) and user apps and libraries that user app uses? Does it really matter which version of GCC and what distribution of Linux was use to build node.JS, python or ruby from apps folder?

If that is a build tool chain that matters than what makes up the "tool chain"?

I was reading more about shared libraries and I found the paragraph Incompatible Libraries from http://tldp.org/HOWTO/Program-Library-HOWTO/shared-libraries.html specifically applicable. In general well maintained libraries (for example boost ones) should be backwards compatible. So if we always build against the newest ones we should be good if any app was using it (case of 2 boost libraries that OSv kernel is linked with and exposes it).

Some other related questions that came to my mind:

  • Does GCC compiler version really matter or is it C++ standard library version that varies across GCC versions matter? If so what is an example of it?
  • Should libboost and libstdc++ libraries be hidden except from internal OSv apps (like java.so, httpserver, cloud-init, etc)?
  • How these compatibility issues are handled on Linux (what if app uses old version of library that is not on of this system?)
  • What is meaning of number (version?); is it used/enforced by OSv? For example elf.cc lists these with version number:
        "libresolv.so.2",
          "libc.so.6",
          "libm.so.6",
#ifdef __x86_64__
          "ld-linux-x86-64.so.2",
          "libboost_system.so.1.55.0",
          "libboost_program_options.so.1.55.0",
#endif /* __x86_64__ */
#ifdef __aarch64__
          "ld-linux-aarch64.so.1",
          "libboost_system-mt.so.1.55.0",
          "libboost_program_options-mt.so.1.55.0",
#endif /* __aarch64__ */
          "libpthread.so.0",
          "libdl.so.2",
          "librt.so.1",
          "libstdc++.so.6",
          "libaio.so.1",
          "libxenstore.so.3.0",
          "libcrypt.so.1",

@nyh
Copy link
Contributor Author

nyh commented May 6, 2018 via email

@yieldone
Copy link

yieldone commented May 7, 2018

Hi folks,

If I may chime in, I noticed @wkozaczuk mentioned something about usr.manifest.skel. In capstan-packages, the docker recipe for jdk-zul-full (https://github.com/mikelangelo-project/capstan-packages/blob/master/docker_files/recipes/openjdk8-zulu-full/build.sh) contains the following:

${OSV_DIR}/scripts/build image=openjdk8-zulu-full export=all usrskel=none export_dir=$PACKAGE_RESULT_DIR -j ${CPU_COUNT}

I used this as my basis for preparing my base image. Does "usrskel=none" therefore ignore usr.manifest.skel?

My process is now as follows:

  1. gen osv-bootloader image, upload to S3
  2. gen jdk image, upload to S3
  3. gen osv.bootstrap, upload to S3
  4. refresh all the above images locally (pull)

If I include the usr.manifest.skel when I prepare the JDK image, perhaps I can avoid building osv.bootstrap explicitly?

In any case, the above process resolves all my issues thus far, really appreciated all the tips!

Cheers,

Rowland

@miha-plesko
Copy link

@wkozaczuk it's most probably not Ubuntu 14.04 vs 16.04 issue, but some underlying library version change between them as you suggest. It's just that I never dived in but still wanted to make it as understandable to users as possible :) Also, I'm more of a high-level programmer (ruby, python) and would have hard time actually figuring out exact reason; gonna have to let you guys do this part 🙌 😇 .

@yieldone you can find some documentation on "userskel=none" argument on OSv commit message here. Also, you can examine this commit on captan-packages here to see how we modified recipes when the option was introduced.

BTW: I'm not surprised it works for you since you've recompiled everything you need. I'm afraid you'll have to stick with osv.bootstrap because Capstan always requires it (it's hard-coded). That being said you could technically prepare empty osv.bootsrtap package (just remove all the files prior zipping) and let your jdk package contain all the files you need. Although having it in a separate package sounds more reasonable to me. Here is a recipe to build it; you'll notice that we actually just build nothing (besides OSv kernel itself) but export it all.

@wkozaczuk
Copy link
Collaborator

For everybody interested I will be moving the discussion to the mailing list as it will be easier to ask/answer questions.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants