Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DESTDIR support #5063

Closed
nalimilan opened this issue Dec 8, 2013 · 56 comments
Closed

DESTDIR support #5063

nalimilan opened this issue Dec 8, 2013 · 56 comments
Labels
building Build system, or building Julia or its dependencies

Comments

@nalimilan
Copy link
Member

As I said in my mail to the list, this is one of the points that would make packaging easier. Currently we need a patch in particular (see the Debian one in [1]) so that we can specify SYSCONFDIR=/etc (the value when the package will be installed) and still get the files to be installed to PREFIX/../etc.

Since apparently Julia does is completely relocatable, i.e. PREFIX is not hardcoded anywhere, we could live without DESTDIR. But it would still be nice since it would make things simpler: make install should install files to $DESTDIR/$PREFIX and $DESTDIR/$SYSCONFDIR [2]. This would be cleaner than using the PREFIX/../etc. solution.

Finally, libuv uses DESTDIR, so currently I need to pass both DESTDIR and PREFIX to make install.

1: http://ftp.de.debian.org/debian/pool/main/j/julia/julia_0.2.0+dfsg-5.debian.tar.gz, at debian/patches/sysconfdir-install.patch
2: http://www.gnu.org/prep/standards/html_node/DESTDIR.html

@ivarne
Copy link
Member

ivarne commented Dec 8, 2013

Link to newsgroup discussion: https://groups.google.com/forum/#!topic/julia-dev/WQ-Duwlo6gg

@nalimilan
Copy link
Member Author

@pao
Copy link
Member

pao commented Dec 12, 2013

@nalimilan looks like that's been submitted as #5114.

@ViralBShah
Copy link
Member

Yes, I merged that one a couple of days back. Would be great to hear if it does indeed work out for you.

@nalimilan
Copy link
Member Author

Actually I need something more: LIBDIR should be an absolute path instead of being added to PREFIX. Cf. JuliaMath/openlibm#32 (comment). I have something mostly working here, I'll make a PR soon.

@nalimilan nalimilan reopened this Dec 28, 2013
@ViralBShah
Copy link
Member

@nalimilan I gave you commit access to openlibm and openspecfun. It should be easier to create PRs with branches rather than forks, that way.

@nalimilan
Copy link
Member Author

OK, please have a look at this branch: https://github.com/nalimilan/julia/compare/libdir

I'd like to know what you think of this solution. The idea is that since LIBDIR and SYSCONFDIR (and other similar variables if they were added) are traditionally defined as absolute paths, i.e. not relative to PREFIX, and since Julia needs rpath to point to LIBDIR both in-tree and after final installation, the only solution is have the same directory layout in BUILD as in the final installation prefix. The remaining change that is missing from my branch is to change $BUILD/bin, etc., to $BUILD/$PREFIX/bin, etc. (If you like, I can also change them to BINDIR, etc., which is shorter and I think cleaner too.)

This is not a final proposal, as there are many areas to check and I wouldn't want to spend time on it if you think that's not the right way.

FWIW, I've found out that Rust has been in the very same situation a few months ago:
rust-lang/rust#5219

CC: @staticfloat

@staticfloat
Copy link
Member

Without testing it, this seems like a fine approach. I will test on OSX once you think you have something testable.

@nalimilan
Copy link
Member Author

@staticfloat Please give it a try then, I think it should mostly work.

There is still something weird which forced me to change $(BUILD)/$(JL_LIBDIR) to $(BUILD)$(JL_LIBDIR), which may mean you'll have to make LIBDIR start with /. Looks like make is confused by // and does not find the target.

I've also tried to remove as much as possible special-casing for Windows/Mac, considering that we would better set the *DIR variables once for all in Make.inc. It's possible that I broke something there.

@staticfloat
Copy link
Member

Some comments:

  • I don't remember where you said this, but you're right, we should set $(libdir), not $(LIBDIR). Feel free to add this change onto your branch, we should get this right the first time. :)
  • If the locations of libraries change, we have to reconfigure/rebuild pretty much everything in deps, which is really unfortunate. Can we keep BUILD = $(JULIAHOME)/usr? That way people who already have the deps build don't have to rebuild them all just to change install paths.
  • I don't have a windows build machine, but I believe that bindir == libdir on windows, as all the .dll's and .exe's get grouped together, so your changes here have likely broken that; that is the reason why we compared JL_LIBDIR against both lib and bin. Although that check is likely meaningless now since we have absolute paths in JL_LIBDIR now. We should just scrap this logic and force $(DIRS) to be unique. I'm pretty sure this logic is just here to save ourselves from trying to create the same folder multiple times anyway.
  • It looks like you've got some commits trailing along for the ride, namely 0c81ac5 and 5098c15. Not sure what that's about.

Once you've fixed that I'll continue testing. I just don't think we can merge something that requires a make -C deps distclean, unfortunately.

@nalimilan
Copy link
Member Author

Thanks for the testing and comments, I'll try to fix these.

One issue that is more difficult is your second point. Indeed it is silly to have to rebuild things only because some install paths change. GNU make's documentation warns against this:

Running ‘make install’ with a different value of prefix from the one used to build the program should not recompile the program.

In principle I very much agree with this. But in practice, this does not play well with the fact that Julia embeds rpaths, and that LIBDIR is not forced to point to a subdirectory of PREFIX. libtool seems to handle this kind of case but it would be a big change to the build process.

That said, we should be able to much reduce the problem: in theory, only the Julia executables would need to be rebuilt when LIBDIR or PREFIX change. It seems to me that most projects tend to save the results of the build process in the directory where the sources live, more or less. We could do the same: stop moving everything to BUILD, which at this point is almost identical to DESTDIR; and require people to run make install DESTDIR=usr to run Julia in-tree. make install would only need to rebuild the executables.

@nalimilan
Copy link
Member Author

What do people think about this scenario? This would be quite a pervasive change.

(I realize it would have been much easier to go on assuming that LIBDIR is always relative to PREFIX, since anyway in practice this is almost always the case. Maybe it would be better to go back to that, even if I feel I wasted some time working on that... Sad to see that there's no standard solution for such a common use case.)

@nalimilan
Copy link
Member Author

Note that another solution would be to set rpath to point only to the final destination, and to rely on LD_LIBRARY_PATH to point to the library directory in the build tree. AFAIK this is the solution used by many projects. julia could be a simple script running the actual executable with the correct library path (instead of a symlink as currently).

@staticfloat
Copy link
Member

we should be able to much reduce the problem: in theory, only the Julia executables would need to be rebuilt when LIBDIR or PREFIX change.

It's even better than that; we use (or at least, we should be using) relative RPATHs wherever we can. That means that only changes to $(libdir) that modify the relative position of libraries to the Julia executable would require a re-link. (Which is possible to do in one command via install_name_tool on OSX, and patchelf on Linux. That was one of my first pull requests to Julia almost two years ago, funnily enough)

In practice, only distributors/packagers are going to be messing with $(libdir), so I wouldn't worry about having to rebuild the Julia code itself. That's only a minute or two of time, whereas rebuilding deps/ can be closer to an hour.

another solution would be to set rpath to point only to the final destination, and to rely on LD_LIBRARY_PATH to point to the library directory in the build tree

Unfortunately, this becomes a little hostile to development, and the userbase that doesn't currently use packages to get their latest Julia often has their julia source just plugged into their $PATH so that they can run julia from $(BUILD), rather than installing. I don't think this solution will work, and is one of the big reasons why I wrote the RPATH-changing code previuosly. But with relative RPATHS, you shouldn't need any of this.

@nalimilan
Copy link
Member Author

I think the new version should address your concerns. By default BUILD is now back to usr/, and PREFIX is empty so that the tree remains stable. The tradition is to use /usr/local as a prefix, but for now it will work well enough -- anyway people installing need to wonder where it will go. This could be changed for tarballs, at least.

Regarding the changes in libdir or prefix which would require or not a rebuild, I don't know how I could detect whether the path of libdir relative to bindir has changed or not -- at least not how to make make aware of this. As you said, anyway, this is not a big deal since only distributors will set this, and they (I) don't build many dependencies, and rebuilding everything from scratch each time.

@staticfloat
Copy link
Member

The tradition is to use /usr/local as a prefix, but for now it will work well enough -- anyway people installing need to wonder where it will go.

Our prefix defaults to julia-$(JULIA_COMMIT), so this is nothing to worry about; we don't default to /usr/local anyway.

@nalimilan
Copy link
Member Author

Cool. Tell me when you have had the chance of testing this.

@nalimilan
Copy link
Member Author

@staticfloat Your https://github.com/JuliaLang/julia/commits/sf/libdir branch looks fine AFAICT (I've not tried to understand everything), and it builds fine here, but it triggers a crash in the RPM debugging symbols extraction tool: https://bugzilla.redhat.com/show_bug.cgi?id=1049839 I hope people there will help sorting this out.

Also, the TravisCI checks fail quite early (which doesn't seem related).

@staticfloat
Copy link
Member

If I had to guess, I'd say the RPM debugging symbols extraction tool failure is my fault. I do something that's a little shady in this process, and it's exactly this testing that is important in making sure it doesn't break anything, (which apparently it does)

The path to the system image file is hardcoded into the binary. (See this C file and this Makefile) This has to happen because we have no idea where that file is actually located. We can't store it in some other configuration file because the only place we could store that configuration file would be right next to the binary in $(bindir), which is kind of unacceptable.

I thought it would be neat to embed this information into Julia via RPATHs, e.g. we already encode the information of where the $(private_libdir) is via RPATHs, so why not just use that information inside of Julia itself to find the system image, but unfortunately, Windows doesn't have RPATHs so we'd still need to hardcode the knowledge somehow. In the end, I decided the best option was to just binary patch julia by ensuring that the string stored in julia had enough space to grow sufficiently, and to just write out the new "post-install" path to the system image into the julia binary, much like I decided to now just change the RPATHs when we install as well.

This seems to work pretty well, but it unfortunately it seems to screw something up in your process. Perhaps there is a simpler/better solution. I will sleep on it. :)

@nalimilan
Copy link
Member Author

It would be interesting to check whether something may be wrong in the way of set the string in julia. Maybe there's not enough space and you're overwriting something important? Maybe there's no ending \0? Testing a few different things could help (for example, write an hardcoded short string an see whether it works).

@nalimilan
Copy link
Member Author

Ah, and looking a the output of strings julia-basic, I've found this: ../lib/julia/sys.ji. It should be ../lib64/julia/sys.ji, so maybe that's the problem (doesn't look like it would trigger a crash, but who knows...).

Also, on Linux you may simply rely on the RPATH, which is more standard, and only use the hack on Windows.

@staticfloat
Copy link
Member

@nalimilan I think I may have found the problem. I was accidentally re-using the string offset across julia executables. Now I correctly calculate the string offset for each executable individually, which should help a lot. This also explains why julia-basic had the incorrect path above; it was replacing the wrong part of the file. Please retry with the latest commit pushed and let me know if problems persist.

I want to try and keep the build as similar across platforms as possible. If need be I will special-case, but I'd like to avoid that as much as I can.

@nalimilan
Copy link
Member Author

Great, now the crash is gone!

But I get an error when starting julia (after installing on the system -- tests from the build directory work fine):

$ /usr/bin/julia
/usr/bin/julia: relocation error: /usr/bin/julia: symbol , version GLIBC_2.2.5 not defined in file libc.so.6 with link time reference

Running strings on the binary, I can find this string:

GLIBC_2.2.5
GLIBC_2.4
GLIBC_2.3
GLIBC_2.3.4

@staticfloat
Copy link
Member

Where is the binary being built? On your local machine? This looks like a libc mismatch error to me. If we don't do the string patching (e.g. if you comment out these lines in Makefile) and rebuild does the error go away? Note that for Julia to start up properly in that case you will need to have $(libdir_rel) == $(build_libdir_rel) and $(private_libdir_rel) == $(build_private_libdir_rel). I'm not sure if that's easy for you to set in your RPM build process, but if it's not easy, the best way to fake it is to just modify Make.inc to change $(build_libdir) and $(build_private_libdir) on these lines. If those variables are equal, that means that we don't need to do any patching when installing the binary to its eventual home, and stringpatch isn't even called.

@nalimilan
Copy link
Member Author

Actually, only the last commit introduces the bug. Looks like the call to awk does not work fine. Here's what I get when calling it on the binary in the build tree:

$ strings -t x - BUILD/julia-sf-libdir/usr/bin/julia-basic | grep "sys.ji"
   4220 ../lib/julia/sys.ji
$ strings -t x - BUILD/julia-sf-libdir/usr/bin/julia-basic | grep "sys.ji$" | awk '{print $1;}'
4220

So AFAICT this is correct, but maybe there's a problem in the Makefile. I'll leave you find the solution. ;-)

@staticfloat
Copy link
Member

Can you give me the command you're running in order to create this bug? Everything works fine on my ubuntu machines (as far as I can tell!) so I'd like to reproduce your bug locally, if possible.

@nalimilan
Copy link
Member Author

I've only tried this when building my RPM, but here are the commands:

%global commonopts USE_SYSTEM_LLVM=1 USE_SYSTEM_LIBUNWIND=1 USE_SYSTEM_READLINE=1 USE_SYSTEM_PCRE=1 USE_SYSTEM_OPENSPECFUN=1 USE_SYSTEM_LIBM=0 USE_SYSTEM_OPENLIBM=1 USE_SYSTEM_BLAS=1 USE_SYSTEM_LAPACK=1 USE_SYSTEM_FFTW=1 USE_SYSTEM_GMP=1 USE_SYSTEM_MPFR=1 USE_SYSTEM_ARPACK=1 USE_SYSTEM_SUITESPARSE=1 USE_SYSTEM_ZLIB=1 USE_SYSTEM_GRISU=1 USE_SYSTEM_RMATH=1 USE_SYSTEM_LIBUV=0 USE_LLVM_SHLIB=1 LIBBLASNAME=libopenblas.so VERBOSE=1 USE_BLAS64=0 prefix=%{_prefix} bindir=%{_bindir} libdir=%{_libdir} libexecdir=%{_libexecdir} datarootdir=%{_datarootdir} includedir=%{_includedir} sysconfdir=%{_sysconfdir}
make %{?_smp_mflags} CFLAGS="%{optflags}" CXXFLAGS="%{optflags}" FFLAGS="%{optflags}" %commonopts
make %commonopts DESTDIR=%{buildroot} install

prefix is /usr and libdir is /usr/lib64; buildroot points somewhat confusingly (RPM terminology) to an empty folder.

@staticfloat
Copy link
Member

I've tried, and I cannot reproduce this. :(

Can you send me an RPM that, when installed, crashes? Or are you not to the point of creating an RPM yet?

@nalimilan
Copy link
Member Author

I have put the RPM here: http://nalimilan.perso.neuf.fr/transfert/julia-base-0.2.0-2.fc20.x86_64.rpm

FWIW, in CFLAGS/CXXFLAGS/FFLAGS is equal to this when I build it:

-O2 -g -pipe -Wall -Wp,-D_FORTIFY_SOURCE=2 -fexceptions -fstack-protector-strong --param=ssp-buffer-size=4 -grecord-gcc-switches  -m64 -mtune=generic

I have checked that the problem also happens with the julia executable installed in DESTDIR, i.e. before rpmbuild does anything special on it (like stripping debugging symbols).

@staticfloat
Copy link
Member

Well, something is definitely going wrong. What gcc are you using? (What version)

@nalimilan
Copy link
Member Author

gcc (GCC) 4.8.2 20131212 (Red Hat 4.8.2-7)

@nalimilan
Copy link
Member Author

Actually, only julia-readline is affected by the bug. julia-basic works fine. Even with USE_SYSTEM_READLINE=0, the situation is the same. Any idea?

@nalimilan
Copy link
Member Author

Also, using strings, I still see a /../lib/julia/ string, which doesn't make any sense to me (leading slash, lib instead of lib64). Probably not related as it's also present in julia-basic.

@staticfloat
Copy link
Member

I've downloaded a virtualbox image of fedora 20, do you have scripts or something that you use to generate your Julia install? Can I use them to generate the RPM in the same way that you do?

@nalimilan
Copy link
Member Author

Hm, that's a little painful. You need to install RPM packages for double-conversion, openlibm and openspecfun from here: http://nalimilan.perso.neuf.fr/transfert/julia-libdir/

Then run rpmdev-setuptree, which will create ~/rpmbuild, and save julia-sf-libdir.tar.gz, dSFMT-src-2.2.tar.gz and libuv-0.10-0bae155.tar.gz into ~/rpmbuild/SOURCES/. Save julia.spec into ~/rpmbuild/SPECS/.

At this point you're almost done:
rpmbuild --ba ~/rpmbuild/SPECS/julia.spec will create the RPM I gave you.
rpmbuild --bi ~/rpmbuild/SPECS/julia.spec will build Julia and install it with DESTDIR=~/rpmbuild/BUILDROOT/julia-0.2.0-2.fc20.x86_64/, without making a RPM package.

For a few more details:
https://fedoraproject.org/wiki/How_to_create_an_RPM_package#The_basics_of_building_RPM_packages

@nalimilan
Copy link
Member Author

Ah, and since you'll need development tools, the short way of installing them is yum install @development-tools and yum install fedora-packager.

@staticfloat
Copy link
Member

Also, using strings, I still see a /../lib/julia/ string, which doesn't make any sense to me (leading slash, lib instead of lib64). Probably not related as it's also present in julia-basic

You're right, it's not related. It's because of this line; it's the default search path, we don't need to be changing that around.

@staticfloat
Copy link
Member

Hmmm, quick question. I notice that you're depending on libRmath-static and using USE_SYSTEM_RMATH=1, is that a julia-specific library? If not, it lacks some stuff that julia needs. See this issue for the effort to move the code we've put into our copy of librmath into Julia, rather than being in the C code.

@nalimilan
Copy link
Member Author

Regarding Rmath, I know I should use the Julia-specific version, but for now, using system packages makes building the package faster for testing. I guess it's not related to our current problem either.

@staticfloat
Copy link
Member

You're right, it's not related. I can reproduce locally, I will try to track down what is going on. Thanks for all your patience.

@nalimilan
Copy link
Member Author

Cool!

@staticfloat
Copy link
Member

You know someone actually understands software engineering when they say
"Cool!" in response to you saying "I can reliably make it crash now". :)

On Wed, Jan 15, 2014 at 2:15 AM, Milan Bouchet-Valat <
[email protected]> wrote:

Cool!


Reply to this email directly or view it on GitHubhttps://github.com//issues/5063#issuecomment-32348856
.

@staticfloat
Copy link
Member

Alright. I've managed to track down why this is happening. Funnily enough, it is NOT because of my stringpatch tool. It's actually a bug in either strings or patchelf.

The problem is that when we use patchelf to modify the RPATH of the binary, strip then prints out a warning:

$ strip -g julia
BFD: stEIbiw8: warning: allocated section `.dynsym' not in segment

This doesn't happen when we don't use patchelf, and indeed this warning is only ever printed out when the resultant file complains about missing symbols. This is a known issue with patchelf, and I may have to take some time to really understand why this happens at all (it seems to me like patchelf might not be updating a field it needs to). The short-term solution is disable stripping in the RPM by putting:

%define debug_package %{nil}
%global __os_install_post %{nil}

at the top of julia.spec. Of course, another possible solution is to pass build_libdir=/usr/lib64 so that patchelf is never called, but I don't like that solution quite so much because it relies on some internal structure that I'd like to not have to worry about when packaging.

@StefanKarpinski
Copy link
Member

Yikes. I'm impressed that you tracked this down.

@staticfloat
Copy link
Member

Well thank you, Stefan. Working on Julia has taught me more about dynamic linking, executable file formats, and plenty of other strange topics that I wouldn't otherwise have had reason to learn about. Speaking of which, I think I've found the problem with patchelf; it's moving the .dynsym section to the front, which confuses tools like strip which don't expect the first program header to be zero.

@StefanKarpinski
Copy link
Member

The direction this is heading is clearly that we're going to have our own linker.

@staticfloat
Copy link
Member

The direction this is heading is clearly that we're going to have our own linker.

Orz.

I've opened an issue on the patchelf github repo. Hopefully this will be a simple fix, and we can just download the latest patchelf for Julia once it gets a new release.

@nalimilan
Copy link
Member Author

Wow. This one was deep in the stack... Looks like nobody before you has ever tried to write a program which would at the same time use RPATH, be runnable from the source tree and be installable to a different prefix without rebuilding everything.

Regarding practical issues, even if patchelf is fixed soon, I'd rather find a workaround to build the RPM and the debugging package, so that we don't have to wait to the fix to enter Fedora. Unfortunately, passing build_libdir=/usr/lib64 is not possible since we don't have the rights to write in that folder. IIIRC, your goal was to avoid modifying RPATH if the relative paths do not change: so wouldn't it be enough to set build=$(BUILD) build_libdir=$(BUILD)/usr/lib64 build_bindir=$(BUILD)/usr/bin?

@staticfloat
Copy link
Member

Sorry, yes that's exactly what I meant.
On Jan 17, 2014 9:48 AM, "Milan Bouchet-Valat" [email protected]
wrote:

Wow. This one was deep in the stack... Looks like nobody before you has
ever tried to write a program which would at the same time use RPATH, be
runnable from the source tree and be installable to a different prefix
without rebuilding everything.

Regarding practical issues, even if patchelf is fixed soon, I'd rather
find a workaround to build the RPM and the debugging package, so that
we don't have to wait to the fix to enter Fedora. Unfortunately, passing
build_libdir=/usr/lib64 is not possible since we don't have the rights to
write in that folder. IIIRC, your goal was to avoid modifying RPATH if
the relative paths do not change: so wouldn't it be enough to set build=$(BUILD)
build_libdir=$(BUILD)/usr/lib64 build_bindir=$(BUILD)/usr/bin?


Reply to this email directly or view it on GitHubhttps://github.com//issues/5063#issuecomment-32628111
.

@nalimilan
Copy link
Member Author

Hey, it works! ;-)

So as far as I'm concerned, this branch is OK. The bug should only affect people modifying libdir, so it may be good to merge even in this state.

@staticfloat
Copy link
Member

Alright. I will rebase, do a final check pass and open a PR.

On Fri, Jan 17, 2014 at 2:24 PM, Milan Bouchet-Valat <
[email protected]> wrote:

Hey, it works! ;-)

So as far as I'm concerned, this branch is OK. The bug should only affect
people modifying libdir, so it may be good to merge even in this state.


Reply to this email directly or view it on GitHubhttps://github.com//issues/5063#issuecomment-32656234
.

@nalimilan
Copy link
Member Author

I've seen a5c253a. Thanks for doing the cleaning! Don't you think it would be useful to give a few more details and a link to this issue?

@staticfloat
Copy link
Member

Hahaha, I'm still working on it, hence the asd commits. I will open a PR with plenty of detail once it's ready.

@nalimilan
Copy link
Member Author

OK.

@staticfloat
Copy link
Member

Status update: almost there! I've gotten the windows cross-compile finishing nicely, but make testall is failing with some weird errors. Once I have Windows, OSX, and Linux builds all working properly on this patch I will submit a PR.

All these changes are really nice, I think this overhaul has been a long time coming. Thank you for doing this initial work, @nalimilan! A nice byproduct of this effort is that some of our windows special cases are now handled automatically simply by changing $(build_libdir) = $(build_bindir), etc... (Because on windows .dll files live next to .exe files.

@vtjnash vtjnash closed this as completed Apr 25, 2014
@vtjnash
Copy link
Member

vtjnash commented Apr 25, 2014

our install target is now much more compliant with expectations

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
building Build system, or building Julia or its dependencies
Projects
None yet
Development

No branches or pull requests

7 participants