-
Notifications
You must be signed in to change notification settings - Fork 29.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
NodeJS native modules made difficult by distribution packages. #21897
Comments
/cc @nodejs/tsc. IMO, we need to have stronger guidelines about ABI compatibility for vendors and distributors shipping Node.js. If something advertises itself as "Node.js 8.11.1" it should be compatible with the official Node.js 8.11.1. |
@ofrobots I remember talking about this in a TSC meeting before the Node 10 release (with @rvagg I think)… we knew this was coming. :/ In retrospect, maybe we should have had different |
@addaleax mangling the I wouldn't be looking forward to this :-) |
OpenSSL is probably the main suspect where the ABI between different distributors could end up differing; but there could be other environmental differences, e.g. choice of the C++ compiler, C++ std library or build flags that that make a non-official release potentially incompatible with official releases. I think having different I guess my question is: Is it a problem that distributor ships an ABI incompatible version of Node that uses the same My personal opinion is that ABI incompatibility fragments the ecosystem and pushes the cost of figuring out compatibility to our users. We should document clear guidelines about ABI compatibility for distributors. Native module authors now also have to worry about shipping multiple versions of the binaries for the same Node version, as @nicolasnoble points out. If it is not compatible with Node, it is not Node. |
I discovered another, fairly edge case package: termux's nodejs package for Android. |
cc @nodejs/delivery-channels |
Open-source distributions implicitely assume users of Node will always compile addons and not install untrusted binaries when it can be avoided. Maybe checking for NODE_MODULE_VERSION is not enough, and shared libraries against which Node has been compiled by distributor should also be taken into account ? |
@nicolasnoble my point was to make sure that what decision goes for openssl goes also for uv, zlib, openssl, c-ares, nghttp2, http-parser, icu, should multiple versions of one of these deps be supported. |
Right, which would in effect make prebuilt binaries impossible to ship for anything but the official runtimes available on the website. This is an important trade-off to consider. |
Maybe N-API is the solution, but i'm not even sure of that. |
I don't think that N-API does anything about transitive dependencies such as openssl or libuv, no. |
It is really important to emphasize the fact that this kind of problem is a general problem of open-source distributions, not particularly related to Node. There is no quick fix and the situation with Node ABI is not as simple as it looks.
|
IMO this is a serious issue and is best addressed by NodeJS 8.x being built against OpenSSL 1.0.2 downstream (i.e. in the distros). I filed a bug against the Debian package, as the version in Ubuntu is unmodified from the Debian package. But I think this issue will be addressed in Debian by upgrading Node to 10.x for the next release (buster). Ubuntu is probably going to need to patch Bionic (18.04 LTS).
For what it's worth, I disagree with this, and I don't think most developers view natively compiled binaries from e.g. npm as untrusted. Having worked on ensuring compatibility of native CPython dependencies with PyPA, I can point to how we handling this kind of ABI compatibility: PEP-513, PEP-571 One of the key things I'll note is that we don't even consider the Python ABI to be compatible/stable; we force native extensions to statically link the target ABI of their choice. The only external symbols we approve are GCC/GXX and a few other core system shared objects, with very old versions to ensure compatibility. Given that this issue stems from an OpenSSL dependency in particular, I'd like to request that NodeJS considers making a breaking change to stop exporting those symbols as part of the Node ABI. I think it is better to distribute these symbols as a versioned, native extension, similar to how PyCA's cryptography package works. |
Even compiling addons from source has issues since the official headers package for Node.js ships openssl headers for the version the official binary was compiled with, see nodejs/node-gyp#1415. |
Note that Ubuntu / Debian's current stance seems to be that they are unwilling to correct this for the time being: https://bugs.launchpad.net/ubuntu/+source/nodejs/+bug/1779863 |
I'm hoping Ubuntu release team changes their mind on this given the additional information from @richardlau. Debian is just understaffed and will eventually fix this when node 10.x gets uploaded. The main issue there is that we want to ship the next release (buster) without openssl 1.0, and so switching 8.x back to depending on it for compatibility is essentially working backwards. It's unfortunate this bug wasn't caught prior to the Ubuntu release :( |
And in my turn I disagree with you, which is why I called this bug a feature if it blocks people from using pre-compiled binaries instead of compiling from source. But if you ask me I think the whole NodeJS environment is not trustable anyway (I liked this story a lot for instance). But at least building from source is vaguely more trustable than using precompiled binaries. And in the same vein, if people trust your precompiled binaries, then they should just use the precompiled nodejs binaries too rather than distros one. ;) Now if you prove me that we indeed cannot even build affected modules from source using our node-gyp package, I’ll consider switching back to openssl-1.0, but I don’t think we are actually affected because we had switched before, then reverted because of that precise issue. Of course we are talking the main version here, not LTS one, but we’re using the same node-gyp and npm, so… Especially here, OpenSSL 1.0 vs 1.1 means only suspicious ECC availables, because Ed25519 was brought by 1.1. I’m not sure, but I believe at least one case that I know of would be broken if switching back to 1.0. |
@ArchangeGabriel the problem really is that packages with prebuilt binaries will (should ?) provide a way to always fallback on compiling from sources, but your distribution isn't flagging this into the runtime. While I don't necessarily adhere with it, I actually appreciate the will to always compile from source. In order to achieve this with arch's nodejs, you simply have to mutate the runtime name so as to not masquerade for the official nodejs binary, and this way all packages will always have to fallback compiling from sources. |
Should we just advertise a different (and unique to Arch) node module version for that to work or is there something else to do? Also I guess there is no way to do this without breaking modules that were actually compiled against our binary, right? |
This is incredibly paternalistic. It's not up to you to decide for your users what their threat models are.
Building from source is hard, which is the whole reason distros and binary package registries exist. Telling end developers to just build from source in the general case is not a solution. I'm a little surprised to hear this from you as I'm under the impression that Arch is a binary distribution, and as I recall, was distributing unsigned binaries for much of the life of the distro. I don't agree with the premise of "building from source is more secure than installing binaries" as a general threat model. It's not like the average developer reads the full source of a dependency before building, and even if they wanted to, the average software project has so many layers of dependencies that this wouldn't be possible. If a binary is backdoored, there's no reason why the source couldn't be, too. And if the build environment is compromised, building from source may give a false sense of security. While the Debian reproducible builds project aims to address this, we are not yet at the point where we require reproducible source-only uploads; most if not all distros are still shipping binaries built on random devs' machines. The argument here seems to be that distros are more trustworthy than upstream because there's some semblance of us building from source, but judging from the number of packages that FTBFS that seems a little optimistic to me. I'd like to think my users put a high level of trust in the integrity of my work as a distro developer, but I'm not so arrogant to suggest that folks building node upstream or uploading to NPM from their own source builds can't possibly be worthy of the same trust.
Why don't you just live up to Arch's reputation of shipping cutting-edge software and package node 10.x which does use the openssl 1.1 ABI, as I suggested for Debian? |
Update for those not following the Ubuntu bug: they have accepted our arguments and are going to SRU the build fix. I prodded them on IRC a bit earlier in the week to explain the issue with the ABI incompat in more detail to get this moving. |
@ArchangeGabriel correct, you would most likely break existing deployments if you were to do this. But that'd really be the only thing needed, yes. Consider how electron for example returns, well, electron, as its runtime name. This alone is enough to have a compilation scheme that allows package distributors like us to differentiate between the two incompatible runtimes. |
Ah, so, if at any point ever during the course of our life as a distro, we weren't as secure as we are now, then we lose the right to argue about anything in the name of security. Gotcha.
Okay, fine, I'm fully in support of dropping our nodejs-lts-carbon package and only providing nodejs 10.8.0 (which we already do). NEXT! |
Hey everyone, please try to keep the discussion productive here. This is not the right place for discussing the general quality or security of the Node.js ecosystem, it’s about one very specific problem. It’s definitely the case that Node.js should have done something about this issue before starting the 10.x release line, and the blame for that lies on us. But we are where we are, and we need to look forward. I’m not sure what we can do on Node’s side here; We could allocate a separate “official” @nodejs/lts I think we might want to block 10.x from going LTS before this issue is resolved. |
Sure, but as I said after I’m not keen on reverting to openssl-1.0 for the sole purpose of letting people use pre-compiled binaries.
You’re right to be surprised, but I’d be a Gentoo user if I had more time for this, though that’s beside the point. I understand your statement, but in the present case I would never break something like security in exchange for people being able to use pre-compiled binaries, especially when they are better solution (just use the main nodejs and not the LTS one, which btw is the only case you seem to support in next Debian, so…).
I definitively agree with you on most points (and btw we also work on reproducible builds partially thanks to the work your people did on that topic, and this is indeed hard), but I would say that:
Of course, everyone might not put the same trust in the same place. And while we are at it, I trust more Debian devs/packages in general than (at least some) Arch ones, but trust is not everything. ;)
We do, since it was released. But we also package the LTS, because some software still does not work with more recent versions. |
I request that you consider self-moderating this comment. All of the contents of the conversation other than this line by all parties focus on technical aspects and this feels personal. Cheers. |
@addaleax Sorry, your post landed while I’ve been busy writing mine. Indeed our discussion derived on what is the point behind distributions and security in binaries… Regarding the present issue: if you do not want people to compile against different libraries (here OpenSSL, but this applies for any library you provide the ABI for) until you do for the official binaries, maybe a solution could be to not provide support for them in the code. As a distro packager, one of my roles is to try hard to compile against latest repo version of any given software a project depends on, and packaging an older version of a lib because some software does not support the new one yet is a last resort measure. But you bet that if you allow me compiling against a newer version, I’ll do. |
At the very least, if it had been clear from the beginning I wouldn’t have build against OpenSSL 1.1 and neither would have Debian/Ubuntu, so the question wouldn’t have arisen (though I think it’s a good thing that this started a discussion about the ABI vendoring). Now of course I would be more reluctant to break existing setups (and so were Ubuntu devs), but I nevertheless will downgrade to 1.0 with an user notice that any existing package depending on the SSL ABI will have to be rebuilt since thanks to your clarification it was never meant to be built like this in production environment. |
Yes, as long as you clarify your intent that this is public ABI, the worst a distro will/should do is build with old openssl but open an issue to argue why there should be a better solution (e.g. ABI tags for the openssl version plus a way to automatically rebuild modules from source with the same openssl version). Also apologies for descending into snark re: stdlib. |
Here's an attempt at documenting ABI compat concerns in the Not 100% if this is the right place to include it, or the best text... please comment in the PR |
For the record, i've just requested some NODE_MODULE_VERSIONs for debian. |
Sorry I'm really late to this discussion! I've proposed a "solution" here #24114 by means of maintaining a registry of |
Why? |
@OlafvdSpek first, this is a really unfair comparison. Vcpkg is for C++ stuff, tailored around it, whereas native extensions in nodejs are C++ code inserted into a nodejs environment. People who are installing vcpkg packages are usually C++ developer, with a C++ compiler installed, and might know how to deal with C++ issue. Nodejs users aren't necessarily having a C++ compiler installed even. And I'm not even talking about the app deployment process that are inherently different between the two. Second, even vcpkg has its own issues, and not everyone is using it successfully. |
Sure, but building C++ code shouldn't be an unsolvable problem.
vcpkg doesn't do app deployment at all. |
And they would be affected, if only by the shifting requirement in their environment. All of a sudden, you'd need a compiler in your production environment to deploy. |
Building could be done in a build / dev environment.. assuming it matches the production environment. Or it could be done by the package provider, assuming matching environments are available. |
And we just went full circle, as this is exactly the topic here initially :-) |
@nicolasnoble Matching environment. So not one generic Linux build, but a unique one for each platform. |
Which is what we're currently discussing about, yes. The notion of having node modules tags specific to each nodejs distribution for instance, in addition to the existing ones of platform, cpu architecture, libc version, etc. |
There are hundreds of Linux distributions out there, but I don't imagine you're suggesting developers should complete a build for each and every one. I want to do better than just providing support for the popular distros. Even within each "platform" (e.g. Debian-like platforms) different releases have wildly different ABI support---the Ubuntu LTS releases look nothing like the Debian stable releases. I maintain the toolchain for building portable native dependencies for Python. The reason building native dependencies is hard is because Linux build environments are extremely divergent. Without bundling a maximal set of dependencies with the source to build, users inevitably miss some tool or dependency. Taking this to its logical end, folks end up shipping something like a Docker artifact. While this may be appropriate in order to ship a common build environment for developers, IMO it is not appropriate for shipping software to end users. It's our job as distributors to make this easy. The manylinux policy solves this for the Python ecosystem by defining a highly supported set of core GLIBC symbols that can be assumed to be on any system, and requires developers to build their native projects against those ABIs. Dynamic linking of dependencies is handled by the auditwheel tool which vendors the system dependencies of a Python binary package into the binary, patching the RPATH of the Python binary to point to the vendored locations. There's no reason the node ecosystem couldn't build a similar tool; the policies could even be identical. |
@ehashman I'll be a bit brutal here, but over at grpc, the manylinux policy caused us way more troubles than it solved, and I'd hate seeing the same thing happening on nodejs. |
@nicolasnoble that sounds like you should file some issues and we should fix them. :) |
No, the API/ABI you've settled on is way too old and painful to deal with properly, there's quite no fixing it, since running any modern piece of code is near impossible. This TODO is basically the reason our Python support is fairly bad. It's costing us more maintenance time than anything due to the special treatment of libraries here, and due to the horribly ancient glibc we have available. The rest of the codebase has a lot of logic to deal with various API levels from the glibc, and we just can't do such things with manylinux, and we have zero options out of it. At least with the current way nodejs does things, we can still do a lot of various work to cater for many of the distributions we want to support. If anything, the deal with Python's manylinux comes from a good idea, but it's way too inflexible, and I wish we could opt out of it. That'd be my litmus test for any proposal you may come up with for nodejs: opt in for the average developer, opt out for the advanced ones. |
@nicolasnoble does manylinux2010 address your issues? @ehashman just announced support for it in auditwheel earlier this month; see also pypa/manylinux#179 for a larger ecosystem-wide tracking issue. Have you tried building manylinux2010 wheels of grpc, and do they address any of your concerns? So I think "The API/ABI you've settled on is way too old" would have been an entirely reasonable bug to file against pypa/manylinux. We (the volunteers working on Python packaging infrastructure) aren't going to know what issues people have if people don't tell us. We know manylinux1 is extremely old but we don't have a sense of whether moving from CentOS 5 to CentOS 6 actually helps people or not, or whether we needed something slightly newer than CentOS 6, without feedback. Advocating for the ability to upload non-manylinux wheels to PyPI would also be a reasonable bug report - it needs some careful design (same sort of design as this ticket has been talking about, more or less) but it's certainly a thing people have wanted. (Though perhaps a manylinux newer than manylinxu2010 would solve the problem too.) So would trying to find a way to build manylinux1 wheels against a newer distribution: there's a thread about using linker tricks to adjust symbol versions (and I have a long-overdue response to that waiting for me to get some free time to rigorously look at glibc's forwards-compatibility story, but the short answer is I think it can work pretty well), I wrote an awful hack to work around CentOS 5 using a syscall ABI that recent distros are dropping to make the CentOS 5 build container continue to work, etc. I'm guessing from grpc/grpc#13949 that one of the things you'd like is to use To try to move this conversation back to NodeJS - it seems likely that NodeJS should base its strategy on manylinux and take into account that the Python world has found that while manylinux works pretty well, manylinux1 isn't sufficient and that periodically producing a manylinux2010, manylinux2014-ish, etc. is necessary, and that also seems to work better than defining platform variants for every single Linux distro. Among other things, it means you only have 2-3 variants to build for in CI instead of hundreds, and the installer on each platform knows what the latest manylinux it supports is, so it can gets the most optimized / featureful binary package for that platform. More to the point about OpenSSL, the Python community has generally found that bundling an OpenSSL (through the "cryptography" wheel) works better for most use cases than depending on an OpenSSL from the platform, in part because it sidesteps the question of requiring OS upgrades to speak newer TLS versions. But if you want to depend on and re-export the platform OpenSSL, you should do something manylinuxish to define what platforms have which ones, so the client installer downloads the right native modules for the system's OpenSSL version. (And it would be wonderful if the Node and Python folks could work together on defining a shared standard! OpenSSL isn't part of the manylinux list right now, but I think you could make an argument that it should be.) |
Or having a distribution-provided cryptography that uses the latest system openssl. :p But I've mentioned earlier in this issue that the specific case of openssl could be solved by exporting openssl as part of a stdlib, so that js code doesn't need to concern itself with compiling against the implementation details of openssl. This is exactly what python does. Except that both python and nodejs internally link to openssl as well... and in the python case this is controllable, since the official precompiled binaries define their own version which cryptography can match, and distros using a different version will also provide their own version of the small handful of modules that link to openssl. The python packaging ecosystem is... more canonical about where you import a module from. Or do as python-cryptography does to sidestep the issue, which is to build with static openssl in order to not be leaky. There are definitively options to make this work. Of course, not everyone compiles static openssl for linking against, and nodejs explicitly documents openssl as part of the public API, so compiling in private copies of openssl routines would be defeating the purpose... |
When we were creating the manylinux ABI, we talked to the developers Canopy and Anaconda, which are commercial python distributions. They have a lot of hard-earned experience shipping "works anywhere" pre-compiled Linux packages to lots of users. What they told us is that they wished they could depend on the system openssl, but in practice they found openssl's ABI just wasn't consistent enough, with e.g. a history of ABI breakages even within security bugfix releases. Maybe they're better these days? I don't know. And in any case, it's obviously the case that openssl breaks compatibility between releases like 1.1.0 and 1.1.1, and personally I'd be extremely reluctant to do anything that traps distros into continuing to support older openssl releases for an indefinite period. Also, you might want to talk to the distros before committing to anything here; our experience with Python is that their attitude towards precompiled binary packages distributed by external indices like npm or pypi ranges between "reluctantly tolerant" and "actively hostile", and they have a history of hacking up our packaging toolchains to add their own policies when redistributing them. If the distros decide that your openssl policy doesn't match their idea of what's best for their users, it's entirely possibly they'll unilaterally "fix it". In principle it's not too hard to avoid exposing openssl as part of your ABI, but you have to wrap your head around ELF symbol lookup, which is... counterintuitive, and less helpful than it could be. In general, when you load an extension module using Python ships with a built-in openssl wrapper, but it's built as an extension module that ships with the main interpreter, rather than linked directly into the interpreter binary itself. That's how we can avoid making openssl part of Python's ABI. This means you can absolutely take a system Python on some crusty old distro, BTW, the same issues apply to every other .so that's linked into the main nodejs executable, like ICU, nghttp2, etc. [1] Well, there's one special case where they can interfere, which we also have a workaround for, but it's not relevant here so let's ignore that for now.
I'm sorry to hear about your problems with grpc. As Geoff explained, we are trying to migrate to a newer baseline, and are open to more fine-grained options beyond that. Unfortunately Python packaging infrastructure gets zero funding from companies beyond the minimum needed to keep PyPI's servers running, so it's slow going. But, any "opt out" will still need to solve some technical challenges. If you have a package that only runs on certain distros, how do you describe that in your package metadata, and how does your installer make use of that metadata to avoid installing packages on systems that can't support them? Technically, it is possible to "opt out" in a sense. Google's official Tensorflow packages on PyPI simply lie about their ABI: they're built against some recent Ubuntu, but then are manually hacked to declare that they can run on any system newer than CentOS 5. So, pip happily installs them, and then they crash. It's not great. You might be interested in this proposal I just posted. For all its limitations, even the initial manylinux1 has been tremendously successful. Last time I checked, ~6 months ago, Python users were downloading manylinux packages ~a million times every day, and it's completely transformed the usability of Python packaging on Linux. |
This issue hasn't seen action in over 1.5 years so I'm going to go ahead and close it out. FWIW, https://bugs.launchpad.net/ubuntu/+source/nodejs/+bug/1779863 was marked as fixed and I don't think there's anything to do on our end. |
As a NodeJS native module developer, we've been relying on NodeJS' ABI to be able to publish pre-compiled binary packages to ease the installation process.
We recently discovered that the Debian / Ubuntu / Arch Linux packages (and maybe more ?) aren't ABI compatible with the official NodeJS distributions, which breaks pre-built binary packages in difficult ways. More specifically, these distributions are shipping NodeJS 8 that is linked against OpenSSL 1.1. The official NodeJS distribution is linking against OpenSSL 1.0, and there has been ABI breaking changes between the two versions. Therefore, a NodeJS 8 native module built against the official runtime will fail to work properly on Debian or Arch Linux's runtime.
The package we are publishing (grpc) is affected by this, but I also managed to identify at least a second package that is also affected: uws. The initial issue was reported and (painfully) investigated over on the gRPC bug tracker: grpc/grpc-node#341
I've then filed a detailed issue on Ubuntu's issue tracker to expose the problem with a reproduction case here.
I'm not sure what would the NodeJS' stance be on this issue, hence me creating this issue here to discuss the problem. I believe that the ABI breakage from Debian and Arch Linux is an oversight and an honest mistake, but I'm not sure what the resolution should be. I know that Arch Linux has an openssl-1.0 package that the nodejs package can depend upon, but I don't think this is viable for Debian / Ubuntu.
My opinion is that at the simplest, the Node foundation should publish Vendoring Guidelines, describing what it means to ship a correct NodeJS runtime, including notes on how to properly expose ABI compatible symbols for native modules.
cc @ofrobots.
The text was updated successfully, but these errors were encountered: