Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Linux also needs heuristic workaround for libclang default include paths #2330

Closed
8 of 11 tasks
Hexcles opened this issue Sep 13, 2016 · 19 comments
Closed
8 of 11 tasks

Comments

@Hexcles
Copy link

Hexcles commented Sep 13, 2016

Issue Prelude

Please complete these steps and check these boxes (by putting an x inside
the brackets) before filing your issue:

  • I have read and understood YCM's CONTRIBUTING document.
  • I have read and understood YCM's CODE_OF_CONDUCT document.
  • I have read and understood YCM's README, especially the
    Frequently Asked Questions section.
  • I have searched YCM's issue tracker to find issues similar to the one I'm
    about to report and couldn't find an answer to my problem. (See Using default .ycm_extra_conf.py, the standard c++11 headers are not found. #303 )
  • If filing a bug report, I have included the output of vim --version.
  • If filing a bug report, I have included the output of :YcmDebugInfo.
  • If filing a bug report, I have included the output of
    :YcmToggleLogs stderr.
  • If filing a bug report, I have included which OS (including specific OS
    version) I am using.
  • If filing a bug report, I have included a minimal test case that reproduces
    my issue.
  • I understand this is an open-source project staffed by volunteers and
    that any help I receive is a selfless, heartfelt gift of their free time. I
    know I am not entitled to anything and will be polite and courteous.
  • I understand my issue may be closed if it becomes obvious I didn't
    actually perform all of these steps.

Thank you for adhering to this process! It ensures your issue is resolved quickly and that neither your nor our time is needlessly wasted.

Issue Details

This issue was discussed in #303. Here's a quick recap: libclang does not always use the default header search paths as the clang binary does, namely the C++ stdlib tends to be missing. It was discovered that macOS was more prone to this issue, while libclang seemed to include the default search paths correctly in Linux (at least for the official libclang binary). Therefore, a workaround was added for macOS only.

New problem: however, since the clang 3.8.0 release (as well as the most recent 3.9.0 release currently used in YCM), the official libclang binary for Linux also seems to stop including the default search paths. (I haven't verified the code change on libclang side, but simply speculated from the behaviour.) I do realize that FAQ already covers this issue, but I think it would be good to add a workaround for Linux similar to what we did for macOS in 11edf12 under the same rationale. (Yet the Linux workaround might be harder/uglier because of the version number in the path...)

A trivial example:

#include <iostream>
int main() {
  std::cout << "Hello";
  return 0;
}

And :YcmDiags would complain about missing headers. After I explicitly added -isystem /usr/include/c++/6.2.1 (since I have GCC 6.2.1) to the flags, everything worked.

FWIW, here's my system info:

  • Arch Linux x86_64 with the most recent packages on 2016-09-12
  • gcc 6.2.1
  • vim 7.4.2334
  • YCM up to date (30871bc) using the official (downloaded) clang 3.9.0.
@pdy
Copy link

pdy commented Oct 4, 2016

I'm using clang and every my .ycm_extra_cong.py contains such script

def LoadSystemIncludes():
    regex = re.compile(ur'(?:\#include \<...\> search starts here\:)(?P<list>.*?)(?:End of search list)', re.DOTALL);
    process = subprocess.Popen(['clang', '-v', '-E', '-x', 'c++', '-'], stdin=subprocess.PIPE, stdout=subprocess.PIPE, stderr=subprocess.PIPE);
    process_out, process_err = process.communicate('');
    output = process_out + process_err;
    includes = [];
    for p in re.search(regex, output).group('list').split('\n'):
        p = p.strip();
        if len(p) > 0 and p.find('(framework directory)') < 0:
            includes.append('-isystem');
            includes.append(p);
    return includes;

With its output being appended to returned flags. I do not experience any problems with stdlib.

@meritozh
Copy link

@PawelSevcio Can you share your full .ycm_extra_conf.py ? I am very interested.

@pdy
Copy link

pdy commented Oct 10, 2016

@meritozh sure, it's in attachment, but there is nothing special in it really.
ycm_extra_conf.txt

@bstaletic
Copy link
Collaborator

I too have noticed that libclang does not include -isystem flags any more and explicitly adding those flags to .ycm_extra_conf.py does solve the issue. So a logical conclusion would be to add some sort of a workaround that would fix this automatically and including -isystem paths that clang uses by default would most likely be just fine for most users. But would such a workaround affect users that use cross toolchains and have quite different -isystem flags? I know that having superfuous include paths is not considered harmful, but if I have defined -isystem flags to be those of my cross toolchain I wouldn't want YCM reporting that everything is fine because it is looking in my native compiler's paths and even worse than this would be having YCM report errors where there are none for the same reason.

To me it seems like a sane decision for -isystem flags not be included by default. Unless I'm terribly mistaken, so anyone is free to correct me.

@Hexcles
Copy link
Author

Hexcles commented Oct 11, 2016

The use of cross toolchains (or any non-default toolchains) is a real and legit concern. However, it is also true that because of the upstream libclang change, there are: 1) changes in the default behaviour for common Linux users (system headers no longer included by default); 2) some behaviour differences between Mac and Linux versions (it used to be the case that both would include the system headers by default -- Linux version was due to libclang behaviours while Mac version is due to the heuristics in the YCM Python script). These were the main reasons I suggested adding some similar heuristic workaround for Linux as well.

But now, considering all these concerns, it might also be good if we can come up with a more elegant solution instead of workarounds. After all, it is really annoying that even a single-file "hello world" would also require a ycm_extra_conf.

@bstaletic
Copy link
Collaborator

I don't see anything wrong with requiring .ycm_extra_conf.py for every single project and including every single flag that the project uses, but I'm an Arch linux user so I'm used to hand configuring everything.

That said I am open to suggestions and willing to debate to find a solution fitting all of us.

@jagerman
Copy link

jagerman commented Nov 6, 2016

I don't see anything wrong with requiring .ycm_extra_conf.py for every single project and including every single flag that the project uses

But that's the thing: projects don't (and shouldn't) specify the include path to standard libraries because it's the compiler's job to do that. Right now, the .ycm_extra_conf.py has to specify more flags than the project uses, and that's the issue.

That's also ignoring the fact that it would be nice to be able edit a 3-line "Hello world" type program without having to first write a 20+ line .ycm_extra_conf.py, or to edit c++ code that isn't my project. Yeah, this isn't always going to work, but having the default often work is a whole lot better than having the default never work.

As for:

But would such a workaround affect users that use cross toolchains and have quite different -isystem flags?

Yes, that is a valid case, but it is also a rare case. It seems far better for including the default system headers to be an opt-out feature than the current hack of having to copy code to parse clang's debugging output into every .ycm_extra_conf.py before ycm will even work. (As to opting out, ycm_core could have a no_default_system_includes() or some such for the rare cases where people don't want the default includes).

jagerman added a commit to erisproject/eris that referenced this issue Nov 6, 2016
jagerman added a commit to erisproject/creativity that referenced this issue Nov 6, 2016
@bstaletic
Copy link
Collaborator

it's the compiler's job to do that.

Ideally it would be, but the projects that are cross compiled are definitely not using native clang as the compiler, it's just not a possibility. I also do not think that people that use C and are primarily cross compiling are rare. Everyone I've ever met that is using C as one's primary language is working with embedded systems and quite often bare-metal, myself included. My every single project for the last four years was cross compiled and I don't see anything changing that in the forseeable future.

It seems far better for including the default system headers to be an opt-out feature than the current hack of having to copy code to parse clang's debugging output into every .ycm_extra_conf.py before ycm will even work. (As to opting out, ycm_core could have a no_default_system_includes() or some such for the rare cases where people don't want the default includes).

I would agree that is a good option. I don't mind including one more line of vimscript, but I also know Valloric and the others don't like to expose any configuration to end users. They even think requiring .ycm_extra_conf.py for C/C++ and .tern-project for javascript was a big mistake. So I wouldn't be surprised to see this possible solution rejected. I hope my feeling is wrong and we will find a solution that wouldd allow me to still use YCM, as I've almost become dependent on :YcmCompleter subcomands and I don't know of any other plugin capable of doing all the things those subcommands are doing.

@puremourning
Copy link
Member

But that's the thing: projects don't (and shouldn't) specify the include path to standard libraries because it's the compiler's job to do that

That's a reasonable thing to assume. And your compiler probably does. Though, try using home-built OSS clang on a Mac. Anyway, realise that the precompiled libclang binaries that we use simply don't know where your standard library is installed. The clang devs try their very best to guess, but they can't know where the designers of your arbitrary system decided to arbitrarily secrete the c++ standard library. It seems the clang driver itself knows better and I know that some of the team are working towards libclang having better heuristics. Until then, there is little we can do other than YCM developers writing workarounds for each and every arbitrary system layout. We concede to do it on Mac because, while different for other OSes, Apple always put the libraries in the same place. Unfortunately, and hugely regrettably, that is not true of any of the other supported OSes, and we're not willing to develop and maintain hacks and workarounds for each and every flavour of each and every OS. In particular, it seems wasteful given that the workaround is trivial (just tell libclang where your standard library is).

While I realise this next point is a straw man, I somehow feel compelled to address it...

That's also ignoring the fact that it would be nice to be able edit a 3-line "Hello world" type program without having to first write a 20+ line .ycm_extra_conf.py

For most users, .ycm_extra_conf.py is a 2 line file for toy projects...

def FlagsForFile( file_name, **kwargs ):
  return { 'flags': [ '-x', 'c++', '-std=c++11' ] }

I'm not sure how often serious developers write a 3-line "Hello, world!" programs, and how useful a code comprehension engine would be for such toy projects, but many thousands of users use YCM on large applications in professional environments, so we really don't see the toy project use case as a major driver for a complex and brittle feature such as this one.

As has been mentioned multiple times, if all you do is toy projects, we recommend writing a global extra conf file with the requisite 2-20 lines in it (depending on your requirements).

You could always indeed incorporate my default-extra-conf branch which somewhat removes the need for project-specific .ycm_extra_conf.py files (under specific circumstances).

They even think requiring .ycm_extra_conf.py for C/C++ and .tern-project for javascript was a big mistake

Quite right! We discussed it at length, but Tern provides no way to avoid tern-project files.

It seems far better for including the default system headers to be an opt-out feature

As above, if we had a solution, it would not be optional.

than the current hack of having to copy code to parse clang's debugging output into every .ycm_extra_conf.py before ycm will even work.

I literally never do that, and use YCM on linux every day. But that's not really the point, I suppose. The point is if libclang can't parse your code, we can't provide completions. You need to tell libclang how to parse your code.

@puremourning
Copy link
Member

I suppose the other thing to point out is that we don't even have a solution, so debating the merits of doing or not doing that nonexistent solution is really not super productive. I realise I have made a nonzero contribution in that regard.

@jagerman
Copy link

jagerman commented Nov 6, 2016

I literally never do that

I'd like to know why it works for you, then, because it doesn't for me (on up-to-date Debian testing).

If I put exactly this in .ycm_extra_conf.py:

def FlagsForFile( file_name, **kwargs ):
    return { 'flags': [ '-x', 'c++', '-std=c++11' ], 'do_cache': True }

and then open this foo.cpp file:

#include <iostream>
int main() {
    std::cout << "hello world\n";
}

I get everything on the std::cout line error-highlighted, with status message invalid operands to binary expression ('ostream' (aka 'int') and 'const char *').

It seems I have to add '-isystem', '/usr/lib/llvm-3.8/lib/clang/3.8.1/include' to make it work: but changing that every time the system clang gets updated is obviously not feasible, so I use a solution like the one @pawelsevcio proposed above to hack the -isystem flags from clang's output. It's gross, and it'll break if clang's verbose output changes, but at least it gets the job done.

I'd happily like a nicer (i.e. two lines) way to do this, but the "for most users" version that you posted above isn't working.

Edit: just to be more precise, without the added -isystem flag :YcmDiags reports:

/usr/include/wchar.h|39 col 11 error| 'stdarg.h' file not found
foo.cpp|4 col 15 error| invalid operands to binary expression ('ostream' (aka 'int') and 'const char *')

@Hexcles
Copy link
Author

Hexcles commented Nov 6, 2016

@bstaletic sorry I guess I misled the discussion: having a configuration file for every project is fine, but the real problem is about hand-coding the default include paths in the conf, something you wouldn't specify in Makefile or CMake if you are doing some common local Linux development and is non-trivial to find out. Furthermore, every time after gcc/clang is upgraded, you have to go back to all these configuration files and change the paths.

@puremourning , first thanks for the insight!

It is true (and sad) that there seems to be no "elegant" (as on Mac) workaround on Linux. The only way I could think of is to automate the manual process of grepping clang's verbose output (see @pawelsevcio 's script), which is quite hacky. (BTW, @jagerman , actually this is not likely to break; the usage comes from gcc and is used in many build hacks.)

Yet I'm wondering how you can (1) use that 2-line .ycm_extra_conf.py for toy projects and (2) never copy the path of stdlib from clang preprocessor output when you work on Linux? This is precisely the issue we are discussing about: if that were the case, I'd be quite satisfied. Which distribution do you use?

@puremourning
Copy link
Member

and is non-trivial to find out.

Quite

Furthermore, every time after gcc/clang is upgraded, you have to go back to all these configuration files and change the paths.

Quite. This is already a pain for macOS due to seemingly arbitrary reorganisations between versions. The YCM development team are just not capable of doing this nontrivial exercise when any of the 10k users deigns to install their compiler and/or standard library in a nonstandard location? I don't think the 5 of us quite have that time to commit, I'm afraid.

automate the manual process of grepping clang's verbose output

As we've said before, we can't do this because we can't even guarantee that clang is installed in YCM runtime environments; there is no requirement for that to be the case.

I think we would only consider any practical heuristic that is:

  • trivial
  • universal
  • self-maintaining

Yet I'm wondering how you can (1) use that 2-line

Take a look at our test suite.

@jagerman
Copy link

jagerman commented Nov 6, 2016

(BTW, @jagerman , actually this is not likely to break; the usage comes from gcc and is used in many build hacks.)

It's actually pretty easy to break: just uninstall the clang binary (ycmd only depends on libclang, not clang itself). (As @puremourning just mentioned).

@Hexcles
Copy link
Author

Hexcles commented Nov 7, 2016

Oh yeah I forgot the requirement of having clang binary. Thanks for pointing it out. I agree, unfortunately, there is no really satisfying one so far.

An interesting update here: I just noticed that I could not reproduce this issue any more on my machine. It is not caused by changes in YCM -- I've verified by checking out to 30871bc (recursively) I originally mentioned. The most likely explanation is my system updates (it's Arch Linux, you know), and I'm still trying to find out the exact reason.

@jagerman
Copy link

jagerman commented Nov 7, 2016

There's a very good chance that the needed clang system includes in linux are to be found in /usr/lib/clang/x.y.z/include, where x.y.z is the libclang version. That path seems to work for the clang packages in Debian, Ubuntu, Arch Linux, and Fedora (I'm not saying it won't work elsewhere, but those are the four Linux distributions I checked).

@Hexcles - I've also noticed that the stdlib headers get picked up now, but the clang includes don't, which causes a failure for, among other things, std::out << "hello";. Does that work for you now? Also, which version of libclang are you using?

@Hexcles
Copy link
Author

Hexcles commented Nov 7, 2016

@jagerman No mine is working fine. And you sure it's missing clang-includes? That's not supposed to happen! YCM explicitly deals with this problem by including a copy of clang_includes and always appending it to includes. See https://github.com/Valloric/ycmd/blob/c3daaacd28decaf8250ee7feaa956d0519eaa52d/ycmd/completers/cpp/flags.py#L438

And here: https://github.com/Valloric/ycmd/tree/c3daaacd28decaf8250ee7feaa956d0519eaa52d/clang_includes/include

I'm on Arch Linux, clang 3.9.0 by the way. And the libclang is downloaded by YCM, 3.9.0.

@jagerman
Copy link

jagerman commented Nov 7, 2016

Actually, this may be is a Debian package problem; it has a symlink from /usr/lib/ycmd/clang_includes to /usr/lib/ycmd/clang/3.8/include, but I think that needs to be /usr/lib/ycmd/clang/3.8, so that clang_includes/include goes to the right place. (Edit: yes, fixing that fixes the problem).

In that case, sorry for all the noise: it seems like whatever was missing picking up the stdlib headers is fixed (and apparently on the clang side of things), and missing things like stdint.h is a problem specific to the Debian package (which has a debian bug filed).

Moreover editing "foo.cpp" (and variants thereof) in my home directory works again without errors (or needed custom configuration).

@puremourning
Copy link
Member

Joy \0/

@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 25, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants