Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nc_test fail in 4.5.0 with CMake #500

Closed
ArchangeGabriel opened this issue Oct 20, 2017 · 29 comments · Fixed by #654
Closed

nc_test fail in 4.5.0 with CMake #500

ArchangeGabriel opened this issue Oct 20, 2017 · 29 comments · Fixed by #654
Assignees
Milestone

Comments

@ArchangeGabriel
Copy link
Contributor

ArchangeGabriel commented Oct 20, 2017

 70/177 Test  #70: nc_test ...............................***Failed   26.31 sec

It works without -DCMAKE_BUILD_TYPE=Release (i.e. DEBUG build), but not with it.

@ArchangeGabriel ArchangeGabriel changed the title nc_test fail in 4.5.0 b nc_test fail in 4.5.0 with CMake Oct 20, 2017
@ArchangeGabriel
Copy link
Contributor Author

Sorry for entering too fast, edited. Likely related to the warning in #501.

@WardF WardF added this to the 4.5.1 milestone Oct 20, 2017
@WardF
Copy link
Member

WardF commented Oct 20, 2017

Interesting; what platform are you on? I'm unable to recreate this but will keep trying.

@WardF
Copy link
Member

WardF commented Oct 20, 2017

Specifically, which distribution of 64-bit linux, sorry :)

@ArchangeGabriel
Copy link
Contributor Author

ArchLinux as usual, sorry to have not mentioned it. I’ll add the log requested by @DennisHeimbigner shortly.

@ArchangeGabriel
Copy link
Contributor Author

https://paste.xinu.at/xrLi/

Relevant section starts at https://paste.xinu.at/xrLi/#n2211

@zerothi
Copy link

zerothi commented Oct 30, 2017

This also fails for me:

# Features
--------
NetCDF-2 API:		yes
HDF4 Support:		no
NetCDF-4 API:		yes
NC-4 Parallel Support:	no
PNetCDF Support:	no
DAP2 Support:		no
DAP4 Support:		no
Diskless Support:	yes
MMap Support:		no
JNA Support:		no
CDF5 Support:		no

With gcc 7.2, zlib 1.2.11, szip 2.1.1, hdf5 1.8.18. Debian (latest). And I am not using CMake.

@zerothi
Copy link

zerothi commented Oct 30, 2017

I can confirm that the test passes if the netcdf-c library is build with debugging flags (all dependent libraries compiled with high optimization).

@ArchangeGabriel
Copy link
Contributor Author

zerothi: So you’re having the same failure with autotools? That’s very interesting, because I don’t (I’ve reverted to autotools since #244 was fixed meaning every test pass this way while this fails with CMake).

Maybe something with GCC 7.2 then. @zerothi What are your compile flags?

@zerothi
Copy link

zerothi commented Oct 30, 2017

-m64 -fPIC -O3 -ftree-vectorize -fexpensive-optimizations -funroll-loops -fprefetch-loop-arrays -march=native

I agree, they are pretty aggressive, but hdf5 etc. completes tests without problems.

@ArchangeGabriel
Copy link
Contributor Author

CFLAGS:			-march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -D_LARGEFILE64_SOURCE -D_FILE_OFFSET_BITS=64 -O3 -DNDEBUG
CPPFLAGS:		 
LDFLAGS:		-Wl,-O1,--sort-common,--as-needed,-z,relro,-z,now 

The -O2 is system default, the -O3 is a RELEASE addition apparently. So I expect -O3 (+ gcc 7.2?) to be the culprit.

@ArchangeGabriel
Copy link
Contributor Author

ArchangeGabriel commented Oct 30, 2017

I can confirm that a debug build with -O3 fails this test. And likely with autotools too, but I’ve just realized that building with autotools does not provide DEBUG or RELEASE builds, so no flags are added, and it’s only O2.

@zerothi
Copy link

zerothi commented Oct 30, 2017

I can confirm that if I only change -O3 to -O2 in my above flags, then tests does not fail.

@DennisHeimbigner
Copy link
Collaborator

You might be able to get some additional information as follows:

  1. For CMake, look in the build directory for the file: Testing/Temporary/LastTest.log
    Search (I think) for the string "Test Fail"

  2. For Automake, look for a file named nc_test.log

@zerothi
Copy link

zerothi commented Oct 31, 2017

Here it is:

*** testing nc_put_var_uchar ... FAIL nc_test (exit status: 139)

@ArchangeGabriel
Copy link
Contributor Author

@DennisHeimbigner Mine is attached here: #500 (comment)

@edhartnett
Copy link
Contributor

@ArchangeGabriel when I try that link I get an error message that there is no such file.

@ArchangeGabriel
Copy link
Contributor Author

Hum indeed. Maybe the file expired, though that’s not normal. Re-uploaded here: https://gist.github.com/ArchangeGabriel/a5d0abb6363b31f71ce4ad44736c60da

@DennisHeimbigner
Copy link
Collaborator

Looking at the log, it appears the problem is in nc_put_att and nc_rename att.

@WardF
Copy link
Member

WardF commented Nov 14, 2017

With some of the recent memory issues being addressed via PR's, does this issue persist? I'm unable to recreate it on my end.

@WardF WardF self-assigned this Nov 14, 2017
@edhartnett
Copy link
Contributor

@ArchangeGabriel I have looked at your logs.

@WardF was there some change merged recently relating to the unsigned/sign char issues somewhere?

Also, @ArchangeGabriel, can you try:

cd nc_test4
make check

The challenge with nc_test is that it's a heck of a program to debug. If we can find a simpler test that is failing, that could make it a lot easier to find the problem.

Also have you tried this with the autotools build?

@ArchangeGabriel
Copy link
Contributor Author

I’m currently building from git, with both autotools and CMake, I’ll keep you updated.

@ArchangeGabriel
Copy link
Contributor Author

ArchangeGabriel commented Nov 15, 2017

Still failing with CMake and O3. Confirmed failing on autotools with O3.

Unrelated, but I’ve noticed a compilation warning:

/build/netcdf/src/netcdf-c/libdispatch/dutil.c: In function ‘NC_mktmp’:
/build/netcdf/src/netcdf-c/libdispatch/dutil.c:249:15: warning: return makes pointer from integer without a cast [-Wint-conversion]
        return (NC_EPERM);
               ^

One more in ncgen3:

gcc -DHAVE_CONFIG_H -I. -I..  -I../include -I../oc2  -D_FORTIFY_SOURCE=2  -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O3 -MT main.o -MD -MP -MF $depbase.Tpo -c -o main.o main.c &&\
mv -f $depbase.Tpo $depbase.Po
main.c: In function ‘main’:
main.c:482:10: warning: ignoring return value of ‘fread’, declared with attribute warn_unused_result [-Wunused-result]
          (void)fread(bom,1,1,fp);
          ^~~~~~~~~~~~~~~~~~~~~~~

One in a test:

gcc -DHAVE_CONFIG_H -I. -I..  -I../include -I../oc2  -I../libsrc -DTOPSRCDIR=/build/netcdf/src/netcdf-c -DTOPBINDIR= -I../liblib -I../include -I../libsrc -D_FORTIFY_SOURCE=2  -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O3 -MT tst_utf8_validate.o -MD -MP -MF $depbase.Tpo -c -o tst_utf8_validate.o tst_utf8_validate.c &&\
mv -f $depbase.Tpo $depbase.Po
tst_utf8_validate.c:98:44: warning: null character(s) preserved in literal
 {0,"2.1.1", "1 byte  (U-00000000)",        " "},
                                            ^

Now running the check in nc_test4.

@ArchangeGabriel
Copy link
Contributor Author

One warning in this folder:

gcc -DHAVE_CONFIG_H -I. -I..  -I../include -I../oc2  -D_FORTIFY_SOURCE=2  -march=x86-64 -mtune=generic -O2 -pipe -fstack-protector-strong -fno-plt -O3 -MT tst_camrun.o -MD -MP -MF $depbase.Tpo -c -o tst_camrun.o tst_camrun.c &&\
mv -f $depbase.Tpo $depbase.Po
tst_camrun.c: In function ‘get_mem_used2’:
tst_camrun.c:685:7: warning: ignoring return value of ‘fscanf’, declared with attribute warn_unused_result [-Wunused-result]
       fscanf(pf, "%u %u %u %u %u %u", &size, &resident, &share,
       ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
       &text, &lib, &data);
       ~~~~~~~~~~~~~~~~~~~

make check worked in nc_test4.

@WardF
Copy link
Member

WardF commented Nov 15, 2017

Thanks for confirming the issue is still there. I’ll take a look shortly or we will see if Ed has an idea/contribution (given the rate of recent contributions!)

@edhartnett
Copy link
Contributor

edhartnett commented Nov 15, 2017

Howdy @ArchangeGabriel! Thanks for the detailed info! ;-)

No need to tell us about warnings. There are lots of warnings in the code that are going to go away shortly (when outstanding PRs get merged). But we still won't be warning-free, yet.

OK, I see that you say it fails on autotools. Only with -O3? I will try...

@edhartnett
Copy link
Contributor

OK, I am reproducing your problems. There are some new warnings, as well as a lot of warnings that are old-frenimies. So I will wait until @WardF merges my outstanding PRs, which clear up a lot of warnings.

@ArchangeGabriel
Copy link
Contributor Author

Yes, we already have determined above that -O3 was the culprit. You can try with CMake or autotools, as long as you’re just -O2 you’re fine. But with -O3 both fail.

The reason why CMake revealed it is that its Release mode includes -O3, while autotools has no such mode and never adds -O3. But if you specify it yourself, it then fails.

@wkliao
Copy link
Contributor

wkliao commented Nov 15, 2017

The bug fix is in #654 

wkliao added a commit to wkliao/parallel-netcdf that referenced this issue Nov 16, 2017
wkliao added a commit to Parallel-NetCDF/PnetCDF that referenced this issue Nov 16, 2017
@ArchangeGabriel
Copy link
Contributor Author

A bit late then, but I can confirm that it works with -O3 -fno-inline. :)

I can retry with just -O3 and your patch if you want. ;)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants