Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test/file: repeated running test_13559 triggers SystemError randomly #22566

Merged
merged 1 commit into from
Jul 22, 2017

Conversation

iblislin
Copy link
Member

@iblislin iblislin commented Jul 8, 2017

The function test_13559 in test/file.jl will randomly raise SystemError.
but (not sure why) this exception cannot trigger via (g)make test-file even I added a loop in test/file.jl.

I just extract the function in a standalone script, found it can be reproduced easily within 10 iters on my Linux and FreeBSD machines.
please checkout this script: https://gist.github.com/iblis17/b39e37071f3d816076e2770b338a8d07

Here is the result on Linux

└─[iblis@kaladbolg Oops]% ../julia test.jl 
=================== iter 0 ======================
pipe: /tmp/juliaCo8IW7
=================== iter 0 done =================
=================== iter 1 ======================
pipe: /tmp/juliaDhJxvz
=================== iter 1 done =================
=================== iter 2 ======================
pipe: /tmp/juliaw9NMlN
ERROR: LoadError: SystemError: opening file /tmp/juliaw9NMlN: Interrupted system call
Stacktrace:
 [1] #systemerror#43 at ./error.jl:64 [inlined]
 [2] systemerror(::String, ::Bool) at ./error.jl:64
 [3] open(::String, ::Bool, ::Bool, ::Bool, ::Bool, ::Bool) at ./iostream.jl:104
 [4] open(::String, ::String) at ./iostream.jl:132
 [5] test_13559() at /home/iblis/git/julia/tmp/test.jl:11
 [6] macro expansion at /home/iblis/git/julia/tmp/test.jl:25 [inlined]
 [7] anonymous at ./<missing>:?
 [8] include_from_node1(::Module, ::String) at ./loading.jl:549
 [9] include(::Module, ::String) at ./sysimg.jl:14
 [10] process_options(::Base.JLOptions) at ./client.jl:310
 [11] _start() at ./client.jl:378
while loading /home/iblis/git/julia/tmp/test.jl, in expression starting on line 23
┌─[~/git/julia/tmp]
| [Venv(py36)] [ master] [-- INSERT --]
└─[iblis@kaladbolg Oops]% uname -a
Linux kaladbolg 4.11.3-1-ARCH #1 SMP PREEMPT Sun May 28 10:40:17 CEST 2017 x86_64 GNU/Linux

See also: #20585 (comment)

@iblislin
Copy link
Member Author

iblislin commented Jul 8, 2017

@ararslan I found the pain point, please check it out!

if (fd != -1)
return fd;
if (_enonfatal(errno))
goto retry;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be implemented at a higher level.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm... I do not get the point "higher level"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you mean replacing goto statement ?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't change this function and check the errno after this function return.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, I modified it. but i'm curious about the reason...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's another open below in the same function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok. I added some comment for it.

@iblislin iblislin force-pushed the fix-open_cloexec branch 3 times, most recently from 589fc7b to 97d8096 Compare July 8, 2017 22:31
@iblislin
Copy link
Member Author

rebased for changing is_windows -> Sys.iswindows.

test/file.jl Outdated
for i ∈ 1:50
test_22566()
end
end # !is_windows()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be nice to update the comment as well, though this block is short enough that the comment doesn't seem necessary IMO.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👌

@ararslan
Copy link
Member

Awesome. Great work here, thanks!

to deal with EINTR (interrupted system call) on `open`
@iblislin
Copy link
Member Author

rebased for trying FreeBSD CI

@vtjnash vtjnash merged commit 5ea8c7c into JuliaLang:master Jul 22, 2017
@iblislin iblislin deleted the fix-open_cloexec branch July 22, 2017 05:31
cmd = `$(Base.julia_cmd()) --startup-file=no -e $script`
open(pipeline(cmd, stderr=STDERR))

r = open(fn, "r")
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any possibility of race condiction between these 2 open?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would constitute a race condition? Yes, the order in which they occur appears to be undefined, if that matters.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(move discussion to #23001 (comment)

ararslan added a commit that referenced this pull request Jul 25, 2018
It was disabled in #24037 but reportedly seems to work okay when
isolated into this separate, node 1 only test group.
ararslan added a commit that referenced this pull request Jul 25, 2018
It was disabled in #24037 but reportedly seems to work okay when
isolated into this separate, node 1 only test group.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants