Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cassandra page faults under YCSB workloadc with extra JVM logging #490

Closed
tgrabiec opened this issue Sep 5, 2014 · 5 comments
Closed

Cassandra page faults under YCSB workloadc with extra JVM logging #490

tgrabiec opened this issue Sep 5, 2014 · 5 comments

Comments

@tgrabiec
Copy link
Member

tgrabiec commented Sep 5, 2014

When these options are passed to the JVM:

-verbose:gc 
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime

The JVM page faults in about a minute from starting YCSB benchmark:

2014-09-05T17:53:59.527+0000: page fault outside application, addr: 0x000020000b2ae000
[registers]
RIP: 0x000000000045ba31 <???+4569649>
RFL: 0x0000000000010202  CS:  0x0000000000000008  SS:  0x0000000000000010
RAX: 0x8000000000000000  RBX: 0x000020000b2ae004  RCX: 0x0000000000000000  RDX: 0x0000000000000000
RSI: 0x0000000000000003  RDI: 0x000020000b2ab31c  RBP: 0x000020000b2ad010  R8:  0x0000000000000066
R9:  0x0000000000000000  R10: 0x00000000ffffffe2  R11: 0x0000000000000000  R12: 0x0000000000000007
R13: 0x0000000000000000  R14: 0x00000000ffffffe2  R15: 0x0000000000000000  RSP: 0x000020000b2ab2a0
Aborted

[backtrace]
0x00000000003291bf <???+3314111>
0x000000000032a2d3 <mmu::vm_fault(unsigned long, exception_frame*)+147>
0x0000000000389ff9 <page_fault+105>
0x0000000000388ee6 <???+3706598>

#0  0x00000000003fa912 in cli_hlt ()
    at /data/tgrabiec/src/osv/arch/x64/processor.hh:242
#1  halt_no_interrupts () at /data/tgrabiec/src/osv/arch/x64/arch.hh:49
#2  osv::halt () at /data/tgrabiec/src/osv/core/power.cc:36
#3  0x00000000002237a5 in abort (fmt=fmt@entry=0x6058ed "Aborted\n")
    at /data/tgrabiec/src/osv/runtime.cc:150
#4  0x00000000002237d0 in abort () at /data/tgrabiec/src/osv/runtime.cc:117
#5  0x00000000003291c0 in mmu::vm_sigsegv (addr=<optimized out>, 
    ef=0xffff8001099ce078) at /data/tgrabiec/src/osv/core/mmu.cc:1191
#6  0x000000000032a2d4 in mmu::vm_fault (addr=<optimized out>, 
    addr@entry=35184559448064, ef=ef@entry=0xffff8001099ce078)
    at /data/tgrabiec/src/osv/core/mmu.cc:1213
#7  0x0000000000389ffa in page_fault (ef=0xffff8001099ce078)
    at /data/tgrabiec/src/osv/arch/x64/mmu.cc:38
#8  <signal handler called>
#9  fmt_fp (f=0x20000b2ad2c0, y=0, w=3, p=7, fl=0, t=102)
    at /data/tgrabiec/src/osv/musl/src/stdio/vfprintf.c:291
#10 0x0000000000000000 in ?? ()
@gleb-cloudius
Copy link
Contributor

We saw the same crash with ifconfig. It looks like we corrupt floating
point state somehow.

On Fri, Sep 05, 2014 at 10:57:03AM -0700, Tomasz Grabiec wrote:

When these options are passed to the JVM:

-verbose:gc 
-XX:+PrintGCDetails
-XX:+PrintGCDateStamps
-XX:+PrintTenuringDistribution
-XX:+PrintGCApplicationConcurrentTime
-XX:+PrintGCApplicationStoppedTime

The JVM page faults in about a minute from starting YCSB benchmark:

2014-09-05T17:53:59.527+0000: page fault outside application, addr: 0x000020000b2ae000
[registers]
RIP: 0x000000000045ba31 <???+4569649>
RFL: 0x0000000000010202  CS:  0x0000000000000008  SS:  0x0000000000000010
RAX: 0x8000000000000000  RBX: 0x000020000b2ae004  RCX: 0x0000000000000000  RDX: 0x0000000000000000
RSI: 0x0000000000000003  RDI: 0x000020000b2ab31c  RBP: 0x000020000b2ad010  R8:  0x0000000000000066
R9:  0x0000000000000000  R10: 0x00000000ffffffe2  R11: 0x0000000000000000  R12: 0x0000000000000007
R13: 0x0000000000000000  R14: 0x00000000ffffffe2  R15: 0x0000000000000000  RSP: 0x000020000b2ab2a0
Aborted

[backtrace]
0x00000000003291bf <???+3314111>
0x000000000032a2d3 <mmu::vm_fault(unsigned long, exception_frame*)+147>
0x0000000000389ff9 <page_fault+105>
0x0000000000388ee6 <???+3706598>

#0  0x00000000003fa912 in cli_hlt ()
    at /data/tgrabiec/src/osv/arch/x64/processor.hh:242
#1  halt_no_interrupts () at /data/tgrabiec/src/osv/arch/x64/arch.hh:49
#2  osv::halt () at /data/tgrabiec/src/osv/core/power.cc:36
#3  0x00000000002237a5 in abort (fmt=fmt@entry=0x6058ed "Aborted\n")
    at /data/tgrabiec/src/osv/runtime.cc:150
#4  0x00000000002237d0 in abort () at /data/tgrabiec/src/osv/runtime.cc:117
#5  0x00000000003291c0 in mmu::vm_sigsegv (addr=<optimized out>, 
    ef=0xffff8001099ce078) at /data/tgrabiec/src/osv/core/mmu.cc:1191
#6  0x000000000032a2d4 in mmu::vm_fault (addr=<optimized out>, 
    addr@entry=35184559448064, ef=ef@entry=0xffff8001099ce078)
    at /data/tgrabiec/src/osv/core/mmu.cc:1213
#7  0x0000000000389ffa in page_fault (ef=0xffff8001099ce078)
    at /data/tgrabiec/src/osv/arch/x64/mmu.cc:38
#8  <signal handler called>
#9  fmt_fp (f=0x20000b2ad2c0, y=0, w=3, p=7, fl=0, t=102)
    at /data/tgrabiec/src/osv/musl/src/stdio/vfprintf.c:291
#10 0x0000000000000000 in ?? ()

Reply to this email directly or view it on GitHub:
#490

        Gleb.

@raphaelsc
Copy link
Member

@gleb-cloudius, is this issue fixed by 0e8d9b5?

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 08, 2014 at 12:33:43PM -0700, Raphael S.Carvalho wrote:

@gleb-cloudius, is this issue fixed by 0e8d9b5?

Yes, it should be.

        Gleb.

@slivne
Copy link
Contributor

slivne commented Sep 29, 2014

@tgrabiec @gleb-cloudius can we close this issue

@gleb-cloudius
Copy link
Contributor

On Mon, Sep 29, 2014 at 12:47:30AM -0700, slivne wrote:

@tgrabiec @gleb-cloudius can we close this issue

yes

        Gleb.

@tzach tzach closed this as completed Sep 29, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants