Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

no debug-info on locals in stack frames when stepping though rust-msgpack #9641

Closed
glycerine opened this issue Oct 1, 2013 · 10 comments
Closed
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.)

Comments

@glycerine
Copy link

Josh asked for where I was seeing debug-info goes missing. Most places, in my experience. An easy to repro example is just to debug the msgpack bindings, as below. The rust compiler itself, I'm told from rust/irc, is another prime example.

me@host:/tmp/rust-msgpack$ uname -a
Linux i7 3.8.0-30-generic #44~precise1-Ubuntu SMP Fri Aug 23 18:32:41 UTC 2013 x86_64 x86_64 x86_64 GNU/Linux
me@host:/tmp/rust-msgpack$ rustc -v
rustc 0.8 (48499c7 2013-09-26 15:36:10 -0700)
host: x86_64-unknown-linux-gnu
me@host:/tmp/rust-msgpack$
me@host:/tmp$ git clone https://github.com/glycerine/rust-msgpack
Cloning into 'rust-msgpack'...
remote: Counting objects: 85, done.
remote: Compressing objects: 100% (52/52), done.
remote: Total 85 (delta 33), reused 84 (delta 32)
Unpacking objects: 100% (85/85), done.
me@host:/tmp$ cd rust-msgpack/
me@host:/tmp/rust-msgpack$ git branch -v
* master 802d91b add license to readme
me@host:/tmp/rust-msgpack$ ls
example.rs  msgpack.rs  README.md
me@host:/tmp/rust-msgpack$ rustc -Zextra-debug-info -Z no-opt example.rs
me@host:/tmp/rust-msgpack$ lh
total 488
drwxr-xr-x  3 jaten jaten   4096 Sep 30 18:16 ./
drwxrwxrwt 90 root  root   53248 Sep 30 18:16 ../
-rwxrwxr-x  1 jaten jaten 403433 Sep 30 18:16 example*
drwxrwxr-x  8 jaten jaten   4096 Sep 30 18:15 .git/
-rw-rw-r--  1 jaten jaten  19994 Sep 30 18:15 msgpack.rs
-rw-rw-r--  1 jaten jaten   1530 Sep 30 18:15 example.rs
-rw-rw-r--  1 jaten jaten    164 Sep 30 18:15 README.md
me@host:/tmp/rust-msgpack$ gdb example
GNU gdb (Ubuntu/Linaro 7.4-2012.04-0ubuntu2.1) 7.4-2012.04
Copyright (C) 2012 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64-linux-gnu".
For bug reporting instructions, please see:
<http://bugs.launchpad.net/gdb-linaro/>...
Reading symbols from /tmp/rust-msgpack/example...done.
(gdb) break msgpack.rs:587
Breakpoint 1 at 0x4163e7: msgpack.rs:587. (84 locations)
(gdb) run # 84 locations??
Starting program: /tmp/rust-msgpack/example
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/x86_64-linux-gnu/libthread_db.so.1".
[New Thread 0x7ffff7fc4700 (LWP 26993)]
[New Thread 0x7ffff5d67700 (LWP 26994)]
[New Thread 0x7ffff4fff700 (LWP 26995)]
[New Thread 0x7ffff4efe700 (LWP 26996)]
[Switching to Thread 0x7ffff7fc4700 (LWP 26993)]

Breakpoint 1, std..hashmap..HashMap$LT$u32$C$u32$GT$::glue_drop::hf62265217cbfd21aaB ()
    at msgpack.rs:587
587           f(self, len)
(gdb) list
582         #[inline(always)]
583         fn read_seq<T>(&mut self, f: &fn(&mut Decoder,uint) -> T) -> T {
584           //println!("in read_seq.");
585           let len = self._read_vec_len();
586           //printfln!("I see length of %?", len);
587           f(self, len)
588         }
589
590         #[inline(always)]
591         fn read_seq_elt<T>(&mut self, _idx: uint, f: &fn(&mut Decoder) -> T) -> T { f(self) }
(gdb) bt
#0  std..hashmap..HashMap$LT$u32$C$u32$GT$::glue_drop::hf62265217cbfd21aaB () at msgpack.rs:587
#1  0x000000000042b257 in hashmap::Clone$HashMap::clone::h7953e8b94478d0bcx7aM::v0.0 ()
    at example.rs:41
#2  0x0000000000429036 in example::main () at example.rs:41
#3  0x00007ffff75a1ee0 in rt::task::__extensions__::build_start_wrapper::anon::anon::expr_fn::a0 ()
   from /tmp/rust-msgpack/../../usr/local/lib/rustc/x86_64-unknown-linux-gnu/lib/libstd-6c65cf4b443341b1-0.8.so
#4  0x00007ffff75a03a7 in rt::task::__extensions__::run::anon::expr_fn::ai ()
   from /tmp/rust-msgpack/../../usr/local/lib/rustc/x86_64-unknown-linux-gnu/lib/libstd-6c65cf4b443341b1-0.8.so
#5  0x00007ffff6d7fb13 in rust_try (f=<optimized out>, fptr=<optimized out>, env=<optimized out>)
    at /usr/cn/rust/github-master/rust/src/rt/rust_builtin.cpp:499
#6  0x00007ffff75a02dc in rt::task::Unwinder::try::h199ab8d6eb226980Vqah::v0.8 ()
   from /tmp/rust-msgpack/../../usr/local/lib/rustc/x86_64-unknown-linux-gnu/lib/libstd-6c65cf4b443341b1-0.8.so
#7  0x00007ffff75a0151 in rt::task::Task::run::h199ab8d6eb226980Z1ag::v0.8 ()
   from /tmp/rust-msgpack/../../usr/local/lib/rustc/x86_64-unknown-linux-gnu/lib/libstd-6c65cf4b443341b1-0.8.so
#8  0x00007ffff75a1a7c in rt::task::__extensions__::build_start_wrapper::anon::expr_fn::aR ()
   from /tmp/rust-msgpack/../../usr/local/lib/rustc/x86_64-unknown-linux-gnu/lib/libstd-6c65cf4b443341b1-0.8.so
#9  0x0000000000000000 in ?? ()
(gdb) list
592
593         #[inline(always)]
594         fn read_struct<T>(&mut self, _name: &str, len: uint, f: &fn(&mut Decoder) -> T) -> T {
595             //printfln!("in read_struct");
596           if len != self._read_map_len() { fail!() }
597 >>>       f(self)
598         }
599
600         #[inline(always)]
601         fn read_struct_field<T>(&mut self, _name: &str, _idx: uint, f: &fn(&mut Decoder) -> T) -> T {
(gdb) i lo
No symbol table info available.
(gdb) p len
No symbol "len" in current context.
(gdb)

@jdm
Copy link
Contributor

jdm commented Oct 1, 2013

This is a great testcase; thanks. Breaking on example::main and stepping is working fine for me; I'm going to investigate when things start to go wrong.

@jdm
Copy link
Contributor

jdm commented Oct 1, 2013

Easy example: break example::decod shows weirdness for the first two step commands (it says the pc's at msgpack.rs:692, then msgpack.rs:54) before actually showing the correct line 24. Furthermore, info locals shows bogus values until we hit line 24, which suggests to me that we're incorrectly associating the function prelude with other bits of code.

@michaelwoerister
Copy link
Member

Yes, that is probably related to the "peculiar stepping behavior" I talked about in the blog post. The #[inline(always)] attribute is also a candidate for making things less stable (though normally LLVM should handle that automatically).

@glycerine
Copy link
Author

Given that stack frame 1 is:
#1 0x000000000042b257 in hashmap::Clone$HashMap::clone::h7953e8b94478d0bcx7aM::v0.0 ()
at example.rs:41

I imagine the biggest problem may actually be the mapping of code to lines. That will mask lots of local variable visibility issues, since it appears that we've halted in the wrong place (not where we set the breakpoint).

@michaelwoerister
Copy link
Member

Yeah, that's what I meant. The whole area of generating good line tables has received virtually no attention so far (you should try stepping through an if-expression ;)
I haven't thought about how this can be properly tested with automated tests but that would be the best way to tackle this. I'll see if I can work on this over the next weeks, but now that my internship is over my time budget for Rust is smaller than it used to be.

@jdm
Copy link
Contributor

jdm commented Dec 11, 2013

Both Michael and I have taken a look at this, and neither of us have found anything conclusive. Information dump:

* It might be a good idea to reset the debug location in debuginfo::create_function_debug_context() because trans::base::new_fn_ctxt_w_id() already creates some LLVM instructions which might be assigned some random, call-site source positions otherwise.
* trans::base::alloca_maybe_zeroed() changes the position of the IR builder, possibly putting it back into the prelude section of the function. Clear source positions here also removed some unwanted source position assignments in IR.

Other than that I have not found out why we still get different results than clang. The generated IR for a small test program looked very similar, but in the clang program the arguments already had the correct value in the first line of the function, while in the Rust program the correct values where loaded after the first line...

Maybe some instruction appearing later in the IR gets assigned a wrong source position before at the beginning of the function body.

From my own investigations, I applied this patch (https://gist.github.com/jdm/7903912) and ended up generating very similar IR to clang for the programs at https://gist.github.com/jdm/7903924, but breaking on step::Foo::foopy still showed the same problem.

@michaelwoerister
Copy link
Member

This may not be entirely our fault. Problem sometimes also occurs in Clang (Ubuntu clang version 3.2-1~exp9ubuntu1), e.g. for the following program:

void function(int a, int b, int c)
{
    a = b;
    c = b;
}

int main() {
    function(10, 11, 12);
}

Instructing gdb to break at function will stop at a position where the arguments are not yet initialized:

Breakpoint 1, function (a=32767, b=-7728, c=0) at ./main.c:4
4   {

After one step the variables have the correct values, so this seems to be the same problem.
Currently I'm looking into whether morestack check in the function prelude is part of the problem...

@michaelwoerister
Copy link
Member

Here is what I tried and found so far:

  • Always set column numbers to zero. Both CLANG and GCC only provide line numbers for source positions.
  • Made sure that there are no superfluous allocas for immediate function arguments. There was some redundant copying around (in debug builds only) which I suspected to throw LLVM's function prelude detection off. This is just a simple modification of a condition in trans::_match::store_arg().
  • CLANG creates a root lexical block for the function body (instead of using the subprogram DIE as the root scope). Function arguments are declared as siblings to this root block. I replicated this behaviour to see if it made a difference.
  • As already mentioned by Josh, I made sure that debug locations are reset before the function is translated and only turned on again after the arguments are loaded.
  • I checked whether the more stack check part of the function prelude causes the problem by applying the #[no_split_stack] attribute to test functions.

Unfortunately, all of these measures combined still don't alleviate the problem. The LLVM IR looks right, the dwarfdump output looks right, but GDB and objdump -S still place the first statement before the arguments are loaded :/

Looks like this issue will pester us a while longer :)

@michaelwoerister
Copy link
Member

OK, I think I know a bit more about the nature of this problem now:
GDB does not seem to use the PROLOGUE END markers in the DWARF line information (GCC does not even generate them) Instead GDB seems to rely on some kind of prologue detection mechanism to find out where the body of a function starts. This seems to work most of the time for code generated by GCC, CLANG, and even RUSTC for functions with the #[no_split_stack] attribute.

However, the additional machine code generated for checking whether more stack space needs to be allocated seems to throw off GDB's prologue detection and it thinks the regular part of the function has already begun there. So, my current guess is that Rust's dynamic stack allocation is what leads to this behaviour.

Note that LLVM does seem to generate correct PROLOGUE END markers in the DWARF line tables, even for functions with split stacks and the IR and DWARF for Rust programs now look pretty much the same as for similar C programs compiled with CLANG. It's possible that this problem would not occur with LLDB. I've not yet had a chance to try that.

All the changes mentioned above make sense and lead to cleaner line table generation. I'll prepare a pull request ASAP :)

bors added a commit that referenced this issue Dec 16, 2013
This PR improves the stepping experience in GDB. It contains some fine tuning of line information and makes *rustc* produce nearly the same IR/DWARF as Clang. The focus of the changes is function prologue handling which has caused some problems in the past (#9641).

It seems that GDB does not properly handle function prologues when the function uses segmented stacks, i.e. it does not recognize that the `__morestack` check is part of the prologue. When setting a breakpoint like `break foo` it will set the break point before the arguments of `foo()` have been loaded and still contain bogus values. For function with the #[no_split_stack] attribute this problem has never occurred for me so I'm pretty sure that segmented stacks are the cause of the problem. @jdm mentioned that segmented stack won't be completely abandoned after all. I'd be grateful if you could tell me about what the future might bring in this regard (@brson, @cmr).

Anyway, this PR should alleviate this problem at least in the case when setting breakpoints using line numbers and also make it less confusing when setting them via function names because then GDB will break *before* the first statement where one could conceivably argue that arguments need not be initialized yet.

Also, a koala: 🐨

Cheers,
Michael
flip1995 pushed a commit to flip1995/rust that referenced this issue Oct 20, 2022
Enable cargo sparse registry in CI

https://blog.rust-lang.org/2022/06/22/sparse-registry-testing.html

The initial registry update takes around 1 minute currently, so this gives quite a nice speed boost to CI build times

r? `@flip1995`

changelog: none
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-debuginfo Area: Debugging information in compiled programs (DWARF, PDB, etc.)
Projects
None yet
Development

No branches or pull requests

3 participants