Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How to slice the program accurately #441

Open
hrshy0629 opened this issue May 26, 2022 · 12 comments
Open

How to slice the program accurately #441

hrshy0629 opened this issue May 26, 2022 · 12 comments
Labels

Comments

@hrshy0629
Copy link

I read and implemented DG, but I found that there were still several issues that failed to reach the target.

  1. Slicing seems to be interprocedure. Can you set a parameter to control whether the output is the slice bitcode of the target function or the slice set bitcode of the interprocedure function?
  2. Can you select more accurate variable location information? LLVM debug information contains row information and column information. Can you select variables by determining row and column information?
    Thank you for any suggestions.
@mchalupa
Copy link
Owner

mchalupa commented Jun 1, 2022

Slicing seems to be interprocedure. Can you set a parameter to control whether the output is the slice bitcode of the target function or the slice set bitcode of the interprocedure function?

There is the option -entry to set the entry point of slicing (but it is still interprocedural). Most of the analyses should be able to work also intraproceduraly, but there is no switch to turn this on at this moment.

Can you select more accurate variable location information? LLVM debug information contains row information and column information. Can you select variables by determining row and column information?

You can select based on row, but not column at this moment. But it should be an easy hack at this place: https://github.com/mchalupa/dg/blob/master/tools/llvm-slicer-crit.cpp#L213
If you do not want to hack the code, you can modify the bitcode and insert a call of a dummy function at the instruction that you want to use as the slicing criterion.

@hrshy0629
Copy link
Author

@mchalupa Thanks for your comment.
In response to my first question, I would like to modify the code to get specify the coded slice of a single function. Interprocedural information is useful for debugging, but I just want to cut out my own function information. Could you tell me the general file information in DG? For example, which files involve function slicing?

@mchalupa
Copy link
Owner

mchalupa commented Jun 2, 2022

If you need to only see the result, you can try -annotate slice with llvm-slicer. It will produce a commented .ll file. Also, llvm-dg-dump has the switch -func that makes it to dump only a given function.

Slicing of LLVM DG is in this file: https://github.com/mchalupa/dg/blob/master/include/dg/llvm/LLVMSlicer.h
but if you want to only dump the information, you probably want to just take the built dependence graph and dump some information about it. For that, you can find inspiration here: https://github.com/mchalupa/dg/blob/master/tools/llvm-dg-dump.cpp

In particular, you will want to get your function from this map:
https://github.com/mchalupa/dg/blob/master/include/dg/llvm/LLVMDependenceGraph.h#L205
and then just iterate over the dependence graph and dump information about it.

@hrshy0629
Copy link
Author

@mchalupa I am not sure it is a bug. When I use llvm-slicer -sc LINE#err -entry FUNCNAME ***.bc, I find that it is good when the LINE is [132, 148], but it reports "Segmentation fault (core dumped)" when the LINE is [143, 145], which means that we do not choose the VAR that is in if/call_statement? Are there any other restrictions when using DG?
`static int jread(struct buffer_head **bhp, journal_t *journal,
unsigned int offset)
{
132# int err;
...
143# err = jbd2_journal_bmap(journal, offset, &blocknr);

145# if (err) {
printk(KERN_ERR "JBD2: bad block at offset %u\n",
offset);
148# return err;
}`

@hrshy0629
Copy link
Author

@mchalupa Hi, I find a bug. When the function instMatchesCrit(tools/llvm-slicer-crit.cpp) is matching instruction, it ignore the PHI instruction. The line in debug info of PHI is !6084 = !DILocation(line: 0, scope: !6073), you will get the line which is 0, which is error. The more reliable method is as follows:

`

        cleanup:
		%retval.0 = phi i32 [ -117, %if.then ], [ %call1, %if.then2 ], [ 0, %if.end22 ], [ -5, %if.then20 ], [ -12, %if.end4.cleanup_crit_edge ], !dbg !6084
		!6084 = !DILocation(**line: 0**, scope: !6073)
	%if.then:
		br label %cleanup, !dbg !6094
		!6094 = !DILocation(line: 140, column: 3, scope: !6093)
	%if.then2:
		br label %cleanup, !dbg !6102
		!6102 = !DILocation(line: 148, column: 3, scope: !6101)
	%if.end22
		br label %cleanup, !dbg !6144
		!6144 = !DILocation(line: 171, column: 2, scope: !6073)
	 %if.then20
		br label %cleanup, !dbg !6142
		!6142 = !DILocation(line: 167, column: 3, scope: !6140)
	if.end4.cleanup_crit_edge
		br label %cleanup, !dbg !6109
            !6109 = !DILocation(line: 152, column: 6, scope: !6073)

`

And the true line is as follows:
`
test

`
I think we can fix this bug.

@mchalupa
Copy link
Owner

mchalupa commented Jun 9, 2022

Thanks for looking into that. If you know how to fix this, can you prepare a pull request?

@hrshy0629
Copy link
Author

I need help! I debug the function instMatchesCrit. But I do not know why the function returns true when obj.empty() is true? If the obj is not NONE and we pass the line check, should we return true?

@mchalupa
Copy link
Owner

If obj.empty() is true, then obj is NONE. The rest of the code assumes, that there is some object we want to match and if there is no object that we want to specifically match, the instruction satisfies the criterion (it matches line and function). Or I do not understand the problem?

@hrshy0629
Copy link
Author

hrshy0629 commented Jun 13, 2022

The overview of DG is: 1)match the input objs to Inst, but dg required the Insts of the objs can read or write the memory; 2) the class of LLVMSlicer initializes the different graphs based on the input bc file; 3) DG queries the Insts from the different graphs and output the results of slices. But the Insts which can not read or write the memory will be discarded, which will result in inaccurate results.
Such as the example. The Inst br label %cleanup, !dbg !6102 has little to do with anything else, but it was part of our requirements.


%if.then2:
		br label %cleanup, !dbg !6102
		!6102 = !DILocation(line: 148, column: 3, scope: !6101)

Do I understand this correctly?

@mchalupa
Copy link
Owner

If an instruction is sliced away, it means none of slicing criterions depend on it, but it doesn't matter if it accesses memory or not. It is true, though, that matching slicing criteria is restricted to instructions that access memory or to function calls. But you can hack this part rather easily, I would say. Also, there is a switch 'slicing-criteria-are-next-inst' that makes the slicing criteria be the successors of the matched instructions. So what you can do is to insert calls of a dummy function before any instruction you want and use this switch to mark it as a slicing criterion.

@hrshy0629
Copy link
Author

Hi, I would like to ask whether the fault tolerance rate of DG is relatively small. I have tried many cases and encountered the problem of segment error.

`
../../llvm-slicer --sc mlxreg_fan_set_cur_state#379#err --cutoff-diverging=1 --cda ntscd --forward=1 --entry mlxreg_fan_set_cur_state 000cc5bc49aafc83a131bab2becf9922f5eec658_drivers@[email protected]
SC: Matched 'mlxreg_fan_set_cur_state#379#err' to:
call void @__asan_load8_noabort(i64 %34), !dbg !4009
%35 = load %struct.regmap*, %struct.regmap** %33, align 8, !dbg !4009
call void @__asan_load4_noabort(i64 %37), !dbg !4010
%38 = load i32, i32* %36, align 4, !dbg !4010
%call95 = call i32 @regmap_write(%struct.regmap* %35, i32 %38, i32 %div93.zext) #12, !dbg !4013
[llvm-slicer] cutoff 5 diverging blocks and 193 completely removed
WARNING: matched due to a lack of information: %33 = load %struct.regmap*, %struct.regmap** %31, align 8, !dbg !4005
WARNING: matched due to a lack of information: %36 = load i32, i32* %34, align 4, !dbg !4006
WARNING: matched due to a lack of information: %call95 = call i32 @regmap_write(%struct.regmap* %33, i32 %36, i32 %div93.zext) #13, !dbg !4009
SC: Matched 'mlxreg_fan_set_cur_state#379#err' to:
%33 = load %struct.regmap*, %struct.regmap** %31, align 8, !dbg !4005
%36 = load i32, i32* %34, align 4, !dbg !4006
%call95 = call i32 @regmap_write(%struct.regmap* %33, i32 %36, i32 %div93.zext) #13, !dbg !4009
#0 0x00007fe36401285c llvm::sys::PrintStackTrace(llvm::raw_ostream&, int) (/home/sunhy/slice_llvm/dg-master/build/tools/libdgllvmslicer.so+0x34d85c)
#1 0x00007fe364010604 llvm::sys::RunSignalHandlers() (/home/sunhy/slice_llvm/dg-master/build/tools/libdgllvmslicer.so+0x34b604)
#2 0x00007fe364010773 SignalHandler(int) (/home/sunhy/slice_llvm/dg-master/build/tools/libdgllvmslicer.so+0x34b773)
#3 0x00007fe3621c3980 __restore_rt (/lib/x86_64-linux-gnu/libpthread.so.0+0x12980)
#4 0x00000000004c6ad5 std::_Rb_tree<dg::LLVMNode*, dg::LLVMNode*, std::_Identitydg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::_M_get_insert_unique_pos(dg::LLVMNode* const&) /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_tree.h:0:0
#5 0x00000000004c6ad5 std::pair<std::_Rb_tree_iteratordg::LLVMNode*, bool> std::_Rb_tree<dg::LLVMNode*, dg::LLVMNode*, std::_Identitydg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::_M_insert_unique<dg::LLVMNode* const&>(dg::LLVMNode* const&) /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_tree.h:2091:4
#6 0x00007fe363ab9f4a std::set<dg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::insert(dg::LLVMNode* const&) /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_set.h:0:9
#7 0x00007fe363ab9f4a dg::DGContainer<dg::LLVMNode*, 4u>::insert(dg::LLVMNode*) /home/sunhy/slice_llvm/dg-master/include/dg/ADT/DGContainer.h:33:46
#8 0x00007fe363ab9f4a dg::Node<dg::LLVMDependenceGraph, llvm::Value*, dg::LLVMNode>::_addBidirectionalEdge(dg::LLVMNode*, dg::LLVMNode*, dg::EdgesContainer<dg::LLVMNode, 4u>&, dg::EdgesContainer<dg::LLVMNode, 4u>&) /home/sunhy/slice_llvm/dg-master/include/dg/Node.h:356:24
#9 0x00007fe363ab9f4a dg::Node<dg::LLVMDependenceGraph, llvm::Value*, dg::LLVMNode>::addUseDependence(dg::LLVMNode*) /home/sunhy/slice_llvm/dg-master/include/dg/Node.h:71:16
#10 0x00007fe363ab9f4a dg::LLVMDependenceGraph::addDefUseEdges(bool) /home/sunhy/slice_llvm/dg-master/lib/llvm/LLVMDependenceGraph.cpp:1274:25
#11 0x00000000004c3838 dg::llvmdg::LLVMDependenceGraphBuilder::_timerStart() /home/sunhy/slice_llvm/dg-master/include/dg/llvm/LLVMDependenceGraphBuilder.h:73:40
#12 0x00000000004c3838 dg::llvmdg::LLVMDependenceGraphBuilder::_runControlDependenceAnalysis() /home/sunhy/slice_llvm/dg-master/include/dg/llvm/LLVMDependenceGraphBuilder.h:93:9
#13 0x00000000004c3838 dg::llvmdg::LLVMDependenceGraphBuilder::computeDependencies(std::unique_ptr<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >&&) /home/sunhy/slice_llvm/dg-master/include/dg/llvm/LLVMDependenceGraphBuilder.h:235:9
#14 0x00000000004b61f4 std::__uniq_ptr_impl<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >::_M_ptr() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:147:42
#15 0x00000000004b61f4 std::unique_ptr<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >::get() const /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:332:21
#16 0x00000000004b61f4 std::unique_ptr<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >::release() /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:354:16
#17 0x00000000004b61f4 std::unique_ptr<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >::operator=(std::unique_ptr<dg::LLVMDependenceGraph, std::default_deletedg::LLVMDependenceGraph >&&) /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/unique_ptr.h:278:12
#18 0x00000000004b61f4 Slicer::computeDependencies() /home/sunhy/slice_llvm/dg-master/tools/include/dg/tools/llvm-slicer.h:105:13
#19 0x00000000004b7821 std::_Rb_tree<dg::LLVMNode*, dg::LLVMNode*, std::_Identitydg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::_Rb_tree_impl<std::lessdg::LLVMNode*, true>::_Rb_tree_impl() /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_tree.h:688:28
#20 0x00000000004b7821 std::_Rb_tree<dg::LLVMNode*, dg::LLVMNode*, std::_Identitydg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::_Rb_tree() /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_tree.h:913:7
#21 0x00000000004b7821 std::set<dg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >::set() /usr/lib/gcc/x86_64-linux-gnu/7.5.0/../../../../include/c++/7.5.0/bits/stl_set.h:157:7
#22 0x00000000004b7821 Slicer::mark(std::set<dg::LLVMNode*, std::lessdg::LLVMNode*, std::allocatordg::LLVMNode* >&) /home/sunhy/slice_llvm/dg-master/tools/include/dg/tools/llvm-slicer.h:132:34
#23 0x00000000004b5750 main /home/sunhy/slice_llvm/dg-master/tools/llvm-slicer.cpp:287:9
#24 0x00007fe3614a2c87 __libc_start_main /build/glibc-uZu3wS/glibc-2.27/csu/../csu/libc-start.c:344:0
#25 0x00000000004b4aba _start (/home/sunhy/slice_llvm/dg-master/build/tools/llvm-slicer+0x4b4aba)
PLEASE submit a bug report to https://bugs.llvm.org/ and include the crash backtrace.
Segmentation fault (core dumped)

`

Is there any way to avoid or solve this problem?

@mchalupa
Copy link
Owner

Hi, I would like to ask whether the fault tolerance rate of DG is relatively small.

What do you mean by fault tolerance?

Is there any way to avoid or solve this problem?

What happens if you do not use cutoff-diverging? This option is meant for backward slicing (it is not explicitly written in the help message, but its nature kind of implies it) and I see you use forward=1.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants