Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Return empty CSourceModule when no lowered_funcs exists in Relay mod #4847

Merged
merged 13 commits into from
Mar 16, 2020

Conversation

kumasento
Copy link
Contributor

This PR implements the dummy function idea as mentioned in #4748 - when the whole Relay module is optimized to empty, we can insert a dummy operator that allows TVM to still produce a library.

@kumasento kumasento requested review from jroesch, tqchen, zhiics and FrozenGene and removed request for FrozenGene February 8, 2020 20:46
@tqchen
Copy link
Member

tqchen commented Feb 10, 2020

cc @zhiics @FrozenGene

Copy link
Contributor

@mbaret mbaret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for contributing this - I've hit this issue myself in the case where the entire graph is off-loaded to external codegen. I wonder whether there's a case for making tvm::build return a 'dummy' module in the case of no lowered_funcs being provided? That way we could avoid having to put the workaround here.

Stmt body = EvaluateNode::make(0);
Array<ObjectRef> api_args;
auto dummy_func = MakeAPI(body, "__dummy__", api_args, 0, false);
lowered_funcs.Set("llvm", Array<LoweredFunc>({dummy_func}));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is defaulting the LLVM the correct behaviour here (eg. will this fall over if we build without LLVM support)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think should set target_host_. Even we have LLVM support, it is not correct too, imagine our target host is ARM.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you guys for your kind comments. We don't need to set target in the latest commit.

lowered_funcs,
target_host_,
BuildConfig::Current());
LOG(WARNING) << "No lowered funcs exist in the compiled module, "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to retain this warning? With external codegen, having no lowered funcs can be a perfectly normal mode of operation.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @mbarrett97 , I've removed that log.

@tqchen
Copy link
Member

tqchen commented Feb 10, 2020

I agree that perhaps an empty module provides useful middle ground. The closest thing so far might be CSourceModule with an empty string https://github.com/apache/incubator-tvm/blob/master/src/target/source/source_module.cc#L190

@zhiics
Copy link
Member

zhiics commented Feb 11, 2020

CSourceModule with an empty string looks to me as well. @kumasento could you do that instead of creating a dummy llvm module? Thanks.

@kumasento
Copy link
Contributor Author

Thank you guys for all your kind reviews! @mbarrett97 @FrozenGene @tqchen @zhiics

I've updated this PR to fulfill the following revisions:

  1. Using CSourceModule with an empty string as the returned module object ret_.mod, instead of creating a dummy lowered function.
  2. Removed the warning based on @mbarrett97 review.

Now the generated module looks clean and tidy. No redundant dummy function generated and no extra design decisions should be made.

Please let me know if there is anything else you feel should be done. Thanks!

@kumasento kumasento force-pushed the dev-relay-dummy branch 3 times, most recently from b3b15f2 to 1f13b9e Compare February 11, 2020 21:52
src/relay/backend/build_module.cc Show resolved Hide resolved
src/relay/backend/build_module.cc Outdated Show resolved Hide resolved
@@ -438,13 +439,14 @@ class RelayBuildModule : public runtime::ModuleNode {

auto lowered_funcs = graph_codegen_->GetLoweredFunc();
if (lowered_funcs.size() == 0) {
LOG(WARNING) << "no lowered funcs exist in the compiled module";
ret_.mod = tvm::codegen::CSourceModuleCreate("", "");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the back and forth. Could you please add a comment here so that ppl would know what we are doing here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can force push so that your previous CI could be terminated earlier.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure thing, just added that

@mbaret
Copy link
Contributor

mbaret commented Feb 12, 2020

I don't think the empty CSourceModule method works. There's a check in source_module.cc that fails when you try and create one with an empty string.

@kumasento
Copy link
Contributor Author

I don't think the empty CSourceModule method works. There's a check in source_module.cc that fails when you try and create one with an empty string.

Hi @mbarrett97 thanks for your comment. Currently I haven't met such an issue while testing. Would you mind letting me know which assertion you were referring to?

Copy link
Contributor

@mbaret mbaret left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kumasento kumasento changed the title Use dummy func when no lowered_funcs exists in Relay mod Return empty CSourceModule when no lowered_funcs exists in Relay mod Feb 12, 2020
@kumasento
Copy link
Contributor Author

@mbarrett97 Thanks, I just noticed that the base of my PR is not the latest commit. I will update it soon.

@tqchen
Copy link
Member

tqchen commented Mar 10, 2020

ping @FrozenGene please followup

auto target = args[0].operator std::string();
auto module_name = args[1].operator std::string();

// create a default data layout
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for later response. Minor comment: The logic here doesn't only create default data layout but also create default target triple. We should update the comment.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @FrozenGene I've updated the comments based on your comments :)

Copy link
Member

@FrozenGene FrozenGene left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@FrozenGene
Copy link
Member

@mbaret @zhiics @tqchen could take a look another round.

@tqchen tqchen merged commit 11ee1a0 into apache:master Mar 16, 2020
@tqchen
Copy link
Member

tqchen commented Mar 16, 2020

Thanks @kumasento @FrozenGene @mbaret @zhiics This PR is now merged

@kumasento kumasento deleted the dev-relay-dummy branch March 16, 2020 21:34
@trevor-m
Copy link
Contributor

This commit is causing segfaults when my entire relay program is offloaded to an external codegen.

@kumasento
Copy link
Contributor Author

kumasento commented Mar 17, 2020 via email

@trevor-m
Copy link
Contributor

trevor-m commented Mar 17, 2020

HI @kumasento the graph runtime is trying to get my external codegen functions from the empty LLVM module.

Stack trace shows the segfault is coming from LLVMModuleNode, so I added a print statement after this line which showed that GetFunction() was called on the LLVM module with name=tensorrt_29 which should be going to my external codegen module instead.

Stack trace:

[18:07:04] /data/neo-ai-tvm/src/target/llvm/llvm_module.cc:60: LLVMModuleNode::GetFunction() func name: tensorrt_29

  [bt] (0) /home/ubuntu/anaconda3/lib/python3.6/site-packages/mxnet/libmxnet.so(+0x2e6b140) [0x7f3b510c0140]
  [bt] (1) /lib/x86_64-linux-gnu/libc.so.6(+0x354b0) [0x7f3bdd7134b0]
  [bt] (2) /lib/x86_64-linux-gnu/libc.so.6(strlen+0x26) [0x7f3bdd769746]
  [bt] (3) /data/neo-ai-tvm/build/libtvm.so(tvm::codegen::LLVMModuleNode::LazyInitJIT()+0x894) [0x7f3bcca32ad4]
  [bt] (4) /data/neo-ai-tvm/build/libtvm.so(tvm::codegen::LLVMModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::ObjectPtr<tvm::runtime::Object> const&)+0x488) [0x7f3bcca334b8]
  [bt] (5) /data/neo-ai-tvm/build/libtvm.so(tvm::runtime::ModuleNode::GetFunction(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, bool)+0x45) [0x7f3bcca757f5]
  [bt] (6) /data/neo-ai-tvm/build/libtvm.so(tvm::runtime::GraphRuntime::CreateTVMOp(tvm::runtime::TVMOpParam const&, std::vector<DLTensor, std::allocator<DLTensor> > const&, unsigned long)+0x4d5) [0x7f3bccad8e05]
  [bt] (7) /data/neo-ai-tvm/build/libtvm.so(tvm::runtime::GraphRuntime::SetupOpExecs()+0x661) [0x7f3bccadb0d1]
  [bt] (8) /data/neo-ai-tvm/build/libtvm.so(tvm::runtime::GraphRuntime::Init(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&, tvm::runtime::Module, std::vector<DLContext, std::allocator<DLContext> > const&)+0x260) [0x7f3bccadcf60]

@kumasento
Copy link
Contributor Author

Thanks @trevor-m

@FrozenGene sorry for bothering you but does this ring a bell?

I feel it is weird that GetFunction won't return an invalid/empty value directly when the function name cannot be found. I'm wondering why LazyJIT should be called in this scenario?

@FrozenGene
Copy link
Member

Thanks @trevor-m

@FrozenGene sorry for bothering you but does this ring a bell?

I feel it is weird that GetFunction won't return an invalid/empty value directly when the function name cannot be found. I'm wondering why LazyJIT should be called in this scenario?

Please see: https://github.com/apache/incubator-tvm/pull/4847/files#diff-8baddb83a9684e8373691bb48a946900R469-R474 When entire program is offloaded, I find previous logic will

// Execute the whole module using external runtime.
        ret_.mod = ext_mods[0];

However, current logic we will

// Import all external runtime modules.
    for (const auto& it : ext_mods)
      ret_.mod.Import(it);

Could you double check the logic is the same as previous? Thanks.

@kumasento
Copy link
Contributor Author

Hi @FrozenGene

Thanks for your explanation. I do have a question about this part, hope you won't mind:

  1. Before this PR, we replace ret_.mod with ext_mods[0] when there is no lowered_funcs, which is sensible since no ret_.mod is available when lowered_funcs does not exist.
  2. Now we will always create a ret_.mod. What would the logic be if there is only one external module (the condition for ret_.mod replacement)? Should we do replacement or import?

Also @trevor-m would you mind sending me a minimal workable example? I would like to do the tracing myself as well.

Thanks

@FrozenGene
Copy link
Member

Hi @FrozenGene

Thanks for your explanation. I do have a question about this part, hope you won't mind:

  1. Before this PR, we replace ret_.mod with ext_mods[0] when there is no lowered_funcs, which is sensible since no ret_.mod is available when lowered_funcs does not exist.
  2. Now we will always create a ret_.mod. What would the logic be if there is only one external module (the condition for ret_.mod replacement)? Should we do replacement or import?

Also @trevor-m would you mind sending me a minimal workable example? I would like to do the tracing myself as well.

Thanks

I think we should be ret_.mod replacement as previous pr. @zhiics do the external codegen part. He could answer it more authority. @zhiics Could you help to answer this question?

@zhiics
Copy link
Member

zhiics commented Mar 21, 2020

I think changing it to a llvm module and import all submodules is okay. Now if you only have an external module. You will need to create a llvm module first and them import the external module to it.

Stepping into llvm module to find the symbol is not wrong because we will always try to find the symbol from the host module first. If it is not found, we will then try to check each imported module. See the code here:

https://github.com/apache/incubator-tvm/blob/050f2bde2c694af9b5569ca954ca041c3767787b/src/runtime/module.cc#L65

A minimal example to reproduce this and track the root cause would be more helpful.

@trevor-m
Copy link
Contributor

I think changing it to a llvm module and import all submodules is okay. Now if you only have an external module. You will need to create a llvm module first and them import the external module to it.

Stepping into llvm module to find the symbol is not wrong because we will always try to find the symbol from the host module first. If it is not found, we will then try to check each imported module. See the code here:

https://github.com/apache/incubator-tvm/blob/050f2bde2c694af9b5569ca954ca041c3767787b/src/runtime/module.cc#L65

A minimal example to reproduce this and track the root cause would be more helpful.

You can reproduce this by running test_extern_dnnl() after commenting out this line: https://github.com/apache/incubator-tvm/blob/master/tests/python/relay/test_pass_partition_graph.py#L203

@kumasento
Copy link
Contributor Author

Hi @trevor-m
Thanks for this information. I ran the test and reproduced the bug. I've located that the segfault should be raised from the create function of llvm::EngineBuilder (this line).

Now I'm looking at the internal logic in LLVM to find out what the actual cause is. Please bear with me for 1-2 days.

@kumasento
Copy link
Contributor Author

@trevor-m a tentative fix has been posted in #5146

trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Mar 25, 2020
…pache#4847)

* Use dummy func when no lowered_funcs exists in Relay mod

* Dummy func -> CSourceModule with empty code str

* Added comments describing the empty CSouceModule

* Always import external modules w/o assertions

* Use CSourceModule as a fallback for LLVMModule

* Changed cond for target == llvm

* Create an empty LLVM module w/o using dummy func

* Avoid using IR str concat to create LLVM module

* Improved comments for codegen.LLVMModuleCreate

* Satisfy the linter for LLVMModuleCreate
trevor-m pushed a commit to trevor-m/tvm that referenced this pull request Apr 16, 2020
…pache#4847)

* Use dummy func when no lowered_funcs exists in Relay mod

* Dummy func -> CSourceModule with empty code str

* Added comments describing the empty CSouceModule

* Always import external modules w/o assertions

* Use CSourceModule as a fallback for LLVMModule

* Changed cond for target == llvm

* Create an empty LLVM module w/o using dummy func

* Avoid using IR str concat to create LLVM module

* Improved comments for codegen.LLVMModuleCreate

* Satisfy the linter for LLVMModuleCreate
zhiics pushed a commit to neo-ai/tvm that referenced this pull request Apr 17, 2020
…pache#4847)

* Use dummy func when no lowered_funcs exists in Relay mod

* Dummy func -> CSourceModule with empty code str

* Added comments describing the empty CSouceModule

* Always import external modules w/o assertions

* Use CSourceModule as a fallback for LLVMModule

* Changed cond for target == llvm

* Create an empty LLVM module w/o using dummy func

* Avoid using IR str concat to create LLVM module

* Improved comments for codegen.LLVMModuleCreate

* Satisfy the linter for LLVMModuleCreate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants