Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLVM/RUNTIME] Support Parallel for on CPU #54

Merged
merged 1 commit into from
Feb 26, 2017
Merged

[LLVM/RUNTIME] Support Parallel for on CPU #54

merged 1 commit into from
Feb 26, 2017

Conversation

tqchen
Copy link
Member

@tqchen tqchen commented Feb 26, 2017

#50 Need to wait #53 to be merged

@tqchen tqchen requested a review from icemelon February 26, 2017 00:51
Copy link
Member

@icemelon icemelon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@tqchen tqchen merged commit f6c043e into master Feb 26, 2017
tqchen added a commit to tqchen/tvm that referenced this pull request May 26, 2018
tqchen added a commit to tqchen/tvm that referenced this pull request May 26, 2018
* [TEST] Xavie initialization for benchmarks

* remove additional line
tqchen added a commit that referenced this pull request May 29, 2018
* [TEST] Xavie initialization for benchmarks

* remove additional line
tqchen added a commit to tqchen/tvm that referenced this pull request Jul 6, 2018
tqchen added a commit to tqchen/tvm that referenced this pull request Jul 6, 2018
* [TEST] Xavie initialization for benchmarks

* remove additional line
sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018
sergei-mironov pushed a commit to sergei-mironov/tvm that referenced this pull request Aug 8, 2018
* [TEST] Xavie initialization for benchmarks

* remove additional line
jroesch added a commit to jroesch/tvm that referenced this pull request Aug 29, 2018
* Start on Relay documentation

* Add more docs

* Copy over old manual text and setup document hierarchy

* Add sphinx_autodoc_annotation
icemelon pushed a commit to icemelon/tvm that referenced this pull request Apr 14, 2020
* Add tensorrt backend.

Fix merge

Fix merge and clean up logs

Add BiasAdd, Concat, padding ceil mode, and clean up code

Fix formatting and remove unused headers

uncomment models

Fix bug with variable input, clean up

Don't split batch norm

Move TRT execution to TrtExecutor

Clean up

Clean up

Add paritioning

Implement graph_runtime execution for Relay/TRT

Fix bug in extern op

Fix compilation

Add EnableTrt pass to perform same modification as previous wholegraphannotator

Renable NNVM TRT

Remove SimplifyBatchnorm, add rules for converting ops

Fix format, remove unused tests

Enable multiple outputs

Fix multiple outputs

Fix activation lookup

Fix no newline at eof

Add license header. Add consistency test to models

Add method to check TRT used. Improve comments

Fix lint

Add util to check TRT version

Add if guards around TRT5.1 APIs

Add env var for workspace size, fix logger

fix build

Add TRT versioning to EnableTrt pass

Fix build error in DLR

Fix compile for DLR

Update dmlc-core, fix copyright header, undo change to includes

Remove unused headers

Fix IsTrtCompatible visitor and move op list to constructor

Add dropout to compatible ops for CheckTrtCompatible only. Add not compatible test

Add squeeze, transpose, reshape, pad, and reduce ops. Add transpose on weights workaround

Fix formatting. Add unit tests

Support transpose on weights for conv2d and dense. Support asymmetric padding. Temp fix for 1D inputs. Add units tests for all ops.

Support StridedSlice, AdaptivePooling approximation, Pytorch addmm fixer pass

Support (2,3,0,1) tranpose on weights

Allow stride to be incomplete. Support ConstantNode -> kWeight

Fix pass serialized graph by value in runtime. Allow inclusive count for strided pool

Comments, disable failign test

Fix CI lint

Removed unused variables from TrtBuilder. Add more comments

Fix build for TRT4

Add GetTrtVersion(), Move convert map to function, remove uneeded include,  make batch_size_, logger_ TrtBuilder members, check output existence

Use shared_ptr for converters. Add check for num outputs and inputs

Support image.resize

Make GetOpConverters return a shared_ptr

Clarify count inclusive padding weirdness

Use external codegen/runtime

Move to src/runtime/contrib/tensorrt. Add Save and Load methods for tensorrt module. Rename some classes

Require format to be tensorrt so that loader knows how to load

FoldConstants

Destroy engine and context after use. Store TRT weights from op converters. Formatting

Always apply ConvertLayout to NCHW

Clean up

Add ASF header

Change ObjectRef -> NodeRef

Fix lint

Fix pylint

Fix bug with scalar weights

Making TRT cmake more informative

Make tensorrt tests dependent on whether trt codegen is enabled

Add serialization test.

* Refactor EnableTRT checkers

* Fix const weight detection

* remove tensorrt_module.h, add test for multiple outputs. Use normal GetShape. Remove GetType. Add flag for additional model testing

Undo add comments to prevent conflicts

* Separate TRT from relay. Add docstrings and more comments. Move all passes to python. Remove double lookup for Execute

Formatting

Fix lint

Fix pylint

Rename codegen_tensorrt. Check registry get. Add comments

Make trt codegen off by default.

* disable for ci

* TRT codegen can be turned on independently

* Fix tests

* Fix build without runtime

* Enable AvgPool approximation

* Remove change to cmake config

* Move passes to PreprocessForTrt. Use op.name. Rename LegalizeLayoutTransform.

* Add newlin to EOF. Remove else. Reserve space for vectors

* Remove AdaptivePool2D commentted out code. Add comment for transposed weight workaround

* Rename IsCompatibleFn

* Use ++i instead of i++

* Improve incompatible messages, use string::empty, small improvements

* Use constructor to fill func_params

* Remove std::move

* Use opt level 3, add helper to check whether to run test, improve load_params

* Replace TransposeRSCKtoCKRS/KCRS with TransposeWeights4D

* Clean up VisitExpr(CallNode) for args
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Feb 26, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Mar 3, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
vinx13 pushed a commit to vinx13/tvm that referenced this pull request Mar 9, 2022
rebased

[TIR][Schedule] fix reorder/buffer_flatten & finish CPU demo (apache#59)

[CPU DEMO] Update cpu gemm demo and fix bug (apache#58)

* [TIR][Schedule] introduce parallel and fix bugs for cpu demo

* [TIR][Schedule] update cpu demo

* [TIR][Schedule] fix lint

* [TIR][Schedule] fix

rebased

[TIR][Schedule] introduce reduction block and CPU demo (apache#53)

* [TIR] reduction : split_reduction

* [TIR] reduction : split_reduction

* [TIR] reduction : fuse_reduction

* [TIR] reduction : cpu demo

* [TIR] reduction : fix

* [TIR] reduction : pattern detect remains

* [TIR] reduction : pattern detect remains

* [TIR] reduction : pattern match done

* [TIR] reduction : fix lint

* [TIR] reduction : fix

* [TIR] reduction : fix

* [TIR] reduction : fix

* [TIR] reduction : fix

* [TIR] reduction : rebased

* [TIR] reduction : rebased

[TIR][Schedule] introduce cache_read cache_write (apache#54)

* [TIR][Schedule] introduce cache_read cache_write

* [TIR][Schedule] add more comments

* [TIR][Schedule] fix problem and add comments

* [TIR][Schedule] address comments

[TIR] schedule: introduce vectorize, unroll, loop validation (apache#47)

* [TIR] vectorize : basically complete

* [TIR] vectorize&unroll : update comments&unroll

* [TIR] vectorize&unroll : rebased

* [TIR] vectorize, unroll, cpu_demo: done

* [TIR] vectorize, unroll, cpu_demo: simplify

* [TIR] vectorize, unroll, cpu_demo: fix

* [TIR] reduction : rebased

* [TIR] reduction : fix

[TIR][Schedule] fix sref and scopes problem during replace and compute_at (apache#50)

* [TIR][Schedule] fix sref and scopes problem during replace and compute_at

* [TIR][Schedule] fix

* [TIR][Schedule] fix

[TIR][Refactor] move function to ScheduleNode

[TIR] Schedule: introduce primitive compute_at (apache#36)

* [TIR] Schedule: introduce primitive compute_at

* [TIR] Schedule: address comments

* [TIR] Schedule: address comments

* [TIR] Schedule: address comments

* [TIR] Schedule: add check to compute_at

* [TIR] Schedule: address comments

* [TIR] Schedule: address comments

[TIR] Schedule: introduce primitive reorder (apache#37)

* [Schedule] debug

* [TIR] Schedule: reorder, loop type detect remains

* [TIR] reorder complete

* [TIR] reorder complete

* [TIR] fix

* [TIR] reorder : rebased complete

* [TIR] reorder : fix container.h

* [TIR] reorder : fix

* [TIR] reorder : fix

* [TIR] reorder : fix

* [TIR] reorder : simplify

* [TIR] reorder : simplify

* [TIR] reorder : simplify

* [TIR] reorder : fix

* [TIR] reorder : fix

* [TIR] reorder : rebased

* [TIR] reorder : rebased

rebase

[TIR] Schedule: introduce BlockRealize and Block SRef reuse(apache#39)

* [TIR] BlockRealize: schedule refactor

* [TIR] BlockRealize: debug

* [TIR] BlockRealize finish

* [TIR] BlockRealize finish

* [TIR] BlockRealize fix

* [TIR] BlockRealize update test

* [TIR] BlockRealize: add loop var reuse

* [TIR] BlockRealize: add loop var reuse

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

* [TIR] BlockRealize: fix

[TIR] compare for module (apache#38)

* [TIR] compare for module

* [TIR] fix

* [TIR] fix

* [TIR] fix

* [TIR] fix

* [TIR] fix

* [TIR] fix

[Hybrid] Module init

[Hybrid] Module print

[Hybrid] Module print with meta

[Hybrid] adjust

[Hybrid] finished but without lint and comment check

[Hybrid] fix lint

[Hybrid] comments

[Hybrid] fix script decoration API

[Hybrid] using IRModule

[Hybrid] fix

[Hybrid] adjust API

[Hybrid] fix

[Hybrid] fix

[Hybrid] fix

[Hybrid] fix symbol table, adjust API, introduce meta_mutator and resolve import issue

[Hybrid] fix lint

[TIR] introduce pass BufferFlatten (apache#32)

* [TIR] introduce pass BufferFlatten

* [Tir] add comments & remove old TeLower

* [TIR] split GatherRegion and BufferFlatten to two Visitor/Mutator

* [TIR] address comments: Only consider stmt scope

* [TIR] BufferFlatten: address comments

* [TIR] BufferFlatten: fold BlockFlattener into BufferFlattener

* [TIR] BufferFlatten: add asserts

* [TIR] BufferFlatten: use Equal in testcase

* [TIR] Equal Pass: Enhanced the pass

* [TIR] Equal Pass: add comments

[Hybrid] refactor using Doc, introduce annotation, enhance parser (apache#28)

* [Hybrid] refactor printer, enhance parser

* [Hybrid] refactor

* [Hybrid] fix

* [Hybrid] fix

* [Hybrid] fix namespace issue

* [Hybrid] compare using Equal

[TIR] rebased

[TE] fix replace again and add primitive fuse and split (apache#27)

* [TE] add: schedule primitive fuse

* [TE] add: schedule primitive split

* [TE] address comments: add IRSubstitueInScope and other minor fix

* [TE] address comments: Enhance Equal api and fix split by nparts

* [TE] address comments

[Hybrid] introduce printer (apache#25)

* [Hybrid] substitute Block with SeqStmt, change block() syntax

* [Hybrid] add printer, type declare intrin

* [Hybrid] refactor

* [Hybrid] meta

* [Hybrid] refactor

* [Hybrid] macro

[TE] fix replace (apache#23)

* [TE] fix replace

* [TE] fix replace: add more tests

* [TE] fix replace: add more tests

[TE] rebased

[Hybrid] python syntax parser (apache#20)

* [Hybrid] python syntax parser

* [Hybrid] add a testcase

* [Hybrid] improve comments and fix bugs

* [Hybrid] improve comments, refactor __internal_assert, add new testcases

* [Hybrid] improve error report message, refactor intrin

* [Hybrid] separate ScopeEmitter from parser

* [Hybrid] refactor type check

* [Hybrid] refactor intrin

* [Hybrid] refactor intrin, allow register external functions with argument type checking, add a testcase

* [Hybrid] address comments, fix a bug in te/ir.h

* [Hybrid] remove type check

* [Hybrid] python syntax parser

* [Hybrid] add a testcase

* [Hybrid] improve comments and fix bugs

* [Hybrid] improve comments, refactor __internal_assert, add new testcases

* [Hybrid] improve error report message, refactor intrin

* [Hybrid] separate ScopeEmitter from parser

* [Hybrid] refactor type check

* [Hybrid] refactor intrin

* [Hybrid] refactor intrin, allow register external functions with argument type checking, add a testcase

* [Hybrid] address comments, fix a bug in te/ir.h

* [Hybrid] remove type check

* [Hybrid] refactor intrin, scope_handler, special_stmt

* [Hybrid] address comments

* [Hybrid] clean code, improve error reporting & testcase

* [Hybrid] clean code

* [Hybrid] clean code

[IR] introduce dependency graph and write map

[TE] refactor and clean codebase

[TE] refactor IR

[TE] introduce schedule, dependency graph and support fuse and split (apache#17)

* fix lint

* introduce dependency graph

* enable create schedule

* support get axes

* fix lint

* revert Set

* add schedule primitive fuse

* address comment

* support split

[IR] Introduce SeqStmt

add TeLower pass and enable to run Te IR (apache#15)

* add function data structure
add TeLower pass to transform Te to current IR
enable to run Te IR

* address comments

* unify terminology

TensorIR data structure init (apache#14)

* init te data structure

* finish printer and enhanced ir_builder

* address the comments

Co-authored-by: Bohan Hou <[email protected]>
jinhongyii pushed a commit to jinhongyii/tvm that referenced this pull request Jun 20, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
cyx-6 added a commit to cyx-6/tvm that referenced this pull request Jun 29, 2022
* `ast.Expr` and concise scope

* add `BufferStore`
junrushao pushed a commit to cyx-6/tvm that referenced this pull request Jul 4, 2022
* `ast.Expr` and concise scope

* add `BufferStore`
cyx-6 added a commit to cyx-6/tvm that referenced this pull request Jul 13, 2022
* `ast.Expr` and concise scope

* add `BufferStore`
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this pull request Jul 30, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
Hzfengsy pushed a commit to Hzfengsy/tvm that referenced this pull request Jul 30, 2022
* `ast.Expr` and concise scope

* add `BufferStore`
areusch pushed a commit to areusch/tvm that referenced this pull request Sep 20, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
gigiblender pushed a commit to gigiblender/tvm that referenced this pull request Nov 3, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
MasterJH5574 pushed a commit to MasterJH5574/tvm that referenced this pull request Nov 20, 2022
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
yelite pushed a commit to yelite/tvm that referenced this pull request Feb 17, 2023
* nn module

* address comments.

* Add nn.init_params

* Remove nn.Builder and use BlockBuilder instead.

* Rebase.

* Refactor block builder and add tests.

* Address comments.

* Update.
vinx13 pushed a commit to vinx13/tvm that referenced this pull request Mar 27, 2023
It's that time again, this is another merge with tvm/unity to grab the
latest improvements.
mikeseven pushed a commit to mikeseven/tvm that referenced this pull request Sep 27, 2023
Develop pr

Approved-by: Mikael Sevenier
masahi added a commit to masahi/tvm that referenced this pull request Mar 13, 2024
elvin-n pushed a commit to Deelvin/tvm that referenced this pull request Mar 19, 2024
vinx13 pushed a commit to vinx13/tvm that referenced this pull request Mar 19, 2024
krishnaraj36 pushed a commit to krishnaraj36/tvm_mainline that referenced this pull request Aug 9, 2024
We can now build one binary and use across targets

Co-authored-by: Siva <[email protected]>
LeiWang1999 added a commit to LeiWang1999/tvm that referenced this pull request Nov 8, 2024
* improve e4m3 decoding.

* append fp16xint1

* Update submodule commit reference

* chore: Update shared memory scope for float32 output dtype

* BUGFIX: UINT8/INT8 Decoding

* feat: Add rasterization options for roller module

* Refactor tensorcore_legalization method to optimize tensor core usage

* feat: Add function to collect variables from expression, improve for splitk

* chore: Update typing import in __init__.py

* chore: Refactor CPU execution of operators

* Refactor matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* chore: Update version to 0.0.1.dev8

* chore: Enable debug output in bitblas.set_debug_level()

* Refactor Linear module matmul implementation for splitk layout

* Refactor matmul implementation for splitk layout

* Refactor CUDA kernel launch string for dynamic symbolic set

* Bumpt version to v0.0.1.dev9

* Refactor CUDA kernel launch string for dynamic symbolic set

* Bump version to v0.0.1.dev10

* Refactor CUDA kernel launch string for dynamic symbolic set

* Bump version to v0.0.1.dev12 and add MatmulConfigWithSplitK and MatmulWithSplitK

---------

Co-authored-by: LeiWang199 <leiwang199>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants