Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Ansor][AutoTVM v2.0] Phase 0: Ansor minimum system for auto schedule generating #5962

Merged
merged 80 commits into from
Jul 15, 2020
Merged
Show file tree
Hide file tree
Changes from 70 commits
Commits
Show all changes
80 commits
Select commit Hold shift + click to select a range
7ee0902
Code migration Start (#1)
jcf94 May 26, 2020
9fcbf0b
Split transform_step out & Update more UTs (#3)
jcf94 May 27, 2020
f43e82f
Add search_task, measure and serialization (#4)
jcf94 May 28, 2020
e0a5ed5
Add MetaTileRewritePolicy (#5)
jcf94 May 29, 2020
359905a
Basic Python API for State (#6)
jcf94 Jun 3, 2020
2032a64
Add Python API: Measure & Task (#7)
jcf94 Jun 4, 2020
6b21dc6
Add ansor.auto_schedule() API; First AutoSchedule working version(#8)
jcf94 Jun 4, 2020
e52135f
Bug fix & Add python serialization API (#10)
jcf94 Jun 5, 2020
1fe6638
Improve code style, python wrapper and test cases (#11)
merrymercy Jun 7, 2020
43d1530
fix unit tests
merrymercy Jun 8, 2020
f367d15
Add RPCRunner & OpenCL/CUDA test (#12)
jcf94 Jun 8, 2020
2bd6471
rebase to upstream/master
merrymercy Jun 8, 2020
c860f2c
Add Ansor basic tutorial (#13)
jcf94 Jun 8, 2020
f60d1a6
migrate feature extraction (#14)
merrymercy Jun 8, 2020
b839c0f
Add XGBModel & RPCRunnerWarpper (#15)
jcf94 Jun 9, 2020
cfe58d7
Migrate workload_registry.py (#16)
merrymercy Jun 9, 2020
143ea45
add task scheduler (#17)
merrymercy Jun 9, 2020
ed075c2
Add conv2d cuda tutorial with workload registry (#18)
jcf94 Jun 9, 2020
74ec7d0
add tune_test.py (the old tune_wkl.py) (#19)
merrymercy Jun 9, 2020
cd0a516
Code refine for tune_test.py & Add a pre load callback (#20)
jcf94 Jun 10, 2020
3a24e49
Add python custom sketch rule (#21)
jcf94 Jun 11, 2020
a155c1f
Ansor Relay Integration (without layout rewrite) (#22)
minminsun Jun 12, 2020
674027f
Add tune_op_subgraph.py & Some code clean for tune_network.py (#23)
jcf94 Jun 12, 2020
2f241ed
add explicit_unroll_max_extent (#25)
merrymercy Jun 12, 2020
18d44b8
Add Index simplification & API update (#26)
jcf94 Jun 15, 2020
4ea6712
Update PreLoadMeasuredStates & Some bug fix (#27)
jcf94 Jun 16, 2020
6126cdb
Add tensorize step for loop_state (#31)
jcf94 Jun 19, 2020
c7364df
State python api update (#33)
jcf94 Jun 19, 2020
36cd9ef
kernel layout rewrite (#28)
minminsun Jun 19, 2020
145e61c
[cache flush] port cache flush to ansor (#32)
FrozenGene Jun 19, 2020
2c27816
Improve relay integration (#34)
merrymercy Jun 20, 2020
0794875
Fix xgb error & Simplify dispatcher (#35)
merrymercy Jun 20, 2020
a4c4548
Rename "MetaTileRewritePolicy" to "SketchPolicy". (#36)
merrymercy Jun 20, 2020
593a2c7
rebase
merrymercy Jun 20, 2020
53bd591
Migrate all node::make to noderef's construct function (#37)
jcf94 Jun 22, 2020
8e53d12
Some lint fix & Recover the double constructor of tvm::PrimExpr (#39)
jcf94 Jun 23, 2020
cd5c5ad
Add MutateComputeLocation and MutateParallel in evolutionary search (…
merrymercy Jun 23, 2020
5860191
Improve loop state python API (stage_tensors -> stage_ops) (#41)
merrymercy Jun 23, 2020
14a19cd
ComputeDAG bug fix & Add Custom TensorCore Matmul Example (#42)
jcf94 Jun 24, 2020
b012e27
Rever Commits, Start to build minimum Ansor system
jcf94 Jun 24, 2020
d6d6b85
Code clean for minimum Ansor system
jcf94 Jun 24, 2020
4042cfa
Bug fix & Delete AccessAnalyzer
jcf94 Jun 28, 2020
7695def
Delete attachmap & Code clean
jcf94 Jun 28, 2020
0c200cd
Doc update
jcf94 Jun 28, 2020
9c35e50
Headfile update & Python doc update
jcf94 Jun 28, 2020
a015051
clang-format fix
jcf94 Jun 29, 2020
6823802
pylint fix
jcf94 Jun 29, 2020
a82dbb8
Update
jcf94 Jun 29, 2020
ac36c46
Doc update
jcf94 Jun 29, 2020
a62b1e0
Update
jcf94 Jun 30, 2020
3eac89d
Merge branch 'upstream_master' into upstream_0_new
jcf94 Jun 30, 2020
526cf42
Bug fix after code merge to the new master
jcf94 Jun 30, 2020
426ec82
clang-format fix
jcf94 Jun 30, 2020
907c17c
Update
jcf94 Jul 1, 2020
64f8f8d
Update
jcf94 Jul 1, 2020
1b16dd4
Update std::vector to Array; Update verbosity setting; Some commemts
jcf94 Jul 1, 2020
9fa897b
std::vector->Array & std::string->String
jcf94 Jul 2, 2020
f40c7af
Add init_state to ComputeDAG
jcf94 Jul 2, 2020
0a24daf
Update
jcf94 Jul 2, 2020
a45fd89
Update some unordered_map to Map
jcf94 Jul 2, 2020
bfc6663
clang-format fix
jcf94 Jul 2, 2020
eb02e77
Comments addressed
jcf94 Jul 3, 2020
cb2442f
Lint fix
jcf94 Jul 3, 2020
b1ca20c
Update
jcf94 Jul 3, 2020
49dbec6
Merge branch 'upstream_master' into upstream_0_new
jcf94 Jul 3, 2020
8add768
Update
jcf94 Jul 3, 2020
78e5313
Update
jcf94 Jul 4, 2020
546abbe
Update
jcf94 Jul 4, 2020
d418a57
Update
jcf94 Jul 5, 2020
8e1d65d
Update
jcf94 Jul 5, 2020
3a67a72
Update
jcf94 Jul 9, 2020
28a7b8f
Update
jcf94 Jul 9, 2020
1360b1b
Update
jcf94 Jul 9, 2020
52afe74
Rename ansor namespace to auto_schedule
jcf94 Jul 11, 2020
6a61fb6
Update
jcf94 Jul 11, 2020
3a4e5da
Rename ThreadPool to ParallelFor
jcf94 Jul 14, 2020
dbe019b
Add parallel_for
jcf94 Jul 14, 2020
1f1b878
Remove ThreadPool
jcf94 Jul 14, 2020
02fede9
Update python/tvm/auto_schedule/auto_schedule.py
merrymercy Jul 14, 2020
eea0989
trigger CI
merrymercy Jul 14, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -185,6 +185,7 @@ assign_source_group("Include" ${GROUP_INCLUDE})

# Source file lists
file(GLOB_RECURSE COMPILER_SRCS
src/ansor/*.cc
src/node/*.cc
src/ir/*.cc
src/arith/*.cc
Expand Down
34 changes: 34 additions & 0 deletions python/tvm/ansor/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.
# pylint: disable=unused-import, redefined-builtin
"""Namespace for Ansor auto-scheduler"""

from . import compute_dag
from . import measure
from . import record
from . import loop_state
from . import utils
from . import workload_registry

# Shortcut
from .compute_dag import ComputeDAG
from .auto_schedule import SearchTask, TuningOptions, HardwareParams, \
auto_schedule, EmptyPolicy
from .measure import MeasureInput, LocalBuilder, LocalRunner
from .record import LogToFile, LogReader, best_measure_pair_in_file, \
load_from_file, append_measure_records_to_file
from .workload_registry import register_workload, make_workload_key
22 changes: 22 additions & 0 deletions python/tvm/ansor/_ffi_api.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

"""Register FFI APIs from C++ for the namespace tvm.ansor"""
import tvm._ffi


tvm._ffi._init_api("ansor", __name__)
206 changes: 206 additions & 0 deletions python/tvm/ansor/auto_schedule.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,206 @@
# Licensed to the Apache Software Foundation (ASF) under one
# or more contributor license agreements. See the NOTICE file
# distributed with this work for additional information
# regarding copyright ownership. The ASF licenses this file
# to you under the Apache License, Version 2.0 (the
# "License"); you may not use this file except in compliance
# with the License. You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing,
# software distributed under the License is distributed on an
# "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
# KIND, either express or implied. See the License for the
# specific language governing permissions and limitations
# under the License.

"""
User interface for Ansor auto-scheduler.

The basic schedule search process for Ansor is designed to be:
`Program sampling` -> `Performance Tuning`.

In `Program sampling`, we use some predefined precise or heuristic rules to generate several
initial schedules. Based on these initial starting points, we perform `Performance Tuning` which
uses cost model based evolutionary search to select schedules with the best performance.

Candidate schedules are measured against the specific hardware target.
"""

import tvm._ffi
from tvm.runtime import Object
from .compute_dag import ComputeDAG
from .measure import LocalBuilder, LocalRunner
from . import _ffi_api


@tvm._ffi.register_object("ansor.HardwareParams")
class HardwareParams(Object):
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
""" The parameters of target hardware used to guide the search process of SearchPolicy.

TODO(jcf94): This is considered to be merged with the new Target:
https://discuss.tvm.ai/t/rfc-tvm-target-specification/6844

Parameters
kevinthesun marked this conversation as resolved.
Show resolved Hide resolved
----------
num_cores : int
The number of device cores.
vector_unit_bytes : int
The width of vector units in bytes.
cache_line_bytes : int
The size of cache line in bytes.
"""
def __init__(self, num_cores, vector_unit_bytes, cache_line_bytes):
self.__init_handle_by_constructor__(_ffi_api.HardwareParams, num_cores,
vector_unit_bytes, cache_line_bytes)


@tvm._ffi.register_object("ansor.SearchTask")
class SearchTask(Object):
""" The computation information and hardware parameters for a specific schedule search task.

Parameters
----------
dag : ComputeDAG
The ComputeDAG for the corresponding compute declaration.
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
workload_key : str
The workload key for the corresponding compute declaration.
target : tvm.target.Target
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
The target device of this search task.
target_host : Optional[tvm.target.Target]
The target host device of this search task.
hardware_params : Optional[HardwareParams]
Hardware parameters used in this search task.
"""
def __init__(self, dag, workload_key, target, target_host=None,
hardware_params=None):
self.__init_handle_by_constructor__(_ffi_api.SearchTask, dag,
workload_key, target, target_host,
hardware_params)


@tvm._ffi.register_object("ansor.SearchPolicy")
class SearchPolicy(Object):
""" The base class of search policies. """


@tvm._ffi.register_object("ansor.EmptyPolicy")
class EmptyPolicy(SearchPolicy):
""" This is an example empty search policy which will always generate
the init state of ComputeDAG.
"""
def __init__(self):
self.__init_handle_by_constructor__(_ffi_api.EmptyPolicy)


@tvm._ffi.register_object("ansor.TuningOptions")
class TuningOptions(Object):
""" This controls the options of performance tuning.

Parameters
----------
num_measure_trials: int = 0
The number of measurement trials.
The search policy measures `num_measure_trials` schedules in total and returns the best one
among them.
With `num_measure_trials` == 0, the policy will do the schedule search but won't involve
measurement.
This can be used to get a runnable schedule quickly without auto-tuning.
early_stopping: int = -1
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
Stop the tuning early if getting no improvement after n measurements.
num_measures_per_round: int = 64
The number of schedules to be measured at each search round.
The whole schedule search process will try a total number of `num_measure_trials` in several
rounds.
verbose: int = 1
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
Verbosity level. 0 for silent, 1 to output information during schedule search.
builder: Union[ProgramBuilder, str] = 'local'
ProgramBuilder which builds the program.
runner: Union[ProgramRunner, str] = 'local'
ProgramRunner which runs the program and measures time costs.
measure_callbacks: Optional[List[MeasureCallback]]
Callback functions called after each measurement.
Candidates:
- ansor.LogToFile
pre_search_callbacks: Optional[List[SearchCallback]]
Callback functions called before the search process.
Candidates:
- ansor.PreloadMeasuredStates
- ansor.PreloadCustomSketchRule
TODO(jcf94): Add these implementation in later PRs.
"""
def __init__(self, num_measure_trials=0, early_stopping=-1, num_measures_per_round=64,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early_stopping -> early_termination

IMHO, this API looks a bit bulky to me, should we have some config dict to do this?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah I agree, there are lots of fields here and its a bit hard to consume

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In my opinions, TuningOptions is already a class holding configurations related to schedule tuning stuffs, I think it might be a little bit overkill to introduce another config dict?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @yangjunpro, this class is a fine way of collecting the tuning options, separating another dict out is messier.

verbose=1, builder='local', runner='local', measure_callbacks=None,
pre_search_callbacks=None):
if isinstance(builder, str):
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
if builder == 'local':
builder = LocalBuilder()
else:
raise ValueError("Invalid builder: " + builder)
elif not isinstance(builder, tvm.ansor.measure.ProgramBuilder):
raise ValueError("Invalid builder: " + builder +
" . TuningOptions expects a ProgramBuilder or string.")

if isinstance(runner, str):
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
if runner == 'local':
runner = LocalRunner()
else:
raise ValueError("Invalid runner: " + runner)
elif not isinstance(runner, tvm.ansor.measure.ProgramRunner):
raise ValueError("Invalid runner: " + runner +
" . TuningOptions expects a ProgramRunner or string.")

measure_callbacks = measure_callbacks if measure_callbacks else []
pre_search_callbacks = pre_search_callbacks if pre_search_callbacks else []

self.__init_handle_by_constructor__(
_ffi_api.TuningOptions, num_measure_trials, early_stopping, num_measures_per_round,
verbose, builder, runner, measure_callbacks, pre_search_callbacks)


def auto_schedule(task, target, target_host=None, search_policy='default',
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
hardware_params=None, tuning_options=None):
""" Do auto scheduling for a computation declaration.

The task parameter can be a `string` as workload_key, or directly
passing a `SearchTask` as input.

Parameters
----------
task : Union[SearchTask, str]
The SearchTask or workload key for the computation declaration.
target : tvm.target.Target
The target device of this schedule search.
target_host : Optional[tvm.target.Target]
The target host device of this schedule search.
search_policy : Union[SearchPolicy, str] = 'default'
The search policy to be used for schedule search.
hardware_params : Optional[HardwareParams]
The hardware parameters of this schedule search.
tuning_options : Optional[TuningOptions]
Tuning and measurement options.

Returns
-------
A `te.schedule` and the a list of `te.Tensor` to be used in `tvm.lower` or `tvm.build`.
"""
if isinstance(search_policy, str):
if search_policy == 'default':
jcf94 marked this conversation as resolved.
Show resolved Hide resolved
# TODO(jcf94): This is an example policy for minimum system, will be upgrated to
# formal search policy later.
search_policy = EmptyPolicy()
else:
raise ValueError("Invalid search policy: " + search_policy)

tuning_options = tuning_options if tuning_options else TuningOptions()
merrymercy marked this conversation as resolved.
Show resolved Hide resolved

if isinstance(task, str):
dag = ComputeDAG(task)
task = SearchTask(dag, task, target, target_host, hardware_params)
elif not isinstance(task, SearchTask):
raise ValueError("Invalid task: " + task +
" . `ansor.auto_schedule` expects a `str` or `SearchTask`.")

sch, tensors = _ffi_api.AutoSchedule(task, search_policy, tuning_options)
merrymercy marked this conversation as resolved.
Show resolved Hide resolved
return sch, tensors
Loading