-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Relay] Expand type unification and other utilities #2189
Conversation
I will take a review pass after I get back from MSR tomorrow. |
Had the below case fail when I tried it. This appears to be happening because the identity relations between There are two ways I can think of right now to fix this bug:
Edit: Figured out another potential issue. In the case of Edit2: Additionally, if we have a type
|
…ch (not sure why this fails)
Introducing a visitor for resolving solved the problems I had in #2189 (comment) without needing to implement any of the other measures, but I would appreciate discussion on whether that's a good approach to resolution or if there need to be larger rewrites of TypeSolver to accommodate nested type vars. |
@slyubomirsky you definitely need to recursively resolve, the boolean should only indicate if progress has been made on the relation not whether it is fully resolved. The main places that should have been changed are (1) add recursive unifier, (2) make resolver recursive, and (3) update solver to handle new relations. |
@jroesch could you please clarify what (3) means in your last comment? I'm not sure what specifically is missing from here; an example test case (even a sketch) could also be helpful |
I think I understand now what might be missing (please correct me if I'm wrong, @jroesch): Adding a relation to the solver via I don't know of a non-unification test case where this applies, however; any suggestions? |
… is added, not during unification. Resolve before running a relation
My most recent commit should propagate recursively, as described, as well as resolve before running a relation (potentially filling in solved nested type vars). Please review @jroesch, @MarisaKirisame. I think we might do well to have more complex test cases, so if any has been giving you problems, I would appreciate a suggestion. (Not sure what the issue is with the CI, does not happen to me locally.) |
I think this looks like it solves the appropriate cases, @MarisaKirisame can you send Steven the broken tests we had committed? |
@jroesch done privatedly |
@jroesch @MarisaKirisame I think the cases you showed me will require changes to type inference, rather than the solver. I will handle them in this PR. |
I thought the above problem was addressed by the addition of the propagator, which ensures that all nested types list the relations from their parent types. I will see if there is a need to have further recursive merging and if I can add more test cases for such recursive updates. |
Another option would be calling the propagator in every call to |
The current propagator logic only solves the problem when the initial relation contains recursive ones, like For example. initial condition: |
Ah, I see the issue now, very good catch. I'll work on a solution over the next day. |
I have recursive merging in, but I think each merge also needs to propagate in order for the rel lists to be correctly updated (since it is always true that if a type has nested types, updates to the nested types should cause updates to relations for the parent). I am getting some weird problems with circular links being formed, so I think I will try using different data structures to both eliminate the issue of circular links and make it easier to keep the trigger lists up to date (a lot of wacky visitors going around). (This will probably lead to some inefficiencies, but I'd rather have things be obviously correct and ready for us to fix later than subtly flawed.) |
I think we will need to revisit this mainly for performance reasons (the use of unodered_set on Node), but since this appears to implement features we need correctly, I can move on to merge this in. @jroesch can you take another close look and https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly |
@junrushao1994 please help to take a look as well if you have time |
@tqchen Will do tomorrow night |
include/tvm/relay/pass.h
Outdated
* | ||
* \param expr the expression. | ||
* | ||
* \return List of free vars, in the PostDFS order in the expression. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like it should be "List of all vars", not only free vars?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That it is; I'll correct it. Thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are probably a few more changes we want to make in the future, but overalls good for now.
@junrushao1994 please take another look and https://docs.tvm.ai/contribute/code_review.html#approve-and-request-changes-explicitly |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Feel like it is pretty good for now
Thanks to @slyubomirsky @jroesch @junrushao1994 , this is now merged |
This PR is reverted due to some additional problems revealed by with the recent added AD, please send in another PR and fix the issues. |
This change adds greater support for type unification to Relay. In particular, it allows for handling cases that require recursing down the type ASTs. E.g.,
Unify(Tuple(?a, ?b), Tuple(Tensor(s, bt), Tensor(s, bt)))
should be able to conclude that the unified type isTuple(Tensor(s, bt), Tensor(s, bt))
and thatUnify(Tuple(?a, ?b), Tuple(?c, ?d))
implies?a = ?c
and?b = ?d
.The main concern is ensuring that if we have a type variable nested inside another type, then an update to that type variable should result in re-running all relations where that type variable appears, including those where that type variable is nested inside another type. That was the main source of complexity in these changes.
(Note to reviewers: It would be very good to ensure that this unifier does not unnecessarily add relations to the queue and thus make the solver do unnecessary work. I do not believe that it does, though.)
Please review as to whether this is the direction we want to take, @jroesch, @tqchen, and others
Outline of Relay's Type Checking and Inference Implementation
For the benefit of reviewers and future work on Relay's type system implementation, I will give a high-level outline of how type checking and inference work in Relay.
Types are actually inferred during the process of type checking, which runs in two stages:
Explanation of Constraints on Types
By "constraints," we mean constraints on the possible types permitted for a given term. For example, if a variable
v
appears in a tuple projectionv.1
, we would like to be able to express the constraint thatv
must have a tuple type with at least two members. Later, in the constraint solving phase, we would fill in the specific tuple type thatv
must have.Handling type inference requires filling in types for terms whose types have been omitted, as in functions where parameter or return types have been omitted or
let
bindings where the variable is not annotated with a type. In these cases, those terms are given incomplete types (known as type variables or type holes in the programming languages literature), to be filled in during the constraint solving phase.Unification
The most common constraint employed is the main subject of this PR, namely unification: the constraint that two types must be structurally equal. (For example, unifying two tuple types requires ensuring that the corresponding members of the tuples can be unified.) Very many type-checking rules require unification: for example, type-checking a function call requires ensuring that the types of the arguments to the call match the types expected for function arguments, so the types of the call arguments will be unified with the function's expected argument types.
Incomplete types require special handling during unification, which is where most of the complexity in the implementation here emerges. Namely, if an incomplete type is unified with another incomplete type, that means both those incomplete types must eventually resolve to the same type. So, if incomplete type
?a
and?b
are unified and then?b
is later unified with a typeTensor[(10, 10), float32]
, all occurrences of both?a
and?b
must be equal toTensor[(10, 10), float32]
.Unification for incomplete types is thus handled via a union-find (UF) data structure, where incomplete types are nodes in the UF tree. For unifying incomplete types with concrete types, we must also keep a mapping of incomplete types to concrete types: if an incomplete type is unified with a concrete type, we say it resolves to that concrete type.
There is one more complication, which is that recursive unification is not permitted. E.g.,
Unify(?a, (?a, ?b))
is invalid because there is no concrete type that is a tuple containing itself as one member. Thus, we must check for such recursion when unifying incomplete types and reject it (this is called an "occurs check" in the PL literature, since it checks whether a type variable occurs inside another type).We can thus outline the algorithm for unifying two types
t1
andt2
(Unify(t1, t2)
) as follows:t1
andt2
are both incomplete types, findr1
andr2
, their respective UF roots. Ifr1
is the same asr2
, there is no more work to do. Otherwise, we may unifyr1
andr2
in the UF tree. If one ofr1
andr2
has a resolved (concrete) type and the other doesn't, set the resolved type of the new UF root to that resolved type; if both have a resolved type (call themc1
andc2
), set the resolved type of the new UF root toUnify(c1, c2)
. Return the resolved type of the new UF root (or the UF root itself if there was no resolved type).t1
andt2
is an incomplete type and the other is concrete (WLOG, assume thatt1
is incomplete andt2
is concrete), obtain the UF root oft1
,r1
. Next perform an occurs check int2
forr1
(the occurs check simply traversest2
's AST and for each incomplete type encountered, checks whether its UF root is equal tor1
); reject if there is an occurrence. Ifr1
does not have a resolved type, set its resolved type tot2
and returnt2
. Ifr1
has a resolved typec1
, setr1
's resolved type toUnify(c1, t2)
and return the final resolved type.t1
nort2
is an incomplete type, unify them by recursing down their ASTs. Ift1
andt2
fail to match at any point of their AST (e.g., ift1
is a tuple type andt2
is a function type), reject the unification. For types that do not contain nested types (e.g., tensor types), reject if they are not equal and return the type if they are equal. Otherwise, unify all subtrees of the two types and return the unified type.Because union-find with path compression is very asymptotically efficient, unification takes time approximately linear in the size of the type ASTs being unified.
Explicit Type Relations
The other constraints typically generated in Relay are explicit type relations on functions. Type relations are typically associated with Relay operators (see the developer tutorial on adding operators, https://docs.tvm.ai/dev/relay_add_op.html, for some discussion). In their current implementation, they take the argument and return types to a Relay function and define a procedure (i.e., a function in C++) that checks if the constraint holds. The procedures may perform arbitrary checks and can update any shape variables or incomplete types within the types passed to them.
For example, the
Identity
relation simply unifies the types passed to it and is satisfied if unifying the types succeeds. TheBroadcast
relation is satisfied if all the types passed to it are tuple types and the shape of the function return type is the broadcast of the argument types' shapes.Note that because it is possible for type relations to be given incomplete types or types that contain nested incomplete types as arguments, the queue of type relations must be appropriately updated as those incomplete types are unified and resolved. In order to track these updates, each type in the program keeps a list of all the type relations in which it appears as an argument (we can call this list a "trigger list"), which is also propagated to every sub-tree of that type's AST. When two types are unified, the final unified type will combine the trigger lists of the two argument types and any unresolved relation in the two trigger lists will be put on the queue of relations, as there is now new information that may allow that relation to be resolved.
For example, if we have a tuple type
(?a, ?b)
that is given as an argument to relationR1
, thenR1
is in the trigger lists of(?a, ?b)
,?a
, and?b
. If?a
is unified with another type andR1
is not yet solved,R1
will be put back on the queue of relations because updating?a
give new information.When adding a type relation, it takes time linear in the size of each argument type AST to propagate that type relation to the trigger list of each argument type. When unifying two types, it takes time linear in the size of the types' trigger lists to iterate through the trigger lists and determine whether a relation needs to be put back on the queue. While the linear performance times here can quickly add up, the number of explicit relations in any given program will typically be quite small and will only apply to a few of the types in the program, so this is not expected to be a threat to performance.
Constraint Generation
The constraint generation phase of type checking is implemented by recursing down the AST of a program and adding new type constraints based on the type-checking rules for each AST node. Without listing out the rules for each node, this procedure consists primarily of the following:
let
expression or function argument, if the type annotation is omitted, set it to a fresh incomplete type. Similarly, if a return type to a function is not annotated, treat it as having an incomplete type for the return type.This phase of type checking should not do very much for each AST node besides adding type relations to a queue or adding more unification constraints and should only visit each AST node once, so it should be approximately linear in the size of the program AST.
Constraint Solving
The second phase of type checking requires solving constraints. In the case of unification constraints, solving is simple: for each type
t
, resolve it by recursing down its AST and replacing each incomplete type with the concrete resolved type of its root in the UF tree.Explicit type relations currently take somewhat of an ad hoc approach: There is a queue of relations kept from the generation phase. For each relation in the queue, call its updater function. If the relation holds and all types passed to the relation are fully concrete, dequeue the relation; it is solved. If a relation returns false, there is a type checking error. Otherwise, if incomplete types remain, add the relation back to the queue. Keep iterating through the queue until all relations are solved or a fixpoint is reached (i.e., no types are changing). A fixpoint may either mean there is an error (impossible constraint to satisfy) or that there is not enough information given (more annotations may be necessary).
Note that resolution of incomplete types should be performed after solving all type relations, since the relations may update incomplete types (i.e., add more unification constraints).
The complexity of this phase is difficult to classify, since relations may perform arbitrary operations. It is certainly possible to imagine constructing very entangled queues of interdependent relations that would have poor performance, but in practice, the relations used for most operators in Relay have simple constraints that do not take many iterations of the queue to fully resolve.
Remaining Issue
Bound type variables don't fit into the above story. That will take some additional design work (see below discussion); we will address it later.