diff --git a/design/15292-generics.md b/design/15292-generics.md new file mode 100644 index 00000000..78b7546c --- /dev/null +++ b/design/15292-generics.md @@ -0,0 +1,233 @@ +# Proposal: Go should have generics + +Author: [Ian Lance Taylor](iant@golang.org) + +Created: January 2011 + +Last updated: April 2016 + +Discussion at https://golang.org/issue/15292 + +## Abstract + +Go should support some form of generic programming. +Generic programming enables the representation of algorithms and data +structures in a generic form, with concrete elements of the code +(such as types) factored out. +It means the ability to express algorithms with minimal assumptions +about data structures, and vice-versa +(paraphrasing [Jazayeri, et al](http://www.dagstuhl.de/en/program/calendar/semhp/?semnr=98171)). + +## Background + +### Generic arguments in favor of generics + +People can write code once, saving coding time. +People can fix a bug in one instance without having to remember to fix it +in others. +Generics avoid boilerplate: less coding by copying and editing. + +Generics save time testing code: they increase the amount of code +that can be type checked at compile time rather than at run time. + +Every statically typed language in current use has generics in one +form or another (even C has generics, where they are called preprocessor macros; +[example](https://gcc.gnu.org/viewcvs/gcc/trunk/gcc/vec.h?revision=165314&view=markup&pathrev=165314)). + +### Existing support for generic programming in Go + +Go already supports a form of generic programming via interfaces. +People can write an abstract algorithm that works with any type that +implements the interface. +However, interfaces are limited because the methods must use specific types. +There is no way to write an interface with a method that takes an +argument of type T, for any T, and returns a value of the same type. +There is no way to write an interface with a method that compares two +values of the same type T, for any T. +The assumptions that interfaces require about the types that satisfy +them are not minimal. + +Interfaces are not simply types; they are also values. +There is no way to use interface types without using interface values, +and interface values aren’t always efficient. +There is no way to create a slice of the dynamic type of an interface. +That is, there is no way to avoid boxing. + +### Specific arguments in favor of generics in Go + +Generics permit type-safe polymorphic containers. +Go currently has a very limited set of such containers: slices, and +maps of most but not all types. +Not every program can be written using a slice or map. + +Look at the functions `SortInts`, `SortFloats`, `SortStrings` in the +sort package. +Or `SearchInts`, `SearchFloats`, `SearchStrings`. +Or the `Len`, `Less`, and `Swap` methods of `byName` in package io/ioutil. +Pure boilerplate copying. + +The `copy` and `append` functions exist because they make slices much +more useful. +Generics would mean that these functions are unnecessary. +Generics would make it possible to write similar functions for maps +and channels, not to mention user created data types. +Granted, slices are the most important composite data type, and that’s why +these functions were needed, but other data types are still useful. + +It would be nice to be able to make a copy of a map. +Right now that function can only be written for a specific map type, +but, except for types, the same code works for any map type. +Similarly, it would be nice to be able to multiplex one channel onto +two, without having to rewrite the function for each channel type. +One can imagine a range of simple channel manipulators, but they can +not be written because the type of the channel must be specified +explicitly. + +Generics let people express the relationship between function parameters +and results. +Consider the simple Transform function that calls a function on every +element of a slice, returning a new slice. +We want to write something like +``` +func Transform(s []T, f func(T) U) []U +``` +but this can not be expressed in current Go. + +In many Go programs, people only have to write explicit types in function +signatures. +Without generics, they also have to write them in another place: in the +type assertion needed to convert from an interface type back to the +real type. +The lack of static type checking provided by generics makes the code +heavier. + +### What we want from generics in Go + +Any implementation of generics in Go should support the following. + +* Define generic types based on types that are not known until they are instantiated. +* Write algorithms to operate on values of these types. +* Name generic types and name specific instantiations of generic types. +* Use types derived from generic types, as in making a slice of a generic type, + or conversely, given a generic type known to be a slice, defining a variable + with the slice’s element type. +* Restrict the set of types that may be used to instantiate a generic type, to + ensure that the generic type is only instantiated with types that support the +* required operations. +* Do not require an explicit relationship between the definition of a generic + type or function and its use. That is, programs should not have to + explicitly say *type T implements generic G*. +* Write interfaces that describe explicit relationships between generic types, + as in a method that takes two parameters that must both be the same unknown type. +* Do not require explicit instantiation of generic types or functions; they + should be instantiated as needed. + +### The downsides of generics + +Generics affect the whole language. +It is necessary to evaluate every single language construct to see how +it will work with generics. + +Generics affect the whole standard library. +It is desirable to have the standard library make effective use of generics. +Every existing package should be reconsidered to see whether it would benefit +from using generics. + +It becomes tempting to build generics into the standard library at a +very low level, as in C++ `std::basic_string, std::allocator >`. +This has its benefits—otherwise nobody would do it—but it has +wide-ranging and sometimes surprising effects, as in incomprehensible +C++ error messages. + +As [Russ pointed out](http://research.swtch.com/generic), generics are +a trade off between programmer time, compilation time, and execution +time. + +Go is currently optimizing compilation time and execution time at the +expense of programmer time. +Compilation time is a significant benefit of Go. +Can we retain compilation time benefits without sacrificing too much +execution time? + +Unless we choose to optimize execution time, operations that appear +cheap may be more expensive if they use values of generic type. +This may be subtly confusing for programmers. +I think this is less important for Go than for some other languages, +as some operations in Go already have hidden costs such as array +bounds checks. +Still, it would be essential to ensure that the extra cost of using +values of generic type is tightly bounded. + +Go has a lightweight type system. +Adding generic types inevitably makes the type system more complex. +It is essential that the result remain lightweight. + +The upsides of the downsides are that Go is a relatively small +language, and it really is possible to consider every aspect of the +language when adding generics. +At least the following sections of the spec would need to be extended: +Types, Type Identity, Assignability, Type assertions, Calls, Type +switches, For statements with range clauses. + +Only a relatively small number of packages will need to be +reconsidered in light of generics: container/*, sort, flag, perhaps +bytes. +Packages that currently work in terms of interfaces will generally be +able to continue doing so. + +### Conclusion + +Generics will make the language safer, more efficient to use, and more +powerful. +These advantages are harder to quantify than the disadvantages, but +they are real. + +## Examples of potential uses of generics in Go + +* Containers + * User-written hash tables that are compile-time type-safe, rather than + converting slice keys to string and using maps + * Sorted maps (red-black tree or similar) + * Double-ended queues, circular buffers + * A simpler Heap + * `Keys(map[K]V) []K`, `Values(map[K]V) []V` + * Caches + * Compile-time type-safe `sync.Pool` +* Generic algorithms that work with these containers in a type-safe way. + * Union/Intersection + * Sort, StableSort, Find + * Copy (a generic container, and also copy a map) + * Transform a container by applying a function--LISP `mapcar` and friends +* math and math/cmplx +* testing/quick.{`Check`,`CheckEqual`} +* Mixins + * like `ioutil.NopCloser`, but preserving other methods instead of + restricting to the passed-in interface (see the `ReadFoo` variants of + `bytes.Buffer`) +* protobuf `proto.Clone` +* Eliminate boilerplate when calling sort function +* Generic diff: `func [T] Diff(x, y []T) []range` +* Channel operations + * Merge N channels onto one + * Multiplex one channel onto N + * The [worker-pool pattern](http://play.golang.org/p/b5XRHnxzZF) +* Graph algorithms, for example immediate dominator computation +* Multi-dimensional arrays (not slices) of different lengths +* Many of the packages in go.text could benefit from it to avoid duplicate + implementation or APIs for `string` and `[]byte` variants; many points that + could benefit need high performance, though, and generics should provide that + benefit + +## Proposal + +I won’t discuss a specific implementation proposal here: my hope is +that this document helps show people that generics are worth having +provided the downsides can be kept under control. + +The following documents are my previous generics proposals, +presented for historic reference. All are flawed in various ways. + +* [Type functions](NNNN/2010-06-type-functions.md) (June 2010) +* [Generalized types](NNNN/2011-03-gen.md) (March 2011) +* [Generalized types](NNNN/2013-10-gen.md) (October 2013) +* [Type parameters](NNNN/2013-12-type-params.md) (December 2013) diff --git a/design/15292/2010-06-type-functions.md b/design/15292/2010-06-type-functions.md new file mode 100644 index 00000000..167bfbcc --- /dev/null +++ b/design/15292/2010-06-type-functions.md @@ -0,0 +1,764 @@ +# Type Functions + +This is a proposal for adding generics to Go, written by Ian Lance +Taylor in June, 2010. +This proposal will not be adopted. +It is being presented as an example for what a complete generics +proposal must cover. + +## Defining a type function + +We introduce _type functions_. +A type function is a named type with zero or more parameters. +The syntax for declaring a type function is: + +``` +type name(param1, param2, ...) definition +``` + +Each parameter to a type function is simply an identifier. +The _definition_ of the type function is any type, just as with an +ordinary type declaration. +The definition of a type function may use the type parameters any +place a type name may appear. + +``` +type Vector(t) []t +``` + +## Using a type function + +Any use of a type function must provide a value for each type +parameter. +The value must itself be a type, though in some cases the exact type +need not be known at compile time. + +We say that a specific use of a type function is concrete if all the +values passed to the type function are concrete. +All predefined types are concrete and all type literals composed using +only concrete types are concrete. +A concrete type is known statically at compile time. + +``` +type VectorInt Vector(int) // Concrete. +``` + +When a type function is used outside of a func or the definition of a +type function, it must be concrete. +That is, global variables and constants are required to have concrete +types. + +In this example: + +``` +var v Vector(int) +``` + +the name of the type of the variable `v` will be `Vector(int)`, and +the value of v will have the representation of `[]int`. +If `Vector(t)` has any methods, they will be attached to the type of +`v` \(methods of type functions are discussed further below\). + +## Generic types + +A type function need not be concrete when used as the type of a +function receiver, parameter, or result, or anywhere within a +function. +A specific use of a type function that is not concrete is known as +generic. +A generic type is not known at compile time. + +A generic type is named by using a type function with one or more +unbound type parameters, or by writing a type function with parameters +attached to generic types. +When writing an unbound type parameter, it can be ambiguous whether +the intent is to use a concrete type or whether the intent is to use +an unbound parameter. +This ambiguity is resolved by using the `type` keyword after each +unbound type parameter. + +\(Another way would be to use a different syntax for type variables. +For example, `$t`. This has the benefit of keeping the grammar +simpler and not needing to worry about where types are introduced +vs. used.\) + +``` +func Bound(v Vector(int)) +func Unbound(v Vector(t type)) +``` + +The `type` keyword may also be used without invoking a type function: + +``` +func Generic(v t type) t +``` + +In this example the return type is the same as the parameter type. +Examples below show cases where this can be useful. + +A value (parameter, variable, etc.) with a generic type is represented +at runtime as a generic value. +A generic value is a dynamic type plus a value of that type. +The representation of a generic value is thus the same as the +representation of an empty interface value. +However, it is important to realize that the dynamic type of a generic +value may be an interface type. +This of course can never happen with an ordinary interface value. + +Note that if `x` is a value of type `Vector(t)`, then although `x` is +a generic value, the elements of the slice are not. +The elements have type `t`, even though that type is not known. +That is, boxing a generic value only occurs at the top level. + +## Generic type identity + +Two generic types are identical if they have the same name and the +parameters are identical. +This can only be determined statically at compile time if the names +are the same. +For example, within a function, `Vector(t)` is identical to +`Vector(t)` if both `t` identifiers denote the same type. +`Vector(int)` is never identical to `Vector(float)`. +`Vector(t)` may or may not be identical to `Vector(u)`; +identity can only be determined at runtime. + +Checking type identity at runtime is implemented by walking through +the definition of each type and comparing each component. +At runtime all type parameters are known, so no ambiguity is possible. +If any literal or concrete type is different, the types are different. + +## Converting a value of concrete type to a generic type + +Sometimes it is necessary to convert a value of concrete type to a +generic type. +This is an operation that may fail at run time. +This is written as a type assertion: `v.(t)`. +This will verify at runtime that the concrete type of `v` is identical +to the generic type `t`. + +\(Open issue: Should we use a different syntax for this?\) + +\(Open issue: In some cases we will want the ability to convert an +untyped constant to a generic type. +This would be a runtime operation that would have to implement the +rules for conversions between numeric types. +How should this conversion be written? +Should we simply use `N`, as in `v / 2`? +The problem with that syntax is that the runtime conversion can fail +in some cases, at least when `N` is not in the range 0 to 127 inclusive. +The same objection applies to T(n). +That suggests N.(t), but that looks weird.\) + +\(Open issue: It is possible that we will want the ability to convert +a value from a known concrete type to a generic type. +This would also require a runtime conversion. +I'm not sure whether this is necessary or not. +What would be the syntax for this?\) + +## Generic value operations + +A function that uses generic values is only compiled once. +This is different from C++ templates. + +The only operations permitted on a generic value are those implied by +its type function. +Some operations will require extra work at runtime. + +### Declaring a local variable with generic type. + +This allocates a new generic value with the appropriate dynamic type. +Note that in the general case the dynamic type may need to be +constructed at runtime, as it may itself be a type function with +generic arguments. + +### Assigning a generic value + +As with any assignment, this is only permitted if the types are +identical. +The value is copied as appropriate. +This is much like assignment of values of empty interface type. + +### Using a type assertion with a generic value + +Programs may use type assertions with generic values just as with +interface values. +The type assertion succeeds if the target type is identical to the +value's dynamic type. +In general this will require a runtime check of type identity as +described above. + +\(Open issue: Should it be possible to use a type assertion to convert +a generic value to a generic type, or only to a concrete type? +Converting to a generic type is a somewhat different operation from +a standard type assertion. +Should it use a different syntax?\) + +### Using a type switch with a generic value + +Programs may use type switches with generic values just as with +interface values. + +### Using a conversion with a generic value + +Generic values may only be converted to types with identical +underlying types. +This is only permitted when the compiler can verify at compile time +that the conversion is valid. +That is, a conversion to a generic type T is only permitted if the +definition of T is identical to the definition of the generic type of +the value. + +``` + var v Vector(t) + a := Vector(t)(v) // OK. + type MyVector(t) []t + b := MyVector(t)(v) // OK. + c := MyVector(u)(v) // OK iff u and t are known identical types. + d := []int(v) // Not OK. +``` + +### Taking the address of a generic value + +Programs may always take the address of a generic value. +If the generic value has type `T(x)` this produces a generic value of +type `*T(x)`. +The new type may be constructed at runtime. + +### Indirecting through a generic value + +If a generic value has a generic type that is a pointer `*T`, then a +program may indirect through the generic value. +This will be similar to a call to `reflect.PtrValue.Elem`. +The result will be a new generic value of type `T`. + +### Indexing or slicing a generic value + +If a generic value has a generic type that is a slice or array, then a +program may index or slice the generic value. +This will be similar to a call to `reflect.ArrayValue.Elem` or +`reflect.SliceValue.Elem` or `reflect.SliceValue.Slice` or +`reflect.MakeSlice`. +In particular, the operation will require a multiplication by the size +of the element of the slice, where the size will be fetched from the +dynamic type. +The result will be a generic value of the element type of the slice or +array. + +### Range over a generic value + +If a generic value has a generic type that is a slice or array, then a +program may use range to loop through the generic value. + +### Maps + +Similarly, if a generic value has a generic type that is a map, +programs may index into the map, check whether an index is present, +assign a value to the map, range over a map. + +### Channels + +Similarly, if a generic value has a generic type that is a channel, +programs may send and receive values of the appropriate generic type, +and range over the channel. + +### Functions + +If a generic value has a generic type that is a function, programs may +call the function. +Where the function has parameters which are generic types, the +arguments must have identical generic type or a type assertion much be +used. +This is similar to `reflect.Call`. + +### Interfaces + +If a generic value has a generic type that is an interface, programs +may call methods on the interface. +This is much like calling a function. +Note that a type assertion on a generic value may return an interface +type, unlike a type assertion on an interface value. +This in turn means that a double type assertion is meaningful. + +``` + a.(InterfaceType).(int) +``` + +### Structs + +If a generic value has a generic type that is a struct, programs may +get and set struct fields. +In general this requires finding the description of the field in the +dynamic type to discover the appropriate concrete type and field +offsets. + +### That is all + +Operations that are not explicitly permitted for a generic value are +forbidden. + +### Scope of generic type parameters + +When a generic type is used, the names of the type parameters have +scope. +The generic type normally provides the type of some name; the scope of +the unbound type parameters is the same as the scope of that name. +In cases where the generic type does not provide the type of some +name, then the unbound type parameters have no scope. +Within the scope of an unbound type parameter, it may be used as a +generic type. + +``` +func Head(v Vector(t type)) { + var first t + first = v[0] +} +``` + +### Scopes of function parameters + +In order to make this work nicely, we change the scope of function +receivers, parameters, and results. +We now define their scope to start immediately after they are defined, +rather than starting in the body of the function. +This means that a function parameter may refer to the unbound type +parameters of an earlier function parameter. + +``` +func SetHead(v Vector(t type), e t) t { + v[0] = e + return e +} +``` + +The main effect of this change in scope will be to change the +behaviour of cases where a function parameter has the same name as a +global type, and that global type was used to define a subsequent +function parameter. + +\(The alternate notation approach would instead define that the +variables only persist for the top-level statement in which they +appear, so that + +``` +func SetHead(v Vector($t), e $t) $t { ... } +``` + +doesn't have to worry about which t is the declaration (the ML +approach). +Another alternative is to do what C++ does and explicitly introduce +them. + +``` +generic(t) func SetHead(v Vector(t), e t) t { ... } ] +``` + +\) + +### Function call argument type checking + +As can be seen by the previous example, it is possible to use generic +types to write functions in which two parameters have related types. +These types are checked at the point of the function call. +If the types at the call site are concrete, the type checking is +always done by the compiler. +If the types are generic, then the function call is only permitted if +the argument types are identical to the parameter types. +The arguments are matched against the required types from left to +right, determining bindings for the unbound type parameters. +Any failure of binding causes the compiler to reject the call with a +type error. +Any case where one unbound type parameter is matched to a different +unbound type parameter causes the compiler to reject the call with a +type error. +In those cases, the call site must use an explicit type assertion, +checked at run time, so that the call can be type checked at compile +time. + +``` + var vi Vector(int) + var i int + SetHead(vi, 1) // OK + SetHead(vi, i) // OK + var vt Vector(t) + var i1 t + SetHead(vt, 1) // OK? Unclear. See above. + SetHead(vt, i) // Not OK; needs syntax + SetHead(vt, i1) // OK + var i2 q // q is another generic type + SetHead(vt, q) // Not OK + SetHead(vt, q.(t)) // OK (may fail at run time) +``` + +### Function result types + +The result type of a function can be a generic type. +The result will be returned to the caller as a generic value. +If the call site uses concrete types, then the result type can often +be determined at compile time. +The compiler will implicitly insert a type assertion to the expected +concrete type. +This type assertion can not fail, because the function will have +ensured that the result has the matching type. +In other cases, the result type may be a generic type, in which case +the returned generic value will be handled like any other generic +value. + +### Making one function parameter the same type as another + +Sometime we want to say that two function parameters have the same +type, or that a result parameter has the same type as a function +parameter, without specifying the type of that parameter. +This can be done like this: + +``` +func Choose(which bool, a t type, b t) t +``` + +\(Or func `Choose(which bool, a $t, b $t) $t`\) + +The argument `a` is passed as generic value and binds the type +parameter `t` to `a`'s type. +The caller must ensure that `b` has the same type as `a`. +`b` will then also be passed as a generic value. +The result will be returned as a generic value; it must again have the +same type. + +Another example: + +``` +type Slice(t) []t +func Repeat(x t type, n int) Slice(t) { + a := make(Slice(t), n) + for i := range a { + a[i] = x + } + return a +} +``` + +\(Or `func Repeat(x $t, n int) []$t { ... }`\) + +### Nested generic types + +It is of course possible for the argument to a generic type to itself +be a generic type. +The above rules are intended to permit this case. + +``` +type Pair(a, b) struct { + first a + second b +} +func Sum(a Pair(Vector(t type), Vector(t))) Vector(t) +``` + +Note that the first occurrence of `t` uses the `type` keyword to +declare it as an unbound type parameter. +The second and third occurrences do not, which means that they are the +type whose name is `t`. +The scoping rules mean that that `t` is the same as the one bound by the +first use of Vector. +When this function is called, the type checking will match `t` to the +argument to the first `Vector`, and then require that the same `t` +appear as the argument to the second `Vector`. + +### Unknown generic types + +Note that it is possible to have a generic value whose type can not be +named. +This can happen when a result variable has a generic type. + +``` +func Unknown() t type +``` + +Now if one writes + +``` +x := Unknown() +``` + +then x is a generic value of unknown and unnamed type. +About all you can do with such a value is assign it using `:=` and use +it in a type assertion or type switch. +The only way that `Unknown` could return a value would be to use some +sort of conversion. + +### Methods on generic types + +A generic type may have methods. +When a generic type is used as a receiver, the arguments must all be +simple unbound names. +Any time a value of this generic type is created, whether the value is +generic or concrete, it will acquire all the methods defined on the +generic type. +When calling these methods, the receiver will of course be passed as a +generic value. + +``` +func (v Vector(t type)) At(int i) t { + return v[i] +} + +func (v Vector(t type)) Set(i int, x t) { + v[i] = x +} +``` + +A longer example: + +``` +package hashmap + +type bucket(keytype, valtype) struct { + next *bucket(keytype, valtype) + key keytype + val valtype +} + +type Hashfn(keytype) func(keytype) uint + +type Eqfn(keytype) func(keytype, keytype) bool + +type Hashmap(keytype, valtype) struct { + hashfn Hashfn(keytype) + eqfn Eqtype(keytype) + buckets []bucket(keytype, valtype) + entries int +} + +func New(hashfn Hashfn(keytype type), eqfn Eqfn(keytype), + _ valtype type) *Hashmap(keytype, valtype) { + return &Hashmap(k, v){hashfn, eqfn, + make([]bucket(keytype, valtype), 16), + 0} +} + +// Note that the dummy valtype parameter in the New function +// exists only to get valtype into the function signature. +// This feels wrong. + +func (p *Hashmap(keytype type, vvaltype type)) + Lookup(key keytype) (found bool, val valtype) { + h := p.hashfn(key) % len(p.buckets) + for b := buckets[h]; b != nil; b = b.next { + if p.eqfn(key, b.key) { + return true, b.val + } + } + return +} +``` + +In the alternate syntax: + +``` +package hash + +type bucket($key, $val) struct { + next *bucket($key, val) + key $key + val $val +} + +type Map($key, $val) struct { + hash func($key) uint + eq func($key, $key) bool + buckets []bucket($key, $val) + entries int +} + +func New(hash func($key) uint, eq func($key, $key) bool, _ $val) + *Map($key, $val) { + return &Map($key, $val){ + hash, + eq, + make([]bucket($key, $val), 16), + 0, + } +} + +// Again note dummy $val in the arguments to New. +``` + +## Concepts + +In order to make type functions more precise, we can additionally +permit the definition of the type function to specify an interface. +This means that whenever the type function is used, the argument is +required to satisfy the interface. +In homage to the proposed but not accepted C++0x notion, we call this +a concept. + +``` +type PrintableVector(t Stringer) []t +``` + +Now `PrintableVector` may only be used with a type that implements the +interface `Stringer`. +This in turn means that given a value whose type is the parameter to +`PrintableVector`, a program may call the `String` method on that +value. + +``` +func Concat(p PrintableVector(t type)) string { + s := "" + for _, v := range p { + s += v.String() + } + return s +} +``` + +Attempting to pass `[]int` to `Concat` will cause the compiler to +issue a type checking error. +But if `MyInt` has a `String` method, then calling `Concat` with +`[]MyInt` will succeed. + +The interface restriction may also be used with a parameter whose type +is a generic type: + +``` +func Print(a t type Stringer) +``` + +This example is not useful, as it is pretty much equivalent to passing +a value of type Stringer, but there is a useful example below. + +Concepts specified in type functions are type checked as usual. +If the compiler does not know statically that the type implements the +interface, then the type check fails. +In such cases an explicit type assertion is required. + +``` +func MyConcat(v Vector(t type)) string { + if pv, ok := v.(PrintableVector(t)); ok { + return Concat(pv) + } + return "unprintable" +} +``` + +\(Note that this does a type assertion to a generic type. Should it +use a different syntax?\) + +The concept must be an interface type, but it may of course be a +generic interface type. +When using a generic interface type as a concept, the generic +interface type may itself use as an argument the type parameter which +it is restricting. + +``` +type Lesser(t) interface { + Less(t) bool +} +func Min(a, b t type Lesser(t)) t { + if a.Less(b) { + return a + } + return b +} +``` + +\(`type Mintype($t Lesser($t)) $t`\) + +This is complex but useful. OK, the function `Min` is not all that +useful, but this looks better when we write + +``` +func Sort(v Vector(t type Lesser(t))) +``` + +which can sort any Vector whose element type implements the Lesser +interface. + +## A note on operator methods + +You will have noticed that there is no way to use an operator with a +generic value. +For example, you can not add two generic values together. +If we implement operator methods, then it will be possible to use this +in conjunction with the interface restrictions to write simple generic +code which uses operators. +While operator methods are of course a separate extension, I think +it's important to ensure that they can work well with generic values. + +``` +type Addable(t) interface { + Binary+(t) t +} +type AddableSlice(t Addable(t)) []t +func Sum(v AddableSlice) t { + var sum t + for _, v := range v { + sum = sum + v + } + return sum +} +``` + +## Some comparisons to C++ templates + +Obviously the big difference between this proposal and C++ templates +is that C++ templates are compiled separately. +This has various consequences. +Some C++ template features that can not be implemented using type +functions: + +* C++ templates permit data structures to be instantiated differently for different component types. +* C++ templates may be instantiated for constants, not just for types. +* C++ permits specific instantiations for specific types or constants. + +The advantages of type functions are: + +* Faster compile time. +* No need for two-phase name lookup. Only the scope of the definition is relevant, not the scope of use. +* Clear syntax for separating compile-time errors from run-time errors. Avoids complex compile-time error messages at the cost of only detecting some problems at runtime. +* Concepts also permit clear compile time errors. + +In general, C++ templates have the advantages and disadvantages of +preprocessor macros. + +## Summary + +This proposal will not be adopted. +It's basically terrible. + +The syntax is confusing: ```MyVector(t)(v)``` looks like two function +calls, but it's actually a type conversion to a type function. + +The notion of an unbound type parameter is confusing, and the +syntax (a trailing `type` keyword) only increases that confusion. + +Types in Go refer to themselves. +The discussion of type identity does not discuss this. +It means that comparing type identity at run time, such as in a type +assertion, requires avoiding loops. +Generic type assertions look like ordinary type assertions, but are +not constant time. + +The need to pass an instance of the value type to `hashmap.New` is a +symptom of a deeper problem. +This proposal is trying to treat generic types like interface types, +but interface types have a simple common representation and generic +types do not. +Value representations should probably be expressed in the type system, +not inferred at run time. + +The proposal suggests that generic functions can be compiled once. +It also claims that generic types can have methods. +If I write + +``` +type Vector(t) []t + +func (v Vector(t)) Read(b []t) (int, error) { + return copy(b, v), nil +} +``` + +then `Vector(byte)` should implement `io.Reader`. +But `Vector(t).Read` is going to be implemented using a generic value, +while `io.Reader` expects a concrete value. +Where is the code that translates from the generic value to the +concrete value? diff --git a/design/15292/2011-03-gen.md b/design/15292/2011-03-gen.md new file mode 100644 index 00000000..cd5a0f08 --- /dev/null +++ b/design/15292/2011-03-gen.md @@ -0,0 +1,747 @@ +# Generalized Types + +This is a proposal for adding generics to Go, written by Ian Lance +Taylor in March, 2011. +This proposal will not be adopted. +It is being presented as an example for what a complete generics +proposal must cover. + + +## Introduction + +This document describes a possible implementation of generalized types +in Go. +We introduce a new keyword, `gen`, which declares one or more type +parameters: types that are not known at compile time. +These type parameters may then be used in other declarations, +producing generalized types and functions. + +Some goals, borrowed from [Garcia et al](http://www.crest.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf): + +* Do not require an explicit relationship between a definition of a generalized function and its use. The function should be callable with any type that fits the required form. +* Permit interfaces to express relationships between types of methods, as in a comparison function that takes two parameters of the same unknown type. +* Given a generalized type, make it possible to use related types, such as a slice of that type. +* Do not require explicit instantiation of generalized functions. +* Permit type aliasing of generalized types. + +The type parameter introduced by a `gen` declaration is a concept that +exists at compile time. +Any actual value that exists at runtime has a specific concrete type: +an ordinary non-generalized type, or a generalized type that has been +instantiated as a concrete type. +Generalized functions will be compiled to handle values whose types +are supplied at runtime. + +This is what changes in the language: + +* There is a new syntax for declaring a type parameter (or parameters) for the scope of one or more declarations. +* There is a new syntax for specifying the concrete type(s) to use when using something declared with a type parameter. +* There is a new syntax for converting values of concrete type, and untyped constants, to generalized types. Also values of generalized type are permitted in type assertions. +* Within a function, we define the operations permitted on values with a generalized type. + +## Syntax + +Any package-scope type or function declaration may be preceded with +the new keyword `gen` followed by a list of type parameter names in +square brackets: + +``` +gen [T] type Vector []T +``` + +This defines `T` as a type parameter for the generalized type `Vector`. +The scope of `Vector` is the same as it would be if `gen` did not appear. + +A use of a generalized type will normally provide specific types to +use for the type parameters. +This is done using square brackets following the generalized type. + +``` +type VectorInt Vector[int] +var v1 Vector[int] +var v2 Vector[float32] +gen [T1, T2] type Pair struct { first T1; second T2 } +var v3 Pair[int, string] +``` + +Type parameters may also be used with functions. + +``` +gen [T] func SetHead(v Vector[T], e T) T { + v[0] = e + return e +} +``` + +For convenience, we permit a modified version of the factoring syntax +used with `var`, `type`, and `const` to permit a series of +declarations to share the same type parameters. + +``` +gen [T1, T2] ( +type Pair struct { first T1; second T2 } + +func MakePair(first T1, second T2) Pair { + return &Pair{first, second} +} +) +``` + +References to other names declared within the same gen block do not +have to specify the type parameters. +When the type parameters are omitted, they are assumed to simply be +the parameters declared for the block. +In the above example, `Pair` when used as the result type of +`MakePair` is equivalent to `Pair[T1, T2]`. + +As with generalized types, we must specify the types when we refer to +a generalized function (but see the section on type deduction, below). + +``` +var MakeIntPair = MakePair[int, int] +var IntPairZero = MakeIntPair(0, 0) +``` + +A generalized type can have methods. + + +``` +gen [T] func (v *Vector[T]) SetHeadMethod(e T) T { + v[0] = e + return e +} +``` + +Of course a method of a generalized type may itself be a generalized function. + +``` +gen [T, T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] +``` + +The `gen` keyword may only be used with a type or function. +It may only appear in package scope, not within a function. +One `gen` keyword may appear within the scope of another. +In that case, any use of the generalized type or function must specify +all the type parameters, starting with the outermost ones. +A different way of writing the last example would be: + +``` +gen [T] ( +type Vector []T +gen [T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] +) + +var v Vector[int] +var v2 = v.Transform[int, string](f) +``` + +Type deduction, described below, would permit omitting the +`[int, string]` in the last line, based on the types of `v` and `f`. + +### A note on syntax + +While the use of the `gen` keyword fits reasonably well into the +existing Go language, the use of square brackets to denote the +specific types is new to Go. +We have considered a number of different approaches: + +* Use angle brackets, as in `Pair`. This has the advantage of being familiar to C++ and Java programmers. Unfortunately, it means that `f(true)` can be parsed as either a call to function `f` or a comparison of `f0 +} + +// Vector of elements that may be compared with themselves. +gen [T Comparer[T]] type SortableVector []T +``` + +## Example + +``` +package hashmap + +gen [Keytype, Valtype] ( + +type bucket struct { + next *bucket + key Keytype + val Valtype +} + +type Hashfn func(Keytype) uint +type Eqfn func(Keytype, Keytype) bool + +type Hashmap struct { + hashfn Hashfn + eqfn Eqfn + buckets []bucket + entries int +} + +// This function must be called with explicit type parameters, as +// there is no way to deduce the value type. For example, +// h := hashmap.New[int, string](hashfn, eqfn) +func New(hashfn Hashfn, eqfn Eqfn) *Hashmap { + return &Hashmap{hashfn, eqfn, make([]buckets, 16), 0} +} + +func (p *Hashmap) Lookup(key Keytype) (val Valtype, found bool) { + h := p.hashfn(key) % len(p.buckets) + for b := p.buckets[h]; b != nil; b = b.next { + if p.eqfn(key, b.key) { + return b.val, true + } + } + return +} + +func (p *Hashmap) Insert(key Keytype, val Valtype) (inserted bool) { + // Implementation omitted. +} + +) // Ends gen. + +package sample + +import ( +“fmt” +“hashmap” +“os” +) + +func hashint(i int) uint { + return uint(i) +} + +func eqint(i, j int) bool { + return i == j +} + +var v = hashmap.New[int, string](hashint, eqint) + +func Add(id int, name string) { + if !v.Insert(id, name) { + fmt.Println(“duplicate id”, id) + os.Exit(1) + } +} + +func Find(id int) string { + val, found = v.Lookup(id) + if !found { + fmt.Println(“missing id”, id) + os.Exit(1) + } +} +``` + +## Language spec changes + +This is an outline of the changes required to the language spec. + +### Types + +A few paragraphs will be added to discuss generalized types. + +### Struct types + +While a struct may use a type parameter as an anonymous field, within +generalized code only the generalized definition is considered when +resolving field references. +That is, given + +``` +gen [T] type MyGenStruct struct { T } +type MyRealStruct { i int } +type MyInstStruct MyGenStruct[MyRealStruct] +gen [T] func GetI(p *MyGenStruct[T]) int { + return p.i // INVALID +} +func MyGetI(p *MyInstStruct) int { + return GetI(p) +} +``` + +the function `GetI` may not refer to the field `i` even though the +field exists when called from `MyGetI`. +(This restriction is fairly obvious if you think about it, but is +explicitly stated for clarity.) + +### Type Identity + +We define type identity for generalized types. +Two generalized types are identical if they have the same name and the +type parameters are identical. + +### Assignability + +We define assignability for generalized types. +A value `x` of generalized type `T1` is assignable to a variable of +type `T2` if `T1` and `T2` are identical. +A value `x` of concrete type is never assignable to a variable of +generalized type: a generalized type coercion is required (see below). +Similarly, a value `x` of generalized type is never assignable to a +variable of concrete type: a type assertion is required. +For example (more details given below): + +``` +gen [T] func Zero() (z T) { + z = 0 // INVALID: concrete to generalized. + z = int(0) // INVALID: concrete to generalized. + z = 0.[T] // Valid: generalized type coercion. +} +gen [T] func ToInt(v T) (r int) { + r = v // INVALID: generalized to concrete + r = int(v) // INVALID: no conversions for gen types + r, ok := v.(int) // Valid: generalized type assertion. + if !ok { + panic(“not int”) + } +} +``` + +### Declarations and scope + +A new section Generalized declarations is added, consisting of a few +paragraphs that describe generalized declarations and the gen syntax. + +### Indexes + +The new syntax `x[T]` for a generalized type or function is defined, +where `T` is a type and `x` is the name of some type or function +declared within a `gen` scope. + +### Type assertions + +We define type assertions using generalized types. + +Given `x.(T)` where `x` is a value with generalized type and `T` is a +concrete type, the type assertion succeeds if the concrete type of `x` +is identical to `T`, or, if `T` is an interface type, the concrete +type implements the interface `T`. +In other words, pretty much the same as doing a type assertion of a +value of interface type. + +If `x` and `T` are both generalized types, we do the same test using +the concrete types of `x` and `T`. + +In general these assertions must be checked at runtime. + +### Generalized type coercions + +We introduce a new syntax for coercing a value of concrete type to a +generalized type. +Where `x` is a value with concrete type and `T` is a generalized type, +the expression `x.[T]` coerces `x` to the generalized type `T`. +The generalized type coercion may succeed or fail, just as with a type +assertion. +However, it is not a pure type assertion, as we permit `x` to be an +untyped constant. +The generalized type coercion succeeds if the concrete type matches +the generalized type, where any parameters of the generalized type +match the appropriate portion of the concrete type. +If the same parameter appears more than once in the generalized type, +it must match identical types in the concrete type. +If the value is an untyped constant, the coercion succeeds if an +assignment of that constant to the concrete type would succeed at +compile time. + +### Calls + +This section is extended to describe the type deduction algorithm used +to avoid explicit type parameters when possible. + +An implicit generalized type conversion is applied to convert the +arguments to the expected generalized type, even though normally +values of concrete type are not assignable to variables of generalized +type. +Type checking ensures that the arguments must be assignable to the +concrete type which is either specified or deduced, and so this +implicit generalized type conversion will always succeed. + +When a result parameter has a generalized type, an implicit type +assertion is applied to convert back to the type that the caller +expects, which may be a concrete type. +The type expected by the caller is determined by the type parameters +passed to the function, whether determined via type deduction or not. +This implicit type assertion will always succeed. +For example, in + +``` +gen [T] func Identity(v T) T { return v } +func Call(i int) { j := Identity(i) } +``` + +the variable `j` gets the type `int`, and an implicit type assertion +converts the return value of `Identity[int]` to `int`. + +### Conversions + +Nothing needs to change in this section. +I just want to note explicitly that there are no type conversions for +generalized types other than the standard conversions that apply to +all types. + +### Type switches + +A type switch may be used on a value of generalized type. +Type switch cases may include generalized types. +The rules are the same as for type assertions. + +### For statements + +A range clause may be used with a value of generalized type, if the +generalized type is known to be a slice, array, map or channel. + +## Implementation + +Any actual value in Go will have a concrete type. +The implementation issue that arises is how to compile a function that +has parameters with generalized type. + +### Representation + +When calling a function that uses type parameters, the type parameters +are passed first, as pointers to a runtime type descriptor. +The type parameters are thus literally additional parameters to the +functions. + +### Types + +In some cases it will be necessary to create a new type at runtime, +which means creating a new runtime type descriptor. +It will be necessary to ensure that type descriptor comparisons +continue to work correctly. +For example, the hashmap example above will require creating a new +type for each call to `hashmap.New` for the concrete types that are used +in the call. +The reflect package already creates new runtime type descriptors in +the functions `PtrTo`, `ChanOf`, `FuncOf`, etc. + +Type reflection on a generalized type will return the appropriate +runtime type descriptor, which may have been newly created. +Calling `Name()` on such a type descriptor will return a name with the +appropriate type parameters: e.g, `“Vector[int]”`. + +### Variable declarations + +A local variable in a function may be declared with a generalized +type. +In the general case, the size of the variable is unknown, and must be +retrieved from the type descriptor. +Declaring a local variable of unknown size will dynamically allocate +zeroed memory of the appropriate size. +As an optimization the memory may be allocated on the stack when there +is sufficient room. + +### Composite literals + +A generalized type that is defined to be a struct, array, slice, or +map type may be used to create a composite literal. +The expression has the same generalized type. +The elements of the composite literal must follow the assignability +rules. + +### Selectors + +When `x` is a value of generalized type that is a struct, `x.f` can +refer to a field of that struct. +Whether `f` is a field of `x` is known at compile time. +The exact offset of the field in the struct value may not be known. +When it is not known, the field offset is retrieved from the type +descriptor at runtime. + +Similarly, `x.f` may refer to a method of the type. +In this case the method is always known at compile time. + +As noted above under struct types, if a generalized struct type uses a +type parameter as an anonymous field, the compiler does not attempt to +look up a field name in the concrete type of the field at runtime. + +### Indexes + +A value of a generalized type that is an array, slice or map may be indexed. +Note that when indexing into a map type, the type of the value must be +assignable to the map’s key type; +in practice this means that if the map’s key type is generalized, the +value must itself have the same generalized type. +Indexing into a generalized array or slice may require multiplying by +the element size found in the type descriptor. +Indexing into a generalized map may require a new runtime function. + +### Slices + +A value of a generalized type that is an array or slice may itself be +sliced. +This operation is essentially the same as a slice of a value of +concrete type. + +### Type Assertions + +A type assertion generally requires a runtime check, and in the +general case requires comparing two concrete types at runtime, where +one of the types is known to instantiate some generalized type. +The complexity of the runtime check is linear in the number of tokens +in the generalized type, and requires storage space to store type +parameters during the check. +This check could be inlined into the code, or it could use a general +purpose runtime check that compares the concrete type descriptor to a +similar representation of the generalized type. + +### Calls + +Function calls can require converting normal values to generalized +values. +This operation depends on the representation chosen for the +generalized value. +In the worst case it will be similar to passing a normal value to a +function that takes an interface type. +When calling a function with type parameters, the type parameters will +be passed first, as a pointer to a runtime type descriptor. + +Function calls can also require converting generalized return values +to normal values. +This is done via an implicitly inserted type assertion. +Depending on the representation, this may not require any actual code +to be generated. + +### Communication operators + +We have to implement sending and receiving generalized values for +channels of generalized type. + +### Assignments + +We have to implement assignment of generalized values. +This will be based on the runtime type descriptor. + +### Type switches + +We have to implement type switches using generalized types. +This will mostly likely devolve into a series of if statements using +type assertions. + +### For statements + +We have to implement for statements with range clauses over +generalized types. +This is similar to the indexing and communication operators. + +### Select statements + +We have to implement select on channels of generalized type. + +### Return statements + +We have to implement returning a value of generalized type. + +### Specialization of functions + +This proposal is intended to support compiling a generalized function +into code that operates on generalized values. +In fact, it requires that this work. + +``` +package p1 +gen [T] func Call(f func (T) T, T) T { + return f(T) +} + +package p2 +func SliceIdentity(a []int) []int { + return a +} + +package p3 +var v = p1.Call(p2.SliceIdentity, make([]int, 10)) +``` + +Here `Call` has to support calling a generalized function. +There is no straightforward specialization process that can implement +this case. +(It could be done if the full source code of p1 and p2 are available either when compiling p3 or at link time; +that is how C++ does it, but it is not an approach that fits well with Go.) + +However, for many cases, this proposal can be implemented using +function specialization. +Whenever the compiler can use type deduction for a function call, and +the types are known concrete types, and the body of the function is +available, the compiler can generate a version of the function +specialized for those types. +This is, therefore, an optional optimization, in effect a form of +cross-package inlining, which costs compilation time but improves +runtime. + +## Methods on builtin types + +This is an optional addendum to the proposal described above. + +The proposal does not provide a convenient way to write a function +that works on any numeric type. +For example, there is no convenient way to write this: + +``` +gen [T] func SliceAverage(a []T) T { + s := T(0) + for _, v = range a { + s += v + } + return s / len(a) +} +``` + +It would be nice if that function worked for any numeric function. +However, it is not permitted under the proposal described above, +because of the use of `+=` and `/`. +These operators are not available for every type and therefore are not +available for a generalized type. + +This approach does work: + +``` +gen [T] type Number interface { + Plus(T) T + Divide(T) T +} + +gen [T Number[T]] func SliceAverage(a []T) T { + s := 0.[T] + for _, v = range a { + s = s.Plus(v) + } + return s.Divide(len(a)) +} +``` + +However, this requires writing explicit `Plus` and `Divide` methods for +each type you want to use. +These methods are themselves boilerplate: + +``` +func (i MyNum) Plus(v MyNum) MyNum { return i + v } +func (i MyNum) Divide(v MyNum) MyNum { return i / v } +``` + +This proposal does not help with this kind of boilerplate function, +because there is no way to use operators with generalized values. + +There are a few ways to solve this. +One way that seems to fit well with Go as extended by this proposal is +to declare that for all types that support some language operator, the +type has a corresponding method. +That is, we say that if the type can be used with `+`, the language +defines a method `Plus` (or `Binary+` or whatever) for the type that +implements the operation. +This method can then be picked up by an interface such as the above, +and the standard library can define convenient aggregate interfaces, +such as an interface listing all the methods supported by an integer +type. + +Note that it would not help for the standard library to define a +`Plus` method for every integer type, as those methods would not carry +over to user defined types. + +## Operator methods + +It is of course a smallish step from those language-defined methods to +having operator methods, which would permit writing generalized code +using operators rather than method calls. For the purposes of using +generalized types, however, this is less important than having +language defined methods for operators. + +## Summary + +This proposal will not be adopted. +It has significant flaws. + +The factored `gen` syntax is convenient but looks awkward on the page. +You wind up with a trailing close parenthesis after a set of function +definitions. +Indenting all the function definitions looks silly. + +This proposal doesn't let me write a trivial generalized `Max` +function, unless we include operator methods. +Even when we include operator methods, `Max` has to be written in +terms of a `Less` method. + +The handling of untyped constants in generalized functions is +extremely awkward. +They must always use a generalized type coercion. + +While this proposal is more or less palatable for data structures, +it is much weaker for functions. +You basically can't do anything with a generalized type, +except assign it and call a method on it. +Writing standardized algorithms will require developing a whole +vocabulary of quasi-standard methods. + +The proposal doesn't help write functions that work on either `[]byte` +or `string`, unless those types get additional operator methods like +`Index` and `Len`. +Even operator methods don't help with using `range`. diff --git a/design/15292/2013-10-gen.md b/design/15292/2013-10-gen.md new file mode 100644 index 00000000..ff6bec96 --- /dev/null +++ b/design/15292/2013-10-gen.md @@ -0,0 +1,978 @@ +# Generalized Types In Go + +This is a proposal for adding generics to Go, written by Ian Lance +Taylor in October, 2013. +This proposal will not be adopted. +It is being presented as an example for what a complete generics +proposal must cover. + +## Introduction + +This document describes a possible implementation of generalized types +in Go. +We introduce a new keyword, `gen`, that declares one or more type +parameters: types that are not known at compile time. +These type parameters may then be used in other declarations, +producing generalized types and functions. + +Some goals, borrowed from [Garcia et al](http://www.crest.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf): + +* Do not require an explicit relationship between a definition of a generalized function and its use. The function should be callable with any suitable type. +* Permit interfaces to express relationships between types of methods, as in a comparison function that takes two parameters of the same unknown type. +* Given a generalized type, make it possible to use related types, such as a slice of that type. +* Do not require explicit instantiation of generalized functions. +* Permit type aliasing of generalized types. + +## Background + +My earlier proposal for generalized types had some flaws. + +People expect functions that operate on generalized types to be fast. +They do not want a reflection based interface in all cases. +The question is how to support that without excessively slowing down +the compiler. + +People want to be able to write simple generalized functions like +`Sum(v []T) T`, a function that sums the values in the slice `v`. +They are prepared to assume that `T` is a numeric type. +They don’t want to have to write a set of methods simply to implement +`Sum` or the many other similar functions for every numeric type, +including their own named numeric types. + +People want to be able to write the same function to work on both +`[]byte` and `string`, without requiring a copy. + +People want to write functions on generalized types that support +simple operations like comparison. +That is, they want to write a function that uses a generalized type +and compares a value of that type to another value of the same type. +That was awkward in my earlier proposal: it required using a form of +the curiously recurring template pattern. + +Go’s use of structural typing means that you can use any type to meet +an interface without an explicit declaration. +Generalized types should work the same way. + +## Proposal + +We permit package-level type and func declarations to use generalized +types. +There are no restrictions on how these types may be used within their +scope. +At compile time each actual use of a generalized type or function is +instantiated by replacing the generalized type with some concrete +type. +This may happen multiple times with different concrete types. +A concrete type is only permitted if all the operations used with the +generalized types are permitted for the concrete type. +How to implement this efficiently is discussed below. + +## Syntax + +Any package-scope type or func declaration may be preceded with the +new keyword `gen` followed by a list of type parameter names in square +brackets: + +``` +gen [T] type Vector []T +``` + +This defines `T` as a type parameter for the generalized type `Vector`. +The scope of `Vector` is the same as it would be if `gen` did not appear. + +A use of a generalized type must provide specific types to use for the +type parameters. +This is normally done using square brackets following the generalized +type. + +``` +type VectorInt Vector[int] +var v1 Vector[int] +var v2 Vector[float32] +gen [T1, T2] type Pair struct { first T1; second T2 } +var v3 Pair[int, string] +``` + +Type parameters may also be used with functions. + +``` +gen [T] func SetHead(v Vector[T], e T) T { + v[0] = e + return e +} +``` + +We permit a modified version of the factoring syntax used with `var`, +`type`, and `const` to permit a series of declarations to share the +same type parameters. + +``` +gen [T1, T2] ( +type Pair struct { first T1; second T2 } + +func MakePair(first T1, second T2) Pair { + return &Pair{first, second} +} +) // Ends gen. +``` + +References to other names declared within the same `gen` block do not +have to specify the type parameters. +When the type parameters are omitted, they are implied to simply be +the parameters declared for the block. +In the above example, `Pair` when used as the result type of `MakePair` is +equivalent to `Pair[T1, T2]`. + +When this syntax is used we require that the entire contents of the +block be valid for a given concrete type. +The block is instantiated as a whole, not in individual pieces. + +As with generalized types, we must specify the types when we refer to +a generalized function (but see the section on type deduction, below). + +``` +var MakeIntPair = MakePair[int, int] +var IntPairZero = MakeIntPair(0, 0) +``` + +A generalized type can have methods. + +``` +gen [T] func (v *Vector[T]) SetHead(e T) T { + v[0] = e + return e +} +``` + +Of course a method of a generalized type may itself be a generalized function. + +``` +gen [T, T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] +``` + +The `gen` keyword may only be used with a type or function. +It may only appear in package scope, not within a function. +A non-generalized type may not have a generalized method. + +A `gen` keyword may appear within the scope of another `gen` keyword. +In that case, any use of the generalized type or function must specify +all the type parameters, starting with the outermost ones. +A different way of writing the last example would be: + +``` +gen [T] ( +type Vector []T +gen [T2] func (v *Vector[T]) Transform(f func(T) T2) Vector[T2] +) +var v Vector[int] +var v2 = v.Transform[int, string](strconv.Itoa) +``` + +Type deduction, described below, would permit omitting the +`[int, string]` in the last line, based on the types of `v` and +`strconv.Itoa`. +Inner type parameters shadow outer ones with the same name, as in +other scopes in Go (although it’s hard to see a use for this +shadowing). + +### A note on syntax + +While the use of the `gen` keyword fits reasonably well into the +existing Go language, the use of square brackets to denote the +specific types is new to Go. +We have considered a number of different approaches: + +* Use angle brackets, as in `Pair`. This has the advantage of being familiar to C++ and Java programmers. Unfortunately, it means that `f(true)` can be parsed as either a call to function `f` or a comparison of `f +func (p realT) plus(a T) T { + return p + a.(realT) // return converts realT to T +} +type realS []realT +func (s realS) len() int { + return len(s) +} +func (s realS) index(i int) T { + return s[i] // return converts realT to T +} +``` + +When instantiating `Sum` for a new type, the compiler builds and +compiles this code for the new type and calls the compiled version of +`Sum` with the interface value for the generated interface. +As shown above, the methods automatically use type assertions and +interface conversion as needed. +The actual call to `Sum(s)` will be rewritten as `GenSum(s).(realT)`. +The type assertions and interface conversions are checked at compile +time and will always succeed. + +Note that another way to say whether a concrete type may be used to +instantiate a generalized function is to ask whether the instantiation +templates may be instantiated and compiled without error. + +More complex cases may of course involve multiple generalized types in +a single expression such as a function call. +The compiler can arbitrarily pick one value to carry the methods, and +the method implementation will use type assertions to implement the +call. +This works because all the concrete types are known at instantiation +time. + +For cases like `make` where the compiler has no value on which to invoke +a method, there are two cases. +For a generalized function, the compiler can write the function as a +closure. +The actual instantiation will pass a special object in the closure +value. +See the use of make in the next example. + +``` +gen [T1, T2, T3] func Combine(a []T1, b []T2, f func(T1, T2) T3) []T3 { + r := make([]T3, len(a)) + for i, v := range a { + r[i] = f(v, b[i]) + } + return r +} +``` + +This will be rewritten as + +``` +type S1 interface { + len() int + index(int) T1 +} +type S2 interface { + index(int) T2 +} +type S3 interface { + set(int, T3) +} +type F interface { + call(T1, T2) T3 +} +type T1 interface{} +type T2 interface{} +type T3 interface{} +type Maker interface { + make(int) S3 +} + +func GenCombine(a S1, b S2, f F) S3 { + // The maker var has type Maker and is accessed via the + // function’s closure. + r = maker.make(a.len()) + for i := 0; i < a.len(); i++ { + v := a.index(i) + r.set(i, f.call(v, b.index(i)) + } + return r +} +``` + +The associated instantiation templates will be + +``` +type realT1 +type realT2 +type realT3 +type realS1 []realT1 +type realS2 []realT2 +type realS3 []realT3 +type realF func(realT1, realT2) realT3 +type realMaker struct{} +func (s1 realS1) len() int { + return len(s1) +} +func (s1 realS1) index(i int) T1 { + return s1[i] +} +func (s2 realS2) index(i int) T2 { + return s2[i] +} +func (s3 realS3) set(i int, t3 T3) { + s3[i] = t3.(realT3) +} +func (f realF) call(t1 T1, t2 T2) T3 { + return f(t1.(realT1), t2.(realT2)) +} +func (m realMaker) make(l int) S3 { + return make(realT3, l) +} +``` + +A reference to `Combine` will then be built into a closure value with +`GenCombine` as the function and a value of the `Maker` interface in the +closure. +The dynamic type of the `Maker` value will be `realMaker`. +(If a function like `make` is invoked in a method on a generalized type, +we can’t use a closure, so we instead add an appropriate hidden method +on the generalized type.) + +With this implementation approach we are taking interface types in a +different direction. +The interface type in Go lets the programmer define methods and then +implement them for different types. +With generalized types the programmer describes how the interface is +used, and the compiler uses that description to define the methods and +their implementation. + +Another example. +When a generalized type has methods, those methods need to be +instantiated as calls to the generalized methods with appropriate type +assertions and conversions. + +``` +gen [T] ( +type Vector []T +func (v Vector) Len() int { return len(v) } +func (v Vector) Index(i int) T { return v[i] } +) // Ends gen. + +type Readers interface { + Len() int + Index(i int) io.Reader +} + +type VectorReader Vector[io.Reader] +var _ = make(VectorReader, 0).(Readers) +``` + +The last statement asserts that `VectorReader[io.Reader]` supports the +Readers interface, as of course it should. +The `Vector` type implementation will look like this. + +``` +type T interface{} +type S interface { + len() int + index(i int) T +} +type V struct { + S +} +func (v V) Len() int { return v.S.len() } +func (v V) Index(i int) T { return v.S.index(i) } +``` + +The instantiation templates will be + +``` +type realT +type realS []realT +func (s realS) len() { return len(s) } +func (s realS) index(i int) T { return s[i] } +``` + +When this is instantiated with `io.Reader`, the compiler will generate +additional methods. + +``` +func (s realS) Len() int { return V{s}.Len() } +func (s realS) Index(i int) io.Reader { +return V{s}.Index(i).(io.Reader) +} +``` + +With an example this simple this seems like a pointless runaround. +In general, though, the idea is that the bulk of the method +implementation will be in the `V` methods, which are compiled once. +The `realS` `len` and `index` methods support those `V` methods. +The `realS` `Len` and `Index` methods simply call the `V` methods with +appropriate type conversions. +That ensures that the `Index` method returns `io.Reader` rather than +`T` aka `interface{}`, so that `realS` satisfies the `Readers` +interface in the original code. + +Now an example with a variadic method. + +``` +gen [T] func F(v T) { + v.M(1) + v.M(“a”, “b”) +} +``` + +This looks odd, but it’s valid for a type with a method +`M(...interface{})`. This is rewritten as + +``` +type T interface { + m(...interface{}) // Not the same as T’s method M. +} +func GF(v T) { + v.m(1) + v.m(“a”, “b”) +} +``` + +The instantiation templates will be + +``` +type realT +func (t realT) m(a ...interface{}) { + t.M(a...) +} +``` + +The basic rule is that if the same method is called with different +numbers of arguments, it must be instantiated with a variadic method. +If it is called with the same number of arguments with different +types, it must be instantiated with interface{} arguments. +In the general case the instantiation template may need to convert the +argument types to the types that the real type’s method accepts. + +Because generalized types are implemented by interface types, there is +no way to write generalized code that detects whether it was +instantiated with an interface type. +If the code can assume that a generalized function was instantiated by +a non-interface type, then it can detect that type using a type switch +or type assertion. +If it is important to be able to detect whether a generalized function +was instantiated with an interface type, some new mechanism will be +required. + +In the above examples I’ve always described a rewritten implementation +and instantiation templates. +There is of course another implementation method that will be +appropriate for simple generalized functions: inline the function. +That would most likely be the implementation method of choice for +something like a generalized `Max` function. +I think this could be handled as a minor variant on traditional +function inlinining. + +In some cases the compiler can determine that only a specific number +of concrete types are permitted. +For example, the `Sum` function can only be used with types that +support that binary `+` operator, which means a numeric or string +type. +In that case the compiler could choose to instantiate and compile the +function for each possible type. +Uses of the generalized function would then call the appropriate +instantiation. +This would be more work when compiling the generalized function, but +not much more work. +It would mean no extra work for uses of the generalized function. + +## Spec changes + +I don’t think many spec changes are needed other than a new section on +generalized types. +The syntax of generalized types would have to be described. +The implementation details do not need to be in the spec. +A generalized function instantiated with concrete types is valid if +rewriting the function with the concrete types would produce valid Go +code. + + +There is a minor exception to that approach: we would want to permit +type assertions and type switches for generalized types as well as for +interface types, even if the concrete type is not an interface type. + +## Compatibility + +This approach introduces a new keyword, `gen`. +However, this keyword is only permitted in top-level declarations. +That means that we could treat it as a new syntactic category, a +top-level keyword that is only recognized as such when parsing a +`TopLevelDecl`. +That would mean that any code that currently compiles with Go 1 would +continue to compile with this new proposal, as any existing use of gen +at top-level is invalid. + +We could also maintain Go 1 compatibility by using the existing `type` +keyword instead of `gen`. +The square brackets used around the generalized type names would make +this unambiguous. + +However, syntactic compatibility is only part of the story. +If this proposal is adopted there will be a push toward rewriting +certain parts of the standard library to use generalized types. +For example, people will want to change the `container` and `sort` +packages. +A farther reaching change will be changing `io.Writer` to take a +generalized type that is either `[]byte` or `string`, and work to push +that through the `net` and `os` packages down to the `syscall` package. +I do not know whether this work could be done while maintaining Go 1 +compatibility. +I do not even know if this work should be done, although I’m sure that +people will want it. + +## Comparison to other languages + +### C + +Generalized types in C are implemented via preprocessor macros. +The system described here can be seen as a macro system. +However, unlike in C, each generalized function must be complete and +compilable by itself. +The result is in some ways less powerful than C preprocessor macros, +but does not suffer from problems of namespace conflict and does not +require a completely separate language (the preprocessor language) for +implementation. + +### C++ + +The system described here can be seen as a subset of C++ templates. +Go’s very simple name lookup rules mean that there is none of the +confusion of dependent vs. non-dependent names. +Go’s lack of function overloading removes any concern over just which +instance of a name is being used. +Together these permit the explicit determination of constraints when +compiling a generalized function, whereas in C++ where it’s nearly +impossible to determine whether a type may be used to instantiate a +template without effectively compiling the instantiated template and +looking for errors (or using concepts, proposed for later addition to +the language). + +C++ template metaprogramming uses template specialization, non-type +template parameters, variadic templates, and SFINAE to implement a +Turing complete language accessible at compile time. +This is very powerful but at the same time has serious drawbacks: the +template metaprogramming language has a baroque syntax, no variables +or non-recursive loops, and is in general completely different from +C++ itself. +The system described here does not support anything similar to +template metaprogramming for Go. +I believe this is a feature. +I think the right way to implement such features in Go would be to add +support in the go tool for writing Go code to generate Go code, most +likely using the go/ast package and friends, which is in turn compiled +into the final program. +This would mean that the metaprogramming language in Go is itself Go. + +### Java + +I believe this system is slightly more powerful than Java generics, in +that it permits direct operations on basic types without requiring +explicit methods. +This system also does not use type erasure. +Although the implementation described above does insert type +assertions in various places, all the types are fully checked at +compile time and those type assertions will always succeed. + +## Summary + +This proposal will not be adopted. +It has significant flaws. + +The factored `gen` syntax is convenient but looks awkward on the page. +You wind up with a trailing close parenthesis after a set of function +definitions. +Indenting all the function definitions looks silly. + +The description of constraints in the implementation section is +imprecise. +It's hard to know how well it would work in practice. +Can the proposed implementation really handle the possible cases? + +A type switch that uses cases with generalized types may wind up with +identical types in multiple different cases. +We need to clearly explain which case is chosen. diff --git a/design/15292/2013-12-type-params.md b/design/15292/2013-12-type-params.md new file mode 100644 index 00000000..42cd3b75 --- /dev/null +++ b/design/15292/2013-12-type-params.md @@ -0,0 +1,1693 @@ +# Type Parameters in Go + +This is a proposal for adding generics to Go, written by Ian Lance +Taylor in December, 2013. +This proposal will not be adopted. +It is being presented as an example for what a complete generics +proposal must cover. + +## Introduction + +This document describes a possible implementation of type parameters +in Go. +We permit top-level types and functions to use type parameters: types +that are not known at compile time. +Types and functions that use parameters are called parameterized, as +in "a parameterized function." + +Some goals, borrowed from [Garcia et al](http://www.crest.iu.edu/publications/prints/2003/comparing_generic_programming03.pdf): + +* Do not require an explicit relationship between a definition of a parameterized function and its use. The function should be callable with any suitable type. +* Permit interfaces to express relationships between types of methods, as in a comparison method that takes two values of the same parameterized type. +* Given a type parameter, make it possible to use related types, such as a slice of that type. +* Do not require explicit instantiation of parameterized functions. +* Permit type aliasing of parameterized types. + +## Background + +My earlier proposal for generalized types had some flaws. + +This document is similar to my October 2013 proposal, but with a +different terminology and syntax, and many more details on +implementation. + +People expect parameterized functions to be fast. +They do not want a reflection based implementation in all cases. +The question is how to support that without excessively slowing down +the compiler. + +People want to be able to write simple parameterized functions like +`Sum(v []T) T`, a function that returns the sum of the values in the +slice `v`. +They are prepared to assume that `T` is a numeric type. +They don’t want to have to write a set of methods simply to implement +Sum or the many other similar functions for every numeric type, +including their own named numeric types. + +People want to be able to write the same function to work on both +`[]byte` and `string`, without requiring the bytes to be copied to a +new buffer. + +People want to parameterize functions on types that support simple +operations like comparisons. +That is, they want to write a function that uses a type parameter and +compares a value of that type to another value of the same type. +That was awkward in my earlier proposal: it required using a form of +the curiously recurring template pattern. + +Go’s use of structural typing means that a program can use any type to +meet an interface without an explicit declaration. +Type parameters should work similarly. + +## Proposal + +We permit package-level type and func declarations to use type +parameters. +There are no restrictions on how these parameters may be used within +their scope. +At compile time each actual use of a parameterized type or function is +instantiated by replacing each type parameter with an ordinary type, +called a type argument. +A type or function may be instantiated multiple times with different +type arguments. +A particular type argument is only permitted if all the operations +used with the corresponding type parameter are permitted for the type +argument. +How to implement this efficiently is discussed below. + +## Syntax + +Any package-scope type or func may be followed by one or more type +parameter names in square brackets. + +``` +type [T] List struct { element T; next *List[T] } +``` + +This defines `T` as a type parameter for the parameterized type `List`. + +Every use of a parameterized type must provide specific type arguments +to use for the type parameters. +This is done using square brackets following the type name. +In `List`, the `next` field is a pointer to a `List` instantiated with +the same type parameter `T`. + +Examples in this document typically use names like `T` and `T1` for +type parameters, but the names can be any identifier. +The scope of the type parameter name is only the body of the type or +func declaration. +Type parameter names are not exported. +It is valid, but normally useless, to write a parameterized type or +function that does not actually use the type parameter; +the effect is that every instantiation is the same. + +Some more syntax examples: + +``` +type ListInt List[int] +var v1 List[int] +var v2 List[float] +type ( +[T1, T2] MyMap map[T1]T2 +[T3] MyChan chan T3 +) +var v3 MyMap[int, string] +``` + +Using a type parameter with a function is similar. + +``` +func [T] Push(l *List[T], e T) *List[T] { + return &List[T]{e, l} +} +``` + +As with parameterized types, we must specify the type arguments when +we refer to a parameterized function (but see the section on type +deduction, below). + +``` +var PushInt = Push[int] // Type is func(*List[int], int) *List[int] +``` + +A parameterized type can have methods. + +``` +func [T] (v *List[T]) Push(e T) { + *v = &List[T]{e, v} +} +``` + +A method of a parameterized type must use the same number of type +parameters as the type itself. +When a parameterized type is instantiated, all of its methods are +automatically instantiated too, with the same type arguments. + +We do not permit a parameterized method for a non-parameterized type. +We do not permit a parameterized method to a non-parameterized +interface type. + +### A note on syntax + +The use of square brackets to mark type parameters and the type +arguments to use in instantiations is new to Go. +We considered a number of different approaches: + +* Use angle brackets, as in `Vector`. This has the advantage of being familiar to C++ and Java programmers. Unfortunately, it means that `f(true)` can be parsed as either a call to function `f` or a comparison of `f>} e2`, `e2 {%,&,|,^,&^,<<,>>} v1` + * `U1` gets the restriction _integral_. + * Type of expression is type of first operand. +* `v1 {==,!=} e2`, `e2 {==,!=} v`1 + * `U1` gets the restriction _comparable_; expression has untyped boolean value. +* `v1 {<,<=,>,>=} e2`, `e2 {<,<=,>,>=} v1` + * `U1` gets the restriction _ordered_; expression has untyped boolean value. +* `v1 {&&,||} e2`, `e2 {&&,||} v1` + * `U1` gets the restriction _boolean_; type of expression is type of first operand. +* `!v` + * `U` gets the restriction _boolean_; type of expression is `U`. +* &v + * Does not introduce any restrictions on `U`. + * Type of expression is new unknown type as for type literal `*U`. +* `*v` + * If `U` has the restriction _points to type `U2`_, then the type of the expression is `U2`. + * Otherwise a new unknown type `U2` is created annotated as the element type of `U`, `U` gets the restriction _points to type `U2`_, and the type of the result is `U2`. +* `<-v` + * If `U` has the restriction _chan of type `U2`_, then the type of the expression is `U2`. + * Otherwise a new unknown type `U2` is created annotated as the element type of `U`, `U` gets the restriction _chan of type `U2`_, and the type of the result is `U2`. +* `U(e)` + * This is a type conversion, not a function call. + * If `e` has a known type `K`, `U` gets the restriction _convertible from `K`_. + * The type of the expression is `U`. +* `T(v)` + * This is a type conversion, not a function call. + * If `T` is a known type, `U` gets the restriction _convertible to `T`_. + * The type of the expression is `T`. + +Some statements introduce restrictions on the types of the expressions +that appear in them. + +* `v <- e` + * If `U` does not already have a restriction _chan of type `U2`_, then a new type `U2` is created, annotated as the element type of `U`, and `U` gets the restriction _chan of type `U2`_. +* `v++`, `v--` + * `U` gets the restriction numeric. +* `v = e` (may be part of tuple assignment) + * If `e` has a known type `K`, `U` gets the restriction _assignable from `K`_. +* `e = v` (may be part of tuple assignment) + * If `e` has a known type `K`, `U` gets the restriction _assignable to `K`_. +* `e1 op= e2` + * Treated as `e1 = e1 op e2`. +* return e + * If return type is known, treated as an assignment to a value of the return type. + +The goal of the restrictions listed above is not to try to handle +every possible case. +It is to provide a reasonable and consistent approach to type checking +of parameterized functions and preliminary type checking of types used +to instantiate those functions. + +It’s possible that future compilers will become more restrictive; +a parameterized function that can not be instantiated by any type +argument is invalid even if it is never instantiated, but we do not +require that every compiler diagnose it. +In other words, it’s possible that even if a package compiles +successfully today, it may fail to compile in the future if it defines +an invalid parameterized function. + +The complete list of possible restrictions is: + +* _addable_ +* _integral_ +* _numeric_ +* _boolean_ +* _comparable_ +* _ordered_ +* _callable_ +* _composite_ +* _points to type `U`_ +* _indexable with value type `U`_ +* _sliceable with value type `U`_ +* _map type with value type `U`_ +* _has field or method `F` of type `U`_ +* _chan of type `U`_ +* _convertible from `U`_ +* _convertible to `U`_ +* _assignable from `U`_ +* _assignable to `U`_ + +Some restrictions may not appear on the same type. +If some unknown type has an invalid pair of restrictions, the +parameterized function is invalid. + +* _addable_, _integral_, _numeric_ are invalid if combined with any of + * _boolean_, _callable_, _composite_, _points to_, _indexable_, _sliceable_, _map type_, _chan of_. +* boolean is invalid if combined with any of + * _comparable_, _ordered_, _callable_, _composite_, _points to_, _indexable_, _sliceable_, _map type_, _chan of_. +* _comparable_ is invalid if combined with _callable_. +* _ordered_ is invalid if combined with any of + * _callable_, _composite_, _points to_, _map type_, _chan of_. +* _callable_ is invalid if combined with any of + * _composite_, _points to_, _indexable_, _sliceable_, _map type_, _chan of_. +* _composite_ is invalid if combined with any of + * _points to_, _chan of_. +* _points to_ is invalid if combined with any of + * _indexable_, _sliceable_, _map type_, _chan of_. +* _indexable_, _sliceable_, _map type_ are invalid if combined with _chan of_. + +If one of the type parameters, not some generated unknown type, has +the restriction assignable from `T` or assignable to `T`, where `T` is a +known named type, then the parameterized function is invalid. +This restriction is intended to catch simple errors, since in general +there will be only one possible type argument. +If necessary such code can be written using a type assertion. + +As mentioned earlier, type checking an instantiation of a +parameterized function is conceptually straightforward: replace all +the type parameters with the type arguments and make sure that the +result type checks correctly. +That said, the set of restrictions computed for the type parameters +can be used to produce more informative error messages at +instantiation time. +In fact, not all the restrictions are used when compiling the +parameterized function, but they will still be useful at instantiation +time. + +## Implementation + +This section describes a possible implementation that yields a good +balance between compilation time and execution time. +The proposal in this section is only a suggestion. + +In general there are various possible implementations that yield the +same syntax and semantics. +For example, it is always possible to implement parameterized +functions by generating a new copy of the function for each +instantiation, where the new function is created by replacing the type +parameters with the type arguments. +This approach would yield the most efficient execution time at the +cost of considerable extra compile time and increased code size. +It’s likely to be a good choice for parameterized functions that are +small enough to inline, but it would be a poor tradeoff in most other +cases. +This section describes one possible implementation with better +tradeoffs. + +Type checking a parameterized function produces a list of unknown +types, as described above. +Create a new interface type for each unknown type. +For each use of a value of that unknown type, add a method to the +interface, and rewrite the use to be a call to the method. +Compile the resulting function. + +Callers of the function will see a list of unknown types with +corresponding interfaces, with a description for each method. +The unknown types will all be annotated to indicate how they are +derived from the type arguments. +Given the type arguments used to instantiate the function, the +annotations are sufficient to determine the real type corresponding to +each unknown type. + +For each unknown type, the caller will construct a new copy of the +type argument. +For each method description for that unknown type, the caller will +compile a method for the new type. +The resulting type will satisfy the interface type that corresponds to +the unknown type. + +If the type argument is itself an interface type, the new copy of the +type will be a struct type with a single member that is the type +argument, so that the new copy can have its own methods. +(This will require slight but obvious adjustments in the instantiation +templates shown below.) +If the type argument is a pointer type, we grant a special exception +to permit its copy to have methods. + +The call to the parameterized function will be compiled as a +conversion from the arguments to the corresponding new types, and a +type assertion of the results from the interface types to the type +arguments. + +We will call the unknown types `Un`, the interface types created while +compiling the parameterized function `In`, the type arguments used in +the instantiation `An`, and the newly created corresponding types +`Bn`. Each `Bn` will be created as though the compiler saw `type Bn An` +followed by appropriate method definitions (modified as described +above for interface and pointer types). + +To show that this approach will work, we need to show the following: + +* Each operation using a value of unknown type can be implemented as a call to a method `M` on an interface type `I`. +* We can describe each `M` for each `I` in such a way that we can instantiate the methods for any valid type argument; for simplicity we can describe these methods as templates in the form of Go code, and we call them _instantiation templates_. +* All valid type arguments will yield valid method implementations. +* All invalid type arguments will yield some invalid method implementation, thus causing an appropriate compilation error. (Frankly this description does not really show that; I’d be happy to see counter-examples.) + +### Simple expressions + +Simple expressions turn out to be easy. +For example, consider the expression `v.F` where `v` has some unknown +type `U1`, and the expression has the unknown type `U2`. +Compiling the original function will generate interface types `I1` and `I2`. + +Add a method `$FieldF` to `I1` (here I’m using `$` to indicate that +this is not a user-callable method; +the actual name will be generated by the compiler and never seen by the user). +Compile `v.F` as `v.$FieldF()` (while compiling the code, `v` has type +`I1`). +Write out an instantiation template like this: + +``` +func (b1 *B1) $FieldF() I2 { return B2(A1(*b1).F) } +``` + +When the compiler instantiates the parameterized function, it knows +the type arguments that correspond to `U1` and `U2`. +It has defined new names for those type arguments, `B1` and `B2`, so +that it has something to attach methods to. +The instantiation template is used to define the method `$FieldF` by +simply compiling the method in a scope such that `A1`, `B1`, and `B2` +refer to the appropriate types. + +The conversion of `*b1` (type `B1`) will always succeed, as `B1` is +simply a new name for `A1`. + +The reference to field (or method) `F` will succeed exactly when `B1` +has a field (or method) `F`; +that is the correct semantics for the expression `v.F` in the original +parameterized function. +The conversion to type `B2` will succeed when `F` has the type `A2`. +The conversion of the return value from type `B2` to type `I2` will always +succeed, as `B2` implements `I2` by construction. + +Returning to the parameterized function, the type of `v.$FieldF()` is +`I2`, which is correct since all references to the unknown type `U2` are +compiled to use the interface type `I2`. + +An expression that uses two operands will take the second operand as a +parameter of the appropriate interface type. +The instantiation template will use a type assertion to convert the +interface type to the appropriate type argument. +For example, `v1[v2]`, where both expressions have unknown type, will +be converted to `v1.$Index(v2)` and the instantiation template will be + +``` +func (b1 *B1) $Index(i2 I2) I3 { return B3(A1(*b1)[A2(*i2.(*B2))]) } +``` + +The type conversions get admittedly messy, but the basic idea is as +above: convert the `Bn` values to the type arguments `An`, perform the +operation, convert back to `Bn`, and finally return as type `In`. +The method takes an argument of type `I2` as that is what the +parameterized function will use; +the type assertion to `*B2` will always succeed. + +This same general procedure works for all simple expressions: index +expressions, slice expressions, relational operators, arithmetic +operators, indirection expressions, channel receives, method +expressions, method values, conversions. + +To be clear, each expression is handled independently, regardless of +how it appears in the original source code. +That is, `a + b - c` will be translated into two method calls, something +like `a.$Plus(b).$Minus(c)` and each method will have its own +instantiation template. + +### Untyped constants + +Expressions involving untyped constants may be implemented by creating +a specific method for the specific constants. +That is, we can compile `v + 10` as `v.$Add10()`, with an instantiation +template + +``` +func (b1 *B1) $Add10() I1 { return B1(A1(*b1) + 10) } +``` + +Another possibility would be to compile it as `v.$AddX(10)` and + +``` +func (b1 *B1) $AddX(x int64) { return B1(A1(*b1) + A1(x)) } +``` + +However, this approach in general will require adding some checks in +the instantiation template so that code like `v + 1.5` is rejected if +the type argument of `v` is not a floating point or complex type. + +### Logical operators + +The logical operators `&&` and `||` will have to be expanded in the +compiled form of the parameterized function so that the operands will +be evaluated only when appropriate. +That is, we can not simply replace `&&` and `||` of values of unknown +types with method calls, but must expand them into if statements while +retaining the correct order of evaluation for the rest of the +expression. +In the compiler this can be done by rewriting them using a +compiler-internal version of the C `?:` ternary operator. + +### Address operator + +The address operator requires some additional attention. +It must be combined with the expression whose address is being taken. +For example, if the parameterized function has the expression `&v[i]`, +the compiler must generate a `$AddrI` method, with an instantiation +template like + +``` +func (b1 *B1) $AddrI(i2 I2) I3 { return B3(&A1(*b1)[A2(i2.(*B2))]) } +``` + +### Type assertions + +Type assertions are conceptually simple, but as they are permitted for +values of unknown type they require some additional attention in the +instantiation template. +Code like `v.(K)`, where `K` is a known type, will be compiled to a +method call with no parameters, and the instantiation template will +look like + +``` +func (b1 B1) $ConvK() K { + a1 := A1(b1) + var e interface{} = a1 + return e.(K) +} +``` + +Introducing `e` avoids an invalid type assertion of a non-interface type. + +For `v.(U2)` where `U2` is an unknown type, the instantiation template +will be similar: + +``` +func (b1 B1) $ConvU() I2 { + a1 := A1(b1) + var e interface{} = a1 + return B2(e.(A2)) +} +``` + +This will behave correctly whether `A2` is an interface or a +non-interface type. + +### Function calls + +A call to a function of known type requires adding implicit +conversions from the unknown types to the known types. +Those conversions will be implemented by method calls as described above. +Only conversions valid for function calls should be accepted; +these are the set of conversions valid for assignment statements, +described below. + +A call to a function of unknown type can be implemented as a method +call on the interface type holding the function value. +Multiple methods may be required if the function is called multiple +times with different unknown types, or with different numbers of +arguments for a variadic function. +In each case the instantiation template will simply be a call of the +function, with the appropriate conversions to the type arguments of +the arguments of unknown type. + +A function call of the form `F1(F2())` where neither function is known +may need a method all by itself, since there is no way to know how +many results `F2` returns. + +### Composite literals + +A composite literal of a known type with values of an unknown type can +be handled by inserting implicit type conversions to the appropriate +known type. + +A composite literal of an unknown type can not be handled using the +mechanisms described above. +The problem is that there is no interface type where we can attach a +method to create the composite literal. +We need some value of type `Bn` with a method for us to call, but in the +general case there may not be any such value. + +To implement this we require that the instantiation place a value of +an appropriate interface type in the function’s closure. +This can always be done as generalized functions only occur at +top-level, so they do not have any other closure (function literals +are discussed below). +We compile the code to refer to a value `$imaker` in the closure, with +type `Imaker`. +The instantiation will place a value with the appropriate type `Bmaker` +in the function instantiation's closure. +The value is irrelevant as long as it has the right type. +The methods of `Bmaker` will, of course, be those of `Imaker`. +Each different composite literal in the parameterized function will be +a method of `Imaker`. + +A composite literal of an unknown type without keys can then be +implemented as a method of `Imaker` whose instantiation template simply +returns the composite literal, as though it were an operator with a +large number of operands. + +A composite literal of an unknown type with keys is trickier. +The compiler must examine all the keys. + +* If any of the keys are expressions or constants rather than simple names, this can not be a struct literal. We can generate a method that passes all the keys and values, and the instantiation template can be the composite literal using those keys and values. In this case if one of the keys is an undefined name, we can give an error while compiling the parameterized function. +* Otherwise, if any of the names are not defined, this must be a struct literal. We can generate a method that passes the values, and the instantiation template can be the composite literal with the literal names and the value arguments. +* Otherwise, we call a method passing all the keys and values. The instantiation template is the composite literal with the key and value arguments. If the type argument is a struct, the generated method will ignore the key values passed in. + +For example, if the parameterized function uses the composite literal +`U{f: g}` and there is a local variable named `f`, this is compiled +into `imaker.$CompLit1(f, g)`, and the instantiation template is + +``` +func (bm Bmaker) $CompLit1(f I1, g I2) I3 { + return bm.$CompLit2(A1(f.(B1)), A2(g.(B2))) +} +func (Bmaker) $CompLit2(f A1, g A2) I3 { return B3(A3{f: g}) } +``` + +If `A3`, the type argument for `U`, is a struct, then the parameter `f` is +unused and the `f` in the composite literal refers to the field `f` of +`A3` (it is an error if no such field exists). +If `A3` is not a struct, then `f` must be an appropriate key type for `A3` +and the value is used. + +### Function literals + +A function literal of known type may be compiled just like any other +parameterized function. +If a maker variable is required for constructs like composite +literals, it may be passed from the enclosing function’s closure to +the function literal’s closure. + +A function literal of unknown type requires that the function have a +maker variable, as for composite literals, above. +The function literal is compiled as a parameterized function, and +parameters of unknown type are received as interface types as we are +describing. +The type of the function literal will itself be an unknown type, and +will have corresponding real and interface types just like any other +unknown type. +Creating the function literal value requires calling a method on the +maker variable. +That method will create a function literal of known type that simply +calls the compiled form of the function literal. + +For example: + +``` +func [T] Counter() func() T { + var c T + return func() T { + c++ + return c + } +} +``` + +This is compiled using a maker variable in the closure. +The unknown type `T` will get an interface type, called here `I1`; +the unknown type `func() T` will get the interface type `I2`. +The compiled form of the function will call a method on the maker +variable, passing a closure, something along the lines of + +``` +type CounterTClosure struct { c *I1 } +func CounterT() I2 { + var c I1 + closure := CounterTClosure{&c} + return $bmaker.$fnlit1(closure) +} +``` + +The function literal will get its own compiled form along the lines of + +``` +func fnlit1(closure CounterTClosure) I1 { + (*closure.c).$Inc() + return *closure.c +} +``` + +The compiled form of the function literal does not have to correspond +to any particular function signature, so it’s fine to pass the closure +as an ordinary parameter. + +The compiler will also generate instantiation templates for callers of +`Counter`. + +``` +func (Bmaker) $fnlit1(closure struct { c *I1}) I2 { + return func() A1 { + i1 := fnlit1(closure) + b1 := i1.(B1) + return A1(b1) + } +} + +func (b1 *B1) $Inc() { + a1 := A1(*b1) + a1++ + *b1 = B1(a1) +} +``` + +This instantiation template will be compiled with the type argument `A1` +and its method-bearing copy `B1`. +The call to `Counter` will use an automatically inserted type assertion +to convert from `I2` to the type argument `B2` aka `func() A1`. +This gives us a function literal of the required type, and tracing +through the calls above shows that the function literal behaves as it +should. + +### Statements + +Many statements require no special attention when compiling a +parameterized function. +A send statement is compiled as a method on the channel, much like a +receive expression. +An increment or decrement statement is compiled as a method on the +value, as shown above. +A switch statement may require calling a method for equality +comparison, just like the `==` operator. + +#### Assignment statements + +Assignment statements are straightforward to implement but require a +bit of care to implement the proper type checking. +When compiling the parameterized function it's impossible to know +which types may be assigned to any specific unknown type. +The type checking could be done using annotations of the form _`U1` +must be assignable to `U2`_, but here I’ll outline a method that +requires only instantiation templates. + +Assignment from a value of one unknown type to the same unknown type +is just an ordinary interface assignment. + +Otherwise assignment is a method on the left-hand-side value (which +must of course be addressable), where the method is specific to the +type on the right hand side. + +``` +func (b1 *B1) $AssignI2(i2 I2) { + var a1 A1 = A2(i2.(B2)) + *b1 = B1(a1) +} +``` + +The idea here is to convert the unknown type on the right hand side +back to its type argument `A2`, and then assign it to a variable of the +type argument `A1`. +If that assignment is not valid, the instantiation template can not be +compiled with the type arguments, and the compiler will give an error. +Otherwise the assignment is made. + +Return statements are implemented similarly, assigning values to +result parameters. +The code that calls the parameterized function will handle the type +conversions at the point of the call. + +#### Range clauses + +A for statement with a range clause may not know anything about the +type over which it is ranging. +This means that range clauses must in general be implemented using +compiler built-in functions that are not accessible to ordinary +programs. +These will be similar to the runtime functions that the compiler +already uses. +A statement: + +``` + for v1 := range v2 {} +``` + +could be compiled as something like: +``` + for v1, it, f := v2.$init(); !f; v1, f = v2.$next(it) {} +``` + +with instantiation templates that invoke compiler built-in functions: + +``` +func (b2 B2) $init() (I1, I3, bool) { + return $iterinit(A2(b2)) +} +func (b2 B2) $next(I3) (I1, bool) { + return $iternext(A2(b2), I3.(A3)) +} +``` + +Here I’ve introduced another unknown type `I3` to represent the +current iteration state. + +If the compiler knows something specific about the unknown type, then +more efficient techniques can be used. For example, a range over a +slice could be written using `$Len` and `$Index` methods. + +#### Type switches + +Type switches, like type assertions, require some attention because +the value being switched on may have a non-interface type argument. +The instantiation method will implement the type switch proper, and +pass back the index of the select case. +The parameterized function will do a switch on that index to choose +the code to execute. + +``` +func [T] Classify(v T) string { + switch v.(type) { + case []byte: + return “slice” + case string: + return “string” + default: + return “unknown” + } +} +``` + +The parameterized function is compiled as + +``` +func ClassifyT(v I1) string { + switch v.$Type1() { + case 0: + return “slice” + case 1: + return “string” + case 2: + return “unknown” + } +} +``` + +The instantiation template will be + +``` +func (b1 B1) $Type1() int { + var e interface{} = A1(b1) + switch e.(type) { + case []byte: + return 0 + case string + return 1 + default + return 2 + } +} +``` + +The instantiation template will have to be compiled in an unusual way: +it will have to permit duplicate types. +That is because a type switch that uses unknown types in the cases may +wind up with the same type in multiple cases. +If that happens the first matching case should be used. + +#### Select statements + +Select statements will be implemented much like type switches. +The select statement proper will be in the instantiation template. +It will accept channels and values to send as required. +It will return an index indicating which case was chosen, and a +receive value (an empty interface) and a `bool` value. +The effect will be fairly similar to `reflect.Select`. + +### Built-in functions + +Most built-in functions when called with unknown types are simply +methods on their first argument: `append`, `cap`, `close`, `complex`, +`copy`, `delete`, `imag`, `len`, `real`. +Other built-in functions require no special handling for parameterized +functions: `panic`, `print`, `println`, `recover`. + +The built-in functions `make` and `new` will be implemented as methods +on a special maker variable, as described above under composite +literals. + +### Methods of parameterized types + +A parameterized type may have methods, and those methods may have +arguments and results of unknown type. +Any instantiation of the parameterized type must have methods with the +appropriate type arguments. +That means that the compiler must generate instantiation templates +that will serve as the methods of the type instantiation. +Those templates will call the compiled form of the method with the +appropriate interface types. + +``` +type [T] Vector []T +func [T] (v Vector[T]) Len() int { return len(v) } +func [T] (v Vector[T]) Index(i int) T { return v[i] } + +type Readers interface { + Len() int + Index(i int) io.Reader +} + +type VectorReader struct { Vector[io.Reader] } +var _ = VectorReader{}.(Readers) +``` + +In this example, the type `VectorReader` inherits the methods of the +embedded field `Vector[io.Reader]` and therefore implements the +non-parameterized interface type `Readers`. +When implementing this, the compiler will assign interface types for +the unknown types `T` and `[]T`; +here those types will be `I1` and `I2`, respectively. +The methods of the parameterized type Vector will be compiled as +ordinary functions: + +``` +func $VectorLen(i2 I2) int { return i2.$len() } +func $VectorIndex(i2 I2, i int) I1 { return i2.$index(i) } +``` + +The compiler will generate instantiation templates for the methods: + +``` +func (v Vector) Len() int { return $VectorLen(I2(v)) } +func (v Vector) Index(i int) A1 { +return A1($VectorIndex(I2(v), i).(B1)) +} +``` + +The compiler will also generate instantiation templates for the +methods of the type `B2` that corresponds to the unknown type `[]T`. + +``` +func (b2 B2) $len() int { return len(A2(b2)) } +func (b2 B2) $index(i int) I1 { return B1(A2(b2)[i]) } +``` + +With an example this simple there is a lot of effort for no real gain, +but this does show how the compiler can use the instantiation +templates to define methods of the correct instantiated type while the +bulk of the work is still done by the parameterized code using +interface types. + +### Implementation summary + +I believe that covers all aspects of the language and shows how they +may be implemented in a manner that is reasonably efficient both in +compile time and execution time. +There will be code bloat in that instantiation templates may be +compiled multiple times for the same type, but the templates are, in +general, small. +Most are only a few instructions. +There will be run time cost in that many operations will require a +method call rather than be done inline. +This cost will normally be small. +Where it is significant, it will always be possible to manually +instantiate the function for the desired type argument. + +While the implementation technique described here is general and +covers all cases, real compilers are likely to implement a blend of +techniques. +Small parameterized functions will simply be inlined whenever they are +called. +Parameterized functions that only permit a few types, such as the `Sum` +or `Join` examples above, may simply be compiled once for each possible +type in the package where they are defined, with callers being +compiled to simply call the appropriate instantiation. + +Implementing type parameters using interface methods shows that type +parameters can be viewed as implicit interfaces. +Rather than explicitly defining the methods of a type and then calling +those methods, type parameters implicitly define an interface by the +way in which values of that type are used. + +In order to get good stack tracebacks and a less confusing +implementation of `runtime.Caller`, it will probably be desirable to, +by default, ignore the methods generated from instantiation templates +when unwinding the stack. +However, it might be best if they could influence the reporting of the +parameterized function in a stack backtrace, so that it could indicate +that types being used. +I don’t yet know if that would be helpful or feasible. + +## Deployment + +This proposal is backward compatible with Go 1, in that all Go 1 +programs will continue to compile and run identically if this proposal +is adopted. That leads to the following proposal. + +* Add support for type parameters to a future Go release 1.n, but require a command line option to use them. This will let people experiment with the new facility. +* Add easy support for that command line option to the go tool. +* Add a `// +build` constraint for the command line option. +* Try out modified versions of standard packages where it seems useful, putting the new versions under the exp directory. +* Decide whether to keep the facility for Go 2, in which the standard packages would be updated. + +In the standard library, the most obvious place where type parameters +would be used is to introduce compile-time-type-safe containers, like +`container/list` but with the type of the elements known at compile +time. +It would also be natural to add to the `sort` package to make it easier +to sort slices with less boilerplate. +Other new packages would be `algorithms` (find the max/min/average of a +collection, transform a collection using a function), `channels` (merge +channels into one, multiplex one channel into many), `maps` (copy a +map). + +Type parameters could be used to partially unify the `bytes` and +`strings` packages. +However, the implementation would be based on using an unknown type +that could be either `[]byte` or `string`. +Values of unknown type are passed as interface values. +Neither `[]byte` nor `string` fits in an interface value, so the +values would have be passed by taking their address. +Most of the functions in the package are fairly simple; +one would only want to unify them if they could be inlined, or if +escape analysis were smart enough to avoid pushing the values into the +heap, or if the compiler were smart enough to see that only two types +would work and to compile both separately. + +Similar considerations apply to supporting a parameterized `Writer` +interface that accepts either `[]byte` or `string`. +On the other hand, if the compiler has the appropriate optimizations, +it would be convenient to write unified implementations for `Write` and +`WriteString` methods. + +The perhaps surprising conclusion is that type parameters permit new +kinds of packages, but need not lead to significant changes in +existing packages. +Go does after all already support generalized programming, using +interfaces, and the existing packages were designed around that fact. +In general they already work well. +Adding type parameters does not change that. +It opens up the ability to write new kinds of packages, ones that have +not been written to date because they are not well supported by +interfaces. + +## Summary + +I think this is the best proposal so far. +However, it will not be adopted. + +The syntax still needs work. +A type is defined as `type [T] Vector []T` but is used as `Vector[int]`, +which means that the brackets are on the left in the definition but on +the right in the use. +It would be much nicer to write `type Vector[T] []T`, but that is +ambiguous with an array declaration. +That suggests the possibility of using double square brackets, as in +`Vector[[int]]`, or perhaps some other character(s). + +The type deduction rules are too complex. +We want people to be able to easily use a `Transform` function, but +the rules required to make that work without explicitly specifying type +parameters are very complex. +The rules for untyped constants are also rather hard to follow. +We need type deduction rules that are clear and obvious, so that +there is no confusion as to which type is being used. + +The implementation description is interesting but very complicated. +Is any compiler really going to implement all that? +It seems likely that any initial implementation would just use macro +expansion, and unclear whether it would ever move beyond that. +The result would be increased compile times and code bloat.