Skip to content

Latest commit

 

History

History
345 lines (333 loc) · 15.7 KB

TODO.md

File metadata and controls

345 lines (333 loc) · 15.7 KB

Things left to do

  • build register allocation verification
  • fix load store bugs
  • properly handle parameters with a copy
  • switch allocator to use uses data structure for liveness tracking
    • in uses liveness calculator, special case phis
      • phis should act like copies in source block
      • if the value does not live
        • it should not be in live out
        • it should be in block kill list
  • top to bottom register allocation verification?
  • it seems like back-links (loops) are not handled properly in the register allocator yet
  • fix value overwrites in the register allocator
  • get ssa dump html to have final assembly output
  • emit ssa dump html on command line arg
  • get a test suite working
    • ability to write .asm files
    • can invoke customasm to produce .hex files
    • can invoke emulator to run the .hex files
    • set up a test runner
    • start writing some simple test programs inspired by c-test-suite
  • storing a const needs a register store [gp, main__init_guard], 1
  • implement function parameters
    • add parameters to function entry
    • handle register ABI for parameters
  • working register allocator
    • implement parallel phi moves
    • convert boolean checks into proper block types
    • try to assign moves the same register so it will be eliminated
    • use a better live value scanner
      • handles loops properly
        • includes branches and other control forks inside the loop
        • fixed point iterative analysis (more accurate)
      • can emit a .dot file for debugging
      • phis and phi copies handled correctly
      • makes sure all loop extended vars have liveThroughCalls set properly
      • better debugging of liveness with graphviz
    • reorder PhiCopies so they don't clobber regs
      • may need to swap sometimes
  • hide debugging spam with flags
  • if a function has multiple returns
    • create a common exit block
    • set successors for each return block to it
    • create a phi in the return block for each returned value
    • change return to jump, remove controls
  • handle stack ABI for parameters
    • loading on callee side
    • storing on caller side

milestone get: min useful compiler

  • now has ability to compile simple programs that use only word data

    • global structs and arrays supported
    • allocator does not support spills
      • crashes if it needs more registers than available
  • implement multiple function return values

    • copies return values to registers a0-a2 in function
    • copies extracted values from registers a0-a2 at call site
    • test case built
  • storing comparisons in a bool

  • fix bug where return values happen after register restores

  • fix bugs with self referential phis

  • add go source to ssa.html dump

  • add original tools/go/ssa source to ssa.html dump

  • better assembly IR / output

    • design a lowest level IR for assembly
    • add a module for rj32 specific assembly generation
    • generate the assembly IR for rj32
    • add a pretty html dump to ssa.html
    • add ability to convert assembly IR into text
    • convert codegen to only produce the IR
  • better register allocation

    • remove register "guessing" -- don't think that's necessary
    • improved register colouring debugging info
    • add function return values to register verifier
    • stronger out-of-ssa handling
      • proper CSSA back out
        • additional parallel moves after the phi
        • remove assumption that PhiCopies are only at the end of a block
        • handle transition from Phi to PhiCopy block properly
  • [.] start a standard library

    • builtin print/println support
      • compiles to a series of runtime functions
    • implement some runtime functions
      • string iterator next function
      • printing various types
    • loaded package replacement
      • build an alternate goroot with replacement packages
      • replace runtime package
      • force runtime to be loaded
    • only parse functions actually used
      • make sure uses of runtime ops are loaded
    • extern func assembly
      • per architecture
      • loads from asm files
    • add runtime library support for builtin ops
      • implement mul in go
      • implement div in go
      • implement rem in go
      • implement double and quad word ops
        • add/sub should use addc/subc
        • shifts should do a function call
        • either multi-def support or register pair support
  • proper error logging with a reference to the original code location

  • Documentation especially around the IR package

  • string support

    • support len() builtin
      • parses
      • implemented as a load of the second value in the tuple
    • iteration support
      • range statements for strings
        • parsed
        • implemented as an iterator
      • string iterator next op
        • parsed
        • calls an iterator function
    • indexing strings
      • address part of tuple loads
      • offset calculated and loaded
      • strings are UTF-16
      • multi-word UTF-16 character support
      • 32-bit runes
    • string len() support
    • string to slice of bytes conversion
    • slice of bytes to string conversion
    • string to slice of runes
    • slice of runes to string
    • utf8 expansion in range operator
    • string less comparison
    • string equal comparison
    • string concatenation
    • string slicing / substring
  • Add support for a second CPU to make adding more easier

    • Artentus' A32
      • Add arch abstraction with an rj32 implementation
      • Abstract away registers and their use classes
        • Add register abstraction with config from rj32 arch
        • Allow 32 registers
      • Support a different ABI
      • Allow sizes to be parameterized by arch
        • Add sizes to rj32 arch
        • Support 32-bit, 16-bit and 8-bit values
        • Support byte addressing
      • Fix hard-coded number of arg regs
        • Fix arg spill code in elaborating calls
        • Fix arg consume code in prologue
        • Fix other references to reg.ArgRegs
      • Support three operand instructions configured by arch
        • Ensure ClobbersArg is false on 3-op arches
        • Legalize should not insert copies on 3-op arches
        • Instructions for A32 emit 3 operands
      • Add support for different cpudefs
      • Add support for different emulators
      • Add stdlib support for different arches
        • Move IO into the standard Library
        • Figure out how to utilize the IN/OUT instructions of A32
        • Fix panic to write to standard out / IO
      • Round robin reg choosing
      • Make stack slots word-size aware
      • Make stack slots align aware
      • Fix customasm producing garbage binary to stdout
      • Fix bug where string length is at offset 1 when it should be 4 on LD
      • Fix calls getting the function type not the return type
      • Fix several bugs with strings and byte addressing
      • Add A32 to test suite to make sure it doesn't break
  • Rework IR (ir2)

    • Split globals into Literals (strings) and Globals (bss data)
    • Split values into Values and Instructions
    • Instructions can return multiple values
    • Iterators with support for better xforms
    • Slab based allocation so pointers are not invalidated
    • Fully documented with doc comments
    • Blocks no longer have block ops, last instr is control instr
    • Proper generation of program-wide unique names for functions, globals and literals
    • Make a copy of the parser package named frontend
      • Get it to parse programs to ir2
      • Handle translating tuples to multiple return values
    • Can emit IR code in a way that simplifies text, assembly and html generation
      • Emits to ssa.html
    • Block parameters instead of phis
      • Values can be used by both Blocks and Instrs
      • Blocks and Instrs are Users of Values
      • Block parameters are emitted
        • params are emitted on block labels
        • correct args are passed to successor blocks
      • Block parameter defs are parsed
      • Terminal instructions specify dest block arguments
      • Frontend builds blocks with parameters
      • Phis are eliminated
    • Implement a simplified type system
      • integer types i/u 8,16,32,64
      • bool type
      • cpu flags
      • pointers
      • const
      • tuples?
      • Scan program for types used & print diagnostics
      • Map from go types to type system
      • Use types in frontend translation to IR
      • Output them in textual format as def annotations
    • Ability to parse text form of IR back into IR
      • Can lex tokens
      • Values parsed
      • Types parsed
        • Types interned to reduce memory
        • Array types
        • Slice types
        • Pointer types
        • Func types
          • Param lists
          • Result lists
          • Receivers
        • Empty interfaces
        • Full interfaces
        • Named types
          • Type names and looking up the typedef
          • Methods
        • Struct types
        • Map types
        • Chan types
        • Tuple types?
        • Type defs
      • Constants parsed
        • function references
        • literal references
        • global references
        • numbers
      • Instructions parsed
      • block labels parsed
      • func labels parsed
        • parameters, results parsed
      • packages parsed
      • parses block references and links them
      • parses value references and links them
      • Global vars
      • Global/Func forward references
      • Refactor: better package naming
        • Generate unique package names
        • Identifiers in other packages are pkg_name.ident rather than pkg_name__ident
      • Use github.com/nochso/ctxerr for error messages
    • Transform xform pkg to use new IR
      • Register func with options for config
        • grabs name from passed xform func
        • tags
        • list of active stages
        • activating op
        • can be run on each instr
        • can be run on each block
        • takes a func that takes an iterator
          • xform func responsible for leaving iterator in a resumable position
          • iterator tracks changes
            • ability to tell the iterator a change was made it can't see
      • A way to run a single transform for testing w/tags
      • A way to run a single stage for testing w/tags
  • Rework xform system

    • Have a better way to track which rules have run to trigger what code to be generated
      • Instrument SSA dump with code location that generated the code
      • A way to allow debugging of rule matching conditions inputs outputs and generated code in the ssa.html
    • Transformation tagging with the architecture or other things using configuration functions
      • Ability to configure the tags for each transformation function close to where the function is defined
        • A way to automatically document rules their inputs and outputs their purpose and automatically assign an ID to them and have a way to list documentation about them at the command line
      • An architecture can have many tags that are associated with it that can describe which rules are active
      • A way to denote which rules/tags were active in each pass for a given architecture in the ssa.html
      • Split larger transformations apart so that they are more fine-grained and thus easier to tag for specific architectures
      • Have a way to easily run a transformation at the function or block level
    • Have a way to denote how far back in the instruction list to backtrack if a change is made
      • Make that back tracking automatic so the rule doesn't have to specify it
    • Have a way to specify the dependencies or rules that have to execute first or prerequisites
      • Maintain a candidate rule list and as that rule satisfies prerequisites of other rules add those to the candidate list
      • This will allow it to automatically figure out the order to apply rules and figure out the phases accordingly
      • A way to add rules based on architecture sizes information
    • Instead of the rule having a bunch of if conditions to denote that it doesn't work have a way to specify what conditions need to be matching before the rule is ever considered
      • Have a way to denote that a rule no longer applies and can be removed from the candidate list
        • Automatically track new instructions and new code generated to see if previously rules that have been removed from the candidate list may now apply again
      • Have a way to re-trigger a transformation if any changes are made but only if another transformation has it as a prerequisite and before that one is redone
    • Register allocation as just a regular transformation
    • Think about how to add types to the xform engine
      • Type verifier that will check to make sure that the return types of ops are correct
      • A way to add types to architecture op translation
  • better copy elimination (coalescing)

    • add register preference scanning
  • track what registers are actually used by a call site

    • reduce the register restrictions (with flag to disable)
  • stack / spill support

  • create spills for temp vars still live at a call

    • reload after
  • build way to count register pressure

    • add spill support to lower register pressure
  • implement graph coloring in register allocator

    • implement adjacency lists for the graph
      • nodes can be marked as move related
      • nodes can be merged so multiple values are in one node
      • nodes can be removed one-by-one
      • fast finding of the nodes with least degree
    • use iterated register coalescing algo
  • expand various ops into calls to functions implementing them

  • stack allocation

    • slices
  • heap allocation

    • free somehow?
  • slice support

  • closures

  • Add make ready to prepare PRs or whatever

  • Far pointer and code page banking

  • Add notion of extended blocks as groups of blocks without back edges

Optimizations

  • fix allocator choosing wrong variable and doing extra copies
  • copy propagation
    • given use of X:
      • are all reaching definitions of X
        • copies from the same variable, ie X = copy Y
      • where Y is not redefined since that variable?
      • if so, substitute use of X with use of Y instead
  • move instruction defs closer to first uses to minimize reg pressure
  • constant folding
  • common subexpression elimination
    • start at entry
    • for each block copy value map from predecessors
      • ignore preds that have not been visited yet
    • for commutative operations, order in ascending order
    • hash op + arg1 + arg2
    • if already exists in value map, replace with value from that map
      • update map with new latest value
      • hash and lookup new resulting expression
    • if doesn't exist, add to map with the result value
  • dead code elimination
  • find all loops
    • loop invariant code motion
      • if for def X, no args refer to a phi node or def inside the loop
        • move X out of the loop into the pre-header
    • find loop induction variables?
      • do strength reduction on uses of induction variable?
        • like array indexing for example
    • support for inline assembly / extern assembly
      • get mul, div and rem converted to assembly