Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Initial design for support of group-transactions analysis #165

Open
S3v3ru5 opened this issue Mar 29, 2023 · 3 comments
Open

Initial design for support of group-transactions analysis #165

S3v3ru5 opened this issue Mar 29, 2023 · 3 comments

Comments

@S3v3ru5
Copy link
Contributor

S3v3ru5 commented Mar 29, 2023

First task of issue #83

Data Model

class Instruction:
    prev: List[Instruction]
    next: List[Instruction]
    line_num: int
    source_code_line: str
    comment: str
    comments_before_ins: List[str]
    tealer_comments: List[str]
    bb: Optional["BasicBlock"]
    supported_version: int
    supported_mode: Mode.Application | Mode.LogicSig

class BasicBlock:
    instructions: List[Instruction]
    prev: List[BasicBlock]
    next: List[BasicBlock]
    idx: int
    subroutine: Optional["Subroutine"]
    tealer_comments: List[str]

class Subroutine:
    subroutine_name: str
    entry: BasicBlock
    basicblocks: List[BasicBlock]
    exit_blocks: List[BasicBlock]
    contract: Optional["Teal"] = None

class Function:
    cfg: List[BasicBlock]
    function_name: str
    contract: Optional["Teal"]

class Teal:
    contract_name: str
    version: int
    execution_mode: ContractType
    instructions: List[Instruction]
    basicblocks: List[BasicBlock]
    main: Subroutine
    subroutines: Dict[str, Subroutine]
    functions: Dict[str, Function]

class ContractType(Enum):
    LogicSig
    ApprovalProgram
    ClearStateProgram

CFG of the contract is divided into subroutines. Every subroutine is an independent CFG. The blocks of one subroutines are not connected to blocks of any other subroutine. In the CFG, The callsub instruction is connected with immediately next instruction (the return point/address of the called subroutine.)

The contract needs to be divided into "Functions/Operations" as well. Functions are equivalent to ARC-4 methods. Every function should have their own CFG for analysis.

The CFG containing the basicblocks that are not part of any of the subroutines is referred to as "contract-entry" CFG. The execution always starts at the entry block of this CFG. If we consider every subroutine as external contract/program then "contract-entry" CFG can be considered as the CFG of the entire contract.

If the method/function dispatcher used by the contract does not touch any of the subroutines, each function can be represented by a CFG that consists of basicblocks that are related to that particular function.

In common use cases, method dispatcher is not part of any subroutine. The method dispatcher generated by PyTeal's Router class also does not touch the subroutines. Under this assumption, The "contract-entry" CFG can be divided into individual function CFGs. Functions can be analyzed independently.

A block might be part of multiple functions. In that case, Every function should have their own copy of the block.

class Transaction:
    type: TransactionType
    logic_sig: Optional[Function]
    application: Optional[Function]
    logic_sig_group_context: Optional[Dict[Instruction, Transaction]]
    application_group_context: Optional[Dict[Instruction, Transaction]]


class TransactionType(Enum):
    Payment
    AssetConfig
    AssetTransfer
    AssetFreeze
    KeyReg
    ApplicationCall

A transaction may involve execution of both a LogicSig and an Application.

The group-context information contains a mapping from every group related instruction of a function to the Transaction object.

group related instructions:

  • gtxn t f, gtxna t f i, gtxns f, gtxnsa f i, gtxnas t f, gtxnsas f
  • gaid t, gaids,
  • gload t i, gloads i, gloadss

group-context of "gtxn 1 RekeyTo" would point to the Transaction object representing the transaction at index 1.

The Transaction object for an instruction is specified for each execution of the function.

A group of transactions may involve execution of same function of a program in multiple transactions Or a block of code can be executed for two different functions and both functions are part of a group. So, The group-context should be provided per (Transaction, Function) because a group-context instruction can refer to different Transaction object for each of the execution.

At the same time, It is not possible to provide a Transaction object for instructions which are executed multiple times in a given execution.
If a gtxns f instruction is used in a loop with different transaction indices then the Transaction object will be different for each iteration. These kind of instructions are ignored for the analysis.

class GroupConfig:
    transactions: List[Transaction]
    tealer: Optional[Tealer] = None

class Tealer:
    contracts: List[Teal]
    group_configs: List[GroupConfig]

Detector API

class AbstractDetector:

    def __init__(self, tealer: Tealer):
        pass

    def detect(self):
        pass

The detectors can be classified into two classes:

  • Type 1: Detectors which does not need any information about other contract executed in the group to detect the bug.
    • Example "Inner Txn Fee must be zero" bug: The detector have to find all the inner transactions in a function and check if the fee field is set to zero or not. Execution of other contracts does not invalidate the bug.
  • Type 2: Detectors which require information of group transactions.
    • Example RekeyTo, AssetCloseTo, ... Any bug which relies on possibility of value for a transaction field. A different contract executed in the same group can access the transaction field and perform validations on it.

Type 1 detectors will have to use the Tealer.contracts to access the individual contracts and run analysis on each of them.
Type 2 detectors will have to use the Tealer.group_configs to access the transaction configs and run analysis on each group of transactions.

Output format:

TBD

@S3v3ru5
Copy link
Contributor Author

S3v3ru5 commented Mar 30, 2023

User Configuration

Contract = {
    /* name of the contract, e.g pool. Every contract should have a unique name */
    "name": string;
    /* filesystem path of the contract, (relative path)*/
    "path": string;
    /* Type of the contract: one of LogicSig, ApprovalProgram or ClearStateProgram*/
    "type": string;
    /* Contract's teal version */
    "version": int;
    /* Names of subroutines present in the contract */
    "subroutines": string[];
    /* Functions/User operations */
    "functions": Function[];
}
Function = {
    /* execution path to reach the function's entry block.
        The execution path is part of the method dispatcher CFG.
        The execution path is array of strings. For example, ["B0", "B1", "B3", "B4"]
        The basic blocks "B0", "B1", "B3", "B4" are part of the method dispatcher. The code in these blocks check for function identifier and route to the function accordingly.
        The block "B4" is start of the function code.
    */
    "execution_path": string[];
    /* Name of the operation, function. used as identity in "group_configurations". Should be unique for a contract. */
    "function_name": string;
}
Transaction = {
    /* A unique id for this transaction. The id is only used to refer this transaction in other transactions of the group configuration. Example: "T1" */
    "tx_id": string;
    /* Type of the transaction: one of "pay", "keyreg", "acfg", "axfer", "afrz", "appl" or "txn".  "txn" can be used to represent any type of transaction" */
    "txn_type": string?;
    /* if the transaction is to be signed with a LogicSig, specify the contract name and the function name */
    "logic_sig": {
        "contract": string;
        "function": string;
    }?;
    /* if the transaction is an application call. specify the contract and the function being called */
    "application": {
        "contract": string;
        "function": string;
    }?;
    /* Transaction's index in the group. if the transaction MUST be present at a predefined index in the group and contracts in the group use an absolute index to access fields of this transaction then specify that index in this field. if the transaction is always the first transaction 
    in the group then the "absolute_index" should be `0`.
    */
    "absolute_index": int?;
    /* Relative index of other transactions from this transactions in the group. The relative index specified are predefined and static. The relative index should not depend on any other runtime information, for example, on application arguments. Such relative indices are not completely supported. Not supported completely in the sense that if the contracts in this transaction perform validations on that transaction, Tealer will not be able to consider these validations when analyzing that transaction. 
    */
    "relative_indexes": [
        {
            /* id specified in the Transaction "id" field. Need a better field name */
            "other_tx_id": string;
            /* relative index of "other_tx_id" transaction from this transaction.
            For example, if "other_tx_id" transaction must preceed this transaction then relative index is `-1`.
            The contract executed in this transaction will access "other_tx_id" transaction using "(Txn.GroupIndex) - 1" */
            "relative_index": int;
        }?;
    ]?;
}
GroupConfig = Transaction[]

UserConfig = {
    "contracts": Contract[],
    "group_configurations": GroupConfig[],
}

@montyly
Copy link
Member

montyly commented Apr 3, 2023

  • I would add an entry / exit_blocks in Function.
  • Should we consider the possibility for a function/subroutine to have more than 1 entry point? I haven't see this happening in other smart contract platform; but because here we can have custom CFG, if we have a function that is an irreducible loop it could happen? Hopefully not, as we should still have 1 entry point, but better to consider it in advance
  • For the user configuration, I think we should not use a dictionary, but a class. We should also load the information from a yaml, or something similar

@S3v3ru5
Copy link
Contributor Author

S3v3ru5 commented Apr 4, 2023

Example config:

name: protocol name
contracts:
  - name: contract1
    path: contracts/contract1.teal
    type: LogicSig
    version: 6
    subroutines:
      - sub1
      - sub2
    functions:
      - name: function1
        execution_path: [B0, B1, B2]
        entry: B2
        exit: [B10, B11]
      - name: function2
        execution_path: [B0, B1, B3]
        entry: B3
        exit: [B12, B13]
  - name: contract2
    path: contracts/contract2.teal
    type: ApprovalProgram
    version: 6
    subroutines:
      - opt
      - delete
    functions:
      - name: init
        execution_path: [B0, B1]
        entry: B1
        exit: [B13]
      - name: clear
        execution_path: [B0, B2]
        entry: B2
        exit: [B13]
groups:
  - - txn_id: T1
      txn_type: pay
      logic_sig:
        contract: contract1
        function: function1
      absolute_index: 0
    - txn_id: T2
      txn_type: appl
      application:
        contract: contract2
        function: init
      absolute_index: 1
  - - txn_id: T1
      txn_type: axfer
    - txn_id: T2
      txn_type: appl
      logic_sig:
        contract: contract1
        function: function2
      application:
        contract: contract2
        function: clear
      relative_indexes:
        - other_txn_id: T1
          relative_index: -1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants