hyperledger · ashcherbakov · Jan 18, 2018 · Dec 27, 2017 · Dec 28, 2017 · Dec 28, 2017
diff --git a/docs/catchup.md b/docs/catchup.md
@@ -0,0 +1,37 @@
+# Ledger catchup
+
+A node uses a process called `catchup` to sync its ledgers with other nodes. It does this process after 
+- starting up: Any node communicates state of its ledgers to any other node it connects to and learns whether is ahead or behind or at same state as others 
+- after a view change: After a view change starts, nodes again communicate state of their ledgers to other nodes. 
+- if it realises that it has missed some transactions: Nodes periodically send checkpoints indicating how many transactions they have processed 
+recently, if a node finds out that is missed some txns, it will perform catchup.  
+
+The catchup is managed by a object called `LedgerManager`. The `LedgerManager` maintains a `LedgerInfo` object for each ledger and references each ledger by its unique id called `ledger_id`.
+When a `ledger` is initialised, `addLedger` method of `LedgerManager` is called; this method registers the ledger with the `LedgerManager`. 
+`addLedger` also accepts callbacks which are called on occurence of different events, like before/after starting catchup for a ledger, 
+before/after marking catchup complete for a ledger, after adding any transaction that was received in catchup to the ledger.
+The `LedgerManager` performs catchup of each ledger sequentially, which means unless catchup for one ledger is complete, catchup for other will not start. 
+This is mostly done for simplicity and can be optimised but the pool ledger needs to be caught up first as the pool ledger knows how many nodes are there 
+in the network. Catchup for any ledger involves these steps:
+-   Learn the correct state (how many txns, merkle roots) of the ledger by using `LedgerStatus` messages from other nodes.
+-   Once sufficient (`Quorums.ledger_status`) and consistent `LedgerStatus` messages are received so that the node knows the latest state of the ledger, if it finds it ledger to be 
+latest, it marks the ledger catchup complete otherwise wait for `ConsistencyProof` messages from other nodes until a timeout.
+-   When a node receives a `LedgerStatus` and it finds the sending node's ledger behind, it sends a `ConsistencyProof` message. This message is like a 
+merkle subset proof of the sending node's ledger and the receiving node's ledger, eg. if the sending node A's ledger has size 20 and merkle root `x` and 
+the receiving node B's size is 30 with merkle root `y`, B sends a proof to A that A's ledger with size 20 and root `x` is a subset of B's ledger with size 
+30 and root `y`.
+-   After receiving a `ConsistencyProof`, the node verifies it.
+-   Once the node that is catching up has sufficient (`Quorums.consistency_proof`) and consistent `ConsistencyProof` messages, it knows how many transactions it needs to catchup and 
+then requests those transactions from other nodes by equally distributing the load. eg if a node has to catchup 1000 txns and there are 5 other nodes in the 
+network, then the node will request 200 txns from each. The node catching up sends a `CatchupReq` message to other nodes and expects a corresponding `CatchupRep`
+
+Apart from this if the node does not receive sufficient or consistent `ConsistencyProof`s under a timeout, it requests them using `request_CPs_if_needed`.
+Similarly if the node does not receive sufficient or consistent `CatchupRep`s under a timeout, it requests them using `request_txns_if_needed`. 
+These timeouts can be configured using `ConsistencyProofsTimeout` and `CatchupTransactionsTimeout` respectively from the config file
+
+
+Relevant code:
+- LedgerManager: `plenum/common/ledger_manager.py`
+- LedgerInfo: `plenum/common/ledger_info.py`
+- Catchup message types: `plenum/common/messages/node_messages.py`  
+- Catchup quorum values: `plenum/server/quorums.py`
diff --git a/docs/main.md b/docs/main.md
@@ -0,0 +1,2 @@
+Outline of the system
+Node 
diff --git a/docs/request_handling.md b/docs/request_handling.md
@@ -0,0 +1,57 @@
+# Request handling
+
+To handle requests sent by client, the nodes require a ledger and/or a state as a data store. Request handling logic is written in `RequestHandler`. 
+`RequestHandler` is extended by new classes to support new requests. A node provides methods `register_new_ledger` to register new `Ledger` and `register_req_handler` to register new `RequestHandler`s.
+There can be a callback registered to execute after a write request has been successfully executed using `register_executer`.
+
+
+There are 2 types of requests a client can send:
+-   Query:
+    Here the client is requesting transactions or some state variables from the node(s). The client can either send a `GET_TXN` to get any transaction with a sequence number from any ledger.
+    Or it can send specific `GET_*` transactions for queries
+-   Write:
+    Here the client is asking the nodes to write a new transaction to the ledger and change some state variable. This requires the nodes to run a consensus protocol (currently RBFT).
+    If the protocol run is successful, then the client's proposed changes are written. 
+
+
+#### Request handling
+Below is a description of how a request is processed.
+A node on receiving a client request in  `validateClientMsg`: 
+-   The node performs static validation checks (validation that does not require any state, like mandatory fields are present, etc), it uses `ClientMessageValidator` and 
+    `RequestHandler`'s `doStaticValidation` for this.
+-   If the static validation passes, it checks if the signature check is required (not required for queries) and does that if needed in `verifySignature`. More on this later.
+-   Checks if it's a generic transaction query (`GET_TXN`). If it is then query the ledger for that particular sequence number and return the result. A `REQNACK` might be sent if the query is not correctly constructed. 
+-   Checks if it's a specific query (needs a specific `RequestHandler`), if it is then it uses `process_query` that uses the specific `RequestHandler` to respond to the query. A `REQNACK` might be sent if the query is not correctly constructed.
+-   If it is a write, then node checks if it has already processed the request before by checking the uniqueness of `identifier` and `reqId` fields of the Request.
+    -   If it has already processed the request, then it sends the corresponding `Reply` to the client
+    -   If the `Request` is already in process, then it sends an acknowledgement to the client as a `REQACK`
+    -   If the node has not seen the `Request` before it broadcasts the `Request` to all nodes in a `PROPAGATE`.
+    -   Once a node receives sufficient (`Quorums.propagate`) `PROPAGATE`s for a request, it forwards the request to its replicas.
+    -   A primary replica on receiving a forwarded request does dynamic validation (requiring state, like if violating some unique constraint or doing an unauthorised action) 
+        on the request using `doDynamicValidation` which in turn calls the corresponding `RequestHandler`'s `validate` method. If the validation succeeds then 
+        `apply` method of `RequestHandler` is called which optimistically applies the changes to ledger and state. 
+        If the validation fails, the primary sends a `REJECT` to the client. Then the primary sends a `PRE-PREPARE` to other nodes which contains the merkle roots and some other information for all such requests.
+    -   The non-primary replicas on receiving the `PRE-PREPARE` performs the same dynamic validation that the primary performed on each request. It also checks if the merkle roots match.
+        If all checks pass the replicas send a `PREPARE` otherwise they reject the `PRE-PREPARE`
+    -   Once the consensus protocol is successfully executed on the request, the replicas send `ORDERED` message to its node and the node updates the monitor.
+    -   If the `ORDERED` message above was sent by the master replica then the node executes the request; meaning they commit any changes made to the ledger and state by that
+        request by calling the corresponding `RequestHandler`'s `commit` method and send a `Reply` back to client.
+    -   The node also tracks the request's `identifier` and `reqId` in a key value database in `updateSeqNoMap`.
+
+
+#### Signature verification
+Each node has a `ReqAuthenticator` object which allows to register `ClientAuthNr` objects using `register_authenticator` method. During signature verification, 
+a node runs each registered authenticator over the request and if any authenticator results in an exception then signature verification is considered failed.
+A node has atleast 1 authenticator called `CoreAuthNr` whose `authenticate` method is called over the serialised request data to verify signature.
+
+
+Relevant code:
+- Node: `plenum/server/node.py`
+- Replica: `plenum/server/replica.py`
+- Propagator: `plenum/server/propagator.py`
+- Request: `plenum/common/request.py`
+- Request structure validation: `plenum/common/messages/client_request.py`
+- RequestHandler: `plenum/server/req_handler.py`
+- Request Authenticator: `plenum/server/req_authenticator.py`
+- Core Authenticator: `plenum/server/client_authn.py`
+- Quorums: `plenum/server/quorums.py`
diff --git a/docs/storage.md b/docs/storage.md
@@ -17,8 +17,7 @@ Each new transaction is added to the ledger (log) and is also hashed (sha256) an
 results in a new merkle root. Thus for each transaction a merkle proof of presence, called `inclusion proof` or `audit_path` can be created by 
 using the root hash a few (`O(lgn)`, `n` being the total number of leaves/transactions in the tree/ledger) intermediate hashes. 
 
-Hashes of all the 
-leaves and intermediate nodes of the tree are stored in a `HashStore`, enabling the creating of inclusion proofs and subset proofs. A subset proof 
+Hashes of all the leaves and intermediate nodes of the tree are stored in a `HashStore`, enabling the creating of inclusion proofs and subset proofs. A subset proof 
 proves that a particular merkle tree is a subset of another (usually larger) merkle tree; more on this in the Catchup doc. The `HashStore` has 2 separate storages for storing leaves 
 and intermediate nodes, each leaf or node of the tree is 32 bytes. Each of these storages can be a binary file or a key value store. 
 The leaf or node hashes are queried by their number. When a client write request completes or it requests a transaction with a particular sequence number from a ledger, 

diff --git a/plenum/server/node.py b/plenum/server/node.py
@@ -1957,17 +1957,18 @@ def processRequest(self, request: Request, frm: str):
         ledgerId = self.ledger_id_for_request(request)
         ledger = self.getLedger(ledgerId)
 
+        if self.is_query(request.operation[TXN_TYPE]):
+            self.process_query(request, frm)
+            return
+
         reply = self.getReplyFromLedger(ledger, request)
         if reply:
             logger.debug("{} returning REPLY from already processed "
                          "REQUEST: {}".format(self, request))
             self.transmitToClient(reply, frm)
             return
 
-        if self.is_query(request.operation[TXN_TYPE]):
-            self.process_query(request, frm)
-            return
-
+        # If the node is not already processing the request
         if not self.isProcessingReq(*request.key):
             self.startedProcessingReq(*request.key, frm)
         # If not already got the propagate request(PROPAGATE) for the