Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HashStore is used to recover when the ledger starts, fixed the calcu… #271

Merged
merged 4 commits into from
Jul 12, 2017
Merged
Show file tree
Hide file tree
Changes from 3 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
24 changes: 19 additions & 5 deletions ledger/compact_merkle_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,10 +261,24 @@ def leafCount(self) -> int:
def nodeCount(self) -> int:
return self.hashStore.nodeCount

def verifyConsistency(self, expectedLeafCount = -1) -> bool:
if expectedLeafCount > 0 and expectedLeafCount != self.leafCount:
@staticmethod
def get_expected_node_count(leaf_count):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why this should be here? Can other implementations of merkle tree have different number of nodes for the same data?
It could be here if it is private, but I see that it is also used in hash_store.py:146,

Copy link
Contributor Author

@lovesh lovesh Jul 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well we store full subtree roots in CompactMerkleTree, i am not sure whether we will need to store them in case of other kinds of trees, but we don't have others as of now. Same argument for HashStore, not sure if we will need HashStore with all tree implementations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it be better to move it to MerkleTree or some utils class?

Even if we are not going to have other implementations we already have class hierarchy for this, so we either should follow it or drop it. MerkleTree is used as a type of tree argument, but then we explicitly use CompactMerkleTree, this is confusing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We will move it when we need it at other places, i don't like idea of moving things to general utility files, maybe later on if we have several kinds of trees and some functions apply to some of those and not others, we will create a tree_utils or something like that

"""
The number of nodes is the number of full subtrees present
"""
count = 0
while leaf_count > 1:
leaf_count //= 2
count += leaf_count
return count

def verify_consistency(self, expected_leaf_count) -> bool:
"""
Check that the tree has same leaf count as expected and the
number of nodes are also as expected
"""
if expected_leaf_count != self.leafCount:
raise ConsistencyVerificationFailed()
expectedNodeCount = count_bits_set(self.leafCount)
if not expectedNodeCount == self.nodeCount:
if self.get_expected_node_count(self.leafCount) != self.nodeCount:
raise ConsistencyVerificationFailed()
return True
return True
46 changes: 20 additions & 26 deletions ledger/ledger.py
Original file line number Diff line number Diff line change
Expand Up @@ -73,34 +73,28 @@ def recoverTree(self):
.format(type(self.tree)))


# ATTENTION!
# This functionality is disabled until better consistency verification
# implemented - always using recovery from transaction log
# from ledger.stores.memory_hash_store import MemoryHashStore
# from ledger.util import ConsistencyVerificationFailed
# if not self.tree.hashStore \
# or isinstance(self.tree.hashStore, MemoryHashStore) \
# or self.tree.leafCount == 0:
# logging.info("Recovering tree from transaction log")
# self.recoverTreeFromTxnLog()
# else:
# try:
# logging.info("Recovering tree from hash store of size {}".
# format(self.tree.leafCount))
# self.recoverTreeFromHashStore()
# except ConsistencyVerificationFailed:
# logging.error("Consistency verification of merkle tree "
# "from hash store failed, "
# "falling back to transaction log")
# self.recoverTreeFromTxnLog()

logging.debug("Recovering tree from transaction log")
from ledger.stores.memory_hash_store import MemoryHashStore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Move these two and all your in-function imports to the top of files

Copy link
Contributor Author

@lovesh lovesh Jul 12, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, done

from ledger.util import ConsistencyVerificationFailed
start = time.perf_counter()
self.recoverTreeFromTxnLog()
if not self.tree.hashStore \
or isinstance(self.tree.hashStore, MemoryHashStore) \
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider doing this in a more generic way, avoid instance checking

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

or self.tree.leafCount == 0:
logging.debug("Recovering tree from transaction log")
self.recoverTreeFromTxnLog()
else:
try:
logging.debug("Recovering tree from hash store of size {}".
format(self.tree.leafCount))
self.recoverTreeFromHashStore()
except ConsistencyVerificationFailed:
logging.error("Consistency verification of merkle tree "
"from hash store failed, "
"falling back to transaction log")
self.recoverTreeFromTxnLog()

end = time.perf_counter()
t = end - start
logging.debug("Recovered tree from transaction log in {} seconds".
format(t))
logging.debug("Recovered tree in {} seconds".format(t))

def recoverTreeFromTxnLog(self):
# TODO: in this and some other lines specific fields of
Expand All @@ -118,7 +112,7 @@ def recoverTreeFromHashStore(self):
hashes = list(reversed(self.tree.inclusion_proof(treeSize,
treeSize + 1)))
self.tree._update(self.tree.leafCount, hashes)
self.tree.verifyConsistency(self._transactionLog.numKeys)
self.tree.verify_consistency(self._transactionLog.numKeys)

def add(self, leaf):
self._addToStore(leaf)
Expand Down
2 changes: 1 addition & 1 deletion ledger/merkle_tree.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,6 @@ def nodeCount(self) -> int:
"""

@abstractmethod
def verifyConsistency(self, expectedLeafCount) -> bool:
def verify_consistency(self, expectedLeafCount) -> bool:
"""
"""
8 changes: 8 additions & 0 deletions ledger/stores/hash_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -137,6 +137,14 @@ def readNodeByTree(self, start, height=None):
pos = self.getNodePosition(start, height)
return self.readNode(pos)

@property
def is_consistent(self) -> bool:
"""
Returns True if number of nodes are consistent with number of leaves
"""
from ledger.compact_merkle_tree import CompactMerkleTree
return self.nodeCount == CompactMerkleTree.get_expected_node_count(self.leafCount)

@staticmethod
def _validatePos(start, end=None):
if end:
Expand Down
8 changes: 8 additions & 0 deletions ledger/test/test_file_hash_store.py
Original file line number Diff line number Diff line change
Expand Up @@ -51,6 +51,14 @@ def testSimpleReadWrite(nodesLeaves, tempdir):
for i, n in enumerate(nds):
assert nodes[i][2] == n

# Check that hash store can be closed and re-opened and the contents remain same
leaf_count = fhs.leafCount
node_count = fhs.nodeCount
fhs.close()
reopened_hash_store = FileHashStore(tempdir)
assert reopened_hash_store.leafCount == leaf_count
assert reopened_hash_store.nodeCount == node_count


def testIncorrectWrites(tempdir):
fhs = FileHashStore(tempdir, leafSize=50, nodeSize=50)
Expand Down
2 changes: 1 addition & 1 deletion ledger/test/test_ledger.py
Original file line number Diff line number Diff line change
Expand Up @@ -107,7 +107,7 @@ def testRecoverLedgerFromHashStore(tempdir):
fhs = FileHashStore(tempdir)
tree = CompactMerkleTree(hashStore=fhs)
ledger = Ledger(tree=tree, dataDir=tempdir)
for d in range(10):
for d in range(100):
ledger.add(str(d).encode())
updatedTree = ledger.tree
ledger.stop()
Expand Down
15 changes: 9 additions & 6 deletions ledger/test/test_merkle_proof.py
Original file line number Diff line number Diff line change
Expand Up @@ -110,6 +110,9 @@
"""


TXN_COUNT = 1000


@pytest.yield_fixture(scope="module", params=['File', 'Memory'])
def hashStore(request, tdir):
if request.param == 'File':
Expand Down Expand Up @@ -141,15 +144,13 @@ def hasherAndTree(hasher):
def addTxns(hasherAndTree):
h, m = hasherAndTree

txn_count = 1000

auditPaths = []
for d in range(txn_count):
for d in range(TXN_COUNT):
serNo = d+1
data = str(serNo).encode()
auditPaths.append([hexlify(h) for h in m.append(data)])

return txn_count, auditPaths
print(m.hashStore.leafCount, m.hashStore.nodeCount)
return TXN_COUNT, auditPaths


@pytest.fixture()
Expand Down Expand Up @@ -200,14 +201,16 @@ def testCompactMerkleTree2(hasherAndTree, verifier):
def testCompactMerkleTree(hasherAndTree, verifier):
h, m = hasherAndTree
printEvery = 1000
count = 1000
count = TXN_COUNT
for d in range(count):
data = str(d + 1).encode()
data_hex = hexlify(data)
audit_path = m.append(data)
audit_path_hex = [hexlify(h) for h in audit_path]
incl_proof = m.inclusion_proof(d, d+1)
assert audit_path == incl_proof
assert m.nodeCount == m.get_expected_node_count(m.leafCount)
assert m.hashStore.is_consistent
if d % printEvery == 0:
show(h, m, data_hex)
print("audit path is {}".format(audit_path_hex))
Expand Down
8 changes: 4 additions & 4 deletions plenum/cli/cli.py
Original file line number Diff line number Diff line change
Expand Up @@ -252,10 +252,10 @@ def __init__(self, looper, basedirpath, nodeReg=None, cliNodeReg=None,
eventloop=eventloop,
output=out)

RAETVerbosity = getRAETLogLevelFromConfig("RAETLogLevelCli",
Console.Wordage.mute,
self.config)
RAETLogFile = getRAETLogFilePath("RAETLogFilePathCli", self.config)
# RAETVerbosity = getRAETLogLevelFromConfig("RAETLogLevelCli",
# Console.Wordage.mute,
# self.config)
# RAETLogFile = getRAETLogFilePath("RAETLogFilePathCli", self.config)
# Patch stdout in something that will always print *above* the prompt
# when something is written to stdout.
sys.stdout = self.cli.stdout_proxy()
Expand Down
Original file line number Diff line number Diff line change
@@ -1,9 +1,6 @@
import pytest

from collections import OrderedDict
from plenum.common.messages.fields import NonNegativeNumberField, \
NonEmptyStringField, \
HexField, MerkleRootField, AnyValueField
NonEmptyStringField, MerkleRootField
from plenum.common.messages.node_messages import Prepare

EXPECTED_ORDERED_FIELDS = OrderedDict([
Expand Down
Original file line number Diff line number Diff line change
@@ -1,7 +1,4 @@
import types
from random import randint

import pytest

from plenum.common.constants import DOMAIN_LEDGER_ID, CONSISTENCY_PROOF
from plenum.common.ledger import Ledger
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you also add simple test that compares root hashes of trees recovered using transaction log and using hash store?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's there in testRecoverLedgerFromHashStore in ledger/test/test_ledger.py

Expand Down