Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

state_sync.py: Unreachable error occurred #4063

Closed
mikhailOK opened this issue Mar 9, 2021 · 6 comments · Fixed by #4070
Closed

state_sync.py: Unreachable error occurred #4063

mikhailOK opened this issue Mar 9, 2021 · 6 comments · Fixed by #4070
Assignees
Labels
A-python-test Area: issues related to python tests A-RPC Area: rpc C-bug Category: This is a bug

Comments

@mikhailOK
Copy link
Contributor

http://nayduck.eastus.cloudapp.azure.com:3000/#/test/104936

�[2mMar 09 17:12:50.914�[0m �[34mDEBUG�[0m chain: Verifying challenges []    
�[2mMar 09 17:12:50.914�[0m �[33m WARN�[0m jsonrpc: Unreachable error occurred: DB Not Found Error: CHUNK EXTRA: 8veiAXzFM9bBvFotemJDzpcUzj3gRPmuN1y2WipvQJ4s:0 
 Cause: Unknown
thread 'actix-rt|system:0|arbiter:0' panicked at 'called `Result::unwrap()` on an `Err` value: InconsistentCardinality { expect: 1, got: 2 }', /home/azureuser/.cargo/registry/src/github.com-1ecc6299db9ec823/prometheus-0.11.0/src/vec.rs:258:49
stack backtrace:
�[2mMar 09 17:12:50.914�[0m �[34mDEBUG�[0m chain: Catching up: removing prev=`8veiAXzFM9bBvFotemJDzpcUzj3gRPmuN1y2WipvQJ4s` from the queue. I'm Some("test0")    
�[2mMar 09 17:12:50.915�[0m �[34mDEBUG�[0m chain: Check orphans: from 8veiAXzFM9bBvFotemJDzpcUzj3gRPmuN1y2WipvQJ4s, # orphans 0    
   0: rust_begin_unwind
             at /rustc/91a79fb29ac78d057d04dbe86be13d5dcc64309a/library/std/src/panicking.rs:483
   1: core::panicking::panic_fmt
             at /rustc/91a79fb29ac78d057d04dbe86be13d5dcc64309a/library/core/src/panicking.rs:85
   2: core::option::expect_none_failed
             at /rustc/91a79fb29ac78d057d04dbe86be13d5dcc64309a/library/core/src/option.rs:1234
   3: core::result::Result<T,E>::unwrap
             at /rustc/91a79fb29ac78d057d04dbe86be13d5dcc64309a/library/core/src/result.rs:973
   4: prometheus::vec::MetricVec<T>::with_label_values
             at /home/azureuser/.cargo/registry/src/github.com-1ecc6299db9ec823/prometheus-0.11.0/src/vec.rs:258
   5: near_metrics::inc_counter_vec

Failing after #3944

@mikhailOK mikhailOK added C-bug Category: This is a bug A-python-test Area: issues related to python tests labels Mar 9, 2021
@mikhailOK
Copy link
Contributor Author

Same crash with transactions.py: http://nayduck.eastus.cloudapp.azure.com:3000/#/test/104924

@mikhailOK
Copy link
Contributor Author

http://nayduck.eastus.cloudapp.azure.com:3000/#/run/1317
transactions.py:

AssertionError: {'jsonrpc': '2.0', 'error': {'code': -32000, 'message': 'Server error', 'data': 'It is a bug if you receive this error type, please, report this incident: https://github.com/near/nearcore/issues/new/choose. Details: DB Not Found Error: CHUNK EXTRA: 7WoNHMcyK9NiBq28NFwS3rMgMhwN3wCwoKEmfdpnwvhV:1 \n Cause: Unknown'}, 'id': 'dontcare'}

@frol
Copy link
Collaborator

frol commented Mar 13, 2021

Traceback (most recent call last):
  File "tests/sanity/transactions.py", line 62, in <module>
    if ctx.get_balances() == ctx.expected_balances:
  File "lib/utils.py", line 32, in get_balances
    return [self.get_balance(i) for i in range(self.num_nodes)]
  File "lib/utils.py", line 32, in <listcomp>
    return [self.get_balance(i) for i in range(self.num_nodes)]
  File "/home/azureuser/.local/lib/python3.7/site-packages/retrying.py", line 49, in wrapped_f
    return Retrying(*dargs, **dkw).call(f, *args, **kw)
  File "/home/azureuser/.local/lib/python3.7/site-packages/retrying.py", line 212, in call
    raise attempt.get()
  File "/home/azureuser/.local/lib/python3.7/site-packages/retrying.py", line 247, in get
    six.reraise(self.value[0], self.value[1], self.value[2])
  File "/home/azureuser/.local/lib/python3.7/site-packages/six.py", line 703, in reraise
    raise value
  File "/home/azureuser/.local/lib/python3.7/site-packages/retrying.py", line 200, in call
    attempt = Attempt(fn(*args, **kwargs), attempt_number, False)
  File "lib/utils.py", line 28, in get_balance
    assert 'result' in r, r
AssertionError: {'jsonrpc': '2.0', 'error': {'code': -32000, 'message': 'Server error', 'data': 'It is a bug if you receive this error type, please, report this incident: https://github.com/near/nearcore/issues/new/choose. Details: DB Not Found Error: CHUNK EXTRA: 7WoNHMcyK9NiBq28NFwS3rMgMhwN3wCwoKEmfdpnwvhV:1 \n Cause: Unknown'}, 'id': 'dontcare'}

It seems that the test itself needs to be reviewed and see if its expectation of getting "result" there is valid (the "DB Not Found Error" should not be Unreachable, but it will remain to be an error; we presumably handled it in #4079, yet obviously we missed something). I think it is due to the fact that we removed the query routing, awaiting for some result to appear on the node, instead, we return immediately. I think we will need to add some retry logic on the pytest side.

near-bulldozer bot pushed a commit that referenced this issue Mar 17, 2021
…tracked shards for handle_query view_client method (#4119)

This PR improves errors in `handle_query` method after refactoring. Solves some problems found in the issue #4063
@mikhailOK
Copy link
Contributor Author

Unreachable error has been fixed, transactions.py is now tracked by #4133.

state_sync.py and skip_epoch.py still fail because of rpc forwarding

AssertionError: {'jsonrpc': '2.0', 'error': {'code': -32000, 'message': 'Server error', 'data': 'The node does not track the shard ID 0'}, 'id': 'dontcare'}

@bowenwang1996
Copy link
Collaborator

@mikhailOK could we fix this by making some nodes track all shards?

@frol
Copy link
Collaborator

frol commented Mar 19, 2021

@bowenwang1996 @mikhailOK You seem to have a better idea of how to address these tests. Would you mind taking it over from the node interfaces team? I hope the error messages are now helpful and it is easier to reason about them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-python-test Area: issues related to python tests A-RPC Area: rpc C-bug Category: This is a bug
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants