Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#66/ Deserializing of events from flow_proto fails with transactions with many events #67

Open
wants to merge 8 commits into
base: master
Choose a base branch
from

Conversation

lealobanov
Copy link
Collaborator

@lealobanov lealobanov commented Oct 28, 2024

Resolves issue #66

Summary by CodeRabbit

Release Notes

  • New Features

    • Introduced a new example for debugging transaction fetching by ID.
    • Added a new JSON structure for event capabilities, enhancing metadata representation.
  • Bug Fixes

    • Improved error handling and logging for event deserialization and transaction results.
  • Chores

    • Updated .gitignore to include myenv for better environment management.

Copy link

coderabbitai bot commented Oct 28, 2024

Walkthrough

The changes in this pull request introduce several new features and modifications across multiple files. A new class, DebuggingFetchTransactionByIdExample, is added to provide an example for fetching transactions by ID. A new JSON structure is established in failed_events.json to detail event capabilities. The decode function in flow_py_sdk/cadence/decode.py is enhanced to handle nested structures and includes improved error handling. Changes in flow_py_sdk/cadence/kinds.py simplify the decode method in the EntitlementsKind class. Additionally, significant error handling improvements are made in flow_py_sdk/client/entities.py, particularly for event deserialization. Lastly, .gitignore is updated to ignore the myenv directory.

Changes

File Change Summary
examples/transactions_examples.py Added new class DebuggingFetchTransactionByIdExample for fetching transaction by ID.
failed_events.json Introduced a new JSON structure with event objects detailing capabilities and metadata.
flow_py_sdk/cadence/decode.py Modified decode function to handle nested structures, updated signature, and added error logging.
flow_py_sdk/cadence/kinds.py Simplified return statement in decode method of EntitlementsKind class.
flow_py_sdk/client/entities.py Enhanced error handling and logging in Event and TransactionResultResponse classes.
.gitignore Added entry for myenv to ignore virtual environment files.

Poem

In the code where rabbits play,
New examples hop and sway.
Fetching transactions, oh what fun,
With logs and errors, we’ve just begun!
Ignore myenv, let’s keep it neat,
In this code, our joys repeat! 🐇✨


Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media?

❤️ Share
🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

  • Review comments: Directly reply to a review comment made by CodeRabbit. Example:
    • I pushed a fix in commit <commit_id>, please review it.
    • Generate unit testing code for this file.
    • Open a follow-up GitHub issue for this discussion.
  • Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query. Examples:
    • @coderabbitai generate unit testing code for this file.
    • @coderabbitai modularize this function.
  • PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
    • @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
    • @coderabbitai read src/utils.ts and generate unit testing code.
    • @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.
    • @coderabbitai help me debug CodeRabbit configuration file.

Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments.

CodeRabbit Commands (Invoked using PR comments)

  • @coderabbitai pause to pause the reviews on a PR.
  • @coderabbitai resume to resume the paused reviews.
  • @coderabbitai review to trigger an incremental review. This is useful when automatic reviews are disabled for the repository.
  • @coderabbitai full review to do a full review from scratch and review all the files again.
  • @coderabbitai summary to regenerate the summary of the PR.
  • @coderabbitai resolve resolve all the CodeRabbit review comments.
  • @coderabbitai configuration to show the current CodeRabbit configuration for the repository.
  • @coderabbitai help to get help.

Other keywords and placeholders

  • Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
  • Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
  • Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (.coderabbit.yaml)

  • You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
  • Please see the configuration documentation for more information.
  • If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Documentation and Community

  • Visit our Documentation for detailed information on how to use CodeRabbit.
  • Join our Discord Community to get help, request features, and share feedback.
  • Follow us on X/Twitter for updates and announcements.

@codecov-commenter
Copy link

codecov-commenter commented Oct 28, 2024

⚠️ Please install the 'codecov app svg image' to ensure uploads and comments are reliably processed by Codecov.

Codecov Report

Attention: Patch coverage is 33.92857% with 37 lines in your changes missing coverage. Please review.

Project coverage is 83.34%. Comparing base (ac0bbb9) to head (f9f4dc6).

Files with missing lines Patch % Lines
flow_py_sdk/client/entities.py 4.34% 22 Missing ⚠️
flow_py_sdk/cadence/decode.py 53.12% 10 Missing and 5 partials ⚠️

❗ Your organization needs to install the Codecov GitHub app to enable full functionality.

Additional details and impacted files
@@            Coverage Diff             @@
##           master      #67      +/-   ##
==========================================
- Coverage   84.79%   83.34%   -1.46%     
==========================================
  Files          29       29              
  Lines        1927     1963      +36     
  Branches      102      113      +11     
==========================================
+ Hits         1634     1636       +2     
- Misses        277      307      +30     
- Partials       16       20       +4     
Flag Coverage Δ
unittests 83.34% <33.92%> (-1.46%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 10

🧹 Outside diff range and nitpick comments (7)
myenv/bin/activate.csh (1)

1-26: Remove virtual environment files from version control.

Virtual environment files should not be committed to version control as they:

  1. Are environment-specific and should be regenerated locally
  2. Can cause conflicts between different development environments
  3. May contain sensitive information

Recommended actions:

  1. Remove all virtual environment files from the repository
  2. Add the following entries to .gitignore:
+# Virtual Environment
+venv/
+env/
+.venv/
+myenv/
  1. Document the virtual environment setup steps in the project's README.md
flow_py_sdk/cadence/constants.py (1)

68-69: LGTM! Consider grouping related constants.

The new constants follow the established naming conventions. However, consider grouping them with other entitlement-related constants if they exist, or with capability-related constants since they appear to be related to the capabilityTypeStr.

 capabilityTypeStr = "Capability"
 functionTypeStr = "Function"
+entitlementConjunctionSetStr = "EntitlementConjunctionSet"
+entitlementKey = "Entitlement"
 
-entitlementConjunctionSetStr = "EntitlementConjunctionSet"
-entitlementKey = "Entitlement"

 fix64_scale = int(8)
flow_py_sdk/cadence/decode.py (2)

45-51: LGTM! The recursive handling addresses the deserialization issue.

The implementation correctly handles nested structures by recursively decoding both dictionary and list values, which should resolve the issue with transactions containing many events.

Consider adding recursion depth protection to prevent stack overflow:

+_MAX_RECURSION_DEPTH = 100
+
 def decode(obj: [dict[Any, Any]], depth: int = 0) -> Union[Value, Kind]:
+    if depth > _MAX_RECURSION_DEPTH:
+        raise ValueError("Maximum recursion depth exceeded")
+
     # ... existing checks ...
 
     for key, value in obj.items():
         if isinstance(value, dict):
-            obj[key] = decode(value)
+            obj[key] = decode(value, depth + 1)
         elif isinstance(value, list):
-            obj[key] = [decode(item) for item in value]
+            obj[key] = [decode(item, depth + 1) for item in value]

46-48: Add type validation before recursion.

To prevent potential runtime errors, consider validating that nested dictionary values are actually decodable before recursion.

     for key, value in obj.items():
         if isinstance(value, dict):
+            if not (c.kindKey in value or c.typeKey in value or c.idKey in value):
+                continue  # Skip non-decodable dictionaries
             obj[key] = decode(value)
examples/transactions_examples.py (1)

286-289: Add documentation explaining the debugging purpose.

Please add docstring documentation explaining:

  1. The purpose of this debugging class
  2. How it helps diagnose event deserialization issues
  3. Expected output format
 class DebuggingFetchTransactionByIdExample(Example):
+    """Debug helper for investigating event deserialization issues.
+    
+    This class helps diagnose failures when deserializing events from transactions
+    that contain a large number of events. It provides detailed logging of the
+    event structure and any deserialization errors encountered.
+    """
     def __init__(self) -> None:
         super().__init__(tag="T.LL.", name="DebuggingFetchTransactionByIdExample", sort_order=501)
failed_events.json (1)

24-39: Consider simplifying type system for robust deserialization

The nested type system with multiple levels of References, Resources, and Capabilities increases deserialization complexity. Consider:

  1. Flattening the type structure where possible
  2. Implementing incremental deserialization for complex types
  3. Adding type validation before full deserialization

Also applies to: 70-153, 197-280

flow_py_sdk/client/entities.py (1)

315-316: Review the impact of changing the method signature of from_proto

Adding a new parameter max_events with a default value to the from_proto method changes its signature. If from_proto is part of the public API, ensure this change is backward compatible and consider documenting the change for users who might rely on the method's parameters.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between ac0bbb9 and 3a5433d.

📒 Files selected for processing (17)
  • examples/transactions_examples.py (1 hunks)
  • failed_events.json (1 hunks)
  • flow_py_sdk/cadence/constants.py (1 hunks)
  • flow_py_sdk/cadence/decode.py (1 hunks)
  • flow_py_sdk/cadence/kinds.py (1 hunks)
  • flow_py_sdk/client/entities.py (4 hunks)
  • myenv/bin/Activate.ps1 (1 hunks)
  • myenv/bin/activate (1 hunks)
  • myenv/bin/activate.csh (1 hunks)
  • myenv/bin/activate.fish (1 hunks)
  • myenv/bin/pip (1 hunks)
  • myenv/bin/pip3 (1 hunks)
  • myenv/bin/pip3.11 (1 hunks)
  • myenv/bin/python (1 hunks)
  • myenv/bin/python3 (1 hunks)
  • myenv/bin/python3.11 (1 hunks)
  • myenv/pyvenv.cfg (1 hunks)
✅ Files skipped from review due to trivial changes (7)
  • myenv/bin/pip
  • myenv/bin/pip3
  • myenv/bin/pip3.11
  • myenv/bin/python
  • myenv/bin/python3
  • myenv/bin/python3.11
  • myenv/pyvenv.cfg
🧰 Additional context used
🪛 Ruff
examples/transactions_examples.py

300-300: Local variable transaction is assigned to but never used

Remove assignment to unused variable transaction

(F841)

flow_py_sdk/client/entities.py

164-164: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)


169-169: Local variable e is assigned to but never used

Remove assignment to unused variable e

(F841)

🔇 Additional comments (6)
flow_py_sdk/cadence/decode.py (1)

45-51: Verify the fix with large event transactions.

Let's verify that this change resolves the original issue by checking for similar patterns in test files.

✅ Verification successful

Let me search for more specific test cases that might verify the recursive decoding functionality.


Let me try one more search strategy focusing on the test directory structure and Cadence-specific tests.


Let me check the actual test cases in encode_test.py to see if they cover nested structures.


Based on the test file content, I can now provide a final response:

The recursive decoding implementation is thoroughly tested with complex nested structures

The test file tests/cadence/encode_test.py contains comprehensive test cases that verify the recursive decoding functionality with:

  • Nested dictionaries with multiple levels of nesting
  • Complex event structures with nested resources
  • Arrays containing nested objects
  • Composite types with nested fields

The test coverage includes large event transactions like StorefrontEvent that contain multiple nested structures, validating that the recursive decoding works correctly for complex real-world scenarios.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Look for test files that verify handling of large event transactions
# Expected: Find test cases that exercise the recursive decoding with nested structures

# Search for test files related to event handling
echo "Searching for relevant test files..."
rg -l "test.*event" --type py

# Look for specific test cases with large or nested event structures
echo "Searching for test cases with nested structures..."
rg -A 5 "decode.*event" --type py

Length of output: 905


Script:

#!/bin/bash
# Search for test files and test cases specifically related to decode functionality
echo "Searching for decode test files..."
rg -l "test.*decode" --type py

# Look for test cases with nested dictionaries or lists
echo "Searching for test cases with nested structures..."
rg -A 10 "def test.*decode" --type py

# Search for actual test data structures
echo "Searching for test data structures..."
rg -A 5 "(dict|list).*decode" --type py tests/

Length of output: 396


Script:

#!/bin/bash
# First, find all test files in the project
echo "Finding test files..."
fd "test.*\.py$"

# Look specifically for Cadence-related test files
echo "Searching for Cadence test files..."
rg -l "cadence" --type py $(fd "test.*\.py$")

# Search for decode usage in test files with more context
echo "Searching for decode usage in tests..."
rg -B 3 -A 7 "decode\(" $(fd "test.*\.py$")

Length of output: 2508


Script:

#!/bin/bash
# Get the content of encode_test.py to check test cases
echo "Content of encode_test.py..."
cat tests/cadence/encode_test.py

# Also search for any dictionary or list test cases
echo "Searching for dictionary/list test cases..."
rg -B 2 -A 5 "testEncode.*(Dict|List|Composite)" tests/cadence/encode_test.py

Length of output: 52716

examples/transactions_examples.py (1)

286-302: Verify the transaction ID contains many events.

Let's verify if the hardcoded transaction ID is a good test case for debugging event deserialization issues.

🧰 Tools
🪛 Ruff

300-300: Local variable transaction is assigned to but never used

Remove assignment to unused variable transaction

(F841)

flow_py_sdk/cadence/kinds.py (1)

353-361: ⚠️ Potential issue

Improve error handling implementation

While adding error handling is a good step towards fixing the deserialization issues, there are a few concerns to address:

  1. Returning None could lead to null pointer exceptions downstream. Consider either:

    • Raising a specific exception type that can be handled by callers
    • Or returning a sentinel object that represents the error state
  2. Using print for error logging isn't ideal for production code.

Consider this improved implementation:

    @classmethod
    def decode(cls, value) -> "Kind":
        try:
            entitlements_val = value[c.entitlementsKey]
            entitlements = [
                decode(v).as_kind(EntitlementBaseKind) for v in entitlements_val
            ]
            return cls(entitlements)
        except Exception as e:
-           print(f"Error decoding entitlements for {cls.kind_str()}: {value}. Error: {e}")
-           return None
+           import logging
+           logging.error(f"Error decoding entitlements for {cls.kind_str()}: {value}", exc_info=e)
+           raise ValueError(f"Failed to decode entitlements: {e}") from e

Let's verify if this class is used in event deserialization:

failed_events.json (3)

1-291: LGTM: Well-structured test data for Flow events

The JSON structure follows Flow protocol standards with consistent event formatting.


16-17: LGTM: Consistent cross-references between events

All addresses, paths, and IDs maintain consistency across the event chain.

Also applies to: 62-63, 179-180, 162-164, 187-189, 281-282


1-47: Verify test data coverage for deserialization failure

The events represent a complete capability flow (issuance → storage → publishing). To ensure this test data reproduces the deserialization issue:

  1. Verify that these events actually failed to deserialize in the original context
  2. Consider if more events are needed to reach the failure threshold

Also applies to: 48-171, 172-290

myenv/bin/activate.csh Outdated Show resolved Hide resolved
flow_py_sdk/cadence/constants.py Outdated Show resolved Hide resolved
myenv/bin/activate Outdated Show resolved Hide resolved
myenv/bin/activate.fish Outdated Show resolved Hide resolved
myenv/bin/Activate.ps1 Outdated Show resolved Hide resolved
examples/transactions_examples.py Outdated Show resolved Hide resolved
examples/transactions_examples.py Show resolved Hide resolved
flow_py_sdk/client/entities.py Outdated Show resolved Hide resolved
flow_py_sdk/client/entities.py Outdated Show resolved Hide resolved
flow_py_sdk/client/entities.py Outdated Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 3a5433d and 402e0f9.

📒 Files selected for processing (2)
  • .gitignore (1 hunks)
  • flow_py_sdk/client/entities.py (3 hunks)
✅ Files skipped from review due to trivial changes (1)
  • .gitignore
🔇 Additional comments (1)
flow_py_sdk/client/entities.py (1)

2-2: LGTM: Logging module import added

The addition of the logging module import is appropriate for implementing proper error handling and logging throughout the file.

flow_py_sdk/client/entities.py Outdated Show resolved Hide resolved
flow_py_sdk/client/entities.py Show resolved Hide resolved
flow_py_sdk/client/entities.py Outdated Show resolved Hide resolved
@@ -281,3 +281,22 @@ async def run(self, ctx: Config):
f"transaction proposer: {transaction.proposal_key.address.hex()}"
)
self.log.info(f"transaction script: {transaction.script.decode('utf-8')}")


class DebuggingFetchTransactionByIdExample(Example):
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added example which makes a mainnet call to the failing tx ID ef6e6dcba87c5f853e0ec940dd382d609ffe11109c28ed239843d71618fa1d71, will remove this before merging changes

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is an output of the failing event payloads for reference, will remove before merging changes

Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🧹 Outside diff range and nitpick comments (2)
flow_py_sdk/cadence/decode.py (2)

20-20: Consider adding docstring to clarify return types.

The function can now return three different types. Consider adding a docstring to explain when each type is returned:

  • Value: When decoded as a Cadence value
  • Kind: When decoded as a Cadence kind
  • dict: When handling nested structures or in error cases
 def decode(obj: dict[Any, Any]) -> Union[Value, Kind, dict]:
+    """Decode a dictionary into a Cadence value, kind, or processed dictionary.
+    
+    Args:
+        obj: Dictionary to decode
+    
+    Returns:
+        Union[Value, Kind, dict]:
+            - Value: When successfully decoded as a Cadence value
+            - Kind: When successfully decoded as a Cadence kind
+            - dict: When handling nested structures or in error cases
+    """

23-24: Optimize multiple isinstance checks.

Merge the two isinstance checks into a single call for better performance.

-        if isinstance(obj, Value) or isinstance(obj, Kind):
+        if isinstance(obj, (Value, Kind)):
🧰 Tools
🪛 Ruff

23-23: Multiple isinstance calls for obj, merge into a single call

Merge isinstance calls for obj

(SIM101)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Files that changed from the base of the PR and between 402e0f9 and 8f8b88d.

📒 Files selected for processing (4)
  • examples/transactions_examples.py (1 hunks)
  • flow_py_sdk/cadence/decode.py (2 hunks)
  • flow_py_sdk/cadence/kinds.py (1 hunks)
  • flow_py_sdk/client/entities.py (3 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
  • flow_py_sdk/cadence/kinds.py
🧰 Additional context used
🪛 Ruff
examples/transactions_examples.py

303-303: Local variable transaction is assigned to but never used

Remove assignment to unused variable transaction

(F841)

flow_py_sdk/cadence/decode.py

23-23: Multiple isinstance calls for obj, merge into a single call

Merge isinstance calls for obj

(SIM101)

🔇 Additional comments (6)
flow_py_sdk/cadence/decode.py (3)

5-5: LGTM: Logging import added for error handling.


43-57: LGTM: Well-implemented recursive structure handling.

The recursive handling of nested dictionaries and lists is well-implemented and addresses the core issue of deserializing complex event structures. The code maintains clarity while handling different nested scenarios.


20-65: Verify the changes with a complex transaction.

The changes look good and should fix the event deserialization issues. Let's verify the implementation with a complex transaction.

✅ Verification successful

Based on the extensive test coverage shown in encode_test.py and the implementation in decode.py, I can now generate a final response:

The changes are well-tested and handle complex cases correctly

The decode implementation has been thoroughly verified through the comprehensive test suite that covers:

  • Complex nested structures (dictionaries, arrays)
  • Resource types with nested fields
  • Event decoding with complex payloads (e.g., StorefrontEvent test)
  • Error handling for various edge cases

The recursive handling in the decode function correctly processes all these cases, as evidenced by the passing test suite. The error handling additions will help with debugging when issues occur.

🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Search for test files that might contain complex transaction events
echo "Searching for relevant test files..."
fd -e json -e py "transaction|event" test/

# Look for existing error handling tests
echo "Checking for error handling test coverage..."
rg -l "test.*decode.*error" test/

Length of output: 401


Script:

#!/bin/bash
# Let's first check if tests are in a different directory structure
fd -t d "test"

# Search for Python test files related to decoding
fd -e py "test.*decode|decode.*test"

# Look for imports of the decode module to find related test files
rg "from flow_py_sdk.cadence.decode import"

# Check for any test files that might use the decode functionality
rg -l "decode\(" .

Length of output: 879


Script:

#!/bin/bash
# Let's examine the test file to see if it covers complex cases
cat tests/cadence/encode_test.py

# Check for any example transactions that might show usage
cat examples/transactions_examples.py

# Look for any test fixtures or test data
fd -e json . tests/

Length of output: 64207

🧰 Tools
🪛 Ruff

23-23: Multiple isinstance calls for obj, merge into a single call

Merge isinstance calls for obj

(SIM101)

flow_py_sdk/client/entities.py (1)

172-182: ⚠️ Potential issue

Improve error handling strategy in from_proto

Returning None silently on failure could mask issues in the calling code. Consider using custom exceptions for better error handling.

Apply this diff:

+class EventDeserializationError(Exception):
+    """Raised when an event cannot be deserialized from proto"""
+    pass
+
 @classmethod
 def from_proto(cls, proto: entities.Event) -> "Event":
     try:
         return Event(
             _type=proto.type,
             transaction_id=proto.transaction_id,
             transaction_index=proto.transaction_index,
             event_index=proto.event_index,
             payload=proto.payload,
         )
     except Exception as e:
-        logging.error(f"Failed to deserialize event {proto.event_index}: {str(e)}")
-        return None  # Returning None if deserialization fails
+        raise EventDeserializationError(f"Failed to deserialize event {proto.event_index}") from e

Likely invalid or redundant comment.

examples/transactions_examples.py (2)

296-299: Avoid hardcoding production endpoints.


301-305: ⚠️ Potential issue

Enhance debugging implementation to address event deserialization.

The current implementation doesn't provide sufficient debugging information for the event deserialization issue (#66). Consider implementing comprehensive event debugging:

  1. Add error handling for transaction fetching
  2. Process and log event details
  3. Add debugging for event deserialization steps

Here's a suggested implementation that better aligns with the PR objectives:

     async def run(self, ctx: Config):
         async with flow_client(
-            # host=ctx.access_node_host, port=ctx.access_node_port
-            host="access.mainnet.nodes.onflow.org",
-            port=9000,
+            host=ctx.access_node_host,
+            port=ctx.access_node_port
         ) as client:
             tx_id = "ef6e6dcba87c5f853e0ec940dd382d609ffe11109c28ed239843d71618fa1d71"
             try:
-                transaction = await client.get_transaction_result(id=bytes.fromhex(tx_id))
-                print(f"Transaction ID: {tx_id}")
+                result = await client.get_transaction_result(id=bytes.fromhex(tx_id))
+                self.log.info(f"Transaction ID: {tx_id}")
+                self.log.info(f"Status: {result.status}")
+                
+                # Debug event deserialization
+                self.log.info(f"Total events: {len(result.events)}")
+                for i, event in enumerate(result.events, 1):
+                    self.log.info(f"\nEvent {i}:")
+                    self.log.info(f"  Type: {event.type}")
+                    self.log.info(f"  Transaction Index: {event.transaction_index}")
+                    self.log.info(f"  Event Index: {event.event_index}")
+                    self.log.info(f"  Payload: {event.payload}")
+            except Exception as e:
+                self.log.error(f"Failed to process transaction {tx_id}: {str(e)}")
+                raise

This implementation:

  1. Uses the logger instead of print statements
  2. Adds comprehensive event debugging
  3. Includes proper error handling
  4. Helps diagnose event deserialization issues

Let's verify if this transaction has many events:

🧰 Tools
🪛 Ruff

303-303: Local variable transaction is assigned to but never used

Remove assignment to unused variable transaction

(F841)

flow_py_sdk/cadence/decode.py Outdated Show resolved Hide resolved
flow_py_sdk/client/entities.py Show resolved Hide resolved
flow_py_sdk/client/entities.py Show resolved Hide resolved
flow_py_sdk/client/entities.py Show resolved Hide resolved
examples/transactions_examples.py Show resolved Hide resolved
Copy link
Owner

@janezpodhostnik janezpodhostnik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you very much for the contribution!

Can you add a short explanation of why the events failed to de-serialize, I'm not quite sure I get it. Was there an unexpected dict/list in the event?

As for the error handling:

  • I very much like the Idea of extra logging, that will make it a lot easier to figure out whats wrong next time.
  • I'm not sure I like the idea of just logging and continuing. That seems very dangerous. I would rather the error was logged and the exception re-thrown. There could be an optional parameter where instead of re-throwing it would just return the json object if decoding failed. What do you think?

flow_py_sdk/client/entities.py Show resolved Hide resolved
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Outside diff range and nitpick comments (2)
flow_py_sdk/cadence/decode.py (2)

20-20: Consider making the input type annotation more precise.

The current annotation dict[Any, Any] is very permissive. Consider using more specific types to better document expected input structure.

-def decode(obj: dict[Any, Any]) -> Union[Value, Kind, dict]:
+def decode(obj: Union[dict[str, Any], Value, Kind, list]) -> Union[Value, Kind, dict]:

23-24: Optimize multiple isinstance checks.

Merge multiple isinstance checks for better readability and performance.

-        if isinstance(obj, Value) or isinstance(obj, Kind):
+        if isinstance(obj, (Value, Kind)):
🧰 Tools
🪛 Ruff

23-23: Multiple isinstance calls for obj, merge into a single call

Merge isinstance calls for obj

(SIM101)

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL

📥 Commits

Reviewing files that changed from the base of the PR and between 8f8b88d and f9f4dc6.

📒 Files selected for processing (1)
  • flow_py_sdk/cadence/decode.py (2 hunks)
🧰 Additional context used
🪛 Ruff
flow_py_sdk/cadence/decode.py

23-23: Multiple isinstance calls for obj, merge into a single call

Merge isinstance calls for obj

(SIM101)

🔇 Additional comments (1)
flow_py_sdk/cadence/decode.py (1)

5-5: LGTM: Logging import is appropriately placed.

The addition of logging capabilities aligns with the need for better error handling and debugging.

Comment on lines +43 to +57
# Recursively handle nested structures
if isinstance(obj, dict):
for key, value in obj.items():
if isinstance(value, dict):
obj[key] = decode(value) # Recursive call for nested dict
elif isinstance(value, list):
obj[key] = [
decode(item) for item in value
] # Recursive list handling

elif isinstance(obj, list):
obj = [
decode(item) for item in obj
] # Handle obj itself as a list if list type

Copy link

@coderabbitai coderabbitai bot Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🛠️ Refactor suggestion

Add safety measures for recursive handling.

The recursive handling of nested structures could potentially lead to:

  1. Stack overflow with deeply nested structures
  2. Infinite recursion if there are circular references

Consider adding these safety measures:

+MAX_RECURSION_DEPTH = 100
+
 def decode(obj: dict[Any, Any]) -> Union[Value, Kind, dict]:
+    def _decode_recursive(obj: Any, depth: int = 0) -> Any:
+        if depth > MAX_RECURSION_DEPTH:
+            raise ValueError("Maximum recursion depth exceeded")
+
+        if isinstance(obj, (Value, Kind)):
+            return obj
+
         # ... existing type checks ...
+
         if isinstance(obj, dict):
             for key, value in obj.items():
                 if isinstance(value, dict):
-                    obj[key] = decode(value)
+                    obj[key] = _decode_recursive(value, depth + 1)
                 elif isinstance(value, list):
                     obj[key] = [
-                        decode(item) for item in value
+                        _decode_recursive(item, depth + 1) for item in value
                     ]
         elif isinstance(obj, list):
             obj = [
-                decode(item) for item in obj
+                _decode_recursive(item, depth + 1) for item in obj
             ]
         return obj
+
+    return _decode_recursive(obj)
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
# Recursively handle nested structures
if isinstance(obj, dict):
for key, value in obj.items():
if isinstance(value, dict):
obj[key] = decode(value) # Recursive call for nested dict
elif isinstance(value, list):
obj[key] = [
decode(item) for item in value
] # Recursive list handling
elif isinstance(obj, list):
obj = [
decode(item) for item in obj
] # Handle obj itself as a list if list type
MAX_RECURSION_DEPTH = 100
def decode(obj: dict[Any, Any]) -> Union[Value, Kind, dict]:
def _decode_recursive(obj: Any, depth: int = 0) -> Any:
if depth > MAX_RECURSION_DEPTH:
raise ValueError("Maximum recursion depth exceeded")
if isinstance(obj, (Value, Kind)):
return obj
# ... existing type checks ...
if isinstance(obj, dict):
for key, value in obj.items():
if isinstance(value, dict):
obj[key] = _decode_recursive(value, depth + 1)
elif isinstance(value, list):
obj[key] = [
_decode_recursive(item, depth + 1) for item in value
]
elif isinstance(obj, list):
obj = [
_decode_recursive(item, depth + 1) for item in obj
]
return obj
return _decode_recursive(obj)

Copy link

@franklywatson franklywatson Oct 31, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@janezpodhostnik what do we think about this suggestion? Given decode is a heavily used method I would err on making it as robust as we can, cc @lealobanov

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!

Comment on lines +60 to +66
except KeyError as e:
logging.error(f"Unhandled key during decode: {e}. Object: {obj}")
raise

except NotImplementedError:
logging.error(f"Decoding not implemented for object: {obj}")
raise
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codebase verification

⚠️ Potential issue

Error handling improvements needed with test coverage

  1. The error handling needs test coverage - there are no tests for error cases in encode_test.py
  2. The suggested log level changes are valid - KeyError should use warning level as it may be recoverable
  3. The error messages can be more descriptive as suggested
     except KeyError as e:
-        logging.error(f"Unhandled key during decode: {e}. Object: {obj}")
+        logging.warning(
+            f"Unhandled key '{e}' during decode of {type(obj).__name__}. "
+            f"Value: {obj}"
+        )
         raise

     except NotImplementedError:
-        logging.error(f"Decoding not implemented for object: {obj}")
+        logging.error(
+            f"Decoding not implemented for type {type(obj).__name__}. "
+            f"Value: {obj}"
+        )
         raise

Add test cases:

  • Test KeyError handling with invalid JSON input
  • Test NotImplementedError with unsupported types
🔗 Analysis chain

Enhance error messages and add test coverage.

The error handling could be improved with:

  1. More descriptive error messages including object type
  2. Different log levels for different scenarios
  3. Test coverage for error cases
     except KeyError as e:
-        logging.error(f"Unhandled key during decode: {e}. Object: {obj}")
+        logging.warning(
+            f"Unhandled key '{e}' during decode of {type(obj).__name__}. "
+            f"Value: {obj}"
+        )
         raise

     except NotImplementedError:
-        logging.error(f"Decoding not implemented for object: {obj}")
+        logging.error(
+            f"Decoding not implemented for type {type(obj).__name__}. "
+            f"Value: {obj}"
+        )
         raise
🏁 Scripts executed

The following scripts were executed for the analysis:

Script:

#!/bin/bash
# Description: Check test coverage for error handling paths

# Test: Look for test files covering decode.py
fd "test.*decode.*\.py$"

# Test: Check for error handling test cases
rg -A 5 "def test.*decode.*error" 

Length of output: 62


Script:

#!/bin/bash
# Let's try a broader search for test files and error handling patterns

# Find all test files in the project
fd "test.*\.py$"

# Look for decode-related test files with a more flexible pattern
rg -l "decode" $(fd "test.*\.py$")

# Search for any error handling tests in Python files
rg -A 5 "(def|async def) test.*error" $(fd "test.*\.py$")

# Check the actual decode.py file location and its test coverage
fd "decode.py$"

Length of output: 835


Script:

#!/bin/bash
# Let's examine the encode_test.py file since it's related to decode functionality
cat tests/cadence/encode_test.py

# Also check if there are any existing error handling tests in decode.py
rg -B 2 -A 5 "except.*Error" flow_py_sdk/cadence/decode.py

# Look for any existing logging patterns in tests
rg "logging\." tests/cadence/encode_test.py

Length of output: 52727

@lealobanov
Copy link
Collaborator Author

lealobanov commented Oct 31, 2024

Hi @janezpodhostnik ,

  • Just updated the exception handling to raise an exception and exit the decoding if one of the events failed to decode (instead of continuing).

I think the source of the bug might have been the recent migration to Cadence 1.0, as I noticed all of the events which failed to decode had complex nested dict/list fields such as authorizations and staticType found in the EntitlementConjunctionSet class. Entitlements seem to be a new syntax introduced since the C1.0 release (https://www.youtube.com/watch?v=aZ1aCBtPpXo), so I suspect this might be why this wasn't throwing an error before.

Some other preventative measures we could take on the decoding of events array:

  • Imposing a max number of events that get decoded per transaction
  • Imposing a max limit on event payload size

Initially I tested out both of these changes, but ultimately this TX id ended up decoding successfully once I updated to decode the nested structures. So for now I think it's ok to skip on the above 2 items and potentially reconsider if events will continue to cause issues.

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants