Add a way to test serialization code state coverage in Python classes written in Rust #13347

eliarbel · 2024-10-20T14:17:01Z

Background

In order to support object serialization and deserialization with Pickle of Python classes written in Rust (PyO3), we often time add __reduce__, __getstate__ and the like to the Rust structs to explicitly handle how the struct fields are serialized/deserialized and how the struct object is initialized.
With this explicit handling approach of object state w.r.t pickling, there risk in forgetting some fields out, especially when extending an already existing struct with new fields that require serialization handling.

Goal

Have a mechanized way to enforce (or at least facilitate) proper state coverage of the code of our Python classes written in Rust w.r.t pickling.

Proposal

There are probably several ways to achieve this and maybe even an already existing solution I'm not aware of. Nevertheless, here is a proposal which is based on our Python testing framework. Hopefully this would help with the discussion and with exploring other possible solutions if needed, or to conclude that there isn't an adequate approach and we should continue to rely on regular pickling testing and code reviews.

The essence of the approach is to enforce the tester, wanting to test pickle/unpickle logic of a given object, to explicitly consider how this object should be initialized by enforcing the test to use non-default values when constructing the object. This is done in the proposal by checking that the attributes of the given object return non-default values. This is done by comparing the test object with an object of the same class constructed with default values (in the code this is the reference object, self._ref_obj). This is not a bullet-proof approach and one could check the __init__ method args itself with inspect but it may be good enough for most use-cases, assuming that user visible instance attributes should be covered by the serialization logic.

To further understand the motivation of the code below, consider the case where there is already an existing pickling/unpickling test for a given Python class. For this test to pass, the test code should either initialize or waive all attributes of the tested object explicitly. In the case a developer adds a new publicly accessible attribute to this class (in Rust) without properly covering this new attribute in the serialization code, the existing test would fail. This is because the test initializes the object with this new attribute set to its default value (if exists). Initializing the new attribute to a non-default value in the test would not be sufficient either if the serialization code does not handle it (e.g. in __reduce__) since then, the attribute's values in the reference object and the actual test object will differ. To make the test pass the developer would have to handle the new attribute in the serialization code in Rust.

In the code below, the class TestPickleCoverage is a generic one meant to be subclassed by concrete test classes, preferably one for each Python class whose pickling logic we want to test.

import pickle
import inspect

from test import QiskitTestCase

from qiskit import ClassicalRegister, QuantumRegister
from qiskit._accelerate.circuit import CircuitData
from qiskit._accelerate.equivalence import Key


class TestPickleCoverage(QiskitTestCase):
    def setUp(self):
        super().setUp()
        self._ignore_list = []
        self._ref_obj = None

    def assertPickleCoverage(self, obj):
        # Go over attributes, try to force the tester to initialize all instance attributes in testing
        if not self._ref_obj:
            self._ref_obj = type(obj)()
        for member in inspect.getmembers(obj, lambda c: not inspect.isroutine(c)):
            if member[0].startswith("_"):
                continue
            if member[0] not in self._ignore_list:
                assert getattr(obj, member[0]) != getattr(self._ref_obj, member[0]), format(
                    f"`{member[0]}` should be initialized with a non-default value."
                )

        # try:
        #     (_, initializers, *_) = obj.__reduce__()
        #     copy_obj = type(obj)(*initializers)
        # except:
        copy_obj = pickle.loads(pickle.dumps(obj))

        self.assertEqual(obj, copy_obj)


class TestCircuitData(TestPickleCoverage):
    def setUp(self):
        super().setUp()
        self._ignore_list.append("parameters")  # Example for skipping attributes of no relevance

    def test_pickle_coverage(self):
        # Example for a good test
        obj = CircuitData(QuantumRegister(1), ClassicalRegister(1), global_phase=1)

        self.assertPickleCoverage(obj)

    def _test_pickle_coverage_partial_init(self):
        # Example for demonstrating a test failure due to the lack of initialization of e.g. here `global_phase`, giving this error:

        # Analogously, if a developer adds a new attribute in the Rust space which is not covered by __reduce__ or otherwise in the
        # standard pickle inspection, a similar failure will occur
        obj = CircuitData(QuantumRegister(1), ClassicalRegister(1))


        self.assertPickleCoverage(obj)


class TestKey(TestPickleCoverage):
    def setUp(self):
        super().setUp()
        self._ref_obj = Key("", 0) # Example for initializing a reference obj whose initialization requires args

    def test_pickle_coverage(self):

        obj = Key("name", 1)

        self.assertPickleCoverage(obj)

And the output running it:

python -m unittest test.python.circuit.test_pickle_coverage
.F.
======================================================================
FAIL: test_pickle_coverage_partial_init (test.python.circuit.test_pickle_coverage.TestCircuitData.test_pickle_coverage_partial_init)
test.python.circuit.test_pickle_coverage.TestCircuitData.test_pickle_coverage_partial_init
----------------------------------------------------------------------
testtools.testresult.real._StringException: Traceback (most recent call last):
  File "/home/arbel/workspace/qiskit/test/python/circuit/test_pickle_coverage.py", line 69, in test_pickle_coverage_partial_init
    self.assertPickleCoverage(obj)
  File "/home/arbel/workspace/qiskit/test/python/circuit/test_pickle_coverage.py", line 37, in assertPickleCoverage
    assert getattr(obj, member[0]) != getattr(self._ref_obj, member[0]), format(
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
AssertionError: `global_phase` should be initialized with a non-default value.


----------------------------------------------------------------------
Ran 3 tests in 0.010s

FAILED (failures=1)

The text was updated successfully, but these errors were encountered:

eliarbel added the type: qa Issues and PRs that relate to testing and code quality label Oct 20, 2024

eliarbel mentioned this issue Oct 20, 2024

Add global_phase to CircuitData::__reduce__ #13346

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a way to test serialization code state coverage in Python classes written in Rust #13347

Add a way to test serialization code state coverage in Python classes written in Rust #13347

eliarbel commented Oct 20, 2024

Add a way to test serialization code state coverage in Python classes written in Rust #13347

Add a way to test serialization code state coverage in Python classes written in Rust #13347

Comments

eliarbel commented Oct 20, 2024

Background

Goal

Proposal