Skip to content

cirKITers/Quafel

Repository files navigation

QUAFEL - QUAntum Framework EvaLuation

Not to be confused with the Quaffle 😉

Overview

See our poster contributed to the QTML23 for more details.

Setup 🔨

This project follows the Kedro Framework approach. Straight, without development packages, you can execute the following command, assuming you have Poetry installed:

poetry install --without dev

There is a setup.sh script in the .vscode directory for convenience.

If you want to go with Pip instead, run

pip install -r src/requirements.in
🚧 only:

If you considere building docs, running tests and commiting to the project, run:

poetry install
poetry run pre-commit autoupdate
poetry run pre-commit install
poetry run pytest
poetry run mkdocs build

Again, there is a setup_dev.sh script in the .vscode directory for convenience.

With Pip the equivalent is

pip install -r src/requirements_dev.in
pre-commit autoupdate
pre-commit install
pytest
mkdocs build

Usage 🚀

Without any configuration needed, you can execute

kedro run --pipeline prepare

followed by

kedro run

and a default pipeline should run. In this and following examples the leading poetry run is omitted for simplicity.

Note that is required to always run the prepare pipeline in advance to any actual processing pipeline. This is because of the current implementation relies on dynamically created nodes that are depending on the configuration and therefore requiring two separate pipeline executions.

In summary, the following pipelines exist:

  • prepare : generates all possible combinations of configurations based on the current parameter set
  • measure : performs the actual time measurement by executing experiments for each of the previously generated configurations with the ability to parallelize processing
  • combine : gathers all the results from the measure pipeline and combines them into a single output dataset
  • visualize : takes the combined experiment results and generates your plots

The default pipeline covers measure, combine and visualize. You can run them separately by specifying the pipeline name.

This project can take advantage of multiprocessing to evaluate numerous combinations of qubits, depths and shots in parallel in the measure pipeline. To use this, you should explicitly call the individual pipelines. In summary the whole experiment will then look as follows:

kedro run --pipeline prepare
kedro run --pipeline measure --runner quafel.runner.Parallel
kedro run --pipeline combine
kedro run --pipeline visualize

Here, only the pipeline measure will utilize multiprocessing and the rest will run single process. We recommend this approach since there is no advantage by running the other pipelines in parallel as well. Of course, you can run the measure pipeline in a single process as well by omitting the --runner option. If for some reason the execution of the measure pipeline gets interrupted, running the same pipeline again without running prepare will allow re-using previously generated artefacts.

For details on the output, see the Data Structure Section.

Note that if you want to re-run e.g. the visualize pipeline, you have to re-run the prepare pipeline as well. This is because intermediate data containing information about the partitions is being deleted after the visualize pipeline of an experimant successfully ran. This constraint will be removed in future releases.

🚧 only: Checkout the pre-defined VSCode tasks if you want to develop on the project.

Configuration 🔧

Tweaking the Partitions

Circuits are being generated in the data_generation namespace of the project. To adjust the number of qubits, depth of the circuit, enabled frameworks and more, checkout conf/base/parameters/data_generation.yml.

Here you can adjust the following parameters:

  • seed: Used in the circuit generation method to sample random gates
  • samples_per_parameter: for expressibility and entangling capability measures
  • haar_samples_per_qubit: for expressibility and entangling capability measures
  • min_[qubits/depth/shots]: lowest number of qubits/ circuit depth/ shots used for generating partitions
  • max_[qubits/depth/shots]: highest number of qubits/ circuit depth/ shots used for generating partitions
  • [qubits/depth/shots]_increment: steps in which the range specified by min/max value will be iterated
  • [qubits/depth/shots]_type: type of the increment (e.g. exp2 or linear)

Tweaking the Execution behaviour

Everything related to executing the circuits and time measurments is contained in the data_science namespace. Head to conf/base/parameters/data_science.yml to specify a framework and set e.g. the number of evaluations.

Tweaking the Visualization

By now, there is no specific Kedro-style configuration. The generated plots can be adjusted using the design class located in src/quafel/pipelines/visualization/nodes.py. Propagating these settings to a .yml file is on the agenda!

Pipeline 👓

You can actually see what's going on by running

poetry run kedro-viz

which will open a browser with kedro-viz showing the pipeline.

Data Structure 💾

  • data/01_raw:
  • data/02_intermediate:
    • Evaluation partitions split into single .csv files.
    • The number of partitions depend on the configuration.
  • data/03_qasm_circuits:
    • A QASM circuit for each partition.
  • data/04_measures:
  • data/05_execution_results:
    • Simulator results of the job with the corresponding id.
    • Result formats are unified as a dictionary with the keys containing the binary bit representation of the measured qubit and the normalized counts as values.
    • Results are zero padded, so it is ensured that also state combinations with $0$ probability are represented.
  • data/06_execution_durations:
    • Duration for the simulation of the job with the corresponding id
    • Duration is only measured using perf_counter and process_time
  • data/07_evaluations_combined:
    • Versioned dataset containing the combined information of both, the input parameters (framework, qubits, depth, shots), the measured duration and the simulator results
  • data/08_reportings:
    • Versioned dataset with the .json formatted ploty heatmaps
    • The data in this folder is named by the framework and the fixed parameter. E.g. when the number of qubits is plotted against the shots and the qiskit_fw is being used to simulate a circuit of depth $3$, the filename would be qiskit_fw_depth_3.
  • data/09_print:
    • Print-ready output of the visualization pipeline in pdf and png format.

Note that all datasets that are not marked as "versioned" will be overwritten on the next run!

Adding new frameworks ➕

New frameworks can easily be added by editing the frameworks.py file. Frameworks are defined by classes following the NAME_fw naming template where NAME should be replaced by the framework to be implemented. Later, the framework can be selected using the class name. The constructor takes the qasm_circuit and the number of shots n_shots as parameter inputs.

The class must contain a method

def execute(self) -> None:
  ...

which should be a minimal call to the frameworks simulator. The output of the simulator can be stored in a class variable, e.g. self.result.

This result can then be accessed in the second required method

def get_result(self) -> Dict[str, float]:
  ...

Here, the simulator output can be post-processed so that a dictionary with bitstring representations for the measured qubits as keys and the normalised counts as values is returned. This dictionary is required to contain all combinations of bitstrings that result from:

bitstrings = [format(i, f"0{self.n_qubits}b") for i in range (2**self.n_qubits)]