Not to be confused with the Quaffle 😉
See our poster contributed to the QTML23 for more details.
This project follows the Kedro Framework approach. Straight, without development packages, you can execute the following command, assuming you have Poetry installed:
poetry install --without dev
There is a setup.sh
script in the .vscode
directory for convenience.
If you want to go with Pip instead, run
pip install -r src/requirements.in
🚧 only:
If you considere building docs, running tests and commiting to the project, run:
poetry install
poetry run pre-commit autoupdate
poetry run pre-commit install
poetry run pytest
poetry run mkdocs build
Again, there is a setup_dev.sh
script in the .vscode
directory for convenience.
With Pip the equivalent is
pip install -r src/requirements_dev.in
pre-commit autoupdate
pre-commit install
pytest
mkdocs build
Without any configuration needed, you can execute
kedro run --pipeline prepare
followed by
kedro run
and a default pipeline should run. In this and following examples the leading poetry run
is omitted for simplicity.
Note that is required to always run the prepare
pipeline in advance to any actual processing pipeline.
This is because of the current implementation relies on dynamically created nodes that are depending on the configuration and therefore requiring two separate pipeline executions.
In summary, the following pipelines exist:
prepare
: generates all possible combinations of configurations based on the current parameter setmeasure
: performs the actual time measurement by executing experiments for each of the previously generated configurations with the ability to parallelize processingcombine
: gathers all the results from themeasure
pipeline and combines them into a single output datasetvisualize
: takes the combined experiment results and generates your plots
The default
pipeline covers measure
, combine
and visualize
.
You can run them separately by specifying the pipeline name.
This project can take advantage of multiprocessing to evaluate numerous combinations of qubits, depths and shots in parallel in the measure
pipeline.
To use this, you should explicitly call the individual pipelines.
In summary the whole experiment will then look as follows:
kedro run --pipeline prepare
kedro run --pipeline measure --runner quafel.runner.Parallel
kedro run --pipeline combine
kedro run --pipeline visualize
Here, only the pipeline measure
will utilize multiprocessing and the rest will run single process.
We recommend this approach since there is no advantage by running the other pipelines in parallel as well.
Of course, you can run the measure
pipeline in a single process as well by omitting the --runner
option.
If for some reason the execution of the measure
pipeline gets interrupted, running the same pipeline again without running prepare
will allow re-using previously generated artefacts.
For details on the output, see the Data Structure Section.
Note that if you want to re-run e.g. the visualize
pipeline, you have to re-run the prepare
pipeline as well.
This is because intermediate data containing information about the partitions is being deleted after the visualize
pipeline of an experimant successfully ran.
This constraint will be removed in future releases.
🚧 only:
Checkout the pre-defined VSCode tasks if you want to develop on the project.Circuits are being generated in the data_generation
namespace of the project.
To adjust the number of qubits, depth of the circuit, enabled frameworks and more, checkout conf/base/parameters/data_generation.yml.
Here you can adjust the following parameters:
seed
: Used in the circuit generation method to sample random gatessamples_per_parameter
: for expressibility and entangling capability measureshaar_samples_per_qubit
: for expressibility and entangling capability measuresmin_[qubits/depth/shots]
: lowest number of qubits/ circuit depth/ shots used for generating partitionsmax_[qubits/depth/shots]
: highest number of qubits/ circuit depth/ shots used for generating partitions[qubits/depth/shots]_increment
: steps in which the range specified by min/max value will be iterated[qubits/depth/shots]_type
: type of the increment (e.g.exp2
orlinear
)
Everything related to executing the circuits and time measurments is contained in the data_science
namespace.
Head to conf/base/parameters/data_science.yml to specify a framework and set e.g. the number of evaluations.
By now, there is no specific Kedro-style configuration.
The generated plots can be adjusted using the design
class located in src/quafel/pipelines/visualization/nodes.py.
Propagating these settings to a .yml
file is on the agenda!
You can actually see what's going on by running
poetry run kedro-viz
which will open a browser with kedro-viz showing the pipeline.
-
data/01_raw:
-
Versioned Evaluation Matrix containing all valid values for
frameworks
,qubits
,depths
, andshots
as specified in the data_generation.yml file.
-
Versioned Evaluation Matrix containing all valid values for
-
data/02_intermediate:
- Evaluation partitions split into single
.csv
files. - The number of partitions depend on the configuration.
- Evaluation partitions split into single
-
data/03_qasm_circuits:
- A QASM circuit for each partition.
-
data/04_measures:
- Entangling capability and expressibility of each generated circuit
- Calculation according to Sim et al. - Expressibility and entangling capability of parameterized quantum circuits for hybrid quantum-classical algorithms
- Statevectors of circuits are cached (
./.cache/
folder) based on the md5 hash of provided circuit to speedup calculation
-
data/05_execution_results:
- Simulator results of the job with the corresponding id.
- Result formats are unified as a dictionary with the keys containing the binary bit representation of the measured qubit and the normalized counts as values.
- Results are zero padded, so it is ensured that also state combinations with
$0$ probability are represented.
-
data/06_execution_durations:
- Duration for the simulation of the job with the corresponding id
- Duration is only measured using
perf_counter
andprocess_time
-
data/07_evaluations_combined:
-
Versioned dataset containing the combined information of both, the input parameters (
framework
,qubits
,depth
,shots
), the measured duration and the simulator results
-
Versioned dataset containing the combined information of both, the input parameters (
-
data/08_reportings:
-
Versioned dataset with the
.json
formatted ploty heatmaps - The data in this folder is named by the framework and the fixed parameter. E.g. when the number of
qubits
is plotted against theshots
and theqiskit_fw
is being used to simulate a circuit ofdepth
$3$ , the filename would beqiskit_fw_depth_3
.
-
Versioned dataset with the
-
data/09_print:
- Print-ready output of the visualization pipeline in
pdf
andpng
format.
- Print-ready output of the visualization pipeline in
Note that all datasets that are not marked as "versioned" will be overwritten on the next run!
New frameworks can easily be added by editing the frameworks.py file.
Frameworks are defined by classes following the NAME_fw
naming template where NAME
should be replaced by the framework to be implemented.
Later, the framework can be selected using the class name.
The constructor takes the qasm_circuit
and the number of shots n_shots
as parameter inputs.
The class must contain a method
def execute(self) -> None:
...
which should be a minimal call to the frameworks simulator.
The output of the simulator can be stored in a class variable, e.g. self.result
.
This result can then be accessed in the second required method
def get_result(self) -> Dict[str, float]:
...
Here, the simulator output can be post-processed so that a dictionary with bitstring representations for the measured qubits as keys and the normalised counts as values is returned. This dictionary is required to contain all combinations of bitstrings that result from:
bitstrings = [format(i, f"0{self.n_qubits}b") for i in range (2**self.n_qubits)]