Docs: add section on basic performance benchmark

The section provides a basic script to benchmark the performance of an AiiDA installation by launching a number of `ArithmeticAddCalculation` jobs. The script by default automatically sets up the required `Code` and localhost `Computer`. All created nodes are automatically deleted from the database at the end. The documentation gives instructions on how to run the script and provides example output of a run on a typical work station including completion times for runs with variable number of daemon workers. This should give users an idea of the performance of their installation.
aiidateam · Sep 8, 2022 · a1d3640 · a1d3640
1 parent 82a81c9
commit a1d3640
Show file tree

Hide file tree

Showing 2 changed files with 174 additions and 0 deletions.
diff --git a/docs/source/howto/include/scripts/performance_benchmark_base.py b/docs/source/howto/include/scripts/performance_benchmark_base.py
@@ -0,0 +1,127 @@
+#!/usr/bin/env python
+# -*- coding: utf-8 -*-
+"""Basic script to benchmark the performance of an AiiDA installation."""
+import click
+
+from aiida.cmdline.params import options
+from aiida.cmdline.utils import decorators, echo
+
+
+@click.command()
+@options.CODE(required=False, help='A code that can run the ``ArithmeticAddCalculation``, for example bash.')
+@click.option('-n', 'number', type=int, default=10, show_default=True, help='The number of processes to submit.')
+@decorators.with_dbenv()
+def main(code, number):
+    """Submit a number of ``ArithmeticAddCalculation`` to the daemon and record time to completion.
+
+    This command requires the daemon to be running.
+
+    The script will submit a configurable number of ``ArithmeticAddCalculation`` jobs. By default, the jobs are executed
+    using the ``bash`` executable of the system. If this executable cannot be found the script will exit. The jobs will
+    be run on the localhost, which is automatically created and configured. At the end of the script, the created nodes
+    will be deleted, as well as the code and computer, if they were automatically setup.
+    """
+    import shutil
+    import time
+    import uuid
+    import tempfile
+
+    from aiida import orm
+    from aiida.common import exceptions
+    from aiida.engine import submit
+    from aiida.engine.daemon.client import get_daemon_client
+    from aiida.plugins import CalculationFactory
+    from aiida.tools.graph.deletions import delete_nodes
+
+    client = get_daemon_client()
+
+    if not client.is_daemon_running:
+        echo.echo_critical('The daemon is not running.')
+
+    computer_created = False
+    code_created = False
+
+    if not code:
+        label = f'benchmark-{uuid.uuid4().hex[:8]}'
+        computer = orm.Computer(
+            label=label,
+            hostname='localhost',
+            transport_type='local',
+            scheduler_type='direct',
+            workdir=tempfile.gettempdir(),
+        ).store()
+        computer.configure(safe_interval=0.0)
+        echo.echo_success(f'Created and configured temporary `Computer` {label} for localhost.')
+        computer_created = True
+
+        executable = shutil.which('bash')
+
+        if executable is None:
+            echo.echo_critical('Could not determine the absolute path for the `bash` executable.')
+
+        code = orm.Code(label='bash', remote_computer_exec=(computer, executable)).store()
+        echo.echo_success(f'Created temporary `Code` {code.label} for localhost.')
+        code_created = True
+
+    cls = CalculationFactory('arithmetic.add')
+    builder = cls.get_builder()
+    builder.code = code
+    builder.x = orm.Int(1)
+    builder.y = orm.Int(1)
+
+    time_start = time.time()
+    nodes = []
+
+    with click.progressbar(range(number), label=f'Submitting {number} calculations.') as bar:
+        for iteration in bar:
+            node = submit(builder)
+            nodes.append(node)
+
+    time_end = time.time()
+    echo.echo(f'Submission completed in {(time_end - time_start):.2f} seconds.')
+
+    completed = 0
+
+    with click.progressbar(length=number, label='Waiting for calculations to complete') as bar:
+        while True:
+            time.sleep(0.2)
+
+            terminated = [node.is_terminated for node in nodes]
+            newly_completed = terminated.count(True) - completed
+            completed = terminated.count(True)
+
+            bar.update(newly_completed)
+
+            if all(terminated):
+                break
+
+    if any(node.is_excepted or node.is_killed for node in nodes):
+        echo.echo_warning('At least one submitted calculation excepted or was killed.')
+    else:
+        echo.echo_success('All calculations finished successfully.')
+
+    time_end = time.time()
+    echo.echo(f'Elapsed time: {(time_end - time_start):.2f} seconds.')
+
+    echo.echo('Cleaning up...')
+    delete_nodes([node.pk for node in nodes], dry_run=False)
+    echo.echo_success('Deleted all calculations.')
+
+    if code_created:
+        code_label = code.full_label
+        orm.Node.objects.delete(code.pk)
+        echo.echo_success(f'Deleted the created code {code_label}.')
+
+    if computer_created:
+        computer_label = computer.label
+        user = orm.User.objects.get_default()
+        auth_info = computer.get_authinfo(user)
+        orm.AuthInfo.objects.delete(auth_info.pk)
+        orm.Computer.objects.delete(computer.pk)
+        echo.echo_success(f'Deleted the created computer {computer_label}.')
+
+    echo.echo(f'Performance {(number / (time_end - time_start)):.2f} processes / s')
+
+
+if __name__ == '__main__':
+    main()
diff --git a/docs/source/howto/installation.rst b/docs/source/howto/installation.rst
@@ -385,6 +385,53 @@ Here are a few general tips that might improve the AiiDA performance:
         and try a simple AiiDA query with the new database.
         If everything went fine, you can delete the old database location.
 
+It is not trivial to give a baseline of performance for AiiDA, as it depends a lot on the machine on which it is running, the installation, and what is being run.
+To still give some sense of performance, we provide this :download:`simple benchmarking script <include/scripts/performance_benchmark_base.py>` :fa:`download`.
+Download the script to the machine where AiiDA is running and make it executable.
+Make sure the daemon is running (``verdi daemon start``) and then execute the script:
+
+.. code:: console
+
+    ./performance_benchmark_base.py -n 100
+
+This will launch 100 ``ArithmeticAddCalculation`` jobs on the localhost and record the time until completion.
+
+Below is the output of a run using AiiDA v1.6.9 with a single daemon worker on a machine with a AMD Ryzen 5 3600 6-Core processor, with all services (RabbitMQ and PostgreSQL running on the same machine):
+
+.. code:: console
+
+    sph@citadel:~/$ ./performance_benchmark_base.py -n 100
+        Success: Created and configured temporary `Computer` benchmark-5fa8c67f for localhost.
+        Success: Created temporary `Code` bash for localhost.
+        Submitting 100 calculations.  [####################################]  100%
+        Submission completed in 9.36 seconds.
+        Waiting for calculations to complete  [####################################]  100%
+        Success: All calculations finished successfully.
+        Elapsed time: 46.55 seconds.
+        Cleaning up...
+        Success: Deleted all calculations.
+        Success: Deleted the created code bash@benchmark-5fa8c67f.
+        Success: Deleted the created computer benchmark-5fa8c67f.
+        Performance 2.15 processes / s
+
+As you can see, it took 9.42 seconds to submit the 100 calculations.
+The entire script took 46.55 seconds to submit and successfully complete all 100 calculations.
+When running the same script and setup with a different amount of daemon workers, we got the following numbers:
+
+.. table::
+    :widths: auto
+
+    ========== ======================= ==========================
+    # Workers  Total elapsed time (s)  Performance (processes/s)
+    ========== ======================= ==========================
+    1          46.55                   2.15
+    2          27.83                   3.59
+    4          16.43                   6.09
+    ========== ======================= ==========================
+
+Although there is no perfect linear scaling, an increase in the number of daemon workers clearly increases the throughput significantly.
+
+
 .. _how-to:installation:update:
 
 Updating your installation