diff --git a/sphinx/api/built-in/rose_prune.rst b/sphinx/api/built-in/rose_prune.rst index 7638c4ed9..fcb58d21b 100644 --- a/sphinx/api/built-in/rose_prune.rst +++ b/sphinx/api/built-in/rose_prune.rst @@ -3,43 +3,64 @@ ``rose_prune`` ============== -A framework for housekeeping a cycling suite. +A framework for housekeeping files and directories created by tasks in a +cycling Cylc workflow. -It prunes files and directories generated by suite tasks. It is designed -to work under :ref:`command-rose-task-run` on the host that runs the suite -daemon. +When Cylc workflows run, jobs may create files and logs in multiple different +locations on the network. After a few cycles of the workflow, these files +can build up, however, older files might not be needed any more. -The application is normally configured in the :rose:conf:`rose_prune[prune]` -section in the :rose:file:`rose-app.conf` file. +The ``rose_prune`` app can be used to remove files from older cycles from +wherever they are stored on the network in order to free up space for +later cycles. -All settings are expressed as a space delimited list of cycles, -normally as :term:`cycle points ` or -:term:`offsets ` relative to the current cycle. -For :term:`datetime cycling`, the format -of a cycle point should be an :term:`ISO8601 datetime`, and an -offset should be an :term:`ISO8601 duration`. E.g. ``-P1DT6H`` is 1 day -and 6 hours before the current cycle point. +.. important:: -The cycles of some settings also accept an optional argument followed -by a colon. In these, the argument should be globs for matching items -in the directory. If two or more globs are required, they should be -separated by a space. In which case, either the argument should be -quoted or the space should be escaped by a backslash. + The ``rose_prune`` app should be run on a "local" host (e.g. the Cylc + server), it uses ``ssh`` to remove files on any "remote" hosts where tasks + ran. -.. note:: - ``rose_prune`` uses Bash `extglob pattern matching`_ which supports simple - (e.g. ``*``) and extended (e.g. ``!(foo)``) pattern matching. +Local Vs Remote Hosts +--------------------- - For more information see the ``shopt`` documentation for the version - of bash you have installed (``$ man shopt``). +.. rubric:: Local hosts + +When you start a Cylc workflow, a :term:`scheduler` is launched. + +The host where the scheduler starts is the "local" host. Any other hosts which +share the same filesystem (i.e. which see the same ``$HOME`` directory) are +also "local". + +The Cylc scheduler manages some files on the "local" filesystem, e.g. the +scheduler log. Cylc may be configured to copy remote job logs back to the +local filesystem. + +.. rubric:: Remote hosts + +Jobs are often configured to run on "remote" hosts which do not share the same +filesystem as the Cylc server (e.g. some HPC systems). The job logs and any +files these jobs create will be on the "remote" filesystem. Invocation ---------- -In automatic selection mode, this built-in application will be invoked -automatically if a task has a name that starts with ``rose_prune*``\ . +To write a ``rose_prune`` app, add these lines at the top of a +``rose-app.conf`` file: + +.. code-block:: rose + + meta=rose_prune + mode=rose_prune + +Then run in a task in a Cylc workflow using ``rose task-run``. + +.. note:: + + If the ``mode`` is not specified, the ``rose_prune`` mode will be + automatically selected if the app is run by a Cylc task with a name that + starts with ``rose_prune``. Example @@ -51,18 +72,73 @@ Example mode=rose_prune [prune] - cycle-format{cycle_year_month}=CCYYMM + # remove log files on remote filesystems from cycles which are 6 hours or + # more before the current cycle + # i.e. ssh rm -r ~/cylc-run//// prune-remote-logs-at=-PT6H + + # archive (e.g. "tar") local log files from cycles one day or more before + # the current cycle + # i.e. gzip cylc-run//// archive-logs-at=-P1D + + # remove local log files from cycles 7 days or more before the current cycle + # i.e. ssh rm -r ~/cylc-run//// prune-server-logs-at=-P7D + + # remove files matching the globs from the Cylc work directory + # i.e. ssh rm -r ~/cylc-run//work// prune{work}=-PT6H:task_x* -PT12H:*/other*.dat -PT18H:task_y* -PT24H - prune{share}=-P1D:hello-*-at-%(cycle)s.txt -P3M:monthly/%(cycle_year_month)s/ + + # remove selected items from the share/cycle directory + # ssh rm -r ~/cylc-run//share/cycle// prune{share/cycle}=-PT6H:foo* -PT12H:'bar* *.baz*' -P1D + # remove selected paths from the share directory + # i.e. ssh rm ~/cylc-run//share/hello-*-at-.txt + prune{share}=-P1D:hello-*-at-%(cycle)s.txt + Configuration ------------- +The application is configured in the :rose:conf:`rose_prune[prune]` +section in the :rose:file:`rose-app.conf` file. + +All settings are expressed as a space delimited list of cycles, normally as +:term:`cycle points ` or :term:`offsets +` relative to the current cycle. + +.. list-table:: + + * - Workflow Cycling Type + - :term:`Datetime Cycling` + - :term:`Integer Cycling` + * - :term:`Cycle Point` format + - :term:`ISO8601 datetime` (e.g. ``20000101T00Z`` - the 1st of Jan 2000) + - Integer (e.g. ``2`` - the second cycle) + * - Cycle offset format + - :term:`ISO8601 duration` (e.g. ``-P1DT6H`` - one day and 6 hours before + the current cycle point). + - Integer duration (e.g. ``-P2`` - two cycles before the current cycle + point) + +The cycles of some settings also accept an optional argument followed +by a colon. In these, the argument should be globs for matching items +in the directory. If two or more globs are required, they should be +separated by a space. In which case, either the argument should be +quoted or the space should be escaped by a backslash. + +.. _rose_prune.globs: + +.. note:: + + ``rose_prune`` uses Bash `extglob pattern matching`_ which supports simple + (e.g. ``*``) and extended (e.g. ``!(foo)``) pattern matching. + + For more information see the ``shopt`` documentation for the version + of bash you have installed (``$ man shopt``). + .. rose:app:: rose_prune .. rose:conf:: prune @@ -70,9 +146,11 @@ Configuration .. rose:conf:: cycle-format{key}=format Specify a key to a format string for use in conjunction with a - :rose:conf:`prune{item-root}=cycle:globs` setting. For example, we may + :rose:conf:`prune{item-root}=cycle:globs` setting. + + For example, we may have something like ``cycle-format{cycle_year}=CCYY`` and - ``prune{share}=-P1Y:xmas-present-%(cycle_year)s/``. In Cylc, if the + ``prune{share}=-P1Y:xmas-present-%(cycle_year)s/``. If the current cycle point is ``20151201T0000Z``, it will clear out the directory ``share/xmas-present-2014/``. @@ -94,36 +172,67 @@ Configuration Archive all job logs at these cycles. Remove remote job logs on success. - .. rose:conf:: prune{item-root}=cycle[:globs] ... + .. rose:conf:: prune{item-root}=cycle[:glob] ... + + Remove items from within a specified directory. + + ``item-root`` + A path within the workflow's :term:`run directory` e.g. ``work`` or + ``share/cycle``. + ``cycle`` + The cycle to remove items from or an offset from the current cycle. + ``glob`` + Remove only files matching a :ref:`glob pattern `. + + By default ``rose_prune`` will remove files within a cycle + subdirectory under ``item-root``, + E.g. If current cycle is ``20141225T1200Z``, + ``prune{work}=-PT12H`` will remove the ``work/20141225T0000Z/`` + directory. + + If you want to clear out paths that include a cycle, rather than a + cycle subdirectory, you can template the path using the ``%(cycle)s`` + substitution, + E.g. If current cycle is ``20141225T1200Z``, then + ``prune{share}=-PT12H:%(cycle)s.txt`` will remove + ``share/20141225T0000Z.txt``. + + To use different date-time formats, add custom subsitutions using + :rose:conf:`cycle-format{key}=format`, E.g. + ``cycle-format{cycle_year_month}=CCYYMM``. + + .. rubric:: Examples: + + If the current cycle is ``20141225T1200Z``: + + .. code-block:: rose + + # remove work/20141225T0000Z/ + prune{work}=-PT12H + + # remove work/20141225T0000Z/glob* + prune{work}=-PT12H:glob* - Remove the sub-directories under ``item-root`` (e.g. - :term:`work/ ` of the specified cycles. - E.g. In Cylc, if current cycle is ``20141225T1200Z``, - ``prune{work}=-PT12H`` will clear out ``work/20141225T0000Z/``. + # remove share/hello-*-at-20141225T0000Z.txt + prune{share}=-PT12H:hello-*-at-%(cycle)s.txt - If globs are specified for a cycle, it will attempt to prune only - items matching ``CYCLE/GLOBS`` under ``item-root``. - E.g. In Cylc, if current cycle is ``20141225T1200Z``, then - ``prune{share/cycle}=-PT12H:wild*`` will clear out all items - matching ``share/cycle/20141225T0000Z/wild*``. + # remove share/hello-*-at-201412.txt + cycle-format{cycle_year_month}=CCYYMM + prune{share}=-PT12H:hello-*-at-%(cycle_year_month)s.txt - A glob can also be specified as a formatting string containing a - single substitution ``%(cycle)s``\ . In this mode, the cycle - string will not be added as a sub-directory of the ``item-root``. - E.g. In Cylc, if current cycle is ``20141225T1200Z``, then - ``prune{share}=-PT12H:hello-*-at-%(cycle)s.txt`` will clear out - all items matching ``share/hello-*-at-20141225T0000Z.txt``. + # remove share/hello-*-at-2014.txt + cycle-format{cycle_year}=CCYY + prune{share}=-PT12H:hello-*-at-%(cycle_year)s.txt - A glob can also be specified as a formatting string containing a - substitution ``%(key)s``, if a - :rose:conf:`cycle-format{key}=format` setting is specified. + # remove share/cycle// + prune{share/cycle}=-PT6H:foo* -PT12H:'bar* *.baz*' -P1D .. rose:conf:: prune-work-at=cycle[:globs] ... .. deprecated:: 2015.04.0 - Equivalent to ``prune{work}=cycle[:globs] ...``\ . + Equivalent to ``prune{work}=cycle[:globs] ...``. .. rose:conf:: prune-datac-at=cycle[:globs] ... .. deprecated:: 2015.04.0 - Equivalent to ``prune{share/cycle}=cycle[:globs] ...``\ . + Equivalent to ``prune{share/cycle}=cycle[:globs] ...``.