Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML loading and dumping improvements #62932

Closed
wants to merge 32 commits into from
Closed

Conversation

rhansen
Copy link
Contributor

@rhansen rhansen commented Oct 21, 2022

What does this PR do?

Introduces a new yaml_compatibility option that controls the behavior of YAML loading/dumping. If a user wants to revert to the YAML behavior from a previous version of Salt, or preview upcoming changes to YAML behavior, they can set it to the desired version of Salt. For example, setting yaml_compatibility to 3006 will pin the current behavior, and setting it to 3007 allows the user to preview the behavior that will become the default in Salt v3007.

In addition, the following behavior changes take effect immediately:

  • A warning is logged saying that the default YAML behavior will change in Salt v3007.
  • Passing a non-dict class to the salt.utils.yaml.SaltYamlSafeLoader constructor no longer causes all future !!omap nodes to raise an exception when loaded.
  • Loading a sequence node tagged with !!python/tuple is now supported, and produces a Python tuple object. Before, such nodes would raise an exception. (This change does not break backwards compatibility, which is why it takes effect immediately.)

After v3006 is released (salt.version.SaltVersionsInfo.SULFUR.released is changed to True), the following behavior changes will automatically take effect. These can be previewed now by setting yaml_compatibility to 3007.

  • salt.utils.yaml.dump() will default to salt.utils.yaml.OrderedDumper rather than PyYAML's yaml.Dumper class. (I believe this was the original intention, and now dump()'s behavior is symmetric with safe_dump()).
  • Fix salt.utils.yaml.IndentedSafeOrderedDumper list indentation (was broken only when yaml.CSafeDumper exists).
  • Loading an explicitly tagged YAML !!timestamp node will produces a Python datetime.datetime object (currently they are always loaded as strings).
  • Fix loading of YAML !!omap nodes (currently they are sometimes incorrectly parsed as mapping nodes rather than sequence nodes containing mapping nodes).
  • Loading a YAML !!omap node will always produce a collections.OrderedDict object. Currently, it sometimes produces a list of (key, value) tuples and sometimes raises an exception.
  • Dumping a Python datetime.datetime object to YAML now explicitly tags the node with !!timestamp.
  • Dumping a Python collections.OrderedDict object to YAML now consistently produces a YAML !!omap node, which is a sequence of single-valued mappings. Before, the behavior varied depending on the specific object type (collections.OrderedDict vs. salt.utils.odict.OrderedDict) and the dump function (salt.utils.yaml.dump() vs. salt.utils.yaml.safe_dump()).
  • Dumping a Python tuple object to YAML now consistently produces a YAML sequence node tagged with !!python/tuple. Before, only salt.utils.yaml.dump() would tag the node with !!python/tuple; salt.utils.yaml.safe_dump() did not.
  • Dumping an object of unsupported type using a safe dumper will raise an exception. Currently it produces a NULL node.

Clean-ups and code health:

What issues does this PR fix or reference?

#12161 (can be closed even without this PR because the bug no longer exists, although this PR adds a regression test in the form of an integration test)

Behavior Changes

  • IndentedSafeOrderedDumper:

    • Before (assuming yaml.CSafeDumper exists):

      >>> import salt.utils.yaml as y
      >>> print(y.dump({"foo": ["bar"]}, Dumper=y.IndentedSafeOrderedDumper, default_flow_style=False))
      foo:
      - bar
      
    • After:

      >>> import salt.utils.yaml as y
      >>> print(y.dump({"foo": ["bar"]}, Dumper=y.IndentedSafeOrderedDumper, default_flow_style=False))
      foo:
        - bar
      

      This fix required a switch to the pure Python yaml.SafeDumper class (for IndentedSafeOrderedDumper only), which will be slower than yaml.CSafeDumper.

  • Loading YAML !!omap nodes to Python:

    • Before:

      >>> import salt.utils.yaml as y
      >>> import collections
      >>> y.load("!!omap [{foo: bar}, {baz: bif}]")
      [('foo', 'bar'), ('baz', 'bif')]
      >>> y.SaltYamlSafeLoader("", dictclass=collections.OrderedDict)  # created for side-effect only
      <salt.utils.yamlloader.SaltYamlSafeLoader object at 0xc31a10>
      >>> y.load("!!omap [{foo: bar}, {baz: bif}]")  # exact same as before
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "salt/utils/yamlloader.py", line 159, in load
          return yaml.load(stream, Loader=Loader)
        File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
          return loader.get_single_data()
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
          return self.construct_document(node)
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 60, in construct_document
          for dummy in generator:
        File "salt/utils/yamlloader.py", line 45, in construct_yaml_map
          value = self.construct_mapping(node)
        File "salt/utils/yamlloader.py", line 56, in construct_mapping
          raise ConstructorError(
      yaml.constructor.ConstructorError: expected a mapping node, but found sequence
        in "<unicode string>", line 1, column 1
    • After:

      >>> import salt.utils.yaml as y
      >>> import collections
      >>> y.load("!!omap [{foo: bar}, {baz: bif}]")
      OrderedDict([('foo', 'bar'), ('baz', 'bif')])
      >>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
      OrderedDict([('foo', 'bar'), ('baz', 'bif')])
      >>> y.load("!!omap [{foo: bar}, {baz: bif}]")
      OrderedDict([('foo', 'bar'), ('baz', 'bif')])
  • Loading YAML !!timestamp nodes to Python:

    • Before:

      >>> import salt.utils.yaml as y
      >>> y.load("!!timestamp 2001-12-15T02:59:43.1Z")
      '2001-12-15T02:59:43.1Z'
      >>> y.load("2001-12-15T02:59:43.1Z")  # implicit timestamp
      '2001-12-15T02:59:43.1Z'
    • After:

      >>> import salt.utils.yaml as y
      >>> y.load("!!timestamp 2001-12-15T02:59:43.1Z")
      datetime.datetime(2001, 12, 15, 2, 59, 43, 100000, tzinfo=datetime.timezone.utc)
      >>> y.load("2001-12-15T02:59:43.1Z")  # untagged is still a string
      '2001-12-15T02:59:43.1Z'
  • Loading YAML !!python/tuple nodes to Python:

    • Before:

      >>> import salt.utils.yaml as y
      >>> y.load("!!python/tuple [foo, bar]")
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "salt/utils/yamlloader.py", line 159, in load
          return yaml.load(stream, Loader=Loader)
        File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
          return loader.get_single_data()
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
          return self.construct_document(node)
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 55, in construct_document
          data = self.construct_object(node)
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 100, in construct_object
          data = constructor(self, node)
        File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 427, in construct_undefined
          raise ConstructorError(None, None,
      yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/tuple'
        in "<unicode string>", line 1, column 1
    • After:

      >>> import salt.utils.yaml as y
      >>> y.load("!!python/tuple [foo, bar]")
      ('foo', 'bar')
  • Dumping Python datetime.datetime objects to YAML:

    • Before:

      >>> import datetime
      >>> import salt.utils.yaml as y
      >>> print(y.dump(datetime.datetime(2022, 10, 21, 18, 16, 3, 100000, tzinfo=datetime.timezone(datetime.timedelta(hours=-4)))))
      2022-10-21 18:16:03.100000-04:00
    • After:

      >>> import datetime
      >>> import salt.utils.yaml as y
      >>> print(y.dump(datetime.datetime(2022, 10, 21, 18, 16, 3, 100000, tzinfo=datetime.timezone(datetime.timedelta(hours=-4)))))
      !!timestamp 2022-10-21 18:16:03.100000-04:00
  • Dumping Python OrderedDict objects to YAML:

    • Before:

      >>> from salt.utils.odict import OrderedDict
      >>> import salt.utils.yaml as y
      >>> import collections
      >>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!python/object/apply:salt.utils.odict.OrderedDict
      dictitems:
        foo: bar
      
      >>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
      foo: bar
      
      >>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!python/object/apply:collections.OrderedDict
      - - - foo
          - bar
      
      >>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
      NULL
      
    • After:

      >>> from salt.utils.odict import OrderedDict
      >>> import salt.utils.yaml as y
      >>> import collections
      >>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!omap
      - foo: bar
      
      >>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!omap
      - foo: bar
      
      >>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!omap
      - foo: bar
      
      >>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
      !!omap
      - foo: bar
      
  • Dumping Python tuple objects to YAML:

    • Before:

      >>> import salt.utils.yaml as y
      >>> print(y.dump(("foo", "bar"), default_flow_style=False))
      !!python/tuple
      - foo
      - bar
      
      >>> print(y.safe_dump(("foo", "bar"), default_flow_style=False))
      - foo
      - bar
      
    • After:

      >>> import salt.utils.yaml as y
      >>> print(y.dump(("foo", "bar"), default_flow_style=False))
      !!python/tuple
      - foo
      - bar
      
      >>> print(y.safe_dump(("foo", "bar"), default_flow_style=False))
      !!python/tuple
      - foo
      - bar
      
  • Dumping unsupported objects:

    • Behavior before:

      >>> import salt.utils.yaml as y
      >>> y.safe_dump(type("GeneratedObjectSubclass", (), {}))
      'NULL\n'
    • Behavior after:

      >>> import salt.utils.yaml as y
      >>> y.safe_dump(type("GeneratedObjectSubclass", (), {}))
      Traceback (most recent call last):
        File "<stdin>", line 1, in <module>
        File "salt/utils/yamldumper.py", line 270, in safe_dump
          return dump(data, stream, Dumper=SafeOrderedDumper, **kwargs)
        File "salt/utils/yamldumper.py", line 261, in dump
          return yaml.dump(data, stream, **kwargs)
        File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 253, in dump
          return dump_all([data], stream, Dumper=Dumper, **kwds)
        File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 241, in dump_all
          dumper.represent(data)
        File "venv/lib/python3.8/site-packages/yaml/representer.py", line 27, in represent
          node = self.represent_data(data)
        File "venv/lib/python3.8/site-packages/yaml/representer.py", line 58, in represent_data
          node = self.yaml_representers[None](self, data)
        File "venv/lib/python3.8/site-packages/yaml/representer.py", line 231, in represent_undefined
          raise RepresenterError("cannot represent an object", data)
      yaml.representer.RepresenterError: ('cannot represent an object', <class '__main__.GeneratedObjectSubclass'>)

Merge requirements satisfied?

[NOTICE] Bug fixes or features added to Salt require tests.

Commits signed with GPG?

No

Tasks

  • write a changelog entry
  • figure out where to document these changes
  • Add a feature flag to enable this new behavior.
  • Get all the tests to pass.
  • Get Lazy import #63073 merged

@rhansen
Copy link
Contributor Author

rhansen commented Oct 21, 2022

This PR is marked as draft until I complete the following tasks:

  • write a changelog entry
  • figure out where to document these changes
  • ensure that all tests pass

@rhansen rhansen force-pushed the yaml-omap branch 3 times, most recently from 3943125 to c0bc1a7 Compare October 23, 2022 04:37
@rhansen rhansen marked this pull request as ready for review October 23, 2022 04:38
@rhansen rhansen requested a review from a team as a code owner October 23, 2022 04:38
@rhansen rhansen requested review from cmcmarrow and removed request for a team October 23, 2022 04:38
@rhansen
Copy link
Contributor Author

rhansen commented Oct 23, 2022

I believe this is ready to be reviewed now.

salt/utils/yamldumper.py Outdated Show resolved Hide resolved
Comment on lines +140 to +139
assert isinstance(dataset["spam"], collections.OrderedDict)
assert isinstance(dataset["spam"]["foo"], collections.OrderedDict)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm going to trigger a full test run on this PR, simply to find out if there are any other tests that may fail due to these changes.

@MKLeb
Copy link
Contributor

MKLeb commented Oct 26, 2022

re-run full all

@waynew
Copy link
Contributor

waynew commented Oct 26, 2022

FWIW I'm on board with this idea! I don't yet know if I'm on board with the implementation - as @MKLeb implied with the full test run, this is a pretty intensive change with (potentially) broad repercussions... hopefully all of them good! But I'm sure someone out there is relying on some backwards funky behavior 🙃

I would love if we could bridge the existing behavior and this with like... a feature flag and deprecation warning, like...

if not __opts__['use_up_to_date_yaml_dumping']:
    raise DeprecationWarning('You are currently using the legacy YAML loading/dumping approach. This will work as exists until 2024 at the earliest. To use the updated YAML behavior, set `use_up_to_date_yaml_dumping: True` in your config files.")
    do_old_stuff()
else:
    do_this()

That way folks could opt-in to the new behavior, and if they have a bunch of problems then they can file some bugs and still have a better fallback than just downgrading their salt installation.

In any case, I'm sure a lot of us would feel better with an in-depth code review!

@rhansen
Copy link
Contributor Author

rhansen commented Oct 26, 2022

I would love if we could bridge the existing behavior and this with like... a feature flag and deprecation warning

I can do that. One thing: There is already an option for legacy YAML behavior: use_yamlloader_old. That option doesn't appear to be marked as deprecated. Is it old enough that we can safely remove it to avoid confusion with a new option?

I'm tempted to make the option a version selector rather than a boolean so that the YAML behavior can be changed again in the future without much fanfare. Maybe something like override_yaml_parser_version that defaults to the current version of Salt and behaves like this:

from salt.version import SaltStackVersion

want_v = __opts__.get("override_yaml_parser_version")
if want_v:
    want_v = SaltStackVersion.parse(want_v)
else:
    want_v = SaltStackVersion.current_release()

if want_v < SaltStackVersion(3006):
    do_old_stuff()
else:
    do_this_new_stuff()

Note that the new behavior would be the default behavior, so if it breaks a user they would have to take action to switch to the old behavior.

Thoughts?

@rhansen
Copy link
Contributor Author

rhansen commented Oct 26, 2022

Note that the new behavior would be the default behavior, so if it breaks a user they would have to take action to switch to the old behavior.

If we think this change (or a future change) is too risky to make the default right away, we could predate the version and add a warning:

if want_v < SaltStackVersion(3007):
    if not __opts__.get("override_yaml_parser_version"):
        salt.utils.versions.warn_until(3007, "Salt's YAML parsing behavior will change in v3007.0; see xyz for details")
    do_old_stuff()
else:
    do_this_new_stuff()

@rhansen
Copy link
Contributor Author

rhansen commented Oct 27, 2022

re-run full all

@rhansen rhansen marked this pull request as draft October 27, 2022 05:42
@rhansen
Copy link
Contributor Author

rhansen commented Oct 27, 2022

I set this to draft until all of the tasks in the PR description are done.

@rhansen
Copy link
Contributor Author

rhansen commented Oct 31, 2022

@waynew I'm trying to figure out how to coax the loader to set __opts__ in salt.utils.yaml and friends, but the loader documentation is wafer thin. Any hints?

@MKLeb
Copy link
Contributor

MKLeb commented Oct 31, 2022

@rhansen There is an example of how to mock the loader dunders here

@rhansen
Copy link
Contributor Author

rhansen commented Oct 31, 2022

@MKLeb Thanks, though it looks like __opts__ isn't available in salt.utils.yaml when Salt normally runs (not in test) because salt.utils.yaml is usually imported, not loaded through the loader. Is that correct? Can I manually trigger the loader at daemon startup to populate the dunder dictionaries so that __opts__ is available even if salt.utils.yaml is directly imported?

@MKLeb
Copy link
Contributor

MKLeb commented Oct 31, 2022

@s0undt3ch Any ideas here?

@lkubb
Copy link
Contributor

lkubb commented Nov 1, 2022

Not sure if that's helpful, but you can always manually load __opts__ and friends (watch out for performance though):

Example:
.. code-block:: python
import salt.config
import salt.loader
__opts__ = salt.config.minion_config('/etc/salt/minion')
__grains__ = salt.loader.grains(__opts__)
__opts__['grains'] = __grains__
__utils__ = salt.loader.utils(__opts__)
__salt__ = salt.loader.minion_mods(__opts__, utils=__utils__)
__salt__['test.ping']()

import salt.config
import salt.syspaths

from pathlib import Path

__opts__ = None


def my_func():
    global __opts__
    if __opts__ is None:
        __opts__ = salt.config.minion_config(Path(salt.syspaths.CONFIG_DIR) / "minion")
    # in case you need grains in __opts__, see the linked example
    # ... do stuff

This does not change the behavior, but it does simplify the code.
The `salt.loader` module is intentially imported late in
`salt.config.call_id_function()` to avoid a circular import error.
However, `salt.loader` is also used in `salt.config.mminion_config()`,
so import it there as well.  This avoids an error if
`mminion_config()` is called before `call_id_function()`, or if the
import statement in `call_id_function()` is removed in the future.
The same YAML value can be represented in different ways, so tests
that compare a generated YAML string with a manually typed string are
fragile.  Use `salt.utils.yaml` to generate the reference YAML so that
the implementation of the YAML dumper can change without breaking the
tests.
The delayed `import` statements match `import` statements at the top
of the file so they are effectively no-ops.
This commit is a prerequisite to adding version-selected behavior
changes.
This will make it possible for users to switch to an older YAML
loader/dumper behavior in case a change breaks something.

Note that several top-of-file imports were moved into
functions to avoid circular import errors.
I believe this was the original intention.  Even if it was not, the
symmetry with `salt.utils.yaml.save_dump()` makes `dump()` less
surprising now, and it makes it easier to introduce representer
changes that affect both `dump()` and `safe_dump()`.
Before, the indentation would only be increased if the compiled module
`yaml.CSafeDumper` did not exist.  Now it is indented even if it does.

Behavior before (assuming `yaml.CSafeDumper` exists):

```pycon
>>> import salt.utils.yaml as y
>>> print(y.dump({"foo": ["bar"]}, Dumper=y.IndentedSafeOrderedDumper, default_flow_style=False))
foo:
- bar

```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> print(y.dump({"foo": ["bar"]}, Dumper=y.IndentedSafeOrderedDumper, default_flow_style=False))
foo:
  - bar

```
The `SaltYamlSafeLoader.add_constructor()` method is a class method,
not an instance method.  Therefore, any call to that method in an
instance method (such as `__init__`) affects all future instances, not
just the current instance (`self`).  That's not a problem if the
registration calls simply re-register the same constructor functions
over and over, but that was not the case here: different functions
were registered depending on the value of the `dictclass` parameter to
`__init__`.

The net effect was that an `!!omap` node was correctly processed as a
sequence of single-entry maps (and a list of (key, value) tuples
returned) until the first time a SaltYamlSafeLoader was constructed
with a non-`dict` class.  After that, every `!!omap` node was always
incorrectly processed as a map node regardless of `dictclass`.

Now `!!omap` processing is performed as originally intended:
  * If the `dictclass` parameter is `dict` (the default), the behavior
    when loading a `!!map` or `!!omap` node is unchanged from the base
    class's behavior.
  * If the `dictclass` parameter is not `dict`:
      * `!!map` nodes are loaded like they are in the base class
        except the custom class is used instead of `dict`.
      * `!!omap` nodes are loaded the same as `!!map` nodes.  (This is
        a bug because an `!!omap` node is a sequence node of
        single-valued map nodes, not an ordinary map node.  A future
        commit will fix this.)

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
>>> y.SaltYamlSafeLoader("", dictclass=collections.OrderedDict)  # created for side-effect only
<salt.utils.yamlloader.SaltYamlSafeLoader object at 0xc31a10>
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")  # exact same as before
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "salt/utils/yamlloader.py", line 159, in load
    return yaml.load(stream, Loader=Loader)
  File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 60, in construct_document
    for dummy in generator:
  File "salt/utils/yamlloader.py", line 45, in construct_yaml_map
    value = self.construct_mapping(node)
  File "salt/utils/yamlloader.py", line 56, in construct_mapping
    raise ConstructorError(
yaml.constructor.ConstructorError: expected a mapping node, but found sequence
  in "<unicode string>", line 1, column 1
```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
>>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 60, in construct_document
    for dummy in generator:
  File "salt/utils/yamlloader.py", line 42, in construct_yaml_omap
    return (yield from self.construct_yaml_map(node))
  File "salt/utils/yamlloader.py", line 36, in construct_yaml_map
    value = self.construct_mapping(node)
  File "salt/utils/yamlloader.py", line 52, in construct_mapping
    raise ConstructorError(
yaml.constructor.ConstructorError: expected a mapping node, but found sequence
  in "<unicode string>", line 1, column 1
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
```

This commit also adds a unit test for the `dictclass` parameter,
though it's important to note that the new test is not a regression
test for the bug fixed by this commit.  (I don't think it would be
worthwhile to write a regression test because the test code would be
complicated and unreadable.)
YAML nodes tagged with `!!timestamp` nodes now become Python
`datetime` objects rather than strings.  To preserve compatibility
with existing YAML files, timestamp-like strings without `!!timestamp`
are still loaded as strings.

Behavior before:

```pycon
>>> import datetime
>>> import salt.utils.yaml as y
>>> y.load("'2022-10-21T18:16:03.1-04:00'")
'2022-10-21T18:16:03.1-04:00'
>>> y.load("!!timestamp '2022-10-21T18:16:03.1-04:00'")
'2022-10-21T18:16:03.1-04:00'
```

Behavior after:

```pycon
>>> import datetime
>>> import salt.utils.yaml as y
>>> y.load("'2022-10-21T18:16:03.1-04:00'")
'2022-10-21T18:16:03.1-04:00'
>>> y.load("!!timestamp '2022-10-21T18:16:03.1-04:00'")
datetime.datetime(2022, 10, 21, 18, 16, 3, 100000, tzinfo=datetime.timezone(datetime.timedelta(days=-1, seconds=72000)))
```
Before, `!!omap` nodes were incorrectly assumed to be mapping nodes
when `dictclass` was not `dict`.  Now they are correctly processed as
sequences of single-entry mappings.  The resulting Python object has
type `dictclass` (usually `collections.OrderedDict` or a subclass).

This commit does not change the behavior when `dictclass` is `dict`;
loading an `!!omap` node still returns a list of (key, value)
tuples (PyYAML's default behavior).  I consider that to be a bug in
PyYAML, so a future commit may change the behavior in the `dict` case
to match the non-`dict` behavior.  (This commit uses `dictclass` for
the non-`dict` case to match what appears to be the original
intention.)

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
>>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 60, in construct_document
    for dummy in generator:
  File "salt/utils/yamlloader.py", line 42, in construct_yaml_omap
    return (yield from self.construct_yaml_map(node))
  File "salt/utils/yamlloader.py", line 36, in construct_yaml_map
    value = self.construct_mapping(node)
  File "salt/utils/yamlloader.py", line 52, in construct_mapping
    raise ConstructorError(
yaml.constructor.ConstructorError: expected a mapping node, but found sequence
  in "<unicode string>", line 1, column 1
```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
>>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
OrderedDict([('foo', 'bar'), ('baz', 'bif')])
```

Relevant bug: saltstack#12161
Now the behavior is consistent regardless of the value of the
Loader's `dictclass` constructor parameter.

Starting with Python 3.6, Python `dict` objects always iterate in
insertion order, so iteration order was already guaranteed when using
an ordinary YAML map.  However, `!!omap` provides a stronger guarantee
to users (plain map iteration order can be thought of as a Salt
implementation detail that might change in the future), and it allows
them to make their `.sls` files self-documenting.  For example,
instead of:

```yaml
my_pillar_data:
  key1: val1
  key2: val2
```

users can now do:

```yaml
my_pillar_data: !!omap
  - key1: val1
  - key2: val2
```

to make it clear to readers that the entries are intended to be
processed in order.

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
[('foo', 'bar'), ('baz', 'bif')]
>>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
OrderedDict([('foo', 'bar'), ('baz', 'bif')])
```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> import collections
>>> y.load("!!omap [{foo: bar}, {baz: bif}]")
OrderedDict([('foo', 'bar'), ('baz', 'bif')])
>>> y.SaltYamlSafeLoader("!!omap [{foo: bar}, {baz: bif}]", dictclass=collections.OrderedDict).get_single_data()
OrderedDict([('foo', 'bar'), ('baz', 'bif')])
```

Relevant bug: saltstack#12161
My main motivation for adding this is to facilitate testing, though it
also gives module authors greater flexibility (in particular: a
`tuple` can be a `dict` key, but a `list` cannot because it is not
hashable), and I don't see a strong reason why this shouldn't be
added.

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> y.load("!!python/tuple [foo, bar]")
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "salt/utils/yamlloader.py", line 159, in load
    return yaml.load(stream, Loader=Loader)
  File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 81, in load
    return loader.get_single_data()
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 51, in get_single_data
    return self.construct_document(node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 55, in construct_document
    data = self.construct_object(node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 100, in construct_object
    data = constructor(self, node)
  File "venv/lib/python3.8/site-packages/yaml/constructor.py", line 427, in construct_undefined
    raise ConstructorError(None, None,
yaml.constructor.ConstructorError: could not determine a constructor for the tag 'tag:yaml.org,2002:python/tuple'
  in "<unicode string>", line 1, column 1
```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> y.load("!!python/tuple [foo, bar]")
('foo', 'bar')
```
Override the implicit resolver for timestamp strings so that
`datetime.datetime` objects are always written with an accompaying
`!!timestamp` tag.

This is a behavior change: Any users that depend on a `datetime`
becoming a plain YAML string (which would be read in as a `str`
object) must convert the object to string themselves.

This change preserves the semantics of the object, and it preserves
round-trip identity.

Behavior before:

```pycon
>>> import datetime
>>> import salt.utils.yaml as y
>>> print(y.dump(datetime.datetime(2022, 10, 21, 18, 16, 3, 100000, tzinfo=datetime.timezone(datetime.timedelta(hours=-4)))))
2022-10-21 18:16:03.100000-04:00

```

Behavior after:

```pycon
>>> import datetime
>>> import salt.utils.yaml as y
>>> print(y.dump(datetime.datetime(2022, 10, 21, 18, 16, 3, 100000, tzinfo=datetime.timezone(datetime.timedelta(hours=-4)))))
!!timestamp 2022-10-21 18:16:03.100000-04:00

```
Behavior before:

```pycon
>>> from salt.utils.odict import OrderedDict
>>> import salt.utils.yaml as y
>>> import collections
>>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
!!python/object/apply:collections.OrderedDict
- - - foo
    - bar

>>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
NULL

```

Behavior after:

```pycon
>>> from salt.utils.odict import OrderedDict
>>> import salt.utils.yaml as y
>>> import collections
>>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

```
Before, an `OrderedDict` object was represented as a plain YAML
mapping.  Now it is represented as an `!!omap` node, which is a
sequence of single-valued mappings.

This is a behavior change: Any users that depend on an `OrderedDict`
becoming a plain YAML mapping (which would be read in as a `dict`
object) must first convert the `OrderedDict` to `dict`.

This change preserves the semantics of the object, and it preserves
round-trip identity.

Behavior before:

```pycon
>>> from salt.utils.odict import OrderedDict
>>> import salt.utils.yaml as y
>>> import collections
>>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

>>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
foo: bar

```

Behavior after:

```pycon
>>> from salt.utils.odict import OrderedDict
>>> import salt.utils.yaml as y
>>> import collections
>>> print(y.dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
!!omap
- foo: bar

>>> print(y.safe_dump(OrderedDict([("foo", "bar")]), default_flow_style=False))
!!omap
- foo: bar

>>> print(y.dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
!!omap
- foo: bar

>>> print(y.safe_dump(collections.OrderedDict([("foo", "bar")]), default_flow_style=False))
!!omap
- foo: bar

```
Before, `!!python/tuple` was only used for `OrderedDumper`.  Now it is
also used for `SafeOrderedDumper` and `IndentedSafeOrderedDumper`.

This is a behavior change: Any users that depended on a `tuple`
becoming a plain YAML sequence (which would be read in as a `list`
object) must first convert the `tuple` to `list`.

This change preserves the semantics of the object, and it preserves
round-trip identity.  Preserving round-trip identity is particularly
important if the tuple is used as a key in a `dict` because `list`
objects are not hashable.

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> print(y.dump(("foo", "bar"), default_flow_style=False))
!!python/tuple
- foo
- bar

>>> print(y.safe_dump(("foo", "bar"), default_flow_style=False))
- foo
- bar

```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> print(y.dump(("foo", "bar"), default_flow_style=False))
!!python/tuple
- foo
- bar

>>> print(y.safe_dump(("foo", "bar"), default_flow_style=False))
!!python/tuple
- foo
- bar

```
Failure to represent an object is a bug so it should be noisy, not
quietly produce `NULL`.  The representer this commit disables was
added in commit 2e2ce5f.

Behavior before:

```pycon
>>> import salt.utils.yaml as y
>>> y.safe_dump(type("GeneratedObjectSubclass", (), {}))
'NULL\n'
```

Behavior after:

```pycon
>>> import salt.utils.yaml as y
>>> y.safe_dump(type("GeneratedObjectSubclass", (), {}))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "salt/utils/yamldumper.py", line 270, in safe_dump
    return dump(data, stream, Dumper=SafeOrderedDumper, **kwargs)
  File "salt/utils/yamldumper.py", line 261, in dump
    return yaml.dump(data, stream, **kwargs)
  File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 253, in dump
    return dump_all([data], stream, Dumper=Dumper, **kwds)
  File "venv/lib/python3.8/site-packages/yaml/__init__.py", line 241, in dump_all
    dumper.represent(data)
  File "venv/lib/python3.8/site-packages/yaml/representer.py", line 27, in represent
    node = self.represent_data(data)
  File "venv/lib/python3.8/site-packages/yaml/representer.py", line 58, in represent_data
    node = self.yaml_representers[None](self, data)
  File "venv/lib/python3.8/site-packages/yaml/representer.py", line 231, in represent_undefined
    raise RepresenterError("cannot represent an object", data)
yaml.representer.RepresenterError: ('cannot represent an object', <class '__main__.GeneratedObjectSubclass'>)
```
@dwoz
Copy link
Contributor

dwoz commented Dec 10, 2023

Closing this due to inactivity. Anyone should feel free to re-open it if they want to see it through to the end in one release cycle.

@dwoz dwoz closed this Dec 10, 2023
@dwoz dwoz added help-wanted Community help is needed to resolve this Abandoned labels Dec 10, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Abandoned help-wanted Community help is needed to resolve this
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants