0

I had some object that I want to turn into yaml, the only thing is that I need to be able to put "!anything" without quotes into it.

When I try it with pyyaml I end up with '!anything' inside my yaml file.

I've already tried using ruamel.yaml PreservedScalarString and LiteralScalarString. And it kind of works, but not in the way that I need to work. The thing is I end up with yaml that looks like this:

10.1.1.16:
            text: '1470814.27'
            confidence: |-
              !anything

But I don't need this |- symbol.

My goal is to get yaml like this:

10.1.1.16:
            text: '1470814.27'
            confidence: !anything

Any ideas how I can achieve that?

MrLalatg
  • 523
  • 1
  • 4
  • 15
  • 2
    In YAML `!anything` (and, generally, anything starting with a `!`) is a tag (i.e. not content). If you want `!anything` to be treated as content, you *must* use single or double quotes or a block scalar (`|-` or `>-`). – flyx Sep 01 '20 at 13:12
  • @flyx so what should I do if I want to be able to dump the tag? The thing is I need somehow be able to write it from my python code – MrLalatg Sep 01 '20 at 13:15

2 Answers2

2

To dump a custom tag, you need to define a type and register a representer for that type. Here's how to do it for scalars:

import yaml

class MyTag:
  def __init__(self, content):
    self.content = content

  def __repr__(self):
    return self.content

  def __str__(self):
    return self.content

def mytag_dumper(dumper, data):
  return dumper.represent_scalar("!anything", data.content)

yaml.add_representer(MyTag, mytag_dumper)

print(yaml.dump({"10.1.1.16": {
    "text": "1470814.27",
    "confidence": MyTag("")}}))

This emits

10.1.1.16:
  confidence: !anything ''
  text: '1470814.27'

Note the '' behind the tag, which is the tagged scalar (no, you can't get rid of it). You can tag collections as well but you'll need to use represent_sequence or represent_mapping accordingly.

flyx
  • 29,987
  • 6
  • 79
  • 110
  • First streaming to a StringIO buffer and then printing the content of that buffer is unnecessarily inefficient (both in memory consumption as well as in time) compared to directly streaming to `sys.stdout`. So you should always use `yaml.dump(data, sys.stdout)` instead of `print(yaml.dump(data))`. – Anthon Sep 02 '20 at 07:25
0

Contrary to @flix comment, in YAML you don't need to follow a tag by single or double quotes (or block scalar). You can try Oren Ben-Kiki's reference parser (programmatically derived from the YAML specification) to confirm that your expected output is valid YAML.

Empty content is normally loaded as None in Python (both by the outdated PyYAML as well as ruamel.yaml). Tagged empty content can of course only indicate existence of a particular instance, without any value indication.

ruamel.yaml can perfectly well round-trip your expected output:

import sys
from ruamel.yaml import YAML

yaml_str = """\
10.1.1.16:
  text: '1470814.27'
  confidence: !anything
"""

yaml = YAML()
data = yaml.load(yaml_str)

yaml.dump(data, sys.stdout)

gives:

10.1.1.16:
  text: '1470814.27'
  confidence: !anything

You can generate an object that dumps just the tag without a value from scratch (as the parser does), but if you don't want to go into the details, you can just load the tagged object and add it to your data structure:

import sys
import ruamel.yaml


yaml = ruamel.yaml.YAML()


def tagged_empty_scalar(tag):
   return yaml.load('!' + tag)

data = {'10.1.1.16': dict(text='1470814.27', confidence=tagged_empty_scalar('anything'))}

yaml.dump(data, sys.stdout)

You can get the exact same result in PyYAML and without the quotes, but that is more complicated.

Anthon
  • 59,987
  • 25
  • 170
  • 228