Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix yaml dump in string templates #32

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

tsvetkov-ii
Copy link

yaml.dump at the end of stream always add '\n...\n' it can't be disabled and it cause errors when templating something in strings.

This PR fixes this behavior.

@@ -0,0 +1,33 @@
#
# Copyright 2017, Rambler Digital Solutions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The year should be 2019


# yaml.dump always end the string with '\n...\n' even if explicit_end is False
# so just replace it
return yaml.dump(schema, Dumper=Dumper, *args, **kwargs).replace("\n...\n", "")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ellipsis in yaml is an end-of-document mark: https://yaml.org/spec/1.2/spec.html#id2760395

I guess it is safe to remove it from the end, and I suppose it shouldn't ever be rendered in the middle (which would mean that dump has produced multiple yaml documents with a single dump call, which is rather unexpected). So probably this should be good as is.

But I would propose to change that replace so it would strip the ellipsis only from the end of the document. Something like this:

yaml_doc = yaml.dump(schema, Dumper=Dumper, *args, **kwargs).rstrip("\n")
yaml_doc = re.sub(r'\.\.\.$', '', yaml_doc, flags=re.MULTILINE)
return yaml_doc

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this is the right solution since yaml has explicit_end parameter and the document-end is an optional marker. There's something going wrong for it occur. Need to investigate what and why.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Solving this via pyyaml would definitely be better, but it might be tricky.

This commit seems to be the reason: yaml/pyyaml@5413848

But simply setting open_ended = False doesn't help, because it then gets set back to True. So the following seems to work:

diff --git a/src/airflow_declarative/schema.py b/src/airflow_declarative/schema.py
index 5f2ca6e..1517b69 100644
--- a/src/airflow_declarative/schema.py
+++ b/src/airflow_declarative/schema.py
@@ -80,10 +80,18 @@ def dump(schema, *args, **kwargs):

     # yaml.dump always end the string with '\n...\n' even if explicit_end is False
     # so just replace it
-    return yaml.dump(schema, Dumper=Dumper, *args, **kwargs).replace("\n...\n", "")
+    return yaml.dump(schema, Dumper=Dumper, *args, **kwargs)


 class Dumper(yaml.SafeDumper):
+    @property
+    def open_ended(self):
+        return False
+
+    @open_ended.setter
+    def open_ended(self, value):
+        pass
+
     def ignore_aliases(self, data):
         return True

Although I'm wondering if this solution is better than the regexp 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's definitely better than blind replace of data we know nothing about. However, may be it's we are who uses yaml in wrong way here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We know pretty well that an ellipsis at the very end of a YAML document is an explicit end mark -- that is in the specs. So this is not a blind replacement.

I don't think it's a matter of incorrect usage. They set open_ended to True for a root node, effectively forcing the trailing ellipsis. And they don't seem to have that explicit_end argument covered with tests, so I would assume that it is broken in upstream.

https://github.com/yaml/pyyaml/blob/5413848f2ba250cc2c70f0192893a4a9626a8209/lib/yaml/emitter.py#L1075-L1076

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://stackoverflow.com/a/56988010 others do strip the ellipsis manually as well.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants