Skip to content

Conversation

@ingydotnet
Copy link
Member

@ingydotnet ingydotnet commented Aug 21, 2025

Fixes #78

When go-yaml encodes a Node it tries to use the style that the node was assigned. In this case the style is "Literal" (the YAML style that uses a |).

But it has to then look at the value to see if the string can even be emitted in the literal style.
Only double quoted style can encode any possible string value.
If it doesn't think it can or should use the literal style it switches to double quoted.
I relaxed the rule a bit.
It wasn't allowing any line with a space before a linebreak.

Copilot AI review requested due to automatic review settings August 21, 2025 04:13
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where multiline scalars with trailing whitespace can now emit literal scalar style in YAML. The changes modify the scalar analysis logic to allow literal block style for scalars that contain space-break patterns, while maintaining restrictions for other cases.

Key changes:

  • Updated scalar analysis function to check event scalar style before disallowing block literals
  • Modified test expectations to reflect literal scalar output instead of double-quoted strings
  • Added new test case for middle lines ending with spaces

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
emitterc.go Modified scalar analysis logic to allow literal block style for space-break patterns when explicitly requested
encode_test.go Updated test expectations and added new test case to verify literal scalar output for multiline strings with trailing whitespace

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@ingydotnet
Copy link
Member Author

ingydotnet commented Aug 21, 2025

I have some issues with this fix.
It changes how YAML is emitted.

See how it broke 3 existing tests, that I needed to change.

That means it also changes yq even though it is reported as a bug in yq: mikefarah/yq#566

In v4 I'd like to have a lot more options on how things are emitted.
We might want to make this new formatting only available as an option.

I'd like to hear your thoughts. (Yes, you!)

@justfalter
Copy link

justfalter commented Aug 21, 2025

I get the impression from Node.Style's docs that the encoder would obey the style when non-zero, so maybe there should be tests validating as much?

go-yaml/yaml.go

Lines 433 to 434 in 7f7bed1

// Style allows customizing the apperance of the node in the tree.
Style Style

After all, it seems that Node.SetString exists to assist with at least partially determining the appropriate style

go-yaml/yaml.go

Lines 540 to 554 in 7f7bed1

// SetString is a convenience function that sets the node to a string value
// and defines its style in a pleasant way depending on its content.
func (n *Node) SetString(s string) {
n.Kind = ScalarNode
if utf8.ValidString(s) {
n.Value = s
n.Tag = strTag
} else {
n.Value = encodeBase64(s)
n.Tag = binaryTag
}
if shouldUseLiteralStyle(n.Value) {
n.Style = LiteralStyle
}
}

And, when decoding a literal string to a Node, the node's Style is LiteralStyle ... so it would make sense to reflect that when encoding.

Comment on lines +1112 to 1113
"|\n hello \n",
"Text + spaces + newline - should be double quoted (ends with spaces)",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You talked about the change here

#79 (comment)

There is clear problem indeed

Not only the format changes (why not, after all) , but here the trailing spaces before the \n are lost

I feel like it's not a good thing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not only the format changes (why not, after all) , but here the trailing spaces before the \n are lost

I'm confused.
How are they lost?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm sorry. I feel like my comment was based on a misunderstanding.

Now, I feel like I didn't understand clearly what your change is about.

The spaces are indeed there. I was confused by the presence of \n as a first character of the marshaled.

I feel like I didn't understand the change involved here. So I'll pass on this PR.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Scalar Node with Literal style encodes as Double-quoted when value has a line with trailing spaces

4 participants