Skip to content

Conversation

@itchyny
Copy link
Contributor

@itchyny itchyny commented Oct 3, 2025

Currently, decoding +0xFFFFFFFFFFFFFFFF into uint64 fails because strconv.ParseUint does not accept the + sign.

Copilot AI review requested due to automatic review settings October 3, 2025 14:40
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes an issue where YAML decoding of unsigned 64-bit integers fails when the number includes a plus sign prefix (e.g., +0xFFFFFFFFFFFFFFFF). The fix removes the plus sign before parsing to make it compatible with Go's strconv.ParseUint function.

  • Modified the parsing logic to strip plus signs from numeric strings before converting to uint64
  • Added comprehensive test cases for both decimal and hexadecimal numbers with plus signs

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.

File Description
resolve.go Updated uint64 parsing to handle plus sign prefix by trimming it before calling strconv.ParseUint
decode_test.go Added test cases for decimal and hexadecimal uint64 values with plus sign prefixes

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@itchyny itchyny force-pushed the fix-uint64-plus-sign branch from 24249b3 to c17655a Compare October 4, 2025 12:06
@dolmen
Copy link

dolmen commented Oct 5, 2025

AFAIK '+' isn't allowed in integer representation.

https://yaml.org/spec/1.2.2/#10213-integer

@itchyny
Copy link
Contributor Author

itchyny commented Oct 5, 2025

Why decimal: +12345 is written in 2.4. Tags then?

@itchyny
Copy link
Contributor Author

itchyny commented Oct 5, 2025

Anyway, the parser already parses the plus sign, and this PR fixes a corner case, so this is not the right place to discuss on dropping the plus sign support. Currently +0x7FFFFFFFFFFFFFFF is accepted as an integer but +0x8FFFFFFFFFFFFFFF is not.

@ccoVeille
Copy link
Contributor

ccoVeille commented Oct 5, 2025

Anyway, the parser already parses the plus sign, and this PR fixes a corner case, so this is not the right place to discuss on dropping the plus sign support. Currently +0x7FFFFFFFFFFFFFFF is accepted as an integer but +0x8FFFFFFFFFFFFFFF is not.

I had the same consideration before validating the PR. If the rest supports it, I see no problem with it.

AFAIK '+' isn't allowed in integer representation.

https://yaml.org/spec/1.2.2/#10213-integer

I checked yesterday before validating the PR

As I found nothing clear I tried with other libraries that parse YAML (in other languages), all of them where allowing the +

So I validated it.

Why decimal: +12345 is written in 2.4. Tags then?

Indeed.

@itchyny
Copy link
Contributor Author

itchyny commented Oct 5, 2025

When encoding a value in the canonical form, the + sign is not allowed. However, the core schema allows to resolve the numbers with + sign.

@ingydotnet
Copy link
Member

This topic falls under the concept of YAML Schemas https://yaml.org/spec/1.2.2/#chapter-10-recommended-schemas
which has little to do with what you likely think of as a schema. It's an unfortunate name in the spec. Let's use the term yaml-schema consistently. A yaml-schema controls tag resolution (and normalization but let's not talk about that now).
In the YAML load process, every untagged node is assigned a tag. The tag is associated with a constructor function that converts the generic YAML data into a native data structure.

YAML was designed so that a loader can be configured to use any yaml-schema rules it wants to. The spec then defines 3 basic yaml-schemas, failsafe, json (which doesn't actually follw all the json rules as you'd expect) and core.
For the most part, loaders are asked to implement the core yaml-schema as the default.

go-yaml can actually choose whatever makes the most sense for Go. What I really want in the world is an easy standard way for any domain (like say kubernetes) to define their own yaml-schemas and have go-yaml accept a yaml-schema as a loader config option.

Back to the problem at hand. The Core yaml-schema https://yaml.org/spec/1.2.2/#1032-tag-resolution clearly shows that
+ and - are allowed in front of (plain scalars that become) normal base 10 integers, but they are not allowed in front of octal and hex variants. So if go-yaml wants to use (as much of as possible) the 1.2 Core yaml-schema as the default, it should follow those rules.

But we can also provide other more sensible "baked-in" yaml-schema options, and eventually user defined ones.

@ingydotnet
Copy link
Member

Anyway, the parser already parses the plus sign, and this PR fixes a corner case, so this is not the right place to discuss on dropping the plus sign support. Currently +0x7FFFFFFFFFFFFFFF is accepted as an integer but +0x8FFFFFFFFFFFFFFF is not.

IN YAML "parser" means something very specific. It has no logic around a +. A parser applies the YAML spec productions against an input and creates a stream of events. So the parser reports a scalar event with a style of 'plain' (unquoted) and that info is used by a later transform called tag resolution to take that info and assign a tag of !!int or !!str etc.

A YAML "loader" is the combination of these transforms. Unfortunately, the go-yaml author chose to break from YAML terminology and use the term unmarshal and decode, which are foreign to the YAML vocabulary. I would like us all to learn and use the correct terminology for this project.

@ccoVeille
Copy link
Contributor

A YAML "loader" is the combination of these transforms. Unfortunately, the go-yaml author chose to break from YAML terminology and use the term unmarshal and decode, which are foreign to the YAML vocabulary. I would like us all to learn and use the correct terminology for this project.

Well, Unmarshall is the term used for JSON in Go standard library. Using it for a library handling YAML in Go was somehow normal to me. But I get your point.

@ccoVeille
Copy link
Contributor

ccoVeille commented Oct 12, 2025

@itchyny I didn't forget your PR.

I worked on it to try to figure what is expected in YAML, what is the existing go-yaml code doing.

I also explored the notion of tag and canonical form, as they were unclear to me.

I added 50+ unit tests, and I found bugs with the code that was already there with the v3.

I'll try to sort things and go back to you when I can.

@itchyny itchyny force-pushed the fix-uint64-plus-sign branch from c17655a to 0233287 Compare October 28, 2025 02:10
Copy link
Contributor

@ccoVeille ccoVeille left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I looked again at the PR.

The PR is fine. It brings support for something that is valid depending on the YAML specs 1.1 or 1.2

So, I would say the code is fine.

Later we may fix what is not valid depending on the spec and whether the value is used in the canonical form.

But for, now I think we are good.

Thanks @itchyny

Of course, I might be wrong and I would be happy to read feedback from others.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants