-
Notifications
You must be signed in to change notification settings - Fork 31.3k
rewrite _process_parameter_type in auto_docstring.py to improve usability #42431
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
rewrite _process_parameter_type in auto_docstring.py to improve usability #42431
Conversation
|
Approved the doc build run! You should get a link to the docs with your PR in about 30 minutes - can you check that everything is working as expected? |
|
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
i think yes. Could you please let me know if anything should be changed in my code for this PR? I tested this rewritten fuction several times as well. Also, i am a little bit new to git and github so sometimes i might mess up PR or commits. Sorry for that in advance. @Rocketknight1 |
bf593b7 to
f951d0c
Compare
|
i am wondering why sometimes tests failed but sometimes didn't? And my changes have nothing to do with these failures |
|
Maybe i should provide more details about this PR. But, auto_docstring would call _process_parameter_type to go through every single parameter's type hint to call this type hint's annotation.__name__. But before python 3.14, bool|int is not the same as Union[bool, int]. It means type hint like "bool|hint" has no attribute __name__. Then _process_parameter_type would raise errors, which prevents users from using kimi 48B instruct models. Btw, the downstream users of kimi 48B noticed this problem then they changed it but it seems that they didn't notify kimi this problem:
(coming from modelling_kimi.py of cerebras/Kimi-Linear-REAP-35B-A3B-Instruct)
My PR could solve this problem. Otherwise, python 3.14 is needed to use kimi 48B instruct model and i don't think transformers users could switch to python 3.14 easily |
f951d0c to
98d26ea
Compare
Rocketknight1
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hey @Yacklin, I took a look at the code! I think the bug you're fixing is real, and I think the cause is that we do lots of hacky accesses to things like annotation.__name__, like you said. I don't like if ":" in parameter_str though - doing these kinds of string manipulations also feels very hacky. It's likely to create bugs in future if anything changes with typing, or if anyone writes an unusual type hint.
Is there a clean way to correctly resolve all type hints into some kind of canonical form, and then just operate on that, rather than relying on string representations and attribute accesses?
Hi @Rocketknight1 , string manipulation is unavoidable in this function because this function would return something like tuple[str, bool]. But i could try to write an implmentation to minimize string manipulation. |
|
Hi @Yacklin, some manipulation is unavoidable at the end, but I think you should always be able to get the type hint from |
Hi @Rocketknight1 , This situation is more complex than i thought. Please read my comments and implementations carefully. Let me know what you think. import inspect
import types
import typing
from typing import get_origin, Union, get_args, Optional
def _process_parameter_type(param, param_name, func):
"""
Process and format a parameter's type annotation.
Args:
param (`inspect.Parameter`): The parameter from the function signature
param_name (`str`): The name of the parameter
func (`function`): The function the parameter belongs to
"""
"""
the minimum version of python that transformers supports is 3.9.0.
but this is the source of problem.
if user who is using python 3.9.x tried to access models that are using features coming from types.UnionType, like int|float, which is introduced in python 3.10,
error would be raised (even before this function is executed). Python developers came up with an idea that would squash int|float into string "int|float", using 'from __future__ import annotations'.
But this is something models developers should care, transformers developers could only warn models developers to solve this issue in their own model script.
the best way to solve this issue is to level up the minimum python version from 3.9 to 3.10
if user is using python 3.9, this error would be raised before this function is hit.
unfortunately, most models developers didn't pay attention to it.
If python 3.9 users tried to access models that are using features coming from types.UnionType, like int|float, which is introduced in python 3.10,
users have to contact models developers to support backward-compatibility. Transformers team has nothing to do with it, unless upgrading min version from
3.9 to 3.10, through voting or something.
"""
# this PR only takes python 3.10 into consideration, unfortunately.
param_annotation = param.annotation
print(param_annotation)
default_value = param.default
origin = get_origin(param_annotation)
args = get_args(param_annotation)
optional_flag = False
type_hint_str = ""
# Optional[sth], under the hood is equivalent to Union[sth, None]
# if the type hint is None, I mean something like age:None, it would be considered optional, very few developers would write it though.
# if the type hint is Optional[None] or Union[None], it would be considered optional, very few developers would write it though.
if (
(origin is typing.Union and type(None) in args)
or default_value is not inspect._empty
or (origin is types.UnionType and type(None) in args)
or param_annotation is types.NoneType
or param_annotation is None
):
optional_flag = True
"""
okay here is another problem, to get the string representation of type hint int, param.annotation.__name__ has to be accessed BUT
types.UnionType, like int|str, before python 3.14 has no attribute __name__, to get the string representation of int|str, access param.annotation ONLY.
"""
param_str: str = str(param)
if param_annotation is not inspect._empty:
if "=" in param_str:
type_hint_str = (
param_str.split(":", maxsplit=1)[1].split("=", maxsplit=1)[0].strip()
)
else:
type_hint_str = param_str.split(":", maxsplit=1)[1].strip()
return type_hint_str, optional_flagTo make sure python 3.9 users could use features like int|str, which is introduced in python 3.10 correctly, what IDE and PEP suggests is to use "from __future__ import annotations", within from __future__ import annotations, you could see it clearly that type hints would all be converted to strings at Runtime. |
98d26ea to
bf887c0
Compare
|
I can confirm that it fixed the issue and I was able to get it working, thank you. |
But i need to make reviewers happy about my changes, which is harder than writing a PR |





What does this PR do?
When using kimi model, kimi developers used something like "bool|int" as type hint for parameters in function signature and unfortunately, due to python's nature, format of type hint like "bool|int" is not completely the same as "typing.Union[bool, int]" until python 3.14. One of the biggest differences is when accessing param.annotation.__name__, "bool|int" would raise error that has no attribute "__name__". This error bothers me a lot so i have to fix it. I found a beautiful way to fix it and its backward compatibility could be extended to 3.4 (I just tried it a little bit but you could try versions before 3.4).
Please be advised that param_name and func are unused in this rewrited version. I chose not to remove these two parameters just in case something bad would happen.
Before submitting
Pull Request section?
to it if that's the case.
documentation guidelines, and
here are tips on formatting docstrings.
Who can review?
Anyone in the community is free to review the PR once the tests have passed. Feel free to tag
members/contributors who may be interested in your PR.
@Rocketknight1
(I accidentally deleted the previous PR, here is the new one)