-
Notifications
You must be signed in to change notification settings - Fork 200
Open
Description
I'm not entirely sure what the intent is here so hesitate to file a PR. We saw some errors thrown by our webapp (using gunicorn) and traced it to request.encget():
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 495, in url
url = self.path_url
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 467, in path_url
bpath_info = bytes_(self.path_info, self.url_encoding)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/descriptors.py", line 70, in fget
return req.encget(key, encattr=encattr)
File "/layers/google.python.pip/pip/lib/python3.9/site-packages/webob/request.py", line 165, in encget
return bytes_(val, 'latin-1').decode(encoding)
UnicodeDecodeError: 'utf-8' codec can't decode byte 0xc0 in position 66: invalid start byte"
My read of util.byte_ is that, when passed a string, it performs val.encode() on it. So the following code in encget():
return bytes_(val, "latin-1").decode(encoding)is the same as doing:
return val.encode("latin-1", "strict").decode(encoding)Based on our exception we can see that the value of encoding is "utf-8", which gives us:
return val.encode("latin-1", "strict").decode("utf-8")or with a specific example that will fail:
x = "À".encode('latin-1').decode('utf-8')I'm not sure why we'd ever be explicitly encoding a string as latin-1 and then decoding it as UTF-8 in the first place -- a simpler return val.encode(encoding) would seem more appropriate here -- but again, there's probably nuance that I'm not understanding, hence the issue report.
Metadata
Metadata
Assignees
Labels
No labels