-
Notifications
You must be signed in to change notification settings - Fork 99
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Several possible problems within the ZPublisher Converters module #558
Comments
IIIUC, post containing file upload inputs are interpreted as file-like objects in the request. This code seems to imply that you can apply the |
As Leo said, what comes into the converter may be a file-like object, and to get to the content Zope/src/ZPublisher/Converters.py Line 186 in 5733da7
This is really lame. This translates to About that |
P.S.: This blocking issue may be related: #271 |
Some more thoughts... Both Zope/src/ZPublisher/Converters.py Lines 33 to 39 in bdc1d21
and Zope/src/ZPublisher/Converters.py Lines 48 to 51 in bdc1d21
return v.read() for file-like objects, without any conversion steps.
I have to admit I lack some knowledge in this area (which I would like to improve working on this issue), but I cannot imagine this can be true. Also, I generally find the code in this module a bit confusing. Ie the partially use of a converter class, the many duplications, and that some "public" functions call each other instead of reusing some common functionality. I'd probably write a couple of decorators, like
Any thoughts on this? |
Also.. how does this package and all the duplicated functions relate to the ZPublisher? |
Jürgen Gmach wrote at 2019-4-19 02:35 -0700:
Also.. how does this package and all the duplicated functions relate to the ZPublisher?
You likely see an historical effect:
The converter functions likely were first used to
implement the ":<type>" suffixes of `ZPublisher` --
and have been defined there.
Lateron, some of them were also used elsewhere, e.g. for `PropertyManager`.
A refactoring took place - but maybe not a complete one.
|
Unless you have proof that the tests in Zope itself exercise the duplicate functions in that package just disregard them. In that case you can look to see if they implement anything "better" and copy that into the ZPublisher. Otherwise, don't worry about it. |
When you read from file objects you're getting bytes back. Knowing that you can easily see that Don't think too fancy before fixing what's obviously wrong and then changing the tests that tested the obviously wrong behavior. You're approaching analysis paralysis it seems. |
Python 2 repl
Python 3 repl
That would mean that Zope/src/ZPublisher/Converters.py Lines 48 to 55 in bdc1d21
As I stated above, I still have some gaps of knowledge especially with six and Python 2 vs Python 3 in terms of text vs bytes. And as this is not paid work with a hard time limit, it's a perfect opportunity to learn a thing or two. Thanks for taking your time! Going to prepare a first pull request for increasing |
Correction for my earlier comment (after seeing your manual experiments with file objects): What comes out of a file when you |
See #560 (comment) - as @dataflake pointed out - it is still not 100% clear whether field2string or field2bytes behaviour was correct - have to look into it more deeply and probably setup a better test case with a real Fieldstorage object. |
Jürgen Gmach wrote at 2019-4-19 22:08 -0700:
See #560 (comment) - as @dataflake pointed out - it is still not 100% clear whether field2string or field2bytes behaviour was correct - have to look into it more deeply and probably setup a better test case with a real Fieldstorage object.
I have my doubts that "field2*" converters should be applied to (uploaded)
files at all. If they should, the case will be non-trivial.
The file's "MIME type" tells us whether its content is (likely)
text ("content-type=text/...") or binary ("content-type"=<non-text>").
For text files, we would also need to know the "charset".
The "content-type" might contain "charset" information, but likely it does
not (as the browser likely does not have this information).
Thus, it is to be expected that text file content cannot be reliably
handled by the converters as they lack the required charset information
(in the general case). The converters cannot rely on server configuration
to determine a default charset -- as (in the general
case) the charset of uploaded files
may not have any relation to the server configuration.
In my opinion, uploaded files should always be opened in binary mode
(as important information is missing in the general case to open
them reliably in text mode) and general converters should refuse
to convert their content to text, unless the converter itself
specifies the charset to use.
An alternative would be to assume the "utf-8" charset in (general) converters
and fail when the assumption obviously fails. This might be a pragmatic
approach if dropping support to convert file content is considered too
drastic.
|
So, this issue is getting a bit out of hand... I try to wrap it up. @dataflake Your initial thoughts on where the buggy behaviour is located was right. I stepped through a file upload with a "string" type converter attached to the field and that was the result:
I closed my pr #560 which would have introduced a new bug. Thanks @dataflake ! Having spent now quite some hours in the Converters and HTTPRequest.processInputs code, I have to concur with @d-maurer - a converter should not be applied to file uploads - and even should not be aware if the input is a file. When you take a look at the "type_converters"... Zope/src/ZPublisher/Converters.py Lines 235 to 252 in a8208a7
Imho only one should be applicable to file uploads - field2required - which is not a type converter at all. Imho it should be pulled out of the converter module and directly applied in HTTPRequest.processInputs - similar to "ignore_empty". Zope/src/ZPublisher/HTTPRequest.py Line 620 in a8208a7
As these changes would be breaking changes, I leave the decision to more senior contributors as @dataflake / @icemac As a side note:
TL/DR
|
Jürgen Gmach wrote at 2019-4-21 23:56 -0700:
...
- [ ] field2u* is broken for Python 3 and byte input
The "u*" things are relevant only for Python 2.
With Python 3, "u*" is equivalent to "*",
e.g. "ustring" to "string", "ulines" to "lines", etc.
Of course, this does not solve the conversion problem.
Passing a file where a simple type (such as "string")
is expected is likely extremely rare. It should not
be a big problem to drop support for this.
|
IIIUC, for POST requests containing file upload inputs, these inputs are interpreted as file-like objects in the request. This code seems to imply that you can apply the |
@icemac / @dataflake / @jamadden / @mgedmin ... Somebody has to make a decision on how to proceed with this issue.
|
My opinion:
|
Jens Vagelpohl wrote at 2019-5-9 05:10 -0700:
My opinion:
- change the converters to not touch files
- leave the `field2u*` converters alone. We will revisit it when we have gathered more experience with Zope on Python 3.
Note: for Python 3, the `u`-fields are equivalent to the corresponding
fields. This means: any problem we have with the `u`-field converters,
we should have as well with the corresponding field converters.
|
Is there any preference on what should happen when a converter gets applied to a file anyway? I am no big fan of ignoring wrong usage, as I prefer to fail hard and early, ie I'd raise an exception. |
Jürgen Gmach wrote at 2019-5-10 10:44 -0700:
Is there any preference on what should happen when a converter gets applied to a file anyway?
I am no big fan of ignoring wrong usage, as I prefer to fail hard and early, ie I'd raise an exception.
+1
|
You can always use the logger to emit a warning-level log message |
At the April 2021 sprint we just decided to create a PR and discuss details later. |
When working on #557 I spotted several possible problems within the Converters module.
The converters which inherit from
_unicode_converter
all fail under Python 3 and byte input.Testcase:
This problem applies to field2utokens, field2utext, field2string. field2ulines on the other hand does not inherit from _unicode_converter and has to be examined separately.
I spotted
Zope/src/ZPublisher/Converters.py
Line 186 in 5733da7
Another possible problem is in
Zope/src/ZPublisher/Converters.py
Line 209 in 5733da7
I tried to fix these "problems" yesterday - seemingly successfully, but then suddenly
Zope/src/ZPublisher/tests/testHTTPRequest.py
Line 302 in 5733da7
Also, can anybody shed some light on what this "read" attribute is all about?
e.g.
Zope/src/ZPublisher/Converters.py
Line 221 in 5733da7
Before spending more time on these issue, I'd like somebody to confirm that these are real issues.
The text was updated successfully, but these errors were encountered: