-
-
Notifications
You must be signed in to change notification settings - Fork 31k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
os module: use keyword-only arguments for dir_fd and nofollow to reduce function count #58831
Comments
There are some functions in the os module that do much the same thing but differ only in minor ways, like
It would be better if we consolidated these variations into one function which took keyword-only parameters that allow access to the alternate functionality. Here is a sample of the new possible function signatures, as originally proposed by Serhiy Storchaka: access(path, mode, *, followlinks=True, dirfd=None, eaccess=False)
chmod(path, mode, *, followlinks=True, dirfd=None)
chown(path, uid, gid, *, followlinks=True, dirfd=None)
link(srcpath, dstpath, *, followlinks=True, srcdirfd=None, dstdirfd=None)
mkdir(path, mode=0o777, *, dirfd=None)
mknod(path, mode=0o600, device=0, *, dirfd=None)
open(path, flag, mode=0o777, *, dirfd=None)
readlink(path, *, dirfd=None)
rename(oldpath, newpath, *, olddirfd=None, newdirfd=None)
stat(path, *, followlinks=True, dirfd=None)
symlink(src, dst, *, dirfd=None)
unlink(path, *, removedir=False, dirfd=None)
utimes(path[, (atime, mtime)], *, ns=False, dirfd=None)
mkfifoat(path, mode=0o666, *, followlinks=True, dirfd=None) My opinion about naming: PEP-8 suggests underscores to separate words in non-class identifiers. So I'd spell those "dir_fd", "src_dir_fd" (etc), "remove_dir", and "follow_symlinks". (While thinking about this, I remembered the infrequently-cited maxim "no boolean parameters". But that's more a guideline than a rule, and it tends to be a complaint about how the meaning of the parameter is unclear at a call site. As these will be keyword-only parameters I think their meanings will be perfectly clear, so I shan't worry about it.) |
storchaka: sorry for the long delay, somehow I missed your reply in python-ideas.) You said you envision this as a big patch. Could I convince you to try and make a series of smaller patches? It should be easy to break up into small pieces--do one patch adding dir_fd, the next with follow_symlinks, etc. |
I think it most (all?) of these cases, it's better to mirror the os level APIs, since many people already know them. Such fanciness is better left to high level wrappers (like path objects). |
No, Guido's boolean keyword dislike is not about things being unclear at the call site. It's about boolean parameters instead of separate functions. That is (I believe) he prefers lstat/stat to having stat take a boolean keyword, keyword-only or not. Did he weigh in on the python-ideas thread? |
r.david.murray said:
I wasn't referring to Guido specifically, just general code smell complaints about boolean parameters.
Not per se. But I ran into him when he was leaving PyCon 2012 for good (second night of the sprints iirc), and he steered me to the original python-ideas thread and asked me to follow up on it / work with the poster. (Probably because of my doing the nanosecond-precision os.stat / utime work.) If there's genuine opposition to the patch being generated here I'll ping him. Oh, btw storchaka, in the email thread you asked
No--it'll just be part of a release, and you'll check which version of Python 3 you have before using it. if sys.version_info.major >= 3 or sys.version_info.minor >= 3:
# use extra flags p.s. http://mail.python.org/pipermail/python-ideas/2012-March/014482.html |
Thank you, Larry. I was going to do it, but got stuck with other things. The main objective of this proposal is to get rid of litter os module by dozen rarely popular and non-portable functions (introduced by bpo-4761). Moreover, the descriptions are given in alphabetical order and related functions are located far from each other.
I follow the common style.
Presence of function depends not only on the version, but also from the platform. Benjamin, therefore I believe it is critically important to do this work before the release of Python 3.3. To people and have not started to use the new features. Otherwise, get rid of them will be more difficult. But I have nothing against to put "at"-functions in a separate submodule (os.posix?). David, let lstat remains. fstatat include functionality both stat and lstat (but has a different interface). I suggest to unite fstatat and stat, while maintaining backward compatibility and using a more user-friendly interface. In the C extension of the functions is impossible, that is why were introduced new functions with other names. |
Before starting to code, it is necessary to solve the problem of interface. With the majority of the functions all is good, but the One is the filename and dirfd are combined in a tuple. Instead of a tuple, you can specify only the name. link ((srcpath, srcdirfd), (dstpath, dstdirfd), *, followlinks=True) The other -- link (srcpath, dstpath, *, followlinks=True, dirfd=(srcdirfd, dstdirfd)) Which of these options (or the original, with different keywords) is preferable? |
It's true that, for example, dir_fd parameters won't work on Windows. The solution is to always accept the parameters and throw NotImplementedError on platforms where the functionality isn't available. Here are my thoughts on the interface for link(). Since the two dir_fd parameters are independent--you might specify none, one, or both--I think the dir_fd=(src,dst) form would be obnoxious. And polymorphic parameters (accept a string or a tuple of string and fd) are nearly always a bad idea; the % operator on strings is a good example of what can go wrong. So I think you should stick with the original interface, with independent explicit keyword arguments. I'd prefer that we did this everywhere it made sense, including adding a follow_symlinks parameter to stat(). But obviously you should prioritize places where we want to get rid of functions that are only in trunk (3.3) so far. I suppose there's precedent for "followlinks"; it's very old, predating PEP-8. Personally I'd overlook it if I were doing the implementation and spell the new parameters "follow_symlinks"--or at least I'd try it and see what the community thought. Anyway, there's no established precedent for "dir_fd" and "remove_dir". So I'd certainly prefer PEP-8 spellings for those. |
Here my first stab at a comprehensive proposal. Each section represents a specific new function argument, and a list of functions that the argument be added to. All new arguments are keyword-only and optional. All functions mentioned are in the os module. This annotation: This annotation: However! I am *not* proposing doing *any* of these deprecations--I suspect the right thing to do is to leave all the existing functions in. follow_symlinks=bool (default True) Allow functions to either follow symlinks (the default) or examine symlinks.
effective_ids=bool (default False) For functions that take the AT_EACCESS flag or otherwise operate on effective uid/gid.
dir_fd=int (default None) For functions that take a "dir_fd" parameter instead of / in addition to a "path" parameter.
fd=int (default None) For functions that take a "path" parameter and have an "f"-equivalent that take an "fd" instead. The "path" parameter and "fd" parameters would be exclusive; you must specify exactly one, never both. Both parameters would accept "None" as equivalent to being unspecified. For the functions that only take one parameter (chdir, listdir, stat, statvfs) I propose we give that parameter a default of None.
remove_dir=bool (default False) Allows unlink to behave like rmdir.
One question: If we do as I propose, and dir_fd is always a parameter to a pre-existing function, can we drop support for AT_FDCWD? It seems to me that |
Well, it always boils down to the same problem: should we offer thin wrappers around underlying syscalls or offer higher-level functions? Now, I'm in favor of adding optional argument to tune the behavior with regard to symlinks, however I'm skeptical about merging regular syscalls and *at() syscalls, for the following reasons:
stat(path, *, followlinks=True, dirfd=None)
is backwards, it should be
stat(dirfd, path, *, followlinks=True) since, Actually, the proper way to do this would be to use overloading, but Python doesn't support it (and even if it did, I think overloading would probably be a bad idea anyway).
|
I very much agree. I suggest anyone who thinks the os module is a thin
NotImplementedError.
try/except. If someone (maybe even me) championed PEP-362 (Function
You could hypothetically rearrange all the arguments I admit it's unlikely anyone will bother.
Speaking *purely hypothetically*, we could actually overload it. It'd if (-1 == PyArg_ParseTupleAndKeyword(arguments-with-dir_fd)) {
PyErr_Clear();
if (-1 == PyArg_ParseTupleAndKeyword(arguments-without-dir_fd)) {
/* etc. */ However, Python doesn't have overloading because we shouldn't do it.
I counter with TOOWWTDI. There are currently four chmod functions in You may also face the problem that you don't want chmod to follow I feel the sad truth of the situation is, it is desirable that Python (Though this would not help the hapless programmer who suspected maybe Also, I suggest that the programmer reading the documentation for os
I understand your critique. In defense of the proposal, I will say that Not that I claim this gives me carte blanche to check an implementation In the absence of a BDFL ruling on the proposal, I think I should |
Here's my first pass at a patch. For this patch, Specifically: This function has been removed, and instead Additionally, we *could* deprecate this function, I doubt we'll ever deprecate those functions. Notes:
/*
* A PyArg_ParseTuple "converter" function
* that handles filesystem paths in the manner
* preferred by the os module.
*
* path_converter accepts (Unicode) strings and their
* subclasses, and bytes and their subclasses. What
* it does with the argument depends on the platform:
*
* * On Windows, if we get a (Unicode) string we
* extract the wchar_t * and return it; if we get
* bytes we extract the char * and return that.
*
* * On all other platforms, strings are encoded
* to bytes using PyUnicode_FSConverter, then we
* extract the char * from the bytes object and
* return that.
*
* Input fields:
* path.nullable
* If nonzero, the path is permitted to be None.
* path.function_name
* If non-NULL, path_converter will use that as the name
* of the function in error messages.
* (If path.argument_name is NULL it omits the function name.)
* path.argument_name
* If non-NULL, path_converter will use that as the name
* of the parameter in error messages.
* (If path.argument_name is NULL it uses "path".)
*
* Output fields:
* path.wide
* Points to the path if it was expressed as Unicode
* and was not encoded. (Only used on Windows.)
* path.narrow
* Points to the path if it was expressed as bytes,
* or it was Unicode and was encoded to bytes.
* path.length
* The length of the path in characters.
* path.object
* The original object passed in.
* path.cleanup
* For internal use only. May point to a temporary object.
* (Pay no attention to the man behind the curtain.)
*
* At most one of path.wide or path.narrow will be non-NULL.
* If path was None and path.nullable was set,
* both path.wide and path.narrow will be NULL,
* and path.length will be 0.
*
* path_converter takes care to not write to the path_t
* unless it's successful. However it must reset the
* "cleanup" field each time it's called.
*
* Use as follows:
* path_t path;
* memset(&path, 0, sizeof(path));
* PyArg_ParseTuple(args, "O&", path_converter, &path);
* // ... use values from path ...
* path_cleanup(&path);
*
* (Note that if PyArg_Parse fails you don't need to call
* path_cleanup(). However it is safe to do so.)
*/
typedef struct {
char *function_name;
char *argument_name;
int nullable;
wchar_t *wide;
char *narrow;
Py_ssize_t length;
PyObject *object;
PyObject *cleanup;
} path_t; I assert path_converter is Very Useful. It nearly always
I do think it's an improvement. But I won't check any part of it I look forward to your feedback! |
Second pass at my patch. Incorporates suggestions from Serhiy's review--thanks, Serhiy! Still not ready for checkin. > 80 col lines, no docs, docstrings are messy. But code is ready for (further) review. Code passes regression test suite without errors. |
Well, I'm going to ignore the long lines and documentation. The patch is |
I'm not sure that "long" and "impressive" are words that go together when describing a patch ;-) |
Here's a nice fresh minor update. Notes on this third patch:
|
BTW: If PEP-362 is accepted, and this patch makes it for 3.3 (both of which I think will happen), I'll hand-code signatures for the functions that may throw NotImplementedError so users can use "is_implemented" to LBYL. |
I haven't read the code, but from Larry's description this looks great to me. It's amazing how many extra functions were added to the os module since 3.2! I also agree that the redundant functions that existed in 3.2 should stay and I don't see it's fair to deprecate them. I do hope that not too many people have written code based on the 3.3 alphas using all those extra functions, but I suppose they will get what they paid for. Everything else Larry wrote also sounds reasonable to me. |
Previously existing redundant functions could be deprecated in documentation. |
As in, don't start a "deprecation cycle" (warning in 3.3, deprecated in 3.4, gone in 3.5), just document "consider using this other function instead"? That's probably worth doing. I wouldn't use the word "deprecated" though, I'd just suggest a "see also". Maybe we could remove the redundant functions in 4.0. I'll put it on my wishlist :) |
New patch! What's new:
I *think* the docstrings are all fixed. The only thing I know that I really wanna get this in before the feature freeze. I promise to |
Fifth iteration of my patch. Everything is done, and I really think it's ready to be checked in.
|
And what if the system supports symlink and doesn't support |
New changeset 27f9c26fdd8b by Larry Hastings in branch 'default': |
New changeset 04fd8f77a58e by Larry Hastings in branch 'default': |
New changeset e1e0eeb07398 by Larry Hastings in branch 'default': |
New changeset 66f7377547d5 by Larry Hastings in branch 'default': |
27f9c26fdd8b broke test_shutil on the Windows buildbots: ====================================================================== Traceback (most recent call last):
File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1146, in test_basic
self.assertEqual(rv, self.temp_file.name)
AssertionError: None != 'c:\\users\\db3l\\appdata\\local\\temp\\tmpxqw4gu\\tmp7ugfmm.exe' ====================================================================== Traceback (most recent call last):
File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1152, in test_full_path_short_circuit
self.assertEqual(self.temp_file.name, rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmpmwer14\\tmpeacfbz.exe' != None ====================================================================== Traceback (most recent call last):
File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1158, in test_non_matching_mode
self.assertIsNone(rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmp7n6ojp\\tmp5tt9pa.exe' is not None ====================================================================== Traceback (most recent call last):
File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1181, in test_pathext_checking
self.assertEqual(self.temp_file.name, rv)
AssertionError: 'c:\\users\\db3l\\appdata\\local\\temp\\tmpipmbe3\\tmpx43hex.exe' != None ====================================================================== Traceback (most recent call last):
File "D:\cygwin\home\db3l\buildarea\3.x.bolen-windows7\build\lib\test\test_shutil.py", line 1166, in test_relative
self.assertEqual(rv, os.path.join(tail_dir, self.file))
AssertionError: None != 'tmpcluw7l\\tmp6sy_py.exe' |
The |
I agree with Antoine. I've opened bpo-15154 to track this. |
The Windows buildbots are now content; closing. |
New changeset a138a1131bae by Victor Stinner in branch 'default': |
New changeset 170cd0104267 by Victor Stinner in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: