Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GH-101112: Add "pattern language" section to pathlib docs #114030

Merged
merged 18 commits into from
Feb 26, 2024
Merged
Changes from 15 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
159 changes: 103 additions & 56 deletions Doc/library/pathlib.rst
Original file line number Diff line number Diff line change
Expand Up @@ -572,6 +572,9 @@ Pure paths provide the following methods and properties:
>>> PurePath('/a/b/c.py').full_match('**/*.py')
True

.. seealso::
:ref:`pattern-language` documentation.

As with other methods, case-sensitivity follows platform defaults::

>>> PurePosixPath('b.py').full_match('*.PY')
Expand Down Expand Up @@ -991,25 +994,15 @@ call fails (for example because the path doesn't exist).
[PosixPath('pathlib.py'), PosixPath('setup.py'), PosixPath('test_pathlib.py')]
>>> sorted(Path('.').glob('*/*.py'))
[PosixPath('docs/conf.py')]

Patterns are the same as for :mod:`fnmatch`, with the addition of "``**``"
which means "this directory and all subdirectories, recursively". In other
words, it enables recursive globbing::

>>> sorted(Path('.').glob('**/*.py'))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]

.. note::
Using the "``**``" pattern in large directory trees may consume
an inordinate amount of time.

.. tip::
Set *follow_symlinks* to ``True`` or ``False`` to improve performance
of recursive globbing.
.. seealso::
:ref:`pattern-language` documentation.

This method calls :meth:`Path.is_dir` on the top-level directory and
propagates any :exc:`OSError` exception that is raised. Subsequent
Expand All @@ -1025,11 +1018,11 @@ call fails (for example because the path doesn't exist).
wildcards. Set *follow_symlinks* to ``True`` to always follow symlinks, or
``False`` to treat all symlinks as files.

.. audit-event:: pathlib.Path.glob self,pattern pathlib.Path.glob
.. tip::
Set *follow_symlinks* to ``True`` or ``False`` to improve performance
of recursive globbing.

.. versionchanged:: 3.11
Return only directories if *pattern* ends with a pathname components
separator (:data:`~os.sep` or :data:`~os.altsep`).
.. audit-event:: pathlib.Path.glob self,pattern pathlib.Path.glob

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.
Expand All @@ -1038,12 +1031,29 @@ call fails (for example because the path doesn't exist).
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
Return files and directories if *pattern* ends with "``**``". In
previous versions, only directories were returned.
The *pattern* parameter accepts a :term:`path-like object`.


.. method:: Path.rglob(pattern, *, case_sensitive=None, follow_symlinks=None)

Glob the given relative *pattern* recursively. This is like calling
:func:`Path.glob` with "``**/``" added in front of the *pattern*.

.. seealso::
:ref:`pattern-language` and :meth:`Path.glob` documentation.

.. audit-event:: pathlib.Path.rglob self,pattern pathlib.Path.rglob

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.

.. versionchanged:: 3.13
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
The *pattern* parameter accepts a :term:`path-like object`.


.. method:: Path.group(*, follow_symlinks=True)

Return the name of the group owning the file. :exc:`KeyError` is raised
Expand Down Expand Up @@ -1471,44 +1481,6 @@ call fails (for example because the path doesn't exist).
strict mode, and no exception is raised in non-strict mode. In previous
versions, :exc:`RuntimeError` is raised no matter the value of *strict*.

.. method:: Path.rglob(pattern, *, case_sensitive=None, follow_symlinks=None)

Glob the given relative *pattern* recursively. This is like calling
:func:`Path.glob` with "``**/``" added in front of the *pattern*, where
*patterns* are the same as for :mod:`fnmatch`::

>>> sorted(Path().rglob("*.py"))
[PosixPath('build/lib/pathlib.py'),
PosixPath('docs/conf.py'),
PosixPath('pathlib.py'),
PosixPath('setup.py'),
PosixPath('test_pathlib.py')]

By default, or when the *case_sensitive* keyword-only argument is set to
``None``, this method matches paths using platform-specific casing rules:
typically, case-sensitive on POSIX, and case-insensitive on Windows.
Set *case_sensitive* to ``True`` or ``False`` to override this behaviour.

By default, or when the *follow_symlinks* keyword-only argument is set to
``None``, this method follows symlinks except when expanding "``**``"
wildcards. Set *follow_symlinks* to ``True`` to always follow symlinks, or
``False`` to treat all symlinks as files.

.. audit-event:: pathlib.Path.rglob self,pattern pathlib.Path.rglob

.. versionchanged:: 3.11
Return only directories if *pattern* ends with a pathname components
separator (:data:`~os.sep` or :data:`~os.altsep`).

.. versionchanged:: 3.12
The *case_sensitive* parameter was added.

.. versionchanged:: 3.13
The *follow_symlinks* parameter was added.

.. versionchanged:: 3.13
The *pattern* parameter accepts a :term:`path-like object`.

.. method:: Path.rmdir()

Remove this directory. The directory must be empty.
Expand Down Expand Up @@ -1639,6 +1611,81 @@ call fails (for example because the path doesn't exist).
.. versionchanged:: 3.10
The *newline* parameter was added.


.. _pattern-language:
barneygale marked this conversation as resolved.
Show resolved Hide resolved

Pattern language
----------------

The following wildcards are supported in patterns for
:meth:`~PurePath.full_match`, :meth:`~Path.glob` and :meth:`~Path.rglob`:

``**`` (full segment)
barneygale marked this conversation as resolved.
Show resolved Hide resolved
Matches any number of file or directory segments.
barneygale marked this conversation as resolved.
Show resolved Hide resolved
``*`` (full segment)
Matches one file or directory segment.
``*`` (otherwise)
barneygale marked this conversation as resolved.
Show resolved Hide resolved
Matches any number of non-separator characters.
``?``
Matches one non-separator character.
``[seq]``
Matches one character in *seq*.
``[!seq]``
Matches one character not in *seq*.

For a literal match, wrap the meta-characters in brackets.
For example, ``"[?]"`` matches the character ``"?"``.

The "``**``" wildcard enables recursive globbing. A few examples:

========================= ===========================================
Pattern Meaning
========================= ===========================================
"``**/*``" Any path with at least one segment.
"``**/*.py``" Any path with a final segment ending "``.py``".
"``assets/**``" Any path starting with "``assets/``".
"``assets/**/*``" Any path starting with "``assets/``", excluding "``assets/``" itself.
========================= ===========================================

.. note::
Globbing with the "``**``" wildcard visits every directory in the tree.
Large directory trees may take a long time to search.

.. versionchanged:: 3.13
Globbing with a pattern that ends with "``**``" returns both files and
directories. In previous versions, only directories were returned.

In :meth:`Path.glob` and :meth:`~Path.rglob`, a trailing slash may be added to
the pattern to match only directories.

.. versionchanged:: 3.11
Globbing with a pattern that ends with a pathname components separator
(:data:`~os.sep` or :data:`~os.altsep`) returns only directories.


Comparison to the :mod:`glob` module
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The patterns accepted and results generated by :meth:`~Path.glob` and
:meth:`~Path.rglob` differ slightly from those by the :mod:`glob` module:

1. Hidden files (beginning with a dot) are not special in pathlib. This is
barneygale marked this conversation as resolved.
Show resolved Hide resolved
like passing ``include_hidden=True`` to :func:`glob.glob`.
2. "``**``" components are always recursive in pathlib. This is like passing
``recursive=True`` to :func:`glob.glob`.
3. "``**``" components do not follow symlinks by default in pathlib. Pass
``follow_symlinks=True`` to :meth:`Path.glob` for :func:`glob.glob`-like
behaviour.
4. Like all :class:`PurePath` and :class:`Path` objects, the values returned
from :meth:`Path.glob` and :meth:`~Path.rglob` don't include trailing
slashes.
5. The values returned from pathlib's ``path.glob()`` and ``path.rglob()``
include the *path* as a prefix, unlike the results of
``glob.glob(root_dir=path)``.
6. ``bytes``-based paths and :ref:`paths relative to directory descriptors
<dir_fd>` are not supported by pathlib.


Correspondence to tools in the :mod:`os` module
-----------------------------------------------

Expand Down
Loading