python BFS and DFS search documentation improvements #1361

barakatzir · 2025-01-13T08:51:46Z

What is the expected enhancement?

I find the documentations for the python visit functions (dfs_search etc.) confusing, and I think all users can benefit from some additions to them.
It might be that my sparse prior knowledge in graph algorithms, but I found the documentation confusing with all the colors (WHITE, GRAY, BLACK) mentioned without context to the BFSVisitor events, or the term "forward or cross edge" which I didn't know.
In the end, I read the rust implementation to understand the algorithm. The code is clearly well thought out and it would be an injustice to keep not let it shine in the documentation as well.

I suggest the following changes:

In docstring, mentioning the visitor events in the pseudo-code in the docs, so it's clear which event occurs when. See suggestions below.
Fix rustworkx.dfs_search docstring: mention that the graph can be a PyDiGraph as well.
Changing the docstring's reported type-annotation for source from List[int] to Sequence[int] | None with default None.

Changing the type annotations (in rustworkx.pyi) for the different visit functions so that visitor=None is not accepted by static type checkers. This can be achieved with overloads:

@overload
def digraph_bfs_search(
    graph: PyDiGraph,
    *,
    visitor: _BFSVisitor,
) -> None: ...
@overload
def digraph_bfs_search(
    graph: PyDiGraph,
    source: Sequence[int] | None,
    visitor: _BFSVisitor,
) -> None: ...

In docstring, replace the "Graph can not be mutated while traversing." to "Trying to mutate the graph while traversing raises a RuntimeError.". This way the user can know which exception type he should catch in such cases.
Documenting that raising PruneSearch in a finish_vertex method raises an exception. I checked and this behavior is documented in rustworkx-core where the function call panics in such cases.
(This is the only actual suggested change to runtime behavior) I suggest not raising a PanicException if finish_vertex raises PrunceSearch. This exception isn't caught by except Exception and user cannot easily narrow it with except PanicException since PanicException is in pyo3_runtime module, which is not site-packages usually and cannot be easily imported.

I only looked at the bfs_search and dfs_search which I use, but I suspect dijkstra_search also could benefit from similar changes.
If you agree, I can open a PR with the agreed upon changes and also look into the Dijkstra case.

Regrading the pseudo-code elaborations. Here is an option: replacing the current pseudo-code with python alternative (python is already kinda pseudo-codish). I wrote python alternatives while trying to understand the rust code. Here they are after removing the control flow for the PruneSearch and StopSearch (I can reproduce the control flow if desired):

from typing import Literal
from collections import deque

def bfs_search(graph, source=None, visitor=None):
    color: dict[int, Literal['WHITE', 'GRAY', 'BLACK']] = {}
    for u in graph.node_indices():
        color[u] = 'WHITE'  # set all vertices as unvisited
    for s in source:
        _bfs_visit(graph, s, visitor, color)
        
def _bfs_visit(graph, start, visitor, color):
    if color[start] != 'WHITE':
        return
    color[start] = 'GRAY'
    visitor.discover_vertex(start)  # discover vertex start
    stack = deque((start,))
    while True:
        try:
            u = stack.pop()
        except IndexError:
            break  # break if stack is empty
        for _, v, weight in graph.incident_edge_index_map(u).values():  # examine edge (u,v)
            if color[v] == 'WHITE':
                visitor.tree_edge((u, v, weight))
                color[v] = 'GRAY'
                visitor.discover_vertex(v)
                stack.appendleft(v)
            else:
                visitor.non_tree_edge((u, v, weight))
                if color[v] == 'GRAY':
                    visitor.gray_target_edge((u, v, weight))
                else:  # color[v] == 'BLACK'
                    visitor.black_target_edge((u, v, weight))
        color[u] = 'BLACK'
        visitor.finish_vertex(u)

and

from typing import Literal

def dfs_search(graph, source=None, visitor=None):
    color: dict[int, Literal['WHITE', 'GRAY', 'BLACK']] = {}
    for u in graph.node_indices():
        color[u] = 'WHITE'  # set all vertices as unvisited
    time = -1  # time counter
    for s in source:
        time = _dfs_visit(graph, s, visitor, color, time)
        

def _dfs_visit(graph, start, visitor, color, time):
    if color[start] != 'WHITE':
        return time
    color[start] = 'GRAY'
    time += 1
    visitor.discover_vertex(start, time)  # discover vertex start
    stack = [(start, graph.incident_edge_index_map(start).values())]
    while True:
        try:
            u, edges = stack[-1]
        except IndexError:
            break  # break if stack is empty
        next_ = None
        for _, v, weight in edges:  # examine edge (u, v)
            if color[v] == 'WHITE':
                visitor.tree_edge((u, v, weight))
                color[v] = 'GRAY'
                time += 1
                visitor.discover_vertex(v, time)
                next_ = v
                break
            elif color[v] == 'GRAY':
                visitor.back_edge((u, v, weight))
            else:  # color[v] == 'BLACK'
                visitor.forward_or_cross_edge((u, v, weight))
        if next_ is not None:
            stack.append((v, graph.incident_edge_index_map(v).values()))
            continue
        color[u] = 'BLACK'
        time += 1
        visitor.finish_vertex(u, time)
        stack.pop()
    return time

The text was updated successfully, but these errors were encountered:

IvanIsCoding · 2025-01-14T02:42:57Z

In general I think documentation improvements are always welcome. I did suggest the idea of having the pseudocode in Python in #489 but the consensus was that the existing pseudocode was good enough.

I will start by saying that the terminology used is the same as the one in CLRS. The book did not invent it but it popularized it for sure. We should probably cite Chapter 20 "Elementary Graph Algorithms" from the 4th edition.

I will group the answer by:

Panicking

Answering n.7. I agree in not panicking, I'd rather throw an error. I will merge a PR for improving that aspect of the code
Documenting n.6 is fine. It can come in the same PR

Modifying the graph

Answering about modifying the graph while iterating. Modifying the input while calling DFSVisitor or BFSVisitor is a terrible idea, the implementation does not handle that case. In the best scenario you will get a RuntimeError as mentioned but if the program managed to panic I would not be surprised.

I am ok with documenting RuntimeError with the emphasis that modifying the graph is not supported.

Type annotations

I can take that one myself and clean it up after #1353 is merged. There are some improvements from #1352 as well. Once I get Matthew to review my pending PRs it will be easier to work on that.

Documentation improvements

You can send the improvements. I don't mind changing the pseudocode to Python as long as it is roughly the same. It shouldn't be a full-fledged implementation but something good enough to give a guide on how to use the code. We should also document that the universal function dfs_search and bfs_search works with both graph types

Another idea is to add more examples to the documentation, maybe with a tutorial on how to use the visitors.

barakatzir · 2025-01-14T08:45:18Z

Thanks for the informative response! I'll get to work on the PR on the documentations in a couple of days probably.
Adding more examples is also a great idea!

Do you have a plan on what to replace the panic with? If you have a concrete exception type in mind I can put it in the documentation.

IvanIsCoding · 2025-01-14T14:49:57Z

Thanks for the informative response! I'll get to work on the PR on the documentations in a couple of days probably. Adding more examples is also a great idea!

Do you have a plan on what to replace the panic with? If you have a concrete exception type in mind I can put it in the documentation.

I think RuntimeError is fine in the place of a panic. I will think more carefully about it but that one is a good umbrella error for things we don’t want to handle

IvanIsCoding · 2025-01-15T03:03:54Z

So I tried to add the annotations in #1362 and the results was that mypy complained that we did not cover the default arguments case.

I will probably need to add an overload with https://docs.python.org/3/library/typing.html#typing.Never for the visitor=None case. Something along the lines of def dfs_search(graph: PyGraph | PyDiGraph, *, visitor: None=...) -> Never.

I never thought argument ordering would bite us back so much!

barakatzir · 2025-01-15T07:10:43Z

Overloading can be tricky. There are a lot of sharp corners for complex signatures. I'll take a look at your commit with the overloading.

It is considered bad practice to use Never or NoReturn to annotate a raised exception.
The following code is valid by mypy:

From typing import Never

def neverreturns(): -> Never:
    assert False

neverreturns()  # no errors by mypy
1 + 'hello'  # this isn't type checked

The idea is that Never can be used by functions that run a server or another never-ending event loop. All the code after a Never is returned is skipped by type checkers since it is deemed unreachable.

Python simply does not have a good alternative to rust's Result...

IvanIsCoding · 2025-01-15T13:10:20Z

I agree it is bad practice but I need to cover the all None case. I’d rather not tell users the all None case is available at all. But our tests catch that “mistake”

barakatzir · 2025-01-15T22:14:35Z

I did some trials and I believe the following should work

@overload
def digraph_bfs_search(
    graph: PyDiGraph, source: Sequence[int] | None=..., *, visitor: _BFSVisitor,
) -> None: ...
@overload
def digraph_bfs_search(
    graph: PyDiGraph, source: Sequence[int] | None, visitor: _BFSVisitor,
) -> None: ...
@overload
def digraph_bfs_search(
    graph: PyDiGraph, source: Never, visitor: None = ...,
) -> None: ...

This passes both mypy and stubtest (on minimal examples I recreated). Notice that I put the Never in the argument for source, so type checkers are instructed to forbid this overload signature.

IvanIsCoding added the documentation label Jan 14, 2025

IvanIsCoding mentioned this issue Jan 15, 2025

Improvements for rustworkx.visit annotations #1362

Merged

barakatzir linked a pull request Jan 29, 2025 that will close this issue

Enhance traversal docs #1375

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

python BFS and DFS search documentation improvements #1361

python BFS and DFS search documentation improvements #1361

barakatzir commented Jan 13, 2025 •

edited

Loading

IvanIsCoding commented Jan 14, 2025

barakatzir commented Jan 14, 2025

IvanIsCoding commented Jan 14, 2025

IvanIsCoding commented Jan 15, 2025

barakatzir commented Jan 15, 2025 •

edited

Loading

IvanIsCoding commented Jan 15, 2025 •

edited

Loading

barakatzir commented Jan 15, 2025 •

edited

Loading

python BFS and DFS search documentation improvements #1361

python BFS and DFS search documentation improvements #1361

Comments

barakatzir commented Jan 13, 2025 • edited Loading

What is the expected enhancement?

IvanIsCoding commented Jan 14, 2025

Panicking

Modifying the graph

Type annotations

Documentation improvements

barakatzir commented Jan 14, 2025

IvanIsCoding commented Jan 14, 2025

IvanIsCoding commented Jan 15, 2025

barakatzir commented Jan 15, 2025 • edited Loading

IvanIsCoding commented Jan 15, 2025 • edited Loading

barakatzir commented Jan 15, 2025 • edited Loading

barakatzir commented Jan 13, 2025 •

edited

Loading

barakatzir commented Jan 15, 2025 •

edited

Loading

IvanIsCoding commented Jan 15, 2025 •

edited

Loading

barakatzir commented Jan 15, 2025 •

edited

Loading