Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Debug and Validation Layers #416

Merged
merged 24 commits into from
Aug 2, 2024
Merged

Debug and Validation Layers #416

merged 24 commits into from
Aug 2, 2024

Conversation

armansito
Copy link
Collaborator

@armansito armansito commented Dec 13, 2023

We've been talking about various ways to perform CPU-side validation/testing over the outputs of Vello pipeline stages. It's also generally useful to be able to visualize the contents of some of these intermediate data structures (such as bounding boxes, the line soup, etc) and to be able to visually interpret any errors that are surfaced from the validation tests.

I implemented the beginnings of this in a new debug_layers feature. I'm putting this up as a Draft PR as there are a few unresolved aspects that I'd like to get feedback on first (more on this below).

Running Instructions

To try this out run the with_winit example with --features debug_layers and use the number keys (1-4) to toggle the individual layers.

Summary

This PR introduces the concept of "debug layers" that are rendered directly to the surface texture over the fine rasterizer output. There are currently 4 different layers:

  • BOUNDING_BOXES: Visualizes the edges of path bounding boxes as green lines
  • LINESOUP_SEGMENTS: Visualizes LineSoup segments as orange lines
  • LINESOUP_POINTS: Visualizes the LineSoup endpoints as cyan circles.
  • VALIDATION: Runs a collection of validation tests on intermediate Vello GPU data structures. This is currently limited to a watertightness test on line segments. Following the test run, the layer visualizes the positions of detected errors as red circles.

These layers can be individually toggled using a new DebugLayers field in vello::RenderParams. The following is an example output with all 4 layers enabled:

Screenshot 2023-12-12 at 3 13 51 PM

Each layer is implemented as an individual render pass. The first 3 layers are simple visualizations of intermediate GPU buffers. The VALIDATION layer is special since it runs CPU-side validation steps (currently a watertightness test over the LineSoup buffer), which requires read-back. The general idea is that VALIDATION can grow to encompass additional sanity checks.

Overview of Changes

  • Engine support for render pipeline creation and draw commands. In particular, the existing blit pipeline can now be expressed as a Recording. The debug layer render passes get recorded to this Recording. All render passes share the same render encoder and target the same surface texture.
  • A simple mechanism to extend the lifetime of GPU buffers beyond their original Recording to allow them to be used in a subsequent render pass. Currently this separation into multiple recordings is necessary since the visualizations require GPU->CPU read-back. This is partially encapsulated by the new CapturedBuffers data structure.
  • The debug module and the DebugLayers and DebugRenderer data structures. DebugRenderer is an encapsulation of the various render pipelines used to visualize the layers that are requested via DebugLayers. DebugRenderer is currently also responsible for execution the validation tests when the VALIDATION layer is enabled.
  • The with_winit example now has key bindings (the number keys 1-4) to toggle the individual layers.

Open Questions

  1. It probably makes sense to have a better separation between running validation tests and visualizing their output. Currently both are performed by DebugRenderer::render.

  2. CapturedBuffers doesn't handle buffer clean up well. The current engine abstractions require that a buffer be returned to the underlying engine's pool via a Recording command. This creates and submits a command buffer to simply free a buffer, which is a bit too heavy-weight. This whole mechanism could use some rethinking.

    Currently, these buffers get conditionally freed in various places in the code and it would be nice to consolidate that logic.

  3. The VALIDATION layer currently doesn't work with the --use-cpu flag since the buffer download command isn't supported for CPU-only buffers. Currently, it's the job of src/render.rs to know which buffers need to get downloaded for validation purposes. It currently simply records a download command. It would be nice for the engine to make the download command seamless across both CPU and GPU buffers rather than having the src/render.rs code do something different across the CPU vs GPU modalities.

  4. Currently all layers require read-back. The debug layers (BOUNDING_BOXES, LINESOUP_SEGMENTS, LINESOUP_POINTS) read back the BumpAllocators buffers to obtain instance counts used in their draw commands. This read-back could be avoided by instead issuing indirect draws for the debug layers. I think this could be implemented with a relatively simple scheme: a new compute pipeline stage is introduced (gated by #[cfg(feature = "debug_layers")], which can inspect any number of the vello intermediate buffers (such as the bump buffer) and populate an indirect draw buffer. The indirect draw buffer would be laid out with multiple DrawIndirect entries, each assigned to a different pre-defined instance type (the DebugRenderer only issues instanced draws). DebugRenderer would then issue an indirect draw with the appropriate indirect buffer offset for each render pipeline.

    The read-back would still be necessary for CPU-side validation stages and their visualization can't really take advantage of the indirect draw. Then again, the exact ordering of the draw submission and the read-backs implemented in this PR is likely to change following the proposal in Strategy for robust dynamic memory, readback, and async #366.

Copy link
Collaborator

@waywardmonkeys waywardmonkeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit, but could this and the sin_cos change be done separately from this?

@armansito armansito marked this pull request as ready for review January 18, 2024 04:01
@armansito
Copy link
Collaborator Author

A nit, but could this and the sin_cos change be done separately from this?

Thanks for taking a look. I don't mind sending those as a a separate PR.

@armansito
Copy link
Collaborator Author

armansito commented Jan 18, 2024

I decided to move this out of Draft status to kick off the discussion. Some of the issues that I mentioned in the original comment above (under "Open Questions") are still open and I would love to hear people's thoughts on them.

@PoignardAzur
Copy link
Collaborator

To be clear, this would be visual validation for debugging, right? This doesn't include a way to say eg "assert that this list of command doesn't trigger a validation error"?

Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall, having some more debug code is extremely useful, thanks.

I've not done a very thoughtful/thorough review yet, but think getting initial comments in is useful.

I was unable to see any watertightness failures in tiger as in your screenshot. Were those intentionally added for an example?

src/debug.rs Outdated Show resolved Hide resolved
src/debug.rs Outdated Show resolved Hide resolved
src/debug.rs Outdated
}
}

const SHADERS: &str = r#"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I'd prefer this to be its own file, primarily for syntax highlighting reasons

@armansito
Copy link
Collaborator Author

To be clear, this would be visual validation for debugging, right? This doesn't include a way to say eg "assert that this list of command doesn't trigger a validation error"?

This PR is mostly about visual validation, yes. That said the watertightness test could be run as part of a test runner that actually asserts or reports the failures in a textual form (e.g. by incorporating into what's proposed in #439)

@armansito
Copy link
Collaborator Author

Overall, having some more debug code is extremely useful, thanks.

I've not done a very thoughtful/thorough review yet, but think getting initial comments in is useful.

Thanks, I'll address them as soon as I get time.

I was unable to see any watertightness failures in tiger as in your screenshot. Were those intentionally added for an example?

No, these are real errors that are present in the flattening logic (in both the CPU and GPU versions of flatten). I'm surprised you're not seeing them. Did you try enabling that layer by pressing "4"?

I'm also intrigued by #439 and I'm wondering if the watertightness test can be run as part of that as well.

@DJMcNab
Copy link
Member

DJMcNab commented Feb 15, 2024

No, these are real errors that are present in the flattening logic (in both the CPU and GPU versions of flatten). I'm surprised you're not seeing them. Did you try enabling that layer by pressing "4"?

Hmm, I still don't see them. The layer appears to be enabled, as the screen goes dark, and the frame time increases massively

I can't test the CPU version, as that instantly panics (as expected/as you documented) due to the download issue

@PoignardAzur
Copy link
Collaborator

Could these errors be non-deterministic?

@armansito
Copy link
Collaborator Author

No, these are real errors that are present in the flattening logic (in both the CPU and GPU versions of flatten). I'm surprised you're not seeing them. Did you try enabling that layer by pressing "4"?

Hmm, I still don't see them. The layer appears to be enabled, as the screen goes dark, and the frame time increases massively

Ah, you're likely not seeing the errors because the GPU-side stroke expansion is disabled by default (look for GPU_STROKES in src/scene.rs) and the stroke encoding converts strokes to fills using kurbo. These errors are only present due to a bug in the new GPU stroke expansion code, so try setting GPU_STROKES to true before trying again.

Copy link
Member

@xStrom xStrom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since #456 we have standard copyright headers. Please update the two files to have consistent capitalization with the standard.

src/debug.rs Outdated Show resolved Hide resolved
src/debug/validate.rs Outdated Show resolved Hide resolved
@DJMcNab DJMcNab self-requested a review February 29, 2024 15:34
@DJMcNab DJMcNab mentioned this pull request May 19, 2024
@DJMcNab
Copy link
Member

DJMcNab commented Jun 5, 2024

I'm sorry this hasn't been reviewed for so long - it keeps on being pushed down my todo list.

In my mind, the major concern is that this doesn't compose with the CPU shaders. It feels like it should be trivial to do readback on a CPU buffer, given that we already have it in memory; is it only an API issue?

Copy link
Member

@DJMcNab DJMcNab left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm slightly conflicted here; the actual debug layer code looks good and works fine, but it's very clear that the code here has outgrown our abstractions.

Given that:

  1. We recommend that users use the render_to_surface method (rather than render_to_surface_async)
  2. All the new complexity is inside render_to_surface_async

I think it's fine to ship this. It's no secret that this code is going to need a really careful rethink at some point, but that isn't the job of this PR, and doing that rethink will be easier if we have this use case in mind.
That's also why I don't mind the CPU shaders not being supported here; clearly it's pretty far from ideal, but it doesn't break anything too badly there

Sorry again for not reviewing this earlier; my personal philosophy on code review has shifted in the intermediate time, to more favour landing things and improving them in-situ.

This PR is however definitely blocked on restoring the wgpu-profiler functionality, i.e. the resolve_queries calls.

@@ -360,8 +386,11 @@ impl WgpuEngine {
transient_map
.bufs
.insert(buf_proxy.id, TransientBuf::Cpu(bytes));
let usage =
BufferUsages::COPY_SRC | BufferUsages::COPY_DST | BufferUsages::STORAGE;
// TODO: restrict VERTEX usage to "debug_layers" feature?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, we have lots of issues with implicit usages; I think we might need to either make this explicit at the recording level, or else loop through the recording and determine what capabilities we need. I think I'd prefer to just make it explicit.

But this isn't a task for this PR.

vello/src/lib.rs Outdated Show resolved Hide resolved
}
}

pub(crate) fn validate_line_soup(lines: &[LineSoup]) -> Vec<LineEndpoint> {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we did want to do this on the GPU, I suppose we'd want to do a 2d sort, so that all points at the same position are next to each other, then check that the points all pair up (maybe in a cleverer way than a single-threaded scan)

examples/with_winit/src/lib.rs Outdated Show resolved Hide resolved
vello/src/lib.rs Show resolved Hide resolved
vello/src/lib.rs Outdated Show resolved Hide resolved
vello/src/lib.rs Show resolved Hide resolved
Copy link
Collaborator

@waywardmonkeys waywardmonkeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A first pass....

vello/src/debug/renderer.rs Outdated Show resolved Hide resolved
pub struct LineEndpoint {
pub path_ix: u32,

// Coordinates in IEEE-754 32-bit float representation
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why not just store them as f32 ? (This should at least be explained here if it stays this way.)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added my best-guess explanation

vello/src/lib.rs Show resolved Hide resolved
armansito and others added 7 commits August 2, 2024 10:54
Recording and WgpuEngine can now record and execute draw commands with
a render pipeline.
The blit pipeline now uses the render pass feature in Recording
instead of making wgpu calls directly.
This introduces a new debug module and DebugLayers data structure that
can render debug visualizations of GPU buffers that are internal to the
vello pipeline. This currently supports a line-based visualization of
the LineSoup and PathBbox buffers.

The debug visualization depends on CPU read-back of the BumpAllocators
buffer to issue a draw call. This could technically be avoided with an
indirect draw but the visualizations will eventually include CPU-side
processing and validation.

The draws are recorded to the same Recording as the blit in
`render_to_texture_async`. The buffers that are used by the draw
commands are temporarily retained outside of the `Render` data
structure. These buffers are currently released back to the engine
explicitly and in various places in code since safe resource removal
currently requires a Recording.
Added the `ResourceProxy::BufferRange` type, which represents a buffer
binding with an offset.
Both blit and debug pipelines need to be recompiled when the engine gets
recreated.
- Introduce the VALIDATION layer which runs a watertightness test on
  LineSoup buffer contents.
- Add debug visualization layers for LineSoup endpoints and validation
  test errors.
- Add DebugLayers options to toggle individual layers.
- Add with_winit key bindings to toggle individual layers.
Copy link
Collaborator

@waywardmonkeys waywardmonkeys left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know about the actual visualization parts, but I'm okay with the integration parts!

@DJMcNab DJMcNab added this pull request to the merge queue Aug 2, 2024
Merged via the queue into main with commit 74b6155 Aug 2, 2024
17 checks passed
@DJMcNab DJMcNab deleted the debug-layers branch August 2, 2024 10:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants