workshop with @flomonster conclusions

Signed-off-by: Leo Valais <leo.valais97@gmail.com>
OpenRailAssociation · Jan 7, 2025 · 0581f19 · 0581f19
1 parent db47135
commit 0581f19
Showing 1 changed file with 37 additions and 45 deletions.
diff --git a/content/docs/reference/design-docs/editoast-errors/_index.en.md b/content/docs/reference/design-docs/editoast-errors/_index.en.md
@@ -19,18 +19,19 @@ weight: 80
 
 - Have a clear separation between logically distinct errors.
 - Dispose of a way to actually match on errors when they occur deeper in the stack
-- Tie the errors to the endpoint they originate from in the OpenAPI.
 - Separate the error definition and their serialization.
 - Establish how we want to forward Core's errors.
-- End the `InternalError` hegemony ⚔️
+- Tie the errors to the endpoint they originate from in the OpenAPI.
 
 # Constraints
 
 - Keep the same error format (for backward compatibility reasons to avoid involving the frontend too much).
+  - We must keep the `editoast:` prefix in the error type.
 - The error must live until it is handled, conversion to our standard error format only happens in the response serialization.
 - Errors must implement `std::error::Error`.
 - Errors must be composable. This will typically be handled by `thiserror`'s `#[from]` attribute.
 - Error variants must be shareable to ensure the deduplication of error kinds. For example, let's say we have two functions `get_infra(id: usize)` and `rename_infra(id: usize, name: String)`. Both functions error types have to include a variant describing the error case of an infrastructure not being found by its ID. However, we can't duplicate something like `InfraNotFound { id: usize }` in both error types as this leads to two different error paths describing the same error case. This is especially problematic for error translation keys. We need to be able to define an error `InfraNotFound` and include it in both error types.
+  - A unicity check may be performed in the post-processing of the OpenAPI file to ensure that each error has a unique error type.
 - Each endpoint must provide all its error cases in the OpenAPI. (How the frontend will consume them is another problem that we'll have to deal with.)
 - As for OSRD errors, the `context` field of the error is populated in the views.
 - Errors can be handled in a generic manner (for situations where it makes some sense to do so).
@@ -48,6 +49,9 @@ weight: 80
 - Errors **do not** require to `impl Serialize`. However, `derive(ViewError)` generates an `impl ViewError` which is used to generate the error that will be serialized in the HTTP response.
 - Error cases that will be used repeatedly are defined as a `struct` but still `derive(thiserror::Error)`.
 - The `error_type` of each variant is generated by the macro at the format `ErrorTypeName::VariantName`, but can be provided explicitly if conflicts arise.
+  - `editoast:` will be prepended systematically to indicate the service that raised the error.
+  - Since this type is not guaranteed to be unique, we may implement a post-processing step to ensure errors with the same `error_type` have the same OpenAPI schema.
+  - To ease the debugging process, an optional `source_location` will be provided in `ViewError`s containing a link to the GitHub file and line where the error is defined.
 
 ## Nominal case
 
@@ -97,7 +101,7 @@ pub enum EndpointError {
     #[view_error(user)] // <=> status = 400
     ResourceNotFound(#[from] GetError),
 
-    #[view_error(internal)] // <=> status = 500, default, overrides the message to "Internal error - please contact your administrator"
+    #[view_error(internal)] // <=> status = 500, default
     ProcessingFailed(#[from] ProcessingError)
 }
 ```
@@ -242,12 +246,13 @@ enum Error {
     #[view_error(user, context)]
     Nein { reasons: Vec<String> }, // { "reasons": [string] }
 
-    // explicitly builds the dictionary (*)
-    #[view_error(user, context(
-        reason = "format!(\"because {reason}\")",
-        recovery = "format!(\"do this instead {recovery}\")"
-    ))]
-    QueNo { reason: String, recovery: String },
+    // select (and maybe rename) some fields
+    #[view_error(user, context(reason, recovery_id = "recovery"))]
+    QueNo {
+        reason: String,
+        recovery_id: String,
+        not_serializable: mspc::Send<()>
+    },
 }
 
 // with a provider function
@@ -263,46 +268,35 @@ fn context_provider(error: Error) -> HashMap<String, serde_json::Value> {
 }
 ````
 
-`(*)`: only if the mapping can be collected easily using `darling`.
-
 ### About Core errors
 
-The Core service is a bit special as it already returns errors with the common OSRD format. Do we (editoast) decide to forward them without changing them or to wrap them in a `struct CoreError`?
-
-#### Forwarding Core errors
-
-Pros:
-
-* Easy to implement
-* Convenient for Core devs when debugging using the frontend
-
-Cons:
-
-* We can't list the errors in the OpenApi
-* The frontend has to know which Core errors to expect
-* They're untyped in editoast, so matching on them is cumbersome
+The Core service is a bit special as it already returns errors with the common OSRD format. Since editoast doesn't really need to parse and recover from Core errors[^2], we don't need an exhaustive list of them. We still need to differentiate them from other editoast errors (let's not start tossing `InternalError` around again…) and to provide a key for the frontend to translate them.
 
-I (Léo) do not like this choice 😆
+[^2]: Core errors will likely never be recoverable from either editoast or the frontend. For the latter, such errors are likely to be displayed as a generic "Internal error" message. So no translation is needed. For these reasons, we don't need to them in the OpenAPI. However, if in the future, we want editoast to actually parse Core errors, ensuring a proper mapping will still be possible.
 
-#### Wrapping Core errors
+Core errors are then "lightly" wrapped: we keep the error as a generic `serde_json::Value` that we include into a `struct CoreError` that we can augment with additional information about the request. This way, the original is preserved, forwarded to the frontend, but fits our new error paradigm.
 
-Pros:
-
-* Errors are typed
-* Core errors are listed in the OpenAPI
-* The frontend can start to consider editoast being "the backend" instead of half of it
-* Editoast can match on Core errors
-
-Cons:
-
-* We need to maintain some sync between the errors Core sends and which errors editoast expects (this can be done automatically in the CI, but it's probably fine to maintain the sync manually at first)
-* The implementors of an `AsCoreRequest` in the client also has to list all the expected errors (one might argue that it's part of the job...)
-* We need to make sure that we forward **all** information returned by Core otherwise it might degrade the debuggability when using the frontend.
+`CoreError` draft:
 
+```rust
+// in editoast_core
 
----
+#[derive(Debug, thiserror::Error)]
+struct CoreError {
+    /// RabbitMQ "endpoint"
+    rpc: String,
+    /// Request metadata
+    metadata: HashMap<String, String>,
+    /// The original error
+    error: serde_json::Value,
+    /// Additional context the caller of the failed Core RPC can provide
+    ///
+    /// Note named "context" on purpose to avoid confusion with the `context` field of the standard OSRD error.
+    additional_information: Option<HashMap<String, serde_json::Value>>,
+}
+```
 
-We probably should organize a workshop on this topic to determine what's the target and how to maintain the sync between Core and editoast errors.
+Note: the `error` field is kept as a `serde_json::Value` and not parsed (even though its format is standard) as we're not supposed to perform any kind of analysis or recovery on it. If we end up parsing it in the future, that means we need a stronger mapping between Core errors and what editoast expects. The red flag will be more obvious if we end up manipulating a JSON dict instead of a proper structure.
 
 ### Implementation plan
 
@@ -316,7 +310,7 @@ We'll need a progressive migration as this implies too much change to fit in a s
 4. Create a derive macro `ViewError` that interfaces with `thiserror::Error` API and generates:
   * `impl ViewError`
   * `impl utoipa::ToSchema` that returns the schema of the generated OSRD error
-5. Adapt the frontend error keys collection script to also look for errors in the OpenAPI routes response section.
+5. ~~Adapt the frontend error keys collection script to also look for errors in the OpenAPI routes response section.~~ We accept a temporary desync of the error keys.
 
 We decide the status code of each `ViewError` at it's definition as it's a required data to generate the response (the status code is part of the payload). Duplicating it in the `utoipa::path` annotation is a bad idea, and extracting it from the generate `utoipa::Route` is tricky. Therefore the error has to stay the source of truth about its status code.
 
@@ -347,9 +341,7 @@ One large change that will have to be atomic will be the adaptation of `Model`'s
 Eventually, when all errors are converted and views errors are attached to their endpoint(s) in the OpenAPI, we'll have to:
 
 1. Remove `trait EditoastError`, `derive(EditoastError)` and `struct InternalError` (at least its former version as the name may be reused in a different scope)
-2. Remove the error collection mechanism
-3. Remove the scanning of `components/schemas` by the frontend script collecting error keys for translation
-4. If not already done, try to ensure the sync between Core and editoast errors is maintained in the CI
+2. Adapt the frontend error keys collection script to look for errors in the OpenAPI routes response section instead of `components/schemas`
 5. (Out of scope) Discuss with the frontend the level of visibility about internal errors we want to give the user
 
 ### Rejected ideas