Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

misc/metrics: Replace tide with some other webserver #2432

Closed
mxinden opened this issue Jan 13, 2022 · 12 comments
Closed

misc/metrics: Replace tide with some other webserver #2432

mxinden opened this issue Jan 13, 2022 · 12 comments
Labels
difficulty:easy getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted

Comments

@mxinden
Copy link
Member

mxinden commented Jan 13, 2022

cargo audit reports two vulnerabilities in the dependency graph of tide:

Given that we use tide for an example only, I would guess any other webserver would do.

Blocks #2430.

@mxinden mxinden added difficulty:easy help wanted getting-started Issues that can be tackled if you don't know the internals of libp2p very well labels Jan 13, 2022
@esotericbyte
Copy link
Contributor

esotericbyte commented Jan 24, 2022

I'm glad to help in any way I can. I just don't to go swatting at flys with an albacore tuna if that can be avoided.
Changing the web server is not a trivial change. If another one is required that adds another dependency downstream for libp2p + tide users.
My research on this indicates that the issue was patched by forcing utc on affected platforms, and this patch is months old. Has the segfault been reproduced? This "fix" has left issues open not because of the segfault but because of breaking local time in the fix. For a webserver utc time is normal.
Tide is an actively maintained project.
Additionally the tide project might have more congruent dependencies with libp2p than whatever other webserver you choose does adding upstream dependencies as well.

Am I missing a second issue that was not fixed? It seemed the cited issues were referencing the same thing.
If you want more root cause research and tests I will set up an environment to do so.

@mxinden
Copy link
Member Author

mxinden commented Jan 24, 2022

Hi @esotericbyte,

I would very much appreciate additional help here.

My motivation:

  1. I would like to detect vulnerabilities in upstream dependencies early.
  2. To me an automated process like the one introduced through .github/workflows: Add cargo audit #2430 seems to be the most efficient.
  3. In order to merge .github/workflows: Add cargo audit #2430 we either have to:
    1. Remove all warning and error advisories.
    2. Keep track of a list of advisories to ignore.

Given that the list of advisories is relatively small, I suggested to do (3.1) (removing all advisories).

Ultimately I don't care how we achieve (1). @esotericbyte how would you approach this problem?

@esotericbyte
Copy link
Contributor

esotericbyte commented Jan 27, 2022

i wrote a js filter that condensed the advisory from cargo audit --json and realized that all of the warning issues in the main build of rust-libp2p are the same as those for tide alone and that this is also a problem for the example for open metrics client. I will start to work on a version using Warp. It uses some syntax that seems very concise but it does have the same vulnerability with chrono. Actix another alternative and cargo audit itself also uses chrono.
Do you have ideas on how to avoid the issue with Chrono? I will look into what is being done solve it upstream on Friday.

@rkuhn
Copy link
Contributor

rkuhn commented Jan 27, 2022

When using Warp, I recommend using .boxed() on filters quite aggressively — I recently tracked down a factor two in compilation times due to unboxed filters and their rather large types, and the problem is that this issue is not visible when building the library, only when linking a binary (more precisely: in the monomorphisation step before calling the linker).

@ahollmann
Copy link

Have you taken a look at axum web framework?

By the hyper and warp author on axum:

Then came Axum. Axum is a Rust server framework designed to extend from Tower and Tower-HTTP. While warp was always able to work with Tower, it’s main Filter API was very #opinionated, and felt foreign to those less-used to functional programming. Axum lives next to warp, more fully embracing the Service, and using a much more common Router style.

@esotericbyte
Copy link
Contributor

I looked at hyper ! It seems that going without a framework and just using the http server , going full boiler obviously is a good way to go and avoids 3 problems.

  1. No more audit errors!
  2. No Warp linking problems.
  3. Avoid opinions and complexity of web frameworks to serve metrics.

The disadvantage is longer code.

Thinking about responses to comments helped lead me toward this direction. I'd be interested in your opinions.

The metrics code example currently uses tide framework to serve the metrics along the side of another server or in this case peer. This is not the start of a web service or to integrate with a framework to collect the metrics. Those are separate issues. This adds a listener to an additional port to provide metrics to e.g. Prometheus. I intend to replace tide.
I thought warp would be good because the code was short but it also adds back the one advisory that is an error. This I will drop the idea of using warp.

The current plan is to define functions to serve the metrics using hyper.

When using Warp, I recommend using .boxed() on filters quite aggressively — I recently tracked down a factor two in compilation times due to unboxed filters and their rather large types, and the problem is that this issue is not visible when building the library, only when linking a binary (more precisely: in the monomorphisation step before calling the linker).

The goal is simple understandable code to serve the metrics. It's not going to get any more complex. With a short example will compile times have more impact on developer experience than simple code?
I don't understand the precise "monomorphisation" step. I will research linking details at some point. If I did go warp maybe you could add the boxed where it needs to be? Does this have zero runtime impact or does it use more memory?

Have you taken a look at axum web framework?
I had not. It looks like it would be a decent choice. After initial research, admittedly breezy, I found complaints about nesting syntax when defining a multi-point router that look less clear to me than warp but that would not apply to this case. It is also more verbose than warp. The highly functional syntax of warp filters certainly is not that weird to anyone exposed to , for example, JS6. I like the approach if warp for short code. It looks easier to have more control in axum. It also sounds like you can combine them .. but how not sure and what is Tower? Warp looks like a good framework for a rapid start on small project you know will stay small or that you will refactor or burn eventually anyway.

@rkuhn
Copy link
Contributor

rkuhn commented Jan 29, 2022

With a short example will compile times have more impact on developer experience than simple code?

This is one of the hard questions. My opinion is that warp code can be quite easy to read for someone who is only interested in the HTTP semantics, but it is definitely not simple — to me “simple” means “the abstraction works perfectly, without leakage, and I can understand the implementation details if need be”. A hammer & chisel is simple, a CNC milling machine can do many more things (which may be easy) but is definitely not simple.

Does this have zero runtime impact or does it use more memory?

As per the above, I cannot tell. I’m using warp in a context where requests happen infrequently, so it wouldn’t even matter if the resulting routing code allocated filters on the heap for every request (which BTW is what the high-level akka-http DSL does, in a completely different ecosystem).

Since serving a prometheus endpoint is a rather small surface area, I’d find it simpler to go low-level than to use ca. 1% of a high-level framework (I have no experience with axum yet, so can’t tell whether that counts as “low-level”).

@ahollmann
Copy link

ahollmann commented Jan 29, 2022

Since serving a prometheus endpoint is a rather small surface area, I’d find it simpler to go low-level than to use ca. 1% of a high-level framework (I have no experience with axum yet, so can’t tell whether that counts as “low-level”).

Axum is around the same abstraction level as tide and warp. It is very new (first release on Jul 30, 2021) and very actively developed. The benefit over tide is that is uses tokio instead of async-std as async runtime. At the same time it uses less "magic" compared to warp and is more actively developed.

If you don't need the middleware (e.g. to add metrics to your endpoints) a higher level web frameworks offers and accept some more boilerplate then hyper should be perfectly fine as nearly all web frameworks are a thin layer on top of hyper anyway. In the end it depends on your requirements. I mentioned axum as the previous choice was tide.

@dignifiedquire
Copy link
Member

Looks like chrono itself is not patched at all, which is quite unfortunate: chronotope/chrono#499.
I will take a look and see if I can get tide fixed, as that needs to be done anyway.

@esotericbyte
Copy link
Contributor

Nearly done with a hyper based metrics server that complies but i need to test it. It's on my fork but is not ready for a PR.
I need to test it and remove the tide server, and fix a couple of other issues.
fork & branch:
https://github.com/esotericbyte/rust-libp2p/tree/misc-metrics-tide-to-hyper

After nearly done I can not help but question if the extra complexity of an asynchronous threaded http server is overkill for this purpose. Among possible downsides hyper and tokio had to be compiled with feature full for some trait implementations.
Maybe the standard lib or some other place as a dead simple http server?

@mxinden
Copy link
Member Author

mxinden commented Feb 14, 2022

Maybe the standard lib or some other place as a dead simple http server?

🙏 No need to be asynchronous. The simpler the better.

@esotericbyte
Copy link
Contributor

I finished the hyper sever at the above branch and removed Tide from metrics/Cargo.toml
The code was not all that trivial for a simpler server. I could still do that.
Since the code for the hyper server is lower level and longer I put it in a separate file.

@mxinden mxinden closed this as completed Mar 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
difficulty:easy getting-started Issues that can be tackled if you don't know the internals of libp2p very well help wanted
Projects
None yet
Development

No branches or pull requests

5 participants