-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A proposal of high-performance L7 network GoLang extension for Envoy. #15152
Comments
This essentially boils down to providing some stable C API for filter management, correct? I think in the past we've been trying to avoid having to provide a C API for Envoy due to the heavy use of C++ features like polymorphism (which becomes non-trivial to map to C) and wanting to avoid yet another set of APIs to maintain, which would slow down our ability to iterate on the existing C++ APIs. I believe Cilium did this for L4 filters (https://github.com/cilium/proxy/blob/master/proxylib/libcilium.h), but that has a substantially API surface area than L7 (e.g. it doesn't require mapping HeaderMap into C). Do you have a sense of what the APIs would look like to make this a useful feature? |
No, in this scenario C++ calls GoLang, not GoLang calls C++/C, so there is no need to implement the C API in Envoy. BTW Http GoLang filter is a common extension filter of Envoy Http, and will not modify filter management, it's similar to the way that Envoy supports Lua extension (https://github.com/envoyproxy/envoy/tree/main/source/extensions/filters/http/lua).
Cilium L4 filter is a good extension, users can easily use GoLang to do some flow control (such as do some control on network policies) at L4. But L4 is not convenient to get some L7 information (such as request/response headers, trailers, etc.), which means that some of the best practices (such as dynamic routing based on request headers, etc.) mentioned above cannot be implemen via L4 extension filter, so we proposed to add the L7 GoLang extension to enhance Envoy's ability to extend the http filter through GoLang at L7. |
In general I'm in favor of doing this as I think it would unlock a huge amount of extensibility within the existing cloud native ecosystem. A few high level comments:
I think the next steps here are to put together a more long form gdoc/design doc on this topic, and hopefully connect with the Cilum folks to come up with a shared plan. Feel free to email me if folks need help connecting. Thank you! |
The existing C++ filter API is heavily callback driven, so if we want to expose the full power of C++ filters we'd need to provide a way for the filter to interact with these callbacks, which would require a way to interact with this API through C. If the scope of this is much smaller and more in line with what the Cilium L4 filter does, then the complexity of this is greatly reduced. I guess my point is that it's not clear to me what the desired scope of this functionality is. Do we have a sense of what the API would have to look like in order to power the listed use cases? |
+1 In general this looks great. We've been thinking of using the WASM ABI as the stable C API, so that might hold here as well. We already have Null (native) WASM extension (which is not in WASM sandboxed but statically linked), so the CGO interface could be just based on that as well. cc @mathetake working on WASM and Go WASM SDK as well. |
Can I ask a stupid question? why not just introduce a gRPC API with an interface similar to https://github.com/proxy-wasm/proxy-wasm-rust-sdk/blob/master/examples/http_headers.rs? . I undestand that gRPC may imply huge latency penalty but when using Envoy as sidecar, the communication is usually done over localhost. Also I am not sure how this would work for wasm or golang or whatever for any case other than HTTP since Envoy understands HTTP. For example, what if I want to use this to authorize mysql traffic (e.g. allow |
+1 in general, but we will definitely face tons of issues in GC and any other Go's language level runtime issues (e.g. When to run GC if an extension allocates inside Go?). Would love to see the design doc so that we could discuss this down to the implementation detail.
+1 |
@mandarjog the IMHO, if we wanted to go out-of-process, something like |
I think it is not necessary for GoLang extension to expose full the API of C++ filter's callback. In addition, the callback of Envoy C++ filter don't necessary to be called API through C, we can design a call description between C++ and GoLang filter; For example, if set
The L4 is not convenient to get some L7 request infomation (such as request/response headers, trailers, etc.) which means that some of the best practices (such as dynamic routing based on request headers or improve Dapr gRPC performence by Envoy L7 GoLang extension, etc.) mentioned above cannot be implement via L4 extension filter. As @mattklein123 suggested, the next step we'll put a long form design doc on this topics. Then we could discuss the design details. Thank you! |
It's a good question. On the one hand, it is a performance issue(it is necessary to cross-process communication,requires protocol en/decode, serialization, etc.), @htuch also commented above on this point( |
@wangfakang yes, it looks like the |
It's a good idea, I will look at this spec: https://github.com/proxy-wasm/spec/blob/cc53262df056036427476b272fb8f2438aa7975f/abi-versions/vNEXT/README.md. Thanks you! |
It's very cooool.
Ok.
Thanks your reply. We'll put a long form design doc on this topics, then we could discuss the design details. |
This looks quite promising! |
great! it will provide an alternative to involve envoy filters |
We are happy to participate and we are happy to upstream our work assuming we get some help. |
The CGO API (https://github.com/cilium/proxy/blob/master/cilium/proxylib.h) used by Cilium Go extensions for Envoy is already completely Cilium-independent and while it's currently operated by the Cilium network filter it should be trivial to create a stand-alone Envoy network filter for it. It defines C++ types for Go strings and slices that allow passing configuration and connection data to Go extensions without any data copies. The Go runtime does not even need to be called for all the data as the Go Extension can instruct the Envoy filter to pass or drop some data (such as protocol (reply) payload) based on parsing only the protocol (request) headers. Memory management between Envoy and Go runtime is non-trivial. To keep things simple our CGO API has no callbacks from Go to Envoy. It is the responsibility of Envoy to call into the Go extensions when new data is available or when some connection events trigger (e.g., new connection, close). All return data (policy verdict (pass/drop), injected data or responses) are passed using buffer space provided in C memory by the calls made from Envoy. To Go code these operational buffers are visible as Slices, but some care has to be exercised so that those slices are not re-allocated (expanded) on the Go side, as it is not generally safe to return Go memory to Envoy. I can propose a PR for an Envoy network filter with this CGO API if there is interest for it. At the minimum this could function as a reference for the HTTP extension effort, and at best we could perhaps extend this CGO API for HTTP so that the same Go runtime could be called from both network filter chains and HTTP filter chains. For this to be worthwhile for us we'd like to use this new filter instead of our own going forward so the existing functionality would need to be retained even if the implementation morphs in to new forms :-) On the Go side all our current parsers are configured by Cilium via Network Policy Discovery Service (NPDS), which is a Cilium defined xDS protocol. This allows runtime configuration of either protocol specific protobuf definitions or generic key-value based policy spec without requiring the Envoy filter chain to be reconfigured on each policy change. This also allowed us to minimize the CGO Envoy API by excluding all protocol-specific definitions from it. Similarly, access logging from the Go parsers is done directly between the Go implementation and Cilium using a protobuf over a Unix domain socket. Integration with Envoy access logging will require extensions to the CGO API. |
Right, mapping HeaderMaps to a structure that Go can read will involve re-creation of an array (or slice) of Go slice headers, while the header names and values themselves do not need to be copied. The header you referred above is the one generated by CGO and contains C prototypes for the API functions implemented in Go. The supporting code with C++ definitions for the Go slices and strings as well as the dylib code is here: |
Thanks @jrajahalme for all the detail. I met with @nobodyiam yesterday and we discussed this effort. I think these are the next steps:
Overall I'm excited to see this moving forward, at least into proposal form. Having good support for Go extensions will unlock a lot of opportunities for additional contributions and use cases within the Envoy ecosystem. Please feel free to ping me if I can help facilitate any of this either via introductions, chatting about potential solutions, etc. Thank you! cc @PiotrSikora for WASM awareness. |
As @wangfakang mentioned, the It seems the Go extension achieve better performance than WASM via less communication between Envoy and the extension. How will we balance this if we need to put some part into WASM and (maybe) call into Envoy in the Go extension? BTW, if I understand it correctly, the Go extension uses Dapr way to work with non-blocking IO model of Envoy like the yield in the Lua. It seems the TinyGo (used by the WASM solution) 's scheduler has some limitations: https://github.com/tetratelabs/proxy-wasm-go-sdk#limitations-and-considerations. Does the Dapr way have any limitation? |
On performance, two things stand out to me:
I'm open to the outcome of this not pointing at Wasm ABI, but at the same time, would like to make sure we make that decision using good performance methodology and with the the Wasm ABI roadmap in mind. There may be other impedance mismatch reasons to have a new ABI here, but it's important to keep in mind the cost of maintaining two completely independent stable ABIs for Envoy and balance this with the other technical wins. |
I agree with @htuch points above. Basically, we should at minimum investigate whether we can use the WASM ABI or not and document why not. That can be part of our design review. Thank you! |
Thanks @mattklein123 and @htuch for the good suggestions and @jrajahalme for the L4 extension details of Cilium, next we will do some study on it.
|
Hi, we have a humble prototype for this which we made publicly available here, given the interest the topic has received. There are some very specific challenges with this (laid out in the README.md), but overall, we're quite happy with how far we have come... |
Thanks @bkgs! Great to have another data point and more help to drive us to a good solution for all. |
Currently, Envoy ships with 3 different ways to extend its capabilities and/or add business logic:
which offer similiar capabilities, but have very different deployment and isolation models, so there are valid reasons for all of them to exist. Right now, they use different APIs, but we should fix that over time to provide consistent user experience. Both of the proposals being discussed here:
are IMHO alternatives to Proxy-Wasm plugins written using Go (TinyGo) SDK, so we should evaluate whether there is a valid reason for adding them. Please correct me if anything below is wrong. The advantages of the Proxy-Wasm solution are:
The advantages of the Go (CGo) solutions seem to be:
I'm not trying to be dismissive, but I'd love to see an apples-to-apples performance comparison, since I have my doubts regarding the claimed better performance, considering that it includes complete Go runtime. Both solutions already exist, so it should be relatively easy for someone to do it for 2 types of filters and share results:
|
a major advantage of go solution without sandbox (or any solution without sandbox, including just dlopen) is that it can use Unmodified libraries and that part is common with ext_proc. The part where ext_proc extauthz differs is that the server side cannot be configured by xds. Many times you do not want to configure the server side, but sometimes you do and that is not an option. I think these are the 3 axes
|
@PiotrSikora Besides the points mentioned by @mandarjog, maximum reusability of existing Go code did matter to us. We tried TinyGo, and it's simply not an alternative. I would also think the risk of crashing plugins exists in C++ as well, let's just assume filter creators know what they are doing although they are not C++ wizards... |
Could you elaborate on what exactly didn't work for you, so that we can improve it in the future? Is your issue primarly with the Proxy-Wasm sandbox (i.e. lack of wide-open access to the host environment and the outside world) or with the limitations of the TinyGo runtime? Also, is there a reason why Basically, I'm trying to find a proper justification for adding the 4th option, and that includes understanding why the existing options don't work. If the primary reason is indeed "we need a full Go runtime" and not "Proxy-Wasm is too slow", then I think we should evaluate performance against a |
Just to reiterate, @PiotrSikora is doing an excellent job in being really clear about what should be in the design doc / proposal. This is similar to what myself and @htuch are saying, just more fully formed. I think we are all on the same page on what we would like to see before we decide to move forward (or not). |
What is the current progress? I really like to see Envoy support Golang Extension, it would make Envoy more flexible as more and more application runtime or other type of runtime occur, my team as well. |
Currently, the proposal is in the review stage, and here is a draft about Envoy’s GoLang extension proposal. |
I think this will be a good candidate for the contrib/ proposal once that lands: https://docs.google.com/document/d/1yl7GOZK1TDm_7vxQvt8UQEAu07UQFru1uEKXM6ZZg_g/edit#heading=h.e31y5myyfztx |
@PiotrSikora so we have a use case where neither
So we'd be very happy to have a first-class Go extension API to solve these problems. Are there any formal points that need to be added to the proposal doc? Thank you. |
Since you said "ext_proc just murders the performance," I'd obviously love
to know more. I recall someone (maybe you) noticing that header
marshalling / unmarshalling was taking a long time, but I never heard
whether it was a problem on the Envoy proxy side or on the external
processor side.
Do you you have a specific use case or even codebase that we could use to
start to have some folks work on learning how to optimize ext_proc
performance? That'd be super helpful. There are certainly going to be "best
practices" that we all learn over time about how best to manage the gRPC
stream and the various processing modes, and if we have a specific use case
and set of requirements we might also identify some tweaks to ext_proc that
could make it faster. Thanks!
…On Thu, Dec 9, 2021 at 5:55 AM George Dobrovolsky ***@***.***> wrote:
@PiotrSikora <https://github.com/PiotrSikora> so we have a use case where
neither ext_proc nor tinygo + wasm sdk help us - we want to validate
requests according to the supplied OpenAPI schema.
ext_proc just murders the performance and tinygo doesn't support a lot of
things that are required to parse json/yaml/xml, just as @bkgs
<https://github.com/bkgs> has said.
So we'd be very happy to have a first-class Go extension API to solve
these problems. Are there any formal points that need to be added to the
proposal doc?
Thank you.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15152 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD7I23F22IL72MTM4AEX3TUQCYLTANCNFSM4YCJWMYA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
@gbrail sorry for a bit out-of-context (and probably hyperbolic) comment on Currently we decided to go with |
Thanks for the update! It'd be great to understand your use case, but I
don't want to burden this PR any further. I think that ext_proc is capable
of operating in an environment that adds a lot less than 40ms of latency,
but I don't know if there are changes to your use of ext_proc that would
help you get there.
…On Mon, Dec 13, 2021 at 4:13 AM George Dobrovolsky ***@***.***> wrote:
@gbrail <https://github.com/gbrail> sorry for a bit out-of-context (and
probably hyperbolic) comment on ext_proc performance. What I meant was
actually the latency that would be much higher if we use this approach
instead of in-process request validation. Not that it matters for all use
cases (it would add no more than ~40ms tops), but some use cases may not
enjoy this.
Currently we decided to go with ext_proc approach for our use case, but
latency-critical applications would much more benefit from in-process
validation on proxy side.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#15152 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAD7I2ZYGHOYLGXYZPAI5U3UQXPODANCNFSM4YCJWMYA>
.
Triage notifications on the go with GitHub Mobile for iOS
<https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675>
or Android
<https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>.
|
I would be interested in seeing this land. As we have contrib now can we just start out with a contrib extension and go from there? |
hi @mattklein123 We have used this proposal of Envoy’s GoLang extension in our internal gateway scenario in 2021 and have achieved good results. We are currently using it to land in the ServiceMesh scenario, and there is still some work to be done, such as goLang plugins managed by Istio, stability, usability, etc. We will compile the documentation and code for submission after we have verified and landing in the ServiceMesh scenario. |
@wangfakang How's it going? Waiting for this feature. |
@Patrick0308 Thank you. Here's a PR #22573 about Envoy's L7 Go extension API, and thanks to @mattklein123 is helping to review it. |
I think this should be done. |
Background
Envoy is an excellent Sidecar in the field of Service Mesh. The use of modern C++ guarantees its high performance, but also increases the difficulty for developers. Although Envoy also supports Lua and WASM to enhance its scalability, there are some shortcomings in some scenarios (as shown in the table below); With the growth of GoLang's cloud native language ecology and popularity, We introduce a proposal that allows Envoy to support GoLang extension capabilities.
Lua vs GoLang vs WASM Extension on Envoy
Design
By adding the Http GoLang extension filter extension to Envoy's HTTP filter, users can implement Envoy's http filter using GoLang via the GoLang L7 extension SDK.
Http Golang extension filter
GoLang http extension filter on the Envoy side implemented in C++, this module is used to call the http filter implemented by the developer through GoLang.
GoLang L7 extension SDK
GoLang L7 extension SDK exports some CGO APIs for interaction between the Http GoLang extension filter and the GoLang http filter.
The specific architecture is as follows:
Best practice
Improve Dapr gRPC performence by Envoy L7 GoLang extension
In some scenarios, we need to use Envoy (mesh) and Dapr (application runtime implemented in GoLang) simultaneously. In these cases, we plan to make Dapr as an envoy's http2 filter (enabled by our GoLang extension module), and re-use the existing component sdks (the reason that we do not use Wasm is GoLang's prematured support of wasm, and the requirement of non-blocking IO model brought by Envoy). The combination of Dapr and Envoy makes it more convenient for maintainance and management than two seperated productions. Let alone the improvement on the performance for Dapr.
Run MOSN stream filter on Envoy via Envoy L7 GoLang extension
In the MOSN project, we use GoLang through the Envoy GoLang extension module(experimental stage) to easily implement Envoy's dynamic routing capabilities: mosn/mosn#1563.
Looking forward to your ideas/comments on this proposal! @mattklein123 @htuch @alyssawilk @lizan thanks.
The text was updated successfully, but these errors were encountered: