-
Notifications
You must be signed in to change notification settings - Fork 627
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP: basic support for SASL/PLAIN #923
Conversation
4980737
to
fe28d91
Compare
Thanks for your contribution @thomaslee ! I am absolutely open to whatever the community puts out there. All these new contributions are very encouraging to see. I will go through these PRs as soon as I can. Can you share details on how you configured kafka to work with SASL? Ideally create another Kafka container with SASL enabled and add test to verify this works. 👍 |
Sure -- there are a few steps, hopefully I haven't missed anything: In
then create
lastly, when launching Kafka, set
Might be an easier way, but this seems to work for me. |
5370c11
to
1996f4a
Compare
@hyperlink I've added a positive & a negative test for SASL/PLAIN auth (including broker config for an extra SASL_PLAINTEXT listener -- the final config ended up being a bit more involved than what my earlier config might have implied). Both are passing for me locally against Kafka 1.0.1. That said, I've had a hard time getting the other tests to pass locally even without my changes -- leaning on Travis for broader verification. |
Ugh, digging around in Kafka source code the test failures around the SASL stuff are legitimate: as I mentioned, I've been testing against Kafka 1.0.1, which supports EDIT: modified to clarify 0.10 & 0.11 rather than 0.8-0.11: I'm not sure when the first support for SASL was introduced, but as far as I can tell SASL/PLAIN was only introduced in 0.10. So these tests will need to be skipped for anything <0.10 |
4eb6b09
to
b42cfbe
Compare
@hyperlink phew -- tests finally went green, then a build failed after a no-op change modifying commit history. 😢 I'm assuming that's an intermittent error. Okay, so the nasty details: As I mentioned: SaslHandshake v1 (used in 1.0.0+) is pretty straightforward: no surprises, easy to implement. SaslHandshake v0 (used in 0.10/0.11) on the other hand gets a bit messy because the step where we send credentials doesn't actually use normal Kafka request/response headers. As a result, we can't rely on correlation IDs for SASL authentication responses -- so we really don't want non-SASL responses coming back while we're in the middle of a SASL handshake. See the override of Anyway, because of this we now wait for the API versions response to come back before proceeding with the SASL exchange + metadata load and (ultimately) allowing the client to become "ready". I think this is probably better behavior anyway: pretty sure before this change we'd fire off the API versions request in parallel with the initial metadata load, which works but probably isn't what was originally intended. (This explains some strange log messages I saw a while back too.) Lastly, the broker seems to just close the connection if authentication fails when using SaslHandshake v0 -- so this change assumes that authentication has failed if you've sent a SASL authentication packet & the connection suddenly closes before you get a reply. If the backward compatibility stuff is too nasty, I wouldn't shed a tear if we only supported SASL for 1.0.0+. Certainly an easier, smaller patch -- but I'm sure there would be folks out there who might want to use SASL with 0.10 or 0.11. Let me know either way. I think that's everything. Seem reasonable? And should I set up Travis builders for 1.0.0 and 1.1.0? That will exercise the SaslHandshake v1 code path, which is not yet exercised by the PR builds. |
😃 good work @thomaslee ! I will look at the PR in more detail and give some feedback soon. Regarding the intermittent failures just ignore them for now I've been trying to improve the stability of the tests but they are hard to trackdown because they pass locally. I don't know why the ConsumerGroupStream tests fail. I have a branch where those tests pass everytime the suite runs. So it appears the failure is caused by some side effect of other tests. Please setup travis building for kafka version 1 if you can and thanks for your contribution! |
Cheers! Just added 1.0 and 1.1 to the Travis matrix. Something's up with the start-kafka.sh script when using KAFKA_OPTS in the 1.x images -- something to do with sed. Had to hack around that for now, holler out if you see a better solution. |
Testing today revealed issues in ConsumerGroup with this change in place (using the SaslHandshake v1 code path): https://gist.githubusercontent.com/thomaslee/dfd4479a439c583e4dde490736c0d335/raw/5787ade7c1abcbfc8f06423ed935ddb4a429afb3/gistfile1.txt The error on the broker side seems to be:
Trying to figure out what's going on, but appreciate any insight. |
Okay, seems like Seems like a proper fix might involve modifying That's a little more than I'd like to bite off here, so trying to figure out a work around by modifying |
0a39941
to
a04d91e
Compare
Alright, 11a66df introduced the new |
Ugh, there's still more issues with this when we get disconnected from a broker. I think I need to go back to the drawing board on some of this & lean more heavily on |
@thomaslee probably outside the scope of this PR but I am thinking we should start to abstract the nitty gritty of broker initialization/authentication into the |
@hyperlink yeah, I think I could see that working: so basically a call to I think I like your proposal better, but given all the thrash on this PR I'm tempted to leave it to a subsequent PR if you can forgive the warts for now. 😃 Will push the new set of changes shortly. |
e53b393
to
318306c
Compare
@hyperlink alrighty, looks like tests are green again using the new approach that makes more liberal use of Once you're happy with this & we land it, I'll follow up with a PR to rework the |
@hyperlink I'll resolve that conflict this evening, but have you had a chance to take a look at this? Wondering how far away we realistically are from landing this (obviously non-trivial and possibly risky) change. We've got it monkey-patched into 2.6.0 internally, but as you can imagine it ain't pretty. 😄 |
@thomaslee reviewing this PR is next on my list. |
e6e1bbc
to
82ac886
Compare
Rebased & fixed up that conflict 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you mind adding documentation options.sasl
in the readme?
lib/client.js
Outdated
var handlers = this.unqueueCallback(socket, correlationId); | ||
|
||
if (handlers) { | ||
var decoder = handlers[0]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we're dropping node 4 so feel free to destructure to your heart's content.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
😂
lib/kafkaClient.js
Outdated
logger.error('error initialize broker after connect', error); | ||
logger.error('error initializing broker after connect', error); | ||
if (error instanceof errors.SaslAuthenticationError) { | ||
self.emit('auth_error', error); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why not use just emit on error
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great question, I think I was concerned about raising an error
here when we don't do so for other types of error (and presumably raising other types of error here would have knock-on implications for tests, client code, etc.)
To put it another way, I wanted to bubble up this error for test purposes but wasn't convinced this PR warranted changing error
semantics & potentially breaking stuff. If that makes any sense at all. It makes a potentially large/risky change even larger and riskier. 😅
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm leaning towards emitting an error. I would expect consumers/producers would want to be notified of this failure since a failure in authentication means messages will not be received or delivered.
@@ -41,7 +43,7 @@ BrokerWrapper.prototype.isConnected = function () { | |||
}; | |||
|
|||
BrokerWrapper.prototype.isReady = function () { | |||
return this.apiSupport != null; | |||
return this.apiSupport != null && (!this.needAuthentication || this.authenticated); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you add authenticated to the toString
override?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do!
lib/kafkaClient.js
Outdated
async.waterfall( | ||
[ | ||
callback => { | ||
logger.debug(`Sending SASL/${mechanism} handshake request to ${broker.socket.addr}`); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can use ${broker}
since the overridden toString should print the socket.addr
along with other useful information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah of course 👍
lib/kafkaClient.js
Outdated
); | ||
let error = this.error; | ||
if (!error) { | ||
if (socket.saslAuthCorrelationId !== undefined) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should we also check this.options.sasl
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't hurt!
…n up some details
@hyperlink finally got around to applying your review feedback + rebased. Let me know if anything else needs work. FWIW we've been running the previous version of this code in production environments for a month or two without incident. |
Thanks for this PR @thomaslee. @hyperlink when do you plan on merging this please? |
@hyperlink we played with this at work. It seems functional and we are anxious to use it. We'd love to see it merged as well. |
@hyperlink We'd love to use this at work!! Could you please give some timeline after merge to release? |
Appreciate the votes of confidence folks & nobody's keener to see this land than I am -- but some patience is warranted here. It's not a totally trivial change, the project lead has a day job and I was very slow in following up on his PR feedback (because I too have a day job 😉). That said, appreciate any feedback in terms of testing/code review/etc. in the interim (looking at you @lordgraysith -- thanks for giving it a shot!). If we can get more confidence that this stuff is solid & ready to go I'm sure it'll make Xiaoxin's job easier when he does get a chance to take a look. Cheers! |
Thank you so much @hyperlink and @thomaslee! We're excited to see this published out on npm. |
Published as 3.0.0 |
@thomaslee , Is there any plan to provide support for SASL/GSSAPI?? |
Still plenty more to do here (honestly, it's a bit of a mess in places), but thought I might open this up for comment.
We have a pressing need for dumb authentication in our Kafka cluster(s) to work around some infrastructure issues and kafka-node's support for TLS client certs isn't an option in the short term. Our immediate needs would be met with basic support for SASL/PLAIN, but obviously kafka-node doesn't support SASL -- just wanted to prove to myself that it was fairly trivial to hook up something simple. So this is my whirlwind prototype.
Very, very briefly verified against a local Kafka instance configured with a SASL_PLAINTEXT listener, seems to work -- just barely. 😅
Support for SASL/GSSAPI & SASL/SCRAM-SHA-{256,512} are lower on my list of priorities, but would look pretty similar.
Previous discussion by others back in 2016 over in #511 and #482. Hopefully SASL support is still something you're open to @hyperlink.