Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix an issue on RP hash calculations. #93

Merged
merged 1 commit into from
Jun 8, 2017
Merged

Conversation

xdxu
Copy link
Contributor

@xdxu xdxu commented Jun 7, 2017

As per RFC4601, all multicast routers in a domain should use the same
algorithm for RP elections, and all the IP addresses including the mask
address used in the algorithm should be in host byte order. The change fixes
the problem about RP hash calculation so that the results will be consistent
across different routers.

Signed-off-by: Xiaodong Xu stid.smth@gmail.com

As per RFC4601, all multicast routers in a domain should use the same
algorithm for RP elections, and all the IP addresses including the mask
address used in the algorithm should be in host byte order. The change fixes
the problem about RP hash calculation so that the results will be consistent
across different routers.

Signed-off-by: Xiaodong Xu <stid.smth@gmail.com>
@troglobit
Copy link
Owner

Interesting! This looks like it could very well be the root cause to issue #81

The code also looks good. Only real problem I see, apart from introducing an incompatibility with earlier versions of pimd, is that I cannot find where in section 4.7.2 of RFC4601 it states the address/mask/group should be in host byte order? Sure, the code seems to suggest it by the use of _h suffix to the local variables, but I cannot find passage in the text to corroborate this.

If you could help me shed some light on this detail I'd be most grateful, because I have to write something in the ChangeLog to explain the possible incompatibility with older pimd versions we introduce.

@xdxu
Copy link
Contributor Author

xdxu commented Jun 7, 2017

It doesn't explicitly state that the address/mask has to be in host byte order. But if you look at the following description:

Value(G,M,C(i))=
(1103515245 * ((1103515245 * (G&M)+12345) XOR C(i)) + 12345) mod 2^31

where C(i) is the RP address and M is a hash-mask. If BSR is
being used, the hash-mask is given in the Bootstrap messages. If
BSR is not being used, the alternative mechanism that supplies
the group-range-to-RP mappings may supply the value, or else it
defaults to a mask with the most significant 30 bits being one
for IPv4 and the most significant 126 bits being one for IPv6.
The hash-mask allows a small number of consecutive groups (e.g.,
4) to always hash to the same RP.

only if the hash mask is in host byte order, can the algorithm hash to the same result for the 4 consecutive group IP addresses within the same subnet of /30.
And yes it is going to break the compatibility with older pimd version. But it will be compatible with Cisco PIM-SM :)

@troglobit
Copy link
Owner

Ah, thanks for clearing that up! Wasn't obvious to me at all, although I did read the description several times.

Hmm, maybe we should add a command line compatibility switch, or configure option to support users with mixed installations, or even bump the major version (2.4.0 👉 3.0) to denote incompatible change? 😰

Good to know this change at least makes it compatible with Cisco! ✌️

@xdxu
Copy link
Contributor Author

xdxu commented Jun 7, 2017

I agree we need bump the major version and document this new behavior in the release notes. However, I kinda feel that it is a BUG rather than a behavior change. I think you may notice two issues in the hash function, which actually cause unexpected hash value and the result could be unpredictable from the user's perspective.
Based on this fact, I would prefer not adding a configuration option for backward compatibility. So we may require the user to upgrade pimd deployed on all the routers in the same domain. Does it make sense?

@troglobit
Copy link
Owner

Agreed, let's bump major to 3.0 and document in ChangeLog and also recommend users to upgrade all routers in the same domain. It's both reasonable and the only sane option I think. Thanks for the rubber ducking!

Yeah, I noticed the ordering bug with curr_address_h as well, good catch!

OK. I'll start prepping the repo and all meta data for 3.0 and then I'll merge this.

@troglobit
Copy link
Owner

(I've been thinking about changing command line options and possibly also some .conf settings, so upping to 3.0 is really not a bad idea.)

@troglobit troglobit added this to the 3.0 milestone Jun 7, 2017
@troglobit troglobit self-assigned this Jun 7, 2017
@troglobit troglobit merged commit 5174944 into troglobit:master Jun 8, 2017
@troglobit
Copy link
Owner

@xdxu Thank you again for your contribution! 😃 👍

troglobit added a commit that referenced this pull request Jun 8, 2017
Signed-off-by: Joachim Nilsson <troglobit@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants