-
Notifications
You must be signed in to change notification settings - Fork 346
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes for Backup Edge Cache Group #1908
Conversation
Can one of the admins verify this patch? |
add to whitelist |
test this please |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just did an initial pass over this PR and it generally seems good but I did notice a couple things:
- The indentation seems off in several places. Please make sure indentation is consistent with the rest of the file.
- New features should be clearly documented. We do have documentation around the Coverage Zone File that will need updated for the new field.
Question: if a CZ's backup list exists, but none of the backup CZs have caches available, what happens? After the backup list is exhausted, should we fallback to the original logic of finding the closest available cachegroup?
@rawlinp This backupList would be used to gain more control over which CGs can be a backup for a group. If a CG is not in the backup list, it won't be used, even if that means the request will be rejected. We specifically implemented this code to not fall back to getClosestAvailableCachegroup(). This allows operators to compartmentalize traffic within portions of their network. (For example, if all the east coast caches go down, bandwidth-limited backbone/cross-country links will not be overwhelmed by all the requests going from east clients to west-coast caches). (This was actually the driving reason for this feature and why the existing getClosest... isn't sufficient in some cases). |
Code looks good. Please also add a section to CHANGELOG.md at the top-level with a few line description of the feature and a link to the new docs. |
I have addressed the review comments. Please review and merge if every thing looks good. |
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
|
||
The ``"backupList"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the configured backup Cache Group(s)). This can be used as an alternative to ``"coordinates"`` therby enabling operators to have a control over choosing back up edge servers rather than depending on Geo-coordinates based routing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"where zone name does not map to a cg name in Traffic Ops"
This might be more clearly written as:
The ``"backupList"`` section is optional and is used by Traffic Router for localization in the case of a CZF
"hit" when there are no caches available for that DS in the matched cache group.
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
|
||
The ``"backupList"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the configured backup Cache Group(s)). This can be used as an alternative to ``"coordinates"`` therby enabling operators to have a control over choosing back up edge servers rather than depending on Geo-coordinates based routing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also mention that the list is ordered.
Also, include that if there are no caches available in any of the listed groups, the request will be bypassed (if configured) or rejected
Hopefully last request. Can you please fix your commit author and email (currently they are root).
Then redo the push (you might need a |
Thank you for guiding me through out this. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
couple minor spacing/indentation issues, change section of release notes
CHANGELOG.md
Outdated
@@ -16,3 +16,11 @@ Pre-2.2 the HTTP Routing Name is configurable via the `http.routing.name` option | |||
2. Add this parameter to a **single** profile in the affected CDN | |||
|
|||
With those profile parameters in place Traffic Ops can be safely upgraded to 2.2. Before taking a post-upgrade snapshot, make sure to check your Delivery Service example URLs for unexpected Routing Name changes. Once Traffic Ops has been upgraded to 2.2 and a post-upgrade snapshot has been taken, your Traffic Routers can be upgraded to 2.2 (Traffic Routers must be upgraded *after* Traffic Ops so that they can work with custom per-DeliveryService Routing Names). | |||
|
|||
|
|||
#### Backup Edge Cache group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
since we currently have an RC for 2.2, this is most likely landing in 2.3. So I would add a new v2.3.0 [unreleased] section at the top of this file and add this subsection #### Backup Edge Cache group
under a ### Release Notes
header, ie:
v2.3.0 [unreleased]
-------------------
### Release Notes
#### my new feature
...
v2.2.0 [unreleased]
...
|
||
if (coordinates != null && coordinates.has("latitude") && coordinates.has("longitude")) { | ||
final double latitude = coordinates.get("latitude").asDouble(); | ||
final double longitude = coordinates.get("longitude").asDouble(); | ||
geolocation = new Geolocation(latitude, longitude); | ||
} | ||
|
||
if (!addNetworkNodesToRoot(root, loc, locData, geolocation, useDeep)) { | ||
if (backupList != null) { | ||
bkCacheLoc = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
too much indentation here
} | ||
|
||
@SuppressWarnings({"PMD.CyclomaticComplexity", "PMD.NPathComplexity"}) | ||
public CacheLocation getCoverageZoneCacheLocation(final String ip, final String deliveryServiceId, final boolean useDeep) { | ||
public CacheLocation getCoverageZoneCacheLocation(final String ip, final String deliveryServiceId,final Track track, final boolean useDeep) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
missing space between parameters
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | ||
|
||
The ``"backupList"`` section is optional and is used by Traffic Router for localization in the case of a CZF "hit" when there are no caches available for that DS in the matched cache group. This backup list is ordered and if there are no caches available in any of the configured backup cache groups, the request will be bypassed (if configured) or rejected |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
question: did you build the docs with this change to make sure the backupList
note is contained within the same box as the coordinates
note? The syntax highlighting here makes me think that the 2nd note might not be in the same box
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I built the docs and loaded the file in a browser. Didn't find any issue.
Can one of the admins verify this patch? |
ok to test |
Refer to this link for build results (access rights to CI server needed): |
I think we want to allow the Maybe the format changes to:
I'm not sure it matters whether the default is true or false as long as it's documented. What do you guys think? |
I believe , we are achieving compartmentalization by avoiding closest cache
group altogether. If user wants to fallback to closest cache group, he can
do that with suitable coverage zone file which doesn't have the backup list.
…On Tue, Feb 27, 2018 at 11:45 PM, Rawlin Peters ***@***.***> wrote:
I think we want to allow the backupList to be used for compartmentalizing
traffic but also provide a configurable option to fallback to the default
behavior of finding the next closest cachegroup once all backupList
cachegroups have been exhausted. Perhaps by default we will fall back to
finding the next closest cachegroup if all backup cachegroups have been
exhausted, but if the coverageZone has a field "fallbackToNextClosest":
false the request will be dropped if the backupList has been exhausted?
Maybe the format changes to:
"GROUP2": {
"backupZones": {
"list": ["GROUP1"],
"fallbackToNextClosest": false <-----default: true or false?
},
"network6": [
"1234:567a::/64",
"1234:567b::/64"
],
"network": [
"10.197.69.0/24"
]
}
I'm not sure it matters whether the default is true or false as long as
it's documented. What do you guys think?
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1908 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/ANr-iJg8TAUBs4q-GFvS-s3l-AZbORobks5tZEY9gaJpZM4SPrbm>
.
|
@asfgit Request your inputs on fixing these . |
I think I agree with Rawlin. I understand the reason we have the backup list, and why using |
@Vijay-1 I think we could add this functionality for a pretty minor change to the logic in |
@dneuman64 . If every one agrees that this fallback configuration is required,will get it done in this PR. |
I think the All Checks have failed just means it couldn't find the unit tests to run. I think it's just an issue with the build envrionment. @dangogh can you confirm? |
Refer to this link for build results (access rights to CI server needed): |
} | ||
} | ||
|
||
if ((networkNode.getBackupLoc() != null && networkNode.isUseClosest()) || (networkNode.getBackupLoc() == null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even if the backupZones list is empty and the operator sets useClosest=true, I think we should enter this loop because its equivalent to trying to route to the backupZones when they are all down/unavailable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think , this condition in NetworkNode.java will take care of this, where we parse and store the useClosest flag.
if (backupConfigNode != null && backupConfigNode.has("list")) {
...
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With this the following config will do the closest group if backup list is empty
"backupZones":{
"list": [],
"fallbackToClosestGroup": true
},
and wont do closest group with the following:
"backupZones":{
"list": [],
"fallbackToClosestGroup": false
},
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i'm trying to simplify this conditional, because its awkward to have opposite checks on both sides of an OR. Do we really need to check for networkNode.getBackupLoc()
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Got you. Let me fix this.
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
I'm good with this. @rawlinp Any more comments? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
apologies in advance, the code looks good, but I have a bunch of nitpicks mostly around variable naming and some minor cleanup
CHANGELOG.md
Outdated
|
||
### Release Notes | ||
|
||
#### Backup Edge Cache group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just as a heads up (you shouldn't have to take any action), I might be reformatting the changelog later per the mailing list discussion about using the keepachangelog.com format.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changelog format has now changed to the keepachangelog format (there's now a link to that in the changelog), so you will need to rebase this branch onto the latest version of master and update these notes to use the new format.
@@ -881,7 +899,7 @@ The CZF is an input to the Traffic Control CDN, and as such does not get generat | |||
|
|||
The script that generates the CZF file is not part of Traffic Control, since it is different for each situation. | |||
|
|||
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). | |||
.. note:: The ``"coordinates"`` section is optional and may be used by Traffic Router for localization in the case of a CZF "hit" where the zone name does not map to a Cache Group name in Traffic Ops (i.e. Traffic Router will route to the closest Cache Group(s) geographically). The ``"backupZones"`` section is optional and is used by Traffic Router for localization in the case of a CZF "hit" when there are no caches available for that DS in the matched cache group. This backup list is ordered and if there are no caches available in any of the configured backup cache groups, the request will be bypassed (if configured) or rejected or gets to the closest cache groups if configured (``"fallbackToClosestGroup"`` flag) for this backup configuration. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would break this last sentence up to make it clearer (starting from "This backup list...":
This backup "list"
contains an ordered list of backup groups to choose from if the matched cache group has no caches available for a requested DS. If an available cache cannot be found in any of the backup groups either, the "fallbackToClosestGroup"
flag determines the Traffic Router's following behavior. If true
, Traffic Router will find the next closest cache group with available caches. If false
(the default), Traffic Router will bypass the request (if configured) or reject it.
@@ -106,15 +108,33 @@ public static NetworkNode generateTree(final JsonNode json, final boolean verify | |||
final String loc = czIter.next(); | |||
final JsonNode locData = JsonUtils.getJsonNode(coverageZones, loc); | |||
final JsonNode coordinates = locData.get("coordinates"); | |||
final JsonNode backupConfigNode = locData.get("backupZones"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kind of a nitpick but can we call this variable backupConfigJson
? All the Node
stuff is confusing with JsonNode vs NetworkNode.
@@ -106,15 +108,33 @@ public static NetworkNode generateTree(final JsonNode json, final boolean verify | |||
final String loc = czIter.next(); | |||
final JsonNode locData = JsonUtils.getJsonNode(coverageZones, loc); | |||
final JsonNode coordinates = locData.get("coordinates"); | |||
final JsonNode backupConfigNode = locData.get("backupZones"); | |||
JsonNode fallbackNode = null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this variable is unneeded; see comment on L131
if (backupConfigNode != null) { | ||
if (backupConfigNode.has("list")) { | ||
bkCacheLoc = new ArrayList<>(); | ||
for (final JsonNode network : JsonUtils.getJsonNode(backupConfigNode, "list")) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cachegroup
rather than network
?
@@ -281,6 +304,14 @@ public Geolocation getGeolocation() { | |||
return geolocation; | |||
} | |||
|
|||
public List<String> getBackupLoc() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
getBackupCacheGroups
// Check whether the CZF entry has a geolocation and use it if so. | ||
return getClosestCacheLocation(cacheRegister.filterAvailableLocations(deliveryServiceId), networkNode.getGeolocation(), cacheRegister.getDeliveryService(deliveryServiceId)); | ||
if (networkNode.getBackupLoc() != null) { | ||
for (final String element : networkNode.getBackupLoc()) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cachegroup
rather than element
if (closestCacheLocation != null) { | ||
LOGGER.debug("Got closest CZ cache group " + closestCacheLocation.getId() + " for " + ip + ", ds " + deliveryServiceId); | ||
if (track != null) { | ||
track.setFromBackupCzGroup(true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
in this case, it wasn't from backupCzGroup but the default behavior of nextClosest. Should we just omit this if-block?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe, any thing other than the exact match on the subnet, is backup. (i.e) both co-ordinates and Backup Cache Groups are fallback mechanism and hence i feel this if-block should be fine
fallbackNode = backupConfigNode.get("fallbackToClosestGroup"); | ||
useClosestOnBackupFailure = (fallbackNode != null) ? fallbackNode.asBoolean(): false; | ||
} else { | ||
useClosestOnBackupFailure = true; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should just initialize useClosestOnBackupFailure
to true on L113 and omit this if-else block. It doesn't change behavior; it would just be cleaner because it will be set to false in the previous block if configured anyway.
bkCacheLoc.add(network.asText()); | ||
} | ||
} | ||
fallbackNode = backupConfigNode.get("fallbackToClosestGroup"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think these two lines can be replaced with:
useClosestOnBackupFailure = JsonUtils.optBoolean(backupConfigJson, "fallbackToClosestGroup", false);
Then you also don't need the fallbackNode
intermediate variable
Refer to this link for build results (access rights to CI server needed): |
@rawlinp , I have address all your comments except the removal of if-block which tags cache group selection based on coordinates as backup. I believe, any thing other than the exact match on the subnet, is backup / fallback. (i.e) both co-ordinates and Backup Cache Groups are fallback mechanism and hence i feel this if-block should be fine. |
Refer to this link for build results (access rights to CI server needed): |
@Vijay-1 thanks for addressing my comments! I pulled this PR down to run some test scenarios against it, and this case was unexpected:
I would've expected item 3 to cause the client to be bypassed (if configured) or rejected. With your current code this only happens if the Delivery Service is set to CZ-only. |
} | ||
return closestCacheLocation; | ||
} | ||
return null; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather just returning null
here, I think we have to return more information to the caller so that the caller knows the difference between "null because there wasn't a hit in the CZF" and "null because there was a hit in the CZF but we're not returning a CacheLocation due to the "backupZones"
config.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rawlinp This test case and results looks fine to me. With the Geo Limit any value other than CZF only , say for example None, TR will go all out to find a route and when doing so it will find a route via 'GEO' since this is not limited to CZF only.
For Null returning, may be we can add one more rdtl field to denote backup Zone failure. But then , i think it is supernumerary since we have enough tagging in the access logs to differentiate between selecting caches from CZ, CZ's backup configuration for successful cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I disagree, one of the reasons for specifying backupZones is to confine traffic to certain cachegroups when the matched cachegroups are unavailable. In this case we're getting a hit in the CZ (so we know where the client is in the network), that cachegroup is unavailable, but the client will still fall back to a Geo-lookup (even though from the CZ-hit we already know where the client is), best-case scenario get matched to the same cachegroup, find the closest available cachegroup, and altogether disregard the "confine traffic to certain cachegroups" rule in the CZF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"confine traffic to certain cache group" is defined by CZF. So, if we remove that limitation (Geo Limit's value other than CZF only), TR can fetch a matching cache using Geo lookup. I am still not clear on why should TR reject this when the configuration says None as Geo Limit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Geo is generally the backup case when a client's IP cannot be found in the CZF. If the client is found in the CZF, we shouldn't be falling back to a Geo-lookup which subverts the CZF's backupZones
config. We know the client is in our network, so we want it to follow the rules specified in the CZF.
If the client was not found in the CZF, then it should most likely not be in our network. In that case, the congestion might not be a problem because the client should be entering our network from somewhere else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the client was found in the CZF, but the caches in that zone are down (and theres no backupList AND fallthrough == true), then we should try to use thecoordinates
of the matching zone to find the next closest cache group. If there are no coordinates, I think the current behavior is to try geolocation, even on a client that was in the CZF.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree with that, but the scenario I'm concerned about is when fallbackToNextClosest == false
. In that case I don't think the request should fallback to Geo when the client was found in the CZF but the matched cachegroup has no caches available. In the best-case geo-lookup that client would still be traversing the same network path to the next available cachegroup (which is what the backup list is supposed to prevent).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok i agree that when fallbackToNextClosest ==false
that we should NOT geolocate
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Refer to this link for build results (access rights to CI server needed): |
CHANGELOG.md
Outdated
|
||
### Release Notes | ||
|
||
#### Backup Edge Cache group |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The changelog format has now changed to the keepachangelog format (there's now a link to that in the changelog), so you will need to rebase this branch onto the latest version of master and update these notes to use the new format.
@@ -262,7 +264,8 @@ public Geolocation getLocation(final String clientIP, final DeliveryService deli | |||
} | |||
} else { | |||
// Deep caching not enabled for this Delivery Service; use the regular CZ | |||
cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds); | |||
cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds, track, useDeep); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ideally this line should just pass track
into this method and let the method default useDeep
to false.
@@ -345,8 +352,9 @@ public DNSRouteResult route(final DNSRequest request, final Track track) throws | |||
return result; | |||
} | |||
|
|||
final CacheLocation cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds); | |||
final CacheLocation cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds, track, false); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is it possible to change the method signature to have this be getCoverageZoneCacheLocation(request.getClientIP(), ds, track);
so that we're not passing useDeep
explicitly as false? useDeep
should default to false.
@@ -279,7 +282,11 @@ public Geolocation getLocation(final String clientIP, final DeliveryService deli | |||
track.setResult(ResultType.MISS); | |||
track.setResultDetails(ResultDetails.DS_CZ_ONLY); | |||
} | |||
} else { | |||
} else if ((useDeep) || (networkNode != null && networkNode.isUseClosest()) || (networkNode == null)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this can remain a simple else
if the logic in my comment above is implemented.
@@ -116,7 +116,7 @@ public void setRegionalAlternateCount(final int regionalAlternateCount) { | |||
|
|||
public enum ResultDetails { | |||
NO_DETAILS, DS_NOT_FOUND, DS_TLS_MISMATCH, DS_NO_BYPASS, DS_BYPASS, DS_CZ_ONLY, DS_CLIENT_GEO_UNSUPPORTED, GEO_NO_CACHE_FOUND, | |||
REGIONAL_GEO_NO_RULE, REGIONAL_GEO_ALTERNATE_WITHOUT_CACHE, REGIONAL_GEO_ALTERNATE_WITH_CACHE | |||
REGIONAL_GEO_NO_RULE, REGIONAL_GEO_ALTERNATE_WITHOUT_CACHE, REGIONAL_GEO_ALTERNATE_WITH_CACHE, DS_CZ_BACKUP_CG |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe add DS_CZ_NO_BKP
to signify that the matched cachegroup and its configured backup cachegroups were unavailable.
} | ||
} | ||
return closestCacheLocation; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Following from my suggestions above, this line would probably be:
} else {
track.SetNoBackupLocationsAvailable(true);
return null;
}
Then I'm not sure if we can also remove the return null
statement below this.
@@ -262,7 +264,8 @@ public Geolocation getLocation(final String clientIP, final DeliveryService deli | |||
} | |||
} else { | |||
// Deep caching not enabled for this Delivery Service; use the regular CZ | |||
cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds); | |||
cacheLocation = getCoverageZoneCacheLocation(request.getClientIP(), ds, track, useDeep); | |||
networkNode = getNetworkNode(request.getClientIP()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally I'm not a big fan of polluting this method with extra NetworkNode stuff, especially since getNetworkNode
is already called in getCoverageZoneCacheLocation
. I think ideally getCoverageZoneCacheLocation
should return us the information we need.
Since we're passing in the Track
object, we could probably add that information there, ie something like track.noBackupLocationsAvailable()
which would get set within getCoverageZoneCacheLocation
. Then we can update the logic starting on L277 to something like this:
if (track.noBackupLocationsAvailable() or ds.isCoverageZoneOnly()) {
if (ds.geoRedirect) {...}
else {
resultDetails = DS_CZ_ONLY if ds.isCoverageZoneOnly else DS_CZ_NO_BKP
...
}
}
Refer to this link for build results (access rights to CI server needed): |
Addressed the following review comments: 1. Added documentation 2. Updated CHANGELOG.md 3. Fixed indentation Addressed review comments. Commit using proper Author fields. Addressed Rawlin's comments Fixing Note's section as per Rawlin's comment. Addressing Rawlin's comments on fallback configuration on backupZones. Rewrote the ambiguous if condition check based on Eric's comments Fixing unit test case failure. Addressing Rawlin's comments on: 1. Naming convention 2. Avoiding use of variable 'fallbackNode ' 3. Documentation Documentation update for Traffic Router's access logs rdtl field Addressing Rawlin's comment on expanding the scope of backupZones's fallbackToClosestGroup configuration for all Geo Limit values. Use Track to record network node's configuration accross.
Refer to this link for build results (access rights to CI server needed): |
Refer to this link for build results (access rights to CI server needed): |
@rawlinp
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
reviewed the changelog first, will continue reviewing the rest of the PR shortly
CHANGELOG.md
Outdated
@@ -1,13 +1,18 @@ | |||
# Changelog | |||
All notable changes to this project will be documented in this file. | |||
======= |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this line is unnecessary, it turns the line above into a large header
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
CHANGELOG.md
Outdated
|
||
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/). | ||
|
||
## [Unreleased] | ||
### Added | ||
- Per-DeliveryService Routing Names: you can now choose a Delivery Service's Routing Name (rather than a hardcoded "tr" or "edge" name). This might require a few pre-upgrade steps detailed [here](http://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops/migration_from_20_to_22.html#per-deliveryservice-routing-names) | ||
|
||
- Backup Edge Cache group: Backup Edge group for a particular cache group can be configured using coverage zone files. With this, users will be able control traffic within portions of their network, there by avoiding choosing fall back cache groups using geo co-ordinates(getClosestAvailableCachegroup) This can be controlled using "backupZones" which contains configuration for backup cache groups parsed from Coverage zone file as explained [here] (http://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops/using.html#the-coverage-zone-file-and-asn-table) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few things here:
- Based on the keepachangelog format, there shouldn't be blank lines between bullet points, so this line should be moved up.
- "thereby" not "there by", "fallback" not "fall back", "coordinates" not "co-ordinates"
- A period is needed after "(getClosestAvailableCachegroup)"
- "the Coverage Zone File" not "Coverage zone file"
- Also, the
[here] (http://...)
link doesn't work if there's a space between[here]
and the link(http://..)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
CHANGELOG.md
Outdated
### Changed | ||
- Reformatted this CHANGELOG file to the keep-a-changelog format | ||
|
||
[Unreleased]: https://github.com/apache/incubator-trafficcontrol/compare/RELEASE-2.1.0...HEAD | ||
-[Unreleased]: https://github.com/apache/incubator-trafficcontrol/compare/RELEASE-2.1.0...HEAD |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the change here breaks the [Unreleased]
link above, this shouldn't be changed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
CHANGELOG.md
Outdated
|
||
The format is based on [Keep a Changelog](http://keepachangelog.com/en/1.0.0/). | ||
|
||
## [Unreleased] | ||
### Added | ||
- Per-DeliveryService Routing Names: you can now choose a Delivery Service's Routing Name (rather than a hardcoded "tr" or "edge" name). This might require a few pre-upgrade steps detailed [here](http://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops/migration_from_20_to_22.html#per-deliveryservice-routing-names) | ||
|
||
- Backup Edge Cache group: Backup Edge group for a particular cache group can be configured using coverage zone files. With this, users will be able control traffic within portions of their network, there by avoiding choosing fall back cache groups using geo co-ordinates(getClosestAvailableCachegroup) This can be controlled using "backupZones" which contains configuration for backup cache groups parsed from Coverage zone file as explained [here] (http://traffic-control-cdn.readthedocs.io/en/latest/admin/traffic_ops/using.html#the-coverage-zone-file-and-asn-table) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
based on the keepachangelog format there should only be one blank line between sections
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@@ -295,15 +296,15 @@ public Geolocation getLocation(final String clientIP, final DeliveryService deli | |||
track.setResult(ResultType.MISS); | |||
track.setResultDetails(ResultDetails.DS_CZ_ONLY); | |||
} | |||
} else { | |||
} else if (track.isUseNextClosest()) { | |||
//Even with Geo limit none, TR wont do geo look-up, if fallback is diabled via backupZones configuration |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
diabled -> disabled
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
@rawlinp |
Refer to this link for build results (access rights to CI server needed): |
I have successfully executed the following test cases:
With Geo Limit 'None':
|
Refer to this link for build results (access rights to CI server needed): |
This PR will be closed and a new one based on the review comments will be created. |
New PR: #2029 |
This PR implements solution for the issue: #1907
It places the backup policy in the CZF file
{
"coverageZones": {
"GROUP2": {
"backupList": ["GROUP1"],
"network6": [
"1234:567a::/64",
"1234:567b::/64"
],
"network": [
"10.197.69.0/24"
]
},
"GROUP1": {
"backupList": ["GROUP2"],
"network6": [
"1234:5677::/64",
"1234:5676::/64"
],
"network": [
"10.126.250.0/24"
]
}
}
}
The following test cases has been executed successfully for both DNS and HTTP Routing:
Test Setup
GROUP1 : Two cache Servers
Group2 : One Cache Server
GEO Limit set to CZF only
Request from GROUP1's subnet , with one server in GROUP1 down and make sure the request is getting redirected to the remaining server in GROUP1
Request from GROUP1's subnet , with both servers in GROUP1 down and make sure the request is getting redirected to the GROUP2
Regression cases
Set GEO limit to None without backup list configured in CZF
Request from GROUP1's subnet , with both servers in GROUP1 down and make sure the request is getting redirected to the GROUP2