Add Level Manifest recovery #1452

NicolasSiver · 2017-11-29T02:50:53Z

Description of the Changes

Ability to recover from the failure is an important part of good playback experience. By this time we had a segment zigzagging feature (the feature where we would hunt for a playable segment through all available levels and jump from primary to backup streams and back).

We are going introduce the same "zigzagging" logic for level manifests. Another benefit, that all these behaviors are driven by retry configuration for both: segments and levels. To complement feature, playlist loader was adjusted to not create extra retry requests, since all retry management is completely managed by Level Controller.

How does it work?

If you have a backup stream, it will look like a zigzagging hunt for the "available" segment/level manifest:

Where: F - Bad Fragment, L - Bad Level

As you can see, if you don't have enough extra renditions, recovery logic will not be able to add extra value to your platform.

Retry Recommendations

If you have 4 renditions and a backup stream:

Level: don't use total retry less than 3 - 4
Fragment: don't use total retry less than 4 - 6
Implement short burst retries (i.e. small retry delay 0.5 - 4 seconds), and when library returns fatal error switch to a different CDN

Notes:

CheckLists

changes have been done against master branch, and PR does not conflict
no commits have been done in dist folder (we will take care of updating it)
new unit / functional tests have been added (whenever applicable)
Travis tests are passing (or test results are not worse than on master branch :))
API or design changes are documented in API.md

mangui

thanks @NicolasSiver
it would be great to add your PR explanation (with the pic) in design.md !

mangui · 2017-11-30T05:51:39Z

src/controller/level-controller.js

+    if (levelError === true || fragmentError === true) {
+      redundantLevels = level.url.length;
+
+      if (redundantLevels > 1 && level.loadError < redundantLevels) {


if my understanding is correct, in case of levelError with redundant streams available, we first schedule a setTimeout(() => this.loadLevel(), delay); then we switch to redundant a couple of lines after ?

Right. Logic for zigzagging is for both: levels and fragments (condition: levelError ===true || fragmentError === true). Where level specific logic (since we also handle Retry Management for levels in the level controller) happens a bit earlier because we can reach the limit and we have to produce fatal error.

By placing level related logic at first, we have the opportunity to bail out in a bit more DRY way.

NicolasSiver · 2017-11-30T17:07:31Z

@mangui added notes to Design document.

mangui · 2017-12-01T06:19:25Z

LGTM, a second pair of eyes might be useful.

johnBartos · 2017-12-01T16:47:54Z

src/controller/level-controller.js

      if (!levelDetails || levelDetails.live === true) {
        // level not retrieved yet, or live playlist we need to (re)load it
        var urlId = level.urlId;
        hls.trigger(Event.LEVEL_LOADING, {url: level.url[urlId], level: newLevel, id: urlId});
      }
    } else {
      // invalid level id given, trigger error
-      hls.trigger(Event.ERROR, {type : ErrorTypes.OTHER_ERROR, details: ErrorDetails.LEVEL_SWITCH_ERROR, level: newLevel, fatal: false, reason: 'invalid level idx'});
+      hls.trigger(Event.ERROR, {
+        type   : ErrorTypes.OTHER_ERROR,


Not a big fan of the irregular whitespacing here since it breaks convention with the rest of the codebase (and if we decide to upgrade our linter to something like eslint, this may give us problems)

It will not. We use eslint. It's a common technic for big inline object representations.
JSHint does not have a warning for that structure. There is a big issue, it's when a line is longer than 80-120 characters. As you can see, in this review you already don't see all line.

Ok fair enough. Since it's a matter of opinion I'm fine with it as is

johnBartos · 2017-12-01T16:59:44Z

src/controller/level-controller.js

+        level.details = undefined;
+      } else {
+        // Switch-down if more renditions are available
+        if (this.manualLevelIndex === -1 && levelIndex !== 0) {


To me, the spec implies we should go to auto:

In the event of an index load failure on one stream, the client chooses the highest bandwidth alternate stream that the network connection supports.

Going to auto should also handle the case where we don't have a level below (but may have one above)

IvanRF · 2017-12-06T20:12:43Z

@NicolasSiver when I saw this commit I thought it was the solution to my current issue, but apparently not. Does it work only for backup streams?

I have only Bad fragments (404 errors) for 360p & 480p but the player never reached the 720p level. It throws a fatal error.

I opened a new issue here: #1458

NicolasSiver added the Enhancement label Nov 29, 2017

NicolasSiver self-assigned this Nov 29, 2017

mangui reviewed Nov 30, 2017

View reviewed changes

mangui requested a review from johnBartos December 1, 2017 06:18

johnBartos reviewed Dec 1, 2017

View reviewed changes

johnBartos approved these changes Dec 4, 2017

View reviewed changes

NicolasSiver added 11 commits December 4, 2017 14:44

add zigzagging for level and fragments

23cb5b3

disable retries for levels in the loader

cffff36

rename to load level

7a43725

set expected level to next level

f1b0a42

rename local level index to currentLevelIndex

0ba1b70

rename manual level to manual level index

34dad9b

add extra config for level loading

cea2347

add remux allow error to be handled as level error

9b856f6

add documentation notes for media zigzagging

5d53d45

resolve lint error for line breaks with OR condition

19174cd

update fMP4 URL

18b8e62

NicolasSiver merged commit f4a86a1 into video-dev:master Dec 4, 2017

NicolasSiver deleted the add-level-manifest-recovery branch December 4, 2017 22:37

ssreed mentioned this pull request Jan 31, 2018

Level suppression #1535

Closed

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add Level Manifest recovery #1452

Add Level Manifest recovery #1452

NicolasSiver commented Nov 29, 2017 •

edited

Loading

mangui left a comment

mangui Nov 30, 2017

NicolasSiver Nov 30, 2017

NicolasSiver commented Nov 30, 2017

mangui commented Dec 1, 2017

johnBartos Dec 1, 2017

NicolasSiver Dec 1, 2017

johnBartos Dec 1, 2017

johnBartos Dec 1, 2017

IvanRF commented Dec 6, 2017

Add Level Manifest recovery #1452

Add Level Manifest recovery #1452

Conversation

NicolasSiver commented Nov 29, 2017 • edited Loading

Description of the Changes

How does it work?

Retry Recommendations

CheckLists

mangui left a comment

Choose a reason for hiding this comment

mangui Nov 30, 2017

Choose a reason for hiding this comment

NicolasSiver Nov 30, 2017

Choose a reason for hiding this comment

NicolasSiver commented Nov 30, 2017

mangui commented Dec 1, 2017

johnBartos Dec 1, 2017

Choose a reason for hiding this comment

NicolasSiver Dec 1, 2017

Choose a reason for hiding this comment

johnBartos Dec 1, 2017

Choose a reason for hiding this comment

johnBartos Dec 1, 2017

Choose a reason for hiding this comment

IvanRF commented Dec 6, 2017

NicolasSiver commented Nov 29, 2017 •

edited

Loading