Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: ManagedMediaSource API #320

Closed
jyavenard opened this issue May 29, 2023 · 52 comments
Closed

Proposal: ManagedMediaSource API #320

jyavenard opened this issue May 29, 2023 · 52 comments
Milestone

Comments

@jyavenard
Copy link
Member

jyavenard commented May 29, 2023

Definitions

A “managed” MediaSource is one where more control over the MediaSource and its associated objects has been given over to the User Agent.

Introduction

The explicit goal of the Media Source Extensions specification is to transfer more control over the streaming of media data from the User Agent to the application running in the page. This transfer of control can and has added points of inefficiencies, where the page does not have the same level of capabilities, knowledge, or even goals as the User Agent.
Examples of these inefficiencies include the management of buffer levels, the timing and amount of network access, and media variant selection. These inefficiencies have largely been immaterial on relatively powerful devices like modern general purpose computers. However, on devices with narrower capabilities, it can be difficult to achieve the same quality of playback with the MediaSource API as is possible with native playback paths provided by the User Agent.

The goal of the ManagedMediaSource API is to transfer some control back to the User Agent from the application for the purpose of increasing playback efficiency and performance, while retaining the ability for pages to control streaming of media data.

Goals

  • Make it easier for media web site authors to support streaming media playback on constrained capability devices.
  • Allow User Agents to react to changes in available memory and networking capabilities.
  • Reduce the power impact of MediaSource APIs.
  • Efficiently support high-speed / high-energy-use networks (e.g., 5G cellular).

Non-Goals

Scenario

Low-memory availability

A user loads a MSE-based website on a device with a limited amount of physical memory, and no ability to swap. The user plays some content on that website, and pauses that content. The system subsequently faces memory pressure, and requires applications (including the User Agent) to purge unused memory. If not enough memory is made available, those applications (including the User Agent) may be killed by the system in order to free up enough memory to perform the operation triggering this memory pressure.

The User Agent runs a version of the “Coded Frame Eviction” algorithm, removing ranges of buffered data in order to free memory for use by the system. At the end of this algorithm, the User Agent fires a “bufferedchange” event at every SourceBuffer affected by this algorithm, allowing the web application to be notified that it may need to re-request purged media data from the server before beginning playback.

Memory availability notification

When a call to appendBuffer is rejected with a QuotaExceededError exception, it can indicate the amount of excessive bytes or time that caused the error.

Network Streaming changes

Currently, a web application is allowed to append media data into a Source Buffer at any time, up until that Source Buffer’s “buffer full flag” is set, indicating no additional data is allowed to be appended. However, a constrained device may want to coalesce network use into a small window, and allow the network to query for battery and bandwidth reasons.

Alternatively, a device may have access to a high-speed network with high power use while the relevant communications interface is active (as can happen on 5G cellular). Using such a network may be beneficial in some circumstances:

  • the network may be less congested
  • bandwidth may be unmetered on certain plans
  • and media may start, re-buffer, and reach higher resolutions faster.

To get these benefits without excessive battery drain, it's necessary to buffer more at once, and to limit streaming activity to specific windows so that the device's radio can be cycled on and off.

The User Agent would fire a “startstreaming” event at the MediaSource, indicating that the web application should begin streaming new media data. It would be up to the User Agent to determine when streaming should start, and could take current buffer levels, current time, network conditions, and other networking activity on the system.

When the User Agent determines that no further media streaming should take place, it would fire a “stopstreaming” event at the MediaSource, indicating to the web application that enough media data had been buffered to allow playback to continue successfully.

[Exposed=(Window,DedicatedWorker)]
interface ManagedMediaSource : MediaSource {
    constructor();

      readonly attribute boolean          streaming;
               attribute EventHandler     onstartstreaming;
               attribute EventHandler     onendstreaming;
                   
                  static boolean          isTypeSupported (DOMString type);                   
};

[Exposed=Window,DedicatedWorker]
interface BufferedChangeEvent : Event {
    constructor(DOMString type, BufferedChangeEventInit eventInitDict);

    [SameObject] readonly attribute TimeRanges addedRanges;
    [SameObject] readonly attribute TimeRanges removedRanges;
};

dictionary BufferedChangeEventInit : EventInit {
    TimeRanges addedRanges;
    TimeRanges removedRanges;
};

[Exposed=(Window,DedicatedWorker)]
interface ManagedSourceBuffer : SourceBuffer {
               attribute EventHandler     onbufferedchange;
};

Usage example

async function setUpVideoStream(){
  // Specific video format and codec
  const mediaType = 'video/mp4; codecs="mp4a.40.2,avc1.4d4015"';
  
  // Check if the type of video format / codect is supported.
  if (!ManagedMediaSource.isTypeSupported(mediaType)) {
    return; // Not supported, do something else.
  }

  // Set up video and its managed source.
  const video = document.createElement("video");
  const source = new ManagedMediaSource();
  
  video.disableRemotePlayback = true;
  video.controls = true;
  
  await new Promise((resolve) => {
    video.src = URL.createObjectURL(source);
    source.addEventListener("sourceopen", resolve, { once: true });
    document.body.appendChild(video);
  });
  
  const sourceBuffer = source.addSourceBuffer(mediaType);
  
  // Set up the event handlers
  sourceBuffer.onbufferedchange = (e) => {
    console.log("onbufferedchange event fired.");
    console.log(`Added Ranges: [${timeRangesToString(e.addedRanges)}]`);
    console.log(`Removed Ranges: [${timeRangesToString(e.removedRanges)}]`);
  };
  
  source.onstartstreaming = async () => {
    const response = await fetch("./videos/bipbop.mp4");
    const buffer = await response.arrayBuffer();
    await new Promise((resolve) => {
      sourceBuffer.addEventListener("updateend", resolve, { once: true });
      sourceBuffer.appendBuffer(buffer);
    });
  };
  
  source.onendstreaming = async () => {
    // Stop fetching new segments here
  };
}

// Helper function...
function timeRangesToString(timeRanges) {
  const ranges = [];
  for (let i = 0; i < timeRanges?.length; i++) {
    const range = [timeRanges.start(i), timeRanges.end(i)];
    ranges.push(range);
  }
  return ranges.toString();
}

<body onload="setUpVideoStream()"></body>

Privacy considerations

TODO: discuss potential privacy protections if multiple origins try poke at this at the same time.
A concern is providing visibility to preferred quality if it is based on networking condition such as cellular or wifi etc.

Other

To consider from MSE v2:

  • Define behaviour for playback over unbuffered range.
@jyavenard jyavenard changed the title [DRAFT] ManagedMediaSource: Explainer Proposal: ManagedMediaSource API Jun 2, 2023
@marcoscaceres
Copy link
Member

If there is no objection, @jyavenard and I would like to submit a Pull Request to the spec outlining the details of how it might work to solicit further feedback from working group members. We are also happy to provide a test suite in the form of Web Platform Tests.

We have a reference implementation in WebKit if folks want to try it out.

@chrisn
Copy link
Member

chrisn commented Jun 2, 2023

Given positive reactions to @jyavenard's post above, I think a PR would be welcome. I'd like to arrange to talk through the proposal in an upcoming Media WG meeting, if you'd be happy to?

We're also going to need someone in the WG as a new co-editor.

@dalecurtis
Copy link

dalecurtis commented Jun 2, 2023

Thanks for working on this! The buffering events look great. I do have some minor concerns, but no blocking objections:

  • quality and onqualitychanged are likely to face privacy concerns.
  • Rejecting appends for violation of streaming events seems infeasible since clients may be doing asynchronous processing before handing the data to MSE (transmuxing, unpacking proprietary stream)
  • It's unclear to me how the streaming events can properly account for bandwidth. MSE only knows the size of the appends. Even if it knew bytes downloaded and time, it wouldn't know what bytes are media.
    • Very low bandwidth situations seem like they could fire these events at the wrong times if just based on time left. Maybe this ends up not mattering in practice though and watching bytes between streaming events is enough.
    • You could imagine something like ManagedMediaSource.fetch() if we need the UA to have perfect knowledge and the ability to schedule optimally for radio.

@RealAlphabet
Copy link

RealAlphabet commented Jun 6, 2023

I can't wait to see the outcome of this new proposal. I think what's being proposed here is really what MSE is lacking most today, so I'm all for it.

Closely related, but a little off topic. Is this API currently deployed on Webkit, or is it just an experiment for now? The MSE API has been disabled on the iPhone for a few years, so I was wondering if the arrival of this new proposal, as well as the arrival of Managed Source Extension on iOS 17 (according to the Safari 17 Beta release notes), would result in the API being re-enabled on the iPhone.

@jyavenard
Copy link
Member Author

I can't wait to see the outcome of this new proposal. I think what's being proposed here is really what MSE is lacking most today, so I'm all for it.

that's great to hear !

Managed Media Source is enabled on Safari 17 beta on macOS and iPadOS and behind an experimental flag on iOS for now.

There will be a talk for WWDC on developer.apple.com on Thursday June 8th "Explore media formats for the web" where it is presented.

@tidoust
Copy link
Member

tidoust commented Jun 6, 2023

Anticipating that there may be a period when there are browsers or platforms that only support MediaSource and browsers or platforms that only support ManagedMediaSource, would it be possible to clarify the story for developers who want to use MSE across devices?

For instance, is the pull request that proposes to add support for ManagedMediaSource in HLS.js (video-dev/hls.js#5542) representative of the amount of code needed to adjust applications to leverage ManagedMediaSource given an existing MediaSource pipeline, or would it be likely that such applications need to maintain separate MediaSource and ManagedMediaSource pipelines, with different adaptive logic?

I'm also wondering about the overall message down the road. Is it "This strikes a better balance than the previous version of MSE, use ManagedMediaSource whenever possible, fall back to MediaSource only when it is not supported". Or more "ManagedMediaSource and MediaSource have different usage scenarios". If the latter, I think it would be useful to provide guidance in the spec on how to choose between options.

@jyavenard
Copy link
Member Author

The message we want to convey is to use Managed Media Source first if available, and only fallback to MSE if that's the only option available.
The events are hints, you can follow them or not. On iPhone and iPad, if you follow the guidance, you will have access to 5G connectivity. So you could use Managed Media Source juste like MSE.
If you don't follow the guidance and a particular user agent decide to enforce that by throwing when you attempt to append data, the remedial code would be applicable with MSE too.

Any logic to decide which resolution variant is suitable to use would be common between the two as far as bandwidth management is concerned. As mention by @dalecurtis the quality attribute may not fly for fingerprinting concerns.

@mwatson2
Copy link
Contributor

mwatson2 commented Jun 6, 2023

Improved exposure of and flexibility for memory constraints would be great.

Regarding browser optimization of network request timing, isn't this a general concern rather than one that is specific to streaming ? The key property of streaming network requests is that they are (often) not urgent, because we have lots of buffered data, and so we're happy to trade some latency for improved overall throughput or some other benefit such as battery life. The propose start / stop streaming events don't prevent the site from downloading, only from appending. And it could presumably happen that a network request issued during a "streaming allowed" period doesn't complete within that period, but we should still be allowed to append it.

An alternative, admittedly more radical, approach would be a way to tag network requests as "non-urgent" and so eligible for delay within the browser until a time where they can be made more efficiently. Of course the browser should only delay such requests if there is some pay-off that the site could later observe.

Regarding the quality hints, I think that as with the network request timing there needs to be some measurable benefit to the site to start listening to these and I am not sure what that is ?

Does it need to be a new class ? Or could these just be discoverable extensions to the existing MediaSource ?

@jernoble
Copy link

jernoble commented Jun 7, 2023

(Implementor hat on)

@mwatson2 said:

The propose start / stop streaming events don't prevent the site from downloading, only from appending.

Correct, they don't prevent the site from downloading. The current language allows a UA to prevent appending, but in our implementation experience, that wasn't actually necessary. We left the ability to block appends in the proposal should a UA decide it was necessary or desirable to implement, but that could be pulled out into a separate proposal and removed from this one.

An alternative, admittedly more radical, approach would be a way to tag network requests as "non-urgent" and so eligible for delay within the browser until a time where they can be made more efficiently.

Seems like a good idea to do whether or not we do ManagedMediaSource, and also something very outside the Media WG's bailiwick. Doing this without wreaking havoc on site's bandwidth estimation would also be difficult. The current proposal allows UAs to incorporate buffer water-levels in its decision to fire startstreaming/endstreaming, and that would be much, much more difficult to do with a free-floating fetch request marked as "low-priority".

Does it need to be a new class ? Or could these just be discoverable extensions to the existing MediaSource ?

We discussed this with @wolenetz, and the alternative would be to pass a dictionary containing configuration modes into the MediaSource constructor. What we all discovered is, that kind of mode switch is not easily feature detectable; a separate class that extends MediaSource is very feature detectable and enables the same capabilities as a configuration mode.

@jyavenard
Copy link
Member Author

jyavenard commented Jun 7, 2023

Improved exposure of and flexibility for memory constraints would be great.

You mean in addition to this proposition?
The difference over MSE is that the coded frame eviction algorithm can be run at any time and not just during the Prepare Append step.
When it has run, bufferedchange event will be fired, along the TimeRanges that were evicted.
I would say that it is up to the user agent to evict content in such a way that it doesn't prevent the current media to continue playing. This could be suggested in the final spec. propose two different iteration of the coded frame eviction: one that would only evict content no longer necessary for playback to continue as-is: such as past data or future, discontinuous from the currently playing TimeRange.
And one that could make the playback stall under extreme memory pressure.

The propose start / stop streaming events don't prevent the site from downloading, only from appending. And it could presumably happen that a network request issued during a "streaming allowed" period doesn't complete within that period, but we should still be allowed to append it.

I will add to Jer's answers:
webkit implementation doesn't block append, what it does however, when the streaming attribute is true (that is between startstreaming and endstreaming events), is tag all downloads started by the page where a ManagedMediaSource is opened as "media", so that they can go over the 5G network (this is also dependent to user system settings). Outside this period, the cellular modem may go into low-power mode, disabling 5G.

@chrisguttandin
Copy link

An alternative, admittedly more radical, approach would be a way to tag network requests as "non-urgent" and so eligible for delay within the browser until a time where they can be made more efficiently.

When I read the comment from @mwatson2 above I vaguely remembered that such a mechanism already exists. There is a priority property that can be set when using fetch(). It seems to be only available in Chrome by now. https://web.dev/fetch-priority/#lower-the-priority-for-non-critical-data-fetches

Sorry if that was obvious to you all.

@mwatson2
Copy link
Contributor

mwatson2 commented Jun 8, 2023

@jernoble As a site implementor I would be very concerned about the user agent making assumptions about "buffer level". The buffer state consists of both media that is appended and media that has been downloaded and not appended. When considering non-trivial playback scenarios (anything where the download is more complex than a straightforward linear sequence of media blocks) the site is managing what is downloaded and what is appended. For example, sometimes we append media "just-in-time" for playback in which case the UA has no useful information about the true buffer level.

Looked at another way, if the UA starts treating media differently based on the UA's perception of buffer level, sites are just going to optimize when they append to get to the site's idea of optimum performance.

A better concept is the "urgency" of requests i.e. some information about the actual earliest time the response might be needed.

@jyavenard About the memory constraints, I meant this proposal. However, ideally, if there are memory constraints it would be nice for the site to be able to know about them in advance so we can consider than in our adaptive streaming choices. On very constrained devices we sometimes stream at a lower bitrate even when throughput is high so as to be able to store enough media to cover adaptations in future. Of course, a site can heuristically work out what the constraint is by observing the UA behavior when it comes to removals.

From your description, it sounds like what the startstreaming and endstreaming events are really doing is advertising the availability of a more efficient network connection which is only intermittently available. Perhaps those events should not be tied to streaming and should just be global events that do exactly that (e.g. networkavailability event with values like standard, lowpower, highspeed - only better names 'cos I just made those up). And then sites would take advantage of those by deferring non-urgent requests into the more efficient time periods.

@chrisguttandin Yes, it certainly seems like a UA could defer "low" fetchpriority requests to the time period within which the 5G connection is available.

@jernoble
Copy link

jernoble commented Jun 8, 2023

@mwatson2 said:

Yes, it certainly seems like a UA could defer "low" fetchpriority requests to the time period within which the 5G connection is available.

No, the fetch spec does not allow that currently. The priority is only used to prioritize fetch requests relative to each other. It doesn't appear to allow UAs to delay fetches indefinitely if conditions are not "ripe" for a low priority request. The notes from Chrome explicitly state that fetchpriority is only really useful in situations where limited bandwidth is being contended for by multiple requests. IOW, if the UA started arbitrarily delaying fetches marked as "low", that would likely make a lot of sites upset. This would need a new "super-low" or "super-duper-low" fetch priority to be specified.

@jernoble
Copy link

jernoble commented Jun 9, 2023

As a site implementor I would be very concerned about the user agent making assumptions about "buffer level".

The UA has to make those assumptions in order to implement things like readyState. There's no avoiding it. I would counter that:

For example, sometimes we append media "just-in-time" for playback in which case the UA has no useful information about the true buffer level.

Seems that by not appending data that has been downloaded and the site intends to play, the resulting problem is one of the sites' own making. One that is easily avoided by just appending that downloaded data rather than saving it for a "just in time" append when the buffered level becomes critically low.

Looked at another way, if the UA starts treating media differently based on the UA's perception of buffer level, sites are just going to optimize when they append to get to the site's idea of optimum performance.

Yes, that is literally the point. :)

A site that can "lie" to the UA by fetching a ton of data up front, and appending that previously downloaded data whenever they receive startstreaming and until they receive stopstreaming will have fantastic power characteristics, as they will leave the radios quiet for the longest period of time.

In the end, startstreaming and stopstreaming events are hints, and sites which fetch and append the way the UA expects between those events may see benefits, including lower power cost and greater download speed.

@mwatson2
Copy link
Contributor

mwatson2 commented Jun 9, 2023

@jernoble wrote:

Seems that by not appending data that has been downloaded and the site intends to play, the resulting problem is one of the sites' own making.

Yep, but I don't think anyone is going to hold back on appending data that they 100% intend to play. The use-case for "just-in-time" appending is when you are not sure until that time what media is to be played. I appreciate that an alternative is to append anyway and then replace if you change your mind, but this has its own complexities. The point is that MSE provides sites with the flexibility to compose media streams in whatever way they choose in a manner that is decoupled from the download strategy. This is very useful. UA assumptions about network requests made based on what has been appended are likely to be incorrect.

I'm assuming that a possible UA algorithm would be to turn on the expensive radio when buffer levels get low and to switch to a longer on / off duty cycle the higher the buffer level. A site that wanted to game this would hold back appends to gain access to the expensive fast radio more often, optimizing for their own goal (throughput) but defeating the UA's objective to save battery life. A more enlightened site developer might share your desire to preserve battery, but in that case wouldn't it be better to give control of the duty cycle to the site, which knows more about its data needs ?

@jernoble
Copy link

jernoble commented Jun 9, 2023

I appreciate that an alternative is to append anyway and then replace if you change your mind, but this has its own complexities.

Yes, that is the preferred pattern of use. It doesn't seem reasonable to design and implement a complicated network API because of "complexities" in how overlapping appends work. We should just address those complexities directly!

wouldn't it be better to give control of the duty cycle to the site, which knows more about its data needs ?

No, because the site isn't the only application on the system driving the radio, nor is the ability for a website to control the duty cycle of an expensive modem a desirable thing (IMO) for the web platform.

@jernoble
Copy link

jernoble commented Jun 9, 2023

@dalecurtis

You could imagine something like ManagedMediaSource.fetch() if we need the UA to have perfect knowledge and the ability to schedule optimally for radio.

We considered this, but the risk here is ManagedMediaSource.fetch() becoming a "magic incantation" in the web platform for "download this faster". (Much in the same way as transform: rotateZ(0deg) became a "magic incantation" for "make this div layer-backed".)

ManagedMediaSource.prototype.fetch() may allow the UA to know with relative certainty that a given network request was a media request, but it's not a guarantee. And it doesn't actually solve the problem of indicating when that fetch request should be issued. If ManagedMediaSource.prototype.fetch() allowed the UA to delay issuing the request until "the best time for networking" or when "buffer levels cross a low-water threshold" then it's functionally the same as the startstreaming event, at the cost of a much more complicated specification and implementation.

@dalecurtis
Copy link

dalecurtis commented Jun 9, 2023

We considered this, but the risk here is ManagedMediaSource.fetch() becoming a "magic incantation" in the web platform for "download this faster". (Much in the same way as transform: rotateZ(0deg) became a "magic incantation" for "make this div layer-backed".)

I agree with this. However the risks here are equivalent to the streaming event approach if we require some percentage of fetched bytes to be appended. I.e., with either solution a page could use canned data to simulate the buffering levels required to get 5G if they really wanted to.

ManagedMediaSource.prototype.fetch() may allow the UA to know with relative certainty that a given network request was a media request, but it's not a guarantee.

The risks here also seem the same.

If ManagedMediaSource.prototype.fetch() allowed the UA to delay issuing the request until "the best time for networking" or when "buffer levels cross a low-water threshold" then it's functionally the same as the startstreaming event, at the cost of a much more complicated specification and implementation.

Yes, I was expecting the fetch version to delay in the same way, but with added benefit of knowing the download rate so that fetches can be scheduled with transfer time in mind. I don't quite follow how you expect to be able to deliver startstreaming reliably without that knowledge -- which is critical if we expect the UA to be a trusted advisor in this context and recommend developers prefer ManagedMediaSource over MediaSource. Can you elaborate on how that's expected to work?

@mwatson2
Copy link
Contributor

mwatson2 commented Jun 9, 2023

@jernoble My point is that - at least - if you embed assumptions into the design - like the assumption that the site is downloading a single simple linear media sequence and will append media as soon as it is downloaded - you'd better be explicit about that assumption. So then sites that do not conform to that can avoid using the new API.

But I'd prefer a solution that did not embed such an assumption because that is clearly just one specific use case - albeit a common one.

The fundamental problem here is one where you want to schedule downloads to take advantage of a resource that is slow and/or expensive to enable, use and disable. As a result, we get the best results when the resource is intermittently available and fully utilized when it is available (i.e. we want to aggregate the idle times, compared to current download scheduling). This problem has very little to do with streaming media, except that media is one example where the application is (sometimes) robust to downloads being scheduled this way.

Ideally, the site would simply be able to provide each download with a wallclock deadline. This would give the UA perfect knowledge of when each request was required and it could schedule in the most efficient way.

What's proposed is that during the streaming period the UA will assume a deadline for all network requests that is based on the state of the MediaSourceBuffers. It seems like a big assumption that could have unintended consequences.

@jernoble
Copy link

jernoble commented Jun 9, 2023

But I'd prefer a solution that did not embed such an assumption because that is clearly just one specific use case - albeit a common one.

To be fair, this use case is the overwhelmingly most common one. Linear playback by appending chunks as soon as they are received is far and away the most common mode of operation. The solution proposed is incredibly simple, easy to specify, implement, and use for the most common use case of the API.

We can debate whether a more complicated networking coalescing API (defined outside of this specification and by a completely separate working group) would help solve the remaining (and much, much less common) use cases, but I don't believe that should prevent this proposal from moving forward.

Meanwhile, those use cases that don't fit neatly into the "fetch, append, throw away" mode above can... just continue on exactly as they have been with MediaSource. And if the Fetch API is modified to allow coalescing low-priority fetch requests with specific deadlines for each, those requests will work with this API as well.

@jernoble
Copy link

jernoble commented Jun 9, 2023

@dalecurtis said:

However the risks here are equivalent to the streaming event approach if we require some percentage of fetched bytes to be appended. I.e., with either solution a page could use canned data to simulate the buffering levels required to get 5G if they really wanted to.

True. The UA has a great deal of latitude about both when to fire the startstreaming and stopstreaming events, and what it does between them. If that kind of "abuse" of the API was detected (and it does seem like it would be detectable), that latitude would allow a UA to act in a way to protect the user.

Yes, I was expecting the fetch version to delay in the same way, but with added benefit of knowing the download rate so that fetches can be scheduled with transfer time in mind. I don't quite follow how you expect to be able to deliver startstreaming reliably without that knowledge -- which is critical if we expect the UA to be a trusted advisor in this context and recommend developers prefer ManagedMediaSource over MediaSource. Can you elaborate on how that's expected to work?

In our implementation, the times at which the startstreaming event is fired are generous enough that there's no risk of responses coming so late that they lead to a buffer underrun. And because standard fetch() semantics apply, sites can use their existing download rate detection to determine things like variant selection. The only use case & behavior which may be negatively affected are things like live streaming, where sites will want to stay as close to the live edge as possible. In that case, they'll just ignore the *streaming events entirely, and those "negative" effects may simply mean the radio will switch to a slower-but-more-power-efficient mode. In our implementation, that "negative" scenario is exactly the same as the current status quo.

Other UAs may make different (and more advanced!) decisions about when to schedule those events. A hypothetical browser may make note of the speed at which fetches made between the streaming events take place, and allow the buffer levels to more completely empty before triggering startstreaming. Or conversely, notice that those same loads are occurring so slowly that they don't benefit from the higher-cost-but-faster radio, and revert to a less expensive network.

However, a non-goal of our implementation is to facilitate pages staying as close to the edge of the buffered range as possible, delivering data from the network "just in time" to avoid underruns. So the requirement to closely monitor and predict download speeds simply isn't present.

@jernoble
Copy link

jernoble commented Jun 9, 2023

@mwatson2 said:

So then sites that do not conform to that can avoid using the new API.

I'm curious about this phrasing. Do you mean "avoid using ManagedMediaSource" entirely? Or just avoid listening to the startstreaming and stopstreaming events?

The design of this ManagedMediaSource proposal is such that, if you ignore the streaming events entirely, the behavior will be essentially the same as MediaSource. Clients are free to ignore the streaming events, and for some cases like live video, they must.

@mwatson2
Copy link
Contributor

mwatson2 commented Jun 9, 2023

I'm not saying the proposal shouldn't move forward. I was trying to see if there was any scope for something more flexible. I do think that the assumptions should be made explicit.

Do you mean "avoid using ManagedMediaSource" entirely? Or just avoid listening to the startstreaming and stopstreaming events?

Unless I'm mis-understanding, if you use ManagedMediaSource and then ignore the startstreaming and stopstreaming events then throughput measurements are going to be a bit messed up because the site's requests will randomly fall into the 5G streaming windows or not. Or is the radio on / off behavior essentially the same with MediaSource and the difference is just whether you tell the site about it or not ?

In the former case, it would be good if the BufferChanged event (specifically) could just be added to the existing MediaSourceBuffer - I think that would be backwards-compatible, no ?

@dalecurtis
Copy link

dalecurtis commented Jun 9, 2023

@jernoble said:

However, a non-goal of our implementation is to facilitate pages staying as close to the edge of the buffered range as possible, delivering data from the network "just in time" to avoid underruns. So the requirement to closely monitor and predict download speeds simply isn't present.

Thanks that explains a lot. I can see how this system works well enough for VOD playbacks.

As you note, a live stream might have to ignore the streaming events to maintain buffering. That seems in conflict with the language around the UA being able to block appends. How do you see that being reconciled? Is stopstreaming never fired since the forward buffering level remains too low?

The text around how the streaming events are to be used will need some care to ensure developers are aware that the streaming events can't function as the sole buffering mechanism during live streaming. It'll be a bit surprising to first time authors I expect; but after ten years, there aren't many non-library based players so maybe that's no big deal.

@jernoble
Copy link

jernoble commented Jun 9, 2023

@jernoble said:

However, a non-goal of our implementation is to facilitate pages staying as close to the edge of the buffered range as possible, delivering data from the network "just in time" to avoid underruns. So the requirement to closely monitor and predict download speeds simply isn't present.

Thanks that explains a lot. I can see how this system works well enough for VOD playbacks.

As you note, a live stream might have to ignore the streaming events to maintain buffering. That seems in conflict with the language around the UA being able to block appends. How do you see that being reconciled?

As I mentioned upthread, we found during implementation that blocking appends was unnecessary, and I suggested that we remove that language from the proposal and track it in another issue.

Something @jyavenard and I thought about earlier was an explicit signal to the UA that the client would be doing live streaming; something like ManagedMediaSource.prototype.streaming = true. If this flag was set, the UA would assume the client would always try to stay as close to the live edge as possible and would pick the best network for frequent small requests. But we found in practice that this wasn't really necessary; detecting live-streaming behavior was sufficient.

@jernoble
Copy link

jernoble commented Jun 9, 2023

I'm not saying the proposal shouldn't move forward. I was trying to see if there was any scope for something more flexible. I do think that the assumptions should be made explicit.

Ah, I understand now, thanks.

Do you mean "avoid using ManagedMediaSource" entirely? Or just avoid listening to the startstreaming and stopstreaming events?

Unless I'm mis-understanding, if you use ManagedMediaSource and then ignore the startstreaming and stopstreaming events then throughput measurements are going to be a bit messed up because the site's requests will randomly fall into the 5G streaming windows or not. Or is the radio on / off behavior essentially the same with MediaSource and the difference is just whether you tell the site about it or not ?

This will vary from platform to platform, but on platforms with multiple radios, it's certainly possible for network speeds to vary wildly as data is routed on one or the other with different capabilities. And of course on mobile devices, users can travel in and out of coverage with varying levels of signal quality and capabilities. In my own neighborhood, I noticed that available bandwidth dropped off a cliff sometimes even when not moving, presumably as other people around me all tried to use the network simultaneously.

So yes, it could cause bandwidth measurements to change as radios were activated and deactivated. But I don't believe this is a new problem, nor one that sites are unprepared to deal with.

In the former case, it would be good if the BufferChanged event (specifically) could just be added to the existing MediaSourceBuffer - I think that would be backwards-compatible, no ?

The proposed bufferchanged event is fired when the UA purges data from SourceBuffers, either from an explicit request by the client or when a low-memory event forces the UA to evict appended data. Maybe I'm mis-understanding in turn, but how would that tie in with bandwidth estimation?

@jyavenard
Copy link
Member Author

Managed Media Source is enabled on Safari 17 beta on macOS and iPadOS and behind an experimental flag on iOS for now.

There is also a flag that says 'Managed Media Source requires AirPlay Source" which is checked. May I ask the purpose of that dependency? I could not find mentions of Airplay in this proposal anywhere. But if I overlooked it, please guide me.

Besides striking a jarring note to our developers, it makes it harder to explain to our users.

It is not part of this proposal. I explained the reason behind it in https://developer.apple.com/videos/play/wwdc2023/10122

I don't see how this has any negative effects on your users that requires an explanation, quite the opposite. In order to avoid usability regression (where all videos would have the ability to be used with AirPlay), when using ManagedMediaSource on iPhone, you need to provide an alternative video source that is compatible with remote media playback.

@dragotitev
Copy link

I tried to use ManagedMediaSource API on iOS 17 in the simulator. Futured flag is enabled for ManagedMediaSource API. When I call addSourceBuffer() method with any mime typeit just says it is not SUpported and can not create source buffer. And when I try to use the method isTypeSupported() it gives false every time. I tried with different combinations of mime types with and without codecs and it always return false. The only mime type that return 'true' is "video/webm" without any codecs. So my question is which codecs amd mime types iOS support and do others have the same issue - not able to use the Managed Media Source on iOS. Or maybe I do something wrong when i use addSourceBuffer() and isTypeSUpported() methods.

@jyavenard
Copy link
Member Author

You need to use iOS 17.1 beta 2; but this isn't the place to ask those questions, please use bugs.webkit.org thank you

@briantbutton
Copy link

jyavenard, thank you for the prompt reply.

It is not part of this proposal. I explained the reason behind it in https://developer.apple.com/videos/play/wwdc2023/10122

Thanks, I viewed that video twice. There is no mention of the flag 'Managed Media Source requires AirPlay Source'. Perhaps you could convey the explanation here?

Managed Media Source is enabled on Safari 17 beta on macOS and iPadOS and behind an experimental flag on iOS for now.

This proposal opened with a presentation of the new feature, an invitation to test a reference implementation, and a pointer to the Feature Flag. The 'Managed Media Source requires Airplay Source' flag was not mentioned but is absolutely required to test the reference implementation. Probably good to fill in that part too.

I don't see how this has any negative effects on your users that requires an explanation

Respectfully, it does. As soon as we mention Airplay, users have a dozen questions which we cannot answer. "Is this turning off AirPlay?" "Do I need to use Airplay?" "Is your product using Airplay?" This is not hypothetical, we have encountered it.

I believe that AirPlay is a branded Apple service, is it not? We wish to show our users that we are delivering something that uses standards and is independent of any proprietary vendor product/service.

In order to avoid usability regression (where all videos would have the ability to be used with AirPlay)

I am suddenly feeling concerned. If we ask a user to test Managed Media Source using the reference implementation introduced here, are we changing the behavior of their iPhone vis-a-vis Airplay?

Thank you,


This is not particularly relevant but we are streaming audio, not video.

@jyavenard
Copy link
Member Author

Thanks, I viewed that video twice. There is no mention of the flag 'Managed Media Source requires AirPlay Source'. Perhaps you could convey the explanation here?

There's definitely is, you should watch again at the 20 minutes mark, or from the transcript

When designing Managed MSE, we wanted to make sure that nothing was left out by accident and that users continue to get the same level of features as they did in the past. So to activate Managed MSE on Mac, iPad, and iPhone, your player must provide an AirPlay source alternative. You can still have access to Managed MSE without it, but you must explicitly disable AirPlay by calling disableRemotePlayback on your media element from the Remote Playback API

Currently, on iPhone, only plain mp4 or HLS is supported, those inherently works with AirPlay (Apple's version of the spec's remote playback). And AirPlay is a very popular (and used) feature. Most A/V receivers support it these days
MSE however, can't work with AirPlay, it has no reference source of content.

We didn't want a functionality to become overnight broken once the ManagedMediaSource element became available.
So you need to provide an alternative playback source, or explicitly disable remote playback.

We hope that the solution adopted will be to do the former (using a source that is likely existing as there's need to be support for earlier iOS version), so that users can continue to listen or view on their preferred A/V equipment.

I am suddenly feeling concerned. If we ask a user to test Managed Media Source using the reference implementation introduced here, are we changing the behavior of their iPhone vis-a-vis Airplay?

So in all honesty, I believe your concerns are unwarranted. In the worse case, you need to set a single attribute to your audio element for things to work as you expected.

And again, this has nothing to do with this proposal, so this will be my last answer on this topic here.

@briantbutton
Copy link

jyavenard, thank you for the patient, informative response. Sorry for hijacking this channel for a moment. We will get back to the lab and finish building it into our software.

We are delighted to have this proposal, BTW. More than you can imagine.

avelad added a commit to shaka-project/shaka-player that referenced this issue Oct 11, 2023
@jyavenard
Copy link
Member Author

I've wrote a first draft here https://jyavenard.github.io/media-source/media-source-respec.html
https://github.com/jyavenard/media-source/tree/managed_mse

@wilaw
Copy link

wilaw commented Oct 13, 2023

@jyavenard - nice work. It might be useful to also extend the Examples section of https://github.com/jyavenard/media-source/tree/managed_mse to include an example showing the use of ManagedMediaSource and the onstartstreaming and onendstreaming events.

@jyavenard
Copy link
Member Author

@jyavenard - nice work. It might be useful to also extend the Examples section of https://github.com/jyavenard/media-source/tree/managed_mse to include an example showing the use of ManagedMediaSource and the onstartstreaming and onendstreaming events.

Done

@jyavenard
Copy link
Member Author

Amended the proposal to change the BufferedChangeEvent to no longer make optional the two TimeRanges.

@marcoscaceres
Copy link
Member

Hey folks, the PR is now up:
#329

Thanks again to everyone that's provided feedback and helped shape the overall design.

Would love to get a second (or third!) implementer commenting on the PR before landing it - as well as potential developers. We intend to prepare some tests in parallel, so we'd love some feedback on the overall design in the meantime.

@aboba
Copy link

aboba commented Feb 13, 2024

Related: w3c/webtransport#522

@padenot
Copy link

padenot commented Apr 9, 2024

@chrisn, @jyavenard, it's probably fine to close this, and to do the remaining work (if any) in follow-up.

@chrisn
Copy link
Member

chrisn commented Apr 12, 2024

Thanks @padenot, agreed. We welcome new issues if anyone has points to follow up on.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests