-
Notifications
You must be signed in to change notification settings - Fork 56
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Proposal: Offscreen Documents for Manifest V3 #170
Comments
|
The bad thing is that this will increase resource consumption of extensions (roughly twice compared to either SW or the old background page) while the missing features are implemented for the service worker context, which realistically will take more than 5 years, maybe even 10, judging by the two-year lifetime of MV3 during which only a couple of simple features were addressed out of dozens more complex ones. The web specification isn't interested in many/most of these lost features, and even if they were these specifications change extremely slow due to the need to account for a lot of edge cases that matter for the web, but not for extensions. Coupled with the fact that service workers offer no real benefits over the classic environment for the majority of extensions as both the time to wake up and the memory footprint as well as performance are practically the same, it means that the decision to remove the old DOM-based background pages is premature and should be postponed to ManifestV4. There's just one rare case where a service worker is better: extensions that observe frequent events like network requests while showing an exceptionally heavy UI that can block the extension's main thread e.g. complex extensions for devtools. For ManifestV3 a much more performant option is to allow both SW and non-persistent background pages, while enticing the developers to migrate by offering preferential treatment of SW-based extensions e.g. shorter reviews and/or some kind of badge in the store and in the browser. Mozilla is doing the right thing here with their support of the limited event pages, which isn't surprising as they have an actual experienced extension developer involved in the design and implementation. |
I have one request regarding the "Purpose". Therefore I suggest that for force-installed extensions (or any other method available to install an extension by an admin) the "Purpose" will not be presented to the end user nor any prompt to approve it. Side note: I think that this proposal is too little too late. MV3 has far too many limitations that this proposal does not address. |
It would be good if the OffscreenDocument had an optional "incognito mode", which does not have the user's cookies and other personal state. In something like dom scraping, you don't want users having a bad/confusing experience of a particular website, if that website generates a personalised experience based on the user's browsing history, and the history has been "polluted" by an extension's OffscreenDocument accessing certain web pages without the user's explicit involvement. An "incognito mode" would fix this. For something like user automation (with an extension taking certain actions on a website, on the user's behalf), then you don't want "incognito mode", as the extension will need the user's session to access the website and take action. |
Yes and no. Chrome is planning to implement offscreen documents in the same process and thread as other extensions pages in our initial implementation, but we may want to change this in the future.
While we're not planning to expose offscreen documents in this method, we are interested in making it easier for other contexts to exchange messages with offscreen documents.
No. Broadly speaking, we'd like to bring cross-context WebExtensions communications capabilities more in line with the web platform. While there are some ways that two documents can synchronously interact with each other (e.g. using window.open() to open another page on the same origin), this is the exception rather than the rule. In order for two independently created tabs on the same origin to communicate with each other, they must use asynchronous message passing capabilities like the ones you outlined. If these capabilities are insufficient, we believe it's best to address that problem at the web platform level.
We are currently planning to expand
Not yet decided, but at the moment I'm hesitant on this. If you (or anyone else) feel this would be valuable, please share how/why you'd like a separate CSP for offscreen documents (especially since, as a full document, these could be included instead within the HTML element with a tag).
Yes. |
Implementing native DOM API in a separate thread might be architecturally impossible in all browsers, because DOM was never designed to be thread-safe, and I doubt it'll change in the foreseeable future (~10 years), so running it in the main thread may be the only realistic solution. |
On the TensorFlow.js side we have a number of users wanting to create ZOOM / Google Meet / other online Video Conf solution based chrome extensions that need access to webcam stream to then manipulate it in some shape or form via offscreen canvas or similar to then send that canvas stream to a virtual camera that can be picked up via the original app in question to make that their webcam source. By means of example here is an example of me joining a Google Meet with an ML powered Chrome extension that does just this: https://www.youtube.com/watch?v=-JpCPIx1WGw Since publishing the above I have seen more companies and individuals try to replicate this for their own domains too. I am seeing a growing interest in this sort of area for long living >5 mins situations that need to constantly handle stream data from webcam/microphone, along with machine learning models that may leverage things like WebGL / WASM for acceleration so that the ML can run fast in the browser environment. |
Simeon -
|
Since @dotproto didn't answer in the latest minutes why Chromium team insists that this proposal is better in the long term than the old event pages or the new limited event pages idea, I'd like to point out that this offscreen document proposal (on top of being an unnecessary complication) can be completely subsumed by event pages if the API provides a method to prolong the lifetime of the currently running background script with a justification - exactly as proposed here, but better because a single event page would consume much less resources/CPU than having an SW that constantly/randomly restarts in parallel with an offscreen document. |
Hi, Thank you for posting this proposal. I have read it, and while I have quite possibly misunderstood important parts, I am concerned about its potential long-term impact. Concern about fragmentation Suppose we add the ability to play audio from ServiceWorkers via AudioWorklet, obviating the need for offscreen documents for that particular use case. There's every chance that Mozilla or Apple may decide not to implement that feature due to privacy or usability or architectural concerns, or simply due to current resource constraints. In such a situation, extension authors would need to prepare two versions of their extension: the offscreen document version and the ServiceWorker version. This might appear to be the same as any other platform feature that needs to be feature-detected but it is more significant because it involves a different architecture based on the availability of the feature (i.e. it is not progressive enhancement but moving functionality around). Furthermore, given the amount of work required to implement a new Web platform API, it seems likely such situations will multiply over time making the permutations of architectures authors need to support unwieldy. Concern about stability The requirement to provide a series of Despite the amount of anxiety raised by the transition to MV3, this proposal sounds like extension developers can expect to endure a similar experience of having to scramble to rewrite their extensions every few months as If you will allow me to be a bit frank, in terms of optics, this comes across as a rather condescending API. There will undoubtedly also be many a use case that does not neatly fit in any of the provided Perhaps this can be addressed by requiring manual reviews for extensions using the "other" category, but doing so introduces a cost to both extension store providers and extension developers. Would it not be preferable to use an API that does not introduce such costs? Concern about impact on the architecture of the Web platform Furthermore, are we (and the TAG etc.) really confident that every reasonable use case should—from an architectural point-of-view—be made available to Service Workers in the future anyway, even if only made accessible to extensions? For example, my add-on would like, from the background page/worker, to load an iframe from an associated cross-origin Web site and query its offline data via Thank you for all your work on this and for publishing your findings. I hope my comments are constructive or that you can explain where I have misunderstood. |
I would like to use Offscreen Document to create thumbnails of pages. (in my speed dial) To make this work I need |
I strongly suggest |
I too wonder about an update on the timeline. There is a new commit that was sent for review yesterday, following the commit in July: https://chromium-review.googlesource.com/c/chromium/src/+/4000752 |
Requires new Can someone on the Google side please explain what is wrong with Limited Event Pages for MV3 #134? |
Chromium team has been effectively ignoring such appeals for the past several years to back their claims about [negligible] problems with background pages or event pages and their claims about [non-existent] improvements of using service workers. We also see that there are no improvements in simplicity of source code of Chromium - on the contrary it will become more and more complicated as it deviates from the standard service worker specification for the Web platform and it has to deviate because 99% extensions don't need a service worker technology per se (they just need an invisible JS context), which means that all the limitations and caveats of service workers need to be patched/disabled for extensions. We also see thousands of users still report that service worker registration is unreliable and can break randomly on update. It's time someone who can admit the reality steps up and stops this service worker + offscreen farce. |
It also speaks volumes that we have 26 days left to migrate our extensions to Manifest V3, otherwise we will lose our Featured badges, and a significant source of traffic with it, while a good portion of web APIs have been simply made unavailable to extensions. I'm not sure how much of a motivation it will be for anyone to watch as years of their work gets destroyed while they eagerly await for some crumbs to be dropped in the form of the Offscreen Documents API sometime next year. |
Very well said! |
Switched this to
implemented: chrome
|
I tried this api yesterday, and write two feedbacks at: |
I tried Offscreen Documents for calling Had to do this workaround: https://github.com/wireworks-app/chrome-screen-recording |
I'm very confused why this got built for Web Extensions specifically? Offscreen DOM has been asked for time and time again (notably captured in w3c/ServiceWorker#846 which links plenty of other things). It feels like rather than obey the Extensible Web Manifesto & build general low level capabilities everyone can use, something botique got built just for Web Extensions. How when and where does the rest of the web start to also benefit from this accidental invention on the side of Web Extensions? Why was this the right arena for this work? To me, name illustrates the weirdness here. This work would be invaluable as "Offscreen Documents". Making it "Offscreen Documents for Manifest V3" is qualifier that makes it niche & special purpose. The overspecialization make it feel like this work won't parlay appropriately into the greater web project. I'd love to know for example what the WinterCG people might work or might be a mismatch about the ideas here. |
Just to close the loop, starting in Chrome 116 we support obtaining a stream ID in a service worker (using the chrome.tabCapture API). This can then be passed to an offscreen document :) There's some more information here: https://developer.chrome.com/docs/extensions/mv3/screen_capture/#audio-and-video-offscreen-doc
I definitely appreciate the concern. As mentioned in the Summary section, the long term goal is that this functionality should be supported in service workers. Offscreen documents are intended as a temporary migration measure and hopefully over time they will be needed less and less. While the proposal was written before I joined Google, I can definitely see why the team might have chosen to build this as an extension specific API - building something for the web would need a lot of thought due to the long term backwards-compatibility implications. On the other hand, while extension APIs still need thought, it's a place where we can definitely move more quickly which was important in unblocking use cases in the migration to Manifest V3. If we realise this concept is useful beyond migration or learn interesting things we can absolutely feed those back to the web. |
First off, thanks to the chrome team for supporting passing a I am working on an extension that needs to support full screen recording (not just on a per tab basis) and was wondering if similar support for Both functions return a Are there any blocking privacy concerns that would prevent this from ever being supported? Or is this something the team would consider / is already on the roadmap? |
@anthonylebrun, glad you're excited about the tabCapture change and appreciate the feedback on desktopCapture! Out of interest, is there anything specific desktopCapture gives you that getDisplayMedia doesn't? It would be nice to lean on the web APIs if possible. |
@oliverdunk desktopCapture supports cancelChooseDesktopMedia which allows the extension to determine if the "Choose what to share" screen is open, and close it to avoid multiple simultaneous recordings. Our extension is designed to only capture one source at a time, and although with getDisplayMedia it is possible to detect if the window is open, its not possible to close it, requiring an additional UI to instruct the user to find the pre-existing "Choose what to share" screen.
Is there any benefits around using getDisplayMedia? chooseDesktopMedia still leverages getUserMedia for the capture of the screen. |
Hi @oliverdunk and everyone, I have an use case where I the extension open a pop-up which list down a few different website. The moment user click on one of the option, we will open a new tab and record that new tab. However, it seems like I have tried a few ways to retrieve the tab id Method 1: Get the tabId from
Method 2: Query
|
Thanks, that's good context!
The benefit for this sort of thing is reducing the amount of code in Chromium + making it easier for web developers to learn extensions. Generally the fewer APIs someone needs to be aware of the better.
I believe you can do it for any tab, but the extension must have been invoked on that tab (so a page opened by a popup would not work). You could use the normal getDisplayMedia API for this, or open a feature request to have a way of opening a new tab and immediately starting recording. To be clear, I don't think this is a new limitation in MV3. For replies to any of the above, let's move to Chrome bugs or https://groups.google.com/a/chromium.org/g/chromium-extensions to avoid adding noise to the WECG :) |
A use case for this PR turns out to be… There are two issues because this API is not available in workers and not available at all in Firefox' and Safari's background pages.
There isn't a If this API becomes cross-browser and this usage becomes accepted, I think we wouldn't necessarily need a dedicated |
Can we get the media streams through the following process :
Tried the above way but ended with the following errors :
|
This is possible for tabs using the It is not currently possible for I'd suggest the chromium-extensions mailing list if you want to chat more just so we keep this thread focused on the proposal :) |
Thanks @oliverdunk, Basically this is my use case :
Following the above while using the
I have tried removing and attaching the |
Hi @Nitesh2503, as mentioned could you move to the mailing list? Happy to help but I'd like to avoid adding too much noise here :) |
Today, Manifest V3 Chrome extensions cannot leverage DOM-related features and APIs in a background context due to service workers not having access to the DOM. This impacts capabilities like headless audio playback, background document scraping, and stream processing. To continue supporting these features, the Chrome team is proposing adding Offscreen Documents support to the Manifest V3 platform.
Allowing Extensions to Open Offscreen Documents in Manifest V3
The full proposal can be found above, but in short offscreen documents provide a temporary, headless page environment that allows an extension to leverage DOM capabilities in the background. Since the contexts are specifically designed for DOM API usage, they will not have access to powerful extension APIs. Offscreen document lifetimes are not tied to the context that spawned them, meaning that an offscreen may outlive the service worker that created it. As an ephemeral context, offscreen documents will be terminated if they are no longer doing work; extension authors should anticipate and prepare to recover from such scenarios.
We are particularly interested in discussing and iterating on the API surface for this capability and identifying reasons extensions may create an offscreen document.
EDIT 2022-03-10: Added a description. Previously we only had a link to the proposal doc.
The text was updated successfully, but these errors were encountered: