Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds fetch stream logic for networking part of PDF.js #8768

Merged
merged 1 commit into from
Aug 30, 2017

Conversation

mukulmishra18
Copy link
Contributor

@mukulmishra18 mukulmishra18 commented Aug 10, 2017

This PR adds Fetch support in networking part of PDF.js project. This PR implements IPDFStream interfaces using Fetch API to stream PDF data. Using Fetch can give us streamable response, that streams PDF data in small chunks.

P.S. fixes #6126

@timvandermeij
Copy link
Contributor

Does this fix #6126?

@mukulmishra18
Copy link
Contributor Author

Does this fix #6126?

Yes, this patch is intended to add support for fetch stream. Currently it is using fetch if browser has native implementation, although I am thinking if we can add github polyfill for fetch, if browser doesn't have support.

While testing this patch I get to know that we don't have any property named body in Response Object in firefox(but defined in chrome), so 8370f1d#diff-ffa21364c93bc3328ab1eb76226f8a71R84 might throw error. I think we have to detect it and use response.arrayBuffer()(?) instead.

@mukulmishra18 mukulmishra18 force-pushed the fetch_stream branch 5 times, most recently from 688ecd7 to 63b4286 Compare August 16, 2017 18:20
@mukulmishra18
Copy link
Contributor Author

mukulmishra18 commented Aug 16, 2017

While testing this PR locally, I find out that this._headers.get(name) is not returning value for Accept-Ranges http response header from 63b4286#diff-ffa21364c93bc3328ab1eb76226f8a71R87, although it is present in the response header(confirmed from the network tab in web console).

Due to this, test for range requests are failing. I am getting Error: 405(Method not allowed) in rangeReader.

} else if (typeof Response !== 'undefined' && 'body' in Response.prototype) {
var PDFFetchStream = require('./display/fetch_stream.js').PDFFetchStream;
pdfjsDisplayAPI.setPDFNetworkStreamClass(PDFFetchStream);
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Rob--W do you know if we can enable only fetch_stream.js for Chrome extension?

Copy link
Member

@Rob--W Rob--W Aug 26, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean?

  1. Whether fetch_stream is disabled for every target except for the Chrome extension build, or
  2. Whether fetch_stream can be unconditionally be used in the Chrome extension, and display/network be removed?

I can't answer 1, but if you meant 2:

Based on 1.4M unique data points (i.e. pings where the client cannot be distinguished from another client in the same data set; this already happens when the user upgrades their browser version), the usage statistics are as follows:

Chrome 62 0.91%
Chrome 61 0.87%
Chrome 60 56.85% <-- Current stable (officially released July 2017, 25th)
Chrome 59 32.11%
Chrome 58 2.08%
Chrome 57 1.17%
Chrome 56 0.77%
Chrome 55 0.97%
Chrome 54 0.20%
Chrome 53 0.34%
Chrome 52 0.60%
Chrome 51 0.33%
Chrome 50 0.19%
Chrome 49 0.91% <-- last supported Chrome for Windows XP
Chrome 48 0.07%
Chrome 47 0.17%
Chrome 46 0.08%
Chrome 45 0.07%
Chrome 44 0.04%
Chrome 43 0.13% <-- First version where streams API is supported
Chrome 42 0.04%
Chrome 41 0.06%
Chrome 40 0.05%
Chrome 39 0.15%
Chrome 38 0.01%
Chrome 37 0.02%
Chrome 36 0.01%
Chrome 35 0.38% <-- mostly 35.0.1878.0, win7. Doesn't look like an official build.
Chrome 34 0.37% <-- Ubuntu https://apps.ubuntu.com/cat/applications/chromium-browser/
Chrome 33 0.02%
Chrome 32 0.01%
Chrome 31 0.01%
Chrome 100 0.01% <-- I don't know what this is...

1.13% of the users on the latest version of the extension (16k pings) use a Chrome version that does not support the streams API (Chrome 42 and earlier). For now, I'd like to keep supporting these users.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for response and analysis. I was asking about (2) -- fetch_stream can be unconditionally be used in the Chrome extension. You answered my question, so we will handle Chrome extension no different than generic viewer.

@yurydelendik
Copy link
Contributor

@mukulmishra18 Can you rebase the commit to the current master?

@mukulmishra18 mukulmishra18 force-pushed the fetch_stream branch 2 times, most recently from d242f17 to 4451a97 Compare August 28, 2017 18:19
@yurydelendik
Copy link
Contributor

We are trying to "serialize" the reason we are not suppose too. Add makeReasonSerializable:

diff --git a/src/shared/util.js b/src/shared/util.js
index 9bd0ff1b..15b3f8ff 100644
--- a/src/shared/util.js
+++ b/src/shared/util.js
@@ -1247,6 +1247,17 @@ function wrapReason(reason) {
   }
 }
 
+function makeReasonSerializable(reason) {
+  if (!(reason instanceof Error) ||
+      reason instanceof AbortException ||
+      reason instanceof MissingPDFException ||
+      reason instanceof UnexpectedResponseException ||
+      reason instanceof UnknownErrorException) {
+    return reason;
+  }
+  return new UnknownErrorException(reason.message, reason.toString());
+}
+
 function resolveOrReject(capability, success, reason) {
   if (success) {
     capability.resolve();
@@ -1307,16 +1318,12 @@ function MessageHandler(sourceName, targetName, comObj) {
             data: result,
           });
         }, (reason) => {
-          if (reason instanceof Error) {
-            // Serialize error to avoid "DataCloneError"
-            reason = reason + '';
-          }
           comObj.postMessage({
             sourceName,
             targetName,
             isReply: true,
             callbackId: data.callbackId,
-            error: reason,
+            error: makeReasonSerializable(reason),
           });
         });
       } else if (data.streamId) {

redirect: 'follow',
}).then((response) => {
if (response.status !== 200) {
throw validateResponseStatus(response.status, url);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Change name to createResponseStatusError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks.

credentials: this._withCredentials ? 'omit' : 'include',
redirect: 'follow',
}).then((response) => {
if (response.status !== 200) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you need to validate status based on protocol, for http it is 200 for other protocols it's 0. I recommend to add/have/change validateResponseStatus(status, isHttp) so it will return boolean that will tell is status is valid. Use this function in the range reader as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done with new commit. Thanks.

@yurydelendik
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://54.215.176.217:8877/b5dc6fa707adeb3/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://54.67.70.0:8877/d2c468ba2000d41/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/d2c468ba2000d41/output.txt

Total script time: 16.50 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/b5dc6fa707adeb3/output.txt

Total script time: 29.51 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/b5dc6fa707adeb3/reftest-analyzer.html#web=eq.log

return false;
}
return true;
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if (!isHttp) {
  return status === 0;
}
return status === 200 || status === 206;

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Thanks.

fetch(url, {
method: 'GET',
headers: this._headers,
}).then((response) => {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

define the same options as in full reader. extract the createFetchOptions(headers) method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

created createFetchOptions with new commit. Thanks.

Copy link
Contributor

@yurydelendik yurydelendik left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, with the validate and options changes.

@yurydelendik
Copy link
Contributor

/botio test

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://54.67.70.0:8877/9cc359156484d66/output.txt

@pdfjsbot
Copy link

From: Bot.io (Windows)


Received

Command cmd_test from @yurydelendik received. Current queue size: 0

Live output at: http://54.215.176.217:8877/cad322ff23abd42/output.txt

@pdfjsbot
Copy link

From: Bot.io (Linux m4)


Success

Full output at http://54.67.70.0:8877/9cc359156484d66/output.txt

Total script time: 16.50 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: Passed

@pdfjsbot
Copy link

From: Bot.io (Windows)


Failed

Full output at http://54.215.176.217:8877/cad322ff23abd42/output.txt

Total script time: 29.23 mins

  • Font tests: Passed
  • Unit tests: Passed
  • Regression tests: FAILED

Image differences available at: http://54.215.176.217:8877/cad322ff23abd42/reftest-analyzer.html#web=eq.log

@yurydelendik
Copy link
Contributor

@yurydelendik yurydelendik merged commit bad3203 into mozilla:master Aug 30, 2017
@yurydelendik
Copy link
Contributor

Thank you for the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Implement Fetch API for streaming PDF loads on Chrome
5 participants