Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Render first page before download is complete (linear / web optimized) #193

Open
VladislavSavvateev opened this issue Jul 16, 2020 · 27 comments · Fixed by #195
Open

Render first page before download is complete (linear / web optimized) #193

VladislavSavvateev opened this issue Jul 16, 2020 · 27 comments · Fixed by #195

Comments

@VladislavSavvateev
Copy link

Without Linear PDF feature most of the PDFs (especially with pictures) loads very slowly. According to the console's information, pdf.js version of this plug-in is 2.1.266, but the current version of pdf.js is 2.5.179, which supports pdf.js.

Steps to reproduce

  1. Load any Linearized PDF to storage (or use my PDF)
  2. Check how slowly it loads
  3. Upload to any host and check how fast your browser can load the first page, while loading other pages

Expected behaviour

Linear PDF support in PDF viewer.

Actual behaviour

Lack of Linear PDF support, and slow loadings...

Server configuration

Operating system: Linux 5.4.0-26-generic #30-Ubuntu SMP Mon Apr 20 16:58:30 UTC 2020 x86_64

Web server: nginx/1.17.10 (fpm-fcgi)

Database: mysql 8.0.20

PHP version: 7.4.7
Modules loaded: Core, date, libxml, openssl, pcre, zlib, filter, hash, Reflection, SPL, session, standard, sodium, cgi-fcgi, mysqlnd, PDO, xml, calendar, ctype, curl, dom, mbstring, FFI, fileinfo, ftp, gd, gettext, gmp, iconv, igbinary, imagick, intl, json, exif, msgpack, mysqli, pdo_mysql, Phar, posix, readline, shmop, SimpleXML, sockets, sysvmsg, sysvsem, sysvshm, tokenizer, xmlreader, xmlwriter, xsl, zip, memcached, Zend OPcache

Nextcloud version: 18.0.7 - 18.0.7.1

Where did you install Nextcloud from: from official website

List of activated apps:

Enabled:

  • accessibility: 1.4.0
  • activity: 2.11.0
  • bruteforcesettings: 1.6.0
  • cloud_federation_api: 1.1.0
  • comments: 1.8.0
  • dav: 1.14.0
  • federatedfilesharing: 1.8.0
  • federation: 1.8.0
  • files: 1.13.1
  • files_pdfviewer: 1.7.0
  • files_rightclick: 0.15.2
  • files_sharing: 1.10.1
  • files_trashbin: 1.8.0
  • files_versions: 1.11.0
  • files_videoplayer: 1.7.0
  • firstrunwizard: 2.7.0
  • issuetemplate: 0.6.0
  • logreader: 2.3.0
  • lookup_server_connector: 1.6.0
  • nextcloud_announcements: 1.7.0
  • notifications: 2.6.0
  • oauth2: 1.6.0
  • password_policy: 1.8.0
  • photos: 1.0.0
  • privacy: 1.2.0
  • provisioning_api: 1.8.0
  • recommendations: 0.6.0
  • serverinfo: 1.8.0
  • settings: 1.0.0
  • sharebymail: 1.8.0
  • support: 1.1.1
  • survey_client: 1.6.0
  • systemtags: 1.8.0
  • text: 2.0.0
  • theming: 1.9.0
  • twofactor_backupcodes: 1.7.0
  • updatenotification: 1.8.0
  • viewer: 1.2.0
  • workflowengine: 2.0.0
    Disabled:
  • admin_audit
  • encryption
  • files_external
  • user_ldap

Nextcloud configuration:

{
"instanceid": "REMOVED SENSITIVE VALUE",
"passwordsalt": "REMOVED SENSITIVE VALUE",
"secret": "REMOVED SENSITIVE VALUE",
"trusted_domains": [
"cloud.sonic.fan"
],
"datadirectory": "REMOVED SENSITIVE VALUE",
"dbtype": "mysql",
"version": "18.0.7.1",
"overwrite.cli.url": "https://cloud.sonic.fan",
"dbname": "REMOVED SENSITIVE VALUE",
"dbhost": "REMOVED SENSITIVE VALUE",
"dbport": "",
"dbtableprefix": "oc_",
"mysql.utf8mb4": true,
"dbuser": "REMOVED SENSITIVE VALUE",
"dbpassword": "REMOVED SENSITIVE VALUE",
"installed": true,
"memcache.local": "\OC\Memcache\Memcached",
"memcache.distributed": "\OC\Memcache\Memcached",
"memcached_servers": [
[
"localhost",
11211
]
],
"theme": "",
"loglevel": 2,
"maintenance": false,
"updater.secret": "REMOVED SENSITIVE VALUE",
"updater.release.channel": "stable"
}

Client configuration

Browser: Mozilla/5.0 (Windows NT 10.0; Win64; x64; rv:78.0) Gecko/20100101 Firefox/78.0

Operating system: Windows 10 Pro x64

Logs

Nextcloud log (data/owncloud.log)

There is no log for this.

Browser log

Version string from console: "(PDF.js: 2.1.266)"

@elpraga
Copy link

elpraga commented Aug 9, 2020

That would be a great feature to be implemented!

Is there a chance of it happening?

@elpraga
Copy link

elpraga commented Aug 11, 2020

I've just tried to replace the PDF.js in the files_pdfviewer with the 2.5 version. PDFs can still be opened, but even linearized PDFs are first downloaded and then displayed.

I've place the same PDF online and opened them from a browser (using apache2).

Both Chrome and Firefox started displaying the file before fully downloaded.

Why is this happening? Can it be fixed?

Being able to show linearized PDFs correctly would be a great improvement.

@elpraga
Copy link

elpraga commented Aug 11, 2020

@skjnldsv skjnldsv mentioned this issue Aug 13, 2020
9 tasks
@elpraga
Copy link

elpraga commented Aug 14, 2020

Thank you for the quick fix @skjnldsv !

Is there a way to try the new feature out in the current 19.0.1?
When cloning the master branch, the min server version is set to 20.

@elpraga
Copy link

elpraga commented Aug 14, 2020

(I did try to simply set the min server version to 19, but after that I was unable to open the server at all)

@skjnldsv
Copy link
Member

Yep, this is for 20 only

@elpraga
Copy link

elpraga commented Oct 9, 2020

I've just updated to NC 20.0.0 and linear PDF files are still not displayed correctly. Users still need to wait for the whole file to be downloaded in order to see any part of the document.

Could this issue be reopened?

@skjnldsv
Copy link
Member

skjnldsv commented Oct 9, 2020

@elpraga is it working on the official pdf.js example page?
Current pdf.js version in this app is 2.4.456

const PDFJSversion = '2.4.456'

@elpraga
Copy link

elpraga commented Oct 10, 2020

If you mean if it's working on https://mozilla.github.io/pdf.js/web/viewer.html it's hard to tell, it renderers so fast that I cannot really tell the difference.

I can confirm though that both Firefox and Google Chrome in their desktop versions start referring and displaying my large sample linearized PDF file before it has completed downloading whenever the file is served by Apache and accessed by a direct link. On the contrary, NC's implementation starts displaying the content only once it has finished downloading.

This leads me to believe that pdf.js does have the capability to display progressive pdf files correctly.

I'm not sure if you had a different pdf.js example page in mind though.

@elpraga
Copy link

elpraga commented Oct 10, 2020

Could it have something to do with viewer.js ? I'm sorry, I'm an English teacher, not a programmer so it is hard for me to investigate the issue. I was trying to find out the pdf.js version Firefox is using right now, and I've noticed a reference to viewer.js.

I was also trying to find out information about pdf.js's version 2.4.456 capabilities to display linearized PDFs, and it seems to me that it should be implemented in this version.

@skjnldsv
Copy link
Member

Maybe I misunderstand then :)
What is a linear pdf? For me the display on https://mozilla.github.io/pdf.js/web/viewer.html is the same as on my Nextcloud instance 🤔

@elpraga
Copy link

elpraga commented Oct 10, 2020

Cheech Document Properties where it says Fast Web View No in the previous link.

Linearized PDF is a pdf prepared in such a way that it can start displaying before it is fully loaded. It is extremely useful for large files.

Try opening
https://tomaskaluza.net/notes/ttt.pdf
Then you can upload it to any Nextcloud server and compare the experience

@elpraga
Copy link

elpraga commented Oct 10, 2020

The pdf Mozilla use in their example is not optimized, but pdf.js has the capability to work properly with optimized PDFs.

@elpraga
Copy link

elpraga commented Oct 10, 2020

optimized is sometimes used in the same way as linearized

@skjnldsv
Copy link
Member

Thanks for the clarifications!

@elpraga
Copy link

elpraga commented Nov 22, 2020

Is there any update on this feature? I've been able to confirm that it is not implemented in NC 20.0.2

@skjnldsv
Copy link
Member

@elpraga the issue is still opened.
The latest releases from pdf.js have breaking changes in the way they are compiled that require changes to make it work here. The task is not easy

@elpraga
Copy link

elpraga commented Nov 25, 2020

Thank you for the update @skjnldsv !

@skjnldsv
Copy link
Member

updated

@elpraga
Copy link

elpraga commented Mar 30, 2021

I'm sorry, but the PDF viewer in NC21 still downloads the whole PDF (even those web optimised (linear)) before displaying the first page.

Why was this issue closed if the update has not fixed it?

@elpraga
Copy link

elpraga commented Mar 30, 2021

If I am not missing anything important @skjnldsv , could you reopen this issue until it actually solved?

@skjnldsv
Copy link
Member

I upgraded the pdf library to the version they said supported the linear feature! 🤷

@skjnldsv skjnldsv reopened this Mar 30, 2021
@elpraga
Copy link

elpraga commented Mar 30, 2021

OK, I see. Is there a way to investigate what is happening and it isn't working?

@elpraga
Copy link

elpraga commented Mar 30, 2021

This is one of the files I've been testing the feature on here

Currently, it displays the front page almost immediately in Firefox, but not in Google Chrome (I guess it is due to its new PDF engine, it used to work before, I'll try to ivestigage).

@elpraga
Copy link

elpraga commented Mar 30, 2021

I have been unable to make linear PDF view feature work un current Google Chrome (version 89.0.4389.90 on Ubuntu) using its built-in PDF view, but I am confident it was working in the past.

As I mentioned earlier, it works in Firefox 87.0 (PDF.js: 2.8.117) on Ubuntu 20.0464 bits. (including the version numbers). I has been working in Firefox for about a year or so though (at least as far as I can remember)

@mattewan
Copy link

mattewan commented Mar 31, 2021

The real issue here is unlikely to be with pdf.js, but more to do with nextcloud.

Due to authentication, I assume that nextcloud internally opens the PDF using PHP and serves the data out to pdf.js rather than giving pdf.js direct access to the PDF. It would be done this way to validate that the user trying to access the file has all the relevant permissions etc.

Problem is, without direct access to the file, pdf.js can't ask for byte ranges within the file, which means that even though the PDF may support fast web view, the technology serving it (nextcloud via php) doesn't.

This could be fixed, but the fix needs to be applied to nextcloud, not to pdf.js.

You can use PHP to get the requested byte range from the HTTP headers, open the file then serve only the requested byte range from the file. For anyone who wants to investigate this, google "byte serving pdf's with php"

@elpraga
Copy link

elpraga commented Apr 3, 2021

Wow! Thank you @mattewan for the input! It is way beyond my understanding and capabilities, but thank you!

@joshtrichards joshtrichards added the dependencies Pull requests that update a dependency file label Oct 14, 2024
@joshtrichards joshtrichards removed the dependencies Pull requests that update a dependency file label Feb 19, 2025
@joshtrichards joshtrichards changed the title Update pdf.js for Linear PDF feature Render first page before download is complete Feb 19, 2025
@joshtrichards joshtrichards changed the title Render first page before download is complete Render first page before download is complete (linear / web optimized) Feb 19, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants