Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed to run pdf2json: ENOENT no such file or directory #349

Closed
styfle opened this issue Apr 8, 2019 · 10 comments
Closed

Failed to run pdf2json: ENOENT no such file or directory #349

styfle opened this issue Apr 8, 2019 · 10 comments
Assignees
Labels
package issue priority Important issue or pull request to fast-track

Comments

@styfle
Copy link
Member

styfle commented Apr 8, 2019

Steps to reproduce

git clone https://github.com/friedhelmensch/node_now_pdf2json_issue
cd node_now_pdf2json_issue
ncc build index.js
node dist

Error

Error: ENOENT: no such file or directory, open '/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/baseshared/util.js'
    at Object.openSync (fs.js:438:3)
    at Object.readFileSync (fs.js:343:35)
    at module.exports.179._pdfjsFiles.forEach (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:1756:66)
    at Array.forEach (<anonymous>)
    at Object.179 (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:1756:13)
    at __webpack_require__ (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:22:30)
    at Object.932 (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:26354:10)
    at __webpack_require__ (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:22:30)
    at Object.318 (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:7517:19)
    at __webpack_require__ (/Users/styfle/Desktop/foobar/node_now_pdf2json_issue/dist/index.js:22:30)
@styfle styfle added package issue priority Important issue or pull request to fast-track labels Apr 8, 2019
@guybedford
Copy link
Contributor

Here's the code path in play here:

const _pdfjsFiles = [
    'shared/util.js',
    'shared/colorspace.js',
    'shared/pattern.js',
    'shared/function.js',
    'shared/annotation.js',

    'core/core.js',
    'core/obj.js',
    'core/charsets.js',
    'core/crypto.js',
    'core/evaluator.js',
    'core/fonts.js',
    'core/font_renderer.js',
    'core/glyphlist.js',
    'core/image.js',
    'core/metrics.js',
    'core/parser.js',
    'core/stream.js',
    'core/worker.js',
    'core/jpx.js',
    'core/jbig2.js',
    'core/bidi.js',
    'core/jpg.js',
    'core/chunked_stream.js',
    'core/pdf_manager.js',
    'core/cmap.js',
    'core/cidmaps.js',

    'display/canvas.js',
    'display/font_loader.js',
    'display/metadata.js',
    'display/api.js'
];

_pdfjsFiles.forEach( (fieldName, idx, arr) => _fileContent += fs.readFileSync(_basePath + fieldName, 'utf8') );

eval(_fileContent)

The main issue with this from an analysis point of view is that it's a new type of analysis to backtrack from a readFileSync statement to try to piece together the whole expression.

Furthermore it is followed by an eval so this seems like it may be better handled by a special casing again.

@styfle
Copy link
Member Author

styfle commented Apr 8, 2019

@guybedford

Would changing the source code solve the problem?

- _pdfjsFiles.forEach( (fieldName, idx, arr) => _fileContent += fs.readFileSync(_basePath + fieldName, 'utf8') );
+ _pdfjsFiles.forEach( (fieldName, idx, arr) => _fileContent += fs.readFileSync(__dirname + '/../base/' + fieldName, 'utf8') );

Or is it simply the fact that they went up a directory (../), the cause of the problem?

@guybedford
Copy link
Contributor

The problem is more that we don't know to emit the assets in the list of _pdfjsFiles, or how to relocate their references. This is because of the level of abstraction between _basePath and _pdfjsFiles in the readFileSync expression.

@guybedford
Copy link
Contributor

@styfle if we had something like your example, then the wildcard emission proposal would handle this though, yes, even without the file list analysis.

@styfle
Copy link
Member Author

styfle commented Apr 8, 2019

@guybedford If "wildcard emission proposal" is referring to #297 then I think we should prioritize #297 and then we could probably submit a PR to pdf2json with my patch above.

@guybedford
Copy link
Contributor

Actually we know exactly what _basePath is in our analysis, so this should work with just #297 implemented.

@styfle
Copy link
Member Author

styfle commented May 17, 2019

Fixed in #378

@styfle styfle closed this as completed May 17, 2019
@JinhooBong
Copy link

I'm a bit new to github issues so it's a bit confusing to follow where the solution can be found. But I'm struggling with this issue with pdf2json causing failures on my nextjs app hosted on vercel.
I am hitting this error: "Error: ENOENT: no such file or directory, open '/var/task/node_modules/pdf2json/base/shared/util.js'"

@PaulChase
Copy link

I'm a bit new to github issues so it's a bit confusing to follow where the solution can be found. But I'm struggling with this issue with pdf2json causing failures on my nextjs app hosted on vercel. I am hitting this error: "Error: ENOENT: no such file or directory, open '/var/task/node_modules/pdf2json/base/shared/util.js'"

I am also facing the same issue with my nextjs app hosted on netlify

@JinhooBong
Copy link

Hey @PaulChase, I followed this thread modesty/pdf2json#330 and was able to resolve my issue. Updating the package it was pulling from to a fix from a fellow dev seemed to fix it for me. Hopefully this helps!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
package issue priority Important issue or pull request to fast-track
Projects
None yet
Development

No branches or pull requests

4 participants