Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

unable to deal with file names that have invalid encoding #8619

Closed
andrewrk opened this issue Oct 25, 2014 · 10 comments
Closed

unable to deal with file names that have invalid encoding #8619

andrewrk opened this issue Oct 25, 2014 · 10 comments

Comments

@andrewrk
Copy link

There is a file in my user's file system and Node.js cannot use any of the fs API on it, because the file name has an invalid encoding. File names are byte arrays; not strings. This is the root of the problem.

How should I solve this problem? Is this a limitation of Node.js, that it cannot deal with file names with invalid encodings?

andrewrk/groovebasin#383

@a0viedo
Copy link
Member

a0viedo commented Nov 1, 2014

Can you provide a use case for this issue?

@othiym23
Copy link

othiym23 commented Nov 1, 2014

The use case seems readily apparent, as described in andrewrk/groovebasin#383. If ordinary processes can deal with these (badly-encoded) filenames, Node should be able to as well. If the problem is JS strings' weird semi-broken character encoding and a desire to not change locked APIs, then perhaps Node could be converted to use Buffers with an encoding type of binary and call .toString()when they're going to be handed off to APIs expecting Strings. If so, though, there should be a way to get the raw Buffer so that applications that scan the filesystem don't error out on "invalid" filenames.

@a0viedo
Copy link
Member

a0viedo commented Nov 3, 2014

I'm not able to reproduce this in Windows nor Linux with v0.10.32 on both cases.
fs

@andrewrk
Copy link
Author

andrewrk commented Nov 3, 2014

try using the results of readdir. I'll piece together an example, just a moment...

@andrewrk
Copy link
Author

andrewrk commented Nov 3, 2014

$ wget https://s3.amazonaws.com/superjoe/temp/encoding-test.tar.gz
$ tar xvf encoding-test.tar.gz 
encoding-test/
encoding-test/test.js
encoding-test/dir/
encoding-test/dir/\377
$ cd encoding-test/
$ cat test.js
var fs = require('fs');
var path = require('path');
var dir = 'dir/';
var list = fs.readdirSync(dir);
var filename = path.join(dir, list[0]);
console.log(filename);
console.log(fs.lstatSync(filename));
$ node test.js 
dir/�

fs.js:688
  return binding.lstat(pathModule._makeLong(path));
                 ^
Error: ENOENT, no such file or directory 'dir/�'
    at Object.fs.lstatSync (fs.js:688:18)
    at Object.<anonymous> (/home/andy/Downloads/encoding-test/test.js:7:16)
    at Module._compile (module.js:456:26)
    at Object.Module._extensions..js (module.js:474:10)
    at Module.load (module.js:356:32)
    at Function.Module._load (module.js:312:12)
    at Function.Module.runMain (module.js:497:10)
    at startup (node.js:119:16)
    at node.js:906:3

@cjihrig
Copy link

cjihrig commented Nov 13, 2014

I ran this with the 0.12 branch on OS X and got, what appears to be, correct output (see below). Can you try with more recent code?

$ ls dir/
%FF
$ node test.js 
dir/%FF
{ dev: 16777219,
  mode: 33188,
  nlink: 1,
  uid: 501,
  gid: 0,
  rdev: 0,
  blksize: 4096,
  ino: 48664766,
  size: 0,
  blocks: 0,
  atime: Wed Dec 31 1969 19:00:00 GMT-0500 (EST),
  mtime: Sun Nov 02 2014 21:45:21 GMT-0500 (EST),
  ctime: Thu Nov 13 2014 11:08:21 GMT-0500 (EST),
  birthtime: Sun Nov 02 2014 21:45:21 GMT-0500 (EST) }

@andrewrk
Copy link
Author

Just tried master branch, currently at cfcb1de, same problem.

Linux andy-bx 3.16.0-24-generic #32-Ubuntu SMP Tue Oct 28 13:07:32 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

@chrisdickinson
Copy link

I have a sneaking suspicion that this is related to #2387 -- the age old problem of JS strings being in UCS2, but the filesystem being in ISO-8859-1, or some other encoding.

@andrewrk
Copy link
Author

Looks like a duplicate issue indeed.

I would like to point out that this is a huge shortcoming in Node.js and should be fixed before 1.0 if Node.js wants to be taken seriously.

With this issue, Node.js is fundamentally unsuitable for dealing with users' files.

@chrisdickinson
Copy link

Closing this issue in favor of #2387.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants