Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CLOSED] Getting mode from file extension won't always work #2762

Open
core-ai-bot opened this issue Aug 29, 2021 · 10 comments
Open

[CLOSED] Getting mode from file extension won't always work #2762

core-ai-bot opened this issue Aug 29, 2021 · 10 comments

Comments

@core-ai-bot
Copy link
Member

Issue by MarkMurphy
Thursday Feb 21, 2013 at 04:31 GMT
Originally opened as adobe/brackets#2923


Getting mode from file extension won't always work where some files don't have extensions like Makefiles for example.

Instead I propose that the getModeFromFileExtension method be refactored to getModeFromFilePath where the filename including the extension can be compared to a regex representing each mode.

for example:

var Mode,
    modes = [],
    modesByName = {};

var modeList = [{
    name: "Makefile", 
    options: "text/plain", 
    mime: "text/plain"
    matches: "^GNUmakefile|^makefile|^Makefile|^OCamlMakefile|make"
} , {
    name: "HTML"
    options: {
        name: "htmlmixed",
        scriptTypes: [{matches: /\/x-handlebars-template|\/x-mustache/i, mode: null}]
    },
    mime: "text/html"
    matches: "html|htm|shtm|shtm|xhtml|cfm|cfml|cfc|dhtml|xht"
}];

var Mode = function(name, options, mime, matches) {
    this.name = name;
    this.options= options;
    this.mime= mime;

    var re;

    if (/\^/.test(matches)) {
        re = matches.replace(/\|(\^)?/g, function(a, b){
            return "$|" + (b ? "^" : "^.*\\.");
        }) + "$";
    } else {
        re = "^.*\\.(" + matches+ ")$";
    }   

    this.matches= new RegExp(re, "gi");
};

Mode.prototype.supportsFile = function(filename) {
    return filename.match(this.matches);
};

for (var i = 0, length = modeList.length; i < length; i++) {
    var mode = modeList[i];
    mode = new Mode(mode.name, mode.options, mode.mime, mode.matches);
    modesByName[mode.name.toLowerCase()] = mode;
    modes.push(mode);
}

function getModeFromPath(path) {
    var filename = path.split(/[\/\\]/).pop(),
         length = modes.length
         i;

    for (i = 0; i < length; i++) {
        if (modes[i].supportsFile(filename)) {
            mode = modes[i];
            break;
        }
    }

    if (!mode) {
         console.log("Called EditorUtils.js _getModeFromFilePath with an unhandled file name or extension: " + filename);
        return "";
    }

    return mode;
}
@core-ai-bot
Copy link
Member Author

Comment by pthiess
Monday Feb 25, 2013 at 19:39 GMT


Hi Mark,
This seems like a reasonable approach to deal with filenames without an extension. We may want to discuss it regarding performance.
However, due to the teams bandwidth we would need to put it on the backlog for now.
Would you feel confident to work on this by yourself?
If so, that would be cool and I would encourage you to start a discussion about the best solution on our developer list (Google Groups).
Thanks a ton for your suggestion!

@core-ai-bot
Copy link
Member Author

Comment by peterflynn
Monday Feb 25, 2013 at 19:46 GMT


Are there cases where you'd need a full-blown regexp? Or would a list of extensionless filenames suffice? Something like this:

{
    name: "Makefile", 
    options: "text/plain", 
    mime: "text/plain",
    fileExtensions: [],
    fileNames: ["GNUmakefile", "makefile", "Makefile", "OCamlMakefile", "make"]
}

I'd worry a little bit that regexps make it too easy to create bugs where language extensions match an overly-broad set of filenames. (I think your example above actually contains such a bug :-) -- there's no "^" before the last item, so any filename containing the contiguous substring "make" would get matched). Regexps also make it impractical to detect clashing extensions, whereas with simple lists we can automatically detect & warn when two language extensions are fighting over the same file.

@core-ai-bot
Copy link
Member Author

Comment by MarkMurphy
Monday Feb 25, 2013 at 20:18 GMT


@peterflynn no bug, if you look at the Mode constructor, it assumes to add a '$' to the expression:

if (/\^/.test(matches)) {
    re = matches.replace(/\|(\^)?/g, function(a, b){
    return "$|" + (b ? "^" : "^.*\\.");
    }) + "$";
} else {
    re = "^.*\\.(" + matches+ ")$";
}   

this.matches= new RegExp(re, "gi");

In this case the regexps are so simplistic that there's really no reason to worry about using them for this.

Performance wise, it wouldn't be any faster that the current implementation of using the switch + whatever time ti takes to evaluate against each regex.

What I do like about about this approach is that it makes it much easier for me to quickly see what extension is associated to what language/mode.

Another approach I was considering was to create a key/value pair object where the keys would be extensions and extension-less file names and the value would be an object containing the mode name, options, mime, etc.

Here's an example:

var modes = {
    "html": { name: 'HTML', mode: ... },
     "htm": { alias: "html" },
    "makefile": { name: 'Makefile', mode: ... }
};

function getModeFromFilePath (path)  {
    var filename = ... ,
        extension = ... , 
        syntax = { mime: 'text/plain', mode: 'text/plain' };

    extension = extension ? extension : filename;

    while(extension && modes[extension] && modes[extension].alias) {
        extension = modes[extension].alias;
    }

    return extension && modes[extension] ? modes[extension] : syntax;
}

@core-ai-bot
Copy link
Member Author

Comment by peterflynn
Monday Feb 25, 2013 at 20:35 GMT


@MarkMurphy: this is definitely about to become a lot more declarative -- see pull request #2844, in particular LanguageManager.js and the languages.json file that feeds into it.

Sorry about the "bug" confusion -- I didn't realize that you were preprocessing the "match" string and not using it as a literal regexp directly.

I think my concerns there remain, though... I'd prefer if we could do this via a simple list of filenames, similar to your last proposal above. (Except that I think we should keep it separate from the list of extensions -- otherwise an extensionless file whose name matches any file extension would get mapped to that mode automatically, which seems wrong).

@core-ai-bot
Copy link
Member Author

Comment by MarkMurphy
Monday Feb 25, 2013 at 21:38 GMT


@peterflynn that makes sense.

@core-ai-bot
Copy link
Member Author

Comment by peterflynn
Monday Feb 25, 2013 at 23:54 GMT


Cool. Well, feel free to make a branch off of #2844's and start playing with an implementation of this if you're interested :-)

@core-ai-bot
Copy link
Member Author

Comment by MarkMurphy
Tuesday Feb 26, 2013 at 01:21 GMT


@peterflynn Sure, I should be able to have something done soon.

@core-ai-bot
Copy link
Member Author

Comment by DennisKehrig
Wednesday Mar 13, 2013 at 17:03 GMT


@MarkMurphy Can you verify this as working?

@core-ai-bot
Copy link
Member Author

Comment by MarkMurphy
Wednesday Mar 13, 2013 at 17:27 GMT


@DennisKehrig Confirmed. Tested by creating an extension-less file named "Cakefile" in a project which Brackets successfully interpreted as a CoffeeScript file.

@core-ai-bot
Copy link
Member Author

Comment by DennisKehrig
Wednesday Mar 13, 2013 at 17:31 GMT


@MarkMurphy Woohoo, thanks! Then I'll go ahead and close this one :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant