UglifyJS – a JavaScript parser/compressor/beautifier
- - -Table of Contents
- -1 UglifyJS — a JavaScript parser/compressor/beautifier
-
-This package implements a general-purpose JavaScript
-parser/compressor/beautifier toolkit. It is developed on NodeJS, but it
-should work on any JavaScript platform supporting the CommonJS module system
-(and if your platform of choice doesn't support CommonJS, you can easily
-implement it, or discard the exports.*
lines from UglifyJS sources).
-
-The tokenizer/parser generates an abstract syntax tree from JS code. You -can then traverse the AST to learn more about the code, or do various -manipulations on it. This part is implemented in parse-js.js and it's a -port to JavaScript of the excellent parse-js Common Lisp library from Marijn Haverbeke. -
--( See cl-uglify-js if you're looking for the Common Lisp version of -UglifyJS. ) -
--The second part of this package, implemented in process.js, inspects and -manipulates the AST generated by the parser to provide the following: -
--
-
- -ability to re-generate JavaScript code from the AST. Optionally -indented—you can use this if you want to “beautify” a program that has -been compressed, so that you can inspect the source. But you can also run -our code generator to print out an AST without any whitespace, so you -achieve compression as well. - - -
-
-shorten variable names (usually to single characters). Our mangler will
-analyze the code and generate proper variable names, depending on scope
-and usage, and is smart enough to deal with globals defined elsewhere, or
-with
eval()
calls orwith{}
statements. In short, ifeval()
or -with{}
are used in some scope, then all variables in that scope and any -variables in the parent scopes will remain unmangled, and any references -to such variables remain unmangled as well. - -
- -
-various small optimizations that may lead to faster code but certainly
-lead to smaller code. Where possible, we do the following:
-
-
-
-
- -foo["bar"] ==> foo.bar - - -
-
-remove block brackets
{}
- -
- - -join consecutive var declarations: -var a = 10; var b = 20; ==> var a=10,b=20; - - -
- -resolve simple constant expressions: 1 +2 * 3 ==> 7. We only do the -replacement if the result occupies less bytes; for example 1/3 would -translate to 0.333333333333, so in this case we don't replace it. - - -
- -consecutive statements in blocks are merged into a sequence; in many -cases, this leaves blocks with a single statement, so then we can remove -the block brackets. - - -
-
-various optimizations for IF statements:
-
-
-
-
- -if (foo) bar(); else baz(); ==> foo?bar():baz(); - -
- -if (!foo) bar(); else baz(); ==> foo?baz():bar(); - -
- -if (foo) bar(); ==> foo&&bar(); - -
- -if (!foo) bar(); ==> foo||bar(); - -
- -if (foo) return bar(); else return baz(); ==> return foo?bar():baz(); - -
- -if (foo) return bar(); else something(); ==> {if(foo)return bar();something()} - - -
- -
-remove some unreachable code and warn about it (code that follows a
-
return
,throw
,break
orcontinue
statement, except -function/variable declarations). -
-
-
1.1 Unsafe transformations
-
-The following transformations can in theory break code, although they're
-probably safe in most practical cases. To enable them you need to pass the
---unsafe
flag.
-
1.1.1 Calls involving the global Array constructor
--The following transformations occur: -
- - - -new Array(1, 2, 3, 4) => [1,2,3,4] -Array(a, b, c) => [a,b,c] -new Array(5) => Array(5) -new Array(a) => Array(a) -- - - -
-These are all safe if the Array name isn't redefined. JavaScript does allow -one to globally redefine Array (and pretty much everything, in fact) but I -personally don't see why would anyone do that. -
-
-UglifyJS does handle the case where Array is redefined locally, or even
-globally but with a function
or var
declaration. Therefore, in the
-following cases UglifyJS doesn't touch calls or instantiations of Array:
-
// case 1. globally declared variable - var Array; - new Array(1, 2, 3); - Array(a, b); - - // or (can be declared later) - new Array(1, 2, 3); - var Array; - - // or (can be a function) - new Array(1, 2, 3); - function Array() { ... } - -// case 2. declared in a function - (function(){ - a = new Array(1, 2, 3); - b = Array(5, 6); - var Array; - })(); - - // or - (function(Array){ - return Array(5, 6, 7); - })(); - - // or - (function(){ - return new Array(1, 2, 3, 4); - function Array() { ... } - })(); - - // etc. -- - - -
1.1.2 obj.toString()
==> obj+“”
-1.2 Install (NPM)
-
-UglifyJS is now available through NPM — npm install uglify-js
should do
-the job.
-
1.3 Install latest code from GitHub
-## clone the repository -mkdir -p /where/you/wanna/put/it -cd /where/you/wanna/put/it -git clone git://github.com/mishoo/UglifyJS.git - -## make the module available to Node -mkdir -p ~/.node_libraries/ -cd ~/.node_libraries/ -ln -s /where/you/wanna/put/it/UglifyJS/uglify-js.js - -## and if you want the CLI script too: -mkdir -p ~/bin -cd ~/bin -ln -s /where/you/wanna/put/it/UglifyJS/bin/uglifyjs - # (then add ~/bin to your $PATH if it's not there already) -- - - -
1.4 Usage
--There is a command-line tool that exposes the functionality of this library -for your shell-scripting needs: -
- - - -uglifyjs [ options... ] [ filename ] -- - - -
-filename
should be the last argument and should name the file from which
-to read the JavaScript code. If you don't specify it, it will read code
-from STDIN.
-
-Supported options: -
--
-
-
-
-b
or--beautify
— output indented code; when passed, additional -options control the beautifier: - --
-
-
-
-i N
or--indent N
— indentation level (number of spaces) - -
- -
-
-q
or--quote-keys
— quote keys in literal objects (by default, -only keys that cannot be identifier names will be quotes). - -
-
- -
-
-
-
--ascii
— pass this argument to encode non-ASCII characters as -\uXXXX
sequences. By default UglifyJS won't bother to do it and will -output Unicode characters instead. (the output is always encoded in UTF8, -but if you pass this option you'll only get ASCII). - -
- -
-
-nm
or--no-mangle
— don't mangle variable names - -
- -
-
-ns
or--no-squeeze
— don't callast_squeeze()
(which does various -optimizations that result in smaller, less readable code). - -
- -
-
-mt
or--mangle-toplevel
— mangle names in the toplevel scope too -(by default we don't do this). - -
- -
-
--no-seqs
— whenast_squeeze()
is called (thus, unless you pass ---no-squeeze
) it will reduce consecutive statements in blocks into a -sequence. For example, "a = 10; b = 20; foo();" will be written as -"a=10,b=20,foo();". In various occasions, this allows us to discard the -block brackets (since the block becomes a single statement). This is ON -by default because it seems safe and saves a few hundred bytes on some -libs that I tested it on, but pass--no-seqs
to disable it. - -
- -
-
--no-dead-code
— by default, UglifyJS will remove code that is -obviously unreachable (code that follows areturn
,throw
,break
or -continue
statement and is not a function/variable declaration). Pass -this option to disable this optimization. - -
- -
-
-nc
or--no-copyright
— by default,uglifyjs
will keep the initial -comment tokens in the generated code (assumed to be copyright information -etc.). If you pass this it will discard it. - -
- -
-
-o filename
or--output filename
— put the result infilename
. If -this isn't given, the result goes to standard output (or see next one). - -
- -
-
--overwrite
— if the code is read from a file (not from STDIN) and you -pass--overwrite
then the output will be written in the same file. - -
- -
-
--ast
— pass this if you want to get the Abstract Syntax Tree instead -of JavaScript as output. Useful for debugging or learning more about the -internals. - -
- -
-
-v
or--verbose
— output some notes on STDERR (for now just how long -each operation takes). - -
- -
-
--unsafe
— enable other additional optimizations that are known to be -unsafe in some contrived situations, but could still be generally useful. -For now only this: - --
-
- -foo.toString() ==> foo+"" - - -
- -
-
--max-line-len
(default 32K characters) — add a newline after around -32K characters. I've seen both FF and Chrome croak when all the code was -on a single line of around 670K. Pass –max-line-len 0 to disable this -safety feature. - -
- -
-
--reserved-names
— some libraries rely on certain names to be used, as -pointed out in issue #92 and #81, so this option allow you to exclude such -names from the mangler. For example, to keep namesrequire
and$super
-intact you'd specify –reserved-names "require,$super". - -
- -
-
--inline-script
– when you want to include the output literally in an -HTML<script>
tag you can use this option to prevent</script
from -showing up in the output. - -
- -
-
--lift-vars
– when you pass this, UglifyJS will apply the following -transformations (see the notes in API,ast_lift_variables
): - --
-
-
-put all
var
declarations at the start of the scope -
- - -make sure a variable is declared only once - -
- -discard unused function arguments - -
- -discard unused inner (named) functions - -
-
-finally, try to merge assignments into that one
var
declaration, if -possible. -
-
- -
-put all
1.4.1 API
--To use the library from JavaScript, you'd do the following (example for -NodeJS): -
- - - -var jsp = require("uglify-js").parser; -var pro = require("uglify-js").uglify; - -var orig_code = "... JS code here"; -var ast = jsp.parse(orig_code); // parse code and get the initial AST -ast = pro.ast_mangle(ast); // get a new AST with mangled names -ast = pro.ast_squeeze(ast); // get an AST with compression optimizations -var final_code = pro.gen_code(ast); // compressed code here -- - - -
-The above performs the full compression that is possible right now. As you
-can see, there are a sequence of steps which you can apply. For example if
-you want compressed output but for some reason you don't want to mangle
-variable names, you would simply skip the line that calls
-pro.ast_mangle(ast)
.
-
-Some of these functions take optional arguments. Here's a description: -
--
-
-
-
jsp.parse(code, strict_semicolons)
– parses JS code and returns an AST. -strict_semicolons
is optional and defaults tofalse
. If you pass -true
then the parser will throw an error when it expects a semicolon and -it doesn't find it. For most JS code you don't want that, but it's useful -if you want to strictly sanitize your code. - -
- -
-
pro.ast_lift_variables(ast)
– merge and movevar
declarations to the -scop of the scope; discard unused function arguments or variables; discard -unused (named) inner functions. It also tries to merge assignments -following thevar
declaration into it. - --If your code is very hand-optimized concerning
-var
declarations, this -lifting variable declarations might actually increase size. For me it -helps out. On jQuery it adds 865 bytes (243 after gzip). YMMV. Also -note that (since it's not enabled by default) this operation isn't yet -heavily tested (please report if you find issues!). --Note that although it might increase the image size (on jQuery it gains -865 bytes, 243 after gzip) it's technically more correct: in certain -situations, dead code removal might drop variable declarations, which -would not happen if the variables are lifted in advance. -
--Here's an example of what it does: -
-
-
function f(a, b, c, d, e) { - var q; - var w; - w = 10; - q = 20; - for (var i = 1; i < 10; ++i) { - var boo = foo(a); - } - for (var i = 0; i < 1; ++i) { - var boo = bar(c); - } - function foo(){ ... } - function bar(){ ... } - function baz(){ ... } -} - -// transforms into ==> - -function f(a, b, c) { - var i, boo, w = 10, q = 20; - for (i = 1; i < 10; ++i) { - boo = foo(a); - } - for (i = 0; i < 1; ++i) { - boo = bar(c); - } - function foo() { ... } - function bar() { ... } -} -- - - -
-
-
-
-
pro.ast_mangle(ast, options)
– generates a new AST containing mangled -(compressed) variable and function names. It supports the following -options: - --
-
-
-
toplevel
– mangle toplevel names (by default we don't touch them). -
- -
-
except
– an array of names to exclude from compression. - -
-
- -
-
-
-
pro.ast_squeeze(ast, options)
– employs further optimizations designed -to reduce the size of the code thatgen_code
would generate from the -AST. Returns a new AST.options
can be a hash; the supported options -are: - --
-
-
-
make_seqs
(default true) which will cause consecutive statements in a -block to be merged using the "sequence" (comma) operator - -
- -
-
dead_code
(default true) which will remove unreachable code. - -
-
- -
-
-
-
pro.gen_code(ast, options)
– generates JS code from the AST. By -default it's minified, but using theoptions
argument you can get nicely -formatted output.options
is, well, optional :-) and if you pass it it -must be an object and supports the following properties (below you can see -the default values): - --
-
-
-
beautify: false
– passtrue
if you want indented output -
- -
-
indent_start: 0
(only applies whenbeautify
istrue
) – initial -indentation in spaces -
- -
-
indent_level: 4
(only applies whenbeautify
istrue
) -- -indentation level, in spaces (pass an even number) -
- -
-
quote_keys: false
– if you passtrue
it will quote all keys in -literal objects -
- -
-
space_colon: false
(only applies whenbeautify
istrue
) – wether -to put a space before the colon in object literals -
- -
-
ascii_only: false
– passtrue
if you want to encode non-ASCII -characters as\uXXXX
. -
- -
-
inline_script: false
– passtrue
to escape occurrences of -</script
in strings -
-
- -
-
1.4.2 Beautifier shortcoming – no more comments
--The beautifier can be used as a general purpose indentation tool. It's -useful when you want to make a minified file readable. One limitation, -though, is that it discards all comments, so you don't really want to use it -to reformat your code, unless you don't have, or don't care about, comments. -
--In fact it's not the beautifier who discards comments — they are dumped at -the parsing stage, when we build the initial AST. Comments don't really -make sense in the AST, and while we could add nodes for them, it would be -inconvenient because we'd have to add special rules to ignore them at all -the processing stages. -
-1.5 Compression – how good is it?
--Here are updated statistics. (I also updated my Google Closure and YUI -installations). -
--We're still a lot better than YUI in terms of compression, though slightly -slower. We're still a lot faster than Closure, and compression after gzip -is comparable. -
-File | UglifyJS | UglifyJS+gzip | Closure | Closure+gzip | YUI | YUI+gzip |
---|---|---|---|---|---|---|
jquery-1.6.2.js | 91001 (0:01.59) | 31896 | 90678 (0:07.40) | 31979 | 101527 (0:01.82) | 34646 |
paper.js | 142023 (0:01.65) | 43334 | 134301 (0:07.42) | 42495 | 173383 (0:01.58) | 48785 |
prototype.js | 88544 (0:01.09) | 26680 | 86955 (0:06.97) | 26326 | 92130 (0:00.79) | 28624 |
thelib-full.js (DynarchLIB) | 251939 (0:02.55) | 72535 | 249911 (0:09.05) | 72696 | 258869 (0:01.94) | 76584 |
1.6 Bugs?
--Unfortunately, for the time being there is no automated test suite. But I -ran the compressor manually on non-trivial code, and then I tested that the -generated code works as expected. A few hundred times. -
--DynarchLIB was started in times when there was no good JS minifier. -Therefore I was quite religious about trying to write short code manually, -and as such DL contains a lot of syntactic hacks1 such as “foo == bar ? a -= 10 : b = 20”, though the more readable version would clearly be to use -“if/else”. -
--Since the parser/compressor runs fine on DL and jQuery, I'm quite confident -that it's solid enough for production use. If you can identify any bugs, -I'd love to hear about them (use the Google Group or email me directly). -
-1.7 Links
--
-
- -Twitter: @UglifyJS - -
- -Project at GitHub: http://github.com/mishoo/UglifyJS - -
- -Google Group: http://groups.google.com/group/uglifyjs - -
- -Common Lisp JS parser: http://marijn.haverbeke.nl/parse-js/ - -
- -JS-to-Lisp compiler: http://github.com/marijnh/js - -
- -Common Lisp JS uglifier: http://github.com/mishoo/cl-uglify-js - -
1.8 License
--UglifyJS is released under the BSD license: -
- - - -Copyright 2010 (c) Mihai Bazon <mihai.bazon@gmail.com> -Based on parse-js (http://marijn.haverbeke.nl/parse-js/). - -Redistribution and use in source and binary forms, with or without -modification, are permitted provided that the following conditions -are met: - - * Redistributions of source code must retain the above - copyright notice, this list of conditions and the following - disclaimer. - - * Redistributions in binary form must reproduce the above - copyright notice, this list of conditions and the following - disclaimer in the documentation and/or other materials - provided with the distribution. - -THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDER “AS IS” AND ANY -EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE -IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR -PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER BE -LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, -OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, -PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR -PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR -TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF -THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF -SUCH DAMAGE. -- - - - -
Footnotes:
- -Date: 2011-08-20 10:08:28 EEST
-HTML generated by org-mode 7.01trans in emacs 23
-