With frameworks like AngularJS it is possible to write really nice web applications just relying on HTML, JavaScript and REST services. Of course you indent and comment in your HTML and JavaScript-files, but this data does not need to be served to the user. The web browser just need the functional parts.
There are many minifiers, uglifiers or obfuscates; programs that remove comments and formatting from your code to make them smaller. Sometimes they also scramble/obfuscate the code with the intention to make it harder for someone to understand (and possibly use or exploit).
Those minifiers can be Windows applications, web pages, web server plugins and they can be implemented in a wide variety of languages or platforms depending on use. What I wanted was something very simple that I could just include in a simple build script on linux system: a command line tool (that does not rely on installing a bunch of java libraries or php-packages, and that does not support hundreds of dangerous options).
For JavaScript it was easy: I found JSMin written by a Master, Crockford. JSMin comes as a single C source file – that is easy for me.
For HTML it was trickier. Probably because few people actually write big HTML files directly – most often a web server and a server framework (like PHP) delivers the code. Also, there were many web based HTML minifies, but those are annoying to automate and depend on. So I actually spent more time looking for something as simple as JSMin, than I actually spent implementing the thing myself. It was tempting to do it in C, but then it would have taken longer time to implement than I already wasted looking for a tool. I choose Python (version 3, so it is incompatible with most peoples Python interpreter). Here we go:
#!/usr/bin/python3 import sys in_pre = False in_comment = False def outCommentHandler(line): x = line.find('<!--') if -1 == x: return line,False,'' else: return line[:x],True,line[4+x:] def inCommentHandler(line): x = line.find('-->') if -1 == x: return True,'' else: return False,line[3+x:] for line in sys.stdin: rem = line.strip() if in_pre: if rem.upper() == '</PRE>': in_pre = False print(rem) else: print(line) elif rem.upper() == '<PRE>': in_pre = True print(rem) elif '' != rem: while '' != rem: if in_comment: clean = '' in_comment,rem = inCommentHandler(rem) else: clean,in_comment,rem = outCommentHandler(rem) if '' != clean: print(clean, '')
Both jsmin and htmlmin.py are used the same way:
$ jsmin < code.js > code.min.js $ htmlmin.py < page.html > page.min.html
Both programs are not perfect.
JSMin
I found that JSMin fails with regular expression patters like this:
var alpha=/^[a-z]+$/ var ALPHA=/^[A-Z]+$/
Adding ; to the end of the lines fix that problem.
htmlmin.py
What it does is simply:
- Preserves <PRE> as long as the PRE-tags are on their own lines
- Removes all comments:
<!-- A comment -->
- Removes white space in the beginning (and end) of lines
- Removes empty lines
This is about the low hanging fruit only, but I think it is good enough for most purposes.
What is the “compression” rate?
For my test code:
HTML was compressed from 59kb to 42kb
JavaScript was compressed from 162kb to 108kb
It is possible to do better with better tools, but this is very simple, and it takes away the obvious waste from the files, with minimal risk of changing behavior. More heavy JavaScript minifiers rename variables and rewrites code.
0 Comments.