Sunday, December 18, 2011

Don't use the jQuery .data() method. Use .attr() instead.

I just discovered the hard way that the jQuery .data() method is horribly broken. By design, it attempts to convert whatever you put into it into a native type.

I've got a template where I'm generating a button with a data-key element:

<button id="fooButton" data-key="1.4000">Click me to edit</button>

http://jsfiddle.net/KwjvA/

It looks like a float where one could assume that 1.4000 === 1.4, but what I really want here is the string 1.4000. I certainly don't want my input to be modified. One suggestion I found in the issue tracker is to put single quotes around the field (ie: data-key="'1.4000'"). That seems rather absurd as well.

The only reason why I'm warning about this here is that I've seen a bunch of libraries using .data() to store stuff in elements in the DOM. I think it is a really bad idea to have a method called .data() where you expect to be able to store something in it and be able to get back out exactly what you put into it.

The recommended alternative is to use .attr(). The problem with this is that while it achieves the same effect, it is actually much different functionality from .data(). .data() stores information within an internal cache within jQuery, while .attr() calls element.setAttribute().

I read through several bug reports on the jQuery website, where people are also confused by this behavior and all of them get closed with a wontfix. I see this as a terrible choice. Yuck.

Update: Here is a bug I just filed, hopefully that explains things better to the people who seem to be having a hard time understanding what I'm talking about: http://bugs.jquery.com/ticket/11060

Thursday, December 8, 2011

Optimizing Web Application JavaScript Delivery

For my new company, I had the following design goals for my heavy use of JavaScript (JS):
  • I'm using CoffeeScript (CS), so I need to have my IDE automatically compile CS to JS when I save. That way the whole Change file, Save, Reload the browser process works cleanly.
  • Be able to split the files up depending on which page is loaded so that only the JS which is needed for the page is sent to the client.
  • Be able to differentiate JS files to be loaded between different states such as logged in, logged out and both. That way, once someone is already logged in, the JS which controls the login and forgot password dialog does not get served again. The flip side is that JS for logged in pages is not served up to anonymous users.
  • In development mode, have everything un-minimized, but in production mode, automatically minimize everything.
  • Run all of the JS through the Closure compiler regardless of dev/prod so that I know that things that work in dev also work in prod.
  • Limit the number of <script> tags to the bare minimum. Ideally, 2-3 for .js files served from my site and not directly off of a CDN. Fewer loads means less network traffic.
  • Be able to transparently support new code / application versions so that when I upgrade the application, the browsers dump their cached copies of my files.
  • No dependencies on external xml, json, property or other configuration file formats to implement the goals above. Everything should be configured by someone editing the html templates.
In order to accomplish the goals above, I first looked at a bunch of solutions, but they all failed in various ways. So, I started on my own and went through various iterations before I came up with the ideal solution which I think is pretty unique and easy.

Let's start off by talking about one of the tools I'm using. LabJS enables me to load JS only when I need it. As part of the 'master' template which contains the skin for all pages, at the very bottom before the </body> element I have something that looks like this:

<script src="/js/LAB.min.js"></script>
<script>
    var country = "${country}";
    var fbAppId = "${fb.APP_ID}";

    var js = '${tool.jsbuilder(
        me != null,
        'json2:both',
        'handlebars.1.0.0.beta.master:both',
        'bootstrap-twipsy:both',
        'bootstrap-popover:both',
        'gen/global/page:both',
        'gen/global/common:both',
        'gen/global/search:both',
        'gen/modal/loginDialog:out', // ! logged in
        'gen/global/loggedInMenu:in', // logged in
        'gen/global/master:both'
        )}';

    var lab = $LAB
        .script("//ajax.googleapis.com/ajax/libs/jquery/1.7/jquery.min.js")
        .wait()
        .script("//ajax.googleapis.com/ajax/libs/jqueryui/1.8.16/jquery-ui.js")
        .script("//connect.facebook.net/en_US/all.js")
        .script("//apis.google.com/js/plusone.js")
        .script("//platform.twitter.com/widgets.js")
        .wait()
        .script(js)
        ;

    // Variable "pagecode" should be a function that takes a LAB and does any page-specific loading
    if (typeof pagecode === 'function')
        pagecode(lab);
</script>

Since I'm using Cambridge Templates with JEXL to process things first, the ${tool.jsbuilder(...)} section runs some Java code which does a lot of the magic during the rendering portion of the page. The first argument is a boolean to indicate whether or not I'm logged in. 'me' is an object in the context and if it is null I'm not logged in. The rest of the arguments are String[]. The method signature looks like this:

    public String jsbuilder(boolean loggedIn, String[] files);

What happens in that method is that it will parse the array of Strings, and based on a setting of 'in' for logged in, 'out' for logged out, or 'both' for either logged in or out, it will compare that to the loggedIn boolean and either load the appropriate JS file or not. The files are then loaded into memory, in order, and sent through the Closure compiler to minify the code.

The output from Closure is then cached in a global HashMap which is never cleared out. (Note: for languages that don't really persist memory between requests, like PHP, you can store this data in something like memcached).

The key of the Map is generated from a md5 hash of the list of filenames combined together + application version. The hash looks something like this: be712950814b2ccc6b92ff5c3. This hash is the String that is returned from the jsbuilder method. By using the names + application version, that ensures a new hash will be generated each time the application is upgraded.

In dev mode, the Map isn't used at all. The code is generated for each request, which ensures that my changes get immediately reflected in the browser. In production, the Map is first checked for the key and if it exists, the key is immediately returned from the jsbuilder method.

The final rendered page looks something like this to the web browser:

    var js = '/js2/be712950814b2ccc6b92ff5c3.js';

When the LabJS code executes in the browser and loads my script with the line .script(js), there is a Servlet listening for requests to /js2/*.js and it looks up the key from the url in the Map and returns the appropriate JS data. This servlet can also set the correct browser cache headers depending on dev or prod.

As you can see, 10+ separate files have been combined and minified into a single file which makes the requests more efficient. All without configuration files or a crazy syntax that only a backend developer can understand.

If I wanted to split the JS files into more loads so that the browser can take advantage of concurrent loading, I could do that as well by just creating more calls to jsbuilder. That is effectively what is happening in the pagecode section near the end of the </script> element above. The body template which is loaded into the master template by Cambridge, optionally has a JS function defined called pagecode. When it executes, it calls lab.script() again with similar output from the jsbuilder tool. This allows me to split up my code so that there is global code as well as page specific code.

Enjoy.

Wednesday, December 7, 2011

Github Pages

I'm a huge fan of Github. Pull requests are the best innovation since the idea behind open source was created.

For one of my projects hosted on Google Code, I was recently asked to move it to Github so that someone could submit pull requests more easily. I happily complied because of how much I love Github.

As part of this move, I decided to finally explore Github Pages in order to publish the nicely formatted documentation of my project. I was thinking they'd be as great as pull requests, and after an hour of reading the documentation and installing everything, I was terribly disappointed.

The main issue is that it uses a site generation tool called Jekyll. While this tool is generally ok, it has a quite a few major failings as a product for Github.
  • It is clearly a product of Not-Invented-Here syndrome. How many static website generators does this world need? Why did you feel the need to invent yet another one? Hell, I don't even want a static website generator, I just want to write some documentation.
  • I don't want 50 different options for generating content. Just give me Markdown. I don't even want 2 different types of Markdown. Just give me the one that works best, as default.
  • By default, it comes with nothing to help you design a site full of great looking documentation. Even just a default template setup would suffice. Some people have tried to create some helper projects, but they all have terrible UI and they all seem somewhat abandoned.
  • It requires me to become an expert in this tool. I have to install a bunch of random software, learn configuration files, learn a specific file layout. All I want to do is write some documentation that looks nice!
  • It was basically created for publishing blogs. Why is this being promoted as GH Pages? It seems rather absurd for a source code repository to provide a tool for publishing blogs, but not a tool for publishing great documentation.
Back over on Google Code, I just create some wiki pages, link them together with a table of contents (also a wiki page) and point at a specific url. Everything just works and looks great.

Github, please fix this!

Thursday, December 1, 2011

Social buttons

Today, I finally got around to implementing those social Like buttons that you see on all of the websites. Personally, I never click them, but it is clear lots of other people do so I'm going with the bandwagon.

They look something like this:

The top ones are Facebook, then Twitter, then Google+.

Styling the first two rows of buttons with CSS is simple. They have class="fb-like" and class="twitter-share-button". I can move them around on my page and place them exactly where I want them quite easily.

What does Google have you ask? Nothing. Zip. Zilch. Instead, it has id="___plusone_0" which means that it is somewhat useless as a general CSS selector.

I know this is nit picky on such a small issue, but an oversight like this seems hard to fathom. I've google'd around a bit and it really makes me wonder, how come no other site designers have brought this up?

I was hoping that a work around for this would be to just write it as a div instead of as the <g:plusone> element. But it turns out that div loses the class attribute when it is re-written by their JavaScript.

<div class="g-plusone" data-size="tall" ... ></div>

In the end, I just put my own div around the element, but that seems like such a kludge when the other services seem to do this correctly.

Maybe I'm reading too far into this because it is such a little thing, but it really makes me feel like Google doesn't understand the needs of site developers like the other Social players do.