SyntaxHighlighter

Wednesday, January 25, 2012

GitHire spam again...

Just keeping this around for posterity since they deleted the HN posting and maybe others will want to see the truth if they are googling around for why you are getting spam from these guys.

http://news.ycombinator.com/item?id=3508655

I called them out for being a the spammers that they are, and I got a rather odd response:
Hi LatchKey,
I'm really sorry that we sent you that email. We just launched a little over a week ago with this crazy idea, and were extremely surprised at how quickly we were overwhelmed with orders.
We made a bad judgement call in sending some emails to people asking if anyone is interested in jobs.
If it makes you feel any better, you can see that we aren't finding very many talented engineers, and we will likely need to refund a lot of money in a few days.
We are honestly trying to be a great service for software developers and employers. We need feedback from people like yourself to learn how we can be the best service possible to reshape the hiring industry.
We actually sent you an email, but never heard back. Please let us know if you're interested in continuing this discussion further on or off of a public message board.
Thanks for keeping us honest.
The HN thread goes on with a lot of people pointing out their own issues with GitHire and I have my response to the above quote here: http://news.ycombinator.com/item?id=3508750

Sorry, but I just don't have any tolerance for spammers or people trying to profit off my profile without my permission.

Sunday, January 22, 2012

Scalable System Architecture Comedy

I was reading Scaling a PHP MySQL Web Application, which is a technical document published on the Oracle website. As I was scrolling down the page, I saw the typical Load balancing Figure 1 that you always see in any PHP/MySql web application.


But then, as the article goes on, it gets more entertaining. It goes on to Figure 3, showing Multiple MySQL Slaves, which is now 4 machines.

But wait, there's more. Now you need a dedicated database slave for each Web server, so the picture expands to even more lines and arrows in Figure 4. A total of 8 machines.


As you keep scrolling, you get to Figure 5. A real gem of an image. Arrows in every direction. Arrows jumping through other arrows. 8 machines, but a completely incomprehensible image.


Ok, now we've randomized the connections between all the web servers and database slaves.

Could you imagine one of these machines going down or throwing errors and trying to figure out which one it is or how it connects to the other machines?

We all know that as systems grow, they get more complex. That said, if you draw an incomprehensible picture of your architecture, it is a clear sign that you are doing it wrong.

Monday, January 16, 2012

GitHire.com is a spammer

I've kept up with all the recent HackerNew articles on GitHire. They seemed like a rather interesting service because I believe that hosting your projects on a site like GitHub, is the best kind of resume a software engineer can have.

That is, until they just spammed me with a whole list of completely random and unrelated jobs. I could understand it if I signed up to their site and requested spam (aka: LinkedIn), but I'm definitely not interested as I'm in the middle of starting my own company!


After checking out their site, I realized they've also got a profile up for me that I had no hand in creating, nor desire having. Apologies if that link stops working, hopefully they will remove my information soon. Maybe I should be pissed that I'm only in their Top 50%? ;-)


Since I wanted to remove myself, I clicked the "Opt out of Githire" link, which then takes me to a page on Github to authorize their application?!?


Hell no, I'm not going to authorize your application, just so I can opt out of your website. That is wrong on so many levels.

Anyway, I cc'd support@github on my response to 'Steve', so hopefully they will be going away soon. I can't see how GitHub is allowing this company to exist, when they so clearly violate their terms of service policy.

Friday, January 6, 2012

Thursday, January 5, 2012

Going on 6 months now...

Here is a bit of a status update for the new year:

We've been working full steam ahead on Voost, going on 6 months now. In that timeframe, Jeff and I have accomplished an impressive amount of work for just two people. I've been putting in 10-15 hour days, seven days a week, of solid coding and Jeff has been doing the same. When he or I are out of town, we sit on Skype all day (and night) long working through any issues we have and bouncing ideas off each other. It has been a hugely productive cooperative development effort.

I've become a much better UX/UI designer than when I started. It had been years since I had worked on this side of things and it has been a lot of fun picking it back up. I've also become an absolute expert in CoffeeScript, JQuery, Less and all of the other hot front end technologies that are out there. On the back end, we've integrated with BrowserID for secure sign in as an option to Facebook Connect. We've also switched to Objectify 4 which is the most advanced way to interact with the Google AppEngine database backend.

The sad face news is we have nothing public to show for all of this hard work quite yet. I could go on with a list of reasons, but they aren't really worth going over in detail. Suffice it to say, we just aren't ready to launch. I'd say that we are about 85-90% of the way there. Hopefully not more than about a month or so. For a few of my friends, what we have is enough and they are pressuring me to just put something out there, even if it is incomplete or buggy. I'm pushing back on them.

While I realize the cycling season is quickly picking up, I'm not in a huge hurry. Thankfully, after years of penny pinching, I have enough savings left to last me until we do launch. I really want to do this right. I want all of my cards on the table. I want people to wonder why nobody has done a site this good before. As cheesy as it sounds, I expect something close to perfection, even if it isn't absolutely feature complete. I think of how the original iPhone disrupted the cell phone market. We went from the clam shells and keyboards to touch screens overnight. It may sound silly, but I'm passionate about doing something like that with the event registration market.

Even without all of the features that other companies have, our application is light years more advanced than any other registration product out there. I know this because I've seen their systems, analyzed everything wrong with them and spent the time to come up with a vastly better designed user experience. This takes a lot of hard work and this will be a huge differentiator in the marketplace for us. I'm very proud of that fact. It will be very expensive and nearly impossible for our competitors to hire enough engineering talent to catch up with us.

A question I get a lot is: do you have any customers? Well, we don't. Not yet. I'm ok with that because I do have enough contacts and relationships to get the word out there to promoters. I think that people also really want this product, so when we launch, it will almost sell itself. I can't tell you how many times I've heard 'I hate XYZ's excessive fees!' and 'This XYZ registration site is so difficult to use!'

Besides, the cycling community, our initial target audience, is very small and I don't want to really start pressuring promoters to try out a system that isn't launched yet. I sure wouldn't trust anyone who doesn't have a live product. On the flip side, if I was a competitor, I'd be really scared of us right now. We are going to be very hungry for customers and it will be that much harder to compete when we have a better product with better pricing.

A bit of good news is that we are close to having a great company logo. We put a bounty up on one of those crowd sourced websites full of designers and got a number of excellent designs, out of over a hundred submissions. We are in the process of choosing the final one over the next couple of days. I look forward to announcing it.

Thanks for listening. Thanks to all my friends and family for the encouragement and advice. Thanks to my wife for putting up with me working all the time. Thanks to everyone who has offered to help. Expect another update soon. This is going to be a lot of fun!

Sunday, December 18, 2011

Don't use the jQuery .data() method. Use .attr() instead.

I just discovered the hard way that the jQuery .data() method is horribly broken. By design, it attempts to convert whatever you put into it into a native type.

I've got a template where I'm generating a button with a data-key element:

<button id="fooButton" data-key="1.4000">Click me to edit</button>

http://jsfiddle.net/KwjvA/

It looks like a float where one could assume that 1.4000 === 1.4, but what I really want here is the string 1.4000. I certainly don't want my input to be modified. One suggestion I found in the issue tracker is to put single quotes around the field (ie: data-key="'1.4000'"). That seems rather absurd as well.

The only reason why I'm warning about this here is that I've seen a bunch of libraries using .data() to store stuff in elements in the DOM. I think it is a really bad idea to have a method called .data() where you expect to be able to store something in it and be able to get back out exactly what you put into it.

The recommended alternative is to use .attr(). The problem with this is that while it achieves the same effect, it is actually much different functionality from .data(). .data() stores information within an internal cache within jQuery, while .attr() calls element.setAttribute().

I read through several bug reports on the jQuery website, where people are also confused by this behavior and all of them get closed with a wontfix. I see this as a terrible choice. Yuck.

Update: Here is a bug I just filed, hopefully that explains things better to the people who seem to be having a hard time understanding what I'm talking about: http://bugs.jquery.com/ticket/11060

Thursday, December 8, 2011

Optimizing Web Application JavaScript Delivery

For my new company, I had the following design goals for my heavy use of JavaScript (JS):
  • I'm using CoffeeScript (CS), so I need to have my IDE automatically compile CS to JS when I save. That way the whole Change file, Save, Reload the browser process works cleanly.
  • Be able to split the files up depending on which page is loaded so that only the JS which is needed for the page is sent to the client.
  • Be able to differentiate JS files to be loaded between different states such as logged in, logged out and both. That way, once someone is already logged in, the JS which controls the login and forgot password dialog does not get served again. The flip side is that JS for logged in pages is not served up to anonymous users.
  • In development mode, have everything un-minimized, but in production mode, automatically minimize everything.
  • Run all of the JS through the Closure compiler regardless of dev/prod so that I know that things that work in dev also work in prod.
  • Limit the number of <script> tags to the bare minimum. Ideally, 2-3 for .js files served from my site and not directly off of a CDN. Fewer loads means less network traffic.
  • Be able to transparently support new code / application versions so that when I upgrade the application, the browsers dump their cached copies of my files.
  • No dependencies on external xml, json, property or other configuration file formats to implement the goals above. Everything should be configured by someone editing the html templates.
In order to accomplish the goals above, I first looked at a bunch of solutions, but they all failed in various ways. So, I started on my own and went through various iterations before I came up with the ideal solution which I think is pretty unique and easy.

Let's start off by talking about one of the tools I'm using. LabJS enables me to load JS only when I need it. As part of the 'master' template which contains the skin for all pages, at the very bottom before the </body> element I have something that looks like this:

<script src="/js/LAB.min.js"></script>
<script>
    var country = "${country}";
    var fbAppId = "${fb.APP_ID}";

    var js = '${tool.jsbuilder(
        me != null,
        'json2:both',
        'handlebars.1.0.0.beta.master:both',
        'bootstrap-twipsy:both',
        'bootstrap-popover:both',
        'gen/global/page:both',
        'gen/global/common:both',
        'gen/global/search:both',
        'gen/modal/loginDialog:out', // ! logged in
        'gen/global/loggedInMenu:in', // logged in
        'gen/global/master:both'
        )}';

    var lab = $LAB
        .script("//ajax.googleapis.com/ajax/libs/jquery/1.7/jquery.min.js")
        .wait()
        .script("//ajax.googleapis.com/ajax/libs/jqueryui/1.8.16/jquery-ui.js")
        .script("//connect.facebook.net/en_US/all.js")
        .script("//apis.google.com/js/plusone.js")
        .script("//platform.twitter.com/widgets.js")
        .wait()
        .script(js)
        ;

    // Variable "pagecode" should be a function that takes a LAB and does any page-specific loading
    if (typeof pagecode === 'function')
        pagecode(lab);
</script>

Since I'm using Cambridge Templates with JEXL to process things first, the ${tool.jsbuilder(...)} section runs some Java code which does a lot of the magic during the rendering portion of the page. The first argument is a boolean to indicate whether or not I'm logged in. 'me' is an object in the context and if it is null I'm not logged in. The rest of the arguments are String[]. The method signature looks like this:

    public String jsbuilder(boolean loggedIn, String[] files);

What happens in that method is that it will parse the array of Strings, and based on a setting of 'in' for logged in, 'out' for logged out, or 'both' for either logged in or out, it will compare that to the loggedIn boolean and either load the appropriate JS file or not. The files are then loaded into memory, in order, and sent through the Closure compiler to minify the code.

The output from Closure is then cached in a global HashMap which is never cleared out. (Note: for languages that don't really persist memory between requests, like PHP, you can store this data in something like memcached).

The key of the Map is generated from a md5 hash of the list of filenames combined together + application version. The hash looks something like this: be712950814b2ccc6b92ff5c3. This hash is the String that is returned from the jsbuilder method. By using the names + application version, that ensures a new hash will be generated each time the application is upgraded.

In dev mode, the Map isn't used at all. The code is generated for each request, which ensures that my changes get immediately reflected in the browser. In production, the Map is first checked for the key and if it exists, the key is immediately returned from the jsbuilder method.

The final rendered page looks something like this to the web browser:

    var js = '/js2/be712950814b2ccc6b92ff5c3.js';

When the LabJS code executes in the browser and loads my script with the line .script(js), there is a Servlet listening for requests to /js2/*.js and it looks up the key from the url in the Map and returns the appropriate JS data. This servlet can also set the correct browser cache headers depending on dev or prod.

As you can see, 10+ separate files have been combined and minified into a single file which makes the requests more efficient. All without configuration files or a crazy syntax that only a backend developer can understand.

If I wanted to split the JS files into more loads so that the browser can take advantage of concurrent loading, I could do that as well by just creating more calls to jsbuilder. That is effectively what is happening in the pagecode section near the end of the </script> element above. The body template which is loaded into the master template by Cambridge, optionally has a JS function defined called pagecode. When it executes, it calls lab.script() again with similar output from the jsbuilder tool. This allows me to split up my code so that there is global code as well as page specific code.

Enjoy.

Wednesday, December 7, 2011

Github Pages

I'm a huge fan of Github. Pull requests are the best innovation since the idea behind open source was created.

For one of my projects hosted on Google Code, I was recently asked to move it to Github so that someone could submit pull requests more easily. I happily complied because of how much I love Github.

As part of this move, I decided to finally explore Github Pages in order to publish the nicely formatted documentation of my project. I was thinking they'd be as great as pull requests, and after an hour of reading the documentation and installing everything, I was terribly disappointed.

The main issue is that it uses a site generation tool called Jekyll. While this tool is generally ok, it has a quite a few major failings as a product for Github.
  • It is clearly a product of Not-Invented-Here syndrome. How many static website generators does this world need? Why did you feel the need to invent yet another one? Hell, I don't even want a static website generator, I just want to write some documentation.
  • I don't want 50 different options for generating content. Just give me Markdown. I don't even want 2 different types of Markdown. Just give me the one that works best, as default.
  • By default, it comes with nothing to help you design a site full of great looking documentation. Even just a default template setup would suffice. Some people have tried to create some helper projects, but they all have terrible UI and they all seem somewhat abandoned.
  • It requires me to become an expert in this tool. I have to install a bunch of random software, learn configuration files, learn a specific file layout. All I want to do is write some documentation that looks nice!
  • It was basically created for publishing blogs. Why is this being promoted as GH Pages? It seems rather absurd for a source code repository to provide a tool for publishing blogs, but not a tool for publishing great documentation.
Back over on Google Code, I just create some wiki pages, link them together with a table of contents (also a wiki page) and point at a specific url. Everything just works and looks great.

Github, please fix this!

Thursday, December 1, 2011

Social buttons

Today, I finally got around to implementing those social Like buttons that you see on all of the websites. Personally, I never click them, but it is clear lots of other people do so I'm going with the bandwagon.

They look something like this:

The top ones are Facebook, then Twitter, then Google+.

Styling the first two rows of buttons with CSS is simple. They have class="fb-like" and class="twitter-share-button". I can move them around on my page and place them exactly where I want them quite easily.

What does Google have you ask? Nothing. Zip. Zilch. Instead, it has id="___plusone_0" which means that it is somewhat useless as a general CSS selector.

I know this is nit picky on such a small issue, but an oversight like this seems hard to fathom. I've google'd around a bit and it really makes me wonder, how come no other site designers have brought this up?

I was hoping that a work around for this would be to just write it as a div instead of as the <g:plusone> element. But it turns out that div loses the class attribute when it is re-written by their JavaScript.

<div class="g-plusone" data-size="tall" ... ></div>

In the end, I just put my own div around the element, but that seems like such a kludge when the other services seem to do this correctly.

Maybe I'm reading too far into this because it is such a little thing, but it really makes me feel like Google doesn't understand the needs of site developers like the other Social players do.

Thursday, November 17, 2011

Contributing to Open Source

I've been working on various open source projects since around 1993. Long before I even really thought of it as open source. It just seemed natural to me to make the fixes I needed and contribute them back. It was always a bit of a challenge to figure out how to get my fixes to the developers. Obviously, they don't know me, so they aren't going to just let me write the files directly. So, I end up sending patches via email or some other means.

Over the years, the process for contributing to projects has gotten easier. Even more recently, it has grown by leaps and bounds thanks to Github.

Case in point. I've been using the twitter bootstrap project for parts of the design of my new company Voost. I like the project a lot. Like millions of other projects, it is hosted on github.

Yesterday, I noticed a small bit of documentation was missing, so I forked the project by clicking a button on the website, created a branch to work on (git checkout -b docadditions), edited the documentation, committed and published my changes and then created a pull request which tells the developers of bootstrap that I have something to contribute:

https://github.com/twitter/bootstrap/pull/647

Mark, one of the developers, who I've never met in my life, was able to take my contributions and combine them with his code by simply clicking a button on the website. Yes, it was that easy.

I also had an enhancement request... so I created an issue...

https://github.com/twitter/bootstrap/issues/646

It was resolved in a few hours with just a small bit of effort. I can then merge his changes into my local fork of the project with a couple easy commands. We stay in perfect sync together.

Bam. That is how collaborative development should work.

As a comparison, in the past, I've done a huge amount of work for the Apache Software Foundation. They have a great open source license, and a huge following. But, they don't use github.

With the ASF, it feels like 1993 again. For each project I want to contribute to, it feels like I'm making a lifetime commitment to that project.

I have to go to the project website and navigate around to figure out how to join a mailing list. This takes several contextual steps in an email client. I need make sure to setup a mail filter to deal with a potentially insane amount of email that I really don't care about. Then, I email a patch to the list (or put it up on gist / pastebin)... and I hope maybe one of the developers might be watching my carefully crafted subject line. Chances are that nobody would respond or the email would get lost, so I'd have to keep nagging people because everyone is busy...

I don't really contribute to the ASF nearly as much anymore.