Monday, August 25, 2008

I want one of these

Monday, August 18, 2008

Y Combinator Startup Ideas

My comments on this list.
  1. I don't have a good answer.
  2. Flock , BlackTop
  3. NewsTrust
  4. IT won't ever go away.
  5. JBoss
  6. Will always be custom.
  7. An enterprise porn affiliate tracking system. The existing solutions suck.
  8. Very interesting to me.
  9. Flickr (the only 'web company' I give money to... happily.)
  10. Yawn
  11. Yawn
  12. We need more creative people.
  13. Interesting, but schools are difficult to work with.
  14. Interesting idea, but it is more about combating politics than measuring anything.
  15. Nice, but a huge initial investment.
  16. Google won, get over it.
  17. VERY interesting to me. I've got an idea in this area, but I don't know the solution for it yet. =(
  18. Yawn.
  19. Yawn.
  20. Yelp
  21. Mint
  22. Yawn, too specific.
  23. I agree, Wikipedia is almost impossible to put new content into.
  24. I hate extortionists, don't you?
  25. Yea, I've had no luck selling stuff on CL recently. It is full of too much crap.
  26. Too much competition.
  27. Huh?
  28. I like gmail, sure it has some warts, but it works well enough for me and I like the freedom it gives me.
  29. Yawn
  30. Sales pitch.

Saturday, August 16, 2008

Enterprise Java

Jim recently blogged about what Enterprise worthy means to him. I agree fully with what he says, if Ruby had the same level of infrastructure that Java does, it might be considered enterprise worthy, just like Java. This won't happen for a very long time though.

Why? Simply put, Java has a huge head start. There is also progressions in the use of the language that IDE's such as Eclipse have made writing code a no-brainer. Such as: instant code compiling (remember when javac was slow?), re-factoring, code-finishing and code-completion. I just don't see C or Ruby having IDE support at the same level as Java for a long long time, if ever.

What makes Java enterprise worthy?

JBoss
EJB3/Hibernate (clustered second level and query caching)
JMS
JMX
SOAP/Axis
IDE support (Eclipse - easy code re-factoring is HUGE)
Velocity
JSP taglibs
GWT (google web toolkit)
Jakarta (and similar projects with large amounts of released libraries)

I mention the above tools because I successfully work with nearly all of those technologies on a daily basis. Once you see millions and millions of hits being reliably served off of servers that are literally sleeping, using those technologies, you become sold on the idea that it is possible to create truly scalable web applications with a relatively minimal amount of work. For the most part, EJB3 and the associated annotations is actually a pleasure to work with.

Sam Ruby had a posting recently about how he was struggling to make Ruby (the language) scale to serve a feed. He talks about page caching, etags, simpler get support, action caching, fragment caching and sweepers. Oh my god, 6 different caching layers? That sounds like a train wreck to maintain, debug and understand. I like to call that a brittle system and I try to avoid anything that complicated as much as possible. After all these years, doing all this weird caching just to make something scale seems so brain dead to me. These problems have been solved and transparently implemented in the Java world for ages now, so why not make life easy and just use what works? For my feed generation (I do a lot of it in my support for Piclens), I use simple Velocity templates and populate the context with data objects that I got from my EJB3 entities. Nowhere do I need to worry about caching because the JVM performs as expected and the rest of the database caching is all handled for me. The code is simple and it all just performs like magic.

Even if things did slow down (and we have meetings about this at work every once and while), it all becomes a matter of adding more hardware to the cluster and/or making sure the caches are being allocated properly. I'm serving 15+ heavily trafficked porn sites on 3 standard server configurations. We really only need one server for all of this but we have 3 just for redundancy. It is relatively simple and it just works. I like that and that is what makes Java Enterprise worthy to me.

How I read the NYTimes.com website

I often get links to articles on the NYTimes.com website. When I go to the website, I'm presented with a login/register page to create a free account. I then go to bugmenot.com and I use one of their fake accounts to login to the NYTimes website. I'm then taken to some interstitial page where I have to either view an advertisement or click a continue link. I then read the article.

Some answers to questions you might have:

Q: Why is it that every time I visit the site, I have to go to bugmenot again?
A: Because NYTimes is playing a whack-a-mole game with bugmenot.com. Every time a new account gets created on bugmenot, they close it down within a day or two.

Q: Why don't I just create my own account?
A: Two reasons: First, I've tried creating my own accounts and they always seem to just stop working after a while. Second, I really don't see the point of forcing me to create an account to read a newspaper online. It should be a value add sort of deal. If I want to comment on a story, make me create an account.

I guess I just don't understand why the NYTimes runs its website like it does. It gives me a bad feeling every time I visit their website and I'm sure I'm not the only one. Why would you want to do that to people?

Friday, August 8, 2008

Download Managers

It has come to my attention that download managers suck. The reason is that they do not have an easy way (ie: API) that they all agree on that websites can implement in order to authenticate users to download content from a CDN in a secure way.

We use a CDN that allows us to create a token which we pass to the CDN in a cookie or url. The CDN authenticates that token and provides access to any request that contains that token. The token is simple, it is a md5 hash with a shared secret, future expiration time and a path to match against. It looks something like this: MD5(mySecret/content/protected.ext?e=1182665958). The url to download the content then looks like this: /content/protected.ext?e=1182665958&h=886dbef7390dfd70aea27fd41e459e7f. Everything after the ? can either be put into a cookie or passed on the query string as described above.

Now, the problem with download managers is that you can't easily script the generation of those tokens. So, anyone using a download manager has to hit the site, grab the cookie and then put the cookie into the download manager along with the urls. This is a royal pain in the ass.

If download managers supported a RESTful api such as:

https://sitename.com/getToken?username=USERNAME&password=PASSWORD&path=urlencode(PATH)

Then, when I receive a request like the one above, all I would need to do is authenticate the user, check to make sure they are allowed access to that path and return a token. If the download manager gets back a 403 Forbidden, then the token probably expired and the download manager could then just request a new token.

I would be more than happy to implement something like that.

p.s. Kink has a system called Warden that implements a token based authentication scheme similar to the one above but works independent of a CDN that we will be making open source as soon as I have some free time to put it up online.