Okay, so I'm probably the only one who didn't know this, but I've been wondering why it seems that every website owned by someone within a few degrees of separation from TimBL tend to use URLs of the form:


Just one of those things I figured kinda made sense, but was never sure why for. Then, today after a bit of wandering while researching things RDF and SemanticWeb, I found a link from Sean B. Palmer pointing to Hypertext Style: Cool URIs don't change by TimBL himself. Seems the example of this pattern is layed out there by the man himself.

Seems like it would work like a limited sort of concurrent versioning scheme, but it just looked wonky the first time I saw it. I mean - date-based website layout? I'd been raised on the high falutin' directory trees made by very well (overly?) paid Information Architect types. /2000/10/stuff? What about /about-us/corporate/ceo.html?

Of course, this is ignoring the fact that some webservers need not directly tie physical disk layout to URL layout. Or that site architecture is best presented via links in the documents themselves. It's just that plain vanilla Apache uses a 1:1 match between file path and URL path, and that's what most everyone uses.

Hmm.. Might play with it a bit around here.


Archived Comments

  • Then you might want to check out some work I've done, mixing TBL's "Cool URLs don't change" with Ted Nelson's tumblers (my take on the idea): About the Boston Diaries The Electric King James Bible Also, I've kept the same structure to my website since I started it in 1993 or 1994 or there abouts. I've found that both the stability and longevity of the site has help my search engine rankings tremendously.
  • One of my problems with this style of url is it doesn't seem to allow for alternate representations of the same info i.e. file.html versus file.pdf. And while I'll admit that extensions are lousy conventions -- they work. If I see a url that's PDF then I know it might crash my browser, take longer, etc. So for me at least (and there are lots of geeky folks out there) extensions work. The Boston Diaries stuff is cool, imho.
  • But part of the HTTProtocol is content negotiation (which includes language preference)---the client says “I can accept the following resource types ... ” and the server can then determine the best type to send. Ideally, if a client says “only plain text please,” then the server can convert from HTML to text (now it can wimp out and send nothing). Extentions can be completely hidden. I also don't think you'll be seeing dynamically generated PDF files ala The Electric King James any time soon. Heck, I've yet to see any other site send down dynamically generated HTML like I've done ...