URLs are for People, not Computers

[+] potatolicious|13 years ago|reply

The example about Amazon is inaccurate.

Here is an Amazon URL:

http://www.amazon.com/Bioshock-Infinite-Premium-Edition-Xbox...

Is it completely clean? Nope. It contains a lot of information that feed into the backend, but the core URL is this:

http://www.amazon.com/Bioshock-Infinite-Premium-Edition-Xbox...

This URL will take you to the correct page, every time, and it doesn't take a genius to figure this out. It also doesn't take a genius to figure out what this page is about before you even paste the link into your browser. By putting the human-relevant portion of the URL as far forward as possible it's able to accomplish both priorities: giving the machine as much information as possible, and giving the human as much information as possible.

The trick here is that "Bioshock-Infinite-Premium-Edition-Xbox-360" is entirely superfluous. It is entirely there for SEO and human readability purposes. This URL works just fine and leads to the same place:

http://www.amazon.com/dp/B009PJ9L3Y/

Amazon isn't blind to these issues. So sure, you can take this very last URL and try to make a point about obfuscated URLs, but that's not what's actually in use at Amazon. It seems odd to pick them as an example when they're not even a violator.

[edit] It looks like HN truncates long URLs for display, which only goes further to prove the point.

[+] randomdrake|13 years ago|reply

Although slightly unrelated, I'd just like to add something else to commend Amazon when it comes to link handling. I'm extremely impressed at Amazon's ability to handle very old links to products. Here's a link to Hitchhiker's Guide to the Galaxy that I used on a webpage I built 13 years ago and it still works: http://www.amazon.com/exec/obidos/ASIN/0517149257/o/qid=9295...

Note: it is a particularly ugly link

  http://www.amazon.com/exec/obidos/ASIN/0517149257/o/qid=929505204/sr=2-1/002-9367729-7762218

[+] weinzierl|13 years ago|reply

SEO friendly: yes, human friendly: no

Sure, the URL

http://www.amazon.com/Bioshock-Infinite-Premium-Edition-Xbox...

leads to the Bioshock article, but so does

http://www.amazon.com/flux-compensator/dp/B009PJ9L3Y/

or

http://www.amazon.com/hackers-and-painters/dp/B009PJ9L3Y/

In my opinion the only reason for the slug is SEO, everything else is a side effect which I personally find borderline dangerous.

BTW stackoverflow.com URLs work the same way.

[+] susi22|13 years ago|reply

You can even do: http://amzn.com/B009PJ9L3Y

[+] danso|13 years ago|reply

The Amazon case is an interesting one because, despite the appeal of the OP's argument, one can hardly deny the success of Amazon's product listings in spite of their ugly URLs.

However, this raises up an important consequence of clean URL design: when you're offering things that may be classified in several categories, it requires good design on the backed/framework to make sure your URL taxonomy isn't overly constricting. For example, example.com/toys/Nintendo-wii or example.com/consoles/Nintendo-wii?

Either one is legit but creating and keeping consistent taxonomy is difficult enough on its own without worrying simultaneously what the URL looks like

[+] wcoenen|13 years ago|reply

I'm especially annoyed by http://outlook.com. When I go there, I get redirected to a garbage URL on http://login.live.com, and then automatically to another garbage URL on http://bay156.mail.live.com where I can see my inbox. Yuck.

[+] felixmar|13 years ago|reply

Logging out on outlook.com is even worse. You get redirected to msn.com with its sleazy celebrity "news" and other dubious items. It's like you're in a renovated theater thinking this new interior design is pretty good and then after the show is over the exit leads to a back alley full of trash.

[+] MatthewPhillips|13 years ago|reply

I just did it and got blu172.mail.live.com. Load balancing is exposed to the user... nice.

[+] salmanapk|13 years ago|reply

[deleted]

[+] samwillis|13 years ago|reply

The seminal writing on this is by none other than Tim Berners-lee:

"Cool URIs don't change"

http://www.w3.org/Provider/Style/URI.html

[+] richardwhiuk|13 years ago|reply

These are all good ideas, but there's part of me that wonders that because they are all being broken by Microsoft, Google, Amazone (as demonstrated in the article and in the comments here) that the importance of them is overstated.

URLs are fundamentally for web browsers to translate into a domain name lookup, and a HTTP request. They are for computers - the fact that we've managed to convince humans that they should care about them, that they should be decipherable by humans, is, IMHO a failing of the web as is.

On a side note, I note that films are starting to use Facebook URLs by doing [FB Logo]/trancethemovie which the user is intended to translate into https://www.facebook.com/trancethemovie

[+] geargrinder|13 years ago|reply

This is an often-heard argument: "Amazon (or other large company) are doing it and it works just fine for them."

But you are not Amazon. Your listings may be competing against Amazon's listings, without the brand recognition, trust and backlinks they have built up.

So you have to do everything better, like building friendly urls, just to have a chance of getting clicks that Amazon can take for granted.

(Amazon does a pretty good job at urls, as others have pointed out, but most larger, established companies are still pretty poor at this, leaving the door open for upstarts to do it better).

[+] 0x0|13 years ago|reply

According to that logic we might as well do away with DNS entirely, and just use numeric IPs in URLs.

[+] criley|13 years ago|reply

I thought that literally the entire point of URL's was that humans can't remember and use IP's naturally so we created URL's and DNS to let humans interface with the machine IP language.

If URL's are actually intended for computers, than I'd say we've failed rather badly.

The whole point was to interface with people.

If people don't matter and it's for machines, why use URLs? Just type IP's. Skip DNS all together...

[+] k3n|13 years ago|reply

The more accurate claim would be: "DNS is for computers, not people", because that is actually true.

URL's are for both, and so you see hints of both concerns represented. Once your routing passes a certain level of complexity, there is no way to make both functional and human-friendly URL's.

The only thing that users should really be concerned with WRT to URL's is the DNS portion; pretty URL's are just that -- pretty -- and a rose by any other name... Ultimately the user should either have trust in your FQDN or not, at which point the actual URL is inconsequential.

EDIT: additionally, a URL is not a UI element, and the user should never even need to see or know about any particular URL (much less its scheme), only that interacting with an anchor tag named "profile" takes them to the profile page, for example. It's up to developers to translate URL's to human-friendly counterparts.

[+] johnchristopher|13 years ago|reply

>Edward Cutrell and Zhiwei Guan from Microsoft Research have conducted an eyetracking study of search engine use (warning: PDF) that found that people spend 24% of their gaze time looking at the URLs in the search results.

>We found that searchers are particularly interested in the URL when they are assessing the credibility of a destination. If the URL looks like garbage, people are less likely to click on that search hit. On the other hand, if the URL looks like the page will address the user’s question, they are more likely to click.

I wished someone at MS would follow up on that and fix the whole bay0X.cdn url jumping everytime I connect to outlook/hotmail.com

[+] Ntrails|13 years ago|reply

It would also lower the amount of time I spend allowing things on noScript

[+] CodeCube|13 years ago|reply

though, to be fair ... their inbox isn't necessarily something that they have to gaze through search results to find. So that doesn't really apply there.

[+] apaprocki|13 years ago|reply

URLs having any meaning at all strikes me as bias. If my parents visited a SSL site (say, a bank) and the address bar simply displayed the company name and nothing more, they would not miss URLs at all.

This is also why your parents and grandparents can just type random text into an address bar to execute a search instead of having to go to google.com or type in something cryptic like google.com?q=thing%20i%20want.

[+] alistair77|13 years ago|reply

That's confusing two different concepts. URLs are the equivalent of a full postal address whereas searching is the equivalent of asking a stranger how to get somewhere. Why would you want to always ask and trust a search engine when you already know how to get somewhere? DNS spoofing aside, I know I can trust anything under, for example, bbc.co.uk and I know where /radio, /news etc takes me. Guess what, so do my seventy year old parents!

[+] drunkpotato|13 years ago|reply

This is a good and oft-neglected part of UI and API design to keep in mind. I especially like the discussion of the implications of hierarchical, semantic URL's in improving user trust and likelihood of clicking.

Much like with database design, it's easy for programmers to take over the task of URL design and make it easy to use from the write-first, read-never programmer perspective. User considerations come later if at all. I like the reminder to pay attention to these factors. We should all be reminded to question our first impulses; are we making something good for us or good for the user?

[+] jader201|13 years ago|reply

I disagree with many points of this article, and actually feel the reverse is true:

URLs are for computers, not people.

To me, a URL is an address to a web site, not the title (or description).

If I want to find somebody's address on a map, I don't go to "Bobby's House". I go to "123 Main Street, New York City, NY". If I search for Bobby's house, I'm not given "Bobby's House" on a map, I'm given a surrogate street address.

If humans are expecting the URL to look pretty and descriptive, then the issue here is that we've conditioned this expectation and we should instead condition users to expect succinct, surrogate URLs that only serve the purpose of identifying the article you're trying to reach.

Additionally, I think the fact that search engines highly weight their optimization on a URL is terrible and counter-intuitive to the purpose of a URL. This is what <title></title> is for, and other <meta></meta> headers.

The URL should not determine a page's rank in search results, at all. At the very most, it may make sense to factor the root domain into SEO, but that's where it should end. This isn't the 1990's when much of the web was static HTML pages that could be given whatever meaningful file names. In today's world where the web is dynamic and mostly made up of user-driven content, URLs are designed to route the user based on one or many identifiers, which are often surrogate identifiers, and not natural or meaningful identifiers.

Edit: I do agree with the point about useless garbage in the URL (like the Google search examples) that are there only in the interests of the site and tracking/analytics. I think URLs should only serve to get the user where they need to go, and contain exactly enough data to get them there.

[+] nopal|13 years ago|reply

But people often need to parse URLs before they provide them to their computers (via click or keyboard), and I think that's the point.

The issue is that URLs are often the only piece of information users receive, and it's why we've conditioned users to expect meaningful URLs.

From our standpoint, it's not too hard to make URLs more meaningful, even with user-generated content. Plenty of sites incorporate the title of submitted content into the URLs, and it's even easier when creating content for oneself.

Would you prefer a link to http://www.example/about or a link to http://www.example.com/?id=123. Which are you able to understand before clicking it? Which are you more likely to click?

[+] radley|13 years ago|reply

123 Main St (comma) New York (comma) New York is a human paradigm.

You left out zip+4 code, GEO coordinates, user spoken language, travel-type preference setting, internal id, and unique user id tracker.

[+] brc|13 years ago|reply

Urls are important to search engines because they should express the primary purpose of the page. Unlike meta keywords, it is very difficult to keyword-stuff a URL, because search engines can detect duplicate content. So a person needs to select the URL for a page that best describes that page, which increases the search signal quality for a URL.

Urls are important to humans for the same reason we save documents with meaningful names instead of random gibberish.

Urls are the way people use sites, and that's just the way it is.

[+] UnoriginalGuy|13 years ago|reply

I blame the tools...

Most tools and frameworks are designed from the ground up to be document-focused. Some even going as far as to purposely simulate a document when none exists (e.g. Tomcat).

Let's take PHP, ASP.net, and Java. They make up the majority of the internet right now. With RoR and MS MVC being outliers.

It is VERY hard to develop applications in them without a document focus because they use documents to direct functionality (e.g. logout.php and login.php might have different underlying functionality).

Now, yes, web-servers do support request redirection, so you can redirect from /logout to /logout.php, but such "magic" is time consuming because there is a disconnect between the underlying framework which "understands" pages and the dumb web-server which just does what it is told to do.

Even if you just automate it so you strip out the extension (e.g. strip ".php") you still wind up /thinking/ about things from a document perspective rather than a functionality perspective (e.g. "this functionality is on THIS page, this functionality is on THAT page").

We just need more modern frameworks where from the ground up the thing is based on a hierarchy rather than documents/files/etc. This should all be dictated by the framework, not the server's filesystem.

[+] phpnode|13 years ago|reply

These sound like complaints that would have been valid 10 years ago, but not any more. Virtually every framework has the concept of "routes" that map URLs to appropriate logic, they're not document based (whatever that means)

[+] throwawayG9|13 years ago|reply

Server's filesystem? Are you serious? Welcome to 2013, all the problems you mention are long gone, and are only brought back from time to time by people like you who stopped learning a decade ago. You should check out symfony.com... Or any other framework for that matter! For Christ's sake.

[+] greghinch|13 years ago|reply

Isn't this what routing does? The problem you're describing seems to be primarily a PHP one, particularly with the lack of a dominant framework in the PHP community. (Possibly also a .Net one, I've avoided working with ASP like the plague in my career).

[+] jeff303|13 years ago|reply

"Java" - what specifically are you referring to? Spring MVC, as one example, has advanced routing capability and is in no way "forced document centric."

[+] abraininavat|13 years ago|reply

Yeah, I don't think you've used RoR.

[+] cargo8|13 years ago|reply

While I do appreciate clean URLs, the reason behind all those random obfuscated query parameters in Google's url is not a mystery. They are hidden indicators that are only available at query time and/or experimentation flags and things like that to improve the results. URLs only need to be readable for the portion that the user inputted or is consciously aware of, the rest is for computers.

[+] NelsonMinar|13 years ago|reply

Google search result URLs used to be simpler, but they made a decision not to care how they look years ago and they keep stuffing more and more data into them. I assume it helps their tracking and maybe optimization.

Even more offensive are the result URLs on the result page. Here's the URL for a logged out search for "hacker news". https://www.google.com/url?sa=t&rct=j&q=&esrc=s&...

[+] Kop|13 years ago|reply

You get that long and obscure URL while visiting google logged out, with no cookies.. So they all must be the defaults.

If they are the defaults, why put them in the URL?

[+] robinh|13 years ago|reply

Could they not set those parameters in a POST request so they don't show up in the URL?

[+] mikecarroll|13 years ago|reply

While I agree that URLs should be considered intrinsic to good UI, social media is also undermining the value/importance of semantic URLs.

Why put extra work into making your RESTful URL structure more semantic, in other words, if Twitter if just going to shorten them to the point that they are no longer fully readable, or Facebook is just going to hide them behind a preview view?

[+] aerique|13 years ago|reply

Twitter does show the full URL in the tooltip.

[+] njharman|13 years ago|reply

I agree. Except hierarchical is problematic. The world is not hierarchical. Or, rather it is composed of innumerable hierarchies, some disjoint, some overlapping, some redundant, some varying with time, and which one to apply and what the levels are is a huge bikeshed / distraction

Chair example:

furniture/chairs/desk/chair

inventory/current/reorder/chair

customer/me/bought/chair

customer/me/wishlist/wedding/chair

products/used/modern/office/chair

products/wood/four legs/padded/black/chair

ad nauseum.

[+] sravfeyn|13 years ago|reply

On a side-note, I have made a movie web-app where you can just enter movie name into URL to get it's rating & trailer, like www.instamovi.com/#<ANY_MOVIE_NAME_HERE>. It works for keywords as long as they are spelled correct...like http://instamovi.com/#bourne

[+] duck|13 years ago|reply

A study conducted by Microsoft found URLs play a vital role in assessing the security and credibility of a website

Why then do most Microsoft sites not follow this finding? Also, a lot of their products break it as well (I'm looking at you SharePoint and CRM).

[+] ygra|13 years ago|reply

MSR does research and prototyping. It's up to the product teams to implement important findings and they still might have other priorities first.

Also »Microsoft« isn't one big monolithic entity and it's not uncommon for individual parts of it doing things in quite different ways.

[+] danibx|13 years ago|reply

The only URLs I care about are the main domain URLs. And I dont even type them. I just use Google to reach the main site. It is faster than typing a full URL. Even more on mobile devices.

Or for commonly accessed sites I just type a few letters on my browser address bar. reddit.com is actually re+enter. news.ycombinator.com is actually ne+enter to me. After I reach the main site I usually click around or use the site's search bar.

So, I would say that good URL names are a secondary optimisation.

I would prefer to focus on this priority: 1) A good unique domain name; 2) Good SEO; 3) Good site information architecture; 4) Good internal site search.

[+] cdoxsey|13 years ago|reply

His examples aren't helping his case. If the most successful store and the most popular search engine don't use pretty URLs why should anyone else care?

[+] PetitPrince|13 years ago|reply

Being big doesn't necessarily means being right, and big companies have sometime good reasons not to follow good practices.

Case in point: it took a while to those same companies to switch from table layout to CSS based layout [1][2].

[1] http://webmasters.stackexchange.com/questions/20408/if-css-i... [2] https://forums.digitalpoint.com/threads/why-does-google-use-...

[+] bobwise|13 years ago|reply

Well obviously URLs are for people because raw IP addresses are unsuitable, but that doesn't mean that textual URLs as they exist today are our best option. Even well-designed URLs are too complicated. "https://news.ycombinator.com/item?id=5498198 is mostly devoid of meaning even to me. I can tell that that URL is referencing a discussion on Hacker News, but "Hacker News" or the title of the article are not present in the URL.

Hierarchical URLs betray the underlying model of the internet as a series of interrelated documents. People don't care about understanding the layout of files on a web server; they just want to open Facebook, or their email, or perform a search. Nobody types "http://www.facebook.com into their browser. They either click a bookmark or type "facebook" into the search or URL bar. What happens next is up to the browser.

The best solution would conform to the already existing mental model that people have. They don't think of a website as a bunch of documents on a web server (despite the shared vocabulary with printed media - words like "page" and "bookmark"). Their mental model is probably something like buildings on a city block. You can pick one to go into, and when you're inside you can do things and learn things that are unique to that building. Rooms are connected by hallways and doors. There are windows where you can see outside or into other buildings. You can bring things with you into the building and take things out when you leave. To get back to a room in a building that you've been in previously, you can either go back to the front door and follow the path you took originally to get to the room, or you can "bookmark the page", which is like a shortcut directly that room.

[+] igorgue|13 years ago|reply

I think I read somewhere that a good number Flickr's users just hack this url: http://www.flickr.com/photos/tags/<tag_name>;

Like somebody else said, I blame the tools, the requirements (but we gotta track the referring url of the referring url!!!), and the programmers.

[+] stormbrew|13 years ago|reply

Kind of a tangent to this, but I'm always really amused by generated clean-looking urls that cut out short words. It's very common to have the word "not" or "no" drop out and produce a headline with completely inverted meaning.

localnewspaper.example.com/1934342342/mayor-dropping-race-after-scandal -> Mayor Not Dropping Out Of Race After Scandal

152 comments