top | item 8207054

A URL shortener service in 45 lines of Scala

27 points| lauriswtf | 11 years ago |grasswire-engineering.tumblr.com | reply

20 comments

order
[+] TeeWEE|11 years ago|reply
I know scala is very powerful, but I think this code proofs to me that scala code is difficult to read. I'm a pro functional programmer, but scala mixes too many concepts.

'pseudo' Python url shorterner in 14 lines of code:

  from webapp2 import RequestHandler
  import redis

  def shorten_url(url):
    import base64
    return base64.encode(url)[:7]
  
  class Handler(RequestHandler):

    def get(self, path):
        self.redirect(redis.get(path))

    def post(self, path):
        redis.put(path, shorten_url(path))
[+] GregorStocks|11 years ago|reply
I believe you'll have a lot of collisions that way, as two URLs with the same 7-byte suffix in base64 will have the same key. The Scala version uses 7 random characters, which I think will lead to a collision after about sqrt(36^7) entries due to the birthday paradox (62^7 if alphanumeric includes both uppercase and lowercase). That's better, but I'd recommend just using something autoincrementing instead, which in theory makes collisions impossible.
[+] tshadwell|11 years ago|reply
As far as I am aware, the standard implementation of a URL shortening algorithm is to convert the number to a higher base, not to generate a random string. This has the advantages of greater loading speed, and smaller storage size.

Here is one I wrote long enough ago for me to be divorced from its implementation flaws: https://github.com/TShadwell/go-shorten/blob/master/shorten/...

[+] xwowsersx|11 years ago|reply
That makes sense. Thanks for the code.
[+] skrebbel|11 years ago|reply
Now that Twitter auto-shortens URLs (and expands them in the UI), is there really a use left for URL-shorteners? Are there any other major platforms where text length matters?
[+] ff7c11|11 years ago|reply
In print: I like how net magazine uses netm.ag/description-123 for any links so it's easy to type and doesn't take up too much space. This also gives analytics for print articles where you normally can't watch whether anyone followed a link

You don't need to rely on routing through Twitter. Works when Twitter blocked or down. Branding.

Also tracking.

Also you can modify where the link points after distributing it in case you notice a mistake.

[+] austenallred|11 years ago|reply
The biggest advantage of URL shorteners, outside of the length, is the ability to quickly generate URLs that are track-able, instead of constantly adding campaigns with something like grasswire.com/1234?refid=twitter, which is how a lot of marketers track where their clicks come from.
[+] JazCE|11 years ago|reply
Length matters when it comes to print format.
[+] u124556|11 years ago|reply
What would be the minimum path size in the shortened url to ensure a large enough capability of storage without letting users guess other urls easily?
[+] agscala|11 years ago|reply
Where does this handle getting the same random shortened url twice?
[+] xwowsersx|11 years ago|reply
Good catch - it doesn't. We'd need to add a line to check for collision and retry. Better yet is to use hashing instead of random string as some folks mentioned in the comments on the blog.
[+] stevewilhelm|11 years ago|reply
Instead of working on a URL shortner, I suggest you work on not sending usernames and passwords in clear text via HTTP POSTs.