Sunday, March 20, 2011

A simple URL shortening algorithm

Shortening a URL is a convenient way to save long URL to make use of space when posting. It's especially popular on Twitter where message is limited to 140 words. Many websites provide this service such as tinyurl.com, bit.ly,...

There some gems or wrapper to use that service in your Rails app. But the shortened URLs belong to another domain (ex: http://bit.ly/ek8Hhe which belongs to bit.ly). If you want to make it belonged to your domain (ex: http://example.com/wfds7i), you must implement your own URLs shortener.

This is a simple way to do that.

Basically, the problem is:
given a URL, how to map it to a string which has pattern XXXXXX, where X belongs to {0..9a-zA-Z}. There would be 62^6 = 56800235584 such strings. That amount is almost enough.
Then the simple idea to solve that problem is:
map the URL to an integer in 1..62^6. That number must correspond to a string in space {XXXXXX} that could be calculated by using a 10-base to 62-base conversion algorithm (you can understand easily by figuring out how to convert a decimal number to hexa number). 

Here is an implementation of mine in Ruby:

class URLShortener
  CHARSET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
  BASE = 62
  CODE_LENGTH = 6

  def self.encode(id)
    code = ""
    while (id > 0) do
      code = CHARSET[id % BASE].chr + code
      id = id / BASE
    end

    (code.length > CODE_LENGTH) ? "" : "0" * (CODE_LENGTH - code.length) + code 
  end

  def self.decode(code)
    return -1 if code.length != CODE_LENGTH
    id = 0
    for i in 0..(CODE_LENGTH-1) do
      n = CHARSET.index(code[i])
      return -1 if n.nil?
      id += n * (BASE ** (CODE_LENGTH - i - 1))
    end
    return id
  end
end

1 comment:

Unknown said...

Good news and info! I am finding free url shortener. If you know about it , please share.
Thanks a lot