Thursday, June 9, 2011

Turn back to Bochs

After a few days tried to use QEMU as an emulator for studying low-level software, it turned out to me that QEMU doesn't support much for debugging. Things like step-by-step run, set break points,... is unable. Then I went back to have a try with Bochs. It's amazing that Bochs is not that hard as I've ever thought. It even provides very good debugging features.

Here're some of my notes on Bochs installation and use:

  1. Download Bochs source code at

  2. Extract Bochs code
      $ gunzip -c bochs-version.tar.gz | tar -xvf -

  3. Configure to use debugger
      $ ./configure --enable-debugger --enable-disasm

  4. Install
    $ make
    $ sudo make install

  5. There an example Bochs configs file name .bochsrc at extracted dir

Sunday, May 29, 2011

Intel vs AT&T syntax

There're two popular assembly syntaxes: Intel and AT&T. Intel syntax is popular in Windows world. In Linux, AT&T syntax is more popular though GAS (Gnu Assembler) supports both.

The following lists some major differences of the two syntaxes:

  1. AT&T prefixes register with % sign
    * Intel:
       eax, ebx, ecx,...
    * AT&T:
       prefix by % sign: %eax, %ebx, %ecx,...

  2. AT&T prefixes immediate value with $ sign, Intel is not
    * Intel: 10, 80h
    * AT&T: $10, $0x80

  3. AT&T and Intel syntax use opposite instruction operands
    * Intel:      mnemonic destination, source 
       Ex:         mov eax, 100
    * AT&T:  mnemonic source, destination
       Ex:        movl $100, %eax

  4. AT&T suffixes instruction to specify instruction's operand size (1 byte: b, 2 bytes: w, 4 bytes: l) but Intel uses directive before operands (1 byte: byte ptr, 2 bytes: word ptr, 4 bytes: dword ptr)
    * Intel:
       mov al, byte ptr foo
    * AT&T:
       movb foo, %al

Saturday, May 28, 2011

Assembly Language

Assembly can be seen as machine language but in symbols/mnemonics instead of 0s or 1s. So one can make use of any aspect of computer's power if writing programs in assembly.

Assembly is specific to machine architecture. IA32 (also called x86, i386) is the most popular architecture for PC.
  1. Register
    Can be classified in 4 types:
    - general purpose (eax, ebx, ecx, edx)
    - pointer/index (esp, ebp, esi, edi)
    - instruction pointer (eip)
    - flags (eflags)

    These are all 32 bits. Each register contains 8-bit and 16-bit parts. Ex: eax (32 bits), ax (16 bits), ah (8 bits), al (8bits).

  2. Instruction
    - arithmetic/logic: add, sub, and, or,...
    - control: jmp,..
    - data movement: mov,..

  3. Operand
    - register: operand value is contained in register
    - immediate: operand value is a constant
    - memory: operand value is in memory

  4. Addressing mode
    Addressing mode is the way to specify a memory address.
    - absolute:
      address = a value
    - register:
      address = register content
    - displacement:
      address = register content + a value
    - indexed:
      address = register content + a value + another register content (index) * another value (scale)

  5. Subroutine
    - subroutine is a set of instructions
    - parameters passed to subroutine are usually pushed on stack

Tuesday, May 17, 2011

Some notes on QEMU

I am finding a computer emulator so that I can use to experiment some low-level softwares. There are two prominent free and open-source emulators: Bochs and QEMU. Bochs seems more popular but lacks of documentation. So I have decided to use QEMU because it's quite simple to use and well documented.

1. Installation (on Ubuntu)
    $ sudo apt-get install qemu

2. Start emulator
    $ qemu [options] [disk_image] 
       --> This means start an emulator with specified options and disk_image (usually contains OS)

      $ qemu linux.img
             --> Start an emulator with default options and its hard disk contains linux.img

    There are lots of options which specify how your emulated computer could be such as what type of its cpu, hard disk, video card, sound card,... You can get more details on each option in QEMU documentation.

3. Monitoring
    QEMU provides a way to monitoring your emulator in which you can inspect your emulator, control it, change its devices, query its status,...

    You can switch back and forth between the emulator and its monitor with keystrokes: Ctrl+Alt+2 and Ctrl+Alt+1

    Some monitoring commands:
    (qemu) help or ? [cmd] 
    (qemu) change device setting
    (qemu) x/fmt addr
           Virtual memory dump starting at addr
    (qemu) xp/fmt addr
           Physical memory dump starting at addr 

Monday, May 16, 2011

Use helpers in controller

Rails applied Restful as part of its design.

To follow Restful, you should:

1. Model your web app as resources
2. Manipulate your app's resources through a conventional interface

- if you have Product resource then urls to CRUD (Create, Edit, Update, Delete) this resource would be:

Action Urls Web MethodRestful Interface
Create /products POST products_url
Edit /products/1/edit POST edit_product_url
Update /products PUT product_url
Delete /products DELETE product_url

Then to get links to delete/edit a product through Restful interface, we can use link_to method as below:

link_to 'Edit', edit_product_path(product)

link_to 'Remove', product_path(product), :confirm => 'Are you sure?', 
        :method => :delete

By default, you can only use above interfaces in View layer. How can we use those in Controller layer?
  • In Rails 2, call through @template variable
    @template.link_to('Edit', edit_product_path(product))
  • In Rails 3, call through view_context method
    view_context.link_to('Delete', product_path(product), :confirm => 'Are you sure?', :method => :delete)

Tuesday, April 12, 2011

Code folding in Emacs

Code folding is a feature to wrap or unwrap a block of code. It can help you to see an overview of your code. This is a quick and simple way to have that feature in Emacs. I modified it a little bit to make it more flexible as the way I want.

;; code folding
(defun toggle-selective-display (level)
  (interactive "nEnter indentation level: ")
  (set-selective-display level)

(global-set-key "\M-3" 'toggle-selective-display)
So, if I have this code:

class User < ActiveRecord
  has_many :projects
  has_many :assignments

  def full_name
    "#{first_name} #{last_name}"

  def show_project_names
    projects.each do |prj|
then when I press Alt+3 4, all code that was indented at level 4 (4 spaces) will be wrapped as below:

class User < ActiveRecord
  has_many :projects
  has_many :assignments

  def full_name...

  def show_project_names...
if I press Alt+3 2, all code indented at level 2 will be wrapped:

class User < ActiveRecord...  
if I press Alt+3 0, wrapped code will be unwrapped again.

class User < ActiveRecord
  has_many :projects
  has_many :assignments

  def full_name
    "#{first_name} #{last_name}"

  def show_project_names
    projects.each do |prj|
It's quite fun, right?

Sunday, March 20, 2011

A simple URL shortening algorithm

Shortening a URL is a convenient way to save long URL to make use of space when posting. It's especially popular on Twitter where message is limited to 140 words. Many websites provide this service such as,,...

There some gems or wrapper to use that service in your Rails app. But the shortened URLs belong to another domain (ex: which belongs to If you want to make it belonged to your domain (ex:, you must implement your own URLs shortener.

This is a simple way to do that.

Basically, the problem is:
given a URL, how to map it to a string which has pattern XXXXXX, where X belongs to {0..9a-zA-Z}. There would be 62^6 = 56800235584 such strings. That amount is almost enough.
Then the simple idea to solve that problem is:
map the URL to an integer in 1..62^6. That number must correspond to a string in space {XXXXXX} that could be calculated by using a 10-base to 62-base conversion algorithm (you can understand easily by figuring out how to convert a decimal number to hexa number). 

Here is an implementation of mine in Ruby:

class URLShortener
  CHARSET = "0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ"
  BASE = 62

  def self.encode(id)
    code = ""
    while (id > 0) do
      code = CHARSET[id % BASE].chr + code
      id = id / BASE

    (code.length > CODE_LENGTH) ? "" : "0" * (CODE_LENGTH - code.length) + code 

  def self.decode(code)
    return -1 if code.length != CODE_LENGTH
    id = 0
    for i in 0..(CODE_LENGTH-1) do
      n = CHARSET.index(code[i])
      return -1 if n.nil?
      id += n * (BASE ** (CODE_LENGTH - i - 1))
    return id