ARM project: Week 2
String sort in ARM assembler


ARM project home | Tasks (weeks): 1 | 2 | 3 | 4

ARM Developer Suite

; ARM project Easter 2003
; Week 2
; Your crsid   your given name   your family name
; Year-month-day
;  
; This program is free software, released under GNU GPL version 2

; hello.s
; Originally written by Simon Moore; minor revision by Frank Stajano.

; TUT: do not exceed 79 characters per line in your source code (columns 0-78)
;         1         2         3         4         5         6         7 
; 23456789012345678901234567890123456789012345678901234567890123456789012345678
; Line things up as follows:
;  
; label         opcode  operands        ; inline comment
; col 0         col 16  col 24          col 40


                ; Hello world in ARM assembler

                AREA    text, CODE      ; TUT: this section is called 'text'
                                        ; and contains code.

                ENTRY                   ; TUT: entry point of the code

                ; Print 'Hello world'.

                ; Put into r4 the address of the string.
                adr     r4, hello       ; TUT: adr: address in register

loop            ; TUT: "loop" is a label and designates this address 
                ; symbolically.

                ; TUT: call the putchar routine to display each character
                ; to illustrate how a loop works.

                ldrb    r0, [r4], #1    ; Get next character.
                                        ; TUT: r0 := mem8[r4]; r4 := r4 + 1
                cmp     r0, #0          ; Stop when we hit a null.
                beq     print           ; TUT: "branch if equal" = if EQ goto

                bl      putchar         ; TUT: "branch+link" = subroutine call
                b       loop            ; TUT: "branch" =  goto

print
                ; Alternatively, use the library routine 'putstring' 
                ; to write out the whole string a second time, but
                ; all in one go
                adr     r0, hello       ; point at the string with r0
                bl      putstring       ; TUT: "branch+link" = subroutine call

finish
                ; Standard exit code: SWI 0x123456, calling routine 0x18
                ; with argument 0x20026
                mov     r0, #0x18
                mov     r1, #0x20000    ; build the "difficult" number...
                add     r1, r1, #0x26   ; ...in two steps
                SWI     0x123456        ; TUT: "software interrupt" = sys call

hello
                DCB     "Hello World\n",0

                END

; ARM project Easter 2003
; Week 2
; Your crsid   your given name   your family name
; Year-month-day
;  
; This program is free software, released under GNU GPL version 2

; ioroutines.s
; Originally written by Simon Moore; minor revision by Frank Stajano.


; TUT: do not exceed 79 characters per line in your source code (columns 0-78)
;         1         2         3         4         5         6         7 
; 23456789012345678901234567890123456789012345678901234567890123456789012345678
; Line things up as follows:
;  
; label         opcode  operands        ; inline comment
; col 0         col 16  col 24          col 40


; I/O routines


; These are the function codes for some useful system calls.
; Place the function code in r0, the argument in r1, then call SWI 0x123456.
SYS_WRITEC      EQU     3               ; Write the single char whose
                                        ; ascii code is pointed to by r1.
SYS_WRITE0      EQU     4               ; Write the null-terminated string
                                        ; pointed to by r1.
SYS_READC       EQU     7               ; Read a char from the debugger's stdin
                                        ; and return it in r0.


                AREA    text, CODE
                                                
                EXPORT  printhex, putchar, putstring

; --------------------------------------

printhex                                
                ; Print number in r0 as 8 hex digits.
                ; Conforms to APCS.

                stmfd   sp!, {r4, lr}   ; TUT: "store multiple full descending"
                                        ; sp := sp - 4
                                        ; mem32[sp] := r4
                                        ; sp := sp - 4
                                        ; mem32[sp] := lr
                                        ; and when you got here (via bl), the
                                        ; return address was placed in the lr

                adr     r2, hexDigits   ; Point at a constant table of digits
                adr     r3, storage     ; Point at the output buffer

                ; r4 is the bit position of the 4-bit-wide portion of r0
                ; that we print as a hex digit in each round of the loop.
                ; for r4 := 28 to 0 step -4
                mov     r4, #28         
printLoop
                mov     r1, r0, lsr r4  ; TUT: r1 := r0 shifted right by r4
                and     r1, r1, #0x0f   ; mask off lower nibble
                ldrb    r1, [r2, r1]    ; r1 now contains a hex number; turn
                                        ; it into an ASCII code by looking
                                        ; it up in the table.
                                        ; TUT: r1 := mem8[r2 + r1]
                strb    r1, [r3], #1    ; Append character to output buffer
                                        ; TUT: mem8[r3] := r1; r3 := r3 + 1
                subs    r4, r4, #4      ; TUT: r4 := r4 - 4; and set flags
                bpl     printLoop       ; TUT: "branch if positive or equal"

                adr     r0, storage     ; point at the 8-char-wide buffer
                bl      putstring       ; and print it

                ldmfd   sp!, {r4, pc}   ; TUT: idiom for returning from routine
                                        ; "load multiple full descending"
                                        ; r4 := mem32[sp]
                                        ; sp := sp + 4
                                        ; pc := mem32[sp]
                                        ; sp := sp + 4

; - - - - - - - - - - - - - - - - - - - 
; data area

hexDigits       DCB     "0123456789ABCDEF"
storage         DCB     "12345678", 0   ; storage for 8 digit hex string, 
                                        ; null terminated
                ALIGN

; --------------------------------------

putchar
                ; Print the single character whose ASCII code is in r0.
                ; Conforms to APCS.


                stmfd   sp!, {r4-r12, lr} 
                                        ; push the registers that 
                                        ; you want to save

                ; Call SYS_WRITEC, with r1 containing a POINTER TO a character.
                adr     r1, char
                strb    r0, [r1]
                mov     r0, #SYS_WRITEC
                swi     0x123456

                ldmfd   sp!, {r4-r12, pc} 
                                        ; and pop them once you're back
; - - - - - - - - - - - - - - - - - - - 
char            DCB     0
                ALIGN
; --------------------------------------

putstring
                ; Print the null-terminated string pointed to by r0.
                ; Conforms to APCS.

                stmfd   sp!, {r4-r12, lr}

                mov     r1, r0
                mov     r0, #SYS_WRITE0
                swi     0x123456

                ldmfd   sp!, {r4-r12, pc}
; --------------------------------------

                END
                IMPORT  printhex
                IMPORT  putchar
                IMPORT  putstring

AXD (ARM Debugger for Windows and UNIX)

Start AXD with Project | Debug.

When AXD loads, it may ask you to choose a target. This refers to running the program on the ARM board, rather than in the emulator. As the link to the board is rather slow, we'll run the first half of this exercise in the emulator (select ARMUL).

Now, let's go back and look at the program's execution in more detail.

In summary, "step in" (F8) will do one instruction at a time; "step over" will do one instruction or subroutine at a time; and breakpoints halt the run whenever they are hit. They provide different granularities of control over debugging.

Now try it on the ARM board. To do this:

  1. Make sure the board is connected to your PC via a serial cable (not the Altera's parallel cable) and switched on.
  2. Select Options | Configure Target..., and click on ADP.
  3. Configure it like this:

Once you get all this prewritten code going, flex your muscles by adding some code of your own to "hello.s" so that it also writes out "2+2=", computes the result in assembler, and prints the result (as a 32-bit hex number) by calling printhex.

Task

The task is to write a program which reads in a list of strings, sorts them and prints them out.

Use this template for any ARM code you write.

Allocate a pool of space to store the strings using the "%" operator. This is done by adding the following just before the END line:

                ALIGN   4               ; ensure allocated memory is 
                                        ; word aligned
strings         %       4096

Strings should not be allocated a fixed amount of space per string. Instead, pack the strings into the large allocated space so that the space is used efficiently. Use the C convention and store each string as a sequence of characters terminated by null (ASCII code 0, written '\0' in assembler).

A table of pointers to the strings (itself null-terminated) should also be built. During sorting, swapping strings over just requires the appropriate pointers in this table to be swapped.

The code to perform the sort should be a subroutine which accepts a pointer to the table of string references. The overall sort routine should call another routine to actually compare a pair of strings. The comparison of strings should be done on ASCII character values with the first character being the most significant.

The strings should be read in one per line ('\n' is the end of line character). The end of input should be indicated by entering a line with a single full stop ('.') on it.

Note that all other inputs, including empty lines and lines with multiple full stops, should be treated as valid and therefore accepted and sorted.

You are not required to provide a backspace facility to get full credit but it's ok if you do.

It is not recommended that you attempt to provide any special cursor movement facilities (arrow keys etc).

The characters can be read from the keyboard one at a time using the appropriate SWI call. The debugger ROM on the board defines a number of useful SWIs, all of which are accessed by putting the function code in r0 and executing a SWI 0x123456 instruction. Note that reading a character will not result in a character being displayed on the console window, so you have to echo the character yourself (to give some some feedback to the user).

Function Code in r0 Arguments in r1 Returns
SYS_WRITEC0x3 A pointer to a character to be printed. None
SYS_WRITE0 0x4 A pointer to a null-terminated string to be printed. None
SYS_READC 0x07 Must be zero. A character read from the debugger's input window.
Angel_ReportException 0x18 An exception code; the most useful one is 0x20026 (exit) Does not return.

E.g., to read a character:

                mov     r0, #7
                mov     r1, #0
                swi     0x123456

These routines may corrupt the values of r4-r12, and r14, so you should save them before calling if you are using them (see ioroutines.s for examples). For SYS_READC, the character typed will be returned in r0.

Hint: build up this program slowly, starting with a program which reads and writes a single string, then multiple strings without sorting, and finally with sorting.

Remember to comment your code appropriately.

Question 1

W2-Q1: How many comparison subroutine calls will be made to sort the following input?

elude
marmalade
elephant
arm
wombat
grapefruit
elephants
elude
ARM

Question 2

W2-Q2: If you were given the job of testing your neighbour's program, with a prize for every crash you could provoke, what kind of "poisoned" inputs would you feed it?

(Note: do not take this to imply that you will lose marks unless you make your own program absolutely bulletproof. For the limited purposes of this week's workshop, if it works on "regular" inputs, it's ok. However, as a designer, you must be aware of the pitfalls of writing low level code, which this week's experience may have highlighted.)

Weekly mini-report

Submit your mini-report before 11:00 on Thursday according to the usual rules.

You must supply a full source listing for the string sort, but not for the hello, 2+2, ioroutines or anything else.

You must make good use of procedure calls to reduce the complexity of any one code segment (this is for your own sanity as well as the marker's!). In particular, your string comparison routine should obey ARM procedure calling standards (arguments passed in registers r0-r3, result returned in r0; registers r4-r11 and r13 cannot be corrupted by the subroutine and must be preserved if they are used).

Be sure to include the answers to the two questions in a comment at the end of your source.

Please don't leave submission to the last moment: expect queues for the demonstrators' attention during the last hour. If you haven't submitted by 11:00 on Thursday, your next chance is on Monday, but with a penalty of 3 marks per weekday (i.e. 6 marks). Worth avoiding.