Princeton University
COS 217:  Introduction to Programming System

Precept 12:  Getting Started with SPARC Assembly Language

Purpose

Help you learn the  how to create simple "hello world" programs in SPARC assembly language

Reading

Paul, Chapters 1, 2, 3, 4, 8, 9 (for several precepts)

Approach

Study many small C programs and corresponding hand-written assembly language programs

After studying each program, refer to these summary sheets to reinforce and generalize the new material that the program illustrates

SPARC Architecture Summary

SPARC Assembly Language Summary

Assembly Language Program Development

To develop an assembly language program named p:

Method 1 (simple):

(1) Create assembly language source code

xemacs p.s

Emacs recognizes the ".s" suffix as indicating that the file contains an assembly language program

Uses "assembler mode" (and not "C mode")

TAB characters are not treated specially

(2) Assemble and link the program

gcc -o p p.s

gcc recognizes the ".s" suffix as indicating that the file contains an assembly language program

Assembles and links the program

Does not preprocess or compile

(3) Execute

p

Note:  gcc -S p.c produces assembly language version of p.c in p.s

Valuable learning tool

Should not use for Assignment 4!

Using the m4 Preprocessor

Problem:  How to define symbolic constants?

Solution:  Use the m4 preprocessor

To define a symbol:  (define symbol, value)

To use a symbol: symbol

Method 2 (with m4): 

(1) Create assembly language source code containing m4 macros

xemacs p.m

(2) Preprocess

m4 p.m > p.s

(3) Assemble and link the program

gcc -o p p.s

(4) Execute

p

Suggestion:  If you want to use m4, use it minimally

Use it to define symbolic constants

Do not use it to hide assembly language 

(Contrary to Paul textbook)

There is an easier way...

Using the C Preprocessor

Problem:  How to define symbolic constants?

Solution:  Use the C preprocessor!!!

Method 3 (with C preprocessor):

(1) Create assembly language source code containing C preprocessor macros

xemacs p.S

Emacs recognizes the ".S" suffix as indicating that the file contains an assembly language program

(2) Preprocess, assemble and link the program

gcc -o p p.S

gcc recognizes the ".S" suffix as indicating that the file contains an assembly language program

Preprocesses, assembles and links the program

Does not compile

(3) Execute

p

Suggestion:  

Use the C preprocessor minimally -- only to define symbolic constants

Do not use macros to hide assembly language

Example: hellop

See hellop.c and hellop.S

What it does

Prints "Hello world."

How it works

Calls printf

The code...

Assembly Time

General points:

Job of assembler is to read hellop.S and write hellop.o

An object file contains sections; so, in other words...

Job of the assembler is to produce the sections that comprise hellop.o

The object files that we'll study contain four sections:  rodata, data, bss, and text

rodata section contains program-initialized read-only data

data section contains program-initialized read-writedata

bss section contains data that should be initialized to 0

text section contains executable code

Assembler keeps a location counter for each section

Each is set to 0 initially

The specifics:

!

Comment

Comments are delimited by ! and by EOLN mark

Can also use C-style comments (/* ... */)

.section

An assembler pseudo-op

Alias assembler directive

Gives instructions/information to the assembler, but does not cause assembler to generate an executable instruction

.section ".rodata"

Add the following code to the rodata section

pcGreeting:

A label:  marks a location in some section

Record the fact that pcGreeting marks location 0 within the rodata section

.asciz "Hello world.\n"

An assembler pseudo-op

Place the ASCII codes for the given string, followed by a NULL character, into the object file

Increment location counter by 14

.section ".data"

Add the following code to the data section

Unnecessary in this program

.section ".bss"

Add the following code to the bss section

Unnecessary in this program

.section ".text"

Add the following code the the text section

.align 4

An assembler pseudo-op

Increment this section's location counter so it is at an image location that is evenly divisible by 4

Instructions must be aligned on 4-byte boundaries

.global main

An assembler pseudo-op

Mark the "main" symbol in the program image so it will be available to the linker

Thus the symbol "main" will be addressable from outside of this file

In C terminology:  main is not a static function

main:

Another label

Record the fact that main marks location 0 within the text section

save %sp, -96, %sp

Assembly language instruction

Assembler places corresponding machine code in the program image

Place appropriate 4 bytes (32 bits) in program image, and increment location counter

set pcGreeting, %o0

A synthetic instruction:  an abbreviation for an assembly language instruction, or a sequence of them

At runtime, pcGreeting will be a 32-bit address

We want to store that 32-bit address in register %o0

Can't be done using 1 instruction: the instruction itself is stored in only 32 bits!!!

Assembler generates two machine language instructions

If you wrote them yourself in assembly language, they would look like this:

sethi  %hi(pcGreeting), %o0
or     %o0, %lo(pcGreeting), %o0
call printf

Assembly language instruction

nop

Assembly language instruction

No operation (pronounced "no op")

mov 0, %i0

Synthetic instruction

Equivalent to one machine language instruction:

or %g0, 0, %i0

ret

Synthetic instruction for jmpl %i7+8, %g0

restore

Synthetic instruction for restore %g0, %g0, %g0

Link Time

General points:

Job of linker is to read object file(s) and produce executable file

Linker replaces reference to section offsets with real memory addresses

The specifics:

In sethi and or instruction, replaces reference to pcGreeting with a memory address

In call instruction, replaces reference to printf with a memory address

Run Time

Execution begins at the instruction whose label is main

save %sp, -96, %sp

%sp stands for the sp register, alias %o6, alias %r14

Pushes a new stack frame onto the top of the runtime stack

(Meaning is described further in later lectures/precepts; for now accept it on faith!)

sethi  %hi(pcGreeting), %o0

Stores high order 22 bits of the memory address denoted by pcGreeting into the high order 22 bits of %o0

Clears the low order 10 bits of %o0

or     %o0, %lo(pcGreeting), %o0

Stores low order 10 bits of the memory address denoted by pcGreeting into the low order 10 bits of %o0

call printf

To call a function:

(1) Set its actual parameters into registers %o0, %01, ...

(2) Execute the call instruction

nop

No operation (pronounced "no op")

A branching instruction (such as call) should always be followed by a nop instruction

Why?  See upcoming lectures and precepts

or %g0, 0, %i0

Assigns 0 to register %i0

Note:  %g0 is the "black hole" register

Used by many synthetic instructions

Allows fewer "real" instructions

Would set instruction have worked?  Why not use set instruction?  (Assembler would optimize)

jmpl %i7+8, %g0

Causes return to caller (and thus ends program execution)

We'll study later

To return from a function:

(1) Set return value into register %i0

(2) Execute the ret instruction

restore %g0, %g0, %g0

Mate to save; We'll study later

Actually executed before control is returned to caller (as described later in course)

Introduced these assembly language features:

Comments

Can also use C-style comments

Pseudo-ops

.section ".rodata"
.section ".data"
.section ".bss"
.section ".txt"
.asciz
.align
.global

Labels

Register notation

Control instructions

call
nop
set (synthetic)
sethi
ret (synthetic)
save (described more thoroughly later)
restore (synthetic) (described more thoroughly later)

Logical instructions

or
mov (synthetic)

Example: hellosp

See hellosp.c and hellosp.S

What it does

Reads a name

Prints "Hello name."

How it works

Calls scanf and printf

The code

.skip 100

Assembler pseudo-op

Increment the location counter by 100

Introduced these assembly language features:

Pseudo-ops

.skip

Control instructions

(More substantial use of function parameters with call instruction)

Copyright © 2002 by Robert M. Dondero, Jr.