Class Notes (1,100,000)
CA (620,000)
McGill (30,000)
COMP (700)
Lecture 10

COMP 273 Lecture Notes - Lecture 10: Louisiana Baptist University, Data Segment, In C

Computer Science (Sci)
Course Code
COMP 273
Piotr Przytycki

of 5
COMP 273 10 - MIPS instructions 3 Feb. 10, 2016
In the past two lectures, we discussed MIPS operations on integers. Today we will consider
a few data structures that you are familiar with, namely arrays and strings, and discuss how to
implement them in MIPS.
Suppose we wish to declare and use an array of Nintegers. How would we represent this array in
MIPS? If we were programming in C, we might let the array be a[ ]. Let the address of the first
element of the array be kept in one of the registers, say $s0. This address is called the base address
of the array. To access an element of array, we need to specify the address of that element relative
to the base address, i.e. the offset.
For example, consider the C instructions:
a[12] = a[10];
In MIPS, we cannot transfer data directly from one location in Memory to another. Instead, we
need to pass the data through a register. Here might be the corresponding MIPS instructions:
lw $t1,40($s0)
sw $t1,48($s0)
The offsets from the base address are 40 and 48 for a[10] and a[12] respectively. Why? We need
four bytes for each word, so an offset of 10 words is an offset of 40 bytes and an offset of 12 words
is an offset of 48 bytes.
Another example is the C instruction,
Suppose the integer variable mis stored in register $s1 and integer index iis stored in register
$s2. We multiply the contents of $s2 by 4 to get the offset, and then we add the offset to the base
address which we’ll assume is in $s0 as in the above example. Note that we compute the offset by
multiplying iby 4 by using the sll instruction, which is simpler than having to perform a general
multiplication. (We will see how to perform multiplication in an upcoming lecture.)
sll $t0, $s2, 2
add $t0, $s0, $t0
lw $s1, 0($t0)
In the C programming language (COMP 206), a variable of type “char” uses one byte (http:
// Thus, each character in a string is stored in one byte.
Previously, we saw the instructions lw and sw which load a word from Memory or store a word
to Memory, respectively. There are similar instructions for single bytes. lb loads a byte, and sb
stores a byte. These functions are of I-format. Again, we specify the address of the byte with a
base address plus an offset. The offset is a signed number.
last updated: 17th Feb, 2016 1 lecture notes c
Michael Langer
COMP 273 10 - MIPS instructions 3 Feb. 10, 2016
lb $s2, 3( $s1 )
sb $s2, -2( $s1 )
lb takes one byte from memory and puts it into a register. The upper 24 bits of the register are sign
extended i.e. filled with whatever value is in the most significant bit (MSB) of that byte. There is
also an unsigned version of load byte, namely lbu, which fills the upper 24 bits with 0’s (regardless
of the MSB).
sb copies the lower 8 bits of a register into some byte in memory. This instruction ignores the
upper 24 bits of the word in the register.
Example: How to calculate the length of a character string?
Consider the following C instructions that compute the length of a string. In C, a string is a
sequence of bytes, terminated by the ASCII NULL character which is denoted ’\0’ and which has
byte value 0x00.
char *str; // Declare a pointer to a string.
// str is an address (a 32 bit number).
int ct;
str = "I love COMP 273";
while ( *(str + ct) != ’\0’ ){
The variable str is a pointer (or ”reference”) and so its value is an address, which is just an unsigned
integer (32 bits). The fancy instruction above then adds an offset ct to this address. The * operator
dereferences that address, which means that it returns the content that is stored at that address. If
you are taking COMP 206 now, then you will learn about dereferencing soon. (Joseph told me so.)
Here’s is MIPS code that does roughly the same thing as the while loop part of the above code.
(Let’s not deal with the initialization part of the code that comes before the while loop. For now,
just accept that the first instruction la ”loads the address” of the string str into register $s0. Also,
assume ct is in register $s1.)
la $s0, str # pseudoinstruction (load address)
add $s1, $zero, $zero # initialize ct, $s1 = 0.
loop: add $t0, $s0, $s1 # address of byte to examine next
lb $t1, 0( $t0 ) # load that byte to get *(s + ct)
beq $t1, $zero, exit # branch if *(s + ct) == ’\0’
addi $s1, $s1, 1 # increment ct
j loop
Register $t1 holds the ctth char (a byte) and this byte is stored in the lower 8 bits of the register.
Note also that I have used $t registers for temporary values that do not get assigned to variables
in the C program, and I have used $s registers for variables. I didn’t need to do this, but I find it
cleaner to do it.
last updated: 17th Feb, 2016 2 lecture notes c
Michael Langer
COMP 273 10 - MIPS instructions 3 Feb. 10, 2016
Assembler directives
We have seen many MIPS instructions. We now would like to put them together and write programs.
To define a program, we need certain special instructions that specify where the program begins, etc.
We also need instructions for declaring variables and initializing these variables. Such instructions
are called assembler directives, since they “direct” the assembler on what (data) to put where
Let’s look at an example. Recall the MIPS code that computes the length of a string. We said
that the string was stored somewhere in Memory and that the address of the string was contained
in some register. But we didn’t say how the string got there. Below is some code program that
defines a string in MIPS Memory and that counts the number of characters in the string.
The program begins with several assembler directives (.data, .asciiz,. . . ) followed by a main
program for computing string length. The .data directive says that the program is about to declare
data that should be put into the data segment of Memory. Next, the instruction label str just
defines an address that you can use by name when you are programming. It allows you to refer a
data item (in this case, a string). We’ll see how this is done later. In particular, the line has an
.asciiz directive which declares a string, “COMP 273”. When the program below uses the label
str, the assembler translates this label into a number. This number can be either an offset value
which is stored in the immediate field (16 bits) of I-format instruction or a 26 bit offset which in a
J format instruction.
The .text directive says that the instructions that follow should be put in the instruction
(“text”) segment of memory. The .align 2 directive puts the instruction at an address whose
lower 2 bits are 00. The .globl main directive declares the instructions that follow are your
program. The first line of your program is labelled main.
.data # Tells assembler to put the following
# ABOVE the static data part of Memory.
# (see remark at bottom of this page)
str: .asciiz "COMP 273" # Terminates string with NULL character.
.text # Tells the assembler to put following
# in the instruction segment of Memory.
.align 2 # Word align (2^2)
.globl main # Assembler expects program to have
# a "main" label, where program starts
main: la $s0, str # pseudoinstruction (load address)
The address of the string is loaded into register using the pseudoinstruction la which stands for
load address. MARS translates this instruction into
lui $s0, 4097
Note that (4097)10 =0x1001, and so the address is 0x1001000 which is the beginning address of
the user data segment. (Actually there is a small “static” part of the data segment below this which
starts at address 0x10000000. You can see this in MARS.)
last updated: 17th Feb, 2016 3 lecture notes c
Michael Langer