PIC for Programmers or Yet Another PIC tutorial

PIC for Programmers

(Yet Another PIC Programming Tutorial)

1. Introduction

If you know assembly programming but are not familiar with the PIC family of microcontrollers you will find in these notes a concise yet rather complete introduction to programming in assembly with PIC. If you are a PIC programmer you may find useful the discussion below on orthogonal assembly notation for addressing, its use in designing a simple macro library to overcome PIC's asymmetric assembler notation and the program examples and exercises at the end of this tutorial. Note, however, that we do not address I/O programming at all, which is the main objective to use a PIC microcontroller, to start with: controlling devices, interrupts and many types of Real Time applications. You will find plenty of excellent examples on the many PIC pages such as beginners check list and talking electronics. So, take the word "complete" above, in a very restricted sense: data access methods, extended precision arithmetic and programming algorithms.

PIC architecture summary:

Some highlights of the PIC16F8X microcontroller architecture, taken from the manufacturer's datasheet:

Harvard architecture with a separate program memory bus (14 bits wide) for instructions and a data memory bus (8 bits wide).
RISC architecture with 35 instructions, each occupying a single 14 bit program memory word and a two-stage pipeline allowing most instructions to be executed in a single cycle (the 16F8X models have 1K program flash memory words on the chip; other models have up to 8K words).
internal ram memory implemented in two switcheable file register banks with 80 bytes each (they are switched by bit 5 of the Status register; other PIC models may have up to 4 banks); the first 12 file registers are special purpose (and named Special File Registers or SFR), including the Status register word, Program Counter (PC), interrupt control and timer.
64 bytes of EEPROM memory for storing constant data.
hardware controlled stack, 8 levels deep ( up to 8 nested subroutine calls)
13 bidirectional I/O pins.
5 types of internal and external interrupts, programmable timer and Watchdog timer.
orthogonal instruction set "allowing any operation on any register using any addressing mode".

PIC's Arithmetic and Logic Unit (ALU) is 8 bits wide and has a single accumulator called the working register or W register. The ALU is capable of addition, subtraction (two's complement) and logic operations such as rotates, or, and, exclusive or, etc. Three bits in the STATUS register (which is file register 03) may be affected by these instructions: Z ( Zero), C (Carry) and DC (Digit Carry, which is analogous to the Auxiliary Carry of the 8085 and 8086 microprocessors).They are, respectively, Status register bits 2, 0 and 1. Two-operand arithmetic and logic instructions take W as one operand and a file register or a literal (constant) as the second operand. In the case of W and a file register as operands, one bit in the instruction selects the destination of the result, which can be either the working register W (value 0) or the file register (value 1). This destination is generically called d and specifically called w or f by the assembler. For example, the instruction addwf fr1, w adds file register fr1 and W leaving the result in W, while addwf fr1, f does the same addition, but leaves the result in file register fr1. This allows some unconventional operations such as subwf fr1, w which performs the operation: fr1 - w => w.

Three mov type instructions allow one to copy the value from a file register to W (movf fr, w ), from W to a file register (movwf fr), and to load a constant or literal into W (movlw k ). We found these assembler mnemonics asymmetric and particularly confusing for the beginner PIC programmer, for the reasons outlined in the next section.

2. Assembly language addressing paradigms

There are 2 widely used paradigms for addressing operands in assembly languages:

the Intel paradigm (we called this way because it is used by all Intel processors) codes a generic two operand opr instruction (such as add) this way:
opr dest, source
with the meaning: dest <= dest opr source (read this as: dest becomes " dest opr source" )
the PDP11 paradigm (named after the venerable PDP11 minicomputer of the early 70's and used by its successors from Motorola, the MCHC11 and MC68000 microprocessors) codes a similar instruction as:

opr source, dest

source opr dest => dest

"source opr dest" goes to dest

These two paradigms are equally convenient and natural if used in an orthogonal (i. e., symmetric) way : every instruction with two operands should use one of these two formats. The PIC16F8X adopts the PDP11 paradigm (for the destination designator d is the second operand in a two register operand instruction) in a non-orthogonal way, however, as the above three different mov instructions clearly show. It would be much clearer to write, for example: mov fr, w, mov w, fr and mov # literal, w (as did the PDP11). A simple macro library that overcomes some of these problems can be found here. It extends the mov macro-instruction to include two distinct file registers and includes a clever xchg fr1, fr2 macro that exchanges the values of two file registers using only the accumulator W as a temporary variable (adapted from a macro of Ivan Cenov). They may help a beginner programmer to think on the problem he/she wants to solve instead of the assembler idiosyncrasies.

3. Instruction Set Summary

Most Instruction Set documents, including Microchip manuals, group PIC instructions according to their physical format and not by their common addressing modes or functions, which makes much easier learning and using them. We have adopted this later approach, and divided PIC instructions in the following groups:

Mov instructions - they copy a value from/to a file register or literal to/from register W
Logic and arithmetic instructions with a file register and register W as operands
Logic and arithmetic instructions with a literal and register W as operands
One operand Logic and Arithmetic instructions
Branch, Skip, Call and Return instructions
Useful macros for conditional branches, logic and arithmetic operations

You should look carefully at this instruction summary document. To test your first PIC programs an assembler and simulator are the ideal tools. You can download from Microchip the excellent integrated editor, assembler and simulator MPLAB IDE for Windows .

4. Pointer or indirect addressing wih PIC

If you need to use arrays or more complex data structures such as lists you will need pointer variables, which in most computer architectures are implemented through register indirect addressing: in other words, use the contents of a register as the address of some aggregate data structure, and access the data indirectly through this register. PIC has just one such register called the FSR register ( file register 04) which is used as an indirect address register in an also indirect way: whenever you want to use the FSR register as a pointer, you use the fictitious register INDF (which is file register 0 ) as one operand of your mov, arithmetic or logical instruction: the PIC processor "takes the contents of the FSR register" as if you had coded it directly in your instruction instead of INDF. It seems weird, but it makes sense if you recall that PIC designers wanted to code all instructions with a single 14 bit word (well, you may argue, they could have designed PIC with a 15 or 16 bit instruction word and reserved one bit for indirect file register addressing, turning any file register into a potential pointer register, wouldn't that be great? There are indeed 16 bit program word PIC models, but as far as I know, none incorporates this feature!). In any case, you can easily loop through a vector of bytes using the FSR register and incrementing (or decrementing) it to point to the next element and addressing the data element through INDF. As an example, (adapted from PIC's datasheet) this program fragment fills the 68 General Purpose Registers (GPR) addresses 0xC thru 0x4F, with the constant oxFF:

movlw 0xc	; oxc => w
movwf FSR	; 0xc => FSR
loop:
movlw 0x50	; 0x50 => W (last GPR number + 1)
clrf INDF	;clear memory at address (FSR)
decf INDF,1	; set memory at addr (FSR) to FF
incf FSR, 1	; FSR points to next file register
subwf FSR, w	; (FSR) - 50h => W
bnz loop	; if result # 0 goto loop

Exercise 1: change the above program fragment to fill the 68 GPR registers with the numbers 1, 2, ..,68.

As a more elaborate example of pointer addressing with INDF and FSR, this program computes the first few elements of the Fibonacci sequence (recall from your Math classes that the Fibonacci sequence is computed using the last two elements to find the next one: you start with the first two elements 0 and 1 and next you get: 1, 2, 3, 5, 8, 13, 21, 34, and so on). The xchg macro fits nicely into this example. You can also look at the program code below: count, f0 and f1 are scratchpad variables; computed Fibonacci numbers are stored in a table starting at file register fib; f0 and f1 store the last two computed Fibonacci numbers; up to 12 Fibonacci numbers numbers can be computed with 8 bit precision.

Computing the first 12 Fibonacci numbers:

movlw fib ; table address => w

movwf FSR ; table address => FSR

movl d'12', w ; compute 12 Fibonacci numbers

mov w, count ; count them,

clrf f0 ; 1st Fibonacci number is 0

clrf f1

incf f1 ; 2nd Fibonacci number is 1

loop:

   mov f0, w ; f0 =>w

add f1, w ; f0+f1 =>w

movwf INDF ; store f0 + f1 in current table entry

   xchg f1, w ; f1=> w,   f0+f1 =>f1

   mov w, f0 ; move previous f1 value to f0

   incf FSR ; FSR points no next table entry

decbnz count,loop ;count-1 => count,   if # 0 goto loop

Exercise 2: extend this program to compute Fibonacci numbers with 16 bit precision; for this purpose write a 16 bit addition subroutine; detect the 16 bit sum overflow in order to end your loop (therefore you don't need to count the 23 Fibonacci numbers that fit in 16 bits).

5. Using program memory to store data tables

The PIC 16F8X has a relatively large program memory (1K 14 bit words) compared to only 2x68 bytes of ram. It would be nice if we could use part of the program memory to store tables of read only data. This can be easily done if the table is small enough to fit in a 256 byte "page boundary" (an address multiple of 256). If you look at the PIC instruction set you will find a useful return instruction called retlw k which loads W with a literal k before returning to the calling program (popping the Program Counter from the hardware stack); this gives a convenient and fast way to return a value from a routine call. Well, this instruction can do the trick if we fill our program memory table with up to 256 such return instructions, each containing the desired constant, and using this table as a "call and jump table". We will pay 6 extra bits for each constant, but our program memory may have enough free space, anyway. How can we index into this table to read an entry value? The solution lies in the fact that the 8 least significant bits of the Program Counter (which, by the way, is 13 bits wide, but only 11 can be used in the 16F8X PIC model) are stored in file register 2 (called PCL). Now, suppose that our jump table starts at a 256 byte page boundary -1 (call that address mytable) in your assembler program, and that we want to read the value of an entry whose index we have loaded in W. This can be done if at the address mytable we code the instruction addwf PCL, 1 (which adds W to PCL). In our program, we should execute the following instructions:

movlw HIGH (mytable +1)	; get the high order bits of the first entry address into W,
movewf PCLATH	; and store in this special FSR to concatenate later with PCL
mov index, w	; put index into W
call mytable	; should return in W the desired table entry

When the instruction call mytable is executed the following actions take place:

The PC (containing the address of the instruction after the call) is pushed into the hardware stack,
The PC is loaded with the 11 bit constant embedded in the calling instruction and that is the address of the instruction at mytable; at the same time PCL is loaded with the 8 least significant bits of this address. The PC 5 most significant bits are not loaded into the special register PCLATH (file register 0Ah); that 's why we had to do it previously
The instruction addwf PCL, 1 at mytable is executed. At this point (just before this execution) the PC has been incremented to point to the next instruction which is the first instruction of our jump table, and PCL has been adjusted accordingly. Executing the instruction addwf PCL, 1 adds W to PCL and loads the PC with the contents of PCLATH concatenated with the contents of PCL, leading the processor to fetch the desired table entry where the retlw k instruction returns in W the required entry value! Because this is an 8 bit addition and not an 11 bit addition, our table cannot extend beyond the current 256 page boundary without further calculations in the initial setup just before the call mytable instruction.

Exercise 3: suppose your table spans multiple 256 byte pages and its index is computed with 16 bit precision. Modify the above setup calculations in order to retrieve the required table entry. As a further enhancement allow your jump table to start at any memory address and not only at a 256 byte page boundary (you will need this if you decide to go on and work on exercise 6 at the end of these notes!)

Let's apply this technique in a complete example, the solution of the so called "Maximum Sum Subvector Problem: given a vector of 8 bit signed integers randomly distributed, find a subvector of consecutive elements with maximum sum". It is simple to devise an algorithm with computation effort proportional to the cube of the number n of elements in the vector (this is called by computer scientists an "O(n**3) solution" ) but it is not trivial to devise a linear time algorithm (i.e. O(n)). The following deceptively simple algorithm (written in C) is such a solution (try it!):

Linear Time Maximum Sum Subvector Algorithm:

void main()
{
int i,j,start,end,csum,maxsum;  
char tab[TMAX];  
rand8(tab,TMAX);   /*initialize vector with random signed 8 bit integers*/  
csum=maxsum=start=0; end=-1;  
 for (i=0, j=0; j < TMAX; j++){  
   csum= csum + tab[j];  
   if (csum> maxsum){  
   maxsum=csum;  
   start= i;  
   end=j;  
   }else
    if (csum < 0){  
      i= j+1;  
      csum=0;  
    }  
}

Our goal is to rewrite this algorithm in PIC assembly language, initializing the vector of random signed integers as a table in PIC's program memory (on a tiny PIC 16F8X we could have a table with more than 900 entries!). This exercise will illustrate several important general assembly language programming techniques:

loop control based on a counter variable (the C for statement)
the C statement csum= csum + tab[j] requires to sign extend to 16 bits the value tab[j] and a 16 bit addition to compute csum (if we do it with 8 bits we could easily get an overflow); see the subroutine sum in the assembly code.
the comparison in if (csum> maxsum) requires a 16 bit two's complement signed integer comparison routine (subroutine cmpcsummaxsum)
the if statements require several conditional branches in assembly code
several 8 bit and 16 bit variable assignment statements (trivial but tedious to code, our macro-instruction mov fr1, fr2 comes handy here).

Although it is unlikely you will ever find a practical use for this algorithm in a real PIC application, the sub-problems listed above certainly will arise in many real applications, and is the main reason to include the program in this tutorial.
I also think it is more didactic to show a complete small structured program than code fragments (it is more fun anyway, and this was indeed the first non trivial program I wrote in PIC).

With the above explanations, the code should now be simple to follow:

the instructions from init: to loop: initialize the program variables,
the C for loop goes from label loop: till instruction b loop,
the first 3 instructions after loop check if we have iterated j from 0 to TMAX,
the sum and cmpcsummaxsum subroutines do the more complex computations in two's complement arithmetic; read them carefully;
the rest of the code fill the several conditional branches needed by the two C if statements: look at the assembly line comments to check the corresponding C source code.

Exercise 4: we have cheated a bit when we said that the subroutine cmpcsummaxsum compares two signed 16 bit integers, when in reality the second integer (maxsum) is always >= 0, and we took advantage of that to make our code faster. Rewrite this subroutine so that it can compare two arbitrary signed 16 bit integers. You can start looking at some 16 bit unsigned comparison subroutines.

Exercise 4.1: write a 16 bit signed subtraction subroutine. Look here for a 16 bit subtraction subroutine. Test your subroutine computing the Fibonacci numbers backwards, i.e., start with two consecutive Fibonacci numbers (for example, 46368 and 28657), and subtract backwards until you reach number 0.

Exercise 4.2: write a 16 bit signed multiplication subroutine. Test it computing the successive powers of a small negative integer (say, -3), which gives alternately positive and negative integers. For this purpose you could extend this 8 bit unsigned multiplication routine found in the MPLAB installation.

Exercise 5 :

Compile the above C program with your preferred C compiler (you can download the maxsum.c program, written for the free, old, but still useful Borland C 2.0) and test it with small vectors. Make sure it gives the correct answers.
Test your program with a vector of 256 entries. Show the start, end and maxsum results in hexadecimal.
Change your C program in order to generate the vector of random signed integers in the format required by PIC's assembler tab jump table, as in our example (recall that the assembler default for constants is hexadecimal; you can change that with the radix dec assembler directive). Write this output to a file.
Copy and paste the above file into your PIC assembly source code where the tab table is.
Assemble and run your program. Check the results against your C program results: they should be the same.

Exercise 6 (this should be fun!):
If the above exercise was not a real challenge for you, allow your jump table to start at any address and to be as large as possible (900 bytes, say); you should modify the assembler program so that all table indices are now 16 bits wide. Make sure csum does not overflow 16 bits in your C tests before generating the assembler table (let me know if it works!).

# visits

movlw fib	; table address => w
movwf FSR	; table address => FSR
movl d'12', w	; compute 12 Fibonacci numbers
mov w, count	; count them,
clrf f0	; 1st Fibonacci number is 0
clrf f1
incf f1	; 2nd Fibonacci number is 1
loop:
mov f0, w	; f0 =>w
add f1, w	; f0+f1 =>w
movwf INDF	; store f0 + f1 in current table entry
xchg f1, w	; f1=> w, f0+f1 =>f1
mov w, f0	; move previous f1 value to f0
incf FSR	; FSR points no next table entry
decbnz count,loop	;count-1 => count, if # 0 goto loop