Raspberry Pico: Programming with PIO State Machines

By Sebastian Günther

—

21st June, 2021

—

Posted in Microcontroller, Raspberry_pico, C

In Microcontroller programming, interfacing with other hardware can be either very simple or very challenging. If the other hardware, e.g. a sensor, supports standard bus systems like I2C, SPI or UART, you just wire them up and read/write data via the implemented bus system. If you need to connect other hardware, you are forced to implement precise timing signals, sending and receiving data with multiple pins, and interpret the signals.

You could program these timing considerations with plain C, but this would mean a very careful programming because you are tied to the processors clock cycle and need to understand the timing impact of each line of code.

To cover this challenge, the Raspberry Pico has a unique hardware extension: The PIO, an abbreviation for Programmable Input/Output. The PIO is realized as 4 independent state machines. Each state machine is connected with FIFO queues to exchange data with the main program. Besides the queues, stata machines can DMS and access all GPIOs, but no other hardware or protocols.

The Pico community uses PIO to output sound effects or video, to connect to proprietary LCD systems, or connect other hardware that requires a very specific protocol.

To help you get started with PIO, this a article is a concise introduction: Learn the hardware part essentials, see how a PIO program looks like and how it interacts with a C main program, and finally dive into the PIO programming language.

Why to you need PIO?

When you want to interface with hardware that cannot be connected to the onboard supported protocols of USB, I2C, SPI or UART, you are forced to write very time-constrained code to read and write to GPIOs. However, when the external hardware you want to connect to requires a very low speed data transmission, then you need to work with interrupts or having long wait cycles.

The official C SDK guide clearly states that using IRQs for protocols that are of the factor 1000 slower than your main process becomes impractical because you will doom the CPU to be waiting most of the time. Or, on the other end of the spectrum, you might have a hardware that has a high cycle, and you are forcing your microcontroller to never miss a single tick. Both challenges force you essentially into the same situation: All your CPU resources will be spent processing or waiting to work just with a single external hardware. You cannot use your Pico for anything else.

The PIO subsystem introduces a novel solution to this problem. Superficially, its similar to a Field Programmable Gate Array (FGPA) by providing a programming environment for building complex logic. But you are not designing integrated circuits with software and then need to write microcontroller software that interacts with this state. Instead, you are directly programming up to 4 different state machines. Each state machine can freely access the GPIO pins for reading and writing data, it can buffer data from the processor or other DMA, and it notifies the processors with interrupts or polling about its computational results.

PIO Example Program: Blinking LED

Let’s define a simple PIO start program that will blink a LED. We need to define two files: A PIO file, which holds the Assembler-like code, and a normal C file with a main function.

Let’s see the PIO file first. A PIO file consists of two parts: A program section in which you define the PIO instructions, and a c-sdk section that contains a function for exposing the PIO program to your main program. The basic layout is this:

.program hello

...

% c-sdk {
...
%}

PIO Program

The PIO program itself is actually written in Assembler, a subset of Assembler statements to be precise. To alternatively switch LEDs on and off, the following program is sufficient:

.program hello

set pindirs, 1

loop:
  set pins, 1 [31]
  set pins, 0 [31]
  jmp loop

Let’s dissect this program line-by-line.

Line 1: The program statement starts the declaration of a PIO program. It needs to have an identifier, which will be used during the compilation and linking process.
Line 3: The SET instruction is a multi-purpose statement. This line means that we will set all configured set pins to be outputs
Line 5: This loop declaration is a free-form label to group parts of a larger program.
Line 6: Set the configured LED pins to output a HIGH value for a total of 32 clock cycles. Each PIO statement is executed within 1 clock cycle, and an additional 5Bit Value can be used to wait additional cycles.
Line 7: Set the configured LED pins to output a LOW value for a total of 32 clock cycles.
Line 8: With JMP we go back to the previous defined loop label.

C-SDK Bindings

To get this program to run, you need to also define a C-SDK binding. In essence, the binding is a function inside the PIO program. During compilation, it will be picked up by the compiler, which outputs a header file that you can integrate into your main program.

Add the following code - a detailed explanation comes later in this article.

% c-sdk {

static inline void hello_program_init(PIO pio, uint sm, uint offset, uint pin) {
  // 1. Define a config object
  pio_sm_config config = hello_program_get_default_config(offset);

  // 2. Set and initialize the output pins
  sm_config_set_set_pins(&config, pin, 1);

  // 3. Apply the configuration & activate the State Machine
  pio_sm_init(pio, sm, offset, &config);
  pio_sm_set_enabled(pio, sm, true);
}
%}

Main Program

Finally, we add everything together in the main program file.

#include <stdio.h>
#include <stdbool.h>
#include <pico/stdlib.h>
#include <hardware/pio.h>
#include <hello.pio.h>

#define LED_BUILTIN 25;

int main() {
  stdio_init_all();

  PIO pio = pio0;
  uint state_machine_id = 0;
  uint offset = pio_add_program(pio, &hello_program);

  hello_program_init(pio, state_machine_id, offset, LED_BUILTIN, 1);

  while(1) {
    //do nothing
  }
}

Here we see the following details:

Line 4: To work with the PIO state machines, we need to include this special header
Line 5: This statement includes the PIO program that is assembled during compilation. It will expose the defined function from before, hello_program_init, and define a pointer to the program as hello_program. Mind the naming conventions!
Line 12: The Pico has two different state machine buses, and we need to define our state machine as belonging to one of them.
Line 13: We define the id of the state machine (a 4-bit value)
Line 14: This statement allocates dynamic memory that will hold the state machine code. It returns a memory offset value that we will pass to the state machine initialization
Line 16: We initialize and start the program

PIO Technical Details

After seeing an example, lets dive into the technical details.

PIO Components

The Pico provides two PIO blocks, with 4 state machines in each block. Each state machine provides the following components.

TX FIFO/RX FIFO: Receive or send a 32-bit value from/to the main program
Input Shift Register (ISR)/ Output Shift Register (OSR): These registers hold volatile data for direct exchange between a state machine and the main program.
Scratch Registers: Labelled x and y, these 32-bit registers allow you to store any additional data that is required for the state machine.
Configurable Clock Divider: The Pico’s clock cycle of 133MHz can be scaled by a 16-bit value, down to 2000Hz
Flexible GPIO Mappings: At the heart of Pico lies the ability to access the GPIO pins, and each state machine can work with four different sets of GPIOs (input, output, set, side-set)
DMA Access: Direct access to memory without involving the main processor
IRQ Flags: 8 global flags can be set or cleared, each state machine and the main program have immediate access to the interrupts

PIO Assembly Language

To program the PIO, you are using a special dialect of assembly language. In the example program, we already saw how to apply logic levels to pins and how to define a simple loop. There are only 9 commands in the assembly language, and some additional statements for code structuring. I will briefly cover all of them, but for the complete definition of all directives, see section 3.3.2 of the official documentation.

Because the language is very compressed, several statements perform multiple functions. Especially how to work with the GPIO pins correctly can be tricky. Therefore, I group the statements into different kind of functions.

Program Structure

To structure your program in general, you have the following commands available.

.program NAME - the name of the program, and also the name of the header file that will be generated during compilation to give you access to the state machine in your main program
.define NAME VALUE - similar to your C program, you can define top-level constants that are visible in the state machine
LABEL: - labels are syntactic grouping of related statements. You can define any label, and then jump back to it
; COMMENT - Anything behind a semicolon is a comment
.wrap_target and .wrap - Instructions to repeatedly run a section of you PIO program
.word - Store a raw 16-bit value as instructions in the program (each PIO statement is a 16-bit value)
.side_set COUNT (opt) - This instruction additionally configures the SIDE pins of this program. The COUNT value is the number of bits that is reduced from the instruction, and the opt value determines whether side statements inside your PIO program are optional or mandatory. When you work with this declaration, then you can attach additional commands to all expressions, for example out x, 1 side 0 would shift one bite from the OSR to the FIFO RX, and set the SIDE pin to logic level LOW.

Move data inside the shift register

in SOURCE count - Shift data into the ISR, where SOURCE can be X, Y, OSR or ISR, and count is 0...32
out DESTINATION count - Shift data out of the OSR, to DESTINATION X, Y, ISR
mov DESTINATION, SOURCE - Move data from SOURCE (X, Y, OSR or ISR) to DESTINATION (X, Y, OSR or ISR)
set DESTIANTION, data - write a 5-bit data value to DESTIANTION (X, Y)

Move data between the shift register and the main program

pull - Load data from the TX FIFO into the OSR
push - Push data from the ISR to the RX FIFO, then clear the ISR
irq INDEX op - Modify the IRQ number index to be either cleared (op=0) or set (op=1)

Write data to GPIO pins

SET pins
- set PINDIRS, 1 - define the configured SET pins as output pins
- set PINS, value - write HIGH (value=1) or LOW (value=1) to the SET pins
OUT pins
- mov PINS, SOURCE - write from SOURCE (X, Y, OSR, ISR) to OUT pins (X, Y, OSR or ISR)

Read data from GPIO pins

SET pins
- set PINDIRS, 0 - define the configured SET pins as input pins
INPUT pins
- mov DESTINATION, PINS - write from IN pins to DESTINATION (X, Y, OSR, ISR, and OUT PINS)

Conditional Statements

jmp CONDITION LABEL - go to LABEL when one the following type of CONDITION is true
- !(X|Y|OSRE) - true when X, Y, OSR is empty
- X-- | Y--) - true when scratch register is empty, otherwise decrement the scratch register
- PIN - true when the JUMP pin is logic level HIGH
wait POLARITY TYPE NUMBER - delay the further processing until the POLARITY matches the ..
- pin NUMBER - INPUT pin
- gpio NUMBER - absolutely numbered gpio
- irq NUMBER - IRQ number (if POLARITY is 1, the IRQ number is cleared)
nop - Don’t do anything

PIO Configuration

A PIO program is highly configurable. The c-sdk section in your pico defines a wrapper function that will be compiled by the Pico assembler. This function is accessible from the main program, and it can receive any arguments.

You can configure a bewildering amount of aspects in this function - the following list briefly describes all options.

Define the input pins, output pins, and side pins
Define a special pin for the JMP instruction
Initialize the direction of input pins
Configure the shift direction, autoload and bit-size (up to 32-bit) of the input and output shift register
Configure the input shift register to serve as an additional output shift register, and vice versa
The clock divider to be applied to the default 133Mhz clock time, a 16Bit value so you can scale down the PIO clock cycle to 2000Hz, like 0,492ms.

In order to have all configuration options present when working with PIO, I like to use the following template. Following this template, I simply configure what I need to adapt, or delete that which I don't need.

static inline void __program_init(PIO pio, uint sm, uint offset, uint in_pin, uint in_pin_count, uint out_pin, uint out_pin_count, float frequency) {
  // 1. Define a config object
  pio_sm_config config = __program_get_default_config(offset);

  // 2. Set and initialize the input pins
  sm_config_set_in_pins(&config, in_pin);
  pio_sm_set_consecutive_pindirs(pio, sm, in_pin, in_pin_count, 1);
  pio_gpio_init(pio, in_pin);

  // 3. Set and initialize the output pins
  sm_config_set_out_pins(&config, out_pin, out_pin_count);
  pio_sm_set_consecutive_pindirs(pio, sm, out_pin, out_pin_count, 0);

  // 4. Set clock divider
  if (frequency < 2000) {
    frequency = 2000;
  }
  float clock_divider = (float) clock_get_hz(clk_sys) / frequency * 1000;
  sm_config_set_clkdiv(&config, clock_divider);

  // 5. Configure input shift register
  // args: BOOL right_shift, BOOL auto_push, 1..32 push_threshold
  sm_config_set_in_shift(&config, true, false, 32);

  // 6. Configure output shift register
  // args: BOOL right_shift, BOOL auto_push, 1..32 push_threshold
  sm_config_set_out_shift(&config, true, false, 32);

  // 7. Join the ISR & OSR
  // PIO_FIFO_JOIN_NONE = 0, PIO_FIFO_JOIN_TX = 1, PIO_FIFO_JOIN_RX = 2
  sm_config_set_fifo_join(&config, PIO_FIFO_JOIN_NONE);

  // 8. Apply the configuration
  pio_sm_init(pio, sm, offset, &config);

  // 9. Activate the State Machine
  pio_sm_set_enabled(pio, sm, true);
}

Conclusion

PIO, the programmable input/output state machine(s) of the Raspberry Pico, is a novel solution to interface any hardware. Instead of wasting CPU cycle with idle wait times, or quite the opposite, to read and write from PINs all the time, state machines do the heavy lifting of interacting with any hardware. They can be configured to run from 2000HZ to 133Mhz, have free access to all GPIO pins, can read and write to these pins each and every clock cycle. With a reduced, Assembler-like language you program these state machines to adhere to specific timing constraints and exchange bit-data with the main program. This article showed how PIO works, listed the components and all its programming language statements. Finally, we saw the many configuration options of a state machine. You can invoke up to 8 state machines to work alongside your main program - what will be your use cases?

Previous: DHT11 Temperature Sensor Library for the Raspberry Pico

Next: Raspberry Pico: The Complete SDK Overview (Native C/C++, Arduino, MicroPython, CircuitPython)