Scheduling Code on Bare Metal
Idioms
Something that is called idiomatic when it is an expression that is natural to a native speaker but opaque to a newcomer. I also use the word idiom to describe a fragment of code that I have used over and over for many years, for example the standard embedded systems main program idiom:
Initialize();
FOREVER {
ProcessStuff();
}
Bare Metal Scheduling
When you are working at the low end of embedded systems programming, your processor may be too small for an RTOS and you fall back to using Big Loop programming with interrupts. This is programming on the bare metal, with no help from new fangled schedulers or operating systems. It’s just you and a timer to keep your code on schedule.
Over the years I have used one particular idiom for executing a piece of code periodically: assume that you have a timer set up to give you an interrupt 1000 times per second. The interrupt handler simply increments a 32-bit unsigned integer called tickTime. You then have a function that can give you the number of ticks since the system booted. You will see this construct in a lot of systems, including FreeRTOS and the CMSIS sysTick handler.
My standard way of working with this system was:
/* Don't use this code. It has a bug. Read the post!. */
#define NEXT_RUN_DELTA (TICKS_PER_SECOND / 10U)
uint32_t nowTime;
uint32_t nextRunTime = 0U;
FOREVER {
nowTime = timerGetTicks();
if (nowTime >= nextRunTime) {
DoSomething();
nextRunTime = nowTime + NEXT_RUN_DELTA;
}
}
This code spins around, checking “are we there yet?” until nowTime increases enough to be greater than or equal to the run time. We do something, then set our next run time to be now plus enough ticks to represent a tenth of a second. This works well, but it has a fatal bug.
What? A Bug?
Today, I was working with a very small, slow, painful, 8-bit processor, and it could not keep up to 1000 interrupts per second doing a 32-bit unsigned increment of the tick time. To do a 32-bit increment, this processor adds one to the least significant byte, checks for a carry, adds the carry into the next byte, checks for a carry, and so on. The process is stupid slow, and when your processor is jogging along at 250KHz, you have problems doing it 1000 times per second.
I figured that I could change to 16-bit values and, indeed, the reduction in the increment time was enough that I could get my millisecond tick times, but it uncovered a horrible bug. My idiom doesn’t handle tick accumulator roll over AT ALL.
Roll Over
If you were to increment an 8-bit unsigned value 1000 times per second, it would count from 0 to 255 in 255 milliseconds. One millisecond later the increment would set the value to 0 again and continue counting and rolling over every 256 milliseconds.
In a 32-bit unsigned tick timer, it would count from 0 to 4,294,967,295 and then wrap back to zero, but this would happen in 49.7 days. That’s like the length of summer here in The Great White North; longer than your typical test run, but too short for real life.
When I changed to a 16-bit tick accumulator, it rolled every 65.5 seconds, at which time my loop locked up. Something in my system had taken more than 1/10 of a second and I ended up with a situation where:
nextRunTime = 65480
nowTime == 6434
Apparently, something like this happened; at some point
nextRunTime == 65480
nowTIme = 65470
then the program went away for about 1/10 of a second (100 ticks). nowTime becomes 65570.
But since we are working with unsigned 16-bit values, the counter would have wrapped when 65535 was incremented. So instead nowTime has a value of 34.
Our conditional now looks like:
if ( 34 >= 65480) {
And gets bypassed because nowTime wrapped before being checked. Poop.
A small amount of googling turned up a new form of my idiom:
if ((nowTime - lastRunTime) >= NEXT_RUN_DELTA) {
DoSomething();
lastRunTime = nowTime;
}
So, basically, we are checking for the number of ticks between now and when we ran last. This code requires unsigned integers, and the timer must only wrap once or less during the largest delay interval. And this code works in little-bit processors that can’t afford big tick accumulators.
(Elecia White wrote about this exact situation in her book Making Embedded Systems (page 138-140). I even read that chapter, but I still got it wrong initially.)
Voodoo Math
One part of this solution really bothered me, the accumulator had rolled over and now we are subtracting a big value from a small value. Let’s take a look.
From the example above:
nextRunTime == 65480
nowTime == 34 (which is 65570 after it rolled over.)
nowTime - lastRunTime == 34 - 65480
== -65446
But we are using unsigned math here, so that value isn’t what we will get. Somewhat magically we get the answer 90.
Let’s look at that action one more time:
65530 - 65530 = 0
65531 - 65530 = 1
65532 - 65530 = 2
65533 - 65530 = 3
65534 - 65530 = 4
65535 - 65530 = 5
0 - 65530 = 6
1 - 65530 = 7
2 - 65530 = 8
3 - 65530 = 9
...
So, even though we have a roll over, the subtraction still works and gives us the answer that we want.
The idioms of embedded systems programming change as we get faster processors and more memory, but they also have to be reviewed when we learn better ways of doing things or when we discover lingering bugs. I think it’s time to reread Elecia’s book.
This post is part of a series. Please see the other posts here