Tuesday, April 16, 2013

RF24 Performance Improvement w/SPI

The Up Side

The RF24 (nrf24l01, nrf24l01p, nrf24l01+) radios use the SPI protocol to communicate with its master device. The speed at which the devices communicate is controlled by the master's clock speed. Here, the Arduino is the master device. Accordingly, we can control the Arduino's SPI bus speed. On the Arduino, this clock speed is set by means of a divisor based on the speed on the CPU. For a standard Arduino, the CPU runs at 16Mhz.

To reduce the chance of communication errors, a compromise was made in the RF24 driver to not run at maximum bus speed. By default of the driver, the divisor value is SPI_CLOCK_DIV4. This divides the 16Mhz clock by 4. Which drives the SPI bus at (16/4) 4Mhz.

To further improve performance, it is possible to use a divisor of 2 (SPI_CLOCK_DIV2), allowing for an 8Mhz SPI bus speed. That's twice the performance of the default setting. This translates into half the latency for SPI bus communication. If you're pinched for maximum performance, this may well be a change worthy of consideration.

To make this change, look at RF24.cpp. Find the line containing "SPI_CLOCK_DIV4". Change that value to one of the other supported constants to adjust the SPI bus speed accordingly. Just remember, its a divisor, so the larger the number, the slower the SPI bus speed. Inversely the lower the number, the higher SPI bus speed.

The Down Side

There are down sides to running at faster speeds. And this is why a lower value was picked for a default. At faster speeds noise becomes much more noticeable and communication between devices becomes more easily corrupted.  Corrupted communications means packet loss, an increase in communication errors, and/or corrupted packets at the receiving end of things. This is important because, should a packet become corrupted during SPI transfer, it will be delivered in its corrupted form at the receiving radio. In spite of corruption, it will still pass a CRC check because it was received exactly as was transmitted - corrupted. That's not to say these things will happen. But they can happen. So caution is advised.

For breadboard use, when jumper leads are long and noise is likely, I strongly encourage you to maintain a slower SPI bus speed of 4Mhz. But, once you're ready for production use, where you have everything connected properly with decoupling capacitors and short leads, a clock divisor of 2 (SPI_CLOCK_DIV2), providing for an (16/2) 8Mhz SPI bus speed, may well give your performance a tiny, extra boost.


Customization

As happens with many Arduino projects, you may find you've simplified things by converting to a simple AVR project. Many users who do this deviate from the 16Mhz clock speed to something slower or faster, as dictated for the given project. As such, if you find you've increased your clock speed to 20Mhz or even decreased to 1Mhz, don't forget you can tweak these settings as needed. Always remember, SPI bus speed is determined by the speed of your CPU. If you change the speed of your CPU, you might re-evaluate your SPI bus divisor.

History & Summary

When the original SPI bus speed was selected in Maniacbug's code, I did observe errors when running at 8Mhz on a breadboard. Infrequent yet present. This is why I picked a 4Mhz speed as a default value. That, however, doesn't mean you can't try faster.

Be mindful of your decision should you decide to run at a faster SPI bus speed. Use leads as short as possible. Use decoupling capacitors as close to the radio as possible. If you're unsure, leave it where its at. Worst case, you now know you have some options to squeak every last bit of performance out of your project.

Monday, April 8, 2013

Detour II: A USB nRF24L01 Radio Modem

Article is currently incomplete and unfinished. Significant progress has been made on the firmware. Its tight on flash, needless to say. The massive bloat of Arduino libraries >= 1.0 makes Arduino >= 1.0 completely useless for development. As such, I've been forced to continue development with Arduino 022 libraries. Likely I'll wind up with some native AVR code, bypassing the Arduino libraries, simply to save space and increase efficiency. But more to come...

Creation Of A Detour
Now that I've decided to focus on nRF24L01 based radios and the driver is fairly feature complete, its time to look at a computer interface for these wonderful radios.

Having an easy to use computer interface for these radios will go a long way toward both making them more accessible as well as make it easier to debug and test application develop. Given the ease of development and inexpensive cost, I expect everyone using these wonderful radio modules will soon have a USB Radio Modem.

The intention is to create a small footprint radio modem which plugs directly into a USB port. Once connected, the modem looks like any other RS232 serial port on the computer. In order to maintain a small footprint, most Arduino solutions are simply too large a bulky. Not to mention their power draw is generally too high because of all the extra components on board. As such, we'll not be using an Arduino for this portion of the build. Rather, we are directly using an AVR uC plus a USB interface and our nRF24 radio module.

The Microcontroller
To find a uC for our project, we need to look one which has enough flash and enough pins to satisfy our physical and software requirements. After some modest testing, I estimated 8K flash is likely to be the minimum requirement for our software footprint.

Next we need to estimate the number of physical pins required for our project. We know we want at least RX, TX, RF24 IRQ, DTR, CTS, DCD, SS, MOSI, MISO, SCK, and LEDs on RX, TX, Power, and/or either an operational mode or DTR state. We know we want at least one hardware a(s)art on the uC. That makes for a minimum signal pin count of 14-I/O pins. When we add GND, power, reset, XTALx2, we have a minimal physical pin count of 19 pins. We also know we want to run at least 16Mhz because of the desired data rate of at least 1Mbps.

This means our minimal requirements are 8K flash and at least 19 pins. After some searching, it looks like the ATmega8-16PU will fit out requirements, including a hardware USART port; providing for RX/TX. But, flash and memory space is tight.

Since estimates indicate we're already tight on flash space, we know we won't have room for an Arduino bootloader. Basically we're looking at a AVR on strip or proto board, which means an inexpensive USBASP interface (or other) is also required for programming. Visiting ebay indicates these can be had very inexpensively for between $2.50 and $5.00. Given this purchase allows us to save so much space and opens the door for so many additional projects, its easily money well spent. I bought mine on ebay for $3.99, delivered to the door. That's hard to argue with considering its built around an ATmega8L, a 12Mhz crystal, and several LEDs. Plus it comes with the interface ribbon cable.

USB Interface
There are several different ways to create this project. One way is to use an AVR uC to directly drive the USB interface via one of several software based AVR-USB libraries. This is fairly inexpensive with a reasonably low component count. Cost for this project is likely to be roughly three to five dollars including the uC. From a frugal perspective it seems like a real win. The down side of the approach is that it requires some very non-Arduino programming which can be rather involved and complex, imposes significant processing overhead and consumes roughly 1.5K of flash of the uC. With the extra software footprint, I don't believe everything will fit on an ATmega8, would would increase the cost of this solution. Also, the associated software skill level is very non-Arduino friendly. This just doesn't sound like a good beginner's path. As such, I don't believe a software USB solution will work so long as an ATMega8 is the target.

An alternate path is to look at USB serial TTL interfaces readily available. Given we want this to look like a logical RS232 serial port and some plans directly depend on making use of some of RS232's signals, we can go shopping on ebay. Technically what we're shopping for are not RS232 ports. Just the same, we do want an interface which provides many of the logical signalling convention.

After shopping for various USB serial TTL interfaces, I quickly focused on the CP2102 variants. More specifically, I found some inexpensive CP2102 based boards which plug directly into a USB port and provide regulated 5v, 3.3v, RX, TX, GND, and CP2102 reset, all broken out to 90-degree pins. Also found are six additional RS232 signals (DTR, DSR, RTS, CTS, DCD, RI), plus two USB suspend signal lines (SPD, SPD/), broken out to through hole solder pads. I bought from hittime_hk on ebay. They were $2.88USD each, including shipping to the door. They look like the following and even come with a small, 5-pin ribbon cable.




The Plan: Pulling It Together
Okay, the plan is to use a CP2102 USB TTL interface board to provide power, ground, and various RS232 logical signals to my ATmega8-16PU microcontroller. The uC in turn will be connected to the nRF24L01(P), as several LEDs, a 16MHz crystal, and several capacitors. The RS232 TTL signals will also be connected to the uC. The Atmega8 is to be programmed via an USBASP interface and AVRdude.

The highest baud rate supported by the CP2102 is 1Mbps. According to the ATmega8 data sheet (page 153), when clocked at 16MHz, it supports an asynchronous 1Mbps data rate with zero errors. This makes it a perfect baud rate match. The catch is, out of the box, the CP2102 doesn't support 1Mbps without creating a baud rate alias. For this, we'll need to either use Silab's CP2102 software to create a baud rate alias or use the cp210x python utility available on Source Forge. After exploring this more, it turns out the python utilities does not support baud rate alias table updates. I've done some work to add this, but its currently a work in progress. So for now, its limited to 115,200 baud.

To allow for proper baud rates, the uC will be statically configured for 1,000,000 baud. On the computer, we will open the serial port at a baud rate of 230,400, which the CP2102 will alias to 1,000,000 Mbps. We pick this baud rate to alias because it provides an error of 3.5% when running the ATmega8 at 16Mhz. As such, its otherwise a poor data rate to use. Furthermore, 115,200 works well so worst case we'll keep that as our fall back and goto data rate until we can get the aliases programmed on the CP2102. Honestly, I'm not sure 16Mhz on the ATMega8 even provides enough headroom to cleanly communicate at 1,000,000 baud.

Since we're running the ATmega8 at 16Mhz, we'll be running this off of the 5v line provided from the CP2102 board.

RF'n Around
The nRF24L01 module is connected via a SPI interface and controlled via the RF24 library. By default, the RF24 library uses a clock divider of 4, which means the SPI interface, given a 16MHz crystal on the ATmega8, will run at 4Mhz. Since the fastest RF data supported by the module is 2Mbs, and our CP2102/AVR interface is limited at 1Mbps, this is plenty fast, even after accounting for all the extra SPI bus traffic used to obtain and configure state from the radio module.

To connect the radio, I'm using standard 0.1" headers, whereby the radio module will simply plug in. This allows for different radio modules to be interchangeably used without additional work. So please, make sure your pin-outs match those used by this project.

Its very important to remember the nRF24 modules require 3.3v and should never be connected to a 5v source. Since the CP2102 TTL module provides a nice 3.3v source, its a match made in heaven. Even better, the nRF24L01 is 5v TTL tolerant, which means, while the module itself must be powered at 3.3v, we can directly connect the data lines from our uC to the nRF24 module without any additional components.

Sleep'n Time
The last part of the puzzle is to make use of the CP2102's USB suspend lines. Most computers dramatically cut power to USB ports when they suspend. To be USB and battery friendly, we'll do the same thing. By monitoring at least one of these signals on our uC, we can both power down the nRF24L01(P) as well as put the uC into a sleep mode, allowing for significant power savings. The end result is we'll be very USB and laptop friendly.

The Modem
The magic of serial port modems is the ability to plug them in, open the associated serial (com) port, and start communicating with the device and/or endpoints connected on the other end of the line. This is exactly what we want here. While not connected through a phone line, we do want to allow for transparent communications with our endpoints.

In days gone by I've had no end of the famous Haye's 'AT' command interface for modems. Personally I'm not a fan. The are cryptic and terse and generally not human friendly anyways. Given the flash size constraints of our ATmega8, implementing a full 'AT' command set is likely to prove too large with little in return. So rather than implement an 'AT' command set, we're going to implement an even more primitive command set - and yes, it uses binary data. Sadly, using binary data makes configuration of the operating mode non-terminal program friendly. But, once configured, the intent it to make it terminal program RX compatible and very friendly. Basically plug in and go, assuming our remote nodes are know at configuration time.

Here's the functions we want to support:
  • Set our local address. Think of this as our MAC or hardware address.
  • Configure our RF24 channel or frequency.
  • Configure the data rate mode of the RF24.
  • Turn the RF24 receiver on/off.
  • Power radio on/off.
  • Configure dynamic or fixed length payloads.
  • Configuring transparent operating mode of our modem.
  • Configuring RF24 receiver pipeline address (0-5).
  • Set Shockburst mode, timeouts, and retries.
  • RF24 CRC, off, 1 or 2 bytes
  • Store the modem configuration on EEPROM.
  • Reset the uC + RF24 module; restarting with saved configuration.
  • Scan a frequency
Command mode of the modem always follows DTR state. When DTR is high, the modem enters transparent mode. When DTR is low, the modem enters command mode.

Article is currently incomplete and unfinished. Significant progress has been made on the firmware. Its tight on flash, needless to say. The massive bloat of Arduino libraries >= 1.0 makes Arduino >= 1.0 completely useless for development. As such, I've been forced to continue development with Arduino 022 libraries. Likely I'll wind up with some native AVR code, bypassing the Arduino libraries, simply to save space and increase efficiency. But more to come...


References:
Atmel
ATmega8 Datasheet

Nordic Semiconductor
nRF24L01 Datasheet
nRF24L01+ Datasheet which also the same thing as the nRF24L01P.

Please note the nRF24L01 has been officially obsoleted by the nRF24L01+ variant; which is frequently referred to as the nRF24L01P. Quoting Nordic, "This product is not recommended for new designs. Nordic recommends its drop-in compatible nRF24L01+ or for a System-on-Chip solution the Nordic nRF24LE1 or nRF24LU1+."

Silabs
CP2102 Datasheet
CP2102 Baud Rate Support Datasheet
CP210x Programmer Software


Thursday, April 4, 2013

RF24 - Avoiding RX Pipe 0 For Enhanced Reliability

RX Pipe 0 Is Special


The NRF24L01(+) radios have six receiving hardware pipes as part of its MultiCeiver design. These pipes, zero through five [0-5], each can have their own address. This sounds great, but reality is, pipe 0 is special.

Pipe 0 is special because, whenever you transmit, RX pipe 0 is changed to that of the writing pipe's address. The fact this occurs is obscured, by design, by the RF24 driver. This is because the RF24 driver has specific logic for pipe 0. This is primarily why the startListening() method exists. Every time a call to startListening() is made, the RX pipe 0 address is shuffled back into the radio. Its shuffled in because whenever the radio transmits, the RX pipe 0 address is internally replaced by the radio.

While not explicitly declared in the data sheet, I believe I understand why this behaviour takes place. When you enable auto-acknowledgement, the receiving radio needs to reliably inform the transmitter of its ACK/NAK status. In turn, potentially returning an ACK payload. However, the receiver doesn't directly know who transmitted the message. Its not part of the message. In order for the receiver to reply, the transmitter must be prepared to listen for a reply back from the receiving radio. As such, if the radio simply listens for a reply using the destination address, it should always match and filter properly. This is a clever idea to prevent transmitting source addresses.

That's fine and all, and is rather clever, but there is a problem. The RF24 driver, in its attempt to hide this detail, creates an opportunity for lost and/or missed packets. Its a classic race condition. This is a race condition because, should a transmitter send a message before the application's call to startListening() completes, the radio will completely ignore the message. Even if received, it will be silently filtered out and ignored. That means all messages destined for RX pipe 0's address will be silently ignored until the reloading of RX pipe's address completes. That completion only takes place with a call to startListening().

This race condition is potentially compounded by the fact applications are free to have any amount of logic between the end of a write() or startWrite() call and the completion of a startListening() call. For almost the entire duration between [write()/startWrite()] and startListening() calls, the radio will ignore all messages addressed to pipe 0's address.

The solution? Well, There really isn't a neat and clean solution. While many applications won't have issue with using RX pipe 0, and the associated message loss, high traffic networks are likely to suffer from periodic message loss and potential packet loss without full use of all retry attempts. For this second case, imagine the radio starts listening into 14-retries out of 15. That means all but one of the available retry opportunities have been lost simply because the radio was ignoring those messages. In turn, this would also drive up latency on the transmitter's side.

Long story short, if you want a reliable network, don't use RX pipe 0. For small networks, use is unlikely to cause significant issue. But for a better option, just pretend you only have RX pipes one through five (1-5); for a total of five, rather than six. But if you insist on using RX pipe 0, always ensure your [write()/startWrite()] calls are as close as is possible to your startListening() call, so as to minimize the window of potential lost packets.

Wednesday, April 3, 2013

New RF24 Driver Release - A Fork

New RF24 Driver Release

The RF24 driver was forked to add new features and fix bugs. These improvements are outlined below. This release's primary aim is to improve performance and increase operational reliability. I believe those aims were achieved.

The intention is to once again have my features and bug fixes merged back into Maniacbug's RF24 driver. While this is currently a fork, I hope in the near future this code will be part of the official RF24 driver repository.

My RF24 Fork: https://github.com/gcopeland/RF24

Auto-Acknowledgement Retry Bug Fix

The default timeout for auto acknowledgements is wrong. If using maximum payload size at 250Kbps, an erroneous timeout may cause errors in transmission, resulting in a failure to send. This timeout has been changed. Users who explicitly set their own timeouts via setRetries() function call are immune to this issue, assuming the provided values comply with operating requirements set out by the data sheet.


Reliability Improvements

Following each write(), the radio was explicitly powered down by the driver. I classify the previous behaviour as a bug. In order to improve reliability and performance, write() no longer powers down the radio

Additionally, powering down the radio after each write also means the radio will not receive data until a startListening() method is called. This in turn means the radio is completely deaf between the end of the write() call and the end of the following startListening() call. This latency increases the likelihood of a missed transmission for a busy, multi-node network. In doing so, needlessly adds additional SPI bus traffic. This bug decreases radio reliability and wastes time on the SPI bus. Fixing this behaviour means the radio can now function optimally from standby mode while using less application time.

By allowing the radio to enter standby mode, the radio will continue to listen for messages, including ACKs/NAKs, and transmissions which might otherwise be missed; as intended. The radio will respond dramatically faster from standby activation than it does from a powered down state. This in turn is a performance optimization. In turn reducing the window for lost packets.

This is an important improvement because while the radio is free to process both ACK/NAK reply messages, it can also receive other messages even though startListening() has not been called. The messages will simply wait in the corresponding rx pipe until it is processed by the application. As such, this change also increases the radio's parallel pipe performance. Another performance optimization.

See the Compatibility section for more details.

Higher Performance

The driver now has fewer delays and blocking calls. Fewer SPI read/writes now take place within various radio method call. This in turn means more CPU is available for applications. In doing so, a number of internal function calls were removed. This has the effect of very minor memory footprint improvements.

Multicasting

This driver adds support for multicasting. This allows a single transmitter to transmit exactly once, allowing for multiple concurrent receivers, without changing the radio's operational mode. This means use of this feature does not interfere with the use of auto-acknowledgement. This is because auto-acknowledgement is a radio's operational mode, whereby, the new multicasting implementation leverages a message's mode. As such, it does not interfere with auto-acknowledgements in any way.

Multicasted messages are inherently unreliable. Even with auto-acknowledgement enabled, multicast messages will never be ACK'd or NAK'd. They are fire and forget. Either the message is received, or its not.

To multicast a message, use the write() or startWrite() methods. There is now a third optional argument. If the third argument is not provided, or 'false' is used, it will transmit exactly as previously. If the third argument is, 'true', the packet will be multicasted.

Example: radio.write( &msg, sizeof(msg), true ) ;

Closing Pipes

A new method, closeReadingPipe() has been added. This allows for a previously opened reading pipe to be shut down. A pipe which has been closed will no longer accept messages for the corresponding pipe address.

Variable Timeouts

A new method, getMaxTimeout(), is now available. This method returns the maximum number of microseconds a read/write operation will take to successfully complete. The value is calculated based on radio configuration at the time the method is called. Reconfiguration of the radio via setRetries() will invalidate the results return from getMaxTimeout().

Compatibility

Compatibility should not be an issue unless your application depends on a power management side effect. If it does, see Power Management.

Power Management

Power management is now an explicit mechanism. Applications which errantly rely on the driver to handle power management as a side effect, will find higher power demands. The fix is for the application to properly implement powerDown() and powerUp() calls as needed.

Battery powered projects might see a minuscule increase of power requirements but ultimately its up to the application to match what was previously a side effect of the driver.

Timeout Calculation

If you previously have loops which look something like the following, where 'myTimeoutValue' is a fixed value, a better solution is now available.

unsigned long t = millis() ;
unsigned long myTimeoutValue = 250UL ;
while( !radio.available() || millis() - t < myTimeoutValue ) {
}

Now, you can initialize myTimeoutValue as follows. Notice it rounds up to the next millisecond.
unsigned long myTimeoutValue = 1 + (radio.getMaxTimeout()/1000) ;

Doing so will ensure the minimal amount of time is spent waiting for a transmission to complete.

As a reminder, if you attempt to use the code above, and if you enter that loop immediately after a write, unless you are using enableAckPayloads(), and a payload is immediately available for transmission on the remote's end, the timeout provided above will not provide time for remote's code execution to process its message and reply. As such, some experimentation may be required on a per application basis. Regardless, this mechanism allows for timed loop optimizations.

Testing

At this point, this fork has been tested by three users (me being one), on nine radios, two platforms (Arduino and Raspberry Pi), and four different makes and models of Arduino (uno, nano, mega2560, and due) of hardware. Thus far its been 100% compatible.

Update: At this point many people on both Arduino and rPi platform have tested this fork. All reports are good, confirming the validity of these changes.


Tuesday, April 2, 2013

Improve RF24 Radio Performance With Proper Addressing Schemes

RF24 Addressing Gotchas

The NRF24L01(+) radios require addresses be configured on each of its Multiciever pipelines. Without much thought, people assign addresses to these pipelines. This can be problematic because of RF decoding requirements as dictated by the preamble (data transmitted in front) of each message. Read 7.3.1 and 7.3.2 of the data sheet for more information.

A longer addressing scheme allows for more opportunity to disambiguate from the preamble. As such, in my opinion, the maximum available address length should be preferred. 40-bit addressing is most effective when proper addressing is provided to each pipeline. Per the data sheet, addresses which use single bit transitions should be avoided as they are more likely to confuse preamble message detection logic should the preamble become distorted from noise.

As such, addresses with the following octets and/or nibbles (half byte) should be avoided.

Octet values to avoid:
'0xaa' - '0b10101010'
'0x55' - '0b01010101'
'0x2a' - '0b00101010'
'0x15' - '0b00010101'

Nibble values to avoid:
'0x0a' - '0b00001010'
'0x05' - '0b00000101'
'0x02' - '0b00000010'
'0x01' - '0b00000001'

This in turn means addresses with those nibbles should not be used. Remember, a nibble is half an octet, so two of those nibbles can be used to create an octet. The octets of 0xaa and 0x55 are very bad. Always avoid them. Notice these appear in the octet list to avoid and that they are made up of a pair of nibbles from our nibble list to avoid. This is because both nibbles of both octets (both halves of the byte) are 0x0a or 0x05; making for an octet of 0xaa or 0x55. Accordingly, octets 0xaa and 0x55 represent the worst possible octet to use for an address because they exactly mirror the preamble of messages.

Other nibble combinations to avoid, for example, would be 0xa1, 0x52, x12, 0x25, so on and so on. But even simple octets, such as 0x01, 0x02, and 0x05, should also be avoided.

In general, use of values contained above is more likely to cause the loss of packets and in turn, require additional retransmissions if using the auto-acknowledgement hardware features.

But Wait, There's More

The list above is hardly the exclusive list of bit patterns to avoid. In section 7.3.2, the data sheet says, "Addresses where the level shifts only one time (that is, 000FFFFFFF) can often be detected in noise and can give a false detection, which may give a raised Packet Error Rate [(PER)]. Addresses as a continuation of the preamble (hi-low toggle; [single bit transitions, as referred above]) also raises the Packet Error Rate.

As such, an address of 0xFFFFFFFFFF should never be used. And in general, octet sequences of all bits on or off should be avoided. Which means, nibbles of 0x0F and 0x00 should be frowned upon.

Please note addressing is a little more complex and that the tips provided here. These tips should be regarded as rules of thumb rather than absolutes. Regardless, if you follow the advice here, your reliability is generally improved (PER is reduced).