How do I get detailed reset cause information on EM35x chips?
Sure, you can get a simple reset reason code from halGetResetString(), and with the EM35x HAL you can even get some more verbose description of that reset type through halGetExtendedResetString(), but these only tell you what general variety of reset you experienced.
For certain kinds of resets, it may be useful to just know the approximate program counter (PC) address where the code ended up just prior to the crash. In those cases, the PCDiagnostics routines can be used to pinpoint the last known PC, similar to how this works on Ember’s EM250 platform. (See http://portal.ember.com/resets for a quick tutorial on using those functions.) Once you have the PC, you can reference that back to your program’s list file (*.LST) to determine what function the reset occurred from, or at least whether this is in your application code or Ember’s pre-built libraries.
However, to go a step further and really dive into the program to figure out what caused this crash (assuming you don’t already know and it’s not something you can easily find out by watching the program with a debugger attached), the best thing to do is make use of the EM35x HAL’s various utilities for dumping crash information to a serial port, so that you can let your program run freely and just monitor serial output for this crash diagnostics information when you’ve experienced an unintended reset.
First, a little background on how the EM35x HAL handles unexpected resets (“crashes”):
What kind of information is captured?
The EM3xx platform has a lot of possible crash causes; the Cortex-M3 core and its MPU detect many different types, and others are found by Ember-specific hardware (clock failure, RAM protection, etc.) or by debugging software (asserts). The standard MPU configuration protects against many common errors such as dereferencing a null pointer.
The data captured depends on the type of crash. A crash detected by MPU usually captures the most data, including the address of the instruction that caused the crash and, if the crash was due to an illegal memory access, the location accessed illegally. At the other extreme, a CPU lockup crash cannot capture any data because the CPU is stopped until it is reset by the watchdog or by some other means.
How is crash information preserved across resets?
Most crashes result from a fault or a non-maskable interrupt (NMI) and vector to a common fault handler (in faults.s79) that does the following:
1. Saves the processor’s general purpose registers in the nvData.crash structure located at the start of RAM. This structure is intended to be shared with the stack, and may be overwritten once the program no longer needs the crash data.
2. Checks the stack pointer to see if it is within the stack segment, and won’t overwrite nvData.crash. If necessary, the stack pointer is reset to the top of stack segment.
3. Calls halCrashHandler() which does the following:
* If the stack pointer was valid when the fault occurred, saves the program counter and processor status register, and then searches upward in the stack and records up to six return addresses.
* Computes the maximum stack usage by searching upward from the stack bottom until it finds a word not equal to the startup fill value (0xCDCDCDCD).
* Saves various NVIC registers useful in diagnosing the crash cause.
* Determines the crash reason based on the type of fault or NMI, and returns this to the common fault handler.
4. Calls halInternalSysReset() which stops any DMA in progress and then resets the processor.
…So now that you understand what happens, here’s how you access that information:
Displaying crash data
To output crash data output it to a serial port, your software must call one or more crash diagnostic functions. For the most complete output, the following code fragment should be included, ideally executed fairly soon after main() is entered. (This reduces the risk of the stack overwriting the crash data.) The function halResetWasCrash() checks the reset cause and returns FALSE unless the reset was due to a crash.
if (halResetWasCrash()) { halPrintCrashSummary(port); halPrintCrashDetails(port); halPrintCrashData(port); }
To conserve flash, you can choose to call only one or two of these functions.
In case you’re wondering what this information looks like, here is a sample of the various kinds of information this prints out with different functions from above:
Thread mode using main stack (20000094 to 20000694), SP = 2000059C 376 bytes used (24%) in main stack (out of 1536 bytes total) No interrupts active Reset cause: Memory Management Fault Instruction address: 08004E6A Illegal access address: 00000000 CFSR.DACCVIOL: attempted load or store at an illegal address CFSR.MMARVALID: MMAR contains valid fault address R0 = 00000000, R1 = 00000000, R2 = 080100DC, R3 = 00000000 R4 = 200016E8, R5 = 00000000, R6 = 2000170C, R7 = 00000020 R8 = 00000000, R9 = 200016EC, R10 = 00000000, R11 = 00000001 R12 = 20000840, R13(LR) = FFFFFFF9, MSP = 2000059C, PSP = 00000000 PC = 08004E6A, xPSR = 01000000, MSP used = 00000178, PSP used = 00000000 ICSR = 00000804, SHCSR = 00070001, INT_ACTIVE = 00000000, CFSR = 00000082 HFSR = 00000000, DFSR = 0000000B, MMAR/BFAR = 00000000, AFSR = 00000000 Ret0 = 0800F98E, Ret1 = 0800F6EA, Ret2 = 080004C0, Ret3 = 0800BE52 Ret4 = 00000000, Ret5 = 00000000, Dat0 = F0416801, Dat1 = 60010140
Here lines 01-03 were output by halPrintCrashSummary(), lines 04-08 by halPrintCrashDetails(), and 09-17 by halPrintCrashData()
01 execution mode
This identifies the CPU execution mode (thread or handler) and the stack in use (main or process). If a fault or NMI occurs in an ISR, it will be in handler mode, otherwise it will be thread mode. Currently Ember software always uses the main stack.
02 stack data
This is an estimate of the maximum stack usage found by searching upward from the stack segment bottom to find the first word that does not contain the fill value written at startup.
03 interrupts active
This lists the interrupt(s) active or pre-empted when the fault or NMI occurred.
04 reset cause
This identifies the type of fault, NMI or other crash cause:
- Memory management fault – caused by a data read or write to an address protected by the MPU, or by trying to execute an instruction from an address where that is not allowed.
- Usage fault – caused by executing an undefined instruction, dividing by zero, and various other problems with instruction execution.
- Bus fault – caused by writing to a non-existent register in the Ember peripheral area, or writing to protected RAM using user (unprivileged) access. When due to a write, a bus fault may be imprecise. This means that it doesn’t occur immediately, but is delayed due to write buffering. As a result, it is not possible to capture the exact address of the instruction that performed the write.
- NMI – caused by watchdog timeout (at the earlier low-water mark), or by failure of the 24MHz crystal clock when it is in use.
- Hard fault – can be caused by a second fault within a fault handler, a bad vector table address and other reasons.
- Debug monitor fault – caused by unsupported debug events.
05 faulting instruction address
The address of the instruction that caused a fault, or that was interrupted by an NMI. In the case of imprecise bus faults, the address is that of an instruction executed several cycles after the one that caused the fault. If this address is zero, the stack pointer was invalid and the address is not known.
06 fault address
The address whose illegal access resulted in a memory management or bus fault. Note that it may be legal to access an address in one way, but not in others. For example, RAM can be read and written, but cannot be executed. (The flash functions disable the MPU to do this.)
07-08 fault status details
These lines decode the various flags set in the CFSR registers. The number of lines depends on how many flags are set.
09-13 processor registers
This displays:
- contents of the general purpose registers (R0-R12)
- the link register (R13)
- main and process stack pointers (MSP and PSP)
- program counter (PC) and program status register (xPSR)
014-015 NVIC registers
The Cortex-M3 Technical Reference Manual’s Nested Vectored Interrupt Controller chapter descrbes these registers. They identify interrupts in progress or suspended, faults details, and include the Ember-defined Auxiliary Fault Status Register (AFSR). The AFSR records illegal accesses to Ember peripheral addresses that generate a bus faults.
016-017 return addresses (ret0 – ret5)
If there was a valid stack pointer when the fault or NMI occurred, these contain probable return addresses from the function that caused the fault. ret0 is the return address of the faulting or interrupted function, ret1 is the return address from its caller, and so forth. If less than six return addresses were found in the stack, the remaining values are set to zero.
The heuristic used to find the return addresses has its limitations:
- If the fault occurred while in an ISR, the address of the instruction interrupted by the ISR will not be in the list.
- Some of the return addresses might actually be pointers to literals in the code segment.
Note that a valid return address points to the instruction immediately following a function call, or BL instruction.
017 fault-specific data (dat0 – dat1)
These words contain additional data for specific crash causes. In all other cases their content is not meaningful.
- If the crash was due to DMA writing to protected RAM, dat0 and dat1 save the contents of the DMA_PROT_ADDR_REG and DMA_PROT_CH registers. These identify the address written, and the DMA channel that caused the crash, respectively.
- If the crash was caused by a failed software assert, dat0 is a pointer to the filename and dat1 is the line number of assert().
So what do I do with this information?
In most cases, the first step ought to be to determine the function and statement executing when the crash occurred. If the PC was saved (i.e., is not zero), consult the program .map file to find the function containing that address, and then locate the statement within the .lst file containing the function. Recall that the addresses in the list file are relative offsets, and must be added to a base address in the .map file to get the actual address. An alternative approach is to load the program in an identical target, then run the IAR IDE and start its debugger. If you enter the fault PC address into the debugger’s disassembly window, it will show you the code executing at the time of the fault.
You can use the same procedure to trace back through the function’s callers using the return address data in ret0, ret1, etc. This is useful when a fault occurs within a utility function that is called from numerous places, such MEMCOPY or a packet buffer function.








