Skip to Navigation
Home
  • Company
    • Quick Facts
    • Board of Directors
    • Management Team
    • Press Releases
    • News Coverage
    • Newsletter
    • Careers
    • Articles
    • Ember Chronology
    • Contact Us
  • Products
    • ZigBee Chips
    • ZigBee Software
    • ZigBee Development Tools
    • Documentation
  • Buy
    • Digi-Key (Online)
    • Distributors
  • Applications
    • AMI & AMR
    • Integrated Home Automation
    • Building Automation
    • Others
  • ZigBee
    • About ZigBee
    • Ember & ZigBee
    • ZigBee FAQ
    • Download Specifications
    • ZigBee Events
  • Partners
  • Support
    • Training
  • Events
Home › FAQs

How can I replace a device that has failed in the field?

Categories:
  • Software : Embedded
  • Software : Networking
  • EmberZNet PRO

For device replacement in cases where a node fails in the field, you basically have two options:

1. Cloned node method
Clone the non-volatile data in its entirety by taking a HEX dump of the original node (see http://portal.ember.com/hex_dump for guidance on that), including
Manufacturing Token data, and then restoring this HEX dump (overriding
the existing mfg tokens) to the new device, replacing its EUI64 with that of the old node. (For EZSP designs, only the NCP needs to be cloned for this process to work.) The new device then simply called NetworkInit to restore settings from non-volatile memory… This method works in nearly all cases, but it requires accessibility to the original device, including having its flash contents intact and readable with external tools (which may not always be the case, especially when the original device has been irreparably damaged), as well as having accessibility to a field programming device (like an InSight USB Link) that allows reading and writing of flash data.

2. Secondary node method
In this method, no backup of data from the original device itself is required, since the new device isn’t a “replacement” for the physical device itself, just a secondary device fulfilling the role of the original device (like adding a backup controller that can be used by other nodes when the primary controller fails.) Have the new device simply join into the network and inform the others that it has taken over responsibility for the old device (in a secondary role to act as a replacement for the primary device). If the network can’t be put back into a commissioning mode (where Permit Joining is set True on at least once accessible device), you may need to force the new node to rejoin (rather than joining through the traditional Association process) by using an EmberJoinMethod parameter of EMBER_USE_NWK_REJOIN in your EmberNetworkParameters structure passed to emberJoinNetwork. Note that informing the other devices about the presence of the secondary node and getting them to use that device instead of the primary node is up to the application to accomplish. Methods for doing this depend on your implementation. (For example, in Ember’s Sensor/Sink application sample, the fail-over from the original Sink node to a secondary, replacement Sink node would happen automatically after a few missed reporting periods.)

These methods are the only real ways to ensure that the new device can
participate successfully in the network without a lot of confusion about
security settings and so forth.

In regards to the Cloned Node method, unless the replacement device has the same EUI64, node ID, and security frame counter values as the old device, there will likely be communication issues with the new device. This is why a full HEX dump, including stack token data and manufacturing token data, is necessary.

Now in regards to the coordinator in particular: Since the coordinator role itself is essentially just a special router that happens to be the first device in the network (and chooses the settings for the PAN), normally there’s no reason that it can’t be simply replaced with a new device that just joins the existing network as a router and takes over where the coordinator left off (as per Secondary Node Method above). However, since the coordinator is also the sole Trust Center for most networks (unless you have enabled the EmberZNet-only option of Distributed Trust Center mode when forming the PAN), the network maintainer is then faced with the issue of “Trust Center replacement”, which (to be frank) is a problem for which ZigBee currently does not provide any official solution. (Ember is working on creative [and interoperable] ways to solve this problem in the future, but it’s not an easy thing to sort out. Furthermore, ZigBee’s Smart Energy group is looking at an SE-specific solution for replacing the ESP in an SE network, and they hope to make this available in an upcoming revision to the SE application profile.) While the Cloned Node Method above can be used here, that method is often impractical (due to inaccessibility of the device’s flash or lack of an experienced technician to perform the process onsite) in the field.

As far as some guidelines for device replacement in general:

  • If your device is anything other than the Trust Center, you can install its replacement into the network in a manner similar to adding any new device to an existing system (as per Secondary Node Method discussed above): Do something to permit joining at one or more devices within range of the incoming device; initiate a normal JoinNetwork action at the replacement device. It should enter the network and receive the network key, and then PermitJoining flag can be disabled as needed.
  • If you rely on some “Controller” device to tell the other nodes to permit joining via OTA messaging (using the emberPermitJoiningRequest() to construct a ZDO command; see http://portal.ember.com/zdo for more info on that topic), and you need to replace this controller (and assuming the Controller is not the Trust Center device), you can have your replacement Controller node force a rejoin (to get around the PermitJoining restriction) by calling emberJoinNetwork() with the networkParameters.joinMethod struct member set to EMBER_USE_NWK_REJOIN. This process also could be used if your network’s policy is set to not allow new joins, only rejoins.
  • To avoid the issue of failure of a single, centralized Trust Center, it is possible (in EmberZNet) to form a network that uses “Distributed Trust Center” [DTC] mode. This is done by setting the EMBER_DISTRIBUTED_TRUST_CENTER_MODE flag in your initial security bitmask before calling FormNetwork. Using DTC mode ensures that all routers (including the coordinator) can perform authentication for their own local joining devices rather than having to consult a central TC node (the coordinator). This removes the single point of failure but also could make authentication more difficult if you were relying on a single connection between some node (the central TC) and an out-of-band (non-ZigBee) system to track authorized/refused nodes in your PAN. The main downsides to this DTC approach though are: (1) it’s not a ZigBee standard mechanism, so non-EmberZNet devices won’t operate properly in a network using this mode, meaning it’s only usable if you run a closed network of all EmberZNet nodes; (2) it prevents you from updating the Network layer encryption key after the network has been formed, which can be problematic for networks that wish to roll their security key periodically to mitigate risk of compromise.
  • If your TC dies and you are not using DTC mode, your network is not completely useless: Your replacement device could simply come back into the network (using one of the two methods described above) as a normal router and fulfill its application layer duties; it just wouldn’t be able to handle authentication requests (such as allowing joining and insecure rejoining of devices and facilitating security key updates). If the network is fairly static (no joining or insecure rejoining occurring), this may be feasible for your PAN to operate in such a state.
  • Since there is a lot of security information about the TC persisting amongst other nodes, and since the TC role is based on a particular EUI64, the only options for actually “replacing” the TC are either to start with a new node, form a new PAN, and get all the devices from the original PAN to leave their network and join up to this new PAN (like starting a system from scratch); or to take a HEX dump of the old TC (assuming it’s not completely dead beyond this degree) and clone that HEX image to the replacement device, as well as program the old device’s EUI64 into the replacement node as a Custom EUI64 (so that it looks identical to the old node). The replacement should then be able to call NetworkInit to restore itself on the network as if it were the original device.
  • The main problem is keeping track of the outgoing security frame counters… For each key (APS or NWK layer) a device has a unique outgoing frame counter value that is incremented with every frame transmitted with that encryption key and placed into the packet (causing a new Message Integrity Code [MIC] to be generated and appended to the packet as well). The receiving device tracks this counter and uses it to guard against replay attacks (since the same payload transmitted twice will have different frame counters and therefore the MIC hashed against the security key and frame counter will be unique each time).
  • Concerning EZSP network coprocessor [NCP] designs:
    You can read the security frame counters back from the NCP via GetKeyTableEntry (for per-device APS keys) and GetKey (for NWK key and TC link key), but there is no method by which you can tell the NCP, “Go set up a network using the following frame counter data.” So the only way to properly restore the frame counters (which live in tokens) would be to clone the SimEEPROM memory (where the token data resides) from the device in question, as per Cloned Node Method above.
  • Printer-friendly version

Search

FAQs

  • All (166)
  • Software : Embedded (65)
  • Software : Networking (70)
  • Hardware : Design (22)
  • Hardware : Manufacturing (10)
  • Tools : Dev Boards (2)
  • Tools : ISD, ISA (21)
  • Tools : xIDE (6)
  • Tools : Other (7)
  • ZigBee (1)
  • Change Notification (0)
Primary links
  • Developer Blog
  • Documentation
    • Release Notes
  • Contributed Software
  • FAQs
  • Change Notifications
  • Training
Portal
  • My Account
  • Search
User login
  • Request new password

Company | Products | Buy | Applications | ZigBee | Partners | Support | Events | Contact Us

©2007-2008 Ember Corporation | All rights reserved | Privacy