How Do You Troubleshoot Data Corruption?

Posted by: Kerri McConnell

In the world of embedded computing, data-related failures are unfortunately part of being in business. Even with the right hardware, software and development, frustrating and costly failures can occur. But when issues do surface, many companies don’t possess the right tools to troubleshoot the challenges they run into.

As experts in making flash data storage reliable in embedded systems, companies often call upon Datalight to figure out what went wrong and how to fix it. And over the years we have developed a robust set of tools to diagnose data issues, oftentimes even without the benefit of a reproducible case.

Flash failures are often complex, making it difficult to discern if the problem resides in the file system, flash driver or hardware. And almost every corruption we see is unique. The issue could result from power failures, or the system may have experienced a bit flips caused by over programming or read disturb.

As we have investigated more failures, patterns have emerged and we were able to devise several different tools and methods to diagnose errors. Errors are rarely identified quickly. Rather, the process tends to be trial and error to eliminate possible causes one at a time instead of landing directly on the root cause.

Recently, Datalight solved an issue for an automotive client that was seeing corruption in images. The customer had grouped them into nearly a dozen different “symptom buckets” that seemed to be unrelated. Datalight engineers identified patterns that enabled them to combine symptoms, reducing them into a few suspected root cause buckets, one of which led to a solution.

Datalight’s Reliance family of file systems makes diagnosing hardware failures easier than ever before. Since it never overwrites live data we can replay exactly what the system was writing and locate the sector that was bad. Plus, we can examine erase counts on the flash media. This combination usually enables us to identify hardware errors.

Download our new guide Troubleshooting Data Corruption on NAND Flash Memory for an in-depth discussion of the topic.

 

Free Troubleshooting Guide


Comments (0)


Add a Comment





Allowed tags: <b><i><br>Add a new comment: