Subscribe to Datalight's Blog


 

Managed NAND Performance: It’s All About Use Case

Last week the UK journal PC Pro published an interesting article about fast SD cards http://www.pcpro.co.uk/features/380167/does-your-camera-need-a-fast-sd-card, with a good description of the SD card Class system. With some clever testing, they show how six cards perform in a continuous shooting situation.

These tests also demonstrate how the SD card manufacturers have customized their firmware to handle sequential write cases. A class 10 card requires a minimum of 10 MB/sec throughput, and a supplemental rating system for Ultra High Speed (UHS) indicates a higher clock rate and correspondingly higher transfer rate. For the larger frame sizes (12 megapixel photos, HD video) high transfer rates are a requirement. The resulting data is almost always sequential, which matches the firmware characteristics well.

This article brings out one more interesting point. The authors point out that the performance measurements from using an SD card in a desktop system don’t always reflect the use case. They end up performing their tests using an actual camera, thereby getting as close to the use case as possible.

For an application which uses random I/O (such as tablets and other Android devices), these firmware optimizations aren’t necessary. In some cases, such optimizations actually lower random I/O performance. Similar firmware shows up in eMMC media as well. A software solution (such as FlashFXe) can adjust much of the I/O to be more sequential and more closely match the optimized performance.

At Embedded World a few weeks ago we recorded our demonstration showing the benefits of our new FlashFXe product on eMMC.

Watch our FlashFXe Demo Video Here

Thom Denholm | March 15, 2013 | Flash Memory, Flash Memory Manager, Performance | Leave a comment

Even When Not Using a Database, You Are Still Using a Database

Recently, we’ve focused considerable development effort on improving database performance for embedded devices, specifically for Android. This is because Android is a particularly database-centric environment.

On an Android platform, each application is equipped with its own SQLite database. Data stored here is accessible by any class in the application, but not by outside applications. The database is entirely self-contained and server-less, while still being transactional and still using the standard SQL language for executing queries. With this approach, a crash in one application (the dreaded “force close” message) will not affect the data store of any other application. While fantastic for protection, this method is quite often implemented on flash media, which was designed for large sequential reads and writes.

For years, benchmarks have touted the pure performance of a drive through large sequential reads and writes. On managed flash media, the firmware programmers have responded by optimizing for this use case – at the expense of the random I/O used by most databases, including SQLite. Another challenge is the very high ratio of flushes performed by the database (sometimes 1:1). The majority of database writes are not done on sector boundaries – especially problematic for flash media which must write an entire block.

While there are a few unified “flash file systems” for Linux such as YAFFS and JFFS2, designed specifically for flash memory, they have fallen out of favor because they do not plug neatly into the standard software stack, and therefore cannot take advantage of standard Linux features such as the system cache. While traditional file systems such as VFAT and Ext2/3/4 can work with flash, they are not designed with that purpose in mind, and therefore their performance and reliability suffers. For example, discard support has largely been tacked onto Linux file systems, and is still considered to be somewhat experimental. To quote the Linux v3.5 Ext4 documentation, discard support is “off by default until sufficient testing has been done.” Another example: file systems on flash memory typically benefit from using a copy-on-write design, which ext4 does not use. The reality is that most file systems are designed for desktop (and often server) environments, where high resource usage is OK, and power-loss is infrequent.

Our solution to improving database performance on flash memory is to provide a more unified solution where the various pieces of the stack work in a cohesive fashion. Furthermore, the solution is specifically designed for embedded systems using flash memory, where power-loss is a common event. Datalight’s Reliance Nitro file system is a transactional, copy-on-write file system, designed from the ground up to support flash memory discards and power-loss safe operations.

The result of our work in this area is FlashFXe, a new Datalight product built on our many years of experience managing raw NAND, but designed for eMMC. When used together with Reliance Nitro, almost all write operations become sequential and aligned on sector boundaries for the highest performance. Internal operations are more efficiently organized for the copy-on-write nature of flash media. A multi-tiered approach allows small random writes with very frequent flushes to be efficiently handled while maintaining power-loss safe operations.

This month at Embedded World, we will be demonstrating the results of our efforts to improve database performance on embedded devices using Android. Prepare to be impressed!

Learn more about FlashFXe

Thom Denholm | February 12, 2013 | Datalight Products, Flash File System, Performance | Leave a comment

Why CRCs are important

Datalight’s Reliance Nitro and journaling file systems such as ext4 are designed to recover from unexpected power interruption. These kinds of “post mortem” recoveries typically consists of determining which files are in which states, and restoring them to the proper working state. Methods like these are fine for recovering from a power failure, but what about a media failure?

When a media block fails, it is either in the empty space, the user data, or the file system data. A block from the empty space can be detected on the next write, which will either cause failure at the application, or will be marked bad internally and the system will move on to another block. When a media block in the user space fails, it cannot be reliably read. Often, the media driver will detect and report an unreadable sector, resulting in an error status (and probably no data) to the user application. When a media block containing file system data or metadata fails, it is the responsibility of the file system to detect and (if possible) repair that damage. Often the best thing that can be done is to stop writing to the media immediately.

In some ways, blocks lost due to media corruption present a problem similar to recovering deleted files. If it is detected quickly enough, user analysis can be done on the cyclical journal file, and this might help determine the previous state of the file system metadata. Information about the previous state can then be used to create a replacement for that block, effectively restoring a file.

Metadata checksums have been added to several file system data blocks for ext4 in the 3.5 kernel release. Noticeably absent from this list are the indirect and double indirect point blocks, used to allocate trees of blocks for a very large file. The latest release of Datalight’s Reliance Nitro file system (version 3.0) adds CRCs to all file system metadata and internal blocks, allowing for rapid and thorough detection of media failures.

Optional within this new version of Reliance Nitro is using CRCs on user data blocks, for individual files or entire volumes. This failsafe can be configured to write protect the volume or halt system operations. Diagnostic messages are also available to indicate the specific logical block number of the corrupted block.

The combination of full CRC protection on every metadata block and optional protection of user file data blocks is one of the key attributes of this release of Reliance Nitro. Embedded system designers can detect more media failures in testing, and can diagnose failed units more quickly, leading to greater success in the marketplace.

Learn more about Reliance Nitro

Thom Denholm | January 26, 2013 | Flash File System, Flash Memory, Reliability | Leave a comment

fsck and chkdsk

Before embedded devices, file systems were designed to work in servers and desktops. Power loss was an infrequent occurrence, so little consideration was given to protecting the data. Frequent checks of the file system structures were important, and were often handled at system startup by a program such as chkdsk (for FAT) or fsck (for Linux file systems). In each case, the OS could also request a run of these utilities when an inconsistency is detected, or when the power was interrupted.

The method behind these tools is a check of the entire disk – reading each block to determine if it is allocated for use, then cross checking with an allocated list located elsewhere on the media. FAT file systems have little other protection, and can only flag sections of the media without matching metadata by creating a CHK file for later user analysis. Linux file systems add in a journal mechanism to detect which files are affected, and can often correct the damage without user intervention.

These utilities are necessary because these basic file systems are not atomic in nature – data and metadata are written separately. Datalight’s Reliance Nitro file system treats updates as a single operation, and thus the file system is never in a state where it would need to be corrected. Our Dynamic Transaction Point technology allows the user to customize just how atomic their design is, protecting not just a block of data and metadata but the whole file – half a JPG is pretty much useless, from a user perspective.

The repairs that fsck and chkdsk can perform are completely unnecessary with the Reliance Nitro file system. At the device design level, this results in quicker boot times for a system that is completely protected from power failure. A file system checker is of course provided, and is useful for detecting failures caused by media corruption.

Taking chkdsk and fsk to the next level of protection would be a tool to repair some media corruption. If a block of data on the media becomes only partially readable, this tool could read it multiple times (to try and collect the most data) and store the results in a newly allocated block, correcting the file system structures appropriately. User intervention would likely be required to understand if enough data was recovered to make this effort worthwhile. Stay tuned for more updates on this topic.

Learn more about Dynamic Transaction Point technology

Thom Denholm | January 7, 2013 | Reliability | Leave a comment

Reliability with ext4

The challenge – making ext4 just as reliable as Datalight’s Reliance Nitro file system, within limitations of the POSIX specification. Unlike most real world embedded designs, performance and media lifetime are not a consideration for this exercise.

1) Change the way programs are written. By changing the habits of coders (and underlying libraries), it is possible to gain some measure of reliability. I’m talking specifically about the write(), fsync(), close(), rename() combination discussed on some forums. This gets around the aggressive buffering of ext4, and gives some assurance that a file either exists or doesn’t. What this does not do is handle an update, which overwrites part of the file, unless an entire new copy of the file is written.

What this also fails to handle is a multithreaded environment. As each of these operations is not atomic, cohesiveness can be lost if power is interrupted while one thread is performing an fsync() while another is writing, or issuing a rename.

2) Journal the data as well as the metadata. For smaller files or overwrites, this solution will have all the data in the journal and available for playback when recovering from a power loss. While this worked well in ext3, this option was changed in ext4. The delayed allocation strategy gives an increase in performance, but can cause more loss of data than is expected. Applications which frequently rewrite many small files seem especially vulnerable.

Shortening the system’s writeback time, stored in shell variables such as dirty_expire_centisecs, can help mitigate this by reducing the amount of data which can be lost.

Some changes just aren’t possible, however. Writes are not atomic, as the metadata and data are written separately. The power can be interrupted between the two types of writes, no matter what is done at the application level. While these suggestions make ext4 more reliable, the cost in changes to user code, loss of performance and additional wear are not insubstantial. Far better to use a file system designed for unexpected power loss, where atomic writes allow the system designer to decide when to put data at risk.

Thom Denholm | October 31, 2012 | Reliability | Leave a comment

Startup and Shutdown: Challenges in User Experience

Embedded device designers have to come up with systems that handle variations in network traffic, resource constrained environments, and battery limitations. All that pales in comparison with challenges at system start and power down times.

In powering down a multi-threaded device, each application will want to get state information committed to the media. The operating system will have its own set of files or registries to update before the power goes off. At the low end, the MLC NAND flash or eMMC media requires power to be maintained while writing a block, in order to prevent corruption of that or other blocks on the media. Therefore a large group of block writes arrive at the media level simultaneously, all with the requirement to finish up before the power drops.

The only thing worse than having to perform several writes in a short time frame is when those writes display the worst performance characteristics of eMMC media. These block writes are often located in different locations, essentially defining Random I/O.

When restoring power, the same situation occurs in reverse, with each thread requesting state information. For some file systems, the interrupted writes will need to be validated, the file system checked, or the journal replayed. Here the time pressure is not with actual loss of power, but with frustrated users, who want the device to be on, now!

This is a very important use case to consider for designers of file systems and block device drivers. While shutdown consists of primarily writes, a multithreaded file system should not block reads. The same applies at startup time, where a write from one thread should not block the other read threads. Where possible, the block device driver should sequence the reads and writes to match the high performance characteristics of the media, allowing the most blocks to be accessed in the least amount of time.

We are currently working on an eMMC-specific driver that will manage the sequencing of writes. It promises to improve write performance by many times the rate of standard drivers. Check back with us for more on this topic next month!

If possible, applications should also be written to minimize the startup and shutdown impact. With more applications being written by outside developers, this is getting harder to control. The system designer must focus on what they can control – choice of hardware, device drivers and file system to mitigate these problems.

Learn more about eMMC

Thom Denholm | June 26, 2012 | Extended Flash Life, Flash Memory, Performance, Product Benefit, Reliability, Uncategorized | Leave a comment

Device Longevity using Software

The new chief executive for Research in Motion Ltd., Thorsten Heins, mentioned recently that 80 to 90 percent of all BlackBerry users in the U.S. are still using older devices, rather than the latest Blackberry 7.

Longevity of a consumer device is something that we at Datalight know belongs firmly in the hands of the product designer, rather than being limited by the shortened lifespan of incorrectly programmed NAND flash media. Both Datalight’s FlashFX Tera and Reliance Nitro incorporate algorithms which reduce the Write Amplification on all Flash media. These methods are especially important on e-MMC, which is at its heart NAND flash. In addition, the static and dynamic wear leveling in FlashFX Tera provides even wearing of all flash for maximum achievable lifetime.

Shorter lifetime for some consumer devices, such as low end cell phones, may be found acceptable. However, many newer converged mobile devices that command a higher price, such as tablets, are expected by consumers to have a much longer lifetime. These devices may be replaced by the primary user with some frequency, although since they are viewed as mini-computers and therefore less “disposable,” they will likely be handed down to younger users rather than being discarded or recycled. Consumers will protest in if they discover their $500 tablet only has a lifespan of 3 years, and they will be even more upset if due to flash densities and write amplification that the next version they purchase may have even a shorter lifespan.

How will flash longevity affect your new embedded design?

Thom Denholm | March 6, 2012 | Extended Flash Life, Flash Industry Info, Flash Memory, Flash Memory Manager | Leave a comment

The Next Generation File System for Windows

There’s a lot of buzz on the MSDN blog site regarding their latest file system post. http://blogs.msdn.com/b/b8/archive/2012/01/16/building-the-next-generation-file-system-for-windows-refs.aspx – and plenty of insightful comments as well.

I for one am happy to see people talking about file system features, especially Data Integrity, knowledge of Flash Media, and faster access through B+ trees. Of course, Datalight’s own Reliance Nitro file system has had all this and more for some time now…

Microsoft has a new term for a thing we’ve seen often in the case of unexpected power loss – a “Torn Write”. They point this out as a specific problem for their journalling file system, NTFS, but updating any file system metadata in place can be problematic. It looks to me like this new file system, ReFS, handles this by bundling the metadata writes with other metadata writes or with the file data. If the former, this demonstrates the trade-off between Reliability and Performance that we are very familiar with at Datalight. Bundling smaller writes will help with spinning media and flash. In time we will see how much control the application developer has over this configuration – another important point for our customers.

One of the commenters posted that error correction belongs at the block device layer, and I tend to agree. Microsoft’s design goal “to detect and correct corruption” is a noble one, but how would they detect corruption for user data? Additional file checksums and ECC algorithms would be intrusive and potentially time consuming. Keeping a watch on vital file system structures is important, of course, and a good backup in case block level error detection fails.

I look forward to reading more from Microsoft’s file system team in the future, and especially hope to see a roadmap for when these important changes will make it down to the embedded space.

Learn more about what happens during a power interruption.

 

Thom Denholm | January 18, 2012 | Reliability, Uncategorized | Leave a comment

Advances in Nonvolatile Memory Interfaces Keep Pace with the Data Volume

This article entitled Advances in Nonvolatile Memory Interfaces Keep Pace with the Data Volume, recently published in RTC Magazine, gives a nice overview of maintaining performance on newer technologies.

 

Learn more about Datalight and ClearNAND

Michele Pike | November 22, 2011 | Flash Memory, Flash Memory Manager, Performance | Leave a comment

EZ NAND compared to eMMC

A recent article by Doug Wong compared performance characteristics of eMMC and ONFI specification EZ-NAND, specifically Toshiba’s SmartNAND here: http://www.eetimes.com/design/memory-design/4218886

One consideration I would add to this quite excellent summary is about the availability of drivers. Raw NAND has been around for quite a while and the market supplies a large range of drivers. Many of these will utilize the basic functionality of SmartNAND and other EZ NAND chips with only small modifications. Drivers for eMMC, on the other hand, are much harder to find. Only Linux has a freely available driver, which Google’s Android has taken advantage of in recent releases.

At Datalight, we continue to be excited by both of these new technologies. From the JEDEC eMMC parts, the cool features such as Secure Delete and Replay Protected Memory Block are very exciting. On the other hand, the sheer performance of Toshiba’s SmartNAND and other EZ NAND solutions is very much in demand.

Thom Denholm | November 8, 2011 | Flash Industry Info, Flash Memory, Performance, Uncategorized | 2 Comments