Skip to Content

What Is DirectFlash and How Does It Work?

collage speed highway and railways in communication supercomputer with binary code; Shutterstock ID 400031566; purchase_order: 0; job: ; client: ; other: Per Eric C request 11/7

Pure Storage® DirectFlash® is a pioneering flash management solution that includes Purity software and DirectFlash Modules—both components that can be independently and non-disruptively upgraded.

Here’s how it works, why it’s different, and why you need it.

Flash Storage Overview

Invented by Toshiba in 1980, flash memory, also known as flash storage, is a type of non-volatile memory (meaning it doesn’t require a continuous power supply) that can be electronically erased and reprogrammed.

The Two Main Types of Flash Memory

There are two main types of flash memory—NOR and NAND—that differ at the circuit level depending on the type of logic gate they’re using. Currently, NAND flash represents more than 95% of the flash memory market and is used in almost all non-embedded flash devices.

Within the NAND category, there are various types of memory, classified based on the number of bits stored per memory cell, including:

  • SLC: One (single) bit per cell
  • MLC: Two (or multiple) bits per cell
  • TLC: Three bits per cell
  • QLC: Four (quad) bits per cell

What Is DirectFlash Module (DFM) Storage?

DirectFlash is a flash module designed by Pure Storage that allows all-flash arrays to communicate directly with raw flash storage. Pure Storage’s holistic approach to building all-flash systems involves leveraging “raw” flash to build our DirectFlash Modules, rather than rely on buying commodity solid-state drives (SSDs). By doing this, we get our flash at a different point in the supply chain from other solid-state array vendors. But the benefits of DirectFlash are much more than just better supply chain economics.

Other all-flash or hybrid arrays that use commodity off-the-shelf SSDs talk to their flash drives in essentially the same way they would a legacy hard drive: like it’s one contiguous set of identical blocks.

Hard drives had tracks and sectors, and laying all those sectors end to end was how you got one long list of blocks. SSDs replicate this same geometry by integrating complex systems in between the system and the flash, called a flash translation layer (FTL).

DirectFlash uses a different approach that talks to flash memory directly, which maximises the capabilities of flash and provides better performance, power utilization, and efficiency.

Specifically, DirectFlash offers:

  • System-level media management, as opposed to drive-level, which means the drives work in concert with the system itself, allowing the system to:
    • Make smarter data placement decisions based on broader context.
    • Understand the activity of the system from the block, file, or object level all the way down to an individual flash cell.    
    • Maximise efficiency by laying out data in ways optimised for the media, avoiding write amplification and increasing endurance.
    • Avoid duplicate work by centralizing functions like garbage collection, sparing, and wear leveling.
  • Reduction of overall media costs by eliminating duplicate efforts and processes that happen across every drive in a traditional system. Petabyte-scale systems that leverage SSDs can have terabytes of DRAM in the drives themselves—not even including system memory—to maintain their individual FTL mappings and metadata. Each drive also contains its own overprovisioned spare space that’s necessary for media management by the FTL. Each one of these components is an added cost that as drive sizes increase will make up a larger and larger portion of the overall media cost. The cost per bit of DRAM hasn’t improved in the last several years, so efficient use of DRAM will become more and more critical.
  • Increased module reliability by failing at a much lower rate (three to four times) compared to SSDs, primarily due to the simpler firmware running.

Learn how DirectFlash Modules drive efficiency: Efficient IT Infrastructure Saves More Than Just Energy Costs

How Solid-state Drives Work

An SSD is composed of NAND flash chips, also known as NAND flash dies, with each die being broken down into smaller elements called blocks, which are made up of pages.

However, flash blocks don’t support random overwrites. Once a page is written with data, the entire block needs to be erased before new data can be written in. At the same time, every SSD is built to support a backward-compatible disk sector interface.

What Is the Flash Translation Layer (FTL)?

This contradiction is resolved by having something in firmware known as a “flash translation layer,” or FTL, which implements a virtual disk sector interface that allows you to write data to different flash pages no matter which logical block the data was intended for. The FTL keeps track of all this mapping metadata in its own memory and metadata storage.

But, because you’re now writing new versions of data into different flash pages, eventually you accumulate data in those blocks that could be considered “garbage” because the data has either been overwritten or logically deleted.

How Garbage Collection Works in SSDs

To reclaim this physical capacity, a “garbage collector” process in the drive firmware takes the data that is still valid and moves it to a new location so that it can then erase the entire block containing the “tombstoned” data. For this garbage collector to work, each drive needs extra flash memory, what’s known as “overprovisioned space,” and every garbage collection event consumes one of the finite number of flash program/erase cycles. The amount of physical writes to the drive that every logical write consumes is known as “write amplification.”

Overprovisioning and write amplification lead to premature wear and shortened life span of the SSD. There are also performance impacts from this design because every time one of these flash dies is doing garbage collection, reads or writes won’t be available from that die. Therefore, performance of the SSD fluctuates unpredictably as the garbage collector becomes more or less active.

What makes this even more challenging is that SSDs have no way to communicate this garbage collection activity to the system that’s accessing it. Rather, the SSD has to maintain the illusion that it’s just like a hard drive. As the number of bits per cell in NAND flash increases, these performance inconsistencies only get worse, as program/erase cycles take longer and longer, leading to longer periods of data inaccessibility.

The Benefits of Using DirectFlash

DirectFlash takes a different approach to flash media management. Rather than deputizing every SSD to perform its own wear leveling, garbage collection, and overprovisioning, the Purity operating system performs these functions in software at the array level. This means each DirectFlash Module is simpler than a traditional solid-state disk, as it only has to provide access to media itself and handle low-level data and signaling tasks.

Learn more about how DirectFlash is bringing an end to hard disk drives (HDDs).

The benefits that this provides are numerous:

  • Improved Density and Efficiency. Our DirectFlash Modules (DFMs) deliver a storage density two to three times better and consume from 39% to 54% fewer watts per terabyte than our closest competitors today. Pure Storage DFMs do not emulate mechanical HDDs, allowing silicon-based flash media to be optimally managed in a way that significantly improves performance, storage density, effective capacity, media endurance, and cost per usable TB relative to COTS SSDs. Pure Storage is shipping 48TB DFMs today, is adding 75TB DFMs later this year, will be adding 150TB DFMs within 18 months, and is planning for 300TB DFMs by 2026. Learn more.
  • Smart Data Placement. Instead of each SSD making decisions about data placement and media management in a vacuum, Purity knows about all ongoing and scheduled system tasks such as current IO activity, data reduction operations, pending garbage collection cycles, and overall array workload and health. This allows Purity to make much smarter placement and scheduling decisions than a single drive could do on its own.
  • By making smarter data placement decisions, data of similar expected life spans can be co-located on the same blocks to minimize instances where some data in blocks is “tombstoned,” while other pages are still valid. Purity knows if certain pages are all part of the same file or object or coming from the same host system, and so by grouping those pages together into similar blocks when that file or object is deleted, the entire block can be freed at once—without rewriting other live data and causing write amplification.
  • They Outperform and Outlast. By performing no garbage collection and causing no write amplification, DirectFlash Modules outperform and outlast their commodity counterparts. Fewer writes means less wear and thus longer drive life spans. Fewer writes also means more IO cycles are available to service “real” client IO. And because Purity knows about current IO activity and has visibility into the entire system, it’s never surprised by one of these program/erase cycles blocking access to data. In the worst case, Purity can just reconstruct that data from parity rather than waiting for a program/erase cycle to finish. This significantly reduces the worst-case latency of our systems, even when using QLC flash.
  • They Improve Over Time. Because we perform all these media management tasks in software, we can improve this software over time. All Pure Storage systems connected to the internet securely phone home telemetry data, and since we have deep insight into the health and activity of the underlying flash memory, we aggregate and analyse this data to improve how our software works in the real world. This means over time, our systems’ reliability and performance can improve with regular software updates.
  • They’re Simpler and More Reliable. Because we perform all these activities at the array level in software, our DirectFlash Modules don’t need complex controllers and large amounts of RAM to do all this work on their own. Thus, our modules are simpler and therefore more reliable, in addition to being more efficient. We can also scale the size of our drives with advances in NAND flash fabrication technology, without needing to increase drive complexity or cost.

What this means for customers is systems that have more performance, more consistently, and more reliability and longevity than other all-flash or hybrid systems designed around SSDs.

Pure Storage was founded around the belief that the future of the data centre was all flash—and we’ve built our DirectFlash technology around making this vision a reality. We believe the best way to build all-flash systems is to build the system from the ground up for all-flash. That means eliminating the parts of the system designed around legacy interfaces and paradigms and letting the technology truly shine.

Want to take advantage of DirectFlash technology in your data centre? Check out our suite of all-flash storage solutions today.

11/2024
Pure Storage FlashBlade and Ethernet for HPC Workloads
NFS with Pure Storage® FlashBlade® and Ethernet delivers high performance and data consistency for high performance computing (HPC) workloads.
White Paper
7 pages

Browse key resources and events

CYBER RESILIENCE
The Blueprint for Cyber Resilience Success

Explore how IT and security teams can seamlessly collaborate to minimize cyber vulnerabilities and avoid attacks.

Show Me How
INDUSTRY EVENT
Explore the Pure Storage Platform at SC24
Nov 17-22 • Booth 1231

Learn how Pure Storage can help you meet your AI, HPC, and EDA requirements.

Book a Meeting
INDUSTRY EVENT
Join Pure Storage at Microsoft Ignite
Nov 18-22, 2024 • Booth 403

Discover how Pure Storage can effortlessly scale your workloads, manage unstructured data, and simplify your cloud transition.

Book a Meeting
INDUSTRY EVENT
Future-Proof Your Hybrid Cloud Infrastructure at AWS re:Invent 2024

Meet Pure Storage at AWS re:Invent and prepare your hybrid cloud infrastructure for what’s new and what’s next.

Book a Meeting
CONTACT US
Meet with an Expert

Let’s talk. Book a 1:1 meeting with one of our experts to discuss your specific needs.

Questions, Comments?

Have a question or comment about Pure products or certifications?  We’re here to help.

Schedule a Demo

Schedule a live demo and see for yourself how Pure can help transform your data into powerful outcomes. 

Call Sales: 800-976-6494

Mediapr@purestorage.com

 

Pure Storage, Inc.

2555 Augustine Dr.

Santa Clara, CA 95054

800-379-7873 (general info)

info@purestorage.com

CLOSE
Your Browser Is No Longer Supported!

Older browsers often represent security risks. In order to deliver the best possible experience when using our site, please update to any of these latest browsers.