To understand what makes Toshiba’s BiCS 3 TLC NAND different from the typical NAND a bit of background information is required. The foundation of all Not-AND (aka ‘NAND’) Solid State storage is a tiny little electrical gate and a tiny little electrical ‘battery’ cell called a transistor. To change the stored data bit from a 0 to 1, or vice versa, the controller tells the gate to either let a little bit of the charge ‘out’ (aka ‘discharge the cell’) or a bit in to the cell. This is called a program operation as it is literally setting the charge state of cell or ‘programming’ it.
To read the NAND cell the controller simply pushes a tiny trickle of power across the gate and reads that cell’s charge (via resistance). For example, in SLC or single level cell NAND the cell has two states: full and empty. Ironically, since the controller is reading a difference in voltage across the cell the default ‘empty’ state is usually considered 1 and not a 0. So an ‘empty’ or discharged state is the same as storing a 1 in that cell, and fully charged is a 0 bit. For old timers, this is why in AS Cleaner or ‘Tony Trim’ type programs (before TRIM was a sure thing) you had to use the ‘FF’ write option… as ‘FF’ is the same as all ones being written to the cells.
Since a NAND transistor is capable of holding a charge even when the system is powered off NAND storage is considered non-volatile (unlike RAM which once the power is turned off the data is flushed). However, just like ‘bit rot’ on a Hard Disk Drive this state is not permanent. Just like the battery in your turned off phone, if left long enough the power in that transistor will leak out… and turn a 0 into a 1. This is a bad thing from a long term storage point of view.
With SLC NAND, a cell that is left alone will not lose all that much voltage and is fairly stable for months. However, as each NAND cell can only store 1 bit it is rather expensive to create say a 1TB Solid State Drive, and scaling up beyond 1TB requires a lot of NAND chips. Even layering NAND cells on top of NAND cells the resulting foot-print was rather large. This combination is why Multi-Level Cell NAND was created which doubled a NAND cell’s storage capacity from 1 bits to 2. It did this by adding more voltage resistance points between ‘full’ and ‘empty’. So instead of two states there was now four states: 0, .25, .5, .75, and 1. This certainly increased storage density, and decreased cost of manufacture to hit a given storage capacity, but at the same time made the self-discharging of NAND a greater issue – as lot less of a voltage change can indeed impact two bits of data. This is why MLC is not as stable as SLC and has to be refreshed / rewritten more often (aka ‘internal housecleaning’).
As time went by MLC was further supplanted by TLC to Tri-Level Cell NAND which can store three bits of data per cell. In TLC NAND there are not two, not four, but eight voltage states (and for comparison sake to hit 4 bits or QLC NAND requires sixteen voltage points). This in turn increases the necessary precession in properly writing and reading a given state of a NAND cell. It also significantly decreases how long a TLC NAND cell will hold that rather precise voltage state. This increased precision plays a large role in why TLC NAND is ‘slower’ at write operations than MLC (which is in turn slower than SLC).
We mention all this because back in 2007 Toshiba understood this underlying issue and took steps to ensure that their NAND would hold a charge longer, not become damaged as quickly as typical designs, and could even be written to faster than typical NAND designs. In typical NAND designs the gate is a ‘floating’ gate (FG or FTG for short) and is the conductor of electricity. So, in order for the drive’s controller to change a NAND cell’s state it pushes electricity through the gate itself. Over time this current running through the floating gate causes miniscule fractures in the crystal lattice that makes up the foundation of the gate. On their own a single fracture will not cause voltage related issues, but over time the culminative effective of these tiny damaged portions of the gate add up to an actual problem. When this happens, the end result is a gate which is incapable of keeping voltage levels inside the transistor stable. This is what a ‘dead’ NAND cell really is – a NAND cell that can no longer hold a precise charge.
Worse still, with every reduction in fabrication size the cell itself not only becomes smaller but the gate itself also becomes smaller. A smaller gate means fewer defects are required before the gate fails. Also, the more precise the voltage requirements, the faster accumulative defects become an issue. By creating ‘3D’ NAND designs NAND manufactures have been able to create large cells with larger gates, but the underlying issue of using a classic FG design remains.
In laymen’s terms, a floating gate literally ‘eats itself’ every time it has to change the state of its transistor. The smaller the gate, or the more precision required the faster it finishes its meal… and the faster the cells die. NAND manufactures are well aware of this issue and include free NAND cells to replace ‘dead’ cells. They are also fully aware that this is only a band-aid solution as eventually there will be no more replacement cells left.
Controller and storage companies are also aware of this issue and include even more ‘replacement cells’ (aka ‘over-provisioning’ and why your 256GB drive is seen as a 240GB drive) but also use slower / gentler write algorithms to reduce this damage. In simplistic terms this means they write to the NAND using less voltage but over a longer period of time. This does reduce cell damage, but at the cost of write speed. To overcome this side effect, the controller will pretend a given chunk of TLC NAND is SLC NAND and only store two charge states instead of eight (or 16 in the case of QLC) per cell. This allows those cells to continue being used long after the amount of damage to the gate would result in random bit changes if eight (or more) states were needed. This pseudo-SLC is then used for real time writes as a buffer and is called different things by different manufactures. For example, Crucial calls it “Dynamic Write Acceleration”, but Seagate calls it ‘DuraWrite’.
So with all that background information to digest out of the way, let’s start by saying that Toshiba BiCS NAND is not based upon a float gate design. Instead it is a Charge Trap gate design (aka CT or CTF). In CTF based NAND, the gate is not a conductor and instead is the insulator. So instead of being responsible for getting power in to the gate, it is responsible for keeping it there. This means during write or erase operations less stress is being placed on the gate itself. The end result is those deadly fractures are less likely to occur, and when they do occasionally happen are usually electron level in size. Since they are smaller, they can only drain off the electrons that are ‘touching’ the fracture. Leaving the rest of the electrons in the cell to control the voltage adequately enough that the controller can still accurately tell what the cell state is.
While yes eventually fractures in the gate’s lattice matrix will multiply to the point that they too also cause the cell to be marked as ‘dead’, the likelihood and severity means a CTF based NAND transistor will last noticeably longer than float trap design. Equally important, this in turn not only increases longevity of the NAND cell but also allows for faster – and yet ironically enough still gentler- programming cycles to be implemented on the NAND. Which in turn increases overall write performance. Charge Trap gates also gain longevity and performance from the switch from planar (aka 2D) NAND designs to 3D designs. Arguably CFT NAND gains even more durability as they do not ‘eat themselves’ nearly as fast as FTG NAND – so the increased mass lasts even longer. When paired with pseudo-SLC algorithms the end result is faster performance, increased durability, and generally a better end-user experience.
Due to the unquiet gate design of the cell in BiCS, the layout of the cells in the 3D NAND block are also different. In typical floating gate based 3D NAND cells, the cells are laid out much like a large apartment building. To imagine this, each apartment in the building are the cells and the hallways and elevators are the control pathways upon which data is transferred from the NAND cells to the controller and back again (i.e. are the interconnects the controller uses). As the interconnects are optimized for a small foot-print (so more ‘apartments’ can be fit in the ‘building’), when the controller needs to erase the data to make the cells ready for future writes they have to move the data in smaller chunks lest the interconnect bus become saturated. This is basically why floating gate NAND can usually only erase one page’s worth of cell blocks at a time. To imagine why this is the case, think of the typical mega-apartment building. Now imagine every person on a given floor with all their furniture/cloths/etc. No imagine them all suddenly trying to cram into the hallway all at once. Now imagine only having a few apartments being emptied and moved at a time. It becomes obvious why FG based designs purposely slow things down during their erase cycle.
With Toshiba BiCS, the NAND is not laid out in a rectangular/square block like a typical apartment building with vertical or horizontal inter-connect pathways. Instead each vertical pathway is U shaped with multiple vertical lines of NAND interconnected at the ‘base of the building’. This different layout means each ‘floor’ has a wider ‘hallway’ and ‘elevator’ to use during a move. This wider inter-connect is why BiCS 3 can erase (or in our analogy move families in the apartments) not one but up to three pages worth of cells at one time. So, while both types of NAND are still erased at the block level… it will take a lot less time for the erase cycle to complete with BiCS 3 NAND. This means less cycles wasted on housecleaning, more cycles for real time I/O requests, and less chances of the end user noticing ‘slow downs’ when the pseudo-SLC NAND blocks are full and need to be erased before this ‘buffer’ can once again boost performance.
On top of these two significant differences, there is also one more major difference between Charged Trap and Floating Gate NAND designs. This is in the likelihood of capacitive coupling occurring. Put simply, since CTF cells are less likely to become damaged, and since the damage is much smaller, capacitive coupling is also lot less likely to happen. Capacitive coupling occurs when these defects in the gate do not drain the cell voltage enough for the controller to notice, but enough that these free electrons can bleed over to adjacent cells. When enough of this bleed over occours, this extra electricity can jump into a new cell and the bits stored in the failing cell not only randomly change, the data in the cells around them also randomly change.
Even more concerning is if this bleed over happens over a long enough period the cells surrounding them not only have their voltage state silently changed but become linked – or ‘fused’ together. Linked via a pathway that is not monitored by the controller. When this happens, the controller can not only be writing to the cell it thinks it is writing to but actually writing to multiple NAND cells at the same time. Needless to say, when these linked cells are read the low-level Error Correction Code immediately kicks in and reduces read performance as it tries to recover the corrupted data.
When all this explained to people the most common question asked is ‘why is Charge Trap not used for all NAND?!’. The answer is simple. Cost of manufacture and manufacturing inertia. Floating gate storage is a very mature process and most manufactures have literally billions of dollars and untold man hours invested in this technology. They know it, they know how to produce a lot of FG NAND quickly, and do not want to change a they believe it is ‘good enough’… which it is. Creating CTF NAND on the other hand would require not only expensive retooling, but also expensive bug fixes in their fabrication process which always crop up, and literally re-writing decades worth of institutional knowledge. This is also why it has taken Toshiba three generations worth of trial and error, and over a decade, before BiCS was ready for prime-time… and why it is called ‘BiCS 3’ and not just plain old ‘BiCS’.
Taken as a whole BiCS 3 is different, and it is great to see more and more companies opting for this different take on NAND.