[Arm-netbook] #2- "MLC NAnd" corrupts-on-read how oft? Would your's have "ECCs", and use them to boot correctly?

Wookey wookey at wookware.org
Sat Sep 10 02:09:03 BST 2016


On 2016-09-07 20:18 -0600, chadvellacott at sasktel.net wrote:

>      I have been reading on "MLC NAnd", and it seems that now I better
> understand the problem of corruption-on-read.

I have some experience of NAND due to working on YAFFS a few years
ago. My info may be slightly out of date as NAND has got even fatter
in the last few years.

>    (a) How common is it for corruption-on-read to occur with "MLC NAnd"
> (like by something as basic as reads done by the built-in "ROM"
> "boot"-loader)?
>    (Perhaps the answer is like "N % probability that one or more of the [1
> to 4] bits in a cell, shall wrongly change it's logical value, after X reads
> of one or more pages in the same block, Y writes to one or more pages in the
> same block, and Z erases of the block".)
>    At first I was thinking that the FIRST time data (mini "boot"-loader or
> otherwise) is read from the "NAnd", corruption might likely occur.
>    But perhaps this corruption-on-read naturally happens ONLY after many
> reads or writes in a block and many erases of a block.
>    So how common is it?

'Rare'. Corruption-on-read (called 'read disturb' in the literature)
will only happen after there have been quite a few reads aligned on
just the 'wrong' page. 20,000 or so might do it in modern
MLC. write-disturb is much more likely than read-disturb. So each read
that energises the same 'row' in the flash layout increases the error
potential a tiny amount. Each write increases it quite a lot more (but
in modern flash with a flash-aware filesystem you only ever write a
page once before erasing it so this is not an issue).

Bits are 'refreshed' (and the probabilities of error reset) when the
bits in question are rewritten. So a really smart flash filesystem
will ensure that 'old' data that is near pages that have been read a
lot, gets moved.

>    (b) Would the "MLC NAnd" planned in the computer-cards via "Crowd
> Supply", have Error-Correction Codes?

yes. All NAND-flash has this otherwise it would be uselessly unreliable. 

>    (c) If so, then does whatever reads the "NAnd" on "booting" (I guess it
> is called the "eGON boot-ROM"), know that it should (and know how to) use
> those "ECCs", to correct errors (if any) which it encounters when trying to
> start the "booting" process, so that it loads the correct original bits of
> the "boot"-loader ("minimalist" or otherwise)?

yes. all NAND reads check the ECC.

The YAFFS site has a load of info on the issue of NAND
(un)reliability, and what it does to manage/mitigate it: 
http://yaffs.net/documents/yaffs-nand-flash-failure-mitigation
Specifically: 
http://yaffs.net/documents/yaffs-nand-flash-failure-mitigation#Read_disturb

HTH

Wookey
-- 
Principal hats:  Linaro, Debian, Wookware, ARM
http://wookware.org/
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: Digital signature
URL: <http://lists.phcomp.co.uk/pipermail/arm-netbook/attachments/20160910/59843960/attachment.sig>


More information about the arm-netbook mailing list