Hi All,

I've seen the following strange behavior on a number of BeagleBone Black 
devices:  The eMMC reads back different data each time I read it (and it is 
not being written to by anything else).

The devices had been operating just fine 24/7 for months.  Then, suddenly, 
after a power cycle, they would not boot any more.

1. The blue "power" LED indicated that the BBB was getting power.
2. No other LEDs illuminated.
3. No output whatsoever on serial console after power was applied.

It was clear that U-Boot was not able to run.  I was starting to think the 
BBB was dead.

On a whim, I inserted a "flasher" SD card and held down the boot switch 
(S2) while applying power.  The board booted just fine off of the SD card!

I was suspicious of the contents of the eMMC, so I broke into the U-Boot 
prompt, set cmdline=init=/bin/bash and booted into Linux.

Since I'd set the "init=" command-line argument, my bash process was the 
only userspace process running, and I verified that /dev/mmcblk1 (the eMMC 
device) was not mounted.

I then did a raw read of the entire eMMC 4 times and compared the results 
byte-by-byte with a python script.  Each of the 4 reads produced a 
different result, with differences between each of the 4 files distributed 
mostly uniformly across the eMMC!  There were no error codes, kernel debug 
messages, or complaints of any kind that I was able to observe during the 
reading of the eMMC.

I have since seen the same behavior on at least 3 other units.  Rather than 
read back the entire 4GB eMMC device and compare, I've been able to test 
for the same result by doing something like the following:

root@(none):/# dd if=/dev/mmcblk1p3 bs=8M count=3 | md5sum
49438591914268785da79c8569b3b571  -
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 11.4388 s, 2.2 MB/s

root@(none):/# dd if=/dev/mmcblk1p3 bs=8M count=3 | md5sum
21ec205a55605c44c5097d2c07b73029  -
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 11.3298 s, 2.2 MB/s

root@(none):/# dd if=/dev/mmcblk1p3 bs=8M count=3 | md5sum
4f9cbb7698b9aecc88ff8da69aa21178  -
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 11.3444 s, 2.2 MB/s

Note how the md5sum changes each time!  I know that I had previously 
written valid data to /mmcblk1p3 (and even if I hadn't, it shouldn't read 
back as different data each time).
Again, I am pretty sure that nothing else is writing to the eMMC in the 
cases above (I have booted from the mmcblk0 SD card, my bash shell is the 
only userspace process, and the mmcblk1 eMMC device is not mounted).

If I write zeros to the same location, then the readbacks become reliable 
(and also take much less time):

root@(none):/# dd if=/dev/zero bs=8M count=3 of=/dev/mmcblk1p3
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 15.0492 s, 1.7 MB/s

root@(none):/# dd if=/dev/mmcblk1p3 bs=8M count=3 iflag=direct | md5sum
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 0.983744 s, 25.6 MB/s
77377273b0a4b61febdbf7bbf52b9db9  -

root@(none):/# dd if=/dev/mmcblk1p3 bs=8M count=3 iflag=direct | md5sum
3+0 records in
3+0 records out
25165824 bytes (25 MB) copied, 0.986371 s, 25.5 MB/s
77377273b0a4b61febdbf7bbf52b9db9  -

I also power-cycled the board after this and verified that I still get the 
same result (77377273b0a4b61febdbf7bbf52b9db9, reads back in about 1 second 
instead of 11 seconds) after the power-cycle.


The first device I discovered this issue on had been exposed to high 
temperatures and so I initially suspected that to be at issue.  However, 
I've since seen this happen to three other BBBs which never left my 
air-conditioned lab.

I've seen the same behavior on at least 4 BBBs now.  So far all of the 
devices I've seen this issue appear to have been populated with the 
Kingston KE4CN2H5A eMMC based on the eMMC size (3825205248 bytes).

Does anyone have any ideas what might be causing this?  Could it just be 
the eMMC device wearing out?  It seems that this eMMC device ( eMMC 4.5 
spec ) does not provide any access to health monitoring statistics.

Any help is greatly appreciated.

-Jeremy Trimble

-- 
For more options, visit http://beagleboard.org/discuss
--- 
You received this message because you are subscribed to the Google Groups 
"BeagleBoard" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/beagleboard/fea7bb5c-f39f-458e-a359-8752f8b6c143%40googlegroups.com.

Reply via email to