On 01/20/17 18:04, Sam Smith wrote:
I'll try to keep this short. I bought a used Lenovo T520 back in May. It
had the motherboard with nvidia GPU. Because it sucks power and doesn't
really have good suspend/resume support, I bought a used mother board
off of ebay that only had intel integrated graphics and I swapped it
out. All was well and I had installed in it 8gb +4gb of ram.
It ran like that for about 8 weeks before I bought an 8gb stick and
stuck that in, so then I had 8gb + 8gb of ram. After about 6 weeks of
running like, it randomly rebooted overnight. I shrugged it off and
thought maybe the power went out or something (even though it had a
battery in it). But then about 2 weeks later it did it again..and then
two weeks after that. So I pulled out the new 8gb stick I had put in and
let it run with just one 8gb stick. It ran like that for about 10 weeks
without a problem. I put the old 4gb stick in just for fun, bringing it
back to the original 8gb + 4gb configuration. But about 2 weeks later it
rebooted again. At that point I bought a matched 16gb kit (8gb + 8gb)
from new egg that seemed to come recommended from google searching for
compatible ram for this model. But just a couple of days ago (about 3
weeks after installing it), it rebooted by itself.
I am kind of at a loss here now. I can buy another motherboard and swap
it out again, but that takes a few hours and I don't feel like doing it.
The cooling and thermal stuff is all good on the laptop,I've ran prime95
and video encoding for hours and it is fine (temps stay below 80* at
least, normal usage is 40-55*). I've also ran memtest for a few hours.
What I find weird is that the machine suddenly reboots. At least a few
years ago, ram issues would just lead to a kernel panic screen. But with
this, the machine is just like someone pulled the plug and rebooted it.
I started to wonder if there is some built in watchdog somewhere that
will reboot the machine if it hangs, but I can't tell? Other than that,
if this is the kernel that is rebooting the machine, is there any way I
can get it to dump some info somewhere before it fully reboots? Before I
go through the pain of swapping the board again, I'd just like to really
know that this is a hardware issue and not the kernel detecting
something and just choosing to reboot...
Memory errors are more common that we'd like to believe:
http://www.zdnet.com/article/dram-error-rates-nightmare-on-dimm-street/?_escaped_fragment_=#!
A two-and-a-half year study of DRAM on 10s of thousands Google
servers found DIMM error rates are hundreds to thousands of times
higher than thought -- a mean of 3,751 correctable errors per DIMM
per year.
Non-ECC DRAM is more common Most DIMMs don’t include ECC because it
costs more. Without ECC the system doesn’t know a memory error has
occurred.
Bad news Besides error rates much higher than expected - which is
plenty bad - the study found that error rates were motherboard, not
DIMM type or vendor, dependent. This means that some popular mobos
have poor EMI hygiene. Route a memory trace too close to noisy
component or shirk on grounding layers and instant error problems.
Other interesting findings For all platforms they found that 20% of
the machines with errors make up more than 90% of all observed
errors on that platform. There be lemons out there!
Without ECC memory, there's no way to know if you really have a memory
problem.
Looking at the data sheet for your computer:
http://www.lenovo.com/shop/americas/content/pdf/system_data/t520_tech_specs.pdf
It covers three variants:
1. ThinkPad® T520 4243 (Onsite)
2. ThinkPad® T520 4243 (Optimus) - Onsite
3. ThinkPad® T520i 4239 (TopSeller)
All three say:
Memory
8GB max7 / PC3-10600 1333MHz DDR3, non-parity,
dual-channel capable, two 204-pin SO-DIMM sockets
See footnotes for more detailed information
I can't find the footnotes.
It appears that you have exceeded the manufacturer's specifications.
Why do you believe installing two 8 GB memory modules will work in this
computer?
What operating system are you running?
Have you installed any software other than official Debian binary packages?
Some ideas:
1. Put identifying marks/ sequential numbers on items that otherwise
look the same -- memory modules, SATA cables, adapter cards, etc..
2. Keep detailed notes in a plain text file using a method that allows
access from multiple computers. (I use CVS over SSH, with the
repository on my file server.)
3. If everything is per the specifications, and module X in slot A and
module Y in slot B results in problems, swapping the modules sometimes
solves the problem. This has worked for me more than once.
4. I once built a computer with what appeared to be an infrequent
memory problem. memtest86 ran for over a day before finding one error.
5. I once upgrade a computer from 2 @ 256 MB memory modules to 2 @ 1
GB, and encountered memory problems. I had another machine with the
same motherboard, 2 @ 512 MB memory modules, and no problems. I swapped
memory between the two machines and both computers worked fine.
David