Hello, RAM somewhere could also be faulty. Have a look at the logs for any ECC errors (both system memory and RAID controller) and memtest the boxes involved for a couple of days. I would suggest some stress testing of the new server if not done already.
Best regards, Dimitris On Sun, Sep 21, 2014 at 3:22 PM, Jörg Saßmannshausen < j.sassmannshau...@ucl.ac.uk> wrote: > Dear all, > > I got a rather strange problem with one of my file servers which I recently > have upgraded in order to accommodate more disc space. > > The problem: I have copies the files from the old file space to a > temporary disc > storage space using this rsync command: > > rsync -vrltH -pgo --stats -D --numeric-ids -x oldserver:foo tempspace:baa > > I am doing this now for some years and never had any problems. > > As always, I am running md5sum afterwards to be sure ther is not a problem > later and the user is loosing data. This time around a rather large file > (around 16 GB) the md5sum failed after I moved the files from the temp > space > back to the new destination using the same command as above. > > Having still access to the old file space, I decided to move this file > from the > old file space. Strangely enough, rsync does not sync the file again so I > had to > delete the file. Even after deleting the file and re-sync it from the old > source, the md5sum is wrong. > > Copying the file to a different file space did not cause these problem, > i.e. the > md5sum is correct. > As it is a tar.gz file, I simply decided to decompress the original file > on the > different file server. That worked. The file where the md5sum is wrong did > not > decompress on the different file server but crashed with an error message > when I > executed gunzip. So the file is broken. > > The setup: > > Originally I was using an old Infortrand box which had old PATA discs in > it. > This box is connected via scsi to a frontend server which exports the file > space via iscsi. The backend for that, i.e. the one the user is accessing > is > on a different physical machine and it is a XEN guest. The reason behind > that > setting is as the frontend is acting as a backup server and I don't want > people to have access to it. > I then exchanged the Infortrend box with a more recent model which got SATA > capeabilities but still got scsi connection to the frontend. The frontend > is > the same. I got a new controller for that box as the old one was broken. > There is no changes in the backend, that is still the same XEN guest on the > same hardware. > > What I cannot work out is why the old Infortrend box does not have any > problems with the new file, the newer one has a problem here. Also, when I > have > copied over some files (again using the rsync command above) a few files > did not > copy correctly (again md5sum) in the first instance but done so later. > > I find that highly alarming as that means that at least for larger and/or > some > binary files there seems to be a problem. However, I am not sure there to > look > at it as I am out of ideas. > > Could it be there is a problem with the 'new' controller? > In all cases I was using ext4 as a file system and I did not have any > problems > with that. > > Anybody got some sentiments here? > > All the best from a sunny London > > Jörg > > P.S. To make things worse I am off on a work related trip from Monday > onwards > and I am working on that problem since Friday evening. > > > > -- > ************************************************************* > Dr. Jörg Saßmannshausen, MRSC > University College London > Department of Chemistry > Gordon Street > London > WC1H 0AJ > > email: j.sassmannshau...@ucl.ac.uk > web: http://sassy.formativ.net > > Please avoid sending me Word or PowerPoint attachments. > See http://www.gnu.org/philosophy/no-word-attachments.html > > _______________________________________________ > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > To change your subscription (digest mode or unsubscribe) visit > http://www.beowulf.org/mailman/listinfo/beowulf > >
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf