Hi Andrew, thanks.
I will look into that. It is good to hear it is ready for production now. The last time I looked into it it was not. All the best Jörg On Sonntag 21 September 2014 you wrote: > > Regarding ZFS: is that available for Linux now? I lost a bit track here. > > Yes. > > http://zfsonlinux.org/ > > I would say its ready for production now. Intel are about to start > supporting it under Lustre in the next couple of months and they are > typically careful about such things. > > Cheers, > > Andrew > > > All the best from London > > > > Jörg > > > > On Sonntag 21 September 2014 you wrote: > > > Hi Jörg, > > > > > > Sounds like a "typical" but very uncommon silent data corruption > > > problem. If you have another copy of the data, compare to that? If you > > > don't have another copy, accept the fact that some of your data maybe > > > got silently corrupted. > > > > > > Most RAID controllers do periodic "scrubbing"; was your Infortrend > > > doing that? > > > > > > For the new system, consider using ZFS pointed at plain disks, as it > > > may have more layers of checksums compared to your current system. > > > > > > Regards, > > > Alex > > > > > > On Sunday, September 21, 2014, Jörg Saßmannshausen < > > > > > > j.sassmannshau...@ucl.ac.uk> wrote: > > > > Dear all, > > > > > > > > I got a rather strange problem with one of my file servers which I > > > > recently have upgraded in order to accommodate more disc space. > > > > > > > > The problem: I have copies the files from the old file space to a > > > > temporary disc > > > > storage space using this rsync command: > > > > > > > > rsync -vrltH -pgo --stats -D --numeric-ids -x oldserver:foo > > > > tempspace:baa > > > > > > > > I am doing this now for some years and never had any problems. > > > > > > > > As always, I am running md5sum afterwards to be sure ther is not a > > > > problem later and the user is loosing data. This time around a rather > > > > large file (around 16 GB) the md5sum failed after I moved the files > > > > from > > > > > > the temp space > > > > back to the new destination using the same command as above. > > > > > > > > Having still access to the old file space, I decided to move this > > > > file from the > > > > old file space. Strangely enough, rsync does not sync the file again > > > > so I > > > > > > had to > > > > delete the file. Even after deleting the file and re-sync it from the > > > > old > > > > > > source, the md5sum is wrong. > > > > > > > > Copying the file to a different file space did not cause these > > > > problem, i.e. the > > > > md5sum is correct. > > > > As it is a tar.gz file, I simply decided to decompress the original > > > > file > > > > > > on the > > > > different file server. That worked. The file where the md5sum is > > > > wrong did not > > > > decompress on the different file server but crashed with an error > > > > message > > > > > > when I > > > > executed gunzip. So the file is broken. > > > > > > > > The setup: > > > > > > > > Originally I was using an old Infortrand box which had old PATA discs > > > > in > > > > > > it. > > > > This box is connected via scsi to a frontend server which exports the > > > > file space via iscsi. The backend for that, i.e. the one the user is > > > > accessing is > > > > on a different physical machine and it is a XEN guest. The reason > > > > behind > > > > > > that > > > > setting is as the frontend is acting as a backup server and I don't > > > > want > > > > > > people to have access to it. > > > > I then exchanged the Infortrend box with a more recent model which > > > > got SATA capeabilities but still got scsi connection to the > > > > frontend. The frontend is > > > > the same. I got a new controller for that box as the old one was > > > > broken. > > > > > > There is no changes in the backend, that is still the same XEN guest > > > > on the same hardware. > > > > > > > > What I cannot work out is why the old Infortrend box does not have > > > > any problems with the new file, the newer one has a problem here. > > > > Also, > > > > when > > > > > > I have > > > > copied over some files (again using the rsync command above) a few > > > > files > > > > > > did not > > > > copy correctly (again md5sum) in the first instance but done so > > > > later. > > > > > > > > I find that highly alarming as that means that at least for larger > > > > and/or > > > > > > some > > > > binary files there seems to be a problem. However, I am not sure > > > > there > > > > to > > > > > > look > > > > at it as I am out of ideas. > > > > > > > > Could it be there is a problem with the 'new' controller? > > > > In all cases I was using ext4 as a file system and I did not have any > > > > problems > > > > with that. > > > > > > > > Anybody got some sentiments here? > > > > > > > > All the best from a sunny London > > > > > > > > Jörg > > > > > > > > P.S. To make things worse I am off on a work related trip from Monday > > > > onwards > > > > and I am working on that problem since Friday evening. > > > > > > > > > > > > > > > > -- > > > > ************************************************************* > > > > Dr. Jörg Saßmannshausen, MRSC > > > > University College London > > > > Department of Chemistry > > > > Gordon Street > > > > London > > > > WC1H 0AJ > > > > > > > > email: j.sassmannshau...@ucl.ac.uk <javascript:;> > > > > web: http://sassy.formativ.net > > > > > > > > Please avoid sending me Word or PowerPoint attachments. > > > > See http://www.gnu.org/philosophy/no-word-attachments.html > > > > -- > > ************************************************************* > > Dr. Jörg Saßmannshausen, MRSC > > University College London > > Department of Chemistry > > Gordon Street > > London > > WC1H 0AJ > > > > email: j.sassmannshau...@ucl.ac.uk > > web: http://sassy.formativ.net > > > > Please avoid sending me Word or PowerPoint attachments. > > See http://www.gnu.org/philosophy/no-word-attachments.html > > > > _______________________________________________ > > Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing > > To change your subscription (digest mode or unsubscribe) visit > > http://www.beowulf.org/mailman/listinfo/beowulf -- ************************************************************* Dr. Jörg Saßmannshausen, MRSC University College London Department of Chemistry Gordon Street London WC1H 0AJ email: j.sassmannshau...@ucl.ac.uk web: http://sassy.formativ.net Please avoid sending me Word or PowerPoint attachments. See http://www.gnu.org/philosophy/no-word-attachments.html
signature.asc
Description: This is a digitally signed message part.
_______________________________________________ Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing To change your subscription (digest mode or unsubscribe) visit http://www.beowulf.org/mailman/listinfo/beowulf