Bret Busby wrote:
> I have an external USB HDD connected to a system running Debian 6 LTS.

I don't really have any great contribution.  But since no one else
seems to have any good response I will contribute what I know.

I have never had good luck with USB connected hard drives.  They work
for a while.  But then invariably they get dropped offline.  It might
be 3-6 months between events.  But for me they just are not reliable.
In the past I have tried very hard to use them as system disks.  Now I
consider that something to avoid.

I still use USB disks as "large floppies".  They are still great for
being large temporary data stores for holding and moving data between
machines.  But only when connected for short term use.  Reading your
problems just reinforces this belief.

[On the other hand USB network devices have been rock solid for me.
Meaning that while I avoid USB disks I actively use USB networking on
several machines to add additional NICs.  I am planning another site
using additional USB NICs.  It is probably hardware dependent but they
have been working great for me regardless of the opposite for disks.
And I have three sites using USB sound cards very robustly.]

> I have tried to transfer data from the desktop intenal HDD, to the
> external USB HDD.
>
> The file manager shows as being "Nautilus 2.30.1".

I read by your message that you are a graphical desktop user.  That's
fine.  But for transfering large amounds of data the command line
tools such as rsync are the best in class.  I wouldn't even consider
trying to use nautilus or other graphical file managers for this type
of task.  I would highly recommend using rsync.  Even if for you it
means a stretch to get off of the mouse and over to the keyboard.

The best advantage of tools such as rsync is that it is
interruptable and restartable with a minimum of lost effort.  I may be
1G into a 3G transfer and want to stop it, change something, and
restart it again.  With a normal copy that would mean copying the
original data again.  With rsync it means it will examine what needs
to be done and be able to continue the copy using the already
transfered data as done and moving forward.

  rsync -avP /from/here/dir-or-file /to/there/dir/

> From time to time, as in this instance, I forget (until too late) that
> Debian 6 can not cope with transferring data more than about 1GB at a
> time; in this instance, I had tried to transfer about 3GB, to make
> room in my /home partition.

Knowing how flaky USB disks interfaces tend to be I think this is most
likely a hardware problem.  Doesn't change your situation.  But I
think it blames the right thing to blame.

In any case I have definitely copied gigs and gigs of data to and from
USB disks.  It can be very good to make large data sets portable on a
portable USB drive.  My complaints usually happen after the disk has
been in active use as a system device for a month and then it goes
offline.

> The transfer had seized up, after transferring about 1.1GB of the
> 3.2GB that I had tried to ransfer, so, after a couple of days of it
> apparently doing nothing, I stopped it, and, as the system monitor
> showed a system load of around 43 (whatever that means - if it was as
> a percentage of system capacity, it could be more meaningful, to me).
> The system monitor currently shows a system load average of about 35.

Let me give a short explanation of system load.  Which is almost
impossible to say briefly so forgive me in advance for leaving out
important parts, saying half of it wrong, and still saying too much.
The concept is the important part here.

First there is no set capacity.  There isn't a cap such as 5 or 10 or
100.  Therefore there isn't a way to say what percentage of your
system is being used by any particular system load.  But it is an
important indicator of system status and health.  A load of 35 or 43
are both very high system loads!

The operating system process scheduler schedules processes to run.
A process ready to run is queued into the run queue.  If the process
is calculating PI to a zillion decimal places then it is going to use
100% of the cpu until it has consumed its time slice and suspended to
give the next process time to run.  If there are no other proceses
then the cpu will be given back to this process and the cpu will
continue to be 100% utilized forever.

But what about processes reading and writing to the disk drive or
network?  In computer speed spinning disk drives are slow.  In
computer speeds networks are slow.  Web servers are slow.  Say that
your web browser sends an http GET request to a web site.  It then
must wait for the response.  Your web browser is ready to run.  But it
can't.  It is waiting for external events return.  It is "blocked"
waiting for I/O.  While it is waiting the OS will schedule another
process to run.  If there is another process ready it will get cpu.
It may also be waiting for the disk drive to spin and return data.  Or
spin and complete a data write.

The OS will spin through the run queue running any process that is
ready to run.  Some will run for a short time and then perform a read
or a write to an I/O device.  That will cause them to stop running
while their I/O request is processed.  Other processes such as those
running protein folding research will consume all cpu running until
the OS suspends them.  The OS manages the run queue and keeps things
moving through the cpu using a variety of algorithms to make best use
of the cpu.  A system running a lot of processes where all of the
processes are using 100% cpu will be very busy and feel very slow.  A
system running a lot of processes where most of them are waiting for
some I/O to complete will be mostly idle because it will be mostly
waiting and will feel very responsive.  You might not even notice it
is doing so many things.

I hope this explains the run queue in the OS and how it is used.  The
run queue is the load average you are seeing.  Those numbers 43 and 35
are the number of processes in the OS run queue.  It says that you
have forty processes in the run queue.  Wow!  In a typical desktop
system the load average will hover near zero.  That your system is so
large says something is blocking those processes from completing.
Your system may be very responsive if they are all waiting, if they
are all "blocked waiting for I/O".  Or it may be really bogged down if
they are all crunching something on the cpu.  Or bogged down if they
are consuming too much memory and your system swaps.  A high load
average indicates something is blocking those processes from
completing and they are stacking up in the queue.  When airplanes
can't land at an airport fast enough then air traffic control stacks
them up in the holding patterns.  That is what is being indicated by
your high load average.  There isn't a maximum and therefore there
isn't a percentage of capacity.

If you see a high load average such as that run a 'ps' command to look
to see what processes are running.  Sorry I don't know how to do that
from a graphical desktop.  But in the command line shell I would
normally run:

  ps -efH | less   # Or the bsd way: ps aux | less

Then look to see what is running.  If you have a load of 40 then there
will be at least 40 unusual processes there.  Maybe that is 40
nautilus processes stacked up trying to talk to the USB.  I think that
likely.  That would definitely indicate a system problem talking to
the USB device.

You may ask, "What is an unusual process?"  That is a hard one to
answer.  You should look at a normally running system routinely and
learn what is normally running.  Then later when things are not
running normally you will recognize patterns that are different.  For
example you would probably not normally see nautilus running.   Or if
you have it open I expect you would see one process.  But if you were
to see 40 nautilus procesess then you would know that was unusual and
likely an indicator of a problem.

What you would do with that information depends upon the information.
I would try to figure out why those processes are stacking up.  I
would try to clear them.  That might be by removing the USB.  Or by
killing processes with kill.  Or other things depending upon the
situation.  But something is wrong and it is good to know what.

I don't really know what to say about the rest of your problems and
situation.  But I hope this small piece was helpful for this small
part of it.

Bob

Attachment: signature.asc
Description: Digital signature

Reply via email to