Re: History file clobbered by multiple simultaneous exits

2013-07-25 Thread Geoff Kuenning
>   As for the problem...the fact that you using 4.2, would seem
> to make the algorith
> open()
> write(whatever we have as history)
> close(set eof to where we are).
>
> What file system are you are?  is it local or networked?

Local, ext3.

> one way for it to be zero is if the last bash exiting had no history,
> cuz the zero/truncate of each open can zero the file from any previous
> bash being left.

I thought of that too, but it's not the case for me.  Even after the
failure has wiped the old history, my new shells have at least 1-2
commands kicking around.  So I could imagine my nice 500-line history
turning into a 2-line one, but not zero-length.

> I can also see the possibility of some kernel or file system routine
> waiting after you issue the close call so that it doesn't have to zero
> the area where data is arriving.  I.e. it might only zero the file beyond
> the valid text AFTER some delay (5 seconds?) OR might wait until the file
> is closed, so if you completely overwrite the old file with text, the
> kernel won't have to zero anything out.

If so, that would be a big bug.  When you're truncating a file to a
shorter length, some filesystems do indeed delay freeing the blocks in
hopes of reusing them.  But the length is set to zero when the O_TRUNC
happens, and likewise if you write n bytes, the length is immediately
increased by n.  There are certain races on some filesystems that could
cause the n bytes to be incorrect (e.g., garbage), but that generally
happens only on system crashes.  There's a paper on this from a few
years back; I'd have to review it to be sure but my recollection is that
you can't get zero-length files in the absence of system or hardware
failures.  (However, I'm not sure whether they used SMPs...)

Still, I suppose it could be a kernel bug.  Maybe I'll have to write a
better test program and let it run overnight.

> in the case of write...close to non-pre-zeroed record, the operation
> becomes a read-modify-write.  Thing is, if proc 3 goes out for the
> partial buffer
> (~4k is likely), it may have been completely zeroed from proc2 closing
> where proc3
> wants to write.

No generic Linux filesystem that I'm aware of zeroes discarded data at
any time; it's too expensive.  And the partial buffer would be in-memory
at that point, since the processes can exit much faster than the buffer
could be written to disk.  So I don't think that's it.

> (multi-threaded ops on real multiple execution units do the darnest things).

Ain't that the truth!
-- 
Geoff Kuenning   ge...@cs.hmc.edu   http://www.cs.hmc.edu/~geoff/

An Internet that is not Open represents a potentially grave risk to
freedoms of many sorts -- freedom of speech and other civil liberties,
freedom of commerce, and more -- and that openness is what we must so
diligently work to both preserve and expand.
-- Lauren Weinstein



Re: History file clobbered by multiple simultaneous exits

2013-07-25 Thread Linda Walsh



Geoff Kuenning wrote:



I can also see the possibility of some kernel or file system routine
waiting after you issue the close call so that it doesn't have to zero
the area where data is arriving.  I.e. it might only zero the file beyond
the valid text AFTER some delay (5 seconds?) OR might wait until the file
is closed, so if you completely overwrite the old file with text, the
kernel won't have to zero anything out.


If so, that would be a big bug.  When you're truncating a file to a
shorter length, some filesystems do indeed delay freeing the blocks in
hopes of reusing them.  But the length is set to zero when the O_TRUNC
happens, and likewise if you write n bytes, the length is immediately
increased by n.  There are certain races on some filesystems that could
cause the n bytes to be incorrect (e.g., garbage), but that generally
happens only on system crashes.  There's a paper on this from a few
years back; I'd have to review it to be sure but my recollection is that
you can't get zero-length files in the absence of system or hardware
failures.  (However, I'm not sure whether they used SMPs...)

-
Instead of "junk", secure file systems mark it as needing to be zeroed. 
 Perhaps
instead of zeroing it ext3 simply marks it of zero length?  Imagine, embedded 
in the
junk are credit cards and passwords and you'll begin to understand why zero 
pages are

kept "in-stock" (in memory) in the kernel so they can rapidly issue a fresh 
page.


It's an edge case usually only seen during some system crash as you mentioned,
so I can't see how it would cause the symptom you are seeing, but it's only
seen in crashes in more mature file systems.  Can you reproduce it on another
file system like xfs?




Still, I suppose it could be a kernel bug.  Maybe I'll have to write a
better test program and let it run overnight.


Well... remember, between bash and the kernel are layers of libc-library
stuff, as well as file-system drivers that often all act just slightly 
differently
from every other driver... ;-)



in the case of write...close to non-pre-zeroed record, the operation
becomes a read-modify-write.  Thing is, if proc 3 goes out for the
partial buffer
(~4k is likely), it may have been completely zeroed from proc2 closing
where proc3
wants to write.


No generic Linux filesystem that I'm aware of zeroes discarded data at
any time; it's too expensive.

-
Actually the buffer is zeroed before the user stuff is copied into it
if it is a partial record.  having no data leakage between processes is a
security requirement on secure systems and that requirement has just become the
status quote for most modern OS's...




Re: History file clobbered by multiple simultaneous exits

2013-07-25 Thread Geoff Kuenning
>   Instead of "junk", secure file systems mark it as needing to be
> zeroed.  Perhaps instead of zeroing it ext3 simply marks it of zero
> length?  Imagine, embedded in the junk are credit cards and passwords
> and you'll begin to understand why zero pages are kept "in-stock" (in
> memory) in the kernel so they can rapidly issue a fresh page.

Perhaps I should mention that file systems are my field.

While you're right that secure-deletion file systems clobber deleted
data, that's the exception rather than the rule.  So your guess about
ext3 is also correct: when you truncate a file, ext3 marks it as zero
length but never overwrites the data.  This is easy to prove on any
machine that has a way of monitoring disk activity (either an LED or via
programs like gkrellm or iotop).  Simply create a multi-gigabyte file,
sync the disk, and cp /dev/null to the file.  The disk will be only
briefly active while the free list is updated, not nearly long enough to
rewrite zeros to the freed blocks.

The difference from zero pages in memory is that the API for memory is
designed around the "give me a block and I'll do things with it later"
model.  So asking for a page doesn't inherently clobber it, and the
kernel has to do that for you.  By contrast, the only way to allocate a
page in a file system is to either write to it or to lseek over it.  The
first case automatically clobbers it; in the second it gets marked as
"demand zero if anybody ever reads it" but the disk never gets
written--in fact, no block is allocated unless you write to it.

(As a historical note, Unix at least through V7 didn't demand-zero newly
allocated memory.  Eventually somebody noticed the bug and a wrote
program to steal passwords that way, prompting a change in the kernel
code.  I also believe that in V6 if you wrote a partial block at the end
of a file, you could get leftover data from another file in the
remainder of the block; since you couldn't read it back--it was past
EOF--it was considered OK to have garbage there, and zeroing the memory
before the write was thought to be too expensive.  Unfortunately my
treasured photocopy of the Lions book is in storage while my office is
being moved, so I can't check the V6 code to be sure.)

> It's an edge case usually only seen during some system crash as you mentioned,
> so I can't see how it would cause the symptom you are seeing, but it's only
> seen in crashes in more mature file systems.  Can you reproduce it on another
> file system like xfs?

Good question.  Unfortunately that's not a practical experiment for me
to run.  :-(
-- 
Geoff Kuenning   ge...@cs.hmc.edu   http://www.cs.hmc.edu/~geoff/

Paymasters come in only two sizes: one sort shows you where the book
says that you can't have what you've got coming to you; the second
sort digs through the book until he finds a paragraph that lets you
have what you need even if you don't rate it.  Doughty was the second
sort.
-- Robert A. Heinlein, "The Door Into Summer"