Re: Problem with reading file and executing other stuffs?

Horinius Mon, 12 Nov 2007 06:52:16 -0800


Hugh Sasse wrote:
> 
> OK, if it is in fields, like /etc/passwd, then awk is probably more
> suited to this problem than reading it directly with shell script.
> 
> If it has some delimited keyword, but each line has variable structure,
> then you'd be better using sed.
>

The files contain something like:
aaa xxx xxx xxxxx xxxxx xxxx
bbb xxx xxx xxx xxx xx xxxx xx
ccc xx xxxxx xxxx xxxxx xxxx

aaa, bbb, ccc are the known unique elements.  No, they don't have a fixed
size.  And no, there's no delimited keyword except the first space after
them.  Those xxx are sequences of characters that can be anything, from
numbers to letters and different length.

The elements are known and unique, and I need to extract the whole line
beginning with such elements.  That's why I used the example of "database
table".  Is awk suitable?  I know nothing about awk.

Hugh Sasse wrote:
> 
> Both of these operate linewise on their input, and can use regular
> expressions and actions in braces to produce some textual response.
> You can pass that response to `xargs -n 1` or  something.
> 

I'm not sure I understand since I know nothing about awk.  But this could be
postponed to a later time for discussion if adequate.

Hugh Sasse wrote:
> 
>> unique element that determines what should be done.  Of course, I could
>> go a
>> grep on the file to find out the elements, but this would give a
>> complexity
>> of O(n^2).
> 
> Not sure how you get the O(n^2) from that unless you don't know what
> the unique elements are, but I still make that "one pass to read them
> all, one pass to execute them" [with apologies to Tolkien :-)]
>> 
>> I know that every line is processed only once, and the pointer to the
>> current line will never go back.  So I figure out that I could read every
>> line in an array element and I could process line by line.  This would
>> give
>> a O(n) which is much faster.
> 
> Yes, agreed.  Throw us a few example lines, fictionalised, then we may
> be able to give you an example of an approach with greater simplicity.
> 

Put it in a simple way, the pseudo algo of extracting lines is like this:
n = number of lines in the file (which is also the number of elements to
process)
element = array(1 to n) of known elements
for i = 1 to n
   use grep or whatever to extract a whole line beginning with element(i)
   //process the line
end

Here, grep has to parse the whole file to extract one line.  In other words,
if there're 3 elements, grep has to parse 3 lines for every element.  Thus
it has to parse 9 lines during the whole algo.  Therefore, if there're n
elements, grep has to parse n lines for n times.  Thus O(n^2).

Even if grep stops at the first occurence of the element, grep has to parse
n/2 lines in average.  So the time is proportional to n^2/2, so the
complexity is still O(n^2).

-- 
View this message in context: 
http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13706888
Sent from the Gnu - Bash mailing list archive at Nabble.com.

Re: Problem with reading file and executing other stuffs?

Reply via email to