Hugh Sasse wrote: > > OK, if it is in fields, like /etc/passwd, then awk is probably more > suited to this problem than reading it directly with shell script. > > If it has some delimited keyword, but each line has variable structure, > then you'd be better using sed. >
The files contain something like: aaa xxx xxx xxxxx xxxxx xxxx bbb xxx xxx xxx xxx xx xxxx xx ccc xx xxxxx xxxx xxxxx xxxx aaa, bbb, ccc are the known unique elements. No, they don't have a fixed size. And no, there's no delimited keyword except the first space after them. Those xxx are sequences of characters that can be anything, from numbers to letters and different length. The elements are known and unique, and I need to extract the whole line beginning with such elements. That's why I used the example of "database table". Is awk suitable? I know nothing about awk. Hugh Sasse wrote: > > Both of these operate linewise on their input, and can use regular > expressions and actions in braces to produce some textual response. > You can pass that response to `xargs -n 1` or something. > I'm not sure I understand since I know nothing about awk. But this could be postponed to a later time for discussion if adequate. Hugh Sasse wrote: > >> unique element that determines what should be done. Of course, I could >> go a >> grep on the file to find out the elements, but this would give a >> complexity >> of O(n^2). > > Not sure how you get the O(n^2) from that unless you don't know what > the unique elements are, but I still make that "one pass to read them > all, one pass to execute them" [with apologies to Tolkien :-)] >> >> I know that every line is processed only once, and the pointer to the >> current line will never go back. So I figure out that I could read every >> line in an array element and I could process line by line. This would >> give >> a O(n) which is much faster. > > Yes, agreed. Throw us a few example lines, fictionalised, then we may > be able to give you an example of an approach with greater simplicity. > Put it in a simple way, the pseudo algo of extracting lines is like this: n = number of lines in the file (which is also the number of elements to process) element = array(1 to n) of known elements for i = 1 to n use grep or whatever to extract a whole line beginning with element(i) //process the line end Here, grep has to parse the whole file to extract one line. In other words, if there're 3 elements, grep has to parse 3 lines for every element. Thus it has to parse 9 lines during the whole algo. Therefore, if there're n elements, grep has to parse n lines for n times. Thus O(n^2). Even if grep stops at the first occurence of the element, grep has to parse n/2 lines in average. So the time is proportional to n^2/2, so the complexity is still O(n^2). -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13706888 Sent from the Gnu - Bash mailing list archive at Nabble.com.