Problem with reading file and executing other stuffs?
I've been struggling with the following code of reading a text file (test.txt) and counting the number of lines. Well, I know there're simpler method to count the number of lines of a text file, but that's not the point of this post. __ n=0 cat test.txt | while read line do n=$((n+1)) echo "${line}" done echo "$n" __ The result of the last echo is zero, meaning that n is never incremented inside the while loop. It seems to me that inside this while loop, except the echo of the lines, nothing else is done. Pitfall? Bug? Or feature? I'd appreciate if somebody could shed some light on this. -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13535762 Sent from the Gnu - Bash mailing list archive at Nabble.com.
Re: Problem with reading file and executing other stuffs?
Paul Jarc wrote: > > Read entry E4 in the bash FAQ: > http://tiswww.case.edu/php/chet/bash/FAQ > OK, I see, the problem comes from use of pipeline which triggers creation of subprocess (why should they do so? -- no need to answer this question :p ) I've read several times that section but I'm not sure how to use the IFS. However, http://en.wikipedia.org/wiki/Bash_syntax_and_semantics#I.2FO_redirection seems to give some hints. Need to try it. -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13545261 Sent from the Gnu - Bash mailing list archive at Nabble.com.
Re: Problem with reading file and executing other stuffs?
Paul Jarc wrote: > > If you're reading from a regular file, you can just eliminate the > useless use of cat: > while read line; do ...; done < test.txt > Oh yes! This is a lot better and syntactically simpler than using the file descriptor 6 (which nevertheless is also a working solution). It's a pity that the filename can't be put before the while loop or it'll be a lot easier to read, esp when the while loop is very big. (Once more, no need to answer to this comment of mine :p ) Is there any pitfall using this solution of yours? You talked about "regular file", what's that supposed to be? Text file vs binary file? I've found that if the last line isn't terminated by a new-line, that line can't be read. This seems to be a very common error and I've seen it in other commands. -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13553050 Sent from the Gnu - Bash mailing list archive at Nabble.com.
Re: Problem with reading file and executing other stuffs?
Hugh Sasse wrote: > > On Fri, 2 Nov 2007, Horinius wrote: >> I've found that if the last line isn't terminated by a new-line, that >> line >> can't be read. This seems to be a very common error and I've seen it in >> other commands. > > This is a Unix convention. I don't know the origins. > I was not talking about new-line vs carriage-return vs new-line AND carriage-return. I was saying that if the last character of the last line is also the last character of the file, that line isn't read. Or is it really what you're referring to as "Unix convention"? -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13553839 Sent from the Gnu - Bash mailing list archive at Nabble.com.
Re: Problem with reading file and executing other stuffs?
Hugh Sasse wrote: > > And vi warns about it in a similar way to ed. > > Again, what problem are you trying to solve, if any? > I'm doing some processing in a big file which is well formatted. It's sort of a database table (or a CVS file if you like). Every line contains a unique element that determines what should be done. Of course, I could go a grep on the file to find out the elements, but this would give a complexity of O(n^2). I know that every line is processed only once, and the pointer to the current line will never go back. So I figure out that I could read every line in an array element and I could process line by line. This would give a O(n) which is much faster. -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13651074 Sent from the Gnu - Bash mailing list archive at Nabble.com.
Re: Problem with reading file and executing other stuffs?
Hugh Sasse wrote: > > OK, if it is in fields, like /etc/passwd, then awk is probably more > suited to this problem than reading it directly with shell script. > > If it has some delimited keyword, but each line has variable structure, > then you'd be better using sed. > The files contain something like: aaa xxx xxx x x bbb xxx xxx xxx xxx xx xx ccc xx x x aaa, bbb, ccc are the known unique elements. No, they don't have a fixed size. And no, there's no delimited keyword except the first space after them. Those xxx are sequences of characters that can be anything, from numbers to letters and different length. The elements are known and unique, and I need to extract the whole line beginning with such elements. That's why I used the example of "database table". Is awk suitable? I know nothing about awk. Hugh Sasse wrote: > > Both of these operate linewise on their input, and can use regular > expressions and actions in braces to produce some textual response. > You can pass that response to `xargs -n 1` or something. > I'm not sure I understand since I know nothing about awk. But this could be postponed to a later time for discussion if adequate. Hugh Sasse wrote: > >> unique element that determines what should be done. Of course, I could >> go a >> grep on the file to find out the elements, but this would give a >> complexity >> of O(n^2). > > Not sure how you get the O(n^2) from that unless you don't know what > the unique elements are, but I still make that "one pass to read them > all, one pass to execute them" [with apologies to Tolkien :-)] >> >> I know that every line is processed only once, and the pointer to the >> current line will never go back. So I figure out that I could read every >> line in an array element and I could process line by line. This would >> give >> a O(n) which is much faster. > > Yes, agreed. Throw us a few example lines, fictionalised, then we may > be able to give you an example of an approach with greater simplicity. > Put it in a simple way, the pseudo algo of extracting lines is like this: n = number of lines in the file (which is also the number of elements to process) element = array(1 to n) of known elements for i = 1 to n use grep or whatever to extract a whole line beginning with element(i) //process the line end Here, grep has to parse the whole file to extract one line. In other words, if there're 3 elements, grep has to parse 3 lines for every element. Thus it has to parse 9 lines during the whole algo. Therefore, if there're n elements, grep has to parse n lines for n times. Thus O(n^2). Even if grep stops at the first occurence of the element, grep has to parse n/2 lines in average. So the time is proportional to n^2/2, so the complexity is still O(n^2). -- View this message in context: http://www.nabble.com/Problem-with-reading-file-and-executing-other-stuffs--tf4733602.html#a13706888 Sent from the Gnu - Bash mailing list archive at Nabble.com.