Re: GNU Parallel seems to drop

2012-09-25 Thread Ole Tange
On Tue, Sep 25, 2012 at 6:42 AM, Dirk Eddelbuettel wrote: > > Here is more concrete example: You have not given enough information to generate a big inputfile, so I cannot reproduce your test. Based on your explanation it sounds as if the awk script opens files A, B and C. If 2 awk scripts both

Re: GNU Parallel seems to drop

2012-09-25 Thread Dirk Eddelbuettel
Hi Ole, Ole Tange tange.dk> writes: > You have not given enough information to generate a big inputfile, so > I cannot reproduce your test. I know. I created a quick R script for dummy data I can post if there is interest. > Based on your explanation it sounds as if the awk script opens fil

Re: GNU Parallel seems to drop

2012-09-25 Thread Ole Tange
On Tue, Sep 25, 2012 at 1:22 PM, Dirk Eddelbuettel wrote: >> One way to solve that is to instead have the first invocation open A1, >> B1 and C1 while the second writes to A2, B2 and C2. You can use {#} or >> $PARALLEL_SEQ for that by writing to A{#} or A$PARALLEL_SEQ. > > Hm. Then I have ~ N x c

Re: GNU Parallel seems to drop

2012-09-25 Thread Dirk Eddelbuettel
Dirk Eddelbuettel debian.org> writes: > Ole Tange tange.dk> writes: > > If 2 awk scripts both open A, B and C then the last one wins and all > > data written by the first one is lost. > > Plonk. I think that may indeed be the case. I had not tought that through. > I have to find a tool that does

Re: GNU Parallel seems to drop

2012-09-25 Thread Ole Tange
On Tue, Sep 25, 2012 at 1:50 PM, Dirk Eddelbuettel wrote: > Well a little "apt-get install gawk-doc" and two seconds of searching lead to > the '>>' operator to append to files ... and tada, it now works. Depending on how it appends that may not work. Do you know for sure it flushes for every re

Re: GNU Parallel seems to drop data

2012-09-25 Thread Dirk Eddelbuettel
Ole Tange tange.dk> writes: > On Tue, Sep 25, 2012 at 1:50 PM, Dirk Eddelbuettel debian.org> > wrote: > > > Well a little "apt-get install gawk-doc" and two seconds of searching lead > > to > > the '>>' operator to append to files ... and tada, it now works. > > Depending on how it appends th

Re: GNU Parallel seems to drop data

2012-09-25 Thread Ole Tange
On Tue, Sep 25, 2012 at 3:48 PM, Dirk Eddelbuettel wrote: > Yes, now that I am in the office and my actual data, that verification in the > next step. I probably also need the '-k' switch [ does that have "significant" > performance implications? ] to ensure the order is the same which is importa

Re: GNU Parallel seems to drop data

2012-09-25 Thread Dirk Eddelbuettel
Ole Tange tange.dk> writes: > On Tue, Sep 25, 2012 at 3:48 PM, Dirk Eddelbuettel debian.org> > wrote: > But remember: -k only affects stdout/stderr. It does *not* affect > output into files (like you are doing). This may turn out to be a showstopper. Dang. Thanks also for pertinent info re c

Re: GNU Parallel seems to drop data

2012-09-25 Thread Ole Tange
On Tue, Sep 25, 2012 at 6:15 PM, Dirk Eddelbuettel wrote: > BTW the sort/md5sum trick didn't work: > > $ parallel -k --tag 'sort {} | md5sum' :: dataSer/* > parallel: Input is read from the terminal. Only experts do this on purpose.\ > Press CTRL-D to exit. : > Any idea? Count your colons.

Re: GNU Parallel seems to drop data

2012-09-25 Thread Dirk Eddelbuettel
Ole Tange tange.dk> writes: > On Tue, Sep 25, 2012 at 6:15 PM, Dirk Eddelbuettel debian.org> > > Any idea? > > Count your colons. Fails either way: edd@max:/tmp/parallel/dataSerial$ ls -l total 19696 -rw-rw-r-- 1 edd edd 4027062 Sep 25 06:47 A.txt -rw-rw-r-- 1 edd edd 4032566 Sep 25 06:47 B.t

Re: GNU Parallel seems to drop data

2012-09-25 Thread Ole Tange
On Wed, Sep 26, 2012 at 12:08 AM, Dirk Eddelbuettel wrote: > Ole Tange tange.dk> writes: >> On Tue, Sep 25, 2012 at 6:15 PM, Dirk Eddelbuettel debian.org> >> > Any idea? >> >> Count your colons. > > Fails either way: > > edd@max:/tmp/parallel/dataSerial$ ls -l > total 19696 > -rw-rw-r-- 1 edd ed

Re: GNU Parallel seems to drop data

2012-09-25 Thread Dirk Eddelbuettel
Ole Tange tange.dk> writes: > Please paste the full output of: > > parallel --version Sure. I tried 0422 and 0822 which both fail, though in different ways. What I sent earlier used 0422. Dirk edd@max:/tmp/parallel$ cd dataSerial/ edd@max:/tmp/parallel/dataSerial$ parallel wc ::: A.txt B.tx

Re: GNU Parallel seems to drop data

2012-09-25 Thread Ole Tange
On Wed, Sep 26, 2012 at 3:23 AM, Dirk Eddelbuettel wrote: > Ole Tange tange.dk> writes: >> Please paste the full output of: >> >> parallel --version > > Sure. I tried 0422 and 0822 which both fail, though in different ways. What I > sent earlier used 0422. You did not say earlier that you are u

Re: GNU Parallel seems to drop data

2012-09-25 Thread Dirk Eddelbuettel
Ole Tange tange.dk> writes: > You did not say earlier that you are using --tollef: I didn't set it. It came with the package. I'll mention it to the maintainer. You never followed up on the "documentation in texi source but no info file" question I had in my first mail. Any hope that'll change

Re: .texi but not .info-file

2012-09-25 Thread Ole Tange
On Wed, Sep 26, 2012 at 4:14 AM, Dirk Eddelbuettel wrote: > You never followed up on the "documentation in texi source but no info file" > question I had in my first mail. Any hope that'll change? 1. It was in a PS. Never put anything you want people to read in a PS. 2. The subject of the emai