To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-vfs2-test has an issue affecting its community integration.
This i
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-id has an issue affecting its community integration.
This issue af
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-proxy-test has an issue affecting its community integration.
This
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-configuration-test has an issue affecting its community
integrati
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-scxml-test has an issue affecting its community integration.
This
On Mar 12, 2012, at 5:44 PM, sebb wrote:
> On 13 March 2012 00:29, Emmanuel Bourg wrote:
>> Le 13/03/2012 01:25, sebb a écrit :
>>
>>
>>> I'm concerned that the CSV code may grow and grow with private
>>> versions of code that could be provided by the JDK.
>>>
>>> By all means make sure the c
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-digester3 has an issue affecting its community integration.
This i
On 13 March 2012 01:47, Niall Pemberton wrote:
> On Tue, Mar 13, 2012 at 12:29 AM, Emmanuel Bourg wrote:
>> Le 13/03/2012 01:25, sebb a écrit :
>>
>>
>>> I'm concerned that the CSV code may grow and grow with private
>>> versions of code that could be provided by the JDK.
>>>
>>> By all means mak
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-io-test has an issue affecting its community integration.
This iss
On Tue, Mar 13, 2012 at 12:29 AM, Emmanuel Bourg wrote:
> Le 13/03/2012 01:25, sebb a écrit :
>
>
>> I'm concerned that the CSV code may grow and grow with private
>> versions of code that could be provided by the JDK.
>>
>> By all means make sure the code is efficient in the way it uses the
>> JD
On Mon, Mar 12, 2012 at 9:12 PM, sebb wrote:
> On 10 March 2012 01:02, sebb wrote:
> > This is a VOTE to release commons-parent 24 based on RC2.
> >
> > As agreed previously, commons parent release votes operate on lazy
> > consensus, i.e. the vote is assumed to have passed if 72 hours have
> >
On 10 March 2012 01:02, sebb wrote:
> This is a VOTE to release commons-parent 24 based on RC2.
>
> As agreed previously, commons parent release votes operate on lazy
> consensus, i.e. the vote is assumed to have passed if 72 hours have
> elapsed without an objection.
72 hours have now elapsed si
On 13 March 2012 00:29, Emmanuel Bourg wrote:
> Le 13/03/2012 01:25, sebb a écrit :
>
>
>> I'm concerned that the CSV code may grow and grow with private
>> versions of code that could be provided by the JDK.
>>
>> By all means make sure the code is efficient in the way it uses the
>> JDK classes,
On Mar 12, 2012, at 20:30, Emmanuel Bourg wrote:
> Le 13/03/2012 01:25, sebb a écrit :
>
>> I'm concerned that the CSV code may grow and grow with private
>> versions of code that could be provided by the JDK.
>>
>> By all means make sure the code is efficient in the way it uses the
>> JDK classe
On Mar 12, 2012, at 20:25, sebb wrote:
> On 13 March 2012 00:12, Emmanuel Bourg wrote:
>> I kept tickling ExtendedBufferedReader and I have some interesting results.
>>
>> First I tried to simplify it by extending java.io.LineNumberReader instead
>> of BufferedReader. The performance decreased b
Le 13/03/2012 00:56, sebb a écrit :
1. Do nothing and address it in the next release with the bean mapping.
Parsing the file would then look like this:
CSVFormat format = CSVFormat.DEFAULT.withType(Person.class);
for (Person person : format.parse(in)) {
persons.add(person);
}
Do
Le 13/03/2012 01:25, sebb a écrit :
I'm concerned that the CSV code may grow and grow with private
versions of code that could be provided by the JDK.
By all means make sure the code is efficient in the way it uses the
JDK classes, but I don't think we should be recoding standard classes.
I a
On 13 March 2012 00:12, Emmanuel Bourg wrote:
> I kept tickling ExtendedBufferedReader and I have some interesting results.
>
> First I tried to simplify it by extending java.io.LineNumberReader instead
> of BufferedReader. The performance decreased by 20%, probably because the
> class is synchron
I kept tickling ExtendedBufferedReader and I have some interesting results.
First I tried to simplify it by extending java.io.LineNumberReader
instead of BufferedReader. The performance decreased by 20%, probably
because the class is synchronized internally.
But wait, isn't BufferedReader als
On 12 March 2012 22:11, Emmanuel Bourg wrote:
> [csv] is missing some elements to ease the use of headers. I have no clear
> idea on how to address this, here are my thoughts.
>
> Headers are used when the fields are accessed by the column name rather than
> by the index. This provides some flexib
A lot of bioinformaticians would love us if we added this!
On Mar 12, 2012 4:20 PM, "Thomas Neidhart"
wrote:
> Hi,
>
> on the weekend, I started to work on issue LANG-680
> (https://issues.apache.org/jira/browse/LANG-680), which is about adding
> support for finding the longest common substring o
[csv] is missing some elements to ease the use of headers. I have no
clear idea on how to address this, here are my thoughts.
Headers are used when the fields are accessed by the column name rather
than by the index. This provides some flexibility because the input file
can be slightly modifie
Am 12. März 2012 21:20 schrieb Thomas Neidhart :
> Hi,
>
> on the weekend, I started to work on issue LANG-680
> (https://issues.apache.org/jira/browse/LANG-680), which is about adding
> support for finding the longest common substring of a set of Strings.
>
> Suffix Trees are a standard data struc
Hi,
on the weekend, I started to work on issue LANG-680
(https://issues.apache.org/jira/browse/LANG-680), which is about adding
support for finding the longest common substring of a set of Strings.
Suffix Trees are a standard data structure to efficiently solve this
problem, and I created a varia
Le 12/03/2012 20:38, Sébastien Brisard a écrit :
> Hello,
Hi Sébastien,
> can we commit now that 3.0 is out?
Sure!
> Do we need to update the pom.xml to 3.1-SNAPSHOT?
I think Seeb did it today.
We should also close the Jira issues that have been solved as of 3.0.
For newcomers, our policy on
Hello,
can we commit now that 3.0 is out?
Do we need to update the pom.xml to 3.1-SNAPSHOT?
Best regards,
Sébastien
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apache.org
For additional commands, e-mail: dev-h...@commons.ap
Le 12/03/2012 16:50, sebb a écrit :
> In addition to Javadoc for 3.0 and 2.2, the website still offers the
> following:
>
> Javadoc (2.1 release)
> Javadoc (2.0 release)
> Javadoc (1.2 release)
> Javadoc (1.1 release)
> Javadoc (1.0 release)
>
> This seems rather unnecessary. Surely we should on
To whom it may engage...
This is an automated request, but not an unsolicited one. For
more information please visit http://gump.apache.org/nagged.html,
and/or contact the folk at gene...@gump.apache.org.
Project commons-id has an issue affecting its community integration.
This issue af
Le 12/03/2012 18:51, Benedikt Ritter a écrit :
you wrote about the printer ;)
Yes, because the line separator of CSVFormat is only used there.
Usually you have to be permissive in what your read, but strict on what
your write.
Emmanuel Bourg
smime.p7s
Description: S/MIME Cryptographic S
Le 12/03/2012 18:52, sebb a écrit :
The only possible ambiguity is whether the file uses CR, LF, or CRLF.
Yes that's what I mean by knowing in advance.
The line separator is not obvious when you open the file in an editor.
That's not the case for the delimiter.
Emmanuel Bourg
smime.p7s
On 12 March 2012 17:38, Emmanuel Bourg wrote:
> Le 12/03/2012 18:31, Benedikt Ritter a écrit :
>
>
>> I'm not sure if I got you right. You have to pass a CSVFormat if you
>> want to construct a CSVLexer(), so we could use the lexer's internal
>> CSVformat.
>
>
> Yes that's what I understood, when
Am 12. März 2012 18:38 schrieb Emmanuel Bourg :
> Le 12/03/2012 18:31, Benedikt Ritter a écrit :
>
>
>> I'm not sure if I got you right. You have to pass a CSVFormat if you
>> want to construct a CSVLexer(), so we could use the lexer's internal
>> CSVformat.
>
>
> Yes that's what I understood, when
Le 12/03/2012 18:31, Benedikt Ritter a écrit :
I'm not sure if I got you right. You have to pass a CSVFormat if you
want to construct a CSVLexer(), so we could use the lexer's internal
CSVformat.
Yes that's what I understood, when I mention the parser it includes the
lexer as well.
I think
On 12 March 2012 16:54, Christian Grobmeier wrote:
> On Mon, Mar 12, 2012 at 5:48 PM, sebb wrote:
>> On 12 March 2012 08:45, wrote:
>>> Author: ebourg
>>> Date: Mon Mar 12 08:45:34 2012
>>> New Revision: 1299580
>>>
>>> URL: http://svn.apache.org/viewvc?rev=1299580&view=rev
>>> Log:
>>> Seriali
Am 12. März 2012 18:24 schrieb Emmanuel Bourg :
> Le 12/03/2012 18:17, Benedikt Ritter a écrit :
>
>
>> this method assumes, that a line separator will always be "\r" or
>> "\r\n". This is true for the pre-configured CSVFormats EXCEL, TDF and
>> MYSQL. I'm not a pro when it comes to file encoding,
Le 12/03/2012 18:17, Benedikt Ritter a écrit :
this method assumes, that a line separator will always be "\r" or
"\r\n". This is true for the pre-configured CSVFormats EXCEL, TDF and
MYSQL. I'm not a pro when it comes to file encoding, but isn't there
the possibility that new encodings will have
Hi,
while looking for potential performance optimization I came across
CSVLexer.isEndOfLine(int c). Here is the source:
private boolean isEndOfLine(int c) throws IOException {
// check if we have \r\n...
if (c == '\r' && in.lookAhead() == '\n') {
// note: does not
Yes this is what I mean. It might be worth a shot. Folks who specialize
in parsing have spent much time on these libraries. It would make sense
that they are quite fast. It gets us out of the parsing business.
On Mar 12, 2012 12:41 PM, "Emmanuel Bourg" wrote:
> Le 12/03/2012 17:28, James Carm
On Mon, Mar 12, 2012 at 5:48 PM, sebb wrote:
> On 12 March 2012 08:45, wrote:
>> Author: ebourg
>> Date: Mon Mar 12 08:45:34 2012
>> New Revision: 1299580
>>
>> URL: http://svn.apache.org/viewvc?rev=1299580&view=rev
>> Log:
>> Serialization test for CSVFormat
>
> Note: this does not test seriali
On Mon, Mar 12, 2012 at 5:41 PM, Emmanuel Bourg wrote:
> Le 12/03/2012 17:28, James Carman a écrit :
>
>> Would one of the parser libraries not work here?
>
>
> You think at something like JavaCC or AntLR? Not sure it'll be more
> efficient than a handcrafted parser. The CSV format is simple enoug
On Mon, Mar 12, 2012 at 5:30 PM, sebb wrote:
> On 12 March 2012 16:25, James Carman wrote:
>> We could say we support short-term storage (or transmission-only) only when
>> it comes to serialization. That would help eliminate some of the burden
>
> Perhaps, but why take on any burden without a go
On 12 March 2012 08:45, wrote:
> Author: ebourg
> Date: Mon Mar 12 08:45:34 2012
> New Revision: 1299580
>
> URL: http://svn.apache.org/viewvc?rev=1299580&view=rev
> Log:
> Serialization test for CSVFormat
Note: this does not test serialisation between versions or JDKs as the
same JVM is used fo
Le 12/03/2012 17:28, James Carman a écrit :
Would one of the parser libraries not work here?
You think at something like JavaCC or AntLR? Not sure it'll be more
efficient than a handcrafted parser. The CSV format is simple enough to
do it manually.
Emmanuel Bourg
smime.p7s
Description: S
Am 12. März 2012 17:22 schrieb Emmanuel Bourg :
> Le 12/03/2012 17:03, Benedikt Ritter a écrit :
>
>
>> The hole logic behind CSVLexer.nextToken() is very hard to read
>> (IMHO). Maybe a some refactoring would help to make it easier to
>> identify bottle necks?
>
>
> Yes I started investigating in
On 12 March 2012 16:25, James Carman wrote:
> We could say we support short-term storage (or transmission-only) only when
> it comes to serialization. That would help eliminate some of the burden
Perhaps, but why take on any burden without a good use case?
Any form of serialisation adds extra te
Would one of the parser libraries not work here?
On Mar 12, 2012 12:22 PM, "Emmanuel Bourg" wrote:
> Le 12/03/2012 17:03, Benedikt Ritter a écrit :
>
> The hole logic behind CSVLexer.nextToken() is very hard to read
>> (IMHO). Maybe a some refactoring would help to make it easier to
>> identify
We could say we support short-term storage (or transmission-only) only when
it comes to serialization. That would help eliminate some of the burden
On Mar 12, 2012 11:23 AM, "sebb" wrote:
> On 12 March 2012 09:02, Jörg Schaible wrote:
> > Emmanuel Bourg wrote:
> >
> >> Le 12/03/2012 00:16, Bened
Le 12/03/2012 17:03, Benedikt Ritter a écrit :
The hole logic behind CSVLexer.nextToken() is very hard to read
(IMHO). Maybe a some refactoring would help to make it easier to
identify bottle necks?
Yes I started investigating in this direction. I filed a few bugs
regarding the behavior of th
Am 12. März 2012 11:31 schrieb Emmanuel Bourg :
> I have identified the performance killer, it's the ExtendedBufferedReader.
> It implements a complex logic to fetch one character ahead, but this extra
> character is rarely used. I have implemented a simpler look ahead using
> mark/reset as suggest
In addition to Javadoc for 3.0 and 2.2, the website still offers the following:
Javadoc (2.1 release)
Javadoc (2.0 release)
Javadoc (1.2 release)
Javadoc (1.1 release)
Javadoc (1.0 release)
This seems rather unnecessary. Surely we should only offer Javadoc for
current releases?
-
2012/3/12 sebb :
> On 12 March 2012 15:08, Thomas Neidhart wrote:
>> 2012/3/12 Sébastien Brisard
>>
>>> Hi,
>>> I was unable to download the binaries of this release from the
>>> website. Downloading the source works fine. Maybe I did something
>>> wrong, could anyone else check?
>>>
>>
>> Did no
Le 12/03/2012 16:44, sebb a écrit :
Java has a PushbackReader class - could that not be used?
I considered it, but it doesn't mix well with line reading. The
mark/reset solution is really simple and efficient.
Emmanuel Bourg
smime.p7s
Description: S/MIME Cryptographic Signature
On 12 March 2012 10:31, Emmanuel Bourg wrote:
> I have identified the performance killer, it's the ExtendedBufferedReader.
> It implements a complex logic to fetch one character ahead, but this extra
> character is rarely used. I have implemented a simpler look ahead using
> mark/reset as suggeste
On Mon, Mar 12, 2012 at 11:23 AM, sebb wrote:
> On 12 March 2012 09:02, Jörg Schaible wrote:
> > Emmanuel Bourg wrote:
> >
> >> Le 12/03/2012 00:16, Benedikt Ritter a écrit :
> >>
> >>> I just saw that CSVFormat implements Serializable, but neither does it
> >>> provide a no-arg constructor nor
On 12 March 2012 15:08, Thomas Neidhart wrote:
> 2012/3/12 Sébastien Brisard
>
>> Hi,
>> I was unable to download the binaries of this release from the
>> website. Downloading the source works fine. Maybe I did something
>> wrong, could anyone else check?
>>
>
> Did not work for me too, the reaso
On 12 March 2012 09:02, Jörg Schaible wrote:
> Emmanuel Bourg wrote:
>
>> Le 12/03/2012 00:16, Benedikt Ritter a écrit :
>>
>>> I just saw that CSVFormat implements Serializable, but neither does it
>>> provide a no-arg constructor nor any of the special serialization
>>> methods (and it has no cu
2012/3/12 Sébastien Brisard
> Hi,
> I was unable to download the binaries of this release from the
> website. Downloading the source works fine. Maybe I did something
> wrong, could anyone else check?
>
Did not work for me too, the reason is the link is wrong, the -bin suffix
is missing, it shou
On Mon, Mar 12, 2012 at 10:47 AM, Benedikt Ritter wrote:
> Am 12. März 2012 15:39 schrieb Gary Gregory :
> > On Mon, Mar 12, 2012 at 10:17 AM, Benedikt Ritter <
> benerit...@googlemail.com
> >> wrote:
> >
> >> Hey Gary,
> >>
> >> thanks for the hint. Should I just send patches for minor changes l
Am 12. März 2012 15:39 schrieb Gary Gregory :
> On Mon, Mar 12, 2012 at 10:17 AM, Benedikt Ritter > wrote:
>
>> Hey Gary,
>>
>> thanks for the hint. Should I just send patches for minor changes like
>> that to the ML (plain text, not as attachment of course ;)?
>>
>
> Hm, I thought a comitter was s
On Mon, Mar 12, 2012 at 10:17 AM, Benedikt Ritter wrote:
> Hey Gary,
>
> thanks for the hint. Should I just send patches for minor changes like
> that to the ML (plain text, not as attachment of course ;)?
>
Hm, I thought a comitter was submitting these... JIRA is the way to submit
code indeed.
Hey Gary,
thanks for the hint. Should I just send patches for minor changes like
that to the ML (plain text, not as attachment of course ;)?
Benedikt
Am 12. März 2012 15:03 schrieb Gary Gregory :
> I do not think we need to tickets for this kind of change.
>
> Gary
>
> On Mar 12, 2012, at 9:59,
I do not think we need to tickets for this kind of change.
Gary
On Mar 12, 2012, at 9:59, "Benedikt Ritter (Created) (JIRA)"
wrote:
> Replace while(true)-loop in CSVParser.getRecord() with do-while-loop
>
>
>
Hi,
I was unable to download the binaries of this release from the
website. Downloading the source works fine. Maybe I did something
wrong, could anyone else check?
Sébastien
-
To unsubscribe, e-mail: dev-unsubscr...@commons.apac
I have identified the performance killer, it's the
ExtendedBufferedReader. It implements a complex logic to fetch one
character ahead, but this extra character is rarely used. I have
implemented a simpler look ahead using mark/reset as suggested by Bob
Smith in CSV-42 and the performance improv
Emmanuel Bourg wrote:
> Le 12/03/2012 00:16, Benedikt Ritter a écrit :
>
>> I just saw that CSVFormat implements Serializable, but neither does it
>> provide a no-arg constructor nor any of the special serialization
>> methods (and it has no custom serialUID). Is this the way it is
>> supposed to
Le 12/03/2012 00:16, Benedikt Ritter a écrit :
I just saw that CSVFormat implements Serializable, but neither does it
provide a no-arg constructor nor any of the special serialization
methods (and it has no custom serialUID). Is this the way it is
supposed to be?
I wrote a test and it seems th
66 matches
Mail list logo