Re: [Text] JaccardSimilarity

2019-03-07 Thread Alex Herbert
> On 8 Mar 2019, at 00:01, Bruno P. Kinoshita > wrote: > >> I’d favour dropping the round and adding it to the Changes.xml via a Jira >> ticket so it is noted if someone upgrades. They can always restore >> functionality to as-it-was by doing a round on the output of the class. > +1 >> I’v

[dbutils] Taking advantage of database cursors in QueryRunner.

2019-03-07 Thread Robert Huffman
My apologies, but this is a repost. I failed to include a subject the first time. I like DbUtils QueryRunner, but it forces me to read the entire ResultSet into memory rather than allowing me to use cursors. So I developed a little library to do that, which I called dbstream. It is on on GitHub:

Re: [Text] JaccardSimilarity

2019-03-07 Thread Bruno P. Kinoshita
>I’d favour dropping the round and adding it to the Changes.xml via a Jira >ticket so it is noted if someone upgrades. They can always restore >functionality to as-it-was by doing a round on the output of the class.  +1 >I’ve already made the test using the python distance.jaccard function fro

Re: [Text] JaccardSimilarity

2019-03-07 Thread Alex Herbert
Hi Bruno, > On 7 Mar 2019, at 21:18, Bruno P. Kinoshita wrote: > > Hi Alex, > Can't recall why it was done that way. When the initial code for the edit > distances was created, some Java libraries like Simmetrics, > java-string-similarity, Lucene, and also R/Python code were used to verify >

Re: [Text] JaccardSimilarity

2019-03-07 Thread Bruno P. Kinoshita
Hi Alex, Can't recall why it was done that way. When the initial code for the edit distances was created, some Java libraries like Simmetrics, java-string-similarity, Lucene, and also R/Python code were used to verify the output of the edit distances. Maybe we used Math.round just to get a test

[Text] JaccardSimilarity

2019-03-07 Thread Alex Herbert
A quick question about the JaccardSimilarity class: Q. Why does it round the similarity to 2 decimal places? This is not documented. It is also done in the complimentary JaccardDistance class. Looking at the history in git it seems to have always been that way. First commit was 2016-11-27.