-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/17058/#review32202
-----------------------------------------------------------


Here are some initial comments on the new changes since the last time I 
reviewed it.


src/java/datafu/pig/sampling/WeightedReservoirSampleWithExpJump.java
<https://reviews.apache.org/r/17058/#comment60929>

    You also need to implement new versions of Intermediate and Final that set 
the reservoir to InverseWeightJumpSampleReservoir.  Currently you are using the 
default Reservoir that does not perform the exponential jumps.  Therefore the 
algebraic version doesn't behave as intended. I checked this with the debugger 
by stepping through the code.



src/java/datafu/pig/sampling/WeightedReservoirSampleWithExpJump.java
<https://reviews.apache.org/r/17058/#comment60918>

    minor comment: It appears that declaring 'r' is not needed.  Math.random() 
could be called on line 146 directly.



src/java/datafu/pig/stats/entropy/stream/ChaoShenEntropyEstimator.java
<https://reviews.apache.org/r/17058/#comment60916>

    I think everything in the datafu.pig.stats.entropy.stream package should be 
moved to datafu.pig.stats.entropy. 


- Matthew Hayes


On Jan. 17, 2014, 6:57 p.m., Matthew Hayes wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/17058/
> -----------------------------------------------------------
> 
> (Updated Jan. 17, 2014, 6:57 p.m.)
> 
> 
> Review request for DataFu.
> 
> 
> Repository: datafu
> 
> 
> Description
> -------
> 
> This is Jian Wang's implementation of the entropy and weighted sampling 
> algorithms in DATAFU-2.  I generated the diff by applying the patches in a 
> branch and squashing into one commit.
> 
> 
> Diffs
> -----
> 
>   src/java/datafu/pig/sampling/ReservoirSample.java 
> 403bfaf06402c8dac2a70c0133ee955c625cb06e 
>   src/java/datafu/pig/sampling/ScoredSampleReservoir.java PRE-CREATION 
>   src/java/datafu/pig/sampling/ScoredTuple.java 
> 293cf3d6fcfa041287258b94a1f686a2a77ab4ce 
>   src/java/datafu/pig/sampling/WeightedReservoirSample.java PRE-CREATION 
>   src/java/datafu/pig/sampling/WeightedReservoirSampleWithExpJump.java 
> PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/Entropy.java PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/EntropyUtil.java PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/stream/ChaoShenEntropyEstimator.java 
> PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/stream/EmpiricalEntropyEstimator.java 
> PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/stream/EntropyEstimator.java PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/stream/StreamingCondEntropy.java 
> PRE-CREATION 
>   src/java/datafu/pig/stats/entropy/stream/StreamingEntropy.java PRE-CREATION 
>   
> test/pig/datafu/test/pig/sampling/WeightedReservoirSampleWithExpJumpTests.java
>  PRE-CREATION 
>   test/pig/datafu/test/pig/sampling/WeightedReservoirSamplingTests.java 
> PRE-CREATION 
>   test/pig/datafu/test/pig/stats/entropy/AbstractEntropyTests.java 
> PRE-CREATION 
>   test/pig/datafu/test/pig/stats/entropy/EntropyTests.java PRE-CREATION 
>   test/pig/datafu/test/pig/stats/entropy/StreamingChaoShenEntropyTests.java 
> PRE-CREATION 
>   
> test/pig/datafu/test/pig/stats/entropy/StreamingEmpiricalCondEntropyTests.java
>  PRE-CREATION 
>   test/pig/datafu/test/pig/stats/entropy/StreamingEmpiricalEntropyTests.java 
> PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/17058/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Matthew Hayes
> 
>

Reply via email to