Hello Joel,
I took a bigger trainingSet around 200K documents (amazon reviews) and it
worked out well. I verified the feature terms extracted and classify
function was able to output correct probability of reviews being negative
or positive. Big thanks for adding this.
I wonder what you have ne
got it, Thanks, Joel.
On Thu, Feb 9, 2017 at 11:17 AM, Susheel Kumar
wrote:
> I increased from 250 to 2500 and 100 to 1000 when did't get expected
> result. Let me put more examples.
>
> Thanks,
> Susheel
>
> On Thu, Feb 9, 2017 at 11:03 AM, Joel Bernstein
> wrote:
>
>> A few things that I see
I increased from 250 to 2500 and 100 to 1000 when did't get expected
result. Let me put more examples.
Thanks,
Susheel
On Thu, Feb 9, 2017 at 11:03 AM, Joel Bernstein wrote:
> A few things that I see right off:
>
> 1) 2500 terms is too many. I was testing with 100-250 terms
> 2) 1000 iteration
Also you can see in the final iteration of the model that there are 8 true
positives and 8 false positives. So this model classifies everything as
positive. At that you know that it's not a good model.
Joel Bernstein
http://joelsolr.blogspot.com/
On Thu, Feb 9, 2017 at 11:03 AM, Joel Bernstein w
A few things that I see right off:
1) 2500 terms is too many. I was testing with 100-250 terms
2) 1000 iterations is to high. If the model hasn't converged by 100
iterations it's likely not going to converge.
3) You're going to need more examples. You may want to run features first
and see what it
Hello Joel,
Here is the final iteration in json format.
https://www.dropbox.com/s/g3a3606ms6cu8q4/final_iteration.json?dl=0
Below is the expression used
update(models,
batchSize="50",
train(trainingSet,
features(trainingSet,
Can you post the final iteration of the model?
Also the expression you used to train the model?
How much training data do you have? Ho many positive examples and negatives
examples?
Joel Bernstein
http://joelsolr.blogspot.com/
On Tue, Feb 7, 2017 at 2:14 PM, Susheel Kumar wrote:
> Hello,
>
>
Hello,
I am tried to follow http://joelsolr.blogspot.com/ to see if we can
classify positive & negative feedbacks using streaming expressions. All
works but end result where probability_d result of classify expression
gives similar results for positive / negative feedback. See below
What I may b