I've only tested with the training data in it's own collection, but it was
designed for multiple training sets in the same collection.

I suspect you're training set is too small to get a reliable model from.
The training sets we tested with were considerably larger.

All the idfs_ds values being the same seems odd though. The idfs_ds in
particular were designed to be accurate when there are multiple training
sets in the same collection.

Joel Bernstein
http://joelsolr.blogspot.com/

On Mon, Mar 20, 2017 at 5:41 PM, Joe Obernberger <
joseph.obernber...@gmail.com> wrote:

> If I put the training data into its own collection and use q="*:*", then
> it works correctly.  Is that a requirement?
> Thank you.
>
> -Joe
>
>
>
> On 3/20/2017 3:47 PM, Joe Obernberger wrote:
>
>> I'm trying to build a model using tweets.  I've manually tagged 30 tweets
>> as threatening, and 50 random tweets as non-threatening.  When I build the
>> mode with:
>>
>> update(models2, batchSize="50",
>>              train(UNCLASS,
>>                       features(UNCLASS,
>>                                      q="ProfileID:PROFCLUST1",
>>                                      featureSet="threatFeatures3",
>>                                      field="ClusterText",
>>                                      outcome="out_i",
>>                                      positiveLabel=1,
>>                                      numTerms=250),
>>                       q="ProfileID:PROFCLUST1",
>>                       name="threatModel3",
>>                       field="ClusterText",
>>                       outcome="out_i",
>>                       maxIterations="100"))
>>
>> It appears to work, but all the idfs_ds values are identical. The
>> terms_ss values look reasonable, but nearly all the weights_ds are 1.0.
>> For out_i it is either -1 for non-threatening tweets, and +1 for
>> threatening tweets.  I'm trying to follow along with Joel Bernstein's
>> excellent post here:
>> http://joelsolr.blogspot.com/2017/01/deploying-ai-alerting-s
>> ystem-with-solrs.html
>>
>> Tips?
>>
>> Thank you!
>>
>> -Joe
>>
>>
>

Reply via email to