problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Good morning. I'm attempting to set some default query parameters via
solrconfig.xml but they are not taking effect.  I'm using the /browse
interface (e.g. velocity templates).

To keep it simple, let's start with "fl".  When I specify fl as a url
parameter, it does take effect.  But when I put it in solrconfig.xml, it's
not being used.

Right now, I'm putting this in a new init params element:


  
field1,field2,field3
  



Then I tell the /browse request handler to add it to the list of useParams
(I've tried the beginning and end of the list):



I have also tried adding "useParams=myParams" in the url.

FYI, I have been including wt=xml in my url so that I eliminate the
possibility that the velocity templates are causing a problem.

Any ideas?

Thanks!
Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Thanks Erick.

The pre-existing request handler for /browse (the velocity template driven
interface) already had this:



I just added an entry for "myParams" and added the initParams element in
solrconfig.xml.

I also tried adding a initParam with a path of /browse (similar to how the
existing initParams elements were setup).

I was wondering where these param set on the /browse event handler were
coming from.  Now that I know to look for params.json, I see a copy in my
core's conf directory and it has "query", "facets", and "velocity" defined.

I'm going to try setting via the parameters api and see what happens...

Thanks for the pointers.

Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
Awesome! Thanks Erik and Erick!!

To close the loop on this, I was able to create a paramset via the rest api
and then use it in a query via ?paramSet=myParams and it's working!!

Hopefully this information will help someone else...

My dataset has some text fields that should be used in more-like-this and
it has some machine learning classifier score fields that vary from 0..1
that I want to be able to do facets over different scores.

Here's the rest call to create my paramset:

export SOLR_BASE=http://myserver.mycompany.com:8983/solr
export CORE=mycore

curl "$SOLR_BASE/$CORE/config/params" -H 'Content-type:application/json'
 -d '{
  "update":{
"myParams":{
  "rows":"5",
  "facet": "on",
  "facet.range":
["classificationfield1","classificationfield2","classificationfield3"],
  "facet.range.start": "0.5",
  "facet.range.end": "1.0",
  "facet.range.gap": "0.1",
  "facet.range.other" : "all",
  "fl":
"title,textfield1,textfield2,classificationfield1,classificationfield2,classificationfield3,score",
  "mlt": "on",
  "mlt.fl": "textfield1,textfield2",
  "df":"_text_"}}
}'


Then I needed to add this new paramset to the *END* of the list in the
requestHandler's useParams attribute:




A few wiki pages that I found useful...

   - "Request Parameters API":
  -
  https://cwiki.apache.org/confluence/display/solr/Request+Parameters+API
   - "InitParms in SolrConfig"
  -
  https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig
  - "Config API"
  - https://cwiki.apache.org/confluence/display/solr/Config+API

Matt


Re: problem setting default params for /browse (velocity) queries

2016-09-20 Thread Matt Work Coarr
It would be nice to have a link to that films example in the cwiki "Request
Parameters API" page.

Erik H, in your Lucidworks blog post, what is the meaning of the empty
string keyed entries in each of the param sets?

"":{"v":0}


Matt


any way to post json document to a MoreLikeThisHandler?

2018-09-11 Thread Matt Work Coarr
Hello,

Using a MoreLikeThisHandler, I was hoping to be able to pass in in the post
body a json document (the same format as a document indexed in my core, but
the document in the request is not and should not be added to the core).

I'm thinking it would handle an incoming document similar to how the
/update handler can split up a json document into the set of fields defined
in the schema (or auto created fields).

For instance, my input document would look like this:

{
  "id": 1234,
  "field1": "blah blah blah",
  "field2": "foo bar",
  "field3": 112233
}

And then I want to be able to use the MoreLikeThis query parameters to
determine which fields are used in the MLT comparison.

Thanks,
Matt


Re: any way to post json document to a MoreLikeThisHandler?

2018-09-11 Thread Matt Work Coarr
Thanks Alex.  Yes, I've been using the MoreLikeThisHandler, but that takes
a block of text as input posted to the request, not the structured json
that corresponds to the fields.

On Tue, Sep 11, 2018 at 10:14 AM Alexandre Rafalovitch 
wrote:

> There are three ways to trigger MLT:
> https://lucene.apache.org/solr/guide/7_4/morelikethis.html
>
> MoreLikeThisHandler allows to supply text externally. Unfortunately, I
> can't find the specific example demonstrating it, so not sure if it
> just a blob of text or a document.
>
> Regards,
>Alex.
>
> On 11 September 2018 at 09:55, Matt Work Coarr 
> wrote:
> > Hello,
> >
> > Using a MoreLikeThisHandler, I was hoping to be able to pass in in the
> post
> > body a json document (the same format as a document indexed in my core,
> but
> > the document in the request is not and should not be added to the core).
> >
> > I'm thinking it would handle an incoming document similar to how the
> > /update handler can split up a json document into the set of fields
> defined
> > in the schema (or auto created fields).
> >
> > For instance, my input document would look like this:
> >
> > {
> >   "id": 1234,
> >   "field1": "blah blah blah",
> >   "field2": "foo bar",
> >   "field3": 112233
> > }
> >
> > And then I want to be able to use the MoreLikeThis query parameters to
> > determine which fields are used in the MLT comparison.
> >
> > Thanks,
> > Matt
>


Re: any way to post json document to a MoreLikeThisHandler?

2018-09-13 Thread Matt Work Coarr
Thank you Alex! I'll take a look at DumpRequestHandler and see if I can
pull pieces from there and from MLT.

I was also looking at DirectUpdateHandler2 to try and tease apart the logic
for parsing an incoming json (or xml or whatever incoming format is).  Do
you think that's worthwhile?

My thought was that this is what backs "/update" and that's how I'm loading
my json files.

It looks like DirectUpdateHandler2.split() might be relevant too.

Matt


On Tue, Sep 11, 2018 at 5:13 PM Alexandre Rafalovitch 
wrote:

> Hmm.
>
> I guess the issue is that the handler is the one doing parsing, so the
> input document can be in XML or JSON or CSV. And MLT as a handler is then a
> competing end point.
>
> So you actually want to use it later in a pipeline but with a document
> constructed on the fly and not stored.
>
> This may not exist right now. Though maybe some combination of
> DumpRequestHandler and MLT as a search component could do the trick?
>
> I would be curious to know if it can be made to work out of the box.
> Otherwise, patches are welcome But they should not expect just JSON
> input format.
>
> Regards,
> Alex
>
> On Tue, Sep 11, 2018, 4:57 PM Matt Work Coarr, 
> wrote:
>
> > Thanks Alex.  Yes, I've been using the MoreLikeThisHandler, but that
> takes
> > a block of text as input posted to the request, not the structured json
> > that corresponds to the fields.
> >
> > On Tue, Sep 11, 2018 at 10:14 AM Alexandre Rafalovitch <
> arafa...@gmail.com
> > >
> > wrote:
> >
> > > There are three ways to trigger MLT:
> > > https://lucene.apache.org/solr/guide/7_4/morelikethis.html
> > >
> > > MoreLikeThisHandler allows to supply text externally. Unfortunately, I
> > > can't find the specific example demonstrating it, so not sure if it
> > > just a blob of text or a document.
> > >
> > > Regards,
> > >Alex.
> > >
> > > On 11 September 2018 at 09:55, Matt Work Coarr <
> mattcoarr.w...@gmail.com
> > >
> > > wrote:
> > > > Hello,
> > > >
> > > > Using a MoreLikeThisHandler, I was hoping to be able to pass in in
> the
> > > post
> > > > body a json document (the same format as a document indexed in my
> core,
> > > but
> > > > the document in the request is not and should not be added to the
> > core).
> > > >
> > > > I'm thinking it would handle an incoming document similar to how the
> > > > /update handler can split up a json document into the set of fields
> > > defined
> > > > in the schema (or auto created fields).
> > > >
> > > > For instance, my input document would look like this:
> > > >
> > > > {
> > > >   "id": 1234,
> > > >   "field1": "blah blah blah",
> > > >   "field2": "foo bar",
> > > >   "field3": 112233
> > > > }
> > > >
> > > > And then I want to be able to use the MoreLikeThis query parameters
> to
> > > > determine which fields are used in the MLT comparison.
> > > >
> > > > Thanks,
> > > > Matt
> > >
> >
>


highlighting in more like this?

2018-09-18 Thread Matt Work Coarr
Is it possible to get highlighting in more like this queries?  My initial
attempts seem to indicate that it isn't possible (I've only attempted this
via modifying MLT query urls)

(I'm looking for something similar to hl=true&hl.fl=field1,field5,field6 in
a normal search)

Thanks,
Matt


highlighting more-like-this

2018-10-12 Thread Matt Work Coarr
I want to get highlighted results for more like this queries.  More like
this doesn't support highlighting.

So what I did was ran a more like this query (I have the source document A
and say I get three similar documents back A1, A2, and A3).  I then create
a second query where I use the contents of A as the query.

More specifically, I have all a subset of my fields being appended to a
multivalued "catchall" field.  I use A's concatenated catchall (with
punctuation removed) as the search:

q=catchall:(*CONCATENATED_A_CATCHALL_TEXT*)

And I limit the results to the three documents A1/A2/A3 via qf:

qf=id*:A1_ID*+id*:A2_ID*+id*:A3_ID*

Now I get highlighted results.  But my main problem is very frequent terms
(for/the/to/in...) are highlighted.  I would have thought these would be
excluded via inverse document frequency (since they show up in just about
every document).

Is there a way to improve the highlighting? (Remove the less important
terms, set some threshold, etc)

Matt