problem setting default params for /browse (velocity) queries
Good morning. I'm attempting to set some default query parameters via solrconfig.xml but they are not taking effect. I'm using the /browse interface (e.g. velocity templates). To keep it simple, let's start with "fl". When I specify fl as a url parameter, it does take effect. But when I put it in solrconfig.xml, it's not being used. Right now, I'm putting this in a new init params element: field1,field2,field3 Then I tell the /browse request handler to add it to the list of useParams (I've tried the beginning and end of the list): I have also tried adding "useParams=myParams" in the url. FYI, I have been including wt=xml in my url so that I eliminate the possibility that the velocity templates are causing a problem. Any ideas? Thanks! Matt
Re: problem setting default params for /browse (velocity) queries
Thanks Erick. The pre-existing request handler for /browse (the velocity template driven interface) already had this: I just added an entry for "myParams" and added the initParams element in solrconfig.xml. I also tried adding a initParam with a path of /browse (similar to how the existing initParams elements were setup). I was wondering where these param set on the /browse event handler were coming from. Now that I know to look for params.json, I see a copy in my core's conf directory and it has "query", "facets", and "velocity" defined. I'm going to try setting via the parameters api and see what happens... Thanks for the pointers. Matt
Re: problem setting default params for /browse (velocity) queries
Awesome! Thanks Erik and Erick!! To close the loop on this, I was able to create a paramset via the rest api and then use it in a query via ?paramSet=myParams and it's working!! Hopefully this information will help someone else... My dataset has some text fields that should be used in more-like-this and it has some machine learning classifier score fields that vary from 0..1 that I want to be able to do facets over different scores. Here's the rest call to create my paramset: export SOLR_BASE=http://myserver.mycompany.com:8983/solr export CORE=mycore curl "$SOLR_BASE/$CORE/config/params" -H 'Content-type:application/json' -d '{ "update":{ "myParams":{ "rows":"5", "facet": "on", "facet.range": ["classificationfield1","classificationfield2","classificationfield3"], "facet.range.start": "0.5", "facet.range.end": "1.0", "facet.range.gap": "0.1", "facet.range.other" : "all", "fl": "title,textfield1,textfield2,classificationfield1,classificationfield2,classificationfield3,score", "mlt": "on", "mlt.fl": "textfield1,textfield2", "df":"_text_"}} }' Then I needed to add this new paramset to the *END* of the list in the requestHandler's useParams attribute: A few wiki pages that I found useful... - "Request Parameters API": - https://cwiki.apache.org/confluence/display/solr/Request+Parameters+API - "InitParms in SolrConfig" - https://cwiki.apache.org/confluence/display/solr/InitParams+in+SolrConfig - "Config API" - https://cwiki.apache.org/confluence/display/solr/Config+API Matt
Re: problem setting default params for /browse (velocity) queries
It would be nice to have a link to that films example in the cwiki "Request Parameters API" page. Erik H, in your Lucidworks blog post, what is the meaning of the empty string keyed entries in each of the param sets? "":{"v":0} Matt
any way to post json document to a MoreLikeThisHandler?
Hello, Using a MoreLikeThisHandler, I was hoping to be able to pass in in the post body a json document (the same format as a document indexed in my core, but the document in the request is not and should not be added to the core). I'm thinking it would handle an incoming document similar to how the /update handler can split up a json document into the set of fields defined in the schema (or auto created fields). For instance, my input document would look like this: { "id": 1234, "field1": "blah blah blah", "field2": "foo bar", "field3": 112233 } And then I want to be able to use the MoreLikeThis query parameters to determine which fields are used in the MLT comparison. Thanks, Matt
Re: any way to post json document to a MoreLikeThisHandler?
Thanks Alex. Yes, I've been using the MoreLikeThisHandler, but that takes a block of text as input posted to the request, not the structured json that corresponds to the fields. On Tue, Sep 11, 2018 at 10:14 AM Alexandre Rafalovitch wrote: > There are three ways to trigger MLT: > https://lucene.apache.org/solr/guide/7_4/morelikethis.html > > MoreLikeThisHandler allows to supply text externally. Unfortunately, I > can't find the specific example demonstrating it, so not sure if it > just a blob of text or a document. > > Regards, >Alex. > > On 11 September 2018 at 09:55, Matt Work Coarr > wrote: > > Hello, > > > > Using a MoreLikeThisHandler, I was hoping to be able to pass in in the > post > > body a json document (the same format as a document indexed in my core, > but > > the document in the request is not and should not be added to the core). > > > > I'm thinking it would handle an incoming document similar to how the > > /update handler can split up a json document into the set of fields > defined > > in the schema (or auto created fields). > > > > For instance, my input document would look like this: > > > > { > > "id": 1234, > > "field1": "blah blah blah", > > "field2": "foo bar", > > "field3": 112233 > > } > > > > And then I want to be able to use the MoreLikeThis query parameters to > > determine which fields are used in the MLT comparison. > > > > Thanks, > > Matt >
Re: any way to post json document to a MoreLikeThisHandler?
Thank you Alex! I'll take a look at DumpRequestHandler and see if I can pull pieces from there and from MLT. I was also looking at DirectUpdateHandler2 to try and tease apart the logic for parsing an incoming json (or xml or whatever incoming format is). Do you think that's worthwhile? My thought was that this is what backs "/update" and that's how I'm loading my json files. It looks like DirectUpdateHandler2.split() might be relevant too. Matt On Tue, Sep 11, 2018 at 5:13 PM Alexandre Rafalovitch wrote: > Hmm. > > I guess the issue is that the handler is the one doing parsing, so the > input document can be in XML or JSON or CSV. And MLT as a handler is then a > competing end point. > > So you actually want to use it later in a pipeline but with a document > constructed on the fly and not stored. > > This may not exist right now. Though maybe some combination of > DumpRequestHandler and MLT as a search component could do the trick? > > I would be curious to know if it can be made to work out of the box. > Otherwise, patches are welcome But they should not expect just JSON > input format. > > Regards, > Alex > > On Tue, Sep 11, 2018, 4:57 PM Matt Work Coarr, > wrote: > > > Thanks Alex. Yes, I've been using the MoreLikeThisHandler, but that > takes > > a block of text as input posted to the request, not the structured json > > that corresponds to the fields. > > > > On Tue, Sep 11, 2018 at 10:14 AM Alexandre Rafalovitch < > arafa...@gmail.com > > > > > wrote: > > > > > There are three ways to trigger MLT: > > > https://lucene.apache.org/solr/guide/7_4/morelikethis.html > > > > > > MoreLikeThisHandler allows to supply text externally. Unfortunately, I > > > can't find the specific example demonstrating it, so not sure if it > > > just a blob of text or a document. > > > > > > Regards, > > >Alex. > > > > > > On 11 September 2018 at 09:55, Matt Work Coarr < > mattcoarr.w...@gmail.com > > > > > > wrote: > > > > Hello, > > > > > > > > Using a MoreLikeThisHandler, I was hoping to be able to pass in in > the > > > post > > > > body a json document (the same format as a document indexed in my > core, > > > but > > > > the document in the request is not and should not be added to the > > core). > > > > > > > > I'm thinking it would handle an incoming document similar to how the > > > > /update handler can split up a json document into the set of fields > > > defined > > > > in the schema (or auto created fields). > > > > > > > > For instance, my input document would look like this: > > > > > > > > { > > > > "id": 1234, > > > > "field1": "blah blah blah", > > > > "field2": "foo bar", > > > > "field3": 112233 > > > > } > > > > > > > > And then I want to be able to use the MoreLikeThis query parameters > to > > > > determine which fields are used in the MLT comparison. > > > > > > > > Thanks, > > > > Matt > > > > > >
highlighting in more like this?
Is it possible to get highlighting in more like this queries? My initial attempts seem to indicate that it isn't possible (I've only attempted this via modifying MLT query urls) (I'm looking for something similar to hl=true&hl.fl=field1,field5,field6 in a normal search) Thanks, Matt
highlighting more-like-this
I want to get highlighted results for more like this queries. More like this doesn't support highlighting. So what I did was ran a more like this query (I have the source document A and say I get three similar documents back A1, A2, and A3). I then create a second query where I use the contents of A as the query. More specifically, I have all a subset of my fields being appended to a multivalued "catchall" field. I use A's concatenated catchall (with punctuation removed) as the search: q=catchall:(*CONCATENATED_A_CATCHALL_TEXT*) And I limit the results to the three documents A1/A2/A3 via qf: qf=id*:A1_ID*+id*:A2_ID*+id*:A3_ID* Now I get highlighted results. But my main problem is very frequent terms (for/the/to/in...) are highlighted. I would have thought these would be excluded via inverse document frequency (since they show up in just about every document). Is there a way to improve the highlighting? (Remove the less important terms, set some threshold, etc) Matt