Suggester highlighter offsets inaccurate

2017-10-12 Thread Timothy Hill
Hello,

I am using Solr 6.6's Suggester functionality to power an autosuggest
widget that returns lists of people's names.

One requirement that we have is that the suggester be
punctuation-insensitive. For example, entering:

'Dr Joh' should provide the suggestion 'Dr. John', despite the fact that
the user omitted the period after 'dr'.

'Hank Williams Jr' should provide the suggestion 'Hank Williams, Jr.'
despite the omission of both the comma and the period.

This functionality is present - but the punctuation-stripping appears to be
causing highlighting offsets to be miscalculated: we end up with 'Dr
John' for the first query and 'Hank Williams, Jr.' for the second

Here's are the relevant parts of the solrconfig.xml and schema.xml
configurations:




suggestEntity
AnalyzingInfixLookupFactory
DocumentDictionaryFactory
skos_prefLabel
derived_score
payload
suggestType
2
false
false
true
suggest_filters




true
true
10
suggestEntity


suggestEntity













As you can see from the schema.xml document, I've tried storing term
vectors, offsets, etc., but the Suggester highlighter doesn't seem to take
advantage of them.

Does anyone know what I'm doing wrong here? Or is this a bug in the
highlighter?

Thanks,

Tim Hill


Application of different stemmers / stopword lists within a single field

2014-04-25 Thread Timothy Hill
This may not be a practically solvable problem, but the company I work for
has a large number of lengthy mixed-language documents - for example,
scholarly articles about Islam written in English but containing lengthy
passages of Arabic. Ideally, we would like users to be able to search both
the English and Arabic portions of the text, using the full complement of
language-processing tools such as stemming and stopword removal.

The problem, of course, is that these two languages co-occur in the same
field. Is there any way to apply different processing to different words or
paragraphs within a single field through language detection? Is this to all
intents and purposes impossible within Solr? Or is another approach (using
language detection to split the single large field into
language-differentiated smaller fields, for example) possible/recommended?

Thanks,

Tim Hill


Mysterious 'Specify the group.field as parameter or local parameter' error with some field combinations

2014-05-22 Thread Timothy Hill
Launching the following query against a sharded Solr 4.7 installation in
EC2 yields a 'Specify the group.field as parameter or local parameter'
error:

http://my_solr_instance:8983/solr/core_name/select/?q=*:mozart&facet=true&facet.field=publisher_facet&facet.field=publication_year_facet&f.publication_year_facet.facet.limit=100&f.publication_year_facet.facet.range.start=1700&f.publication_year_facet.facet.range.end=2014&f.publication_year_facet.facet.range.gap=10&f.publication_year_facet.facet.range.other=before&fq=collection_id:LIBI-BFobject_type:bibliographic_entity&start=0&rows=1&group=true&group.facet=true&group.field=group_id&timeAllowed=3&facet.range=publication_year_facet&version=2.2&wt=json

The error mystifies me. The error appears to be complaining that the
'group.field' parameter hasn't been specified - however, it is clearly
present in the query.

Curiously, the error goes away if I substitute another facet field of the
same datatype for 'facet.field=publisher_facet'. For instance, if I change
this to 'facet.field=language_facet' the query behaves as expected.

>From the documentation I got the idea that the 'shards' parameter would
have to be specified. This throws off the counts - and leaves the error
intact.

Any ideas what could be going wrong here?

Thanks much,

Tim Hill


Luke misreporting index-time boosts?

2013-04-24 Thread Timothy Hill
Hello, all

I have recently been attempting to apply index-time boosts to fields using
the following syntax:



bleah bleah bleah
content here
content here


content here
bleah bleah bleah
content here



The intention is that matches on important_field should be more important
to score than matches on trivial_field (so that a search across all fields
for the term 'content' would return the second document above the first),
while still being able to use the standard query parser.

Looking at output from Luke, however, all fields are reported as having a
boost of 1.0.

The following possibilities occur to me.

(1) The entire index-time-boosting approach is misconceived
(2) Luke is misreporting, because index-time boosting alters more
fundamental aspects of scoring (tf-idf calculations, I suppose), and the
index-time boost is thus invisible to it
(3) Some combination of (1) and (2)

Can anyone help illuminate the situation for me? Documentation for these
questions seems patchy.

Thanks,

Tim


Syntax for parameter substitution in function queries?

2012-08-07 Thread Timothy Hill
Hello, all ...

According to http://wiki.apache.org/solr/FunctionQuery/#What_is_a_Function.3F,
it is possible under Solr 4.0 to perform parameter substitutions
within function queries.

However, I can't get the syntax provided in the documentation there to
work *at all* with Solr 4.0 out of the box: the only location at which
function queries can be specified, it seems, is in the 'fl' parameter.
And attempts at parameter substitutions here fail. Using (haphazardly
guessed) syntax like

select?q=*:*&fl=*, test_id:if(exists(employee), employee_id,
socialsecurity_id), boost_id:sum($test_id, 10)&wt=xml

results in the following error

Error parsing fieldname: Missing param test_id while parsing function
'sum($test_id, 10)'

Right now I'm entertaining the following hypotheses:

(i) I'm somehow borking the syntax, probably in an embarrassingly obvious way
(ii) param substitutions of this type work with the syntax given in
the wiki, but not with the standard query handler
(iii) the wiki documentation is an artefact of development-in-progress
for a feature that was subsequently dropped

Can anyone on the list shed any light on this?

Thanks,

Tim


Re: Syntax for parameter substitution in function queries?

2012-08-08 Thread Timothy Hill
Thanks very much; that does indeed work as I'd hoped/expected.

On 7 August 2012 17:12, Yonik Seeley  wrote:
> On Tue, Aug 7, 2012 at 3:01 PM, Timothy Hill  wrote:
>> Hello, all ...
>>
>> According to 
>> http://wiki.apache.org/solr/FunctionQuery/#What_is_a_Function.3F,
>> it is possible under Solr 4.0 to perform parameter substitutions
>> within function queries.
>>
>> However, I can't get the syntax provided in the documentation there to
>> work *at all* with Solr 4.0 out of the box: the only location at which
>> function queries can be specified, it seems, is in the 'fl' parameter.
>> And attempts at parameter substitutions here fail. Using (haphazardly
>> guessed) syntax like
>>
>> select?q=*:*&fl=*, test_id:if(exists(employee), employee_id,
>> socialsecurity_id), boost_id:sum($test_id, 10)&wt=xml
>>
>> results in the following error
>>
>> Error parsing fieldname: Missing param test_id while parsing function
>> 'sum($test_id, 10)'
>
> test_id needs to be an actual request parameter.
>
> This worked for me on the example data:
> http://localhost:8983/solr/query?q=*:*&fl=*,%20test_id:if(exists(price),id,name),%20boost_id:sum($param,10)¶m=price
>
> -Yonik
> http://lucidimagination.com