strange copied field problem

2011-09-21 Thread Tanner Postert
i have 3 fields that I am working with: genre, genre_search and text. genre
is a string field which comes from the data source. genre_search is a text
field that is copied from genre, and text is a text field that is copied
from genre_search and a few other fields. Text field is the default search
field for queries. When I search for q=genre_search:indie+rock, solr returns
several records that have both Indie as a genre and Rock as a genre, which
is great, but when I search for q=indie+rock or q=text:indie+rock, i get no
results.

Why would the source field return the value and the destination wouldn't.
Both genre_search and text are the same data type, so there shouldn't be any
strange translations happening.


Re: strange copied field problem

2011-09-21 Thread Tanner Postert
i believe that was the original configuration, but I can switch it back and
see if that yields any results.

On Wed, Sep 21, 2011 at 10:54 AM, Pulkit Singhal wrote:

> I am NOT claiming that making a copy of a copy field is wrong or leads
> to a race condition. I don't know that. BUT did you try to copy into
> the text field directly from the genre field? Instead of the
> genre_search field? Did that yield working queries?
>
> On Wed, Sep 21, 2011 at 12:16 PM, Tanner Postert
>  wrote:
> > i have 3 fields that I am working with: genre, genre_search and text.
> genre
> > is a string field which comes from the data source. genre_search is a
> text
> > field that is copied from genre, and text is a text field that is copied
> > from genre_search and a few other fields. Text field is the default
> search
> > field for queries. When I search for q=genre_search:indie+rock, solr
> returns
> > several records that have both Indie as a genre and Rock as a genre,
> which
> > is great, but when I search for q=indie+rock or q=text:indie+rock, i get
> no
> > results.
> >
> > Why would the source field return the value and the destination wouldn't.
> > Both genre_search and text are the same data type, so there shouldn't be
> any
> > strange translations happening.
> >
>


Re: strange copied field problem

2011-09-21 Thread Tanner Postert
sure enough that worked. could have sworn we had it this way before, but
either way, that fixed it. Thanks.

On Wed, Sep 21, 2011 at 11:01 AM, Tanner Postert
wrote:

> i believe that was the original configuration, but I can switch it back and
> see if that yields any results.
>
>
> On Wed, Sep 21, 2011 at 10:54 AM, Pulkit Singhal 
> wrote:
>
>> I am NOT claiming that making a copy of a copy field is wrong or leads
>> to a race condition. I don't know that. BUT did you try to copy into
>> the text field directly from the genre field? Instead of the
>> genre_search field? Did that yield working queries?
>>
>> On Wed, Sep 21, 2011 at 12:16 PM, Tanner Postert
>>  wrote:
>> > i have 3 fields that I am working with: genre, genre_search and text.
>> genre
>> > is a string field which comes from the data source. genre_search is a
>> text
>> > field that is copied from genre, and text is a text field that is copied
>> > from genre_search and a few other fields. Text field is the default
>> search
>> > field for queries. When I search for q=genre_search:indie+rock, solr
>> returns
>> > several records that have both Indie as a genre and Rock as a genre,
>> which
>> > is great, but when I search for q=indie+rock or q=text:indie+rock, i get
>> no
>> > results.
>> >
>> > Why would the source field return the value and the destination
>> wouldn't.
>> > Both genre_search and text are the same data type, so there shouldn't be
>> any
>> > strange translations happening.
>> >
>>
>
>


Re: How to get the fields that match the request?

2011-09-22 Thread Tanner Postert
this would be useful to me as well.

even when searching with q=test, I know it defaults to the default search
field, but it would helpful to know what field(s) match the query term.

On Thu, Sep 22, 2011 at 3:29 AM, Nicolas Martin wrote:

> Hi everyBody,
>
> I need your help to get more information in my solR query's response.
>
> i've got a simple input text which allows me to query several fields in the
> same query.
>
> So my query  looks like this
> "q=email:martyn+OR+name:**martynn+OR+commercial:martyn ..."
>
> Is it possible in the response to know the fields where "martynn" has been
> found ?
>
> Thanks a Lot :-)
>


Stemming numbers

2012-01-10 Thread Tanner Postert
We've had some issues with people searching for a document with the
search term '200 movies'. The document is actually title 'two hundred
movies'.

Do we need to add every number to our  synonyms dictionary to
accomplish this? Is it best done at index or search time?


Re: Stemming numbers

2012-01-10 Thread Tanner Postert
You mention "that is one way to do it" is there another i'm not seeing?

On Jan 10, 2012, at 4:34 PM, Ted Dunning  wrote:

> On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert 
> wrote:
>
>> We've had some issues with people searching for a document with the
>> search term '200 movies'. The document is actually title 'two hundred
>> movies'.
>>
>> Do we need to add every number to our  synonyms dictionary to
>> accomplish this?
>
>
> That is one way to deal with this.
>
> But it depends on a lot of hand engineering of special cases.  That is good
> to have for the low hanging fruit, but it only takes you so far.  You can
> also automate the discovery of such cases to a certain degree by analyzing
> query logs.
>
>
>> Is it best done at index or search time?
>>
>
> I would say that opinion is divided on this and in the end, you probably
> have to do versions of this at both times.  This is especially true if you
> want to include secondary information like inferred query purpose
> (obviously only available at query time) and inferred document
> characteristics (best known at indexing time).  Partly the choice about
> when to do this is driven by which trade-offs you are OK making.  For
> instance, some people are driven by index size but not query response time.
> They would probably opt for pushing load to the query.  Others may be
> bound by response time or query throughput.  They may wish to minimize
> query complexity and size.


No system property or default value specified for...

2011-01-14 Thread Tanner Postert
I'm trying to dynamically add a core to a multi core system using the
following command:

http://localhost:8983/solr/admin/cores?action=CREATE&name=items&instanceDir=items&config=data-config.xml&schema=schema.xml&dataDir=data&persist=true

the data-config.xml looks like this:


  
  
   





this same configuration works for a core that is already imported into the
system, but when trying to add the core with the above command I get the
following error:

No system property or default value specified for local.code

so I added a  tag in the solr.xml figuring that it needed some
type of default value for this to work, then I restarted solr, but now when
I try the import I get:

No system property or default value specified for
dataimporter.last_index_time

Do I have to define a default value for every variable I will conceivably
use for future cores? is there a way to bypass this error?

Thanks in advance


Re: No system property or default value specified for...

2011-01-19 Thread Tanner Postert
i even have to define default values for the dataimport.delta values? that
doesn't seem right

On Wed, Jan 19, 2011 at 11:57 AM, Markus Jelsma
wrote:

> Hi,
>
> I'm unsure if i completely understand but you first had the error for
> local.code and then set the property in solr.xml? Then of course it will
> give
> an error for the next undefined property that has no default set.
>
> If you use a property without default it _must_ be defined in solr.xml or
> solrcore.properties. And since you don't use defaults in your dataconfig
> they
> all must be explicitely defined.
>
> This is proper behaviour.
>
> Cheers,
>
> > I'm trying to dynamically add a core to a multi core system using the
> > following command:
> >
> >
> http://localhost:8983/solr/admin/cores?action=CREATE&name=items&instanceDir
> > =items&config=data-config.xml&schema=schema.xml&dataDir=data&persist=true
> >
> > the data-config.xml looks like this:
> >
> > 
> >>url="jdbc:mysql://localhost/"
> >...
> >name="server"/>
> >   
> > >query="select code from master.locals"
> >rootEntity="false">
> >  > query="select '${local.code}' as localcode,
> > items.*
> > FROM ${local.code}_meta.item
> > WHERE
> >   item.lastmodified > '${dataimporter.last_index_time}'
> > OR
> >   '${dataimporter.request.clean}' != 'false'
> > order by item.objid"
> > />
> > 
> > 
> > 
> >
> > this same configuration works for a core that is already imported into
> the
> > system, but when trying to add the core with the above command I get the
> > following error:
> >
> > No system property or default value specified for local.code
> >
> > so I added a  tag in the solr.xml figuring that it needed some
> > type of default value for this to work, then I restarted solr, but now
> when
> > I try the import I get:
> >
> > No system property or default value specified for
> > dataimporter.last_index_time
> >
> > Do I have to define a default value for every variable I will conceivably
> > use for future cores? is there a way to bypass this error?
> >
> > Thanks in advance
>


Re: No system property or default value specified for...

2011-01-19 Thread Tanner Postert
there error I am getting is that I have no default value
for ${dataimporter.last_index_time}

should I just define -00-00 00:00:00 as the default for that field?

On Wed, Jan 19, 2011 at 12:45 PM, Markus Jelsma
wrote:

> No, you only need defaults if you use properties that are not defined in
> solr.xml or solrcore.properties.
>
> What would the value for local.core be if you don't define it anyway and
> you
> don't specify a default? Quite unpredictable i gues =)
>
> > i even have to define default values for the dataimport.delta values?
> that
> > doesn't seem right
> >
> > On Wed, Jan 19, 2011 at 11:57 AM, Markus Jelsma
> >
> > wrote:
> > > Hi,
> > >
> > > I'm unsure if i completely understand but you first had the error for
> > > local.code and then set the property in solr.xml? Then of course it
> will
> > > give
> > > an error for the next undefined property that has no default set.
> > >
> > > If you use a property without default it _must_ be defined in solr.xml
> or
> > > solrcore.properties. And since you don't use defaults in your
> dataconfig
> > > they
> > > all must be explicitely defined.
> > >
> > > This is proper behaviour.
> > >
> > > Cheers,
> > >
> > > > I'm trying to dynamically add a core to a multi core system using the
> > >
> > > > following command:
> > >
> http://localhost:8983/solr/admin/cores?action=CREATE&name=items&instanceD
> > > ir
> > >
> > > >
> =items&config=data-config.xml&schema=schema.xml&dataDir=data&persist=tr
> > > > ue
> > > >
> > > > the data-config.xml looks like this:
> > > >
> > > > 
> > > >
> > > >> > >
> > > >url="jdbc:mysql://localhost/"
> > > >...
> > > >name="server"/>
> > > >
> > > >   
> > > >
> > > > > > >
> > > >query="select code from master.locals"
> > > >rootEntity="false">
> > > >
> > > >  > > >
> > > > query="select '${local.code}' as localcode,
> > > > items.*
> > > >
> > > > FROM ${local.code}_meta.item
> > > > WHERE
> > > >
> > > >   item.lastmodified > '${dataimporter.last_index_time}'
> > > >
> > > > OR
> > > >
> > > >   '${dataimporter.request.clean}' != 'false'
> > > >
> > > > order by item.objid"
> > > > />
> > > > 
> > > > 
> > > > 
> > > >
> > > > this same configuration works for a core that is already imported
> into
> > >
> > > the
> > >
> > > > system, but when trying to add the core with the above command I get
> > > > the following error:
> > > >
> > > > No system property or default value specified for local.code
> > > >
> > > > so I added a  tag in the solr.xml figuring that it needed
> > > > some type of default value for this to work, then I restarted solr,
> > > > but now
> > >
> > > when
> > >
> > > > I try the import I get:
> > > >
> > > > No system property or default value specified for
> > > > dataimporter.last_index_time
> > > >
> > > > Do I have to define a default value for every variable I will
> > > > conceivably use for future cores? is there a way to bypass this
> error?
> > > >
> > > > Thanks in advance
>


help with dismax query

2011-02-11 Thread Tanner Postert
I'm having a problem using the dismax query for the term "obsessed with
winning"

http://localhost:8983/solr/core1/select?q=obsessed+with+winning&fq=code:xyz&shards=localhost:8983/solr/core1,localhost:8983/solr/core2,&rows=10&start=0&defType=dismax&qf=title
^10+description^4+text^1&debugQuery=true

that query yields zero results, but removing the dismax stuff it works
fine:

http://localhost:8983/solr/core1/select?q=obsessed+with+winning&fq=code:xyz&shards=localhost:8983/solr/core1,localhost:8983/solr/core2,&rows=10&start=0&debugQuery=true

or even adding the mm=2 yields results, but mm=3 does not.

There is at least 1 record which contains the exact phrase 'obsessed with
winning' in the title as well as the description multiple times, yet when
the defType=dismax option is added, the query yields no results.  Am I
missing something?

Thanks in advance.


help with dismax query

2011-02-11 Thread Tanner Postert
I'm having a problem using the dismax query. For example: for the term
"obsessed with winning" I use:

http://localhost:8983/solr/core1/select?q=obsessed+with+winning&fq=code:xyz&shards=localhost:8983/solr/core1,localhost:8983/solr/core2,&rows=10&start=0&defType=dismax&qf=title
^10+description^4+text^1&debugQuery=true

that query yields zero results, but removing the dismax stuff it works
fine:

http://localhost:8983/solr/core1/select?q=obsessed+with+winning&fq=code:xyz&shards=localhost:8983/solr/core1,localhost:8983/solr/core2,&rows=10&start=0&debugQuery=true

or even adding the mm=2 yields results, but mm=3 does not.

Looking at the discussion here:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg29682.html I see
that possibly sending the qf fields as separate values, rather than one may
yield better results but searching for:

http://localhost:8983/solr/core1/select?q=obsessed+with+winning&fq=code:xyz&shards=localhost:8983/solr/core1,localhost:8983/solr/core2,&rows=10&start=0&defType=dismax&qf=title
^10&qf=description^4&qf=text^1&debugQuery=true

yields no results either

There is at least 1 record which contains the exact phrase 'obsessed with
winning' in the title as well as the description and text (text is just a
copied field of title and description and couple of other fields). multiple
times, yet when the defType=dismax option is added, the query yields no
results.  Am I missing something?

Thanks in advance.


Re: help with dismax query

2011-02-11 Thread Tanner Postert
looks like that might be the case, if I just do a search for "with"
including the dismax parameters, it returns no results, as opposed to a
search for 'obsessed' does return results. Is there any way I can get around
this behavior? or do I have something configured wrong?

>
> Might "with" be a stop word removed by one of those qf fields?  That'd explain
> why mm=3 doesn't work, I think.
>
> Erik
>
>


Re: help with dismax query

2011-02-11 Thread Tanner Postert
I think I found the answer here:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg04433.html

<http://www.mail-archive.com/solr-user@lucene.apache.org/msg04433.html>I
think the title and description fields did not have the stopword filter
applied to it, so it was causing an error. When I took off the qf=title &
qf=description fields the results works. I am rebuilding my indexes now.

On Fri, Feb 11, 2011 at 3:20 PM, Tanner Postert wrote:

> looks like that might be the case, if I just do a search for "with"
> including the dismax parameters, it returns no results, as opposed to a
> search for 'obsessed' does return results. Is there any way I can get around
> this behavior? or do I have something configured wrong?
>
>> Might "with" be a stop word removed by one of those qf fields?  That'd 
>> explain
>> why mm=3 doesn't work, I think.
>>
>> Erik
>>
>>


Multicore boosting to only 1 core

2011-02-14 Thread Tanner Postert
I have a multicore system and I am looking to boost results by date, but
only for 1 core. Is this at all possible?

Basically one of the core's content is very new, and changes all the time,
and if I boost everything by date, that core's content will almost always be
at the top of the results, so I only want to do the date boosting to the
cores that have older content so that their more recent results get boosted
over the older content.


Any way to get back search query with parsed out stop words

2011-02-15 Thread Tanner Postert
I am trying to see if there is a way to get back the resulting search'd
query to solr excluding the stopwords.  Right now when I search for: "the
year in review" i can see in the debug that the parsed query contains: text:"?
year ? review" but that information is mixed in with all the parsed boosting
queries and isn't easily accessible via that XML (and would require me to
always pass debugQuery=true to my production queries.

I am trying to only get back the natural searched terms so I can highlight
them in the returned search results. I know that solr has built in
highlighting capability, but I can't use it because some of the fields
contain HTML themselves and I need to strip it all out when I display the
search results.

right now, if I pass the full searched phrase to the highlighter it looks a
little strange having every occurrence of "the" or "and" highlighted, so I'd
like to only highlight the non-stopwords if I could.

any thoughts or ideas would be appreciated


Re: Any way to get back search query with parsed out stop words

2011-02-15 Thread Tanner Postert
ok, I will look at using that filter factory on my content.

But I was also looking at the stop filter number so I could adjust my mm
parameter based on the number of non-stopwords in the search parameter so I
don't run into the dismax stopword issue. any way around that other than
using a very low mm?

On Tue, Feb 15, 2011 at 1:45 PM, Ahmet Arslan  wrote:

> > I am trying to only get back the natural searched terms so
> > I can highlight
> > them in the returned search results. I know that solr has
> > built in
> > highlighting capability, but I can't use it because some of
> > the fields
> > contain HTML themselves and I need to strip it all out when
> > I display the
> > search results.
>
> I would stick with the solr's highlighting. If you strip html codes with a
> solr.HTMLStripCharFilterFactory, you can highlight html fields without
> problem.
>
>
>
>
>
>


Re: solr.HTMLStripCharFilterFactory not working

2011-02-15 Thread Tanner Postert
nevermind, I think I found my answer here:
http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html

<http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html>I
will add the HTML stripper to the data importer and see how that goes

On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert wrote:

> I have several fields defined and one of the field types includes a
> solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't
> appear to be affecting the field as I would expect.
> I have tried a simple:
>
> 
> followed by the tokenizer
> 
>
> or the combined factory
>
> 
>
> but neither seems to work.
>
> Returned search results from the webtitle & webdescription as well as text
> include the original HTML characters that the title & description fields
> have.
>
> The relevant schema:
>
> 
>  omitNorms="true"/>
>
> 
>   
> 
>
>  words="stopwords.txt" enablePositionIncrements="true"/>
>
>  generateNumberParts="1" catenateWords="1" catenateNumbers="1"
> catenateAll="0" splitOnCaseChange="1"/>
> 
>  protected="protwords.txt"/>
>   
>   
>
> 
>
>  ignoreCase="true" expand="true"/>
>
>  words="stopwords.txt" enablePositionIncrements="true"/>
>
>  generateNumberParts="1" catenateWords="0" catenateNumbers="0"
> catenateAll="0" splitOnCaseChange="1"/>
>
> 
>  protected="protwords.txt"/>
>   
> 
>
>  positionIncrementGap="100" omitNorms="true">
>   
> 
>  words="stopwords.txt"/>
> 
> 
>   
>   
> 
>  ignoreCase="true" expand="true"/>
>  words="stopwords.txt"/>
> 
> 
>   
> 
> 
>
> 
> stored="true"   multiValued="false" />
> stored="true"   multiValued="false" />
> 
>
> stored="true"   multiValued="false" compressed="true" />
> stored="true"   mutliValued="false" compressed="true" />
> 
>
>stored="true"   multiValued="true" />
> 
> 
>
>multiValued="true" />
> 
> 
>
> 
>
>


Re: solr.HTMLStripCharFilterFactory not working

2011-02-15 Thread Tanner Postert
I am using the data import handler and using the HTMLStripTransformer
doesn't seem to be working either.

I've changed webtitle and webdescription to not by copied from title and
description in the schema.xml file then set them both to just but duplicates
of title and description in the data importer query:


 
  
 


On Tue, Feb 15, 2011 at 3:49 PM, Tanner Postert wrote:

> nevermind, I think I found my answer here:
> http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html
>
> <http://www.mail-archive.com/solr-user@lucene.apache.org/msg34622.html>I
> will add the HTML stripper to the data importer and see how that goes
>
>
> On Tue, Feb 15, 2011 at 3:43 PM, Tanner Postert 
> wrote:
>
>> I have several fields defined and one of the field types includes a
>> solr.HTMLStripCharFilterFactory field in the analyzer but it doesn't
>> appear to be affecting the field as I would expect.
>> I have tried a simple:
>>
>> 
>> followed by the tokenizer
>> 
>>
>> or the combined factory
>>
>> 
>>
>> but neither seems to work.
>>
>> Returned search results from the webtitle & webdescription as well as text
>> include the original HTML characters that the title & description fields
>> have.
>>
>> The relevant schema:
>>
>> 
>> > omitNorms="true"/>
>>
>> 
>>   
>> 
>>
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>>
>> > generateNumberParts="1" catenateWords="1" catenateNumbers="1"
>> catenateAll="0" splitOnCaseChange="1"/>
>> 
>> > protected="protwords.txt"/>
>>   
>>   
>>
>> 
>>
>> > ignoreCase="true" expand="true"/>
>>
>> > words="stopwords.txt" enablePositionIncrements="true"/>
>>
>> > generateNumberParts="1" catenateWords="0" catenateNumbers="0"
>> catenateAll="0" splitOnCaseChange="1"/>
>>
>> 
>> > protected="protwords.txt"/>
>>   
>> 
>>
>> > positionIncrementGap="100" omitNorms="true">
>>   
>> 
>> > words="stopwords.txt"/>
>> 
>> 
>>   
>>   
>> 
>> > ignoreCase="true" expand="true"/>
>> > words="stopwords.txt"/>
>> 
>> 
>>   
>> 
>> 
>>
>> 
>>   >  stored="true"   multiValued="false" />
>>   >  stored="true"   multiValued="false" />
>> 
>>
>>   >  stored="true"   multiValued="false" compressed="true" />
>>   >  stored="true"   mutliValued="false" compressed="true" />
>> 
>>
>>   > stored="true"   multiValued="true" />
>> 
>> 
>>
>>   > multiValued="true" />
>> 
>> 
>>
>> 
>>
>>
>


Re: Passing parameters to DataImportHandler

2011-02-15 Thread Tanner Postert
yes it is possible via ${dataimporter.request.param}

see

http://wiki.apache.org/solr/DataImportHandler#Accessing_request_parameters

On Tue, Feb 15, 2011 at 4:45 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> It'd be nice to be able to pass HTTP parameters into DataImportHandler
> that'd be passed into the SQL as parameters, is this possible?
>


Spellchecking with some misspelled words in index source

2011-02-15 Thread Tanner Postert
I'm building my spellcheck index from my content and it seems to be working,
but my problem is that there are a few misspelled words in my content.  For
example: the word Sheriff is improperly misspelled Sherrif in my content a
couple dozen times (but spelled correctly a couple thousand times). The
results of the spellcheck at first glance indicate that the word is spelled
correctly because it is found in the spellcheck dictionary and has valid
search results. Adding a spellcheck.onlyMorePopular=true to the query
results in the spellcheck returning additional suggestions, but none of them
are for the correct spelling of the word:




sherriff


10




sherri


2319




sherril


155




sherif


19




sherric


4




is this just a strange glitch in my spellcheck dictionary based on my
content? What is strange, is sending the spellcheck sherriff (which is
another misspelling that has results in the index) results in the spellcheck
sending back the correct spelling as the top result.


Re: solr.HTMLStripCharFilterFactory not working

2011-02-16 Thread Tanner Postert
I updated my data importer.

I used to have:




which wasn't working. But I changed that to




and it is working fine.

On Tue, Feb 15, 2011 at 5:50 PM, Koji Sekiguchi  wrote:

> (11/02/16 8:03), Tanner Postert wrote:
>
>> I am using the data import handler and using the HTMLStripTransformer
>> doesn't seem to be working either.
>>
>> I've changed webtitle and webdescription to not by copied from title and
>> description in the schema.xml file then set them both to just but
>> duplicates
>> of title and description in the data importer query:
>>
>> 
>>  > query="select
>>   title as title,
>>   title as webtitle,
>>   description as description,
>>   description as webdescription
>>   FROM ...>
>>   
>>   
>>  
>> 
>>
>>
> Just for input (I'm not sure that I could help you), I'm using
> HTMLStripTransformer
> with PlainTextEntityProcessor and it works fine with me:
>
> 
>baseUrl="http://lucene.apache.org/"/>
>  
> transformer="HTMLStripTransformer"
>dataSource="f" url="solr/">
>  
>
>  
> 
>
> Koji
> --
> http://www.rondhuit.com/en/
>


Spellcheck Phrases

2011-02-23 Thread Tanner Postert
right now when I search for 'brake a leg', solr returns valid results with
no indication of misspelling, which is understandable since all of those
terms are valid words and are probably found in a few pieces of our content.
My question is:

is there any way for it to recognize that the phase should be "break a leg"
and not "brake a leg" and suggest the proper phrase?


manually editing spellcheck dictionary

2011-02-25 Thread Tanner Postert
I'm using an index based spellcheck dictionary and I was wondering if there
were a way for me to manually remove certain words from the dictionary.

Some of my content has some mis-spellings, and for example when I search for
the word sherrif (which should be spelled sheriff), it get recommendations
like sherriff or sherri instead. If I could remove those words, it would
seem like the system would work a little better.


Re: Basic Dismax syntax question

2011-02-28 Thread Tanner Postert
i noticed that your search terms are using caps vs lower case, are your
search fields perhaps not set to lowercase the terms and/or the search
term?

On Mon, Feb 28, 2011 at 10:41 AM, mrw  wrote:

> Say I have an index with first_name and last_name fields, and also a copy
> field for the full name called full_name.  Say I add two employees:
> Napoleon Bonaparte and Napoleon Dynamite.
>
> If I search for just the first or last name, or both names, with mm=1, I
> get
> the expected results:
>
> q=Napoleon&defType=dismax&tie=0.1&qf=first_name%20last_name&mm=1  // 2
> results
> q=Bonaparte&defType=dismax&tie=0.1&qf=first_name%20last_name&mm=1 // 2
> results
>
> q=Napoleon%20Bonaparte&defType=dismax&tie=0.1&qf=first_name%20last_name&mm=1
> // 2 results
>
>
> However, if I try to search for both names with mm=2 (which I think means
> term1 AND term2), I get 0 results:
>
>
> q=napoleon%20bonaparte&defType=dismax&tie=0.1&qf=first_name%20last_name&mm=2
>// 0 results
> q=napoleon%20bonaparte&defType=dismax&tie=0.1&qf=full_name&mm=2 // 0
> results
>
> I also see this when I put all fields (including the copy field) into the
> qf
> parameter.
>
>
> Thoughts?
>
>
> Thanks!
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Different-behavior-for-q-goo-com-vs-q-goo-com-in-queries-tp2168935p2596768.html
> Sent from the Solr - User mailing list archive at Nabble.com.


FilterQuery OR statement

2011-03-03 Thread Tanner Postert
Trying to figure out how I can run something similar to this for the fq
parameter

Field1 in ( 1, 2, 3 4 )
AND
Field2 in ( 4, 5, 6, 7 )

I found some examples on the net that looked like this: &fq=+field1:(1 2 3
4) +field2(4 5 6 7) but that yields no results.


Re: FilterQuery OR statement

2011-03-03 Thread Tanner Postert
That worked, thought I tried it before, not sure why it didn't before.

Also, is there a way to query without a q parameter?

I'm just trying to pull back all of the field results where field1:(1 OR 2
OR 3) etc. so I figured I'd use the FQ param for caching purposes because
those queries will likely be run a lot, but if I leave the Q parameter off i
get a null pointer error.

On Thu, Mar 3, 2011 at 11:05 AM, Ahmet Arslan  wrote:

> > Trying to figure out how I can run
> > something similar to this for the fq
> > parameter
> >
> > Field1 in ( 1, 2, 3 4 )
> > AND
> > Field2 in ( 4, 5, 6, 7 )
> >
> > I found some examples on the net that looked like this:
> > &fq=+field1:(1 2 3
> > 4) +field2(4 5 6 7) but that yields no results.
>
> May be your default operator is set to AND in schema.xml?
> If yes, try using +field2(4 OR 5 OR 6 OR 7)
>
>
>
>


Spacial Search Field Type

2011-03-17 Thread Tanner Postert
I am using Solr 1.4.1 (Solr Implementation Version: 1.4.1 955763M - mark -
2010-06-17 18:06:42) to be exact.

I'm trying to implement that GeoSpacial field type by adding to the schema:

 





but I get the following errors:


org.apache.solr.common.SolrException: Unknown fieldtype 'location'
specified on field geo

and


org.apache.solr.common.SolrException: Error loading class 'solr.LatLonType'


I thought I read that you had to have Solr 4.0 for the LatLon field
type, but isn't 1.4 = 4.0? Do I need some type of patch or different
version of Solr to use that field type?


Re: Spellcheck Phrases

2011-05-27 Thread Tanner Postert
are there any updates on this? any third party apps that can make this work
as expected?

On Wed, Feb 23, 2011 at 12:38 PM, Dyer, James wrote:

> Tanner,
>
> Currently Solr will only make suggestions for words that are not in the
> dictionary, unless you specifiy "spellcheck.onlyMorePopular=true".  However,
> if you do that, then it will try to "improve" every word in your query, even
> the ones that are spelled correctly (so while it might change "brake" to
> "break" it might also change "leg" to "log".)
>
> You might be able to alleviate some of the pain by setting the
> "thresholdTokenFrequency" so as to remove misspelled and rarely-used words
> from your dictionary, although I personally haven't been able to get this
> parameter to work.  It also doesn't seem to be documented on the wiki but it
> is in the 1.4.1. source code, in class IndexBasedSpellChecker.  Its also
> mentioned in Smiley&Pugh's book.  I tried setting it like this, but got a
> ClassCastException on the float value:
>
> 
>  text_spelling
>  
>  spellchecker
>  Spelling_Dictionary
>  text_spelling
>  true
>  .001
>  
> 
>
> I have it on my to-do list to look into this further but haven't yet.  If
> you decide to try it and can get it to work, please let me know how you do
> it.
>
> James Dyer
> E-Commerce Systems
> Ingram Content Group
> (615) 213-4311
>
> -Original Message-
> From: Tanner Postert [mailto:tanner.post...@gmail.com]
> Sent: Wednesday, February 23, 2011 12:53 PM
> To: solr-user@lucene.apache.org
> Subject: Spellcheck Phrases
>
> right now when I search for 'brake a leg', solr returns valid results with
> no indication of misspelling, which is understandable since all of those
> terms are valid words and are probably found in a few pieces of our
> content.
> My question is:
>
> is there any way for it to recognize that the phase should be "break a leg"
> and not "brake a leg" and suggest the proper phrase?
>


Better Spellcheck

2011-05-31 Thread Tanner Postert
I've tried to use a spellcheck dictionary built from my own content, but my
content ends up having a lot of misspelled words so the spellcheck ends up
being less than effective. I could use a standard dictionary, but it may
have problems with proper nouns. It also misses phrases. When someone
searches for "Untied States" I would hope the spellcheck would suggest
"United States" but it just recognizes that "untied" is a valid word and
doesn't suggest any thing.

Is there any way around this? Are there any third party modules or
spellcheck systems that I could implement to get these type of features?