PatternReplaceFilterFactory -- what does this regex do?

2013-03-22 Thread Eric Wilson
I'm using the Solr Suggester for autocompletion with WFSTLookup suggest
component, and a text file with phrases and weights. (
http://wiki.apache.org/solr/Suggester)

I found that the following filter made it impossible to match on
ampersands. So I removed it. But I'm sure it was there for a reason. What
was this suppose to do?



Thanks,

Eric Wilson


Using multiple text files for Suggestor dictionarys

2013-03-27 Thread Eric Wilson
I'm using the Suggester component for autocomplete. I have a variety of
types of suggestions that I would like to offer, such as locations, company
names, products, and dictionary words.

These lists vary in size and volatility, so keeping them all in the same
text file is not the most convenient.

I'm using text files because I want the ability to add weights to the terms
suggested.

Is it possible to use multiple text files? I tried the following:

   <
str name="name">suggestword 
org.apache.solr.spelling.suggest.Suggester 
org.apache.solr.spelling.suggest.fst.WFSTLookupFactory suggestword false  true ../data/words.txt 
../data/cities.txt 

But the second list, the cities, are apparently undetected, after
restarting the tomcat and rebuilding the dictionary. Can this be done? If
not, how would you recommend managing different dictionaries?

Thanks,

Eric Wilson


Can mm (min-match) be specified by field in dismax or edismax?

2013-06-03 Thread Eric Wilson
I would like to have the min-match set differently for different fields in
my dismax handler. Is this possible?


Using suggester for smarter phrase autocomplete

2013-03-13 Thread Eric Wilson
I'm trying to use the suggester for auto-completion with Solr 4. I have
followed the example configuration for phrase suggestions at the bottom of
this wiki page:
http://wiki.apache.org/solr/Suggester<https://mail.manta.com/owa/redir.aspx?C=a570b5bb74f64f4fb810ba260e304ec5&URL=http%3a%2f%2fwiki.apache.org%2fsolr%2fSuggester>

This shows how to use a text file with the following text for phrase
suggestions:

# simple auto-suggest phrase dictionary for testing
# note this uses tabs as separator!
the first phrase1.0
the second phrase   2.0
testing 12343.0
foo 5.0
the fifth phrase2.0
the final phrase4.0

This seems to be working in the expected way. If I query for "the f" I
receive the following suggestions:

 the final phrase
 the fifth phrase
 the first phrase

I would like to deal with the case where the user is interested in "the
foo". When "the fo" is entered, there will be no suggestions. Is it
possible to provide both the phrase matches, and the matches for individual
words, so that when the user entered text is no longer part of any actual
phrase, there are still suggestions to be made for the final word?

Thanks,

Eric Wilson


How to use suggester for smarter multi-word autocomplete?

2013-03-13 Thread Eric Wilson
I'm trying to use the suggester for auto-completion with Solr 4. I have 
followed the example configuration for phrase suggestions at the bottom of this 
wiki page: http://wiki.apache.org/solr/Suggester

This shows how to use a text file with the following text for phrase 
suggestions:

# simple auto-suggest phrase dictionary for testing
# note this uses tabs as separator!
the first phrase1.0
the second phrase   2.0
testing 12343.0
foo 5.0
the fifth phrase2.0
the final phrase4.0

This seems to be working in the expected way. If I query for "the f" I receive 
the following suggestions:

the final phrase
the fifth phrase
the first phrase

I would like to deal with the case where the user is interested in "the foo". 
When "the fo" is entered, there will be no suggestions. Is it possible to 
provide both the phrase matches, and the matches for individual words, so that 
when the user entered text is no longer part of any actual phrase, there are 
still suggestions to be made for the final word?

Eric Wilson
cell#: 740-601-8993
IM: wilson.eri...@gmail.com


Re: Using suggester for smarter phrase autocomplete

2013-03-13 Thread Eric Wilson
I'm not concerned about stopwords, rather the situation where the first and
second words are rarely used together, so don't occur together in a phrase
in the dictionary. Thanks.

On Wed, Mar 13, 2013 at 11:11 AM, Robert Muir  wrote:

> On Wed, Mar 13, 2013 at 11:07 AM, Eric Wilson 
> wrote:
> > I'm trying to use the suggester for auto-completion with Solr 4. I have
> > followed the example configuration for phrase suggestions at the bottom
> of
> > this wiki page:
> > http://wiki.apache.org/solr/Suggester<
> https://mail.manta.com/owa/redir.aspx?C=a570b5bb74f64f4fb810ba260e304ec5&URL=http%3a%2f%2fwiki.apache.org%2fsolr%2fSuggester
> >
> >
> > This shows how to use a text file with the following text for phrase
> > suggestions:
> >
> > # simple auto-suggest phrase dictionary for testing
> > # note this uses tabs as separator!
> > the first phrase1.0
> > the second phrase   2.0
> > testing 12343.0
> > foo 5.0
> > the fifth phrase2.0
> > the final phrase4.0
> >
> > This seems to be working in the expected way. If I query for "the f" I
> > receive the following suggestions:
> >
> >  the final phrase
> >  the fifth phrase
> >  the first phrase
> >
> > I would like to deal with the case where the user is interested in "the
> > foo". When "the fo" is entered, there will be no suggestions. Is it
> > possible to provide both the phrase matches, and the matches for
> individual
> > words, so that when the user entered text is no longer part of any actual
> > phrase, there are still suggestions to be made for the final word?
> >
>
> Is it really the case that you want matches for individual words, or
> just to handle e.g. the stopwords case like 'the fo' -> foo ?
>
> the latter can be done with analyzingsuggester (configure a stopfilter
> on the analyzer).
>


Solr spellchecking fails on sharded query

2012-06-19 Thread Eric Wilson
I have a Solr application that is distributed into 11 shards, using Solr
version 4.0.0.2011.07.26.16.34.16

In the solrconfig.xml for each shard, I have configured a spellcheck
component:



  textSpell

  

cn_spell

company_name_spell

0.0001

true

./spellchecker_cn_spell

  



I have built the dictionary for each shard, and verified that each shard
will return suggestions for misspellings. Moreover, it is evident that a
different dictionary is being used for the various shards.

The problem comes when I submit a sharded query. In that case the result
comes back with the following:


  


In other words, the list of words for which there are suggestions is empty.

Is there a trick to sharded spellchecking? I appreciate any suggestions.

Eric


Solr DataImport failing silently, intermitantly

2012-06-22 Thread Eric Wilson
One of our Solr indexes is failing to import successfully on occasion,
without any useful errors being reported.

This is a small index (about 15k documents) in Solr 1.4, deployed in Tomcat
7. It is updated every six minutes using the DataImportHandler. These jobs
are managed by JobScheduler.

Lately it has been failing several times a day. It will perform the "clean"
successfully, not show any errors in the logs, and fail to import any (or
most) of the data. Sometimes when this happens that index ends up with zero
documents, other times there are a handful of documents.

Here is an example of the log output of the case where it fails. This log
output is produced by Java Service Wrapper, which we use to maintain our
production Tomcat instances.

INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: [] webapp=/subscription
path=/dataimport params={optimize=true&clean=true&command=full-import}
status=0 QTime=22
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.core.SolrCore execute
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: [] webapp=/subscription
path=/dataimport params={} status=0 QTime=0
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: Starting Full Import
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.handler.dataimport.SolrWriter readIndexerProperties
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: Read dataimport.properties
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.update.DirectUpdateHandler2 deleteAll
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: [] REMOVING ALL DOCUMENTS
FROM INDEX
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.core.SolrDeletionPolicy onInit
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: SolrDeletionPolicy.onInit:
commits:num=1
INFO   | jvm 1| 2012/06/21 17:33:28 |
commit{dir=/rpt/src/solr-home/mst/pbl/solr/data/index,segFN=segments_7j,version=1340212196528,generation=271,filenames=[_cj.fdt,
_cj.tii, _cj.fnm, _cj.frq, segments_7j, _cj.nrm, _cj.tis, _cj.prx, _cj.fdx]
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.core.SolrDeletionPolicy updateCommits
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: newest commit =
1340212196528
INFO   | jvm 1| 2012/06/21 17:33:28 | Jun 21, 2012 5:33:28 PM
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO   | jvm 1| 2012/06/21 17:33:28 | INFO: Creating a connection for
entity PBL with URL: jdbc:oracle:thin:@ecnext-ecn01:1545:ecn01
STATUS | wrapper  | 2012/06/21 17:33:29 | TERM trapped.  Shutting down.
INFO   | jvm 1| 2012/06/21 17:33:29 | Jun 21, 2012 5:33:29 PM
org.apache.solr.handler.dataimport.JdbcDataSource$1 call
INFO   | jvm 1| 2012/06/21 17:33:29 | INFO: Time taken for
getConnection(): 1187
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.core.SolrCore close
INFO   | jvm 1| 2012/06/21 17:33:31 | INFO: []  CLOSING SolrCore
org.apache.solr.core.SolrCore@57044c5
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.core.SolrCore closeSearcher
INFO   | jvm 1| 2012/06/21 17:33:31 | INFO: [] Closing main searcher on
request.
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.search.SolrIndexSearcher close
INFO   | jvm 1| 2012/06/21 17:33:31 | INFO: Closing Searcher@192425amain
INFO   | jvm 1| 2012/06/21 17:33:31 |
fieldValueCache{lookups=0,hits=0,hitratio=0.00,inserts=0,evictions=0,size=0,warmupTime=0,cumulative_lookups=0,cumulative_hits=0,cumulative_hitratio=0.00,cumulative_inserts=0,cumulative_evictions=0}
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.update.DirectUpdateHandler2 close
INFO   | jvm 1| 2012/06/21 17:33:31 | INFO: closing
DirectUpdateHandler2{commits=0,autocommits=0,optimizes=0,rollbacks=0,expungeDeletes=0,docsPending=41,adds=42,deletesById=0,deletesByQuery=1,errors=0,cumulative_adds=42,cumulative_deletesById=0,cumulative_deletesByQuery=1,cumulative_errors=0}
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.core.SolrDeletionPolicy onCommit
INFO   | jvm 1| 2012/06/21 17:33:31 | INFO:
SolrDeletionPolicy.onCommit: commits:num=2
INFO   | jvm 1| 2012/06/21 17:33:31 |
commit{dir=/rpt/src/solr-home/mst/pbl/solr/data/index,segFN=segments_7j,version=1340212196528,generation=271,filenames=[_cj.fdt,
_cj.tii, _cj.fnm, _cj.frq, segments_7j, _cj.nrm, _cj.tis, _cj.prx, _cj.fdx]
INFO   | jvm 1| 2012/06/21 17:33:31 |
commit{dir=/rpt/src/solr-home/mst/pbl/solr/data/index,segFN=segments_7k,version=1340212196529,generation=272,filenames=[_ck.nrm,
_ck.prx, _ck.fdt, _ck.fnm, _ck.tis, _ck.fdx, segments_7k, _ck.tii, _ck.frq]
INFO   | jvm 1| 2012/06/21 17:33:31 | Jun 21, 2012 5:33:31 PM
org.apache.solr.core.SolrDeletionPolicy updateCommi