RE: bi-grams for common terms - any analyzers do that?

2010-09-24 Thread Dennis Gearon
I'm looking for doing CJK applications by mid next year, also Euro/Russian. Are the analyzers for all those up and running? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php

Re: matches in result grouping

2010-09-24 Thread Koji Sekiguchi
Correct. The "matches" is the doc count before any grouping (and for field.query that means before the restriction given by field.query is applied). It won't always be the same though - for example we might implement filter excludes like we do with faceting, etc. -Yonik http://lucenerevolutio

Re: Localsolr with Dismax

2010-09-24 Thread gearond
Ever get this workiing? -- View this message in context: http://lucene.472066.n3.nabble.com/Localsolr-with-Dismax-tp1402956p1578050.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: LocalSolr unknown handler: geo

2010-09-24 Thread gearond
Actually, I'm doing this from Nabble and getting used to that. Sorry for back to back posts. If I wanted to filter on a distance from a supplied point, and also return a field which is the distance from the point, which: 1/ Lucene version should I use? 2/ Which GIS code should I use? -- View th

Re: LocalSolr unknown handler: geo

2010-09-24 Thread gearond
I also wonder where the mailing list for localsolr is. TIA -- View this message in context: http://lucene.472066.n3.nabble.com/LocalSolr-unknown-handler-geo-tp1572964p1578030.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: bi-grams for common terms - any analyzers do that?

2010-09-24 Thread Andy
--- On Thu, 9/23/10, Burton-West, Tom wrote: > It also splits on whitespace which causes all CJK queries > to be treated as phrase queries regardless of the CJK > tokenizer you use. But I thought specialized analyzers like CJKAnalyzer are designed for those languages, which don't use whitespa

Re: Calculating distances in Solr using longitude latitude

2010-09-24 Thread Dennis Gearon
hmm, So what works? I 'only' need: 1/ sorting by distance (maybe, probably could be avoided in Solr) 2/ filtering by max distance, or at least a bounding box 3/ a pseudo field - distance from given point for each returned result (REALLY a must). Any of this available now? Ba

Re: Data Import Handler Rich Format Documents

2010-09-24 Thread Dennis Gearon
What's a GA release? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/24/10, Lance Norskog wrote: > From: Lance Norskog > Subject: Re: Data Import Handler Ric

Re: Calculating distances in Solr using longitude latitude

2010-09-24 Thread Lance Norskog
As it turns out, the problem is not trivial in general, and shoehorning it into an existing search system nicely is also not simple. I think the current spatial stuff is the third go-round on doing GIS in Lucene/Solr. PeterKerk wrote: It would be such a shame if there's no way to get it now al

Re: Data Import Handler Rich Format Documents

2010-09-24 Thread Lance Norskog
The TikaEntityProcessor is the class in the DIH that calls the Tika libraries. TikaEntityProcessor is not in Solr 1.4 or 1.4.1. It is in the trunk and the 3.x branch. I have set it up from the 3.x branch. I discovered that the "DefaultParser" does not work, and you have to explicitly name the

Re: Solr Highlighting Question

2010-09-24 Thread Koji Sekiguchi
(10/09/25 8:07), Jed Glazner wrote: Hi Koji, I'm trying to get the FVH to work per your suggestion, but I think I must have something misconfigured... Here is the field def in my schema.xml: Then here is my request handler in solrconfig.xml: edismax name_title^3 plain^6 grams^1 sou

Re: LocalSolr unknown handler: geo

2010-09-24 Thread Lance Norskog
In answer to your actual question: "geo" is a "request handler" that is configured in solrconfig.xml. I don't know what it needs. The LocalSolr stuff should supply samples of how to change solrconfig.xml and schema.xml. Lance PeterKerk wrote: I've configured LocalSolr according to description

Re: Solr Highlighting Question

2010-09-24 Thread Jed Glazner
Hi Koji, I'm trying to get the FVH to work per your suggestion, but I think I must have something misconfigured... Here is the field def in my schema.xml:                     Then here is my request handler in solrconfig.xml:        

RE: Autocomplete: match words anywhere in the token

2010-09-24 Thread Jonathan Rochkind
I'm pretty sure under the algorithm that Chantal describes, if you use a multi-valued field for matching, you're going to get results in your auto-suggest that are in the same document with things that matched your entry, but don't actually match your entry themselves. Chantal seemed to confirm

Re: LocalSolr unknown handler: geo

2010-09-24 Thread Dennis Gearon
We'll be able to 'kick the tires' in about 3 weeks, sorry for that delay. But at that time, we'll be able to start using geo spatial in combination with all the other solr stuff. Not having the pseudo field is a REAL BUMMER, though, since I want to display distance from the results. Is there a

Re: LocalSolr unknown handler: geo

2010-09-24 Thread Grant Ingersoll
I'd say it's about 85-90% there, but of course, you don't know what you don't know. Biggest thing missing now is "pseudo-fields", IMO, i.e. the ability to return the distance as a field. Next after that is some type of tile/grid approach, but that's less important given we have Trie fields for

Highlighting splitting HTML named entities

2010-09-24 Thread Andrew Cogan
We're using SOLR 1.4.1 with the highlighting formatter org.apache.solr.highlight.HtmlFormatter. Is there a way to configure the rules it uses for determining token boundaries? We're getting highlight markup inserted into the middle of HTML named entities. For example, if the user searches for "

Re: upgrade index from 2.9 to 3.x

2010-09-24 Thread mike anderson
Thanks. I found the Jars for Lucene 3.0.2, but for the life of me I can figure out how to compile solr against that verison. Is there a parameter I can pass to ant that tells it which version to use? Should I just dump everything Lucene related into the 'lib' folder? -mike On Fri, Sep 24, 2010 at

Re: LocalSolr unknown handler: geo

2010-09-24 Thread Dennis Gearon
How functional is spatial in 3.x? Dennis Gearon Signature Warning EARTH has a Right To Life, otherwise we all die. Read 'Hot, Flat, and Crowded' Laugh at http://www.yert.com/film.php --- On Fri, 9/24/10, Grant Ingersoll wrote: > From: Grant Ingersoll > Subject: Re: LocalSo

FieldType for storing date

2010-09-24 Thread Jibo John
Hello, I was wondering what would be the best FieldType for storing date with a millisecond precision that would allow me to sort and run range queries against this field. We would like to achieve the best query performance, minimal heap - fieldcache - requirements, good indexing throughput an

Re: possible to have uniqueKey to be type long?

2010-09-24 Thread Lin Bin Chen
remove QueryElevationComponent in solrconfig.xml 2010/9/24 Andy > I have a uniqueKey "id". I want to have id of the type long. So I changed > my schema.xml to have: > > /> > > When I tried to index data, I got the error: > > Severe errors in solr configuration. > Check your log files for more

RE: Autocomplete: match words anywhere in the token

2010-09-24 Thread Jonathan Rochkind
Chantal Ackermann wrote: "I definitely need to have a look at how to use facetting in combination with multivalued fields for autocomplete." My one kind of crazy idea is to (ab)use the Hilighting Component. If you make a query for auto-suggest terms based on facets, using Chantal's technique, b

Re: Help need in setting up delta imports

2010-09-24 Thread Alexey Serba
Your example doesn't mention deleting Employee. Is this a valid use case? If not then you can simplify things: query="SELECT name, address from employee where endtimestamp is null" deltaQuery= "SELECT DISTINCT name FROM employee eventtimestamp > '${dataimporter.last_index_time}' " d

Re: upgrade index from 2.9 to 3.x

2010-09-24 Thread Markus Jelsma
There is a recent thread on this one http://www.mail-archive.com/solr-user@lucene.apache.org/msg40491.html On Friday 24 September 2010 16:30:36 mike anderson wrote: > What is the right way to upgrade a solr index from Lucene 2.9.1 to 3.x. I'm > getting the exception: > > SEVERE: java.lang.Runtim

upgrade index from 2.9 to 3.x

2010-09-24 Thread mike anderson
What is the right way to upgrade a solr index from Lucene 2.9.1 to 3.x. I'm getting the exception: SEVERE: java.lang.RuntimeException: org.apache.lucene.index.IndexFormatTooOldException: Format version is not supported in file '_aw5w.fdx': 1 (needs to be between 2 and 2). This version of Lucene on

possible to have uniqueKey to be type long?

2010-09-24 Thread Andy
I have a uniqueKey "id". I want to have id of the type long. So I changed my schema.xml to have: When I tried to index data, I got the error: Severe errors in solr configuration. Check your log files for more detailed information on what may be wrong. If you want solr to continue after configu

Re: AbstractMethodError with lucid KStem

2010-09-24 Thread Bernd Fehling
Hi Yonik, which is the most recent solr version working with lucid KStem? I started with lucidworks for solr but tika for PDFs was way to old and made problems so I had to change to trunk with tika 0.8 SNAPSHOT. Runs perfect but now the problem with KStem... What is your advise? Regards, Bernd

Re: AbstractMethodError with lucid KStem

2010-09-24 Thread Yonik Seeley
On Fri, Sep 24, 2010 at 9:39 AM, Bernd Fehling wrote: > I tried using lucid KStem with solr trunk version but get AbstractMethodError. That hasn't been ported to trunk yet. -Yonik http://lucenerevolution.org Lucene/Solr Conference, Boston Oct 7-8

AbstractMethodError with lucid KStem

2010-09-24 Thread Bernd Fehling
I tried using lucid KStem with solr trunk version but get AbstractMethodError. schema.xml: ... ... solr/lib/ has lucid-kstem.jar and lucid-solr-kstem.jar. When using /

Re: Solr UIMA integration

2010-09-24 Thread Tommaso Teofili
Hi Maheshkumar, I never had this one before, which version of UIMA dependencies (uima-core, AlchemyAPIAnnotator, OpenCalaisAnnotator, Tagger, WhitespaceTokenizer) are you using? It should be 2.3.1-SNAPSHOT. Which version of Solr? It seems that there is a problem in Tagger reading its model (to gene

Re: Solr UIMA integration

2010-09-24 Thread maheshkumar
I have configured solr and uima has described by you. I have the following dependency jars also AlchemyAPIAnnotator.jar commons-beanutils-1.7.0.jar commons-digester-2.0.jar commons-lang-2.4.jar OpenCalaisAnnotator.jar slf4j-api-1.5.5.jar slf4j-jdk14-1.5.5.jar solr-uima.jar Tagger.jar uima-core.jar

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Arunkumar Ayyavu
I realize that Solr has facet.mincount and I believe I could use that. Sorry for writing too many mails without much study. On Fri, Sep 24, 2010 at 5:57 PM, Arunkumar Ayyavu < arunkumar.ayy...@gmail.com> wrote: > > > On Fri, Sep 24, 2010 at 5:22 PM, Arunkumar Ayyavu < > arunkumar.ayy...@gmail.co

Re: LocalSolr unknown handler: geo

2010-09-24 Thread PeterKerk
I tried looking for a place where I could ask those questions, but where can I find the LocalSolr mailinglist you are referring to? Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/LocalSolr-unknown-handler-geo-tp1572964p1573583.html Sent from the Solr - User mailing

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Arunkumar Ayyavu
On Fri, Sep 24, 2010 at 5:22 PM, Arunkumar Ayyavu < arunkumar.ayy...@gmail.com> wrote: > Thanks. That was cool. > > Now, I'm looking into another problem. Say, I search for sony, I don't want > to see all that starts with sony. Only when I type more text, say sony slr, > I want to see those entrie

Re: LocalSolr unknown handler: geo

2010-09-24 Thread Grant Ingersoll
You're best off to ask on the Local Solr mailing list, or have a look at Solr's 3.x and trunk branches, which have spatial baked in (although it is still improving) On Sep 24, 2010, at 5:57 AM, PeterKerk wrote: > > I've configured LocalSolr according to description on this page: > http://www.g

Re: Data Import Handler Rich Format Documents

2010-09-24 Thread Tod
On 9/23/2010 6:52 AM, mehdi.es...@gmail.com wrote: Hi, I have exactly the same problem than the one you submitted in this link http://lucene.472066.n3.nabble.com/Data-Import-Handler-Rich-Format-Documents-td905478.html and I would like to ask you if you got a solution for that. I started to have

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Peter Karich
Jonathan, this field described here from Chantal: > 2.) create an additional field that stores uses the > String type with the same content (use copy field to fill either) can be multivalued. Or what did you mean? BTW: The nice thing about facet.prefix is that you can add an arbitrary (filter)

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Arunkumar Ayyavu
On Thu, Sep 23, 2010 at 1:57 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > On Wed, 2010-09-22 at 20:14 +0200, Arunkumar Ayyavu wrote: > > Thanks for the responses. Now, I included the EdgeNGramFilter. But, I get > > the following results when I search for "canon pixma". > > Can

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Arunkumar Ayyavu
Thanks. That was cool. Now, I'm looking into another problem. Say, I search for sony, I don't want to see all that starts with sony. Only when I type more text, say sony slr, I want to see those entries starting with sony slr. Let me see if I can find the answer soon. On Thu, Sep 23, 2010 at 1:50

Re: Can Solr do approximate matching?

2010-09-24 Thread Markus Jelsma
The mlt handler isn't present in the default solrconfig.xml, but the component is. You can simply register the component and add it to your request handler. See how it's done in the configuration http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/solrconfig.xml On Friday 24

Re: Search a URL

2010-09-24 Thread Markus Jelsma
WordDelimiterFilter On Friday 24 September 2010 02:42:52 Dennis Gearon wrote: > WDF is not WTF(what I think when I see WDF), right ;-) > > What is WDF? > > Dennis Gearon > > Signature Warning > > EARTH has a Right To Life, > otherwise we all die. > > Read 'Hot, Flat, and Cro

Re: Autocomplete: match words anywhere in the token

2010-09-24 Thread Chantal Ackermann
Hi Jonathan, yes it works only for single-valued fields without great effort. For multivalued fields you'd have to do some extra work getting only the values wich contain tokens that start with the given prefix. But maybe you mean also wether it works for several fields in one query. I guess not,

LocalSolr unknown handler: geo

2010-09-24 Thread PeterKerk
I've configured LocalSolr according to description on this page: http://www.gissearch.com/localsolr I use Solr 1.4.1. I've compiled the LocalSolr jars and put them in the Solr lib: \apache-solr-1.4.1\example\solr\lib The lib folder didnt exist so I created it and put the LocalSolr jars in there

Re: WordDelimiterFilter combined with PositionFilter

2010-09-24 Thread Robert Muir
On Fri, Sep 24, 2010 at 3:54 AM, Mathias Walter wrote: > Hi, > > I'm combined the WordDelimiterFilter with the PositionFilter to prevent the > creation of expensive Phrase and MultiPhraseQueries. But > if I now parse an escaped string consisting of two terms, the analyser > returns a BooleanQuery.

Re: Range query not working

2010-09-24 Thread PeterKerk
It works! :) @Jonathan: Indeed, I'm using Solr1.4.1 example schema. I have now added: in schema.xml changed relevant fields in schema.xml from "integer" to "int" Thanks! -- View this message in context: http://lucene.472066.n3.nabble.com/Range-query-not-working-tp1570324p1572605.html Sent

WordDelimiterFilter combined with PositionFilter

2010-09-24 Thread Mathias Walter
Hi, I'm combined the WordDelimiterFilter with the PositionFilter to prevent the creation of expensive Phrase and MultiPhraseQueries. But if I now parse an escaped string consisting of two terms, the analyser returns a BooleanQuery. That's not what I would expect. If a string is escaped, I would