date:20090602

Re: Using Chinese / How to ?

2009-06-02 Thread James liu

u means how to config solr which support chinese?

Update problem?

On Tuesday, June 2, 2009, Fer-Bj  wrote:
>
> I'm sending 3 files:
> - schema.xml
> - solrconfig.xml
> - error.txt (with the error description)
>
> I can confirm by now that this error is due to invalid characters for the
> XML format (ASCII 0 or 11).
> However, this problem now is taking a different direction: how to start
> using the CJK instead of the english!
> http://www.nabble.com/file/p23825881/error.txt error.txt
> http://www.nabble.com/file/p23825881/solrconfig.xml solrconfig.xml
> http://www.nabble.com/file/p23825881/schema.xml schema.xml
>
>
> Grant Ingersoll-6 wrote:
>>
>> Can you provide details on the errors?  I don't think we have a
>> specific how to, but I wouldn't think it would be much different from
>> 1.2
>>
>> -Grant
>> On May 31, 2009, at 10:31 PM, Fer-Bj wrote:
>>
>>>
>>> Hello,
>>>
>>> is there any "how to" already created to get me up using SOLR 1.3
>>> running
>>> for a chinese based website?
>>> Currently our site is using SOLR 1.2, and we tried to move into 1.3
>>> but we
>>> couldn't complete our reindex as it seems like 1.3 is more strict
>>> when it
>>> comes to special chars.
>>>
>>> I would appreciate any help anyone may provide on this.
>>>
>>> Thanks!!
>>> --
>>> View this message in context:
>>> http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>
>>
>> --
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>>
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>> using Solr/Lucene:
>> http://www.lucidimagination.com/search
>>
>>
>>
>
> --
> View this message in context: 
> http://www.nabble.com/Using-Chinese---How-to---tp23810129p23825881.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

-- 
regards
j.L ( I live in Shanghai, China)

Re: Solr multiple keyword search as google

2009-06-02 Thread James liu

U can find answer in tutorial or example

On Tuesday, June 2, 2009, The Spider  wrote:
>
> Hi,
>    I am using solr nightly bind for my search.
> I have to search in the location field of the table which is not my default
> search field.
> I will briefly explain my requirement below:
> I want to get the same/similar result when I give location multiple
> keywords, say  "San jose ca USA"
> or "USA ca san jose" or "CA San jose USA" (like that of google search). That
> means even if I rearranged the keywords of location I want to get proper
> results. Is there any way to do that?
> Thanks in advance
> --
> View this message in context: 
> http://www.nabble.com/Solr-multiple-keyword-search-as-google-tp23826278p23826278.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

-- 
regards
j.L ( I live in Shanghai, China)

Solritas Problem with "faces"

2009-06-02 Thread Jörg Agatz

Hi... I have in Solritas a Problem with the Faces...

I seach for "Ipod" ore "plesnik" and the faces are say,

(PDF) 39
(TXT)109
(DOC)1200

When i click on the PDF, i want to see 39 PDF´s with te Kayword "plesnik"
but i get more the 800 thats are all pdf´s in the index..
is this a Bug? ore a Fetures?

Phrase query search returns no result

2009-06-02 Thread SergeyG


Hi,

I'm trying to implement a full-text search but can't get the right result
with a Phrase query search. The field I search through was indexed as a
"text" field. The phrase was "It was as long as a tree". During both
indexing and searching the StopWordsFiler was on. For a search I used these
settings: 

  
   dismax
   explicit
   
  title author category content
   
   
  id,title,author,isbn,category,content,score
   
   100
   content
  


But I the returned docs list was empty. Using Solr Admin console for
debugging showed that parsedquery=+() ().
Switching the StopwordsFilter off during searching didn't help either. 

Am I missing something?

Thanks,
Sergey
-- 
View this message in context: 
http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
Sent from the Solr - User mailing list archive at Nabble.com.

NPE in dataimport.DebugLogger.peekStack (DIH Development Console)

2009-06-02 Thread Steffen B.


Hi,
I'm trying to debug my DI config on my Solr server and it constantly fails
with a NullPointerException:
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImportHandler
processConfiguration
INFO: Processing configuration from solrconfig.xml: {config=dataconfig.xml}
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
loadDataConfig
INFO: Data Configuration loaded successfully
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
verifyWithSchema
INFO: id is a required field in SolrSchema . But not found in DataConfig
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
INFO: Starting Full Import
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.SolrWriter
readIndexerProperties
INFO: Read dataimport.properties
Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
doFullImport
SEVERE: Full Import failed
java.lang.NullPointerException
at
org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78)
at
org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98)
at
org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248)
at
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:304)
at
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:224)
at
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:167)
at
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:316)
at
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:376)
at
org.apache.solr.handler.dataimport.DataImportHandler.handleRequestBody(DataImportHandler.java:187)
at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)
at
org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341)
at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:244)
at
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:215)
at
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:188)
at
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:213)
at
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:174)
at
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:127)
at
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:117)
at
org.apache.catalina.valves.RequestFilterValve.process(RequestFilterValve.java:276)
at
org.apache.catalina.valves.RemoteHostValve.invoke(RemoteHostValve.java:81)
at
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:108)
at
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:151)
at
org.apache.coyote.http11.Http11AprProcessor.process(Http11AprProcessor.java:834)
at
org.apache.coyote.http11.Http11AprProtocol$Http11ConnectionHandler.process(Http11AprProtocol.java:640)
at
org.apache.tomcat.util.net.AprEndpoint$Worker.run(AprEndpoint.java:1286)
at java.lang.Thread.run(Thread.java:619)

Running a normal full-import works just fine, but whenever I try to run the
debugger, it gives me this error. I'm using the most recent Solr nightly
build (2009-06-01) and the method in question is:
private DebugInfo peekStack() {
return debugStack.isEmpty() ? null : debugStack.peek();
}
I'm using a DI config that has been working fine in for several previous
builds, so that shouldn't be the problem... any ideas what the problem could
be?
Thanks in advance,
Steffen
-- 
View this message in context: 
http://www.nabble.com/NPE-in-dataimport.DebugLogger.peekStack-%28DIH-Development-Console%29-tp23833878p23833878.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase query search returns no result

2009-06-02 Thread Otis Gospodnetic


Your stopwords were removed during indexing, so if all those terms were 
stopwords, and they likely were, none of them exist in the index now.  You can 
double-check that with Luke.  You need to remove stopwords from the index-time 
analyzer, too, and then reindex.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: SergeyG 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 9:57:17 AM
> Subject: Phrase query search returns no result
> 
> 
> Hi,
> 
> I'm trying to implement a full-text search but can't get the right result
> with a Phrase query search. The field I search through was indexed as a
> "text" field. The phrase was "It was as long as a tree". During both
> indexing and searching the StopWordsFiler was on. For a search I used these
> settings: 
> 
>   
>   dismax
>   explicit
>   
>   title author category content
>   
>   
>   id,title,author,isbn,category,content,score
>   
>   100
>   content
>   
> 
> 
> But I the returned docs list was empty. Using Solr Admin console for
> debugging showed that parsedquery=+() ().
> Switching the StopwordsFilter off during searching didn't help either. 
> 
> Am I missing something?
> 
> Thanks,
> Sergey
> -- 
> View this message in context: 
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Using SolrJ with multicore/shards

2009-06-02 Thread ahammad


Hello,

I have a MultiCore install of solr with 2 cores with different schemas and
such. Querying directly using http request and/or the solr interface works
very well for my purposes.

I want to have a proper search interface though, so I have some code that
basically acts as a link between the server and the front-end. Basically,
depending on the options, the search string is built, and when the search is
submitted, that string gets passed as an http request. The code then would
parse through the xml to get the information.

This method works with shards because I can add the shards parameter
straight into the link that I end up hitting. Although this is currently
functional, I was thinking of using SolrJ simply because it is simpler to
use and would cut down the amount of code.

The question is, how would I be able to define the shards in my query, so
that when I do search, I hit both shards and get mixed results back? Using
http requests, it's as simple as adding a shard=core0,core1 snippet. What is
the equivalent of this in SolrJ?

BTW, I do have some SolrJ code that is able to query and return results, but
for a single core. I am currently using CommonsHttpSolrServer for that, not
the Embedded one.

Cheers
-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23834518.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase query search returns no result

2009-06-02 Thread SergeyG


Thanks, Otis. 

Checking for the stop words was the first thing I did after getting the
empty result. Not all of those words are in the stopwords.txt file. Then
just for experimenting purposes I commented out the StopWordsAnalyser during
indexing and reindexed. But the phrase was not found again.

Sergey


Otis Gospodnetic wrote:
> 
> 
> Your stopwords were removed during indexing, so if all those terms were
> stopwords, and they likely were, none of them exist in the index now.  You
> can double-check that with Luke.  You need to remove stopwords from the
> index-time analyzer, too, and then reindex.
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: SergeyG 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 9:57:17 AM
>> Subject: Phrase query search returns no result
>> 
>> 
>> Hi,
>> 
>> I'm trying to implement a full-text search but can't get the right result
>> with a Phrase query search. The field I search through was indexed as a
>> "text" field. The phrase was "It was as long as a tree". During both
>> indexing and searching the StopWordsFiler was on. For a search I used
>> these
>> settings: 
>> 
>>   
>>   dismax
>>   explicit
>>   
>>   title author category content
>>   
>>   
>>   id,title,author,isbn,category,content,score
>>   
>>   100
>>   content
>>   
>> 
>> 
>> But I the returned docs list was empty. Using Solr Admin console for
>> debugging showed that parsedquery=+() ().
>> Switching the StopwordsFilter off during searching didn't help either. 
>> 
>> Am I missing something?
>> 
>> Thanks,
>> Sergey
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase query search returns no result

2009-06-02 Thread Otis Gospodnetic


And "your phrase here"~100 works?

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: SergeyG 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 11:17:23 AM
> Subject: Re: Phrase query search returns no result
> 
> 
> Thanks, Otis. 
> 
> Checking for the stop words was the first thing I did after getting the
> empty result. Not all of those words are in the stopwords.txt file. Then
> just for experimenting purposes I commented out the StopWordsAnalyser during
> indexing and reindexed. But the phrase was not found again.
> 
> Sergey
> 
> 
> Otis Gospodnetic wrote:
> > 
> > 
> > Your stopwords were removed during indexing, so if all those terms were
> > stopwords, and they likely were, none of them exist in the index now.  You
> > can double-check that with Luke.  You need to remove stopwords from the
> > index-time analyzer, too, and then reindex.
> > 
> >  Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > - Original Message 
> >> From: SergeyG 
> >> To: solr-user@lucene.apache.org
> >> Sent: Tuesday, June 2, 2009 9:57:17 AM
> >> Subject: Phrase query search returns no result
> >> 
> >> 
> >> Hi,
> >> 
> >> I'm trying to implement a full-text search but can't get the right result
> >> with a Phrase query search. The field I search through was indexed as a
> >> "text" field. The phrase was "It was as long as a tree". During both
> >> indexing and searching the StopWordsFiler was on. For a search I used
> >> these
> >> settings: 
> >> 
> >>  
> >>   dismax
> >>   explicit
> >>  
> >>   title author category content
> >>  
> >>  
> >>   id,title,author,isbn,category,content,score
> >>  
> >>   100
> >>   content
> >>  
> >> 
> >> 
> >> But I the returned docs list was empty. Using Solr Admin console for
> >> debugging showed that parsedquery=+() ().
> >> Switching the StopwordsFilter off during searching didn't help either. 
> >> 
> >> Am I missing something?
> >> 
> >> Thanks,
> >> Sergey
> >> -- 
> >> View this message in context: 
> >> 
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html
> Sent from the Solr - User mailing list archive at Nabble.com.

spell checking

2009-06-02 Thread Yao Ge


Can someone help providing a tutorial like introduction on how to get
spell-checking work in Solr. It appears many steps are requires before the
spell-checkering functions can be used. It also appears that a dictionary (a
list of correctly spelled words) is required to setup the spell checker. Can
anyone validate my impression?

Thanks.
-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23835427.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: NPE in dataimport.DebugLogger.peekStack (DIH Development Console)

2009-06-02 Thread Shalin Shekhar Mangar

On Tue, Jun 2, 2009 at 8:06 PM, Steffen B. wrote:

>
> I'm trying to debug my DI config on my Solr server and it constantly fails
> with a NullPointerException:
> Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
> doFullImport
> SEVERE: Full Import failed
> java.lang.NullPointerException
>at
>
> org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78)
>at
> org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98)
>at
> org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248)
>at...
>
> Running a normal full-import works just fine, but whenever I try to run the
> debugger, it gives me this error. I'm using the most recent Solr nightly
> build (2009-06-01) and the method in question is:
> private DebugInfo peekStack() {
>return debugStack.isEmpty() ? null : debugStack.peek();
> }
> I'm using a DI config that has been working fine in for several previous
> builds, so that shouldn't be the problem... any ideas what the problem
> could
> be?



A previous commit to change the EntityProcessor API broke this
functionality. I'll open an issue and give a patch.

-- 
Regards,
Shalin Shekhar Mangar.

Re: spell checking

2009-06-02 Thread Grant Ingersoll


Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent


On Jun 2, 2009, at 8:50 AM, Yao Ge wrote:



Can someone help providing a tutorial like introduction on how to get
spell-checking work in Solr. It appears many steps are requires  
before the
spell-checkering functions can be used. It also appears that a  
dictionary (a
list of correctly spelled words) is required to setup the spell  
checker. Can

anyone validate my impression?

Thanks.
--
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23835427.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search

Search combination?

2009-06-02 Thread Jörg Agatz

Hi users...

i have a Problem...

i will search for:

http://192.168.105.54:8983/solr/itas?q=size:7*&extension:db

i mean i search for all documents they are size 7* and extension:pdf,

But it dosent work
i get some other files, with extension doc ore db
what is Happens about ?

Jörg

Re: NPE in dataimport.DebugLogger.peekStack (DIH Development Console)

2009-06-02 Thread Steffen B.


Glad to hear that it's not a problem with my setup.
Thanks for taking care of it! :)


Shalin Shekhar Mangar wrote:
> 
> On Tue, Jun 2, 2009 at 8:06 PM, Steffen B.
> wrote:
> 
>>
>> I'm trying to debug my DI config on my Solr server and it constantly
>> fails
>> with a NullPointerException:
>> Jun 2, 2009 4:20:46 PM org.apache.solr.handler.dataimport.DataImporter
>> doFullImport
>> SEVERE: Full Import failed
>> java.lang.NullPointerException
>>at
>>
>> org.apache.solr.handler.dataimport.DebugLogger.peekStack(DebugLogger.java:78)
>>at
>> org.apache.solr.handler.dataimport.DebugLogger.log(DebugLogger.java:98)
>>at
>> org.apache.solr.handler.dataimport.SolrWriter.log(SolrWriter.java:248)
>>at...
>>
>> Running a normal full-import works just fine, but whenever I try to run
>> the
>> debugger, it gives me this error. I'm using the most recent Solr nightly
>> build (2009-06-01) and the method in question is:
>> private DebugInfo peekStack() {
>>return debugStack.isEmpty() ? null : debugStack.peek();
>> }
>> I'm using a DI config that has been working fine in for several previous
>> builds, so that shouldn't be the problem... any ideas what the problem
>> could
>> be?
> 
> 
> 
> A previous commit to change the EntityProcessor API broke this
> functionality. I'll open an issue and give a patch.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/NPE-in-dataimport.DebugLogger.peekStack-%28DIH-Development-Console%29-tp23833878p23835897.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using SolrJ with multicore/shards

2009-06-02 Thread Otis Gospodnetic


You should be able to set any name=value URL parameter pair and send it to Solr 
using SolrJ.  What's the name of that class... MapSolrParams, I believe.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: ahammad 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 11:06:55 AM
> Subject: Using SolrJ with multicore/shards
> 
> 
> Hello,
> 
> I have a MultiCore install of solr with 2 cores with different schemas and
> such. Querying directly using http request and/or the solr interface works
> very well for my purposes.
> 
> I want to have a proper search interface though, so I have some code that
> basically acts as a link between the server and the front-end. Basically,
> depending on the options, the search string is built, and when the search is
> submitted, that string gets passed as an http request. The code then would
> parse through the xml to get the information.
> 
> This method works with shards because I can add the shards parameter
> straight into the link that I end up hitting. Although this is currently
> functional, I was thinking of using SolrJ simply because it is simpler to
> use and would cut down the amount of code.
> 
> The question is, how would I be able to define the shards in my query, so
> that when I do search, I hit both shards and get mixed results back? Using
> http requests, it's as simple as adding a shard=core0,core1 snippet. What is
> the equivalent of this in SolrJ?
> 
> BTW, I do have some SolrJ code that is able to query and return results, but
> for a single core. I am currently using CommonsHttpSolrServer for that, not
> the Embedded one.
> 
> Cheers
> -- 
> View this message in context: 
> http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23834518.html
> Sent from the Solr - User mailing list archive at Nabble.com.

RE: Solr.war

2009-06-02 Thread Francis Yakin

 Thank You!

Francis

-Original Message-
From: Koji Sekiguchi [mailto:k...@r.email.ne.jp]
Sent: Monday, June 01, 2009 5:14 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr.war

They are identical. solr.war is a copy of apache-solr-1.3.0.war.
You may want to look at example target in build.xml:

  



Koji

Francis Yakin wrote:
> We are planning to upgrade solr 1.2.0 to 1.3.0
>
> Under 1.3.0 - Which of war file that I need to use and deploy on my 
> application?
>
> We are using weblogic.
>
> There are two war files under 
> /opt//apache-solr-1.3.0/dist/apache-solr-1.3.0.war and under 
> /opt/apache-solr-1.3.0/example/webapps/solr.war.
> Which is one are we suppose to use?
>
>
> Thanks
>
> Francis
>
>
>
>

Re: Phrase query search returns no result

2009-06-02 Thread SergeyG


Actually, "my phrase here"~0 (for an exact match) didn't work I tried, just
for to experiment, to put "qs=100". 

Otis Gospodnetic wrote:
> 
> 
> And "your phrase here"~100 works?
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: SergeyG 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 11:17:23 AM
>> Subject: Re: Phrase query search returns no result
>> 
>> 
>> Thanks, Otis. 
>> 
>> Checking for the stop words was the first thing I did after getting the
>> empty result. Not all of those words are in the stopwords.txt file. Then
>> just for experimenting purposes I commented out the StopWordsAnalyser
>> during
>> indexing and reindexed. But the phrase was not found again.
>> 
>> Sergey
>> 
>> 
>> Otis Gospodnetic wrote:
>> > 
>> > 
>> > Your stopwords were removed during indexing, so if all those terms were
>> > stopwords, and they likely were, none of them exist in the index now. 
>> You
>> > can double-check that with Luke.  You need to remove stopwords from the
>> > index-time analyzer, too, and then reindex.
>> > 
>> >  Otis
>> > --
>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> > 
>> > 
>> > 
>> > - Original Message 
>> >> From: SergeyG 
>> >> To: solr-user@lucene.apache.org
>> >> Sent: Tuesday, June 2, 2009 9:57:17 AM
>> >> Subject: Phrase query search returns no result
>> >> 
>> >> 
>> >> Hi,
>> >> 
>> >> I'm trying to implement a full-text search but can't get the right
>> result
>> >> with a Phrase query search. The field I search through was indexed as
>> a
>> >> "text" field. The phrase was "It was as long as a tree". During both
>> >> indexing and searching the StopWordsFiler was on. For a search I used
>> >> these
>> >> settings: 
>> >> 
>> >>  
>> >>   dismax
>> >>   explicit
>> >>  
>> >>   title author category content
>> >>  
>> >>  
>> >>   id,title,author,isbn,category,content,score
>> >>  
>> >>   100
>> >>   content
>> >>  
>> >> 
>> >> 
>> >> But I the returned docs list was empty. Using Solr Admin console for
>> >> debugging showed that parsedquery=+() ().
>> >> Switching the StopwordsFilter off during searching didn't help either. 
>> >> 
>> >> Am I missing something?
>> >> 
>> >> Thanks,
>> >> Sergey
>> >> -- 
>> >> View this message in context: 
>> >> 
>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> > 
>> > 
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23836414.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad


I'm still not sure what you meant. I took a look at that class but I haven't
got any idea on how to proceed.

BTW I tried something like this 

query.setParam("shard", "http://localhost:8080/solr/core0/"; ,
"http://localhost:8080/solr/core1/";);

But it doesn't seem to work for me. I tried it with different variations
too, like removing the http://, and combining both cores as a single string.

Could you please clarify your suggestion?

Regards


Otis Gospodnetic wrote:
> 
> 
> You should be able to set any name=value URL parameter pair and send it to
> Solr using SolrJ.  What's the name of that class... MapSolrParams, I
> believe.
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: ahammad 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 11:06:55 AM
>> Subject: Using SolrJ with multicore/shards
>> 
>> 
>> Hello,
>> 
>> I have a MultiCore install of solr with 2 cores with different schemas
>> and
>> such. Querying directly using http request and/or the solr interface
>> works
>> very well for my purposes.
>> 
>> I want to have a proper search interface though, so I have some code that
>> basically acts as a link between the server and the front-end. Basically,
>> depending on the options, the search string is built, and when the search
>> is
>> submitted, that string gets passed as an http request. The code then
>> would
>> parse through the xml to get the information.
>> 
>> This method works with shards because I can add the shards parameter
>> straight into the link that I end up hitting. Although this is currently
>> functional, I was thinking of using SolrJ simply because it is simpler to
>> use and would cut down the amount of code.
>> 
>> The question is, how would I be able to define the shards in my query, so
>> that when I do search, I hit both shards and get mixed results back?
>> Using
>> http requests, it's as simple as adding a shard=core0,core1 snippet. What
>> is
>> the equivalent of this in SolrJ?
>> 
>> BTW, I do have some SolrJ code that is able to query and return results,
>> but
>> for a single core. I am currently using CommonsHttpSolrServer for that,
>> not
>> the Embedded one.
>> 
>> Cheers
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23834518.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23836485.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Search combination?

2009-06-02 Thread Thomas Traeger


I assume you are using the StandardRequestHandler, so this should work:

http://192.168.105.54:8983/solr/itas?q=size:7* AND extension:pdf

Also have a look at the follwing links:

http://wiki.apache.org/solr/SolrQuerySyntax
http://lucene.apache.org/java/2_4_1/queryparsersyntax.html

Thomas

Jörg Agatz schrieb:

Hi users...

i have a Problem...

i will search for:

http://192.168.105.54:8983/solr/itas?q=size:7*&extension:db

i mean i search for all documents they are size 7* and extension:pdf,

But it dosent work
i get some other files, with extension doc ore db
what is Happens about ?

Jörg

Questions regarding IT search solution

2009-06-02 Thread Silent Surfer

Hi,
I am new to Lucene forum and it is my first question.I need a clarification 
from you.
Requirement:--1. Build a IT search tool for logs similar to 
that of Splunk(Only wrt searching logs but not in terms of reporting, graphs 
etc) using solr/lucene. The log files are mainly the server logs like JBoss, 
Custom application server logs (May or may not be log4j logs) and the files 
size can go potentially upto 100 MB2. The logs are spread across multiple 
servers (25 to 30 servers)2. Capability to be do search almost realtime3. 
Support  distributed search

Our search criterion can be based on a keyword or timestamp or IP address etc.
Can anyone throw some light if solr/lucene is right solution for this ?
Appreciate any quick help in this regard.
Thanks,Surfer



Thanks,Tiru

Re: Avoid duplicates in MoreLikeThis using field collapsing

2009-06-02 Thread Otis Gospodnetic


But why does MLT return duplicates in the first place?  That seems strange to 
me.  If there are no duplicates in your index, how does MLT manage to return 
dupes?

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Marc Sturlese 
> To: solr-user@lucene.apache.org
> Sent: Friday, May 29, 2009 7:05:15 AM
> Subject: Avoid duplicates in MoreLikeThis using field collapsing
> 
> 
> Hey there, 
> I am testing MoreLikeThis feaure (with MoreLikeThis component and with
> MoreLikeThis handler) and I am getting lots of duplicates. I have noticed
> that lots of the similar documents returned are duplicates. To avoid that I
> have tried to use the field collapsing patch but it's not taking effect.
> 
> In case of MoreLikeThis handler I think it's normal has I have seen it
> extends directly from RequestHandlerBase.java and not from
> SearchHandler.java that is the one that in the function handleRequestBody
> will deal with components:
> 
>   for( SearchComponent c : components ) {
> rb.setTimer( subt.sub( c.getName() ) );
> c.prepare(rb);
> rb.getTimer().stop();
>   }
> 
> To sort it out I have "embbed" the collapseFilter in the getMoreLikeThis
> method of the MoreLikeThisHandler.java
> This is working alrite but would like to know if is there any more polite
> way to make MoreLikeThisHandler able to deal with components. I mean via
> solrconfig.xml or "pluging" something instead of "hacking" it.
> 
> Thanks in advance
> 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Avoid-duplicates-in-MoreLikeThis-using-field-collapsing-tp23778054p23778054.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Avoid duplicates in MoreLikeThis using field collapsing

2009-06-02 Thread Marc Sturlese


With DeDuplication path I create a signature field to control duplicates wich
is a MD5 of 3 different fields:
hashField = hash (fieldA + fieldB +fieldC)

With MoreLikeThis I want to show fieldA
There are documents that DeDuplication will not consider duplicates because
filedC was diferent for each. However fieldA is exaclty the same. These are
the duplicate documents that MoreLikeThis is showing me.

Hope I explained myself more or less ok...




Otis Gospodnetic wrote:
> 
> 
> But why does MLT return duplicates in the first place?  That seems strange
> to me.  If there are no duplicates in your index, how does MLT manage to
> return dupes?
> 
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: Marc Sturlese 
>> To: solr-user@lucene.apache.org
>> Sent: Friday, May 29, 2009 7:05:15 AM
>> Subject: Avoid duplicates in MoreLikeThis using field collapsing
>> 
>> 
>> Hey there, 
>> I am testing MoreLikeThis feaure (with MoreLikeThis component and with
>> MoreLikeThis handler) and I am getting lots of duplicates. I have noticed
>> that lots of the similar documents returned are duplicates. To avoid that
>> I
>> have tried to use the field collapsing patch but it's not taking effect.
>> 
>> In case of MoreLikeThis handler I think it's normal has I have seen it
>> extends directly from RequestHandlerBase.java and not from
>> SearchHandler.java that is the one that in the function handleRequestBody
>> will deal with components:
>> 
>>   for( SearchComponent c : components ) {
>> rb.setTimer( subt.sub( c.getName() ) );
>> c.prepare(rb);
>> rb.getTimer().stop();
>>   }
>> 
>> To sort it out I have "embbed" the collapseFilter in the getMoreLikeThis
>> method of the MoreLikeThisHandler.java
>> This is working alrite but would like to know if is there any more polite
>> way to make MoreLikeThisHandler able to deal with components. I mean via
>> solrconfig.xml or "pluging" something instead of "hacking" it.
>> 
>> Thanks in advance
>> 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/Avoid-duplicates-in-MoreLikeThis-using-field-collapsing-tp23778054p23778054.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Avoid-duplicates-in-MoreLikeThis-using-field-collapsing-tp23778054p23837785.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad


Hello,

I played around some more with it and I found out that I was pointing my
constructor to an older class that doesn't have the MultiCore capability.

This is what I did to set up the shards:

query.setParam("shards",
"localhost:8080/solr/core0/,localhost:8080/solr/core1/");

I do have a new issue with this though. Here is how the results are
displayed:

   QueryResponse qr = server.query(query);

SolrDocumentList sdl = qr.getResults();

System.out.println("Found: " + sdl.getNumFound());
System.out.println("Start: " + sdl.getStart());
System.out.println("Max Score: " + sdl.getMaxScore());
System.out.println("");

ArrayList> hitsOnPage = new
ArrayList>();

for(SolrDocument d : sdl)
{

HashMap values = new HashMap();

for(Iterator> i = d.iterator();
i.hasNext(); )
{
Map.Entry e2 = i.next();

values.put(e2.getKey(), e2.getValue());
}

hitsOnPage.add(values);
 
String outputString = new String(  values.get("title") );
System.out.println(outputString);
}

The field "title" is one of the common fields that is shared between the two
schemas. When I print the results of my query, I get null for everything.
However, the result of sdl.getNumFound() is correct, so I know that both
cores are being accessed.

Is there a difference with how SolrJ handles multicore requests?

Disclaimer: The code 



ahammad wrote:
> 
> Hello,
> 
> I have a MultiCore install of solr with 2 cores with different schemas and
> such. Querying directly using http request and/or the solr interface works
> very well for my purposes.
> 
> I want to have a proper search interface though, so I have some code that
> basically acts as a link between the server and the front-end. Basically,
> depending on the options, the search string is built, and when the search
> is submitted, that string gets passed as an http request. The code then
> would parse through the xml to get the information.
> 
> This method works with shards because I can add the shards parameter
> straight into the link that I end up hitting. Although this is currently
> functional, I was thinking of using SolrJ simply because it is simpler to
> use and would cut down the amount of code.
> 
> The question is, how would I be able to define the shards in my query, so
> that when I do search, I hit both shards and get mixed results back? Using
> http requests, it's as simple as adding a shard=core0,core1 snippet. What
> is the equivalent of this in SolrJ?
> 
> BTW, I do have some SolrJ code that is able to query and return results,
> but for a single core. I am currently using CommonsHttpSolrServer for
> that, not the Embedded one.
> 
> Cheers
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23838351.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Using SolrJ with multicore/shards

2009-06-02 Thread ahammad


Sorry for the additional message, the disclaimer was missing.

Disclaimer: The code that was used was taken from the following site:
http://e-mats.org/2008/04/using-solrj-a-short-guide-to-getting-started-with-solrj/
. 


ahammad wrote:
> 
> Hello,
> 
> I played around some more with it and I found out that I was pointing my
> constructor to an older class that doesn't have the MultiCore capability.
> 
> This is what I did to set up the shards:
> 
> query.setParam("shards",
> "localhost:8080/solr/core0/,localhost:8080/solr/core1/");
> 
> I do have a new issue with this though. Here is how the results are
> displayed:
> 
>QueryResponse qr = server.query(query);
> 
> SolrDocumentList sdl = qr.getResults();
> 
> System.out.println("Found: " + sdl.getNumFound());
> System.out.println("Start: " + sdl.getStart());
> System.out.println("Max Score: " + sdl.getMaxScore());
> System.out.println("");
> 
> ArrayList> hitsOnPage = new
> ArrayList>();
> 
> for(SolrDocument d : sdl)
> {
>   
> HashMap values = new HashMap Object>();
> 
> for(Iterator> i = d.iterator();
> i.hasNext(); )
> {
> Map.Entry e2 = i.next();
> 
> values.put(e2.getKey(), e2.getValue());
> }
> 
> hitsOnPage.add(values);
>  
> String outputString = new String(  values.get("title") );
> System.out.println(outputString);
> }
> 
> The field "title" is one of the common fields that is shared between the
> two schemas. When I print the results of my query, I get null for
> everything. However, the result of sdl.getNumFound() is correct, so I know
> that both cores are being accessed.
> 
> Is there a difference with how SolrJ handles multicore requests?
> 
> Disclaimer: The code 
> 
> 
> 
> ahammad wrote:
>> 
>> Hello,
>> 
>> I have a MultiCore install of solr with 2 cores with different schemas
>> and such. Querying directly using http request and/or the solr interface
>> works very well for my purposes.
>> 
>> I want to have a proper search interface though, so I have some code that
>> basically acts as a link between the server and the front-end. Basically,
>> depending on the options, the search string is built, and when the search
>> is submitted, that string gets passed as an http request. The code then
>> would parse through the xml to get the information.
>> 
>> This method works with shards because I can add the shards parameter
>> straight into the link that I end up hitting. Although this is currently
>> functional, I was thinking of using SolrJ simply because it is simpler to
>> use and would cut down the amount of code.
>> 
>> The question is, how would I be able to define the shards in my query, so
>> that when I do search, I hit both shards and get mixed results back?
>> Using http requests, it's as simple as adding a shard=core0,core1
>> snippet. What is the equivalent of this in SolrJ?
>> 
>> BTW, I do have some SolrJ code that is able to query and return results,
>> but for a single core. I am currently using CommonsHttpSolrServer for
>> that, not the Embedded one.
>> 
>> Cheers
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-SolrJ-with-multicore-shards-tp23834518p23838988.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase query search returns no result

2009-06-02 Thread SergeyG


Hmmm... It looks a bit magic. After 3 days of experimenting with various
parameters and getting only wrong results, I deleted all the indexed data
and left the minimum set of parameters: qs=default (I omitted it),
StopWords=off (StopWordsFilter was commented out), no copyFields,
requestHandler=standard. And guess what - it started producing the expected
results! :) So for me the question remains: what was the cause of all the
previous trouble?
Anyway, thanks for the discussion.


SergeyG wrote:
> 
> Actually, "my phrase here"~0 (for an exact match) didn't work I tried,
> just for to experiment, to put "qs=100". 
> 
> Otis Gospodnetic wrote:
>> 
>> 
>> And "your phrase here"~100 works?
>> 
>>  Otis
>> --
>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> 
>> 
>> 
>> - Original Message 
>>> From: SergeyG 
>>> To: solr-user@lucene.apache.org
>>> Sent: Tuesday, June 2, 2009 11:17:23 AM
>>> Subject: Re: Phrase query search returns no result
>>> 
>>> 
>>> Thanks, Otis. 
>>> 
>>> Checking for the stop words was the first thing I did after getting the
>>> empty result. Not all of those words are in the stopwords.txt file. Then
>>> just for experimenting purposes I commented out the StopWordsAnalyser
>>> during
>>> indexing and reindexed. But the phrase was not found again.
>>> 
>>> Sergey
>>> 
>>> 
>>> Otis Gospodnetic wrote:
>>> > 
>>> > 
>>> > Your stopwords were removed during indexing, so if all those terms
>>> were
>>> > stopwords, and they likely were, none of them exist in the index now. 
>>> You
>>> > can double-check that with Luke.  You need to remove stopwords from
>>> the
>>> > index-time analyzer, too, and then reindex.
>>> > 
>>> >  Otis
>>> > --
>>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>> > 
>>> > 
>>> > 
>>> > - Original Message 
>>> >> From: SergeyG 
>>> >> To: solr-user@lucene.apache.org
>>> >> Sent: Tuesday, June 2, 2009 9:57:17 AM
>>> >> Subject: Phrase query search returns no result
>>> >> 
>>> >> 
>>> >> Hi,
>>> >> 
>>> >> I'm trying to implement a full-text search but can't get the right
>>> result
>>> >> with a Phrase query search. The field I search through was indexed as
>>> a
>>> >> "text" field. The phrase was "It was as long as a tree". During both
>>> >> indexing and searching the StopWordsFiler was on. For a search I used
>>> >> these
>>> >> settings: 
>>> >> 
>>> >>  
>>> >>   dismax
>>> >>   explicit
>>> >>  
>>> >>   title author category content
>>> >>  
>>> >>  
>>> >>   id,title,author,isbn,category,content,score
>>> >>  
>>> >>   100
>>> >>   content
>>> >>  
>>> >> 
>>> >> 
>>> >> But I the returned docs list was empty. Using Solr Admin console for
>>> >> debugging showed that parsedquery=+() ().
>>> >> Switching the StopwordsFilter off during searching didn't help
>>> either. 
>>> >> 
>>> >> Am I missing something?
>>> >> 
>>> >> Thanks,
>>> >> Sergey
>>> >> -- 
>>> >> View this message in context: 
>>> >> 
>>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
>>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>>> > 
>>> > 
>>> > 
>>> 
>>> -- 
>>> View this message in context: 
>>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>> 
>> 
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23839134.html
Sent from the Solr - User mailing list archive at Nabble.com.

How to avoid space on facet field

2009-06-02 Thread Bny Jo


Hello, 

 I am wondering why solr is returning a manufacturer name field ( Dell, Inc) as 
Dell one result and Inc another result. Is there a way to facet a field which 
have space or delimitation on them?

query.addFacetField("manu");  
query.setFacetMinCount(1);
query.setIncludeScore(true);
 List facetFieldList=qr.getFacetFields();
for(FacetField facetField: facetFieldList){
System.out.println(facetField.toString() +"Manufactures");
}
And it returns 
-
[manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1), viewson 
(1), vizo (1)]]

Re: Dismax handler phrase matching question

2009-06-02 Thread anuvenk


I have to search over multiple fields so passing everything in the 'q' might
not be neat. Can something be done with the facet.query to accomplish this.
I'm using the facet parameters. I'm not familiar with java so not sure if a
function query could be used to accomplish this. Any other thoughts?


Shalin Shekhar Mangar wrote:
> 
> On Tue, Jun 2, 2009 at 12:53 AM, anuvenk  wrote:
> 
>>
>> title  state
>>
>> dui faq1   california
>> dui faq2   florida
>> dui faq3   federal
>>
>> Now I want to be able to return federal results irrespective of the
>> state.
>> For example dui california should return all federal results for 'dui'
>> also
>> along with california results.
>>
> 
> Perhaps you just need to create your query in such a way that both match?
> 
> q=title:(dui california) state:(dui california) state:federal
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Dismax-handler-phrase-matching-question-tp23820340p23840154.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Phrase query search returns no result

2009-06-02 Thread Erick Erickson

Did you by any chance change your schema? Rename a field? Change your
analyzers? etc? between the time you originally
generated your index and blowing it away?

I'm wondering if blowing away your index and regenerating just
caused any changes in how you index/search to get picked
up...

Best
Erick

On Tue, Jun 2, 2009 at 3:28 PM, SergeyG  wrote:

>
> Hmmm... It looks a bit magic. After 3 days of experimenting with various
> parameters and getting only wrong results, I deleted all the indexed data
> and left the minimum set of parameters: qs=default (I omitted it),
> StopWords=off (StopWordsFilter was commented out), no copyFields,
> requestHandler=standard. And guess what - it started producing the expected
> results! :) So for me the question remains: what was the cause of all the
> previous trouble?
> Anyway, thanks for the discussion.
>
>
> SergeyG wrote:
> >
> > Actually, "my phrase here"~0 (for an exact match) didn't work I tried,
> > just for to experiment, to put "qs=100".
> >
> > Otis Gospodnetic wrote:
> >>
> >>
> >> And "your phrase here"~100 works?
> >>
> >>  Otis
> >> --
> >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>
> >>
> >>
> >> - Original Message 
> >>> From: SergeyG 
> >>> To: solr-user@lucene.apache.org
> >>> Sent: Tuesday, June 2, 2009 11:17:23 AM
> >>> Subject: Re: Phrase query search returns no result
> >>>
> >>>
> >>> Thanks, Otis.
> >>>
> >>> Checking for the stop words was the first thing I did after getting the
> >>> empty result. Not all of those words are in the stopwords.txt file.
> Then
> >>> just for experimenting purposes I commented out the StopWordsAnalyser
> >>> during
> >>> indexing and reindexed. But the phrase was not found again.
> >>>
> >>> Sergey
> >>>
> >>>
> >>> Otis Gospodnetic wrote:
> >>> >
> >>> >
> >>> > Your stopwords were removed during indexing, so if all those terms
> >>> were
> >>> > stopwords, and they likely were, none of them exist in the index now.
> >>> You
> >>> > can double-check that with Luke.  You need to remove stopwords from
> >>> the
> >>> > index-time analyzer, too, and then reindex.
> >>> >
> >>> >  Otis
> >>> > --
> >>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >>> >
> >>> >
> >>> >
> >>> > - Original Message 
> >>> >> From: SergeyG
> >>> >> To: solr-user@lucene.apache.org
> >>> >> Sent: Tuesday, June 2, 2009 9:57:17 AM
> >>> >> Subject: Phrase query search returns no result
> >>> >>
> >>> >>
> >>> >> Hi,
> >>> >>
> >>> >> I'm trying to implement a full-text search but can't get the right
> >>> result
> >>> >> with a Phrase query search. The field I search through was indexed
> as
> >>> a
> >>> >> "text" field. The phrase was "It was as long as a tree". During both
> >>> >> indexing and searching the StopWordsFiler was on. For a search I
> used
> >>> >> these
> >>> >> settings:
> >>> >>
> >>> >>
> >>> >>   dismax
> >>> >>   explicit
> >>> >>
> >>> >>   title author category content
> >>> >>
> >>> >>
> >>> >>   id,title,author,isbn,category,content,score
> >>> >>
> >>> >>   100
> >>> >>   content
> >>> >>
> >>> >>
> >>> >>
> >>> >> But I the returned docs list was empty. Using Solr Admin console for
> >>> >> debugging showed that parsedquery=+() ().
> >>> >> Switching the StopwordsFilter off during searching didn't help
> >>> either.
> >>> >>
> >>> >> Am I missing something?
> >>> >>
> >>> >> Thanks,
> >>> >> Sergey
> >>> >> --
> >>> >> View this message in context:
> >>> >>
> >>>
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
> >>> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>> >
> >>> >
> >>> >
> >>>
> >>> --
> >>> View this message in context:
> >>>
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23834693.html
> >>> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23839134.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>

Can submit docs to different indexes?

2009-06-02 Thread Darren Govoni

Hi,
  Pardon if this question has an answer I missed in the archives. I
couldn't find it or in the docs (again, I may have missed it). But what
I want to do is submit docs to Solr as usual, but also tell Solr which
index to store the doc and then be able to query also providing which
index to keep sets of indexes separate.

is this possible with Solr?

thanks,
Darren

Re: spell checking

2009-06-02 Thread Yao Ge


Yes. I did. I was not able to grasp the concept of making spell checking
work.
For example, the wiki page says an spell check index need to be built. But
did not say how to do it. Does Solr buid the index out of thin air? Or the
index is buit from the main index? or index is built form a dictionary or
word list?

Please help.


Grant Ingersoll-6 wrote:
> 
> Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent
> 
> 
> On Jun 2, 2009, at 8:50 AM, Yao Ge wrote:
> 
>>
>> Can someone help providing a tutorial like introduction on how to get
>> spell-checking work in Solr. It appears many steps are requires  
>> before the
>> spell-checkering functions can be used. It also appears that a  
>> dictionary (a
>> list of correctly spelled words) is required to setup the spell  
>> checker. Can
>> anyone validate my impression?
>>
>> Thanks.
>> -- 
>> View this message in context:
>> http://www.nabble.com/spell-checking-tp23835427p23835427.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
> 
> --
> Grant Ingersoll
> http://www.lucidimagination.com/
> 
> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> using Solr/Lucene:
> http://www.lucidimagination.com/search
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23840843.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: spell checking

2009-06-02 Thread Otis Gospodnetic


Hello,

This is how you build the SC index:
http://wiki.apache.org/solr/SpellCheckComponent#head-78f5afcf43df544832809abc68dd36b98152670c

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Yao Ge 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 5:03:24 PM
> Subject: Re: spell checking
> 
> 
> Yes. I did. I was not able to grasp the concept of making spell checking
> work.
> For example, the wiki page says an spell check index need to be built. But
> did not say how to do it. Does Solr buid the index out of thin air? Or the
> index is buit from the main index? or index is built form a dictionary or
> word list?
> 
> Please help.
> 
> 
> Grant Ingersoll-6 wrote:
> > 
> > Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent
> > 
> > 
> > On Jun 2, 2009, at 8:50 AM, Yao Ge wrote:
> > 
> >>
> >> Can someone help providing a tutorial like introduction on how to get
> >> spell-checking work in Solr. It appears many steps are requires  
> >> before the
> >> spell-checkering functions can be used. It also appears that a  
> >> dictionary (a
> >> list of correctly spelled words) is required to setup the spell  
> >> checker. Can
> >> anyone validate my impression?
> >>
> >> Thanks.
> >> -- 
> >> View this message in context:
> >> http://www.nabble.com/spell-checking-tp23835427p23835427.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> > 
> > --
> > Grant Ingersoll
> > http://www.lucidimagination.com/
> > 
> > Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
> > using Solr/Lucene:
> > http://www.lucidimagination.com/search
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/spell-checking-tp23835427p23840843.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can submit docs to different indexes?

2009-06-02 Thread Otis Gospodnetic


Hi Darren,

Yes, it is possible! :)
First you need to make sure your Solr has multiple indices using one of the 
following options:

http://wiki.apache.org/solr/MultipleIndexes

The most popular approach is the MultiCore approach.  If you go that route, 
then you query things like in this example:

http://wiki.apache.org/solr/CoreAdmin#head-aeda88bd432e812ebbcf1f86baec51f1f10eca0f

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Darren Govoni 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 4:59:35 PM
> Subject: Can submit docs to different indexes?
> 
> Hi,
>   Pardon if this question has an answer I missed in the archives. I
> couldn't find it or in the docs (again, I may have missed it). But what
> I want to do is submit docs to Solr as usual, but also tell Solr which
> index to store the doc and then be able to query also providing which
> index to keep sets of indexes separate.
> 
> is this possible with Solr?
> 
> thanks,
> Darren

Re: spell checking

2009-06-02 Thread Jeff Newburn

The spell checking dictionary should be built on startup with spellchecking
is enabled in the system.

First we defined the component in solrconfig.xml.  Notice how it has
buildOnCommit to tell it rebuild the dictionary.

  

  default
  solr.IndexBasedSpellChecker
  field
  ./spellchecker1
  0.5
  true


  jarowinkler
  field
  
  org.apache.lucene.search.spell.JaroWinklerDistance
  ./spellchecker2
  0.5
  true


Second we added the component to the dismax handler:
 
   spellcheck
 

This seems to work for us.  Hope it helps

-- 
Jeff Newburn
Software Engineer, Zappos.com
jnewb...@zappos.com - 702-943-7562


> From: Yao Ge 
> Reply-To: 
> Date: Tue, 2 Jun 2009 14:03:24 -0700 (PDT)
> To: 
> Subject: Re: spell checking
> 
> 
> Yes. I did. I was not able to grasp the concept of making spell checking
> work.
> For example, the wiki page says an spell check index need to be built. But
> did not say how to do it. Does Solr buid the index out of thin air? Or the
> index is buit from the main index? or index is built form a dictionary or
> word list?
> 
> Please help.
> 
> 
> Grant Ingersoll-6 wrote:
>> 
>> Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent
>> 
>> 
>> On Jun 2, 2009, at 8:50 AM, Yao Ge wrote:
>> 
>>> 
>>> Can someone help providing a tutorial like introduction on how to get
>>> spell-checking work in Solr. It appears many steps are requires
>>> before the
>>> spell-checkering functions can be used. It also appears that a
>>> dictionary (a
>>> list of correctly spelled words) is required to setup the spell
>>> checker. Can
>>> anyone validate my impression?
>>> 
>>> Thanks.
>>> -- 
>>> View this message in context:
>>> http://www.nabble.com/spell-checking-tp23835427p23835427.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>> 
>> --
>> Grant Ingersoll
>> http://www.lucidimagination.com/
>> 
>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>> using Solr/Lucene:
>> http://www.lucidimagination.com/search
>> 
>> 
>> 
> 
> -- 
> View this message in context:
> http://www.nabble.com/spell-checking-tp23835427p23840843.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>

Re: Search combination?

2009-06-02 Thread Otis Gospodnetic


Are you trying to find items with size > 7?  If so, 7* is not the way to do 
that - 7* will find items whose "size" field starts with "7", e.g. 7, 70, 71, 
72, 72, 73...79, 700, 701


What you may want is an open-ended range query: q=size:[7 TO *] (I think that's 
the correct syntax, but please double-check it)

Also, I assume you already indexed file extensions into a separate "extension" 
field.
 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Jörg Agatz 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 12:18:42 PM
> Subject: Search combination?
> 
> Hi users...
> 
> i have a Problem...
> 
> i will search for:
> 
> http://192.168.105.54:8983/solr/itas?q=size:7*&extension:db
> 
> i mean i search for all documents they are size 7* and extension:pdf,
> 
> But it dosent work
> i get some other files, with extension doc ore db
> what is Happens about ?
> 
> Jörg

Re: spell checking

2009-06-02 Thread Yao Ge

Sorry for not be able to get my point across.

I know the syntax that leads to a index build for spell checking. I actually
run the command saw some additional file created in data\spellchecker1
directory. What I don't understand is what is in there as I can not trick
Solr to make spell suggestions based on the documented query structure in
wiki. 

Can anyone tell me what happened after when the default spell check is
built? In my case, I used copyField to copy a couple of text fields into a
field called "spell". These fields are the original text, they are the ones
with typos that I need to run spell check on. But how can these original
data be used as a base for spell checking? How does Solr know what are
correctly spelled words?

   ...

   ...

Yao Ge wrote:
> 
> Can someone help providing a tutorial like introduction on how to get
> spell-checking work in Solr. It appears many steps are requires before the
> spell-checkering functions can be used. It also appears that a dictionary
> (a list of correctly spelled words) is required to setup the spell
> checker. Can anyone validate my impression?
> 
> Thanks.
> 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23841373.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Can submit docs to different indexes?

2009-06-02 Thread Darren Govoni

Thanks Otis!!!

On Tue, 2009-06-02 at 14:13 -0700, Otis Gospodnetic wrote:
> Hi Darren,
> 
> Yes, it is possible! :)
> First you need to make sure your Solr has multiple indices using one of the 
> following options:
> 
> http://wiki.apache.org/solr/MultipleIndexes
> 
> The most popular approach is the MultiCore approach.  If you go that route, 
> then you query things like in this example:
> 
> http://wiki.apache.org/solr/CoreAdmin#head-aeda88bd432e812ebbcf1f86baec51f1f10eca0f
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
> > From: Darren Govoni 
> > To: solr-user@lucene.apache.org
> > Sent: Tuesday, June 2, 2009 4:59:35 PM
> > Subject: Can submit docs to different indexes?
> > 
> > Hi,
> >   Pardon if this question has an answer I missed in the archives. I
> > couldn't find it or in the docs (again, I may have missed it). But what
> > I want to do is submit docs to Solr as usual, but also tell Solr which
> > index to store the doc and then be able to query also providing which
> > index to keep sets of indexes separate.
> > 
> > is this possible with Solr?
> > 
> > thanks,
> > Darren
>

Is there Downside to a huge synonyms file?

2009-06-02 Thread anuvenk


In my index i have legal faqs, forms, legal videos etc with a state field for
each resource.
Now if i search for real estate san diego, I want to be able to return other
'california' results i.e results from san francisco.
I have the following fields in the index

title  state  
description...
real estate san diego example 1   california some
description
real estate carlsbad example 2 california some desc

so when i search for real estate san francisco, since there is no match, i
want to be able to return the other real estate results in california
instead of returning none. Because sometimes they might be searching for a
real estate form and city probably doesn't matter. 

I have two things in mind. One is adding a synonym mapping
san diego, california
carlsbad, california
san francisco, california

(which probably isn't the best way)
hoping that search for san francisco real estate would map san francisco to
california and hence return the other two california results

OR

adding the mapping of city to state in the index itself like..

title state city
  
description...
real estate san diego eg 1california   carlsbad, san francisco, san
diegosome description
real estate carlsbad eg 2  california   carlsbad, san francisco, san
diegosome description

which of the above two is better. Does a huge synonym file affect
performance. Or Is there a even better way? I'm sure there is but I can't
put my finger on it yet & I'm not familiar with java either.

-- 
View this message in context: 
http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23842527.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: spell checking

2009-06-02 Thread Otis Gospodnetic


Hello,

In short, the assumption behind this type of SC is that the text in the
main index is (mostly) correctly spelled.  When the SC finds query
terms that are close in spelling to words indexed in SC, it offers
spelling suggestions/correction using those presumably correctly spelled terms 
(there are other parameters that control the exact behaviour, but this is the 
idea)

Solr (Lucene's spellchecker, which Solr uses under the hood, actually) turn the 
input text (values from those fields you copy to the spell field) into so 
called n-grams.  You can see that if you open up the SC index with something 
like Luke.  Please see
http://wiki.apache.org/jakarta-lucene/SpellChecker .

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Yao Ge 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 5:34:07 PM
> Subject: Re: spell checking
> 
> 
> Sorry for not be able to get my point across.
> 
> I know the syntax that leads to a index build for spell checking. I actually
> run the command saw some additional file created in data\spellchecker1
> directory. What I don't understand is what is in there as I can not trick
> Solr to make spell suggestions based on the documented query structure in
> wiki. 
> 
> Can anyone tell me what happened after when the default spell check is
> built? In my case, I used copyField to copy a couple of text fields into a
> field called "spell". These fields are the original text, they are the ones
> with typos that I need to run spell check on. But how can these original
> data be used as a base for spell checking? How does Solr know what are
> correctly spelled words?
> 
>   
> multiValued="true"/>
>   
> multiValued="true"/>
>...
>   
> multiValued="true"/>
>...
>   
>   
> 
> 
> 
> Yao Ge wrote:
> > 
> > Can someone help providing a tutorial like introduction on how to get
> > spell-checking work in Solr. It appears many steps are requires before the
> > spell-checkering functions can be used. It also appears that a dictionary
> > (a list of correctly spelled words) is required to setup the spell
> > checker. Can anyone validate my impression?
> > 
> > Thanks.
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/spell-checking-tp23835427p23841373.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is there Downside to a huge synonyms file?

2009-06-02 Thread Otis Gospodnetic


Hi,

If index-time synonym expansion/indexing is used, then a large synonym file 
means your index is going to be bigger.
If query-time synonym expansion is used, then your queries are going to be 
larger (i.e. more ORs, thus a bit slower).

How much, it really depends on your specific synonyms, so I can't generalize.  
I have a feeling you are not dealing with millions of documents, in which case 
you can most likely ignore increase in index or query size.

 
Adding synonyms sounds like the easiest approach.  I'd try that and worry about 
improvement only IF I see that doesn't give adequate results.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: anuvenk 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 6:55:27 PM
> Subject: Is there Downside to a huge synonyms file?
> 
> 
> In my index i have legal faqs, forms, legal videos etc with a state field for
> each resource.
> Now if i search for real estate san diego, I want to be able to return other
> 'california' results i.e results from san francisco.
> I have the following fields in the index
> 
> title  state  
> description...
> real estate san diego example 1   california some
> description
> real estate carlsbad example 2 california some desc
> 
> so when i search for real estate san francisco, since there is no match, i
> want to be able to return the other real estate results in california
> instead of returning none. Because sometimes they might be searching for a
> real estate form and city probably doesn't matter. 
> 
> I have two things in mind. One is adding a synonym mapping
> san diego, california
> carlsbad, california
> san francisco, california
> 
> (which probably isn't the best way)
> hoping that search for san francisco real estate would map san francisco to
> california and hence return the other two california results
> 
> OR
> 
> adding the mapping of city to state in the index itself like..
> 
> title state city  
>   
>   
> description...
> real estate san diego eg 1california   carlsbad, san francisco, san
> diegosome description
> real estate carlsbad eg 2  california   carlsbad, san francisco, san
> diegosome description
> 
> which of the above two is better. Does a huge synonym file affect
> performance. Or Is there a even better way? I'm sure there is but I can't
> put my finger on it yet & I'm not familiar with java either.
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23842527.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: spell checking

2009-06-02 Thread Yao Ge


Excellent. Now everything make sense to me. :-)

The spell checking suggestion is the closest variance of user input that
actually existed in the main index. So called "correction" is relative the
text existed indexed. So there is no need for a brute force list of all
correctly spelled words. Maybe we should call this "alternative search
terms" or "suggested search terms" instead of spell checking. It is
misleading as there is no right or wrong in spelling, there is only popular
(term frequency?) alternatives.

Thanks for the insight.


Otis Gospodnetic wrote:
> 
> 
> Hello,
> 
> In short, the assumption behind this type of SC is that the text in the
> main index is (mostly) correctly spelled.  When the SC finds query
> terms that are close in spelling to words indexed in SC, it offers
> spelling suggestions/correction using those presumably correctly spelled
> terms (there are other parameters that control the exact behaviour, but
> this is the idea)
> 
> Solr (Lucene's spellchecker, which Solr uses under the hood, actually)
> turn the input text (values from those fields you copy to the spell field)
> into so called n-grams.  You can see that if you open up the SC index with
> something like Luke.  Please see
> http://wiki.apache.org/jakarta-lucene/SpellChecker .
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
>> From: Yao Ge 
>> To: solr-user@lucene.apache.org
>> Sent: Tuesday, June 2, 2009 5:34:07 PM
>> Subject: Re: spell checking
>> 
>> 
>> Sorry for not be able to get my point across.
>> 
>> I know the syntax that leads to a index build for spell checking. I
>> actually
>> run the command saw some additional file created in data\spellchecker1
>> directory. What I don't understand is what is in there as I can not trick
>> Solr to make spell suggestions based on the documented query structure in
>> wiki. 
>> 
>> Can anyone tell me what happened after when the default spell check is
>> built? In my case, I used copyField to copy a couple of text fields into
>> a
>> field called "spell". These fields are the original text, they are the
>> ones
>> with typos that I need to run spell check on. But how can these original
>> data be used as a base for spell checking? How does Solr know what are
>> correctly spelled words?
>> 
>>   
>> multiValued="true"/>
>>   
>> multiValued="true"/>
>>...
>>   
>> multiValued="true"/>
>>...
>>   
>>   
>> 
>> 
>> 
>> Yao Ge wrote:
>> > 
>> > Can someone help providing a tutorial like introduction on how to get
>> > spell-checking work in Solr. It appears many steps are requires before
>> the
>> > spell-checkering functions can be used. It also appears that a
>> dictionary
>> > (a list of correctly spelled words) is required to setup the spell
>> > checker. Can anyone validate my impression?
>> > 
>> > Thanks.
>> > 
>> 
>> -- 
>> View this message in context: 
>> http://www.nabble.com/spell-checking-tp23835427p23841373.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/spell-checking-tp23835427p23844050.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: spell checking

2009-06-02 Thread Otis Gospodnetic


I'm glad my late night explanation helped.
You may be right about there being a better name for this functionality.
Note that we do have support for file-based (dictionary-like) spellchecker, too.

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Yao Ge 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 9:42:48 PM
> Subject: Re: spell checking
> 
> 
> Excellent. Now everything make sense to me. :-)
> 
> The spell checking suggestion is the closest variance of user input that
> actually existed in the main index. So called "correction" is relative the
> text existed indexed. So there is no need for a brute force list of all
> correctly spelled words. Maybe we should call this "alternative search
> terms" or "suggested search terms" instead of spell checking. It is
> misleading as there is no right or wrong in spelling, there is only popular
> (term frequency?) alternatives.
> 
> Thanks for the insight.
> 
> 
> Otis Gospodnetic wrote:
> > 
> > 
> > Hello,
> > 
> > In short, the assumption behind this type of SC is that the text in the
> > main index is (mostly) correctly spelled.  When the SC finds query
> > terms that are close in spelling to words indexed in SC, it offers
> > spelling suggestions/correction using those presumably correctly spelled
> > terms (there are other parameters that control the exact behaviour, but
> > this is the idea)
> > 
> > Solr (Lucene's spellchecker, which Solr uses under the hood, actually)
> > turn the input text (values from those fields you copy to the spell field)
> > into so called n-grams.  You can see that if you open up the SC index with
> > something like Luke.  Please see
> > http://wiki.apache.org/jakarta-lucene/SpellChecker .
> > 
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> > 
> > 
> > 
> > - Original Message 
> >> From: Yao Ge 
> >> To: solr-user@lucene.apache.org
> >> Sent: Tuesday, June 2, 2009 5:34:07 PM
> >> Subject: Re: spell checking
> >> 
> >> 
> >> Sorry for not be able to get my point across.
> >> 
> >> I know the syntax that leads to a index build for spell checking. I
> >> actually
> >> run the command saw some additional file created in data\spellchecker1
> >> directory. What I don't understand is what is in there as I can not trick
> >> Solr to make spell suggestions based on the documented query structure in
> >> wiki. 
> >> 
> >> Can anyone tell me what happened after when the default spell check is
> >> built? In my case, I used copyField to copy a couple of text fields into
> >> a
> >> field called "spell". These fields are the original text, they are the
> >> ones
> >> with typos that I need to run spell check on. But how can these original
> >> data be used as a base for spell checking? How does Solr know what are
> >> correctly spelled words?
> >> 
> >>  
> >> multiValued="true"/>
> >>  
> >> multiValued="true"/>
> >>...
> >>  
> >> multiValued="true"/>
> >>...
> >>  
> >>  
> >> 
> >> 
> >> 
> >> Yao Ge wrote:
> >> > 
> >> > Can someone help providing a tutorial like introduction on how to get
> >> > spell-checking work in Solr. It appears many steps are requires before
> >> the
> >> > spell-checkering functions can be used. It also appears that a
> >> dictionary
> >> > (a list of correctly spelled words) is required to setup the spell
> >> > checker. Can anyone validate my impression?
> >> > 
> >> > Thanks.
> >> > 
> >> 
> >> -- 
> >> View this message in context: 
> >> http://www.nabble.com/spell-checking-tp23835427p23841373.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> > 
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/spell-checking-tp23835427p23844050.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to avoid space on facet field

2009-06-02 Thread Anshuman Manur

Hey,

>From what you have written I'm guessing that in your schema.xml file, you
have defined the field manu to be of type  "text", which is good for keyword
searches, as the text type indexes on whitespace, i.e. Dell Inc. is indexed
as dell, inc. so keyword searches matches either dell or inc. But when you
want to facet on a particular field, you want exact matches regardless of
whitespace in between. In such cases its a good idea to use the string type.
Let me illustrate with an example based on my settings:

Here are my fields:

   
   
   
   
   
   
   
   

   
   

   
   
   
   

   
   
   
   
   
   

   
   
   

So, when doing keyword searches I use the  to search
in all the fields, as I copyField all the fields onto the field named text.
But, for faceting I use the exact fields, which are of type string and don't
split on whitespace.


Anshu

On Wed, Jun 3, 2009 at 1:50 AM, Bny Jo  wrote:

>
> Hello,
>
>  I am wondering why solr is returning a manufacturer name field ( Dell,
> Inc) as Dell one result and Inc another result. Is there a way to facet a
> field which have space or delimitation on them?
>
> query.addFacetField("manu");
> query.setFacetMinCount(1);
>query.setIncludeScore(true);
>  List facetFieldList=qr.getFacetFields();
>for(FacetField facetField: facetFieldList){
>System.out.println(facetField.toString() +"Manufactures");
>}
> And it returns
> -
> [manu:[dell (5), inc (5), corp (1), sharp (1), sonic (1), view (1), viewson
> (1), vizo (1)]]
>
>
>
>

Re: Using Chinese / How to ?

2009-06-02 Thread Fer-Bj


Right now we figured out the insert new documents problem, which was by
removing "special" ascii chars not accepted for XML on SOLR 1.3

The question is now: how to config SOLR 1.3 with the chinese support!

James liu-2 wrote:
> 
> u means how to config solr which support chinese?
> 
> Update problem?
> 
> On Tuesday, June 2, 2009, Fer-Bj  wrote:
>>
>> I'm sending 3 files:
>> - schema.xml
>> - solrconfig.xml
>> - error.txt (with the error description)
>>
>> I can confirm by now that this error is due to invalid characters for the
>> XML format (ASCII 0 or 11).
>> However, this problem now is taking a different direction: how to start
>> using the CJK instead of the english!
>> http://www.nabble.com/file/p23825881/error.txt error.txt
>> http://www.nabble.com/file/p23825881/solrconfig.xml solrconfig.xml
>> http://www.nabble.com/file/p23825881/schema.xml schema.xml
>>
>>
>> Grant Ingersoll-6 wrote:
>>>
>>> Can you provide details on the errors?  I don't think we have a
>>> specific how to, but I wouldn't think it would be much different from
>>> 1.2
>>>
>>> -Grant
>>> On May 31, 2009, at 10:31 PM, Fer-Bj wrote:
>>>

 Hello,

 is there any "how to" already created to get me up using SOLR 1.3
 running
 for a chinese based website?
 Currently our site is using SOLR 1.2, and we tried to move into 1.3
 but we
 couldn't complete our reindex as it seems like 1.3 is more strict
 when it
 comes to special chars.

 I would appreciate any help anyone may provide on this.

 Thanks!!
 --
 View this message in context:
 http://www.nabble.com/Using-Chinese---How-to---tp23810129p23810129.html
 Sent from the Solr - User mailing list archive at Nabble.com.

>>>
>>> --
>>> Grant Ingersoll
>>> http://www.lucidimagination.com/
>>>
>>> Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
>>> using Solr/Lucene:
>>> http://www.lucidimagination.com/search
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Using-Chinese---How-to---tp23810129p23825881.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> -- 
> regards
> j.L ( I live in Shanghai, China)
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Using-Chinese---How-to---tp23810129p23844708.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is there Downside to a huge synonyms file?

2009-06-02 Thread anuvenk


I'm using query time synonyms. I have more fields in my index though. This is
just an example or sample of data from my index. Yes, we don't have millions
of documents. Could be around 300,000 and might increase in future. The
reason i'm using query time synonyms is because of the nature of my data. I
can't re-index the data everytime i add or remove a synonym. But for this
particular requirement is it best to have index time synonyms because of the
multi-word synonym nature. Again if i add more cities list to the synonym
file, I can't be re-indexing all the data over and over again. 



anuvenk wrote:
> 
> In my index i have legal faqs, forms, legal videos etc with a state field
> for each resource.
> Now if i search for real estate san diego, I want to be able to return
> other 'california' results i.e results from san francisco.
> I have the following fields in the index
> 
> title  state  
> description...
> real estate san diego example 1   california some
> description
> real estate carlsbad example 2 california some desc
> 
> so when i search for real estate san francisco, since there is no match, i
> want to be able to return the other real estate results in california
> instead of returning none. Because sometimes they might be searching for a
> real estate form and city probably doesn't matter. 
> 
> I have two things in mind. One is adding a synonym mapping
> san diego, california
> carlsbad, california
> san francisco, california
> 
> (which probably isn't the best way)
> hoping that search for san francisco real estate would map san francisco
> to california and hence return the other two california results
> 
> OR
> 
> adding the mapping of city to state in the index itself like..
> 
> title state city  
> 
> description...
> real estate san diego eg 1california   carlsbad, san francisco, san
> diegosome description
> real estate carlsbad eg 2  california   carlsbad, san francisco, san
> diegosome description
> 
> which of the above two is better. Does a huge synonym file affect
> performance. Or Is there a even better way? I'm sure there is but I can't
> put my finger on it yet & I'm not familiar with java either.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23844761.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Is there Downside to a huge synonyms file?

2009-06-02 Thread Otis Gospodnetic


Hello,

300K is a pretty small index.  I wouldn't worry about the number of synonyms 
unless you are turning a single term into dozens of ORed terms.

Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: anuvenk 
> To: solr-user@lucene.apache.org
> Sent: Tuesday, June 2, 2009 11:28:43 PM
> Subject: Re: Is there Downside to a huge synonyms file?
> 
> 
> I'm using query time synonyms. I have more fields in my index though. This is
> just an example or sample of data from my index. Yes, we don't have millions
> of documents. Could be around 300,000 and might increase in future. The
> reason i'm using query time synonyms is because of the nature of my data. I
> can't re-index the data everytime i add or remove a synonym. But for this
> particular requirement is it best to have index time synonyms because of the
> multi-word synonym nature. Again if i add more cities list to the synonym
> file, I can't be re-indexing all the data over and over again. 
> 
> 
> 
> anuvenk wrote:
> > 
> > In my index i have legal faqs, forms, legal videos etc with a state field
> > for each resource.
> > Now if i search for real estate san diego, I want to be able to return
> > other 'california' results i.e results from san francisco.
> > I have the following fields in the index
> > 
> > title  state  
> > description...
> > real estate san diego example 1   california some
> > description
> > real estate carlsbad example 2 california some desc
> > 
> > so when i search for real estate san francisco, since there is no match, i
> > want to be able to return the other real estate results in california
> > instead of returning none. Because sometimes they might be searching for a
> > real estate form and city probably doesn't matter. 
> > 
> > I have two things in mind. One is adding a synonym mapping
> > san diego, california
> > carlsbad, california
> > san francisco, california
> > 
> > (which probably isn't the best way)
> > hoping that search for san francisco real estate would map san francisco
> > to california and hence return the other two california results
> > 
> > OR
> > 
> > adding the mapping of city to state in the index itself like..
> > 
> > title state city
> >   
> 
> > description...
> > real estate san diego eg 1california   carlsbad, san francisco, san
> > diegosome description
> > real estate carlsbad eg 2  california   carlsbad, san francisco, san
> > diegosome description
> > 
> > which of the above two is better. Does a huge synonym file affect
> > performance. Or Is there a even better way? I'm sure there is but I can't
> > put my finger on it yet & I'm not familiar with java either.
> > 
> > 
> 
> -- 
> View this message in context: 
> http://www.nabble.com/Is-there-Downside-to-a-huge-synonyms-file--tp23842527p23844761.html
> Sent from the Solr - User mailing list archive at Nabble.com.

fq vs. q

2009-06-02 Thread Martin Davidsson

I've tried to read up on how to decide, when writing a query, what  
criteria goes in the q parameter and what goes in the fq parameter, to  
achieve optimal performance. Is there some documentation that  
describes how each field is treated internally, or even better, some  
kind of rule of thumb to help me decide how to split things up when  
querying against one or more fields. In most cases, I'm looking for  
exact matches but sometimes an occasional wildcard query shows up too.  
Thank you!


-- Martin

Re: Using Chinese / How to ?

2009-06-02 Thread James liu

1: modify ur schema.xml:
like




2: add your field:


3: add your analyzer to {solr_dir}\lib\

4: rebuild newsolr and u will find it in {solr_dir}\dist

5: follow tutorial to setup solr

6: open your browser to solr admin page, find analyzer to check analyzer, it
will tell u how to analyzer world, use which analyzer


-- 
regards
j.L ( I live in Shanghai, China)

Re: Dismax handler phrase matching question

2009-06-02 Thread Shalin Shekhar Mangar

On Wed, Jun 3, 2009 at 1:59 AM, anuvenk  wrote:

>
> I have to search over multiple fields so passing everything in the 'q'
> might
> not be neat. Can something be done with the facet.query to accomplish this.
> I'm using the facet parameters. I'm not familiar with java so not sure if a
> function query could be used to accomplish this. Any other thoughts?
>
>
I don't think facet.query and function queries have anything to do with
this. Using the dismax params seem to be the right way.

-- 
Regards,
Shalin Shekhar Mangar.

Re: Phrase query search returns no result

2009-06-02 Thread SergeyG


Yes, Erick, I did. Actually the course of events was as follows. I started
with the example config files (solrconfig.xml & schema.xml) and added my own
fields. In my search I have 2 clauses: for a phrase and for a set of
keywords. And from the very beginning it worked fine. Until on the second
day one phrase ("It was as long as a tree") gave me back the wrong response.
Trying to find the reason I started changing different parameters one by one
(field types - from text to string and back, copyfields, analyzers, etc.).
The result - I came to the situation when all the queries returned only
wrong responses. During my "research" I deleted all indexed xml files
several times what, in theory, should have cleaned up the index itself (as I
understand it). And then I decided to start all over again. The only two
differences from the very beginning was that I turned the StopWordsFilter
off (although I did it several times while playing with params; besides, the
phrase that initially caused troubles doesn't consists only of the stop
words) and also, I commented out copyField declarations for my own fields. 
I'm still wondering what happened.

Thank you,
Sergey


Erick Erickson wrote:
> 
> Did you by any chance change your schema? Rename a field? Change your
> analyzers? etc? between the time you originally
> generated your index and blowing it away?
> 
> I'm wondering if blowing away your index and regenerating just
> caused any changes in how you index/search to get picked
> up...
> 
> Best
> Erick
> 
> On Tue, Jun 2, 2009 at 3:28 PM, SergeyG  wrote:
> 
>>
>> Hmmm... It looks a bit magic. After 3 days of experimenting with various
>> parameters and getting only wrong results, I deleted all the indexed data
>> and left the minimum set of parameters: qs=default (I omitted it),
>> StopWords=off (StopWordsFilter was commented out), no copyFields,
>> requestHandler=standard. And guess what - it started producing the
>> expected
>> results! :) So for me the question remains: what was the cause of all the
>> previous trouble?
>> Anyway, thanks for the discussion.
>>
>>
>> SergeyG wrote:
>> >
>> > Actually, "my phrase here"~0 (for an exact match) didn't work I tried,
>> > just for to experiment, to put "qs=100".
>> >
>> > Otis Gospodnetic wrote:
>> >>
>> >>
>> >> And "your phrase here"~100 works?
>> >>
>> >>  Otis
>> >> --
>> >> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >>
>> >>
>> >>
>> >> - Original Message 
>> >>> From: SergeyG 
>> >>> To: solr-user@lucene.apache.org
>> >>> Sent: Tuesday, June 2, 2009 11:17:23 AM
>> >>> Subject: Re: Phrase query search returns no result
>> >>>
>> >>>
>> >>> Thanks, Otis.
>> >>>
>> >>> Checking for the stop words was the first thing I did after getting
>> the
>> >>> empty result. Not all of those words are in the stopwords.txt file.
>> Then
>> >>> just for experimenting purposes I commented out the StopWordsAnalyser
>> >>> during
>> >>> indexing and reindexed. But the phrase was not found again.
>> >>>
>> >>> Sergey
>> >>>
>> >>>
>> >>> Otis Gospodnetic wrote:
>> >>> >
>> >>> >
>> >>> > Your stopwords were removed during indexing, so if all those terms
>> >>> were
>> >>> > stopwords, and they likely were, none of them exist in the index
>> now.
>> >>> You
>> >>> > can double-check that with Luke.  You need to remove stopwords from
>> >>> the
>> >>> > index-time analyzer, too, and then reindex.
>> >>> >
>> >>> >  Otis
>> >>> > --
>> >>> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>> >>> >
>> >>> >
>> >>> >
>> >>> > - Original Message 
>> >>> >> From: SergeyG
>> >>> >> To: solr-user@lucene.apache.org
>> >>> >> Sent: Tuesday, June 2, 2009 9:57:17 AM
>> >>> >> Subject: Phrase query search returns no result
>> >>> >>
>> >>> >>
>> >>> >> Hi,
>> >>> >>
>> >>> >> I'm trying to implement a full-text search but can't get the right
>> >>> result
>> >>> >> with a Phrase query search. The field I search through was indexed
>> as
>> >>> a
>> >>> >> "text" field. The phrase was "It was as long as a tree". During
>> both
>> >>> >> indexing and searching the StopWordsFiler was on. For a search I
>> used
>> >>> >> these
>> >>> >> settings:
>> >>> >>
>> >>> >>
>> >>> >>   dismax
>> >>> >>   explicit
>> >>> >>
>> >>> >>   title author category content
>> >>> >>
>> >>> >>
>> >>> >>   id,title,author,isbn,category,content,score
>> >>> >>
>> >>> >>   100
>> >>> >>   content
>> >>> >>
>> >>> >>
>> >>> >>
>> >>> >> But I the returned docs list was empty. Using Solr Admin console
>> for
>> >>> >> debugging showed that parsedquery=+() ().
>> >>> >> Switching the StopwordsFilter off during searching didn't help
>> >>> either.
>> >>> >>
>> >>> >> Am I missing something?
>> >>> >>
>> >>> >> Thanks,
>> >>> >> Sergey
>> >>> >> --
>> >>> >> View this message in context:
>> >>> >>
>> >>>
>> http://www.nabble.com/Phrase-query-search-returns-no-result-tp23833024p23833024.html
>> >>> >> Sent from the Solr - User mailing list archive at Nabble.com.
>> >>> >
>> >>> >

49 matches

Mail list logo