Re: Sub entities

2011-03-01 Thread Stefan Matheis
Brian,

except for your sql-syntax error in the specie_relations-query "SELECT
specie_id FROMspecie_relations .." (missing whitespace after FROM)
your config looks okay.

following questions:
* is there a field named specie in your schema? (otherwise dih will
silently ignore it)
* did you check your mysql-query log? to see which queries were
executed and what their result is?

And, just as quick notice .. there is no need to use  (while both attribute have the same value).

Regards
Stefan

On Mon, Feb 28, 2011 at 9:52 PM, Brian Lamb
 wrote:
> Hi all,
>
> I was able to get my dataimport to work correctly but I'm a little unclear
> as to how the entity within an entity works in regards to search results.
> When I do a search for all results, it seems only the outermost responses
> are returned. For example, I have the following in my db config file:
>
> 
>   driver="com.mysql.jdbc.Driver"
> url="jdbc:mysql://localhost/db?characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull"
> user="user" password="password"/>
>    
>      
>        
>        
>        
>
>        
>        
>          
>            
>          
>        
>      
>    
>  
> 
>
> However, specie never shows up in my search results:
>
> 
>  Mammal
>  1
>  Canis
> 
>
> I had hoped the results would include the species. Can it? If so, what is my
> malfunction?
>


Re: Disabling caching for fq param?

2011-03-01 Thread Markus Jelsma
If filterCache hitratio is low then just disable it in solrconfig by deleting 
the section or setting its values to 0.

> Based on what I've read here and what I could find on the web, it seems
> that each fq clause essentially gets its own results cache.  Is that
> correct?
> 
> We have a corporate policy of passing the user's Oracle OLS labels into the
> index in order to be matched against the labels field.  I currently
> separate this from the user's query text by sticking it into an fq
> param...
> 
> ?q=
> &fq=labels:
> &qf= 
> &tie=0.1
> &defType=dismax
> 
> ...but since its value (a collection of hundreds of label values) only
> apply to that user, the accompanying result set won't be reusable by other
> users:
> 
> My understanding is that this query will result in two result sets (q and
> fq) being cached separately, with the union of the two sets being returned
> to the user.  (Is that correct?)
> 
> There are thousands of users, each with a unique combination of labels, so
> there seems to be little value in caching the result set created from the
> fq labels param.  It would be beneficial if there were some kind of fq
> parameter override to indicate to Solr to not cache the results?
> 
> 
> Thanks!


Re: Problem with sorting using functions.

2011-03-01 Thread Jan Høydahl
Also, if you're on 3.1, the function needs to be without spaces since sort will 
split on space to find the sort order.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 28. feb. 2011, at 22.34, John Sherwood wrote:

> Fair call.  Thanks.
> 
> On Tue, Mar 1, 2011 at 8:21 AM, Geert-Jan Brits  wrote:
>> sort by functionquery is only available from solr 3.1 (from :
>> http://wiki.apache.org/solr/FunctionQuery#Sort_By_Function)
>> 
>> 
>> 2011/2/28 John Sherwood 
>> 
>>> This works:
>>> /select/?q=*:*&sort=price desc
>>> 
>>> This throws a 400 error:
>>> /select/?q=*:*&sort=sum(1, 1) desc
>>> 
>>> "Missing sort order."
>>> 
>>> I'm using 1.4.2.  I've tried all sorts of different numbers, functions, and
>>> fields but nothing seems to change that error.  Any ideas?
>>> 
>> 



Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jan Høydahl
Have you tried removing the  tag from solrconfig.xml? Then it should 
fall back to default ./data relative to core instancedir.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 1. mars 2011, at 00.00, Jonathan Rochkind wrote:

> Unless I'm doing something wrong, in my experience in multi-core Solr in 
> 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.
> 
> I set up multi-core like this:
> 
> 
> 
> 
> 
> 
> 
> Now, setting instanceDir like that works for Solr to look for the 'conf' 
> directory in the default location you'd expect, ./some_core/conf.
> 
> You'd expect it to look for the 'data' dir for an index in ./some_core/data 
> too, by default.  But it does not seem to. It's still looking for the 'data' 
> directory in the _main_ solr.home/data, not under the relevant core directory.
> 
> The only way I can manage to get it to look for the /data directory where I 
> expect is to spell it out with a full absolute path:
> 
> 
> 
> 
> 
> And then in the solrconfig.xml do a ${dataDir}
> 
> Is this what everyone else does too? Or am I missing a better way of doing 
> this?  I would have thought it would "just work", with Solr by default 
> looking for a ./data subdir of the specified instanceDir.  But it definitely 
> doesn't seem to do that.
> 
> Should it? Anyone know if Solr in trunk past 1.4.1 has been changed to do 
> what I expect? Or am I wrong to expect it? Or does everyone else do 
> multi-core in some different way than me where this doesn't come up?
> 
> Jonathan
> 



Re: Problem with Solr and Nutch integration

2011-03-01 Thread Paul Rogers
Hi Anurag

The request handler has been added the solrconfig file.

I'll try your attached requesthandler and see if that helps.

Interestingly enough the whole setup when I was using nutch 1.2/solr 1.4.1.
 It is only since moving to nutch trunk/solr branch_3x that the problem has
occurred.  I assume that something has changed inbetween and the tutorial's
request handler is incorrect for the later solr version.  Which versions of
solr/nutch are you using?

Assuming the catalina.out file is the correct log file the output I get is
shown below.  This output occurs on restarting the solr-example after adding
the new requesthandler.  When I access the solr admin page no additional
logging occurs.  Can any one see the problem?

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 

INFO: Solr home set to '/opt/solr/example/solr/'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/./lib

Feb 28, 2011 6:28:59 PM org.apache.solr.servlet.SolrDispatchFilter init

INFO: SolrDispatchFilter.init()

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /opt/solr/example/solr/solr.xml

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome

INFO: Using JNDI solr.home: /opt/solr/example/solr

Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer 

INFO: New CoreContainer: solrHome=/opt/solr/example/solr/ instance=6794958

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 

INFO: Solr home set to '/opt/solr/example/solr/'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/./lib

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 

INFO: Solr home set to '/opt/solr/example/solr/./'

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
addToClassLoader

SEVERE: Can't find (or read) file to add to classloader:
/opt/solr/example/solr/././lib

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrConfig initLibs

INFO: Adding specified lib dirs to ClassLoader

Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/commons-compress-1.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/log4j-1.2.14.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/commons-logging-1.1.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/tika-parsers-0.8.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/asm-3.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/icu4j-4_6.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xercesImpl-2.8.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
to classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/fontbox-1.3.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-3.7.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/dom4j-1.6.1.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/opt/solr/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.1.jar'
to classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-ooxml-3.7.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xml-apis-1.0.b2.jar' to
classloader
Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
replaceClassLoader

INFO: Adding
'file:/

Re: Problem with Solr and Nutch integration

2011-03-01 Thread Anurag
i have nutch-1.0 and Apache-solr-1.3.0 (integrated these two).

On 3/1/11, Paul Rogers [via Lucene]
 wrote:
>
>
> Hi Anurag
>
> The request handler has been added the solrconfig file.
>
> I'll try your attached requesthandler and see if that helps.
>
> Interestingly enough the whole setup when I was using nutch 1.2/solr 1.4.1.
>  It is only since moving to nutch trunk/solr branch_3x that the problem has
> occurred.  I assume that something has changed inbetween and the tutorial's
> request handler is incorrect for the later solr version.  Which versions of
> solr/nutch are you using?
>
> Assuming the catalina.out file is the correct log file the output I get is
> shown below.  This output occurs on restarting the solr-example after adding
> the new requesthandler.  When I access the solr admin page no additional
> logging occurs.  Can any one see the problem?
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> locateSolrHome
>
> INFO: Using JNDI solr.home: /opt/solr/example/solr
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 
>
> INFO: Solr home set to '/opt/solr/example/solr/'
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> addToClassLoader
>
> SEVERE: Can't find (or read) file to add to classloader:
> /opt/solr/example/solr/./lib
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.servlet.SolrDispatchFilter init
>
> INFO: SolrDispatchFilter.init()
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> locateSolrHome
>
> INFO: Using JNDI solr.home: /opt/solr/example/solr
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer$Initializer
> initialize
> INFO: looking for solr.xml: /opt/solr/example/solr/solr.xml
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> locateSolrHome
>
> INFO: Using JNDI solr.home: /opt/solr/example/solr
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.CoreContainer 
>
> INFO: New CoreContainer: solrHome=/opt/solr/example/solr/ instance=6794958
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 
>
> INFO: Solr home set to '/opt/solr/example/solr/'
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> addToClassLoader
>
> SEVERE: Can't find (or read) file to add to classloader:
> /opt/solr/example/solr/./lib
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader 
>
> INFO: Solr home set to '/opt/solr/example/solr/./'
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> addToClassLoader
>
> SEVERE: Can't find (or read) file to add to classloader:
> /opt/solr/example/solr/././lib
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrConfig initLibs
>
> INFO: Adding specified lib dirs to ClassLoader
>
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding
> 'file:/opt/solr/contrib/extraction/lib/commons-compress-1.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/log4j-1.2.14.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding
> 'file:/opt/solr/contrib/extraction/lib/commons-logging-1.1.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/tika-parsers-0.8.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/asm-3.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/icu4j-4_6.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/xercesImpl-2.8.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/bcmail-jdk15-1.45.jar'
> to classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/fontbox-1.3.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/poi-3.7.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding 'file:/opt/solr/contrib/extraction/lib/dom4j-1.6.1.jar' to
> classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INFO: Adding
> 'file:/opt/solr/contrib/extraction/lib/geronimo-stax-api_1.0_spec-1.0.1.jar'
> to classloader
> Feb 28, 2011 6:28:59 PM org.apache.solr.core.SolrResourceLoader
> replaceClassLoader
>
> INF

Error during auto-warming of key

2011-03-01 Thread Markus Jelsma
Hi,

Yesterday's error log contains something peculiar: 

 ERROR [solr.search.SolrCache] - [pool-29-thread-1] - : Error during auto-
warming of key:+*:* 
(1.0/(7.71E-8*float(ms(const(1298682616680),date(sort_date)))+1.0))^20.0:java.lang.NullPointerException
at org.apache.lucene.util.StringHelper.intern(StringHelper.java:36)
at 
org.apache.lucene.search.FieldCacheImpl$Entry.(FieldCacheImpl.java:275)
at 
org.apache.lucene.search.FieldCacheImpl.getLongs(FieldCacheImpl.java:525)
at 
org.apache.solr.search.function.LongFieldSource.getValues(LongFieldSource.java:57)
at 
org.apache.solr.search.function.DualFloatFunction.getValues(DualFloatFunction.java:48)
at 
org.apache.solr.search.function.ReciprocalFloatFunction.getValues(ReciprocalFloatFunction.java:61)
at 
org.apache.solr.search.function.FunctionQuery$AllScorer.(FunctionQuery.java:123)
at 
org.apache.solr.search.function.FunctionQuery$FunctionWeight.scorer(FunctionQuery.java:93)
at 
org.apache.lucene.search.BooleanQuery$BooleanWeight.scorer(BooleanQuery.java:297)
at 
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:246)
at org.apache.lucene.search.Searcher.search(Searcher.java:171)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSetNC(SolrIndexSearcher.java:651)
at 
org.apache.solr.search.SolrIndexSearcher.getDocSet(SolrIndexSearcher.java:545)
at 
org.apache.solr.search.SolrIndexSearcher.cacheDocSet(SolrIndexSearcher.java:520)
at 
org.apache.solr.search.SolrIndexSearcher$2.regenerateItem(SolrIndexSearcher.java:296)
at org.apache.solr.search.FastLRUCache.warm(FastLRUCache.java:168)
at 
org.apache.solr.search.SolrIndexSearcher.warm(SolrIndexSearcher.java:1481)
at org.apache.solr.core.SolrCore$2.call(SolrCore.java:1131)
at java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
at java.util.concurrent.FutureTask.run(FutureTask.java:138)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
at java.lang.Thread.run(Thread.java:619)


Well, i use Dismax' bf parameter to boost very recent documents. I'm not using 
the queryResultCache or documentCache, only filterCache and Lucene fieldCache. 
I've check LUCENE-1890 but am unsure if that's the issue. Anyt thoughts on 
this one?

https://issues.apache.org/jira/browse/LUCENE-1890

Cheers,

-- 
Markus Jelsma - CTO - Openindex
http://www.linkedin.com/in/markus17
050-8536620 / 06-50258350


Problem with sorting using functions

2011-03-01 Thread John Sherwood
This works:
/select/?q=*:*&sort=price desc

This throws a 400 error:
/select/?q=*:*&sort=sum(1, 1) desc

"Missing sort order."

I'm using 1.4.2.  I've tried all sorts of different numbers/functions/fields
and nothing seems to change that error.  Any ideas?
**


Retrieving payload from each highlighted term

2011-03-01 Thread Fabiano Nunes
How can I get the payload from each highlighted term?


Re: Disabling caching for fq param?

2011-03-01 Thread mrw
We use fq params for filtering as well (not show in previous example), so we
only want to be able to override fq caching on a per-parameter basis (e.g.,
fq={!noCache userLabels} ).

Thanks


Markus Jelsma-2 wrote:
> 
> If filterCache hitratio is low then just disable it in solrconfig by
> deleting 
> the section or setting its values to 0.
> 
>> Based on what I've read here and what I could find on the web, it seems
>> that each fq clause essentially gets its own results cache.  Is that
>> correct?
>> 
>> We have a corporate policy of passing the user's Oracle OLS labels into
>> the
>> index in order to be matched against the labels field.  I currently
>> separate this from the user's query text by sticking it into an fq
>> param...
>> 
>> ?q=
>> &fq=labels:
>> &qf= 
>> &tie=0.1
>> &defType=dismax
>> 
>> ...but since its value (a collection of hundreds of label values) only
>> apply to that user, the accompanying result set won't be reusable by
>> other
>> users:
>> 
>> My understanding is that this query will result in two result sets (q and
>> fq) being cached separately, with the union of the two sets being
>> returned
>> to the user.  (Is that correct?)
>> 
>> There are thousands of users, each with a unique combination of labels,
>> so
>> there seems to be little value in caching the result set created from the
>> fq labels param.  It would be beneficial if there were some kind of fq
>> parameter override to indicate to Solr to not cache the results?
>> 
>> 
>> Thanks!
> 
> 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Disabling-caching-for-fq-param-tp2600188p2602986.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Disabling caching for fq param?

2011-03-01 Thread mrw
That clause will always be the same per-user (i.e., you have values 1,2,4 and
I have values 1,2,8) across queries.  In the result set denoted by the
labels param, some users will have tens of thousands of documents and others
will have millions of documents.  

It sounds like you don't see a huge problem with our approach, so maybe
we'll stick with it for the time being.

Thanks!


Jonathan Rochkind wrote:
> 
> As far as I know there is not, it might be beneficial, but also worth
> considering: "thousands" of users isn't _that_ many, and if that same
> clause is always the same per user, then if the same user does a query a
> second time, it wouldn't hurt to have their user-specific fq in the cache. 
> A single fq cache may not take as much RAM as you think, you could
> potentially afford increase your fq cache size to
> thousands/tens-of-thousands, and win all the way around. 
> 
> The filter cache should be a least-recently-used-out-first cache, so even
> if the filter cache isn't big enough for all of them, fq's that are used
> by more than one user will probably stay in the cache as old user-specific
> fq's end up falling off the back as least-recently-used. 
> 
> So in actual practice, one way or another, it may not be a problem. 
> 
> From: mrw [mikerobertsw...@gmail.com]
> Sent: Monday, February 28, 2011 9:06 PM
> To: solr-user@lucene.apache.org
> Subject: Disabling caching for fq param?
> 
> Based on what I've read here and what I could find on the web, it seems
> that
> each fq clause essentially gets its own results cache.  Is that correct?
> 
> We have a corporate policy of passing the user's Oracle OLS labels into
> the
> index in order to be matched against the labels field.  I currently
> separate
> this from the user's query text by sticking it into an fq param...
> 
> ?q=
> &fq=labels:
> &qf= 
> &tie=0.1
> &defType=dismax
> 
> ...but since its value (a collection of hundreds of label values) only
> apply
> to that user, the accompanying result set won't be reusable by other
> users:
> 
> My understanding is that this query will result in two result sets (q and
> fq) being cached separately, with the union of the two sets being returned
> to the user.  (Is that correct?)
> 
> There are thousands of users, each with a unique combination of labels, so
> there seems to be little value in caching the result set created from the
> fq
> labels param.  It would be beneficial if there were some kind of fq
> parameter override to indicate to Solr to not cache the results?
> 
> 
> Thanks!
> 
> 
> 
> 
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Disabling-caching-for-fq-param-tp2600188p2600188.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/Disabling-caching-for-fq-param-tp2600188p2603009.html
Sent from the Solr - User mailing list archive at Nabble.com.


RE: Query on multivalue field

2011-03-01 Thread Steven A Rowe
Hi Scott,

Querying against a multi-valued field just works - no special incantation 
required.

Steve

> -Original Message-
> From: Scott Yeadon [mailto:scott.yea...@anu.edu.au]
> Sent: Monday, February 28, 2011 11:50 PM
> To: solr-user@lucene.apache.org
> Subject: Query on multivalue field
> 
> Hi,
> 
> I have a variable number of text-based fields associated with each
> primary record which I wanted to apply a search across. I wanted to
> avoid the use of dynamic fields if possible or having to create a
> different document type in the index (as the app is based around the
> primary record and different views mean a lot of work to revamp
> pagination etc).
> 
> So, is there a way to apply a query to each value of a multivalued field
> or is it always treated as a "single" field from a query perspective?
> 
> Thanks.
> 
> Scott.


Help with explain query syntax

2011-03-01 Thread Glòria Martínez
Hello,

I can't understand why this query is not matching anything. Could someone
help me please?

*Query*
http://localhost:8894/solr/select?q=linguajob.pl&qf=company_name&wt=xml&qt=dismax&debugQuery=on&explainOther=id%3A1


-

0
12
-

id:1
on
linguajob.pl
company_name
xml
dismax



-

linguajob.pl
linguajob.pl
-

+DisjunctionMaxQuery((company_name:"(linguajob.pl linguajob) pl")~0.01) ()

-

+(company_name:"(linguajob.pl linguajob) pl")~0.01 ()


id:1
-

-


0.0 = (NON-MATCH) Failure to meet condition(s) of required/prohibited
clause(s)
  0.0 = no match on required clause (company_name:"(linguajob.pl linguajob)
pl") *<- What does this syntax (field:"(token1 token2) token3") mean?*
0.0 = (NON-MATCH) fieldWeight(company_name:"(linguajob.pl linguajob) pl"
in 0), product of:
  0.0 = tf(phraseFreq=0.0)
  1.6137056 = idf(company_name:"(linguajob.pl linguajob) pl")
  0.4375 = fieldNorm(field=company_name, doc=0)


DisMaxQParser


+

...




There's only one document indexed:

*Document*
http://localhost:8894/solr/select?q=1&qf=id&wt=xml&qt=dismax

-

0
2
-

id
xml
dismax
1


-

-

LinguaJob.pl
1
6
2011-03-01T11:14:24.553Z




*Solr Admin Schema*
Field: company_name
Field Type: text
Properties: Indexed, Tokenized, Stored
Schema: Indexed, Tokenized, Stored
Index: Indexed, Tokenized, Stored

Position Increment Gap: 100

Index Analyzer: org.apache.solr.analysis.TokenizerChain Details
Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
Filters:
schema.UnicodeNormalizationFilterFactory args:{composed: false
remove_modifiers: true fold: true version: java6 remove_diacritics: true }
org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
ignoreCase: true enablePositionIncrements: true }
org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 1
generateWordParts: 1 catenateAll: 0 catenateNumbers: 1 }
org.apache.solr.analysis.LowerCaseFilterFactory args:{}
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}

Query Analyzer: org.apache.solr.analysis.TokenizerChain Details
Tokenizer Class: org.apache.solr.analysis.WhitespaceTokenizerFactory
Filters:
schema.UnicodeNormalizationFilterFactory args:{composed: false
remove_modifiers: true fold: true version: java6 remove_diacritics: true }
org.apache.solr.analysis.SynonymFilterFactory args:{synonyms: synonyms.txt
expand: true ignoreCase: true }
org.apache.solr.analysis.StopFilterFactory args:{words: stopwords.txt
ignoreCase: true }
org.apache.solr.analysis.WordDelimiterFilterFactory args:{preserveOriginal:
1 splitOnCaseChange: 1 generateNumberParts: 1 catenateWords: 0
generateWordParts: 1 catenateAll: 0 catenateNumbers: 0 }
org.apache.solr.analysis.LowerCaseFilterFactory args:{}
org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory args:{}

Docs: 1
Distinct: 5
Top 5 terms
term frequency
lingua 1
linguajob.pl 1
linguajobpl 1
pl 1
job 1

*Solr Analysis*
Field name: company_name
Field value (Index): LinguaJob.pl
Field value (Query): linguajob.pl

*Index Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

schema.UnicodeNormalizationFilterFactory {composed=false,
remove_modifiers=true, fold=true, version=java6, remove_diacritics=true}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,
ignoreCase=true, enablePositionIncrements=true}
term position 1
term text LinguaJob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.WordDelimiterFilterFactory {preserveOriginal=1,
splitOnCaseChange=1, generateNumberParts=1, catenateWords=1,
generateWordParts=1, catenateAll=0, catenateNumbers=1}
term position 123
term text LinguaJob.plJobpl
LinguaLinguaJobpl
term type wordwordword
wordword
source start,end 0,126,910,12
0,60,12
payload

org.apache.solr.analysis.LowerCaseFilterFactory {}
term position 123
term text linguajob.pljobpl
lingualinguajobpl
term type wordwordword
wordword
source start,end 0,126,910,12
0,60,12
payload

org.apache.solr.analysis.RemoveDuplicatesTokenFilterFactory {}
term position 123
term text linguajob.pljobpl
lingualinguajobpl
term type wordwordword
wordword
source start,end 0,126,910,12
0,60,12
payload

*Query Analyzer

org.apache.solr.analysis.WhitespaceTokenizerFactory {}
term position 1
term text linguajob.pl
term type word
source start,end 0,12
payload

schema.UnicodeNormalizationFilterFactory {composed=false,
remove_modifiers=true, fold=true, version=java6, remove_diacritics=true}
term position 1
term text linguajob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.SynonymFilterFactory {synonyms=synonyms.txt,
expand=true, ignoreCase=true}
term position 1
term text linguajob.pl
term type word
source start,end 0,12
payload

org.apache.solr.analysis.StopFilterFactory {words=stopwords.txt,

Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
I did try that, yes. I tried that first in fact!  It seems to fall back 
to a ./data directory relative to the _main_ solr directory (the one 
above all the cores), not the core instancedir.  Which is not what I 
expected either.


I wonder if this should be considered a bug? I wonder if anyone has 
considered this and thought of changing/fixing it?


On 3/1/2011 4:23 AM, Jan Høydahl wrote:

Have you tried removing the  tag from solrconfig.xml? Then it should 
fall back to default ./data relative to core instancedir.

--
Jan Høydahl, search solution architect
Cominvent AS - www.cominvent.com

On 1. mars 2011, at 00.00, Jonathan Rochkind wrote:


Unless I'm doing something wrong, in my experience in multi-core Solr in 1.4.1, 
you NEED to explicitly provide an absolute path to the 'data' dir.

I set up multi-core like this:







Now, setting instanceDir like that works for Solr to look for the 'conf' 
directory in the default location you'd expect, ./some_core/conf.

You'd expect it to look for the 'data' dir for an index in ./some_core/data 
too, by default.  But it does not seem to. It's still looking for the 'data' 
directory in the _main_ solr.home/data, not under the relevant core directory.

The only way I can manage to get it to look for the /data directory where I 
expect is to spell it out with a full absolute path:





And then in the solrconfig.xml do a${dataDir}

Is this what everyone else does too? Or am I missing a better way of doing this?  I would 
have thought it would "just work", with Solr by default looking for a ./data 
subdir of the specified instanceDir.  But it definitely doesn't seem to do that.

Should it? Anyone know if Solr in trunk past 1.4.1 has been changed to do what 
I expect? Or am I wrong to expect it? Or does everyone else do multi-core in 
some different way than me where this doesn't come up?

Jonathan





MLT with boost

2011-03-01 Thread Mark
Is it possible to add function queries/boosts to the results that are by 
MLT? If not out of the box how would one go about achieving this 
functionality?


Thanks


Re: please make JSONWriter public

2011-03-01 Thread Ryan McKinley
You may have noticed the ResponseWriter code is pretty hairy!  Things
are package protected so that the API can change between minor release
without concern for back compatibility.

In 4.0 (/trunk) I hope to rework the whole ResponseWriter framework so
that it is more clean and hopefully stable enough that making parts
public is helpful.

For now, you can:
- copy the code
- put your class in the same package name
- make it public in your own distribution

ryan



On Mon, Feb 28, 2011 at 2:56 PM, Paul Libbrecht  wrote:
>
> Hello fellow SOLR experts,
>
> may I ask to make top-level and public the class
>    org.apache.solr.request.JSONWriter
> inside
>    org.apache.solr.request.JSONResponseWriter
> I am re-using it to output JSON search result to code that I wish not to 
> change on the client but the current visibility settings (JSONWriter is 
> package protected) makes it impossible for me without actually copying the 
> code (which is possible thanks to the good open-source nature).
>
> thanks in advance
>
> paul


Re: Sub entities

2011-03-01 Thread Brian Lamb
Yes, it looks like I had left off the field (misspelled it actually). I
reran the full import and the fields did properly show up. However, it is
still not working as expected. Using the example below, a result returned
would only list one specie instead of a list of species. I have the
following in my schema.xml file:



I reran the fullimport but it is still only listing one specie instead of
multiple. Is my above declaration incorrect?

On Tue, Mar 1, 2011 at 3:41 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Brian,
>
> except for your sql-syntax error in the specie_relations-query "SELECT
> specie_id FROMspecie_relations .." (missing whitespace after FROM)
> your config looks okay.
>
> following questions:
> * is there a field named specie in your schema? (otherwise dih will
> silently ignore it)
> * did you check your mysql-query log? to see which queries were
> executed and what their result is?
>
> And, just as quick notice .. there is no need to use  column="foo" name="foo"> (while both attribute have the same value).
>
> Regards
> Stefan
>
> On Mon, Feb 28, 2011 at 9:52 PM, Brian Lamb
>  wrote:
> > Hi all,
> >
> > I was able to get my dataimport to work correctly but I'm a little
> unclear
> > as to how the entity within an entity works in regards to search results.
> > When I do a search for all results, it seems only the outermost responses
> > are returned. For example, I have the following in my db config file:
> >
> > 
> >   > driver="com.mysql.jdbc.Driver"
> >
> url="jdbc:mysql://localhost/db?characterEncoding=UTF8&zeroDateTimeBehavior=convertToNull"
> > user="user" password="password"/>
> >
> >  
> >
> >
> >
> >
> >
> >
> >  
> >
> >  
> >
> >  
> >
> >  
> > 
> >
> > However, specie never shows up in my search results:
> >
> > 
> >  Mammal
> >  1
> >  Canis
> > 
> >
> > I had hoped the results would include the species. Can it? If so, what is
> my
> > malfunction?
> >
>


Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Thank you for your reply but the searching is still not working out. For
example, when I go to:

http://localhost:8983/solr/select/?q=*%3A*

I get the following as a response:


  
Mammal
1
Canis
  


(plus some other docs but one is enough for this example)

But if I go to 
http://localhost:8983/solr/select/?q=type%3A
Mammal

I only get:



But it seems that should return at least the result I have listed above.
What am I doing incorrectly?

On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:

> q=dog is equivalent to q=text:dog (where the default search field is
> defined as text at the bottom of schema.xml).
>
> If you want to specify a different field, well, you need to tell it :-)
>
> Is that it?
>
> Upayavira
>
> On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
>  wrote:
> > Hi all,
> >
> > I was able to get my installation of Solr indexed using dataimport.
> > However,
> > I cannot seem to get search working. I can verify that the data is there
> > by
> > going to:
> >
> >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> > This gives me the response:  > start="0">
> >
> > But when I go to
> >
> >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> >
> > I get the response: 
> >
> > I know that dog should return some results because it is the first result
> > when I select all the records. So what am I doing incorrectly that would
> > prevent me from seeing results?
> >
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source
>
>


Re: Sub entities

2011-03-01 Thread Stefan Matheis
Brian,

On Tue, Mar 1, 2011 at 4:52 PM, Brian Lamb
 wrote:
>  indexed="true" stored="true" required="false" />

Not sure, but iirc  in this context has no column-Attribute ..
that should normally not break your solr-configuration.

Are you sure, that your animal has multiple species assigned? Checked
the Query from the MySQL-Query-Log and verified that it returns more
than one record?

Otherwise you could enable
http://wiki.apache.org/solr/DataImportHandler#LogTransformer for your
dataimport, which outputs a log-row for every record .. just to
ensure, that your Query-Results is correctly imported

HTH, Regards
Stefan


Re: Indexed, but cannot search

2011-03-01 Thread Edoardo Tosca
Hi,
i'm not sure if it is a typo, anyway the second query you mentioned should
be:
http://localhost:8983/solr/select/?q=type:*

HTH,

Edo

On Tue, Mar 1, 2011 at 4:06 PM, Brian Lamb wrote:

> Thank you for your reply but the searching is still not working out. For
> example, when I go to:
>
> http://localhost:8983/solr/select/?q=*%3A*<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
>
> I get the following as a response:
>
> 
>  
>Mammal
>1
>Canis
>  
> 
>
> (plus some other docs but one is enough for this example)
>
> But if I go to http://localhost:8983/solr/select/?q=type%3A<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> Mammal
>
> I only get:
>
> 
>
> But it seems that should return at least the result I have listed above.
> What am I doing incorrectly?
>
> On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
>
> > q=dog is equivalent to q=text:dog (where the default search field is
> > defined as text at the bottom of schema.xml).
> >
> > If you want to specify a different field, well, you need to tell it :-)
> >
> > Is that it?
> >
> > Upayavira
> >
> > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> >  wrote:
> > > Hi all,
> > >
> > > I was able to get my installation of Solr indexed using dataimport.
> > > However,
> > > I cannot seem to get search working. I can verify that the data is
> there
> > > by
> > > going to:
> > >
> > >
> >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> > >
> > > This gives me the response:  > > start="0">
> > >
> > > But when I go to
> > >
> > >
> >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> > >
> > > I get the response: 
> > >
> > > I know that dog should return some results because it is the first
> result
> > > when I select all the records. So what am I doing incorrectly that
> would
> > > prevent me from seeing results?
> > >
> > ---
> > Enterprise Search Consultant at Sourcesense UK,
> > Making Sense of Open Source
> >
> >
>



-- 
Edoardo Tosca
Sourcesense - making sense of Open Source: http://www.sourcesense.com


Re: Indexed, but cannot search

2011-03-01 Thread Upayavira
Next question, do you have your "type" field set to index="true" in your
schema?

Upayavira

On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
 wrote:
> Thank you for your reply but the searching is still not working out. For
> example, when I go to:
> 
> http://localhost:8983/solr/select/?q=*%3A*
> 
> I get the following as a response:
> 
> 
>   
> Mammal
> 1
> Canis
>   
> 
> 
> (plus some other docs but one is enough for this example)
> 
> But if I go to
> http://localhost:8983/solr/select/?q=type%3A
> Mammal
> 
> I only get:
> 
> 
> 
> But it seems that should return at least the result I have listed above.
> What am I doing incorrectly?
> 
> On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> 
> > q=dog is equivalent to q=text:dog (where the default search field is
> > defined as text at the bottom of schema.xml).
> >
> > If you want to specify a different field, well, you need to tell it :-)
> >
> > Is that it?
> >
> > Upayavira
> >
> > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> >  wrote:
> > > Hi all,
> > >
> > > I was able to get my installation of Solr indexed using dataimport.
> > > However,
> > > I cannot seem to get search working. I can verify that the data is there
> > > by
> > > going to:
> > >
> > >
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> > >
> > > This gives me the response:  > > start="0">
> > >
> > > But when I go to
> > >
> > >
> > http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> > >
> > > I get the response: 
> > >
> > > I know that dog should return some results because it is the first result
> > > when I select all the records. So what am I doing incorrectly that would
> > > prevent me from seeing results?
> > >
> > ---
> > Enterprise Search Consultant at Sourcesense UK,
> > Making Sense of Open Source
> >
> >
> 
--- 
Enterprise Search Consultant at Sourcesense UK, 
Making Sense of Open Source



Re: Question on writing custom UpdateHandler

2011-03-01 Thread Chris Hostetter

In your first attempt, the crux of your problem was probably that you were 
never closing the searcher/reader.

: Or how can I perform a query on the current state of the index from within an
: UpdateProcessor?

If you implement UpdateRequestProcessorFactory, the getInstance method is 
given the SolrQueryRequest, which you cna use to access the current 
SolrIndexSearcher.

this will only show you the state of the index as of the last commit, so 
it won't be real time as you are streaming new documents, but if will give 
you the same results as a search query happening concurrent to your 
update.


-Hoss


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Chris Hostetter

: Unless I'm doing something wrong, in my experience in multi-core Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.

have you looked at the example/multicore directory that was included in 
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the 
solr.xml (or hte solrconfig.xml) and it uses the "data" dir inside the 
specified instanceDir.

If that example works for you, but your own configs do not, then we'll 
need more details about your own configs -- how are you running solr, what 
does the solrconfig.xml of the core look like, etc...


-Hoss


Re: please make JSONWriter public

2011-03-01 Thread Paul Libbrecht
Ryan,

honestly, hairyness was rather mild.
I found it fairly readable.

paul


Le 1 mars 2011 à 16:46, Ryan McKinley a écrit :

> You may have noticed the ResponseWriter code is pretty hairy!  Things
> are package protected so that the API can change between minor release
> without concern for back compatibility.
> 
> In 4.0 (/trunk) I hope to rework the whole ResponseWriter framework so
> that it is more clean and hopefully stable enough that making parts
> public is helpful.
> 
> For now, you can:
> - copy the code
> - put your class in the same package name
> - make it public in your own distribution
> 
> ryan
> 
> 
> 
> On Mon, Feb 28, 2011 at 2:56 PM, Paul Libbrecht  wrote:
>> 
>> Hello fellow SOLR experts,
>> 
>> may I ask to make top-level and public the class
>>org.apache.solr.request.JSONWriter
>> inside
>>org.apache.solr.request.JSONResponseWriter
>> I am re-using it to output JSON search result to code that I wish not to 
>> change on the client but the current visibility settings (JSONWriter is 
>> package protected) makes it impossible for me without actually copying the 
>> code (which is possible thanks to the good open-source nature).
>> 
>> thanks in advance
>> 
>> paul



solr different sizes on master and slave

2011-03-01 Thread Mike Franon
I was curious why would the size be dramatically different even though
the index versions are the same?

One is 1.2 Gb, and on the slave it is 512 MB

I would think they should both be the same size no?

Thanks


Re: Sub entities

2011-03-01 Thread Brian Lamb
Thanks for the help Stefan. It seems removing column="specie" fixed it.

On Tue, Mar 1, 2011 at 11:18 AM, Stefan Matheis <
matheis.ste...@googlemail.com> wrote:

> Brian,
>
> On Tue, Mar 1, 2011 at 4:52 PM, Brian Lamb
>  wrote:
> >  > indexed="true" stored="true" required="false" />
>
> Not sure, but iirc  in this context has no column-Attribute ..
> that should normally not break your solr-configuration.
>
> Are you sure, that your animal has multiple species assigned? Checked
> the Query from the MySQL-Query-Log and verified that it returns more
> than one record?
>
> Otherwise you could enable
> http://wiki.apache.org/solr/DataImportHandler#LogTransformer for your
> dataimport, which outputs a log-row for every record .. just to
> ensure, that your Query-Results is correctly imported
>
> HTH, Regards
> Stefan
>


Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Hi all,

The problem was that my fields were defined as type="string" instead of
type="text". Once I corrected that, it seems to be fixed. The only part that
still is not working though is the search across all fields.

For example:

http://localhost:8983/solr/select/?q=type%3AMammal

Now correctly returns the records matching mammal. But if I try to do a
global search across all fields:

http://localhost:8983/solr/select/?q=Mammal
http://localhost:8983/solr/select/?q=text%3AMammal

I get no results returned. Here is how the schema is set up:


text


Thanks to everyone for your help so far. I think this is the last hurdle I
have to jump over.

On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:

> Next question, do you have your "type" field set to index="true" in your
> schema?
>
> Upayavira
>
> On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
>  wrote:
> > Thank you for your reply but the searching is still not working out. For
> > example, when I go to:
> >
> > http://localhost:8983/solr/select/?q=*%3A*<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> >
> > I get the following as a response:
> >
> > 
> >   
> > Mammal
> > 1
> > Canis
> >   
> > 
> >
> > (plus some other docs but one is enough for this example)
> >
> > But if I go to
> > http://localhost:8983/solr/select/?q=type%3A<
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> >
> > Mammal
> >
> > I only get:
> >
> > 
> >
> > But it seems that should return at least the result I have listed above.
> > What am I doing incorrectly?
> >
> > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> >
> > > q=dog is equivalent to q=text:dog (where the default search field is
> > > defined as text at the bottom of schema.xml).
> > >
> > > If you want to specify a different field, well, you need to tell it :-)
> > >
> > > Is that it?
> > >
> > > Upayavira
> > >
> > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > >  wrote:
> > > > Hi all,
> > > >
> > > > I was able to get my installation of Solr indexed using dataimport.
> > > > However,
> > > > I cannot seem to get search working. I can verify that the data is
> there
> > > > by
> > > > going to:
> > > >
> > > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&indent=on
> > > >
> > > > This gives me the response:  > > > start="0">
> > > >
> > > > But when I go to
> > > >
> > > >
> > >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&indent=on
> > > >
> > > > I get the response: 
> > > >
> > > > I know that dog should return some results because it is the first
> result
> > > > when I select all the records. So what am I doing incorrectly that
> would
> > > > prevent me from seeing results?
> > > >
> > > ---
> > > Enterprise Search Consultant at Sourcesense UK,
> > > Making Sense of Open Source
> > >
> > >
> >
> ---
> Enterprise Search Consultant at Sourcesense UK,
> Making Sense of Open Source
>
>


Re: Indexed, but cannot search

2011-03-01 Thread Markus Jelsma
Traditionally, people forget to reindex ;)

> Hi all,
> 
> The problem was that my fields were defined as type="string" instead of
> type="text". Once I corrected that, it seems to be fixed. The only part
> that still is not working though is the search across all fields.
> 
> For example:
> 
> http://localhost:8983/solr/select/?q=type%3AMammal
> 
> Now correctly returns the records matching mammal. But if I try to do a
> global search across all fields:
> 
> http://localhost:8983/solr/select/?q=Mammal
> http://localhost:8983/solr/select/?q=text%3AMammal
> 
> I get no results returned. Here is how the schema is set up:
> 
>  multiValued="true"/>
> text
> 
> 
> Thanks to everyone for your help so far. I think this is the last hurdle I
> have to jump over.
> 
> On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:
> > Next question, do you have your "type" field set to index="true" in your
> > schema?
> > 
> > Upayavira
> > 
> > On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
> > 
> >  wrote:
> > > Thank you for your reply but the searching is still not working out.
> > > For example, when I go to:
> > > 
> > > http://localhost:8983/solr/select/?q=*%3A*<
> > 
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > dent=on
> > 
> > > I get the following as a response:
> > > 
> > > 
> > > 
> > >   
> > >   
> > > Mammal
> > > 1
> > > Canis
> > >   
> > >   
> > > 
> > > 
> > > 
> > > (plus some other docs but one is enough for this example)
> > > 
> > > But if I go to
> > > http://localhost:8983/solr/select/?q=type%3A<
> > 
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > dent=on
> > 
> > > Mammal
> > > 
> > > I only get:
> > > 
> > > 
> > > 
> > > But it seems that should return at least the result I have listed
> > > above. What am I doing incorrectly?
> > > 
> > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> > > > q=dog is equivalent to q=text:dog (where the default search field is
> > > > defined as text at the bottom of schema.xml).
> > > > 
> > > > If you want to specify a different field, well, you need to tell it
> > > > :-)
> > > > 
> > > > Is that it?
> > > > 
> > > > Upayavira
> > > > 
> > > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > > > 
> > > >  wrote:
> > > > > Hi all,
> > > > > 
> > > > > I was able to get my installation of Solr indexed using dataimport.
> > > > > However,
> > > > > I cannot seem to get search working. I can verify that the data is
> > 
> > there
> > 
> > > > > by
> > 
> > > > > going to:
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > dent=on
> > 
> > > > > This gives me the response:  > > > > numFound="234961" start="0">
> > > > > 
> > > > > But when I go to
> > 
> > http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&inde
> > nt=on
> > 
> > > > > I get the response: 
> > > > > 
> > > > > I know that dog should return some results because it is the first
> > 
> > result
> > 
> > > > > when I select all the records. So what am I doing incorrectly that
> > 
> > would
> > 
> > > > > prevent me from seeing results?
> > > > 
> > > > ---
> > > > Enterprise Search Consultant at Sourcesense UK,
> > > > Making Sense of Open Source
> > 
> > ---
> > Enterprise Search Consultant at Sourcesense UK,
> > Making Sense of Open Source


Re: solr different sizes on master and slave

2011-03-01 Thread Markus Jelsma
Are there pending commits on the master?

> I was curious why would the size be dramatically different even though
> the index versions are the same?
> 
> One is 1.2 Gb, and on the slave it is 512 MB
> 
> I would think they should both be the same size no?
> 
> Thanks


Re: Indexed, but cannot search

2011-03-01 Thread Brian Lamb
Oh if only it were that easy :-). I have reindexed since making that change
which is how I was able to get the regular search working. I have not
however been able to get the search across all fields to work.

On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma wrote:

> Traditionally, people forget to reindex ;)
>
> > Hi all,
> >
> > The problem was that my fields were defined as type="string" instead of
> > type="text". Once I corrected that, it seems to be fixed. The only part
> > that still is not working though is the search across all fields.
> >
> > For example:
> >
> > http://localhost:8983/solr/select/?q=type%3AMammal
> >
> > Now correctly returns the records matching mammal. But if I try to do a
> > global search across all fields:
> >
> > http://localhost:8983/solr/select/?q=Mammal
> > http://localhost:8983/solr/select/?q=text%3AMammal
> >
> > I get no results returned. Here is how the schema is set up:
> >
> >  > multiValued="true"/>
> > text
> > 
> >
> > Thanks to everyone for your help so far. I think this is the last hurdle
> I
> > have to jump over.
> >
> > On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:
> > > Next question, do you have your "type" field set to index="true" in
> your
> > > schema?
> > >
> > > Upayavira
> > >
> > > On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
> > >
> > >  wrote:
> > > > Thank you for your reply but the searching is still not working out.
> > > > For example, when I go to:
> > > >
> > > > http://localhost:8983/solr/select/?q=*%3A*<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > I get the following as a response:
> > > >
> > > > 
> > > >
> > > >   
> > > >
> > > > Mammal
> > > > 1
> > > > Canis
> > > >
> > > >   
> > > >
> > > > 
> > > >
> > > > (plus some other docs but one is enough for this example)
> > > >
> > > > But if I go to
> > > > http://localhost:8983/solr/select/?q=type%3A<
> > >
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > Mammal
> > > >
> > > > I only get:
> > > >
> > > > 
> > > >
> > > > But it seems that should return at least the result I have listed
> > > > above. What am I doing incorrectly?
> > > >
> > > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> > > > > q=dog is equivalent to q=text:dog (where the default search field
> is
> > > > > defined as text at the bottom of schema.xml).
> > > > >
> > > > > If you want to specify a different field, well, you need to tell it
> > > > > :-)
> > > > >
> > > > > Is that it?
> > > > >
> > > > > Upayavira
> > > > >
> > > > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > > > >
> > > > >  wrote:
> > > > > > Hi all,
> > > > > >
> > > > > > I was able to get my installation of Solr indexed using
> dataimport.
> > > > > > However,
> > > > > > I cannot seem to get search working. I can verify that the data
> is
> > >
> > > there
> > >
> > > > > > by
> > >
> > > > > > going to:
> > >
> http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > > dent=on
> > >
> > > > > > This gives me the response:  > > > > > numFound="234961" start="0">
> > > > > >
> > > > > > But when I go to
> > >
> > >
> http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&inde
> > > nt=on
> > >
> > > > > > I get the response:  start="0">
> > > > > >
> > > > > > I know that dog should return some results because it is the
> first
> > >
> > > result
> > >
> > > > > > when I select all the records. So what am I doing incorrectly
> that
> > >
> > > would
> > >
> > > > > > prevent me from seeing results?
> > > > >
> > > > > ---
> > > > > Enterprise Search Consultant at Sourcesense UK,
> > > > > Making Sense of Open Source
> > >
> > > ---
> > > Enterprise Search Consultant at Sourcesense UK,
> > > Making Sense of Open Source
>


Re: solr different sizes on master and slave

2011-03-01 Thread Mike Franon
No pending commits, what it looks like is there are almost two copies
of the index on the master, not sure how that happened.



On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
 wrote:
> Are there pending commits on the master?
>
>> I was curious why would the size be dramatically different even though
>> the index versions are the same?
>>
>> One is 1.2 Gb, and on the slave it is 512 MB
>>
>> I would think they should both be the same size no?
>>
>> Thanks
>


numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
I wonder if i shall use solr int or string for such field with following
requirement

multi-value
facet needed
sort not needed


The field value is a an id.  Therefore, i can store as either numeric field
or just a string.   Shall i choose string for efficiency?

Thanks.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606353.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Ahmet Arslan
> I wonder if i shall use solr int or
> string for such field with following
> requirement
> 
> multi-value
> facet needed
> sort not needed
> 
> 
> The field value is a an id.  Therefore, i can store as
> either numeric field
> or just a string.   Shall i choose string
> for efficiency?

Trie based integer (tint) is preferred for faster faceting.
 





Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon

Thanks, but just to confirm the way multiValued fields work:

In a multiValued field, call it field1, if I have two values indexed to 
this field, say value 1 = "some text...termA...more text" and value 2 = 
"some text...termB...more text" and do a search such as field1:(termA termB)
(where ) I'm getting a hit 
returned even though both terms don't occur within a single value in the 
multiValued field.


What I'm wondering is if there is a way of applying the query against 
each value of the field rather than against the field in its entirety. 
The reason being is the number of values I want to store is variable and 
I'd like to avoid the use of dynamic fields or restructuring the index 
if possible.


Scott.

On 2/03/11 12:35 AM, Steven A Rowe wrote:

Hi Scott,

Querying against a multi-valued field just works - no special incantation 
required.

Steve


-Original Message-
From: Scott Yeadon [mailto:scott.yea...@anu.edu.au]
Sent: Monday, February 28, 2011 11:50 PM
To:solr-user@lucene.apache.org
Subject: Query on multivalue field

Hi,

I have a variable number of text-based fields associated with each
primary record which I wanted to apply a search across. I wanted to
avoid the use of dynamic fields if possible or having to create a
different document type in the index (as the app is based around the
primary record and different views mean a lot of work to revamp
pagination etc).

So, is there a way to apply a query to each value of a multivalued field
or is it always treated as a "single" field from a query perspective?

Thanks.

Scott.




Re: solr different sizes on master and slave

2011-03-01 Thread Mike Franon
ok doing some more research I noticed, on the slave it has multiple
folders where it keeps them for example

index
index.20110204010900
index.20110204013355
index.20110218125400

and then there is an index.properties that shows which index it is using.

I am just curious why does it keep multiple copies?  Is there a
setting somewhere I can change to only keep one copy so not to lose
space?

Thanks

On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon  wrote:
> No pending commits, what it looks like is there are almost two copies
> of the index on the master, not sure how that happened.
>
>
>
> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
>  wrote:
>> Are there pending commits on the master?
>>
>>> I was curious why would the size be dramatically different even though
>>> the index versions are the same?
>>>
>>> One is 1.2 Gb, and on the slave it is 512 MB
>>>
>>> I would think they should both be the same size no?
>>>
>>> Thanks
>>
>


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Chris Hostetter

: > The field value is a an id.  Therefore, i can store as
: > either numeric field
: > or just a string.   Shall i choose string
: > for efficiency?
: 
: Trie based integer (tint) is preferred for faster faceting.

range faceting/filtering yes -- not for "field" faceting which is what i 
think he's asking about.

in that case int would still proably be more efficient, but you don't want 
precision steps (that will introduce added terms)

-Hoss

Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
Hmm, okay, have to try to find time to install the example/multicore and 
see.


It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:

: Unless I'm doing something wrong, in my experience in multi-core Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 'data' dir.

have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the "data" dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr, what
does the solrconfig.xml of the core look like, etc...


-Hoss



Re: solr different sizes on master and slave

2011-03-01 Thread Jonathan Rochkind
The slave should not keep multiple copies _permanently_, but might 
temporarily after it's fetched the new files from master, but before 
it's committed them and fully wamred the new index searchers in the 
slave.  Could that be what's going on, is your slave just still working 
on committing and warming the new version(s) of the index?


[If you do 'commit' to slave (and a replication pull counts as a 
'commit') so quick that you get overlapping commits before the slave was 
able to warm a new index... its' going to be trouble all around.]


On 3/1/2011 4:27 PM, Mike Franon wrote:

ok doing some more research I noticed, on the slave it has multiple
folders where it keeps them for example

index
index.20110204010900
index.20110204013355
index.20110218125400

and then there is an index.properties that shows which index it is using.

I am just curious why does it keep multiple copies?  Is there a
setting somewhere I can change to only keep one copy so not to lose
space?

Thanks

On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon  wrote:

No pending commits, what it looks like is there are almost two copies
of the index on the master, not sure how that happened.



On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
  wrote:

Are there pending commits on the master?


I was curious why would the size be dramatically different even though
the index versions are the same?

One is 1.2 Gb, and on the slave it is 512 MB

I would think they should both be the same size no?

Thanks


Re: numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
Sorry i didn't make my question clear.

I will only facet based on field value, not ranged query  (it is just some
ids for a  multi-value field).   And i won't do sort on the field either.

In that case, is string more efficient for the requirement?

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2606762.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Query on multivalue field

2011-03-01 Thread Ahmet Arslan
> In a multiValued field, call it field1, if I have two
> values indexed to 
> this field, say value 1 = "some text...termA...more text"
> and value 2 = 
> "some text...termB...more text" and do a search such as
> field1:(termA termB)
> (where ) I'm
> getting a hit 
> returned even though both terms don't occur within a single
> value in the 
> multiValued field.
> 
> What I'm wondering is if there is a way of applying the
> query against 
> each value of the field rather than against the field in
> its entirety. 
> The reason being is the number of values I want to store is
> variable and 
> I'd like to avoid the use of dynamic fields or
> restructuring the index 
> if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value. 

Ff you have positionIncrementGap="100", you can simulate this with using
&q=field1:"termA termB"~100

http://search-lucene.com/m/Hbdvz1og7D71/


  


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Ahmet Arslan
> I will only facet based on field value, not ranged
> query  (it is just some
> ids for a  multi-value field).   And i
> won't do sort on the field either.
> 
> In that case, is string more efficient for the
> requirement?

Hoss was saying to use,  





Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Dear all,

First I am sorry if this question has already been asked ( I am sure it
was...) but I can't find the right option with solrj.

I want to query only documents that contains ALL query terms.
Let me take an example, I have 4 documents that are simple sequences  ( they
have only one field : text ):

1 : The cat is on the roof
2 : The dog is on the roof
3 : The cat is black
4 : the cat is black and on the roof

if I search "cat roof" I will have doc 1,2,3,4
In my case I would like to have only : doc 1 and doc 4 (either cat or roof
don't appear in doc 2 and 3).

Is there a simple way to do that automatically with SolrJ or should I should
something like :
text:cat AND text:roof ?

Thank you very much for your help !

Best regards,
Victor


Re: Searching all terms - SolrJ

2011-03-01 Thread Ahmet Arslan

--- On Wed, 3/2/11, openvictor Open  wrote:

> From: openvictor Open 
> Subject: Searching all terms - SolrJ
> To: solr-user@lucene.apache.org
> Date: Wednesday, March 2, 2011, 12:20 AM
> Dear all,
> 
> First I am sorry if this question has already been asked (
> I am sure it
> was...) but I can't find the right option with solrj.
> 
> I want to query only documents that contains ALL query
> terms.
> Let me take an example, I have 4 documents that are simple
> sequences  ( they
> have only one field : text ):
> 
> 1 : The cat is on the roof
> 2 : The dog is on the roof
> 3 : The cat is black
> 4 : the cat is black and on the roof
> 
> if I search "cat roof" I will have doc 1,2,3,4
> In my case I would like to have only : doc 1 and doc 4
> (either cat or roof
> don't appear in doc 2 and 3).
> 
> Is there a simple way to do that automatically with SolrJ
> or should I should
> something like :
> text:cat AND text:roof ?
> 
> Thank you very much for your help !

You can use  in your schema.xml





Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon
The only trick with this is ensuring the searches return the right 
results and don't go across value boundaries. If I set the gap to the 
largest text size we expect (approx 5000 chars) what impact does such a 
large value have (i.e. does Solr physically separate these fragments in 
the index or just apply the figure as part of any query?


Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = "some text...termA...more text"
and value 2 =
"some text...termB...more text" and do a search such as
field1:(termA termB)
(where) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value.

Ff you have positionIncrementGap="100", you can simulate this with using
&q=field1:"termA termB"~100

http://search-lucene.com/m/Hbdvz1og7D71/








Re: Distances in spatial search (Solr 4.0)

2011-03-01 Thread Alexandre Rocco
Hi Bill,

I was using a different approach to sort by the distance with the dist()
function, since geodist() is not documented on the wiki (
http://wiki.apache.org/solr/FunctionQuery)

Tried something like:
&sort=dist(2, 45.15,-93.85, lat, lng) asc

I made some tests with geodist() function as you pointed and got different
results.
Is it safe to assume that geodist() is the correct way of doing it?

Also, can you clear up how can I see the distance using the "_Val_" as you
told?

Thanks!
Alexandre

On Tue, Mar 1, 2011 at 12:03 AM, Bill Bell  wrote:

> Use sort with geodist() to sort by distance.
>
> Getting the distance returned us documented on the wiki if you are not
> using score. see reference to _Val_
>
> Bill Bell
> Sent from mobile
>
>
> On Feb 28, 2011, at 7:54 AM, Alexandre Rocco  wrote:
>
> > Hi guys,
> >
> > We are implementing a separate index on our website, that will be
> dedicated
> > to spatial search.
> > I've downloaded a build of Solr 4.0 to try the spatial features and got
> the
> > geodist working really fast.
> >
> > We now have 2 other features that will be needed on this project:
> > 1. Returning the distance from the reference point to the search hit (in
> > kilometers)
> > 2. Sorting by the distance.
> >
> > On item 2, the wiki doc points that a distance function can be used but I
> > was not able to find good info on how to accomplish it.
> > Also, returning the distance (item 1) is noted as currently being in
> > development and there is some workaround to get it.
> >
> > Anyone had experience with the spatial feature and could help with some
> > pointers on how to achieve it?
> >
> > Thanks,
> > Alexandre
>


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Michael Sokolov
I tried this in my 1.4.0 installation (commenting out what had been 
working, hoping the default would be as you said works in the example):













In the log after starting up, I get these messages (among many others):

...

Mar 1, 2011 7:51:23 PM org.apache.solr.core.CoreContainer$Initializer 
initialize

INFO: looking for solr.xml: /usr/local/tomcat/solr/solr.xml
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
locateSolrHome

INFO: No /solr/home in JNDI
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or 
JNDI)

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/'

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/bpro/'
...
Mar 1, 2011 7:51:24 PM org.apache.solr.core.SolrCore 
INFO: [bpro] Opening new SolrCore at solr/bpro/, dataDir=./solr/data/
...
Mar 1, 2011 7:51:25 PM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to 'solr/pfapp/'
...
Mar 1, 2011 7:51:26 PM org.apache.solr.core.SolrCore 
INFO: [pfapp] Opening new SolrCore at solr/pfapp/, dataDir=solr/pfapp/data/

and it's pretty clearly using the "wrong" directory at that point.

Some more details:

/usr/local/tomcat has the usual tomcat distribution (this is 6.0.29)
conf/server.xml has:


rosen
rosen.ifactory.com




There is a solrconfig.xml in each of the core directories (should there 
only be one of these?).  I believe these are pretty generic (and they 
are identical); the one in the bpro folder has:



${solr.data.dir:./solr/data}



-Mike

On 3/1/2011 4:38 PM, Jonathan Rochkind wrote:
Hmm, okay, have to try to find time to install the example/multicore 
and see.


It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:
: Unless I'm doing something wrong, in my experience in multi-core 
Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the 
'data' dir.


have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the "data" dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr, 
what

does the solrconfig.xml of the core look like, etc...


-Hoss





Re: Query on multivalue field

2011-03-01 Thread Jonathan Rochkind
Each token has a position set on it. So if you index the value "alpha 
beta gamma", it winds up stored in Solr as (sort of, for the way we want 
to look at it)


document1:
alpha:position 1
beta:position 2
gamma: postition 3

 If you set the position increment gap large, then after one value in a 
multi-valued field ends, the position increment gap will be added to the 
positions for the next value. Solr doesn't actually internally have much 
of any idea of a multi-valued field, ALL a multi-valued indexed field 
is, is a position increment gap seperating tokens from different 'values'.


So index in a multi-valued field, with position increment gap 1,  
the values:  ["alpha beta gamma", "aleph bet"], you get kind of like:


document1:
alpha: 1
beta: 2
gamma: 3
aleph: 10004
bet: 10005

A large position increment gap, as far as I know and can tell (please 
someone correct me if I'm wrong, I am not a Solr developer) has no 
effect on the size or efficiency of your index on disk.


I am not sure why positionIncrementGap doesn't just default to a very 
large number, to provide behavior that more matches what people expect 
from the idea of a "multi-valued field". So maybe there is some flaw in 
my understanding, that justifies some reason for it not to be this way?


But I set my positionIncrementGap very large, and haven't seen any issues.


On 3/1/2011 5:46 PM, Scott Yeadon wrote:

The only trick with this is ensuring the searches return the right
results and don't go across value boundaries. If I set the gap to the
largest text size we expect (approx 5000 chars) what impact does such a
large value have (i.e. does Solr physically separate these fragments in
the index or just apply the figure as part of any query?

Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = "some text...termA...more text"
and value 2 =
"some text...termB...more text" and do a search such as
field1:(termA termB)
(where) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.

Your best bet can be using positionIncrementGap and to issue a phrase query 
(implicit AND) with the appropriate slop value.

Ff you have positionIncrementGap="100", you can simulate this with using
&q=field1:"termA termB"~100

http://search-lucene.com/m/Hbdvz1og7D71/








Re: multi-core solr, specifying the data directory

2011-03-01 Thread Jonathan Rochkind
This definitely matches my own experience, and I've heard it from 
others. I haven't heard of anyone who HAS gotten it to work like that.  
But apparently there's a distributed multi-core example which claims to 
work like it doesn't for us.


One of us has to try the Solr distro multi-core example, as Hoss 
suggested/asked, to see if the problem exhibits even there, and if not, 
figure out what the difference is.  Sorry, haven't found time to figure 
out how to install and start up the demo.


I am running in Tomcat, I wonder if container could matter, and maybe it 
somehow works in Jetty or something?


Jonathan


On 3/1/2011 7:05 PM, Michael Sokolov wrote:

I tried this in my 1.4.0 installation (commenting out what had been
working, hoping the default would be as you said works in the example):












In the log after starting up, I get these messages (among many others):

...

Mar 1, 2011 7:51:23 PM org.apache.solr.core.CoreContainer$Initializer
initialize
INFO: looking for solr.xml: /usr/local/tomcat/solr/solr.xml
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: No /solr/home in JNDI
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
locateSolrHome
INFO: solr home defaulted to 'solr/' (could not find system property or
JNDI)
Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
INFO: Solr home set to 'solr/'

Mar 1, 2011 7:51:23 PM org.apache.solr.core.SolrResourceLoader
INFO: Solr home set to 'solr/bpro/'
...
Mar 1, 2011 7:51:24 PM org.apache.solr.core.SolrCore
INFO: [bpro] Opening new SolrCore at solr/bpro/, dataDir=./solr/data/
...
Mar 1, 2011 7:51:25 PM org.apache.solr.core.SolrResourceLoader
INFO: Solr home set to 'solr/pfapp/'
...
Mar 1, 2011 7:51:26 PM org.apache.solr.core.SolrCore
INFO: [pfapp] Opening new SolrCore at solr/pfapp/, dataDir=solr/pfapp/data/

and it's pretty clearly using the "wrong" directory at that point.

Some more details:

/usr/local/tomcat has the usual tomcat distribution (this is 6.0.29)
conf/server.xml has:


rosen
rosen.ifactory.com




There is a solrconfig.xml in each of the core directories (should there
only be one of these?).  I believe these are pretty generic (and they
are identical); the one in the bpro folder has:


${solr.data.dir:./solr/data}



-Mike

On 3/1/2011 4:38 PM, Jonathan Rochkind wrote:

Hmm, okay, have to try to find time to install the example/multicore
and see.

It's definitely never worked for me, weird.

Thanks.

On 3/1/2011 2:38 PM, Chris Hostetter wrote:

: Unless I'm doing something wrong, in my experience in multi-core
Solr in
: 1.4.1, you NEED to explicitly provide an absolute path to the
'data' dir.

have you looked at the example/multicore directory that was included in
the 1.4.1 release?

it has a solr.xml that loads two cores w/o specifying a data dir in the
solr.xml (or hte solrconfig.xml) and it uses the "data" dir inside the
specified instanceDir.

If that example works for you, but your own configs do not, then we'll
need more details about your own configs -- how are you running solr,
what
does the solrconfig.xml of the core look like, etc...


-Hoss





[ANNOUNCE] Web Crawler

2011-03-01 Thread Dominique Bejean

Hi,

I would like to announce Crawl Anywhere. Crawl-Anywhere is a Java Web 
Crawler. It includes :


   * a crawler
   * a document processing pipeline
   * a solr indexer

The crawler has a web administration in order to manage web sites to be 
crawled. Each web site crawl is configured with a lot of possible 
parameters (no all mandatory) :


   * number of simultaneous items crawled by site
   * recrawl period rules based on item type (html, PDF, …)
   * item type inclusion / exclusion rules
   * item path inclusion / exclusion / strategy rules
   * max depth
   * web site authentication
   * language
   * country
   * tags
   * collections
   * ...

The pileline includes various ready to use stages (text extraction, 
language detection, Solr ready to index xml writer, ...).


All is very configurable and extendible either by scripting or java coding.

With scripting technology, you can help the crawler to handle javascript 
links or help the pipeline to extract relevant title and cleanup the 
html pages (remove menus, header, footers, ..)


With java coding, you can develop your own pipeline stage stage

The Crawl Anywhere web site provides good explanations and screen shots. 
All is documented in a wiki.


The current version is 1.1.4. You can download and try it out from here 
: www.crawl-anywhere.com



Regards

Dominique



Re: Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Yes but I want to leave the choice to the user.

He can either search all the terms or just some.

Is there any more flexible solution ? Even if I have to code it by hand ?



2011/3/1 Ahmet Arslan 

>
> --- On Wed, 3/2/11, openvictor Open  wrote:
>
> > From: openvictor Open 
> > Subject: Searching all terms - SolrJ
> > To: solr-user@lucene.apache.org
> > Date: Wednesday, March 2, 2011, 12:20 AM
> > Dear all,
> >
> > First I am sorry if this question has already been asked (
> > I am sure it
> > was...) but I can't find the right option with solrj.
> >
> > I want to query only documents that contains ALL query
> > terms.
> > Let me take an example, I have 4 documents that are simple
> > sequences  ( they
> > have only one field : text ):
> >
> > 1 : The cat is on the roof
> > 2 : The dog is on the roof
> > 3 : The cat is black
> > 4 : the cat is black and on the roof
> >
> > if I search "cat roof" I will have doc 1,2,3,4
> > In my case I would like to have only : doc 1 and doc 4
> > (either cat or roof
> > don't appear in doc 2 and 3).
> >
> > Is there a simple way to do that automatically with SolrJ
> > or should I should
> > something like :
> > text:cat AND text:roof ?
> >
> > Thank you very much for your help !
>
> You can use  in your schema.xml
>
>
>
>


Re: Query on multivalue field

2011-03-01 Thread Scott Yeadon
Tested it out and seems to work well as long as I set the gap to a value 
much longer than the text - 1 appear to work fine for our current 
data. Thanks heaps for all the help guys!


Scott.

On 2/03/11 11:13 AM, Jonathan Rochkind wrote:
Each token has a position set on it. So if you index the value "alpha 
beta gamma", it winds up stored in Solr as (sort of, for the way we 
want to look at it)


document1:
alpha:position 1
beta:position 2
gamma: postition 3

 If you set the position increment gap large, then after one value in 
a multi-valued field ends, the position increment gap will be added to 
the positions for the next value. Solr doesn't actually internally 
have much of any idea of a multi-valued field, ALL a multi-valued 
indexed field is, is a position increment gap seperating tokens from 
different 'values'.


So index in a multi-valued field, with position increment gap 1,  
the values:  ["alpha beta gamma", "aleph bet"], you get kind of like:


document1:
alpha: 1
beta: 2
gamma: 3
aleph: 10004
bet: 10005

A large position increment gap, as far as I know and can tell (please 
someone correct me if I'm wrong, I am not a Solr developer) has no 
effect on the size or efficiency of your index on disk.


I am not sure why positionIncrementGap doesn't just default to a very 
large number, to provide behavior that more matches what people expect 
from the idea of a "multi-valued field". So maybe there is some flaw 
in my understanding, that justifies some reason for it not to be this 
way?


But I set my positionIncrementGap very large, and haven't seen any 
issues.



On 3/1/2011 5:46 PM, Scott Yeadon wrote:

The only trick with this is ensuring the searches return the right
results and don't go across value boundaries. If I set the gap to the
largest text size we expect (approx 5000 chars) what impact does such a
large value have (i.e. does Solr physically separate these fragments in
the index or just apply the figure as part of any query?

Scott.

On 2/03/11 9:01 AM, Ahmet Arslan wrote:

In a multiValued field, call it field1, if I have two
values indexed to
this field, say value 1 = "some text...termA...more text"
and value 2 =
"some text...termB...more text" and do a search such as
field1:(termA termB)
(where) I'm
getting a hit
returned even though both terms don't occur within a single
value in the
multiValued field.

What I'm wondering is if there is a way of applying the
query against
each value of the field rather than against the field in
its entirety.
The reason being is the number of values I want to store is
variable and
I'd like to avoid the use of dynamic fields or
restructuring the index
if possible.
Your best bet can be using positionIncrementGap and to issue a 
phrase query (implicit AND) with the appropriate slop value.


Ff you have positionIncrementGap="100", you can simulate this with 
using

&q=field1:"termA termB"~100

http://search-lucene.com/m/Hbdvz1og7D71/












Re: numberic or string type for non-sortable field?

2011-03-01 Thread cyang2010
Can I know why?  I thought solr is tuned for string if no sorting of facet by
range query is needed.

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/numberic-or-string-type-for-non-sortable-field-tp2606353p2607932.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: solr different sizes on master and slave

2011-03-01 Thread Markus Jelsma
Indeed, the slave should not have useless copies but it does, at least in 
1.4.0, i haven't seen it in 3.x, but that was just a small test that did not 
exactly meet my other production installs.

In 1.4.0 Solr does not remove old copies at startup and it does not cleanly 
abort running replications at shutdown. Between shutdown and startup there 
might be a higher index version, it will then proceed as expected; download 
the new version and continue. Old copies will appear.

There is an earlier thread i started but without patch. You can, however, work 
around the problem by letting Solr delete a running replication by: 1. disable 
polling and then 2) abort replication. You can also write a script that will 
compare current and available replication directories before startup and act 
accordingly.


> The slave should not keep multiple copies _permanently_, but might
> temporarily after it's fetched the new files from master, but before
> it's committed them and fully wamred the new index searchers in the
> slave.  Could that be what's going on, is your slave just still working
> on committing and warming the new version(s) of the index?
> 
> [If you do 'commit' to slave (and a replication pull counts as a
> 'commit') so quick that you get overlapping commits before the slave was
> able to warm a new index... its' going to be trouble all around.]
> 
> On 3/1/2011 4:27 PM, Mike Franon wrote:
> > ok doing some more research I noticed, on the slave it has multiple
> > folders where it keeps them for example
> > 
> > index
> > index.20110204010900
> > index.20110204013355
> > index.20110218125400
> > 
> > and then there is an index.properties that shows which index it is using.
> > 
> > I am just curious why does it keep multiple copies?  Is there a
> > setting somewhere I can change to only keep one copy so not to lose
> > space?
> > 
> > Thanks
> > 
> > On Tue, Mar 1, 2011 at 3:26 PM, Mike Franon  wrote:
> >> No pending commits, what it looks like is there are almost two copies
> >> of the index on the master, not sure how that happened.
> >> 
> >> 
> >> 
> >> On Tue, Mar 1, 2011 at 3:08 PM, Markus Jelsma
> >> 
> >>   wrote:
> >>> Are there pending commits on the master?
> >>> 
>  I was curious why would the size be dramatically different even though
>  the index versions are the same?
>  
>  One is 1.2 Gb, and on the slave it is 512 MB
>  
>  I would think they should both be the same size no?
>  
>  Thanks


Re: Indexed, but cannot search

2011-03-01 Thread Markus Jelsma
Hmm, please provide analyzer of text and output of debugQuery=true. Anyway, if 
field type is fieldType text and the catchall field text is fieldType text as 
well 
and you reindexed, it should work as expected.

> Oh if only it were that easy :-). I have reindexed since making that change
> which is how I was able to get the regular search working. I have not
> however been able to get the search across all fields to work.
> 
> On Tue, Mar 1, 2011 at 3:01 PM, Markus Jelsma 
wrote:
> > Traditionally, people forget to reindex ;)
> > 
> > > Hi all,
> > > 
> > > The problem was that my fields were defined as type="string" instead of
> > > type="text". Once I corrected that, it seems to be fixed. The only part
> > > that still is not working though is the search across all fields.
> > > 
> > > For example:
> > > 
> > > http://localhost:8983/solr/select/?q=type%3AMammal
> > > 
> > > Now correctly returns the records matching mammal. But if I try to do a
> > > global search across all fields:
> > > 
> > > http://localhost:8983/solr/select/?q=Mammal
> > > http://localhost:8983/solr/select/?q=text%3AMammal
> > > 
> > > I get no results returned. Here is how the schema is set up:
> > > 
> > >  > > multiValued="true"/>
> > > text
> > > 
> > > 
> > > Thanks to everyone for your help so far. I think this is the last
> > > hurdle
> > 
> > I
> > 
> > > have to jump over.
> > > 
> > > On Tue, Mar 1, 2011 at 12:34 PM, Upayavira  wrote:
> > > > Next question, do you have your "type" field set to index="true" in
> > 
> > your
> > 
> > > > schema?
> > > > 
> > > > Upayavira
> > > > 
> > > > On Tue, 01 Mar 2011 11:06 -0500, "Brian Lamb"
> > > > 
> > > >  wrote:
> > > > > Thank you for your reply but the searching is still not working
> > > > > out. For example, when I go to:
> > > > > 
> > > > > http://localhost:8983/solr/select/?q=*%3A*<
> > 
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > 
> > > > dent=on
> > > > 
> > > > > I get the following as a response:
> > > > > 
> > > > > 
> > > > > 
> > > > >   
> > > > >   
> > > > > Mammal
> > > > > 1
> > > > > Canis
> > > > >   
> > > > >   
> > > > > 
> > > > > 
> > > > > 
> > > > > (plus some other docs but one is enough for this example)
> > > > > 
> > > > > But if I go to
> > > > > http://localhost:8983/solr/select/?q=type%3A<
> > 
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > 
> > > > dent=on
> > > > 
> > > > > Mammal
> > > > > 
> > > > > I only get:
> > > > > 
> > > > > 
> > > > > 
> > > > > But it seems that should return at least the result I have listed
> > > > > above. What am I doing incorrectly?
> > > > > 
> > > > > On Mon, Feb 28, 2011 at 6:57 PM, Upayavira  wrote:
> > > > > > q=dog is equivalent to q=text:dog (where the default search field
> > 
> > is
> > 
> > > > > > defined as text at the bottom of schema.xml).
> > > > > > 
> > > > > > If you want to specify a different field, well, you need to tell
> > > > > > it
> > > > > > 
> > > > > > :-)
> > > > > > 
> > > > > > Is that it?
> > > > > > 
> > > > > > Upayavira
> > > > > > 
> > > > > > On Mon, 28 Feb 2011 15:38 -0500, "Brian Lamb"
> > > > > > 
> > > > > >  wrote:
> > > > > > > Hi all,
> > > > > > > 
> > > > > > > I was able to get my installation of Solr indexed using
> > 
> > dataimport.
> > 
> > > > > > > However,
> > > > > > > I cannot seem to get search working. I can verify that the data
> > 
> > is
> > 
> > > > there
> > > > 
> > > > > > > by
> > 
> > > > > > > going to:
> > http://localhost:8983/solr/select/?q=*%3A*&version=2.2&start=0&rows=10&in
> > 
> > > > dent=on
> > > > 
> > > > > > > This gives me the response:  > > > > > > numFound="234961" start="0">
> > > > > > > 
> > > > > > > But when I go to
> > 
> > http://localhost:8983/solr/select/?q=dog&version=2.2&start=0&rows=10&inde
> > 
> > > > nt=on
> > > > 
> > > > > > > I get the response:  > 
> > start="0">
> > 
> > > > > > > I know that dog should return some results because it is the
> > 
> > first
> > 
> > > > result
> > > > 
> > > > > > > when I select all the records. So what am I doing incorrectly
> > 
> > that
> > 
> > > > would
> > > > 
> > > > > > > prevent me from seeing results?
> > > > > > 
> > > > > > ---
> > > > > > Enterprise Search Consultant at Sourcesense UK,
> > > > > > Making Sense of Open Source
> > > > 
> > > > ---
> > > > Enterprise Search Consultant at Sourcesense UK,
> > > > Making Sense of Open Source


Re: multi-core solr, specifying the data directory

2011-03-01 Thread Chris Hostetter
: 
: ${solr.data.dir:./solr/data}

that directive says "use the solr.data.dir system property to pick a path, 
if it is not set, use "./solr/data" (realtive the CWD)

if you want it to use the default, then you need to eliminate it 
completley, or you need to change it to the empty string...

   ${solr.data.dir:}

or...

   


-Hoss


Re: numberic or string type for non-sortable field?

2011-03-01 Thread Chris Hostetter

: Can I know why?  I thought solr is tuned for string if no sorting of facet by
: range query is needed.

"tuned for string" doesn't really mean anything to me, i'm not sure what 
that's in refrence to.  nothing thta i know of is particularly optimized 
for strings.  Almost anything can be indexed/stored/represented as a 
string (in some form ot another) and that tends to work fine in solr, but 
some things are optimized for other more specialized datatypes.

the reason i suggested that using ints might (marginally) be better is 
because of the FieldCache and the fieldValueCache -- the int 
representation uses less memory then if it was holding strings 
representing hte same ints.

worrying about that is really a premature optimization though -- model 
your data in the way that makes the most sense -- if your ids are 
inherently ints, model them as ints until you come up with a reason to 
model them otherwise and move on to the next problem.


-Hoss


Re: Distances in spatial search (Solr 4.0)

2011-03-01 Thread William Bell
See http://wiki.apache.org/solr/SpatialSearch and yest use sort=geodist()+asc

This Wiki page has everything you should need\.


On Tue, Mar 1, 2011 at 3:49 PM, Alexandre Rocco  wrote:
> Hi Bill,
>
> I was using a different approach to sort by the distance with the dist()
> function, since geodist() is not documented on the wiki (
> http://wiki.apache.org/solr/FunctionQuery)
>
> Tried something like:
> &sort=dist(2, 45.15,-93.85, lat, lng) asc
>
> I made some tests with geodist() function as you pointed and got different
> results.
> Is it safe to assume that geodist() is the correct way of doing it?
>
> Also, can you clear up how can I see the distance using the "_Val_" as you
> told?
>
> Thanks!
> Alexandre
>
> On Tue, Mar 1, 2011 at 12:03 AM, Bill Bell  wrote:
>
>> Use sort with geodist() to sort by distance.
>>
>> Getting the distance returned us documented on the wiki if you are not
>> using score. see reference to _Val_
>>
>> Bill Bell
>> Sent from mobile
>>
>>
>> On Feb 28, 2011, at 7:54 AM, Alexandre Rocco  wrote:
>>
>> > Hi guys,
>> >
>> > We are implementing a separate index on our website, that will be
>> dedicated
>> > to spatial search.
>> > I've downloaded a build of Solr 4.0 to try the spatial features and got
>> the
>> > geodist working really fast.
>> >
>> > We now have 2 other features that will be needed on this project:
>> > 1. Returning the distance from the reference point to the search hit (in
>> > kilometers)
>> > 2. Sorting by the distance.
>> >
>> > On item 2, the wiki doc points that a distance function can be used but I
>> > was not able to find good info on how to accomplish it.
>> > Also, returning the distance (item 1) is noted as currently being in
>> > development and there is some workaround to get it.
>> >
>> > Anyone had experience with the spatial feature and could help with some
>> > pointers on how to achieve it?
>> >
>> > Thanks,
>> > Alexandre
>>
>


Re: Question about Nested Span Near Query

2011-03-01 Thread William Bell
I am not 100% sure. But I why did you not use the standard confix for "text" ?


  








  
  







  



You are using:

- 
- 
  
  
  
- 
  
  


Can you try a more standard approach ?

solr.WhitespaceTokenizerFactory
solr.LowerCaseFilterFactory

??

Thanks.


On Mon, Feb 28, 2011 at 2:38 AM, Ahsan |qbal  wrote:
> Hi Bill
> Any update..
>
> On Thu, Feb 24, 2011 at 8:58 PM, Ahsan |qbal 
> wrote:
>>
>> Hi
>> schema and document are attached.
>>
>> On Thu, Feb 24, 2011 at 8:24 PM, Bill Bell  wrote:
>>>
>>> Send schema and document in XML format and I'll look at it
>>>
>>> Bill Bell
>>> Sent from mobile
>>>
>>>
>>> On Feb 24, 2011, at 7:26 AM, "Ahsan |qbal" 
>>> wrote:
>>>
>>> > Hi
>>> >
>>> > To narrow down the issue I indexed a single document with one of the
>>> > sample
>>> > queries (given below) which was giving issue.
>>> >
>>> > *"evaluation of loan and lease portfolios for purposes of assessing the
>>> > adequacy of" *
>>> >
>>> > Now when i Perform a search query (*TextContents:"evaluation of loan
>>> > and
>>> > lease portfolios for purposes of assessing the adequacy of"*) the
>>> > parsed
>>> > query is
>>> >
>>> >
>>> > *spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([spanNear([Contents:evaluation,
>>> > Contents:of], 0, true), Contents:loan], 0, true), Contents:and], 0,
>>> > true),
>>> > Contents:lease], 0, true), Contents:portfolios], 0, true),
>>> > Contents:for], 0,
>>> > true), Contents:purposes], 0, true), Contents:of], 0, true),
>>> > Contents:assessing], 0, true), Contents:the], 0, true),
>>> > Contents:adequacy],
>>> > 0, true), Contents:of], 0, true)*
>>> >
>>> > and search is not successful.
>>> >
>>> > If I remove '*evaluation*' from start OR *'assessing the adequacy of*'
>>> > from
>>> > end it works fine. Issue seems to come on relatively long phrases but I
>>> > have
>>> > not been able to find a pattern and its really mind boggling coz I
>>> > thought
>>> > this issue might be due to large position list but this is a single
>>> > document
>>> > with one phrase. So its definitely not related to size of index.
>>> >
>>> > Any ideas whats going on??
>>> >
>>> > On Thu, Feb 24, 2011 at 10:25 AM, Ahsan |qbal
>>> > wrote:
>>> >
>>> >> Hi
>>> >>
>>> >> It didn't search.. (means no results found even results exist) one
>>> >> observation is that it works well even in the long phrases but when
>>> >> the long
>>> >> phrases contain stop words and same stop word exist two or more time
>>> >> in the
>>> >> phrase then, solr can't search with query parsed in this way.
>>> >>
>>> >>
>>> >> On Wed, Feb 23, 2011 at 11:49 PM, Otis Gospodnetic <
>>> >> otis_gospodne...@yahoo.com> wrote:
>>> >>
>>> >>> Hi,
>>> >>>
>>> >>> What do you mean by "this doesn't work fine"?  Does it not work
>>> >>> correctly
>>> >>> or is
>>> >>> it slow or ...
>>> >>>
>>> >>> I was going to suggest you look at Surround QP, but it looks like you
>>> >>> already
>>> >>> did that.  Wouldn't it be better to get Surround QP to work?
>>> >>>
>>> >>> Otis
>>> >>> 
>>> >>> Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
>>> >>> Lucene ecosystem search :: http://search-lucene.com/
>>> >>>
>>> >>>
>>> >>>
>>> >>> - Original Message 
>>>  From: Ahsan |qbal 
>>>  To: solr-user@lucene.apache.org
>>>  Sent: Tue, February 22, 2011 10:59:26 AM
>>>  Subject: Question about Nested Span Near Query
>>> 
>>>  Hi All
>>> 
>>>  I had a requirement to implement queries that involves phrase
>>> >>> proximity.
>>>  like user should be able to search "ab cd" w/5 "de fg", both
>>>   phrases as
>>>  whole should be with in 5 words of each other. For this I  implement
>>>  a
>>> >>> query
>>>  parser that make use of nested span queries, so above query  would
>>>  be
>>> >>> parsed
>>>  as
>>> 
>>>  spanNear([spanNear([Contents:ab, Contents:cd], 0,  true),
>>>  spanNear([Contents:de, Contents:fg], 0, true)], 5,  false)
>>> 
>>>  Queries like this seems to work really good when phrases are small
>>>   but
>>> >>> when
>>>  phrases are large this doesn't work fine. Now my question, Is there
>>>   any
>>>  limitation of SpanNearQuery. that we cannot handle large phrases in
>>> >>> this
>>>  way?
>>> 
>>>  please help
>>> 
>>>  Regards
>>>  Ahsan
>>> 
>>> >>>
>>> >>
>>> >>
>>
>
>


indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Hi,

I can't seem to be able to index to a solr date field from a query result
using DataImportHandler.  Anyone else know how to resoleve the problem?












When i check the solr document, there is no term populated for release_date
field.  All other fields are populated with terms.

The field, "release_date" is a solr date type field.


Appreciate your help.


-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608327.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: More Date Math: NOW/WEEK

2011-03-01 Thread Chris Hostetter
: Digging into the source code of DateMathParser.java, i found the following 
: comment:
:99   // NOTE: consciously choosing not to support WEEK at this time,   
: 100   // because of complexity in rounding down to the nearest week   101 
  
: // arround a month/year boundry.   102   // (Not to mention: it's not 
clear 
: what people would *expect*) 
: 
: I was able to implement a work-around in my ruby client using the following 
: pseudo code:
:   wd=NOW.wday; "NOW-#{wd}DAY/DAY"

the main issue that comment in DateMathParser.java is refering to is what 
the ambiguity of what should happen when you try do something like 
"2009-01-02T00:00:00Z/WEEK"

"WEEK" would be the only unit where rounding changed a unit 
*larger* then the one you rounded on -- ie: rounding day only affects 
hours, minutes, seconds, millis; rounding on month only affects days, 
hours, minutes, seconds, millies; but in an example like the one above, 
where Jan 2 2009 was a friday.  rounding down a week (using logic similar 
to what you have) would result in "2008-12-28T00:00:00Z" -- changing the 
month and year.

It's not really clear that that is what people would expect -- i'm 
guessing at least a few people would expect it to stop at the 1st of the 
month.

the ambiguity of what behavior makes the most sense is why never got 
arround to implementing it -- it's certianly possible, but the 
various options seemed too confusing to really be very generally useful 
and easy to understand 

as you point out: people who really want special logic like this (and know 
how they want it to behave) have an easy workarround by evaluating "NOW" 
in the client since every week has exactly seven days.



-Hoss


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread Chris Hostetter
:   query="select ID,  title_full as TITLE_NAME, YEAR,
: COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10">

Are you certian that the first 10 results returned (you have "limit 10") 
all have a value in the "modified" field?

if modified is nullable you could very easily just happen to be getting 10 
docs that don't have values in that field.


-Hoss


Re: Searching all terms - SolrJ

2011-03-01 Thread Chris Hostetter

: Yes but I want to leave the choice to the user.
: 
: He can either search all the terms or just some.
: 
: Is there any more flexible solution ? Even if I have to code it by hand ?

the declaration in the schema dictates the default.

you can override the default at query time using the "q.op" param (ie: 
q.op=AND, q.op=OR) in the request.

in SolrJ you would just call solrQuery.set("q.op","OR") on your SolrQuery 
object.

-Hoss


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Yes, I am pretty sure every row has a modified field.   I did my testing
before posting question.

I tried with adding DateFormatTransformer, still not help.













I assume it is ok to just get the date part of the information out of a
datetime field?

Any thought on this?

-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608452.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread William Bell


Did you convert the date to standard GMT format as above in DIH?

Also add transformer="DateFormatTransformer,..."

http://lucene.apache.org/solr/api/org/apache/solr/schema/DateField.html



On Tue, Mar 1, 2011 at 7:54 PM, cyang2010  wrote:
> Yes, I am pretty sure every row has a modified field.   I did my testing
> before posting question.
>
> I tried with adding DateFormatTransformer, still not help.
>
>
>                                query="select ID,  title_full as TITLE_NAME, YEAR,
> COUNTRY_OF_ORIGIN,  modified as RELEASE_DATE from title limit 10"
>
>
> transformer="RegexTransformer,DateFormatTransformer,TemplateTransformer">
>
>            
>
>            
>
>            
>            
>
>             dateTimeFormat="-MM-dd"/>
>
> I assume it is ok to just get the date part of the information out of a
> datetime field?
>
> Any thought on this?
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2608452.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: [ANNOUNCE] Web Crawler

2011-03-01 Thread David Smiley (@MITRE.org)
Dominique,
The obvious number one question is of course why you re-invented this wheel
when there are several existing crawlers to choose from.  Your website says
the reason is that the UIs on existing crawlers (e.g. Nutch, Heritrix, ...)
weren't sufficiently user-friendly or had the site-specific configuration
you wanted.  Well if that is the case, why didn't you add/enhance such
capabilities for an existing crawler?

~ David Smiley

-
 Author: https://www.packtpub.com/solr-1-4-enterprise-search-server/book
-- 
View this message in context: 
http://lucene.472066.n3.nabble.com/ANNOUNCE-Web-Crawler-tp2607831p2608956.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: indexing mysql dateTime/timestamp into solr date field

2011-03-01 Thread cyang2010
Bill,

I did try to use the way you suggested above.  Unfortunately it does not
work either.

It is pretty much the same as my last reply, except the
dateTimeFormat="-MM-dd'T'hh:mm:ss"

Thanks,

cyang

--
View this message in context: 
http://lucene.472066.n3.nabble.com/indexing-mysql-dateTime-timestamp-into-solr-date-field-tp2608327p2609053.html
Sent from the Solr - User mailing list archive at Nabble.com.


Re: Searching all terms - SolrJ

2011-03-01 Thread openvictor Open
Great !

Thank you very much Chris, it will come handy !

Best regards,
Victor

2011/3/1 Chris Hostetter 

>
> : Yes but I want to leave the choice to the user.
> :
> : He can either search all the terms or just some.
> :
> : Is there any more flexible solution ? Even if I have to code it by hand ?
>
> the declaration in the schema dictates the default.
>
> you can override the default at query time using the "q.op" param (ie:
> q.op=AND, q.op=OR) in the request.
>
> in SolrJ you would just call solrQuery.set("q.op","OR") on your SolrQuery
> object.
>
> -Hoss
>


how to debug dataimporthandler

2011-03-01 Thread cyang2010
I wonder how to run dataimporthandler in debug mode.  Currently i can't get
data correctly into index through dataimporthandler, especially a timestamp
column to solr date field.  I want to debug the process.

According to this wiki page:

Commands
The handler exposes all its API as http requests . The following are the
possible operations 
•full-import : Full Import operation can be started by hitting the URL
http://:/solr/dataimport?command=full-import 
...
■clean : (default 'true'). Tells whether to clean up the index before the
indexing is started 
■commit: (default 'true'). Tells whether to commit after the operation 
■optimize: (default 'true'). Tells whether to optimize after the operation 
■debug : (default false). Runs in debug mode.It is used by the interactive
development mode (see here) 
■Please note that in debug mode, documents are never committed
automatically. If you want to run debug mode and commit the results too, add
'commit=true' as a request parameter. 


Therefore, i run 

http://:/solr/dataimport?command=full-import &debug=true

Not only i didn't see log with "DEBUG" level, but also it crashes my machine
a few times.   I was surprised it can even do that ...

Did someone ever try to debug the process before?  What is your experience
with it?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/how-to-debug-dataimporthandler-tp2611506p2611506.html
Sent from the Solr - User mailing list archive at Nabble.com.


MorelikeThis not working with Shards(distributed Search)

2011-03-01 Thread Isha Garg



Hi,

  I am experimenting with the *morelikethis* to see if it also works
with *distributed* search.But i did not get the solution yet.Can anyone
help me regarding this. please provide me detailed description. as I
didnt find it by updating
MoreLikeThisComponent.java,MoreLikeThisHandler.java,ShardRequest.java
specified in the AlternateDistributedMLT.patch .

Thanks in advance..
Isha Garg