Searching Behavior Multi Word

2010-01-10 Thread brianeno

Hello, I am having trouble figuring out an issue with Solr and returning of a
query.  I have a document title "Band Of Brothers" and seeing the following
results.

- “Band” = works, returns the document
- “Brothers = works, returns the document
- “Band of” = works, returns the document
- “Band of “ = works, returns the document
- “of Brothers” = works, returns the document
- “ of Brothers” = works, returns the document
- “Band of Brothers” = does not work, does not return the document

I can't figure out why the last one does not work?  I am using the dismax
query parser and have made attempts in connection with stopwords, but to no
avail. 

Can anyone steer me in the right direction, I assume this is a pretty
standard Lucene/Solr behavior but looking at documents and forum I haven't
seen it or missed it.

Many thanks.

-- 
View this message in context: 
http://old.nabble.com/Searching-Behavior-Multi-Word-tp27098985p27098985.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Searching Behavior Multi Word

2010-01-10 Thread Erik Hatcher

Give &debugQuery=true a try and see what each of those parse to.

What are your dismax settings?   And for the fields you're querying  
(pf/qf), what are their field types and analysis details?


Erik



On Jan 10, 2010, at 9:48 AM, brianeno wrote:



Hello, I am having trouble figuring out an issue with Solr and  
returning of a
query.  I have a document title "Band Of Brothers" and seeing the  
following

results.

- “Band” = works, returns the document
- “Brothers = works, returns the document
- “Band of” = works, returns the document
- “Band of “ = works, returns the document
- “of Brothers” = works, returns the document
- “ of Brothers” = works, returns the document
- “Band of Brothers” = does not work, does not return the document

I can't figure out why the last one does not work?  I am using the  
dismax
query parser and have made attempts in connection with stopwords,  
but to no

avail.

Can anyone steer me in the right direction, I assume this is a pretty
standard Lucene/Solr behavior but looking at documents and forum I  
haven't

seen it or missed it.

Many thanks.

--
View this message in context: 
http://old.nabble.com/Searching-Behavior-Multi-Word-tp27098985p27098985.html
Sent from the Solr - User mailing list archive at Nabble.com.





Tokenizer question

2010-01-10 Thread rswart

Hi,

This is probably an easy question. 

I am doing a simple query on postcode and house number. If the housenumber
contains a minus sign like:

q=PostCode:(1078 pw)+AND+HouseNumber:(39-43)

the resulting parsed query contains a phrase query:

+(PostCode:1078 PostCode:pw) +PhraseQuery(HouseNumber:"39 43")

This never matches.

What I want solr to do is generate the following parsed query (essentially
an OR for both house numbers):

+(PostCode:1078 PostCode:pw) +(HouseNumber:39 HouseNumber:43)

Solr generates this based on the following query (so a space instead of a
minus sign):

q=PostCode:(1078 pw)+AND+HouseNumber:(39 43)


I tried two things to have Solr generate the desired parsed query:

1. WordDelimiterFilterFactory with generateNumberParts=1 but this results in
a phrase query
2. PatternTokenizerFactory that splits on (\s+|-).

But both options don't work. 

Any suggestions on how to get rid of the phrase query?

Thanks,

Richard
-- 
View this message in context: 
http://old.nabble.com/Tokenizer-question-tp27099119p27099119.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: How to Split Index file.

2010-01-10 Thread Andrzej Bialecki

On 2010-01-10 01:55, Lance Norskog wrote:

Make two copies of the index. In each copy, delete the records you do
not want. Optimize.


... which is essentially what the MultiPassIndexSplitter does, only it 
avoids the initial copy (by deleting in the source index).



--
Best regards,
Andrzej Bialecki <><
 ___. ___ ___ ___ _ _   __
[__ || __|__/|__||\/|  Information Retrieval, Semantic Web
___|||__||  \|  ||  |  Embedded Unix, System Integration
http://www.sigram.com  Contact: info at sigram dot com



Re: Understanding the query parser

2010-01-10 Thread Ahmet Arslan

> I am using Solr 1.3.
> I have an index with a field called "name". It is of type
> "text"
> (unmodified, stock text field from solr).
> 
> My query
> field:foo-bar
> is parsed as a phrase query
> field:"foo bar"
> 
> I was rather expecting it to be parsed as
> field:(foo bar)
> or
> field:foo field:bar
> 
> Is there an expectation mismatch? Can I make it work as I
> expect it to?

If the query analyzer produces two or more tokens from a single token, 
QueryParser constructs PhraseQuery. Therefore it is expected. 

Without writing custom code it seems impossible to alter this behavior.

Modifying QueryParser to change this behavior will be troublesome. 
I think easiest way is to replace '-' with whitespace before analysis phase. 
Probably in client side. Or in an custom RequestHandler.

May be you can set qp.setPhraseSlop(Integer.MAX_VALUE); so that 
field:foo-bar and field:(foo AND bar) will be virtually equal.

hope this helps.


  


London Search Social - this Tuesday, 12th January

2010-01-10 Thread Richard Marr
Hi all,

Apologies for the cross-post. If you're near London on Tuesday the
12th Jan (i.e. this Tuesday) please come along and geek with us over a
beer or two. All experience levels welcome, don't be scared. Details
on the Meetup page below... (please sign up on there if you're
interested in subsequent events).

http://www.meetup.com/london-search-social/

Cheers,

Richard Marr


Re: London Search Social - this Tuesday, 12th January

2010-01-10 Thread rob





On Sun 10/01/10 20:24 , Richard Marr  wrote:

> Hi all,
> Apologies for the cross-post. If you're near London on Tuesday the
> 12th Jan (i.e. this Tuesday) please come along and geek with us over
> a
> beer or two. All experience levels welcome, don't be scared. Details
> on the Meetup page below... (please sign up on there if you're
> interested in subsequent events).
> http://www.meetup.com/london-search-social/
> Cheers,
> Richard Marr
> 
> 
Message sent via Atmail Open - http://atmail.org/


[PECL-DEV] [ANNOUNCEMENT] solr-0.9.9 (beta) Released

2010-01-10 Thread Israel Ekpo
The new PECL package solr-0.9.9 (beta) has been released at
http://pecl.php.net/.

Release notes
-
- Fixed Bug #17009 Creating two SolrQuery objects leads to wrong query value
- Reset the buffer for the request data from the previous request in
SolrClient
- Added new internal static function solr_set_initial_curl_handle_options()
- Moved the intialization of CURL handle options to
solr_set_initial_curl_handle_options() function
- Resetting the CURL options on the (CURL *) handle after each request is
completed
- Added more explicit error message to indicate that cloning SolrParams
objects and its descendants is currently not yet supported

Package Info
-
It effectively simplifies the process of interacting with Apache Solr using
PHP5 and it already comes with built-in readiness for the latest features
available in Solr 1.4. The extension has features such as built-in,
serializable query string builder objects which effectively simplifies the
manipulation of name-value pair request parameters across repeated requests.
The response from the Solr server is also automatically parsed into native
php objects whose properties can be accessed as array keys or object
properties without any additional configuration on the client-side. Its
advanced HTTP client reuses the same connection across multiple requests and
provides built-in support for connecting to Solr servers secured behind HTTP
Authentication or HTTP proxy servers. It is also able to connect to
SSL-enabled containers. Please consult the documentation for more details on
features.

Related Links
-
Package home: http://pecl.php.net/package/solr
Changelog: http://pecl.php.net/package-changelog.php?package=solr
Download: http://pecl.php.net/get/solr-0.9.9.tgz
Documentation: http://us.php.net/solr

Authors
-
Israel Ekpo  (lead)


Errant link on the wiki

2010-01-10 Thread Jason Rutherglen
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem

Which is a cool algo, however the MIT link is totally down... Is it up
sometimes or is it discontinued in favor of the Lucid version (which
is open source or not?)?


Re: Errant link on the wiki

2010-01-10 Thread Pradeep Pujari
Hi Jason,

Which one is the MIT link? Can you please paste the url?

Pradeep.

On Sun, Jan 10, 2010 at 8:51 PM, Jason Rutherglen <
jason.rutherg...@gmail.com> wrote:

> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters/Kstem
>
> Which is a cool algo, however the MIT link is totally down... Is it up
> sometimes or is it discontinued in favor of the Lucid version (which
> is open source or not?)?
>


Could not start SOLR issue

2010-01-10 Thread dipti khullar
Hi

We are running master/slave Solr 1.3 version on production since about 5
months.

Yesterday, we faced following issue on one of the slaves for the first time
because of which we had to restart the slave.

SEVERE: Could not start SOLR. Check solr/home property
java.lang.RuntimeException: java.io.FileNotFoundException: no segments* file
found in 
org.apache.lucene.store.FSDirectory@/opt/solr/solr_slave/solr/data/index:
files: null

I searched on forums but couldn't find any relevant info which could have
possibly caused the issue.

In snapinstaller logs, following failed logs were observed:

2010/01/11 04:20:06 started by solr
2010/01/11 04:20:06 command:
/opt/solr/solr_slave/solr/solr/bin/snapinstaller
2010/01/11 04:20:07 installing snapshot
/opt/solr/solr_slave/solr/data/snapshot.20100111041402
2010/01/11 04:20:07 notifing Solr to open a new Searcher
2010/01/11 04:20:07 failed to connect to Solr server
2010/01/11 04:20:07 snapshot installed but Solr server has not open a new
Searcher
2010/01/11 04:20:08 failed (elapsed time: 1 sec)


Configurations:
There are 2 search servers in a virtualized VMware environment. Each has  2
instances of Solr running on separates ports in tomcat.
Server 1: hosts 1 master(application 1), 1 slave (application 1)
Server 2: hosta 1 master (application 2), 1 slave (application 1)

Both servers have 4 CPUs and 4 GB RAM.
Master
- 4GB RAM
- 1GB JVM Heap memory is allocated to Solr
Slave1/Slave2:
- 4GB RAM
- 2GB JVM Heap memory is allocated to Solr

Can there be any possible reasons that solr/home property couldn't be found?

Thanks
Dipti


Re: Synonyms from Database

2010-01-10 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Sun, Jan 10, 2010 at 1:04 PM, Otis Gospodnetic
 wrote:
> Ravi,
>
> I think if your synonyms were in a DB, it would be trivial to periodically 
> dump them into a text file Solr expects.  You wouldn't want to hit the DB to 
> look up synonyms at query time...
Why query time. Can it not be done at startup time ?
>
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch
>
>
>
> - Original Message 
>> From: Ravi Gidwani 
>> To: solr-user@lucene.apache.org
>> Sent: Sat, January 9, 2010 10:20:18 PM
>> Subject: Synonyms from Database
>>
>> Hi :
>>      Is there any work done in providing synonyms from a database instead of
>> synonyms.txt file ? Idea is to have a dictionary in DB that can be enhanced
>> on the fly in the application. This can then be used at query time to check
>> for synonyms.
>>
>> I know I am not putting thoughts to the performance implications of this
>> approach, but will love to hear about others thoughts.
>>
>> ~Ravi.
>
>



-- 
-
Noble Paul | Systems Architect| AOL | http://aol.com