Google like searching

2006-07-05 Thread Andre Basse
Hi Solr users,
 
I would like to configure my Solr search to make it Google like.
With the standard setup, the "OR" operator is used between two or more
search values.
 
Example:
 
A search for  berlin robin
returns in debug mode:
berlin robin
pubtext:berlin pubtext:robin
(and will give me all documents that with berlin or robin)
 
An "AND" search for  berlin AND Robin  
returns in debug mode :
berlin AND robin
+pubtext:berlin +pubtext:robin
(and will give me only the document that contains berlin and robin)
 
 
How can I setup Solar, that my users don't have to key in AND all the
time?
 
Any help is much appreciated.
 
 
Regards,
 
Andre


*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



RE: Google like searching

2006-07-06 Thread Andre Basse
Hi Hoss,

Thank you very much. Works great!

Another question, probably more index related.
When I do a search for "ageing", my query will also return documents
with the word "age" only. (not ageing) 
I could image that age == ageing but not ageing == age.

Please, how can I change that? 


Thanks,

Andre



*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



SOLR stylesheet

2006-07-17 Thread Andre Basse
Hi SOLR users,
 
I know this issue has been discussed before but I'm not sure if there
was a final answer.

I would like to apply a stylesheet as mentioned in the tutorial.
 
http://localhost:8983/solr/select/?stylesheet=
 
Any ideas where to place the stylesheet, any examples available?
 
 
Thanks,
 
Andre
 
 

 



*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



Problem with well-formed XML docs

2006-07-28 Thread Andre Basse
Hi all,
 
 
I have imported some XML documents to Solr. However when I do a query for 
certain documents I get following error message in the browser:
 
XML Parsing Error: not well-formed
Location: 
http://192.168.32.128:8983/solr/select/?stylesheet=&q=cat%0D%0A&version=2.1&start=0&rows=10&indent=on
 

 
Line Number 149, Column 185:An unusual Wemyss tabby cat, 32cm high, sporting a 
broad grin and inset green-glass eyes, caused a mild sensation on January 27 
when it fetched £23,900 at Edinburgh auctioneers Lyon & Turnbull. This was 
more than four times the upper estimate. Among a sizeable line-up of Wemyss 
porkers, a seated pig just 16cm long and painted with shamrocks fetched £4780, 
almost 10 times its estimate.
^
 
 
The error message is pointing to the & char in the result.
 
 
 
This is the part of my source XML document that shows that the "&" is 
well-formed before import: 
 
"...when it fetched £23,900 at Edinburgh auctioneers Lyon & Turnbull. This 
was more than four .."
 
 
Any idea?
 
 
Any help is much appreciated!
 
 
 
 
Thanks,
 
Andre


*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



RE: Problem with well-formed XML docs

2006-07-31 Thread Andre Basse
Updated to the latest build. - Problem solved.

Thanks for your help!!!




-Original Message-
From: Chris Hostetter [mailto:[EMAIL PROTECTED] 
Sent: Friday, 28 July 2006 5:42 PM
To: solr-user@lucene.apache.org
Subject: Re: Problem with well-formed XML docs


Andre, which Appserver are you using to run Solr? ... there have been several 
reports of bugs with the way Jetty deals with the the XML escaped output 
produced by Solr, particularaly when non ascii characters are involved.

If you are using a version of Jetty, have you tried using a build more recent 
then July 17th when this patch was applied...

http://issues.apache.org/jira/browse/SOLR-32

?

: Date: Fri, 28 Jul 2006 17:09:51 +1000
: From: Andre Basse <[EMAIL PROTECTED]>
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Problem with well-formed XML docs
:
: Hi all,
:
:
: I have imported some XML documents to Solr. However when I do a query for 
certain documents I get following error message in the browser:
:
: XML Parsing Error: not well-formed
: Location: 
http://192.168.32.128:8983/solr/select/?stylesheet=&q=cat%0D%0A&version=2.1&start=0&rows=10&indent=on
 
<http://192.168.32.128:8983/solr/select/?stylesheet=&q=cat%0D%0A&version=2.1&start=0&rows=10&indent=on>
: Line Number 149, Column 185:An unusual Wemyss tabby cat, 32cm high, sporting 
a broad grin and inset green-glass eyes, caused a mild sensation on January 27 
when it fetched £23,900 at Edinburgh auctioneers Lyon & Turnbull. This was 
more than four times the upper estimate. Among a sizeable line-up of Wemyss 
porkers, a seated pig just 16cm long and painted with shamrocks fetched £4780, 
almost 10 times its estimate.
: 
^
:
:
: The error message is pointing to the & char in the result.
:
:
:
: This is the part of my source XML document that shows that the "&" is 
well-formed before import:
:
: "...when it fetched £23,900 at Edinburgh auctioneers Lyon & Turnbull. 
This was more than four .."
:
:
: Any idea?
:
:
: Any help is much appreciated!
:
:
:
:
: Thanks,
:
: Andre
:
:
: 
*
: The information contained in this e-mail message and any accompanying files 
is or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
: 
*
:
:



-Hoss



Faceted Searching problems

2006-09-13 Thread Andre Basse
Hi all,
 
I just installed the nightly build to try the Faceted Searching . After some 
testing I discovered that some characters are missing in the result XML and 
that fields with "/" chars are sometimes split into two entries.
 
Example:
1 should be France
1 should be Culture/Festivals

Please find details below.
 
Original XML
=
 
Metro
 

Culture/Film
Culture/Festivals



France
Sydney

 
 
 
SOLR response for the query 
=
(http://192.168.157.128:8983/solr/select/?q=Bellucci&rows=0&facet=true&facet.limit=5&facet.field=section&facet.field=geoloc&facet.field=classification)
 

−
 
0
518


−
 

−
 
−
 
2
0
0
0
0

−
 
1
1
0
0
0

−
 
1
1
1
1
1




 
 
Any help is much appreciated!
 
 
Thanks,
 
Andre
 
 
 


*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



RE: Faceted Searching problems

2006-09-13 Thread Andre Basse
Sorry, please ignore that email. Problem solved (I should read more mails...)

Thanks to Jeff.










Hi all,
 
I just installed the nightly build to try the Faceted Searching . After some 
testing I discovered that some characters are missing in the result XML and 
that fields with "/" chars are sometimes split into two entries.
 
Example:
1 should be France 1 
should be Culture/Festivals

Please find details below.
 
Original XML
=
 
Metro
 

Culture/Film
Culture/Festivals



France
Sydney

 
 
 
SOLR response for the query
=
(http://192.168.157.128:8983/solr/select/?q=Bellucci&rows=0&facet=true&facet.limit=5&facet.field=section&facet.field=geoloc&facet.field=classification)
 

−
 
0
518


−
 

−
 
−
 
2
0
0
0
0

−
 
1
1
0
0
0

−
 
1
1
1
1
1




 
 
Any help is much appreciated!
 
 
Thanks,
 
Andre
 
 
 


*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



RE: Faceted Searching problems

2006-09-14 Thread Andre Basse

Time to say: Thank you all for your great support!


-Andre




> You need to use an untokenized field for facets.

At least 3 answers in 5 minutes... we should try synchronized swimming
;-)

-Yonik


*
The information contained in this e-mail message and any accompanying files is 
or may be confidential.  If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error, please advise 
the sender immediately by return e-mail, or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.
*



RE: maximum index size

2007-03-27 Thread Andre Basse
>I've 50 million documents each about 10K in size and I've 4 index
partitions each consisting of 12.5 million documents. Each index
partition is about 80GB. A search typically takes about 3-5 seconds.
Single word searches are faster than multi-word searches. I'm still
working on finding the ideal index size that Solr can handle well with
in a second.

Hi Venkatesh,

I'm looking at a similar size of archive. What hardware are you running?
Do you use collection distribution?


Thanks,

Andre


The information contained in this e-mail message and any accompanying files is 
or may be confidential. If you are not the intended recipient, any use, 
dissemination, reliance, forwarding, printing or copying of this e-mail or any 
attached files is unauthorised. This e-mail is subject to copyright. No part of 
it should be reproduced, adapted or communicated without the written consent of 
the copyright owner. If you have received this e-mail in error please advise 
the sender immediately by return e-mail or telephone and delete all copies. 
Fairfax does not guarantee the accuracy or completeness of any information 
contained in this e-mail or attached files. Internet communications are not 
secure, therefore Fairfax does not accept legal responsibility for the contents 
of this message or attached files.