Re: synonyms

2008-03-31 Thread Lucas F. A. Teixeira

Hello All,

We've implemented a PortugueseSteemer. We want to let it available for 
everyone. Where can I commit it ?



[]s,

Lucas

Leonardo Santagada wrote:


On 28/03/2008, at 16:28, Lance Norskog wrote:

Lucas-

Your examples are Portuguese and Spanish. You might find a 
Spanish-language
stemmer that follows the very rigid conjugation in Spanish (and I'm 
assuming
in Portuguese as well). Spanish follows conjugation rules that embed 
much
more semantics than English, so a huge number of synonyms can be 
stemmed to

the same word.


Well his examples are in brazilian portuguese and not spanish and the 
biggest problem is that a spanish stemmer is not goin to work. I 
haven't found a pt_BR steammer, have I overlooked something?


--
Leonardo Santagada








Re: sorting on aggregate averages

2008-03-31 Thread Umar Shah
Hi,

it took me some but I implemented the required function by developing a
custom plugin for our specific example. However Now I have another issue:

I am computing a sorted rank list and returning a slice (for pagination) but
have to recompute the result for each request, although the actual q
parameter and fq would be cached but not the sorted list which I could cache
to reuse on subsequent requests.

I might have a look at the caching also, any suggestions in this regard.

thanks.
-umar


On Wed, Mar 19, 2008 at 2:59 AM, Chris Hostetter <[EMAIL PROTECTED]>
wrote:

>
> : I have a problem of returning an list of results which is sorted on a
> : average of ranks returned from aggregates.
> : the qury would be something like ?
> : q=product:p1+product:p2+product:p3; sort score desc
> : To explain Supose I have documents with fields Product, Manufacturer,
> Rank
> : and I want to return the top manufacturers across products p1,p2,p3 with
> : highest average rank on these products.
>
> the topic of generating statistics on facet constraints has come up before
> ... but nothing for doing that is provided out of the box at the moment.
>
> while basic stats like the min/mean/median/stddev/max of a numeric facet
> field (in the context of a q/fq) would be relativeily straight forward to
> add to Solr's built in simple facet support; more complex types statistics
> (like hat you describe) would be difficult to implement in a way that
> would be generally reusable through simple query params ... however: it
> would probably be fairly straightfoward to implemnt domain specific stats
> like this directly in a custom plugin.
>
> The new SearchComponents framework available in the trunk would probably
> be an easy way to do this, allthough it's not very well documented at the
> moment.  If you lok at the existing FacetComponent however, seeing how it
> generates facet counts, and extending it to know about your specific
> fields and generate the type of stats you want should be possible.
>
>
>
>
> -Hoss
>
>


Re: search for non empty field

2008-03-31 Thread Matt Mitchell
Thanks Erik. I think this is the thread here:

http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200709.mbox/[EMAIL 
PROTECTED]

Matt

On Sun, Mar 30, 2008 at 9:50 PM, Erik Hatcher <[EMAIL PROTECTED]>
wrote:

> Documents with a particular field can be matched using:
>
>  field:[* TO *]
>
> Or documents without a particular field with:
>
>  -field:[* TO *]
>
> An empty field?  Meaning one that was indexed but with no terms?  I'm
> not sure about that one.  Seems like Hoss replied to something
> similar on this last week or so though - check the archives.
>
>Erik
>
>
> On Mar 30, 2008, at 9:43 PM, Matt Mitchell wrote:
> > I'm looking for the exact same thing.
> >
> > On Sun, Mar 30, 2008 at 8:45 PM, Ismail Siddiqui <[EMAIL PROTECTED]>
> > wrote:
> >
> >> Hi all,
> >>
> >>
> >> I have a situation where i have to filter result on a non empty
> >> field .
> >> wild card wont work as it will have to match with a letter.
> >> How can I form query to return result where a particular field is
> >> non-empty
> >> .
> >>
> >>
> >>
> >> Ismail
> >>
>
>


Re: Message = "The remote server returned an error: (500) Internal Server Error."

2008-03-31 Thread Ryan McKinley

Are you using jetty?

I forget the JIRA issue to point you too, but (assuming it is jetty),  
this has something to do with the war file extracting itself again.   
The solution is to change the directory it is configured to use.


The default jetty settings included in the nightly builds should avoid  
this problem


ryan


On Mar 28, 2008, at 2:35 AM, farhanali wrote:



Hi:

when i post XML file to solr,data is indexed but if after a week or  
two i
again post the same file to solr i usually get this error "The  
remote server

returned an error: (500) Internal Server Error."
i dont know what is the problem.

if i create a new instance of solr and place "solr.config" and  
"schema.xml"

into it and again post the same XML then it is posted.

i want to know why it is happening that after some time solr refused  
to

indexed the same file which he accepted earlier.

Regards

Farhan Ali


--
View this message in context: 
http://www.nabble.com/Message-%3D-%22The-remote-server-returned-an-error%3A-%28500%29-Internal-Server-Error.%22-tp16346616p16346616.html
Sent from the Solr - User mailing list archive at Nabble.com.





Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Ryan McKinley

Without writing any custom code, no.

If you write a "SearchComponent" http://wiki.apache.org/solr/SearchComponent 
 -- you can programatically change the response at runtime.


ryan



On Mar 28, 2008, at 3:38 AM, Umar Shah wrote:


Hi,

I wanted to know whether we can append a field (Fdyn say) to each  
doc in the

returned set
Fdyn is computed as some complex function of the fields stored in  
the index

during the runtime in SOLR.



-umar




Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Umar Shah
thanks ryan for the reply.

I have looked at the prepare and process methods in SearchComponents(Query,
Filter etc).
I'm using all the default components to prepare and then process the reults.
and then prepare a custom field after iterating through all the documents in
the result set. After having created this field for each document how do I
add corresponding custom field to each document in the response set.


On Mon, Mar 31, 2008 at 6:25 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:

> Without writing any custom code, no.
>
> If you write a "SearchComponent"
> http://wiki.apache.org/solr/SearchComponent
>  -- you can programatically change the response at runtime.
>
> ryan
>
>
>
> On Mar 28, 2008, at 3:38 AM, Umar Shah wrote:
>
> > Hi,
> >
> > I wanted to know whether we can append a field (Fdyn say) to each
> > doc in the
> > returned set
> > Fdyn is computed as some complex function of the fields stored in
> > the index
> > during the runtime in SOLR.
> >
> >
> >
> > -umar
>
>


Is number of stored fields affects query performance?

2008-03-31 Thread Evgeniy Strokin
I have two questions related to the subject:
 
1. If I have 100 fields in my document, all indexed. Will my queries run slower 
if I store all 100 fields or just 10?
 
2. If I have 100 fields in my documents, all stored. Will my queries run slower 
if I index all 100 fields or just 10?
 
Thanks in advance,
Eugene

Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Ryan McKinley

Two approaches:
1. make a map and add it to the response:
  rb.rsp.add( "mystuff", mymap );

2. Augment the documents with a field value -- this is a bit more  
complex and runs the risk of name collisions with fields in your  
documents.  You can pull the docLIst out from the response and add  
fields to each document.


If #1 works, go with that...

ryan



On Mar 31, 2008, at 9:51 AM, Umar Shah wrote:


thanks ryan for the reply.

I have looked at the prepare and process methods in  
SearchComponents(Query,

Filter etc).
I'm using all the default components to prepare and then process the  
reults.
and then prepare a custom field after iterating through all the  
documents in
the result set. After having created this field for each document  
how do I

add corresponding custom field to each document in the response set.


On Mon, Mar 31, 2008 at 6:25 PM, Ryan McKinley <[EMAIL PROTECTED]>  
wrote:



Without writing any custom code, no.

If you write a "SearchComponent"
http://wiki.apache.org/solr/SearchComponent
-- you can programatically change the response at runtime.

ryan



On Mar 28, 2008, at 3:38 AM, Umar Shah wrote:


Hi,

I wanted to know whether we can append a field (Fdyn say) to each
doc in the
returned set
Fdyn is computed as some complex function of the fields stored in
the index
during the runtime in SOLR.



-umar







Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread jnagro

Hello,

Earlier this week we started experiencing a strange situation with our Solr
installation. We have a home-grown query tool which started to timeout (we
had it set low at 2seconds which was always more than enough). In doing some
rather in-depth investigation its appears that Solr is processing POST
requests much much slower than GET requests. Using the solr admin interface
i can GET a search in under a second (about 0.2 to be exact) and when i
change that form to be a POST form (using firefox web-developer toolbar)
suddenly the request takes ~5-7 seconds - sometimes longer - to return. Our
query tool needs to make POSTS because we run into the max URL length
problem very quickly with some of the queries we need to run. Any thoughts?
I was going to try a newer build but it seems strange that we would
all-of-a-sudden run into this issue.

Thanks!

-John
-- 
View this message in context: 
http://www.nabble.com/Solr-GET-requests-return-quickly%2C-POST-requests-take-very-long%2C-why--tp16396262p16396262.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Umar Shah
On Mon, Mar 31, 2008 at 7:38 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:

> Two approaches:
> 1. make a map and add it to the response:
>   rb.rsp.add( "mystuff", mymap );
>

I tried using both  Map/ NamedList

it appends to the results
I have to attach each document with corresponding field.


> 2. Augment the documents with a field value -- this is a bit more
> complex and runs the risk of name collisions with fields in your
> documents.  You can pull the docLIst out from the response and add
> fields to each document.

this seems more appropriate,
I'm okay, to resolve name collision , how do I add the  field.. any specific
methods to do that?


> If #1 works, go with that...
>
> ryan
>
>
>
> On Mar 31, 2008, at 9:51 AM, Umar Shah wrote:
>
> > thanks ryan for the reply.
> >
> > I have looked at the prepare and process methods in
> > SearchComponents(Query,
> > Filter etc).
> > I'm using all the default components to prepare and then process the
> > reults.
> > and then prepare a custom field after iterating through all the
> > documents in
> > the result set. After having created this field for each document
> > how do I
> > add corresponding custom field to each document in the response set.
> >
> >
> > On Mon, Mar 31, 2008 at 6:25 PM, Ryan McKinley <[EMAIL PROTECTED]>
> > wrote:
> >
> >> Without writing any custom code, no.
> >>
> >> If you write a "SearchComponent"
> >> http://wiki.apache.org/solr/SearchComponent
> >> -- you can programatically change the response at runtime.
> >>
> >> ryan
> >>
> >>
> >>
> >> On Mar 28, 2008, at 3:38 AM, Umar Shah wrote:
> >>
> >>> Hi,
> >>>
> >>> I wanted to know whether we can append a field (Fdyn say) to each
> >>> doc in the
> >>> returned set
> >>> Fdyn is computed as some complex function of the fields stored in
> >>> the index
> >>> during the runtime in SOLR.
> >>>
> >>>
> >>>
> >>> -umar
> >>
> >>
>
>


Solr interprets UTF-8 as ISO-8859-1

2008-03-31 Thread Daniel Löfquist

Hello,

We're building a webapplication that uses Solr for searching and I've
come upon a problem that I can't seem to get my head around.

We have a servlet that accepts input via XML-RPC and based on that input
constructs the correct URL to perform a search with the Solr-servlet.

I know that the call to Solr (the URL) from our servlet looks like this
(which is what it should look like):

http://myserver:8080/solrproducts/select/?q=all_SV:ljusblå+status:online&fl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2C&sort=titleSort_SV+asc,id+asc&start=0&q.op=AND&rows=25

But Solr reports the input-fields (the GET-variables in the URL) as:

INFO: /select/
fl=id,artno,title_SV,titleSort_SV,description_SV,&sort=titleSort_SV+asc,id+asc&start=0&q=all_SV:ljusblå+status:online&q.op=AND&rows=25

which is all fine except where it says "ljusblå". Apparently Solr is
interpreting the UTF-8 string "ljusblå" as ISO-8859-1 and thus creates
this garbage that makes the search return 0 when it should in reality
return 3 hits.

All other searches that don't use special characters work 100% fine.

I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody
help me out and point me in the direction of a solution?

Sincerely,

Daniel Löfquist



Re: Solr interprets UTF-8 as ISO-8859-1

2008-03-31 Thread Sean Timm
Send the URL with the å character URL encoded as %C3%A5.  That is the 
UTF-8 URL encoding.


http://myserver:8080/solrproducts/select/?q=all_SV:ljusbl%C3%A5+status:online&fl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2C&sort=titleSort_SV+asc,id+asc&start=0&q.op=AND&rows=25

-Sean


Daniel Löfquist wrote:

Hello,

We're building a webapplication that uses Solr for searching and I've
come upon a problem that I can't seem to get my head around.

We have a servlet that accepts input via XML-RPC and based on that input
constructs the correct URL to perform a search with the Solr-servlet.

I know that the call to Solr (the URL) from our servlet looks like this
(which is what it should look like):

http://myserver:8080/solrproducts/select/?q=all_SV:ljusblå+status:online&fl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2C&sort=titleSort_SV+asc,id+asc&start=0&q.op=AND&rows=25 



But Solr reports the input-fields (the GET-variables in the URL) as:

INFO: /select/
fl=id,artno,title_SV,titleSort_SV,description_SV,&sort=titleSort_SV+asc,id+asc&start=0&q=all_SV:ljusblå+status:online&q.op=AND&rows=25 



which is all fine except where it says "ljusblå". Apparently Solr is
interpreting the UTF-8 string "ljusblå" as ISO-8859-1 and thus creates
this garbage that makes the search return 0 when it should in reality
return 3 hits.

All other searches that don't use special characters work 100% fine.

I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody
help me out and point me in the direction of a solution?

Sincerely,

Daniel Löfquist



Re: Solr interprets UTF-8 as ISO-8859-1

2008-03-31 Thread Siegfried Goeschl

Hi Daniel,

the following topic might help (at least it did the trick for me using 
german chararcters)


http://wiki.apache.org/solr/FAQ - Why don't International Characters Work?

So I wrote the following servlet (taken from Wiki/mailing list)

import org.apache.solr.servlet.SolrDispatchFilter;

import javax.servlet.ServletRequest;
import javax.servlet.ServletResponse;
import javax.servlet.FilterChain;
import javax.servlet.ServletException;
import java.io.IOException;

/**
* A work around that the URL parameters are encoded using UTF-8 but no 
character
* encoding is defined. So enforce UTF-8 to make it work with German 
characters.

*/
public class CdpSolrDispatchFilter extends SolrDispatchFilter {

 public void doFilter(ServletRequest request, ServletResponse response, 
FilterChain chain) throws IOException, ServletException {


   String encoding = request.getCharacterEncoding();
   if (null == encoding) {
 // Set your default encoding here
 request.setCharacterEncoding("UTF-8");
   } else {
 request.setCharacterEncoding(encoding);
   }
  
   super.doFilter(request, response, chain);

 }
}

Cheers,

Siegfried Goeschl

Daniel Löfquist wrote:

Hello,

We're building a webapplication that uses Solr for searching and I've
come upon a problem that I can't seem to get my head around.

We have a servlet that accepts input via XML-RPC and based on that input
constructs the correct URL to perform a search with the Solr-servlet.

I know that the call to Solr (the URL) from our servlet looks like this
(which is what it should look like):

http://myserver:8080/solrproducts/select/?q=all_SV:ljusblå+status:online&fl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2C&sort=titleSort_SV+asc,id+asc&start=0&q.op=AND&rows=25 



But Solr reports the input-fields (the GET-variables in the URL) as:

INFO: /select/
fl=id,artno,title_SV,titleSort_SV,description_SV,&sort=titleSort_SV+asc,id+asc&start=0&q=all_SV:ljusblå+status:online&q.op=AND&rows=25 



which is all fine except where it says "ljusblå". Apparently Solr is
interpreting the UTF-8 string "ljusblå" as ISO-8859-1 and thus creates
this garbage that makes the search return 0 when it should in reality
return 3 hits.

All other searches that don't use special characters work 100% fine.

I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody
help me out and point me in the direction of a solution?

Sincerely,

Daniel Löfquist





Re: Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread Ryan McKinley

what app container are you running on?  (jetty? tomcat? resin...)

what version of solr are you running?

In solr -- the request goes through all the same hoops if it is GET or  
POST, so I suspect it is something to do with the container... but  
honestly don't know.


ryan


On Mar 31, 2008, at 10:11 AM, jnagro wrote:



Hello,

Earlier this week we started experiencing a strange situation with  
our Solr
installation. We have a home-grown query tool which started to  
timeout (we
had it set low at 2seconds which was always more than enough). In  
doing some

rather in-depth investigation its appears that Solr is processing POST
requests much much slower than GET requests. Using the solr admin  
interface
i can GET a search in under a second (about 0.2 to be exact) and  
when i
change that form to be a POST form (using firefox web-developer  
toolbar)
suddenly the request takes ~5-7 seconds - sometimes longer - to  
return. Our

query tool needs to make POSTS because we run into the max URL length
problem very quickly with some of the queries we need to run. Any  
thoughts?

I was going to try a newer build but it seems strange that we would
all-of-a-sudden run into this issue.

Thanks!

-John
--
View this message in context: 
http://www.nabble.com/Solr-GET-requests-return-quickly%2C-POST-requests-take-very-long%2C-why--tp16396262p16396262.html
Sent from the Solr - User mailing list archive at Nabble.com.





federated search - Carrot Clustering Engine

2008-03-31 Thread Grégoire Neuville
Hi,

I've recently posted a message on this list about the ways of
implementing a 'federated search' (a search on multiple indices) whith
Solr. Starting from the answers I got plus informations from the
mailing list archives and the Solr Wiki, I've planned to test three
configurations :

- one sole index for all applications (each one interrogate the index
its own way)
- one index and one schema.xml per application (each index managed by
a Core of the MultiCore system)
- one index per application and a sole schema.xml for them all (each
'shard' also managed individually by a Solr Core)

What do you think of those possibilities ? Have you tested and/or
compared them yourself ?

Secondly, as I am currently developping a portal for federated search,
I'm getting interested in the clustering engine Carrot²
(http://project.carrot2.org) ; as it is described as being able to
read results produced by Solr, I was wondering if anyone of you Solr
users had already used them together ?

Thanks,
-- 
Grégoire Neuville


Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Siegfried Goeschl

Hi folks,

I had to solve a similiar problem with SOLR 1.2 and used a custom 
org.apache.solr.request.QueryResponseWriter - you can trigger your 
custom response writer using SOLR admin but it is not an elegant 
solution (I think the XMWriter is a final class therefore some 
copy&waste code)


Cheers,

Siegfried Goeschl



Umar Shah wrote:

On Mon, Mar 31, 2008 at 7:38 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote:

  

Two approaches:
1. make a map and add it to the response:
  rb.rsp.add( "mystuff", mymap );




I tried using both  Map/ NamedList

it appends to the results
I have to attach each document with corresponding field.


  

2. Augment the documents with a field value -- this is a bit more
complex and runs the risk of name collisions with fields in your
documents.  You can pull the docLIst out from the response and add
fields to each document.



this seems more appropriate,
I'm okay, to resolve name collision , how do I add the  field.. any specific
methods to do that?


  

If #1 works, go with that...

ryan



On Mar 31, 2008, at 9:51 AM, Umar Shah wrote:



thanks ryan for the reply.

I have looked at the prepare and process methods in
SearchComponents(Query,
Filter etc).
I'm using all the default components to prepare and then process the
reults.
and then prepare a custom field after iterating through all the
documents in
the result set. After having created this field for each document
how do I
add corresponding custom field to each document in the response set.


On Mon, Mar 31, 2008 at 6:25 PM, Ryan McKinley <[EMAIL PROTECTED]>
wrote:

  

Without writing any custom code, no.

If you write a "SearchComponent"
http://wiki.apache.org/solr/SearchComponent
-- you can programatically change the response at runtime.

ryan



On Mar 28, 2008, at 3:38 AM, Umar Shah wrote:



Hi,

I wanted to know whether we can append a field (Fdyn say) to each
doc in the
returned set
Fdyn is computed as some complex function of the fields stored in
the index
during the runtime in SOLR.



-umar
  




  


schema version bug

2008-03-31 Thread Andrew Nagy
Hello - I stumbled upon a odd bug, or what appears to be a bug, today.  I have 
been using my own custom version numbers for my schema and tried to change the 
version number from 0.8 to 0.8.1 which rendered solr useless yielding a schema 
parsing error.  I then tried to change it to 0.8-1 with the same results.

Is this a bug or a "feature"?

Thanks
Andrew


Re: Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread Vinci

Hi,

I don't use POST request for query, but I think you should check what is
actually your browser sent by firebug first...Also, it will be more helpful
if you can tell us how long of your query are, and some information of  your
query server, etc

Quick comment: POST usually need longer time to process when compare to GET.
But I think you should pay more attention of your use on the search
engine..may be synonymy may help you to reduce the amount of information
of user sent. 

Thank you,
Vinci

jnagro wrote:
> 
> Hello,
> 
> Earlier this week we started experiencing a strange situation with our
> Solr installation. We have a home-grown query tool which started to
> timeout (we had it set low at 2seconds which was always more than enough).
> In doing some rather in-depth investigation its appears that Solr is
> processing POST requests much much slower than GET requests. Using the
> solr admin interface i can GET a search in under a second (about 0.2 to be
> exact) and when i change that form to be a POST form (using firefox
> web-developer toolbar) suddenly the request takes ~5-7 seconds - sometimes
> longer - to return. Our query tool needs to make POSTS because we run into
> the max URL length problem very quickly with some of the queries we need
> to run. Any thoughts? I was going to try a newer build but it seems
> strange that we would all-of-a-sudden run into this issue.
> 
> Thanks!
> 
> -John
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-GET-requests-return-quickly%2C-POST-requests-take-very-long%2C-why--tp16396262p16397736.html
Sent from the Solr - User mailing list archive at Nabble.com.



Indexing a word in url

2008-03-31 Thread Vinci

Hi all,

I would like to ask, if I want to index word in a URL, which data type and
parser should I use?

Thank you,
Vinci
-- 
View this message in context: 
http://www.nabble.com/Indexing-a-word-in-url-tp16397739p16397739.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Solr interprets UTF-8 as ISO-8859-1

2008-03-31 Thread Uwe Klosa
You should set uriEncoding="UTF-8" in your application server. For tomcat
you can do that in the server.xml. For Glassfish you have to create a
sun-web.xml containing the according parameters. Yoy r application server
should provide a similar mechanism.

Uwe

On Mon, Mar 31, 2008 at 4:32 PM, Daniel Löfquist <
[EMAIL PROTECTED]> wrote:

> Hello,
>
> We're building a webapplication that uses Solr for searching and I've
> come upon a problem that I can't seem to get my head around.
>
> We have a servlet that accepts input via XML-RPC and based on that input
> constructs the correct URL to perform a search with the Solr-servlet.
>
> I know that the call to Solr (the URL) from our servlet looks like this
> (which is what it should look like):
>
> http://myserver:8080/solrproducts/select/?q=all_SV:ljusbl
> å+status:online&fl=id%2Cartno%2Ctitle_SV%2CtitleSort_SV%2Cdescription_SV%2C&sort=titleSort_SV+asc,id+asc&start=0&q.op=AND&rows=25
>
> But Solr reports the input-fields (the GET-variables in the URL) as:
>
> INFO: /select/
>
> fl=id,artno,title_SV,titleSort_SV,description_SV,&sort=titleSort_SV+asc,id+asc&start=0&q=all_SV:ljusblå+status:online&q.op=AND&rows=25
>
> which is all fine except where it says "ljusblå". Apparently Solr is
> interpreting the UTF-8 string "ljusblå" as ISO-8859-1 and thus creates
> this garbage that makes the search return 0 when it should in reality
> return 3 hits.
>
> All other searches that don't use special characters work 100% fine.
>
> I'm new to Solr so I'm not sure what I'm doing wrong here. Can anybody
> help me out and point me in the direction of a solution?
>
> Sincerely,
>
> Daniel Löfquist
>
>


Re: schema version bug

2008-03-31 Thread Chris Hostetter

: Hello - I stumbled upon a odd bug, or what appears to be a bug, today.  
: I have been using my own custom version numbers for my schema and tried 
: to change the version number from 0.8 to 0.8.1 which rendered solr 
: useless yielding a schema parsing error.  I then tried to change it to 
: 0.8-1 with the same results.

the "version" attribute of a schema has a very specific meaning relating 
to how the schema.xml file will be parsed.  There is a note about this in 
the example schema.xml...

http://svn.apache.org/viewvc/lucene/solr/trunk/example/solr/conf/schema.xml?view=markup


  


...that said, Solr should probably warn you (and function as if the 
default version is specified) if an unrecognized version number is found.  
feel free to open a bug to do that.


-Hoss



Re: Search fail if copyField absent?(+ Jetty Question)

2008-03-31 Thread Chris Hostetter

: I doesn't change other thing about field in default schema, so I think you
: are correct...Then here is one question: Can I use parameter to change the
: search field when query come in?

You can change the query string, or you can change the "df" dfualt search 
field, or if you are using hte dismax syntax you can change the "qf" 
query fields.

http://wiki.apache.org/solr/StandardRequestHandler#head-54c4743325ef5891a2c734963bfa367d89d7adc5
http://wiki.apache.org/solr/DisMaxRequestHandler#head-af452050ee272a1c88e2ff89dc0012049e69e180

Or you could change "The Default Search Field" right in the schema if you 
don't plan on using hte "text" field at all...

http://wiki.apache.org/solr/SchemaXml#head-b80c539a0a01eef8034c3776e49e8fe1c064f496


-Hoss



Re: Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread jnagro

I appreciate the response. We're running tomcat/apache at the moment. All of
these questions are good, however it doesn't really explain why this would
be happening so suddenly and why there is such a wide difference between
POST and GET. Do you have any other thoughts that i could investigate? Some
of these queries are taking almost 20 seconds to return, but running them as
GETS returns them in under a second (we've even restarted servers to ensure
query caching was cleared).

We are running a nightly build - we might try a newer one.

I will try and get some more info for you but any other insight would be
helpful.


Vinci wrote:
> 
> Hi,
> 
> I don't use POST request for query, but I think you should check what is
> actually your browser sent by firebug first...Also, it will be more
> helpful if you can tell us how long of your query are, and some
> information of  your query server, etc
> 
> Quick comment: POST usually need longer time to process when compare to
> GET. But I think you should pay more attention of your use on the search
> engine..may be synonymy may help you to reduce the amount of
> information of user sent. 
> 
> Thank you,
> Vinci
> 
> jnagro wrote:
>> 
>> Hello,
>> 
>> Earlier this week we started experiencing a strange situation with our
>> Solr installation. We have a home-grown query tool which started to
>> timeout (we had it set low at 2seconds which was always more than
>> enough). In doing some rather in-depth investigation its appears that
>> Solr is processing POST requests much much slower than GET requests.
>> Using the solr admin interface i can GET a search in under a second
>> (about 0.2 to be exact) and when i change that form to be a POST form
>> (using firefox web-developer toolbar) suddenly the request takes ~5-7
>> seconds - sometimes longer - to return. Our query tool needs to make
>> POSTS because we run into the max URL length problem very quickly with
>> some of the queries we need to run. Any thoughts? I was going to try a
>> newer build but it seems strange that we would all-of-a-sudden run into
>> this issue.
>> 
>> Thanks!
>> 
>> -John
>> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-GET-requests-return-quickly%2C-POST-requests-take-very-long%2C-why--tp16396262p16398625.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Chris Hostetter

: > 2. Augment the documents with a field value -- this is a bit more
: > complex and runs the risk of name collisions with fields in your
: > documents.  You can pull the docLIst out from the response and add
: > fields to each document.
: 
: this seems more appropriate,
: I'm okay, to resolve name collision , how do I add the  field.. any specific
: methods to do that?

I *think* the missing step here is that while DocLists can't easily be 
modified, new SolrDocument and SolrDocumentList classes have been 
added since 1.2.  Solr by defualt doesn't use them, but the built in Solr 
ResponseWriters can output them, so your custom component can build a 
SolrDocumentList bsed on the DocList, and add whatever fields you want.

I'm not sure if there are any help methods to do the 
DocList->SolrDocumentList conversion.

(Ryan: keep my honest if this isn't what you had in mind)




-Hoss



Re: Can We append a field to the response that is not in the index but computed at runtime.

2008-03-31 Thread Ryan McKinley


On Mar 31, 2008, at 2:43 PM, Chris Hostetter wrote:



: > 2. Augment the documents with a field value -- this is a bit more
: > complex and runs the risk of name collisions with fields in your
: > documents.  You can pull the docLIst out from the response and add
: > fields to each document.
:
: this seems more appropriate,
: I'm okay, to resolve name collision , how do I add the  field..  
any specific

: methods to do that?

I *think* the missing step here is that while DocLists can't easily be
modified, new SolrDocument and SolrDocumentList classes have been
added since 1.2.  Solr by defualt doesn't use them, but the built in  
Solr

ResponseWriters can output them, so your custom component can build a
SolrDocumentList bsed on the DocList, and add whatever fields you  
want.


I'm not sure if there are any help methods to do the
DocList->SolrDocumentList conversion.

(Ryan: keep my honest if this isn't what you had in mind)



Correct -- there isn't really a clean way to do this.

For an example, you can check the "locallucene" query component:
https://locallucene.svn.sourceforge.net/svnroot/locallucene/trunk/localsolr/src/com/pjaol/search/solr/component/LocalSolrQueryComponent.java
(towards the bottom of process)

that adds a calculated "geo_distance" field to each returned document.

ryan












-Hoss





Re: Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread Erik Hatcher
It's probably the HTTP caching :)  Catches me off guard often too  
when hitting Solr from a browser.  Shift-refresh or something like  
that to force it to GET without caching headers sent.


There should really be no difference in speed between GET and POST to  
Solr all caching aside.


Erik


On Mar 31, 2008, at 2:39 PM, jnagro wrote:


I appreciate the response. We're running tomcat/apache at the  
moment. All of
these questions are good, however it doesn't really explain why  
this would
be happening so suddenly and why there is such a wide difference  
between
POST and GET. Do you have any other thoughts that i could  
investigate? Some
of these queries are taking almost 20 seconds to return, but  
running them as
GETS returns them in under a second (we've even restarted servers  
to ensure

query caching was cleared).

We are running a nightly build - we might try a newer one.

I will try and get some more info for you but any other insight  
would be

helpful.


Vinci wrote:


Hi,

I don't use POST request for query, but I think you should check  
what is

actually your browser sent by firebug first...Also, it will be more
helpful if you can tell us how long of your query are, and some
information of  your query server, etc

Quick comment: POST usually need longer time to process when  
compare to
GET. But I think you should pay more attention of your use on the  
search

engine..may be synonymy may help you to reduce the amount of
information of user sent.

Thank you,
Vinci

jnagro wrote:


Hello,

Earlier this week we started experiencing a strange situation  
with our

Solr installation. We have a home-grown query tool which started to
timeout (we had it set low at 2seconds which was always more than
enough). In doing some rather in-depth investigation its appears  
that

Solr is processing POST requests much much slower than GET requests.
Using the solr admin interface i can GET a search in under a second
(about 0.2 to be exact) and when i change that form to be a POST  
form
(using firefox web-developer toolbar) suddenly the request takes  
~5-7

seconds - sometimes longer - to return. Our query tool needs to make
POSTS because we run into the max URL length problem very quickly  
with
some of the queries we need to run. Any thoughts? I was going to  
try a
newer build but it seems strange that we would all-of-a-sudden  
run into

this issue.

Thanks!

-John






--
View this message in context: http://www.nabble.com/Solr-GET- 
requests-return-quickly%2C-POST-requests-take-very-long%2C-why-- 
tp16396262p16398625.html

Sent from the Solr - User mailing list archive at Nabble.com.




Re: synonyms

2008-03-31 Thread Chris Hostetter

: We've implemented a PortugueseSteemer. We want to let it available for
: everyone. Where can I commit it ?

If it's just a Stemmer that has no Solr dependencies (or a Stemmer built 
as a TokenFilter) the best thing to do is contribute it to the 
Lucene-Java project...

http://wiki.apache.org/lucene-java/HowToContribute

...the best place for it to live would be in the contrib/analysis package.

make sure to note in the issue how it is differnet/better then the 
existing Stemmers for Portuguese so people can differentiate it.



-Hoss



Re: Indexing a word in url

2008-03-31 Thread Mike Klaas


On 31-Mar-08, at 10:50 AM, Vinci wrote:


Hi all,

I would like to ask, if I want to index word in a URL, which data  
type and

parser should I use?


Depends on how you want to search it.  I use WordDelimiterFilter with  
parts generation on only (no catenation), and an additiona stopwords  
like that excludes a few tokens like 'http'.


-Mike


Re: Is number of stored fields affects query performance?

2008-03-31 Thread Mike Klaas


On 31-Mar-08, at 6:57 AM, Evgeniy Strokin wrote:

I have two questions related to the subject:

1. If I have 100 fields in my document, all indexed. Will my queries  
run slower if I store all 100 fields or just 10?


Depends on that total size of the stored fields.  Really large stored  
documents can result in more disk seeks to fetch the documents to  
return.




2. If I have 100 fields in my documents, all stored. Will my queries  
run slower if I index all 100 fields or just 10?


Not really.

Whether there is a practical difference depends strongly on the size  
of your index, the heft of the machine you're using, and the  
definition of "slower".  There really isn't any fixed advice other  
than "benchmark it under production-like conditions".


-Mike


RE: Highlight - get terms used by lucene

2008-03-31 Thread Chris Hostetter

: Solr returns the max score and the score per document.

: This means that the best hit always is 100% which is not always what you 
: want because the article itself could still be quite irrelevant...

Solr doesn't give you a percentage, and there's no reason to divide a 
doc's scroe by maxScore to get a percentage -- anymore then there would be 
with the Oracle function as described.  The Oracle docs don't say that you 
can divide a score of 23 by a max score of 100 to determine it's a 23% 
match, just that scores will always be less then 100 ... in fact the doc 
you linked to specificly says you can't compare scores, so a score of 23 
for one query doesn't mean the samething as a score for 23 from another 
query (which is also true for Lucene scores BTW, Lucene just doesn't 
promise you any particular max score because there are so many more 
internesting and complex query types in Lucene that make determining such 
a max impossible)

My main point was: rather then letting Solr score the results one way, and 
then trying to come up with your own variation on that score externally 
(which is error prone given that your scoring varaition might result in a 
differnet ordering and change which results appear per "page") let 
Solr compute the score for you. 

If you aren't happy with the way Solr computes the score, and you want a 
simpiler Score calculation likewhat Oracle provides (that will only work 
for simple Term queries) write a custom Similarity instance that does what 
you want...

http://lucene.apache.org/java/2_3_1/api/org/apache/lucene/search/Similarity.html
http://wiki.apache.org/solr/SolrPlugins

Off the cuff I think you'd get what Oracle describes by:
  - omiting norms on all fields in your schema.xml
  - making Similarity.queryNorm(float) allways return 3
  - making Similarity.tf(float) allways return it's input
  - not using query boosts

...all bets are off though if you use multi term queries (or phrase 
queries, or fuzzy queries, etc..) but you can play with the other methods 
in Similarity if you have a particular idea how you'd like those scored if 
they you do use them.


-Hoss



Re: Setting a Threshold of a sortable field to filter the result?

2008-03-31 Thread Chris Hostetter
: 
: How can I set a threshold value of a field so that I can filter the result
: which is lower than the threshold? By the schema.xml or set by the query?

fq=your_field_name:[* TO your_max_value]

or

fq=your_field_name:[your_min_value TO *]

depending on wether you want a minimum or maximum filter.




-Hoss



Re: Solr GET requests return quickly, POST requests take very long, why?

2008-03-31 Thread Vinci

hi,

You need to give us some example...while you should ask in the tomcat user
group of how tomcat/apache dealing with POST request as ryan said they go to
the same loop.

Thank you,
Vinci


jnagro wrote:
> 
> I appreciate the response. We're running tomcat/apache at the moment. All
> of these questions are good, however it doesn't really explain why this
> would be happening so suddenly and why there is such a wide difference
> between POST and GET. Do you have any other thoughts that i could
> investigate? Some of these queries are taking almost 20 seconds to return,
> but running them as GETS returns them in under a second (we've even
> restarted servers to ensure query caching was cleared).
> 
> We are running a nightly build - we might try a newer one.
> 
> I will try and get some more info for you but any other insight would be
> helpful.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Solr-GET-requests-return-quickly%2C-POST-requests-take-very-long%2C-why--tp16396262p16407788.html
Sent from the Solr - User mailing list archive at Nabble.com.



stored and indexed in schema

2008-03-31 Thread Vinci

Hi,

I would like to ask, if I set a field to be indexed but not stored, I can
retrieved the document but cannot retrieve this field?
If I have large field that I want to index but I am not suppose to show them
to user (The origin content stored in another processed document where I am
using another field in Solr to point to their location...I throw the
retrieval job to the server :P), will I get faster respond even the query
doesn't ask solr to return this large field?

Thank you,
Vinci
-- 
View this message in context: 
http://www.nabble.com/stored-and-indexed-in-schema-tp16411090p16411090.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Indexing a word in url

2008-03-31 Thread Vinci

Hi,

Thank you for your reply.
Actually I want to use anything that is not alphabet or digit to be the
separator - anything between them will be a word (so that I can use the URL
fragment to see what is indexed about this site)...any suggestion?

Thank you,
Vinci


Mike Klaas wrote:
> 
> 
> On 31-Mar-08, at 10:50 AM, Vinci wrote:
>>
>> Hi all,
>>
>> I would like to ask, if I want to index word in a URL, which data  
>> type and
>> parser should I use?
> 
> Depends on how you want to search it.  I use WordDelimiterFilter with  
> parts generation on only (no catenation), and an additiona stopwords  
> like that excludes a few tokens like 'http'.
> 
> -Mike
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Indexing-a-word-in-url-tp16397739p16411091.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Setting a Threshold of a sortable field to filter the result?

2008-03-31 Thread Vinci

Hi,

One more thing: which numerical data type I should use, sfloat or float, fot
the fq parameter? 

Thank you,
Vinci


hossman wrote:
> 
> : 
> : How can I set a threshold value of a field so that I can filter the
> result
> : which is lower than the threshold? By the schema.xml or set by the
> query?
> 
> fq=your_field_name:[* TO your_max_value]
> 
> or
> 
> fq=your_field_name:[your_min_value TO *]
> 
> depending on wether you want a minimum or maximum filter.
> 
> 
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Setting-a-Threshold-of-a-sortable-field-to-filter-the-result--tp16367336p16411382.html
Sent from the Solr - User mailing list archive at Nabble.com.



How to handle multiple application?

2008-03-31 Thread Bhavin Pandya
I have configured solr instance for one of my application in which there is one 
master server and 3 slave server.

I want to add one more application in same solr instance ? is it possible ? or 
i need to run multiple instance of solr.

please help.

Bhavin pandya
Software engineer,
Rediff.com India Ltd,