Lucene Query to Solr query

2009-05-25 Thread Reza Safari

Hello,

One little question: is there any utility that can convert core Lucene  
query (any type e.q. TermQuery etc) to solr query? It's is really a  
lot of work for me to rewrite existing code.


Thanks,
Reza

--
Reza Safari
LUKKIEN
Copernicuslaan 15
6716 BM Ede

The Netherlands
-
http://www.lukkien.com
t: +31 (0) 318 698000

This message is for the designated recipient only and may contain  
privileged, proprietary, or otherwise private information. If you have  
received it in error, please notify the sender immediately and delete  
the original. Any other use of the email by you is prohibited.

















Re: How to index large set data

2009-05-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Mon, May 25, 2009 at 10:56 AM, nk 11  wrote:
> Hello
> Interesting thread. One request please, because I don't have much experience
> with solr, could you please use full terms and not DIH, RES etc.?

nk11.
DIH =  DataImportHandler
RES=?

it is unavoidable that we end up using short names because of
laziness/lack of time. But if you ever come across one, do not
hesitate to ask.we will be more than glad to clarify.
>
> Thanks :)
>
> On Mon, May 25, 2009 at 4:44 AM, Jianbin Dai  wrote:
>
>>
>> Hi Paul,
>>
>> Hope you have a great weekend so far.
>> I still have a couple of questions you might help me out:
>>
>> 1. In your earlier email, you said "if possible , you can setup multiple
>> DIH say /dataimport1, /dataimport2 etc and split your files and can achieve
>> parallelism"
>> I am not sure if I understand it right. I put two requesHandler in
>> solrconfig.xml, like this
>>
>> > class="org.apache.solr.handler..dataimport.DataImportHandler">
>>    
>>      ./data-config.xml
>>    
>> 
>>
>> > class="org.apache.solr.handler.dataimport.DataImportHandler">
>>    
>>      ./data-config2.xml
>>    
>> 
>>
>>
>> and create data-config.xml and data-config2.xml.
>> then I run the command
>> http://host:8080/solr/dataimport?command=full-import
>>
>> But only one data set (the first one) was indexed. Did I get something
>> wrong?
>>
>>
>> 2. I noticed that after solr indexed about 8M documents (around two hours),
>> it gets very very slow. I use "top" command in linux, and noticed that RES
>> is 1g of memory. I did several experiments, every time RES reaches 1g, the
>> indexing process becomes extremely slow. Is this memory limit set by JVM?
>> And how can I set the JVM memory when I use DIH through web command
>> full-import?
>>
>> Thanks!
>>
>>
>> JB
>>
>>
>>
>>
>> --- On Fri, 5/22/09, Noble Paul നോബിള്‍  नोब्ळ् 
>> wrote:
>>
>> > From: Noble Paul നോബിള്‍  नोब्ळ् 
>> > Subject: Re: How to index large set data
>> > To: "Jianbin Dai" 
>> > Date: Friday, May 22, 2009, 10:04 PM
>> > On Sat, May 23, 2009 at 10:27 AM,
>> > Jianbin Dai 
>> > wrote:
>> > >
>> > > Hi Pual, but in your previous post, you said "there is
>> > already an issue for writing to Solr in multiple threads
>> >  SOLR-1089". Do you think use solrj alone would be better
>> > than DIH?
>> >
>> > nope
>> > you will have to do indexing in multiple threads
>> >
>> > if possible , you can setup multiple DIH say /dataimport1,
>> > /dataimport2 etc and split your files and can achieve
>> > parallelism
>> >
>> >
>> > > Thanks and have a good weekend!
>> > >
>> > > --- On Fri, 5/22/09, Noble Paul നോബിള്‍
>> >  नोब्ळ् 
>> > wrote:
>> > >
>> > >> no need to use embedded Solrserver..
>> > >> you can use SolrJ with streaming
>> > >> in multiple threads
>> > >>
>> > >> On Fri, May 22, 2009 at 8:36 PM, Jianbin Dai
>> > 
>> > >> wrote:
>> > >> >
>> > >> > If I do the xml parsing by myself and use
>> > embedded
>> > >> client to do the push, would it be more efficient
>> > than DIH?
>> > >> >
>> > >> >
>> > >> > --- On Fri, 5/22/09, Grant Ingersoll 
>> > >> wrote:
>> > >> >
>> > >> >> From: Grant Ingersoll 
>> > >> >> Subject: Re: How to index large set data
>> > >> >> To: solr-user@lucene.apache.org
>> > >> >> Date: Friday, May 22, 2009, 5:38 AM
>> > >> >> Can you parallelize this?  I
>> > >> >> don't know that the DIH can handle it,
>> > >> >> but having multiple threads sending docs
>> > to Solr
>> > >> is the
>> > >> >> best
>> > >> >> performance wise, so maybe you need to
>> > look at
>> > >> alternatives
>> > >> >> to pulling
>> > >> >> with DIH and instead use a client to push
>> > into
>> > >> Solr.
>> > >> >>
>> > >> >>
>> > >> >> On May 22, 2009, at 3:42 AM, Jianbin Dai
>> > wrote:
>> > >> >>
>> > >> >> >
>> > >> >> > about 2.8 m total docs were created.
>> > only the
>> > >> first
>> > >> >> run finishes. In
>> > >> >> > my 2nd try, it hangs there forever
>> > at the end
>> > >> of
>> > >> >> indexing, (I guess
>> > >> >> > right before commit), with cpu usage
>> > of 100%.
>> > >> Total 5G
>> > >> >> (2050) index
>> > >> >> > files are created. Now I have two
>> > problems:
>> > >> >> > 1. why it hangs there and failed?
>> > >> >> > 2. how can i speed up the indexing?
>> > >> >> >
>> > >> >> >
>> > >> >> > Here is my solrconfig.xml
>> > >> >> >
>> > >> >> >
>> > >> >>
>> > >>
>> > false
>> > >> >> >
>> > >> >>
>> > >>
>> > 3000
>> > >> >> >
>> > >> >>
>> > 1000
>> > >> >> >
>> > >> >>
>> > >>
>> > 2147483647
>> > >> >> >
>> > >> >>
>> > >>
>> > 1
>> > >> >> >
>> > >> >>
>> > >>
>> > false
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> >
>> > >> >> > --- On Thu, 5/21/09, Noble Paul
>> > >> >> നോബിള്‍  नो
>> > >> >> > ब्ळ् 
>> > >> >> wrote:
>> > >> >> >
>> > >> >> >> From: Noble Paul
>> > നോബിള്‍
>> > >> >> नोब्ळ्
>> > >> >> >> 
>> > >> >> >> Subject: Re: How to index large
>> > set data
>> > >> >> >> To: solr-user@lucene.apache.org
>> > >> >> >> Date: Thursday, May 21, 2009,
>> > 10:39 PM
>> > >> >> >> what is the total no:of docs
>> > created
>> 

Re: Lucene Query to Solr query

2009-05-25 Thread Avlesh Singh
If you use SolrJ client to perform searches, does this not work for you?

SolrQuery solrQuery = new SolrQuery();
solrQuery.setQuery(*myLuceneQuery.toString()*);
QueryResponse response = mySolrServer.query(solrQuery);

Cheers
Avlesh

On Mon, May 25, 2009 at 12:39 PM, Reza Safari  wrote:

> Hello,
>
> One little question: is there any utility that can convert core Lucene
> query (any type e.q. TermQuery etc) to solr query? It's is really a lot of
> work for me to rewrite existing code.
>
> Thanks,
> Reza
>
> --
> Reza Safari
> LUKKIEN
> Copernicuslaan 15
> 6716 BM Ede
>
> The Netherlands
> -
> http://www.lukkien.com
> t: +31 (0) 318 698000
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise private information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the email by you is prohibited.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


Sending Mlt POST request

2009-05-25 Thread Ohad Ben Porat
Hello,



I wish to send an Mlt request to Solr and filter the result by a list of values 
to specific field.   The problem is sometimes the list can include 
thousands of values and it's impossible to send such GET request.

Sending this request as POST didn't work well... Is POST supported by mlt? If 
not, is there suppose to be added in one of the next versions? Or is there a 
different solution maybe?



I will appreciate any help and advice,

Thanks,

Ohad.



Is it memory leaking in solr?

2009-05-25 Thread Jianbin Dai

I am using DIH to do indexing. After I indexed about 8M documents (took about 
1hr40m), it used up almost all memory (4GB), and the indexing becomes extremely 
slow. If I delete all indexing and shutdown tomcat, it still shows over 3gb 
memory was used. Is it memory leaking? if it is, then the leaking is in solr 
indexing or DIH?  Thanks.


  



RE: Boolean query in Solr

2009-05-25 Thread Sagar Khetkade


Hi Erik,
 
This mail just got into my junk so left unread. Well after I made debug query 
on., and firing the same query I am getting some different vibs from it.
Suppose the query is  Content: xyz AND Ticket_Id: (123 OR 1234)  and here the 
search query i.e “xyz” was in the stop word list even then the search results 
were retrieved. Expected was none of the document be retrieved. Here I am 
getting the documents where “xyz”  is not at all there.
 
The debug query was q=Content:xyz+AND+Ticket_Id:4
 
When I used SolrAdmin I got that the parsed query from solr was omitting the 
search field (Content) as it is marked as stop word but was taking the 
Ticket_Id and giving me the results. 

My question is whether it is possible in Solr to fire the query where I want to 
retrieve the document having the search field(Content) only in the selected 
Ticket_Id.
 
Thanks in advance.
~ Sagar
 

 
> From: e...@ehatchersolutions.com
> To: solr-user@lucene.apache.org
> Subject: Re: Boolean query in Solr
> Date: Tue, 14 Apr 2009 09:33:27 -0400
> 
> 
> On Apr 14, 2009, at 5:38 AM, Sagar Khetkade wrote:
> 
> >
> > Hi,
> > I am using SolrJ and firing the query on Solr indexes. The indexed 
> > contains three fields viz.
> > 1. Document_id (type=integer required= true)
> > 2. Ticket Id (type= integer)
> > 3. Content (type=text)
> >
> > Here the query formulation is such that I am having query with “AND” 
> > clause. So the query, that I am firing on index files look like 
> > “Content: search query AND Ticket_id:123 Ticket_Id:789)”.
> 
> That query is invalid query parser syntax, with an unopen paren first 
> of all. I assume that's a typo though. Be careful in how you 
> construct queries with field selectors. Saying:
> 
> Content:search query
> 
> does NOT necessarily mean that the term "query" is being searched in 
> the Content field, as that depends on your default field setting for 
> the query parser. This, however, does use the Content field for both 
> terms:
> 
> Content:(search query)
> 
> 
> > I know this type of query is easily fired on lucene indexes. But 
> > when I am firing the above query I am not getting the required 
> > result . The result contains the document which does not belongs to 
> > the ticket id mentioned in the query.
> > Please can anyone help me out of this issue.
> 
> What does the query parse to with &debugQuery output? That's mighty 
> informative info.
> 
> Erik
> 

_
More than messages–check out the rest of the Windows Live™.
http://www.microsoft.com/india/windows/windowslive/

Recover crashed solr index

2009-05-25 Thread Wang Guangchen
Hi everyone,

I have 8m docs to index, and each doc is around 50kb. The solr crashed in
the middle of indexing. error message said that one of the file in the data
directory is missing. I don't know why this is happened.

So right now I have to find a way to recover the index to avoid re-index. Is
there anyone know any tools or method to recover the crashed index? Please
help.

Thanks a lot.

Regards
GC


Re: Lucene Query to Solr query

2009-05-25 Thread Reza Safari
Hmmm, overriding toString() can make wonders. I will try as you  
suggested. Thanx for quick reply.


Gr, Reza

On May 25, 2009, at 9:34 AM, Avlesh Singh wrote:

If you use SolrJ client to perform searches, does this not work for  
you?


SolrQuery solrQuery = new SolrQuery();
solrQuery.setQuery(*myLuceneQuery.toString()*);
QueryResponse response = mySolrServer.query(solrQuery);

Cheers
Avlesh

On Mon, May 25, 2009 at 12:39 PM, Reza Safari   
wrote:



Hello,

One little question: is there any utility that can convert core  
Lucene
query (any type e.q. TermQuery etc) to solr query? It's is really a  
lot of

work for me to rewrite existing code.

Thanks,
Reza

--
Reza Safari
LUKKIEN
Copernicuslaan 15
6716 BM Ede

The Netherlands
-
http://www.lukkien.com
t: +31 (0) 318 698000

This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you  
have
received it in error, please notify the sender immediately and  
delete the

original. Any other use of the email by you is prohibited.


















--
Reza Safari
LUKKIEN
Copernicuslaan 15
6716 BM Ede

The Netherlands
-
http://www.lukkien.com
t: +31 (0) 318 698000

This message is for the designated recipient only and may contain  
privileged, proprietary, or otherwise private information. If you have  
received it in error, please notify the sender immediately and delete  
the original. Any other use of the email by you is prohibited.

















R: Filtering query terms

2009-05-25 Thread Branca Marco
Hi,
I tested the new filters' configuration and it works fine.


  





  
  





  


The problem about ISOLatin1AccentFilterFactory was not due to Solr, but to a 
core-dependent configuration in a Solr multi-core environment. It was only 
necessary to set to 0 the property 'splitOnCaseChange' in 
solr.WordDelimiterFilterFactory.

Thanks for your support,

Marco


Marco Branca
Consultant Sytel Reply S.r.l.
Via Ripamonti,  89 - 20139 Milano
Mobile: (+39) 348 2298186
e-mail: m.bra...@reply.it
Website: www.reply.eu

Da: Ensdorf Ken [ensd...@zoominfo.com]
Inviato: venerdì 22 maggio 2009 18.16
A: 'solr-user@lucene.apache.org'
Oggetto: RE: Filtering query terms

> When I try testing the filter "solr.LowerCaseFilterFactory" I get
> different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3APaPa&version=2.2&start=0&rows=10&indent=on

In this case, the WordDelimiterFilterFactory is kicking in on your second 
search, so "APaPa" is split into "APa" and "Pa".  You can double-check this by 
using the analysis tool in the admin UI - 
http://localhost:8983/solr/admin/analysis.jsp

>
> Besides, when trying to test the "solr.ISOLatin1AccentFilterFactory" I
> get different results calling the following urls:
>
>  1. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapa&version=2.2&start=0&rows=10&indent=on
>  2. http://[server-ip]:[server-port]/solr/[core-
> name]/select/?q=all%3Apapà&version=2.2&start=0&rows=10&indent=on

Not sure what it happening here, but again I would check it with the analysi 
tool

--
The information transmitted is intended for the person or entity to which it is 
addressed and may contain confidential and/or privileged material. Any review, 
retransmission, dissemination or other use of, or taking of any action in 
reliance upon, this information by persons or entities other than the intended 
recipient is prohibited. If you received this in error, please contact the 
sender and delete the material from any computer.


Re: Lucene Query to Solr query

2009-05-25 Thread Avlesh Singh
You missed the point, Reza. toString *has to be implemented* by all
Queryobjects in Lucene. All you have to do is to compose the right
Lucene query
matching your needs (all combinations of TermQueries, BooleanQueries,
RangeQueries etc ..) and just do a luceneQuery.toString() when performing a
Solr query.

Thinking aloud, does it make sense for the SolrQuery object to take a Lucene
Query object?
I am suggesting something like this -
SolrQuery.setQuery(org.apache.lucene.search.Query
luceneQuery)

Cheers
Avlesh

On Mon, May 25, 2009 at 2:32 PM, Reza Safari  wrote:

> Hmmm, overriding toString() can make wonders. I will try as you suggested.
> Thanx for quick reply.
>
> Gr, Reza
>
>
> On May 25, 2009, at 9:34 AM, Avlesh Singh wrote:
>
>  If you use SolrJ client to perform searches, does this not work for you?
>>
>> SolrQuery solrQuery = new SolrQuery();
>> solrQuery.setQuery(*myLuceneQuery.toString()*);
>> QueryResponse response = mySolrServer.query(solrQuery);
>>
>> Cheers
>> Avlesh
>>
>> On Mon, May 25, 2009 at 12:39 PM, Reza Safari 
>> wrote:
>>
>>  Hello,
>>>
>>> One little question: is there any utility that can convert core Lucene
>>> query (any type e.q. TermQuery etc) to solr query? It's is really a lot
>>> of
>>> work for me to rewrite existing code.
>>>
>>> Thanks,
>>> Reza
>>>
>>> --
>>> Reza Safari
>>> LUKKIEN
>>> Copernicuslaan 15
>>> 6716 BM Ede
>>>
>>> The Netherlands
>>> -
>>> http://www.lukkien.com
>>> t: +31 (0) 318 698000
>>>
>>> This message is for the designated recipient only and may contain
>>> privileged, proprietary, or otherwise private information. If you have
>>> received it in error, please notify the sender immediately and delete the
>>> original. Any other use of the email by you is prohibited.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>
> --
> Reza Safari
> LUKKIEN
> Copernicuslaan 15
> 6716 BM Ede
>
> The Netherlands
> -
> http://www.lukkien.com
> t: +31 (0) 318 698000
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise private information. If you have
> received it in error, please notify the sender immediately and delete the
> original. Any other use of the email by you is prohibited.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>


RE: Solr statistics of top searches and results returned

2009-05-25 Thread Plaatje, Patrick
Hi all,

I created a script that uses a Solr Search Component, which hooks into the main 
solr core and catches the searches being done. After this it tokenizes the 
search and send both the tokenized as well as the original query to another 
Solr core. I have not written a factory for this, but if required, it shouldn't 
be so hard to modify the script and code Database support into it.

You can find the source here:

http://www.ipros.nl/uploads/Stats-component.zip

It includes a README, and a schema.xml that should be used.

Please let me know you're thoughts.

Best,

Patrick



 

-Original Message-
From: Umar Shah [mailto:u...@wisdomtap.com] 
Sent: vrijdag 22 mei 2009 10:03
To: solr-user@lucene.apache.org
Subject: Re: Solr statistics of top searches and results returned

Hi,

good feature to have,
maintaining top N would also require storing all the search queries done so far 
and keep updating (or atleast in some time window).

having pluggable persistent storage for all time search queries would be great.

tell me how can I help?

-umar

On Fri, May 22, 2009 at 12:21 PM, Shalin Shekhar Mangar 
 wrote:
> On Fri, May 22, 2009 at 3:22 AM, Grant Ingersoll wrote:
>
>>
>> I think you will want some type of persistence mechanism otherwise 
>> you will end up consuming a lot of resources keeping track of all the 
>> query strings, unless I'm missing something.  Either a Lucene index 
>> (Solr core) or the option of embedding a DB.  Ideally, it would be 
>> pluggable such that people could choose their storage mechanism.  
>> Most people do this kind of thing offline via log analysis as logs can grow 
>> quite large quite quickly.
>>
>
> For a general case, yes. But I was thinking more of a top 'n' queries 
> as a running statistic.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Index size concerns

2009-05-25 Thread Muhammed Sameer

Salaam,

We are using apache-solr to index our files for faster searches, all things 
happen without a problem, my only concern is the size of the cache.

It seems that the trend is that the if I cache 1 GB of files the index goes to 
800MB ie we are seeing a 80% cache size.

Is this normal or am I missing something in the configuration of solr

Thanks and regards,
Muhammed Sameer


  


Re: Getting 404 for MoreLikeThis handler

2009-05-25 Thread Koji Sekiguchi

jlist9 wrote:

Thanks. Will that still be the MoreLikeThisRequestHandler?
Or the StandardRequestHandler with mlt option?

  

Yes, StandardRequestHandler. MoreLikeThisComponent is
available by default. Set mlt=on when you want to get MLT results.

Koji



Re: Plugin Not Found

2009-05-25 Thread Noble Paul നോബിള്‍ नोब्ळ्
hi jeff ,
look at these lines in the log

May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/home/zetasolr/'
May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Adding 'file:/home/zetasolr/lib/FacetCubeComponent.jar' to Solr
classloader
May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader 
INFO: Solr home set to '/home/zetasolr/cores/zeta-main/'
May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader
createClassLoader
INFO: Reusing parent classloader


this means that Solr is just using the webapp class loader instead of
its own . which version of Solr are you using?

is it possible for you to apply this patch and start if you get a
different error mesage?



--

On Fri, May 22, 2009 at 8:15 PM, Jeff Newburn  wrote:
> I have included the configuration and the log for the error on startup. I
> does appear it tries to load the lib but then simply can't referene it.
>
>  default="true" >
>        
>            explicit
>            0.01
>            
>                productId^10.0
>
>                personality^15.0
>                subCategory^20.0
>                category^10.0
>                productType^8.0
>
>                brandName^10.0
>                realBrandName^9.5
>                productNameSearch^20
>
>                size^1.2
>                width^1.0
>                heelHeight^1.0
>
>                productDescription^5.0
>                color^6.0
>                price^1.0
>
>                expandedGender^0.5
>            
>            
>                brandName^5.0  productNameSearch^5.0 productDescription^5.0
> personality^10.0 subCategory^20.0 category^10.0 productType^8.0
>            
>            
>                productId, productName, price, originalPrice,
> brandNameFacet, productRating, imageUrl, productUrl, isNew, onSale
>            
>            rord(popularity)^1
>            100%
>            1
>            5
>            *:*
>
>            
>             name="mlt.fl">brandNameFacet,productTypeFacet,productName,categoryFacet,subC
> ategoryFacet,personalityFacet,colorFacet,heelHeight,expandedGender
>            1
>            1
>        
>         
>               spellcheck
>               facetcube
>         
>
>    
>
>     class="com.zappos.solr.FacetCubeComponent"/>
>
>
> LOGS
> May 22, 2009 7:38:24 AM org.apache.catalina.startup.SetAllPropertiesRule
> begin
> WARNING: [SetAllPropertiesRule]{Server/Service/Connector} Setting property
> 'maxProcessors' to '500' did not find a matching property.
> May 22, 2009 7:38:24 AM org.apache.catalina.startup.SetAllPropertiesRule
> begin
> WARNING: [SetAllPropertiesRule]{Server/Service/Connector} Setting property
> 'maxProcessors' to '500' did not find a matching property.
> May 22, 2009 7:38:24 AM org.apache.catalina.core.AprLifecycleListener init
> INFO: The APR based Apache Tomcat Native library which allows optimal
> performance in production environments was not found on the
> java.library.path: /usr/local/apr/lib
> May 22, 2009 7:38:24 AM org.apache.tomcat.util.net.NioSelectorPool
> getSharedSelector
> INFO: Using a shared selector for servlet write/read
> May 22, 2009 7:38:24 AM org.apache.coyote.http11.Http11NioProtocol init
> INFO: Initializing Coyote HTTP/1.1 on http-8080
> May 22, 2009 7:38:24 AM org.apache.tomcat.util.net.NioSelectorPool
> getSharedSelector
> INFO: Using a shared selector for servlet write/read
> May 22, 2009 7:38:24 AM org.apache.coyote.http11.Http11NioProtocol init
> INFO: Initializing Coyote HTTP/1.1 on http-8443
> May 22, 2009 7:38:24 AM org.apache.catalina.startup.Catalina load
> INFO: Initialization processed in 1011 ms
> May 22, 2009 7:38:24 AM org.apache.catalina.core.StandardService start
> INFO: Starting service Catalina
> May 22, 2009 7:38:24 AM org.apache.catalina.core.StandardEngine start
> INFO: Starting Servlet Engine: Apache Tomcat/6.0.16
> May 22, 2009 7:38:24 AM org.apache.catalina.startup.HostConfig deployWAR
> INFO: Deploying web application archive solr.war
> May 22, 2009 7:38:25 AM org.apache.solr.servlet.SolrDispatchFilter init
> INFO: SolrDispatchFilter.init()
> May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader
> locateInstanceDir
> INFO: No /solr/home in JNDI
> May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader
> locateInstanceDir
> INFO: using system property solr.solr.home: /home/zetasolr
> May 22, 2009 7:38:25 AM org.apache.solr.core.CoreContainer$Initializer
> initialize
> INFO: looking for solr.xml: /home/zetasolr/solr.xml
> May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader 
> INFO: Solr home set to '/home/zetasolr/'
> May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader
> createClassLoader
> INFO: Adding 'file:/home/zetasolr/lib/FacetCubeComponent.jar' to Solr
> classloader
> May 22, 2009 7:38:25 AM org.apache.solr.core.SolrResourceLoader 
> INFO: Solr home set to '/home/zetasolr/cores/zeta-main/'
> May 22, 2009 7:38:25 AM

Re: Getting 404 for MoreLikeThis handler

2009-05-25 Thread Erik Hatcher
That's the standard request handler.   You have to create a mapping in  
solrconfig.xml to the MoreLikeThisHandler (not  
MoreLikeThis*Request*Handler) in order to use that.  It is not mapped  
in the default example config (at least on trunk).


Erik

On May 24, 2009, at 11:08 PM, jlist9 wrote:


Thanks. Will that still be the MoreLikeThisRequestHandler?
Or the StandardRequestHandler with mlt option?


Hi, I'm trying out the mlt handler but I'm getting a 404 error.

HTTP Status 404 - /solr/mlt

solrconfig.xml seem to say that mlt handler is available by default.
i wonder if there's anything else I should do before I can use it?
I'm using version 1.3.


Try /solr/select with mlt=on parameter.

Koji




Re: Lucene Query to Solr query

2009-05-25 Thread Erik Hatcher
Warning: toString on a Query object is *NOT* guaranteed to be parsable  
back into the same Query.  Don't use Query.toString() in this manner.


What you probably want to do is create your own QParserPlugin for Solr  
that creates the Query however you need from textual parameters from  
the client.


Erik


On May 25, 2009, at 5:16 AM, Avlesh Singh wrote:


You missed the point, Reza. toString *has to be implemented* by all
Queryobjects in Lucene. All you have to do is to compose the right
Lucene query
matching your needs (all combinations of TermQueries, BooleanQueries,
RangeQueries etc ..) and just do a luceneQuery.toString() when  
performing a

Solr query.

Thinking aloud, does it make sense for the SolrQuery object to take  
a Lucene

Query object?
I am suggesting something like this -
SolrQuery.setQuery(org.apache.lucene.search.Query
luceneQuery)

Cheers
Avlesh

On Mon, May 25, 2009 at 2:32 PM, Reza Safari   
wrote:


Hmmm, overriding toString() can make wonders. I will try as you  
suggested.

Thanx for quick reply.

Gr, Reza


On May 25, 2009, at 9:34 AM, Avlesh Singh wrote:

If you use SolrJ client to perform searches, does this not work for  
you?


SolrQuery solrQuery = new SolrQuery();
solrQuery.setQuery(*myLuceneQuery.toString()*);
QueryResponse response = mySolrServer.query(solrQuery);

Cheers
Avlesh

On Mon, May 25, 2009 at 12:39 PM, Reza Safari 
wrote:

Hello,


One little question: is there any utility that can convert core  
Lucene
query (any type e.q. TermQuery etc) to solr query? It's is really  
a lot

of
work for me to rewrite existing code.

Thanks,
Reza

--
Reza Safari
LUKKIEN
Copernicuslaan 15
6716 BM Ede

The Netherlands
-
http://www.lukkien.com
t: +31 (0) 318 698000

This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you  
have
received it in error, please notify the sender immediately and  
delete the

original. Any other use of the email by you is prohibited.


















--
Reza Safari
LUKKIEN
Copernicuslaan 15
6716 BM Ede

The Netherlands
-
http://www.lukkien.com
t: +31 (0) 318 698000

This message is for the designated recipient only and may contain
privileged, proprietary, or otherwise private information. If you  
have
received it in error, please notify the sender immediately and  
delete the

original. Any other use of the email by you is prohibited.



















exceptions when using existing index with latest build

2009-05-25 Thread Peter Wolanin
Building Solr last night from updated svn, I'm now getting the
exception below when I use any fq parameter searching a pre-existing
index.  So far, I cannot fix it by tweak config files, but I had to
delete and re-index.

I note that Solr was recently updated to the latest lucene build, so
maybe something broke in the index format?

here's the relevant part of the trace:

org.apache.lucene.index.ReadOnlySegmentReader cannot be cast to
org.apache.solr.search.SolrIndexReader

java.lang.ClassCastException:
org.apache.lucene.index.ReadOnlySegmentReader cannot be cast to
org.apache.solr.search.SolrIndexReader
   at 
org.apache.solr.search.SortedIntDocSet$2.getDocIdSet(SortedIntDocSet.java:530)
   at 
org.apache.lucene.search.IndexSearcher.doSearch(IndexSearcher.java:237)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:221)
   at org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)
   at org.apache.lucene.search.Searcher.search(Searcher.java:150)
   at 
org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:1032)
   at 
org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:894)
   at 
org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:337)
   at 
org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:176)
   at 
org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
   at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)
   at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)

-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


Re: exceptions when using existing index with latest build

2009-05-25 Thread Erik Hatcher
Peter - I posted this to the solr-dev list this morning also.  The  
thread to follow is over there.


Erik

On May 25, 2009, at 9:05 AM, Peter Wolanin wrote:


Building Solr last night from updated svn, I'm now getting the
exception below when I use any fq parameter searching a pre-existing
index.  So far, I cannot fix it by tweak config files, but I had to
delete and re-index.

I note that Solr was recently updated to the latest lucene build, so
maybe something broke in the index format?

here's the relevant part of the trace:

org.apache.lucene.index.ReadOnlySegmentReader cannot be cast to
org.apache.solr.search.SolrIndexReader

java.lang.ClassCastException:
org.apache.lucene.index.ReadOnlySegmentReader cannot be cast to
org.apache.solr.search.SolrIndexReader
  at org.apache.solr.search.SortedIntDocSet 
$2.getDocIdSet(SortedIntDocSet.java:530)
  at  
org.apache.lucene.search.IndexSearcher.doSearch(IndexSearcher.java: 
237)
  at  
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:221)
  at  
org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:212)

  at org.apache.lucene.search.Searcher.search(Searcher.java:150)
  at  
org 
.apache 
.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java: 
1032)
  at  
org 
.apache 
.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:894)
  at  
org 
.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java: 
337)
  at  
org 
.apache 
.solr.handler.component.QueryComponent.process(QueryComponent.java: 
176)
  at  
org 
.apache 
.solr 
.handler 
.component.SearchHandler.handleRequestBody(SearchHandler.java:195)
  at  
org 
.apache 
.solr 
.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:131)

  at org.apache.solr.core.SolrCore.execute(SolrCore.java:1328)

--
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com




Re: More questions about MoreLikeThis

2009-05-25 Thread Koji Sekiguchi

jlist9 wrote:

The wiki page (http://wiki.apache.org/solr/MoreLikeThis) says:

mlt.fl: The fields to use for similarity. NOTE: if possible, these
should have a stored TermVector

I didn't set TermVector to true MoreLikeThis with StandardRequestHandler seems
to work fine. The first question is, is TermVector only for
performance optimization?

  

I think yes.


The second question is, afterI changed the mlt.fl fields from both
indexed and stored
to indexed only, I started to get zero results back. Do mlt.fl fields
always need to
be stored?

Thanks

  


MLT uses termVector if it exists for the field. If termVector is not 
available,

MLT tries to get stored field data. If stored field is not available, MLT
does nothing for the field as you were seeing.

So mlt.fl fields don't always need to be stored.

Koji




problem with Solrits (schow only some of the fields)

2009-05-25 Thread Jörg Agatz
Hallo...

I have a Problem...

I will use Solitas for solr..
But i have a Problem...
At the moment, Solitas present All fields in the Results, but i musst change
it and present only some, Like id, name, cat and inStock (exampledocuments)

i think that is the code to post all fields..


  #foreach($fieldname in $doc.fieldNames)
 
   $fieldname :
   
   #foreach($value in $doc.getFieldValues($fieldname))
 $value
   #end
   
 
  #end
  #if($params.getBool("debugQuery",false))
toggle explain
$response.getExplainMap().get($doc.getFirstValue('id'))
  #end



Maby someone can explain me how i can change the code to get some fields
Jörg Agatz


Re: Index size concerns

2009-05-25 Thread Shalin Shekhar Mangar
On Mon, May 25, 2009 at 3:53 PM, Muhammed Sameer wrote:

>
> We are using apache-solr to index our files for faster searches, all things
> happen without a problem, my only concern is the size of the cache.
>
> It seems that the trend is that the if I cache 1 GB of files the index goes
> to 800MB ie we are seeing a 80% cache size.
>
> Is this normal or am I missing something in the configuration of solr
>

I'm sorry I do not understand your question. Which files are you talking
about? The Solr cache has got nothing to do with files. It caches the
query/filter results and solr documents.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Lucene Query to Solr query

2009-05-25 Thread Avlesh Singh
Point taken, Erik. But, is there really a downside towards using
Query.toString() if someone is not using any of the complex Query Subclasses
(like a SpanQuery)?

Cheers
Avlesh

On Mon, May 25, 2009 at 5:38 PM, Erik Hatcher wrote:

> Warning: toString on a Query object is *NOT* guaranteed to be parsable back
> into the same Query.  Don't use Query.toString() in this manner.
>
> What you probably want to do is create your own QParserPlugin for Solr that
> creates the Query however you need from textual parameters from the client.
>
>Erik
>
>
> On May 25, 2009, at 5:16 AM, Avlesh Singh wrote:
>
>  You missed the point, Reza. toString *has to be implemented* by all
>> Queryobjects in Lucene. All you have to do is to compose the right
>>
>> Lucene query
>> matching your needs (all combinations of TermQueries, BooleanQueries,
>> RangeQueries etc ..) and just do a luceneQuery.toString() when performing
>> a
>> Solr query.
>>
>> Thinking aloud, does it make sense for the SolrQuery object to take a
>> Lucene
>> Query object?
>> I am suggesting something like this -
>> SolrQuery.setQuery(org.apache.lucene.search.Query
>> luceneQuery)
>>
>> Cheers
>> Avlesh
>>
>> On Mon, May 25, 2009 at 2:32 PM, Reza Safari 
>> wrote:
>>
>>  Hmmm, overriding toString() can make wonders. I will try as you
>>> suggested.
>>> Thanx for quick reply.
>>>
>>> Gr, Reza
>>>
>>>
>>> On May 25, 2009, at 9:34 AM, Avlesh Singh wrote:
>>>
>>> If you use SolrJ client to perform searches, does this not work for you?
>>>

 SolrQuery solrQuery = new SolrQuery();
 solrQuery.setQuery(*myLuceneQuery.toString()*);
 QueryResponse response = mySolrServer.query(solrQuery);

 Cheers
 Avlesh

 On Mon, May 25, 2009 at 12:39 PM, Reza Safari 
 wrote:

 Hello,

>
> One little question: is there any utility that can convert core Lucene
> query (any type e.q. TermQuery etc) to solr query? It's is really a lot
> of
> work for me to rewrite existing code.
>
> Thanks,
> Reza
>
> --
> Reza Safari
> LUKKIEN
> Copernicuslaan 15
> 6716 BM Ede
>
> The Netherlands
> -
> http://www.lukkien.com
> t: +31 (0) 318 698000
>
> This message is for the designated recipient only and may contain
> privileged, proprietary, or otherwise private information. If you have
> received it in error, please notify the sender immediately and delete
> the
> original. Any other use of the email by you is prohibited.
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>
>>> --
>>> Reza Safari
>>> LUKKIEN
>>> Copernicuslaan 15
>>> 6716 BM Ede
>>>
>>> The Netherlands
>>> -
>>> http://www.lukkien.com
>>> t: +31 (0) 318 698000
>>>
>>> This message is for the designated recipient only and may contain
>>> privileged, proprietary, or otherwise private information. If you have
>>> received it in error, please notify the sender immediately and delete the
>>> original. Any other use of the email by you is prohibited.
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>>>
>


Re: Lucene Query to Solr query

2009-05-25 Thread Shalin Shekhar Mangar
On Mon, May 25, 2009 at 9:16 PM, Avlesh Singh  wrote:

> Point taken, Erik. But, is there really a downside towards using
> Query.toString() if someone is not using any of the complex Query
> Subclasses
> (like a SpanQuery)?
>

Well, you will be relying on undocumented behavior that might change in
future releases.

Also, most (none?) Query objects do not have a parseable toString
representation so it may not even work at all.

-- 
Regards,
Shalin Shekhar Mangar.


Re: Lucene Query to Solr query

2009-05-25 Thread Avlesh Singh
>
> Also, most (none?) Query objects do not have a parseable toString
> representation so it may not even work at all.
>

IMO, this behavior is limited to the Subclasses of SpanQuery.
Anyways, I understand the general notion here.

Cheers
Avlesh

On Mon, May 25, 2009 at 9:30 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Mon, May 25, 2009 at 9:16 PM, Avlesh Singh  wrote:
>
> > Point taken, Erik. But, is there really a downside towards using
> > Query.toString() if someone is not using any of the complex Query
> > Subclasses
> > (like a SpanQuery)?
> >
>
> Well, you will be relying on undocumented behavior that might change in
> future releases.
>
> Also, most (none?) Query objects do not have a parseable toString
> representation so it may not even work at all.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: Recover crashed solr index

2009-05-25 Thread Peter Wolanin
you can use the lucene jar with solr to invoke the CheckIndex method -
this will  possibly allow you to recover if you pass the -fix param.

You may lose some docs, however, so this is only viable if you can,
for example, query to check what's missing.

The command looks like (from the root of the solr svn checkout):

java -ea:org.apache.lucene -cp lib/lucene-core-2.9-dev.jar
org.apache.lucene.index.CheckIndex [path to index directory]

For example, to check the example index:

java -ea:org.apache.lucene -cp lib/lucene-core-2.9-dev.jar
org.apache.lucene.index.CheckIndex example/solr/data/index/

-Peter

On Mon, May 25, 2009 at 4:42 AM, Wang Guangchen  wrote:
> Hi everyone,
>
> I have 8m docs to index, and each doc is around 50kb. The solr crashed in
> the middle of indexing. error message said that one of the file in the data
> directory is missing. I don't know why this is happened.
>
> So right now I have to find a way to recover the index to avoid re-index. Is
> there anyone know any tools or method to recover the crashed index? Please
> help.
>
> Thanks a lot.
>
> Regards
> GC
>



-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


Re: Is it memory leaking in solr?

2009-05-25 Thread Jianbin Dai

Again, indexing becomes extremely slow after indexed 8m documents (about 25G of 
original file size). Here is the memory usage info of my computer. Does this 
have anything to do with tomcat setting? Thanks.


top - 08:09:53 up  7:22,  1 user,  load average: 1.03, 1.01, 1.00
Tasks:  78 total,   2 running,  76 sleeping,   0 stopped,   0 zombie
Cpu(s): 49.9%us,  0.2%sy,  0.0%ni, 49.8%id,  0.2%wa,  0.0%hi,  0.0%si,  0.0%st
Mem:   4044776k total,  3960740k used,84036k free,42196k buffers
Swap:  2031608k total,   84k used,  2031524k free,  2729892k cached

  PID USER  PR  NI  VIRT  RES  SHR S %CPU %MEMTIME+  COMMAND
   
 3322 root  21   0 1357m 1.0g  11m S  100 27.0 397:51.74 java  



--- On Mon, 5/25/09, Jianbin Dai  wrote:

> From: Jianbin Dai 
> Subject: Is it memory leaking in solr?
> To: solr-user@lucene.apache.org, noble.p...@gmail.com
> Date: Monday, May 25, 2009, 1:27 AM
> 
> I am using DIH to do indexing. After I indexed about 8M
> documents (took about 1hr40m), it used up almost all memory
> (4GB), and the indexing becomes extremely slow. If I delete
> all indexing and shutdown tomcat, it still shows over 3gb
> memory was used. Is it memory leaking? if it is, then the
> leaking is in solr indexing or DIH?  Thanks.
> 
> 
>       
> 
> 






Re: Index size concerns

2009-05-25 Thread Muhammed Sameer

Salaam,

Sorry for this here is the big picture

Actually we use solr to index all the mails that come to us so that we can 
allow for faster look ups.

We have seen that after our mail server accepts say a GB of mails the index 
size goes upto 800MB 

I hope that this time I am clear in conveying the problem

What I wanted to know is that is this index size normal ?

Regards,
Muhammed Sameer

--- On Mon, 5/25/09, Shalin Shekhar Mangar  wrote:

> From: Shalin Shekhar Mangar 
> Subject: Re: Index size concerns
> To: solr-user@lucene.apache.org
> Date: Monday, May 25, 2009, 11:19 AM
> On Mon, May 25, 2009 at 3:53 PM,
> Muhammed Sameer wrote:
> 
> >
> > We are using apache-solr to index our files for faster
> searches, all things
> > happen without a problem, my only concern is the size
> of the cache.
> >
> > It seems that the trend is that the if I cache 1 GB of
> files the index goes
> > to 800MB ie we are seeing a 80% cache size.
> >
> > Is this normal or am I missing something in the
> configuration of solr
> >
> 
> I'm sorry I do not understand your question. Which files
> are you talking
> about? The Solr cache has got nothing to do with files. It
> caches the
> query/filter results and solr documents.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 


  


Re: grouping response docs together

2009-05-25 Thread Matt Mitchell
Thanks guys. I looked at the dedup stuff, but the documents I'm adding
aren't really duplicates. They're very similar, but different.

I checked out the field collapsing feature patch, applied the patch but
can't get it to build successfully. Will this patch work with a nightly
build?

Thanks!

On Fri, May 15, 2009 at 7:47 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Matt - you may also want to detect near duplicates at index time:
>
> http://wiki.apache.org/solr/Deduplication
>
>  Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Matt Mitchell 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, May 15, 2009 6:52:48 PM
> > Subject: grouping response docs together
> >
> > Is there a built-in mechanism for grouping similar documents together in
> the
> > response? I'd like to make it look like there is only one document with
> > multiple "hits".
> >
> > Matt
>
>


Re: grouping response docs together

2009-05-25 Thread Thomas Traeger

Hello Matt,

the patch should work with trunk and after a small fix with 1.3 too (see
my comment in SOLR-236). I just made a successful build to be sure.

Do you see any error messages?

Thomas

Matt Mitchell schrieb:

Thanks guys. I looked at the dedup stuff, but the documents I'm adding
aren't really duplicates. They're very similar, but different.

I checked out the field collapsing feature patch, applied the patch but
can't get it to build successfully. Will this patch work with a nightly
build?

Thanks!

On Fri, May 15, 2009 at 7:47 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:


Matt - you may also want to detect near duplicates at index time:

http://wiki.apache.org/solr/Deduplication

 Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Matt Mitchell 
To: solr-user@lucene.apache.org
Sent: Friday, May 15, 2009 6:52:48 PM
Subject: grouping response docs together

Is there a built-in mechanism for grouping similar documents together in

the

response? I'd like to make it look like there is only one document with
multiple "hits".

Matt








issues with shards

2009-05-25 Thread Valdir Salgueiro
hello, im using solr 1.3, and having some problems when i search with the
shards parameters,
for example:

*shards=localhost:9090/isearch*

*im using 9090 as the default port

i get this error:

NFO: Filter queries (object): [null]
> 25/05/2009 17:06:33 org.apache.solr.core.SolrCore execute
> INFO: webapp=null path=null
> params={facet.zeros=false&facet=true&hl.autofield.excluderegex=.*(?:_blob)$&facet.limit=200&hl.simple.pre=&hl.autofields=true&ling=none&hl=true&fl=id,score&allcats=0&hl.autofield.regex=^(?:(?:show)|(?:ctrl))##.%2B&hl.simple.post=&hl.merge=false&fsv=true&fq=*:*&hl.fragsize=100&hl.fl=&IDENTIFICADOR_ACESSO=&wt=javabin&rows=10&hl.snippets=3&start=0&q=content:(cesar)&idAcl=&hl.notags=true&isShard=true}
> hits=46 status=0 QTime=297
> 25/05/2009 17:06:33 org.apache.solr.common.SolrException log
> SEVERE: java.lang.RuntimeException: This is a binary writer , Cannot write
> to a characterstream
> at
> org.apache.solr.request.BinaryResponseWriter.write(BinaryResponseWriter.java:48)
> at org.apache.solr.servlet.SolrServlet.doGet(SolrServlet.java:89)
> at org.apache.solr.servlet.SolrServlet.doPost(SolrServlet.java:65)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:637)
> at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
> at
> org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
> at
> org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
> at
> org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
> at
> org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
> at
> org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:433)
> at
> org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
> at
> org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
> at
> org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
> at
> org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
> at
> org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
> at
> org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
> at
> org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
> at java.lang.Thread.run(Thread.java:619)
>
> 25/05/2009 17:06:33 org.apache.solr.common.SolrException log
> SEVERE: org.apache.solr.common.SolrException: Internal Server Error
>
> Internal Server Error
>
> request: http://localhost:9090/isearch/select


ive been searching for this error for some days (*SEVERE:
java.lang.RuntimeException: This is a binary writer , Cannot write to a
characterstream*) without much luck, and have no idea how to fix it :( any
help is apreciated :)

im posting the full log below if anyone is interested, thanks


> 25/05/2009 17:06:24 com.aileader.isearch.ui.InitISearchServlet init
> INFO: InitISearchServlet.init()
> 25/05/2009 17:06:24 org.apache.solr.core.SolrResourceLoader
> locateInstanceDir
> INFO: No /solr/home in JNDI
> 25/05/2009 17:06:24 org.apache.solr.core.SolrResourceLoader
> locateInstanceDir
> INFO: solr home defaulted to 'solr/' (could not find system property or
> JNDI)
> 25/05/2009 17:06:24 org.apache.solr.core.SolrResourceLoader 
> INFO: Solr home set to 'solr/'
> 25/05/2009 17:06:24 org.apache.solr.core.SolrResourceLoader
> createClassLoader
> INFO: Reusing parent classloader
> 25/05/2009 17:06:24 org.apache.solr.core.SolrConfig 
> INFO: Loaded SolrConfig: solrconfig.xml
> 25/05/2009 17:06:24 org.apache.solr.core.SolrCore 
> INFO: Opening new SolrCore at solr/, dataDir=./solr/data/
> 25/05/2009 17:06:24 org.apache.solr.schema.IndexSchema readSchema
> INFO: Reading Solr Schema
> 25/05/2009 17:06:24 org.apache.solr.schema.IndexSchema readSchema
> INFO: Schema name=default
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created string: org.apache.solr.schema.StrField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created boolean: org.apache.solr.schema.BoolField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created integer: org.apache.solr.schema.IntField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created long: org.apache.solr.schema.LongField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created float: org.apache.solr.schema.FloatField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created double: org.apache.solr.schema.DoubleField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: created sint: org.apache.solr.schema.SortableIntField
> 25/05/2009 17:06:24 org.apache.solr.util.plugin.AbstractPluginLoader load
> INFO: creat

Re: highlighting performance

2009-05-25 Thread Matt Mitchell
Thanks Otis. I added termVector="true" for those fields, but there isn't a
noticeable difference. So, just to be a little more clear, the dynamic
fields I'm adding... there might be hundreds. Do you see this as a problem?

Thanks,
Matt

On Fri, May 15, 2009 at 7:48 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Matt,
>
> I believe indexing those fields that you will use for highlighting with
> term vectors enabled will make things faster (and your index a bit bigger).
>
>
> Otis --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Matt Mitchell 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, May 15, 2009 5:08:23 PM
> > Subject: highlighting performance
> >
> > Hi,
> >
> > I'm experimenting with highlighting and am noticing a big drop in
> > performance with my setup. I have documents that use quite a few dynamic
> > fields (20-30). The fields are multiValued stored/indexed text fields,
> each
> > with a few paragraphs worth of text. My hl.fl param is set to *_t
> >
> > What kinds of things can I tweak to make this faster? Is it because I'm
> > highlighting so many different fields?
> >
> > Thanks,
> > Matt
>
>


Re: Lucene Query to Solr query

2009-05-25 Thread Yonik Seeley
On Mon, May 25, 2009 at 3:09 AM, Reza Safari  wrote:
> One little question: is there any utility that can convert core Lucene query
> (any type e.q. TermQuery etc) to solr query? It's is really a lot of work
> for me to rewrite existing code.

Solr internal APIs take Lucene query types.
I guess perhaps you mean transforming a Lucene query into a parameter
for the external HTTP API?

new TermQuery(new Term("foo","bar"))
  would be transformed to
q=foo:bar

-Yonik
http://www.lucidimagination.com


Re: grouping response docs together

2009-05-25 Thread Matt Mitchell
Hi Thomas,

In a 5-24-09 nightly build, I applied the patch:

cd apache-solr-nightly

patch -p0 < ~/Projects/apache-solr-patches/SOLR-236_collapsing.patch
patching file src/common/org/apache/solr/common/params/CollapseParams.java
patching file
src/java/org/apache/solr/handler/component/CollapseComponent.java
patching file src/java/org/apache/solr/search/CollapseFilter.java
patching file src/java/org/apache/solr/search/NegatedDocSet.java
patching file src/java/org/apache/solr/search/SolrIndexSearcher.java
Hunk #1 succeeded at 1444 (offset -39 lines).
patching file src/test/org/apache/solr/search/TestDocSet.java
Hunk #1 succeeded at 134 (offset 42 lines).

... and got this when running "ant dist"

docs:
[mkdir] Created dir:
/Users/mwm4n/Downloads/apache-solr-nightly/contrib/javascript/dist/doc
 [java] Exception in thread "main" java.lang.NoClassDefFoundError:
org/mozilla/javascript/tools/shell/Main
 [java] at JsRun.main(Unknown Source)

BUILD FAILED
/Users/mwm4n/Downloads/apache-solr-nightly/common-build.xml:338: The
following error occurred while executing this line:
/Users/mwm4n/Downloads/apache-solr-nightly/common-build.xml:215: The
following error occurred while executing this line:
/Users/mwm4n/Downloads/apache-solr-nightly/contrib/javascript/build.xml:74:
Java returned: 1

Not sure what any of that means, but the "ant dist" task worked fine before
the patch. Any ideas?

Thanks,

Matt

On Mon, May 25, 2009 at 3:59 PM, Thomas Traeger  wrote:

> Hello Matt,
>
> the patch should work with trunk and after a small fix with 1.3 too (see
> my comment in SOLR-236). I just made a successful build to be sure.
>
> Do you see any error messages?
>
> Thomas
>
> Matt Mitchell schrieb:
>
>  Thanks guys. I looked at the dedup stuff, but the documents I'm adding
>> aren't really duplicates. They're very similar, but different.
>>
>> I checked out the field collapsing feature patch, applied the patch but
>> can't get it to build successfully. Will this patch work with a nightly
>> build?
>>
>> Thanks!
>>
>> On Fri, May 15, 2009 at 7:47 PM, Otis Gospodnetic <
>> otis_gospodne...@yahoo.com> wrote:
>>
>>  Matt - you may also want to detect near duplicates at index time:
>>>
>>> http://wiki.apache.org/solr/Deduplication
>>>
>>>  Otis
>>> --
>>> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>>>
>>>
>>>
>>> - Original Message 
>>>
 From: Matt Mitchell 
 To: solr-user@lucene.apache.org
 Sent: Friday, May 15, 2009 6:52:48 PM
 Subject: grouping response docs together

 Is there a built-in mechanism for grouping similar documents together in

>>> the
>>>
 response? I'd like to make it look like there is only one document with
 multiple "hits".

 Matt

>>>
>>>
>>
>


Shuffling results

2009-05-25 Thread yaymicro_bjorn

Hi

I'm responsible for the search engine at yaymicro.com. yaymicro.com is a
microstock agency (sells images). We are using the excellent solr search
engine, but I have a problem with series of similar images showing up. I'll
try to explain: 

A search for dog for example
http://yaymicro.com/search.action?search.search=dog&x=0&y=0&search.first=true

very often results in variations of the images of the same motive close to
each other. This is logical, but unwanted behaviour, since we would love to
show our customers more variations in search result. 

So my question is quite simple, is there a way to configure solr to put some
"randomness" in the search result? To shuffle the result, not completly, but
"a bit" to avoid such series of similar images. 

Any respons would be highly appreciated

Bjorn 
CTO of YayMicro



-- 
View this message in context: 
http://www.nabble.com/Shuffling-results-tp23715563p23715563.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: problem with Solrits (schow only some of the fields)

2009-05-25 Thread Erik Hatcher


On May 25, 2009, at 11:15 AM, Jörg Agatz wrote:

I will use Solitas for solr..


Yay!  Our first customer ;)

At the moment, Solitas present All fields in the Results, but i  
musst change
it and present only some, Like id, name, cat and inStock  
(exampledocuments)


i think that is the code to post all fields..


 #foreach($fieldname in $doc.fieldNames)

  $fieldname :
  
  #foreach($value in $doc.getFieldValues($fieldname))
$value
  #end
  

 #end


Right - this is just generic code to show all stored fields from the  
document (TODO: and really should be adjusted to be fl parameter aware).


Maby someone can explain me how i can change the code to get some  
field


Sure... first, this page describes the objects you have available in  
the template (Velocity) context:  http://wiki.apache.org/solr/VelocityResponseWriter 
 - you can link off to javadocs from there to see more about what  
each object provides in terms of getters and such.


There is a $response.  From the default browse.vm template, $doc is a  
single item in an iteration over $response.results


$doc is an org.apache.solr.common.SolrDocument (http://lucene.apache.org/solr/api/org/apache/solr/common/SolrDocument.html 
)


So from a $doc, you can do this:

  $doc.getFirstValue("name")

getFirstValue is used when it is known to be a single valued field  
(experiment with the other getters on SolrDocument to see how they  
work with various fields).


In the custom templates I have been using, I define a macro to make  
this even easier - you can define it in VM_global_library.vm in conf/ 
velocity to make it global to all templates:


   #macro(field $f)$!{esc.html($doc.getFirstValue($f))}#end

Now you can use #field("name") instead, making templates much  
cleaner.  The $!{esc.html(...)} bit is there to HTML escape the field  
value, otherwise it leaves the possibility of malformed rendering or  
even the possibility of a JavaScript injection vulnerability.  The  
explanation point is a Velocity templating feature to not render  
anything if the value is null, otherwise it'll literally render "$ 
{}" in the output.


Perhaps more than you were asking for, but I wanted to be thorough  
since this is a feature of Solr I'd like to see get some more, ahem,  
visibility.


Erik




Re: Shuffling results

2009-05-25 Thread Avlesh Singh
If simply getting random results (matching your query) from Solr is your
requirement, then a dynamic RandomSortField is what you need. Details here
-
http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html

Cheers
Avlesh

On Tue, May 26, 2009 at 6:54 AM, yaymicro_bjorn
wrote:

>
> Hi
>
> I'm responsible for the search engine at yaymicro.com. yaymicro.com is a
> microstock agency (sells images). We are using the excellent solr search
> engine, but I have a problem with series of similar images showing up. I'll
> try to explain:
>
> A search for dog for example
>
> http://yaymicro.com/search.action?search.search=dog&x=0&y=0&search.first=true
>
> very often results in variations of the images of the same motive close to
> each other. This is logical, but unwanted behaviour, since we would love to
> show our customers more variations in search result.
>
> So my question is quite simple, is there a way to configure solr to put
> some
> "randomness" in the search result? To shuffle the result, not completly,
> but
> "a bit" to avoid such series of similar images.
>
> Any respons would be highly appreciated
>
> Bjorn
> CTO of YayMicro
>
>
>
> --
> View this message in context:
> http://www.nabble.com/Shuffling-results-tp23715563p23715563.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: Shuffling results

2009-05-25 Thread yaymicro_bjorn

Hi Avlesh

No, as I was trying to explain, I obviously don't want a totally random
result. I just want to mix it up a "little". Is there a way to achieve this
with solr? 

Bjorn


Avlesh Singh wrote:
> 
> If simply getting random results (matching your query) from Solr is your
> requirement, then a dynamic RandomSortField is what you need. Details here
> -
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
> 
> Cheers
> Avlesh
> 
> On Tue, May 26, 2009 at 6:54 AM, yaymicro_bjorn
> wrote:
> 
>>
>> Hi
>>
>> I'm responsible for the search engine at yaymicro.com. yaymicro.com is a
>> microstock agency (sells images). We are using the excellent solr search
>> engine, but I have a problem with series of similar images showing up.
>> I'll
>> try to explain:
>>
>> A search for dog for example
>>
>> http://yaymicro.com/search.action?search.search=dog&x=0&y=0&search.first=true
>>
>> very often results in variations of the images of the same motive close
>> to
>> each other. This is logical, but unwanted behaviour, since we would love
>> to
>> show our customers more variations in search result.
>>
>> So my question is quite simple, is there a way to configure solr to put
>> some
>> "randomness" in the search result? To shuffle the result, not completly,
>> but
>> "a bit" to avoid such series of similar images.
>>
>> Any respons would be highly appreciated
>>
>> Bjorn
>> CTO of YayMicro
>>
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/Shuffling-results-tp23715563p23715563.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Shuffling-results-tp23715563p23716312.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Shuffling results

2009-05-25 Thread Avlesh Singh
>
> I just want to mix it up a "little"
>

Sounds very subjective and open.

Give this a thought - You can try multi-field sort with first sort being on
the score (so that all the more relevant results ones appear first), and
second being a sort on the random field (which shuffles the order of results
with the same score).
In Solr, you can do multi-field sorting like this - sort=+[,+]...

Cheers
Avlesh

On Tue, May 26, 2009 at 8:59 AM, yaymicro_bjorn
wrote:

>
> Hi Avlesh
>
> No, as I was trying to explain, I obviously don't want a totally random
> result. I just want to mix it up a "little". Is there a way to achieve this
> with solr?
>
> Bjorn
>
>
> Avlesh Singh wrote:
> >
> > If simply getting random results (matching your query) from Solr is your
> > requirement, then a dynamic RandomSortField is what you need. Details
> here
> > -
> >
> http://lucene.apache.org/solr/api/org/apache/solr/schema/RandomSortField.html
> >
> > Cheers
> > Avlesh
> >
> > On Tue, May 26, 2009 at 6:54 AM, yaymicro_bjorn
> > wrote:
> >
> >>
> >> Hi
> >>
> >> I'm responsible for the search engine at yaymicro.com. yaymicro.com is
> a
> >> microstock agency (sells images). We are using the excellent solr search
> >> engine, but I have a problem with series of similar images showing up.
> >> I'll
> >> try to explain:
> >>
> >> A search for dog for example
> >>
> >>
> http://yaymicro.com/search.action?search.search=dog&x=0&y=0&search.first=true
> >>
> >> very often results in variations of the images of the same motive close
> >> to
> >> each other. This is logical, but unwanted behaviour, since we would love
> >> to
> >> show our customers more variations in search result.
> >>
> >> So my question is quite simple, is there a way to configure solr to put
> >> some
> >> "randomness" in the search result? To shuffle the result, not completly,
> >> but
> >> "a bit" to avoid such series of similar images.
> >>
> >> Any respons would be highly appreciated
> >>
> >> Bjorn
> >> CTO of YayMicro
> >>
> >>
> >>
> >> --
> >> View this message in context:
> >> http://www.nabble.com/Shuffling-results-tp23715563p23715563.html
> >> Sent from the Solr - User mailing list archive at Nabble.com.
> >>
> >>
> >
> >
>
> --
> View this message in context:
> http://www.nabble.com/Shuffling-results-tp23715563p23716312.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


Re: More questions about MoreLikeThis

2009-05-25 Thread jlist9
Thanks. That explains it! I'll set termVector to true and give it a try again.

On Mon, May 25, 2009 at 7:41 AM, Koji Sekiguchi  wrote:
> MLT uses termVector if it exists for the field. If termVector is not
> available,
> MLT tries to get stored field data. If stored field is not available, MLT
> does nothing for the field as you were seeing.
>
> So mlt.fl fields don't always need to be stored.


Re: Recover crashed solr index

2009-05-25 Thread Wang Guangchen
Hi peter,

Thank you very much for your quick reply.

I tried the CheckIndex method. It can't work on my crashed index.
In the error message, it says the segments file in the directory is missing.
and when I use the -fix param, new segments file still can't be write.
I even try the CheckIndex without the assertion, it still can't work.


Do you know why this is happening ? Does it mean that the segment file can't
be rewrite at all?

Btw, i am using the nightly build solr.

following is the error messages:

[r...@localhost lib]# java  -cp lucene-core-2.9-dev.jar
org.apache.lucene.index.CheckIndex -fix /solr/example/data/index/

NOTE: testing will be more thorough if you run java with
'-ea:org.apache.lucene...', so assertions are enabled

Opening index @ /solr/example/data/index/

ERROR: could not read any segments file in directory
java.io.FileNotFoundException: /solr/example

/data/index/segments_cje (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:212)
at
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:630)
at
org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:660)
at
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:566)
at
org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:560)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:224)
at
org.apache.lucene.index.SegmentInfos$1.doBody(SegmentInfos.java:292)
at
org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:688)
at org.apache.lucene.index.SegmentInfos.read(SegmentInfos.java:289)
at
org.apache.lucene.index.CheckIndex.checkIndex(CheckIndex.java:258)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:678)
WARNING: 0 documents will be lost

NOTE: will write new segments file in 5 seconds; this will remove 0 docs
from the index. THIS IS YOUR LAST CHANCE TO CTRL+C!
  5...
  4...
  3...
  2...
  1...
Writing...
Exception in thread "main" java.lang.NullPointerException
at org.apache.lucene.index.CheckIndex.fixIndex(CheckIndex.java:556)
at org.apache.lucene.index.CheckIndex.main(CheckIndex.java:697)


Regards

GC



On Tue, May 26, 2009 at 12:49 AM, Peter Wolanin wrote:

> you can use the lucene jar with solr to invoke the CheckIndex method -
> this will  possibly allow you to recover if you pass the -fix param.
>
> You may lose some docs, however, so this is only viable if you can,
> for example, query to check what's missing.
>
> The command looks like (from the root of the solr svn checkout):
>
> java -ea:org.apache.lucene -cp lib/lucene-core-2.9-dev.jar
> org.apache.lucene.index.CheckIndex [path to index directory]
>
> For example, to check the example index:
>
> java -ea:org.apache.lucene -cp lib/lucene-core-2.9-dev.jar
> org.apache.lucene.index.CheckIndex example/solr/data/index/
>
> -Peter
>
> On Mon, May 25, 2009 at 4:42 AM, Wang Guangchen 
> wrote:
> > Hi everyone,
> >
> > I have 8m docs to index, and each doc is around 50kb. The solr crashed in
> > the middle of indexing. error message said that one of the file in the
> data
> > directory is missing. I don't know why this is happened.
> >
> > So right now I have to find a way to recover the index to avoid re-index.
> Is
> > there anyone know any tools or method to recover the crashed index?
> Please
> > help.
> >
> > Thanks a lot.
> >
> > Regards
> > GC
> >
>
>
>
> --
> Peter M. Wolanin, Ph.D.
> Momentum Specialist,  Acquia. Inc.
> peter.wola...@acquia.com
>


commit question

2009-05-25 Thread Ashish P

If I add 10 document to solrServer as in solrServer.addIndex(docs) ( Using
Embedded ) and then I commit and commit fails for for some reason. Then can
I retry this commit lets say after some time or the added documents are
lost??

-- 
View this message in context: 
http://www.nabble.com/commit-question-tp23717415p23717415.html
Sent from the Solr - User mailing list archive at Nabble.com.