How to index large set data

2009-05-20 Thread Jianbin Dai
Hi, I have about 45GB xml files to be indexed. I am using DataImportHandler. I started the full import 4 hours ago, and it's still running My computer has 4GB memory. Any suggestion on the solutions? Thanks! JB

Re: Plugin Not Found

2009-05-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
what else is there in the solr.home/lib other than this component? On Wed, May 20, 2009 at 9:08 PM, Jeff Newburn wrote: > I tried to change the package name to com.zappos.solr. > > When I declared the search component with: > class="com.zappos.solr.FacetCubeComponent"/> > > I get: > SEVERE: org.

Re: Issue with AND/OR Operator in Dismax Request

2009-05-20 Thread Doug Steigerwald
http://issues.apache.org/jira/browse/SOLR-405 ? It's quite old and it's exactly what you want, but I think it might be the JIRA ticket that Otis mentioned. Using a filter query was what we really needed. I'm also not really sure why you need a dismax query at all. You're not querying for

Re: dataimport.properties; configure writable location?

2009-05-20 Thread Shalin Shekhar Mangar
On Wed, May 20, 2009 at 11:44 PM, Wesley Small wrote: > Is a place in a core's solrconfig, where one can set the directory/path > where the dataimport.properties file is written to? > It is not configurable right now. Can you please open a jira issue for this? -- Regards, Shalin Shekhar Mangar.

Re: phrase query & word delimiting

2009-05-20 Thread Otis Gospodnetic
Alex, You might want to paste in your tokenizer/token filter config. You may also want to paste in how you analyzer configuration breaks those phrases and what the position of each term is. This will make it easier for others to understand what you have, what doesn't work, and what your opti

XPath query support in Solr Cell

2009-05-20 Thread Eric Pugh
So I am trying to filter down what I am indexing, and the basic XPath queries don't work. For example, working with tutorial.pdf this indexes all the : curl http://localhost:8983/solr/update/extract?ext.idx.attr=true \&ext.def.fl=text\&ext.map.div=foo_t\&ext.capture=div \&ext.literal.id=12

RE: Solr statistics of top searches and results returned

2009-05-20 Thread Plaatje, Patrick
Hi Shalin, Let me investigate. I think the challenge will be in storingmanaging these statistics. I'll get back to the list when I have thought of something. Rgrds, Patrick -Original Message- From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com] Sent: woensdag 20 mei 2009 10:33

Re: Facet counts limit

2009-05-20 Thread Matt Weber
1. The limit parameter takes a signed integer, so the max value is 2,147,483,647. 2. I don't think there is a defined limit which would mean you are only limited to want your system can handle. Thanks, Matt Weber eSr Technologies http://www.esr-technologies.com On May 20, 2009, at 11:4

Facet counts limit

2009-05-20 Thread sachin78
Have two questions? 1) What is the limit on facet counts? ex : test(10,0).Is this valid? 2) What is the limit on the no of facets? how many facets can a query get? --Sachin -- View this message in context: http://www.nabble.com/Facet-counts-limit-tp23641105p23641105.html Sent from the So

Re: phrase query & word delimiting

2009-05-20 Thread Avlesh Singh
Using a NGramTokenizerFactory as an analyzer for your field would help you achieve the desired. Here's a nice article - http://coderrr.wordpress.com/2008/05/08/substring-queries-with-solr-acts_as_solr/ Cheers Avlesh On Wed, May 20, 2009 at 11:26 PM, Alex Life wrote: > > Hi All, > > Could you pl

Re: dataimport.properties; configure writable location?

2009-05-20 Thread Wesley Small
Is a place in a core's solrconfig, where one can set the directory/path where the dataimport.properties file is written to? On 5/20/09 2:09 PM, "Giovanni De Stefano" wrote: > Doh, > > can you please rephrase? > > Giovanni > > On Wed, May 20, 2009 at 3:47 PM, Wesley Small > wrote: > >> In So

Re: dataimport.properties; configure writable location?

2009-05-20 Thread Giovanni De Stefano
Doh, can you please rephrase? Giovanni On Wed, May 20, 2009 at 3:47 PM, Wesley Small wrote: > In Solr 1.3, is there a setting that allows one to modified the where the > dataimport.properties file resides? > > In a production environment, the solrconfig directory needs to be > read-only. > I ha

phrase query & word delimiting

2009-05-20 Thread Alex Life
Hi All, Could you please help? I have following document "Super PowerShot SD" I want this document to be found by phrase queries below (both): "super powershot sd" "super power shot sd" Is this possible without sloppy phrase query? (at least theoretical) I don't see any way setting term/positi

RE: Creating a distributed search in a searchComponent

2009-05-20 Thread Nick Bailey
It seems I sent this out a bit too soon. After looking at the source it seems there are two seperate paths for distributed and regular queries, however the prepare method for for all components is run before the shards parameter is checked. So I can build the shards portion by using the prepar

Creating a distributed search in a searchComponent

2009-05-20 Thread Nick Bailey
Hi, I am wondering if it is possible to basically add the distributed portion of a search query inside of a searchComponent. I am hoping to build my own component and add it as a first-component to the StandardRequestHandler. Then hopefully I will be able to use this component to build the "s

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Walter Underwood
An HTTP cache will still work. We make three or four back end queries for each search page. We use separate request handlers with filter query specs instead of putting the filter query in the URL, but those two approaches are equivalent for the HTTP cache. We get similar cache hit rates on the fac

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Walter Underwood
How often do you update the indexes? We update once per day, and our HTTP cache has a hit rate of 75% once it gets warmed up. wunder On 5/20/09 9:07 AM, "Otis Gospodnetic" wrote: > > Kent, > > Solr plays nice with HTTP caches. Perhaps the simplest solution is to put > Solr behind a caching se

Re: Issue with AND/OR Operator in Dismax Request

2009-05-20 Thread dabboo
Hi, Yeah you are right. Can you please tell me the URL of JIRA. Thanks, Amit Otis Gospodnetic wrote: > > > Amit, > > That's the same question as the other day, right? > Yes, DisMax doesn't play well with Boolean operators. Check JIRA, it has > a search box, so you may be able to find rela

Re: Shutting down an instance of EmbeddedSolrServer

2009-05-20 Thread Eric Pugh
I created ticket SOLR-1178 for the small tweak. https://issues.apache.org/jira/browse/SOLR-1178 Eric On May 5, 2009, at 12:26 AM, Noble Paul നോബിള്‍ नोब्ळ् wrote: hi Eric, there should be a getter for CoreContainer in EmbeddedSolrServer. Open an issue --Noble On Tue, May 5, 2009 at 12:

Re: Cleanly shutting down Solr/Jetty on Windows

2009-05-20 Thread Eric Pugh
Wouldn't you want to run it as a windows service and use net start/ net stop? If you download and install Jetty it comes with the appropriate scripts to be installed as a service. Eric On May 20, 2009, at 12:39 PM, Chris Harris wrote: I'm running Solr with the default Jetty setup on Wi

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Yonik Seeley
On Wed, May 20, 2009 at 12:43 PM, Yonik Seeley wrote: >    true Of course the examples you gave used the default sort (by score) so this wouldn't help if you do actually need to sort by score. -Yonik http://www.lucidimagination.com

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Yonik Seeley
Some thoughts: #1) This is sort of already implemented in some form... see this section of solrconfig.xml and try uncommenting it: Unfortunately, it's currently a system-wide setting... you can't select it per-query. #2) Your problem might be able to be solved with field collapsing on the "

Cleanly shutting down Solr/Jetty on Windows

2009-05-20 Thread Chris Harris
I'm running Solr with the default Jetty setup on Windows. If I start solr with "java -jar start.jar" from a command window, then I can cleanly shut down Solr/Jetty by hitting Control-C. In particular, this causes the shutdown hook to execute, which appears to be important. However, I don't especia

Re: Issue with AND/OR Operator in Dismax Request

2009-05-20 Thread Otis Gospodnetic
Amit, That's the same question as the other day, right? Yes, DisMax doesn't play well with Boolean operators. Check JIRA, it has a search box, so you may be able to find related patches. I think the patch I was thinking about is actually for something else - allowing field names to be specifie

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Yonik Seeley
On Wed, May 20, 2009 at 12:07 PM, Otis Gospodnetic wrote: > Solr plays nice with HTTP caches.  Perhaps the simplest solution is to put > Solr behind a caching server such as Varnish, Squid, or even Apache? In Kent's case, the other query parameters (the other filters mainly) change, so an extern

Re: best way to cache "base" queries (before application of filters)

2009-05-20 Thread Otis Gospodnetic
Kent, Solr plays nice with HTTP caches. Perhaps the simplest solution is to put Solr behind a caching server such as Varnish, Squid, or even Apache? Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Kent Fitch > To: solr-user@lucene.apach

Re: How to retrieve all available Cores in a "static way" ?

2009-05-20 Thread Giovanni De Stefano
Thank you all for your replies. I guess I will stick with another approach: all my request handlers inherit from a custom base handler which is CoreAware. Its inform(core) method notifies a static map hold by another object avoiding duplicates. Thanks again! Giovanni On Wed, May 20, 2009 at 3:

Re: Plugin Not Found

2009-05-20 Thread Jeff Newburn
I tried to change the package name to com.zappos.solr. When I declared the search component with: I get: SEVERE: org.apache.solr.common.SolrException: Unknown Search Component: facetcube at org.apache.solr.core.SolrCore.getSearchComponent(SolrCore.java:874) at org.apache.solr.handler.co

Re: java.lang.RuntimeException: after flush: fdx size mismatch

2009-05-20 Thread James X
Hi Mike, thanks for the quick response: $ java -version java version "1.6.0_11" Java(TM) SE Runtime Environment (build 1.6.0_11-b03) Java HotSpot(TM) 64-Bit Server VM (build 11.0-b16, mixed mode) I hadn't noticed the 268m trigger for LUCENE-1521 - I'm definitely not hitting that yet! The excepti

Re: QueryElevationComponent : hot update of elevate.xml

2009-05-20 Thread Nicolas Pastorino
Hi, On May 12, 2009, at 12:33 , Nicolas Pastorino wrote: Hi, On May 7, 2009, at 6:03 , Noble Paul നോബിള്‍ नोब्ळ् wrote: going forward the java based replication is going to be the preferred means replicating index. It does not support replicating files in the dataDir , it only supports re

Re: Plugin Not Found

2009-05-20 Thread Grant Ingersoll
Just a wild guess here, but... Try doing one of two things: 1. change the package name to be something other than o.a.s 2. Change your config to use solr.FacetCubeComponent You might also try turning on trace level logging for the SolrResourceLoader and report back the output. -Grant On

Re: Plugin Not Found

2009-05-20 Thread Jeff Newburn
Error is below. This error does not appear when I manually copy the jar file into the tomcat webapp directory only when I try to put it in the solr.home lib directory. SEVERE: org.apache.solr.common.SolrException: Error loading class 'org.apache.solr.handler.component.FacetCubeComponent' at o

dataimport.properties; configure writable location?

2009-05-20 Thread Wesley Small
In Solr 1.3, is there a setting that allows one to modified the where the dataimport.properties file resides? In a production environment, the solrconfig directory needs to be read-only. I have observed that the DIH process works regards, but a whooping errors is put in the logs when the dataimpor

Re: How to retrieve all available Cores in a "static way" ?

2009-05-20 Thread Ryan McKinley
I cringe to suggest this but you can use the deprecated call: SolrCore.getSolrCore().getCoreContainer() On May 19, 2009, at 11:21 AM, Giovanni De Stefano wrote: Hello all, I have a quick question but I cannot find a quick answer :-) I have a Java client running on the same JVM where Sol

Re: Optimize

2009-05-20 Thread Erik Hatcher
To send an optimize command, POST an message to /solr/update Erik On May 20, 2009, at 6:49 AM, Gargate, Siddharth wrote: Hi all, I am not sure how to call optimize on the existing index. I tried with following URL http://localhost:9090/solr/update?optimize=true Wi

Re: What are the basic requirements for on-the-fly registration/creation of new Core?

2009-05-20 Thread KK
Thanks for the response. That means I've to have the directory before I pass it to solr, its not going to create it by itself. Or just passing the name will make it create a new directory? I've to give the full path? Seems I wont be able to register a new core on the fly by just passing the name. D

Optimize

2009-05-20 Thread Gargate, Siddharth
Hi all, I am not sure how to call optimize on the existing index. I tried with following URL http://localhost:9090/solr/update?optimize=true With this request, the response took a long time, and the index folder size doubled. Then again I queried the same URL and index size re

Re: What are the basic requirements for on-the-fly registration/creation of new Core?

2009-05-20 Thread KK
Hi Noble, I downloaded the latest nightly build(20th may) and deployed the nightly-build.war and tried to run the same thing for creating a new core and the bad news is that it didn't work. I tried this /opt/solr/data/${core.name} also /opt/solr/data/${solr.core.name} but it gave me the same error

Re: java.lang.RuntimeException: after flush: fdx size mismatch

2009-05-20 Thread Michael McCandless
Hmm... somehow Lucene is flushing a new segment on closing the IndexWriter, and thinks 1 doc had been added to the stored fields file, yet the fdx file is the wrong size (0 bytes). This check (& exception) are designed to prevent corruption from entering the index, so it's at least good to see Che

Re: How to change the weight of the fields ?

2009-05-20 Thread Vincent Pérès
I tried the following request after changed the dismax : http://localhost:8983/solr/listings/select/?q=novel&qt=dismax&qf=title_s^2.0&fl=title_s+isbn_s&version=2.2&start=0&rows=5&indent=on&debugQuery=on But I don't get any results : novel novel +DisjunctionMaxQuery((title_s:novel^2.0)~0.01) ()

Re: Solr statistics of top searches and results returned

2009-05-20 Thread Shalin Shekhar Mangar
On Wed, May 20, 2009 at 1:31 PM, Plaatje, Patrick < patrick.plaa...@getronics.com> wrote: > > At the moment Solr does not have such functionality. I have written a > plugin for Solr though which uses a second Solr core to store/index the > searches. If you're interested, send me an email and I'll

Re: How to retrieve all available Cores in a "static way" ?

2009-05-20 Thread Andrey Klochkov
AFAIK there's no way of getting it in "static" way. If you look into SolrDispatchFilter.java, you'll see this lines: // put the core container in request attribute req.setAttribute("org.apache.solr.CoreContainer", cores); So later in your servlet you can get this request attribute, I do it in thi

Re: query clause and filter query

2009-05-20 Thread Andrey Klochkov
Read "Consider using filters" section here: http://wiki.apache.org/lucene-java/ImproveSearchingSpeed On Wed, May 20, 2009 at 10:24 AM, Ashish P wrote: > > what is the difference between query clause and filter query?? > Thanks, > Ashish > -- > View this message in context: > http://www.nabble.c

Re: How to change the weight of the fields ?

2009-05-20 Thread Vincent Pérès
Hello, I'm sorry I wrote a mistake, I mean : http://localhost:8983/solr/listings/select/?q=novel&qf=title_s^5.0&fl=title_s+isbn_s&version=2.2&start=0&rows=5&indent=on&debugQuery=on (using qf (Query Fields)) But it seems I need to add dismax as well and configure it by default in solr config? Th

RE: Solr statistics of top searches and results returned

2009-05-20 Thread Plaatje, Patrick
Hi, At the moment Solr does not have such functionality. I have written a plugin for Solr though which uses a second Solr core to store/index the searches. If you're interested, send me an email and I'll get you the source for the plugin. Regards, Patrick -Original Message- From: solr

best way to cache "base" queries (before application of filters)

2009-05-20 Thread Kent Fitch
Hi, I'm looking for some advice on how to add "base query" caching to SOLR. Our use-case for SOLR is: - a large Lucene index (32M docs, doubling in 6 months, 110GB increasing x 8 in 6 months) - a frontend which presents views of this data in 5 "categories" by firing off 5 queries with the same s

Re: What are the basic requirements for on-the-fly registration/creation of new Core?

2009-05-20 Thread KK
OK then , I assume that nightly build will solve my basic problem of "On the fly creation of new cores using dataDir as req parameter", then I can wait for two more hours. One more thing the new nightly build willl be called solr-2009-05-20.tgz, right as teh current one is solr-2009-05-19.tgz, rig

java.lang.RuntimeException: after flush: fdx size mismatch

2009-05-20 Thread James X
Hello all,I'm running Solr 1.3 in a multi-core environment. There are up to 2000 active cores in each Solr webapp instance at any given time. I've noticed occasional errors such as: SEVERE: java.lang.RuntimeException: after flush: fdx size mismatch: 1 docs vs 0 length in bytes of _h.fdx at