Indexing zip files

2010-04-27 Thread Sandhya Agarwal
Hello, I see that solr 1.4 is bundled with tika 0.4, which does not do proper content extraction of zip files. So, I replaced tika jars with the latest tika 0.7 jars. I still see an issue and the individual files in the zip file are not being indexed. Any configuration I must do to get this wor

Re: Replicate cores from master to slave

2010-04-27 Thread Chris Hostetter
: If I create a new core on a Solr master, is there a way to instruct a : Solr slave to replicate the new core? replication is handled by a handler running inside the core, so the slave has to have the core running beore it can start replicating. but as i understand the new cloud stuff (by whic

Re: Don't return field from query

2010-04-27 Thread Lance Norskog
No, there is no "-fieldname" feature. On Tue, Apr 27, 2010 at 11:38 AM, Darren Govoni wrote: > Hi Solrnauts, >    I know I can specify in my query which fields I want. But is there a > way to specify that I DON'T want a given field > returned (e.g. because its too large or whatever). > > thanks f

Need help/assistance with Multicore admin/cores?action=CREATE

2010-04-27 Thread Turner, Robbin J
Using the information off the CoreAdmin Wiki, I initially set up Solr with one core with solr.xml looking like the following: The application starts up fine and I can get to the http:///solrcores

Sort by membership of range query

2010-04-27 Thread Simon Droscher
Hi. I'm trying to figure out how to do a particular sort in Solr (1.3). I have an index for a product which includes a startDate field and an endDate field. In the UI we define a product as "active" if the current date is within the startDate, endDate range. Now we can filter search results to

RE: nfs vs sas in production

2010-04-27 Thread Burton-West, Tom
Hi Kallin, Given the previous postings on the list about terrible NFS performance we were pleasantly surprised when we did some tests against a well tuned NFS RAID array on a private network. We got reasonably good results (given our large index sizes.) See http://www.hathitrust.org/blogs/lar

Replicate cores from master to slave

2010-04-27 Thread Jason Rutherglen
If I create a new core on a Solr master, is there a way to instruct a Solr slave to replicate the new core?

Score cutoff

2010-04-27 Thread Satish Kumar
Hi, For some of our queries, the top xx (five or so) results are of very high quality and results after xx are very poor. The difference in score for the high quality and poor quality results is high. For example, 3.5 for high quality and 0.8 for poor quality. We want to exclude results with score

Re: Using NoOpMergePolicy (Lucene 2331) from Solr

2010-04-27 Thread Jason Rutherglen
Tom, Interesting, can you post your findings after you've found them? :) Jason On Tue, Apr 27, 2010 at 2:33 PM, Burton-West, Tom wrote: > Is it possible to use the NoOpMergePolicy ( > https://issues.apache.org/jira/browse/LUCENE-2331   ) from Solr? > > We have very large indexes and always opt

Re: nfs vs sas in production

2010-04-27 Thread Walter Underwood
Look here for a number of messages on this: http://markmail.org/search/solr+nfs You'll find my posting, where indexing on NFS was 100X slower than local disk. And 276 other e-mails on the subject. wunder On Apr 27, 2010, at 2:30 PM, Otis Gospodnetic wrote: > Kallin, > > I don't have experien

Using NoOpMergePolicy (Lucene 2331) from Solr

2010-04-27 Thread Burton-West, Tom
Is it possible to use the NoOpMergePolicy ( https://issues.apache.org/jira/browse/LUCENE-2331 ) from Solr? We have very large indexes and always optimize, so we are thinking about using a very large ramBufferSizeMB and a NoOpMergePolicy and then running an optimize to avoid extra disk reads a

Re: nfs vs sas in production

2010-04-27 Thread Otis Gospodnetic
Kallin, I don't have experience with SAS storage and don't recall SAS being mentioned on Lucene/Solr lists. But I do recall NFS being mention on several occasions: http://search-lucene.com/?q=sas+nfs http://search-lucene.com/?q=sas+nfs+san From a few of my quick Google-based quick self-educat

Re: Solr Spellcheck on Large index size

2010-04-27 Thread Abdelhamid ABID
Hi, With the spellcheck.build=true, IMO solr will build the spellcheck disctionnary at each request, so with the 29m documents solr can popup from the server with some error like "I quit" :) I would build the dictionnary once after data index creation, you may set this option to the spell request

Solr Spellcheck on Large index size

2010-04-27 Thread Kyle J G
I am trying to create a spell checker for my companies website. Currently there are approx 29million documents in the index. When trying to create the spelling index it just seems to skip over the command. My fields in schema.xml look like the following:

nfs vs sas in production

2010-04-27 Thread Nagelberg, Kallin
Hey, A question was raised during a meeting about our new Solr based search projects. We're getting 4 cutting edge servers each with something like 24 Gigs of ram dedicated to search. However there is some problem with the amount of SAS based storage each machine can handle, and people wonder i

date facets without intersections

2010-04-27 Thread Király Péter
Dear Solr users, I am interesting, whether it is possible to get date facets without intersecting ranges. Now the documents which stands on boundaries of ranges are covered by both ranges. An example: facet result (from Solr): 3 3 12 If we translate into queries, it means that the number of d

Re: How are (multiple) filter queries processed?

2010-04-27 Thread Chris Hostetter
: i was wondering how the following query might be processed: : : ?q=*:*&fq=+tags:(Gucci)&fq=-tags:(watch sunglasses) they are intersected so only documents matching all of them are potential matches. : and if there is a difference to a query with only one fq parameter like : : ?q=*:*&fq=+tag

Re: Embedded / Servlet simultaneous use

2010-04-27 Thread Chris Hostetter
: I am wondering if having an EmbeddedServer and regular Servlet running : on the same solr_home, using the former to add to the index and the : latter to query it, makes sense as a server setup. having two instances of Solr pointed at the same solr home (regardless of wether one is Embeddedor o

Re: SEVERE: Could not start SOLR. Check solr/home property

2010-04-27 Thread Chris Hostetter
: SEVERE: Could not start SOLR. Check solr/home property it means something when horribly wrong when starting solr, and since this is frequently caused by either an incorrect explicit solr/home or an incorrect implicitly guessed solr home, that is mentioned in the error message as something to

Re: resolutions and chapters

2010-04-27 Thread Chris Hostetter
: as an intermediate hack the best solution i see is to just make the : chapter fields multivalued fields inside the resolutions, this should be : a decent solution for (a), though this way i will not really have any : information on the number of chapters matched (let alone their id's). : thi

Re: Solr does not honor facet.mincount and field.facet.mincount

2010-04-27 Thread Chris Hostetter
: For example: If I want to display facets on fields A, B, C and D. But in : case a field say C, does not have any data, then C should be excluded from : the solr response it doesn't work that way -- it would actually make hte parsing code a lot more complicated for most clients because since t

Re: Merging Solr Cores Urgent

2010-04-27 Thread Chris Hostetter
: : The Wiki Documentation says that "Merged" core must exist prior to calling : the merge command : : So I created the "Merged" core and pointed it to some "data dir". : : However even after merging the cores it does still points to the old "data : dir" : : Shouldn't the merge command create a

RE: FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
Well right now we're using a nightly build of Solr 1.4 with Lucene 2.9.1. But I would probably just upgrade to a trunk build of Lucene/Solr just to get this working Will do! This bug is pretty disappointing. -Original Message- From: Michael McCandless [mailto:luc...@mikemccandless.

Re: FSDirectory Synchronization Issues

2010-04-27 Thread Michael McCandless
I don't think that'll work anymore (it used to with old versions of Lucene). Hmm but which underlying Lucene version are you using? I don't otherwise know how to configure it -- anyone else? You should also go and vote on the underlying Sun (Oracle!) bug that falsely bottlenecks threads sharing

Don't return field from query

2010-04-27 Thread Darren Govoni
Hi Solrnauts, I know I can specify in my query which fields I want. But is there a way to specify that I DON'T want a given field returned (e.g. because its too large or whatever). thanks for the tip!

Monitoring via JMX; changing mbean names?

2010-04-27 Thread Dan Trainor
Hi - I've not been able to find a way to change the names of mbeans exposed from solr via JMX. Is this even possible? Am I just plain doing it wrong? For example, when running multiple instances of solr in the same Tomcat instance, each has an associated searc...@1234567 mbean. Alright, I ex

RE: FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
I was considering it, but we're already tight on memory usage. How do you configure Solr to use it? Is this correct? http://www.mail-archive.com/solr-user@lucene.apache.org/msg28574.html > So, start your servlet container with > -Dorg.apache.lucene.FSDirectory.class=org.apache.lucene.store.MMap

Re: Minimum Should Match the other way round

2010-04-27 Thread Chris Hostetter
: thank you for joining the discussion :). Heh ... no problem, i was a little behind on my mail for a while there ... but i'm catching up. : 2) If I understood the API-documentation right, the behaviour of the : FieldQParser depends exactly on what I've defined in my analyzer. right ... it's

Re: FSDirectory Synchronization Issues

2010-04-27 Thread Michael McCandless
Try MMapDirectory? Mike On Tue, Apr 27, 2010 at 2:09 PM, Giovanni Fernandez-Kincade wrote: > Hello, > I'm encountering a lot of contention around > SimpleFSDirectory$SimpleFSIndexInput.readInternal, pretty much identical to > what this user described back in 2008: > http://www.mail-archive.com

Re: Spell check suggesting corrections for boost function

2010-04-27 Thread Chris Hostetter
: I'm trying to perform spell checking as part of a query using the Lucene : parser, and I'm finding that the spell checker is giving me suggestions for : the mathematical functions used in my boost clause. Here's my request as : seen through solr admin: spellchecker looks at the literal "q" par

FSDirectory Synchronization Issues

2010-04-27 Thread Giovanni Fernandez-Kincade
Hello, I'm encountering a lot of contention around SimpleFSDirectory$SimpleFSIndexInput.readInternal, pretty much identical to what this user described back in 2008: http://www.mail-archive.com/solr-user@lucene.apache.org/msg15516.html I also found this JIRA issue, where it appears that the conc

Re: Installing Solr under Glassfish

2010-04-27 Thread Chris Hostetter
: Nope, still not right. The index is not created under solr home, instead it is : created relative to $CWD, which in Glassfish is in the configuration area of : the domain. : : How do I get the index back to live under solr.solr.home? what does your dataDir directive in solrconfig.xml look like

Re: Indexing all versions of Microsoft Office Documents

2010-04-27 Thread Shashi Kant
If you are on Windows try the Microsoft IFilter API - it supports current Office versions. http://www.microsoft.com/downloads/details.aspx?FamilyId=60C92A37-719C-4077-B5C6-CAC34F4227CC&displaylang=en On Tue, Apr 27, 2010 at 6:08 AM, Roland Villemoes wrote: > Hi All, > > Does anyone have a runn

Re: Indexing all versions of Microsoft Office Documents

2010-04-27 Thread Otis Gospodnetic
Roland, A better place to ask might in fact be tika-user mailing list. Sorry, I don't have the answer except for this pointer. Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Lucene ecosystem search :: http://search-lucene.com/ - Original Message > From: Roland

Re: AutoSuggest with custom sorting

2010-04-27 Thread Papiya Misra
I guess my basic issue is that Solr scores all matches for prefix searches equally. Any way to score PINK over PINKSHEETS when you are searching for PI ? Thanks Papiya Papiya Misra wrote: Hi I am supposed to implement auto suggest where the prefix matches are sorted based on the following crit

RE: indexer threading?

2010-04-27 Thread Wawok, Brian
Hi Alex, Were you ever able to get the indexing machine to go over about 1 CPU worth of work? I am also curious of how BinaryRequestWriter compares to the StreamingUpdateSolrServer that I was using... Brian -Original Message- From: Alexey Serba [mailto:ase...@gmail.com] Sent: Tues

Re: resolutions and chapters

2010-04-27 Thread Lukas Kahwe Smith
On 26.04.2010, at 12:48, Lukas Kahwe Smith wrote: > Hi, > > I am currently putting together a search for a DB where I have resolutions > along with their metadata as well as chapters, its text and metadata. Most of > the searching will actually be done on the metadata. The plan atm is to > su

Re: indexer threading?

2010-04-27 Thread Alexey Serba
Hi Brian, I was testing indexing performance on a high cpu box recently and came to the same issue. I tried different indexing methods ( xml, CSVRequestHandler and Solrj + BinaryRequestWriter with multiple threads ). The last method is the fastest indeed. I believe that multiple threads approach g

Re: Installing Solr under Glassfish

2010-04-27 Thread Theodore Omtzigt
Nope, still not right. The index is not created under solr home, instead it is created relative to $CWD, which in Glassfish is in the configuration area of the domain. How do I get the index back to live under solr.solr.home? Theo On 4/26/2010 9:32 PM, Theodore Omtzigt wrote: Turns out that I

Re: multiple cores on SOLR under Tomcat

2010-04-27 Thread Jon Baer
I would not use this layout, you are putting important Solr config files outside onto the docroot (presuming we are looking @ the webapps folder) ... here is my current Tomcat project (if it helps): [507][jonbaer.MBP: tomcat]$ pwd /Users/jonbaer/WORKAREA/SVN_HOME/my-project/tomcat [508][jonbaer

Re: multiple cores on SOLR under Tomcat

2010-04-27 Thread Shawn Heisey
Here's how I've got things set up. It's a different directory structure than yous, and I run it under jetty, but hopefully it gives you the basic idea. The dataDir setting is relative to the instanceDir setting. I run jetty with -Dsolr.solr.home=/index/solr so it can find solr.xml. [r...@i

Re: SV: multiple cores on SOLR under Tomcat

2010-04-27 Thread Dimitrios Sferopoulos
Roland Villemoes wrote: Yes, I have a solution running on that. If you remove your "multicore" level or add it to your solr.xml it will work. Roland Villemoes OK removed the "multicore" level but it still doesn't work. Even tried having the absolute path in solr.xml which did not work ei

SV: multiple cores on SOLR under Tomcat

2010-04-27 Thread Roland Villemoes
Yes, I have a solution running on that. If you remove your "multicore" level or add it to your solr.xml it will work. Roland Villemoes -Oprindelig meddelelse- Fra: Sergei Goorov [mailto:goo...@gmail.com] Sendt: 27. april 2010 14:37 Til: solr-user@lucene.apache.org Emne: Re: multiple c

Re: multiple cores on SOLR under Tomcat

2010-04-27 Thread Sergei Goorov
> My SOLR directory structure is: > > solr >  admin >  home >        bin >        conf >        data >        solr.xml >         multicore >                core0 >                    data >                    conf >                core1 >                    data >                    conf >  META-IN

multiple cores on SOLR under Tomcat

2010-04-27 Thread Dimitrios Sferopoulos
Hi all, I have been trying to set up multiple cores on SOLR that runs under Apache Tomcat but haven't had much luck. I followed the instruction on the wiki but that didn't help much. This is what I get when I browse in: http://devel.edina.ac.uk:20232/solr/admin/cores My SOLR directory struc

AW: copyField for dynamicFields

2010-04-27 Thread Jan Simon Winkelmann
Hi, thanks very much for that. I was actually worried I would have to restructure the index and the interface in our application. regards, Jan-Simon > -Ursprüngliche Nachricht- > Von: Naga Darbha [mailto:ndar...@opentext.com] > Gesendet: Dienstag, 27. April 2010 12:16 > An: solr-user@lu

RE: copyField for dynamicFields

2010-04-27 Thread Naga Darbha
Hi, I think copyField copies the un-processed content (that will be processed by source field) onto the target field and processes it based on target field's type. It is *copied first*. regards, Naga -Original Message- From: Jan Simon Winkelmann [mailto:winkelm...@newsfactory.de] Sen

Indexing all versions of Microsoft Office Documents

2010-04-27 Thread Roland Villemoes
Hi All, Does anyone have a running solution indexing Microsoft Office Documents e.g. .docx .xlsx etc. ? I can see a lot of examples using Tika for rich content extraction, but still nothing when it comes to newer versions of Microsoft Office? What libraries to use of not Tika? med venlig hilse

Re: "Solr 1.4 Enterprise Search Server" book examples

2010-04-27 Thread findbestopensource
I downloaded the 5883_Code.zip file but not able to extract the complete contents. Regards Aditya www.findbestopensource.com On Tue, Apr 27, 2010 at 12:45 AM, Johan Cwiklinski < johan.cwiklin...@ajlsm.com> wrote: > Hello, > > Le 26/04/2010 20:53, findbestopensource a écrit : > > I am able to s

Re: TikaEntityProcessor in Solr1.4

2010-04-27 Thread Monmohan Singh
typo: Also, is there a timeframe on Solr1. release? should be Also, is there a timeframe on Solr1.5 release? On Tue, Apr 27, 2010 at 8:10 AM, monmohan wrote: > > Hi, > I would like to use TikaEntityProcessor with Solr1.4. > https://issues.apache.org/jira/browse/SOLR-1358 shows that this is added

Re: aaaah. No response, No Result

2010-04-27 Thread stockii
oha, it works again.. but why ?? i changed my import with DIH, again back and it works fine like bevore... but whats the reaseon for this behavior ? thats the query wich worked not: query="select i.id, i.shop_id, i.name, ... FROM items_de.shop_items as i JOIN base.shops s ON s.id = i.shop_i

How are (multiple) filter queries processed?

2010-04-27 Thread Alexander Valet
Hi, i was wondering how the following query might be processed: ?q=*:*&fq=+tags:(Gucci)&fq=-tags:(watch sunglasses) and if there is a difference to a query with only one fq parameter like ?q=*:*&fq=+tags:(Gucci) -tags:(watch sunglasses) I am aware of the chaching implications but i am not sure

copyField for dynamicFields

2010-04-27 Thread Jan Simon Winkelmann
Hi, i have the following configured in my schema.xml: What I can't quite figure out, is when exactly the data from the _i fields gets copied to the _i_f fields. Does it get processed first (Tokenizer, Filters, etc.) or copied first? I would appreciate any insight. Thanks in advance! Best,

Embedded / Servlet simultaneous use

2010-04-27 Thread Valodim
Hi guys, I am wondering if having an EmbeddedServer and regular Servlet running on the same solr_home, using the former to add to the index and the latter to query it, makes sense as a server setup. The query is not critical, if something is indexed but only later shown in query results due to ca

Re: aaaah. No response, No Result

2010-04-27 Thread stockii
Hey. Sry for to less information. I get nö documents by using the Standard Request Händler. Dismax Works fine. What i try: - Delete the Index and Import. - Delete Index Folder and Import. - Restart tomcat. - Logging Said commit flush worked. - No exception in my log Files. I changed my Data-