date:20100726

RE: java "GC overhead limit exceeded"

2010-07-26 Thread Bastian Spitzer

Hi, which version do you use? 1.4.1 is highly recommended since previous versions contained some bugs related to memory usage that could lead to memory leaks. i had this gc overhead limit in my setup as well. only workaround that helped was a dayly restart of all instances. with 1.4.1 this iss

Re: spell checking....

2010-07-26 Thread satya swaroop

This is in solrconfig.xml::: default solr.IndexBasedSpellChecker spell ./spellchecker 0.7 true true jarowinkler lowerfilt org.apache.lucene.search.spell.JaroWinklerDistance ./spellchecker

Re: Querying throws java.util.ArrayList.RangeCheck

2010-07-26 Thread Yonik Seeley

Do you have any custom code, or is this stock solr (and which version, and what is the request)? -Yonik http://www.lucidimagination.com On Tue, Jul 27, 2010 at 12:30 AM, Manepalli, Kalyan wrote: > Hi, > I am stuck at this weird problem during querying. While querying the solr > index I am get

Querying throws java.util.ArrayList.RangeCheck

2010-07-26 Thread Manepalli, Kalyan

Hi, I am stuck at this weird problem during querying. While querying the solr index I am getting the following error. Index: 52, Size: 16 java.lang.IndexOutOfBoundsException: Index: 52, Size: 16 at java.util.ArrayList.RangeCheck(ArrayList.java:547) at java.util.ArrayList.get(ArrayList.java:32

Re: Design questions/Schema Help

2010-07-26 Thread Kiwi de coder

i think the search log will require a lot of storage which may make indexes size unreasonable large if store in solr. and the aggregration results may not really fixed in lucene index structure. :) kiwi happy hacking ! On Tue, Jul 27, 2010 at 7:47 AM, Tommy Chheng wrote: > Alternatively, hav

StatsComponent and sint?

2010-07-26 Thread Jonathan Rochkind

Man, what types of fields is StatsComponent actually known to work with? With an sint, it seems to have trouble if there are any documents with null values for the field. It appears to decide that a null/empty/blank value is -1325166535, and is thus the minimum value. At least if I'm interpret

RE: java "GC overhead limit exceeded"

2010-07-26 Thread Jonathan Rochkind

> Short answer: "GC overhead limit exceeded" means "out of memory". Aha, thanks. So the answer is just "raise your Xmx/heap size, you need more memory to do what you're doing", yeah? Jonathan

Is there a cache for a query?

2010-07-26 Thread Li Li

I want a cache to cache all result of a query(all steps including collapse, highlight and facet). I read http://wiki.apache.org/solr/SolrCaching, but can't find a global cache. Maybe I can use external cache to store key-value. Is there any one in solr?

Re: java "GC overhead limit exceeded"

2010-07-26 Thread Yonik Seeley

On Mon, Jul 26, 2010 at 7:17 PM, Jonathan Rochkind wrote: > I am now occasionally getting a Java "GC overhead limit exceeded" error in > my Solr. This may or may not be related to recently adding much better (and > more) warming querries. When memory gets tight, the JVM kicks of a garbage collect

Re: Updating fields in Solr

2010-07-26 Thread Erick Erickson

See below: On Mon, Jul 26, 2010 at 11:49 AM, Pramod Goyal wrote: > Hi, > I have a requirement where i need to keep updating certain fields in > the schema. My requirement is to change some of the fields or add some > values to a field ( multi-value field ). I understand that i can use Solr >

Re: Total number of terms in an index?

2010-07-26 Thread Chris Hostetter

: Sorry, like the subject, I mean the total number of terms. it's not stored anywhere, so the only way to fetch it is to actually iteate all of the terms and count them (that's why LukeRequestHandler is slow slow to compute this particular value) If i remember right, someone mentioned at one p

Re: spell checking....

2010-07-26 Thread Erick Erickson

It's almost impossible to analyze this kind of thing without seeing your schema and debug output. You might want to review: http://wiki.apache.org/solr/UsingMailingLists Best Erick On Mon, Jul 26, 2010 at 9:56 AM, satya swaroop wrote: > hi all, >i am a new one to solr and able to implem

Re: Similar search regarding a result document

2010-07-26 Thread Erick Erickson

I need much more detailed information before I can make sense of your use case. Could you provide some sample? MoreLikeThis sounds in the right neighborhood, but I'm guessing. Best Erick On Mon, Jul 26, 2010 at 9:02 AM, wrote: > > Hi, > > I would like to implement a similar search feature...

Re: question about relevance

2010-07-26 Thread Erick Erickson

I'm having trouble getting my head around what you're trying to accomplish, so if this is off base you know why . But what it smells like is that you're trying to do database-ish things in a SOLR index, which is almost always the wrong approach. Is there a way to index redundant data with each doc

Solr crawls during replication

2010-07-26 Thread Mark

We have an index around 25-30G w/ 1 master and 5 slaves. We perform replication every 30 mins. During replication the disk I/O obviously shoots up on the slaves to the point where all requests routed to that slave take a really long time... sometimes to the point of timing out. Is there any lo

Re: Design questions/Schema Help

2010-07-26 Thread Tommy Chheng

Alternatively, have you considered storing(or i should say indexing) the search logs with Solr? This lets you text search across your search queries. You can perform time range queries with solr as well. @tommychheng Programmer and UC Irvine Graduate Student Find a great grad school based on

Re: Design questions/Schema Help

2010-07-26 Thread Mark

On 7/26/10 4:43 PM, Mark wrote: We are thinking about using Cassandra to store our search logs. Can someone point me in the right direction/lend some guidance on design? I am new to Cassandra and I am having trouble wrapping my head around some of these new concepts. My brain keeps wanting to g

Design questions/Schema Help

2010-07-26 Thread Mark

We are thinking about using Cassandra to store our search logs. Can someone point me in the right direction/lend some guidance on design? I am new to Cassandra and I am having trouble wrapping my head around some of these new concepts. My brain keeps wanting to go back to a RDBMS design. We wi

Re: NullPointerException with CURL, but not in browser

2010-07-26 Thread Chris Hostetter

: However, when I'm trying this very URL with curl within my (perl) script, I : receive a NullPointerException: : CURL-COMMAND: curl -sL : http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard it appears you aren't quoting

java "GC overhead limit exceeded"

2010-07-26 Thread Jonathan Rochkind

I am now occasionally getting a Java "GC overhead limit exceeded" error in my Solr. This may or may not be related to recently adding much better (and more) warming querries. I can get it when trying a 'commit', after deleting all documents in my index, or in other cases. Anyone run into thi

Re: Total number of terms in an index?

2010-07-26 Thread Jason Rutherglen

Sorry, like the subject, I mean the total number of terms. On Mon, Jul 26, 2010 at 4:03 PM, Jason Rutherglen wrote: > What's the fastest way to obtain the total number of docs from the > index? (The Luke request handler takes a long time to load so I'm > looking for something else). >

Total number of terms in an index?

2010-07-26 Thread Jason Rutherglen

What's the fastest way to obtain the total number of docs from the index? (The Luke request handler takes a long time to load so I'm looking for something else).

NullPointerException with CURL, but not in browser

2010-07-26 Thread Rene Rath

Hi *, I'd like to see how many documents I have in my index with a certain ListId, in this example ListId 881. http://localhost:8983/solr/select?indent=on&version=2.2&q=*&fq=ListId%3A881&start=0&rows=0&fl=*%2Cscore&qt=standard&wt=standard In the browser, the output looks perfect, I indeed have 3

Re: How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-26 Thread David Stuart

Hi Savannah, I have just answered this question over on drupal.org. http://drupal.org/node/811062 Response number 5 and 11 will help you. On the solrconfig.xml side of things you will only really need Drupal's version. Although still in alpha my Nutch module will help you out with integration

How to Combine Drupal solrconfig.xml with Nutch solrconfig.xml?

2010-07-26 Thread Savannah Beckett

I am using Drupal ApacheSolr module to integrate solr with drupal. I already integrated solr with nutch. I already moved nutch's solrconfig.xml and schema.xml to solr's example directory, and it work. I tried to append Drupal's ApacheSolr module's own solrconfig.xml and schema.xml into the s

Solr 3.1 and ExtractingRequestHandler resulting in blank content

2010-07-26 Thread David Thibault

Hello all, I’m working on a project with Solr. I had 1.4.1 working OK using ExtractingRequestHandler except that it was crashing on some PDFs. I noticed that Tika bundled with 1.4.1 was 0.4, which was kind of old. I decided to try updating to 0.7 as per the directions here: http://wiki.apac

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

ah okay thx =) the class "SolrInputDocuments" is only for indexing an document and "SolrDocuement" for the search ? when Solr index an document first step is to create an SolrInputDocument. then in class "DocumentBuilder" creates solr in function "Document toDocument (SolrInputDoc, Schema)" an L

Re: slave index is bigger than master index

2010-07-26 Thread Chris Hostetter

: No I didn't. I thought you aren't supposed to run optimize on slaves. Well correct, you should make all changes to the master. : but it doesn;t matter now, as I think its fixed now. I just added a dummy : document on master, ran a commit call and then once that executed ran an : optimize call.

Re: Can't find org.apache.solr.client.solrj.embedded

2010-07-26 Thread Chris Hostetter

: where is a Jar, containing org.apache.solr.client.solrj.embedded? Classes in the embedded package are useless w/o the rest of the Solr internal "core" classes, so they are included directly in the apache-solr-core-1.4.1.jar. (i know .. the directory structure doesn't make a lot of sense) :

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread Chris Hostetter

: i want to learn more about the technology. : : exists an issue to create really an solrDoc ? Or its in the code only for a : better understanding of the lucene and solr border ? There is a real and actual class named "SolrDocument". it is a simpler object then Lucene's "Document" class becua

Extracting PDF text/comment/callout/typewriter boxes with Solr CELL/Tika/PDFBox

2010-07-26 Thread Sharp, Jonathan

Every so often I need to index new batches of scanned PDFs and occasionally Adobe's OCR can't recognize the text in a couple of these documents. In these situations I would like to type in a small amount of text onto the document and have it be extracted by Solr CELL. Adobe Pro 9 has a numbe

Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits

I still assume that what you mean by "search queries data" is just some other form of document (in this case containing 1 seach-request per document) I'm not sure what you intend to do by that actually, but yes indexing stays the same (you probably want to mark field "type" as required so you don't

Re:Re: How to speed up solr search speed

2010-07-26 Thread Dennis Gearon

Isn't it always one of these three? (from most likely to least likely, generally) Memory Disk Speed WebServer and it's code CPU. Memory and Disk are related, as swapping occurs between them. As long as memory is high enough, it becomes: Disk Speed WebServer and it's code CPU If the WebServer

Re: how to Protect data

2010-07-26 Thread Dennis Gearon

If it's not the data that's being searched, you can alway encode it before inserting it. You either have to either fruther encode it to base64 to make it printable before storing it, OR use a binary field. You probably could also set up an external process that cycles through every document in

Re: 2 type of docs in same schema?

2010-07-26 Thread scrapy

Thanks for you answer! That's great. Now to index search quieries data is there something special to do? or it stay as usual? -Original Message- From: Geert-Jan Brits To: solr-user@lucene.apache.org Sent: Mon, Jul 26, 2010 4:57 pm Subject: Re: 2 type of docs in same schema?

Updating fields in Solr

2010-07-26 Thread Pramod Goyal

Hi, I have a requirement where i need to keep updating certain fields in the schema. My requirement is to change some of the fields or add some values to a field ( multi-value field ). I understand that i can use Solr update for this. If i am using Solr update do i need to publish the entire

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

i want to learn more about the technology. exists an issue to create really an solrDoc ? Or its in the code only for a better understanding of the lucene and solr border ? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p99.html Sent from the

RE: slave index is bigger than master index

2010-07-26 Thread Bastian Spitzer

as far as i know this is not needed, the optimized index is automatically replicated to the slaves. therefore something seems to be really wrong with your setup. maybe the slave index got corrupted for some reason? did u try deleting the data dir + slave restart for a fresh replicated index? may

Re: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

No I didn't. I thought you aren't supposed to run optimize on slaves. Well but it doesn;t matter now, as I think its fixed now. I just added a dummy document on master, ran a commit call and then once that executed ran an optimize call. This triggered snapshooter to replicate the index, which some

Re: 2 type of docs in same schema?

2010-07-26 Thread Geert-Jan Brits

You can easily have different types of documents in 1 core: 1. define searchquery as a field(just as the others in your schema) 2. define type as a field (this allows you to decide which type of documents to search for, e.g: "type_normal" or "type_search") now searching on regular docs becomes: q

Re: slave index is bigger than master index

2010-07-26 Thread Peter Karich

did you try an optimize on the slave too? > Yes I always run an optimize whenever I index on master. In fact I just ran > an optimize command an hour ago, but it didn't make any difference. >

Re: DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-26 Thread MitchK

Hi Chantal, did you tried to write a http://wiki.apache.org/solr/DIHCustomFunctions custom DIH Function ? If not, I think this will be a solution. Just check, whether "${prog.vip}" is an empty string or null. If so, you need to replace it with a value that never can response anything. So the vi

2 type of docs in same schema?

2010-07-26 Thread scrapy

I need you expertise on this one... We would like to index every search query that is passed in our solr engine (same core) Our docs format are like this (already in our schema): title content price category etc... Now how to add "search queries" as a field in our schema? Know that the sea

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread kenf_nc

DataImportHandler (DIH) is an add-on to Solr. It lets you import documents from a number of sources in a flexible way. The only connection DIH has to Lucene is that Solr uses Lucene as the index engine. When you work with Solr you naturally talk about Solr Documents, if you were working with Luce

Re: Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki

I have just fixed it. Problem was related with operating system value - they were different that solr expected with incoming datastream. Regards, Rafal Zawadzki On Mon, Jul 26, 2010 at 3:20 PM, Chantal Ackermann < chantal.ackerm...@btelligent.de> wrote: > On Mon, 2010-07-26 at 14:46 +0200, Raf

Re: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

I just checked my config file, and I do have exact same values for deletionPolicy tag, as you attached in your email, so I dont really think it could be this. -- View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996373.html Sent fr

Re: AW: slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

Yes I always run an optimize whenever I index on master. In fact I just ran an optimize command an hour ago, but it didn't make any difference. -- View this message in context: http://lucene.472066.n3.nabble.com/slave-index-is-bigger-than-master-index-tp996329p996364.html Sent from the Solr - Us

spell checking....

2010-07-26 Thread satya swaroop

hi all, i am a new one to solr and able to implement indexing the documents by following the solr wiki. now i am trying to add the spellchecking. i followed the spellcheck component in wiki but not getting the suggested spellings. i first build it by spellcheck.build=true,... here i give u

AW: slave index is bigger than master index

2010-07-26 Thread Bastian Spitzer

Hi, are u calling on the master to finally remove deleted documents and merge the index files? once a day is recommended: http://wiki.apache.org/solr/SolrPerformanceFactors#Optimization_Considerations cheers -Ursprüngliche Nachricht- Von: Muneeb Ali [mailto:muneeba...@hotmail.com] G

Re: slave index is bigger than master index

2010-07-26 Thread Tommaso Teofili

Hi, I think that you may be using a Lucene/Solr IndexDeletionPolicy that does not remove old commits (and you aren't propagating solr-config via replication). You can configre this feature on the solr-config.xml inside the tag: * 1 0 * I hope this can be help

slave index is bigger than master index

2010-07-26 Thread Muneeb Ali

Hi, I am using Solr 1.4 version, with master-slave setup. We have one master slave and two slave servers. It was all working fine, but lately solr slaves are behaving strange. Particularly during replicating the index, the slave nodes die and always need a restart. Also the index size of slave no

Re: Problem with parsing date

2010-07-26 Thread Chantal Ackermann

On Mon, 2010-07-26 at 14:46 +0200, Rafal Bluszcz Zawadzki wrote: > EEE, d MMM HH:mm:ss z not sure but you might want to try with an uppercase 'Z' for the timezone (surrounded by single quotes, alternatively). The rest of your pattern looks fine. But if you still run into problems try differen

Re: 2 solr dataImport requests on a single core at the same time

2010-07-26 Thread kishan

btw , i want to put all the requestHandlers(more than 1) in 1 xml file and i want to use this in my solrConfig.xml i have used xinclude but it didnt work .. please suggest me any thing Thanks, Prasad -- View this message in context: http://lucene.472066.n3.nabble.com/2-solr-dataImport-requ

Re: 2 solr dataImport requests on a single core at the same time

2010-07-26 Thread kishan

Tq very Much .. -- View this message in context: http://lucene.472066.n3.nabble.com/2-solr-dataImport-requests-on-a-single-core-at-the-same-time-tp978649p996190.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki

I am using also others dateFormat string, also in same data handler and they works. But not this one. And this data are fetching from the external source, so I don't have possibility to modify them (well, theoritacly i can save them, edit etc but this is not the way). Why this is not working with

Re: Problem with Pdf, Sol 1.4.1 Cell

2010-07-26 Thread Tommaso Teofili

Hi, I think there is an open bug for it at: https://issues.apache.org/jira/browse/SOLR-1902 Using Solr 1.4.1 and upgrading Tika libraries to 0.8 snapshot I had also to upgrade pdfbox, fontbox and jembox to 1.2.1; I got no errors and it seems it's able to index PDFs without any errors (I can query t

Re: Problem with parsing date

2010-07-26 Thread Li Li

I uses format like -MM-ddThh:mm:ssZ. it works 2010/7/26 Rafal Bluszcz Zawadzki : > Hi, > > I am using Data Import Handler from Solr 1.4. > > Parts of my data-config.xml are: > > > processor="XPathEntityProcessor" > stream="false" > forEach="

Problem with parsing date

2010-07-26 Thread Rafal Bluszcz Zawadzki

Hi, I am using Data Import Handler from Solr 1.4. Parts of my data-config.xml are: . During full-import I got message: WARNING: Error creating document : SolrInputDocument[{SearchableText=SearchableText(1.0)={phrase}, parentPaths=parentPaths(1.0)={/site

Can't find org.apache.solr.client.solrj.embedded

2010-07-26 Thread Uwe Reh

Hello experts, where is a Jar, containing org.apache.solr.client.solrj.embedded? I miss this package in 'apache-solr-solrj-1.4.[01].jar'. Also I can't find any other sources than >http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/src/webapp/src/org/apache/solr/client/solrj/embedded/ , which

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

... but in the code is the talk about of, SolrDocuments. these are higher level docs, used to construct the lucene doc to index ... !!?!?!?!? and in wiki is the talk about "Build Solr documents by aggregating data from multiple columns and tables according to configuration" http://wiki.apache.or

DIH : SQL query (sub-entity) is executed although variable is not set (null or empty list)

2010-07-26 Thread Chantal Ackermann

Hi, my use case is the following: In a sub-entity I request rows from a database for an input list of strings: /* multivalued, not required */ The root entity is "prog" and it has an optional multivalued field called "vip". When the list of "vip" val

Re: Solr Doc Lucene Doc !?

2010-07-26 Thread MitchK

Stockii, Solr's index is a Lucene Index. Therefore, Solr documents are Lucene documents. Kind regards, - Mitch -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Doc-Lucene-Doc-tp995922p995968.html Sent from the Solr - User mailing list archive at Nabble.com.

Solr Doc Lucene Doc !?

2010-07-26 Thread stockii

Hello. I write a little text about SOLR and LUCENE by using the DIH. what documents are creating and inserting DIH ? in wiki is the talk about "solr documents" but i thought that, solr uses lucene to do this and so that DIH creates Lucnee Documents, not Solr Documents !? what are doing the D

Re: schema.xml

2010-07-26 Thread Grijesh.singh

Hi , There is no required fields except u specify any fields to required.U can remove or add as many fields u want. That is an example schema which shows how feilds are configured -- View this message in context: http://lucene.472066.n3.nabble.com/schema-xml-tp995696p995800.html Sent from the S

Re: how to Protect data

2010-07-26 Thread Peter Karich

Hi Girish, I am not aware of such a thing. But you could use a middleware to avoid certain fields from being retrieved via the 'fl' parameter: http://wiki.apache.org/solr/CommonQueryParameters#fl E.g. for your customers the query looks like q=hello&fl=title and for your admin the query looks like

Re: help with a schema design problem

2010-07-26 Thread Chantal Ackermann

Hi, I haven't read everything thoroughly but have you considered creating fields for each of your (I think what you call) "party value"? So that you can query like "client:Pramod". You would then be able to facet on client and supplier. Cheers, Chantal On Fri, 2010-07-23 at 23:23 +0200, Geert

Integration Problem

2010-07-26 Thread Jörg Wißmeier

Hi everybody, since a while i'm working with solr and i have integrated it with liferay 6.0.3. So every search request from liferay is processed by solr and its index. But i have to integrate another system, this system offers me a webservice. the results of these webservice should be in the resul

68 matches

Mail list logo