What is the performance impact of a fq that matches all docs?

2009-02-20 Thread Peter Wolanin
We are working on integration with the Drupal CMS, and so are writing code that carries out operations that might only be relevant for only a small subset of the sites/indexes that might use the integration module. In this regard, I'm wondering if adding to the query (using the dismax or mlt handl

Suggested hardening of Solr schema.jsp admin interface

2009-02-20 Thread Peter Wolanin
My colleague Paul opened this issue and supplied a patch and I commented on it regarding a potential security weakness in the admin interface: https://issues.apache.org/jira/browse/SOLR-1031 -- Peter M. Wolanin, Ph.D. Momentum Specialist, Acquia. Inc. peter.wola...@acquia.com

Re: mapping pdf metadata

2009-02-20 Thread Erik Hatcher
And when you do use the ExtractingRequestHandler (aka Solr Cell), you can find the metadata fields by using the ext.extract.only=true setting. You might also find this article by Sami Siren helpful: Erik

Re: mapping pdf metadata

2009-02-20 Thread Otis Gospodnetic
Josh, You didn't mention whether you are using http://wiki.apache.org/solr/ExtractingRequestHandler , but if you are not, maybe this already has what you need: http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6 Otis -- Sematext -- http://semat

Re: show first couple sentences from found doc

2009-02-20 Thread Koji Sekiguchi
Josh Joy wrote: Hi, I would like to do something similar to Google, in that for my list of hits, I would like to grab the surrounding text around my query term so I can include that in my search results. What's the easiest way to do this? Thanks, Josh Highlighter? http://wiki.apache.org/

show first couple sentences from found doc

2009-02-20 Thread Josh Joy
Hi, I would like to do something similar to Google, in that for my list of hits, I would like to grab the surrounding text around my query term so I can include that in my search results. What's the easiest way to do this? Thanks, Josh

mapping pdf metadata

2009-02-20 Thread Josh Joy
Hi, I'm having trouble figuring out how to map the tika metadata fields to my own solr schema document fields. I guess the first hurdle I need to overcome, is where can I find a list of the Tika PDF metadata fields that are available for mapping? Thanks, Josh

Re: Defining shards in solrconfig with multiple cores

2009-02-20 Thread Yonik Seeley
On Fri, Feb 20, 2009 at 10:32 AM, jdleider wrote: > However when i try to /select using > this shards param in the solrconfig.xml the query just hangs. The basic /select url should normally not have shards set as a default... this will cause infinite recursion when the top level searcher sends re

Re: Question about etag

2009-02-20 Thread Pascal Dimassimo
Sorry, the xml of the solrconfig.xml was lost. It is Hi guys, I'm having trouble understanding the behavior of firefox and the etag. After cleaning the cache, I send this request from firefox: GET /solr/select/?q=television HTTP/1.1 Host: localhost:8088 User-Agent: Mozilla/5.0 (Windows

Re: Updating a single field of a document

2009-02-20 Thread Shalin Shekhar Mangar
On Sat, Feb 21, 2009 at 1:00 AM, Amit Nithian wrote: > Thanks Otis. Are these Solr specific issues. In looking through Lucene's > FAQ, it seems that you would have to delete the document and re-add. Could > a > possible solution be to find the document by the unique-id and set the > fields that w

Re: Updating a single field of a document

2009-02-20 Thread Amit Nithian
Thanks Otis. Are these Solr specific issues. In looking through Lucene's FAQ, it seems that you would have to delete the document and re-add. Could a possible solution be to find the document by the unique-id and set the fields that were changed or would this not scale when doing a lot of document

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese
I am working with 3 index of 1 gig each. I am using the standard setting of the GC, haven't changed anything and using java version "1.6.0_07". I don't know so much about GV configuration... just read this http://marcus.net/blog/2007/11/10/solr-search-and-java-gc-tuning/ when a month ago I exepr

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 11:23 PM, Marc Sturlese wrote: > > Yes, > Now it's almost tree days non-stop since I am running updates with the 3 > cores with cron jobs. If there are updates of 1 docs everything is > alrite. When I start doing updates of 30 is when that core runs really > slow. I

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese
Yes, Now it's almost tree days non-stop since I am running updates with the 3 cores with cron jobs. If there are updates of 1 docs everything is alrite. When I start doing updates of 30 is when that core runs really slow. I have to abort the import in that core and keep updating with less

Question about etag

2009-02-20 Thread Pascal Dimassimo
Hi guys, I'm having trouble understanding the behavior of firefox and the etag. After cleaning the cache, I send this request from firefox: GET /solr/select/?q=television HTTP/1.1 Host: localhost:8088 User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) Gecko/2009011913 F

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 10:43 PM, Marc Sturlese wrote: > > Hey, > Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat > aswell. > Is there any other known concurrency bug that maybe I am missing? > In my use case I could manage to index not concurrently but would like to > discove

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese
Hey, Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat aswell. Is there any other known concurrency bug that maybe I am missing? In my use case I could manage to index not concurrently but would like to discover why this is happening... Thank you very much! Shalin Shekhar Ma

Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:01 PM, Rui Pereira wrote: > Only one more question: doesn't full-import deletes all records before > execution, or in this case only deletes the entities passed in the url? > If no 'entity' parameter is specified, a full-import deletes all existing documents. But if a 'e

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:41 PM, Marc Sturlese wrote: > > Hey there, > I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it > every 5 minutes with a cron job). > The three cores use JdbcDataSource as datasource in data-config.xml > Reached a point, the core that fetches more my

Defining shards in solrconfig with multiple cores

2009-02-20 Thread jdleider
Hey All, I am trying to load balance two solr installations, solr1 and solr2. Each box is running 4 cores, core0 - core3. I would like to define the shards for each box in solrconfig as such: solr1:8080/solr/core0,solr1:8080/solr/core1,solr1:8080/solr/core2,solr1:8080/solr/core3 Fo

concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese
Hey there, I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it every 5 minutes with a cron job). The three cores use JdbcDataSource as datasource in data-config.xml Reached a point, the core that fetches more mysql rows starts running so so solw until the thread seems to stop

RE: Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON
OK, thanks, That's what i've done ; I've kind of hoped that there was a nicer way to go, but after all, it works that way anyway... Cheers, P Landron > Date: Fri, 20 Feb 2009 06:05:24 -0800 > From: otis_gospodne...@yahoo.com > Subject: Re: Retrieve last indexed documents... > To: solr-user@luc

Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Only one more question: doesn't full-import deletes all records before execution, or in this case only deletes the entities passed in the url? Thanks in advance, Rui Pereira On Fri, Feb 20, 2009 at 1:07 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Fri, Feb 20, 2009 at 5:4

Re: Retrieve last indexed documents...

2009-02-20 Thread Otis Gospodnetic
Pierre, This is the issue to watch: https://issues.apache.org/jira/browse/SOLR-1023 I don't think there is a super nice way to do that currently. You could use the match-all query (*:*) and sort by timestamp desc, and use start=0&rows=1. Using a raw timestamp that includes milliseconds is no

Re: delta-import not giving updated records

2009-02-20 Thread Shalin Shekhar Mangar
1. There is no closing quote in transformer="TemplateTransformer 2. Attribute names are case-sensitive so it should be deltaQuery instead of deltaquery On Fri, Feb 20, 2009 at 6:48 PM, con wrote: > > Hi alll > > I am trying to run delta-import. For this I am having the below > data-config.xml >

Re: delta-import not giving updated records

2009-02-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is a very good chance that the query created by DIH is wrong. try giving the 'deltaImportQuery' explicitly in the entity . On Fri, Feb 20, 2009 at 6:48 PM, con wrote: > > Hi alll > > I am trying to run delta-import. For this I am having the below > data-config.xml > > > driver="ora

delta-import not giving updated records

2009-02-20 Thread con
Hi alll I am trying to run delta-import. For this I am having the below data-config.xml But nothing is happening when i call http://localhost:8080/solr/

Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 5:44 PM, Rui Pereira wrote: > Hello all! > I'm trying to add jdbc entities to Solr in runtime. I can update > data-config.xml and reload the file using the reload-config command, but I > wanted to make the first index on the new entities (not full-index), that > is, add to

Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Hello all! I'm trying to add jdbc entities to Solr in runtime. I can update data-config.xml and reload the file using the reload-config command, but I wanted to make the first index on the new entities (not full-index), that is, add to index the data given by the query in the new entities. How can

Re: Field Boosting Code

2009-02-20 Thread Grant Ingersoll
It's in Lucene. See the Field class. Assuming you mean boosting the Field at index time and not boosting the term (text + field name) at query time. On Feb 20, 2009, at 6:26 AM, dabboo wrote: Hi, I was looking into the Solr code and was trying to figure out as where the code for field

Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON
Hello everybody, I suppose this is a very common question, and I'm sorry if it has been answered before : How can I retrieve the last indexed documents (I use a timestamp field defined as ) ? Thanks, Pierre Landron _ Show them t

Boosting Code

2009-02-20 Thread dabboo
Hi, Can anyone please tell me where I can find the actual logic/implementation of field boosting in Solr. I am looking for classes. Thanks, Amit Garg -- View this message in context: http://www.nabble.com/Boosting-Code-tp22119017p22119017.html Sent from the Solr - User mailing list archive at

Field Boosting Code

2009-02-20 Thread dabboo
Hi, I was looking into the Solr code and was trying to figure out as where the code for field boosting is written. I am specifically looking for classes, which gets called for that functionality. If somebody knows as where the code is, it will be of great help. Thanks, Amit Garg -- View this m

Re: why don't we have a forum for discussion?

2009-02-20 Thread Gunnar Wagenknecht
Martin Lamothe schrieb: > This mailing list overloads my poor BB curve. You can configure BIS/BES to not deliver mailing list email to your device. Note, that this mailing list is already as a newsgroup via NNTP today. No need to subscribe. Just get a NNTP news reader (eg. Mozilla Thunderbird). :