Re: why don't we have a forum for discussion?

2009-02-20 Thread Gunnar Wagenknecht
Martin Lamothe schrieb:
> This mailing list overloads my poor BB curve.

You can configure BIS/BES to not deliver mailing list email to your device.

Note, that this mailing list is already as a newsgroup via NNTP today.
No need to subscribe. Just get a NNTP news reader (eg. Mozilla
Thunderbird). :)

news://news.gmane.org/gmane.comp.jakarta.lucene.solr.user

-Gunnar

-- 
Gunnar Wagenknecht
gun...@wagenknecht.org
http://wagenknecht.org/



Field Boosting Code

2009-02-20 Thread dabboo

Hi,

I was looking into the Solr code and was trying to figure out as where the
code for field boosting is written. I am specifically looking for classes,
which gets called for that functionality.

If somebody knows as where the code is, it will be of great help.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Field-Boosting-Code-tp22118997p22118997.html
Sent from the Solr - User mailing list archive at Nabble.com.



Boosting Code

2009-02-20 Thread dabboo

Hi,

Can anyone please tell me where I can find the actual logic/implementation
of field boosting in Solr. I am looking for classes.

Thanks,
Amit Garg
-- 
View this message in context: 
http://www.nabble.com/Boosting-Code-tp22119017p22119017.html
Sent from the Solr - User mailing list archive at Nabble.com.



Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON

Hello everybody,

I suppose this is a very common question, and I'm sorry if it has been answered 
before : How can I retrieve the last indexed documents (I use a timestamp field 
defined as ) ? 

Thanks,
Pierre Landron

_
Show them the way! Add maps and directions to your party invites. 
http://www.microsoft.com/windows/windowslive/products/events.aspx

Re: Field Boosting Code

2009-02-20 Thread Grant Ingersoll
It's in Lucene.  See the Field class.  Assuming you mean boosting the  
Field at index time and not boosting the term (text + field name) at  
query time.


On Feb 20, 2009, at 6:26 AM, dabboo wrote:



Hi,

I was looking into the Solr code and was trying to figure out as  
where the
code for field boosting is written. I am specifically looking for  
classes,

which gets called for that functionality.

If somebody knows as where the code is, it will be of great help.

Thanks,
Amit Garg
--
View this message in context: 
http://www.nabble.com/Field-Boosting-Code-tp22118997p22118997.html
Sent from the Solr - User mailing list archive at Nabble.com.



--
Grant Ingersoll
http://www.lucidimagination.com/

Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)  
using Solr/Lucene:

http://www.lucidimagination.com/search



Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Hello all!
I'm trying to add jdbc entities to Solr in runtime. I can update
data-config.xml and reload the file using the reload-config command, but I
wanted to make the first index on the new entities (not full-index), that
is, add to index the data given by the query in the new entities.
How can I manage to do this?

Thanks in advance.


Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 5:44 PM, Rui Pereira wrote:

> Hello all!
> I'm trying to add jdbc entities to Solr in runtime. I can update
> data-config.xml and reload the file using the reload-config command, but I
> wanted to make the first index on the new entities (not full-index), that
> is, add to index the data given by the query in the new entities.
> How can I manage to do this?
>

You can use 'entity=&entity=' when
calling full-import to import only the specified entities.

-- 
Regards,
Shalin Shekhar Mangar.


delta-import not giving updated records

2009-02-20 Thread con

Hi alll

I am trying to run delta-import. For this I am having the below
data-config.xml



  


 



But nothing is happening when i call
http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
the dataimport.properties is getting updated with the time at which
delta-import is run.

Where as http://localhost:8080/solr/users/dataimport?command=full-import is
properly inserting data.

Can anybody suggest what is wrong with this configuration.

Thanks
con


-- 
View this message in context: 
http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: delta-import not giving updated records

2009-02-20 Thread Noble Paul നോബിള്‍ नोब्ळ्
there is a very good chance that the query created by DIH is wrong.
try giving the 'deltaImportQuery' explicitly in the entity .

On Fri, Feb 20, 2009 at 6:48 PM, con  wrote:
>
> Hi alll
>
> I am trying to run delta-import. For this I am having the below
> data-config.xml
>
> 
> driver="oracle.jdbc.driver.OracleDriver"
> url="***" user="" password="*"/>
>
>query="select USERS.USER_ID, USERS.USER_NAME, 
> USERS.CREATED_TIMESTAMP
> FROM USERS, CUSTOMERS where USERS.USER_ID = CUSTOMERS.USER_ID"
>
>deltaquery="select USERS.USER_ID, USERS.USER_NAME,
> USERS.CREATED_TIMESTAMP FROM USERS, CUSTOMERS where USERS.USER_ID =
> CUSTOMERS.USER_ID" >
>
>
>
> 
>
> But nothing is happening when i call
> http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
> the dataimport.properties is getting updated with the time at which
> delta-import is run.
>
> Where as http://localhost:8080/solr/users/dataimport?command=full-import is
> properly inserting data.
>
> Can anybody suggest what is wrong with this configuration.
>
> Thanks
> con
>
>
> --
> View this message in context: 
> http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>



-- 
--Noble Paul


Re: delta-import not giving updated records

2009-02-20 Thread Shalin Shekhar Mangar
1. There is no closing quote in transformer="TemplateTransformer
2. Attribute names are case-sensitive so it should be deltaQuery instead of
deltaquery

On Fri, Feb 20, 2009 at 6:48 PM, con  wrote:

>
> Hi alll
>
> I am trying to run delta-import. For this I am having the below
> data-config.xml
>
> 
> driver="oracle.jdbc.driver.OracleDriver"
> url="***" user="" password="*"/>
>
> transformer="TemplateTransformer pk="USER_ID"
>query="select USERS.USER_ID, USERS.USER_NAME,
> USERS.CREATED_TIMESTAMP
> FROM USERS, CUSTOMERS where USERS.USER_ID = CUSTOMERS.USER_ID"
>
>deltaquery="select USERS.USER_ID, USERS.USER_NAME,
> USERS.CREATED_TIMESTAMP FROM USERS, CUSTOMERS where USERS.USER_ID =
> CUSTOMERS.USER_ID" >
>
>
>
> 
>
> But nothing is happening when i call
> http://localhost:8080/solr/users/dataimport?command=delta-import. Whereas
> the dataimport.properties is getting updated with the time at which
> delta-import is run.
>
> Where as http://localhost:8080/solr/users/dataimport?command=full-importis
> properly inserting data.
>
> Can anybody suggest what is wrong with this configuration.
>
> Thanks
> con
>
>
> --
> View this message in context:
> http://www.nabble.com/delta-import-not-giving-updated-records-tp22120184p22120184.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Regards,
Shalin Shekhar Mangar.


Re: Retrieve last indexed documents...

2009-02-20 Thread Otis Gospodnetic

Pierre,

This is the issue to watch: https://issues.apache.org/jira/browse/SOLR-1023

I don't think there is a super nice way to do that currently.  You could use 
the match-all query (*:*) and sort by timestamp desc, and use start=0&rows=1.  
Using a raw timestamp that includes milliseconds is not recommended unless you 
really need milliseconds.


Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Pierre-Yves LANDRON 
> To: solr-user@lucene.apache.org
> Sent: Friday, February 20, 2009 8:04:28 PM
> Subject: Retrieve last indexed documents...
> 
> 
> Hello everybody,
> 
> I suppose this is a very common question, and I'm sorry if it has been 
> answered 
> before : How can I retrieve the last indexed documents (I use a timestamp 
> field 
> defined as 
> default="NOW" multiValued="false"/>) ? 
> 
> Thanks,
> Pierre Landron
> 
> _
> Show them the way! Add maps and directions to your party invites. 
> http://www.microsoft.com/windows/windowslive/products/events.aspx



Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Rui Pereira
Only one more question: doesn't full-import deletes all records before
execution, or in this case only deletes the entities passed in the url?

Thanks in advance,
Rui Pereira


On Fri, Feb 20, 2009 at 1:07 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:

> On Fri, Feb 20, 2009 at 5:44 PM, Rui Pereira  >wrote:
>
> > Hello all!
> > I'm trying to add jdbc entities to Solr in runtime. I can update
> > data-config.xml and reload the file using the reload-config command, but
> I
> > wanted to make the first index on the new entities (not full-index), that
> > is, add to index the data given by the query in the new entities.
> > How can I manage to do this?
> >
>
> You can use 'entity=&entity=' when
> calling full-import to import only the specified entities.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


RE: Retrieve last indexed documents...

2009-02-20 Thread Pierre-Yves LANDRON

OK, thanks,

That's what i've done ; I've kind of hoped that there was a nicer way to go, 
but after all, it works that way anyway...

Cheers,
P Landron

> Date: Fri, 20 Feb 2009 06:05:24 -0800
> From: otis_gospodne...@yahoo.com
> Subject: Re: Retrieve last indexed documents...
> To: solr-user@lucene.apache.org
> 
> 
> Pierre,
> 
> This is the issue to watch: https://issues.apache.org/jira/browse/SOLR-1023
> 
> I don't think there is a super nice way to do that currently.  You could use 
> the match-all query (*:*) and sort by timestamp desc, and use start=0&rows=1. 
>  Using a raw timestamp that includes milliseconds is not recommended unless 
> you really need milliseconds.
> 
> 
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> 
> 
> - Original Message 
> > From: Pierre-Yves LANDRON 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, February 20, 2009 8:04:28 PM
> > Subject: Retrieve last indexed documents...
> > 
> > 
> > Hello everybody,
> > 
> > I suppose this is a very common question, and I'm sorry if it has been 
> > answered 
> > before : How can I retrieve the last indexed documents (I use a timestamp 
> > field 
> > defined as 
> > default="NOW" multiValued="false"/>) ? 
> > 
> > Thanks,
> > Pierre Landron
> > 
> > _
> > Show them the way! Add maps and directions to your party invites. 
> > http://www.microsoft.com/windows/windowslive/products/events.aspx
> 

_
Invite your mail contacts to join your friends list with Windows Live Spaces. 
It's easy!
http://spaces.live.com/spacesapi.aspx?wx_action=create&wx_url=/friends.aspx&mkt=en-us

concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Hey there,
I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
every 5 minutes with a cron job).
The three cores use JdbcDataSource as datasource in data-config.xml
Reached a point, the core that fetches more mysql rows starts running so so
solw until the thread seems to stop (but the other tow keep working
fine)...but java doesn't throw and exception...
I am using a nightly from early january. I found someone experienced the
same problem and uploaded a templateString patch to make it thread-save.

http://www.nabble.com/Concurrency-problem-with-delta-import-td21665540.html#a21665540

The thing is even with this, the problem doesn't disapear.
Does someone knows what is happening??
Thank you.
-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22120430.html
Sent from the Solr - User mailing list archive at Nabble.com.



Defining shards in solrconfig with multiple cores

2009-02-20 Thread jdleider

Hey All,

I am trying to load balance two solr installations, solr1 and solr2. Each
box is running 4 cores, core0 - core3. I would like to define the shards for
each box in solrconfig as such:


 solr1:8080/solr/core0,solr1:8080/solr/core1,solr1:8080/solr/core2,solr1:8080/solr/core3
 


For whatever reason the /admin works. However when i try to /select using
this shards param in the solrconfig.xml the query just hangs. Ive looked
everywhere trying to figure this one out and the syntax looks right. The
query works as it is supposed to when the shards param is removed from
solrconfig.xml and appended to the url. However, I cant use the load
balancer if i have to specify the shards host in the url. 

Am I doing something wrong or is this not supported yet? Is there a
workaround that I can use?

Thanks!

Justin

-- 
View this message in context: 
http://www.nabble.com/Defining-shards-in-solrconfig-with-multiple-cores-tp22120446p22120446.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:41 PM, Marc Sturlese wrote:

>
> Hey there,
> I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
> every 5 minutes with a cron job).
> The three cores use JdbcDataSource as datasource in data-config.xml
> Reached a point, the core that fetches more mysql rows starts running so so
> solw until the thread seems to stop (but the other tow keep working
> fine)...but java doesn't throw and exception...
> I am using a nightly from early january. I found someone experienced the
> same problem and uploaded a templateString patch to make it thread-save.
>

Marc, I'd strongly recommend using a more recent nightly build. There was
another problem related to unsafe usage of SimpleDateFormat which was fixed
recently.

See https://issues.apache.org/jira/browse/SOLR-1017 (which was fixed on 11th
Feb)
-- 
Regards,
Shalin Shekhar Mangar.


Re: Add jdbc entity to DataImportHandler in runtime

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 8:01 PM, Rui Pereira wrote:

> Only one more question: doesn't full-import deletes all records before
> execution, or in this case only deletes the entities passed in the url?
>

If no 'entity' parameter is specified, a full-import deletes all existing
documents. But if a 'entity' is specified then the deleteQuery is not
executed. There's no way for DataImportHandler to figure out which documents
were generated by which entity.

You can use the 'preImportDeleteQuery' attribute on an entity to specify a
delete query which can delete the documents created by that entity.

http://wiki.apache.org/solr/DataImportHandler#head-70d3fdda52de9ee4fdb54e1c6f84199f0e1caa76

-- 
Regards,
Shalin Shekhar Mangar.


Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Hey,
Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat aswell.
Is there any other known concurrency bug that maybe I am missing?
In my use case I could manage to index not concurrently but would like to
discover why this is happening...

Thank you very much!



Shalin Shekhar Mangar wrote:
> 
> On Fri, Feb 20, 2009 at 8:41 PM, Marc Sturlese
> wrote:
> 
>>
>> Hey there,
>> I am indexing 3 cores concurrently from 3 diferent mysql tables (I do it
>> every 5 minutes with a cron job).
>> The three cores use JdbcDataSource as datasource in data-config.xml
>> Reached a point, the core that fetches more mysql rows starts running so
>> so
>> solw until the thread seems to stop (but the other tow keep working
>> fine)...but java doesn't throw and exception...
>> I am using a nightly from early january. I found someone experienced the
>> same problem and uploaded a templateString patch to make it thread-save.
>>
> 
> Marc, I'd strongly recommend using a more recent nightly build. There was
> another problem related to unsafe usage of SimpleDateFormat which was
> fixed
> recently.
> 
> See https://issues.apache.org/jira/browse/SOLR-1017 (which was fixed on
> 11th
> Feb)
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22123287.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 10:43 PM, Marc Sturlese wrote:

>
> Hey,
> Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat
> aswell.
> Is there any other known concurrency bug that maybe I am missing?
> In my use case I could manage to index not concurrently but would like to
> discover why this is happening...
>
> Thank you very much!
>
>
I don't see any obvious issue except for these two fixes. Are you
experiencing this problem even after applying both of Ryuuichi's fixes?

-- 
Regards,
Shalin Shekhar Mangar.


Question about etag

2009-02-20 Thread Pascal Dimassimo

Hi guys,
 
I'm having trouble understanding the behavior of firefox and the etag.
 
After cleaning the cache, I send this request from firefox:
 
GET /solr/select/?q=television HTTP/1.1
Host: localhost:8088
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6) 
Gecko/2009011913 Firefox/3.0.6 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: JSESSIONID=AA71D602A701BB6287C60083DD6879CD
 
Which solr responds with:
 
HTTP/1.1 200 OK
Last-Modified: Thu, 19 Feb 2009 19:57:14 GMT
ETag: "NmViOTJkMjc1ODgwMDAwMFNvbHI="
Content-Type: text/xml; charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(6.1.3)
(#data following#)
 
So far so good. But then, I press F5 to refresh the page. Now if I understand 
correctly the way the etag works, firefox should send the request with a 
"if-none-match" along with the etag and then the server should return a 304 
"not modified" code.
 
But what happens is that firefox just don't send anything. In the firebug 
window, I only see "0 requests". Just to make sure I test with tcpmon and 
nothing is sent by firefox.
 
Is this making sense? Am I missing something?
 
My solrconfig.xml has this config:


 
 
Thanks!


_
The new Windows Live Messenger. You don’t want to miss this.
http://www.microsoft.com/windows/windowslive/products/messenger.aspx

Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

Yes,
Now it's almost tree days non-stop since I am running updates with the 3
cores with cron jobs. If there are updates of 1 docs everything is
alrite. When I start doing updates of 30 is when that core runs really
slow. I have to abort the import in that core and keep updating with less
rows each time.
Another thing to point is that tomcat reaches the maximum memory I allow
(2Gig) and never goes down (but at least it doesn't run out of memory). Is
that normal? Shouldn't the memory go down a lot after an update is
completed?

Thank you very much!


Shalin Shekhar Mangar wrote:
> 
> On Fri, Feb 20, 2009 at 10:43 PM, Marc Sturlese
> wrote:
> 
>>
>> Hey,
>> Yeah, I patched the bug reported by Ryuuichi of the SimpleDateFormat
>> aswell.
>> Is there any other known concurrency bug that maybe I am missing?
>> In my use case I could manage to index not concurrently but would like to
>> discover why this is happening...
>>
>> Thank you very much!
>>
>>
> I don't see any obvious issue except for these two fixes. Are you
> experiencing this problem even after applying both of Ryuuichi's fixes?
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22125443.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Shalin Shekhar Mangar
On Fri, Feb 20, 2009 at 11:23 PM, Marc Sturlese wrote:

>
> Yes,
> Now it's almost tree days non-stop since I am running updates with the 3
> cores with cron jobs. If there are updates of 1 docs everything is
> alrite. When I start doing updates of 30 is when that core runs really
> slow. I have to abort the import in that core and keep updating with less
> rows each time.
> Another thing to point is that tomcat reaches the maximum memory I allow
> (2Gig) and never goes down (but at least it doesn't run out of memory). Is
> that normal? Shouldn't the memory go down a lot after an update is
> completed?
>

I guess you are being hit by garbage collection. Memory utilization should
go down once an import completes. Which GC are you using? There have been a
few recent threads on GC settings. Perhaps you can try out a few of those
settings. I don't know how big your documents/index are but if possible give
it more memory.

-- 
Regards,
Shalin Shekhar Mangar.


Re: concurrency problem with delta-import (indexing various cores simultaniously)

2009-02-20 Thread Marc Sturlese

I am working with 3 index of 1 gig each. I am using the standard setting of
the GC, haven't changed anything and using java version "1.6.0_07".
I don't know so much about GV configuration... just read this

http://marcus.net/blog/2007/11/10/solr-search-and-java-gc-tuning/

when a month ago I exeprienced another problem with Solr (at the end it was
not GV's fault). So, any advice about wich GC should I try or what should I
tune?

Thank you very much!



Shalin Shekhar Mangar wrote:
> 
> On Fri, Feb 20, 2009 at 11:23 PM, Marc Sturlese
> wrote:
> 
>>
>> Yes,
>> Now it's almost tree days non-stop since I am running updates with the 3
>> cores with cron jobs. If there are updates of 1 docs everything is
>> alrite. When I start doing updates of 30 is when that core runs
>> really
>> slow. I have to abort the import in that core and keep updating with less
>> rows each time.
>> Another thing to point is that tomcat reaches the maximum memory I allow
>> (2Gig) and never goes down (but at least it doesn't run out of memory).
>> Is
>> that normal? Shouldn't the memory go down a lot after an update is
>> completed?
>>
> 
> I guess you are being hit by garbage collection. Memory utilization should
> go down once an import completes. Which GC are you using? There have been
> a
> few recent threads on GC settings. Perhaps you can try out a few of those
> settings. I don't know how big your documents/index are but if possible
> give
> it more memory.
> 
> -- 
> Regards,
> Shalin Shekhar Mangar.
> 
> 

-- 
View this message in context: 
http://www.nabble.com/concurrency-problem-with-delta-import-%28indexing-various-cores-simultaniously%29-tp22120430p22125716.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Updating a single field of a document

2009-02-20 Thread Amit Nithian
Thanks Otis. Are these Solr specific issues. In looking through Lucene's
FAQ, it seems that you would have to delete the document and re-add. Could a
possible solution be to find the document by the unique-id and set the
fields that were changed or would this not scale when doing a lot of
document field updates?
Which JIRA issues were you referring to?

Thanks
Amit

On Thu, Feb 19, 2009 at 6:57 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:

>
> Amit,
>
> This is still the case.  I believe 2 separate issues related to this exist
> in JIRA, but none is in a finished state.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message 
> > From: Amit Nithian 
> > To: solr-user@lucene.apache.org
> > Sent: Friday, February 20, 2009 7:00:03 AM
> > Subject: Updating a single field of a document
> >
> > Is there a way in Solr 1.2 (or Solr 1.3) to update a single field of an
> > existing document if I know the primary key? Reason I ask is that I
> > construct a document from multiple sources and some fields may need
> periodic
> > updating from one of those sources. I would prefer not to have to
> > reconstruct the entire document (and hence query the multiple sources)
> for a
> > single field change.
> > I noticed that Solr 1.2 will delete and add the new document rather than
> > replace individual fields. Is there a way around this?
> >
> > Thanks
> > Amit
>
>


Re: Updating a single field of a document

2009-02-20 Thread Shalin Shekhar Mangar
On Sat, Feb 21, 2009 at 1:00 AM, Amit Nithian  wrote:

> Thanks Otis. Are these Solr specific issues. In looking through Lucene's
> FAQ, it seems that you would have to delete the document and re-add. Could
> a
> possible solution be to find the document by the unique-id and set the
> fields that were changed or would this not scale when doing a lot of
> document field updates?
> Which JIRA issues were you referring to?
>

https://issues.apache.org/jira/browse/SOLR-139
https://issues.apache.org/jira/browse/SOLR-828

-- 
Regards,
Shalin Shekhar Mangar.


Re: Question about etag

2009-02-20 Thread Pascal Dimassimo

Sorry, the xml of the solrconfig.xml was lost. It is






Hi guys,
 
I'm having trouble understanding the behavior of firefox and the etag.
 
After cleaning the cache, I send this request from firefox:
 
GET /solr/select/?q=television HTTP/1.1
Host: localhost:8088
User-Agent: Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.9.0.6)
Gecko/2009011913 Firefox/3.0.6 (.NET CLR 3.5.30729)
Accept: text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8
Accept-Language: en-us,en;q=0.5
Accept-Encoding: gzip,deflate
Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7
Keep-Alive: 300
Connection: keep-alive
Cookie: JSESSIONID=AA71D602A701BB6287C60083DD6879CD
 
Which solr responds with:
 
HTTP/1.1 200 OK
Last-Modified: Thu, 19 Feb 2009 19:57:14 GMT
ETag: "NmViOTJkMjc1ODgwMDAwMFNvbHI="
Content-Type: text/xml; charset=utf-8
Transfer-Encoding: chunked
Server: Jetty(6.1.3)
(#data following#)
 
So far so good. But then, I press F5 to refresh the page. Now if I
understand correctly the way the etag works, firefox should send the request
with a "if-none-match" along with the etag and then the server should return
a 304 "not modified" code.
 
But what happens is that firefox just don't send anything. In the firebug
window, I only see "0 requests". Just to make sure I test with tcpmon and
nothing is sent by firefox.
 
Is this making sense? Am I missing something?
 
My solrconfig.xml has this config:


 
 
Thanks!

-- 
View this message in context: 
http://www.nabble.com/Question-about-etag-tp22125449p22127322.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Defining shards in solrconfig with multiple cores

2009-02-20 Thread Yonik Seeley
On Fri, Feb 20, 2009 at 10:32 AM, jdleider  wrote:
> However when i try to /select using
> this shards param in the solrconfig.xml the query just hangs.

The basic /select url should normally not have shards set as a
default... this will cause infinite recursion when the top level
searcher sends requests to the sub-searchers until you exhaust all
threads and run into a distributed deadlock.  Set up another handler
with the default shards param instead.

-Yonik
Lucene/Solr? http://www.lucidimagination.com


mapping pdf metadata

2009-02-20 Thread Josh Joy
Hi,

I'm having trouble figuring out how to map the tika metadata fields to my
own solr schema document fields. I guess the first hurdle I need to
overcome, is where can I find a list of the Tika PDF metadata fields that
are available for mapping?

Thanks,
Josh


show first couple sentences from found doc

2009-02-20 Thread Josh Joy
Hi,

I would like to do something similar to Google, in that for my list of hits,
I would like to grab the surrounding text around my query term so I can
include that in my search results. What's the easiest way to do this?

Thanks,
Josh


Re: show first couple sentences from found doc

2009-02-20 Thread Koji Sekiguchi

Josh Joy wrote:

Hi,

I would like to do something similar to Google, in that for my list of hits,
I would like to grab the surrounding text around my query term so I can
include that in my search results. What's the easiest way to do this?

Thanks,
Josh

  


Highlighter?

http://wiki.apache.org/solr/HighlightingParameters

Koji




Re: mapping pdf metadata

2009-02-20 Thread Otis Gospodnetic

Josh,

You didn't mention whether you are using 
http://wiki.apache.org/solr/ExtractingRequestHandler , but if you are not, 
maybe this already has what you need: 
http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6

 
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 
> From: Josh Joy 
> To: solr-user@lucene.apache.org
> Sent: Saturday, February 21, 2009 9:11:01 AM
> Subject: mapping pdf metadata
> 
> Hi,
> 
> I'm having trouble figuring out how to map the tika metadata fields to my
> own solr schema document fields. I guess the first hurdle I need to
> overcome, is where can I find a list of the Tika PDF metadata fields that
> are available for mapping?
> 
> Thanks,
> Josh



Re: mapping pdf metadata

2009-02-20 Thread Erik Hatcher
And when you do use the ExtractingRequestHandler (aka Solr Cell), you  
can find the metadata fields by using the ext.extract.only=true setting.


You might also find this article by Sami Siren helpful: 


Erik


On Feb 20, 2009, at 8:39 PM, Otis Gospodnetic wrote:



Josh,

You didn't mention whether you are using http://wiki.apache.org/solr/ExtractingRequestHandler 
 , but if you are not, maybe this already has what you need: http://wiki.apache.org/solr/ExtractingRequestHandler#head-c413be32c951c89c0a28f4f8336aa7d2774ec2d6



Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch



- Original Message 

From: Josh Joy 
To: solr-user@lucene.apache.org
Sent: Saturday, February 21, 2009 9:11:01 AM
Subject: mapping pdf metadata

Hi,

I'm having trouble figuring out how to map the tika metadata fields  
to my

own solr schema document fields. I guess the first hurdle I need to
overcome, is where can I find a list of the Tika PDF metadata  
fields that

are available for mapping?

Thanks,
Josh




Suggested hardening of Solr schema.jsp admin interface

2009-02-20 Thread Peter Wolanin
My colleague Paul opened this issue and supplied a patch and I
commented on it regarding a potential security weakness in the admin
interface:

https://issues.apache.org/jira/browse/SOLR-1031


-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com


What is the performance impact of a fq that matches all docs?

2009-02-20 Thread Peter Wolanin
We are working on integration with the Drupal CMS, and so are writing
code that carries out operations that might only be relevant for only
a small subset of the sites/indexes that might use the integration
module.  In this regard, I'm wondering if adding to the query (using
the dismax or mlt handlers) a fq that matches all documents would have
any impact on performance?  I gatehr that there is caching for the fq
matches, but it seems liek that would still incur some overhead,
especially for a large index?

As a more concrete example, suppose each document has a string field
that names the role of user that is allowed to see the content.  e.g.
'public', 'registered', 'admin'.  Most sites have only public content,
but because our code is generic, we might add  &fq=role:public to
every query.  What would the expected performance effect be compared
to omitting that fq if, for example, we had a way to determine in
advance that all site content matches 'public'.

Thanks,

Peter

-- 
Peter M. Wolanin, Ph.D.
Momentum Specialist,  Acquia. Inc.
peter.wola...@acquia.com