master master, repeaters

2010-12-19 Thread Tri Nguyen
Hi,

In the master-slave configuration, I'm trying to figure out how to configure 
the 
system setup for master failover.

Does solr support master-master setup?  From my readings, solr does not.

I've read about repeaters as well where the slave can act as a master.  When 
the 
main master goes down, do the other slaves switch to the repeater?

Barring better solutions, I'm thinking about putting 2 masters behind  a load 
balancer.

If this is not implemented already, perhaps solr can be updated to support a 
list of masters for fault tolerance.

Tri

shard versus core

2010-12-19 Thread Tri Nguyen
Hi,

Was wondering about  the pro's and con's of using sharding versus cores.

An index can be split up to multiple cores or multilple shards.

So why one over the other?

Thanks,


tri

Re: shard versus core

2010-12-19 Thread Erick Erickson
Well, they can be different beasts. First of all, different cores can have
different schemas, which is not true of shards. Also, shards are almost
assumed to be running on different machines as a scaling technique,
whereas it multiple cores are run on a single Solr instance.

So using multiple cores is very similar to running multiple "virtual" Solr
serves on a single machine, each independent of the other. This can make
sense if, for instance, you wanted to have a bunch of small indexes all
on one machine. You could use multiple cores rather than multiple
instances of Solr. These indexes may or may not have anything to do with
each other.

Sharding, on the other hand, is almost always used to split a single logical
index up amongst multiple machines in order to improve performance. The
assumption usually is that the index is too big to give satisfactory
performance
on a single machine, so you'll split it into parts. That assumption really
implies that it makes no sense to put multiple shards on the #same# machine.

So really, the answer to your question is that you choose the right
technique
for the problem you're trying to solve. They aren't really different
solutions to
the same problem...

Hope this helps.
Erick

On Sun, Dec 19, 2010 at 4:07 AM, Tri Nguyen  wrote:

> Hi,
>
> Was wondering about  the pro's and con's of using sharding versus cores.
>
> An index can be split up to multiple cores or multilple shards.
>
> So why one over the other?
>
> Thanks,
>
>
> tri


Re: shard versus core

2010-12-19 Thread Shawn Heisey

On 12/19/2010 2:07 AM, Tri Nguyen wrote:

Was wondering about  the pro's and con's of using sharding versus cores.

An index can be split up to multiple cores or multilple shards.

So why one over the other?


If you split your index into multiple cores, you still have to use the 
shards parameter to tell Solr where to find the parts.  You can use 
multiple servers, multiple cores, or even both.  Which method to use 
depends on why you've decided to split your index into multiple pieces.


If the primary motivating factor is index size, you'll probably want to 
use separate servers.  Unless the only reason for distributed search is 
making build process easier (or possible), I personally would not have 
multiple "live" cores on a single machine.  An example where multiple 
cores per server is entirely appropriate (creating a new core every five 
minutes):


http://www.loggly.com/2010/08/our-solr-system/

I went to this guy's talk at Lucene Revolution.  Amazing stuff.

Shawn



Re: DIH for sharded database?

2010-12-19 Thread Dennis Gearon
The easiest way, and the way that the database needs to use those shards, 
probably, is to use a view with a queiry and I think it joins on the primary 
key.

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Andy 
To: solr-user@lucene.apache.org
Sent: Sat, December 18, 2010 6:20:54 PM
Subject: DIH for sharded database?

I have a table that is broken up into many virtual shards. So basically I have 
N 
identical tables:

Document1
Document2
.
.
Document36

Currently these tables all live in the same database, but in the future they 
may 
be moved to different servers to scale out if the needs arise.

Is there any way to configure a DIH for these tables so that it will 
automatically loop through the 36 identical tables and pull data out for 
indexing?

Something like (pseudo code):

for (i = 1; i <= 36; i++) {
   ## retrieve data from the table Document{$i} & index the data
}

What's the best way to handle a situation like this?

Thanks


Re: DIH for sharded database?

2010-12-19 Thread Dennis Gearon
Some talk on giant databases in postgres:
  
http://wiki.postgresql.org/images/3/38/PGDay2009-EN-Datawarehousing_with_PostgreSQL.pdf

wikipedia
  http://en.wikipedia.org/wiki/Partition_%28database%29
  (says to use a UNION)
postgres description on how to do it:
  http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html

 Dennis Gearon


Signature Warning

It is always a good idea to learn from your own mistakes. It is usually a 
better 
idea to learn from others’ mistakes, so you do not have to make them yourself. 
from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'


EARTH has a Right To Life,
otherwise we all die.



- Original Message 
From: Andy 
To: solr-user@lucene.apache.org
Sent: Sat, December 18, 2010 6:20:54 PM
Subject: DIH for sharded database?

I have a table that is broken up into many virtual shards. So basically I have 
N 
identical tables:

Document1
Document2
.
.
Document36

Currently these tables all live in the same database, but in the future they 
may 
be moved to different servers to scale out if the needs arise.

Is there any way to configure a DIH for these tables so that it will 
automatically loop through the 36 identical tables and pull data out for 
indexing?

Something like (pseudo code):

for (i = 1; i <= 36; i++) {
   ## retrieve data from the table Document{$i} & index the data
}

What's the best way to handle a situation like this?

Thanks


Re: master master, repeaters

2010-12-19 Thread Upayavira
We had a (short) thread on this late last week. 

Solr doesn't support automatic failover of the master, at least in
1.4.1. I've been discussing with my colleague (Tommaso) about ways to
achieve this.

There's ways we could 'fake it', scripting the following:

 * set up a 'backup' master, as a replica of the actual master
 * monitor the master for 'up-ness'
 * if it fails:
   * tell the master to start indexing to the backup instead
   * tell the slave(s) to connect to a different master (the backup)
 * then, when the master is back:
   * wipe its index (backing up dir first?)
   * configure it to be a backup of the new master
   * make it pull a fresh index over

But, Jan Høydahl suggested using SolrCloud. I'm going to follow up on
how that might work in that thread.

Upayavira
 

On Sun, 19 Dec 2010 00:20 -0800, "Tri Nguyen" 
wrote:
> Hi,
> 
> In the master-slave configuration, I'm trying to figure out how to
> configure the 
> system setup for master failover.
> 
> Does solr support master-master setup?  From my readings, solr does not.
> 
> I've read about repeaters as well where the slave can act as a master. 
> When the 
> main master goes down, do the other slaves switch to the repeater?
> 
> Barring better solutions, I'm thinking about putting 2 masters behind  a
> load 
> balancer.
> 
> If this is not implemented already, perhaps solr can be updated to
> support a 
> list of masters for fault tolerance.
> 
> Tri


Re: master master, repeaters

2010-12-19 Thread Tri Nguyen
How do we tell the slaves to point to the new master without modifying the 
config files?  Can we do this while the slave is up, issuing a command to it?
 
Thanks,
 
Tri

--- On Sun, 12/19/10, Upayavira  wrote:


From: Upayavira 
Subject: Re: master master, repeaters
To: solr-user@lucene.apache.org
Date: Sunday, December 19, 2010, 10:13 AM


We had a (short) thread on this late last week. 

Solr doesn't support automatic failover of the master, at least in
1.4.1. I've been discussing with my colleague (Tommaso) about ways to
achieve this.

There's ways we could 'fake it', scripting the following:

* set up a 'backup' master, as a replica of the actual master
* monitor the master for 'up-ness'
* if it fails:
   * tell the master to start indexing to the backup instead
   * tell the slave(s) to connect to a different master (the backup)
* then, when the master is back:
   * wipe its index (backing up dir first?)
   * configure it to be a backup of the new master
   * make it pull a fresh index over

But, Jan Høydahl suggested using SolrCloud. I'm going to follow up on
how that might work in that thread.

Upayavira


On Sun, 19 Dec 2010 00:20 -0800, "Tri Nguyen" 
wrote:
> Hi,
> 
> In the master-slave configuration, I'm trying to figure out how to
> configure the 
> system setup for master failover.
> 
> Does solr support master-master setup?  From my readings, solr does not.
> 
> I've read about repeaters as well where the slave can act as a master. 
> When the 
> main master goes down, do the other slaves switch to the repeater?
> 
> Barring better solutions, I'm thinking about putting 2 masters behind  a
> load 
> balancer.
> 
> If this is not implemented already, perhaps solr can be updated to
> support a 
> list of masters for fault tolerance.
> 
> Tri


Re: Transparent redundancy in Solr

2010-12-19 Thread Upayavira
Jan,

I'd appreciate a little more explanation here. I've explored SolrCloud
somewhat, but there's some bits of this architecture I don't yet get.

You say, next time an "indexer slave" pings ZK. What is an "indexer
slave"? Is that the external entity that is doing posting indexing
content? If this app that posts to Solr, you imply it must check with ZK
before it can do an HTTP post to Solr? Also, once you do this leader
election to switch to an alternative master, are you implying that this
new master was once a slave of the original master, and thus has a valid
index?

Find this interesting, but still not quite sure on how it works exactly.

Upayavira

On Fri, 17 Dec 2010 10:09 +0100, "Jan Høydahl / Cominvent"
 wrote:
> Hi,
> 
> I believe the way to go is through ZooKeeper[1], not property files or
> local hacks. We've already started on this route and it makes sense to
> let ZK do what it is designed for, such as leader election. When a node
> starts up, it asks ZK what role it should have and fetches corresponding
> configuration. Then it polls ZK regularly to know if the world has
> changed. So if a master indexer goes down, ZK will register that as a
> state change condition, and next time one of the indexer slaves pings ZK,
> it may be elected as new master, and config in ZK is changed
> correspondingly, causing all adds to flow to the new master...
> 
> Then, when the slaves cannot contact their old master, they ask ZK for an
> update, and retrieve a new value for master URL.
> 
> Note also that SolrCloud is implementing load-balancing and sharding as
> part of the arcitecture so often we can skip dedicated LBs.
> 
> [1] : http://wiki.apache.org/solr/SolrCloud
> 
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
> 
> On 15. des. 2010, at 18.50, Tommaso Teofili wrote:
> 
> > Hi all,
> > me, Upayavira and other guys at Sourcesense have collected some Solr
> > architectural views inside the presentation at [1].
> > For sure one can set up an architecture for failover and resiliency on the
> > "search face" (search slaves with coordinators and distributed search) but
> > I'd like to ask how would you reach transparent redundancy in Solr on the
> > "index face".
> > On slide 13 we put 2 slave backup masters and so if one of the main masters
> > goes down you can switch slaves' replication on the backup master.
> > First question if how could it be made automatic?
> > In a previous thread [2] I talked about a possible solution writing the
> > master url of slaves in a properties file so when you have to switch you
> > change that url to the backup master and reload the slave's core but that is
> > not automatic :-) Any more advanced ideas?
> > Second question: when main master comes up how can it be automatically
> > considered as the backup master (since hopefully the backup master has
> > received some indexing requests in the meantime)? Also consider that its
> > index should be wiped out and replicated from the new master to ensure index
> > integrity.
> > Looking forward for your feedback,
> > Cheers,
> > Tommaso
> > 
> > [1] : http://www.slideshare.net/sourcesense/sharded-solr-setup-with-master
> > [2] : http://markmail.org/thread/vjj5jovbg6evpmpp
> 


Re: master master, repeaters

2010-12-19 Thread Upayavira


On Sun, 19 Dec 2010 10:20 -0800, "Tri Nguyen" 
wrote:
> How do we tell the slaves to point to the new master without modifying
> the config files?  Can we do this while the slave is up, issuing a
> command to it?

I believe this can be done (details are in
http://wiki.apache.org/solr/SolrReplication), but I've not actually done
it.

Upayavira  

> --- On Sun, 12/19/10, Upayavira  wrote:
> 
> 
> From: Upayavira 
> Subject: Re: master master, repeaters
> To: solr-user@lucene.apache.org
> Date: Sunday, December 19, 2010, 10:13 AM
> 
> 
> We had a (short) thread on this late last week. 
> 
> Solr doesn't support automatic failover of the master, at least in
> 1.4.1. I've been discussing with my colleague (Tommaso) about ways to
> achieve this.
> 
> There's ways we could 'fake it', scripting the following:
> 
> * set up a 'backup' master, as a replica of the actual master
> * monitor the master for 'up-ness'
> * if it fails:
>    * tell the master to start indexing to the backup instead
>    * tell the slave(s) to connect to a different master (the backup)
> * then, when the master is back:
>    * wipe its index (backing up dir first?)
>    * configure it to be a backup of the new master
>    * make it pull a fresh index over
> 
> But, Jan Høydahl suggested using SolrCloud. I'm going to follow up on
> how that might work in that thread.
> 
> Upayavira
> 
> 
> On Sun, 19 Dec 2010 00:20 -0800, "Tri Nguyen" 
> wrote:
> > Hi,
> > 
> > In the master-slave configuration, I'm trying to figure out how to
> > configure the 
> > system setup for master failover.
> > 
> > Does solr support master-master setup?  From my readings, solr does not.
> > 
> > I've read about repeaters as well where the slave can act as a master. 
> > When the 
> > main master goes down, do the other slaves switch to the repeater?
> > 
> > Barring better solutions, I'm thinking about putting 2 masters behind  a
> > load 
> > balancer.
> > 
> > If this is not implemented already, perhaps solr can be updated to
> > support a 
> > list of masters for fault tolerance.
> > 
> > Tri
> 


Re: Custom scoring for searhing geographic objects

2010-12-19 Thread Alexey Serba
Hi Pavel,

I had the similar problem several years ago - I had to find
geographical locations in textual descriptions, geocode these objects
to lat/long during indexing process and allow users to filter/sort
search results to specific geographical areas. The important issue was
that there were several types of geographical objects - street < town
< region < country. The idea was to geocode to most narrow
geographical area as possible. Relevance logic in this case could be
specified as "find the most narrow result that is unique identified by
your text or search query".  So I came up with custom algorithm that
was quite good in terms of performance and precision/recall. Here's
the simple description:
* You can intersect all text/searchquery terms with locations
dictionary to find only geo terms
* Search in your locations Lucene index and filter only street objects
(the most narrow areas). Due to tf*idf formula you'll get the most
relevant results. Then you need to post process N (3/5/10) results and
verify that they are matches indeed. I did intersect search terms with
result's terms and make another lucene search to verify if these terms
are unique identifying the match. If it's then return matching street.
If there's no any match proceed using the same algorithm with towns,
regions, countries.

HTH,
Alexey

On Wed, Dec 15, 2010 at 6:28 PM, Pavel Minchenkov  wrote:
> Hi,
> Please give me advise how to create custom scoring. I need to result that
> documents were in order, depending on how popular each term in the document
> (popular = how many times it appears in the index) and length of the
> document (less terms - higher in search results).
>
> For example, index contains following data:
>
> ID    | SEARCH_FIELD
> --
> 1     | Russia
> 2     | Russia, Moscow
> 3     | Russia, Volgograd
> 4     | Russia, Ivanovo
> 5     | Russia, Ivanovo, Altayskaya street 45
> 6     | Russia, Moscow, Kremlin
> 7     | Russia, Moscow, Altayskaya street
> 8     | Russia, Moscow, Altayskaya street 15
> 9     | Russia, Moscow, Altayskaya street 15/26
>
>
> And I should get next results:
>
>
> Query                     | Document result set
> --
> Russia                    | 1,2,4,3,6,7,8,9,5
> Moscow                  | 2,6,7,8,9
> Ivanovo                    | 4,5
> Altayskaya              | 7,8,9,5
>
> In fact --- it is a search for geographic objects (cities, streets, houses).
> At the same time can be given only part of the address, and the results
> should appear the most relevant results.
>
> Thanks.
> --
> Pavel Minchenkov
>


Performance Monitoring Solution

2010-12-19 Thread Cameron Hurst
I am at the point in my set up that I am happy with how things are being
indexed and my interface is all good to go but what I don't know how to
judge is how often it will be queried and how much resources it needs to
function properly. So what I am looking for is some sort of performance
monitoring solution. I know if I go to the statistics page i can find
the number of queries and the average response time. What I want is a
bit more detailed result, showing how it varies over time. A plot of RAM
usage and possibly disk IO that is due to Solr over time as well.

It is due to my new use of the program but also unsure about my user and
how much they will user my search interface that I need detailed results
of its use. Solutions I have found is New Relic RPM apparently has
support for Solr but you need to use one of their paid packages which I
would like to avoid. The other option I found is LiquidGaze, it says it
is an open source solution for monitoring the handlers and can do a lot
of what I need but has anyone ever used it before and can give it a
rating, good or bad. Is there another solution for this that I have
missed that would be better than the to that I listed?


Re: Dataimport performance

2010-12-19 Thread Alexey Serba
> With subquery and with left join:   320k in 6 Min 30
It's 820 records per second. It's _really_ impressive considering the
fact that DIH performs separate sql query for every record in your
case.

>> So there's one track entity with an artist sub-entity. My (admittedly
>> rather limited) experience has been that sub-entities, where you have
>> to run a separate query for every row in the parent entity, really
>> slow down data import.
Sub entities slows down data import indeed. You can try to avoid
separate query for every row by using CachedSqlEntityProcessor. There
are couple of options - 1) you can load all sub-entity data in memory
or 2) you can reduce the number of sql queries by caching sub entity
data per id. There's no silver bullet and each option has its own pros
and cons.

Also Ephraim proposed a really neat solution with GROUP_CONCAT, but
I'm not sure that all RDBMS-es support that.


2010/12/15 Robert Gründler :
> i've benchmarked the import already with 500k records, one time without the 
> artists subquery, and one time without the join in the main query:
>
>
> Without subquery: 500k in 3 min 30 sec
>
> Without join and without subquery: 500k in 2 min 30.
>
> With subquery and with left join:   320k in 6 Min 30
>
>
> so the joins / subqueries are definitely a bottleneck.
>
> How exactly did you implement the custom data import?
>
> In our case, we need to de-normalize the relations of the sql data for the 
> index,
> so i fear i can't really get rid of the join / subquery.
>
>
> -robert
>
>
>
>
>
> On Dec 15, 2010, at 15:43 , Tim Heckman wrote:
>
>> 2010/12/15 Robert Gründler :
>>> The data-config.xml looks like this (only 1 entity):
>>>
>>>      
>>>        
>>>        
>>>        
>>>        
>>>        
>>>        >> name="sf_unique_id"/>
>>>
>>>        
>>>          
>>>        
>>>
>>>      
>>
>> So there's one track entity with an artist sub-entity. My (admittedly
>> rather limited) experience has been that sub-entities, where you have
>> to run a separate query for every row in the parent entity, really
>> slow down data import. For my own purposes, I wrote a custom data
>> import using SolrJ to improve the performance (from 3 hours to 10
>> minutes).
>>
>> Just as a test, how long does it take if you comment out the artists entity?
>
>


Re: Dataimport performance

2010-12-19 Thread Lukas Kahwe Smith

On 19.12.2010, at 23:30, Alexey Serba wrote:

> 
> Also Ephraim proposed a really neat solution with GROUP_CONCAT, but
> I'm not sure that all RDBMS-es support that.


Thats MySQL only syntax.
But if you google you can find similar solution for other RDBMS.

regards,
Lukas Kahwe Smith
m...@pooteeweet.org





DIH for taxonomy faceting in Lucid webcast

2010-12-19 Thread Andy
Hi,

I watched the Lucid webcast:
http://www.lucidimagination.com/solutions/webcasts/faceting

It talks about encoding hierarchical categories to facilitate faceting. So a 
category "path" of "NonFic>Science" would be encoded as the multivalues 
"0/NonFic" & "1/NonFic/Science".

1) My categories are stored in database as coded numbers instead of fully 
spelled out names. For example I would have a category of "2/7" and a lookup 
dictionary to convert "2/7" into "NonFic/Science". How do I do such lookup in 
DIH?

2) Once I have the fully spelled out category path such as "NonFic/Science", 
how do I turn that into "0/NonFic" & "1/NonFic/Science" using the DIH?

3) Some of my categories are multi-words containing whitespaces, such as 
"Computer Science" and "Functional Programming", so I'd have facet values such 
as "2/NonFic/Computer Science/Functional Programming".  How do I handle 
whitespaces in this case? Would filtering by fq still work?

Thanks


  


Re: DIH for sharded database?

2010-12-19 Thread Andy
This is helpful. Thank you.

--- On Sun, 12/19/10, Dennis Gearon  wrote:

> From: Dennis Gearon 
> Subject: Re: DIH for sharded database?
> To: solr-user@lucene.apache.org
> Date: Sunday, December 19, 2010, 11:56 AM
> Some talk on giant databases in
> postgres:
>   
> http://wiki.postgresql.org/images/3/38/PGDay2009-EN-Datawarehousing_with_PostgreSQL.pdf
> 
> wikipedia
>   http://en.wikipedia.org/wiki/Partition_%28database%29
>   (says to use a UNION)
> postgres description on how to do it:
>   http://www.postgresql.org/docs/current/interactive/ddl-partitioning.html
> 
>  Dennis Gearon
> 
> 
> Signature Warning
> 
> It is always a good idea to learn from your own mistakes.
> It is usually a better 
> idea to learn from others’ mistakes, so you do not have
> to make them yourself. 
> from 'http://blogs.techrepublic.com.com/security/?p=4501&tag=nl.e036'
> 
> 
> EARTH has a Right To Life,
> otherwise we all die.
> 
> 
> 
> - Original Message 
> From: Andy 
> To: solr-user@lucene.apache.org
> Sent: Sat, December 18, 2010 6:20:54 PM
> Subject: DIH for sharded database?
> 
> I have a table that is broken up into many virtual shards.
> So basically I have N 
> identical tables:
> 
> Document1
> Document2
> .
> .
> Document36
> 
> Currently these tables all live in the same database, but
> in the future they may 
> be moved to different servers to scale out if the needs
> arise.
> 
> Is there any way to configure a DIH for these tables so
> that it will 
> automatically loop through the 36 identical tables and pull
> data out for 
> indexing?
> 
> Something like (pseudo code):
> 
> for (i = 1; i <= 36; i++) {
>    ## retrieve data from the table
> Document{$i} & index the data
> }
> 
> What's the best way to handle a situation like this?
> 
> Thanks
> 





Custom transformer to get file content from file path

2010-12-19 Thread vasu p
Hi,
I have a custom library, which is used to input a file path and it returns
file content as a string output.
My DB has a file path in one of the table and using DIH configuration in
Solr to do the indexing. I couldnt use TikaEntityProcessor to do indexing of
a file located in file system. I though of using Custom Transformer to
transform file_path to file_content field in the row.

I would like to know following details:
1. Setting file content as a string to a custom file_content field might
cause memory issue if a very big file over hundreds of mega bites might
consume the RAM space. Is it possible to send a stream as input to Solr?
What is the filed type should be configured in schema.xml?
2. Is there any better approach than a custom transformer?
3. Any other best approach to implement indexing based on a file path?
Thanks a lot.


Re: Custom transformer to get file content from file path

2010-12-19 Thread Ahmet Arslan
> I have a custom library, which is used to input a file path
> and it returns
> file content as a string output.
> My DB has a file path in one of the table and using DIH
> configuration in
> Solr to do the indexing. I couldnt use TikaEntityProcessor
> to do indexing of
> a file located in file system. I though of using Custom
> Transformer to
> transform file_path to file_content field in the row.
> 
> I would like to know following details:
> 1. Setting file content as a string to a custom
> file_content field might
> cause memory issue if a very big file over hundreds of mega
> bites might
> consume the RAM space. Is it possible to send a stream as
> input to Solr?
> What is the filed type should be configured in schema.xml?
> 2. Is there any better approach than a custom transformer?
> 3. Any other best approach to implement indexing based on a
> file path?

http://wiki.apache.org/solr/DataImportHandler#PlainTextEntityProcessor should 
do the trick.


  


[Reload-Config] not working

2010-12-19 Thread Adam Estrada
http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=full-import";>Full
Import
http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=reload-config";>Reload
Configuration

All,

The links above are meant for me to reload the configuration file after a
change is made and the other is to perform the full import. My problem is
that The reload-config option does not seem to be working. Am I doing
anything wrong? Your expertise is greatly appreciated!

Adam


Re: [Reload-Config] not working

2010-12-19 Thread Ahmet Arslan

--- On Mon, 12/20/10, Adam Estrada  wrote:

> From: Adam Estrada 
> Subject: [Reload-Config] not working
> To: solr-user@lucene.apache.org
> Date: Monday, December 20, 2010, 5:33 AM
> http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=full-import";>Full
> Import
> http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=reload-config";>Reload
> Configuration
> 
> All,
> 
> The links above are meant for me to reload the
> configuration file after a
> change is made and the other is to perform the full import.
> My problem is
> that The reload-config option does not seem to be working.
> Am I doing
> anything wrong? Your expertise is greatly appreciated!
> 
> Adam
> 


  


Re: [Reload-Config] not working

2010-12-19 Thread Ahmet Arslan
> http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=full-import";>Full
> Import
> http://localhost:8983/solr/select?clean=false&commit=true&qt=%2Fdataimport&command=reload-config";>Reload
> Configuration
> 
> All,
> 
> The links above are meant for me to reload the
> configuration file after a
> change is made and the other is to perform the full import.
> My problem is
> that The reload-config option does not seem to be working.
> Am I doing
> anything wrong? Your expertise is greatly appreciated!

I am sorry, I hit the reply button accidentally. 

Are you receiving/checking the message 
Configuration Re-loaded sucessfully
after the reload?

And are checking that data-config.xml is a valid xml after editing it 
programatically?

And instead of editing data-config.xml file cant you use  variable resolver? 
http://search-lucene.com/m/qYzPk2n86iI&subj


  


Re: Performance Monitoring Solution

2010-12-19 Thread Gora Mohanty
On Mon, Dec 20, 2010 at 3:13 AM, Cameron Hurst  wrote:
> I am at the point in my set up that I am happy with how things are being
> indexed and my interface is all good to go but what I don't know how to
> judge is how often it will be queried and how much resources it needs to
> function properly. So what I am looking for is some sort of performance
> monitoring solution. I know if I go to the statistics page i can find
> the number of queries and the average response time. What I want is a
> bit more detailed result, showing how it varies over time. A plot of RAM
> usage and possibly disk IO that is due to Solr over time as well.
[...]

Solr exposes statistics through JMX: http://wiki.apache.org/solr/SolrJmx
As described there, you can examine them using jconsole. Nagios also
has several JMX plugins, of which we have successfully used
http://exchange.nagios.org/directory/Plugins/Java-Applications-and-Servers/Syabru-Nagios-JMX-Plugin/details

There are also several open-source JMX clients available, but we did
not find any actively-maintained one with the features that we would
have liked. We have prototyped an in-house JMX application, focused
on Solr, and hope to be able to open-source this soon.

> The other option I found is LiquidGaze, it says it
> is an open source solution for monitoring the handlers and can do a lot
> of what I need but has anyone ever used it before and can give it a
> rating, good or bad.
[...]

I presume that you mean LucidGaze. We have been meaning to check
it out, but the person implementing it was running into some issues.

Regards,
Gora


Re: master master, repeaters

2010-12-19 Thread Lance Norskog
If you have a load balancer available, that is a much cleaner solution
than anything else. After the main indexer comes back, you have to get
the current index state to it to start again. But otherwise

On Sun, Dec 19, 2010 at 10:39 AM, Upayavira  wrote:
>
>
> On Sun, 19 Dec 2010 10:20 -0800, "Tri Nguyen" 
> wrote:
>> How do we tell the slaves to point to the new master without modifying
>> the config files?  Can we do this while the slave is up, issuing a
>> command to it?
>
> I believe this can be done (details are in
> http://wiki.apache.org/solr/SolrReplication), but I've not actually done
> it.
>
> Upayavira
>
>> --- On Sun, 12/19/10, Upayavira  wrote:
>>
>>
>> From: Upayavira 
>> Subject: Re: master master, repeaters
>> To: solr-user@lucene.apache.org
>> Date: Sunday, December 19, 2010, 10:13 AM
>>
>>
>> We had a (short) thread on this late last week.
>>
>> Solr doesn't support automatic failover of the master, at least in
>> 1.4.1. I've been discussing with my colleague (Tommaso) about ways to
>> achieve this.
>>
>> There's ways we could 'fake it', scripting the following:
>>
>> * set up a 'backup' master, as a replica of the actual master
>> * monitor the master for 'up-ness'
>> * if it fails:
>>    * tell the master to start indexing to the backup instead
>>    * tell the slave(s) to connect to a different master (the backup)
>> * then, when the master is back:
>>    * wipe its index (backing up dir first?)
>>    * configure it to be a backup of the new master
>>    * make it pull a fresh index over
>>
>> But, Jan Høydahl suggested using SolrCloud. I'm going to follow up on
>> how that might work in that thread.
>>
>> Upayavira
>>
>>
>> On Sun, 19 Dec 2010 00:20 -0800, "Tri Nguyen" 
>> wrote:
>> > Hi,
>> >
>> > In the master-slave configuration, I'm trying to figure out how to
>> > configure the
>> > system setup for master failover.
>> >
>> > Does solr support master-master setup?  From my readings, solr does not.
>> >
>> > I've read about repeaters as well where the slave can act as a master.
>> > When the
>> > main master goes down, do the other slaves switch to the repeater?
>> >
>> > Barring better solutions, I'm thinking about putting 2 masters behind  a
>> > load
>> > balancer.
>> >
>> > If this is not implemented already, perhaps solr can be updated to
>> > support a
>> > list of masters for fault tolerance.
>> >
>> > Tri
>>
>



-- 
Lance Norskog
goks...@gmail.com


Re: DIH for sharded database?

2010-12-19 Thread Lance Norskog
You said: Currently these tables all live in the same database, but in
the future they may be moved to different servers to scale out if the
needs arise.

That's why I concentrated on the JDBC url problem.

But you can use a file as a list of tables. Read each line, and a
sub-entity can substitute the line value into the SQL statement.

On Sat, Dec 18, 2010 at 6:46 PM, Andy  wrote:
>
> --- On Sat, 12/18/10, Lance Norskog  wrote:
>
>> You can have a file with 1,2,3 on
>> separate lines. There is a
>> line-by-line file reader that can pull these as separate
>> drivers.
>> Inside that entity the JDBC url has to be altered with the
>> incoming
>> numbers. I don't know if this will work.
>
> I'm not sure I understand.
>
> How will altering the JDBC url change the name of the table it is importing 
> data from?
>
> Wouldn't I need to change the  actual SQL query itself?
>
> "select * from Document1"
> "select * from Document2"
> ...
> "select * from Document36"
>
>
>
>



-- 
Lance Norskog
goks...@gmail.com


Re: DIH for sharded database?

2010-12-19 Thread Andy

--- On Mon, 12/20/10, Lance Norskog  wrote:

> You said: Currently these tables all
> live in the same database, but in
> the future they may be moved to different servers to scale
> out if the
> needs arise.
> 
> That's why I concentrated on the JDBC url problem.
> 
> But you can use a file as a list of tables. Read each line,
> and a
> sub-entity can substitute the line value into the SQL
> statement.
> 

Can you give me an example of how to do this or pointing me to documentation 
that illustrates this? I think I sorta understand what you're saying 
conceptually but I need to be sure about the specifics.

Thanks.