Some of these are big questions- try them in different emails.

On Wed, Sep 29, 2010 at 9:40 AM, Sharma, Raghvendra
<sraghven...@corelogic.com> wrote:
> Some questions.
>
> 1. I have about 3-5 tables. Now designing schema.xml for a single table looks 
> ok, but whats the direction for handling multiple table structures is 
> something I am not sure about. Would it be like a big huge xml, wherein those 
> three tables (assuming its three) would show up as three different tag-trees, 
> nullable.
>
> My source provides me a single flat file per table (tab delimited).
>
> Do you think having multiple indexes could be a solution for this case ?? or 
> do I really need to spend effort in denormalizing the data ?
>
> 2. Further, loading into solr can use some perf tuning.. any tips ? best 
> practices ?
>
> 3. Also, is there a way to specify a xslt at the server side, and make it 
> default, i.e. whenever a response is returned, that xslt is applied to the 
> response automatically...
>
> 4. And last question for the day - :) there was one post saying that the 
> spatial support is really basic in solr and is going to be improved in next 
> versions... Can you ppl help me get a definitive yes or no on spatial 
> support... in the current form, does it work on not ? I would store lat and 
> long, and would need to make them searchable...
>
> --raghav..
>
> -----Original Message-----
> From: Sharma, Raghvendra [mailto:sraghven...@corelogic.com]
> Sent: Tuesday, September 28, 2010 11:45 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Is Solr right for my business situation ?
>
> Thanks for the responses people.
>
> @Grant
>
> 1. can you show me some direction on that.. loading data from an incoming 
> stream.. do I need some third party tools, or need to build something 
> myself...
>
> 4. I am basically attempting to build a very fast search interface for the 
> existing data. The volume I mentioned is more like static one (data is 
> already there). The sql statements I mentioned are daily updates coming. The 
> good thing is that the history is not there, so the overall volume is not 
> growing, but I need to apply the update statements.
>
> One workaround I had in mind is, (though not so great performance) is to 
> apply the updates to a copy of rdbms, and then feed the rdbms extract to 
> solr.  Sounds like overkill, but I don't have another idea right now. Perhaps 
> business discussions would yield something.
>
> @All -
>
> Some more questions guys.
>
> 1. I have about 3-5 tables. Now designing schema.xml for a single table looks 
> ok, but whats the direction for handling multiple table structures is 
> something I am not sure about. Would it be like a big huge xml, wherein those 
> three tables (assuming its three) would show up as three different tag-trees, 
> nullable.
>
> My source provides me a single flat file per table (tab delimited).
>
> 2. Further, loading into solr can use some perf tuning.. any tips ? best 
> practices ?
>
> 3. Also, is there a way to specify a xslt at the server side, and make it 
> default, i.e. whenever a response is returned, that xslt is applied to the 
> response automatically...
>
> 4. And last question for the day - :) there was one post saying that the 
> spatial support is really basic in solr and is going to be improved in next 
> versions... Can you ppl help me get a definitive yes or no on spatial 
> support... in the current form, does it work on not ? I would store lat and 
> long, and would need to make them searchable...
>
> Looks like I m close to my solution.. :)
>
> --raghav
>
> -----Original Message-----
> From: Grant Ingersoll [mailto:gsing...@apache.org]
> Sent: Tuesday, September 28, 2010 1:05 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Is Solr right for my business situation ?
>
> Inline.
>
> On Sep 27, 2010, at 1:26 PM, Walter Underwood wrote:
>
>> When do you need to deploy?
>>
>> As I understand it, the spatial search in Solr is being rewritten and is 
>> slated for Solr 4.0, the release after next.
>
> It will be in 3.x, the next release
>
>>
>> The existing spatial search has some serious problems and is deprecated.
>>
>> Right now, I think the only way to get spatial search in Solr is to deploy a 
>> nightly snapshot from the active development on trunk. If you are deploying 
>> a year from now, that might change.
>>
>> There is not any support for SQL-like statements or for joins. The best 
>> practice for Solr is to think of your data as a single table, essentially 
>> creating a view from your database. The rows become Solr documents, the 
>> columns become Solr fields.
>
> There is now group-by capabilities in trunk as well, which may or may not 
> help.
>
>>
>> wunder
>>
>> On Sep 27, 2010, at 9:34 AM, Sharma, Raghvendra wrote:
>>
>>> I am sure these kind of questions keep coming to you guys, but I want to 
>>> raise the same question in a different context...my own business situation.
>>> I am very very new to solr and though I have tried to read through the 
>>> documentation, I have nowhere near completing the whole read.
>>>
>>> The need is like this -
>>>
>>> We have a huge rdbms database/table. A single table perhaps houses 100+ 
>>> million rows. Though oracle is doing a fine job of handling the insertion 
>>> and updation of data, the querying is where our main concerns lie.  Since 
>>> we have spatial data, the index building takes hours and hours for such 
>>> tables.
>>>
>>> That's when we thought of moving away from standard rdbms and thought of 
>>> trying something different and fast.
>>> My last week has been spent in a journey reading through bigtable to hadoop 
>>> to hbase, to hive and then finally landed on solr. As far as I am in my 
>>> tests, it looks pretty good, but I have a few unanswered questions still. 
>>> Trying this group for them  :)  (I am sure I can find some answers if I 
>>> read/google more on the topic, but now I m being lazy and feel asking the 
>>> people who are already using it/or perhaps developing it is a better bet).
>>>
>>> 1. Can I get my solr instance to load data (fresh data for indexing) from a 
>>> stream (imagine a mq kind of queue, or similar) ?
>
> Yes, with a little bit of work.
>
>>> 2. Can I host my solr instance to use hbase as the database/file system 
>>> (read HDFS) ?
>
> Probably, but I doubt it will be fast.  Local disk is usually the best.  100+ 
> M rows is large but not unreasonable.
>
>>> 3. are there somewhere any reports available (as in benchmarks ) for a solr 
>>> instance's performance ?
>
> You can probably search the web for these.  I've personally seen several 
> installs w/ 1B+ docs and subsecond search and faceting and heard of others.  
> You might look at the stuff the Hathi trust has put up.
>
>>> 4. are there any APIs available which might help me apply ANSI sql kind of 
>>> statements to my solr data ?
>
> No.  Question back?  What kinds of things are you trying to do?
>
>>>
>>> It would be great if people could help share their experience in the 
>>> area... if it's too much trouble writing all of it, perhaps url would be 
>>> easier... I welcome all kinds of help here... any advice/suggestions are 
>>> good ...
>>>
>>> Looking forward to your viewpoints..
>>>
>>> --raghav..
>>> ******************************************************************************************
>>> This message may contain confidential or proprietary information intended 
>>> only for the use of the
>>> addressee(s) named above or may contain information that is legally 
>>> privileged. If you are
>>> not the intended addressee, or the person responsible for delivering it to 
>>> the intended addressee,
>>> you are hereby notified that reading, disseminating, distributing or 
>>> copying this message is strictly
>>> prohibited. If you have received this message by mistake, please 
>>> immediately notify us by
>>> replying to the message and delete the original message and any copies 
>>> immediately thereafter.
>>>
>>> Thank you.
>>> ******************************************************************************************
>>> CLLD
>>>
>>
>>
>>
>>
>
> --------------------------
> Grant Ingersoll
> http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
>
>



-- 
Lance Norskog
goks...@gmail.com

Reply via email to