Any rule of thumb regarding the size of document limitation when storing it
in solr?



Otis Gospodnetic-5 wrote
> Use Solr.  It's pretty clear you don't yet have any problems that
> would make you think about alternatives.  Using Solr to store and not
> just index will make your life simpler (and your app simpler and
> likely faster).
> 
> Otis
> --
> Solr & ElasticSearch Support
> http://sematext.com/
> 
> 
> 
> 
> 
> On Tue, Apr 16, 2013 at 6:31 PM, Furkan KAMACI <

> furkankamaci@

> > wrote:
>> Thanks again for your answer. If I find any document about such
>> comparisons
>> that I would like to read.
>>
>> By the way, is there any advantage for using Lucene instead of anything
>> else as like that:
>>
>> Using Lucene is naturally supported at Solr and if I use anything else I
>> may face with some compatibility problems or communicating issues?
>>
>>
>> 2013/4/17 Otis Gospodnetic <

> otis.gospodnetic@

> >
>>
>>> People do use other data stores to retrieve data sometimes. e.g. Mongo
>>> is popular for that.  Like I hinted in another email, I wouldn't
>>> necessarily recommend this for common cases.  Don't do it unless you
>>> really know you need it.  Otherwise, just store in Solr.
>>>
>>> Otis
>>> --
>>> Solr & ElasticSearch Support
>>> http://sematext.com/
>>>
>>>
>>>
>>>
>>>
>>> On Tue, Apr 16, 2013 at 5:32 PM, Furkan KAMACI <

> furkankamaci@

> >
>>> wrote:
>>> > Hi Otis and Jack;
>>> >
>>> > I have made a research about highlights and debugged code. I see that
>>> > highlight are query dependent and not stored. Why Solr uses Lucene for
>>> > storing text, I mean i.e. content of a web page. Is there any
>>> comparison
>>> > about to store texts at Hbase or any other databases versus Lucene.
>>> >
>>> > Also I want to learn that is there anybody who has used anything else
>>> from
>>> > Lucene to store text of document at our solr user list?
>>> >
>>> > 2013/4/11 Otis Gospodnetic <

> otis.gospodnetic@

> >
>>> >
>>> >> Source code is your best bet.  Wiki has info about how to use it, but
>>> >> not how highlighting is implemented.  But you don't need to
>>> understand
>>> >> the implementation details to understand that they are dynamic,
>>> >> computed specifically for each query for each matching document, so
>>> >> you cannot store them anywhere ahead of time.
>>> >>
>>> >> Otis
>>> >> --
>>> >> Solr & ElasticSearch Support
>>> >> http://sematext.com/
>>> >>
>>> >>
>>> >>
>>> >>
>>> >>
>>> >> On Thu, Apr 11, 2013 at 11:22 AM, Furkan KAMACI <

> furkankamaci@

> >> >
>>> >> wrote:
>>> >> > Hi Otis;
>>> >> >
>>> >> > It seems that I should read more about highlights. Is there any
>>> where
>>> >> that
>>> >> > explains in detail how highlights are generated at Solr?
>>> >> >
>>> >> > 2013/4/11 Otis Gospodnetic <

> otis.gospodnetic@

> >
>>> >> >
>>> >> >> Hi,
>>> >> >>
>>> >> >> You can't store highlights ahead of time because they are query
>>> >> >> dependent.  You could store documents in HBase and use Solr just
>>> for
>>> >> >> indexing.  Is that what you want to do?  If so, a custom
>>> >> >> SearchComponent executed after QueryComponent could fetch data
>>> from
>>> >> >> external store like HBase.  I'm not sure if I'd recommend that.
>>> >> >>
>>> >> >> Otis
>>> >> >> --
>>> >> >> Solr & ElasticSearch Support
>>> >> >> http://sematext.com/
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >>
>>> >> >> On Thu, Apr 11, 2013 at 10:01 AM, Furkan KAMACI <
>>> 

> furkankamaci@

>>> >> >
>>> >> >> wrote:
>>> >> >> > Actually I don't think to store documents at Solr. I want to
>>> store
>>> >> just
>>> >> >> > highlights (snippets) at Hbase and I want to retrieve them from
>>> Hbase
>>> >> >> when
>>> >> >> > needed.
>>> >> >> > What do you think about separating just highlights from Solr and
>>> >> storing
>>> >> >> > them into Hbase at Solrclod. By the way if you explain at which
>>> >> process
>>> >> >> and
>>> >> >> > how highlights are genareted at Solr you are welcome.
>>> >> >> >
>>> >> >> >
>>> >> >> > 2013/4/9 Otis Gospodnetic &lt;

> otis.gospodnetic@

> &gt;
>>> >> >> >
>>> >> >> >> You may also be interested in looking at things like solrbase
>>> (on
>>> >> >> Github).
>>> >> >> >>
>>> >> >> >> Otis
>>> >> >> >> --
>>> >> >> >> Solr & ElasticSearch Support
>>> >> >> >> http://sematext.com/
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >>
>>> >> >> >> On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI <
>>> >> 

> furkankamaci@

>>
>>> >> >> >> wrote:
>>> >> >> >> > Hi;
>>> >> >> >> >
>>> >> >> >> > First of all should mention that I am new to Solr and making
>>> a
>>> >> >> research
>>> >> >> >> > about it. What I am trying to do that I will crawl some
>>> websites
>>> >> with
>>> >> >> >> Nutch
>>> >> >> >> > and then I will index them with Solr. (Nutch 2.1,
>>> Solr-SolrCloud
>>> >> 4.2 )
>>> >> >> >> >
>>> >> >> >> > I wonder about something. I have a cloud of machines that
>>> crawls
>>> >> >> websites
>>> >> >> >> > and stores that documents. Then I send that documents into
>>> >> SolrCloud.
>>> >> >> >> Solr
>>> >> >> >> > indexes that documents and generates indexes and save them. I
>>> know
>>> >> >> that
>>> >> >> >> > from Information Retrieval theory: it *may* not be efficient
>>> to
>>> >> store
>>> >> >> >> > indexes at a NoSQL database (they are something like linked
>>> lists
>>> >> and
>>> >> >> if
>>> >> >> >> > you store them in such kind of database you *may* have a
>>> sparse
>>> >> >> >> > representation -by the way there may be some solutions for
>>> it.
>>> If
>>> >> you
>>> >> >> >> > explain them you are welcome.)
>>> >> >> >> >
>>> >> >> >> > However Solr stores some documents too (i.e. highlights) So
>>> some
>>> >> of my
>>> >> >> >> > documents will be doubled somehow. If I consider that I will
>>> have
>>> >> many
>>> >> >> >> > documents, that dobuled documents may cause a problem for me.
>>> So is
>>> >> >> there
>>> >> >> >> > any way not storing that documents at Solr and pointing to
>>> them
>>> at
>>> >> >> >> > Hbase(where I save my crawled documents) or instead of
>>> pointing
>>> >> >> directly
>>> >> >> >> > storing them at Hbase (is it efficient or not)?
>>> >> >> >>
>>> >> >>
>>> >>
>>>





--
View this message in context: 
http://lucene.472066.n3.nabble.com/Pointing-to-Hbase-for-Docuements-or-Directly-Saving-Documents-at-Hbase-tp4054277p4056599.html
Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to