Any rule of thumb regarding the size of document limitation when storing it in solr?
Otis Gospodnetic-5 wrote > Use Solr. It's pretty clear you don't yet have any problems that > would make you think about alternatives. Using Solr to store and not > just index will make your life simpler (and your app simpler and > likely faster). > > Otis > -- > Solr & ElasticSearch Support > http://sematext.com/ > > > > > > On Tue, Apr 16, 2013 at 6:31 PM, Furkan KAMACI < > furkankamaci@ > > wrote: >> Thanks again for your answer. If I find any document about such >> comparisons >> that I would like to read. >> >> By the way, is there any advantage for using Lucene instead of anything >> else as like that: >> >> Using Lucene is naturally supported at Solr and if I use anything else I >> may face with some compatibility problems or communicating issues? >> >> >> 2013/4/17 Otis Gospodnetic < > otis.gospodnetic@ > > >> >>> People do use other data stores to retrieve data sometimes. e.g. Mongo >>> is popular for that. Like I hinted in another email, I wouldn't >>> necessarily recommend this for common cases. Don't do it unless you >>> really know you need it. Otherwise, just store in Solr. >>> >>> Otis >>> -- >>> Solr & ElasticSearch Support >>> http://sematext.com/ >>> >>> >>> >>> >>> >>> On Tue, Apr 16, 2013 at 5:32 PM, Furkan KAMACI < > furkankamaci@ > > >>> wrote: >>> > Hi Otis and Jack; >>> > >>> > I have made a research about highlights and debugged code. I see that >>> > highlight are query dependent and not stored. Why Solr uses Lucene for >>> > storing text, I mean i.e. content of a web page. Is there any >>> comparison >>> > about to store texts at Hbase or any other databases versus Lucene. >>> > >>> > Also I want to learn that is there anybody who has used anything else >>> from >>> > Lucene to store text of document at our solr user list? >>> > >>> > 2013/4/11 Otis Gospodnetic < > otis.gospodnetic@ > > >>> > >>> >> Source code is your best bet. Wiki has info about how to use it, but >>> >> not how highlighting is implemented. But you don't need to >>> understand >>> >> the implementation details to understand that they are dynamic, >>> >> computed specifically for each query for each matching document, so >>> >> you cannot store them anywhere ahead of time. >>> >> >>> >> Otis >>> >> -- >>> >> Solr & ElasticSearch Support >>> >> http://sematext.com/ >>> >> >>> >> >>> >> >>> >> >>> >> >>> >> On Thu, Apr 11, 2013 at 11:22 AM, Furkan KAMACI < > furkankamaci@ > >> > >>> >> wrote: >>> >> > Hi Otis; >>> >> > >>> >> > It seems that I should read more about highlights. Is there any >>> where >>> >> that >>> >> > explains in detail how highlights are generated at Solr? >>> >> > >>> >> > 2013/4/11 Otis Gospodnetic < > otis.gospodnetic@ > > >>> >> > >>> >> >> Hi, >>> >> >> >>> >> >> You can't store highlights ahead of time because they are query >>> >> >> dependent. You could store documents in HBase and use Solr just >>> for >>> >> >> indexing. Is that what you want to do? If so, a custom >>> >> >> SearchComponent executed after QueryComponent could fetch data >>> from >>> >> >> external store like HBase. I'm not sure if I'd recommend that. >>> >> >> >>> >> >> Otis >>> >> >> -- >>> >> >> Solr & ElasticSearch Support >>> >> >> http://sematext.com/ >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> >>> >> >> On Thu, Apr 11, 2013 at 10:01 AM, Furkan KAMACI < >>> > furkankamaci@ >>> >> > >>> >> >> wrote: >>> >> >> > Actually I don't think to store documents at Solr. I want to >>> store >>> >> just >>> >> >> > highlights (snippets) at Hbase and I want to retrieve them from >>> Hbase >>> >> >> when >>> >> >> > needed. >>> >> >> > What do you think about separating just highlights from Solr and >>> >> storing >>> >> >> > them into Hbase at Solrclod. By the way if you explain at which >>> >> process >>> >> >> and >>> >> >> > how highlights are genareted at Solr you are welcome. >>> >> >> > >>> >> >> > >>> >> >> > 2013/4/9 Otis Gospodnetic < > otis.gospodnetic@ > > >>> >> >> > >>> >> >> >> You may also be interested in looking at things like solrbase >>> (on >>> >> >> Github). >>> >> >> >> >>> >> >> >> Otis >>> >> >> >> -- >>> >> >> >> Solr & ElasticSearch Support >>> >> >> >> http://sematext.com/ >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> >>> >> >> >> On Sat, Apr 6, 2013 at 6:01 PM, Furkan KAMACI < >>> >> > furkankamaci@ >> >>> >> >> >> wrote: >>> >> >> >> > Hi; >>> >> >> >> > >>> >> >> >> > First of all should mention that I am new to Solr and making >>> a >>> >> >> research >>> >> >> >> > about it. What I am trying to do that I will crawl some >>> websites >>> >> with >>> >> >> >> Nutch >>> >> >> >> > and then I will index them with Solr. (Nutch 2.1, >>> Solr-SolrCloud >>> >> 4.2 ) >>> >> >> >> > >>> >> >> >> > I wonder about something. I have a cloud of machines that >>> crawls >>> >> >> websites >>> >> >> >> > and stores that documents. Then I send that documents into >>> >> SolrCloud. >>> >> >> >> Solr >>> >> >> >> > indexes that documents and generates indexes and save them. I >>> know >>> >> >> that >>> >> >> >> > from Information Retrieval theory: it *may* not be efficient >>> to >>> >> store >>> >> >> >> > indexes at a NoSQL database (they are something like linked >>> lists >>> >> and >>> >> >> if >>> >> >> >> > you store them in such kind of database you *may* have a >>> sparse >>> >> >> >> > representation -by the way there may be some solutions for >>> it. >>> If >>> >> you >>> >> >> >> > explain them you are welcome.) >>> >> >> >> > >>> >> >> >> > However Solr stores some documents too (i.e. highlights) So >>> some >>> >> of my >>> >> >> >> > documents will be doubled somehow. If I consider that I will >>> have >>> >> many >>> >> >> >> > documents, that dobuled documents may cause a problem for me. >>> So is >>> >> >> there >>> >> >> >> > any way not storing that documents at Solr and pointing to >>> them >>> at >>> >> >> >> > Hbase(where I save my crawled documents) or instead of >>> pointing >>> >> >> directly >>> >> >> >> > storing them at Hbase (is it efficient or not)? >>> >> >> >> >>> >> >> >>> >> >>> -- View this message in context: http://lucene.472066.n3.nabble.com/Pointing-to-Hbase-for-Docuements-or-Directly-Saving-Documents-at-Hbase-tp4054277p4056599.html Sent from the Solr - User mailing list archive at Nabble.com.