Re: Solr index size has increased in solr 7.7.2

2020-04-15 Thread David Hastings
i wouldnt worry about the index size until you get above a half terabyte or so. adding doc values and other features means you sacrifice things that dont matter, like size. memory and ssd's are cheap. On Wed, Apr 15, 2020 at 1:21 PM Rajdeep Sahoo wrote: > Hi all > We are migrating from solr 4.

Re: solr index data from hdfs with error

2019-12-20 Thread Erick Erickson
Morphlines support was removed from Solr in Solr 6.6, see: https://issues.apache.org/jira/browse/SOLR-9221 So I don’t think anyone here will be very conversant in the details. I vaguely recall that this process added an ID field by default, but it’s been a very long time since I looked. Do chec

Re: Solr index

2019-08-08 Thread Dario Rigolin
Do you know that your solr is open to the internet? It's better to filter the port or at least not put here the full address... Il giorno gio 8 ago 2019 alle ore 15:58 HTMLServices.it < i...@htmlservices.it> ha scritto: > Hi everyone > I installed Solr on a test server (centos 7) to get the faste

Re: Solr index slow response

2019-03-19 Thread Walter Underwood
t; Sent: Tuesday, March 19, 2019 3:29:17 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client > threads, you should be able to drive CPU to over 90%. > > Start with two cl

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun
time. I will try with Solr Could cluster, maybe get better speed there. //Aaron From: Walter Underwood Sent: Tuesday, March 19, 2019 3:29:17 PM To: solr-user@lucene.apache.org Subject: Re: Solr index slow response Indexing is CPU bound. If you have enough RAM

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović
> wiki.apache.org > Schema Design Considerations. indexed fields. The number of indexed fields > greatly increases the following: Memory usage during indexing ; Segment merge > time > > > > > > From: Emir Arnautović > Sent: Tu

Re: Solr index slow response

2019-03-19 Thread Michael Gibney
I'll second Emir's suggestion to try disabling swap. "I doubt swap would affect it since there is such huge free memory." -- sounds reasonable, but has not been my experience, and the stats you sent indicate that swap is in fact being used. Also, note that in many cases setting vm.swappiness=0 is n

Re: Solr index slow response

2019-03-19 Thread Walter Underwood
Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client threads, you should be able to drive CPU to over 90%. Start with two client threads per CPU. That allows one thread to be sending data over the network while another is waiting for Solr to process the batch. A couple of

Re: Solr index slow response

2019-03-19 Thread Bernd Fehling
Isn't there somthing about largePageTables which must be enabled in JAVA and also supported by OS for such huge heaps? Just a guess. Am 19.03.19 um 15:01 schrieb Jörn Franke: It could be an issue with jdk 8 that may not be suitable for such large heaps. Have more nodes with smaller heaps (eg 3

Re: Solr index slow response

2019-03-19 Thread Jörn Franke
It could be an issue with jdk 8 that may not be suitable for such large heaps. Have more nodes with smaller heaps (eg 31 gb) > Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun : > > Hello, Solr! > > > We are having some performance issue when try to send documents for solr to > index. The rep

Re: Solr index slow response

2019-03-19 Thread Chris Ulicny
___ > From: Emir Arnautović > Sent: Tuesday, March 19, 2019 1:00:19 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > If you start indexing with just a single thread/client, do you still see > slow bulks? > > Emir &g

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun
Design Considerations. indexed fields. The number of indexed fields greatly increases the following: Memory usage during indexing ; Segment merge time From: Emir Arnautović Sent: Tuesday, March 19, 2019 1:00:19 PM To: solr-user@lucene.apache.org Subject: Re: Sol

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović
; From: Emir Arnautović > Sent: Tuesday, March 19, 2019 12:30:33 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Just to add different perspective here: how do you send documents to Solr? > Are those log lines from your client? Maybe it is not Sol

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun
2019 12:30:33 PM To: solr-user@lucene.apache.org Subject: Re: Solr index slow response Just to add different perspective here: how do you send documents to Solr? Are those log lines from your client? Maybe it is not Solr that is slow. Could it be network or client itself. If you have some dry run on

Re: Solr index slow response

2019-03-19 Thread Emir Arnautović
re > any other faster way to index such big amount of data? > > > BRs > > //Aaron > > > From: Walter Underwood > Sent: Monday, March 18, 2019 4:59:20 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Solr is not designed t

Re: Solr index slow response

2019-03-19 Thread Aaron Yingcai Sun
9 ms. > 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584 KB, Res code: > 200, QTime: 22800 ms, Request time: 22802 ms. > 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670 KB, Res code: > 200, QTime: 34193 ms, Request time: 34195 ms. > 190318-162821.062-189983 D

Re: Solr index slow response

2019-03-18 Thread Walter Underwood
18-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584 KB, Res code: > 200, QTime: 22800 ms, Request time: 22802 ms. > 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670 KB, Res code: > 200, QTime: 34193 ms, Request time: 34195 ms. > 190318-162821.062-189983 DBG1:doc_count: 10 , do

Re: Solr index slow response

2019-03-18 Thread Aaron Yingcai Sun
s Ulicny Sent: Monday, March 18, 2019 2:54:25 PM To: solr-user@lucene.apache.org Subject: Re: Solr index slow response One other thing to look at besides the heap is your commit settings. We've experienced something similar, and changing commit settings alleviated the issue. Are you opening a sea

Re: Solr index slow response

2019-03-18 Thread Chris Ulicny
__ > From: Emir Arnautović > Sent: Monday, March 18, 2019 2:19:19 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Hi Aaron, > Without looking too much into numbers, my bet would be that it is large > heap that is causing issues.

Re: Solr index slow response

2019-03-18 Thread Emir Arnautović
___ > From: Emir Arnautović > Sent: Monday, March 18, 2019 2:19:19 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Hi Aaron, > Without looking too much into numbers, my bet would be that it is large heap > that is causi

Re: Solr index slow response

2019-03-18 Thread Emir Arnautović
142655.210-160208 DBG1:doc_count: 10 , doc_size: 605 KB, Res code: > 200, QTime: 108 ms, Request time: 110 ms. > 190318-142655.304-160208 DBG1:doc_count: 10 , doc_size: 481 KB, Res code: > 200, QTime: 89 ms, Request time: 90 ms. > 190318-142655.410-160208 DBG1:doc_count: 10 , doc_size: 4

Re: Solr index slow response

2019-03-18 Thread Aaron Yingcai Sun
time. BRs //Aaron From: Emir Arnautović Sent: Monday, March 18, 2019 2:19:19 PM To: solr-user@lucene.apache.org Subject: Re: Solr index slow response Hi Aaron, Without looking too much into numbers, my bet would be that it is large heap that is causing issues

Re: Solr index slow response

2019-03-18 Thread Aaron Yingcai Sun
olr-user@lucene.apache.org Subject: Re: Solr index slow response On Mon, 2019-03-18 at 10:47 +, Aaron Yingcai Sun wrote: > Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, > while 300 GB is reserved for solr, [...] 300GB for Solr sounds excessive. > Our applica

Re: Solr index slow response

2019-03-18 Thread Emir Arnautović
.0_144/jre/classes", >> "classpath":"...", >> "commandLineArgs":["-Xms100G", >> "-Xmx300G", >> "-DSTOP.PORT=8079", >> "-DSTOP.KEY=..", >> "-Dsolr.solr.home=.."

Re: Solr index slow response

2019-03-18 Thread Emir Arnautović
lr.home=..", >"-Djetty.port=8983"], > "startTime":"2019-03-18T09:35:27.892Z", > "upTimeMS":9258422}}, > "system":{ > "name":"Linux", >"arch":"amd64", >"av

Re: Solr index slow response

2019-03-18 Thread Toke Eskildsen
On Mon, 2019-03-18 at 10:47 +, Aaron Yingcai Sun wrote: > Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, > while 300 GB is reserved for solr, [...] 300GB for Solr sounds excessive. > Our application send 100 documents to solr per request, json encoded. > the size is aro

Re: Solr index slow response

2019-03-18 Thread Aaron Yingcai Sun
t;:32, "systemLoadAverage":14.72, "version":"3.0.101-311.g08a8a9d-default", "committedVirtualMemorySize":2547960700928, "freePhysicalMemorySize":4530696192, "freeSwapSpaceSize":3486846976, "processCpuLoad":0.

Re: Solr index slow response

2019-03-18 Thread Emir Arnautović
Hi Aaron, Which version of Solr? How did you configure your heap? Is it standalone Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you see some spikes, pauses or CPU usage is constant? Thanks, Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elast

Re: Solr Index Size after reindex

2019-02-14 Thread David Hastings
e rsync confirm that it has been entirely > completed. > > > > I don't see any transaction not completed that normaly means that the > indexation is completed. That's why I don't understand the difference. > > > > Kind Regards > > > > Matthieu &

Re: Solr Index Size after reindex

2019-02-14 Thread Erick Erickson
e > colleague who realized the rsync confirm that it has been entirely completed. > > I don't see any transaction not completed that normaly means that the > indexation is completed. That's why I don't understand the difference. > > Kind Regards > > M

RE: Solr Index Size after reindex

2019-02-13 Thread Mathieu Menard
se.io] Sent: samedi 9 février 2019 16:56 To: solr-user@lucene.apache.org Subject: Re: Solr Index Size after reindex Yes, those numbers are different and that should explain the different size. I think you should be able to find some information in the Alfresco or Solr log. There must be a reas

Re: Solr Index Size after reindex

2019-02-09 Thread Andrea Gazzarini
* vendredi 8 février 2019 14:54 *To:* solr-user@lucene.apache.org *Subject:* Re: Solr Index Size after reindex Hi Mathieu, what about the docs in the two infrastructures? Do they have the same numbers (numdocs / maxdocs)? Any meaningful message (error or not) in log files? Andrea On 08/02

RE: Solr Index Size after reindex

2019-02-08 Thread Mathieu Menard
9 14:54 To: solr-user@lucene.apache.org Subject: Re: Solr Index Size after reindex Hi Mathieu, what about the docs in the two infrastructures? Do they have the same numbers (numdocs / maxdocs)? Any meaningful message (error or not) in log files? Andrea On 08/02/2019 14:19, Mathieu Menard wrote:

Re: Solr Index Size after reindex

2019-02-08 Thread Andrea Gazzarini
Hi Mathieu, what about the docs in the two infrastructures? Do they have the same numbers (numdocs / maxdocs)? Any meaningful message (error or not) in log files? Andrea On 08/02/2019 14:19, Mathieu Menard wrote: Hello, I would like to have your point of view about an observation we have

Re: Solr index writing to s3

2019-01-17 Thread Mikhail Khludnev
There is some experience on backup to s3 https://issues.apache.org/jira/browse/SOLR-9952 iirc, it lacks performance. Jörn, it's not a point, but literally s3 consistency might be enough, since s3 provides read-after-write for PUT and Lucene index writer is append-only. On Thu, Jan 17, 2019 at 10:1

Re: Solr index writing to s3

2019-01-16 Thread Jörn Franke
This is not a requirement. This is a statement to a problem where there could be other solutions. s3 is only eventually consistent and I am not sure Solr works properly in this case. You may also need to check the S3 consistency to be applied. > Am 16.01.2019 um 19:39 schrieb Naveen M : > > hi

Re: Solr index writing to s3

2019-01-16 Thread Hendrik Haddorp
Theoretically you should be able to use the HDFS backend, which you can configure to use s3. Last time I tried that it did however not work for some reason. Here is an example for that, which also seems to have ultimately failed: https://community.plm.automation.siemens.com/t5/Developer-Space/R

Re: Solr Index Data will be delete if state.json did not exists

2018-12-14 Thread Jan Høydahl
I would use the Backup/Restore API https://lucene.apache.org/solr/guide/7_5/making-and-restoring-backups.html Alternatively, you could create collection B, using same configset as A, stop solr, copy the data folder and

Re: [solr-index]Can I do a lot of analysis on one field at the time of indexing?

2018-12-13 Thread Walter Underwood
> > If you can afford the time, can you give us a specific sample of the proposed > method? > > Thank you. > > -Original Message- > From: Walter Underwood > Sent: Friday, December 14, 2018 12:11 PM > To: solr-user@lucene.apache.org > Subject: Re: [solr-

RE: [solr-index]Can I do a lot of analysis on one field at the time of indexing?

2018-12-13 Thread 유정인
WalterUnderwood, thank you for your reply. If you can afford the time, can you give us a specific sample of the proposed method? Thank you. -Original Message- From: Walter Underwood Sent: Friday, December 14, 2018 12:11 PM To: solr-user@lucene.apache.org Subject: Re: [solr-index]Can

Re: [solr-index]Can I do a lot of analysis on one field at the time of indexing?

2018-12-13 Thread Walter Underwood
Right, no feature that does that for you. You should be able to code that with an update request processor script. You can fetch an analyzer chain, run it, add the results to a field, then do that again. I have one that runs a chain with minhash then saves the hex values of the hashes to a field.

Re: [solr-index]Can I do a lot of analysis on one field at the time of indexing?

2018-12-13 Thread Erick Erickson
In a word, "no". A field can have exactly one tokenizer, and there are no conditional filters. You can copyField to multiple individual fields and treat each one of those differently, i.e. copy from title to title1, title2 etc. where each one has a different analysis chain. Best, Erick On Thu, Dec

Re: SOLR Index Time Running Optimization

2018-09-26 Thread Walter Underwood
How long does the query take when it is run directly, without Solr? For our DIH queries, Solr was not the slow part. It took 90 minutes directly or with DIH. With our big cluster, I’ve seen indexing rates of one million docs per minute. wunder Walter Underwood wun...@wunderwood.org http://observe

Re: SOLR Index Time Running Optimization

2018-09-26 Thread Jan Høydahl
With DIH you are doing indexing single-threaded. You should be able to configure multiple DIH's on the same collection and then partition the data between them, issuing slightly different SQL to each. But I don't exactly know what that would look like. -- Jan Høydahl, search solution architect

Re: SOLR Index Time Running Optimization

2018-09-26 Thread Susheel Kumar
Also are you using Solr data import? That will be much slower compare to if you write our own little indexer which does indexing in batches and with multiple threads. On Wed, Sep 26, 2018 at 8:00 AM Vincenzo D'Amore wrote: > Hi, I know this is the shortest way but, had you tried to add more core

Re: SOLR Index Time Running Optimization

2018-09-26 Thread Vincenzo D'Amore
Hi, I know this is the shortest way but, had you tried to add more core or CPU to your solr instances? How big is you collection in terms of GB and number of documents? Ciao, Vincenzo > On 26 Sep 2018, at 08:36, Krizelle Mae Hernandez > wrote: > > Hi. > > Our SOLR currently is running appr

Re: Solr index clearing

2018-09-25 Thread Jan Høydahl
Hi, Solr does not do anything automatically, so I think this is a question for the Nutch community - http://nutch.apache.org/mailing_lists.html -- Jan Høydahl, search solution architect Cominvent AS - www.cominvent.com > 24. sep. 2018 kl. 20:06 skrev Bineesh : > > Team, > > We use solr 7.3.1

Re: Solr Index Issues

2018-09-10 Thread Walter Underwood
Every time you see "Expected mime type application/octet-stream but got text/html” from SolrJ, it means that Solr returned an error. Look for an error in the Solr logs at the same time as the SolrJ message. It could be any error, which is why we can’t help more. After you know the Solr error, w

Re: Solr Index Issues

2018-09-10 Thread Erick Erickson
It would be best to ask on the Nutch mailing list, this list doesn't have very many people who know _how_ Nutch uses Solr though. Best, Erick On Sun, Sep 9, 2018 at 11:47 PM Bineesh wrote: > > Hi Team, > > We are using Nutch 1.15 and Solr 6.6.3 > > We tried crawling one of the URL and and noticed

Re: Solr index getting replaced instead of merged

2017-08-31 Thread David Hastings
>Can anyone tell is it possible to paginate the data using Solr UI? use the start/rows input fields using standard array start as 0, ie start=0, rows=10 start=10, rows=10 start=20, rows=10 On Thu, Aug 31, 2017 at 8:21 AM, Agrawal, Harshal (GE Digital) < harshal.agra...@ge.com> wrote: > Hello A

RE: Solr index getting replaced instead of merged

2017-08-31 Thread Agrawal, Harshal (GE Digital)
Hello All, If I check out clear option while indexing 2nd table it worked.Thanks Gurdeep :) Can anyone tell is it possible to paginate the data using Solr UI? If yes please tell me the features which I can use? Regards Harshal From: Agrawal, Harshal (GE Digital) Sent: Wednesday, August 30, 2017

Re: Solr index getting replaced instead of merged

2017-08-30 Thread Gurdeep Singh
Not sure how you are doing indexing. Try adding clean=false in your indexing command/script when you do second table indexing. > On 30 Aug 2017, at 7:06 PM, Agrawal, Harshal (GE Digital) > wrote: > > Hello Guys, > > I have installed solr in my local system and was able to connect to Terad

RE: Solr Index issue on string type while querying

2017-08-02 Thread padmanabhan
Thank you Matt for the reply. my apologize on the clarity about the problem statement. The problem was with the source attribute value defined at the source system. Source system with the heightSquareTube_string_mv: > 90 - 100 mm Solr index converts the xml or html code to its symbol equivale

Re: SOLR Index and Schema.xml file corruption

2017-05-23 Thread Erick Erickson
If you have classic schema factory configured, then Solr will not write the schema.xml file out. So either something's strange with SiteCore or someone inadvertently hand-edited the schema. I suggest contacting the SiteCore people to see how it would get that way. You should be able to shut Solr/S

RE: Solr Index issue on string type while querying

2017-05-16 Thread Matt Kuiper
Your problem statement is not quite clear, however I will make a guess. Assuming your problem is that when you remove the '>' sign from your query term you receive zero results, then this is actually expected behavior for field types that are of type string. When searching against string fields

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-18 Thread Shawn Heisey
On 4/10/2017 1:57 AM, Himanshu Sachdeva wrote: > Thanks for your time and quick response. As you said, I changed our > logging level from SEVERE to INFO and indeed found the performance > warning *Overlapping onDeckSearchers=2* in the logs. I am considering > limiting the *maxWarmingSearchers* coun

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-11 Thread Toke Eskildsen
On Mon, 2017-04-10 at 13:27 +0530, Himanshu Sachdeva wrote: > Thanks for your time and quick response. As you said, I changed our > logging level from SEVERE to INFO and indeed found the performance > warning *Overlapping onDeckSearchers=2* in the logs. If you only see it occasionally, it is proba

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-10 Thread kshitij tyagi
Hi Himanshu, maxWarmingSearchers would break nothing on production. Whenever you request solr to open a new searcher, it autowarms the searcher so that it can utilize caching. After autowarm is complete a new searcher is opened. The questions you need to adress here are 1. Are you using soft-com

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-10 Thread Himanshu Sachdeva
Hi Toke, Thanks for your time and quick response. As you said, I changed our logging level from SEVERE to INFO and indeed found the performance warning *Overlapping onDeckSearchers=2* in the logs. I am considering limiting the *maxWarmingSearchers* count in configuration but want to be sure that n

Re: Solr Index size keeps fluctuating, becomes ~4x normal size.

2017-04-06 Thread Toke Eskildsen
On Thu, 2017-04-06 at 16:30 +0530, Himanshu Sachdeva wrote: > We monitored the index size for a few days and found that it varies > widely from 11GB to 43GB.  Lucene/Solr indexes consists of segments, each holding a number of documents. When a document is deleted, its bytes are not removed immedia

Re: Solr Index upgradation Merging issue observed

2017-01-09 Thread Shawn Heisey
On 1/8/2017 11:21 PM, Manan Sheth wrote: > Currently, We are in process of upgrading existing Solr indexes from Solr 4.x > to Solr 6.2.1. In order to upgrade existing indexes we are planning to use > IndexUpgrader class in sequential manner from Solr 4.x to Solr 5.x and Solr > 5.x to Solr 6.2.1.

Re: SOLR index help (SQL Anywhere 16, MS SQL 2014)

2016-12-05 Thread Erick Erickson
There are two basic choices, see Data Import Handler (DIH) or roll-your-own solrJ client, see: https://cwiki.apache.org/confluence/display/solr/Uploading+Structured+Data+Store+Data+with+the+Data+Import+Handler https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ Best, Erick On Mon, Dec 5,

Re: Solr - index polygons from csv

2016-04-28 Thread David Smiley
Hi. To use polygons, you need to add JTS, otherwise you get an unsupported shape error. See https://cwiki.apache.org/confluence/display/solr/Apache+Solr+Reference+Guide it involves not only adding a JTS lib to your classpath (ideal spot is WEB-INF/lib ) but also adding a spatialContextFactory att

Re: solr index size issue

2016-03-20 Thread Zheng Lin Edwin Yeo
Did you check if your index still contains 500 docs, or is there more? Regards, Edwin On 12 March 2016 at 22:54, Toke Eskildsen wrote: > sara hajili wrote: > > why solr index size become bigger and bigger without adding any new doc? > > Solr does not change the index unprovoked. It sounds lik

Re: solr index size issue

2016-03-12 Thread Toke Eskildsen
sara hajili wrote: > why solr index size become bigger and bigger without adding any new doc? Solr does not change the index unprovoked. It sounds like your external document feeding process is still running. - Toke Eskildsen

Re: Solr | index | Lock Type

2016-02-26 Thread Shawn Heisey
On 2/26/2016 7:48 AM, Prateek Jain J wrote: > WARN - 2016-02-26 05:49:29.191; org.apache.solr.core.SolrCore; [cm_history] > WARNING: Solr index directory '/foo/solr/cm_history/data/index/' is locked. > Unlocking... > WARN - 2016-02-26 05:49:29.680; org.apache.solr.rest.ManagedResource; No > s

Re: Solr index segment level merge

2015-12-29 Thread Tomás Fernández Löbbe
Would collection aliases be an option (assuming you are using SolrCloud mode)? https://cwiki.apache.org/confluence/display/solr/Collections+API#CollectionsAPI-api4 On Tue, Dec 29, 2015 at 9:21 PM, Erick Erickson wrote: > Could you simply add the new documents to the current index? > > That asi

Re: Solr index segment level merge

2015-12-29 Thread Erick Erickson
Could you simply add the new documents to the current index? That aside, merging does not need to create a new core or a new folder. The form: mergeindexes&core=core0&indexDir=/opt/solr/core1/data/index&indexDir=/opt/solr/core2/data/index Should merge the indexes from the two directories into th

Re: Solr index segment level merge

2015-12-29 Thread Walter Underwood
You probably do not NEED to merge your indexes. Have you tried not merging the indexes? wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Dec 29, 2015, at 7:31 PM, jeba earnest wrote: > > I have a scenario that I need to merge the solr indexes online

RE: Solr Index data lost

2015-04-22 Thread Vijay Bhoomireddy
your help.. Regards Vijay From: Vijaya Narayana Reddy Bhoomi Reddy [mailto:vijaya.bhoomire...@whishworks.com] Sent: 21 April 2015 09:22 To: solr-user@lucene.apache.org Subject: Re: Solr Index data lost Shawn, Yes, I had used java -jar start.jar. I haven't tried moving it to a

Re: Solr Index data lost

2015-04-21 Thread Vijaya Narayana Reddy Bhoomi Reddy
en done not in the correct fashion. > > > > Thanks & Regards > > Vijay > > > > -Original Message- > > From: Shawn Heisey [mailto:apa...@elyograg.org] > > Sent: 20 April 2015 22:34 > > To: solr-user@lucene.apache.org > > Subject: Re:

Re: Solr Index data lost

2015-04-20 Thread Erick Erickson
e window where > it was started earlier. > > Please correct me if something has been done not in the correct fashion. > > Thanks & Regards > Vijay > > -Original Message- > From: Shawn Heisey [mailto:apa...@elyograg.org] > Sent: 20 April 2015 22:34 > To: so

Re: Solr Index data lost

2015-04-20 Thread Shawn Heisey
On 4/20/2015 4:58 PM, Vijay Bhoomireddy wrote: > I haven’t changed any DirectoryFactory setting in the solrconfig.xml as I am > using in a local setup and using the default configurations. > > Device has been unmounted successfully (confirmed through windows message in > the lower right corner).

RE: Solr Index data lost

2015-04-20 Thread Vijay Bhoomireddy
5 22:34 To: solr-user@lucene.apache.org Subject: Re: Solr Index data lost On 4/20/2015 2:55 PM, Vijay Bhoomireddy wrote: > I have configured Solr example server on a pen drive. I have indexed > some content. The data directory was under > example/solr/collection1/data which is the defau

Re: Solr Index data lost

2015-04-20 Thread Shawn Heisey
On 4/20/2015 2:55 PM, Vijay Bhoomireddy wrote: > I have configured Solr example server on a pen drive. I have indexed some > content. The data directory was under example/solr/collection1/data which is > the default one. After indexing, I stopped the Solr server and unplugged the > pen drive and re

Re: solr index design for this use case?

2015-04-15 Thread vsriram30
Hi Eric, Thanks for your response. I was planning to do the same, to store the data in a single collection with site parameter differentiating duplicated content for different sites. But my use case is that in future the content would run into millions and potentially there could be large number o

Re: solr index design for this use case?

2015-04-15 Thread Erick Erickson
At this data size, don't worry at _all_ about duplicating content. A single Solr node easily holds 20M docs. 50M is common and 250M is not unheard of. My bold claim is: you can freely duplicate the data to your heart's content and you'll never notice it. In fact, you can put it all in a single co

Re: SOLR Index in shared/Network folder

2015-03-30 Thread Walter Underwood
I suggest that you do not try to save money on disk space. Disk is cheap. You will spend weeks of expensive engineering time trying to make this work. Once you make it work, it will be slow an unreliable. 300GB Amazon EBS volumes are $180/year, $360/year for SSD. Just spend the money. wunder Wa

Re: SOLR Index in shared/Network folder

2015-03-30 Thread Erick Erickson
First examine whether you can reduce the amount of data you keep around, field norms, stored fields, etc. Here's a place to start: http://stackoverflow.com/questions/10080881/solr-index-size-reduction I have heard of people doing what you suggest, but be _very_ careful that you don't accidentally

Re: SOLR Index in shared/Network folder

2015-03-29 Thread abhi Abhishek
Hello, Thanks for the suggestions. My aim is to reduce the disk space usage. I have 1 master with 2 slave configured, where slaves are used for searching and master ingests new data replicated to slaves, but as my index size is in 100's of GB we see 3x times space overhead. i would like to red

Re: SOLR Index in shared/Network folder

2015-03-27 Thread Erick Erickson
To pile on: If you're talking about pointing two Solr instances at the _same_ index, it doesn't matter whether you are on NFS or not, you'll have all sorts of problems. And if this is a SolrCloud installation, it's particularly hard to get right. Please do not do this unless you have a very good r

Re: SOLR Index in shared/Network folder

2015-03-27 Thread Walter Underwood
Several years ago, I accidentally put Solr indexes on an NFS volume and it was 100X slower. If you have enough RAM, query speed should be OK, but startup time (loading indexes into file buffers) could be really long. Indexing could be quite slow. wunder Walter Underwood wun...@wunderwood.org ht

Re: SOLR Index in shared/Network folder

2015-03-26 Thread Shawn Heisey
On 3/27/2015 12:06 AM, abhi Abhishek wrote: > Greetings, > I am trying to use a network shared location as my index directory. > are there any known problems in using a Network File System for running a > SOLR Instance? It is not recommended. You will probably need to change the lockType, .

Re: Solr index corrupt question

2014-10-31 Thread ku3ia
Erick Erickson wrote > What version of Solr/Lucene? First of all, was Lucene\Solr v.4.6, but later it was changed to Lucene\Solr 4.8. More later to the schema was added _root_ field and child doc support. Full data re-index on each change was not done. But not so long ago I had made an optimize to

Re: Solr index corrupt question

2014-10-31 Thread Erick Erickson
What version of Solr/Lucene? There have been some instances of index corruption, see the lucene/CHANGES.txt file that might account for it. This is something of a stab in the dark though. Because this is troubling... Best, Erick On Fri, Oct 31, 2014 at 7:57 AM, ku3ia wrote: > Hi, Erick. Thanks

Re: Solr index corrupt question

2014-10-31 Thread ku3ia
Hi, Erick. Thanks for you response. I'd checked my index via check index utility, and what I'm got: 3 of 41: name=_1ouwn docCount=518333 codec=Lucene46 compound=false numFiles=11 size (MB)=431.564 diagnostics = {timestamp=1412166850391, os=Linux, os.version=3.2.0-68-generic, mergeFactor

Re: Solr index corrupt question

2014-10-31 Thread Erick Erickson
Not quite sure what you mean by "destroy". I can use a delete-by-query with *:* and mark all docs in my index deleted. Search results will return nothing but it's still a valid index, it just consists of all deleted docs. All the segments may be removed even in the absence of an optimize due to seg

Re: Solr Index to Helio Search

2014-10-13 Thread Norgorn
*totalprovidencevideo It worked, thanks, u helped me to save nearly a week on reindexing and lot of nerves. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Index-to-Helio-Search-tp4163446p4164114.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr Index to Helio Search

2014-10-09 Thread Yonik Seeley
Hmmm, I imagine this is due to the lucene back compat bugs that were in 4.10, and the fact that the last release of heliosearch was branched off of the 4x branch. I just tried moving an index back and forth between my local heliosearch copy and solr 4.10.1 and things worked fine. Here's the snaps

Re: Solr index filename doesn't match with solr vesion

2014-02-17 Thread Nguyen Manh Tien
Thanks Shawn, Tri for your infos, explanation. Tien On Mon, Feb 17, 2014 at 1:36 PM, Tri Cao wrote: > Lucene main file formats actually don't change a lot in 4.x (or even 5.x), > and the newer codecs just delegate to previous versions for most file > types. The newer file types don't typically

Re: Solr index filename doesn't match with solr vesion

2014-02-16 Thread Tri Cao
Lucene main file formats actually don't change a lot in 4.x (or even 5.x), and the newer codecs just delegate to previous versions for most file types. The newer file types don't typically include Lucene's version in file names.For example, Lucene 4.6 codes basically delegate stored fields and term

Re: Solr index filename doesn't match with solr vesion

2014-02-16 Thread Shawn Heisey
On 2/16/2014 7:25 PM, Nguyen Manh Tien wrote: > I upgraded recently from solr 4.0 to solr 4.6, > I check solr index folder and found there file > > _aars_*Lucene41*_0.doc > _aars_*Lucene41*_0.pos > _aars_*Lucene41*_0.tim > _aars_*Lucene41*_0.tip > > I don't know why it don't have *Lucene46* in fi

Re: SOLR index Recovery & availability

2013-09-08 Thread Walter Underwood
This sounds very complicated for only 30K documents. Put them all on one server, give it enough memory so that the index can all be in file buffers. If there is a disaster, reindex everything. That should only take a few minutes. And don't optimize. wunder On Sep 8, 2013, at 3:01 PM, atuldj.ja

Re: Solr Index Files in a Directories

2013-07-26 Thread Otis Gospodnetic
Or simply use Flume Solr Sink and skip writing to local disk. Otis -- Solr & ElasticSearch Support -- http://sematext.com/ Performance Monitoring -- http://sematext.com/spm On Thu, Jul 25, 2013 at 11:02 PM, Jack Krupansky wrote: > Use LucidWorks Search, define a file system data source and set

Re: Solr Index Files in a Directories

2013-07-25 Thread Jack Krupansky
Use LucidWorks Search, define a file system data source and set the schedule to crawl the directory every minute, 5 minutes, 30 seconds, or whatever interval you want. http://docs.lucidworks.com/display/lweug/Simple+Filesystem+Data+Sources http://docs.lucidworks.com/display/help/Schedules -- J

Re: Solr index lot of pdf, doc, txt

2013-07-19 Thread sodoo
I'm using Solr 4.2 but I don't understand well this post recursive way. Maybe I think write a bash script. But bash script is not good solution. Another way & solution ? Please advice me. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-index-lot-of-pdf-doc-txt-tp4078

Re: Solr index lot of pdf, doc, txt

2013-07-17 Thread Alexandre Rafalovitch
You don't seem to be too creative with your doc_id values, so perhaps you can use Solr 4's post.jar recursive option: http://wiki.apache.org/solr/ExtractingRequestHandler#SimplePostTool_.28post.jar.29 Otherwise, you need to correlate the ID and the source file somehow, so you probably need a file

Re: Solr index searcher to lucene index searcher

2013-04-26 Thread parnab kumar
Hi , Thanks Chris . For every document that matches the query i want to able to compute the following set of features for a query document pair LuceneScore ( The vector space score that lucene gives to each doc) LinkScore ( computed from nutch ) OpicScore ( computed from

Re: Solr index searcher to lucene index searcher

2013-04-26 Thread Chris Hostetter
: used to call the lucene IndexSearcher . As the documents are collected in : TopDocs in Lucene , before that is passed back to Nutch , i used to look : into the top K matching documents , consult some external repository : and further score the Top K documents and reorder them in the TopDocs array

Re: Solr index searcher to lucene index searcher

2013-04-23 Thread parnab kumar
Hi , Thanks Chris. I had been using Nutch 1.1 . The Nutch IndexSearcher used to call the lucene IndexSearcher . As the documents are collected in TopDocs in Lucene , before that is passed back to Nutch , i used to look into the top K matching documents , consult some external repository an

  1   2   3   >