Re: Re:Interpreting Solr indexing times

2021-01-13 Thread Alessandro Benedetti
I agree, documents may be gigantic or very small, with heavy text analysis or simple strings ... so it's not possible to give an evaluation here. But you could make use of the nightly benchmark to give you an idea of Lucene indexing speed (the engine inside Apache Solr) : http://home.apache.org/~

Re:Interpreting Solr indexing times

2021-01-10 Thread xiefengchang
it's hard to answer your question without your solrconfig.xml, managed-schema(or schema.xml), and good to have some log snippet as well~ At 2021-01-07 21:28:00, "ufuk yılmaz" wrote: >Hello all, > >I have been looking at our SolrCloud indexing performance statistics and >trying t

Interpreting Solr indexing times

2021-01-07 Thread ufuk yılmaz
Hello all, I have been looking at our SolrCloud indexing performance statistics and trying to make sense of the numbers. We are using a custom Flume sink and sending updates to Solr (8.4) using SolrJ. I know these stuff depend on a lot of things but can you tell me if these statistics are horr

Re: SOLR indexing takes longer time

2020-08-18 Thread Walter Underwood
Instead of writing code, I’d fire up SQL Workbench/J, load the same JDBC driver that is being used in Solr, and run the query. https://www.sql-workbench.eu If that takes 3.5 hours, you have isolated the problem. wunder Walter Underwood wun...@wunderwood.org http:/

Re: SOLR indexing takes longer time

2020-08-18 Thread David Hastings
Another thing to mention is to make sure the indexer you build doesnt send commits until its actually done. Made that mistake with some early in house indexers. On Tue, Aug 18, 2020 at 9:38 AM Charlie Hull wrote: > 1. You could write some code to pull the items out of Mongo and dump > them to d

Re: SOLR indexing takes longer time

2020-08-18 Thread Charlie Hull
1. You could write some code to pull the items out of Mongo and dump them to disk - if this is still slow, then it's Mongo that's the problem. 2. Write a standalone indexer to replace DIH, it's single threaded and deprecated anyway. 3. Minor point - consider whether you need to index everything e

Re: SOLR indexing takes longer time

2020-08-17 Thread Aroop Ganguly
Adding on to what others have said, indexing speed in general is largely affected by the parallelism and isolation you can give to each node. Is there a reason why you cannot have more than 1 shard? If you have 5 node cluster, why not have 5 shards, maxshardspernode=1 replica=1 is ok. You should

Re: SOLR indexing takes longer time

2020-08-17 Thread Shawn Heisey
On 8/17/2020 12:22 PM, Abhijit Pawar wrote: We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / replicas and just single core. It takes almost 3.5 hours to index that data. I am using a data import handler to import data from the mongo database. Is there something we can do t

Re: SOLR indexing takes longer time

2020-08-17 Thread Walter Underwood
while you are indexing. If it is under 50%, the bottleneck is MongoDB and single-threaded indexing. For another check, run that same query in a regular database client and time it. The Solr indexing will never be faster than that. wunder Walter Underwood wun...@wunderwood.org http

Re: SOLR indexing takes longer time

2020-08-17 Thread Abhijit Pawar
Sure Divye, *Here's the config.* *conf/solr-config.xml:* /home/ec2-user/solr/solr-5.4.1/server/solr/test_core/conf/dataimport/data-source-config.xml *schema.xml:* has of all the field definitions *conf/dataimport/data-source-config.xml* . . . 4-5 more nested entities..

Re: SOLR indexing takes longer time

2020-08-17 Thread Jörn Franke
The DIH is single threaded and deprecated. Your best bet is to have a script/program extracting data from MongoDB and write them to Solr in Batches using multiple threads. You will see a significant higher performance for your data. > Am 17.08.2020 um 20:23 schrieb Abhijit Pawar : > > Hello,

Re: SOLR indexing takes longer time

2020-08-17 Thread Divye Handa
Can you share the dih configuration you are using for same? On Mon, 17 Aug, 2020, 23:52 Abhijit Pawar, wrote: > Hello, > > We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / > replicas and just single core. > It takes almost 3.5 hours to index that data. > I am using a data

SOLR indexing takes longer time

2020-08-17 Thread Abhijit Pawar
Hello, We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / replicas and just single core. It takes almost 3.5 hours to index that data. I am using a data import handler to import data from the mongo database. Is there something we can do to reduce the time taken to index? Will

Re: Solr indexing with Tika DIH - ZeroByteFileException

2020-04-23 Thread Charlie Hull
If users can upload any PDF, including broken or huge ones, and some cause a Tika error, you should decouple Tika from Solr and run it as a separate process to extract text before indexing with Solr. Otherwise some of what is uploaded *will* break Solr. https://lucidworks.com/post/indexing-with

Re: Solr indexing with Tika DIH - ZeroByteFileException

2020-04-22 Thread ravi kumar amaravadi
Hi, Iam also facing same issue. Does anyone have any update/soulution how to fix this issue as part DIH? Thanks. Regards, Ravi kumar -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr indexing performance

2019-12-05 Thread Shawn Heisey
On 12/5/2019 10:42 PM, Paras Lehana wrote: Can ulimit settings impact this? Review once. If the OS limits prevent Solr from opening a file or starting a thread, it is far more likely

Re: Solr indexing performance

2019-12-05 Thread Paras Lehana
Can ulimit settings impact this? Review once. On Thu, 5 Dec 2019 at 23:31, Shawn Heisey wrote: > On 12/5/2019 10:28 AM, Rahul Goswami wrote: > > We have a Solr 7.2.1 Solr Cloud setup w

Re: Solr indexing performance

2019-12-05 Thread Shawn Heisey
On 12/5/2019 10:28 AM, Rahul Goswami wrote: We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 parallel threads with 5000 docs per batch. This is a test setup and all documents are indexed on the same node. We are seeing connection timeout issues thereafter some time into ind

Re: Solr indexing performance

2019-12-05 Thread Vincenzo D'Amore
Hi, the clients are reusing their SolrClient? Ciao, Vincenzo -- mobile: 3498513251 skype: free.dev > On 5 Dec 2019, at 18:28, Rahul Goswami wrote: > > Hello, > > We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 > parallel threads with 5000 docs per batch. This is a te

Solr indexing performance

2019-12-05 Thread Rahul Goswami
Hello, We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 parallel threads with 5000 docs per batch. This is a test setup and all documents are indexed on the same node. We are seeing connection timeout issues thereafter some time into indexing. I am yet to analyze GC pauses a

Re: Solr indexing for unstructured data

2019-08-22 Thread Alexandre Rafalovitch
In Admin UI, there is schema browsing screen: https://lucene.apache.org/solr/guide/8_1/schema-browser-screen.html That shows you all the fields you have, their configuration and their (tokenized) indexed content. This seems to be a good midpoint between indexing and querying. So, I would check whe

Solr indexing for unstructured data

2019-08-22 Thread amrit pattnaik
Hi , I am a newbie in Solr. I have a scenario wherein the pdf documents with unstructured data have been parsed as text and kept in a separate directory. Now once I build a collection and do indexing using "bin/post -c collection name document name", the document gets indexed and I am able to retr

Solr indexing with Tika DIH - ZeroByteFileException

2019-06-11 Thread neilb
Hi, while going through solr logs, I found data import error for certain documents. Here are details about the error. Exception while processing: file document : null:org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to read content Processing Document # 7866 at org.apa

Re: Solr indexing with Tika DIH local vs network share

2019-04-04 Thread neilb
Thank you Erick, this is very helpful! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
So just try adding the autocommit and auotsoftcommit settings. All of the example configs have these entries and you can copy/paste/change > On Mar 29, 2019, at 10:35 AM, neilb wrote: > > Hi Erick, I am using solrconfig.xml from samples only and has very few > entries. I have attached my config

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, I am using solrconfig.xml from samples only and has very few entries. I have attached my config files for review along with reply. Thanks solrconfig.xml tika-data-config.xml

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
I suspect is that your autocommit settings in solrconfig.xml are something like hard commit: has openSearcher set to “false” soft commit: has the interval set to -1 (never) That means that until an external commit is executed, you won’t see any documents. Try setting your soft commit to somet

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, thanks a lot for your suggestions. I will look into it. But to answer my own query, I was little impatient and checking indexing status after every minute. What I found is after few hours, status started updating with document count and finished the indexing process in around 5Hrs. Do you

Re: Solr indexing with Tika DIH local vs network share

2019-03-26 Thread Erick Erickson
Not quite an answer to your specific qustion, but… There are a number of reasons why it’s better to run your Tika process outside of Solr and DIH. Here’s the long form: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Ignore the RDBMS parts. It’s somewhat old, but should be adaptable easily.

Solr indexing with Tika DIH local vs network share

2019-03-26 Thread neilb
Hi, I am trying to setup Solr for our project which can return full text searches on PDF documents. I am able to run the sample Tika DIH example locally on my windows server machine. It can index all PDF documents recursively in "baseDir" of config xml. Presently "baseDir" points to local folder o

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
Oh ok then that must no be the culprit then. I got this logs from our application server but I'm not sure if this is useful: Caused by: org.apache.solr.client.solrj.SolrServerException: org.apache.http.ParseException: Invalid content type: at org.apache.solr.client.solrj.impl.LBHttpSolrS

Re: Docker and Solr Indexing

2019-02-12 Thread Shawn Heisey
On 2/12/2019 6:56 AM, solrnoobie wrote: I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: USUALLY, this is a harmless annoyance, not an indication of an actual problem. Some people have indicated that it causes problems when using the bac

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: /opt/solr/server/solr/primaryCollectionPERF_shard1_replica9/data/index/segments_78 at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at java.base

Re: Docker and Solr Indexing

2019-02-12 Thread solrnoobie
I know this is too late of a reply but I found this on our solr.log java.nio.file.NoSuchFileException: /opt/solr/server/solr/primaryCollectionPERF_shard1_replica9/data/index/segments_78 at java.base/sun.nio.fs.UnixException.translateToIOException(UnixException.java:92) at java.base

Solr indexing raises error while posting PDF

2019-01-23 Thread sonam mittal
I am using Solr-6.6.4 version and Ubuntu 16 version.I have created a collection in Solr using the configuration files of the Solr example *techproducts*. I am trying to post a PDF in Solr but it is raising some errors.I have also installed the apache tika through maven but still it is showing the f

Re: Making Solr Indexing Errors Visible

2018-09-30 Thread Jason Gerlowski
Hi Also worth mentioning that bin/post only handles certain file extensions, and AFAIR it doesn't mention specifically when it skips over a file because of the extension. You mentioned you're trying to index Word docs and pdf's. Are there any other formats in the directory that might be messing u

Re: Making Solr Indexing Errors Visible

2018-09-27 Thread Shawn Heisey
On 9/26/2018 2:39 PM, Terry Steichen wrote: Let me try to clarify a bit - I'm just using bin/post to index the files in a directory.  That indexing process produces a lengthy screen display of files that were indexed.  (I realize this isn't production-quality, but I'm not ready for production jus

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Shawn Heisey
On 9/26/2018 2:39 PM, Terry Steichen wrote: To the best of my knowledge, I'm not using SolrJ at all.  Just Solr-out-of-the-box.  In this case, if I understand you below, it "should indicate an error status" I think you'd know if you were using SolrJ directly.  You'd have written the indexing p

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
Alex, Please look at my embedded responses to your questions. Terry On 09/26/2018 04:57 PM, Alexandre Rafalovitch wrote: > The challenge here is to figure out exactly what you are doing, > because the original description could have been 10 different things. > > So: > 1) You are using bin/post

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Alexandre Rafalovitch
The challenge here is to figure out exactly what you are doing, because the original description could have been 10 different things. So: 1) You are using bin/post command (we just found this out) 2) You are indexing a bunch of files (what format? all same or different?) 3) You are indexing them i

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
Shawn, To the best of my knowledge, I'm not using SolrJ at all.  Just Solr-out-of-the-box.  In this case, if I understand you below, it "should indicate an error status"  But it doesn't. Let me try to clarify a bit - I'm just using bin/post to index the files in a directory.  That indexing proce

Re: Making Solr Indexing Errors Visible

2018-09-26 Thread Shawn Heisey
On 9/26/2018 1:23 PM, Terry Steichen wrote: I'm pretty sure this was covered earlier.  But I can't find references to it.  The question is how to make indexing errors clear and obvious. If there's an indexing error and you're NOT using the concurrent client in SolrJ, the response that Solr ret

Making Solr Indexing Errors Visible

2018-09-26 Thread Terry Steichen
I'm pretty sure this was covered earlier.  But I can't find references to it.  The question is how to make indexing errors clear and obvious.  (I find that there are maybe 10% more files in a directory than end up in the index.  I presume they were indexing errors, but I have no idea which ones or

Re: Docker and Solr Indexing

2018-09-12 Thread Shawn Heisey
On 9/12/2018 7:43 AM, Dominique Bejean wrote: Are you aware about issues in Java applications in Docker if java version is not 10 ? https://blog.docker.com/2018/04/improved-docker-container-integration-with-java-10/ Solr explicitly sets heap size when it starts, so Java is *NOT* determining th

Re: Docker and Solr Indexing

2018-09-12 Thread Dominique Bejean
Hi, Are you aware about issues in Java applications in Docker if java version is not 10 ? https://blog.docker.com/2018/04/improved-docker-container-integration-with-java-10/ Regards. Dominique Le mer. 12 sept. 2018 à 05:42, Shawn Heisey a écrit : > On 9/11/2018 9:20 PM, solrnoobie wrote: > >

Re: Docker and Solr Indexing

2018-09-11 Thread Shawn Heisey
On 9/11/2018 9:20 PM, solrnoobie wrote: So what we did is we upgraded the instances to 16 gigs and we rarely encounter this now. So what we did was to increase the batch size to 500 instead of 50 and it worked for our test data. But when we tried 1000 batch size, the invalid content type error r

Re: Docker and Solr Indexing

2018-09-11 Thread solrnoobie
Thank you all for the kind and timely reply. So what we did is we upgraded the instances to 16 gigs and we rarely encounter this now. So what we did was to increase the batch size to 500 instead of 50 and it worked for our test data. But when we tried 1000 batch size, the invalid content type err

Re: Docker and Solr Indexing

2018-09-11 Thread Jan Høydahl
You have not shed any light on what the reason for the container restart was, and there is too little information about your setup and Solr usage to guess what goes on. Whether 4Gb is sufficient or not depends on how much data and queries you plan for each shard to handle, how much heap you give

Re: Docker and Solr Indexing

2018-09-11 Thread Walter Underwood
4 Gb is very small for Solr. Solr is not designed for Dockerized, fail-often use. We use a LOT of Docker ECS, but all of our Solr servers are on EC2 instances. That’s about sixty instances in several clusters. We run an 8 Gb heap for all our Solr instances. Instances in our biggest cluster (in t

Docker and Solr Indexing

2018-09-10 Thread solrnoobie
So we have a dockerized aws environment with the solr docker container having only 4 gigs for max ram. Our problem is whenever we index, the container containing the leader shard will restart after around 2 or less minutes of index time (batch is 50 docs per batch with 3 threads in our app thread

Re: Solr indexing Duplicate URL's ending with /

2018-08-29 Thread Jan Høydahl
Hi, You would have to direct this question to the crawler you are using, since it is the crawler that decides the document ID to send to Solr. Most crawlers will have configuration options to normalize the URL for each document. However you could also try to clean the URL after it arrives in SO

Solr indexing Duplicate URL's ending with /

2018-08-29 Thread kunhu0...@gmail.com
Team, Need suggestion on how to remove the duplicate entries while indexing to Solr. Below are the sample entries i see in solr collection while i need to remove the one which is ending with / https://www.abc.com/2018/test.html https://www.abc.com/2018/test.html/ Thank you -- Sent from: http

Re: Solr Indexing error

2018-08-28 Thread Shawn Heisey
On 8/28/2018 6:03 AM, kunhu0...@gmail.com wrote: possible analysis error: Document contains at least one immense term in field="content" (whose UTF8 encoding is longer than the max length 32766), It's telling you exactly what is wrong. The field named "content" is probably using a field class

Solr Indexing error

2018-08-28 Thread kunhu0...@gmail.com
Hello All, Need help on the error related to Solr indexing. We are using Solr 6.6.3 and Nutch crawler 1.14. While indexing data to Solr we see errors as below possible analysis error: Document contains at least one immense term in field="content" (whose UTF8 encoding is longer th

Keep Solr Indexing live

2017-12-20 Thread shashiroushan
Hello All, I am using DIH to import data from SQL to Solr using Url "/dataimport?command=full-import&clean=true". My problem is, When SQL query return zero record then Solr also return zero records. But as per my project requirement, Solr indexing should be clean only when SQ

Re: Urgent - Solr indexing is taking hours and dashboard page is not getting rendered at all :(

2017-03-09 Thread Shawn Heisey
to the list. They almost never do. If you need to share something that's too big to include as regular text in your email, store it in a semi-permanent place on the public Internet and provide a URL to access it. Remember that few people here know anything about hybris. Information from th

Re: Urgent - Solr indexing is taking hours and dashboard page is not getting rendered at all :(

2017-03-09 Thread Charlie Hull
On 09/03/2017 15:16, Gaurav Srivastava wrote: Hi All, I have a eCommerce site built on Hybris 6.2.0.4 which uses SOLR OOB (vendor=hybris version=6.2.0.2) as a search engine. I am facing below 2 problems : 1. Indexing is taking lot of time(4-5 hours) in last couple of weeks. (data has increased

Urgent - Solr indexing is taking hours and dashboard page is not getting rendered at all :(

2017-03-09 Thread Gaurav Srivastava
Hi All, I have a eCommerce site built on Hybris 6.2.0.4 which uses SOLR OOB (vendor=hybris version=6.2.0.2) as a search engine. I am facing below 2 problems : 1. Indexing is taking lot of time(4-5 hours) in last couple of weeks. (data has increased though) 2. Our dashboard page is getting hunged

Re: How to know if SOLR indexing is completed prorammatically

2016-09-30 Thread subinalex
Thanks a lot christian.. let me explore that.. :) -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-know-if-SOLR-indexing-is-completed-prorammatically-tp4298799p4298807.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to know if SOLR indexing is completed prorammatically

2016-09-30 Thread Christian Ortner
t programmatically and based on this trigger the > next solr indexing batch job. > > > Please help with this. > > > :) > > > > -- > View this message in context: http://lucene.472066.n3. > nabble.com/How-to-know-if-SOLR-indexing-is-completed- > prorammatically-tp4298799.html > Sent from the Solr - User mailing list archive at Nabble.com. >

How to know if SOLR indexing is completed prorammatically

2016-09-30 Thread subinalex
Hi Guys, We are running back to back solr indexing batch jobs.We need to ensure if the triggered batch indexing is completed before starting the next. I know we can check the status by viewing the 'Logging' and 'CoreAdmin' page of solr admin console. But,we n

Re: Solr indexing sequentially or randomly?

2016-06-14 Thread Zheng Lin Edwin Yeo
Thank you. On 14 June 2016 at 20:03, Mikhail Khludnev wrote: > Sequentially. > > On Tue, Jun 14, 2016 at 12:32 PM, Zheng Lin Edwin Yeo < > edwinye...@gmail.com> > wrote: > > > Hi, > > > > i would like to find out, does Solr writes to the disk sequentially or > > randomly during indexing? > > I'm

Re: Solr indexing sequentially or randomly?

2016-06-14 Thread Mikhail Khludnev
Sequentially. On Tue, Jun 14, 2016 at 12:32 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > i would like to find out, does Solr writes to the disk sequentially or > randomly during indexing? > I'm using Solr 6.0.1. > > Regards, > Edwin > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid

Solr indexing sequentially or randomly?

2016-06-14 Thread Zheng Lin Edwin Yeo
Hi, i would like to find out, does Solr writes to the disk sequentially or randomly during indexing? I'm using Solr 6.0.1. Regards, Edwin

Re: solr Indexing PDF attachments not working. in ubuntu

2016-01-23 Thread Binoy Dalal
Do you see any exceptions in the solr log? On Sat, 23 Jan 2016, 16:29 Moncif Aidi wrote: > HI, > > I have a problem with integrating solr in Ubuntu server.Before using solr > on ubuntu server i tested it on my mac it was working perfectly. it indexed > my PDF,Doc,Docx documents.so after installi

solr Indexing PDF attachments not working. in ubuntu

2016-01-23 Thread Moncif Aidi
HI, I have a problem with integrating solr in Ubuntu server.Before using solr on ubuntu server i tested it on my mac it was working perfectly. it indexed my PDF,Doc,Docx documents.so after installing solr on ubuntu server and using the same configuration files and librairies. i've found out that s

Re: Problem with Solr indexing "non-searchable" pdf files

2015-12-17 Thread Erick Erickson
Not sure how much help I can be, I have no clue what DSpace is doing with Solr. If you're willing to try to index straight to Solr, you can always use SolrJ to parse the files, it's actually not very hard. Here's an example: https://lucidworks.com/blog/2012/02/14/indexing-with-solrj/ some databas

Problem with Solr indexing "non-searchable" pdf files

2015-12-17 Thread RICARDO EITO BRUN
Hi, I am using SOLR as part of the dspace 5.4 SW application. I have a problem when running the dspace indexing command (index-discovery). Most of the files are not being added to the index, and an exception is raised. It seems that Solr does not process the PDF files that are result of scanning w

Re: solr indexing warning

2015-11-20 Thread Shawn Heisey
On 11/20/2015 12:33 AM, Midas A wrote: > As we are this server as a master server there are no queries running on > it . in that case should i remove these configuration from config file . The following cache info says that there ARE queries being run on this server: > QueryResultCache: > > lo

Re: solr indexing warning

2015-11-20 Thread Emir Arnautovic
Hi, Since this is master node, and not expected to have queries, you can disable caches completely. However, from numbers cache autowarm is not an issue here but probably frequency of commits and/or warmup queries. How do you do commits? Since master-slave, I don't see reason to have them too

Re: solr indexing warning

2015-11-19 Thread Midas A
thanks Shawn, As we are this server as a master server there are no queries running on it . in that case should i remove these configuration from config file . Total Docs: 40 0 Stats # Document cache : lookups:823 hits:4 hitratio:0.00 inserts:820 evictions:0 size:820 warmupTime:0 cumulati

Re: solr indexing warning

2015-11-19 Thread Shawn Heisey
On 11/19/2015 11:06 PM, Midas A wrote: > autowarmCount="1000"/> size="1000" initialSize="1000" autowarmCount="1000"/> ="1000" autowarmCount="1000"/> Your caches are quite large. More importantly, your autowarmCount is very large. How many documents are in each of your cores? If you check t

Re: solr indexing warning

2015-11-19 Thread Midas A
Thanks Emir , So what we need to do to resolve this issue . This is my solr configuration. what changes should i do to avoid the warning . ~abhishek On Thu, Nov 19, 2015 at 6:37 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > This means that one searcher is still warming

Re: solr indexing warning

2015-11-19 Thread Emir Arnautovic
This means that one searcher is still warming when other searcher created due to commit with openSearcher=true. This can be due to frequent commits of searcher warmup taking too long. Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support *

solr indexing warning

2015-11-19 Thread Midas A
Getting following log on solr PERFORMANCE WARNING: Overlapping onDeckSearchers=2`

Re: Problem with the Content Field during Solr Indexing

2015-11-02 Thread Susheel Kumar
Hi Shruti, If you are looking to index images to make them searchable (Image Search) then you will have to look at LIRE (Lucene Image Retrieval) http://www.lire-project.net/ and can follow Lire Solr Plugin at this site https://bitbucket.org/dermotte/liresolr. Thanks, Susheel On Sat, Oct 31, 201

Re: Problem with the Content Field during Solr Indexing

2015-10-31 Thread Zheng Lin Edwin Yeo
Hi Shruti, >From what I understand, the /update/extract handler is for indexing rich-text documents, and does not support ".png" files. It only supports the following files format: pdf, doc, docx, ppt, pptx, xls, xlsx, odt, odp, ods, ott, otp, ots, rtf, htm, html, txt, log If you use the default

Re: Problem with the Content Field during Solr Indexing

2015-10-30 Thread Shruti Mundra
Hi Edwin, The file extension of the image file is ".png" and we are following this url for indexing: " http://blog.thedigitalgroup.com/vijaym/wp-content/uploads/sites/11/2015/07/SolrImageExtract.png " Thanks and Regards, Shruti Mundra On Thu, Oct 29, 2015 at 8:33 PM, Zheng Lin Edwin Yeo wrote:

Re: Problem with the Content Field during Solr Indexing

2015-10-29 Thread Zheng Lin Edwin Yeo
The "\n" actually means new line as decoded by Solr from the indexed document. What is your file extension of your image file, and which method are you using to do the indexing? Regards, Edwin On 30 October 2015 at 04:38, Shruti Mundra wrote: > Hi, > > When I'm trying index an image file dire

Problem with the Content Field during Solr Indexing

2015-10-29 Thread Shruti Mundra
Hi, When I'm trying index an image file directly to Solr, the attribute content, consists of trails of "\n"s and not the data. We are successful in getting the metadata for that image. Can anyone help us out on how we could get the content along with the Metadata. Thanks! - Shruti Mundra

Re: Solr indexing based on last_modified

2015-08-17 Thread Erick Erickson
cheduler which will call solr in scheduled > time interval. Any updates to the file system must be indexed by solr. Only > changes must be re-indexed as file system is huge and cannot be re-indexed > every time. > > > > -- > View this message in context: > http://lucene

Re: Solr indexing based on last_modified

2015-08-17 Thread coolmals
.nabble.com/Solr-indexing-based-on-last-modified-tp4223506p4223511.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing based on last_modified

2015-08-17 Thread Erick Erickson
; -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-indexing-based-on-last-modified-tp4223506.html > Sent from the Solr - User mailing list archive at Nabble.com.

Solr indexing based on last_modified

2015-08-17 Thread coolmals
View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-based-on-last-modified-tp4223506.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Optimizing Solr indexing over WAN

2015-07-22 Thread Markus Jelsma
nesday 22nd July 2015 17:43 > To: solr-user@lucene.apache.org > Subject: RE: Optimizing Solr indexing over WAN > > Indexing over a WAN will be slow, limited by the bandwidth of the pipe. > > I think you will be better served to move the data in bulk to the same LAN as > your

RE: Optimizing Solr indexing over WAN

2015-07-22 Thread Reitzel, Charles
ll of this complexity may buy you a few % improvement in indexing speed. Probably not worth the development cost ... -Original Message- From: Ali Nazemian [mailto:alinazem...@gmail.com] Sent: Wednesday, July 22, 2015 2:21 AM To: solr-user@lucene.apache.org Subject: Optimizing Solr indexin

Optimizing Solr indexing over WAN

2015-07-21 Thread Ali Nazemian
Dears, Hi, I know that there are lots of tips about how to make the Solr indexing faster. Probably some of the most important ones which are considered in client side are choosing batch indexing and multi-thread indexing. There are other important factors that are server side which I dont want to

Re: lucene vs Solr Indexing on Sample data

2015-06-15 Thread Erick Erickson
Basically I expect you're falling afoul of a very common misunderstanding; It's not that Solr is slower, it's that the client isn't feeding Solr as fast as it should. If you profile your Solr server, my suspicion is that you're not driving it very hard. You'll probably see 4 spikes in CPU activity

Re: lucene vs Solr Indexing on Sample data

2015-06-15 Thread Alessandro Benedetti
Actually I can see a problem in your question… Lucene and Solr are not competitor technologies. Solr is a Search Server that internally uses the Lucene library and offers easy to use configuration and REST API. Lucene is a library that implements tons of search algorithms and features. You can see

lucene vs Solr Indexing on Sample data

2015-06-15 Thread Argho Chatterjee
Hello Everyone, I had posted a question on stackoverflow.com after performing a few POCs My hadrware consist of a single i-3 intel processor (4 CPU as per "dxdiag" on run ), 8GB Ram, Laptop machine. My Question Link : http://stackoverflow.com/questions/30823314/lucene-vs-solr-indexning-speed-for

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Swaraj Kumar
I am not sure but the following regex have worked for me in JAVA. Kindly check with your's one. ([^\x01])\x01([^\x01])\x01..([^\x01])$ Thanks, Swaraj

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Toke Eskildsen
avinash09 wrote: > Thanks Toke , nice explanation , i have one more concern instead of comma > separated my columns are ^A separated how to deal ^A ?? I am really not proficient with control characters and regexp. If ^A is Start Of Heading, which has ASCII & unicode character 1, my guess is that

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread avinash09
Thanks Toke , nice explanation , i have one more concern instead of comma separated my columns are ^A separated how to deal ^A ?? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Swaraj Kumar
I have used the following and it works very fast in DIH solr-5.0 You can try this for getting groupNames from regex. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com +91-9811774497

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-03 Thread Toke Eskildsen
avinash09 wrote: > regex="^(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*), > (.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)$" A better solution seems to have been presented, but for the record I would like to note that the regexp above is quite an ef

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
Alex, finally it worked for me found ctrl A separator ==( separator=%01&escape=\) Thanks for your help -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4197143.html Sent

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
ile csv upload > > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html > Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
thanks Erick and Alexandre Rafalovitch R one more doubt how to pass ctrl A(^A) seprator while csv upload -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html Sent

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Erick Erickson
question m confuse here what is difference between data import > handler and update csv > > > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.h

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
sir , a silly question m confuse here what is difference between data import handler and update csv -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html Sent from the

  1   2   3   4   5   >