Re: SOLR indexing takes longer time

2020-08-18 Thread Walter Underwood
Instead of writing code, I’d fire up SQL Workbench/J, load the same JDBC driver that is being used in Solr, and run the query. https://www.sql-workbench.eu If that takes 3.5 hours, you have isolated the problem. wunder Walter Underwood wun...@wunderwood.org http:/

Re: SOLR indexing takes longer time

2020-08-18 Thread David Hastings
Another thing to mention is to make sure the indexer you build doesnt send commits until its actually done. Made that mistake with some early in house indexers. On Tue, Aug 18, 2020 at 9:38 AM Charlie Hull wrote: > 1. You could write some code to pull the items out of Mongo and dump > them to d

Re: SOLR indexing takes longer time

2020-08-18 Thread Charlie Hull
1. You could write some code to pull the items out of Mongo and dump them to disk - if this is still slow, then it's Mongo that's the problem. 2. Write a standalone indexer to replace DIH, it's single threaded and deprecated anyway. 3. Minor point - consider whether you need to index everything e

Re: SOLR indexing takes longer time

2020-08-17 Thread Aroop Ganguly
Adding on to what others have said, indexing speed in general is largely affected by the parallelism and isolation you can give to each node. Is there a reason why you cannot have more than 1 shard? If you have 5 node cluster, why not have 5 shards, maxshardspernode=1 replica=1 is ok. You should

Re: SOLR indexing takes longer time

2020-08-17 Thread Shawn Heisey
On 8/17/2020 12:22 PM, Abhijit Pawar wrote: We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / replicas and just single core. It takes almost 3.5 hours to index that data. I am using a data import handler to import data from the mongo database. Is there something we can do t

Re: SOLR indexing takes longer time

2020-08-17 Thread Walter Underwood
I’m seeing multiple red flags for performance here. The top ones are “DIH”, “MongoDB”, and “SQL on MongoDB”. MongoDB is not a relational database. Our multi-threaded extractor using the Mongo API was still three times slower than the same approach on MySQL. Check the CPU usage on the Solr hosts w

Re: SOLR indexing takes longer time

2020-08-17 Thread Abhijit Pawar
Sure Divye, *Here's the config.* *conf/solr-config.xml:* /home/ec2-user/solr/solr-5.4.1/server/solr/test_core/conf/dataimport/data-source-config.xml *schema.xml:* has of all the field definitions *conf/dataimport/data-source-config.xml* . . . 4-5 more nested entities..

Re: SOLR indexing takes longer time

2020-08-17 Thread Jörn Franke
The DIH is single threaded and deprecated. Your best bet is to have a script/program extracting data from MongoDB and write them to Solr in Batches using multiple threads. You will see a significant higher performance for your data. > Am 17.08.2020 um 20:23 schrieb Abhijit Pawar : > > Hello,

Re: SOLR indexing takes longer time

2020-08-17 Thread Divye Handa
Can you share the dih configuration you are using for same? On Mon, 17 Aug, 2020, 23:52 Abhijit Pawar, wrote: > Hello, > > We are indexing some 200K plus documents in SOLR 5.4.1 with no shards / > replicas and just single core. > It takes almost 3.5 hours to index that data. > I am using a data

Re: Solr indexing with Tika DIH - ZeroByteFileException

2020-04-23 Thread Charlie Hull
If users can upload any PDF, including broken or huge ones, and some cause a Tika error, you should decouple Tika from Solr and run it as a separate process to extract text before indexing with Solr. Otherwise some of what is uploaded *will* break Solr. https://lucidworks.com/post/indexing-with

Re: Solr indexing with Tika DIH - ZeroByteFileException

2020-04-22 Thread ravi kumar amaravadi
Hi, Iam also facing same issue. Does anyone have any update/soulution how to fix this issue as part DIH? Thanks. Regards, Ravi kumar -- Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr indexing performance

2019-12-05 Thread Shawn Heisey
On 12/5/2019 10:42 PM, Paras Lehana wrote: Can ulimit settings impact this? Review once. If the OS limits prevent Solr from opening a file or starting a thread, it is far more likely

Re: Solr indexing performance

2019-12-05 Thread Paras Lehana
Can ulimit settings impact this? Review once. On Thu, 5 Dec 2019 at 23:31, Shawn Heisey wrote: > On 12/5/2019 10:28 AM, Rahul Goswami wrote: > > We have a Solr 7.2.1 Solr Cloud setup w

Re: Solr indexing performance

2019-12-05 Thread Shawn Heisey
On 12/5/2019 10:28 AM, Rahul Goswami wrote: We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 parallel threads with 5000 docs per batch. This is a test setup and all documents are indexed on the same node. We are seeing connection timeout issues thereafter some time into ind

Re: Solr indexing performance

2019-12-05 Thread Vincenzo D'Amore
Hi, the clients are reusing their SolrClient? Ciao, Vincenzo -- mobile: 3498513251 skype: free.dev > On 5 Dec 2019, at 18:28, Rahul Goswami wrote: > > Hello, > > We have a Solr 7.2.1 Solr Cloud setup where the client is indexing in 5 > parallel threads with 5000 docs per batch. This is a te

Re: Solr indexing for unstructured data

2019-08-22 Thread Alexandre Rafalovitch
In Admin UI, there is schema browsing screen: https://lucene.apache.org/solr/guide/8_1/schema-browser-screen.html That shows you all the fields you have, their configuration and their (tokenized) indexed content. This seems to be a good midpoint between indexing and querying. So, I would check whe

Re: Solr indexing with Tika DIH local vs network share

2019-04-04 Thread neilb
Thank you Erick, this is very helpful! -- Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
So just try adding the autocommit and auotsoftcommit settings. All of the example configs have these entries and you can copy/paste/change > On Mar 29, 2019, at 10:35 AM, neilb wrote: > > Hi Erick, I am using solrconfig.xml from samples only and has very few > entries. I have attached my config

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, I am using solrconfig.xml from samples only and has very few entries. I have attached my config files for review along with reply. Thanks solrconfig.xml tika-data-config.xml

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread Erick Erickson
I suspect is that your autocommit settings in solrconfig.xml are something like hard commit: has openSearcher set to “false” soft commit: has the interval set to -1 (never) That means that until an external commit is executed, you won’t see any documents. Try setting your soft commit to somet

Re: Solr indexing with Tika DIH local vs network share

2019-03-29 Thread neilb
Hi Erick, thanks a lot for your suggestions. I will look into it. But to answer my own query, I was little impatient and checking indexing status after every minute. What I found is after few hours, status started updating with document count and finished the indexing process in around 5Hrs. Do you

Re: Solr indexing with Tika DIH local vs network share

2019-03-26 Thread Erick Erickson
Not quite an answer to your specific qustion, but… There are a number of reasons why it’s better to run your Tika process outside of Solr and DIH. Here’s the long form: https://lucidworks.com/2012/02/14/indexing-with-solrj/ Ignore the RDBMS parts. It’s somewhat old, but should be adaptable easily.

Re: Solr indexing Duplicate URL's ending with /

2018-08-29 Thread Jan Høydahl
Hi, You would have to direct this question to the crawler you are using, since it is the crawler that decides the document ID to send to Solr. Most crawlers will have configuration options to normalize the URL for each document. However you could also try to clean the URL after it arrives in SO

Re: Solr Indexing error

2018-08-28 Thread Shawn Heisey
On 8/28/2018 6:03 AM, kunhu0...@gmail.com wrote: possible analysis error: Document contains at least one immense term in field="content" (whose UTF8 encoding is longer than the max length 32766), It's telling you exactly what is wrong. The field named "content" is probably using a field class

Re: Solr indexing sequentially or randomly?

2016-06-14 Thread Zheng Lin Edwin Yeo
Thank you. On 14 June 2016 at 20:03, Mikhail Khludnev wrote: > Sequentially. > > On Tue, Jun 14, 2016 at 12:32 PM, Zheng Lin Edwin Yeo < > edwinye...@gmail.com> > wrote: > > > Hi, > > > > i would like to find out, does Solr writes to the disk sequentially or > > randomly during indexing? > > I'm

Re: Solr indexing sequentially or randomly?

2016-06-14 Thread Mikhail Khludnev
Sequentially. On Tue, Jun 14, 2016 at 12:32 PM, Zheng Lin Edwin Yeo wrote: > Hi, > > i would like to find out, does Solr writes to the disk sequentially or > randomly during indexing? > I'm using Solr 6.0.1. > > Regards, > Edwin > -- Sincerely yours Mikhail Khludnev Principal Engineer, Grid

Re: solr Indexing PDF attachments not working. in ubuntu

2016-01-23 Thread Binoy Dalal
Do you see any exceptions in the solr log? On Sat, 23 Jan 2016, 16:29 Moncif Aidi wrote: > HI, > > I have a problem with integrating solr in Ubuntu server.Before using solr > on ubuntu server i tested it on my mac it was working perfectly. it indexed > my PDF,Doc,Docx documents.so after installi

Re: solr indexing warning

2015-11-20 Thread Shawn Heisey
On 11/20/2015 12:33 AM, Midas A wrote: > As we are this server as a master server there are no queries running on > it . in that case should i remove these configuration from config file . The following cache info says that there ARE queries being run on this server: > QueryResultCache: > > lo

Re: solr indexing warning

2015-11-20 Thread Emir Arnautovic
Hi, Since this is master node, and not expected to have queries, you can disable caches completely. However, from numbers cache autowarm is not an issue here but probably frequency of commits and/or warmup queries. How do you do commits? Since master-slave, I don't see reason to have them too

Re: solr indexing warning

2015-11-19 Thread Midas A
thanks Shawn, As we are this server as a master server there are no queries running on it . in that case should i remove these configuration from config file . Total Docs: 40 0 Stats # Document cache : lookups:823 hits:4 hitratio:0.00 inserts:820 evictions:0 size:820 warmupTime:0 cumulati

Re: solr indexing warning

2015-11-19 Thread Shawn Heisey
On 11/19/2015 11:06 PM, Midas A wrote: > autowarmCount="1000"/> size="1000" initialSize="1000" autowarmCount="1000"/> ="1000" autowarmCount="1000"/> Your caches are quite large. More importantly, your autowarmCount is very large. How many documents are in each of your cores? If you check t

Re: solr indexing warning

2015-11-19 Thread Midas A
Thanks Emir , So what we need to do to resolve this issue . This is my solr configuration. what changes should i do to avoid the warning . ~abhishek On Thu, Nov 19, 2015 at 6:37 PM, Emir Arnautovic < emir.arnauto...@sematext.com> wrote: > This means that one searcher is still warming

Re: solr indexing warning

2015-11-19 Thread Emir Arnautovic
This means that one searcher is still warming when other searcher created due to commit with openSearcher=true. This can be due to frequent commits of searcher warmup taking too long. Emir -- Monitoring * Alerting * Anomaly Detection * Centralized Log Management Solr & Elasticsearch Support *

Re: Solr indexing based on last_modified

2015-08-17 Thread Erick Erickson
Well, you'll have to have some kind of timestamp that you can reference and only re-send files that have a newer timestamp. Or keep a DB around with file path/last indexed timestamp or Best, Erick On Mon, Aug 17, 2015 at 12:36 PM, coolmals wrote: > I have a file system. I have a scheduler wh

Re: Solr indexing based on last_modified

2015-08-17 Thread coolmals
I have a file system. I have a scheduler which will call solr in scheduled time interval. Any updates to the file system must be indexed by solr. Only changes must be re-indexed as file system is huge and cannot be re-indexed every time. -- View this message in context: http://lucene.472066.n3.

Re: Solr indexing based on last_modified

2015-08-17 Thread Erick Erickson
There's no way that I know of with post.jar. Post.jar was never really intended as a production tool, and sending all the files to Solr for parsing (pdf, word and the like) is putting quite a load on the Solr server. What is your use-case? You might consider a SolrJ program, it would be simple eno

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Swaraj Kumar
I am not sure but the following regex have worked for me in JAVA. Kindly check with your's one. ([^\x01])\x01([^\x01])\x01..([^\x01])$ Thanks, Swaraj

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Toke Eskildsen
avinash09 wrote: > Thanks Toke , nice explanation , i have one more concern instead of comma > separated my columns are ^A separated how to deal ^A ?? I am really not proficient with control characters and regexp. If ^A is Start Of Heading, which has ASCII & unicode character 1, my guess is that

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread avinash09
Thanks Toke , nice explanation , i have one more concern instead of comma separated my columns are ^A separated how to deal ^A ?? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-04 Thread Swaraj Kumar
I have used the following and it works very fast in DIH solr-5.0 You can try this for getting groupNames from regex. Regards, Swaraj Kumar Senior Software Engineer I MakeMyTrip.com +91-9811774497

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-03 Thread Toke Eskildsen
avinash09 wrote: > regex="^(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*), > (.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*),(.*)$" A better solution seems to have been presented, but for the record I would like to note that the regexp above is quite an ef

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
Alex, finally it worked for me found ctrl A separator ==( separator=%01&escape=\) Thanks for your help -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4197143.html Sent from the

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
That's an interesting question. The reference shows you how to set a separator, but ^A is a special case. You may need to pass it in as a URL escape character or similar. But I would first get a sample working with more conventional separator and then worry about ^A. Just so you are not confusing

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
thanks Erick and Alexandre Rafalovitch R one more doubt how to pass ctrl A(^A) seprator while csv upload -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196998.html Sent fro

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Erick Erickson
Data Import Handler is a process in Solr that reaches out, grabs "something external" and indexes it. "Something external" can be a database, files on the server etc. Along the way, you can do many transformations of the data. The point is that the source can be anything. The update handler is an

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
sir , a silly question m confuse here what is difference between data import handler and update csv -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196940.html Sent from the So

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
Well, I believe the tutorial has an example. Always a good thing - going through the tutorial. And the reference guide has the details: https://cwiki.apache.org/confluence/display/solr/Uploading+Data+with+Index+Handlers#UploadingDatawithIndexHandlers-CSVFormattedIndexUpdates . Regards, Alex. -

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread avinash09
no could you please share an example -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-from-csv-file-having-28-cols-taking-lot-of-time-plz-help-i-m-new-to-solr-tp4196904p4196928.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr -indexing from csv file having 28 cols taking lot of time ..plz help i m new to solr

2015-04-01 Thread Alexandre Rafalovitch
Solr actually has CSV update handler. You could send file to that directly. Have you tried that? Regards, Alex On 1 Apr 2015 11:56 pm, "avinash09" wrote: > >processor="LineEntityProcessor" > dataSource="fds" > url="test.csv" >

Re: SOLR indexing strategy

2015-03-21 Thread Jack Krupansky
Don't you have a number of "types" of transactions, where some fields may be common to all transactions, but with plenty of fields that are not common to all transactions? The point is that if the number of fields that need to be populated for each document type is relatively low, it becomes much m

Re: SOLR indexing strategy

2015-03-21 Thread varun sharma
1. All fields should be retrievable and are populated for each row , may be with default values for some.2. Out of 1000 fields , 10-15 are need to be indexed. In our current proprietary  solution , index as well as data files(compressed) reside together on SAN storage , and based on date range d

Re: SOLR indexing strategy

2015-03-21 Thread varun sharma
Its more of a financial message where for each customer there are various fields that specify various aspects of the transaction  On Friday, 20 March 2015 8:09 PM, Priceputu Cristian wrote: Why would you need 1000 fields for ? C On Fri, Mar 20, 2015 at 1:12 PM, varun sharma wrote

Re: SOLR indexing strategy

2015-03-20 Thread Shawn Heisey
On 3/20/2015 10:08 PM, Jack Krupansky wrote: > 1. With 1000 fields, you may only get 10 to 25 million rows per node. So, a > single date may take 15 to 50 nodes. > 2. How many of the fields need to be indexed for reference in a query? > 3. Are all the fields populated for each row? > 4. Maybe you c

Re: SOLR indexing strategy

2015-03-20 Thread Jack Krupansky
1. With 1000 fields, you may only get 10 to 25 million rows per node. So, a single date may take 15 to 50 nodes. 2. How many of the fields need to be indexed for reference in a query? 3. Are all the fields populated for each row? 4. Maybe you could split each row, so that one Solr collection would

Re: SOLR indexing strategy

2015-03-20 Thread Erick Erickson
On the surface, this is impossible: bq: This query should load only indexes within this date range How would one "load only indexes with this date range"? The nature of Lucene's merging segments makes it unclear what this would even mean. Best, Erick On Fri, Mar 20, 2015 at 5:09 AM, Priceputu C

Re: SOLR indexing strategy

2015-03-20 Thread Priceputu Cristian
Why would you need 1000 fields for ? C On Fri, Mar 20, 2015 at 1:12 PM, varun sharma wrote: > Requirements of the system that we are trying to build are for each date > we need to create a SOLR index containing about 350-500 million documents , > where each document is a single structured record

Re: solr indexing not working when i try to insert 1000000 rows but works fine when i try to index 400000 rows or below

2014-06-03 Thread madhav bahuguna
Hi Ahmet, Thanks a ton. You were absolutely right the moment i added the line batchsize=-1 it worked . Thank you so much its been 7 days and i just could not figure out what the issue was. Its working like a charm now. Thanks Again regards Madhav Bahuguna On Tue, Jun 3, 2014 at 5:28 PM, Ahmet Ars

Re: solr indexing not working when i try to insert 1000000 rows but works fine when i try to index 400000 rows or below

2014-06-03 Thread Shawn Heisey
On 6/3/2014 12:00 AM, madhav bahuguna wrote: > iam using solr 4.7.1 and trying to do a full import.My data source is a > table in mysql. It has 1000 rows and 20 columns. > > Whenever iam trying to do a full import solr stops responding. But when i > try to do a import with a limit of 40 or

Re: solr indexing not working when i try to insert 1000000 rows but works fine when i try to index 400000 rows or below

2014-06-03 Thread Ahmet Arslan
Hi Madhav, Just a guess, try using batchSize="-1" Ahmet On Tuesday, June 3, 2014 12:48 PM, madhav bahuguna wrote: HI iam using solr 4.7.1 and trying to do a full import.My data source is a table in mysql. It has 1000 rows and 20 columns. Whenever iam trying to do a full import solr st

Re: Solr indexing on urlencoded fields

2013-10-22 Thread neerajp
Thanks for giving me your valuable thoughts. I used CDATA for escaping the special characters('<', '>', & etc) in XML file. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-on-urlencoded-fields-tp4096994p4097138.html Sent from the Solr - User mailing list archiv

Re: Solr indexing on urlencoded fields

2013-10-22 Thread Erik Hatcher
You only url encode data that is in the URL If you're posting Solr XML, you need to encode entities appropriately for it to be valid XML but that's not the same as URL encoding. Erik On Oct 22, 2013, at 6:54 AM, neerajp wrote: > Hi, > I am a new solr user. I need to integrate Solr wi

Re: Solr Indexing Status

2013-08-22 Thread Prasi S
Thanks much . This was useful. On Thu, Aug 22, 2013 at 2:24 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > You can use the /admin/mbeans handler to get all system stats. You can > find stats such as "adds" and "cumulative_adds" under the update > handler section. > > http://localho

Re: Solr Indexing Status

2013-08-22 Thread Shalin Shekhar Mangar
You can use the /admin/mbeans handler to get all system stats. You can find stats such as "adds" and "cumulative_adds" under the update handler section. http://localhost:8983/solr/collection1/admin/mbeans?stats=true On Thu, Aug 22, 2013 at 12:35 PM, Prasi S wrote: > I am not using dih for indexi

Re: Solr Indexing Status

2013-08-22 Thread Prasi S
I am not using dih for indexing csv files. Im pushing data through solrj code. But i want a status something like what dih gives. ie. fire a command=status and we get the response. Is anythin like that available for any type of file indexing which we do through api ? On Thu, Aug 22, 2013 at 12:09

Re: Solr Indexing Status

2013-08-21 Thread Shalin Shekhar Mangar
Yes, you can invoke http://:/solr/dataimport?command=status which will return how many Solr docs have been added etc. On Wed, Aug 21, 2013 at 4:56 PM, Prasi S wrote: > Hi, > I am using solr 4.4 to index csv files. I am using solrj for this. At > frequent intervels my user may request for "Status"

Re: Solr Indexing Status

2013-08-21 Thread Furkan KAMACI
You know the size of CSV files and you can calculate it if you want. 2013/8/21 Prasi S > Hi, > I am using solr 4.4 to index csv files. I am using solrj for this. At > frequent intervels my user may request for "Status". I have to send get > something like in DIH " Indexing in progress.. Added x

Re: Solr indexing slows down

2013-06-10 Thread Michael Della Bitta
ngliche Nachricht- > Von: Michael Della Bitta [mailto:michael.della.bi...@appinions.com] > Gesendet: Donnerstag, 6. Juni 2013 18:29 > An: solr-user@lucene.apache.org > Betreff: Re: Solr indexing slows down > > Hi Sebastian, > > What database are you using? How much RAM is availabl

Re: Solr indexing slows down

2013-06-10 Thread Walter Underwood
8 million documents in two hours is over 1000/sec. That is a pretty fast indexing rate. It may be hard to go faster than that. wunder On Jun 10, 2013, at 7:12 AM, Shawn Heisey wrote: > On 6/10/2013 2:32 AM, Sebastian Steinfeld wrote: >> Hi Shawn, >> >> thank you for your answer. >> >> I am us

Re: Solr indexing slows down

2013-06-06 Thread Shawn Heisey
On 6/6/2013 4:13 AM, Sebastian Steinfeld wrote: The amout of documents I want to index is 8 million, the first 1,6 million are indexed in 2min, but to complete the Import it takes nearly 2 hours. The size of the index on the hard drive is 610MB. I started the solr server with 2GB memory. I read

Re: Solr indexing slows down

2013-06-06 Thread Michael Della Bitta
Hi Sebastian, What database are you using? How much RAM is available on your machine? It looks like you're selecting from a view... Have you tried paging through the view outside of Solr? Does that slow down as well? Do you notice any increased load on the Solr box or the database server? Micha

Re: Solr Indexing Rich Documents

2013-04-27 Thread Ahmet Arslan
Indexing Rich Documents Yes, file system 2013/4/27 Ahmet Arslan > hi, > > Where do you store your rich documents? File system? > > > > > - Original Message - > From: Furkan KAMACI > To: solr-user@lucene.apache.org > Cc: > Sent: Friday, April 26, 2013 6:19

Re: Solr Indexing Rich Documents

2013-04-27 Thread Furkan KAMACI
Yes, file system 2013/4/27 Ahmet Arslan > hi, > > Where do you store your rich documents? File system? > > > > > - Original Message - > From: Furkan KAMACI > To: solr-user@lucene.apache.org > Cc: > Sent: Friday, April 26, 2013 6:19 PM > Subject:

Re: Solr Indexing Rich Documents

2013-04-27 Thread Ahmet Arslan
hi, Where do you store your rich documents? File system? - Original Message - From: Furkan KAMACI To: solr-user@lucene.apache.org Cc: Sent: Friday, April 26, 2013 6:19 PM Subject: Re: Solr Indexing Rich Documents Is there any example at wiki for Manifold? 2013/4/26 Ahmet Arslan

Re: Solr Indexing Rich Documents

2013-04-26 Thread Furkan KAMACI
ts. It utilises ExtractingRequestHandler feature > of solr. > > --- On Fri, 4/26/13, Furkan KAMACI wrote: > > > From: Furkan KAMACI > > Subject: Re: Solr Indexing Rich Documents > > To: solr-user@lucene.apache.org > > Date: Friday, April 26, 2013, 3:39 PM > > Thanks for the an

Re: Solr Indexing Rich Documents

2013-04-26 Thread Ahmet Arslan
rom: Furkan KAMACI > Subject: Re: Solr Indexing Rich Documents > To: solr-user@lucene.apache.org > Date: Friday, April 26, 2013, 3:39 PM > Thanks for the answer, I get an error > now: FileNotFound Exception as I > mentioned at other thread. Now I' trying to solve it. >

Re: Solr Indexing Rich Documents

2013-04-26 Thread Furkan KAMACI
Thanks for the answer, I get an error now: FileNotFound Exception as I mentioned at other thread. Now I' trying to solve it. 2013/4/26 Jack Krupansky > It's called SolrCell or the ExtractingRequestHandler (/update/extract), > which the newer post.jar knows to use for some file types: > http://wi

Re: Solr Indexing Rich Documents

2013-04-26 Thread Jack Krupansky
It's called SolrCell or the ExtractingRequestHandler (/update/extract), which the newer post.jar knows to use for some file types: http://wiki.apache.org/solr/ExtractingRequestHandler -- Jack Krupansky -Original Message- From: Furkan KAMACI Sent: Friday, April 26, 2013 4:48 AM To: sol

Re: Solr indexing

2013-04-18 Thread uohzoaix
you just change date filedtype to string -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-tp4057017p4057136.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr indexing

2013-04-18 Thread Jack Krupansky
Solr dates are always "Z", GMT. -- Jack Krupansky -Original Message- From: hassancrowdc Sent: Thursday, April 18, 2013 11:49 AM To: solr-user@lucene.apache.org Subject: Solr indexing Solr is not showing the dates i have in database. any help? is solr following any specific timezone?

Re: Solr indexing

2013-04-18 Thread Andy Lester
On Apr 18, 2013, at 10:49 AM, hassancrowdc wrote: > Solr is not showing the dates i have in database. any help? is solr following > any specific timezone? On my database my date is 2013-04-18 11:29:33 but > solr shows me "2013-04-18T15:29:33Z". Any help Solr knows nothing of timezones. Solr

Re: Solr Indexing My SQL Timestamp or Date Time field

2013-04-15 Thread Erick Erickson
Solr requires precise date formats, see: http://lucene.apache.org/solr/api-4_0_0-BETA/org/apache/solr/schema/DateField.html Best Erick On Sun, Apr 14, 2013 at 11:43 AM, ursswak...@gmail.com wrote: > Hi, > > To index Date in Solr, Date should be in ISO format. > Can we index MySQL Timestamp or Da

Re: Solr indexing binary files

2013-03-18 Thread Luis
Hi Gora, Yes, my urlpath points to an url like that. I do not get why uncommenting the catch all dynamic field ("*") does not work for me. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-indexing-binary-files-tp4047470p4048542.html Sent from the Solr - User mailing li

Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 16 March 2013 00:30, Luis wrote: > Sorry, Gora. It is ${fileSourcePaths.urlpath} actually. Most likely, there is some issue with the selected urlpath not pointing to a proper http or file source. E.g., urlpath could be something like http://example.com/myfile.pdf . Please check that ${fileSou

Re: Solr indexing binary files

2013-03-15 Thread Luis
Sorry, Gora. It is ${fileSourcePaths.urlpath} actually. *My complete schema.xml is this:*

Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 15 March 2013 20:16, Luis wrote: > > Hi Gora, thank you for your reply. I am not using any commands, I just go > on > the Solr dashboard, db > Dataimport and execute a full-import. In that case, you are not using the ExtractingRequestHandler, but using the DataImportHandler, even though you h

Re: Solr indexing binary files

2013-03-15 Thread Luis
Hi Gora, thank you for your reply. I am not using any commands, I just go on the Solr dashboard, db > Dataimport and execute a full-import. *My schema.xml looks like this:*

Re: Solr indexing binary files

2013-03-15 Thread Gora Mohanty
On 15 March 2013 19:28, Luis wrote: > Hi Jack, thanks a lot for your reply. I did that type="text" multiValued="true" />. However, when I run Solr it gives me a > bunch of errors. It actually displays the content of my files on my command > line and shows some logs like this: > > org.apache.so

Re: Solr indexing binary files

2013-03-15 Thread Luis
Hi Jack, thanks a lot for your reply. I did that . However, when I run Solr it gives me a bunch of errors. It actually displays the content of my files on my command line and shows some logs like this: org.apache.solr.common.SolrException: Document is missing mandatory uniqueKey field: id

Re: Solr indexing binary files

2013-03-14 Thread Jack Krupansky
Take a look at Solr Cell: http://wiki.apache.org/solr/ExtractingRequestHandler Include a dynamicField with a "*" pattern and you will see the wide variety of metadata that is available for PDF and other rich document formats. -- Jack Krupansky -Original Message- From: Luis Sent: Th

Re: Solr Indexing with huge number of documents

2012-12-26 Thread Gora Mohanty
On 26 December 2012 15:00, bsargurunathan wrote: > Hi Everyone, > > In Solr Indexing, I needs to index millions of millions records in single > time from xml file. > While I am doing the indexing, I am constructing the xml file and passing to > the solr. > But right now I am controlling the record

Re: Solr Indexing MAX FILE LIMIT

2012-11-15 Thread Alexandre Rafalovitch
Maybe you can start by testing this with split -l and xargs :-) These are standard Unix toolkit approaches and since you use one of them (curl) you may be happy to use others too. Regards, Alex. Personal blog: http://blog.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalov

Re: Solr Indexing MAX FILE LIMIT

2012-11-14 Thread mitra
Thank you eric I didnt know that we could write a Java class for it , can you provide me with some info on how to Thanks -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-Indexing-MAX-FILE-LIMIT-tp4019952p4020407.html Sent from the Solr - User mailing list archive at N

Re: Solr Indexing MAX FILE LIMIT

2012-11-13 Thread Erick Erickson
Have you considered writing a small SolrJ (or other client) program that processed the rows in your huge file and sent them to solr in sensible chunks? That would give you much finer control over how the file was processed, how many docs were sent to Solr at a time, what to do with errors. You coul

RE: Solr Indexing MAX FILE LIMIT

2012-11-13 Thread mitra
Thankyou *** I understand that the default size for HTTP POST in tomcat is 2mb can we change that somehow so that i dont need to split the 10gb csv into 2mb chunks curl http://localhost:8080/solr/update/csv -F "stream.file=D:\eighth.csv" -F "commit=true" -F "optimize=true" -F "encapsulate

RE: Solr Indexing MAX FILE LIMIT

2012-11-13 Thread Markus Jelsma
Hi - instead of trying to make the system ingest such large files perhaps you can split the files in many small pieces. -Original message- > From:mitra > Sent: Tue 13-Nov-2012 09:05 > To: solr-user@lucene.apache.org > Subject: Solr Indexing MAX FILE LIMIT > > Hello Guys > > Im using

Re: solr indexing using jquery AJAX

2012-11-01 Thread amit
I changed as per your feedback. Added quotes and escaped them before id and name. Still not able to insert. data: "20trailblazers", The tomcat log says bad request. 192.168.11.88 - - [01/Nov/2012:17:10:35 +0530] "OPTIONS /solr/update?commit=true HTTP/1.1" 400 1052 In google chrome there are

Re: solr indexing using jquery AJAX

2012-10-30 Thread Michael Della Bitta
Also I believe you have to set the MIME type on your post. You might want to refer to the server side logs to see what Solr is saying about your request. There likely will be a descriptive error there. Michael Della Bitta Appinions 18 East 41st St

Re: solr indexing using jquery AJAX

2012-10-30 Thread Gora Mohanty
On 30 October 2012 20:16, Amit Mallik wrote: > Hi I am trying to index using AJAX basically jquery. > Below is my code > try { > $.ajax({ > type: "POST", > url: > "http://myserver:8080/solr/update?commit=true";

Re: Solr Indexing and Search on different Product Catalog Views

2012-10-06 Thread Gora Mohanty
On 6 October 2012 21:53, nitin malhotra wrote: > Hi, > > The catalogue views are of dynamic fashion and a fresh update and index > creation can bring in a change in the distribution of the product / records > from one view set to another. There can be modifications for every index > refresh. In su

Re: Solr Indexing and Search on different Product Catalog Views

2012-10-06 Thread nitin malhotra
a/ExternalFileField.html> > > -- Jack Krupansky > > -Original Message- From: Jack Krupansky > Sent: Saturday, October 06, 2012 11:43 AM > To: solr-user@lucene.apache.org > Subject: Re: Solr Indexing and Search on different Product Catalog Views > > > One approa

  1   2   3   >