Re: DIH

2021-01-21 Thread dmitri maziuk
On 2021-01-20 6:26 PM, Joshua Wilder wrote: Please reconsider the removal of the DIH from future versions. The repo it's been moved to is a ghost town with zero engagement from Rohit (or anyone). Not sure how 'moving' it caused it to now only support MariaDB but that appears to be the case. The c

Re: DIH and UUIDProcessorFactory

2020-12-17 Thread Dmitri Maziuk
On 12/17/2020 4:05 PM, Alexandre Rafalovitch wrote: Try with the explicit URP chain too. It may work as well. Actually in this case we're just making sure uniqueKey is in fact unique in all documents, so default is what we want. For this particular dataset I may at some future point look int

Re: DIH and UUIDProcessorFactory

2020-12-17 Thread Alexandre Rafalovitch
Try with the explicit URP chain too. It may work as well. Regards, Alex. On Thu, 17 Dec 2020 at 16:51, Dmitri Maziuk wrote: > > On 12/12/2020 4:36 PM, Shawn Heisey wrote: > > On 12/12/2020 2:30 PM, Dmitri Maziuk wrote: > >> Right, ```Every update request received by Solr is run through a chai

Re: DIH and UUIDProcessorFactory

2020-12-17 Thread Dmitri Maziuk
On 12/12/2020 4:36 PM, Shawn Heisey wrote: On 12/12/2020 2:30 PM, Dmitri Maziuk wrote: Right, ```Every update request received by Solr is run through a chain of plugins known as Update Request Processors, or URPs.``` The part I'm missing is whether DIH's 'name="/dataimport"' counts as an "Upda

Re: DIH and UUIDProcessorFactory

2020-12-12 Thread Shawn Heisey
On 12/12/2020 2:30 PM, Dmitri Maziuk wrote: Right, ```Every update request received by Solr is run through a chain of plugins known as Update Request Processors, or URPs.``` The part I'm missing is whether DIH's 'name="/dataimport"' counts as an "Update Request", my reading is it doesn't and U

Re: DIH and UUIDProcessorFactory

2020-12-12 Thread Dmitri Maziuk
On 12/12/2020 2:50 PM, Shawn Heisey wrote: The only way I know of to use an update processor chain with DIH is to set 'default="true"' when defining the chain. I did manage to find an example with the default attribute, in javadocs: https://lucene.apache.org/solr/5_0_0/solr-core/org/apache/so

Re: DIH and UUIDProcessorFactory

2020-12-12 Thread Shawn Heisey
On 12/12/2020 12:54 PM, Dmitri Maziuk wrote: is there an easy way to use the stock UUID generator with DIH? We have a hand-written one-liner class we use as DIH entity transformer but I wonder if there's a way to use the built-in UUID generator class instead. From the TFM it looks like there

Re: DIH and UUIDProcessorFactory

2020-12-12 Thread Alexandre Rafalovitch
Why not? You should be able to put an URP chain after DIH, the usual way. Is that something about UUID that is special? Regards, Alex On Sat., Dec. 12, 2020, 2:55 p.m. Dmitri Maziuk, wrote: > Hi everyone, > > is there an easy way to use the stock UUID generator with DIH? We have a > hand-w

Re: DIH on SolrCloud

2020-08-14 Thread Jan Høydahl
DIH should run fine from any node. It sends update requests as any other client, and those are routed to the leader, wherever it is. It could be problematic if node 2 gets overloaded by both doing DIH work, Overseer work and perhaps shard leader work, and an overloaded node gets into all kind of p

Re: DIH on SolrCloud

2020-08-13 Thread Issei Nishigata
Thank you for your quick reply. Can I make sure that the indexing isn't conducted on the node where the DIH executed but conducted on the Leader node, right? As far as I have seen a log, there are errors: the failed establishment of connection occurred from Node2 on the state of Replica on running

Re: DIH on SolrCloud

2020-08-13 Thread Jörn Franke
DIH is deprecated in current Solr versions. The general recommendation is to do processing outside the Solr server and use the update handler (the normal one, not Cell) to add documents to the index. So you should avoid using it as it is not future proof . If you need more Time to migrate to a

Re: DIH nested entity repeating query in verbose output

2020-05-14 Thread matthew sporleder
I think this is just an issue in the verbose/debug output. tcpdump does not show the same issue. On Wed, May 13, 2020 at 7:39 PM matthew sporleder wrote: > > I am attempting to use nested entities to populate documents from > different tables and verbose/debug output is showing repeated queries

Re: DIH across two SQL DBs

2019-10-31 Thread Jan Høydahl
Hmm, I'll have a look, but the SELECT is a bit more involved so the IDs from the other DB will be OR'ed into the WHERE clause, i.e. be added to those selected from other part of the where clause, so it's not a pure join. I'll think some more -- Jan Høydahl, search solution architect Cominvent A

Re: DIH across two SQL DBs

2019-10-31 Thread Mikhail Khludnev
Hello, Jan. Have you considered join="zipper" ? On Thu, Oct 31, 2019 at 12:52 AM Jan Høydahl wrote: > I need a SELECT which filters IDS based on an ‘id’ list coming from > another database, i.e. SELECT * FROM maindb.maintable WHERE id IN (SELECT > myid FROM otherdb.other_table). > > The docs ar

Re: DIH: Create Child Documents in ScriptTransformer

2019-09-19 Thread Jörn Franke
Hi, thanks for all the feedback. The context parameter in the ScriptTransformer is new to me - thanks for this insight. I could not find it in any docs. So just for people that also did not know it: you can have the ScriptTransformer with 2 parameters, e.g. function mytransformer(row,context){ ...

Re: DIH: Create Child Documents in ScriptTransformer

2019-09-18 Thread Mikhail Khludnev
Hello, Jörn. Have you tried to find a parent doc in the context which is passed as a second argument into ScriptTransformer? On Wed, Sep 18, 2019 at 9:56 PM Jörn Franke wrote: > > Hi, > > I load a set of documents. Based on these documents some logic needs to be > applied to split them into chapt

Re: DIH: Create Child Documents in ScriptTransformer

2019-09-18 Thread Jörn Franke
I fully agree. However, I am just curious to see the limits. > Am 18.09.2019 um 23:33 schrieb Erick Erickson : > > When it starts getting complex, I usually move to SolrJ. You say > you're loading documents, so I assume Tika is in the mix too. > > Here's a blog on the topic so you an see how to

Re: DIH: Create Child Documents in ScriptTransformer

2019-09-18 Thread Erick Erickson
When it starts getting complex, I usually move to SolrJ. You say you're loading documents, so I assume Tika is in the mix too. Here's a blog on the topic so you an see how to get started... https://lucidworks.com/post/indexing-with-solrj/ Best, Erick On Wed, Sep 18, 2019 at 2:56 PM Jörn Franke

Re: DIH import fails when importing multi-valued field

2019-06-27 Thread Erick Erickson
This looks like a problem with your select statement returning too many rows. I doubt it has to do with the multiValued field, I don’t think DIH is getting to the point where it even tries to create a SolrInputDocument. Depending on the driver, there are ways to limit the number of rows returned

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Glad to help :) 2018年10月12日(金) 21:10 Martin Frank Hansen (MHQ) : > You sir just made my day!!! > > It worked!!! Thanks a million! > > > Martin Frank Hansen, > > -Oprindelig meddelelse- > Fra: Kamuela Lau > Sendt: 12. oktober 2018 11:41 > Til: solr-user@

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Alexandre Rafalovitch
Solr ships with DIH Tika example that seems 90% identical to yours. Can you get that to run? If it works, then you can focus on the 10% difference. Perhaps it is explicit dataSource=null in the outer entity? Or maybe format=text on the inner one. Regards, Alex On Fri, Oct 12, 2018, 3:11 AM

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Also, just wondering, have you have tried to specify dataSource="bin" for read_file? On Fri, Oct 12, 2018 at 6:38 PM Kamuela Lau wrote: > Hi, > > I was unable to reproduce the error that you got with the information > provided. > Below are the data-config.xml and managed-schema fields I used; th

Re: DIH for TikaEntityProcessor

2018-10-12 Thread Kamuela Lau
Hi, I was unable to reproduce the error that you got with the information provided. Below are the data-config.xml and managed-schema fields I used; the data-config is mostly the same (I think that BinFileDataSource doesn't actually require a dataSource, so I think it's safe to put dataSource="null

Re: DIH for different levels of XML

2018-10-07 Thread Alexandre Rafalovitch
If your ID field comes from one XML level and your record details from another, they are processed as two separate records. Have a look at atom example that ships with DIH example set. Specifically, at commonField parameter, it may be useful for you: https://lucene.apache.org/solr/guide/7_4/uploadi

Re: DIH with huge data

2018-04-12 Thread Sujay Bawaskar
That sounds good option. So spark job will connect to MySQL and create solr document which is pushed into solr using solrj probably in batches. On Thu, Apr 12, 2018 at 10:48 PM, Rahul Singh wrote: > If you want speed, Spark is the fastest easiest way. You can connect to > relational tables direc

Re: DIH with huge data

2018-04-12 Thread Rahul Singh
CSV -> Spark -> SolR https://github.com/lucidworks/spark-solr/blob/master/docs/examples/csv.adoc If speed is not an issue there are other methods. Spring Batch / Spring Data might have all the tools you need to get speed without Spark. -- Rahul Singh rahul.si...@anant.us Anant Corporation On

Re: DIH with huge data

2018-04-12 Thread Rahul Singh
If you want speed, Spark is the fastest easiest way. You can connect to relational tables directly and import or export to CSV / JSON and import from a distributed filesystem like S3 or HDFS. Combining a dfs with spark and a highly available SolR - you are maximizing all threads. -- Rahul Sing

Re: DIH with huge data

2018-04-12 Thread Sujay Bawaskar
Thanks Rahul. Data source is JdbcDataSource with MySQL database. Data size is around 100GB. I am not much familiar with spark but are you suggesting that we should create document by merging distinct RDBMS tables in using RDD? On Thu, Apr 12, 2018 at 10:06 PM, Rahul Singh wrote: > How much data

Re: DIH with huge data

2018-04-12 Thread Rahul Singh
How much data and what is the database source? Spark is probably the fastest way. -- Rahul Singh rahul.si...@anant.us Anant Corporation On Apr 12, 2018, 7:28 AM -0400, Sujay Bawaskar , wrote: > Hi, > > We are using DIH with SortedMapBackedCache but as data size increases we > need to provide mo

Re: DIH XPathEntityProcessor XPath subset?

2018-01-05 Thread Rick Leir
Stefan There is at least one free Solr WP plugin. There are several Solr PHP toolkits on github. Start with these unless your WP is wildly custo..  .. cheers -- Rick On 01/03/2018 11:50 AM, Erik Hatcher wrote: Stefan - If you pre-transform the XML, I’d personally recommend either transform

Re: DIH XPathEntityProcessor XPath subset?

2018-01-03 Thread Erik Hatcher
Stefan - If you pre-transform the XML, I’d personally recommend either transforming it into straight up Solr XML (docs/fields/values) or some other format or posting directly to Solr. Avoid this DIH thing when things get complicated. Erik > On Jan 3, 2018, at 11:40 AM, Stefan Moises

Re: DIH not stop

2017-11-16 Thread Rick Leir
Can, I would like to learn many languages, but so far only two. Shawn suggested you get help from a friend who knows English. As well, Google translate is great for me, but I have not used it with Turkish. Cheers -- Rick On November 16, 2017 5:19:33 AM EST, Shawn Heisey wrote: >On 11/15/2017 11

Re: DIH not stop

2017-11-16 Thread Shawn Heisey
On 11/15/2017 11:59 PM, Can Ezgi Aydemir wrote: I configured Solr and Cassandra. Running full data import but not stop. Only core load during this process, stop it. Seeing that stop dih, not write dataimport.properties. In dataconfig.xml file, i define simplepropertywriter type and filename. B

RE: DIH not stop

2017-11-16 Thread Can Ezgi Aydemir
dde No:14, Beysukent 06800, Ankara, Türkiye T : 0 312 233 50 00 .:. F : 0312 235 56 82 E : cayde...@islem.com.tr .:. W : http://www.islem.com.tr -Original Message- From: Sujay Bawaskar [mailto:sujaybawas...@gmail.com] Sent: 16 November 2017 11:49 To: solr-user@lucene.apache.org Subje

Re: DIH not stop

2017-11-16 Thread Sujay Bawaskar
9} status=0 QTime=0 > 2017-11-16 07:21:36.076 INFO (qtp1638215613-14) [ x:cea2] > o.a.s.c.S.Request [cea2] webapp=/solr path=/dataimport > params={indent=on&wt=json&command=status&_=1510816148489} status=0 QTime=0 > 2017-11-16 07:21:38.064 INFO (qtp1638215613-14) [ x

RE: DIH not stop

2017-11-15 Thread Can Ezgi Aydemir
75 INFO (qtp1638215613-43) [ x:cea2] o.a.s.c.S.Request [cea2] webapp=/solr path=/dataimport params={indent=on&wt=json&command=status&_=1510816148489} status=0 QTime=2 ^C Can Ezgi Aydemir Oracle Veri Tabanı Yöneticisi & Oracle Database Admin İşlem Coğrafi Bilgi Sistemleri Müh. & Eğitim AŞ.

Re: DIH not stop

2017-11-15 Thread Sujay Bawaskar
I have experience this problem recently with MySQL and after checking solr.log found that there was a connection timeout from MySQL. Please check solr.log for any Cassandra connection errors. Thanks, Sujay On Thu, Nov 16, 2017 at 12:29 PM, Can Ezgi Aydemir wrote: > Hi all, > > I configured Solr

Re: DIH, multiple sources, cores and search: single core with multiple entities or single core per source with search across multiple cores?

2017-07-24 Thread Rick Leir
Giovanni, Start with your search results page and work back from there. Decide what fields you want to display in a results page, then plan for your Solr document to contain all these fields. Now you will need a program to ingest the data from whatever database, and create documents for Solr. Th

RE: DIH issue with streaming xml file

2017-07-10 Thread Miller, William K - Norman, OK - Contractor
, 2017 2:12 PM To: 'solr-user@lucene.apache.org' Subject: RE: DIH issue with streaming xml file Thank you for your response. I will look into this link. Also, sorry I did not specify the file type. I am working with XML files. ~~~ William Kevin Miller ECS Fe

Re: DIH delta import with cache 5.3.1 issue

2017-06-20 Thread Sujay Bawaskar
Hi, Did not encounter this issue with solr 6.x. But delta import with cache executes nested query for every element encountered in parent query. Since this select does not have where clause because we are using cache, it takes long time. So delta import witch cache is very slow. My observation is

RE: DIH issue with streaming xml file

2017-06-12 Thread Miller, William K - Norman, OK - Contractor
[mailto:arafa...@gmail.com] Sent: Monday, June 12, 2017 1:26 PM To: solr-user Subject: Re: DIH issue with streaming xml file Solr 6.5.1 DIH setup has - somewhat broken - RSS example (redone as ATOM example in 6.6) that shows how to get stuff from https URL. You can see the atom example here: https

Re: DIH issue with streaming xml file

2017-06-12 Thread Alexandre Rafalovitch
reate a custom entity processor? > > > > > ~~~ > William Kevin Miller > > ECS Federal, Inc. > USPS/MTSC > (405) 573-2158 > > > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Monday, June 12, 2017 12:57 PM > To: solr-user >

RE: DIH issue with streaming xml file

2017-06-12 Thread Miller, William K - Norman, OK - Contractor
) 573-2158 -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Monday, June 12, 2017 12:57 PM To: solr-user Subject: Re: DIH issue with streaming xml file How do you get a list of URLs for the files on the remote server? That's probably the first issue. Once you have

Re: DIH issue with streaming xml file

2017-06-12 Thread Alexandre Rafalovitch
How do you get a list of URLs for the files on the remote server? That's probably the first issue. Once you have the URLs in an outside entity or two, you can feed them one by one into the inner entity. Regards, Alex. http://www.solr-start.com/ - Resources for Solr users, new and experien

Re: DIH Speed

2017-04-27 Thread Vijay Kokatnur
​Let me clarify - DIH is running on Solr 6.5.0 that calls a different solr instance running​ on 4.5.0, which has 150M documents. If we try fetch them using DIH onto new solr cluster, wouldn't it result in deep paging on solr 4.5.0 and drastically slow down indexing on solr 6.5.0? On Thu, Apr 27,

Re: DIH Speed

2017-04-27 Thread Shawn Heisey
On 4/27/2017 9:15 PM, Vijay Kokatnur wrote: > Hey Shawn, Unfortunately, we can't upgrade the existing cluster. That > was my first approach as well. Yes, SolrEntityProcessor is used so it > results in deep paging after certain rows. I have observed that > instead of importing for a larger period, i

Re: DIH Speed

2017-04-27 Thread Vijay Kokatnur
:07 PM *To:* solr-user@lucene.apache.org *Subject:* Re: DIH Speed On 4/27/2017 5:40 PM, Erick Erickson wrote: > I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging is querying. > > If it's querying, consider cursorMark or the /export handler. https://luc

Re: DIH Speed

2017-04-27 Thread Shawn Heisey
On 4/27/2017 5:40 PM, Erick Erickson wrote: > I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging > is querying. > > If it's querying, consider cursorMark or the /export handler. > https://lucidworks.com/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-

Re: DIH Speed

2017-04-27 Thread Erick Erickson
I'm unclear why DIH an deep paging are mixed. DIH is indexing and deep paging is querying. If it's querying, consider cursorMark or the /export handler. https://lucidworks.com/2013/12/12/coming-soon-to-solr-efficient-cursor-based-iteration-of-large-result-sets/ If it's DIH, please explain a bit

Re: DIH Issues

2017-04-25 Thread Sales
> On Apr 25, 2017, at 10:28 AM, AJ Lemke wrote: > > Thanks for the thought Alex! > The fields that have this happen most often are numeric and boolean fields. > These fields have real data (id numbers, true/false, etc.) > > AJ > We had an identical problem a few months ago, and there was no

Re: DIH Issues

2017-04-25 Thread Alexandre Rafalovitch
and boolean fields. >> These fields have real data (id numbers, true/false, etc.) >> >> AJ >> >> -Original Message- >> From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] >> Sent: Tuesday, April 25, 2017 8:27 AM >> To: solr-user >

Re: DIH Issues

2017-04-25 Thread Erick Erickson
se, etc.) > > AJ > > -Original Message- > From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] > Sent: Tuesday, April 25, 2017 8:27 AM > To: solr-user > Subject: Re: DIH Issues > > Maybe the content gets simplified away between the database and the Solr >

RE: DIH Issues

2017-04-25 Thread AJ Lemke
: solr-user Subject: Re: DIH Issues Maybe the content gets simplified away between the database and the Solr schema. For example if your field contains just spaces and you have UpdateRequestProcessors to do trim and removal of empty fields? Schemaless mode will remove empty fields, but will not

Re: DIH Issues

2017-04-25 Thread Alexandre Rafalovitch
Maybe the content gets simplified away between the database and the Solr schema. For example if your field contains just spaces and you have UpdateRequestProcessors to do trim and removal of empty fields? Schemaless mode will remove empty fields, but will not trim for example. Regards, Alex. -

Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Sujay Bawaskar
Thanks Alex. I will test it with 5.4 and 6.4 and let you know. On Thu, Mar 16, 2017 at 7:40 PM, Alexandre Rafalovitch wrote: > You have nested entities and accumulate the content of the inner > entities in the outer one with caching on an inner one. Your > description sounds like the inner cache

Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Alexandre Rafalovitch
You have nested entities and accumulate the content of the inner entities in the outer one with caching on an inner one. Your description sounds like the inner cache is not reset on the next iteration of the outer loop. This may be connected to https://issues.apache.org/jira/browse/SOLR-7843 (Fixe

Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Sujay Bawaskar
This behaviour is for delta import only. One document get field values of all documents. These fields are child entities which maps column to multi valued fields. On Thu, Mar 16, 2017 at 6:35 PM, Alexandre Rafalovitch wrote: > Could you give a bit more details. Do y

Re: DIH delta import with cache 5.3.1 issue

2017-03-16 Thread Alexandre Rafalovitch
Could you give a bit more details. Do you mean one document gets the content of multiple documents? And only on delta? Regards, Alex On 16 Mar 2017 8:53 AM, "Sujay Bawaskar" wrote: Hi, We are using DIH with cache(SortedMapBackedCache) with solr 5.3.1. We have around 2.8 million documents i

Re: DIH Full Index Issue

2017-03-09 Thread Alexandre Rafalovitch
> adtype > 2017-03-09 13:41:00.053 INFO (qtp2080166188-41928) [c:collectionXXX s:shard1 > r:core_node1 x:collectionXXX_shard1_replica2] o.a.s.c.S.Request > [collectionXXX_shard1_replica2] webapp=/solr path=/schema params={wt=json} > status=0 QTime=0 > > > > AJ >

RE: DIH Full Index Issue

2017-03-09 Thread AJ Lemke
params={wt=json} status=0 QTime=0 AJ -Original Message- From: Alexandre Rafalovitch [mailto:arafa...@gmail.com] Sent: Wednesday, March 8, 2017 9:33 AM To: solr-user Subject: Re: DIH Full Index Issue Are you perhaps indexing at the same time from the source other than DIH? Because th

Re: DIH Full Index Issue

2017-03-08 Thread Alexandre Rafalovitch
Are you perhaps indexing at the same time from the source other than DIH? Because the commit is global and all the changes from all the sources will become visible. Check the access logs perhaps to see the requests to /update handler or similar. Regards, Alex. http://www.solr-start.com/

Re: DIH: last_index_time not updated on if 0 docs updated

2017-02-27 Thread Erick Erickson
Seems like a legitimate request, if you can't find a JIRA feel free to open one. And if you wanted to supply a patch, _well_ ;) On Mon, Feb 27, 2017 at 10:37 AM, xavier jmlucjav wrote: > Hi, > > After getting our interval for calling delta index shorter and shorter, I > have found out that last_

Re: DIH - Parent-Child-Problems - GrapQuery-Or-BlockJoin - Order with Orderlines

2017-02-01 Thread Mikhail Khludnev
Have you checked https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers ? 01 февр. 2017 г. 10:42 пользователь "Kent Iversen" написал: > I'm a newbie to Solr and can't seem to get this to work, properly. Gonna > use Order with Orderlines as an example. >

Re: DIH do not work. Child-entity cannot refer parent's id.

2017-01-23 Thread Keiichi MORITA
Resolved. My problem occurred because of the case-sensitive. I've read the source code of Solr-6.3 and found a code referencing metadata of databases, so I finally noticed that Oracle Database returns *UPPERCASE* letters from metadata. As a correct setting, in the where clause of the query calle

Re: DIH do not work. Child-entity cannot refer parent's id.

2017-01-21 Thread Keiichi MORITA
Hi Shawn, Thank you for helpful information and suggestions. > Are you using the Oracle JVM? This is recommended. Version 1.8.x (Java > 8) > is required for Solr 6.3.0. I'm using Oracle Java 8 (1.8.0_111). In response to your advice, I've changed the logging level for JdbcDataSource to DEBUG

Re: DIH do not work. Child-entity cannot refer parent's id.

2017-01-20 Thread Shawn Heisey
On 1/20/2017 7:40 AM, Shawn Heisey wrote: > One thing you might want to try doing is enclosing the property > ${books.book_id} in single quotes. The example configs on the > dataimport wiki page have the properties referenced from parent > entities surrounded by single quotes: A second look reveal

Re: DIH do not work. Child-entity cannot refer parent's id.

2017-01-20 Thread Shawn Heisey
On 1/20/2017 5:45 AM, Keiichi MORITA wrote: > DataImportHandler *can't* work out with Oracle 12c and Solr 6.3. > Query in nested entities are called, the mapping values are not in child's > WHERE clause. > What is the cause of this error? I want some help. > > > ## data-config.xml > > >

Re: DIH Commit Issue

2016-12-22 Thread Erick Erickson
I would set the times in the autoCommit to a large number (or -1 I think). It's possible that there's a default there if the autocommit section is found but nothing specified, you'll have to look at the code to be sure. But what I would do is use aliasing (either core if you're in stand-alone or c

Re: DIH problem with multiple (types of) resources

2016-11-15 Thread Peter Blokland
hi, On Tue, Nov 15, 2016 at 02:54:49AM +1100, Alexandre Rafalovitch wrote: >> >> > Attribute names are case sensitive as far as I remember. Try > 'dataSource' for the second definition. oh wow... that's sneaky. in the old version the case didn't seem to matter, but now it certainly d

Re: DIH problem with multiple (types of) resources

2016-11-14 Thread Alexandre Rafalovitch
On 15 November 2016 at 02:19, Peter Blokland wrote: > > Attribute names are case sensitive as far as I remember. Try 'dataSource' for the second definition. Regards, Alex. Solr Example reading group is starting November 2016, join us at http://j.mp/SolrERG Newsletter and

Re: DIH Delete with Full Import

2016-05-31 Thread nikosmarinos
Thank you Kiran. Simple and nice. I lost a day today trying to make the delta-import work. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-Delete-with-Full-Import-tp4040070p4279981.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH Schedule Solr 6

2016-04-21 Thread Shawn Heisey
On 4/21/2016 5:25 AM, Mahmoud Almokadem wrote: > We have a cluster of solr 4.8.1 installed on tomcat servlet container and > we’re able to use DIH Schedule by adding this lines to web.xml of the > installation directory: > > > > org.apache.solr.handler.dataimport.scheduler.ApplicationListe

Re: DIH with Nested Documents - Configuration Issue

2016-04-14 Thread Mikhail Khludnev
Giving child="true" Solr 5.5 creates a documents block with implicit relations across parent and nested children. These later retrievable via https://cwiki.apache.org/confluence/display/solr/Other+Parsers#OtherParsers-BlockJoinQueryParsers only. Giving the fact you run 4.10 I don't think you really

RE: DIH Caching w/ BerkleyBackedCache

2015-12-16 Thread Dyer, James
yer Ingram Content Group -Original Message- From: Todd Long [mailto:lon...@gmail.com] Sent: Wednesday, December 16, 2015 10:21 AM To: solr-user@lucene.apache.org Subject: RE: DIH Caching w/ BerkleyBackedCache James, I apologize for the late response. Dyer, James-2 wrote > With the DIH

RE: DIH Caching w/ BerkleyBackedCache

2015-12-16 Thread Todd Long
James, I apologize for the late response. Dyer, James-2 wrote > With the DIH request, are you specifying "cacheDeletePriorData=false" We are not specifying that property (it looks like it defaults to "false"). I'm actually seeing this issue when running a full clean/import. It appears that the

RE: DIH Caching w/ BerkleyBackedCache

2015-11-20 Thread Dyer, James
or getting it to work. James Dyer Ingram Content Group -Original Message- From: Todd Long [mailto:lon...@gmail.com] Sent: Tuesday, November 17, 2015 8:11 AM To: solr-user@lucene.apache.org Subject: Re: DIH Caching w/ BerkleyBackedCache Mikhail Khludnev wrote > It's worth to me

Re: DIH Caching w/ BerkleyBackedCache

2015-11-17 Thread Todd Long
Mikhail Khludnev wrote > It's worth to mention that for really complex relations scheme it might be > challenging to organize all of them into parallel ordered streams. This will most likely be the issue for us which is why I would like to have the Berkley cache solution to fall back on, if possib

Re: DIH Caching w/ BerkleyBackedCache

2015-11-16 Thread Mikhail Khludnev
On Mon, Nov 16, 2015 at 5:08 PM, Todd Long wrote: > Mikhail Khludnev wrote > > "External merge" join helps to avoid boilerplate caching in such simple > > cases. > > Thank you for the reply. I can certainly look into this though I would have > to apply the patch for our version (i.e. 4.8.1). I re

Re: DIH Caching w/ BerkleyBackedCache

2015-11-16 Thread Todd Long
Mikhail Khludnev wrote > "External merge" join helps to avoid boilerplate caching in such simple > cases. Thank you for the reply. I can certainly look into this though I would have to apply the patch for our version (i.e. 4.8.1). I really just simplified our data configuration here which actually

Re: DIH Caching w/ BerkleyBackedCache

2015-11-13 Thread Mikhail Khludnev
Hello Todd, "External merge" join helps to avoid boilerplate caching in such simple cases. it should be something On Fri, Nov 13, 2015 at 10:54 PM, Todd Long wrote: > We currently index using DIH along with the SortedMapBackedCache cache > implementation which has worked wel

Re: DIH Caching with Delta Import

2015-11-03 Thread Todd Long
Erick Erickson wrote > Have you considered using SolrJ instead of DIH? I've seen > situations where that can make a difference for things like > caching small tables at the start of a run, see: > > searchhub.org/2012/02/14/indexing-with-solrj/ Nice write-up. I think we're going to move to that ev

Re: DIH Caching with Delta Import

2015-10-25 Thread Erick Erickson
Have you considered using SolrJ instead of DIH? I've seen situations where that can make a difference for things like caching small tables at the start of a run, see: searchhub.org/2012/02/14/indexing-with-solrj/ Best, Erick On Sat, Oct 24, 2015 at 6:17 PM, Todd Long wrote: > Dyer, James-2 wrot

RE: DIH Caching with Delta Import

2015-10-24 Thread Todd Long
Dyer, James-2 wrote > The DIH Cache feature does not work with delta import. Actually, much of > DIH does not work with delta import. The workaround you describe is > similar to the approach described here: > https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport , > which in my op

RE: DIH Caching with Delta Import

2015-10-21 Thread Dyer, James
The DIH Cache feature does not work with delta import. Actually, much of DIH does not work with delta import. The workaround you describe is similar to the approach described here: https://wiki.apache.org/solr/DataImportHandlerDeltaQueryViaFullImport , which in my opinion is the best way to i

RE: DIH parallel processing

2015-10-15 Thread Davis, Daniel (NIH/NLM) [C]
This is also what I have done, but I agree with the notion of using something external to load the data. -Original Message- From: Dyer, James [mailto:james.d...@ingramcontent.com] Sent: Thursday, October 15, 2015 9:24 AM To: solr-user@lucene.apache.org Subject: RE: DIH parallel

RE: DIH parallel processing

2015-10-15 Thread Dyer, James
Nabil, What we do is have multiple dih request handlers configured in solrconfig.xml. Then in the sql query we put something like "where mod(id, ${partition})=0". Then an external script calls a full import on each request handler at the same time and monitors the response. This isn't the mo

Re: DIH parallel processing

2015-10-15 Thread Charlie Hull
On 15/10/2015 09:57, nabil Kouici wrote: Hi All, I'm using DIH to index more than 15M from Sql Server to Solr. This take more than 2 hours. Big amount of this time is consumed by data fetching from database. I'm thinking about a solution to have parallel (thread) loud in the same DIH. Each thr

Re: DIH FileDataSource and delta-import

2015-09-09 Thread Shawn Heisey
On 9/9/2015 4:27 PM, Scott Derrick wrote: > I can't seem to get delta-imports to work with a FileDataSource DIH The information I have says delta-import won't work with that kind of entity. http://wiki.apache.org/solr/DataImportHandler#Using_delta-import_command-1 Also, please make note of this:

Re: DIH delta-import pk

2015-08-24 Thread CrazyDiamond
i have autogenerated uuid for each document in solr. it is not marked as uniquefield. i add uuid in config to generate uuid when i add document from client. But now each time i update document uuid is changed. -- View this message in context: http://lucene.47

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
i don't use SQL now. i'm adding documents manually. db_id_s -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-delta-import-pk-tp4224342p4224762.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH delta-import pk

2015-08-23 Thread William Bell
Send the SQL and Schema.xml. Also logs. Does it complain about _id_ or you field in schema? On Sun, Aug 23, 2015 at 4:55 AM, CrazyDiamond wrote: > Now I set db id as unique field and uuid field,which should be generated > automatically as required. but when i add document i have an error tha

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
Now I set db id as unique field and uuid field,which should be generated automatically as required. but when i add document i have an error that my required uuid field is missing. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-delta-import-pk-tp4224342p4224701.html Sen

Re: DIH delta-import pk

2015-08-23 Thread CrazyDiamond
As far as I understand I cant use 2 uniquefield. i need db id and uuid because i moving data from database to solr index entirely. And temporaly i need it to be compatble with delta-import, but in future i will use new only uuid . -- View this message in context: http://lucene.472066.n3.nabble

Re: DIH delta-import pk

2015-08-21 Thread Erick Erickson
"use 2 unique fields" to do what? Solr replaces older docs with newer docs based _solely_ on the defined in schema.xml. There is no notion of "compound unique key" like there can be in a database. You could concatenate the PK and a uuid, but what would be the point? Since the uuid (presumably) c

Re: DIH delta-import pk

2015-08-21 Thread CrazyDiamond
ok, can I use 2 unique fields one with uuid and one with db id? what will happened then? -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-delta-import-pk-tp4224342p4224395.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: DIH delta-import pk

2015-08-20 Thread Shawn Heisey
On 8/20/2015 4:27 PM, CrazyDiamond wrote: > i have a DIH delta-import query based on last_index_time.it works perfectly > But sometimes i add documents to Solr manually and i want DIH not to add > them again.I have UUID unique field and also i have "id" from database which > is marked as pk in DI

Re: DIH solr cloud

2015-07-29 Thread Erick Erickson
Just pick a node to run it on. I vastly prefer, though, using a SolrJ client, here's a sample: https://lucidworks.com/blog/indexing-with-solrj/ Best, Erick On Wed, Jul 29, 2015 at 4:37 AM, Midas A wrote: > Hi, > > I have to create DIH with solr cloud shared with multi node architecture > for s

Re: DIH question: importing string containing comma-delimited list into a multiValued field

2015-07-17 Thread Shawn Heisey
On 7/17/2015 8:23 AM, Bill Au wrote: > One of my database column is a varchar containing a comma-delimited list of > values. I wold like to import these values into a multiValued field. I > figure that I will need to write a ScriptTransformer to do that. Is there > a better way? DIH provides th

Re: DIH Not Indexing Two Documents

2015-07-15 Thread Paden
You were 100 percent right. I went back and checked the metadata looking for multiple instances of the same file path. Both of the files had an extra set of metadata with the same filepath. Thank you very much. -- View this message in context: http://lucene.472066.n3.nabble.com/DIH-Not-Indexin

Re: DIH Not Indexing Two Documents

2015-07-15 Thread Erick Erickson
My first guess is that somehow these two documents have the same as some other documents so later docs are replacing newer docs. Although not conclusive, looking at the admin page for the cores in question may show numDocs=278 and maxDoc=280 or some such in which case that would be what's happenin

  1   2   3   4   5   6   7   8   9   10   >