DIH doucments not indexed because of loss in xsl transformation.

2013-12-10 Thread jerome . dupont
Hello I'm indexing xml files with xpathEntityProcessor, and for some hundreads documents on 12 millions are not processed. When I tried to index only one of the KO documents it doesn't either index. So it's not a matter of big number of documents. We tried to do the xslt transformation external

using facet enum et fc in the same query.

2014-09-22 Thread jerome . dupont
Hello, I have a solr index (12 M docs, 45Go) with facets, and I'm trying to improve facet queries performances. 1/ I tried to use docvalue on facet fields, it didn't work well 2/ I tried facet.threads=-1 in my querie, and worked perfectely (from more 15s to 2s for longest queries) 3/ I'm tryi

RE: RE: using facet enum et fc in the same query.

2014-09-23 Thread jerome . dupont
tp://tokee.github.io/lucene-solr/ Right now we use solr 4.6, and we soon deliver our relsease, and I'm afraid I won't have time to try this time, but I can try for next release (next month I think). Thanks very much again Jerome Dupont jerome.dupont_at#bnf.fr Participez à l'acquisition d'un Trésor national - Le manuscrit royal de François I er Avant d'imprimer, pensez à l'environnement.

[SOLR 4.4 or 4.2] indexing with dih and solrcloud

2013-08-29 Thread jerome . dupont
Hello, I'm trying to index documents with Data import handler and solrcloud at the same time. (huge collection, need to make parallel indexing) First I had a dih configuration whichs works with solr standalone. (Indexing for two month every week) I've transformed my configuration to "cloudify"

Re :Re: [SOLR 4.4 or 4.2] indexing with dih and solrcloud

2013-08-29 Thread jerome . dupont
Hello again Finally, I found the problem. It seems that _ The indexation request was done with an http GET and not with POST, because I was lauching it from a favorite in my navigator. Launching indexation on my documents by the admin interface made indexation work. _ Antoher problem was that som

solr cloud and DIH, indexation runs only on one shard.

2013-09-03 Thread jerome . dupont
Hello again, I still trying to index a with solr cloud and dih. I can index but it seems that indexation is done on only 1 shard. (my goal was to parallelze that to go fast) This my conf: I have 2 tomcat instances, One with zookeeper embedded in solr 4.4.0 started and 1 shard (port 8080) The othe

Re: solr cloud and DIH, indexation runs only on one shard.

2013-09-03 Thread jerome . dupont
It works I've done what you said: _ In my request to get list of documents, I add a where clause filtering on the select getting the documents to index: where noticebib.numnoticebib LIKE '%${dataimporter.request.suffixeNotice}'" _ And I called my dih on each shard with the parameter suffixeNotice

[DIH] Logging skipped documents

2013-09-23 Thread jerome . dupont
Hello, I have a question, I index documents and a small part them are skipped, (I am in onError="skip" mode) I'm trying to get a list of them, in order to analyse what's worng with these documents Is there a mean to get the list of skipped documents, and some more information (my onError="skip" i

error while indexing huge filesystem with data import handler and FileListEntityProcessor

2013-05-24 Thread jerome . dupont
Hello, We are trying to use data import handler and particularly on a collection which contains many file (one xml per document) Our configuration works for a small amount of files, but dataimport fails with OutofMemory Error when running it on 10M files (in several directories...) This is it

Re: Re: error while indexing huge filesystem with data import handler and FileListEntityProcessor

2013-05-29 Thread jerome . dupont
The configuraiton works with LineEntityProcessor, with few documents (havn (t test with many documents yet. For information this the config ... fields de

[DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread jerome . dupont
Hello, I want to use a index a huge list of xml file. _ Using FileListEntityProcessor causes an OutOfMemoryException (too many files...) _ I can do it using a LineEntityProcessor reading a list of files, generated externally, but I would prefer to generate the list in SOLR _ So to avoid to mantai

Re: Re: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-30 Thread jerome . dupont
Hi, Thanks for your anwser, it made me go ahead. The name of the entity was not good, not consistent with schema Now the first entity works fine: the query is done to the database and returns the good result. The problem is that the second entity, which is a XPathEntityProcessor entity, doesn't r

RE: [DIH] Using SqlEntity to get a list of files and read files in XpathEntityProcessor

2013-05-31 Thread jerome . dupont
Thanks very much, it works, with dataSource (capital S) !!! Finally, I didn't have to define a "CHEMINRELATIF" field in the configuration, it's working without it. This is the definive working configuration:

Issue with dataimport xml validation with dtd and jetty: conflict of use for user.dir variable

2019-02-08 Thread jerome . dupont
Hello, I use solr and dataimport to index xml files with a dtd. The dtd is referenced like this Previously we were using solr4 in a tomcat container. During the import process, solr tries to validate the xml file with the dtd. To find it we were defining -Duser.dir=pathToDtD and solr could find