Help extracting text from PDF images when indexing files

2019-05-03 Thread Miguel Fernandes
Any help on how to correctly extract the text from the PDF images would be great. Thanks Miguel

Doubt about facet with dates

2017-10-06 Thread Miguel Valencia Zurera
Hi I need get faceted results by a date field. The facets must be two: 1) all values lower than the system date 2) and values greater than the system date, ¿it is possible get these two facet? I'm reading wiki solr about facet.date and facet.range but I not get a good solution. Any idea

Re: High disk write usage

2017-07-11 Thread Antonio De Miguel
Thanks Shawn! I will try to change the values of those parameters 2017-07-10 14:57 GMT+02:00 Shawn Heisey : > On 7/10/2017 2:57 AM, Antonio De Miguel wrote: > > I continue deeping inside this problem... high writing rates continues. > > > > Searching in logs i see this:

Re: High disk write usage

2017-07-10 Thread Antonio De Miguel
=1.784 MB docs/MB=244.978 A flush happens each 10 seconds (my autosoftcommit time is 10 secs and hardcommit 5 minutes). ¿is the expected behaviour? I thought soft commits does not write into disk... 2017-07-06 0:02 GMT+02:00 Antonio De Miguel : > Hi erik. > > What i want to said is tha

Help with updateHandler commit stats

2017-07-07 Thread Antonio De Miguel
Hi, I'm taking a look to UpdateHandler stats... and i see when autosoftcommit occurs (every 10 secs) both metrics, "commits" and "soft autocommits" increments by one. ¿is this normal? My config is: autoCommit: 180 secs autoSoftCommit: 10 secs Thanks!

Re: High disk write usage

2017-07-05 Thread Antonio De Miguel
> How much physical memory does the machine have and how much memory is > allocated to _all_ of the JVMs running on that machine? > > see: http://blog.thetaphi.de/2012/07/use-lucenes-mmapdirectory- > on-64bit.html > > Best, > Erick > > > On Wed, Jul 5, 2017 at 9:

Re: High disk write usage

2017-07-05 Thread Antonio De Miguel
isking > your data. That might be a quick experiment you could run though, > disable tlogs and see what that changes. Of course I'd do this on my > test system ;). > > But yeah, Solr will use a lot of I/O in the scenario you are outlining > I'm afraid. > > Best, >

Re: High disk write usage

2017-07-05 Thread Antonio De Miguel
ds and consider better hardware > (SSD's) > > -Original message- > > From:Antonio De Miguel > > Sent: Wednesday 5th July 2017 16:48 > > To: solr-user@lucene.apache.org > > Subject: Re: High disk write usage > > > > Thnaks a lot alessandro! &

Re: High disk write usage

2017-07-05 Thread Antonio De Miguel
Thnaks a lot alessandro! Yes, we have very big physical dedicated machines, with a topology of 5 shards and10 replicas each shard. 1. transaction log files are increasing but not with this rate 2. we 've probed with values between 300 and 2000 MB... without any visible results 3. We don't us

High disk write usage

2017-07-05 Thread Antonio De Miguel
Hi, We are implementing a solrcloud cluster (6.6 version) with NRT requisites. We are indexing 600 docs/sec with 1500 docs/sec peaks, and we are serving about 1500qps. Our documents has 300 fields with some doc values, about 4kb and we have 3 million of documents. HardCommit is set to 15 minutes

distinc date format on parameter lastModified in LukeRequestHandler

2016-06-03 Thread Miguel Valencia Zurera
hi everybody I have two instalation of apache solr 3.5.0 and when I consult web page "/admin/luke" I see that parameter lastmodified have distinct format in both. The first show: 2016-05-20T13:03:03Z and the second show: name="lastModified">2016-05-20T13:03:03.593Z why the second solr show m

Re: Search over XML data using xpath

2016-04-07 Thread Miguel Valencia Zurera
siren/overview/ - I believe they indexed XML Regards, Alex. Newsletter and resources for Solr beginners and intermediates: http://www.solr-start.com/ On 1 April 2016 at 19:41, Miguel Valencia Zurera wrote: Hi everybody I'm looking for the way to store XML file and keep on hierarchy o

Search over XML data using xpath

2016-04-01 Thread Miguel Valencia Zurera
Hi everybody I'm looking for the way to store XML file and keep on hierarchy of the data because I need show full xml and besides to search inside of nodes of xml. Only I have found XPathEntityProcessor for import xml but it does not keep on the hierarchy of the data. I have not found one ty

Re: filters to work with dates

2016-02-04 Thread Miguel Valencia Zurera
Hi Markus At first, I thought keep the original field and create a new field using function "Copying Fields ". For this reason, I thought it was better choice to use a filter function in destiny field. However I am going to stud

filters to work with dates

2016-02-02 Thread Miguel Valencia Zurera
Hi everybody I'm looking for a filter o similar function to resolve the next problem in my solr index: I have a string field that it contains a date but each record of this field can be in diferent formats. Now I have to sort by this field and for this I have to normalize this field. I've thou

Re: About enableLazyFieldLoading and memory

2014-03-19 Thread Miguel
An interesting check would be disable compression on stored fields, and to check if your searcher works better. Disable compression should increase stored and searcher should be quicker. I have read that disable compression all you need to do is to write a new codec that uses a stored fields f

Re: About enableLazyFieldLoading and memory

2014-03-18 Thread Miguel
Hi David If you use lazy field loading (/enableLazyFieldLoading=true/) /documentCache/ functionality is somehow limited. This means that the document stored in the /documentCache/ will contain only those fields that were passed to the /fl /parameter. /documentCache/ requires memory, the

Re: Debugging Solr XSL

2013-06-14 Thread Miguel
Hi You can use an online xsl validator, example: http://xslttest.appspot.com/ but I think it's better use XSLT editor. It's sure visual studio should have someone. regars. El 13/06/2013 23:45, O. Olson escribió: Hi, I am attempting to transform the XML output of Solr using the Xs

Re: conditional queries?

2013-04-09 Thread Miguel
I not sure, but you can create a class extend of SearchComponent and include at the least of your requesthandler and in this way add optional actions about whatever query on your solr server. Example solrconfig.xml actions Regars El 09/04/2013 17:05, Walter Underw

Re: Group By and Sum

2013-03-18 Thread Miguel
Hi Adam Have you seen wiki about field collapsing? http://wiki.apache.org/solr/FieldCollapsing I think that this page help you to emule group by. El 18/03/2013 17:48, Adam Harris escribió: Hello All, Pretty stuck here and I am hoping you might be the person to help me out. I am working

Re: xml output question

2013-03-13 Thread Miguel
Hi jazz You can use "wt" and "tr" parameters for XSLT transformation, example: &wt=xslt&tr=test.xsl so, you can generate whatever XML output for Solr. El 12/03/2013 20:22, jazz escribió: Hi Michael, Thans for the reply. My question is how to make child XML elements such as: These c

Re: incorrect datetime to use now function

2013-03-07 Thread Miguel
Hi Cris I've used TZ param with UpdateXmlMessages and data are indexed with default GMT value on field timestamp, but one hour less than current date system. I am doing test using command curl, so: /curl http://10.240.234.133:8080/solr/update?TZ=Europe/Madrid --data-binary @data.xml -H "

Re: incorrect datetime to use now function

2013-03-05 Thread Miguel
thanks Cris I've used this configuration to my timestamp field and it's works default="NOW+1HOUR" multiValued="false"/> Anyway, I would like to know possible configuration of TZ parameter. When you speak "clients can specify a "TZ" param", this means it is possible configure TZ paramete

incorrect datetime to use now function

2013-03-05 Thread Miguel
Hi I am using timestamp field configured in the schema, so this way: default="NOW" multiValued="false"/> when I've checked the new data, I see datetime value have one hour less than current date. I though it was problem on java configuration so I include in JAVA_OPTS = -Duser.timezone=Euro

Re: Open 2 ports on Solr3.6 Tomcat6?

2013-03-01 Thread Miguel
Hi You could do an ip routing usind linux command iptables to redirect request from port 80 to Tomcat port. In this page explain how-to do: http://forum.slicehost.com/index.php?p=/discussion/2497/iptables-redirect-port-80-to-port-8080/p1 El 01/03/2013 12:43, Bruno Mannina escribió: Dear

Re: get content is put in the index queue but is not committed

2013-02-21 Thread Miguel
Thanks Cris I'm going to see both UpdateLog and RealTimeGetComponent classes, but I not sure if I could use them because I'm working with apache solr version 1.4.1, (I know is older). Anyway I'll tell you my problem. I am developing a custom class extend from UpdateRequestProcessorFactory.

Re: How to retrive all terms with their frequency in that website.

2013-02-21 Thread Miguel
Hi Look up the luke page in admin Solr .. /admin/luke?show=index That page show topTerms of terms, so I suppose is possible get frecuency all terms. El 21/02/2013 12:58, search engn dev escribió: I have indexed data of 10 websites in solr. Now i want to dump data of each website with follo

get content is put in the index queue but is not committed

2013-02-19 Thread Miguel
Hi everybody Anybody know how-to get content is put in the index queue but is not committed? I am developing a custom UpdateRequestProcessorFactory and in the ADD Event I see the documents, and I would need to access this same documents in the Commit event. When the add event has implicit com

Re: How-to get date of indexing process

2013-02-14 Thread Miguel
Thanks Markus I didn't know that page. It's all I need it. Thanks again El 14/02/2013 10:47, Markus Jelsma escribió: See: admin/luke?show=index or the admin UI. -Original message- From:Miguel Sent: Thu 14-Feb-2013 10:45 To: solr-user@lucene.apache.org Subject: How-to get d

How-to get date of indexing process

2013-02-14 Thread Miguel
Hi everybody I am looking for the way to get date of last indexing process or commit event that it happened in my Solr server. I found a possible solution to add timestamp field , for example: || But, I would like a solution without modify the schema of Solr server. I checked statistics p

Re: how-to configure mysql pool connection on Solr Server

2013-02-08 Thread Miguel
17:35, Michael Della Bitta escribió: Hello Miguel, If you set up a JNDI datasource in your servlet container, you can use that as your database config. Then you just need to use a pooling datasource: http://wiki.apache.org/solr/DataImportHandlerFaq#How_do_I_use_a_JNDI_DataSource.3F http://dev.mysql.com

how-to configure mysql pool connection on Solr Server

2013-02-07 Thread Miguel
Hi I need configure a mysql pool connection on Solr Server for using on custom plugin. I saw DataImportHandler wiki: http://wiki.apache.org/solr/DataImportHandler , but it's seems that DataImportHandler open the connection when handler is calling and close when finish import and I need keep

Re: Fwd: advice about develop AbstractSolrEventListener.

2013-02-06 Thread Miguel
nt of handler Solr is using for update process, and allow include events associated to updated records after commit event. Thanks El 04/02/2013 9:03, Miguel escribió: Hi everybody Please, I need to know if anybody has done similar something. I have to developed a notification when commit

Re: Fwd: advice about develop AbstractSolrEventListener.

2013-02-04 Thread Miguel
don't see how to get updated data associated to commit event. Thanks for help El 31/01/2013 9:32, Miguel escribió: Hi After to study apache solr documentation, I think only way to know update records (modify, delete an insert actions) is developed a class ex

Re: Fwd: advice about develop AbstractSolrEventListener.

2013-01-31 Thread Miguel
confirm me, that this way is the best way? or is there any options? thanks El 30/01/2013 13:39, Miguel escribió: Hi I have to developed a function that must comunicate with webservice and this function must execute after each time commits. My doubt; it's possible get that records had

Fwd: advice about develop AbstractSolrEventListener.

2013-01-30 Thread Miguel
Hi I have to developed a function that must comunicate with webservice and this function must execute after each time commits. My doubt; it's possible get that records had been updated on solr index? My function must send information about add, updated and delete records from solr index to exter

Re: Solr 4.1 Maven artifacts.

2013-01-28 Thread Miguel Ángel Martín
Hi Lu: Look at https://repository.apache.org/content/groups/snapshots/org/apache/solr/ ;-) El 28/01/2013 20:10, "Luis Cappa Banda" escribió: > Hello! > > I haven´t found Solr 4.1 maven artifacts to update my Solr projects > dependencies. Are they published in the public Mvn repositories? > >

Re: edismax with df qf and alias

2013-01-06 Thread Juan Miguel Cejuela
No, *f.a.qf=a^3* is not a solution as I get the parsing error: "Field aliases lead to a cycle" 2013/1/6 Juan Miguel Cejuela > Hi, > > I have the following exemplified parameters in my edismax query: > > qf=a^3 x^2 y^1 > f.var.qf=x^2 y^1 > df=var > > Whe

edismax with df qf and alias

2013-01-06 Thread Juan Miguel Cejuela
t see it documented. Otherwise, how can I set different scores to parameters while keeping an independent *df*? Like the following? f.a.qf=a^3 f.var.qf=x^2 y^1 df=var Thanks for your help. -- Juan Miguel Cejuela

Re: edismax: implicit AND changes into implicit OR

2013-01-06 Thread Juan Miguel Cejuela
changes the default operator to OR like --> "(xxx OR yyy) OR zzz" 2012/12/13 Jack Krupansky > &debug=query -- Juan Miguel Cejuela

error opening index solr 4.0 with lukeall-4.0.0-ALPHA.jar

2012-11-16 Thread Miguel Ángel Martín
hi all: i can open an index create with solr 4.0. with luke version=> lukeall-4.0.0-ALPHA.jar i have the error: Format version is not supported (resource: NIOFSIndexInput(path="/Users/desa/data/index/_2.tvx")): 1 (needs to be between 0 and 0) at org.apache.lucene.codecs.CodecUtil.checkHeaderN

solr 4.0 error validation in web.xml

2012-10-18 Thread Miguel Ángel Martín
HI: I,m testing Solr 4 in eclipse and i uncommmnet solr home section in web xml but i had the XML validation error: *cvc-complex-type.2.4.a: Invalid content was found starting with element 'env-entry-type'. One of '{"http://java.sun.com/xml/ns/javaee":mapped-name, "http://java.sun.com/xml/**

Re: Boosting in query level the relevance based in content of any fields

2012-09-28 Thread Miguel Ángel Martín
Hi c. Recently, i was testing that issue, i use boost query in solr 3.6.1 with edixmax and works fine for me Regards El 28/09/2012 14:44, "Claudio Ranieri" escribió: > Hello, > > How can I boost in query level the relevance of documents based in > content of any fields? Example, I have 5 docum

SolrCloud state

2011-09-21 Thread Miguel Coxo
"promoted" to a master. Or will it remain search only and only recover when a new master is setup. Also how is the document indexing distributed by the shards? Can i add a new shard dynamically? All the best, Miguel Coxo.

add documents to the slave

2011-08-30 Thread Miguel Valencia
Hi I've read that it's possible add documents to slave machine: http://wiki.apache.org/solr/SolrReplication#What_if_I_add_documents_to_the_slave_or_if_slave_index_gets_corrupted.3F ¿Is there anyway to not allow add to documents to slave machine? for example, touch on configurations files t

Re: SRW/U and OAI-PMH servers over solr

2009-03-26 Thread Miguel Coxo
> > > On Mar 25, 2009, at 3:30 PM, Miguel Coxo wrote: > > Hello there, >> >> I'm looking for a way to implement SRW/U and a OAI-PMH servers over solr, >> similar to what i have found here: >> http://marc.info/?l=solr-dev&m=116405019011211&w

SRW/U and OAI-PMH servers over solr

2009-03-25 Thread Miguel Coxo
calls to the data provider). Any information that you guys can provide is welcome =). -- All the best, Miguel Coxo.