Re: Integrating Oauth2 with Solr MailEntityProcessor

2014-02-12 Thread Dileepa Jayakody
Hi again, Anybody interested in this feature for Solr MailEntityProcessor? WDYT? Thanks, Dileepa On Thu, Jan 30, 2014 at 11:00 AM, Dileepa Jayakody < dileepajayak...@gmail.com> wrote: > Hi All, > > I think Oauth2 integration is a valid usecase for Solr when it comes to > i

API to get documents imported in dataimport from EventListener.onEvent(Context cntxt)

2014-02-04 Thread Dileepa Jayakody
Hi All, Is there a way to retrieve the documents being imported in a dataimport request from a EventListener configured to run at onImportEnd? I need to get the set of values of the field:content of all the documents imported to perform an enhancement task. Is there a way to retrieve the documents

Re: Concurrency handling in DataImportHandler

2014-01-30 Thread Dileepa Jayakody
, Jan 30, 2014 at 4:13 PM, Dileepa Jayakody wrote: > I would particularly like to know how DIH handles concurrency in JDBC > database connections during datamport.. > > url="jdbc:mysql://localhost:3306/solrtest" user="usr1" password="123" > batchSize

Re: Concurrency handling in DataImportHandler

2014-01-30 Thread Dileepa Jayakody
I would particularly like to know how DIH handles concurrency in JDBC database connections during datamport.. Thanks, Dileepa On Thu, Jan 30, 2014 at 4:05 PM, Dileepa Jayakody wrote: > Hi All, > > Can I please know about how concurrency is handled in the DIH? > What happens

Concurrency handling in DataImportHandler

2014-01-30 Thread Dileepa Jayakody
Hi All, Can I please know about how concurrency is handled in the DIH? What happens if multiple /dataimport requests are issued to the same Datasource? I'm doing some custom processing at the end of dataimport process as an EventListener configured in the data-config.xml as below. Will each DI

Re: Integrating Oauth2 with Solr MailEntityProcessor

2014-01-29 Thread Dileepa Jayakody
also be a good project for GSoC 2014. Thanks, Dileepa On Wed, Jan 29, 2014 at 3:57 PM, Dileepa Jayakody wrote: > Hi All, > > I'm doing a research project on : Email Reputation Analysis and for this > project I'm planning to use Apache Solr, Tika and Mahout projects to >

Integrating Oauth2 with Solr MailEntityProcessor

2014-01-29 Thread Dileepa Jayakody
Hi All, I'm doing a research project on : Email Reputation Analysis and for this project I'm planning to use Apache Solr, Tika and Mahout projects to analyse, store and query reputation of emails and correspondents. For indexing emails in Solr I'm going to use the MailEntityProcessor [1]. But I s

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-27 Thread Dileepa Jayakody
Hi All, I have implemented my requirement as a EventListener which runs on importEnd of the dataimporthandler. I'm running Solrj based client to send Stanbol enhancement updates to the documents within my EventListener. Thanks, Dileepa On Mon, Jan 27, 2014 at 4:34 PM, Dileepa Jayakody

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-27 Thread Dileepa Jayakody
the changes from your > client would be visible faster. > > Alternately you could do the same thing at the DIH level by writing a > customer Transformer ( > http://wiki.apache.org/solr/DataImportHandler#Writing_Custom_Transformers) > > > On Mon, Jan 27, 2014 a

Re: What is the last_index_time in dataimport.properties?

2014-01-26 Thread Dileepa Jayakody
Hi, > > last_index_time traditionally is used to query Database. But it seems that > you want to query solr, right? > > > > > On Sunday, January 26, 2014 11:15 PM, Dileepa Jayakody < > dileepajayak...@gmail.com> wrote: > Hi Ahmet, > > Thanks a lot. > It means I can us

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-26 Thread Dileepa Jayakody
a timestamp based approach to trigger a /update query to all documents imported after the last_index_time in dataimport.prop and update them with NLP fields. Hope my requirement is clear :). Appreciate your suggestions. [1] http://stanbol.apache.org/ > > > > > On Sunday, January

Re: What is the last_index_time in dataimport.properties?

2014-01-26 Thread Dileepa Jayakody
Hi Dileepa, > > It is the time that the last dataimport process started. So it is safe to > use it when considering updated documenets during the import. > > Ahmet > > > > On Sunday, January 26, 2014 9:10 PM, Dileepa Jayakody < > dileepajayak...@gmail.com> wrot

What is the last_index_time in dataimport.properties?

2014-01-26 Thread Dileepa Jayakody
Hi All, Can I please know what timestamp in the dataimport process is reordered as the last_index_time in dataimport.properties? Is it the time that the last dataimport process started ? OR Is it the time that the last dataimport process finished? Thanks, Dileepa

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-26 Thread Dileepa Jayakody
Hi all, Any ideas on how to run a reindex update process for all the imported documents from a /dataimport query? Appreciate your help. Thanks, Dileepa On Thu, Jan 23, 2014 at 12:21 PM, Dileepa Jayakody < dileepajayak...@gmail.com> wrote: > Hi All, > > I did some research on

Re: How to run a subsequent update query to documents indexed from a dataimport query

2014-01-22 Thread Dileepa Jayakody
those doc ids with an /update query to run the Stanbol update process. Please give me your ideas and suggestions. Thanks, Dileepa On Wed, Jan 22, 2014 at 6:14 PM, Dileepa Jayakody wrote: > Hi All, > > I have a Solr requirement to send all the documents imported from a > /dataimport q

How to run a subsequent update query to documents indexed from a dataimport query

2014-01-22 Thread Dileepa Jayakody
Hi All, I have a Solr requirement to send all the documents imported from a /dataimport query to go through another update chain as a separate background process. Currently I have configured my custom update chain in the /dataimport handler itself. But since my custom update process need to conne

Concurrent request configurations for Solr Processors

2013-12-18 Thread Dileepa Jayakody
Hi All, I have written a custom update request processor and configured a UpdateRequestProcessor chain in solrconfig.xml as below; Can I please know how can I configure the number of concurrent requests for my processor? What is the default number of concurrent requests per a Solr

Re: Passing a Parameter to a Custom Processor

2013-12-17 Thread Dileepa Jayakody
Thanks a lot for the info Koji. I'm going through the source-code, to find out. Regards, Dileepa On Fri, Dec 13, 2013 at 5:40 PM, Koji Sekiguchi wrote: > Hi Dileepa, > > The stanbolInterceptor processor chain will be used in multiple request >> handlers. Then I will have to pass the stanbol.e

Passing a Parameter to a Custom Processor

2013-12-13 Thread Dileepa Jayakody
Hi All, I have written a custom update-request processor and need to pass certain parameters to the Processor. I believe solrconfig.xml is the place to pass these parameters. At the moment I define my parameter in the request handler as below; data-config.xml stanbolInterceptor *http:/

Re: How to use batchSize in DataImportHandler to throttle updates in a batch-mode

2013-12-01 Thread Dileepa Jayakody
I actually tweaked the Stanbol server to handle more results and successfully ran 10K imports within 30 minutes with no server issue. I'm looking for further improving the results with regard to the efficiency and NLP accuracy. Thanks, Dileepa On Sun, Dec 1, 2013 at 8:17 PM, Dileepa Jay

Re: How to use batchSize in DataImportHandler to throttle updates in a batch-mode

2013-12-01 Thread Dileepa Jayakody
eeps a while after every N requests to simulate > throttling. > On 26 Nov 2013 14:21, "Dileepa Jayakody" > wrote: > > > Hi All, > > > > I have a requirement to import a large amount of data from a mysql > database > > and index documents (about 1000 d

How to use batchSize in DataImportHandler to throttle updates in a batch-mode

2013-11-26 Thread Dileepa Jayakody
Hi All, I have a requirement to import a large amount of data from a mysql database and index documents (about 1000 documents). During indexing process I need to do a special processing of a field by sending a enhancement requests to an external Apache Stanbol server. I have configured my dataimpo

Re: An UpdateHandler to run following a MySql DataImport

2013-11-15 Thread Dileepa Jayakody
ith-solrj/ > > Best, > Erick > > > On Fri, Nov 15, 2013 at 2:48 AM, Dileepa Jayakody < > dileepajayak...@gmail.com > > wrote: > > > Hi All, > > > > I have written a custom update request handler to do some custom > processing > > of do

An UpdateHandler to run following a MySql DataImport

2013-11-14 Thread Dileepa Jayakody
Hi All, I have written a custom update request handler to do some custom processing of documents and configured the /update handler to use my custom handler in the default: update.chain. The same requestHandler should be configured for the data-import-handler when it loads documents to solr index

Re: Indexing a token to a different field in a custom filter

2013-11-12 Thread Dileepa Jayakody
a lot of built-in update processors as well as a JavaScript > script update processor. > > -- Jack Krupansky > > -Original Message- From: Dileepa Jayakody > Sent: Tuesday, November 12, 2013 1:31 AM > To: solr-user@lucene.apache.org > Subject: Indexing a token to a diff

Re: Indexing a token to a different field in a custom filter

2013-11-12 Thread Dileepa Jayakody
m adding a new document, instead of updating the same document) Any pointers please? Thanks, Dileepa On Tue, Nov 12, 2013 at 12:01 PM, Dileepa Jayakody < dileepajayak...@gmail.com> wrote: > Hi All, > > In my custom filter, I need to index the processed token into a different > fie

Indexing a token to a different field in a custom filter

2013-11-11 Thread Dileepa Jayakody
Hi All, In my custom filter, I need to index the processed token into a different field. The processed token is a Stanbol enhancement response. The solution I have so far found is to use a Solr client (solj) to add a new Document with my processed field into Solr. Below is the sample code segment

Re: HTTP 500 error when invoking a REST client in Solr Analyzer

2013-11-11 Thread Dileepa Jayakody
hard coded in the Analyzer class and from the Solr Analysis UI. The UI response failed intermittently when I adjust the field value. This could be a problem with character encoding of the field value it seems. Thanks, Dileepa On Tue, Nov 12, 2013 at 1:33 AM, Dileepa Jayakody wrote: > Hi

HTTP 500 error when invoking a REST client in Solr Analyzer

2013-11-11 Thread Dileepa Jayakody
Hi All, I am working on a custom analyzer in Solr to post content to Apache Stanbol for enhancement during indexing. To post content to Stanbol, inside my custom analyzer's incrementToken() method I have written below code using Jersey client API sample [1]; public boolean incrementToken() throws

Re: Error instantiating a Custom Filter in Solr

2013-11-10 Thread Dileepa Jayakody
To: solr-user@lucene.apache.org > Subject: Re: Error instantiating a Custom Filter in Solr > > > Well, I think Jack Krupansky's book has some examples, at $10 it's probably > a steal. > > Best, > Erick > > > > > On Fri, Nov 8, 2013 at 1:4

Re: Error instantiating a Custom Filter in Solr

2013-11-07 Thread Dileepa Jayakody
like LowerCaseFilterFactory > and use that as a model, although you probably don't need to implement > the MultiTermAware bit. > > FWIW, > Erick > > > On Thu, Nov 7, 2013 at 1:31 PM, Dileepa Jayakody > wrote: > > > Hi All, > > > > I'm a novic

Error instantiating a Custom Filter in Solr

2013-11-07 Thread Dileepa Jayakody
Hi All, I'm a novice in Solr and I'm continuously bumping into problems with my custom filter I'm trying to use for analyzing a fieldType during indexing as below; Below is my custom FilterFactory class; *public class ContentFilterFactory extends TokenFilterFactory {* * publ

Re: Help to find BaseTokenFilterFactory to write a Custom TokenFilter

2013-11-07 Thread Dileepa Jayakody
.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:151) ... 18 more *Caused by: java.lang.NoSuchMethodException: com.solr.test.analyzer.ContentFilterFactory.(java.util.Map)* at java.lang.Class.getConstructor0(Class.java:2810) at java.lang.Class.getConstructor(Class.java:1718) at org.apache.so

Re: Help to find BaseTokenFilterFactory to write a Custom TokenFilter

2013-11-07 Thread Dileepa Jayakody
lr-4-0 > > > On Thu, Nov 7, 2013 at 1:05 PM, Dileepa Jayakody > wrote: > > > Hi All, > > > > I am writing a custom TokenFilter to post a token value to Apache Stanbol > > for enhancement. In this Custom TokenFilter I'm trying to retrieve the > > resp

Help to find BaseTokenFilterFactory to write a Custom TokenFilter

2013-11-06 Thread Dileepa Jayakody
Hi All, I am writing a custom TokenFilter to post a token value to Apache Stanbol for enhancement. In this Custom TokenFilter I'm trying to retrieve the response from Stanbol and index it as a new document in Solr. I'm following [1] to write a custom filter, but I'm having trouble locating BaseTo

Writing a Solr custom analyzer to post content to Stanbol {was: Need additional data processing in Data Import Handler prior to indexing}

2013-11-02 Thread Dileepa Jayakody
s, Dileea On Wed, Oct 30, 2013 at 11:26 AM, Dileepa Jayakody < dileepajayak...@gmail.com> wrote: > Thanks guys for your ideas. > > I will go through them and come back with questions. > > Regards, > Dileepa > > > On Wed, Oct 30, 2013 at 7:00 AM, Erick Eri

Re: Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dileepa Jayakody
tHandler#Usage_with_XML.2FHTTP_Datasource > > > > > > Michael Della Bitta > > > > > > Applications Developer > > > > > > o: +1 646 532 3062 | c: +1 917 477 7906 > > > > > > appinions inc. > > > > > > “The Scienc

Need additional data processing in Data Import Handler prior to indexing

2013-10-29 Thread Dileepa Jayakody
Hi All, I'm a newbie to Solr, and I have a requirement to import data from a mysql database; enhance the imported content to identify Persons mentioned and index it as a separate field in Solr along with the other fields defined for the original db query. I'm using Apache Stanbol [1] for the co