Re: AEM SOLR integaration

2017-09-24 Thread Tommaso Teofili
integrating can be done in AEM at different layers, however my suggestion would be to enable that at the repository (Oak) level [1] so that usual AEM search would also take ACLs into account. [1] : http://jackrabbit.apache.org/oak/docs/query/solr.html Il giorno ven 22 set 2017 alle ore 18:47 Davi

Re: Knn classifier doesn't work

2017-09-19 Thread Tommaso Teofili
hi Alessandro, yes please, feel free to open a Jira issue, patches welcome ! Tommaso Il giorno lun 18 set 2017 alle ore 14:30 alessandro.benedetti < a.benede...@sease.io> ha scritto: > Hi Tommaso, > you are definitely right! > I see that the method : MultiFields.getTerms > returns : > if (term

Re: multi language search engine in solr

2017-09-11 Thread Tommaso Teofili
another thing to consider is what users would expect, would english user search over english docs only ? if yes, the most important task would be to correctly set up / create accurate per language analyzers, otherwise you may consider to also adopt machine translation, either on the search queries

Re: Knn classifier doesn't work

2017-09-02 Thread Tommaso Teofili
it would sound like none of the docs in your index has the "class" field, in your case Tags, whereas classification needs some bootstrapping (add some examples of correctly classified docs to the index beforehand). On the other hand the naive bayes implementation has definitely a bug as the MultiFi

Re: Exception during integration of Solr with UIMA

2017-03-20 Thread Tommaso Teofili
Hi, the UIMA OpenCalais Annotator you're using refers to an old endpoint which is no longer available, see log line [1]. I would suggest to simply remove the OpenCalaisAnnotator entry from your UIMAUpdateRequestProcessor configuration in solrconfig.xml. More generally you should put only the UIMA

Re: Solr UIMA Custom Annotator PEAR file installation on Linux

2016-01-08 Thread Tommaso Teofili
Hi, do you mean you want to use a PEAR to provide the Annotator for the Solr UIMA UpdateProcessor ? Can you please detail a bit more your needs? Regards, Tommaso 2016-01-08 1:57 GMT+01:00 techqnq : > implemented custom annotator and generated the PEAR file. > Windos has the PEAR installer util

Re: Using SimpleNaiveBayesClassifier in solr

2015-10-12 Thread Tommaso Teofili
Hi Yewint, the SNB classifier is not an online one, so you should retrain it every time you want to update it. What you pass to the Classifier is a Reader therefore you should grant that this keeps being accessible (not close it) for classification to work. Regarding performance SNB becomes slower

Re: solr uima and opennlp

2015-06-01 Thread Tommaso Teofili
yeah, I think you'd rather post it to d...@uima.apache.org . Regards, Tommaso 2015-05-28 15:19 GMT+02:00 hossmaa : > Hi Tommaso > > Thanks for the quick reply! I have another question about using the > Dictionary Annotator, but I guess it's better to post it separately. > > Cheers > Andreea > >

Re: solr uima and opennlp

2015-05-21 Thread Tommaso Teofili
Hi Andreaa, 2015-05-21 18:12 GMT+02:00 hossmaa : > Hi everyone > > I'm trying to plug in a new UIMA annotator into solr. What is necessary for > this? Is is enough to build a Jar similarly to the ones from the > uima-addons > package? yes, exactly. Actually you just need a jar containing the An

Re: /suggest through SolrJ?

2015-04-29 Thread Tommaso Teofili
2015-04-27 19:22 GMT+02:00 Alessandro Benedetti : > Just had the very same problem, and I confirm that currently is quite a > mess to manage suggestions in SolrJ ! > I have to go with manual Json parsing. > or very not nice NamedList API mess (see an example in JR Oak [1][2]). Regards, Tommaso

Re: Issue with multivalued fields in UIMA

2014-08-29 Thread Tommaso Teofili
Hi, it'd be good if you could open a Jira issues (with a patch preferably) describing your findings. Thanks, Tommaso 2014-08-29 18:34 GMT+02:00 mkhordad : > I solved it. It was caused by a bug in UIMAUpdateRequestProcessor. > > > > -- > View this message in context: > http://lucene.472066.n3.n

Tika analyzers

2014-07-30 Thread Tommaso Teofili
Hi all, while SolrCell works nicely when in need of indexing binary documents, I am wondering about the possibility of having Lucene / Solr documents that have binaries in specific Lucene fields, e.g. title="a nice doc", name"blabla.doc", binary="0x1234...". In that case the "binary" field should

Re: Integrate solr with openNLP

2014-06-04 Thread Tommaso Teofili
Hi all, Ahment was suggesting to eventually use UIMA integration because OpenNLP has already an integration with Apache UIMA and so you would just have to use that [1]. And that's one of the main reason UIMA integration was done: it's a framework that you can easily hook into in order to plug your

Re: deep paging without sorting / keep IRs open

2014-05-19 Thread Tommaso Teofili
thanks Yonik, that looks promising, I'll have a look at it. Tommaso 2014-05-17 17:57 GMT+02:00 Yonik Seeley : > On Sat, May 17, 2014 at 10:30 AM, Yonik Seeley > wrote: > > I think searcher leases would fit the bill here? > > https://issues.apache.org/jira/browse/SOLR-2809 > > > > Not yet imple

deep paging without sorting / keep IRs open

2014-05-15 Thread Tommaso Teofili
Hi all, in one use case I'm working on [1] I am using Solr in combination with a MVCC system [2][3], so that the (Solr) index is kept up to date with the system and must handle search requests that are tied to a certain state / version of it and of course multiple searches based on different versi

Re: [Clustering] Full-Index Offline cluster

2014-03-10 Thread Tommaso Teofili
Hi Ahmet, Ale, right, there's a classification module for Lucene (and therefore usable in Solr as well), but no clustering support there. Regards, Tommaso 2014-03-10 19:15 GMT+01:00 Ahmet Arslan : > Hi, > > Thats weird. As far as I know there is no such thing. There is > classification stuff b

Re: Caching requests to Solr

2014-03-08 Thread Tommaso Teofili
following up on this, I've created https://issues.apache.org/jira/browse/SOLR-5826 , with a draft patch. Regards, Tommaso 2014-03-05 8:50 GMT+01:00 Tommaso Teofili : > Hi all, > > I have the following requirement where I have an application talking to > Solr via SolrJ where I d

Caching requests to Solr

2014-03-04 Thread Tommaso Teofili
Hi all, I have the following requirement where I have an application talking to Solr via SolrJ where I don't know upfront which type of Solr instance that will be communicating with, while this is easily solvable by using different SolrServer implementations I also need a way to ensure that all th

Re: Alternatives to GATE?

2014-01-16 Thread Tommaso Teofili
If you need a framework to build your enhancement pipeline on I think Apache UIMA [1] is good as it's also able to store annotated documents into Lucene and Solr so it may be a good fit for your needs. Just consider that you have to learn how to use / develop on top of it, it's not a big deal but n

Re: Too slow UIMA with Solr

2013-08-29 Thread Tommaso Teofili
p.s. see https://issues.apache.org/jira/browse/SOLR-5201 2013/8/29 Tommaso Teofili > Hi Jun, > > I agree the AE (instead of the AEProvider) should be cached on the > UpdateRequestProcessor. > In previous revisions [1] it was cached directly by the BasicAEProvider so > there w

Re: Too slow UIMA with Solr

2013-08-29 Thread Tommaso Teofili
Hi Jun, I agree the AE (instead of the AEProvider) should be cached on the UpdateRequestProcessor. In previous revisions [1] it was cached directly by the BasicAEProvider so there wasn't need of that in the UIMAUpdateRequestProcessor but, since that has changed, I agree that should be done there a

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Tommaso Teofili
entually train a classifier to help you mark other texts as quote / plagiarism HTH, Tommaso 2013/7/23 Furkan KAMACI > Actually I need a specialized algorithm. I want to use that algorithm to > detect duplicate blog posts. > > 2013/7/23 Tommaso Teofili > > > Hi, > >

Re: Document Similarity Algorithm at Solr/Lucene

2013-07-23 Thread Tommaso Teofili
Hi, I you may leverage and / or improve MLT component [1]. HTH, Tommaso [1] : http://wiki.apache.org/solr/MoreLikeThis 2013/7/23 Furkan KAMACI > Hi; > > Sometimes a huge part of a document may exist in another document. As like > in student plagiarism or quotation of a blog post at another b

Re: Solr UIMA

2013-02-21 Thread Tommaso Teofili
Hi Bart, I think the only way you can do that is by reindexing, or maybe by just doing a dummy atomic update [1] to each of the documents (e.g. adding or changing a field of type 'ignored' or something like that) that weren't "tagged" by UIMA before. Regards, Tommaso [1] : http://wiki.apache.org

Re: which analyzer is used for facet.query?

2013-02-13 Thread Tommaso Teofili
I agree that's definitely strange, I'll have a look at it. Tommaso 2013/2/12 Chris Hostetter > > : > So it seems that facet.query is using the analyzer of type index. > : > Is it a bug or is there another analyzer type for the facet query? > > That doesn't really make any sense ... > > i don't

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-05 Thread Tommaso Teofili
nceAE.xml" > tokenType="org.apache.uima.SentenceAnnotation" ngramsize="2" > modelFile="file:german/TuebaModel.dat" /> > > ??? > > Thanks, > > Kai > > > -Original Message- > From: Tommaso Teofili [mailto:tommaso.teof...@gmai

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
descriptor and is then set with the given actual value. HTH, Tommaso 2013/2/4 Tommaso Teofili > Regarding configuration parameters have a look at > https://issues.apache.org/jira/browse/LUCENE-4749 > Regards, > Tommaso > > > 2013/2/4 Tommaso Teofili > >> Thanks

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
Regarding configuration parameters have a look at https://issues.apache.org/jira/browse/LUCENE-4749 Regards, Tommaso 2013/2/4 Tommaso Teofili > Thanks Kai for your feedback, I'll look into it and let you know. > Regards, > Tommaso > > > 2013/2/1 Kai Gülzau > >&g

Re: Indexing nouns only with UIMA works - performance issue?

2013-02-04 Thread Tommaso Teofili
Thanks Kai for your feedback, I'll look into it and let you know. Regards, Tommaso 2013/2/1 Kai Gülzau > I now use the "stupid" way to use the german corpus for UIMA: copy + paste > :-) > > I modified the Tagger-2.3.1.jar/HmmTagger.xml to use the german corpus > ... > > file:german/TuebaMode

Re: Solr UIMA with KEA

2012-11-23 Thread Tommaso Teofili
the AlchemyAPI service is not mandatory (it's there just as an example and can be safely removed), you can use whatever service you want as long as it's wrapped by a UIMA AnalysisEngine and you specify its descriptor. See following updateChain example configuration : /path/to/KEAdescritpor.x

Re: UIMA for lemmatization

2012-09-25 Thread Tommaso Teofili
Hi, I think you'd better ask this on u...@uima.apache.org list as this is more related to Apache UIMA itself rather than to Apache Solr. Regards, Tommaso 2012/9/25 abhayd > hi > I m new to UIMA. Solr doea not have lemmatization component, i was > thinking > of using UIMA for this. > > Is t

Re: Backup strategy for SolrCloud

2012-09-20 Thread Tommaso Teofili
I also think that's a good question and currently without a "use this" answer :-) I think it shouldn't be hard to write a Solr service querying ZK and replicate both conf and indexes (via SnapPuller or ZK itself) so that such a node is responsible to back up the whole cluster in a secure storage (N

Re: Embedded Server Issue : SOLRJ : No Such Core Found

2012-09-19 Thread Tommaso Teofili
Hi Senthil, try using the following: CoreContainer coreContainer = new CoreContainer.Initializer().initialize(); SolrServer solrServer = new EmbeddedSolrServer(coreContainer, "collection1"); Hope it helps, Tommaso 2012/9/19 Senthil Kk Mani > > Hi, > > I am facing an issue while trying to us

Re: Levenstein Distance

2012-06-07 Thread Tommaso Teofili
During the analysis phase you could add payloads to the terms using LevensteinDistance and then use that in conjunction with a PayloadSimilarity class ´See [1] for an example), or just use a custom Similarity class which uses LevensteinDistance for scoring. HTH Tommaso [1] : http://www.lucidimagin

Re: Solr with UIMA

2012-06-04 Thread Tommaso Teofili
Hi all, 2012/6/1 Jack Krupansky > Is it failing on the first document? I see "uid 5", suggests that it is > not. If not, how is this document different from the others? > > I see the exception > org.apache.uima.resource.**ResourceInitializationExceptio**n, suggesting > that some file cannot be l

Re: shard distribution of multiple collections in SolrCloud

2012-05-24 Thread Tommaso Teofili
7;ll take a look and try to help there. Tommaso > > > On May 24, 2012, at 4:39 AM, Tommaso Teofili wrote: > > > 2012/5/23 Mark Miller > > > >> Yeah, currently you have to create the core on each node...we are > working > >> on a 'collections'

Re: shard distribution of multiple collections in SolrCloud

2012-05-24 Thread Tommaso Teofili
2012/5/23 Mark Miller > Yeah, currently you have to create the core on each node...we are working > on a 'collections' api that will make this a simple one call operation. > Mark, is there a Jira for that yet? Tomamso > > We should have this soon. > > - Mark > > On May 23, 2012, at 2:36 PM, Da

Re: Problem with AND clause in multi core search query

2012-05-14 Thread Tommaso Teofili
The latter is supposed to work: http://localhost:8983/solr/core0/select?shards=localhost:8983/solr/core0,localhost:8983/solr/core1&q=column1 :"A" OR column2:"B" The first query cannot work as there is no document neither in core0 nor in core1 which has A in field column1 and B in field column2 but

Re: Using UIMA in Solr behind a firewall

2012-04-04 Thread Tommaso Teofili
Hello Peter, I think that is more related to UIMA AlchemyAPIAnnotator [1] or to AlchemyAPI services themselves [2] because Solr just use the out of the box UIMA AnalysisEngine for that. Thus it may make sense to ask on d...@uima.apache.org (or even directly to AlchemyAPI guys). HTH, Tommaso [1] :

Re: Solr with UIMA

2012-04-04 Thread Tommaso Teofili
Hi again Chris, I finally manage to find some proper time to test your configuration. First thing to notice is that it worked for me assuming the following pre-requisites were satisfied: - you had the jar containing the AnalysisEngine for the RoomAnnotator.xml in your libraries section (this is ac

Re: Solr with UIMA

2012-03-28 Thread Tommaso Teofili
Hi Chris, I did never tried the Nutch integration so I can't help with that. However I'll try to repeat your same setup and will let you know what it comes out for me. Tommaso 2012/3/28 chris3001 > Still not getting there on Solr with UIMA... > Has anyone taken example 1 (RoomAnnotator) and su

Re: Solr with UIMA

2012-03-28 Thread Tommaso Teofili
Hi Chris, 2012/3/28 chris3001 > I am having a hard time integrating UIMA with Solr. I have downloaded the > Solr 3.5 dist and have it successfully running with nutch and tika on > windows 7 using solrcell and curl via cygwin. To begin, I copied the 6 jars > from solr/contrib/uima/lib to the work

Re: Solr Monitoring / Stats

2012-03-15 Thread Tommaso Teofili
would http://www.lucidimagination.com/blog/2011/10/02/monitoring-apache-solr-and-lucidworks-with-zabbix/work for your scenario? Tommaso 2012/3/12 Alex Leonhardt > Hi All, > > I was wondering if anyone knows of a free tool to use to monitor multiple > Solr hosts under one roof ? I found some non

Re: Reporting tools

2012-03-09 Thread Tommaso Teofili
as Gora says there is the stats component you can take advantage of or you could also use JMX directly [1] or LucidGaze [2][3] or commercial services like [4] or [5] (these are the ones I know but there may be also others), each of them with different level/type of service. Tommaso [1] : http://w

Re: in solr how to support Document.SetBoost as lucene?

2012-03-07 Thread Tommaso Teofili
when indexing a Solr document by sending XML files via HTTP POST you can set it adding the boost element to the doc one, see http://wiki.apache.org/solr/UpdateXmlMessages#Optional_attributes_on_.22doc.22 If you plan to index using the java APIs (SolrJ, see http://wiki.apache.org/solr/Solrj) you can

Re: performance between ExternalFileField and Join

2012-03-01 Thread Tommaso Teofili
Also regarding the Join functionality I remember Yonik pointed out it's O(# unique terms) but I agree with Erik on the ExternalFileField as you can use it just inside a function query, for example, for boosting. Tommaso 2012/3/1 Erick Erickson > Hmmm. ExternalFileFields can only be float values,

Re: proper syntax for using sort query parameter in responseHandler

2012-02-17 Thread Tommaso Teofili
Hi Mark, Having a look at that requestHandler it looks ok [1], are you experiencing any errors? If so did you check the wiki page FieldOptionsByUseCase [2], maybe that field (rankNo) options contain indexed="false" or multiValued="true"? HTH, Tommaso [1] : http://wiki.apache.org/solr/CommonQueryPa

Re: Sorting solrdocumentlist object after querying

2012-02-09 Thread Tommaso Teofili
Hi Kashif, maybe the field collapsing feature [1] may help you with your requirement. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/FieldCollapsing

Re: How to do this in Solr? random result for the first few results

2012-02-09 Thread Tommaso Teofili
I think you may use/customize the query elevation component to achieve that. http://wiki.apache.org/solr/QueryElevationComponent Tommaso 2012/2/9 mtheone > Say I have a classified ads site, I want to display 2 random items (premium > ads) in the beginning of the search result and the rest are re

Re: How to get the time document was indexed?

2012-01-20 Thread Tommaso Teofili
Hi Alex, you can create a field in the schema.xml of type date or tdate called (something like) idx_timestamp and set its default option to NOW then you won't have to add any extra fields to the documents because it will be automatically created when documents are indexed. Hope it helps. Tommaso 2

Re: Problems with SolrUIMA

2011-12-10 Thread Tommaso Teofili
Hello Adriana, your configuration looks fine to me. The exception you pasted makes me think you're using a Solr instance at a certain version (3.4.0) while the Solr-UIMA module jar is at a different version; I remember there has been a change in the UpdateRequestProcessorFactory API at some point

Re: Document Processing

2011-12-06 Thread Tommaso Teofili
Hello Michael, I can help you with using the UIMA UpdateRequestProcessor [1]; the current implementation uses in-memory execution of UIMA pipelines but since I was planning to add the support for higher scalability (with UIMA-AS [2]) that may help you as well. Tommaso [1] : http://svn.apache.org

Re: Upgratding the Index from 1.4.1 to 3.4 using replication

2011-10-27 Thread Tommaso Teofili
I don't think it'll work as I've tried this approach myself and the blocking issue was that Solr 1.4.1 use a different javabin version than Solr 3.4 (I think it's 1 vs 2) so the master and the slave(s) can't communicate using standard replication handler and thus can't exchange information and data

Re: UIMA DictionaryAnnotator partOfSpeach

2011-09-28 Thread Tommaso Teofili
I think one problem is that the featurePath is not set correctly. Note that you are assuming PoS are written somewhere in some annotation feature so this mean you should've setup the UIMA pipeline to include also, for example, the HMM Tagger [1] which adds (by default) the posTag feature to TokenAn

Different Solr versions between Master and Slave(s)

2011-09-19 Thread Tommaso Teofili
Hi all, while thinking about a migration plan of a Solr 1.4.1 master / slave architecture (1 master with N slaves already in production) to Solr 3.x I imagined to go for a graceful migration, starting with migrating only one/two slaves, making the needed tests on those while still offering the inde

Re: solr UIMA exception

2011-08-29 Thread Tommaso Teofili
The UIMA AlchemyAPI annotator is failing for you due to an error no server side and I think you should look at your Solr UIMA configuration as it seem you wanted to extract entities from text: "Senator Dick Durbin (D-IL) Chicago , March 3,2007." while the error says "org.apache.solr.uima.processor

Re: Solr UIMA integration problem

2011-08-17 Thread Tommaso Teofili
At a first glance I think the problem is in the 'feature' element which is set to 'title'. The 'feature' element should contain a UIMA Feature of the type defined in element 'type'; for example for SentenceAnnotation [1] defined in HMM Tagger has 'only' the default features of a UIMA Annotation: be

Re: (Solr-UIMA) Indexing problems with UIMA fields.

2011-07-14 Thread Tommaso Teofili
quot;text" field. Is it > enough if I specify that inside SolrConfig in this point...or should I do > something more? > this is ok > > 3) Where can I see a more detailed Log about what is happening inside Solr? > I am running Solr from Eclipse + Tomcat. Neither the Console

Re: (Solr-UIMA) Indexing problems with UIMA fields.

2011-07-13 Thread Tommaso Teofili
Hello, I think the problem might be the following, if you defined the update request handlers like in the sample solrconfig : uima ... then the uima update chain will be executed only for HTTP POSTs on /update and not for /update/javabin (that is

Re: Different Indexing formats for Older Lucene versions and Solr?

2011-07-05 Thread Tommaso Teofili
Which Lucene version were you using? Regards, Tommaso 2011/7/5 Sowmya V.B. > Hi All > > A quick doubt on the index files of Lucene and Solr. > > I had an older version of lucene (with UIMA) till recently, and had an > index > built thus. > I shifted to Solr (3.3, with UIMA)..and tried to use the

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili
adClass(ClassLoader.java:307) at > java.net.FactoryURLClassLoader.loadClass(URLClassLoader.java:627) at > java.lang.ClassLoader.loadClass(ClassLoader.java:248) at > java.lang.Class.forName0(Native Method) at > java.lang.Class.forName(Class.java:247) at > > org.apache.solr.core.So

Re: Problems using Solr with UIMA

2011-07-04 Thread Tommaso Teofili
Hello Sowmya, Is the problem a ClassNotFoundException? If so check there exist a element referencing the solr-uima jar. Otherwise it may be some configuration error. By the way, which version of Solr are you using ? I ask since you're seeing README for trunk but you may be using Solr jars with dif

Re: Query time noun, verb boosting

2011-06-24 Thread Tommaso Teofili
2011/6/23 Anshum > Pooja, > You could use UIMA (or any other) Parts of Speech Tagger. You could read a > little more about it here. > > http://uima.apache.org/downloads/sandbox/hmmTaggerUsersGuide/hmmTaggerUsersGuide.html#sandbox.tagger.annotatorDescriptor > This would help you annotate and segr

Re: Showing facet of first N docs

2011-06-20 Thread Tommaso Teofili
correct. do you need to improve relevancy? I have a quite good relevance obtained after playing a bit with dismax and bq. I think the problem is just in how the facets are being used, I think a customized SpellChecker sounds like the right component to provide smart suggestions. 2011/6/20 Toke Eskil

Re: Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
nstraints to allow > paging. > > The default value is 0. > > This parameter can be specified on a per field basis. > > > Dmitry > > > On Thu, Jun 16, 2011 at 1:39 PM, Tommaso Teofili > wrote: > > > Hi all, > > Do you know if it is possible to show the

Showing facet of first N docs

2011-06-16 Thread Tommaso Teofili
Hi all, Do you know if it is possible to show the facets for a particular field related only to the first N docs of the total number of results? It seems facet.limit doesn't help with it as it defines a window in the facet constraints returned. Thanks in advance, Tommaso

Re: [Mahout] Integration with Solr

2011-06-09 Thread Tommaso Teofili
Hello Adam, I've managed to create a small POC of integrating Mahout with Solr for a clustering task, do you want to use it for clustering only or possibly for other purposes/algorithms? More generally speaking, I think it'd be nice if Solr could be extended with a proper API for integrating cluste

Re: How can I query mutlitcore with solrJ

2011-05-20 Thread Tommaso Teofili
Or, if you want results from both together, you can use the distributed search [1]. Just decide which one of the cores will be the "collector" and add the shards=localhost:8983/solr/fund_dih,localhost:8983/solr/fund_tika parameter like : SolrServer server = new CommonsHttpSolrServer(" http://local

Re: UIMA analysisEngine path

2011-05-18 Thread Tommaso Teofili
rsion 1.4.1 as well? > the UpdateRequestProcessorChain API has changed from 1.4.1 to 3.1.0 so, although it should be easy to back port, it's not compatible with Solr 1.4.1 out of the box. Tommaso > > Thanks again > > > > On Tue, May 17, 2011 at 12:13 PM, Tommaso Teofili

Re: UIMA analysisEngine path

2011-05-17 Thread Tommaso Teofili
ds, Tommaso > > On Mon, May 16, 2011 at 6:19 PM, Tommaso Teofili [via Lucene] < > ml-node+2948866-1333438441-399...@n3.nabble.com> wrote: > > > The error you pasted doesn't seem to be related to a (class)path issue > but > > more likely to be related to a

Re: UIMA analysisEngine path

2011-05-16 Thread Tommaso Teofili
ssorFactory > > > Regards, > Chamara > > > On Mon, May 16, 2011 at 9:17 AM, Tommaso Teofili [via Lucene] < > ml-node+2946920-843126873-399...@n3.nabble.com> wrote: > >> Hello, >> >> if you want to take the descriptor from a jar, provided that y

Re: UIMA analysisEngine path

2011-05-16 Thread Tommaso Teofili
Hello, if you want to take the descriptor from a jar, provided that you configured the jar inside a element in solrconfig, then you just need to write the correct classpath in the analysisEngine element. For example if your descriptor resides in com/something/desc/ path inside the jar then you sh

Re: uima fieldMappings and solr dynamicField

2011-05-09 Thread Tommaso Teofili
Thanks Koji for opening that, the dynamicField mapping is a commonly used feature especially for named entities mapping. Tommaso 2011/5/7 Koji Sekiguchi > I've opened https://issues.apache.org/jira/browse/SOLR-2503 . > > Koji > -- > http://www.rondhuit.com/en/ > > (11/05/06 20:15), Koji Sekiguch

Re: UIMA analysisEngine path

2011-05-06 Thread Tommaso Teofili
ount in the first implementation since many existing annotators already deliver descriptors bundled inside the jars/pears but this addition sounds like a good improvement so, basically, let's do it ;-) Regards, Tommaso [1] : http://uima.apache.org/d/uimaj-2.3.1/api/org/apache/uima/util/XMLInputSou

Re: UIMA analysisEngine path

2011-05-06 Thread Tommaso Teofili
relative paths pointing into the pear > subdirectory. > Grabbing the descriptor from the jar breaks that since > OverridingParamsAEProvider > uses the XMLInputSource method without relative path signature. > > Barry > > > On 5/4/2011 6:16 AM, Tommaso Teofili wrote: > &g

Re: UIMA analysisEngine path

2011-05-04 Thread Tommaso Teofili
Hello Barry, the main AnalysisEngine descriptor defined inside the element should be inside one of the jars imported with the elements. At the moment it cannot be taken from expanded directories but it should be easy to do it (and indeed useful) modifying the OverridingParamsAEProvider class [1]

Re: Viewing Raw index data

2011-04-19 Thread Tommaso Teofili
Hello Dave, the LukeRequestHandler [1] and the Analysis service [2] should help you : Regards, Tommaso [1] : http://wiki.apache.org/solr/LukeRequestHandler [2] : http://wiki.apache.org/solr/FAQ#My_search_returns_too_many_.2BAC8_too_little_.2BAC8_unexpected_results.2C_how_to_debug.3F 2011/4/19 Da

Re: solr- Uima integration

2011-04-19 Thread Tommaso Teofili
Hi Isha 2011/4/18 Isha Garg > Can anyone explain me the what are runtimeParameters specified in the > as in link http://wiki.apache.org/solr/SolrUIMA. also tell me > how to integrate our own analysis engine to solr. I am new to this. the runtimeParameters contains parameters' settings that

Re: AbstractSolrTestCase and Solr 3.1.0

2011-04-12 Thread Tommaso Teofili
Thanks Robert, that was very useful :) Tommaso 2011/4/12 Robert Muir > On Tue, Apr 12, 2011 at 6:44 AM, Tommaso Teofili > wrote: > > Hi all, > > I am porting a previously series of Solr plugins developed for 1.4.1 > version > > to 3.1.0, I've written some

AbstractSolrTestCase and Solr 3.1.0

2011-04-12 Thread Tommaso Teofili
Hi all, I am porting a previously series of Solr plugins developed for 1.4.1 version to 3.1.0, I've written some integration tests extending the AbstractSolrTestCase [1] utility class but now it seems that wasn't included in the solr-core 3.1.0 artifact as it's in the solr/src/test directory. Was t

Re: UIMA example setup w/o OpenCalais

2011-04-08 Thread Tommaso Teofili
Hi Jay, you should be able to do so by simply removing the OpenCalaisAnnotator from the execution pipeline commenting the line 124 of the file: solr/contrib/uima/src/main/resources/org/apache/uima/desc/OverridingParamsExtServicesAE.xml Hope this helps, Tommaso 2011/4/7 Jay Luker > Hi, > > I'd wo

Re: invert terms in search with exact match

2011-03-24 Thread Tommaso Teofili
Hi Gastone, I think you should use proximity search as described here in Lucene query syntax page [1]. So searching for "my love"~2 should work for your use case. Cheers, Tommaso [1] : http://lucene.apache.org/java/2_9_3/queryparsersyntax.html#ProximitySearches 2011/3/24 Gastone Penzo > Hi, >

Re: boosting with standard search handler

2011-03-24 Thread Tommaso Teofili
Hi Gastone, I used to do that in standard search handler using the following parameters: q={!boost b=query($qq,0.7)} text:something title:other qq=date:[NOW-60DAY TO NOW]^5 OR date:[NOW-15DAY TO NOW]^8 that enabling custom recency based boosting. My 2 cents, Tommaso 2011/3/24 Gastone Penzo > Hi

Solr UIMA Wiki page

2011-03-09 Thread Tommaso Teofili
Hi all, I just improved the Solr UIMA integration wiki page [1] so if anyone is using it and/or has any feedback it'd be more than welcome. Regards, Tommaso [1] : http://wiki.apache.org/solr/SolrUIMA

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
2011/3/8 Tommaso Teofili > Hi Rajani, > > i > > > 2011/3/8 rajini maski > > >> Tommaso, Please can you share any link that explains me about how to >> enable >> and do load balancing on the machines that you did mention above..? >> >>

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi Rajani, i 2011/3/8 rajini maski > > Tommaso, Please can you share any link that explains me about how to enable > and do load balancing on the machines that you did mention above..? > > > > if you're querying Solr via SolrJ [1] you could use the LBHttpSolrServer [2] otherwise, if you still

Re: Use of multiple tomcat instance and shards.

2011-03-08 Thread Tommaso Teofili
Hi, from my experience when you have to scale in the number of documents it's good idea to use shards (so one schema and N shards containing (1/N)*total#docs) while if the requirement is granting high query volume response you could get a significant boost from replicating the same index on 2 or mo

Re: Faceting

2011-02-21 Thread Tommaso Teofili
Hi Praveen, as far as I understand you have to set the type of the field(s) you are searching over to be conservative. So for example you won't include stemmer and lowercase filters and use only a whitespace tokenizer, more over you should search with the default operator set to AND. Then faceting

Re: Best way for a query-expander?

2011-02-18 Thread Tommaso Teofili
Hi Paul, me and a colleague worked on a QParserPlugin to "expand" alias field names to many existing field names ex: q=mockfield:val ==> q=actualfield1:val OR actualfield2:val but if you want to be able to use other params that come from the HTTP request you should use a custom RequestHandler I thi

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, The other in the basis configuration is the AlchemyAPIAnnotator. Cheers, Tommaso 2011/2/6, Darx Oman : > Hi Tommaso > yes my server isn't connected to the internet. > what other UIMA annotators that I can run which doesn't require an internet > connection? >

Re: UIMA Error

2011-02-05 Thread Tommaso Teofili
Hi Darx, are you running it without an internet connection? As the problem seems to be that the OpenCalais service host cannot be resolved. Remember that you can select which UIMA annotators run inside the OverridingParamsAggregateAEDescriptor.xml. Hope this helps. Tommaso 2011/2/5, Darx Oman : >

Re: solr - uima error

2011-01-30 Thread Tommaso Teofili
I found the issue is in the README.txt as the right class to use is UIMAUpdateRequestProcessorFactory, please change that in your solrconfig. Regards, Tommaso 2011/1/30 Darx Oman > Hi > I already copied "apache-solr-uima-4.0-SNAPSHOT.jar"tosolr\lib > but what causing the error is this >

Re: solr - uima error

2011-01-29 Thread Tommaso Teofili
Hi Darx you need to run 'and dist' under solr/contrib/uima and then reference the created jar (under solr/contrib/uima/build) inside the solrconfig.xml ( tag) of your instance. Hope this helps, Tommaso 2011/1/29 Darx Oman > I tried to do the uima integration with solr > I followed the steps in t

Re: Searchers and Warmups

2011-01-14 Thread Tommaso Teofili
Hi David, The idea is that you can define some "listeners" which make a list of queries to an IndexSearcher. In particular the firstSearcher event is related to the very first IndexSearcher being created inside the Solr instance while the newSearcher is the event related to the creation of a new In

Solr and UIMA #2

2011-01-04 Thread Tommaso Teofili
Hi all, just a quick notice to let you know that a new component to consume UIMA objects to a (local or remote) Solr instance is available inside UIMA sandbox [1]. Note that this "writes" to Solr from UIMA pipelines (push) while in SOLR-2129 [2] Solr "asks" UIMA to extract metadata while indexing d

Transparent redundancy in Solr

2010-12-15 Thread Tommaso Teofili
Hi all, me, Upayavira and other guys at Sourcesense have collected some Solr architectural views inside the presentation at [1]. For sure one can set up an architecture for failover and resiliency on the "search face" (search slaves with coordinators and distributed search) but I'd like to ask how

Parenthesis in query string

2010-12-15 Thread Tommaso Teofili
Hi all, I've just noticed a strange behavior (or, at least, I didn't expect that), when adding useless parenthesis to a query. Using the lucene query parser in Solr I get no results with the query: * ((( NOT (text:"something"))) AND date <= 2010-12-15) * while I get the expected results when the

Re: Problem with multicore

2010-12-15 Thread Tommaso Teofili
Hi Jörg, I think the first thing you should check is your Ubuntu's encoding, second one is file permissions (BTW why are you sudoing?). Did you try using the bash script under example/exampledocs named "post.sh" (use it like this: 'sh post.sh *.xml') Cheers, Tommaso 2010/12/15 Jörg Agatz > Hall

Re: Taxonomy and Faceting

2010-12-13 Thread Tommaso Teofili
With the SOLR-2129 patch you enable an Apache UIMA [1] pipeline to enrich documents being indexed. The base pipeline provided with the patch uses the following blocks (see OverridingParamsExtServicesAE.xml): AggregateSentenceAE OpenCalaisAnnotator TextKeywordExtractionAED

Re: Indexing documents with SOLR

2010-12-10 Thread Tommaso Teofili
Hi Pankaj, you can find the needed documentation right here [1]. Hope this helps, Tommaso [1] : http://wiki.apache.org/solr/ExtractingRequestHandler 2010/12/10 pankaj bhatt > Hi All, > I am a newbie to SOLR and trying to integrate TIKA + SOLR. > Can anyone please guide me, how to achieve

  1   2   >