Re: What would cause: "SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory"

2012-06-08 Thread Aaron Daubman
Just in case it is helpful, here are the relevant pieces of my schema.xml: ---snip--

What would cause: "SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory"

2012-06-08 Thread Aaron Daubman
Greetings, I am in the process of updating custom code and schema from Solr 1.4 to 3.6.0 and have run into the following issue with our two custom Tokenizer and Token Filter components. I've been banging my head against this one for far too long, especially since it must be something obvious I'm

Re: Adding Custom-Parser to Tika

2012-06-08 Thread Lance Norskog
The doc is old. Tika hunts for parsers in the classpath now. http://www.lucidimagination.com/search/link?url=https://issues.apache.org/jira/browse/SOLR-2116?focusedCommentId=12977072#action_12977072 On Fri, Jun 8, 2012 at 2:20 PM, Chris Hostetter wrote: > You canspecify a "tika.config" option po

Re: Writing custom data import handler for Solr.

2012-06-08 Thread Lance Norskog
The DataImportHandler is a toolkit in Solr. It has a few different kinds of plugins. It is very possible that you do not have to write any Java code. If you have an unusual external data feed (database, file system, Amazon S3 buckets) then you would write a Datasource. The only examples are the so

RE: Writing custom data import handler for Solr.

2012-06-08 Thread ram anam
Hi Eric, I cannot disclose the data source which we are planning to index inside SOLR as it is confidential. But client wants it be in the form of Import Handler. We plan to install Solr and our custom data import handlers so that client can just consume it. Could you please provide me the poin

Re: Boost by Nested Query / Join Needed?

2012-06-08 Thread Chris Hostetter
: For posterity, I think we're going to remove 'preference' data from Solr : indexing and go in the custom Function Query direction with a key-value : store. that would be my suggestion. Assuming you really are modeling candy & users, my guess is the number if distinct candies you have is "very

RE: Adding Custom-Parser to Tika

2012-06-08 Thread Chris Hostetter
You canspecify a "tika.config" option pointing at your own tika-config.xml files that ExtractionRequestHandler will use to configure Tika with... http://wiki.apache.org/solr/ExtractingRequestHandler "The tika.config entry points to a file containing a Tika configuration. You would only need th

RE: Adding Custom-Parser to Tika

2012-06-08 Thread spring
The parser must get registered in the service registry (META-INF/services/org.apache.tika.parser.Parser). Just being in the classpath does not work. > -Original Message- > From: Lance Norskog [mailto:goks...@gmail.com] > Sent: Freitag, 8. Juni 2012 22:38 > To: solr-user@lucene.apache.org

Re: Adding Custom-Parser to Tika

2012-06-08 Thread Lance Norskog
Solr will find libs in top-level directory solr/lib (next to solr.xml) or a lib/ directory inside each core directory. You can put your new parser in a jar file in one of those places. Like this: solr/ solr/solr.xml solr/lib solr/lib/yourjar.jar solr/collection1 solr/collection1/conf solr/collecti

Adding Custom-Parser to Tika

2012-06-08 Thread spring
Hi, I have written a new parser for tika. The problem is, that I have to edit org.apache.tika.parser.Parser in the tika.jar. But I do not want to edit the jar. Is the another way to register the new parser? It must work with a plain AutoDetectParser, since this is used in oder Parsers directly (e.

Re: Writing custom data import handler for Solr.

2012-06-08 Thread Erick Erickson
You need to back up a bit and describe _why_ you want to do this, perhaps there's an easy way to do what you want. This could easily be an XY problem... For instance, you can write a SolrJ program to index data, which _might_ be what you want. It's a separate process runnable anywhere. See: http:/

Writing custom data import handler for Solr.

2012-06-08 Thread ram anam
Hi, I am planning to write a custom data import handler for SOLR for some data source. Could you give me some pointers to documentation, examples on how to write a custom data import handler and how to integrate it with SOLR. Thank you for help. Thanks and regards,Ram Anam.

Re: Help! Confused about using Jquery for the Search query - Want to ditch it

2012-06-08 Thread Roman Chyla
Hi, what you want to do is not that difficult, you can use json, eg. try: conn = urllib.urlopen(url, params) page = conn.read() rsp = simplejson.loads(page) conn.close() return rsp except Exception, e: log.error(str(e)) log.error(page

Re: ContentStreamUpdateRequest method addFile in 4.0 release.

2012-06-08 Thread Ryan McKinley
for the ExtractingRequestHandler, you can put anything into the request contentType. try: addFile( file, "application/octet-stream" ) but anything should work ryan On Thu, Jun 7, 2012 at 2:32 PM, Koorosh Vakhshoori wrote: > In latest 4.0 release, the addFile() method has a new argument 'con

terms count in multivalues field

2012-06-08 Thread preetesh dubey
Is it possible to get number of entries present in a multivalued field by solr query. Lets say I want to query to solr to get all documents having * count* of some multivalued field >1. Is it possible in solr ? -- Thanks & Regards Preetesh Dubey

Re: defaultSearchField and param df are messed up in 3.6.x

2012-06-08 Thread Jack Krupansky
Besides the obvious need to clean up the getDefaultSearchFieldName references, I would also suggest that the "df" param have a hard-wired default of "text" since that is the obvious default. -- Jack Krupansky -Original Message- From: Bernd Fehling Sent: Friday, June 08, 2012 10:15 AM

Re: highlighter not respecting sentence boundry

2012-06-08 Thread abhayd
hi Here is how i get the snippet i phone is highlighted == , a car charger and a battery backup for iPods and iPhones. I expect this to start from starting of sentence. here is my solr config ===

defaultSearchField and param df are messed up in 3.6.x

2012-06-08 Thread Bernd Fehling
Unfortunately I must see that defaultSearchField and param df are pretty much messed up in solr 3.6.x Yes, I have seen issue SOLR-2724 and SOLR-3292. So if defaultSearchField has been removed (deprecated) from schema.xml then why are the still calls to "org.apache.solr.schema.IndexSchema.getDefau

RE: per-fieldtype similarity not working

2012-06-08 Thread Markus Jelsma
Excellent! Thanks -Original message- > From:Robert Muir > Sent: Fri 08-Jun-2012 13:06 > To: Markus Jelsma > Cc: solr-user@lucene.apache.org > Subject: Re: per-fieldtype similarity not working > > On Fri, Jun 8, 2012 at 5:04 AM, Markus Jelsma > wrote: > > Thanks Robert, > > > > The

Re: timeAllowed flag in the response

2012-06-08 Thread Michael Kuhlmann
Am 08.06.2012 11:55, schrieb Laurent Vaills: Hi Michael, Thanks for the details that helped me to take a deeper look in the source code. I noticed that each time a TimeExceededException is caught the method setPartialResults(true) is called...which seems to be what I'm looking for. I have to i

Re: ExtendedDisMax Question - Strange behaviour

2012-06-08 Thread André Maldonado
Thank's Jack. It is exactly this. My mistake. Thank's * -- * *"E conhecereis a verdade, e a verdade vos libertará." (João 8:32)* *andre.maldonado*@gmail.com (11) 9112-4227

track unused parts of config, schema

2012-06-08 Thread bryan rasmussen
Hi, Our configs, schemas are quite big. Are there any tools, code snippets in various languages, methodologies that people use in cleaning such up? For methodologies I might instead say things to look for that are almost always there and almost never used so I can look at those first. Thanks, Br

Re: per-fieldtype similarity not working

2012-06-08 Thread Robert Muir
On Fri, Jun 8, 2012 at 5:04 AM, Markus Jelsma wrote: > Thanks Robert, > > The difference in scores is clear now so it shouldn't matter as queryNorm > doesn't affect ranking but coord does. Can you explain why coord is left out > now and why it is considered to skew results and why queryNorm skew

appear garbled when I use DIH from oracle database

2012-06-08 Thread 涂小刚
Hello: when I use DIH from oracle database,it appears garbled,why? ps:my oracle database is GBK encoding with chinese. how can I solve the problem? thanks!

Re: How to cap facet counts beyond a specified limit

2012-06-08 Thread Toke Eskildsen
On Thu, 2012-06-07 at 10:01 +0200, Andrew Laird wrote: > For our needs we don't really need to know that a particular facet has > exactly 14,203,527 matches - just knowing that there are "more than a > million" is enough. If I could somehow limit the hit counts to a > million (say) [...] It shoul

Re: timeAllowed flag in the response

2012-06-08 Thread Laurent Vaills
Hi Michael, Thanks for the details that helped me to take a deeper look in the source code. I noticed that each time a TimeExceededException is caught the method setPartialResults(true) is called...which seems to be what I'm looking for. I have to investigate, since this partialResults does not s

Re: what's better for in memory searching?

2012-06-08 Thread Lance Norskog
Yes, use MMapDirectory. It is faster and uses memory more efficiently than RAMDirectory. This sounds wrong, but it is true. With RAMDirectory, Java has to work harder doing garbage collection. On Fri, Jun 8, 2012 at 1:30 AM, Li Li wrote: > hi all >   I want to use lucene 3.6 providing searching s

Re: Carrot2 using rawtext of field for clustering

2012-06-08 Thread Stanislaw Osinski
> > Is there any workaround in Solr/Carrot2 So that we could pass tokens that'd > been filtered with customer tokenizer/filters instead of rawtext that it > currently > uses for clustering ? > > I read an issue in following link too . > > https://issues.apache.org/jira/browse/SOLR-2917 > > > Is wri

RE: per-fieldtype similarity not working

2012-06-08 Thread Markus Jelsma
Thanks Robert, The difference in scores is clear now so it shouldn't matter as queryNorm doesn't affect ranking but coord does. Can you explain why coord is left out now and why it is considered to skew results and why queryNorm skews results? And which specific new ranking algorithms they conf

Re: Sorting performance

2012-06-08 Thread Dmitry Kan
Hi, probably this may help you start: https://issues.apache.org/jira/browse/SOLR-1297 Dmitry On Mon, Jun 4, 2012 at 9:51 PM, Gau wrote: > Here is the usecase: > I am using synonym expansion at query time to get results. this is > essentially a name search, so a search for Jim may be expanded

Re: timeAllowed flag in the response

2012-06-08 Thread Michael Kuhlmann
Hi Laurent, alas there is currently no such option. The time limit is handled by an internal TimeLimitingCollector, which is used inside SolrIndexSearcher. Since the using method only returns the DocList and doesn't have access to the QueryResult, it won't be easy to return this information in