Re: Field missing when use distributed search + dismax

2010-06-22 Thread Scott Zhang
Hi. Lance. Thanks for replying. Yes. I especially checked the schema.xml and did another simple test. The broker is running on localhost:7499/solr. A solr instance is running on localhost:7498/solr. For this test, I only use these 2 instances. 7499's index is empty. 7498 has 12 documents in inde

Re: Nested table support ability

2010-06-22 Thread amit_ak
Hi Otis, Thanks for the update. My paramteric search has to span across customer table and 30 child tables. We have close to 1 million customers. Do you think Lucene/Solr is the right fsolution for such requirements? or database search would be more optimal. Regards, Amit -- View this message

about function query

2010-06-22 Thread Li Li
I want to integrate document's timestamp into scoring of search. And I find an example in the book "Solr 1.4 Enterprise Search Server" about function query. I want to boost a document which is newer. so it may be a function such as 1/(timestamp+1) . But the function query is added to the final resu

Re: Field missing when use distributed search + dismax

2010-06-22 Thread Lance Norskog
Do all of the Solr instances, including the broker, use the same schema.xml? On 6/22/10, Scott Zhang wrote: > Hi. All. >I was using distributed search over 30 solr instance, the previous one > was using the standard query handler. And the result was returned correctly. > each result has 2 fie

Re: Help with highlighting

2010-06-22 Thread Erik Hatcher
You need to share with us the Solr request you made, any any custom request handler settings that might map to. Chances are you just need to twiddle with the highlighter parameters (see wiki for docs) to get it to do what you want. Erik On Jun 22, 2010, at 4:42 PM, n...@frameweld.

Re: collapse exception

2010-06-22 Thread Erik Hatcher
Martijn - Maybe the patches to SolrIndexSearcher could be extracted into a new issue so that we can put in the infrastructure at least. That way this could truly be a drop-in plugin without it actually being in core. I haven't looked at the specifics, but I imagine we could get the core s

Re: SOLR partial string matching question

2010-06-22 Thread Joe Calderon
you want a combination of WhitespaceTokenizer and EdgeNGramFilter http://lucene.apache.org/solr/api/org/apache/solr/analysis/WhitespaceTokenizerFactory.html http://lucene.apache.org/solr/api/org/apache/solr/analysis/EdgeNGramFilterFactory.html the first will create tokens for each word the second

SOLR partial string matching question

2010-06-22 Thread Vladimir Sutskever
Hi, Can you guys make a recommendation for which types/filters to use accomplish the following partial keyword match: A. Actual Indexed Term: "bank of america" B. User Enters Search Term: "of ameri" I would like SOLR to match document "bank of america" with the partial string "of ameri"

Re: collapse exception

2010-06-22 Thread Martijn v Groningen
I checked your stacktrace and I can't remember putting SolrIndexSearcher.getDocListAndSet(...) in the doQuery(...) method. I guess the patch was modified before it was applied. I think the error occurs when you do a field collapse search with a fq parameter. That is the only reason I can think of w

Re: Field Collapsing SOLR-236

2010-06-22 Thread Martijn v Groningen
What exactly did not work? Patching, compiling or running it? On 22 June 2010 16:06, Rakhi Khatwani wrote: > Hi, >      I tried checking out the latest code (rev 956715) the patch did not > work on it. > Infact i even tried hunting for the revision mentioned earlier in this > thread (i.e. rev 955

Help with highlighting

2010-06-22 Thread noel
Hi, I need help with highlighting fields that would match a query. So far, my results only highlight if the field is from all_text, and I would like it to use other fields. It simply isn't the case if I just turn highlighting on. Any ideas why it only applies to all_text? Here is my schema:

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Matteo Fiandesio
Fields over i'm sorting to are dynamic so one query sorts on erick_time_1,erick_timeA_1 and other sorts on erick_time_2 and so on.What we see in the heap are a lot of arrays,most of them,filled with 0s maybe due to the fact that this timestamps fields are not present in all the documents. By the w

Re: anyone use hadoop+solr?

2010-06-22 Thread Blargy
Muneeb Ali wrote: > > Hi Blargy, > > Nice to hear that I am not alone ;) > > Well we have been using Hadoop for other data-intensive services, those > that can be done in parallel. We have multiple nodes, which are used by > Hadoop for all our MapReduce jobs. I personally don't have much expe

Re: example for searching hibernate entities

2010-06-22 Thread Peter Karich
as always: it depends. take a look into hibernate search also, which is lucene powered. Peter. > I have complex data model with bi directional relations I Use hibernate > as ORM provider.so I have several model objects representing data model. All > together my model objetcs are 75 to 100 an

Re: anyone use hadoop+solr?

2010-06-22 Thread Jason Rutherglen
We (Attensity Group) have been using SOLR-1301 for 6+ months now because we have a ready Hadoop cluster and need to be able to re/index up to 3 billion docs. I read the various emails and wasn't sure what you're asking. Cheers... On Tue, Jun 22, 2010 at 8:27 AM, Neeb wrote: > > Hey James, > > J

Performance related question on DISMAX handler..

2010-06-22 Thread bbarani
Hi, I just want to know if there will be any overhead / performance degradation if I use the Dismax search handler instead of standard search handler? We are planning to index millions of documents and not sure if using Dismax will slow down the search performance. Would be great if someone can

Re: Change the Solr searcher

2010-06-22 Thread Erik Hatcher
Sounds like what you want is to override Solr's "query" component. Have a look at the built-in one and go from there. Erik On Jun 22, 2010, at 1:38 PM, sarfaraz masood wrote: I am a novice in solr / lucene. but i have gone thru the documentations of both.I have even implemented prog

Change the Solr searcher

2010-06-22 Thread sarfaraz masood
I am a novice in solr / lucene. but i have gone thru the documentations of both.I have even implemented programs in lucene for searching etc. My problem is to apply a new search technique other than the one used by solr. Now as i know that lucene has its own searcher which is used by solr as wel

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Erick Erickson
Hmmm, I'm missing something here then. Sorting over 15 fields of type long shouldn't use much memory, even if all the values are unique. When you say "12-15 dynamic fields", are you talking about 12-15 fields per query out of XXX total fields? And is XXX large? At a guess, how many different fields

Re: solr with hadoop

2010-06-22 Thread Jon Baer
I was playing around w/ Sqoop the other day, its a simple Cloudera tool for imports (mysql -> hdfs) @ http://www.cloudera.com/developers/downloads/sqoop/ It seems to me (it would be pretty efficient) to dump to HDFS and have something like Data Import Handler be able to read from hdfs:// directl

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Matteo Fiandesio
Hi Erick, the index is quite small (1691145 docs) but sorting is massive and often on unique timestamp fields. OOM occur after a range of time between three and four hours. Depending as well if users browse a part of the application. We use solrj to make the queries so we did not use Readers obje

Re: anyone use hadoop+solr?

2010-06-22 Thread Muneeb Ali
Hi Blargy, Nice to hear that I am not alone ;) Well we have been using Hadoop for other data-intensive services, those that can be done in parallel. We have multiple nodes, which are used by Hadoop for all our MapReduce jobs. I personally don't have much experience with its use and hence wouldn

Re: anyone use hadoop+solr?

2010-06-22 Thread Marc Sturlese
Well, the patch consumes the data from a csv. You have to modify the input to use TableInputFormat (I don't remember if it's called exaclty like that) and it will work. Once you've done that, you have to specify as much reducers as shards you want. I know 2 ways to index using hadoop method 1 (so

Re: Data Import Handler Rich Format Documents

2010-06-22 Thread Tod
On 6/18/2010 2:42 PM, Chris Hostetter wrote: : > I don't think DIH can do that, but who knows, let's see what others say. : Looks like the ExtractingRequestHandler uses Tika as well. I might just use : this but I'm wondering if there will be a large performance difference between : using it to

Re: anyone use hadoop+solr?

2010-06-22 Thread Blargy
Need, Seems like we are in the same boat. Our index consist of 5M records which roughly equals around 30 gigs. All in all thats not too bad however our indexing process (we use DIH but I'm now revisiting that idea) takes a whopping 30+ hours!!! I just bought the Hadoop In Action early edition b

Re: solr with hadoop

2010-06-22 Thread MitchK
I wanted to add a Jira-issue about exactly what Otis is asking here. Unfortunately, I haven't time for it because of my exams. However, I'd like to add a question to Otis' ones: If you destribute the indexing-progress this way, are you able to replicate the different documents correctly? Thank y

Re: anyone use hadoop+solr?

2010-06-22 Thread Neeb
Thanks Marc, Well I have an HBASE storage architecture and solr master-slave setup with two slave servers. Would this patch work with my setup? Do I need sharding in place? and what tasks would be run at map and reduce phases? I was thinking something like: At Map: read documents as key/value

Re: solr with hadoop

2010-06-22 Thread Marc Sturlese
I think a good solution could be to use hadoop with SOLR-1301 to build solr shards and then use solr distributed search against these shards (you will have to copy to local from HDFS to search against them) -- View this message in context: http://lucene.472066.n3.nabble.com/solr-with-hadoop-tp48

Re: anyone use hadoop+solr?

2010-06-22 Thread Marc Sturlese
I think there's people using this patch in production: https://issues.apache.org/jira/browse/SOLR-1301 I have tested it myself indexing data from CSV and from HBase and it works properly -- View this message in context: http://lucene.472066.n3.nabble.com/anyone-use-hadoop-solr-tp485333p914553.ht

Re: Configuring RequestHandler in solrconfig.xml OR in the Servlet code using SolrJ

2010-06-22 Thread Sven Maurmann
Hi, there are reasons for both options. Usually it is a good idea to put the default configuration into the solrconfig.xml (and even fix some of the configuration) in order to have simple client-side code. But sometimesit is necessary to have some flexibility for the actual query. In this si

Re: Configuring RequestHandler in solrconfig.xml OR in the Servlet code using SolrJ

2010-06-22 Thread Jan Høydahl / Cominvent
Hi, Sometimes I do both. I put the defaults in solrconfig.xml and thus have one place to define all kind of low-level default settings. But then I make a possibility in the application space to add/override any parameters as well. This gives you great flexibility to let server administrators (

Re: solr with hadoop

2010-06-22 Thread Neeb
Hi, We currently have a master-slave setup for solr with two slave servers. We are using Solrj (stream-update-solr-server) to index master slave, which takes 6 hours to index around 15 million documents. I would like to explore hadoop, in particularly for indexing job using mapreduce approach.

Re: anyone use hadoop+solr?

2010-06-22 Thread Neeb
Hey James, Just wondering if you ever had a chance to try out hadoop with solr? Would appreciate any information/directions you could give. I am particularly interested in indexing using a mapreduce job. Cheers, -Ali -- View this message in context: http://lucene.472066.n3.nabble.com/anyone-u

Field missing when use distributed search + dismax

2010-06-22 Thread Scott Zhang
Hi. All. I was using distributed search over 30 solr instance, the previous one was using the standard query handler. And the result was returned correctly. each result has 2 fields. "ID" and "type". Today I want to use search withk dismax, I tried search with each instance with dismax. It wo

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Erick Erickson
H.. A couple of details I'm wondering about. How many documents are we talking about in your index? Do you get OOMs when you start fresh or does it take a while? You've done some good investigations, so it seems like there could well be something else going on here than just "the usual suspect

RE: example for searching hibernate entities

2010-06-22 Thread Fornoville, Tom
Have you already looked at Hibernate Search? It combines Hibernate ORM with indexing/searching functionality of Lucene. The latest version even comes with the Solr analyzers. http://www.hibernate.org/subprojects/search.html Regards, Tom -Original Message- From: fachhoch [mailto:fachh...@

Re: Searching across multiple repeating fields

2010-06-22 Thread Geert-Jan Brits
Perhaps my answer is useless, bc I don't have an answer to your direct question, but: You *might* want to consider if your concept of a solr-document is on the correct granular level, i.e: your problem posted could be tackled (afaik) by defining a document being a 'sub-event' with only 1 daterang

Re: performance sorting multivalued field

2010-06-22 Thread Erick Erickson
Curiosity is good . Do be aware, though, that the behavior is not guaranteed, it's just "how things happen to work" and may change without warning Erick On Tue, Jun 22, 2010 at 4:01 AM, Marc Sturlese wrote: > > >>Well, sorting requires that all the unique values in the target field > >>get l

example for searching hibernate entities

2010-06-22 Thread fachhoch
I have complex data model with bi directional relations I Use hibernate as ORM provider.so I have several model objects representing data model. All together my model objetcs are 75 to 100 and my database each table has several records like 20,000. please suggest in my case will text search

Re: Field Collapsing SOLR-236

2010-06-22 Thread Rakhi Khatwani
Hi, I tried checking out the latest code (rev 956715) the patch did not work on it. Infact i even tried hunting for the revision mentioned earlier in this thread (i.e. rev 955615) but cannot find it in the repository. (it has revision 955569 followed by revision 955785). Any pointers?? Regar

Searching across multiple repeating fields

2010-06-22 Thread Mark Allan
Hi all, Firstly, I apologise for the length of this email but I need to describe properly what I'm doing before I get to the problem! I'm working on a project just now which requires the ability to store and search on temporal coverage data - ie. a field which specifies a date range durin

How to wait for StreamingUpdateSolrServer to finish?

2010-06-22 Thread Stephen Duncan Jr
I'm prototyping using StreamingUpdateSolrServer. I want to send a commit (or optimize) after I'm done adding all of my docs, rather than wait for the autoCommit to kick in. However, since StreamingUpdateSolrServer is multi-threaded, I can't simply call commit when I'm done, because that can happe

[NEWS] New Response Writer for Native PHP Solr Client

2010-06-22 Thread Israel Ekpo
Hi Solr users, If you are using Apache Solr via PHP, I have some good news for you. There is a new response writer for the PHP native extension, currently available as a plugin. This new feature adds a new response writer class to the org.apache.solr.request package. This class is used by the P

Re: Alternative for field collapsing

2010-06-22 Thread Rakhi Khatwani
Thanks Peter :) On Tue, Jun 22, 2010 at 3:08 PM, Peter Karich wrote: > ups, sorry. I meant Martijn! Not the germanized Martin :-/ > > Peter. > > > Hi, > > I wanted to apply field collapsing on the title(type string). but > > want to show only one document (and the count of such documents

Re: Alternative for field collapsing

2010-06-22 Thread Peter Karich
ups, sorry. I meant Martijn! Not the germanized Martin :-/ Peter. > Hi, > I wanted to apply field collapsing on the title(type string). but > want to show only one document (and the count of such documents) per title > rather than show all the documents. > > Regards > Raakhi > > > On Tue,

Re: Alternative for field collapsing

2010-06-22 Thread Peter Karich
Hi Raakhi, yes, then the collapse patch works perfectly in our case. If you don't get the patch applied correctly, try asking directly here: https://issues.apache.org/jira/browse/SOLR-236 I did the same and got immediately response from Martin & Co or try the latest patch: 2010-06-17 03:08 PM Mar

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Matteo Fiandesio
First of all thanks for your answers. Those OOMEs are pretty nasty for our production environment. I didn't try the solution of ordering by function as it was a solr 1.5 feature and we prefer to use a stable version 1.4. I made a temporary patch that it looks is working fine. I patched the lucene-

Re: LocalParams?

2010-06-22 Thread Peter Karich
E.g. take a look at: http://www.craftyfella.com/2010/01/faceting-and-multifaceting-syntax-in.html Peter. > Huh? Read through the wiki: See http://wiki.apache.org/solr/LocalParams but I > still don't understand its utility? > > Can someone explain to me why this would even be used? Any examples t

Re: performance sorting multivalued field

2010-06-22 Thread Marc Sturlese
>>Well, sorting requires that all the unique values in the target field >>get loaded into memory That's what I tought, thanks. >>But a larger question is whether what your doing is worthwhile >>even as just a measurement. You say >>"This is good for me, I don't care for my tests". I claim that >>

RE: solr string field

2010-06-22 Thread ZAROGKIKAS,GIORGOS
It's ok It was a problem with my schema Thanks anyway -Original Message- From: Erik Hatcher [mailto:erik.hatc...@gmail.com] Sent: Monday, June 21, 2010 5:09 PM To: solr-user@lucene.apache.org Subject: Re: solr string field Or even better for an exact string query: q={!raw f=field_

Re: Alternative for field collapsing

2010-06-22 Thread Rakhi Khatwani
Hi, I wanted to apply field collapsing on the title(type string). but want to show only one document (and the count of such documents) per title rather than show all the documents. Regards Raakhi On Tue, Jun 22, 2010 at 12:59 AM, Peter Karich wrote: > Hi Raakhi, > > First, field collap

Re: OOM on sorting on dynamic fields

2010-06-22 Thread Lance Norskog
No, this is basic to how Lucene works. You will need larger EC2 instances. On Mon, Jun 21, 2010 at 2:08 AM, Matteo Fiandesio wrote: > Compiling solr with lucene 2.9.3 instead of 2.9.1 will solve this issue? > Regards, > Matteo > > On 19 June 2010 02:28, Lance Norskog wrote: >> The Lucene impleme

Re: Mr Lance : customize the search algorithm of solr

2010-06-22 Thread Lance Norskog
Solr depends on Lucene's implementation of queries and how it returns document hits. I can't help you architect these changes. On Mon, Jun 21, 2010 at 7:47 AM, sarfaraz masood wrote: > Mr Lance > > Thanks > a lot for ur reply.. I am a novice a solr / lucene. but i have gone > thru the documentati