Re: autosuggest combination of data from documents and popular queries

2011-09-29 Thread abhayd
hi Hoss, This helps. Only thing i am not sure is use of TermsComponent. As I understand TermsComponent allows sorking only on count|index. So I m not sure how popularity could be used for sort or boost. Any thoughts around using TermsComponent with popularity? If this is possible then i dont thin

32-bit to 64-bit

2011-09-29 Thread - -
Hi, I indexed my data on my 32-bit computer.Do I need to re-index if I upload my data to a 64-bit server or does copying the data directories would suffice? Thank you.

About solr distributed search

2011-09-29 Thread Pengkai Qin
Hi all, Now I'm doing research on solr distributed search, and it is said documents more than one million is reasonable to use distributed search. So I want to know, does anyone have the test result(Such as time cost) of using single index and distributed search of more than one million data? I

Re: SOLR Index Speed

2011-09-29 Thread Lord Khan Han
Hi, The no-op run completed in 20 minutes. The only commented line was "solr.addBean(doc)" We've tried SUSS as a drop in replacement for CommonsHttpSolrServer but it's behavior was weird. We have seen 10Ks of seconds for updates and it continues for a very long time after sending to solr is comple

Re: basic solr cloud questions

2011-09-29 Thread Darren Govoni
That was kinda my point. The "new" cloud implementation is not about replication, nor should it be. But rather about horizontal scalability where "nodes" manage different parts of a unified index. One of the design goals of the "new" cloud implementation is for this to happen more or less automati

Re: Query failing because of omitTermFreqAndPositions

2011-09-29 Thread Michael McCandless
Once a given field has omitted positions in the past, even for just one document, it "sticks" and that field will forever omit positions. Try creating a new index, never omitting positions from that field? Mike McCandless http://blog.mikemccandless.com On Thu, Sep 29, 2011 at 1:14 AM, Isan Fuli

Re: SolrCloud: is there a programmatic way to create an ensemble

2011-09-29 Thread Yury Kats
Nope On 9/29/2011 12:17 AM, Pulkit Singhal wrote: > Did you find out about this? > > 2011/8/2 Yury Kats : >> I have multiple SolrCloud instances, each running its own Zookeeper >> (Solr launched with -DzkRun). >> >> I would like to create an ensemble out of them. I know about -DzkHost >> paramete

Re: basic solr cloud questions

2011-09-29 Thread Yury Kats
On 9/29/2011 7:22 AM, Darren Govoni wrote: > That was kinda my point. The "new" cloud implementation > is not about replication, nor should it be. But rather about > horizontal scalability where "nodes" manage different parts > of a unified index. It;s about many things. You stated one, but there

RE: 32-bit to 64-bit

2011-09-29 Thread Jaeger, Jay - DOT
Are you changing just the host OS or the JVM, or both, from 32 bit to 64 bit? If it is just the OS, the answer is definitely no, you don't need to do anything more than copy. If the answer is the JVM, I *think* the answer is still no, but others more authoritative than I may wish to respond. -

Errors in requesthandler statistics

2011-09-29 Thread roySolr
Hello, I was taking a look to my SOLR statistics and i see in part of the requesthandler a count of 23 by errors. How can i see which requests returns this errors? Can i log this somewhere? Thanks Roy -- View this message in context: http://lucene.472066.n3.nabble.com/Errors-in-requesthandler-s

Re: Solr stopword problem in Query

2011-09-29 Thread Erick Erickson
I think your problem is that you've set omitTermFreqAndPositions="true" It's not real clear from the Wiki page, but the tricky little phrase "Queries that rely on position that are issued on a field with this option will silently fail to find documents." And phrase queries rely on position info

RE: About solr distributed search

2011-09-29 Thread Jaeger, Jay - DOT
I am no expert, but here is my take and our situation. Firstly, are you asking what the minimum number of documents is before it makes *any* sense at all to use a distributed search, or are you asking what the maximum number of documents is before a distributed search is essentially required?

Re: SolrCloud: is there a programmatic way to create an ensemble

2011-09-29 Thread Mark Miller
That's normally what you want to do - setup a separate quorum for production. On Sep 29, 2011, at 1:36 AM, Jamie Johnson wrote: > I'm not a solrcloud guru, but why not start your zookeeper quorum separately? > > I also believe that you can specify a zoo.cfg file which will create a > zk quorum f

Re: DIH when using XML Files questions

2011-09-29 Thread Erick Erickson
Specific replies below, but what I'd seriously consider is writing my own filesystem-aware hook that pushed documents to known Solr servers rather than using DIH to pull them. You could use the code from FileSystemEntityProcessor as a base and go from there. The FileSystemEntityProcessor isn't real

RE: Errors in requesthandler statistics

2011-09-29 Thread Jaeger, Jay - DOT
I am not expert, but based on my experience, the information you are looking for should indeed be in your logs. There are at least three logs you might look for / at: - An HTTP request log - The solr log - Logging by the application server / JVM Some information is available at http://wiki.apac

Re: How to reserve ids?

2011-09-29 Thread Erick Erickson
Hmmm, if treating them as stopwords, wouldn't you have to list all the possible variants? E.g. mystuff.msn.com yourstuff.msn.com etc? Is that sufficient or do you want *.msn.com (which isn't legal in a stopword file as far as I know)? Best Erick On Tue, Sep 27, 2011 at 11:39 PM, Otis Gospodnetic

Indexing geohash in solrj - Multivalued spatial search

2011-09-29 Thread Alessandro Benedetti
Hi all, I have already read the topics in the mailing list that are regarding spatial search, but I haven't found an answer ... I have to index a multivalued field of type : "geohash" via solrj. Now I build a string with the lat and lon comma separated ( like 54.569468,67.58494 ) and index it in t

RE: Errors in requesthandler statistics

2011-09-29 Thread roySolr
Hi, Thanks for your answer. I have some logging by jetty. Every request looks like this: 2011-09-29T12:28:47 1317292127479 18470 org.apache.solr.core.SolrCore INFO org.apache.solr.core.SolrCore execute 20 [] webapp=/solr path=/select/ params={spellcheck=true&facet=true&sort=ge

RE: Errors in requesthandler statistics

2011-09-29 Thread Jaeger, Jay - DOT
If you are asking how to tell which of 94000 records failed in a SINGLE HTTP update request, I have no idea, but I suspect that you cannot necessarily tell. It might help if you copied and pasted what you find in the solr log for the failure (see my previous response for how to figure out where

Re: Upgrading from 3.1 to 3.4

2011-09-29 Thread Erick Erickson
They should be outlined in CHANGES.txt if there are any. But usually changes to minor versions don't require any special steps... Best Erick On Wed, Sep 28, 2011 at 4:14 AM, Rohit wrote: > I have been using solr 3.1 am planning to update to solr 3.4, whats the > steps to be followed or anything

Re: Distributed search has problems with some field names

2011-09-29 Thread Erick Erickson
I know I've seen other anomalies with odd characters in field names. In general, it's much safer to use only letters, numbers, and underscores. In fact, I even prefer lowercase letters. Since you're pretty sure those work, why not just use them? Best Erick On Wed, Sep 28, 2011 at 6:59 AM, Luis Ne

Solr on OC4J

2011-09-29 Thread Raja Ghulam Rasool
Hi all, I have installed solr on oc4j. but when i try to access the admin page it throws a 'StackOverflowError' Sep 28, 2011 3:35:25 PM org.apache.solr.common.SolrException log SEVERE: java.lang.StackOverflowError is there something i am doing wrong ? any tweak or config that i need to change ? p

Re: Solr on OC4J

2011-09-29 Thread Raja Ghulam Rasool
Just to explain a bit more, OC4J standalone version is 10.1.3.5.0 and Solr version is 3.4.0. Any help will be greatly appreciated guys :) On Thu, Sep 29, 2011 at 6:15 PM, Raja Ghulam Rasool wrote: > Hi all, > > I have installed solr on oc4j. but when i try to access the admin page it > throws

Re: synonym filtering at index time

2011-09-29 Thread Erick Erickson
Biggest red flag is "KeywordTokenizerFactory". You don't say whether your input is multi-word or not, but that tokenizer does NOT break up input, so even the input "my watche" would not trigger a synonym substitution. Try something like WhitespaceTokenizer. Second red flag. Changing your analy

Re: Indexing geohash in solrj - Multivalued spatial search

2011-09-29 Thread Smiley, David W.
Hi Alessandro. I can't think of any good reason anyone would use the geohash field type that is a part of Solr today. If you are shocked I would say that, keep in mind the work I've done with geohashes is an extension of what's in Solr, it's not what's in Solr today. Recently I ported SOLR-2155

Re: basic solr cloud questions

2011-09-29 Thread Darren Govoni
Agree. Thanks also for clarifying. It helps. On 09/29/2011 08:50 AM, Yury Kats wrote: On 9/29/2011 7:22 AM, Darren Govoni wrote: That was kinda my point. The "new" cloud implementation is not about replication, nor should it be. But rather about horizontal scalability where "nodes" manage diffe

Re: basic solr cloud questions

2011-09-29 Thread Sami Siren
2011/9/29 Yury Kats : > True, but there is a big gap between goals and current state. > Right now, there is distributed search, but not distributed indexing > or auto-sharding, or auto-replication. So if you want to use the SolrCloud > now (as many of us do), you need do a number of things yourself

Re: Distributed search has problems with some field names

2011-09-29 Thread Luis Neves
Hi, On 09/29/2011 03:10 PM, Erick Erickson wrote: I know I've seen other anomalies with odd characters in field names. In general, it's much safer to use only letters, numbers, and underscores. In fact, I even prefer lowercase letters. Since you're pretty sure those work, why not just use them?

Re: Errors in requesthandler statistics

2011-09-29 Thread Shawn Heisey
On 9/29/2011 7:42 AM, roySolr wrote: I have some logging by jetty. Every request looks like this: 2011-09-29T12:28:47 1317292127479 18470 org.apache.solr.core.SolrCore INFO org.apache.solr.core.SolrCore execute 20 [] webapp=/solr path=/select/ params={spellcheck=true&

Query with plus sign failing

2011-09-29 Thread Shawn Heisey
The following query is failing: ((Google +)) This is ultimately reduced to 'google' by my analysis chain, but the following is in my log (3.2.0, but 3.4.0 also fails): SEVERE: org.apache.solr.common.SolrException: org.apache.lucene.queryParser.ParseException: Cannot parse '( (Google +))':

PDF indexing

2011-09-29 Thread Jón Helgi Jónsson
Good day, I'm checking if Solr would work for indexing PDFs. My requirements are: 1) I must know which page has what contents. 2) Left to right search support. Such as Hebrew. This has been the most trickiest to achieve. I also prefer to know the position of the searched contents on the page but

Re: Query with plus sign failing

2011-09-29 Thread Erik Hatcher
Just a fact of life with the Lucene query parser. You'll need to escape the + with a backslash for this to work. Erik On Sep 29, 2011, at 12:31 , Shawn Heisey wrote: > The following query is failing: > > ((Google +)) > > This is ultimately reduced to 'google' by my analysis chain, bu

Re: Trouble configuring multicore / accessing admin page

2011-09-29 Thread Joshua Miller
On Sep 28, 2011, at 2:16 PM, Joshua Miller wrote: > On Sep 28, 2011, at 2:11 PM, Jaeger, Jay - DOT wrote: > >> cores adminPath="/admij/cores" >> >> Was that a cut and paste? If so, the /admij/cores is presumably incorrect, >> and ought to be /admin/cores >> > > No, that was a typo -- th

Solr integration with Hbase

2011-09-29 Thread Stuti Awasthi
Hi all, I am newbee in Solr. I have my application on Hbase and Hadoop and I want to provide search functionality using Solr. I read http://wiki.apache.org/solr/DataImportHandler and got to know that there is support for SQL database. My question is : Is Solr is also good for NoSQL like databas

Re: About solr distributed search

2011-09-29 Thread Gregor Kaczor
Hi Pengkai, my experience is based on http://www.findfiles.net/ which holds >700 Mio documents, each about 2kb size. A single Index containing that kind of data should hold below 80 Mio documents. In case you have complex queries with lots of facets, sorting, function queries then even 50 Mi

Automate startup/shutdown of SolrCloud Shards

2011-09-29 Thread Jamie Johnson
I am trying to automate the startup/shutdown of SolrCloud shards and have noticed that there is a bit of a timing issue where if the server which is to bootstrap ZK with the configs does not complete it's process (i.e. there is no data at the Conf yet) the other servers will fail to start. An obvi

Re: Solr integration with Hbase

2011-09-29 Thread pulkitsinghal
Try lilyproject.com I think they do exactly what you are asking for. Sent from my iPhone On Sep 29, 2011, at 6:27 AM, Stuti Awasthi wrote: > Hi all, > > I am newbee in Solr. I have my application on Hbase and Hadoop and I want to > provide search functionality using Solr. I read > http://wik

Re: Solr integration with Hbase

2011-09-29 Thread Haspadar
http://www.lilyproject.org 2011/9/29 > Try lilyproject.com I think they do exactly what you are asking for. > > Sent from my iPhone > > On Sep 29, 2011, at 6:27 AM, Stuti Awasthi wrote: > > > Hi all, > > > > I am newbee in Solr. I have my application on Hbase and Hadoop and I want > to provide

Re: Indexing geohash in solrj - Multivalued spatial search

2011-09-29 Thread Alessandro Benedetti
Sorry David, probably I misunderstood your reply, what do you mean? I'm using Lucid Work Enterprise 1.8, and, as I know , it includes geohashes patch. I have to index a multivalued location field and I have to make location queries on it! So I figured to use the geohash type ... Any hint about ind

Re: Indexing geohash in solrj - Multivalued spatial search

2011-09-29 Thread Smiley, David W.
On Sep 29, 2011, at 5:10 PM, Alessandro Benedetti wrote: > Sorry David, probably I misunderstood your reply, what do you mean? > > I'm using Lucid Work Enterprise 1.8, and, as I know , it includes geohashes > patch. Solr 3x, trunk, and I suspect Lucid Works Enterprise 2.0 (doubtful 1.8)) suppo

removing dynamic fields

2011-09-29 Thread zarni aung
Hi, I've been experimenting with Solr dynamic fields. Here is what I've gathered based on my research. For instance, I have a setup where I am catching undefined custom fields this way. I am using (trie) types by the way. I am dealing with

Re: Getting facet counts for 10,000 most relevant hits

2011-09-29 Thread Lan
I implemented a similar feature for a categorization suggestion service. I did the faceting in the client code, which is not exactly the best performing but it worked very well. It would be nice to have the Solr server do the faceting for performance. Burton-West, Tom wrote: > > If relevance ra

dismax with AND/OR combination

2011-09-29 Thread abhayd
hi i m using solr from trunk 4.0 Also dismax is set as default qt with text^2.5 features^1.1 displayName^15.0 mfg^4.0 description^3.0 myquery is = q=+"ab sx"+OR+(mfg:abc+OR+sx)+OR+(displayName:abc+OR+sx)&qt=dismax It is not working as per my expectation . Any wa

Re: dismax with AND/OR combination

2011-09-29 Thread Erick Erickson
Well, you have to tell us what you expected and what you're seeing. Including the output with &debugQuery=on and telling us what you disagree with would be the best way. You might also include your definition from your solrconfig file. You included a fragment of it, but other parts may have bearin

split index horizontally

2011-09-29 Thread Robert Yu
Is there a efficient way to handle my case? Each document has several group fields, some of them are updated frequently, some of them are updated infrequently. Is it possible to maintain index based on groups but can search over all of them as ONE index? To some extent, it is a three layer of

Re: dismax with AND/OR combination

2011-09-29 Thread yingshou guo
you cann't use this kind of query syntax against dismax query parser. your query can by understood by standard query parser or edismax query parser. "qt" request parameter is used by solr to select the request handler plugin, not query parser. keep in mind that different query parser can understan

Re: dismax with AND/OR combination

2011-09-29 Thread Jason Toy
Can dismax understand that query in a translated form? 在 Sep 29, 2011 10:01 PM 時,yingshou guo 寫到: > you cann't use this kind of query syntax against dismax query parser. > your query can by understood by standard query parser or edismax query > parser. "qt" request parameter is used by solr to

Re: dismax with AND/OR combination

2011-09-29 Thread yingshou guo
I don't understand what do you mean by "a translated form". The only special symbols that dismax query parser can understand is "+-, eg phrase, mandatory and prohibitory semantic, something like: "term1 term2" +term3 -term4. Dismax parser will take the other operators as query string. I guess when

Re: autosuggest combination of data from documents and popular queries

2011-09-29 Thread abhayd
anyone? How to sort for termscomponent? -- View this message in context: http://lucene.472066.n3.nabble.com/autosuggest-combination-of-data-from-documents-and-popular-queries-tp3360657p3381201.html Sent from the Solr - User mailing list archive at Nabble.com.

About solr distributed search

2011-09-29 Thread 秦鹏凯
Hi all, Now I'm doing research on solr distributed search, and it is said documents more than one million is reasonable to use distributed search. So I want to know, does anyone have the test result(Such as time cost) of using single index and distributed search of more than one million data?

Re: About solr distributed search

2011-09-29 Thread Jerry Li
hi 建议你自己搭个环境测试一下吧,1M这点儿数据一点儿问题没有 2011/9/30 秦鹏凯 : > Hi all, > > Now I'm doing research on solr distributed search, and it > is said documents more than one million is reasonable to use > distributed search. > So I want to know, does anyone have the test > result(Such as time cost) of using singl

Lucene 3.4.0 Merging

2011-09-29 Thread Ahson Iqbal
Hi I have 3 solr3.4.0 indexes i want to merge them, after searching on web i found that there are two ways to do it as 1. Using Lucene Merge tool. 2. Merging through core admin i am using the 1st method for this i have downloaded lucene 3.4.0 and unpack it and then run following command on com