Re: [SolrCloud] shard hash ranges changed after restoring backup

2016-06-16 Thread Gary Yao
> > On Wed, Jun 15, 2016 at 8:36 AM, Gary Yao wrote: >> Hi all, >> >> My team at work maintains a SolrCloud 5.3.2 cluster with multiple >> collections configured with sharding and replication. >> >> We recently backed up our Solr indexes using the b

[SolrCloud] shard hash ranges changed after restoring backup

2016-06-15 Thread Gary Yao
Hi all, My team at work maintains a SolrCloud 5.3.2 cluster with multiple collections configured with sharding and replication. We recently backed up our Solr indexes using the built-in backup functionality. After the cluster was restored from the backup, we noticed that atomic updates of documen

Re: SolrCloud result correctness compared with single core

2015-01-29 Thread Yandong Yao
say, an optimize. > > So called "bottom line" is that yes, the scoring may change, but > IMO not any more radically than was possible with single cores, > and I wouldn't worry about unless I had evidence that it was > biting me. > > Best > Erick > > On

SolrCloud result correctness compared with single core

2015-01-23 Thread Yandong Yao
Hi Guys, As the main scoring mechanism is based tf/idf, so will same query running against SolrCloud return different result against running it against single core with same data sets as idf will only count df inside one core? eg: Assume I have 100GB data: A) Index those data using single core B)

Re: Index optimize takes more than 40 minutes for 18M documents

2013-02-21 Thread Yandong Yao
and have no updates in between. But even then it may be a > waste of time. > > You need lots of free disk space for merging, whether a forced merge or > automatic. Free space equal to the size of the index is usually enough, but > worst case can need double the size of the index.

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
out. > > Otis > Solr & ElasticSearch Support > http://sematext.com/ > On Jan 9, 2013 6:07 PM, "Yandong Yao" wrote: > > > Any comments on this? Thanks very much in advance! > > > > 2013/1/9 Yandong Yao > > > > > Hi Solr Guru, > > >

Re: How to run many MoreLikeThis request efficiently?

2013-01-09 Thread Yandong Yao
Any comments on this? Thanks very much in advance! 2013/1/9 Yandong Yao > Hi Solr Guru, > > I have two set of documents in one SolrCore, each set has about 1M > documents with different document type, say 'type1' and 'type2'. > > Many documents in

How to run many MoreLikeThis request efficiently?

2013-01-08 Thread Yandong Yao
Hi Solr Guru, I have two set of documents in one SolrCore, each set has about 1M documents with different document type, say 'type1' and 'type2'. Many documents in first set are very similar with 1 or 2 documents in the second set, What I want to get is: for each document in set 2, return the mo

Re: mergeindex: what happens if there is deletion during index merging

2012-08-21 Thread Yandong Yao
Hi Shalin, Thanks very much for your detailed explanation! Regards, Yandong 2012/8/21 Shalin Shekhar Mangar > On Tue, Aug 21, 2012 at 8:47 AM, Yandong Yao wrote: > > > Hi guys, > > > > From http://wiki.apache.org/solr/MergingSolrIndexes, it said 'Using > >

mergeindex: what happens if there is deletion during index merging

2012-08-20 Thread Yandong Yao
Hi guys, >From http://wiki.apache.org/solr/MergingSolrIndexes, it said 'Using "srcCore", care is taken to ensure that the merged index is not corrupted even if writes are happening in parallel on the source index'. What does it means? If there are deletion request during merging, will this delet

Count is inconsistent between facet and stats

2012-07-18 Thread Yandong Yao
Hi Guys, Steps to reproduce: 1) Download apache-solr-4.0.0-ALPHA 2) cd example; java -jar start.jar 3) cd exampledocs; ./post.sh *.xml 4) Use statsComponent to get the stats info for field 'popularity' based on facet 'cat'. And the 'count' for 'electronics' is 3 http://localhost:8983/solr/coll

Re: SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-23 Thread Yandong Yao
rks against the automation in > solrcore, but maybe there's a good reason you want to do it this way. > > > > --- Original Message --- > > On 5/22/2012 07:35 AM Yandong Yao wrote:Hi Darren, > > > > Thanks very much for your reply. > > > > The reason I wa

Re: SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-22 Thread Yandong Yao
or you, therefore when you try to search a node/core > with no documents, all the results from the "cloud" are retrieved > regardless. This is considered "A Good Thing". > > It requires a change in thinking about indexing and searching > > On Tue, 2012-05

SolrCloud: how to index documents into a specific core and how to search against that core?

2012-05-21 Thread Yandong Yao
Hi Guys, I use following command to start solr cloud according to solr cloud wiki. yydzero:example bjcoe$ java -Dbootstrap_confdir=./solr/conf -Dcollection.configName=myconf -DzkRun -DnumShards=2 -jar start.jar yydzero:example2 bjcoe$ java -Djetty.port=7574 -DzkHost=localhost:9983 -jar start.jar

Re: Faster Solr Indexing

2012-03-11 Thread Yandong Yao
I have similar issues by using DIH, and org.apache.solr.update.DirectUpdateHandler2.addDoc(AddUpdateCommand) consumes most of the time when indexing 10K rows (each row is about 70K) - DIH nextRow takes about 10 seconds totally - If index uses whitespace tokenizer and lower case filter, th

RE: Permissions and user to acess administrative interface

2012-02-13 Thread Ge, Yao (Y.)
. Modified the web.xml to map different modules to different roles. -Yao -Original Message- From: Em [mailto:mailformailingli...@yahoo.de] Sent: Monday, February 13, 2012 11:05 AM To: solr-user@lucene.apache.org Subject: Re: Permissions and user to acess administrative interface Hi Anderson

How to use nested query in fq?

2012-02-07 Thread Yandong Yao
Hi Guys, I am using Solr 3.5, and would like to use a fq like 'getField(getDoc(uuid:workspace_${workspaceId})), "isPublic"):true? - workspace_${workspaceId}: workspaceId is indexed field. - getDoc(uuid:concat("workspace_", workspaceId): return the document whose uuid is "workspace_${workspaceI

Re: Need help for solr searching case insensative item

2010-10-26 Thread yandong yao
Sounds like WordDelimiterFilter config issue, please refer to http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.WordDelimiterFilterFactory . Also it will help if you could provide: 1) Tokenizers/Filters config in schema file 2) analysis.jsp output in admin page. 2010/10/26 wu liu

Re: A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
After upgrading to 1.4.1, it is fixed. Thanks very much for your help! Regards, Yandong Yao 2010/9/14 yandong yao > Hi Robert, > > I am using solr 1.4, will try with 1.4.1 tomorrow. > > Thanks very much! > > Regards, > Yandong Yao > > 2010/9/14 Robert Muir >

Re: A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
Hi Robert, I am using solr 1.4, will try with 1.4.1 tomorrow. Thanks very much! Regards, Yandong Yao 2010/9/14 Robert Muir > did you index with solr 1.4 (or are you using solr 1.4) ? > > at a quick glance, it looks like it might be this: > https://issues.apache.org/jira/brow

A question on WordDelimiterFilterFactory

2010-09-14 Thread yandong yao
Hi Guys, I encountered a problem when enabling WordDelimiterFilterFactory for both index and query (pasted relative part of schema.xml at the bottom of email). *1. Steps to reproduce:* 1.1 The indexed sample document contains only one sentence: "This is a TechNote." 1.2 Query is: q=TechNo

Re: how to support "implicit trailing wildcards"

2010-08-11 Thread yandong yao
> you could satisfy this by making 2 fields: > > 1. exactmatch > > 2. wildcardmatch > > > > use copyfield in your schema to copy 1 --> 2 . > > > > q=exactmatch:mount+wildcardmatch:mount*&q.op=OR > > this would score exact matches above (solely) wildcar

Re: how to support "implicit trailing wildcards"

2010-08-09 Thread yandong yao
rationale is that if search 'mounted', I also want documents with 'mount' match. So seems built-in wildcard search could not satisfy my requirements if i understand correctly. Thanks very much! 2010/8/9 Bastian Spitzer > Wildcard-Search is already built in, just use:

how to support "implicit trailing wildcards"

2010-08-09 Thread yandong yao
Hi everyone, How to support 'implicit trailing wildcard *' using Solr, eg: using Google to search 'umoun', 'umount' will be matched , search 'mounta', 'mountain' will be matched. >From my point of view, there are several ways, both with disadvantages: 1) Using EdgeNGramFilterFactory, thus 'umou

RE: Solr Search problem; cannot search the existing word in the index content

2010-06-02 Thread Zero Yao
Modify all settings in solrconfig.xml and try again, by default solr will only index the first 1 fields. Best Regards, Yandong -Original Message- From: Mint o_O! [mailto:mint@gmail.com] Sent: 2010年6月3日 13:58 To: solr-user@lucene.apache.org Subject: Re: Solr Search problem; cann

RE: Solr read-only core

2010-05-25 Thread Yao
My motivation is more from the performance prospective than functional prospective. I was hoping by opening the Solr index/core read-only, underlying Lucene IndexReader can be opened in read-only mode for optimum query performance (removing the overhead of multi-thread management). -- View this m

Solr read-only core

2010-05-25 Thread Yao
Is there a way to open a Solr index/core in read-only mode? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-read-only-core-tp843049p843049.html Sent from the Solr - User mailing list archive at Nabble.com.

RE: Date faceting and memory leaks

2010-05-17 Thread Yao
and got 1,500/minute throughput by only using a little more than 0.5 GB of Heap Memory. The hit ratio reported in Solr admin statistics page shows filterCache has a hitratio of 0.99. with 103800 lookups and 103773 hits, I assume it is 99%. Have a nice day. -Yao

Re: Date faceting and memory leaks

2010-05-17 Thread Yao
on you can help with your insight with just the little fraction of information provided. Thanks again! -Yao -- View this message in context: http://lucene.472066.n3.nabble.com/Date-faceting-and-memory-leaks-tp824372p825059.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Date faceting and memory leaks

2010-05-17 Thread Yao
Chris, Thanks for the detailed response. No I am not using Date Facet but Facet Query as for facet display. Here is the full configuration of my "dismax" query handler: dismax explicit 0.01 title text^0.5 domain^0.1 nature^0.1 author title

Re: Date faceting and memory leaks

2010-05-17 Thread Yao
No I still have the OOM issue with repeated facet query request on the date field. I forgot to mention that I am running 64-bit IBM 1.5 JVM. I also tried the Sun 1.6 JVM with and without your GC arguments. The GC pattern is different but the heap size does not drop as the test going on. I tested w

RE: Date faceting and memory leaks

2010-05-17 Thread Ge, Yao (Y.)
I do not have any GC specific setting in command line. I had tried to force GC collection via Jconsole at the end of the run but it didn't seems to do anything the heap size. -Yao -Original Message- From: Antonio Lobato [mailto:alob...@symplicity.com] Sent: Monday, May 17, 2010 2:

Date faceting and memory leaks

2010-05-17 Thread Yao
I have been running load testing using JMeter on a Solr 1.4 index with ~4 million docs. I notice a steady JVM heap size increase as I iterator 100 query terms a number of times against the index. The GC does not seems to claim the heap after the test run is completed. It will run into OutOfMemory

hl.maxAlternateFieldLength defaults in solrconfig.xml

2010-02-10 Thread Yao Ge
It appears the hl.maxAlternateFieldLength parameter default setting in solrconfig.xml does not take effect. I can only get it to work by explicitly sending the parameter via the client request. It is not big deal but it appears to be a bug. -- View this message in context: http://old.nabble.com/

DIH delta import - last modified date

2010-01-19 Thread Yao Ge
I am struggling with the concept of delta import in DIH. According the to documentation, the delta import will automatically record the last index time stamp and make it available to use for the delta query. However in many case when the last_modified date time stamp in the database lag behind the

DIH - Export to XML

2009-10-30 Thread Yao Ge
For Data Import Handler, there is a way to dump data to a SOLR feed format XML file? -- View this message in context: http://old.nabble.com/DIH---Export-to-XML-tp26138213p26138213.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Google Side-By-Side UI

2009-10-02 Thread Yao Ge
Yes. I think would be very helpful tool for tunning search relevancy - you can do a controlled experiment with your target audiences to understand their responses to the parameter changes. We plan to use this feature to benchmark Lucene/SOLR against our in-house commercial search engine - it will

Re: Item Facet

2009-08-07 Thread Yao Ge
Are your product_name* fields numeric fields (integer or float)? Dals wrote: > > Hi... > > Is there any way to group values like shopping.yahoo.com or > shopper.cnet.com do? > > For instance, I have documents like: > > doc1 - product_name1 - value1 > doc2 - product_name1 - value2 > doc3 - p

RE: Item Facet

2009-08-06 Thread Ge, Yao (Y.)
If you can reindex, simply rebuild the index with fields replaced by combining existing fields. -Yao -Original Message- From: David Lojudice Sobrinho [mailto:dalss...@gmail.com] Sent: Thursday, August 06, 2009 4:17 PM To: solr-user@lucene.apache.org Subject: Item Facet Hi... Is there

Re: Limiting facets for huge data - setting indexed=false in schema.xml

2009-07-31 Thread Yao Ge
Having a large number of fields is not the same as having a large number of facets. To facets are something you would display to users as aid for query refinement or navigation. There is no way for a user to use 3700 facets at the same time. So it more of question on how to determine what facets t

Re: Solr's MLT query call doesn't work

2009-07-08 Thread Yao Ge
A couple of things, your mlt.fl value, must be part of fl. In this case, content_mlt is not included in fl. I think the fl parameter value need to be comma separated. try fl=title,author,content_mlt,score -Yao SergeyG wrote: > > Hi, > > Recently, while implementing the MoreLik

Re: about defaultSearchField

2009-07-07 Thread Yao Ge
Try with fl=* or fl=*,score added to your request string. -Yao Yang Lin-2 wrote: > > Hi, > I have some problems. > For my solr progame, I want to type only the Query String and get all > field > result that includ the Query String. But now I can't get any result > with

Re: A big question about Solr and SolrJ range query ?

2009-07-07 Thread Yao Ge
use Solr's Filter Query parameter "fq": fq=x:[10 TO 100]&fq=y:[20 TO 300]&fl=title -Yao huenzhao wrote: > > Hi all: > > Suppose that my index have 3 fields: title, x and y. > > I know one range(10 < x < 100) can query liks this: > >

Re: Faceting with MoreLikeThis

2009-07-07 Thread Yao Ge
faceting remains the same as standard request handler. -Yao Yao Ge wrote: > > Does Solr support faceting on MoreLikeThis search results? > -- View this message in context: http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24380459.html Sent from the Solr - User mailing list a

Re: Filtering MoreLikeThis results

2009-07-07 Thread Yao Ge
The answer to my owner question: ... ... would work. -Yao Yao Ge wrote: > > I am not sure about the parameters for MLT the requestHandler plugin. Can > one of you share the solrconfig.xml entry for MLT? Thanks in advance. > -Yao > > > Bill Au wrote: >

Re: Filtering MoreLikeThis results

2009-07-07 Thread Yao Ge
I am not sure about the parameters for MLT the requestHandler plugin. Can one of you share the solrconfig.xml entry for MLT? Thanks in advance. -Yao Bill Au wrote: > > I have been using the StandardRequestHandler (ie /solr/select). fq does > work with the MoreLikeThisHandler. I wi

Filtering MoreLikeThis results

2009-07-06 Thread Yao Ge
I could not find any support from http://wiki.apache.org/solr/MoreLikeThis on how to restrict MLT results to certain subsets. I passed along a fq parameter and it is ignored. Since we can not incorporate the filters in the query itself which is used to retrieve the target for similarity comparison

Faceting with MoreLikeThis

2009-07-06 Thread Yao Ge
Does Solr support faceting on MoreLikeThis search results? -- View this message in context: http://www.nabble.com/Faceting-with-MoreLikeThis-tp24356166p24356166.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Query Filter fq with OR operator

2009-06-26 Thread Yao Ge
I will like to submit a JIRA issue for this. Can anyone help me on where to go? -Yao Otis Gospodnetic wrote: > > > Brian, > > Opening a JIRA issue if it doesn't already exist is the best way. If you > can provide a patch, even better! > > Otis > -- &

Re: Faceting on text fields

2009-06-11 Thread Yao Ge
/download.html). Yao Ge wrote: > > FYI. I did a direct integration with Carrot2 with Solrj with a separate > Ajax call from UI for top 100 hits to clusters terms in the two text > fields. It gots comparable performance to other facets in terms of > response time. > > In terms of alg

Re: Faceting on text fields

2009-06-11 Thread Yao Ge
" which I don't reconize. But I think at least one of them might have used SVD (http://en.wikipedia.org/wiki/Singular_value_decomposition). -Yao Otis Gospodnetic wrote: > > > I'd call it related (their application in search encourages exploration), > but also dist

Re: Faceting on text fields

2009-06-10 Thread Yao Ge
Thanks for insight Otis. I have no awareness of ClusteringComponent until now. It is time to move to Solr 1.4 -Yao Otis Gospodnetic wrote: > > > Yao, > > Solr can already cluster top N hits using Carrot2: > http://wiki.apache.org/solr/ClusteringComponent > > I&#

Re: Faceting on text fields

2009-06-09 Thread Yao Ge
over top N docs might be more attractive. Another features I can really appreciate is to provide search time n-gram term clustering. Maybe this might be better suited for "spell checker" as it just a different way to display the alternative search terms. -Yao Michael Ludwig-4 wrote: &

Query Filter fq with OR operator

2009-06-05 Thread Yao Ge
If I want use OR operator with mutile query filters, I can do: fq=popularity:[10 TO *] OR section:0 Is there a more effecient alternative to this? -- View this message in context: http://www.nabble.com/Query-Filter-fq-with-OR-operator-tp23895837p23895837.html Sent from the Solr - User mailing li

Re: Faceting on text fields

2009-06-04 Thread Yao Ge
Yes. I am using 1.3. When is 1.4 due for release? Yonik Seeley-2 wrote: > > Are you using Solr 1.3? > You might want to try the latest 1.4 test build - faceting has changed a > lot. > > -Yonik > http://www.lucidimagination.com > > On Thu, Jun 4, 2009 at 12:01 PM

Faceting on text fields

2009-06-04 Thread Yao Ge
I am index a database with over 1 millions rows. Two of fields contain unstructured text but size of each fields is limited (256 characters). I come up with an idea to use visualize the text fields using text cloud by turning the two text fields in facets. The weight of font and size is of each

Re: spell checking

2009-06-02 Thread Yao Ge
t; into so called n-grams. You can see that if you open up the SC index with > something like Luke. Please see > http://wiki.apache.org/jakarta-lucene/SpellChecker . > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original

Re: spell checking

2009-06-02 Thread Yao Ge
pos that I need to run spell check on. But how can these original data be used as a base for spell checking? How does Solr know what are correctly spelled words? ... ... Yao Ge wrote: > > Can someone help providing a tutorial like introduction on how to get >

Re: spell checking

2009-06-02 Thread Yao Ge
dictionary or word list? Please help. Grant Ingersoll-6 wrote: > > Have you gone through: http://wiki.apache.org/solr/SpellCheckComponent > > > On Jun 2, 2009, at 8:50 AM, Yao Ge wrote: > >> >> Can someone help providing a tutorial like introduction on how to get >

spell checking

2009-06-02 Thread Yao Ge
Can someone help providing a tutorial like introduction on how to get spell-checking work in Solr. It appears many steps are requires before the spell-checkering functions can be used. It also appears that a dictionary (a list of correctly spelled words) is required to setup the spell checker. Can

DataImportHandler Template Transformer

2009-05-18 Thread Yao Ge
It took me a while to understand that to use the Template Transfomer (http://lucene.apache.org/solr/api/org/apache/solr/handler/dataimport/TemplateTransformer.html), all building variable names (e.g. ${e.firstName} ${e.lastName} etc). can not contain null values. I hope the parser can do a better

Re: Solr Shard - Strange results

2009-05-18 Thread Yao Ge
Maybe you want to try with docNumber field type as "string" and see it would make a difference. CB-PO wrote: > > I'm not quite sure what logs you are talking about, but in the > tomcat/logs/catalina.out logs, i found the following [note, i can't > copy/paste, so i am typing up a summary]: > >

Query Boost Functions

2009-05-18 Thread Yao Ge
ied to add single and double quote around the field name but didn't help. Can field name contain hyphen in boot functions? How to do it? If not, where do I find the field name special character restrictions? -Yao -- View this message in context: http://www.nabble.com/Query-

encountered the "Cannot allocate memory" when calling snapshooter program after optimize command

2009-01-07 Thread Justin Yao
Hi, I configured solr to listen on postOptimize event and call the snapshooter program after an optimize command. It works well when the Java heap size is set to less than 4G. But if I increased the java heap size to 5G, the snapshooter program can't be successfully called after the optimize

Look ahead queries

2007-05-03 Thread Ge, Yao \(Y.\)
uggestion. Let me know if someone has already done this and there are better ways or suggestions to accomplish this. I figured solr's caching will make this type of application more efficient than a straight Lucene integration. Thanks. -Yao

RE: Faceted count syntax (exclude zeros)...

2007-05-01 Thread Ge, Yao \(Y.\)
There is an bug related to "facet.mincount" in incubating version. http://www.mail-archive.com/solr-user@lucene.apache.org/msg03269.html -Yao -Original Message- From: escher2k [mailto:[EMAIL PROTECTED] Sent: Tuesday, May 01, 2007 2:00 AM To: solr-user@lucene.apache.org Subjec

solr java client code and XML schema

2007-04-21 Thread Ge, Yao \(Y.\)
XML schema? -Yao

RE: Facet.query

2007-04-19 Thread Ge, Yao \(Y.\)
Never mind. I should have read the example (http://wiki.apache.org/solr/SimpleFacetParameters#head-1da3ab3995bc4abc dce8e0f04be7355ba19e9b2c) first. From: Ge, Yao (Y.) Sent: Thursday, April 19, 2007 10:41 PM To: 'solr-user@lucene.apache.org' Subject: F

Facet.query

2007-04-19 Thread Ge, Yao \(Y.\)
When mutiple facet queries are specified, are they booleaned as OR or AND? -Yao