date:20140924

Re: Loading an index (generated by map reduce) in SolrCloud

2014-09-24 Thread rulinma

copy is not a good choice, transfer to hdfs and merge.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Loading-an-index-generated-by-map-reduce-in-SolrCloud-tp4159530p4160855.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Combining several fields for facets.

2014-09-24 Thread SolrUser1543

Using a copy field will require reindeer of my data, I am looking for a
solution without reindex. 



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-several-fields-for-facets-tp4160679p4160858.html
Sent from the Solr - User mailing list archive at Nabble.com.

RE: SlrCloud RAM requirments

2014-09-24 Thread Toke Eskildsen

Norgorn [lsunnyd...@mail.ru] wrote:
> I have CLOUD with 3 nodes and 16 MB RAM on each.
> My index is about 1 TB and search speed is awfully bad.

We all have different standard with regards to search performance. What is 
"awfully bad" and what is "good enough" for you?

Related to this: How many documents are in your index, how do you query 
(faceting, sorting, special searches) and how often is an index performed?

> I've read, that one needs at least 50% of index size in RAM,

That is the common advice, yes. The advice is not bad for some use cases. The 
problem is that it has become gospel.

I am guessing that you are using spinning drives? Solr needs fast random access 
reads and spinning drives are very slow for that. You can either compensate by 
buying enough RAM or you can change to a faster underlying storage technology. 
The obvious choice these days are Solid State Drives (we bought Samsung 840 
EVO's last time and would probably buy those again). They will not give you RAM 
speed, but they do give a lot more bang for the buck and depending on your 
performance requirements they can be enough.

You might want to read 
http://sbdevel.wordpress.com/2013/06/06/memory-is-overrated/ (I am the author)

All that being said, it is not certain that your performance problems are due 
to slow IO. But 3*16MB for 1TB of index certainly points that way.

> SOLR spec is hs_0.06

I have no idea what that means.

- Toke Eskildsen

Re: Combining several fields for facets.

2014-09-24 Thread lboutros

How many different values do you have in your fields and do you know them ?

Faceting by query is not an option for you ?

Ludovic.



-
Jouve
France.
--
View this message in context: 
http://lucene.472066.n3.nabble.com/Combining-several-fields-for-facets-tp4160679p4160866.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Accessing document stored fields in a custom function

2014-09-24 Thread Mikhail Khludnev

Actual Stored Fields are no-go definitely. You can hit any kind of
forward-view index.
http://www.youtube.com/watch?v=T5RmMNDR5XI
Look at StrFieldSource, IntFieldSource. If you wonder how to access stored
fields anyway, call org.apache.lucene.index.IndexReader.document(int).
Beware of difference between segment-local docnums, and global ones, which
are used by Solr sometimes.

On Wed, Sep 24, 2014 at 3:08 AM, Scott Smith 
wrote:

> I'm creating a custom function (extends ValueSource).  I'm generating a
> value that will both be returned as a value in the hit for each doc and
> also be used to sort.  As I read the documentation, this is not difficult.
>
> To determine the value for a document, I need to access the "stored"
> fields for that document (i.e., the value that the function will generate
> partially depends on stored information in the document).  How do I access
> them from the getValues() method?  Is this via the FieldCache.DEFAULT?  I'm
> using solr 4.8 if that makes a difference (which I think it does since
> older examples seem to have been deprecated).  For example, if I have a
> field called "Fred", how do I access that field from the document?
>
> Is accessing the stored data going to have a big impact on the time to
> return results?
>
> Thanks
>
> Scott
>

-- 
Sincerely yours
Mikhail Khludnev
Principal Engineer,
Grid Dynamics

RE: RE: using facet enum et fc in the same query.

2014-09-24 Thread Toke Eskildsen

jerome.dup...@bnf.fr [jerome.dup...@bnf.fr] wrote:

[1 thread = 15 seconds, ∞ threads = 2 seconds]

> The "slow" request corresponds to our website search query. It for our
> book catalog: some facets are for type of documents, author, title
> subjets, location of the book, dates...

> In this request we have now 35 facets.

Okay, that might by itself be a little problematic as Solr treats all facets 
separately. That also explains the large speed-up from threading.

> About unique value, for the "slow" query:
> 1 facet goes up to 4M unique values (authors),
> 1 facet has 250.000 uniques values
> 1 have 5
> 1 have 6700
> 4 have between 300 and 1000
> 5 have between 100 and 160
> 16 have less than 65

Guessing that a slow query is one that matches a lot of documents (so that the 
document IDs are represented internally as a bitmap), each facet means that 12M 
bits are checked and several million of those (the hits) are looked up in 
doc->term_ordinal tables, in order to update the facet counters. With 35 
facets, this means checking 400M bits, doing maybe 100M lookups and the same 
amount of updates.

If you could collapse some of the facets by indexing their values in a common 
field, it would help a lot.

>> Toke: Or you could try Sparse Faceting

I take that somewhat back: Sparse faceting would help with your authors field, 
and maybe the 250.000 one, but your main problem seems to be the sheer number 
of facet fields.

- Toke Eskildsen

RE: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term

2014-09-24 Thread Markus Jelsma

Hi - but this makes no sense, they are scored as equals, except for tiny 
differences in TF and IDF. What you would need is something like a stemmer that 
preserves the original token and gives a < 1 payload to the stemmed token. The 
same goes for filters like decompounders and accent folders that change meaning 
of words.
 
 
-Original message-
> From:Diego Fernandez 
> Sent: Wednesday 17th September 2014 23:37
> To: solr-user@lucene.apache.org
> Subject: Re: How does KeywordRepeatFilterFactory help giving a higher score 
> to an original term vs a stemmed term
> 
> I'm not 100% on this, but I imagine this is what happens:
> 
> (using -> to mean "tokenized to")
> 
> Suppose that you index:
> 
> "I am running home" -> "am run running home"
> 
> If you then query "running home" -> "run running home" and thus give a higher 
> score than if you query "runs home" -> "run runs home"
> 
> 
> - Original Message -
> > The Solr wiki says   "A repeated question is "how can I have the
> > original term contribute
> > more to the score than the stemmed version"? In Solr 4.3, the
> > KeywordRepeatFilterFactory has been added to assist this
> > functionality. "
> > 
> > https://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#Stemming
> > 
> > (Full section reproduced below.)
> > I can see how in the example from the wiki reproduced below that both
> > the stemmed and original term get indexed, but I don't see how the
> > original term gets more weight than the stemmed term.  Wouldn't this
> > require a filter that gives terms with the keyword attribute more
> > weight?
> > 
> > What am I missing?
> > 
> > Tom
> > 
> > 
> > 
> > -
> > "A repeated question is "how can I have the original term contribute
> > more to the score than the stemmed version"? In Solr 4.3, the
> > KeywordRepeatFilterFactory has been added to assist this
> > functionality. This filter emits two tokens for each input token, one
> > of them is marked with the Keyword attribute. Stemmers that respect
> > keyword attributes will pass through the token so marked without
> > change. So the effect of this filter would be to index both the
> > original word and the stemmed version. The 4 stemmers listed above all
> > respect the keyword attribute.
> > 
> > For terms that are not changed by stemming, this will result in
> > duplicate, identical tokens in the document. This can be alleviated by
> > adding the RemoveDuplicatesTokenFilterFactory.
> > 
> >  > positionIncrementGap="100">
> >  
> >
> >
> >
> >
> >  
> > "
> > 
> 
> -- 
> Diego Fernandez - 爱国
> Software Engineer
> GSS - Diagnostics
> 
>

Re: Solr: Boost of childs (json)

2014-09-24 Thread ku3ia

ku3ia wrote
> I can't find an example to post document with child boosted documents
> using json update handler.
> ...
> How to set the "boost" of child documents??

No ideas? Is it possible at all?



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Solr-Boost-of-childs-json-tp4160711p4160888.html
Sent from the Solr - User mailing list archive at Nabble.com.

log to file and to logging tab in web interface at the same time

2014-09-24 Thread wolvverine

Is any way to logging to file (J done this) AND see fresh logs in solr
web interface (tab logging) ?

my  configuration in *.war file classes/log4j.properties:

#  Logging level
solr.log=/usr/local/inp/logs/tomcat
log4j.rootLogger=WARN, File, Console

# Log to console
log4j.appender.Console=org.apache.log4j.ConsoleAppender
log4j.appender.Console.layout=org.apache.log4j.PatternLayout
log4j.appender.Console.layout.ConversionPattern=%-4r [%t] %-5p %c %x \u2013 %m%n

#- size rotation with log cleanup.
log4j.appender.File=org.apache.log4j.RollingFileAppender
log4j.appender.File.MaxFileSize=4MB
log4j.appender.File.MaxBackupIndex=9

#- File to log to and log format
log4j.appender.File.File=${solr.log}/solr.log
log4j.appender.File.layout=org.apache.log4j.PatternLayout
log4j.appender.File.layout.ConversionPattern=%-5p - %d{-MM-dd
HH:mm:ss.SSS}; %C; %m\n

# set  Log level log messages
log4j.logger.org.apache.zookeeper=WARN
log4j.logger.org.apache.hadoop=WARN

# set to INFO to enable infostream log messages
log4j.logger.org.apache.solr.update.LoggingInfoStream=OFF
log4j.logger.org.apache.solr=WARN
log4j.logger.org.apache.solr.cloud=WARN
log4j.logger.org.apache.solr.common=WARN
log4j.logger.org.apache.solr.core=WARN
log4j.logger.org.apache.solr.dataimport=WARN
log4j.logger.org.apache.zookeeper=WARN

RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn

Thanks for your reply.

Collection contains about billion of documents.
I'm using most of all simple queries with date and other filters (5 filters
per query).
Yup, disks are cheapest and simplest.

At the end, I want to reach several seconds per search query (for not cached
query =) ), so, please, give me some reference points.
How much (roughly) will I need RAM with and without SSDs?

I know, it depends, but at least sommething, please.

And HS means HelioSearch, SOLR spec, which store filter caches out of JVM
heap, for me it helps to avoid OOM exceptions.



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SlrCloud-RAM-requirments-tp4160853p4160891.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: [ANN] Lucidworks Fusion 1.0.0

2014-09-24 Thread Grant Ingersoll

Hi Thomas,

Thanks for the question, yes, I give a brief demo of it in action during my 
talk and we will have demos at our booth.  I will also give a demo during the 
Webinar, which will be recorded.  As others have said as well, you can simply 
download it and try yourself.

Cheers,
Grant

On Sep 23, 2014, at 2:00 AM, Thomas Egense  wrote:

> Hi Grant.
> Will there be a Fusion demostration/presentation  at Lucene/Solr Revolution
> DC? (Not listed in the program yet).
> 
> 
> Thomas Egense
> 
> On Mon, Sep 22, 2014 at 3:45 PM, Grant Ingersoll 
> wrote:
> 
>> Hi All,
>> 
>> We at Lucidworks are pleased to announce the release of Lucidworks Fusion
>> 1.0.   Fusion is built to overlay on top of Solr (in fact, you can manage
>> multiple Solr clusters -- think QA, staging and production -- all from our
>> Admin).In other words, if you already have Solr, simply point Fusion at
>> your instance and get all kinds of goodies like Banana (
>> https://github.com/LucidWorks/Banana -- our port of Kibana to Solr + a
>> number of extensions that Kibana doesn't have), collaborative filtering
>> style recommendations (without the need for Hadoop or Mahout!), a modern
>> signal capture framework, analytics, NLP integration, Boosting/Blocking and
>> other relevance tools, flexible index and query time pipelines as well as a
>> myriad of connectors ranging from Twitter to web crawling to Sharepoint.
>> The best part of all this?  It all leverages the infrastructure that you
>> know and love: Solr.  Want recommendations?  Deploy more Solr.  Want log
>> analytics?  Deploy more Solr.  Want to track important system metrics?
>> Deploy more Solr.
>> 
>> Fusion represents our commitment as a company to continue to contribute a
>> large quantity of enhancements to the core of Solr while complementing and
>> extending those capabilities with value adds that integrate a number of 3rd
>> party (e.g connectors) and home grown capabilities like an all new,
>> responsive UI built in AngularJS.  Fusion is not a fork of Solr.  We do not
>> hide Solr in any way.  In fact, our goal is that your existing applications
>> will work out of the box with Fusion, allowing you to take advantage of new
>> capabilities w/o overhauling your existing application.
>> 
>> If you want to learn more, please feel free to join our technical webinar
>> on October 2: http://lucidworks.com/blog/say-hello-to-lucidworks-fusion/.
>> If you'd like to download: http://lucidworks.com/product/fusion/.
>> 
>> Cheers,
>> Grant Ingersoll
>> 
>> 
>> Grant Ingersoll | CTO
>> gr...@lucidworks.com | @gsingers
>> http://www.lucidworks.com
>> 
>> 


Grant Ingersoll | @gsingers
http://www.lucidworks.com

RE: SlrCloud RAM requirments

2014-09-24 Thread Toke Eskildsen

Norgorn [lsunnyd...@mail.ru] wrote:
> Collection contains about billion of documents.

So 3-400M documents per core. That is a challenge with frequent updates and 
facets, but with your simple queries it should be doable.

> At the end, I want to reach several seconds per search query (for not cached
> query =) ), so, please, give me some reference points.

The frustratingly true answer is
http://lucidworks.com/blog/sizing-hardware-in-the-abstract-why-we-dont-have-a-definitive-answer/

> How much (roughly) will I need RAM with and without SSDs?
> I know, it depends, but at least sommething, please.

Okay, something it is: We have a 256GB machine, running a SolrCloud with 17 
shards, each shard being about 900GB / 300M documents and put on a dedicated 
SSD. The machine currently has 160GB free for disk cache or about 1% of the 
total index size. For very simple unwarmed searches (just query on 1-3 terms, 
edismax over 6 fields with 2 phrase fields), median response time is < 200ms 
and nearly all response times are < 1 second. An extremely rough downscale with 
a factor 15 to approximate your 1TB index would leave 11GB for disk cache; 
divide it by 3 for your machines and it's 4GB disk cache/machine + whatever it 
takes to run your Solrs and the system itself.

BUT! All the shards are fully optimized and never updated, range filters can be 
tricky, multiple filters takes time, you have more documents/bytes than we have 
etc. 

> And HS means HelioSearch,

Ah. Of course. Although it helps with processing performance, it cannot do 
anything for your IO-problem,

How much memory is used for disk caching with your current setup?

- Toke

Re: Issue Adding Filter Query

2014-09-24 Thread aaguilar

Hello Erick,

Just wanted to let you know that I did the change you suggested and
everything works as expected.  Also, thanks for letting me know about the
Analysis page in solr.  I did not know about it and I have found it very
useful.

Thanks!

On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar 
wrote:

> Hello Erick,
>
> Thank you so much for your help.  That makes perfect sense.  I will do the
> changes you suggest and let you know how it goes.
>
> Thanks!
>
> On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
> ml-node+s472066n4160547...@n3.nabble.com> wrote:
>
>> You have your index and query time analysis chains defined much
>> differently. Omitting the WordDelimiterFilterFactory from the
>> query-time analysis chain will lead to endless problems.
>>
>> With the definition you have, here are the terms in the index and
>> their term positions as  below. This is available from the
>> admin/analysis page if you click the "verbose" checkbox, although I
>> admit it's kind of hard to read:
>> 1 2   34
>> fatty  acid-binding bindingprotein
>>  acid
>>
>> But at query time, this is how they're being analyzed
>> 1 2   3
>> fattyacid-bindingprotein
>>
>> So searching for "fatty acid-binding protein" requires that the tokens
>> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
>> rather  than where they actually are (1, 2, 4). Searching for "fatty
>> acid-binding protein"~1 would actually find this, the "~1" means allow
>> one gap in there.
>>
>> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
>> will _also_ "split on intra-word delimiters (all non alpha-numeric
>> characters)". While that doesn't really say so explicitly, that will
>> have the effect of removing puncutation. So searching for "fatty
>> acid-binding protein."~1 (note the period) will fail since the token
>> will include the period.
>>
>> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
>> settings in both analysis and query times included in the stock Solr
>> release for, say, text_en_splitting or even a single analyzer like
>> text_en_splitting_tight.
>>
>> Best,
>> Erick
>>
>> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
>> > wrote:
>>
>> > Hello Erick.
>> >
>> > Below is the information you requested.   Thanks for your help!
>> >
>> > > positionIncrementGap=
>> > "100">  > > "solr.WhitespaceTokenizerFactory"/> > > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
>> splitOnCaseChange="0"
>> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
>> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/> > class=
>> > "solr.StopFilterFactory"/> > class="solr.LowerCaseFilterFactory"/> > > analyzer>  > > "solr.WhitespaceTokenizerFactory"/> > > "solr.LowerCaseFilterFactory"/>  
>> >
>> >
>> > > stored="true"
>> > />
>> >
>> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
>> > [hidden email] >
>> wrote:
>> >
>> >> Hmmm, I'd have to see the schema definition for your description
>> >> field. For this, the admin/analysis page is very helpful. Here's my
>> >> guess:
>> >>
>> >> Your analysis chain doesn't break the incoming tokens up quite like
>> >> you think it is. Thus you have the tokens in your index like
>> >> 'protein,' (notice the comma) and 'protein-like' rather than just
>> >> 'protein'. However, I can't quite reconcile this with your statement:
>> >> "Another weird thing is that if I used description:"fatty
>> >> acid-binding" AND description:"protein"
>> >>
>> >> so I'm at something of a loss. If you paste in your schema definition
>> >> for the 'description' field _and_ the corresponding 
>> >> definition I can give it a quick whirl.
>> >>
>> >> Best,
>> >> Erick
>> >>
>> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
>> >> > wrote:
>> >>
>> >> > Hello Erick,
>> >> >
>> >> > Thanks for the response.  I tried adding the debug=True to the
>> query,
>> >> but I
>> >> > do not know exactly what I am looking for in the output.  Would it
>> be
>> >> > possible for you to look at the results?  I would really appreciate
>> it.
>> >> I
>> >> > attached two files, one of them is with the filter query
>> >> description:"fatty
>> >> > acid-binding" and the other is with the filter query
>> description:"fatty
>> >> > acid-binding protein".  If you see the file that has the results for
>> >> > description:"fatty acid-binding" , you can see that the hits do have
>> >> "fatty
>> >> > acid-binding protein" and nothing in between.  I really appreciate
>> any
>> >> help
>> >> > you can provide.
>> >> >
>> >> > Thanks you
>> >> >
>> >> > On Fri, Sep 19, 2014 at 2:03 PM, Erick Erickson [via Lucene] <
>> >> > [hidden email]

Spellchecking and suggesting part numbers

2014-09-24 Thread Lochschmied, Alexander

Hello Solr Users,

we are trying to get suggestions for part numbers using the spellchecker.

Problem scenario:

ABCD1234 // This is the search term
ABCE1234 // This is what we get from spellchecker
ABCD1244 // This is what we would like to get from spellchecker

Characters towards the left of our part numbers are more relevant.


The setup is:



solr.IndexBasedSpellChecker
./spellchecker
did_you_mean_part




did_you_mean_part
on


spellcheck_part




















Can we tweak the setup such that we should get more relevant part numbers?

Thanks,
Alexander

RE: Spellchecking and suggesting part numbers

2014-09-24 Thread Dyer, James

Alexander,

You could use a higher value for spellcheck.count, maybe 20 or so, then in your 
application pick out the suggestions that make changes on the right side.

Another option is to use DirectSolrSpellChecker (usually a better choice 
anyhow) and set the "minPrefix" field.  This will require up to n characters on 
the left side to match before it will make suggestions.  Taking a quick look at 
the code, it seems to me it won't try and correct anything in this prefix 
region also.  So perhaps you can set this to 2-4 (default=1).  See 
http://lucene.apache.org/core/4_10_0/suggest/org/apache/lucene/search/spell/DirectSpellChecker.html#setMinPrefix%28int%29
 .

James Dyer
Ingram Content Group
(615) 213-4311


-Original Message-
From: Lochschmied, Alexander [mailto:alexander.lochschm...@vishay.com] 
Sent: Wednesday, September 24, 2014 9:06 AM
To: solr-user@lucene.apache.org
Subject: Spellchecking and suggesting part numbers

Hello Solr Users,

we are trying to get suggestions for part numbers using the spellchecker.

Problem scenario:

ABCD1234 // This is the search term
ABCE1234 // This is what we get from spellchecker
ABCD1244 // This is what we would like to get from spellchecker

Characters towards the left of our part numbers are more relevant.


The setup is:



solr.IndexBasedSpellChecker
./spellchecker
did_you_mean_part




did_you_mean_part
on


spellcheck_part




















Can we tweak the setup such that we should get more relevant part numbers?

Thanks,
Alexander

Help in selecting the appropriate feature to obtain results

2014-09-24 Thread barrybear

Hi guys, I'm still a beginner to Solr and I'm not sure whether to implement a
custom Filter Query or any other available features/plugins that I am not
aware of in Solr. I am using Solr v4.4.0.

I have a collection as an example as below:

[
   {
  description: 'group1',
  group: ['G?', 'GE*']
   },
   {
  description: 'group2',
  group: ['GEB']
   },
   {
  description: 'group3',
  group: ['G']
   }
]

Where group field is a multiValued whereby will contain of alphabets which
will determine the ranking and  two special characters: ? and *. Placing a ?
at the back will mean any subordinate of that ranking, while * means all
levels of subordinates of that particular ranking.

If I were to search for group:'GEB', I will expect to obtain result: 
[
   {
  description: 'group1',
  group: ['G?', 'GE*']
   },
   {
  description: 'group2',
  group: ['GEB']
   }
] 

While searching for group:'GE', should return this result:
[
   {
  description: 'group1',
  group: ['G?', 'GE*']
   }
]

And finally searching for group:'G' should only return one result:
[
   {
  description: 'group3',
  group: ['G']
   }
]

Hope that my explanation is clear enough and thanks for your attention and
time..



--
View this message in context: 
http://lucene.472066.n3.nabble.com/Help-in-selecting-the-appropriate-feature-to-obtain-results-tp4160944.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr upgrade to latest version

2014-09-24 Thread Erick Erickson

Did you look at the rest of this thread? There are some comments there.

The CHANGES.txt file will guide you through each intermediate step.

There's nothing going straight from 1.4 to 4.x. You could go from 1.4 -> 3.x
then 3.x->4.x, but frankly I'd just start with a stock 4.x distro and
transfer over
only the things you've changed (like schema definitions, NOT the whole file,
see above) and re-index.

Best,
Erick

On Tue, Sep 23, 2014 at 10:23 PM, Vivek Misra  wrote:

> Hi,
>
> Currently I am using SOLR 1.4.1 and want to migrate to SOLR 4.9.
>
> Is there any manual or link for 1.4 to 4.9? Which can guide step by step on
>
> 1. solrconfig.xml changes
>
> 2. schema.xml changes
>
> 3. changes required in version 1.4.1 queries.
>
>
> Thanks
>
> Vivek
>
>
>
>
>
>
> On Tue, Sep 23, 2014 at 9:19 AM, Danesh Kuruppu 
> wrote:
>
> > Thanks Alex and Erick for quick response,
> > This is really helpful.
> >
> > On Tue, Sep 23, 2014 at 1:19 AM, Erick Erickson  >
> > wrote:
> >
> > > Probably go for 4.9.1. There'll be a 4.10.1 out in the not-too-distant
> > > future that you can upgrade to if you wish. 4.9.1 -> 4.10.1 should be
> > > quite painless.
> > >
> > > But do _not_ copy your schema.xml and solrconfig.xml files over form
> > > 1.4 to 4.x. There are some fairly easy ways to shoot yourself in the
> > > foot there. Take the stock distribution configuration files and copy
> > > _parts_ of your schema.xml and solrconfig.xml you care about.
> > >
> > > If you're using multiple cores, read about core discovery here:
> > > https://wiki.apache.org/solr/Core%20Discovery%20(4.4%20and%20beyond)
> > >
> > > And be very aware that you should _not_ remove any of the _field_
> > > entries in schema.xml. In particular _version_ and _root_ should be
> > > left alone. As well as the "id" field.
> > >
> > > And you'll have to re-index everything; Solr 4.x will not read Solr
> > > 1.4 indexes. If that's impossible, you'll have to upgrade from 1.4 to
> > > 3.x, optimize your index, then upgrade from 3.x to 4.x, add some
> > > documents, and optimize/force_merge again.
> > >
> > > HTH
> > > Erick
> > >
> > > On Mon, Sep 22, 2014 at 2:29 AM, Danesh Kuruppu 
> > > wrote:
> > > > Hi all,
> > > >
> > > > I currently working on upgrade sorl 1.4.1 to sorl latest stable
> > release.
> > > >
> > > > What is the latest stable release I can use?
> > > > Is there specfic things I need to look at when upgrade.
> > > >
> > > > Need help
> > > > Thanks
> > > >
> > > > Danesh
> > >
> >
>

Re: Issue Adding Filter Query

2014-09-24 Thread Erick Erickson

Glad your problem isn't one any longer. Yeah, there are a lot
of nooks and crannies that one gets used to with Solr!

I'd estimate that between learning how to read the debug
output and the analysis page 80-90% of the
"my search isn't working" questions on the list can be answered,
but it takes a while to get comfortable with those tools (and
to even know they exist!)...

Best
Erick

On Wed, Sep 24, 2014 at 6:57 AM, aaguilar  wrote:

> Hello Erick,
>
> Just wanted to let you know that I did the change you suggested and
> everything works as expected.  Also, thanks for letting me know about the
> Analysis page in solr.  I did not know about it and I have found it very
> useful.
>
> Thanks!
>
> On Mon, Sep 22, 2014 at 5:41 PM, Antelmo Aguilar <
> antelmo.aguilar...@nd.edu>
> wrote:
>
> > Hello Erick,
> >
> > Thank you so much for your help.  That makes perfect sense.  I will do
> the
> > changes you suggest and let you know how it goes.
> >
> > Thanks!
> >
> > On Mon, Sep 22, 2014 at 4:12 PM, Erick Erickson [via Lucene] <
> > ml-node+s472066n4160547...@n3.nabble.com> wrote:
> >
> >> You have your index and query time analysis chains defined much
> >> differently. Omitting the WordDelimiterFilterFactory from the
> >> query-time analysis chain will lead to endless problems.
> >>
> >> With the definition you have, here are the terms in the index and
> >> their term positions as  below. This is available from the
> >> admin/analysis page if you click the "verbose" checkbox, although I
> >> admit it's kind of hard to read:
> >> 1 2   34
> >> fatty  acid-binding bindingprotein
> >>  acid
> >>
> >> But at query time, this is how they're being analyzed
> >> 1 2   3
> >> fattyacid-bindingprotein
> >>
> >> So searching for "fatty acid-binding protein" requires that the tokens
> >> "fatty" "acid-binding" and "protein" appear in term positions 1, 2, 3
> >> rather  than where they actually are (1, 2, 4). Searching for "fatty
> >> acid-binding protein"~1 would actually find this, the "~1" means allow
> >> one gap in there.
> >>
> >> HOWEVER, that's the least of your problems. WordDelimiterFilterFactory
> >> will _also_ "split on intra-word delimiters (all non alpha-numeric
> >> characters)". While that doesn't really say so explicitly, that will
> >> have the effect of removing puncutation. So searching for "fatty
> >> acid-binding protein."~1 (note the period) will fail since the token
> >> will include the period.
> >>
> >> I'd _really_ advise you to use the stock WordDelimiterFilterFactory
> >> settings in both analysis and query times included in the stock Solr
> >> release for, say, text_en_splitting or even a single analyzer like
> >> text_en_splitting_tight.
> >>
> >> Best,
> >> Erick
> >>
> >> On Mon, Sep 22, 2014 at 6:33 AM, aaguilar <[hidden email]
> >> > wrote:
> >>
> >> > Hello Erick.
> >> >
> >> > Below is the information you requested.   Thanks for your help!
> >> >
> >> >  >> positionIncrementGap=
> >> > "100">   >> > "solr.WhitespaceTokenizerFactory"/>  >> > "solr.WordDelimiterFilterFactory" splitOnNumerics="0"
> >> splitOnCaseChange="0"
> >> > generateWordParts="1" generateNumberParts="0" catenateWords="0"
> >> > catenateNumbers="0" catenateAll="0" preserveOriginal="1"/>  >> class=
> >> > "solr.StopFilterFactory"/>  >> class="solr.LowerCaseFilterFactory"/>  >> > analyzer>   >> > "solr.WhitespaceTokenizerFactory"/>  >> > "solr.LowerCaseFilterFactory"/>  
> >> >
> >> >
> >> >  >> stored="true"
> >> > />
> >> >
> >> > On Fri, Sep 19, 2014 at 7:36 PM, Erick Erickson [via Lucene] <
> >> > [hidden email]  >>
> >> wrote:
> >> >
> >> >> Hmmm, I'd have to see the schema definition for your description
> >> >> field. For this, the admin/analysis page is very helpful. Here's my
> >> >> guess:
> >> >>
> >> >> Your analysis chain doesn't break the incoming tokens up quite like
> >> >> you think it is. Thus you have the tokens in your index like
> >> >> 'protein,' (notice the comma) and 'protein-like' rather than just
> >> >> 'protein'. However, I can't quite reconcile this with your statement:
> >> >> "Another weird thing is that if I used description:"fatty
> >> >> acid-binding" AND description:"protein"
> >> >>
> >> >> so I'm at something of a loss. If you paste in your schema definition
> >> >> for the 'description' field _and_ the corresponding 
> >> >> definition I can give it a quick whirl.
> >> >>
> >> >> Best,
> >> >> Erick
> >> >>
> >> >> On Fri, Sep 19, 2014 at 11:53 AM, aaguilar <[hidden email]
> >> >> > wrote:
> >> >>
> >> >> > Hello Erick,
> >> >> >
> >> >> > Thanks for the response.  I tried adding the debug=True to the
> >> query,
> >> >> but I
> >> >> > do not know exactly what I am looking for in the output.  Would it
> >> be
> >> >> > possible for yo

Memory issue in merge thread

2014-09-24 Thread Thomas Mortagne

Hi guys,

I recently upgraded from Solr 4.0 to 4.8.1. I start it with a clean
index (we did some change in the Solr schema in the meantime) and
after some time of indexing a very big database my instance is
becoming totally unusable with 99% of the heap filled. Then when I
restart it it get stuck very quickly with the same memory issue so it
seems linked to the size of the Lucene index more that the time spend
indexing data.

Youtkit is telling me that "Lucene Merge Thread #1"
(ConcurrentMergeScheduler$MergeThread) is keeping 4,095 instances of
org.apache.lucene.codecs.lucene42.Lucene42DocValuesProducer$2 in some
local variable(s) which retain about 422MB of RAM and keep adding more
of those to the heap.

Whatever this thread is doing (ConcurrentMergeScheduler javadoc is not
very detailed) it does not seems to be doing it in a very streamed
fashion (if not "simply" a memory leak in 4.8.1). Any idea if this is
expected ? Do I have a way to control the size of the heap this thread
is going to need ?

I have the heap dump if anyone want more details or want to look at
it. The Solr setup can be seen on
https://github.com/xwiki/xwiki-platform/tree/xwiki-platform-6.2/xwiki-platform-core/xwiki-platform-search/xwiki-platform-search-solr/xwiki-platform-search-solr-api/src/main/resources/solr/xwiki/conf

I know all that is only talking about Lucene classes but since on my
side what I use is Solr I tough it was better to ask on this mailing
list.

Thanks,
-- 
Thomas

MRIT's morphline mapper doesn't co-locate with data

2014-09-24 Thread Tom Chen

Hi,

The MRIT (MapReduceIndexerTool) uses NLineInputFormat for the morphline
mapper. The mapper doesn't co-locate with the input data that it process.
Isn't this a performance hit?

Ideally, morphline mapper should be run on those hosts that contain most
data blocks for the input files it process.

Regards,
Tom

Re: Spellchecking and suggesting part numbers

2014-09-24 Thread Jorge Luis Betancourt Gonzalez

I’ve done something similar to this using the the EdgeNGram not the 
spellchecker component, I don’t know if this is along with your requirements:

The relevant portion of my fieldType config:



 class="solr.SpellCheckComponent">
>   
>   solr.IndexBasedSpellChecker
>   ./spellchecker
>   did_you_mean_part
>   
>   
>startup="lazy">
>   
>   did_you_mean_part
>   on
>   
>   
>   spellcheck_part
>   
>   
> 
> 
>positionIncrementGap="100">
>   
>class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
>   
>   
>minGramSize="1" maxGramSize="20" side="front"/>
>class="solr.RemoveDuplicatesTokenFilterFactory"/>
>   
>   
>class="solr.PatternReplaceCharFilterFactory" pattern="[\s]+" replacement=""/>
>   
>   
>minGramSize="1" maxGramSize="20" side="front"/>
>   
>   
> 
> Can we tweak the setup such that we should get more relevant part numbers?
> 
> Thanks,
> Alexander

Concurso "Mi selfie por los 5". Detalles en 
http://justiciaparaloscinco.wordpress.com

Re: Solr 4.10 termsIndexInterval and termsIndexDivisor not supported with default PostingsFormat?

2014-09-24 Thread Tom Burton-West

Thanks Hoss,

Just opened SOLR-6560 and attached a patch which removes the offending
section from the example solrconfig.xml file.

  We suspect that with the much more efficient block and FST based Solr 4
default postings format that the need to mess with the parameters in order
to reduce memory usage has gone away.  Haven't really tested yet.

If there is still a use case for configuring the Solr default
PostingsFormat  and the ability to set the parameters currently exists,
than maybe someone who understands this could put an example in the
solrconfig.xml file and documentation.   On the other hand if the use case
still exists and Solr doesn't have the ability to configure the parameters,
maybe another issue should be opened.  Looks like all that would be needed
is a mechanism to pass a couple of ints to the Lucene postings format:

" For example, Lucene41PostingsFormat
implements
the term index instead based upon how terms share prefixes. To configure
its parameters (the minimum and maximum size for a block), you would
instead use Lucene41PostingsFormat.Lucene41PostingsFormat(int, int)
.
which can also be configured on a per-field basis:"

Tom

On Thu, Sep 18, 2014 at 1:42 PM, Chris Hostetter 
wrote:

>
> : I think the documentation and example files for Solr 4.x need to be
> : updated.  If someone will let me know I'll be happy to fix the example
> : and perhaps someone with edit rights could fix the reference guide.
>
> I think you're correct - can you open a Jira with suggested improvements
> for the configs?  (i see you commented on the ref guide page which is
> helpful - but the jira issue wil also help serve sa a reminder to audit
> *all* the pages for refrences to these options, ie: in config snippets,
> etc...)
>
> : According to the JavaDocs for IndexWriterConfig, the Lucene level
> : implementations of these do not apply to the default PostingsFormat
> : implementation.
> :
> http://lucene.apache.org/core/4_10_0/core/org/apache/lucene/index/IndexWriterConfig.html#setReaderTermsIndexDivisor%28int%29
> :
> : Despite this statement in the Lucene JavaDocs, in the
> : example/solrconfig.xml there is the following:
>
> Yeah ... I'm not sure what (if anything?) we should say about these in the
> example configs -- the *setting* is valid and supported by
> IndexWriterConfig no matter what posting format you use, so it's not an
> error to configure this, but it can be ignored in many cases.
>
> : Can someone please confirm that these two parameter settings
> : termIndexInterval and termsIndexDivisor, do not apply to the default
> : PostingsFormat for Solr 4.10?
>
> I was taking your word for it :)
>
>
> -Hoss
> http://www.lucidworks.com/
>

Scoring with wild cars

2014-09-24 Thread Pigeyre Romain

Hi,

I hava two records with name_fra field
One with name_fra="un test CARREAU"
And another one with name_fra="un test CARRE"

{
"codeBarre": "1",
"name_FRA": "un test CARREAU"
  }
{
"codeBarre": "2",
"name_FRA": "un test CARRE"
  }

Configuration of these fields are :







  





  
  





  


When I'm using this query :
http://localhost:8983/solr/cdv_product/select?q=text%3Acarre*&fl=score%2C+*&wt=json&indent=true&debugQuery=true
The result is :
{
  "responseHeader":{
"status":0,
"QTime":2,
"params":{
  "debugQuery":"true",
  "fl":"score, *",
  "indent":"true",
  "q":"text:carre*",
  "wt":"json"}},
  "response":{"numFound":2,"start":0,"maxScore":1.0,"docs":[
  {
   "codeBarre":"1",
"name_FRA":"un test CARREAU",
"_version_":1480150860842401792,
"score":1.0},
  {
"codeBarre":"2",
"name_FRA":"un test CARRE",
"_version_":1480150875738472448,
"score":1.0}]
  },
  "debug":{
"rawquerystring":"text:carre*",
"querystring":"text:carre*",
"parsedquery":"text:carre*",
"parsedquery_toString":"text:carre*",
"explain":{
  "1":"\n1.0 = (MATCH) ConstantScore(text:carre*), product of:\n  1.0 = 
boost\n  1.0 = queryNorm\n",
  "2":"\n1.0 = (MATCH) ConstantScore(text:carre*), product of:\n  1.0 = 
boost\n  1.0 = queryNorm\n"},
"QParser":"LuceneQParser",
"timing":{
  "time":2.0,
  "prepare":{
"time":1.0,
"query":{
  "time":1.0},
"facet":{
  "time":0.0},
"mlt":{
  "time":0.0},
"highlight":{
  "time":0.0},
"stats":{
  "time":0.0},
"expand":{
  "time":0.0},
"debug":{
  "time":0.0}},
  "process":{
"time":1.0,
"query":{
  "time":0.0},
"facet":{
  "time":0.0},
"mlt":{
  "time":0.0},
"highlight":{
  "time":0.0},
"stats":{
  "time":0.0},
"expand":{
  "time":0.0},
"debug":{
  "time":1.0}

The score is the same for both of record. CARREAU record is first and CARRE is 
next. I want to place CARRE before CARREAU result because CARRE is an exact 
match. Is it possible?

NB : scoring for this query only use querynorm and boosters

In this test :
http://localhost:8983/solr/cdv_product/select?q=text%3Acarre&fl=score%2C*&wt=json&indent=true&debugQuery=true

I have only one record found but the scoring is more complex. Why?

{

  "responseHeader":{

"status":0,

"QTime":2,

"params":{

  "debugQuery":"true",

  "fl":"score,*",

  "indent":"true",

  "q":"text:carre",

  "wt":"json"}},

  "response":{"numFound":1,"start":0,"maxScore":0.53033006,"docs":[

  {

"codeBarre":"2",

"name_FRA":"un test CARRE",

"_version_":1480150875738472448,

"score":0.53033006}]

  },

  "debug":{

"rawquerystring":"text:carre",

"querystring":"text:carre",

"parsedquery":"text:carre",

"parsedquery_toString":"text:carre",

"explain":{

  "2":"\n0.53033006 = (MATCH) weight(text:carre in 0) [DefaultSimilarity], 
result of:\n  0.53033006 = fieldWeight in 0, product of:\n1.4142135 = 
tf(freq=2.0), with freq of:\n  2.0 = termFreq=2.0\n1.0 = idf(docFreq=1, 
maxDocs=2)\n0.375 = fieldNorm(doc=0)\n"},

"QParser":"LuceneQParser",

"timing":{

  "time":2.0,

  "prepare":{

"time":1.0,

"query":{

  "time":1.0},

"facet":{

  "time":0.0},

"mlt":{

  "time":0.0},

"highlight":{

  "time":0.0},

"stats":{

  "time":0.0},

"expand":{

  "time":0.0},

"debug":{

  "time":0.0}},

  "process":{

"time":1.0,

"query":{

  "time":0.0},

"facet":{

  "time":0.0},

"mlt":{

  "time":0.0},

"highlight":{

  "time":0.0},

"stats":{

  "time":0.0},

"expand":{

  "time":0.0},

"debug":{

  "time":1.0}





Romain PIGEYRE
Centre de service de Lyon

[Sopra]

Sopra
Parc du Puy d'Or
72 Allée des Noisetiers - CS 10137
69578 - LIMONEST
France
Phone : +33 (0)4 37 26 43 33
romain.pige...@sopra.com - 
www.sopra.com


[cid:image004.png@01CFD833.DFE6CB90] 
[cid:image006.png@01CFD833.DFE6CB90] 
  
[cid:image008.png@01CFD833.DFE6CB90]   
[cid:image010.png@01CFD833.DFE6CB90]   
[cid:image012.png@01CFD833.DFE6CB90] 
Ce message peut contenir des informations confidentielles

Re: [ANN] Lucidworks Fusion 1.0.0

2014-09-24 Thread Sebastián Ramírez

It's good to know you'll talk about it at Lucene/Solr Revolution 2014 too.


*Sebastián Ramírez*
Diseñador de Algoritmos

 

 Tel: (+571) 795 7950 ext: 1012
 Cel: (+57) 300 370 77 10
 Calle 73 No 7 - 06  Piso 4
 Linkedin: co.linkedin.com/in/tiangolo/
 Email: sebastian.rami...@senseta.com
 www.senseta.com

On Wed, Sep 24, 2014 at 6:13 AM, Grant Ingersoll 
wrote:

> Hi Thomas,
>
> Thanks for the question, yes, I give a brief demo of it in action during
> my talk and we will have demos at our booth.  I will also give a demo
> during the Webinar, which will be recorded.  As others have said as well,
> you can simply download it and try yourself.
>
> Cheers,
> Grant
>
> On Sep 23, 2014, at 2:00 AM, Thomas Egense 
> wrote:
>
> > Hi Grant.
> > Will there be a Fusion demostration/presentation  at Lucene/Solr
> Revolution
> > DC? (Not listed in the program yet).
> >
> >
> > Thomas Egense
> >
> > On Mon, Sep 22, 2014 at 3:45 PM, Grant Ingersoll 
> > wrote:
> >
> >> Hi All,
> >>
> >> We at Lucidworks are pleased to announce the release of Lucidworks
> Fusion
> >> 1.0.   Fusion is built to overlay on top of Solr (in fact, you can
> manage
> >> multiple Solr clusters -- think QA, staging and production -- all from
> our
> >> Admin).In other words, if you already have Solr, simply point
> Fusion at
> >> your instance and get all kinds of goodies like Banana (
> >> https://github.com/LucidWorks/Banana -- our port of Kibana to Solr + a
> >> number of extensions that Kibana doesn't have), collaborative filtering
> >> style recommendations (without the need for Hadoop or Mahout!), a modern
> >> signal capture framework, analytics, NLP integration, Boosting/Blocking
> and
> >> other relevance tools, flexible index and query time pipelines as well
> as a
> >> myriad of connectors ranging from Twitter to web crawling to Sharepoint.
> >> The best part of all this?  It all leverages the infrastructure that you
> >> know and love: Solr.  Want recommendations?  Deploy more Solr.  Want log
> >> analytics?  Deploy more Solr.  Want to track important system metrics?
> >> Deploy more Solr.
> >>
> >> Fusion represents our commitment as a company to continue to contribute
> a
> >> large quantity of enhancements to the core of Solr while complementing
> and
> >> extending those capabilities with value adds that integrate a number of
> 3rd
> >> party (e.g connectors) and home grown capabilities like an all new,
> >> responsive UI built in AngularJS.  Fusion is not a fork of Solr.  We do
> not
> >> hide Solr in any way.  In fact, our goal is that your existing
> applications
> >> will work out of the box with Fusion, allowing you to take advantage of
> new
> >> capabilities w/o overhauling your existing application.
> >>
> >> If you want to learn more, please feel free to join our technical
> webinar
> >> on October 2:
> http://lucidworks.com/blog/say-hello-to-lucidworks-fusion/.
> >> If you'd like to download: http://lucidworks.com/product/fusion/.
> >>
> >> Cheers,
> >> Grant Ingersoll
> >>
> >> 
> >> Grant Ingersoll | CTO
> >> gr...@lucidworks.com | @gsingers
> >> http://www.lucidworks.com
> >>
> >>
>
> 
> Grant Ingersoll | @gsingers
> http://www.lucidworks.com
>
>
>
>
>
>

-- 
**
*This e-mail transmission, including any attachments, is intended only for 
the named recipient(s) and may contain information that is privileged, 
confidential and/or exempt from disclosure under applicable law. If you 
have received this transmission in error, or are not the named 
recipient(s), please notify Senseta immediately by return e-mail and 
permanently delete this transmission, including any attachments.*

Does soft commit block on autowarming?

2014-09-24 Thread Bruce Johnson

I currently have an algorithm that needs to know whether query results are
fresh up to a known point in time, and I'm using an explicit soft commit
request to act as a latch point. I record the time T just before I issue a
soft commit request, and when it returns, I assume that query results
include all documents indexed prior to T. (Note that I record T *prior* to
issuing the soft commit request, to avoid the obvious race.)

Is that sound? Is it reliably true that once a soft commit request returns,
any subsequent queries will hit a new (and autowarmed) searcher? I'm
specifically wondering whether Solr might continue to autowarm a new
pending searcher in the background after claiming to finish the soft commit
and then at some unknown point later switch to the newer searcher, such
that queries can hit a stale searcher accidentally.

Hope that question makes sense. Any help, pointers to code, whatever, would
be greatly appreciated!

- Bruce

Re: Does soft commit block on autowarming?

2014-09-24 Thread Yonik Seeley

On Wed, Sep 24, 2014 at 6:56 PM, Bruce Johnson  wrote:
> Is it reliably true that once a soft commit request returns,
> any subsequent queries will hit a new (and autowarmed) searcher?

Yes.
The default for commit and softCommit commands is waitSearcher=true,
which will not return until a new searcher is "registered".  After
that point, you're guaranteed to get the new searcher for any
requests.  Autowarming happens before searcher registration and hence
isn't an issue.

-Yonik
http://heliosearch.org - native code faceting, facet functions,
sub-facets, off-heap data

Re: MRIT's morphline mapper doesn't co-locate with data

2014-09-24 Thread Wolfgang Hoschek

Based on our measurements, Lucene indexing is so CPU intensive that it wouldn’t 
really help much to exploit data locality on read. The overwhelming bottleneck 
remains the same. Having said that, we have an ingestion tool in the works that 
will take advantage of data locality for splitable files as well.

Wolfgang.

On Sep 24, 2014, at 9:38 AM, Tom Chen  wrote:

> Hi,
> 
> The MRIT (MapReduceIndexerTool) uses NLineInputFormat for the morphline
> mapper. The mapper doesn't co-locate with the input data that it process.
> Isn't this a performance hit?
> 
> Ideally, morphline mapper should be run on those hosts that contain most
> data blocks for the input files it process.
> 
> Regards,
> Tom

Solr Cloud Default Document Routing

2014-09-24 Thread Susmit Shukla

Hi,

I'm building out a multi shard solr collection as the index size is likely
to grow fast.
I was testing out the setup with 2 shards on 2 nodes with test data.
Indexed few documents with "id" as the unique key.
collection create command -
/solr/admin/collections?action=CREATE&name=multishard&numShards=2

used this command to upload - curl
http://server/solr/multishard/update/json?commitWithin=2000 --data-binary
@data.json -H 'Content-type:application/json'

data.json -
[
  {
"id": "100161200"
  }
  {
"id": "100161384"
  }
]

when I query on one of the node with with an id constraint, I see the query
executed on both shards which looks inefficient - Qtime increased to double
digits. I guess solr would know based on id which shard data went to.

I have a few questions around this as I could not find pertinent
information on user lists or documentation.
- query is hitting all shards and replicas - if I have 3 shards and 5
replicas , how would the performance be impacted since for the very simple
case it increased to double digits?
- Could id lookup queries just go to one shard automatically?

/solr/multishard/select?q=id%3A100161200&wt=json&indent=true&debugQuery=true

"QTime":13,

  "debug":{
"track":{
  "rid":"-multishard_shard1_replica1-1411605234897-171",
  "EXECUTE_QUERY":[
"http://server1/solr/multishard_shard1_replica1/";,[
  "QTime","1",
  "ElapsedTime","4",
  "RequestPurpose","GET_TOP_IDS",
  "NumFound","1",
  "Response","some resp"],
"http://server2/solr/multishard_shard2_replica1/";,[
  "QTime","1",
  "ElapsedTime","6",
  "RequestPurpose","GET_TOP_IDS",
  "NumFound","0",
  "Response","some"]],
  "GET_FIELDS":[
"http://server1/solr/multishard_shard1_replica1/";,[
  "QTime","0",
  "ElapsedTime","4",
  "RequestPurpose","GET_FIELDS,GET_DEBUG",
  "NumFound","1",


Thanks,
Susmit

RE: SlrCloud RAM requirments

2014-09-24 Thread Norgorn

Thanks again.
I'd answered before properly reading your post, my apologizes.

I can't say for sure, cause filter caches are out of the JVM (dat HS), but
top shows  5 GB cached and no free RAM.
The only question for me now is how to balance disk cache and filter cache?
Do I need to worry about that, or big disk cache is enough?
And does "optimized index"  mean SOLR "optimize" command, or something else?

Anyway, your previous answers are really greate, so don't spend time, if u
don't have much to)



--
View this message in context: 
http://lucene.472066.n3.nabble.com/SlrCloud-RAM-requirments-tp4160853p4161047.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Loading an index (generated by map reduce) in SolrCloud

Re: Combining several fields for facets.

RE: SlrCloud RAM requirments

Re: Combining several fields for facets.

Re: Accessing document stored fields in a custom function

RE: RE: using facet enum et fc in the same query.

RE: How does KeywordRepeatFilterFactory help giving a higher score to an original term vs a stemmed term

Re: Solr: Boost of childs (json)

log to file and to logging tab in web interface at the same time

RE: SlrCloud RAM requirments

Re: [ANN] Lucidworks Fusion 1.0.0

RE: SlrCloud RAM requirments

Re: Issue Adding Filter Query

Spellchecking and suggesting part numbers

RE: Spellchecking and suggesting part numbers

Help in selecting the appropriate feature to obtain results

Re: Solr upgrade to latest version

Re: Issue Adding Filter Query

Memory issue in merge thread

MRIT's morphline mapper doesn't co-locate with data

Re: Spellchecking and suggesting part numbers

Re: Solr 4.10 termsIndexInterval and termsIndexDivisor not supported with default PostingsFormat?

Scoring with wild cars

Re: [ANN] Lucidworks Fusion 1.0.0

Does soft commit block on autowarming?

Re: Does soft commit block on autowarming?

Re: MRIT's morphline mapper doesn't co-locate with data

Solr Cloud Default Document Routing

RE: SlrCloud RAM requirments

29 matches

Site Navigation

Mail list logo

Footer information