weak documents

2013-11-27 Thread Thomas Scheffler
Hi, I am relatively new to SOLR and I am looking for a neat way to implement weak documents with SOLR. Whenever a document is updated or deleted all it's dependent documents should be removed from the index. In other words they exist as long as the document exist they refer to when they were

Persist solr cache

2013-11-27 Thread Prasi S
Hi all, Is there any way the solr caches ( document / field/ query) cache can be persisted on disk. In case of system crash, can i make the new cache loaded from the persisted cache. Thanks, Prasi

Re: weak documents

2013-11-27 Thread Upayavira
Just a guess, I haven't investigated them fully yet, but I wonder if block joins could serve you here, as they involve creating docs in a parent child relationship. Or, you could easily fake it: abcd parent:abcd Not sure if that syntax is completely right, but using that sort of thing woul

Re: Solr Autowarmed queries on jvm crash

2013-11-27 Thread michael.boom
As Shawn stated above, when you start up Solr there will be no such thing as caches or old searchers. If you want to warm up, you can only rely on firstSearcher and newSearcher queries. /"What would happen to the autowarmed queries , cache , old searcher now ?"/ They're all gone. - Thanks,

Re: Persist solr cache

2013-11-27 Thread michael.boom
Caches are only valid as long as the Index Searcher is valid. So, if you make a commit with opening a new searcher then caches will be invalidated. However, in this scenario you can configure your caches so that the new searcher will keep a certain number of cache entries from the previous one (aut

Re: weak documents

2013-11-27 Thread Paul Libbrecht
Thomas, our experience with Curriki.org is that evaluating what I call the "related documents" is a procedure that needs access to the complete content and thus is run at the DB level and no thte sold-level. For example, if a user changes a part of its name, we need to reindex all of his resou

Re: syncronization between replicas

2013-11-27 Thread adfel70
I'm sorry, I forgot to write the problem. adfel70 wrote > 1. take one of the replicas of shard1 down(it doesn't matter which one) > 2. continue indexing documents(that's important for this scenario) > 3. take down the second replica of shard1(now the shard is down and we > cannot index anymore) >

Re: Solr 3.6.1 stalling with high CPU and blocking on field cache

2013-11-27 Thread Guido Medina
We have a webapp running with a very high HEAP size (24GB) and we have no problems with it AFTER we enabled the new GC that is meant to replace sometime in the future the CMS GC, but you have to have Java 6 update "Some number I couldn't find but latest should cover" to be able to use: 1. Remo

Re: Persist solr cache

2013-11-27 Thread Prasi S
Currently , once solr is started, we run a batch that would fire queries to solr ( just something like the firstsearcher does). Once this is done, then the users would start using search. In case the server is restarted or anything crashes, then again i have to run this batch which i cannot contro

Re: Multivalued true Error?

2013-11-27 Thread Furkan KAMACI
Thanks Sujit, I got the problem and fixed it. 2013/11/26 Sujit Pal > Hi Furkan, > > In the stock definition of the payload field: > > http://svn.apache.org/viewvc/lucene/dev/trunk/solr/example/solr/collection1/conf/schema.xml?view=markup > > the analyzer for payloads field type is a WhitespaceT

Re: weak documents

2013-11-27 Thread Thomas Scheffler
Am 27.11.2013 09:58, schrieb Paul Libbrecht: Thomas, our experience with Curriki.org is that evaluating what I call the "related documents" is a procedure that needs access to the complete content and thus is run at the DB level and no thte sold-level. For example, if a user changes a part of i

TrimFilterFactory and IllegalArgumentException with Solr4.6

2013-11-27 Thread Bernd Fehling
Now this is strange, while using TrimFilterFactory with attribute "updateOffsets=true" as described in http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.TrimFilterFactory and https://cwiki.apache.org/confluence/display/solr/Filter+Descriptions#FilterDescriptions-TrimFilter I get "

News url is not showing correct

2013-11-27 Thread Vishal GUPTA
Hi I am facing a problem in solr with tt_news url. Every time its showing all news from one detail page. For ex: I have two category of news 1. Corporate 2. Human So url for corporate it should be form like : domainname/pagename/corporate/detail/article/newsheading And for human

LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

2013-11-27 Thread Müller , Stephan
Hello, this is a repost. This message was originally posted on the 'general' list but it was suggested, that the 'user' list might be a better place to ask. Original Message Hi, we are passing a multivalued field to the LanguageIdentifierUpdateProcessor. This multivalued field contai

Re: Persist solr cache

2013-11-27 Thread michael.boom
You could just add the queries you have set up in your batch script to the firstSearcher queries. Like this, you wouldn't need to run the script everytime you restart Solr. As for crash protection and immediate action, that's outside the scope of the Solr mailing list. You could setup a watchdog t

Re: syncronization between replicas

2013-11-27 Thread Daniel Collins
I think when a replica becomes leader, it tries to sync *from* all the other replicas to see if anyone else is more up to date than it is, then it syncs back out *to* the replicas. But that probably won't happen in your case, since when replica1 comes back (step 4) it is the only contender, so it

Which patch for which solr version

2013-11-27 Thread Ramo Karahasan
Hi, on https://issues.apache.org/jira/browse/SOLR-3583 there are some patches listed. I currently can't really figure out, for which solr version this patch is valid, since the issue listed there is still open and should be fixed for version 4.6. I'm wondering if this patch can be appli

RE: Need help on Joining and sorting syntax and limitations between multiple documents in solr-4.4.0

2013-11-27 Thread Sukanta Dey
Hi Team, As per the latest updates in the support ticket in Lucid portal we have some concerns as below 1. The join key id's seem to have to be integers. It says they require longs, but I am having trouble with anything but an integer as the "from" and "to" key values. --regarding the

Re: SOLR Master-Slave Repeater with Load balancer

2013-11-27 Thread Erick Erickson
Yes. This is going to hurt you a lot. The intent of M/S is that you should be indexing to one, and only one machine, the master. All slaves pull their indexes from the master. Frankly I don't know quite what will happen in the configuration you're talking about. I strongly recommend you do not do t

Re: TermVectorComponent NullPointerException

2013-11-27 Thread Erick Erickson
Please review: http://wiki.apache.org/solr/UsingMailingLists You've given us almost no information to go on here. Best, Erick On Tue, Nov 26, 2013 at 2:21 PM, GOYAL, ANKUR wrote: > Hi, > > I am working on using term vector component with solr 4.2.1. If I use solr > in a multicore environment,

Re: LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

2013-11-27 Thread Jack Krupansky
I suspect that it is an oversight for a use case that was not considered. I mean, it should probably either ignore or convert non text/string values. Hmmm... are you using JSON input? I mean, how are the types being set? Solr XML doesn't have a way to set the value types. You could workaround

Re: Client-side proxy for Solr 4.5.0

2013-11-27 Thread Reyes, Mark
What about using some JSONP techniques since the results in the Solr instance rest as key/value pairs? On 11/26/13, 10:53 AM, "Markus Jelsma" wrote: >I don't think you mean client-side proxy. You need a server side layer >such as a normal web application or good proxy. We use Nginx, it is very

Re: Is it possible to have only fq in my solr query?

2013-11-27 Thread Erick Erickson
The sense of "fq" clauses is "for all the docs that match my primary query, only show the ones that match the fq clause". There's no primary query to work with. If you really need this capability, you can add this to the section of your request handler in solrconfig.xml *:* The oob request handl

Re: weak documents

2013-11-27 Thread Jack Krupansky
Just bite the bullet and do the query at your application level. I mean, Solr/Lucene would have to do the same amount of work internally anyway. If the perceived performance overhead is too great, get beefier hardware. -- Jack Krupansky -Original Message- From: Thomas Scheffler Sent:

Re: News url is not showing correct

2013-11-27 Thread Erick Erickson
My _guess_, and it's only a guess since you haven't shown us anything about your Solr setup, is that all your documents are getting indexed with the same ID so you only have one live document. You might review: http://wiki.apache.org/solr/UsingMailingLists Best, Erick On Wed, Nov 27, 2013 at 5:

Re: syncronization between replicas

2013-11-27 Thread Erick Erickson
As Daniel says, there's no information available in step 4 for that node to know it's out of date. "Don't do that" isn't very helpful. I think the only recovery strategy I can think of offhand is to reindex from some time T prior to step <1>... Best, Erick On Wed, Nov 27, 2013 at 6:07 AM, Danie

Term Vector Component Question

2013-11-27 Thread Jamie Johnson
I am interested in retrieving the tf for terms that matched the query, not all terms in the document. Is this possible? Looking at the example when I search for the word cable I get the response that is shown below, ideally I'd like to see only the tf for the word cable. Is this possible or woul

Re: Which patch for which solr version

2013-11-27 Thread Erick Erickson
"Try it and see". Not really helpful, but the best we I can do. There's no formal method for insuring that a patch will work with an arbitrary version. At least you're trying to apply it to a version newer than it was created on. Not much help, but If you _do_ apply it to 4.5.1, and if you ha

Re: Term Vector Component Question

2013-11-27 Thread Erick Erickson
Would it serve to return the tf or ttf? You'd have to tack on clauses like fl=*,ttf(name,drive) or fl=*.ttf(name,drive) Which implies that you'd have to do some work on the query side to add the tf or ttf clauses. See: http://wiki.apache.org/solr/FunctionQuery#tf Best, Erick On Wed, Nov 27, 20

Re: Term Vector Component Question

2013-11-27 Thread Jack Krupansky
That information would be included in the debugQuery output as well. -- Jack Krupansky -Original Message- From: Jamie Johnson Sent: Wednesday, November 27, 2013 9:32 AM To: solr-user@lucene.apache.org Subject: Term Vector Component Question I am interested in retrieving the tf for

solr as a service for multiple projects in the same environment

2013-11-27 Thread adfel70
Hi I have various solr related projects in a single environment. These project are not related one to another. I'm thinking of building a solr architecture so that all the projects will use different solr collections in the same cluster, as opposed to having a solr cluster for each project. 1. as

Re: Client-side proxy for Solr 4.5.0

2013-11-27 Thread Guido Medina
Why complicate it?, I think the simplest solution to the poster question is either a transparent proxy or proxy jetty (or tomcat) via Apache Web Server. I don't think there will be any difference between either, only how easy one or the other are to implement. HTH, Guido. On 27/11/13 14:13

Re: Client-side proxy for Solr 4.5.0

2013-11-27 Thread Guido Medina
Mark, As a 2nd thought, maybe, I was just focusing on what I thought you needed initially which is allow client to query solr and at the same time restrict specific request parameters, both apache and a any rich transparent proxy can do the job easily, apache can rewrite the URL and map only

SolR vs large PDF

2013-11-27 Thread Marcello Lorenzi
Hi All, on our test environment we have implemented a new search engine based on Solr 4.3 with 2 instances hosted on different servers and 1 shard present on each servlet container. During some stress test we noticed a bottleneck into crawling of large PDF file that blocks the serving of resu

RE: LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

2013-11-27 Thread Müller , Stephan
> I suspect that it is an oversight for a use case that was not considered. > I mean, it should probably either ignore or convert non text/string > values. Ok, I'll see that I provide a patch against trunk. It actually ignores non string values, but is unable to check the remaining values of a mul

Re: weak documents

2013-11-27 Thread Walter Underwood
Right. Delete by query "id:foo OR dependsOn:foo". --wunder On Nov 27, 2013, at 6:23 AM, "Jack Krupansky" wrote: > Just bite the bullet and do the query at your application level. I mean, > Solr/Lucene would have to do the same amount of work internally anyway. If > the perceived performance o

Re: Multivalued true Error?

2013-11-27 Thread Furkan KAMACI
Hi Sujit; Your example has that line: override def decodeNormValue(b: Byte) = 1.0F However it is a final class. Do you have any idea to handle it? 2013/11/27 Furkan KAMACI > Thanks Sujit, I got the problem and fixed it. > > > 2013/11/26 Sujit Pal > >> Hi Furkan, >> >> In the stock defini

Re: Multivalued true Error?

2013-11-27 Thread Furkan KAMACI
"it is a final *method*". Can not be overrided at Solr 4.5.1? 2013/11/27 Furkan KAMACI > Hi Sujit; > > Your example has that line: > > override def decodeNormValue(b: Byte) = 1.0F > > > However it is a final class. Do you have any idea to handle it? > > > > 2013/11/27 Furkan KAMACI > >> Thank

Error when creating collection in Solr 4.6

2013-11-27 Thread lansing
Hi, I am using solr 4.6, with external zookeeper 3.4.5 5 nodes, 5 shards, 3 replicas. I uploaded collection configuration in zookeepr. I am using the new core discovery mode I have this issue when I try to create a collection with this call : http://10.0.5.227:8101/solr/admin/collections?action=C

Re: solr as a service for multiple projects in the same environment

2013-11-27 Thread michael.boom
Hi, There's nothing unusual in what you are trying to do, this scenario is very common. To answer your questions: > 1. as I understand I can separate the configs of each collection in > zookeeper. is it correct? Yes, that's correct. You'll have to upload your configs to ZK and use the Collection

Re: Error when creating collection in Solr 4.6

2013-11-27 Thread Yago Riveiro
Lansing, I ran the command without any issue http://localhost:8983/solr/admin/collections?action=CREATE&name=Current1&numShards=5&replicationFactor=3&maxShardsPerNode=15&collection.configName=default The only different was that I have only one box and used the default config from example folde

What is the right way to list top terms for a given field?

2013-11-27 Thread Dave Seltzer
Hello, I'm trying to get a list of top terms for a field called "Tags". One way to do this would be to query all data *:* and then facet by the Tags column: /solr/collection/admin/select?q=*:*&rows=0&facet=true&facet.field=Tags I've noticed another way to do this is using the luke interface like

Solr 4.3.1 :: Error loading class 'solr.ICUFoldingFilterFactory'

2013-11-27 Thread Raheel Hasan
Hi, I got a new issue now. I have Solr 4.3.0 running just fine. However on Solr 4.3.1, it wont load. I get this issue: {msg=SolrCore 'mycore' is not available due to init failure: Plugin init failure for [schema.xml] fieldType "text_ws": Plugin init failure for [schema.xml] analyzer/filter: Erro

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
Jack, I'm not following, are you suggesting to turn on debug and then parse the explain? Seems very round about if that is the case, no? On Wed, Nov 27, 2013 at 9:40 AM, Jack Krupansky wrote: > That information would be included in the debugQuery output as well. > > -- Jack Krupansky > > -

Re: What is the right way to list top terms for a given field?

2013-11-27 Thread Alexandre Rafalovitch
You can always expose the admin handler on non-admin URL. That's all just definitions in solrconfig.xml. Personal website: http://www.outerthoughts.com/ LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch - Time is the quality of nature that keeps events from happening all at once. Lately, i

Re: What is the right way to list top terms for a given field?

2013-11-27 Thread Stefan Matheis
Since your users shouldn't be allowed at any time to access Solr directly, it's up to you to implement that on the client side anyway? I can't tell if there is a technical difference between the two calls you named, but i'd guess that the second might be a more direct way to access this informa

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
I definitely want tf, the number of times the matched term appears in the document, the key is that I want only the term that was searched for, not all terms. Looking at the tf function this is close, except it needs to be the exact term, I really need it to be the user entered text. So for insta

Re: Term Vector Component Question

2013-11-27 Thread Jack Krupansky
There is an XML version of explain as well, if parsing the structured text is too difficult for your application. The point is that debug "explain" details precisely the term vector values for actual query terms. Don't let the "debug" moniker throw you - this parameter is simply giving you acc

Re: What is the right way to list top terms for a given field?

2013-11-27 Thread Dave Seltzer
It's certainly seems to be faster (in my limited testing). I just don't want to base my software on the Luke scripts if they're prone to changing in the future. And yes, I realize there are ways to make this secure. I just wanted to know if it's something I should avoid doing (perhaps for reasons

Re: SolR vs large PDF

2013-11-27 Thread Erick Erickson
I'm assuming you're using the ExtractingRequestHandler. Offloading the entire work onto your Solr box that is also serving queries and indexing is not going to scale well. Consider using Tika/SolrJ (Tika is what the ERH uses anyway) to offload the PDF parsing amongst as many clients as you can aff

SolrCloud and 2MB Synonym file

2013-11-27 Thread Puneet Pawaia
Hi I am trying to setup a test SolrCloud 4.5.1 implementation. My synonym file is about 1.6 MB. When I try to add collection to ZooKeeper 3.4.5 on Ubuntu 12.4, it fails because of the 1MB limit of ZooKeeper. Has anyone any experience with using such synonym files? Can I store them in some other lo

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Yago Riveiro
You can use the jute.maxbuffer > 1M as a workaround. You must set -Djute.maxbuffer in zookeeper and solr to work properly -- Yago Riveiro Sent with Sparrow (http://www.sparrowmailapp.com/?sig) On Wednesday, November 27, 2013 at 5:15 PM, Puneet Pawaia wrote: > Hi > > I am trying to setup a t

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
thanks I'm looking at this now, debug seems pretty close to what I want. Is there a way to exclude information from the debug response, for instance I don't need idf, fieldnorm, timing information, etc. Again thanks. On Wed, Nov 27, 2013 at 11:49 AM, Jack Krupansky wrote: > There is an XML ver

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
a little more reading gave me it. I can just do debug=results, but that still includes idf and fieldnorm. Much less though so it's a step ;) If there is anyway to get just idf that would be great, otherwise no big deal On Wed, Nov 27, 2013 at 12:18 PM, Jamie Johnson wrote: > thanks I'm looki

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Puneet Pawaia
Yago, not sure if this is a good idea. Docs say this is dangerous stuff. Anyway, not being a linux or java expert, I would appreciate if you could point me to an implementation of this. Regards Puneet Pawaia On 27 Nov 2013 22:54, "Yago Riveiro" wrote: > You can use the jute.maxbuffer > 1M as

Re: Term Vector Component Question

2013-11-27 Thread Jack Krupansky
To be honest, this kind of question comes up so often, that it probably is worth a Jira to have a more customized or parameterized "explain". Function queries in the "fl" list give you a lot more control, but not at the level of actual terms that matched. -- Jack Krupansky -Original Mess

Re: Solr 4.3.1 :: Error loading class 'solr.ICUFoldingFilterFactory'

2013-11-27 Thread Shawn Heisey
On 11/27/2013 9:37 AM, Raheel Hasan wrote: I got a new issue now. I have Solr 4.3.0 running just fine. However on Solr 4.3.1, it wont load. I get this issue: {msg=SolrCore 'mycore' is not available due to init failure: Plugin init failure for [schema.xml] fieldType "text_ws": Plugin init failur

Custom Relevancy Using Field Payloads

2013-11-27 Thread Furkan KAMACI
I have a payload field at my schema (Solr 4.5.1) When a user searches for a keyword I will calculate the usual score and "if" a match occurs at that payload field I will add payload to the general score (payload * normalize coefficient) How can I do that? Custom payload similarity class or custom

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Yago Riveiro
How are you launching Solr? Do you have an ensemble or you're running zookeeper embedded? Yes, doc says that jute.maxbuffer is dangerous, but without it you can stored nothing with more than 1M in zookeeper … and in some point you can have a clusterstate.json with a size greater than 1M -

Re: Function query matching

2013-11-27 Thread Peter Keegan
Hi, So, this query does just what I want, but it's typically 3 times slower than the edismax query without the functions: select?qq={!edismax v='news' qf='title^2 body'}&scaledQ=scale(product( query($qq),1),0,1)&q={!func}sum(product(0.75,$scaledQ), product(0.25,field(myfield)))&fq={!query v=$qq}

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Mark Miller
They are just trying to keep users from using ZK in a bad way. Storing and accessing a ton of huge files is not what ZooKeeper was designed for. A 1MB limit is a fairly arbitrary limiter to make sure you don’t shoot yourself in the foot and store lots of large files. With modern networks and har

Re: Function query matching

2013-11-27 Thread Chris Hostetter
: So, this query does just what I want, but it's typically 3 times slower : than the edismax query without the functions: that's because the scale() function is inhernetly slow (it has to compute the min & max value for every document in order to know how to scale them) what you are seeing is

Re: Function query matching

2013-11-27 Thread Peter Keegan
Although the 'scale' is a big part of it, here's a closer breakdown. Here are 4 queries with increasing functions, and theei response times (caching turned off in solrconfig): 100 msec: select?q={!edismax v='news' qf='title^2 body'} 135 msec: select?qq={!edismax v='news' qf='title^2 body'}q={!fun

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Timothy Potter
I'm curious how much compression you get with your synonym file using something basic like gzip? If significant, would it make sense to store the compressed syn file in ZooKeeper (or any other metadata you need to distribute around the cluster)? This would require the code that reads the syn file f

Re: What is the right way to list top terms for a given field?

2013-11-27 Thread Timothy Potter
Hi Dave, Have you looked at the TermsComponent? http://wiki.apache.org/solr/TermsComponent It is easy to wire into an existing request handler and allows you to return the top terms for a field. Example server even includes an example request handler that uses it: true fa

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
Thanks Jack, I'll see if I can find anything on Jira about this and if not I'll create a ticket for it. On Wed, Nov 27, 2013 at 12:28 PM, Jack Krupansky wrote: > To be honest, this kind of question comes up so often, that it probably is > worth a Jira to have a more customized or parameterized "

Re: Term Vector Component Question

2013-11-27 Thread Jamie Johnson
I didn't see anything so I created this https://issues.apache.org/jira/browse/SOLR-5511 On Wed, Nov 27, 2013 at 2:35 PM, Jamie Johnson wrote: > Thanks Jack, I'll see if I can find anything on Jira about this and if not > I'll create a ticket for it. > > > On Wed, Nov 27, 2013 at 12:28 PM, Jack

Re: SolR vs large PDF

2013-11-27 Thread Marcello Lorenzi
Hi Erick, On our architecture we use Apache Manifoldcf to invoke the schedulation from Manifold-web and we use the Manifold-agent to take the pdf file from the filesystem to SolR instances. Is it possibile to redirect the Manifold schedulation to the SolrJ instance for specific schedules? Tha

NOTE: comments currently disabled for most users in Solr Ref Guide

2013-11-27 Thread Chris Hostetter
jca recently pointed out on the #solr IRC channel that normal (ie: non-committer) confluence-users are not able to post comments on any pages of the Solr Ref Guide. This is evidently do to a change made by Infra that was mentioned in an email to all PMC members on Oct1 -- but the rramificati

Caches contain deleted docs (?)

2013-11-27 Thread Roman Chyla
Hi, I'd like to check - there is something I don't understand about cache - and I don't know if it is a bug, or feature the following calls return a cache FieldCache.DEFAULT.getTerms(reader, idField); FieldCache.DEFAULT.getInts(reader, idField, false); the resulting arrays *will* contain entrie

Can't post comment on Confluence pages under "Apache Solr Reference Guide"

2013-11-27 Thread Julien Canquelain
Hi,I would like to post a comment about the problem below on Solr Confluence documentation, but comments are disabled right now for confluence-users (at least at the time I'm writing this - it was confirmed on IRC a minute ago).The page I would like to comment on is : https://cwiki.apache.org/conf

Re: NOTE: comments currently disabled for most users in Solr Ref Guide

2013-11-27 Thread Chris Hostetter
FYI: https://issues.apache.org/jira/browse/INFRA-7058 : Changing this back for just the ref guide wiki psace would be fairly easy -- : but i don't want to do that until i have a chance to talk to Infra about it. -Hoss

Re: What is the right way to list top terms for a given field?

2013-11-27 Thread Dave Seltzer
Thanks Tim, That seems to be exactly what I'm looking for! -Dave > On Nov 27, 2013, at 2:34 PM, Timothy Potter wrote: > > Hi Dave, > > Have you looked at the TermsComponent? > http://wiki.apache.org/solr/TermsComponent It is easy to wire into an > existing request handler and allows you to re

Re: Term Vector Component Question

2013-11-27 Thread Erick Erickson
Jamie: Before jumping into using debug, do take a bit to test the performance! I've seen the debug component take up to 80% of the query time. Admittedly, that was, I think, 3.6 or something so it may be much different now. But I should have asked first, "Why do you care?". What is your use case.

Re: Caches contain deleted docs (?)

2013-11-27 Thread Erick Erickson
Yep, it's expected. Segments are write-once. It's been a long standing design that deleted data will be reclaimed on segment merge, but not before. It's pretty expensive to change the terms loaded on the fly to respect deleted document's removed data. Best, Erick On Wed, Nov 27, 2013 at 4:07 PM,

Re: Error when creating collection in Solr 4.6

2013-11-27 Thread Erick Erickson
Are you using old-style XML files with a tag and maybe tags as well? If so, see: https://issues.apache.org/jira/browse/SOLR-5510 Short form: you may have better luck if you're using old-style solr.xml files by adding: genericCoreNodeNames="${genericCoreNodeNames:true} to your tag, something li

Re: Caches contain deleted docs (?)

2013-11-27 Thread Roman Chyla
I understand that changes would be expensive, but shouldn't the cache simply skip the deleted docs? In the same way as the cache for multivalued fields (that accepts livedocs bits). Thanks, roman On Wed, Nov 27, 2013 at 6:26 PM, Erick Erickson wrote: > Yep, it's expected. Segments are write-o

Re: NOTE: comments currently disabled for most users in Solr Ref Guide

2013-11-27 Thread Chris Hostetter
FYI: comments should now be working for all registered users. If Comment spam becomes a problem too unweildy to manage by deleting after the fact, we'll have to consider going the same route as we do with an explicit white list of users like we have with moin moin. : Date: Wed, 27 Nov 2013 14

(info) how view lucene merge process

2013-11-27 Thread Jacky.J.Wang (mis.cnsh04.Newegg) 41361
Hello lucene how view lucene merge process?

Re: (info) how view lucene merge process

2013-11-27 Thread Jack Krupansky
What do you really want to do/accomplish? I mean, for what purpose? You can turn on the Lucene infostream for logging of index writing. See: https://cwiki.apache.org/confluence/display/solr/IndexConfig+in+SolrConfig Set to "true". There are some examples in my e-book. -- Jack Krupansky

Re: SolrCloud and 2MB Synonym file

2013-11-27 Thread Puneet Pawaia
I am running an ensemble. Can I get examples of how to use the option? I think there are not many examples available of the exact usage. Regards Puneet On 27 Nov 2013 23:23, "Yago Riveiro" wrote: > How are you launching Solr? > > Do you have an ensemble or you're running zookeeper embedded? > >

Re: Can't post comment on Confluence pages under "Apache Solr Reference Guide"

2013-11-27 Thread Ahmet Arslan
Hi Julien, Please see : http://search-lucene.com/m/MTRUH1cyNGZ1 and  https://issues.apache.org/jira/browse/INFRA-7058 On Wednesday, November 27, 2013 11:19 PM, Julien Canquelain wrote: Hi, I would like to post a comment about the problem below on Solr Confluence documentation, but com

Request for Contributors Group

2013-11-27 Thread Ahmet Arslan
Hello all, Please add my username ( iorixxx ) to  Contributors Group. With this, can I edit confluence too?

Request for Contributors Group

2013-11-27 Thread Shinichiro Abe
Hi, Please add my username ( shinichiro ) to Contributors Group. Thanks in advance, Shinichiro Abe

Re: Error when creating collection in Solr 4.6

2013-11-27 Thread lansing
Thank you for your replies, I am using the new-style discovery It worked after adding this setting : ${genericCoreNodeNames:true} -- View this message in context: http://lucene.472066.n3.nabble.com/Error-when-creating-collection-in-Solr-4-6-tp4103536p4103696.html Sent from the Solr - User ma