One of three cores is missing userData and lastModified fields from /admin/cores

2015-03-24 Thread Aaron Daubman
Hey All, On a Solr server running 4.10.2 with three cores, two return the expected info from /solr/admin/cores?wt=json but the third is missing userData and lastModified. The first (artists) and third (tracks) cores from the linked screenshot are the ones I care about. Unfortunately, the third (t

Re: Understanding fieldNorm differences between 3.6.1 and 4.9 solrs

2014-07-02 Thread Aaron Daubman
/%3CCALyTvnpwZMj4zxPbK0abVpnyRJny=qauijdqmj7e3zgnv7u...@mail.gmail.com%3E In the mean time, I'm still happy to hear any new thoughts / suggestions on making similarity contiguous across upgrades. Thanks again, Aaron On Tue, Jul 1, 2014 at 11:14 PM, Aaron Daubman wrote: > In trying to determine some subtle

Understanding fieldNorm differences between 3.6.1 and 4.9 solrs

2014-07-01 Thread Aaron Daubman
In trying to determine some subtle scoring differences (causing occasionally significant ordering differences) among search results, I wrote a parser to normalize debug.explain.structured JSON output. It appears that every score that is different comes down to a difference in fieldNorm, where the

Re: Cannot run Solr4 from Intellij Idea

2012-12-04 Thread Aaron Daubman
Interestingly, I have run in to this same (or very similar) issue when attempting to run embedded solr. All of the solr.* classes that were recently moved to lucene would not work with the solr.* shorthand - I had to replace them with the full classpath. As you found, these shorthands in the same s

Re: Range Queries performing differently on SortableIntField vs TrieField of type integer

2012-12-04 Thread Aaron Daubman
I forgot a possibly important piece... Given the different Solr versions, the schema version (and it's related different defaults) is also a change: Solr 1.4.1 Has: Solr 3.6.1 Has: > Solr 1.4.1 Relevant Schema Parts - Working as desired: > > >

Re: Range Queries performing differently on SortableIntField vs TrieField of type integer

2012-12-04 Thread Aaron Daubman
Hi Upayavira, One small question - did you re-index in-between? The index structure > will be different for each. > Yes, the Solr 1.4.1 (working) instance was built using the original schema and that solr version. The Solr 3.6.1 (not working) instance was re-built using the new schema and Solr 3.

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
> (plus when I deploy, my deploy script > runs some actual simple test queries to ensure they return before enabling > the ping handler to return 200s) to avoid this problem. > What are you doing to programmatically disable/enable the ping handler? This sounds like exactly what I should be doing

Re: Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
Nov 8, 2012 at 11:54 AM, Aaron Daubman wrote: > > > Greetings, > > > > I have several custom QueryComponents that have high one-time startup > costs > > (hashing things in the index, caching things from a RDBMS, etc...) > > > > Is there a way to prevent so

Preventing accepting queries while custom QueryComponent starts up?

2012-11-08 Thread Aaron Daubman
Greetings, I have several custom QueryComponents that have high one-time startup costs (hashing things in the index, caching things from a RDBMS, etc...) Is there a way to prevent solr from accepting connections before all QueryComponents are "ready"? Especially, since many of our instance are l

Re: Improving performance for use-case where large (200) number of phrase queries are used?

2012-10-24 Thread Aaron Daubman
Hi Peter, Thanks for the recommendation - I believe we are thinking along the same lines, but wanted to check to make sure. Are you suggesting something different than my #5 (below) or are we essentially suggesting the same thing? On Wed, Oct 24, 2012 at 1:20 PM, Peter Keegan wrote: > Could you

Re: Improving performance for use-case where large (200) number of phrase queries are used?

2012-10-24 Thread Aaron Daubman
Thanks for the ideas - some followup questions in-line below: > * use shingles e.g. to turn two-word phrases into single terms (how > long is your average phrase?). Would this be different than what I was calling "common grams"? (other than shingling every two words, rather than just common ones

Improving performance for use-case where large (200) number of phrase queries are used?

2012-10-24 Thread Aaron Daubman
Greetings, We have a solr instance in use that gets some perhaps atypical queries and suffers from poor (>2 second) QTimes. Documents (~2,350,000) in this instance are mainly comprised of various "descriptive fields", such as multi-word (phrase) tags - an average document contains 200-400 phrases

Why does SolrIndexSearcher.java enforce mutual exclusion of filter and filterList?

2012-10-21 Thread Aaron Daubman
Greetings, I'm wondering if somebody would please explain why SolrIndexSearcher.java enforces mutual exclusion of filter and filterList (e.g. see: https://github.com/apache/lucene-solr/blob/trunk/solr/core/src/java/org/apache/solr/search/SolrIndexSearcher.java#L2039 ) For a custom application we

ScorerDocQueue.java's downHeap showing up as frequent hotspot in profiling - ideas why?

2012-10-16 Thread Aaron Daubman
Greetings, In a recent batch of solr 3.6.1 slow response time queries the profiler highlighted downHeap (line 212) in SoorerDocQueue.java as averaging more than 60ms across the 16 calls I was looking at and showing it spiking up over 100ms - which, after looking at the code (two int comparisons?!?

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

2012-10-10 Thread Aaron Daubman
Hi Mikhail, On Fri, Oct 5, 2012 at 7:15 AM, Mikhail Khludnev wrote: > okay. huge rows value is no.1 way to kill Lucene. It's not possible, > absolutely. You need to rethink logic of your component. Check Solr's > FieldCollapsing code, IIRC it makes second search to achieve similar goal. > Also ch

Re: Why is SolrDispatchFilter using 90% of the Time?

2012-10-10 Thread Aaron Daubman
Hi Stijn, I have occasionally been seeing similar behavior when profiling one of our Solr 3.6.1 servers using the similar AppDynamics product. Did you ever hunt down what was causing this for you or get more info? (I haven't been able to rule out truncated or filtered call-graphs that don't show

Re: PriorityQueue:initialize consistently showing up as hot spot while profiling

2012-10-05 Thread Aaron Daubman
Fri, Oct 5, 2012 at 6:56 AM, Aaron Daubman wrote: > >> Greetings, >> >> I've been seeing this call chain come up fairly frequently when >> debugging longer-QTime queries under Solr 3.6.1 but have not been able >> to understand from the code what is re

PriorityQueue:initialize consistently showing up as hot spot while profiling

2012-10-04 Thread Aaron Daubman
Greetings, I've been seeing this call chain come up fairly frequently when debugging longer-QTime queries under Solr 3.6.1 but have not been able to understand from the code what is really going on - the call graph and code follow below. Would somebody please explain to me: 1) Why this would show

Re: Understanding fieldCache SUBREADER "insanity"

2012-10-01 Thread Aaron Daubman
Hi Yonik, I've been attempting to fix the SUBREADER insanity in our custom component, and have made perhaps some progress (or is this worse?) - I've gone from SUBREADER to VALUEMISMATCH insanity: ---snip--- entries_count : 12 entry#0 : 'MMapIndexInput(path="/io01/p/solr/playlist/c/playlist/index/

Solr Caching - how to tune, how much to increase, and any tips on using Solr with JDK7 and G1 GC?

2012-09-29 Thread Aaron Daubman
Greetings, I've recently moved to running some of our Solr (3.6.1) instances using JDK 7u7 with the G1 GC (playing with max pauses in the 20 to 100ms range). By and large, it has been working well (or, perhaps I should say that without requiring much tuning it works much better in general than my

Re: How to more gracefully handle field format exceptions?

2012-09-24 Thread Aaron Daubman
e error on the client, fix/clean/remove, and retry, no? > > Otis > -- > Search Analytics - http://sematext.com/search-analytics/index.html > Performance Monitoring - http://sematext.com/spm/index.html > > > On Mon, Sep 24, 2012 at 9:21 PM, Aaron Daubman wrote: >> Greet

How to more gracefully handle field format exceptions?

2012-09-24 Thread Aaron Daubman
Greetings, Is there a way to configure more graceful handling of field formatting exceptions when indexing documents? Currently, there is a field being generated in some documents that I am indexing that is supposed to be a float but some times slips through as an empty string. (I know, fix the d

Re: Understanding fieldCache SUBREADER "insanity"

2012-09-21 Thread Aaron Daubman
Yonik, et al. I believe I found the section of code pushing me into 'insanity' status: ---snip--- int[] collapseIDs = null; float[] hotnessValues = null; String[] artistIDs = null; try { collapseIDs = FieldCache.DEFAULT.getInts(searcher.getIndexReader(),

Re: Understanding fieldCache SUBREADER "insanity"

2012-09-19 Thread Aaron Daubman
Hi Tomás, > This probably means that you are using the same field for faceting and for > sorting (tf_normalizedTotalHotttnesss), sorting uses the segment level > cache and faceting uses by default the global field cache. This can be a > problem because the field is duplicated in cache, and then it

Understanding fieldCache SUBREADER "insanity"

2012-09-19 Thread Aaron Daubman
Hi all, In reviewing a solr instance with somewhat variable performance, I noticed that its fieldCache stats show an insanity_count of 1 with the insanity type SUBREADER: ---snip--- insanity_count : 1 insanity#0 : SUBREADER: Found caches for descendants of ReadOnlyDirectoryReader(segments_k _6h9(

Re: Solr request/response lifecycle and logging full response time

2012-09-06 Thread Aaron Daubman
.getEndTime() - st), (int) (System.currentTimeMillis() - st))); } } } Please advise if: - Flowcharts for any solr/lucene-related lifecycles exist - There is a better way of doing this Thanks, Aaron On Thu, Sep 6, 2012 at 9:16 PM, Aaron Daubman wrote: > Greetings, > > I'm looking t

Solr request/response lifecycle and logging full response time

2012-09-06 Thread Aaron Daubman
Greetings, I'm looking to add some additional logging to a solr 3.6.0 setup to allow us to determine actual time spent by Solr responding to a request. We have a custom QueryComponent that sometimes returns 1+ MB of data and while QTime is always on the order of ~100ms, the response time at the c

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Aaron Daubman
Robert, So this is lossy: basically you can think of there being only 256 > possible values. So when you increased the number of terms only > slightly by changing your analysis, this happened to bump you over the > edge rounding you up to the next value. > > more information: > http://lucene.apach

Re: Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-19 Thread Aaron Daubman
Robert, > I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as > > identically as possible (given deprecations) and indexing the same > document. > > Why did you do this? If you want the exact same scoring, use the exact > same analysis. > This means specifying luceneMatchVer

Frustrating differences in fieldNorm between two different versions of solr indexing the same document

2012-07-18 Thread Aaron Daubman
Greetings, I've been digging in to this for two days now and have come up short - hopefully there is some simple answer I am just not seeing: I have a solr 1.4.1 instance and a solr 3.6.0 instance, both configured as identically as possible (given deprecations) and indexing the same document. Fo

Debugging jetty IllegalStateException errors?

2012-07-04 Thread Aaron Daubman
Greetings, I'm wondering if anybody has experienced (and found root cause) for errors like this. We're running Solr 3.6.0 with latest stable Jetty 7 (7.6.4.v20120524). I know this is likely due to a client (or the server) terminating the connection unexpectedly, but we see these fairly frequently

Re: Correct way to deal with source data that may include a multivalued field that needs to be used for sorting?

2012-06-11 Thread Aaron Daubman
While I look into doing some refactoring, as well as creating some new UpdateRequestProcessors (and/or backporting), would you please point me to some reading material on why you say the following: In this day and age, a custom update handler is almost never the right > answer to a problem -- nor

Re: Correct way to deal with source data that may include a multivalued field that needs to be used for sorting?

2012-06-10 Thread Aaron Daubman
Hoss, The new FieldValueSubsetUpdateProcessorFactory classes look phenomenal. I haven't looked yet, but what are the chances these will be back-ported to 3.6 (or how hard would it be to backport them?)... I'll have to check out the source in more detail. If stuck on 3.6, what would be the best wa

Re: What would cause: "SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory"

2012-06-10 Thread Aaron Daubman
ry > and TokenizerFactory. > > -- Jack Krupansky > > -Original Message- From: Aaron Daubman > Sent: Saturday, June 09, 2012 12:03 AM > To: solr-user@lucene.apache.org > Subject: What would cause: "SEVERE: java.lang.ClassCastException: > com.company.*

Re: What would cause: "SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory"

2012-06-08 Thread Aaron Daubman
--- ---snip--- On Sat, Jun 9, 2012 at 12:03 AM, Aaron Daubman wrote: > Greetings, > > I am in the process of updating custom code and schema from Solr 1.4 to > 3.6.0 and have run into the following issue with our two custom Tokenizer > and Token Filter components

What would cause: "SEVERE: java.lang.ClassCastException: com.company.MyCustomTokenizerFactory cannot be cast to org.apache.solr.analysis.TokenizerFactory"

2012-06-08 Thread Aaron Daubman
Greetings, I am in the process of updating custom code and schema from Solr 1.4 to 3.6.0 and have run into the following issue with our two custom Tokenizer and Token Filter components. I've been banging my head against this one for far too long, especially since it must be something obvious I'm

Re: Correct way to deal with source data that may include a multivalued field that needs to be used for sorting?

2012-06-05 Thread Aaron Daubman
Thanks for the responses, By saying "dirty data" you imply that only one of the values is "good" or > "clean" and that the others can be safely discarded/ignored, as opposed to > true multi-valued data where each value is there for good reason and needs > to be preserved. In any case, how do you k

Correct way to deal with source data that may include a multivalued field that needs to be used for sorting?

2012-06-04 Thread Aaron Daubman
Greetings, I have "dirty" source data where some documents being indexed, although unlikely, may contain multivalued fields that are also required for sorting. In previous versions of Solr, sorting on this field worked fine (possibly because few or no multivalued fields were ever encountered?), ho

Re: Tips on creating a custom QueryCache?

2012-05-30 Thread Aaron Daubman
Hoss, : 1) Any recommendations on which best to sub-class? I'm guessing, for this > : scenario with "rare" batch puts and no evictions, I'd be looking for get > : performance. This will also be on a box with many CPUs - so I wonder if > the > : older LRUCache would be preferable? > > i suspect yo

Example setup of using Solr 3.6.0 with Jetty 7 (7.6.3)?

2012-05-29 Thread Aaron Daubman
Greetings, Has anybody gotten Solr 3.6.0 to work well with Jetty 7.6.3, and if so, would you mind sharing your config files / directory structure / other useful details? Thanks, Aaron

Generating maven artifacts for 3.6.0 build - correct -Dversion to use?

2012-05-25 Thread Aaron Daubman
Greetings, Following the directions here: http://svn.apache.org/repos/asf/lucene/dev/trunk/dev-tools/maven/README.maven for building Lucene/Solr with Maven, what is the correct -Dversion to pass in to get-maven-poms. This seems set up for building -SNAPSHOT, however, I would like to use maven to

Re: Tips on creating a custom QueryCache?

2012-05-24 Thread Aaron Daubman
Hoss, brilliant as always - many thanks! =) Subclassing the SolrCache class sounds like a good way to accomplish this. Some questions: 1) Any recommendations on which best to sub-class? I'm guessing, for this scenario with "rare" batch puts and no evictions, I'd be looking for get performance. Th

Re: Tips on creating a custom QueryCache?

2012-05-24 Thread Aaron Daubman
hat's run before the usual > QueryComponent? > This component would be responsible for loading queries, executing them, > caching results, and for returning those results when these queries are > encountered later on. > > Otis > > >________ >

Tips on creating a custom QueryCache?

2012-05-23 Thread Aaron Daubman
Greetings, I'm looking for pointers on where to start when creating a custom QueryCache. Our usage patterns are possibly a bit unique, so let me explain the desired use case: Our Solr index is read-only except for dedicated periods where it is updated and re-optimized. On startup, I would like t