Optimizing integer primary key lookup speed: optimal FieldType and Codec?

2019-06-17 Thread Gregg Donovan
nt fields, but this does not appear to be due to any inherent limitation in the Points field. At least, that's what I infer from the JIRA <https://issues.apache.org/jira/browse/SOLR-11162>. Are the Point fields the right choice for fast 32-bit int ID lookups? Thanks! Gregg

1969 vs 1960s: not-quite-synonyms in Solr

2019-03-06 Thread Gregg Donovan
th eDisMax without adding new fields to the index. Thanks. Gregg

Re: Query of Death Lucene/Solr 7.6

2019-02-22 Thread Gregg Donovan
FWIW: we have also seen serious Query of Death issues after our upgrade to Solr 7.6. Are there any open issues we can watch? Is Markus' findings around `pf` our best guess? We've seen these issues even with ps=0. We also use the WDF. On Fri, Feb 22, 2019 at 8:58 AM Markus Jelsma wrote: > Hello M

Compression for solrbin?

2015-11-13 Thread Gregg Donovan
version[2] and compressing/decompressing inside of JavaBinCodec#marshal[3] and JavaBinCodec#unmarshal[4] would allow us to retain backwards compatibility with older clients or existing files. Thoughts? --Gregg [1] http://cyan4973.github.io/lz4/#tab-2 [2] https://github.com/apache/lucene-solr/blob/

ShardHandler semantics

2015-04-02 Thread Gregg Donovan
ts independently, each with its own completion service and pending queue. Does that sound right? Thanks! --Gregg

Re: How To Interrupt Solr Query Execution

2015-03-20 Thread Gregg Donovan
SOLR-5986 looks like a great enhancement for enforcing timeouts. I'm curious about how to handle *manual* cancellation. We're working on backup requests -- e.g. wait till 90% of shards have responded then send out a backup request for the lagging (e.g. GC, cache miss, overloaded, etc.) shards afte

Re: Enforcing a hard timeout on shard requests?

2014-06-02 Thread Gregg Donovan
0, 2014 at 6:09 PM, Jason Hellman < jhell...@innoventsolutions.com> wrote: > Gregg, > > I don’t have an answer to your question but I’m very curious what use case > you have that permits such arbitrary partial-results. Is it just an edge > case or do you want to permit a common

Enforcing a hard timeout on shard requests?

2014-05-30 Thread Gregg Donovan
Is there a way to enforce this timeout without failing the request entirely? I'd still like to get as many shards to return in 120ms as I can, even if they have partialResults. Thanks. --Gregg

How to optimize a DisMax of multiple cachable queries?

2014-04-25 Thread Gregg Donovan
dexSearcher and then merging a the DocLists manually. My concern is that this would lose the query normalization that happens in DisjunctionMaxQuery. This seems like a common problem: how to cache parts of a complex Solr query individually. Any ideas or common patterns for solving it? Thanks. --Gregg

Re: Estimating RAM usage of SolrCache instances?

2014-04-16 Thread Gregg Donovan
is about "average size of a filter query" + maxdoc/8 > document cacha is about "average size of the stored fields in bytes" * > size. > > HTH, > Erick > > On Mon, Apr 14, 2014 at 5:17 PM, Gregg Donovan wrote: > > We'd like to graph the approximate RA

Estimating RAM usage of SolrCache instances?

2014-04-14 Thread Gregg Donovan
te. I assume this is due to an issue with how the variably-sized backing maps were calculated, but I'm not sure. Any ideas for how to get an accurate RAM estimation for SolrCache objects? --Gregg [1] https://gist.github.com/greggdonovan/10682810

Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
t seems to get the job done - for me - it's not > a reusable component, but might serve as an illustration of one way to > handle the problem > > -Mike > > > On 04/07/2014 12:23 PM, Gregg Donovan wrote: > >> That was my first attempt, but it's much trickier tha

Re: Fetching uniqueKey and other int quickly from documentCache?

2014-04-07 Thread Gregg Donovan
Mar 3, 2014 at 11:14 AM, Gregg Donovan wrote: > > Yonik, > > > > That's a very clever idea. Unfortunately, I think that will skip the > > distributed query optimization we were hoping to take advantage of in > > SOLR-1880 [1], but it should work with the proposed

Re: Distributed tracing for Solr via adding HTTP headers?

2014-04-07 Thread Gregg Donovan
; > > > Regards, > >Alex. > > Personal website: http://www.outerthoughts.com/ > > Current project: http://www.solr-start.com/ - Accelerating your Solr > proficiency > > > > > > On Sat, Apr 5, 2014 at 3:16 AM, Gregg Donovan > wrote: > >>

Distributed tracing for Solr via adding HTTP headers?

2014-04-04 Thread Gregg Donovan
ot;Solr-" prefix. On the server, we look for those headers and add them to the SLF4J MDC[3]. Here's a patch [4] that does this that we're testing out. Is this a good idea? Would anyone else find this useful? If so, I'll open a ticket. --Gregg [1] http://logging.apache.org/log

Re: Fetching uniqueKey and other int quickly from documentCache?

2014-03-03 Thread Gregg Donovan
Yonik, That's a very clever idea. Unfortunately, I think that will skip the distributed query optimization we were hoping to take advantage of in SOLR-1880 [1], but it should work with the proposed distrib.singlePass optimization in SOLR-5768 [2]. Does that sound right? --Gregg [1]

Re: SolrCloud: heartbeat succeeding while node has failing SSD?

2014-03-03 Thread Gregg Donovan
some local readings and depending on the results, pulls itself out > of the mix as best it can (remove itself from clusterstate.json or simply > closes it's zk conneciton). > > - Mark > > http://about.me/markrmiller > > On Mar 2, 2014, at 3:42 PM, Gregg Donovan wrote:

SolrCloud: heartbeat succeeding while node has failing SSD?

2014-03-02 Thread Gregg Donovan
he disk is checked as part of the heartbeat and/or we verify that it can serve queries. Any pointers would be appreciated. Thanks! --Gregg

Fetching uniqueKey and other int quickly from documentCache?

2014-02-24 Thread Gregg Donovan
7;ve used a custom SolrCache maintaining that mapping to quickly filter over personalized collections. Maybe the uniqueKey should be more optimized out of the box? Perhaps a custom "uniqueKey" codec that also maintained the docId->uniqueKey mapping in memory? --Gregg [1] http://search-lucene.com/m/oCUKJ1heHUU1

Re: DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-24 Thread Gregg Donovan
Thank you Shalin and Yonik! Both SOLR-1880 and SOLR-5768 will be very helpful for our distributed search performance. On Mon, Feb 24, 2014 at 5:02 AM, Shalin Shekhar Mangar < shalinman...@gmail.co

DistributedSearch: Skipping STAGE_GET_FIELDS?

2014-02-23 Thread Gregg Donovan
n the pipeline. Is this possible out-of-the-box? If not, how would you recommend implementing it? Thanks! --Gregg

Caching Solr boost functions?

2014-02-18 Thread Gregg Donovan
unction so that one could do: boost=cache(expensiveFunctionQuery()) Thanks. --Gregg

Re: Consistent relevance tie-breaking across clusters?

2013-03-02 Thread Gregg Donovan
tureUpdateProcessorFactory unless there's a case to be made for the usefulness of an UpdateProcessor that only operates on uniqueKey. Thanks for the feedback! --Gregg On Sat, Mar 2, 2013 at 8:21 PM, Chris Hostetter wrote: > : bq: we don't want to use either the primary key or the record

Re: SolrCore#getIndexDir() contract change between 3.6 and 4.1?

2013-02-07 Thread Gregg Donovan
.. Anyway, I'll follow up in JIRA. --Gregg [1] https://issues.apache.org/jira/browse/SOLR-4413 On Wed, Feb 6, 2013 at 8:42 PM, Mark Miller wrote: > Thanks Gregg - can you file a JIRA issue? > > - Mark > > On Feb 6, 2013, at 5:57 PM, Gregg Donovan wrote: > > > M

Re: SolrCore#getIndexDir() contract change between 3.6 and 4.1?

2013-02-06 Thread Gregg Donovan
getIndexDir() method to know the active index directory"* is the behavior that we were reliant on. Since it's now hardcoded to dataDir + "index/", it doesn't always return the active index directory. --Gregg On Wed, Feb 6, 2013 at 5:13 PM, Mark Miller wrote: > >

SolrCore#getIndexDir() contract change between 3.6 and 4.1?

2013-02-06 Thread Gregg Donovan
ed into the older code paths. I can certainly appreciate that it's tough to make the changes needed for SolrCloud while maintaining perfect compatibility in pre-Cloud code paths. Would restoring the previous contact of SolrCore#getIndexDir() break anything in SolrCloud? Thanks! --Gregg G

Re: replicateOnStartup not finding commits after SOLR-3911?

2013-01-29 Thread Gregg Donovan
Thanks, Mark -- that fixed the issue for us. I created https://issues.apache.org/jira/browse/SOLR-4380 to track it. On Tue, Jan 29, 2013 at 4:06 PM, Mark Miller wrote: > > On Jan 29, 2013, at 3:50 PM, Gregg Donovan wrote: > >> should we >> just try uncommenting that line

replicateOnStartup not finding commits after SOLR-3911?

2013-01-29 Thread Gregg Donovan
ry uncommenting that line in ReplicationHandler? Thanks! --Gregg Gregg Donovan Senior Software Engineer, Etsy.com gr...@etsy.com [1] https://issues.apache.org/jira/browse/SOLR-3911 https://issues.apache.org/jira/secure/attachment/12548596/SOLR-3911.patch [2] http://svn.apache.org/viewvc/luce

PK uniqueness aware Solr index merging?

2013-01-24 Thread Gregg Donovan
nner than re-adding all of the documents in each directory to a new Solr index to avoid PK duplicates? Thanks. --Gregg Gregg Donovan Senior Software Engineer, Etsy.com gr...@etsy.com

Re: Solr 4.0 SnapPuller version vs. generation issue

2013-01-10 Thread Gregg Donovan
ration || forceReplication; and that fixed our post-reindexing HTTP replication issues. But I'm not sure if that check works for all of the cases that SnapPuller is designed for. --Gregg On Thu, Jan 10, 2013 at 4:28 PM, Mark Miller wrote: > > On Jan 10, 2013, at 4:11 PM, Gregg Donova

Solr 4.0 SnapPuller version vs. generation issue

2013-01-10 Thread Gregg Donovan
napPuller.java?r1=1144761&r2=1235888&pathrev=1235888&diff_format=h Thanks! --Gregg Gregg Donovan Senior Software Engineer, Etsy.com gr...@etsy.com

Re: Is FileFloatSource's WeakHashMap cache only cleaned by GC?

2012-06-06 Thread Gregg Donovan
Thanks for the suggestion, Erick. I created a JIRA and moved the patch to SVN, just to be safe. [1] --Gregg [1] https://issues.apache.org/jira/browse/SOLR-3514 On Wed, Jun 6, 2012 at 2:35 PM, Erick Erickson wrote: > > Hmmm, it would be better to open a Solr JIRA and attach this as a

Is FileFloatSource's WeakHashMap cache only cleaned by GC?

2012-06-05 Thread Gregg Donovan
deas? Overall, what do you think? Does relying on GC to clean this cache make sense as a possible cause of GC spikiness? If so, does the patch [3] look like a decent approach? Thanks! --Gregg [1] https://github.com/apache/lucene-solr/blob/a3914cb5c0243913b827762db2d616ad7cc6801d/solr/core/src/jav

Good time for an upgrade to Solr/Lucene trunk?

2011-06-21 Thread Gregg Donovan
ut to land on trunk that are worth waiting a few weeks for? Thanks for the guidance! --Gregg Gregg Donovan Technical Lead, Search, Etsy.com gr...@etsy.com

Sorting and filtering on fluctuating multi-currency price data?

2010-10-20 Thread Gregg Donovan
SolrIndexReader, but could be per-segment. Perhaps a custom poly-field could accomplish something like this? Has anyone dealt with this sort of problem? Do any of these approaches sound more or less reasonable? Are we missing anything? Thanks for the help! Gregg Donovan Technical Lead, Search Etsy.com

Re: Help on spelling.

2010-09-09 Thread Gregg Hoshovsky
gt; -----Original message- > From: Gregg Hoshovsky > Sent: Thu 09-09-2010 22:40 > To: solr-user@lucene.apache.org; > Subject: Help on spelling. > > I am trying to use the spellchecker but cannot get past the point of having > the spelling possibilities returned.

Help on spelling.

2010-09-09 Thread Gregg Hoshovsky
=2.2&start=0&rows=10&indent=on&wt=json I would expect that this would have returned some spelling suggestions ( such as wedge) but don’t get anything besides: { "responseHeader":{ "status":0, "QTime":1}, "response":{"numFound":0,"start":0,"docs":[] }} Any help is appreciated. Gregg

Re: Solr and NLP

2010-07-02 Thread Gregg Hoshovsky
ease share your findings. I will have to venture down this path someday myself. Gregg On 7/2/10 8:15 AM, "Moazzam Khan" wrote: Hi guys, Is there a way I can make Solr work with an NLP application? Are there any NLP applications that will work with Solr? Can someone please point

Highlight question

2010-06-23 Thread Gregg Hoshovsky
tterns but they didn't do anything. here is a snippet of the config file. Any help is appreciated. Gregg 4 70 0.2 [-\w ,/\n\"']{1,1} 4 100

Re: DIH field options

2010-03-13 Thread Gregg Hoshovsky
You can use mysql , select *, “staticdata” as staticdata from table x. As long as your field name is staticdata, this should add it there. On 3/12/10 8:39 AM, "Tommy Chheng" wrote: Haven't tried this myself but try adding a default value and don't specify it during the import. http://wiki.ap

Re: How to return filtered tokens as query results?

2010-02-05 Thread Gregg Horan
On Fri, Feb 5, 2010 at 2:31 AM, Ahmet Arslan wrote: > > > Is there a way to return Solr's > > analyzed/filtered tokens from a query, > > rather than the original indexed data? (Ideally at a > > fairly high level like > > solrj). > > TermVectorComponent [1] can do that. > > [1]http://wiki.apache.

How to return filtered tokens as query results?

2010-02-04 Thread Gregg Horan
Is there a way to return Solr's analyzed/filtered tokens from a query, rather than the original indexed data? (Ideally at a fairly high level like solrj). Thanks

Re: Getting solr response data in a JS query

2010-01-11 Thread Gregg Hoshovsky
You might be running into an Ajax restriction. See if an article like this helps. http://www.nathanm.com/ajax-bypassing-xmlhttprequest-cross-domain-restriction/ On 1/9/10 11:37 PM, "Otis Gospodnetic" wrote: Dan, You didn't mention whether you tried &wt=json . Does it work if you use that

Re: solrj query size limit?

2009-11-03 Thread Gregg Horan
That was it. Didn't see that optional parameter - the POST works. Thanks! On Nov 3, 2009, at 1:57 AM, Avlesh Singh wrote: Did you hit the limit for maximum number of characters in a GET request? Cheers Avlesh On Tue, Nov 3, 2009 at 9:36 AM, Gregg Horan wrote: I'm cons

solrj query size limit?

2009-11-02 Thread Gregg Horan
chars. I do have the maxBooleanClauses jacked up to 2048. Using javabin. 1.4-dev. Are there any other options or settings I might be overlooking? -Gregg

Re: Difficulty with Multi-Word Synonyms

2009-09-17 Thread Gregg Donovan
n of SynonymFilter that also implemented incrementToken() be helpful? --Gregg On Thu, Sep 17, 2009 at 7:38 PM, Yonik Seeley wrote: > On Thu, Sep 17, 2009 at 6:29 PM, Lance Norskog wrote: > > Please add a Jira issue for this. It will get more attention there. > > > > BTW, t

Difficulty with Multi-Word Synonyms

2009-09-14 Thread Gregg Donovan
use ["e","e"] is the value of the token stream     assertEquals(Arrays.asList("a","e"), tokens);   } } Any help would be much appreciated. Thanks. --Gregg

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
Bug filed. Thankyou. On Wed, 2009-05-27 at 22:40 +0530, Shalin Shekhar Mangar wrote: > On Wed, May 27, 2009 at 9:01 PM, Matthew Gregg wrote: > > > That is disappointing then. Restricting by IP may be doable, but much > > more work than basic auth. > > > > >

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
That is disappointing then. Restricting by IP may be doable, but much more work than basic auth. On Wed, 2009-05-27 at 20:41 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: > replication has no builtin security > > > > On Wed, May 27, 2009 at 8:37 PM, Matthew Gregg > wrote: >

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
which i feel may not need to be protected > > The other API's methods can have security . say dnappull, diableSnapPoll etc > > > > On Wed, May 27, 2009 at 7:47 PM, Matthew Gregg > wrote: > > On Wed, 2009-05-27 at 19:06 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: &g

Re: 1.4 Replication

2009-05-27 Thread Matthew Gregg
On Wed, 2009-05-27 at 19:06 +0530, Noble Paul നോബിള്‍ नोब्ळ् wrote: > On Wed, May 27, 2009 at 6:48 PM, Matthew Gregg > wrote: > > Does replication in 1.4 support passing credentials/basic auth? If not > > what is the best option to protect replication? > do you me

1.4 Replication

2009-05-27 Thread Matthew Gregg
Does replication in 1.4 support passing credentials/basic auth? If not what is the best option to protect replication?

Re: How to handle database replication delay when using DataImportHandler?

2009-01-29 Thread Gregg Donovan
se them to my custom function at query time? Thanks. --Gregg On Wed, Jan 28, 2009 at 11:20 PM, Noble Paul നോബിള്‍ नोब्ळ् < noble.p...@gmail.com> wrote: > The problem you are trying to solve is that you cannot use > ${dataimporter.last_index_time} as is. you may need something like

How to handle database replication delay when using DataImportHandler?

2009-01-28 Thread Gregg
o delta-imports. Has anyone run into this? In our non-DIH indexing system we get around this by either using the slave DB's seconds-behind-master or the max last update time of the records returned. Thanks. Gregg

Re: SolrUpdateServlet Warning

2008-09-23 Thread Gregg
This turned out to be a fairly pedestrian bug on my part: I had "/update" appended to the Solr base URL when I was adding docs via SolrJ. Thanks for the help. --Gregg On Tue, Sep 23, 2008 at 12:42 PM, Ryan McKinley <[EMAIL PROTECTED]> wrote: > > On Sep 23, 2008, at

SolrUpdateServlet Warning

2008-09-23 Thread Gregg
te rather then use this servlet. Add: to your solrconfig.xml I have an update handler configured in solrconfig.xml as follows: What's the preferred solution? Should I comment out the SolrUpdateServlet in solr's web.xml? My Solr server is running at /solr, if that helps. Thanks. Gregg