Possible bug report—ICU Tokenizer: letter-space-number-letter tokenized inconsistently

2021-01-22 Thread Trey Jones
for a few years, and looking into it when it cropped up again recently I realized it is probably an upstream problem, so I wanted to open an issue for Lucene. Is this a known issue, or should I create a new ticket? Thanks! —Trey Trey Jones Sr. Computational Linguist, Search Platform Wikimedia

Re: [EXTERNAL] Getting rid of Master/Slave nomenclature in Solr

2020-06-18 Thread Trey Grainger
g" vs. "Unmanaged Clustering" Mode Alt F: "Managed Clustering" vs. "Manual Clustering" Mode ? I think I prefer option F. Trey Grainger Founder, Searchkernel https://searchkernel.com On Thu, Jun 18, 2020 at 5:59 PM Jan Høydahl wrote: > I support Mike Dro

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Trey Grainger
w terminology to clearly distinguish between modes. Regardless of the naming decided on, I'm in support of removing the master/slave nomenclature. Trey Grainger Founder, Searchkernel https://searchkernel.com On Wed, Jun 17, 2020 at 7:00 PM Shawn Heisey wrote: > On 6/17/2020 2:36 PM, Trey Grainge

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Trey Grainger
Sorry: > > but I maintain that leader vs. follower behavior is inconsistent here. Sorry, that should have said "I maintain that leader vs. follower behavior is consistent here." Trey Grainger Founder, Searchkernel https://searchkernel.com On Wed, Jun 17, 2020 at 6:03 PM Trey

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Trey Grainger
is true that Standalone mode does not currently have support for two of the replica TYPES that SolrCloud mode does, but I maintain that leader vs. follower behavior is inconsistent here. Trey Grainger Founder, Searchkernel https://searchkernel.com On Wed, Jun 17, 2020 at 5:41 PM Walter Underwood w

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Trey Grainger
and followers, whereas in standalone mode you have to manage them manually (as is the case with most things in SolrCloud vs. Standalone). My view is that having an entirely different set of terminology describing the same thing is way more cognitive overhead than having consistent terminology. Trey Grain

Re: Getting rid of Master/Slave nomenclature in Solr

2020-06-17 Thread Trey Grainger
eady specific and well established meaning of "replica" within Solr. All the Best, Trey Grainger Founder, Searchkernel https://searchkernel.com On Wed, Jun 17, 2020 at 3:38 PM Anshum Gupta wrote: > Hi everyone, > > Moving a conversation that was happening on the PMC list t

[PSA] Activate 2019 Call for Speakers ends May 8

2019-05-04 Thread Trey Grainger
Just wanted to make sure everyone in the development and user community here was aware of the conference and didn't miss the opportunity to submit a talk by Wednesday if interested. All the best, Trey Grainger Chief Algorithms Officer @ Lucidworks https://www.linkedin.com/in/treygrainger/

Re: IRA or IRA the Person

2019-04-01 Thread Trey Grainger
s: https://www.slideshare.net/treygrainger/how-to-build-a-semantic-search-system All the best, Trey Grainger Chief Algorithms Officer @ Lucidworks On Mon, Apr 1, 2019 at 11:45 AM Moyer, Brett wrote: > Hello, > > Looking for ideas on how to determine intent and drive r

Re: Disabling XmlQParserPlugin through solrconfig

2017-10-12 Thread Trey Grainger
This way, the xml query parser is loaded in as a version of the eDismax query parser instead, and any queries the are trying to reference the xml query parser through local params will instead hit the eDismax query parser and use its parsing logic instead. All the best, Trey Grainger SVP

Re: Semantic Knowledge Graph

2017-10-09 Thread Trey Grainger
Hi David, that's my fault. I need to do a final proofread through them before they get posted (and may have to push one quick code change, as well). I'll try to get that done within the next few days. All the best, Trey Grainger SVP of Engineering @ Lucidworks Co-author, Solr in Ac

RE: Solr 6.4 - Transient core loading is extremely slow with HDFS and S3

2017-04-12 Thread Cahill, Trey
he best way to access S3 right now. Finally, remember that S3 is a service; if the S3 service is slow (for example due to a heavy stream of request), then your operations with S3 will also be slow. Hope this helps and good luck, Trey -Original Message- From: Amarnath palavalli [ma

RE: Solr with HDFS on AWS S3 - Server restart fails to load the core

2017-04-07 Thread Cahill, Trey
://cwiki.apache.org/confluence/display/solr/CoreAdmin+API#CoreAdminAPI-RELOAD). Best of luck, Trey From: Amarnath palavalli [mailto:pamarn...@gmail.com] Sent: Friday, April 07, 2017 3:20 PM To: solr-user@lucene.apache.org Subject: Solr with HDFS on AWS S3 - Server restart fails to load the core

RE: reset version number

2017-01-11 Thread Cahill, Trey
What are you trying to accomplish by resetting the version number? -Original Message- From: Kris Musshorn [mailto:mussho...@comcast.net] Sent: Tuesday, January 10, 2017 9:31 PM To: solr-user@lucene.apache.org Subject: RE: reset version number Obviously deleting and rebuilding the core wi

Re: "on deck" searcher vs warming searcher

2016-12-09 Thread Trey Grainger
r was the next one to take off), and was later used heavily in baseball (the "on deck" batter was the one warming up to go next) and probably elsewhere. I've always understood the "on deck" searcher(s) being the same as the warming searcher(s). So you have the "active

Re: Related Search

2016-10-26 Thread Trey Grainger
dler right now), but once it's finished that could certainly be done. Just wanted to mention it as another approach to solve this specific problem. -Trey Grainger SVP of Engineering @ Lucidworks Co-author, Solr in Action On Wed, Oct 26, 2016 at 1:59 PM, Markus Jelsma wrote: > Indeed,

Re: Hackday next month

2016-09-21 Thread Trey Grainger
I know a bunch of folks who would be likely attend the hackday (including committers) will have some other meetings on Wednesday before the conference, so I think that Tuesday is actually a pretty good time to have this. My 2 cents, Trey Grainger SVP of Engineering @ Lucidworks Co-author, Solr

RE: Integration Solr Cloudera with squirrel-sql

2016-08-26 Thread Cahill, Trey
Hardika, Parallel SQL and the accompanying JDBC connector only became available in Solr 6.x. Since Cloudera's Solr is only at 4.10, it will not have this feature. Trey -Original Message- From: Hardika Catur S [mailto:hardika.sa...@solusi247.com.INVALID] Sent: Friday, August 26,

Re: [ANN] Relevant Search by Manning out! (Thanks Solr community!)

2016-06-21 Thread Trey Grainger
anyone on the mailing list who is contemplating buying it), it is a REALLY great book that will teach you the ins and outs of how search relevancy works under the covers and how you can manipulate and improve it. It's very well-written, and definitely worth the read. Congrats again, guys. Trey Gra

Re: Lucene Revolution ?

2015-10-18 Thread Trey Grainger
JSON faceting API and the enhanced analytical capabilities therein. Once again, several other talks on faceting and analytics, but there was quite a strong committer focus on that topic. Definitely worth checking out the slides and videos when they are posted - lots of really good material all arou

Re: are there any SolrCloud supervisors?

2015-10-12 Thread Trey Grainger
I'd be very interested in taking a look if you post the code. Trey Grainger Co-Author, Solr in Action Director of Engineering, Search & Recommendations @ CareerBuilder On Fri, Oct 2, 2015 at 3:09 PM, r b wrote: > I've been working on something that just monitors ZooKeeper to

Re: catchall fields or multiple fields

2015-10-12 Thread Trey Grainger
d if you are okay losing IDF per-field (you'll still have it globally across all fields). If you want to use a catch-all field, but still want to boost content based upon the field it originated within, you can accomplish this with payloads. All the best, Trey Grainger Co-author, Solr in Action

Re: JSON Facet & Analytics API in Solr 5.1

2015-04-17 Thread Trey Grainger
his without adding in extra levels to make it look like the input side, so this is an exception case even thought it seems syntactically valid. So in conclusion, I'd give a strong vote to the flatter structure. Can someone enumerate the benefits of the current format over the flatter structure

Re: Basic Multilingual search capability

2015-02-23 Thread Trey Grainger
th the ICUTokenizer then it will work to a point, but some of the problems Walter mentioned may eventually bite you if you are supporting certain groups of languages. All the best, Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Recommendations @ CareerBuilder On M

What's the most efficient way to sort by "number of terms matched"?

2014-11-05 Thread Trey Grainger
rms from the main query (q parameter) execution:: q=python OR solr OR hadoop&sort=uniquematchedterms() desc,score desc. I don't think anything like this exists, but would love some suggestions if anyone else has solved this before. Thanks, -Trey

Re: How to implement multilingual word components fields schema?

2014-09-08 Thread Trey Grainger
be talking about multilingual search in November at Lucene/Solr Revolution, so I'd ideally like to finish before then so I can demonstrate it there. Thanks, -Trey Grainger Director of Engineering, Search & Analytics @ CareerBuilder On Mon, Sep 8, 2014 at 3:31 PM, Jorge Luis Betancourt G

Re: facet.field counts when q includes field

2014-04-27 Thread Trey Grainger
No problem, Mike. Glad you got it sorted out. Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Analytics @ CareerBuilder On Sun, Apr 27, 2014 at 7:23 PM, Michael Sokolov < msoko...@safaribooksonline.com> wrote: > On 4/27/14 7:02 PM, Michael Sokolov wrote:

Re: facet.field counts when q includes field

2014-04-27 Thread Trey Grainger
uments which are both a type of book and also match the "q=toto" query, you should get 0 documents and thus the counts of all your facet values will be zero. As you mentioned, it is possible to utilize tags and excludes to change the behavior described above, but hopefully this ans

Re: multiple analyzers for one field

2014-04-10 Thread Trey Grainger
complicated. At any rate, that's the specific answer to your specific question about whether it is possible to utilize multiple Analyzers within a field based upon multiple inputs. All the best, Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Analytics @ CareerBuil

Re: Multiple Languages in Same Core

2014-03-27 Thread Trey Grainger
d and run the code examples for free, though they may be harder to follow without the context from the book. Thanks, Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Analytics @CareerBuilder On Wed, Mar 26, 2014 at 4:34 AM, Liu Bo wrote: > Hi Jeremy > >

Re: [ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Trey Grainger
ook. Best regards, Trey Grainger Co-author, Solr in Action Director of Engineering, Search & Analytics @CareerBuilder On Thu, Mar 27, 2014 at 12:04 PM, Philippe Soares wrote: > > Thanks Trey ! > I just tried to download my copy from my manning account, and this final > version

[ANN] Solr in Action book release (Solr 4.7)

2014-03-27 Thread Trey Grainger
*full source code are also available* at http://solrinaction.com. I would love it if you would check the book out, and I would also appreciate your feedback on it, especially if you find the book to be a useful guide as you are working with Solr! Timothy Potter and I (Trey Grainger) worked tireless

Re: analyzer with multiple stem-filters for more languages

2014-03-14 Thread Trey Grainger
on/tree/master/src/main/java/sia/ch14 Of course, if you want to take a simpler route, you can always just copy your text to two separate fields (one per language) and then search across them at query time using the eDisMax query parser. There are pros and cons to both approaches. All the best, -

Re: Facet pivot and distributed search

2014-02-07 Thread Trey Grainger
FYI, the last distributed pivot facet patch functionally works, but there are some sub-optimal data structures being used and some unnecessary duplicate processing of values. As a result, we found that for certain worst-case scenarios (i.e. data is not randomly distributed across Solr cores and req

Re: Single multilingual field analyzed based on other field values

2013-12-19 Thread Trey Grainger
ng to be much simpler than putting it all into the field to be analyzed to begin with (or better yet having an update request processor do it for you - including the detection of language boundaries - inside of Solr so the customer doesn't have to worry about it). -Trey On Tue, Oct 29, 201

Re: Re: LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

2013-12-12 Thread Trey Grainger
Hmm... haven't run into the case where null was returned in a multi-valued scenario yet... I probably just haven't tested that case. I likely need to add a null check there - thanks for pointing it out. -Trey On Fri, Nov 29, 2013 at 6:10 AM, Müller, Stephan < muel...@ponton-

Re: Function query matching

2013-12-02 Thread Trey Grainger
= this (in QueryValueSource). This should be an easy fix. I'll create a JIRA ticket to use better key names in these functions and push up a patch. This will eliminate the need for the extra NoOp function. -Trey On Mon, Dec 2, 2013 at 12:41 PM, Peter Keegan wrote: > I'm persuing

Re: LanguageIdentifierUpdateProcessor uses only firstValue() on multivalued fields

2013-11-28 Thread Trey Grainger
n/java/sia/ch14/MultiTextFieldLanguageIdentifierUpdateProcessor.java Good luck! -Trey On Wed, Nov 27, 2013 at 10:16 AM, Müller, Stephan < muel...@ponton-consulting.de> wrote: > > I suspect that it is an oversight for a use case that was not considered. > > I mean, it should probably either

Re: Single multilingual field analyzed based on other field values

2013-10-28 Thread Trey Grainger
ached in a ThreadLocal context depending upon to the internal ReusePolicy, and I'm skeptical that you'll be able to pull this off cleanly. It would really be hacking around the Lucene API's even if you were able to pull it off. -Trey On Mon, Oct 28, 2013 at 5:15 PM, Jack Krupan

Re: Getting a query parameter in a TokenFilter

2013-09-22 Thread Trey Grainger
in my implementation, but I could see where it might be more user-friendly for many Solr users. I'm just finishing up the "multilingual search" chapter and code now and will be happy to post it to SOLR-5053 once I finish in the next few days if this would be helpful to you. -Trey

Re: Need help understanding the use cases behind core auto-discovery

2013-09-21 Thread Trey Grainger
x27;s JIRA comment: https://issues.apache.org/jira/browse/SOLR-4478 Thanks, -Trey On Sat, Sep 21, 2013 at 2:25 PM, Erick Erickson wrote: > Also consider where SolrCloud is going. Trying to correctly maintain > all the solr.xml files yourself on all the nodes would have > been..."

Re: [ANNOUNCE] Solr wiki editing change

2013-03-30 Thread Trey Grainger
Please add TreyGrainger to the the contributors group. Thanks! -Trey On Sun, Mar 24, 2013 at 11:18 PM, Steve Rowe wrote: > The wiki at http://wiki.apache.org/solr/ has come under attack by > spammers more frequently of late, so the PMC has decided to lock it down in > an attempt

PostingsHighlighter and analysis

2013-03-11 Thread Trey Hyde
ENE-4641 that is really critical to our searches is the WordDelimiter filer. My current index time filter config (which I believe has bee unchanged for me for 5+ years): Does anyone have any suggestions deal with this? Perhaps limiting certain options will always produce tokens in order

Re: Can I invert the inverted index?

2011-07-05 Thread Trey Grainger
r 3.3. One of these days I'll remove the JSP dependency and this may eventually making it into trunk. Thanks, -Trey Grainger Search Technology Development Team Lead, Careerbuilder.com Site Architect, Celiaccess.com On Tue, Jul 5, 2011 at 3:59 PM, Gabriele Kahlout wrote: > Hello, > >

Re: Indexes in ramdisk don't show performance improvement?

2011-06-02 Thread Trey Grainger
ene index files, which should be a safe assumption considering you're trying to put the whole index in a ramdisk. -Trey On Thu, Jun 2, 2011 at 7:15 PM, Erick Erickson wrote: > What I expect is happening is that the Solr caches are effectively making the > two tests identical, usin

Re: old searchers not closing after optimize or replication

2011-04-21 Thread Trey Grainger
ushed up a patch into the 3x branch and trunk for this issue.  I can confirm that applying the patch (or just removing startup replication) resolved the issue for us. Do you think this is your issue? Thanks, -Trey On Thu, Apr 21, 2011 at 2:27 AM, Bernd Fehling wrote: > Hi Erik, > > &g

Re: Apache Spam Filter Blocking Messages

2011-04-21 Thread Trey Grainger
Good to know; I'll go change those settings, then.  Thanks for the feedback. -Trey On Thu, Apr 21, 2011 at 4:42 AM, Em wrote: > > This really helps at the mailinglists. > If you send your mails with Thunderbird, be sure to check that you enforce > plain-text-emails. If not, i

Apache Spam Filter Blocking Messages

2011-04-20 Thread Trey Grainger
t? Thanks, -Trey

Re: Solr 3.1: Old Index Files Not Removed on Optimize?

2011-04-15 Thread Trey Grainger
Thank you, Yonik! I see the Jira issue you created and am guessing it's due to this issue. We're going to remove replicateAfter="startup" in the mean-time to see if that helps (assuming this is the issue the jira ticket described). I appreciate you taking a look at this. Th

Solr 3.1: Old Index Files Not Removed on Optimize?

2011-04-15 Thread Trey Grainger
nfig.xml]: true commit optimize startup 10 30 false 1 Thanks in advance, -Trey

Re: useFastVectorHighlighter creates fragments with cut off terms, incomplete

2010-10-12 Thread Trey Hyde
lighter.java:379) ... >> >> I also set the value to a value larger than the possible size of the field >> but I still get a left truncated highlight in many cases. >> >> >> hl.fragListBuilder and hl.fragmentsBuilder sound like they may be relevant >> but I ha

useFastVectorHighlighter creates fragments with cut off terms, incomplete

2010-10-12 Thread Trey Hyde
This is my highlighter set up for the time being. true true 3 1 200 true Any suggestions? Thanks. I'm running revision 1021880 in the lusolr 3_1 branch. Trey Hyde th...@centraldesktop.com Central Desktop, Inc. Organize, Share, Collaborate

Re: Luke browser does not show non-String Solr fields?

2010-05-31 Thread Trey Grainger
can't tell what problem you are trying to solve), but I thought it would be worth mentioning as one tool in your toolbox. -Trey > > > >

Re: resetting stats

2010-03-31 Thread Trey Grainger
: reloading the core just to reset the stats definitely seems like throwing : out the baby with the bathwater. Agreed about throwing out the baby with the bath water - if stats need to be reset, though, then that's the only way today. A reset stats button would be a nice way to prevent having to

Re: resetting stats

2010-03-30 Thread Trey Grainger
negligable and the time to reload is essentially zero. The primary disadvantage of the core reloading approach is that your warmed caches are dropped (if you are using caches on that core), but as long as you have good warmup queries you should be okay as long as the reload isn't constant. -Trey O

Re: How to edit / compile the SOLR source code

2010-03-12 Thread Trey
Or, (as Joe Calderon said in the apparent sibling thread) you can just type "ant clean dist" if you want to verifiably blow away the old jars and replace them with the new jars/war to deploy. -Trey On Sat, Mar 13, 2010 at 12:03 AM, Trey wrote: > Hmm... sorry for the bad link.

Re: How to edit / compile the SOLR source code

2010-03-12 Thread Trey
;ant generate-maven-artifacts' to generate maven artifacts. [echo] Use 'ant package' to generate zip, tgz, and maven artifacts for dist ribution. [echo] Use 'ant luke' to start luke. see: http://www.getopt.org/luke/ [echo] Use 'ant test' to run unit t

GC performance: 1.3 vs 1.4

2010-03-12 Thread Trey Hyde
y suggestions on a change to my GC settings or solrconfig.xml to match with changes in 1.4? I'm going to add an SSD L2ARC to get some better performance but that will not help my GC issues. Trey Hyde th...@centraldesktop.com Central Desktop, Inc. Organize, Share, Collaborate

Re: How to edit / compile the SOLR source code

2010-03-11 Thread Trey
can attach the same way. The last step (after you've made your changes) is that you would just need to rebuild with Ant (run "ant" from the directory containing the build.xml file to see the build options for Solr). I think that just running "ant example" there should do th

Any way to recover a corrupt index from a "live" IndexReader?

2010-02-24 Thread Trey
Hi All, It seems I have a corrupt index on disk on my Master, but the live IndexReader is still working. I don't want to restart Solr (1.4), because I'm pretty sure the corrupt index will be loaded upon restart, causing me to delete and rebuild the index from source. Is there any way to restore

Re: Multiple Cores Vs. Single Core for the following use case

2010-01-26 Thread Trey
shared by multiple users? -Trey On Tue, Jan 26, 2010 at 7:27 PM, Matthieu Labour wrote: > Hi > > > > Shall I set up Multiple Core or Single core for the following use case: > > > > I have X number of users. > > > > When I do a search, I always know for which

Re: Replication Handler Severe Error: Unable to move index file

2010-01-21 Thread Trey
error > happened. > > > > I'm not looking at the source code now, but is that really the only error > you got? No exception stack trace? > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Solr - Lucene - Nutch > > > > > &g

Replication Handler Severe Error: Unable to move index file

2010-01-20 Thread Trey
Does anyone know what would cause the following error?: 10:45:10 AM org.apache.solr.handler.SnapPuller copyAFile SEVERE: *Unable to move index file* from: /home/solr/cores/core8/index.20100119103919/_6qv.fnm to: /home/solr/cores/core8/index/_6qv.fnm This occurred a few days back and we notic

Re: NPE when trying to view a specific document via Luke

2009-11-12 Thread Solr Trey
I played around with it and am also getting a NullPointerException on Solr 1.4, as well (albeit with a slightly different dump). Some of my documents actually return, FYI, just not all. I'm on a on a multi-solr-core system searching /solr/core1/admin/luke?id=MYID. My Exception looked different,

Re: faceted browsing

2006-04-04 Thread Trey Hyde
Chris Hostetter wrote: : My (our) query plugin uses specialized SolrCache's in lieu of the meta : data records. For each new searcher installed each fields possible : values will be determined and stored in a cache (off the top of my head, Are you determining the field values based on all in

Re: faceted browsing

2006-04-03 Thread Richard \&quot;Trey\" Hyde
My (our) query plugin uses specialized SolrCache's in lieu of the meta data records. For each new searcher installed each fields possible values will be determined and stored in a cache (off the top of my head, some fields have a cardinality of well over 500k). Each time a query is run that r