king for?
Thanks,
Brett
Field Definition:
sortMissingLast="true" omitNorms="false" omitTermFreqAndPositions="true">
is the case.
Best
Erick
On Wed, Jan 11, 2012 at 7:57 PM, Brett wrote:
I'm implementing a feature where admins have the ability to control the
order of the results by adding a boost to any specific search.
The search is a faceted interface (no text input) and which we take a hash
of the
.GoLive: Live merging of index shards into
Solr cluster took 2.31269488E8 secs
14/04/17 16:00:31 INFO hadoop.GoLive: Live merging failed
I'm digging into the code now, but wanted to send this out as a sanity
check.
Thanks,
Brett
https://gist.github.com/bretthoerner/0dc6bfdbf45a18328d4b
On Thu, Apr 17, 2014 at 11:31 AM, Mark Miller wrote:
> Odd - might be helpful if you can share your sorlconfig.xml being used.
>
> --
> Mark Miller
> about.me/markrmiller
>
> On April 17, 2014 at 12:18:37
Sorry to bump this, I have the same issue and was curious about the sanity
of trying to work around it.
* I have a constant stream of realtime documents I need to continually
index. Sometimes they even overwrite very old documents (by using the same
unique ID).
* I also have a *huge* backlog of do
I'm back to looking at the code but holy hell is debugging Hadoop hard. :)
On Thu, Apr 17, 2014 at 12:33 PM, Brett Hoerner wrote:
> https://gist.github.com/bretthoerner/0dc6bfdbf45a18328d4b
>
>
> On Thu, Apr 17, 2014 at 11:31 AM, Mark Miller wrote:
>
>> Odd - might b
merge is complete. If writes are allowed, corruption may occur on the
merged index." Is that saying that Solr will block writes, or is that
saying the end user has to ensure no writes are happening against the
collection during a merge? That seems... risky?
On Tue, Apr 22, 2014 at 9:29 AM, Brett
If I run a query like this,
fq=text:lol
fq=created_at_tdid:[1400544000 TO 1400630400]
It takes about 6 seconds. Following queries take only 50ms or less, as
expected because my fqs are cached.
However, if I change the query to not cache my big range query:
fq=text:lol
fq={!cache=false}created_a
act of storing the work after
it's done (it has to be done in either case) is taking 4 whole seconds?
On Tue, Jun 3, 2014 at 3:59 PM, Shawn Heisey wrote:
> On 6/3/2014 2:44 PM, Brett Hoerner wrote:
> > If I run a query like this,
> >
> > fq=text:lol
> > fq=created_a
, but that seems...
surprising to me.
On Tue, Jun 3, 2014 at 4:02 PM, Brett Hoerner
wrote:
> In this case, I have >400 million documents, so I understand it taking a
> while.
>
> That said, I'm still not sure I understand why it would take *more* time.
> In your exampl
Yonik, I'm familiar with your blog posts -- and thanks very much for them.
:) Though I'm not sure what you're trying to show me with the q=*:* part? I
was of course using q=*:* in my queries, but I assume you mean to leave off
the text:lol bit?
I've done some Cluster changes, so these are my basel
The following two queries are doing the same thing, one using a "normal" fq
range query and another using a parent query. The cache is warm (these are
both hits) but the "normal" ones takes ~6 to 7.5sec while the parent query
hack takes ~1.2sec.
Is this expected? Is there anything "wrong" with my
t 5:09 AM, Mikhail Khludnev wrote:
> Brett,
>
> It's really interesting observation. I can only speculate. It's worth to
> check cache hit stats and cache content via
> http://wiki.apache.org/solr/SolrCaching#showItems (the key question what
> are cached doc sets classes). A
Can anyone explain the difference between these two queries?
text:(+"happy") AND -user:("123456789") = numFound 2912224
But
text:(+"happy") AND user:(-"123456789") = numFound 0
Now, you may just say "then just put - infront of your field, duh!" Well,
text:(+"happy") = numFound 2912224
quot;. For example:
>
> text:(+"happy") AND user:(*:* -"123456789")
>
> -- Jack Krupansky
>
> -Original Message- From: Brett Hoerner
> Sent: Tuesday, July 1, 2014 2:51 PM
> To: solr-user@lucene.apache.org
> Subject: Confusion about location of
Also, does anyone have the Solr or Lucene bug # for this?
On Tue, Jul 1, 2014 at 3:06 PM, Brett Hoerner
wrote:
> Interesting, is there a performance impact to sending the *:*?
>
>
> On Tue, Jul 1, 2014 at 2:53 PM, Jack Krupansky
> wrote:
>
>> Yeah, there's a k
hought this would be expected since I do routing myself.
Did the upgrade change something here? I didn't see anything related to
this in the upgrade notes.
Thanks,
Brett
Here's my clusterstate.json:
https://gist.github.com/bretthoerner/a8120a8d89c93f773d70
On Mon, Nov 25, 2013 at 10:18 AM, Brett Hoerner wrote:
> Hi, I've been using a collection on Solr 4.5.X for a few weeks and just
> did an upgrade to 4.6 and am having some issues.
; (is there
a tool for this? I've always done it manually), started the cluster up
again and it's all good now.
On Mon, Nov 25, 2013 at 10:38 AM, Brett Hoerner wrote:
> Here's my clusterstate.json:
>
> https://gist.github.com/bretthoerner/a8120a8d89c93f773d70
>
>
I have Solr 4.6.1 on the server and just upgraded my indexer app to SolrJ
4.6.1 and indexing ceased (indexer returned "No live servers for shard" but
the real root from the Solr servers is below). Note that SolrJ 4.6.1 is
fine for the query side, just not adding documents.
21:35:21.508 [qtp14184
On Fri, Feb 7, 2014 at 6:15 PM, Mark Miller wrote:
> You have to update the other nodes to 4.6.1 as well.
>
I'm not sure I follow, all of the Solr instances in the cluster are 4.6.1
to my knowledge?
Thanks,
Brett
;
> - Mark
>
> http://about.me/markrmiller
>
>
>
> On Feb 7, 2014, 7:01:24 PM, Brett Hoerner wrote:
> I have Solr 4.6.1 on the server and just upgraded my indexer app to SolrJ
> 4.6.1 and indexing ceased (indexer returned "No live servers for shard" but
> th
not 4.6.1. That code couldn’t have been 4.6.1 it seems.
>
> - Mark
>
> http://about.me/markrmiller
>
> On Feb 8, 2014, at 11:12 AM, Brett Hoerner wrote:
>
> > Hmmm, I'm assembling into an uberjar that forces uniqueness of classes. I
> > verified 4.6.1 i
Mark, you were correct. I realized I was still running a prerelease of
4.6.1 (by a handful of commits). Bounced them with proper 4.6.1 and we're
all good, sorry for the spam. :)
On Sat, Feb 8, 2014 at 10:29 AM, Brett Hoerner wrote:
> Oh, I was talking about my indexer. That stack is
I have a very weird problem that I'm going to try to describe here to see
if anyone has any "ah-ha" moments or clues. I haven't created a small
reproducible project for this but I guess I will have to try in the future
if I can't figure it out. (Or I'll need to bisect by running long Hadoop
jobs...
(StandardDirectoryReader.java:277)
at
org.apache.lucene.index.DirectoryReader.openIfChanged(DirectoryReader.java:251)
at org.apache.solr.core.SolrCore.openNewSearcher(SolrCore.java:1476)
... 25 more
On Tue, Sep 16, 2014 at 12:54 PM, Brett Hoerner
wrote:
> I have a very weird prob
To be clear, those exceptions are during the "main" mapred job that is
creating the many small indexes. If these errors above occur (they don't
fail the job), I am 99% sure that is when the MTree job later hangs.
On Tue, Sep 23, 2014 at 1:02 PM, Brett Hoerner
wrote:
> I
ing take a long time. I haven't tried to
see if the issue shows on smaller jobs yet (does 1 minute become 6
minutes?).
Brett
On Tue, Sep 16, 2014 at 12:54 PM, Brett Hoerner
wrote:
> I have a very weird problem that I'm going to try to describe here to see
> if anyone has any &
I'm interesting in using the new custom sharding features in the
collections API to search a rolling window of event data. I'd appreciate a
spot/sanity check of my plan/understanding.
Say I only care about the last 7 days of events and I have thousands per
second (billions per week).
Am I correct
It seems that changes in 4.5 collection configuration now require users to
set a maxShardsPerNode (or it defaults to 1).
Maybe this was the case before, but with the new CREATESHARD API it seems a
very restrictive. I've just created a very simple test collection on 3
machines where I set maxShards
would create 1 new shard with 1 replica on any
server in 4.5?
Thanks!
On Tue, Oct 1, 2013 at 8:14 PM, Brett Hoerner wrote:
> It seems that changes in 4.5 collection configuration now require users to
> set a maxShardsPerNode (or it defaults to 1).
>
> Maybe this was the case before
n odd
number like 1000 just to get around this?
Thanks!
On Wed, Oct 2, 2013 at 12:04 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> Thanks for reporting this Brett. This is indeed a bug. A workaround is to
> specify replicationFactor=1 with the createShard command
ow is that since I'm only using the default 16 bits my data
is being split across multiple shards (because of my high # of shards).
Thanks,
Brett
zeable amount of data (68M and 128M) and the rest are very
small as expected.
The fact that two are receiving so much makes me think my data is being
split into two shards. I'm trying to debug more now.
On Tue, Oct 8, 2013 at 5:45 PM, Yonik Seeley wrote:
> On Tue, Oct 8, 2013 at 6:
y hour and it's
been running for 2). There *is* a little old data in my stream, but not
that much (like <5%). What's confusing to me is that 5 of them are rather
large, when I'd expect 2 of them to be.
On Tue, Oct 8, 2013 at 5:45 PM, Yonik Seeley wrote:
> On Tue, Oct 8, 2013 at
e, Oct 8, 2013 at 7:31 PM, Brett Hoerner
> wrote:
> > This is my clusterstate.json:
> > https://gist.github.com/bretthoerner/0098f741f48f9bb51433
> >
> > And these are my core sizes (note large ones are sorted to the end):
> > https://gist.github.com/bretthoerner/f5b5e0
Ignore me I forgot about shards= from the wiki.
On Tue, Oct 8, 2013 at 7:11 PM, Brett Hoerner wrote:
> I have a silly question, how do I query a single shard in SolrCloud? When
> I hit solr/foo_shard1_replica1/select it always seems to do a full cluster
> query.
>
> I can&
Thanks folks,
As an update for future readers --- the problem was on my side (my logic in
picking the _route_ was flawed) as expected. :)
On Tue, Oct 8, 2013 at 7:35 PM, Yonik Seeley wrote:
> On Tue, Oct 8, 2013 at 8:27 PM, Shawn Heisey wrote:
> > There is also the "distrib=false" parameter t
ng like a facet
query would need to go to *all* shards for any query (I'm using the default
SolrCloud sharding mechanism, nothing special).
How could a text field search for 'happy' always work and 'austin' always
return an error, shouldn't that "down server" be hit for a 'happy' query
also?
Thanks,
Brett
ile correctly returns the text of
the file.
I am wondering:
* Is this a known issue? (I couldn't find any mention of this
particular issue anywhere...)
* Are there any workarounds or does anyone have any suggestions?
Thanks,
Brett.
...
Brett.
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Sunday, November 25, 2012 9:27 PM
To: solr-user@lucene.apache.org
Subject: Re: Problem with Solr 3.6.1 extracting ODT content using SolrCell's
ExtractingRequestHandler
Did you commit after you adde
Hi,
I have a Cloud setup of 4 machines. I bootstrapped them with 1 collection,
which I called "default" and haven't used since. I'm using an external ZK
ensemble that was completely empty before I started this cloud.
Once I had all 4 nodes in the cloud I used the collection API to create the
real
CREATE or DELETE actually did anything, though. (Again, HTTP
200 OK)
Still stuck here, any ideas?
Brett
On Tue, Dec 4, 2012 at 7:19 PM, Brett Hoerner wrote:
> Hi,
>
> I have a Cloud setup of 4 machines. I bootstrapped them with 1 collection,
> which I called "default" and ha
ument]},
attr_revision_number=attr_revision_number(1.0)={1},
attr_template=attr_template(1.0)={Normal.dotm},
attr_last_author=attr_last_author(1.0)={Brett Melbourne},
attr_page_count=attr_page_count(1.0)={1},
attr_application_name=attr_application_name(1.0)={Microsoft Office Word},
author=autho
ing any
deletes. :)
Brett
On Fri, Dec 7, 2012 at 10:50 AM, Mark Miller wrote:
> Anything in any of the other logs (the other nodes)? The key is getting
> the logs from the node designated as the overseer - it should hopefully
> have the error.
>
> Right now because you pass this
I was using Solr 4.0 but ran into a few problems using SolrCloud. I'm
trying out 4.1 RC1 right now but the update URL I used to use is returning
HTTP 404.
For example, I would post my document updates to,
http://localhost:8983/solr/collection1
But that is 404ing now (collection1 exists according
on it reports 404
sometimes. What's odd is that I can use curl to post a JSON document to the
same URL and it will return 200.
When I log every request I make from my indexer process (using solr4j) it's
about 50/50 between 404 and 200...
On Sat, Jan 19, 2013 at 5:22 PM, Brett Hoerner wrot
So the ticket I created wasn't related, there is a working patch for that
now but my original issue remains, I get 404 when trying to post updates to
a URL that worked fine in Solr 4.0.
On Sat, Jan 19, 2013 at 5:56 PM, Brett Hoerner wrote:
> I'm actually wondering if this other is
Sorry, I take it back. It looks like fixing
https://issues.apache.org/jira/browse/SOLR-4321 fixed my issue after all.
On Sun, Jan 20, 2013 at 2:21 PM, Brett Hoerner wrote:
> So the ticket I created wasn't related, there is a working patch for that
> now but my original issue remains
I have a collection in Solr 4.1 RC1 and doing a simple query like
text:"puppy dog" is causing an exception. Oddly enough, I CAN query for
text:puppy or text:"puppy", but adding the space breaks everything.
Schema and config: https://gist.github.com/f49da15e39e5609b75b1
This happens whether I quer
set to use Lucene version 4.0
> index format but you mention you are using it 4.1
>
> LUCENE_40
>
>
>
> On Mon, Jan 21, 2013 at 4:26 PM, Brett Hoerner >wrote:
>
> > I have a collection in Solr 4.1 RC1 and doing a simple query like
> > text:"pup
Hi,
I have a 5 server cluster running 1 collection with 20 shards, replication
factor of 2.
Earlier this week I had to do a rolling restart across the cluster, this
worked great and the cluster stayed up the whole time. The problem is that
the last node I restarted is now the leader of 0 shards,
very busy, indexing 5k+ small documents
per second, but the nodes were all fine until I had to restart them and
they had to re-sync.
Here is the log since reboot: https://gist.github.com/396af4b217ce8f536db6
Any ideas?
On Sat, Feb 2, 2013 at 10:27 AM, Brett Hoerner wrote:
> Hi,
>
> I have
ores?action=unload&name=core1. This removes the core/shard from
> bob, giving the other servers a chance to grab leader props.
>
> -Joey
>
> On Feb 2, 2013, at 11:27 AM, Brett Hoerner wrote:
>
> > Hi,
> >
> > I have a 5 server cluster running 1 collection with
I have a SolrCloud cluster (2 machines, 2 Solr instances, 32 shards,
replication factor of 2) that I've been using for over a month now in
production.
Suddenly, the hourly cron I run that dispatches a delete by query
completely halts all indexing. Select queries still run (and quickly),
there is n
4.1, I'll induce it again and run jstack.
On Wed, Mar 6, 2013 at 1:50 PM, Mark Miller wrote:
> Which version of Solr?
>
> Can you use jconsole, visualvm, or jstack to get some stack traces and see
> where things are halting?
>
> - Mark
>
> On Mar 6, 2013, at 1
f the shards to be mastered on server2
(unload/create on server1)
5) restart indexer
And it works again until a delete eventually kills it.
To be clear again, select queries continue to work indefinitely.
Thanks,
Brett
On Wed, Mar 6, 2013 at 1:50 PM, Mark Miller wrote:
> Which version
rated pretty deadlock graphs of those.
> >
> >
> > Regards,
> > Alex.
> >
> >
> >
> >
> >
> > Personal blog: http://blog.outerthoughts.com/
> > LinkedIn: http://www.linkedin.com/in/alexandrerafalovitch
> > - Time is the quality of na
replica as well? (also when
> it's locked up of course).
>
> - Mark
>
> On Mar 6, 2013, at 3:34 PM, Brett Hoerner wrote:
>
> > If there's anything I can try, let me know. Interestingly, I think I have
> > noticed that if I stop my indexer, do my delete, and r
As a side note, do you think that was a poor idea? I figured it's better to
spread the master "load" around?
On Thu, Mar 7, 2013 at 11:29 AM, Mark Miller wrote:
>
> On Mar 7, 2013, at 9:03 AM, Brett Hoerner wrote:
>
> > To be clear, neither is really "the
Thu, Mar 7, 2013 at 11:03 AM, Brett Hoerner wrote:
> Here is the other server when it's locked:
> https://gist.github.com/3529b7b6415756ead413
>
> To be clear, neither is really "the replica", I have 32 shards and each
> physical server is the leader for 16, and the rep
sting all URLs be
evaluated as lowercase. What is the best practice on URL case? Is there a
negative to making all lowercase? I know I can drop the index and re-crawl to
fix it, but long term how should URL case be treated? Thanks!
Brett
https://www.nuveen.com/mutual-funds/nuveen-high-yield-municipal-bond-fund
https://www.nuveen.com/mutual-funds/Nuveen-High-Yield-Municipal-Bond-Fund
Is there any issue if we just lowercase all URLs? I can't think of an issue
that would be caused, but that's why I'm asking the Guru&
I'm using the below FieldType/Field but when I index my documents, the URL is
not being lower case. Any ideas? Do I have the below wrong?
Example: http://connect.rightprospectus.com/RSVP/TADF
Expect: http://connect.rightprospectus.com/rsvp/tadf
Brett
he case I still don't get why the
Lowercase doesn't fire when the data is being indexed.
Brett Moyer
-Original Message-
From: Shawn Heisey [mailto:apa...@elyograg.org]
Sent: Thursday, March 14, 2019 10:44 AM
To: solr-user@lucene.apache.org
Subject: Re: FieldTypes and Lower
also storing in the
Inverted Index the lowercase form, right?
I'm getting closer I think. Ok so if I want to physically lowercase the URL and
store it that way, I need to do it before it gets to the Index as you stated.
Ok got it, Thanks!
Brett Moyer
Manager, Sr. Technical Lead | TFS Technology
a bad habit of only using single terms for search. A
very common search term is "ira". The PERSON page ranks higher than the article
on IRA's. With essentially no information from the user, what are some way we
can detect and rank differently?
27;ll take a closer look at
what you sent, Thank you!
Brett Moyer
Manager, Sr. Technical Lead | TFS Technology
Public Production Support
Digital Search & Discovery
8625 Andrew Carnegie Blvd | 4th floor
Charlotte, NC 28263
Tel: 704.988.4508
Fax: 704.988.4907
bmo...@tiaa.org
-Origina
llation", <-this is the bad line
"account"
]
Previous Solr versions
--
"spellcheck": {
"suggestions": [
"acount",
{
"numFound": 1,
"startOffset": 0,
"endOffset": 6,
"suggestion": [
"account"
]
}
x27;m looking to know what
other approaches people have created. Did you put your solr in T1, I assume
not, that would put it at risk. Thanks!
Brett Moyer
*
This e-mail may contain confidential or privileged information.
I
large, in total we might have 50k documents. How can I reduce this /data/solr
space? Is this what the Solr Optimize command is for? Thanks!
Brett
*
This e-mail may contain confidential or privileged information.
If you are
Highlight? What about using the Highlighter?
https://lucene.apache.org/solr/guide/6_6/highlighting.html
Brett Moyer
Manager, Sr. Technical Lead | TFS Technology
Public Production Support
Digital Search & Discovery
8625 Andrew Carnegie Blvd | 4th floor
Charlotte, NC 28263
Tel: 704.988.
/lzd6hkoikhagujs/CoreOne.png?dl=0
https://www.dropbox.com/s/ae6rayb38q39u9c/CoreTwo.png?dl=0
Brett
-Original Message-
From: Erick Erickson
Sent: Thursday, August 8, 2019 5:49 PM
To: solr-user@lucene.apache.org
Subject: Re: Indexed Data Size
On the surface, this makes no sense at all, so there’s
n 8/9/2019 6:12 AM, Moyer, Brett wrote:
> Thanks! We update each index nightly, we don’t clear, but bring in New and
> Deltas, delete expired/404. All our data are basically webpages, so none are
> very large. Some PDFs but again not too large. We are running Solr 7.5,
> hopefully y
Turns out this is due to a job that indexes logs. We were able to clear some
with another job. We are working through the value of these indexed logs.
Thanks for all your help!
Brett Moyer
Manager, Sr. Technical Lead | TFS Technology
Public Production Support
Digital Search & Disco
user to their desired result. Something about it doesn’t
seem right. Is this right with a flat single level pattern like what we have?
Should each doc have multiple Fields to map to different values? Any help is
appreciated. Thanks!
Example Facets:
Brokerage
Retirement
Open an Account
Move Money
rrect for Facets? Why do
you say the above are not Facets?
Here is an excerpt from our JSON:
"facet_counts": {
"facet_queries": {},
"facet_fields": {
"Tags": [
"Retirement",
1260,
"Locations & People",
1149,
"Advice and Tools&quo
[
{"name":"brokage",{
"type":"str","value":"numFound":1,
"startOffset":0,
"endOffset":9,
"suggestion":["brokerage"]}}],
"collations":
[
{&q
Yes we are stemming, ahh so we shouldn't stem our words to be spelled?
Brett Moyer
-Original Message-
From: Jörn Franke
Sent: Friday, November 22, 2019 8:34 AM
To: solr-user@lucene.apache.org
Subject: Re: Odd Edge Case for SpellCheck
Stemming involved ?
> Am 22.11.2019
This is a great help, thank you!
Brett Moyer
-Original Message-
From: Erick Erickson
Sent: Monday, November 25, 2019 4:12 PM
To: solr-user@lucene.apache.org
Subject: Re: Odd Edge Case for SpellCheck
If you’re using direct spell checking, it looks for the _indexed_ term. So this
means
80 matches
Mail list logo