h+Worker+Collections
>
> As I work on the documentation I'll revalidate the performance numbers I
> was seeing when I did the performance testing several months ago.
>
> Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, May 16, 2016 at 10:51 AM, Ryan Cutter
&g
ly take some time to complete though as you are
> sorting
> >> and exporting 30,000,000 million docs from a single node.
> >>
> >> 2) Then try running the same *:* search() against the /export handler in
> >> parallel() gradually increasing the number of worker
ers, you could export 52,000,000 docs per-second. With 40
> shards, 5 replicas and 40 workers you could export 130,000,000 docs per
> second.
>
> So with large clusters you could do very large distributed joins with
> sub-second performance.
>
>
>
>
> Joel Bernstein
>
oming in 6.1:
> > >
> > >
> >
> https://cwiki.apache.org/confluence/pages/viewpage.action?pageId=62693238
> > >
> > > Joel Bernstein
> > > http://joelsolr.blogspot.com/
> > >
> > > On Fri, May 13, 2016 at 5:57 PM, Joel Bernste
so keep in mind that the /export handler requires
> > that sort fields and fl fields have docValues set.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Fri, May 13, 2016 at 5:36 PM, Ryan Cutter
> wrote:
> >
> >> Question #1:
Question #1:
triple_type collection has a few hundred docs and triple has 25M docs.
When I search for a particular subject_id in triple which I know has 14
results and do not pass in 'rows' params, it returns 0 results:
innerJoin(
search(triple, q=subject_id:1656521, fl="triple_id,subject_id
ming from the wrapper exception
> I
> > believe.
> >
> > Joel Bernstein
> > http://joelsolr.blogspot.com/
> >
> > On Tue, May 10, 2016 at 12:30 AM, Ryan Cutter
> > wrote:
> >
> >> Yes, the people collection has the personId and pets has ownerId, as
Joel Bernstein
> http://joelsolr.blogspot.com/
>
> On Mon, May 9, 2016 at 10:43 PM, Ryan Cutter wrote:
>
> > Thanks Joel, I added the personId and ownerId fields before ingested a
> > little data. I made them to be stored=true/multiValue=false/longs (and
> > strings, la
> http://joelsolr.blogspot.com/
>
> On Mon, May 9, 2016 at 9:22 PM, Ryan Cutter wrote:
>
> > Hello, I'm checking out the cool stream join operations in Solr 6.0 but
> > can't seem to the example listed on the wiki to work:
> >
> >
> >
&g
Hello, I'm checking out the cool stream join operations in Solr 6.0 but
can't seem to the example listed on the wiki to work:
https://cwiki.apache.org/confluence/display/solr/Streaming+Expressions#StreamingExpressions-innerJoin
innerJoin(
search(people, q=*:*, fl="personId,name", sort="personId
Shawn, thank you very much for that explanation. It helps a lot.
Cheers, Ryan
On Wed, May 20, 2015 at 5:07 PM, Shawn Heisey wrote:
> On 5/20/2015 5:57 PM, Ryan Cutter wrote:
> > GC is operating the way I think it should but I am lacking memory. I am
> > just surprised beca
? And if there isn't much slack memory laying
around to begin with, there's a bunch of contention/swap?
Thanks Shawn!
On Wed, May 20, 2015 at 4:50 PM, Shawn Heisey wrote:
> On 5/20/2015 5:41 PM, Ryan Cutter wrote:
> > I have a collection with 1 billion documents and I want to del
I have a collection with 1 billion documents and I want to delete 500 of
them. The collection has a dozen shards and a couple replicas. Using Solr
4.4.
Sent the delete query via HTTP:
http://hostname:8983/solr/my_collection/update?stream.body=
source:foo
Took a couple minutes and several repli
Sorry, I believe this can be disregarded. There were changes made to
system time that likely caused this state. Apologies, Ryan
On Mon, Sep 29, 2014 at 8:24 AM, Ryan Cutter wrote:
> Solr 4.7.2 went down during a period of little activity. Wondering if
> anyone has an idea about what
Solr 4.7.2 went down during a period of little activity. Wondering if
anyone has an idea about what's going on, thanks!
INFO - 2014-09-26 15:35:00.152;
org.apache.solr.cloud.DistributedQueue$LatchChildWatcher; LatchChildWatcher
fired on path: null state: Disconnected type None
then eventually:
>
> On Wed, Jul 30, 2014 at 7:29 PM, Jost Baron wrote:
>
>> -BEGIN PGP SIGNED MESSAGE-
>> Hash: SHA1
>>
>> Hi Ryan,
>>
>> On 07/31/2014 01:26 AM, Ryan Cutter wrote:
>> > Is there a way to index time or date ranges? That
Is there a way to index time or date ranges? That is, assume 2 docs:
#1: date = 2014-01-01
#2: date = 2014-02-01 through 2014-05-01
Would there be a way to index #2's date as a single field and have all the
search options you usually get with time/date?
One strategy could be to index the start
Thanks, everything worked fine after these pointers and I was able to
generate a patch properly.
Cheers, Ryan
On Mon, Jan 6, 2014 at 7:31 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> On Mon, Jan 6, 2014 at 8:54 PM, Ryan Cutter wrote:
> > 1. Should we be using Ja
1. Should we be using Java 6 or 7? The docs say 1.6 (
http://wiki.apache.org/solr/HowToContribute) but running 'ant test' on
trunk/ yields:
/lucene/common-build.xml:328: Minimum supported Java version is 1.7.
I don't get that error with branch_4x/ which leads to my next question.
2. Should
Marcello,
Can you quantify what you're seeing? Did you send the JVM any args (Xmx,
Xms, etc)?
Thanks, Ryan
On Mon, Dec 16, 2013 at 1:01 AM, Marcello Lorenzi wrote:
> Hi All,
> we have deployed on our production environment a new Solr 4.3 instance (2
> nodes with SolrCloud) but this morning on
Shawn's right that if you're going to scale this big you'd be very well
served to spend time getting the index as small as possible. In my
experience if your searches require real-time random access reads (that is,
the entire index needs to be fast), you don't want to wait for HDD disk
reads.
Get
Michal,
I don't have much experience with DIH so I'll leave that to someone else
but I would suggest you profile Solr during imports. That might show you
where the bottleneck is.
Generally, it's reasonable to think Solr updates will get slower the larger
the indexes get and the more load you put
For people who run into this situation in the future: I had the exact same
problem Sebastien had while using 4.4.0 (1 of my 6 nodes died). We rebuilt
a host to take its place but gave it the same hostname instead of making a
new one. It was configured the same way with the same config files but
w
I think Upayavira's suggestion of writing a filter factory fits what you're
asking for. However, the other end of cleverness is to simple use
solr.TrieIntField and store everything in MB. So for 1TB you'd
write 51200. A range query for 256MB to 1GB would be field:[256 TO 1024].
Conversion from
24 matches
Mail list logo