Thanks for the replies. I made the changes so that the external file field
is loaded per:
On 10/8/2016 1:18 PM, Mike Lissner wrote:
> I want to make sure I understand this properly and document this for
> futurepeople that may find this thread. Here's what I interpret your
> advice to be:
> 0. Slacken my auto soft commit interval to something more like a minute.
Yes, I would do this.
I chose 16 as a place to start. You usually reach diminishing returns
pretty quickly, i feel it's a mistake to set your autowarm counts to, say
256 (and I've seen this in the thousands) unless you have some proof
that it's useful to bump higher.
But certainly if you set them to 16 and see spikes j
With time-oriented data, you can use an old trick (goes back to Infoseek in
1995).
Make a “today” collection that is very fresh. Nightly, migrate new documents to
the “not today” collection. The today collection will be small and can be
updated
quickly. The archive collection will be large and
On Fri, Oct 7, 2016 at 8:18 PM Erick Erickson
wrote:
> What you haven't mentioned is how often you add new docs. Is it once a
> day? Steadily
> from 8:00 to 17:00?
>
Alas, it's a steady trickle during business hours. We're ingesting court
documents as they're posted on court websites, then sendi
On Sat, Oct 8, 2016 at 8:46 AM Shawn Heisey wrote:
> Most soft commit
> > documentation talks about setting up soft commits with of
> about a
> > second.
>
> IMHO any documentation that recommends autoSoftCommit with a maxTime of
> one second is bad documentation, and needs to be fixed. Where h
On 10/7/2016 6:19 PM, Mike Lissner wrote:
> Soft commits seem to be exactly the thing for this, but whenever I open a
> new searcher (which soft commits seem to do), the external file is
> reloaded, and all queries are halted until it finishes loading. When I just
> measured, this took about 30 sec
bq: Most soft commit
documentation talks about setting up soft commits with of about a
second.
I think this is really a consequence of this being included in the
example configs
for illustrative purposes, personally I never liked this.
There is no one right answer. I've seen soft commit interval
I have an index of about 4M documents with an external file field
configured to do boosting based on pagerank scores of each document. The
pagerank file is about 93MB as of today -- it's pretty big.
Each day, I add about 1,000 new documents to the index, and I need them to
be available as soon as
This is great! I guess, there is nothing left to worry about for a while.
Erick & Yonik, thank you again for your great responses.
Bests,
Jak
On Thu, Nov 17, 2011 at 4:01 PM, Yonik Seeley wrote:
> On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote:
> > Is it ok to see soft committed records af
On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote:
> Is it ok to see soft committed records after server restart, too?
Yes... we currently have Jetty configured to call some cleanups on
exit (such as closing the index writer).
-Yonik
http://www.lucidimagination.com
Yonik,
Is it ok to see soft committed records after server restart, too? If it is,
there is no problem left at all.
I added changing files and 1 sec of log at the end of the e-mail. One
significant line says softCommit=true, so Solr recognizes our softCommit
request.
INFO: start
commit(optimize=fa
On Thu, Nov 17, 2011 at 1:34 PM, Erick Erickson wrote:
> Hmmm. It is suspicious that your index files change every
> second.
Why is this suspicious?
A soft commit still writes out some files currently... it just doesn't
fsync them.
-Yonik
http://www.lucidimagination.com
1- There is an improvement on the issue. I add 10 seconds time interval
into the delta of data-config.xml, which will cover records that already
indexed.
"revision_time > DATE_SUB('${dataimporter.last_index_time}', INTERVAL 10
SECOND);"
In this case 1369 new records inserted with 7 records per sec
Hmmm. It is suspicious that your index files change every
second. If you change our cron task to update every 10
seconds, do the index files change every 10 seconds?
Regarding your question about
"After a server restart last query results reserved. (In NRT they would
disappear, right?)"
not necess
Yonik,
I updated my solrconfig time based only as follows.
30
1000
And changed my soft commit script to the first case.
while [ 1 ]; do
echo "Soft commit applied!"
wget -O /dev/null '
http://localhost:8080/solr-jak/dataimport?command=delta-import&commit=fa
On Thu, Nov 17, 2011 at 11:48 AM, Jak Akdemir wrote:
> 2) I am sure about delta-queries configured well. Full-Import is completed
> in 40 secs for 40 docs. And delta's are in 1 sec for 15 new records.
> Also I checked it. There is no problem in it.
That's 10,000 docs/sec. If you configure a
re-indexing *everything* every time? There's an interactive
> debugging console you can use that may help, try:
> http://localhost:8983/solr/admin/dataimport.jsp
>
> Best
> Erick
>
> On Thu, Nov 17, 2011 at 3:19 AM, Jak Akdemir wrote:
> > Hi,
> >
> > I wa
every time? There's an interactive
debugging console you can use that may help, try:
http://localhost:8983/solr/admin/dataimport.jsp
Best
Erick
On Thu, Nov 17, 2011 at 3:19 AM, Jak Akdemir wrote:
> Hi,
>
> I was trying to configure a Solr instance with the near real-time search
>
Hi,
I was trying to configure a Solr instance with the near real-time search
and auto-complete capabilities. I stuck in the NRT feature. There are
15 new records per second that inserted into the database (mysql) and I
indexed them with DIH. First, I tried to manage autoCommits from
> no, I only thought you use one day :-)
> so you don't or do you have 31 shards?
>
No, we use 1 shard per month - e.g. 7 shards will hold 7 month's of data.
It can be set to 1 day, but you would need to have a huge amount of
data in a single day to warrant doing that.
On Thu, Nov 18, 2010 at 8
Does yours need to be once a day?
no, I only thought you use one day :-)
so you don't or do you have 31 shards?
having a look at Solr Cloud or Katta - could be useful
here in dynamically allocating shards.
ah, thx! I will take a look at it (after trying solr4)!
Regards,
Peter.
May
> Maybe I didn't fully understood what you explained: but doesn't this mean
> that you'll have one index per day?
> Or are you overwriting, via replicating, every shard and the number of shard
> is fixed?
> And why are you replicating from the local replica to the next shard? (why
> not directly fr
Hi Peter!
* I believe the NRT patches are included in the 4.x trunk. I don't
think there's any support as yet in 3x (uses features in Lucene 3.0).
I'll investage how much effort it is to update to solr4
* For merging, I'm talking about commits/writes. If you merge while
commits are going on
* I believe the NRT patches are included in the 4.x trunk. I don't
think there's any support as yet in 3x (uses features in Lucene 3.0).
* For merging, I'm talking about commits/writes. If you merge while
commits are going on, things can get a bit messy (maybe on source
cores this is ok, but I hav
Hi Peter,
thanks for your response. I will dig into the sharding stuff asap :-)
This may have changed recently, but the NRT stuff - e.g. per-segment
commits etc. is for the latest Solr 4 trunk only.
Do I need to turn something 'on'?
Or do you know wether the NRT patches are documented some
Hi Peter,
First off, many thanks for putting together the NRT Wiki page!
This may have changed recently, but the NRT stuff - e.g. per-segment
commits etc. is for the latest Solr 4 trunk only.
If your setup uses the 3x Solr code branch, then there's a bit of work
to do to move to the new version.
Hi,
I wanted to provide my indexed docs (tweets) relative fast: so 1 to 10
sec or even 30 sec would be ok.
At the moment I am using the read only core scenario described here
(point 5)*
with a commit frequency of 180 seconds which was fine until some days.
(I am using solr1.4.1)
Now the time
running into memory issues?
> >
> > Otis
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> >
> > - Original Message
> >> From: Mark Ferguson
> >> To: solr-user@lucene.apache.org
> >> Sent:
t.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message
>> From: Mark Ferguson
>> To: solr-user@lucene.apache.org
>> Sent: Friday, February 20, 2009 1:14:15 AM
>> Subject: Near real-time search of user data
>>
>> Hi,
>>
>> I a
http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Mark Ferguson
> To: solr-user@lucene.apache.org
> Sent: Friday, February 20, 2009 1:14:15 AM
> Subject: Near real-time search of user data
>
> Hi,
>
> I am trying to come up with a strateg
Hi,
I am trying to come up with a strategy for a solr setup in which a user's
indexed data can be nearly immediately available to them for search. My
current strategy (which is starting to cause problems) is as follows:
- each user has their own personal index (core), which gets committed
after
information. (we were planning on using a paxos algorithm to help with this)
Hopefully Jason if you are reading this will add further to it.
regards
Ian
Grant Ingersoll wrote:
Hi James,
Can you provide more information about what you are trying to do? By
real time search, do you mean you
have more memory and cpu,,,just open more instance ,not one by one.)
final, we merge result and show it to user.
that all i think, not test it.
2007/9/24, Grant Ingersoll <[EMAIL PROTECTED]>:
>
> Hi James,
>
> Can you provide more information about what you are trying to do?
Ingersoll wrote:
>
> > Hi James,
> >
> > Can you provide more information about what you are trying to do?
> > By real time search, do you mean you want indexed documents to be
> > available immediately? Or is a minute or two acceptable? Do all
> > users need
]
| 702-943-7833
++
On Sep 24, 2007, at 8:13 AM, Grant Ingersoll wrote:
Hi James,
Can you provide more information about what you are trying to do?
By real time search, do you mean you want indexed documents to be
available immediately
Hi James,
Can you provide more information about what you are trying to do? By
real time search, do you mean you want indexed documents to be
available immediately? Or is a minute or two acceptable? Do all
users need to see them immediately, or just the current user?
We can better
i wanna do it.
Maybe someone did it, if so, give me some tips.
thks
--
regards
jl
38 matches
Mail list logo