Re: Real Time Search and External File Fields

2016-10-10 Thread Mike Lissner
Thanks for the replies. I made the changes so that the external file field is loaded per:

Re: Real Time Search and External File Fields

2016-10-09 Thread Shawn Heisey
On 10/8/2016 1:18 PM, Mike Lissner wrote: > I want to make sure I understand this properly and document this for > futurepeople that may find this thread. Here's what I interpret your > advice to be: > 0. Slacken my auto soft commit interval to something more like a minute. Yes, I would do this.

Re: Real Time Search and External File Fields

2016-10-08 Thread Erick Erickson
I chose 16 as a place to start. You usually reach diminishing returns pretty quickly, i feel it's a mistake to set your autowarm counts to, say 256 (and I've seen this in the thousands) unless you have some proof that it's useful to bump higher. But certainly if you set them to 16 and see spikes j

Re: Real Time Search and External File Fields

2016-10-08 Thread Walter Underwood
With time-oriented data, you can use an old trick (goes back to Infoseek in 1995). Make a “today” collection that is very fresh. Nightly, migrate new documents to the “not today” collection. The today collection will be small and can be updated quickly. The archive collection will be large and

Re: Real Time Search and External File Fields

2016-10-08 Thread Mike Lissner
On Fri, Oct 7, 2016 at 8:18 PM Erick Erickson wrote: > What you haven't mentioned is how often you add new docs. Is it once a > day? Steadily > from 8:00 to 17:00? > Alas, it's a steady trickle during business hours. We're ingesting court documents as they're posted on court websites, then sendi

Re: Real Time Search and External File Fields

2016-10-08 Thread Mike Lissner
On Sat, Oct 8, 2016 at 8:46 AM Shawn Heisey wrote: > Most soft commit > > documentation talks about setting up soft commits with of > about a > > second. > > IMHO any documentation that recommends autoSoftCommit with a maxTime of > one second is bad documentation, and needs to be fixed. Where h

Re: Real Time Search and External File Fields

2016-10-08 Thread Shawn Heisey
On 10/7/2016 6:19 PM, Mike Lissner wrote: > Soft commits seem to be exactly the thing for this, but whenever I open a > new searcher (which soft commits seem to do), the external file is > reloaded, and all queries are halted until it finishes loading. When I just > measured, this took about 30 sec

Re: Real Time Search and External File Fields

2016-10-07 Thread Erick Erickson
bq: Most soft commit documentation talks about setting up soft commits with of about a second. I think this is really a consequence of this being included in the example configs for illustrative purposes, personally I never liked this. There is no one right answer. I've seen soft commit interval

Real Time Search and External File Fields

2016-10-07 Thread Mike Lissner
I have an index of about 4M documents with an external file field configured to do boosting based on pagerank scores of each document. The pagerank file is about 93MB as of today -- it's pretty big. Each day, I add about 1,000 new documents to the index, and I need them to be available as soon as

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
This is great! I guess, there is nothing left to worry about for a while. Erick & Yonik, thank you again for your great responses. Bests, Jak On Thu, Nov 17, 2011 at 4:01 PM, Yonik Seeley wrote: > On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote: > > Is it ok to see soft committed records af

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 3:56 PM, Jak Akdemir wrote: > Is it ok to see soft committed records after server restart, too? Yes... we currently have Jetty configured to call some cleanups on exit (such as closing the index writer). -Yonik http://www.lucidimagination.com

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Yonik, Is it ok to see soft committed records after server restart, too? If it is, there is no problem left at all. I added changing files and 1 sec of log at the end of the e-mail. One significant line says softCommit=true, so Solr recognizes our softCommit request. INFO: start commit(optimize=fa

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 1:34 PM, Erick Erickson wrote: > Hmmm. It is suspicious that your index files change every > second. Why is this suspicious? A soft commit still writes out some files currently... it just doesn't fsync them. -Yonik http://www.lucidimagination.com

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
1- There is an improvement on the issue. I add 10 seconds time interval into the delta of data-config.xml, which will cover records that already indexed. "revision_time > DATE_SUB('${dataimporter.last_index_time}', INTERVAL 10 SECOND);" In this case 1369 new records inserted with 7 records per sec

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Erick Erickson
Hmmm. It is suspicious that your index files change every second. If you change our cron task to update every 10 seconds, do the index files change every 10 seconds? Regarding your question about "After a server restart last query results reserved. (In NRT they would disappear, right?)" not necess

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Yonik, I updated my solrconfig time based only as follows. 30 1000 And changed my soft commit script to the first case. while [ 1 ]; do echo "Soft commit applied!" wget -O /dev/null ' http://localhost:8080/solr-jak/dataimport?command=delta-import&commit=fa

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Yonik Seeley
On Thu, Nov 17, 2011 at 11:48 AM, Jak Akdemir wrote: > 2) I am sure about delta-queries configured well. Full-Import is completed > in 40 secs for 40 docs. And delta's are in 1 sec for 15 new records. > Also I checked it. There is no problem in it. That's 10,000 docs/sec. If you configure a

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
re-indexing *everything* every time? There's an interactive > debugging console you can use that may help, try: > http://localhost:8983/solr/admin/dataimport.jsp > > Best > Erick > > On Thu, Nov 17, 2011 at 3:19 AM, Jak Akdemir wrote: > > Hi, > > > > I wa

Re: Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Erick Erickson
every time? There's an interactive debugging console you can use that may help, try: http://localhost:8983/solr/admin/dataimport.jsp Best Erick On Thu, Nov 17, 2011 at 3:19 AM, Jak Akdemir wrote: > Hi, > > I was trying to configure a Solr instance with the near real-time search >

Solr Near Real-Time Search, Soft Commit Problem

2011-11-17 Thread Jak Akdemir
Hi, I was trying to configure a Solr instance with the near real-time search and auto-complete capabilities. I stuck in the NRT feature. There are 15 new records per second that inserted into the database (mysql) and I indexed them with DIH. First, I tried to manage autoCommits from

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Sturge
> no, I only thought you use one day :-) > so you don't or do you have 31 shards? > No, we use 1 shard per month - e.g. 7 shards will hold 7 month's of data. It can be set to 1 day, but you would need to have a huge amount of data in a single day to warrant doing that. On Thu, Nov 18, 2010 at 8

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Karich
Does yours need to be once a day? no, I only thought you use one day :-) so you don't or do you have 31 shards? having a look at Solr Cloud or Katta - could be useful here in dynamically allocating shards. ah, thx! I will take a look at it (after trying solr4)! Regards, Peter. May

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Sturge
> Maybe I didn't fully understood what you explained: but doesn't this mean > that you'll have one index per day? > Or are you overwriting, via replicating, every shard and the number of shard > is fixed? > And why are you replicating from the local replica to the next shard? (why > not directly fr

Re: Possibilities of (near) real time search with solr

2010-11-18 Thread Peter Karich
Hi Peter! * I believe the NRT patches are included in the 4.x trunk. I don't think there's any support as yet in 3x (uses features in Lucene 3.0). I'll investage how much effort it is to update to solr4 * For merging, I'm talking about commits/writes. If you merge while commits are going on

Re: Possibilities of (near) real time search with solr

2010-11-17 Thread Peter Sturge
* I believe the NRT patches are included in the 4.x trunk. I don't think there's any support as yet in 3x (uses features in Lucene 3.0). * For merging, I'm talking about commits/writes. If you merge while commits are going on, things can get a bit messy (maybe on source cores this is ok, but I hav

Re: Possibilities of (near) real time search with solr

2010-11-16 Thread Peter Karich
Hi Peter, thanks for your response. I will dig into the sharding stuff asap :-) This may have changed recently, but the NRT stuff - e.g. per-segment commits etc. is for the latest Solr 4 trunk only. Do I need to turn something 'on'? Or do you know wether the NRT patches are documented some

Re: Possibilities of (near) real time search with solr

2010-11-16 Thread Peter Sturge
Hi Peter, First off, many thanks for putting together the NRT Wiki page! This may have changed recently, but the NRT stuff - e.g. per-segment commits etc. is for the latest Solr 4 trunk only. If your setup uses the 3x Solr code branch, then there's a bit of work to do to move to the new version.

Possibilities of (near) real time search with solr

2010-11-15 Thread Peter Karich
Hi, I wanted to provide my indexed docs (tweets) relative fast: so 1 to 10 sec or even 30 sec would be ok. At the moment I am using the read only core scenario described here (point 5)* with a commit frequency of 180 seconds which was fine until some days. (I am using solr1.4.1) Now the time

Re: Near real-time search of user data

2009-02-19 Thread Mark Ferguson
running into memory issues? > > > > Otis > > -- > > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > > > > > - Original Message > >> From: Mark Ferguson > >> To: solr-user@lucene.apache.org > >> Sent:

Re: Near real-time search of user data

2009-02-19 Thread Noble Paul നോബിള്‍ नोब्ळ्
t.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: Mark Ferguson >> To: solr-user@lucene.apache.org >> Sent: Friday, February 20, 2009 1:14:15 AM >> Subject: Near real-time search of user data >> >> Hi, >> >> I a

Re: Near real-time search of user data

2009-02-19 Thread Otis Gospodnetic
http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Mark Ferguson > To: solr-user@lucene.apache.org > Sent: Friday, February 20, 2009 1:14:15 AM > Subject: Near real-time search of user data > > Hi, > > I am trying to come up with a strateg

Near real-time search of user data

2009-02-19 Thread Mark Ferguson
Hi, I am trying to come up with a strategy for a solr setup in which a user's indexed data can be nearly immediately available to them for search. My current strategy (which is starting to cause problems) is as follows: - each user has their own personal index (core), which gets committed after

Re: real time search

2007-09-25 Thread Ian Holsman
information. (we were planning on using a paxos algorithm to help with this) Hopefully Jason if you are reading this will add further to it. regards Ian Grant Ingersoll wrote: Hi James, Can you provide more information about what you are trying to do? By real time search, do you mean you

Re: real time search

2007-09-24 Thread James liu
have more memory and cpu,,,just open more instance ,not one by one.) final, we merge result and show it to user. that all i think, not test it. 2007/9/24, Grant Ingersoll <[EMAIL PROTECTED]>: > > Hi James, > > Can you provide more information about what you are trying to do?

Re: real time search

2007-09-24 Thread James liu
Ingersoll wrote: > > > Hi James, > > > > Can you provide more information about what you are trying to do? > > By real time search, do you mean you want indexed documents to be > > available immediately? Or is a minute or two acceptable? Do all > > users need

Re: real time search

2007-09-24 Thread Matthew Runo
] | 702-943-7833 ++ On Sep 24, 2007, at 8:13 AM, Grant Ingersoll wrote: Hi James, Can you provide more information about what you are trying to do? By real time search, do you mean you want indexed documents to be available immediately

Re: real time search

2007-09-24 Thread Grant Ingersoll
Hi James, Can you provide more information about what you are trying to do? By real time search, do you mean you want indexed documents to be available immediately? Or is a minute or two acceptable? Do all users need to see them immediately, or just the current user? We can better

real time search

2007-09-23 Thread James liu
i wanna do it. Maybe someone did it, if so, give me some tips. thks -- regards jl