This turned out to be a missing SolrDeletionPolicy in the configuration.
Once the slaves had a SolrDeletionPolicy, they stopped growing out of
control.
Ian.
On Wed, Aug 17, 2011 at 8:46 AM, Ian Connor wrote:
> Hi,
>
> We have noticed that many index.* directories are appearing on sla
14, 2011, at 11:34 , Ian Connor wrote:
>
> > It is nothing special - just like this:
> >
> >conn = Solr::Connection.new("http://#{LOCAL_SHARD}";,
> > {:timeout => 1000, :autocommit => :on})
> >options[:shards] = HA_SHARDS
> >res
this to see if we can find out how to
reproduce it or at least the conditions that tend to reproduce it.
--
Regards,
Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor
uest. Are you keeping an instance around?
>
> Erik
>
>
> On Aug 8, 2011, at 12:03 , Ian Connor wrote:
>
> > Hi,
> >
> > I have seen some of these errors come through from time to time. It looks
> > like:
> >
> > /usr/lib/ruby/1.8/net/http.rb:
. Would it be good to create a new
one inside of the connection or is something more serious going on?
ubuntu 10.04
passenger 3.0.8
rails 2.3.11
--
Regards,
Ian Connor
dinator core (the overhead of searching one
> distributed shard vs doing the same query directly is usually very
> measurable, even on if the shard is the same Solr instance as your
> coordinator)
>
>
>
> -Hoss
>
>
--
Regards,
Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor
uld contain the id.
>
> I'll be honest thought: i'm guessing that if your example query doesn't
> work, by suggestion won't either -- because if you get that error just
> trying to access the "id" field, the same thing will probably happen when
> the de
I have found that this search crashes:
/solr/select?q=*%3A*&fq=&start=0&rows=1&fl=id
SEVERE: java.lang.IndexOutOfBoundsException: Index: 114, Size: 90
at java.util.ArrayList.RangeCheck(ArrayList.java:547)
at java.util.ArrayList.get(ArrayList.java:322)
at org.apache.lucene.index.FieldI
als...
>
> On Thu, Feb 11, 2010 at 13:22, Ian Connor wrote:
> > This seems to allow you to log each query - which is a good start.
> >
> > I was thinking of something that would add all the ms together and report
> it
> > in the "completed at" line so you ca
13 PM, Mat Brown wrote:
> On Thu, Feb 11, 2010 at 13:07, Ian Connor wrote:
> > The idea is that in the log is currently like:
> >
> > Completed in 1290ms (View: 152, DB: 75) | 200 OK [
> > http://localhost:3000/search?q=nik+gene+cluster&view=2]
> >
> > I wan
334) | 200 OK [
http://localhost:3000/search?q=nik+gene+cluster&view=2]
Has anyone done such a plug-in or extension already?
--
Regards,
Ian Connor
0 at 11:49 AM, Tim Underwood wrote:
> Have you played around with the "option httpclose" or the "option
> forceclose" configuration options in HAProxy (both documented here:
> http://haproxy.1wt.eu/download/1.3/doc/configuration.txt)?
>
> -Tim
>
>
; http://hc.apache.org/httpclient-3.x/
>
> We used 'balance' at another project and did not have any problems.
>
> On Tue, Feb 9, 2010 at 5:54 AM, Ian Connor wrote:
> > I have been using distributed search with haproxy but noticed that I am
> > suffering a little from tcp co
I have been using distributed search with haproxy but noticed that I am
suffering a little from tcp connections building up waiting for the OS level
closing/time out:
netstat -a
...
tcp6 1 0 10.0.16.170%34654:53789 10.0.16.181%363574:8893
CLOSE_WAIT
tcp6 1 0 10.0.16.170%34654
an error is
> returned and the search fails, is there a way to avoid the error and
> return the results from the shards that are still up?
>
> thx much
>
> --joe
>
--
Regards,
Ian Connor
Can anyone think of a reason why these locks would hang around for more than
2 hours?
I have been monitoring them and they look like they are very short lived.
On Tue, Jan 26, 2010 at 10:15 AM, Ian Connor wrote:
> We traced one of the lock files, and it had been around for 3 hours. A
> r
We traced one of the lock files, and it had been around for 3 hours. A
restart removed it - but is 3 hours normal for one of these locks?
Ian.
On Mon, Jan 25, 2010 at 4:14 PM, mike anderson wrote:
> I am getting this exception as well, but disk space is not my problem. What
> else can I do to de
;> If Ruby is not using the HTTP to talk EmbeddedSolrServer, what is it
>> using?
>>
>> Thanks and Regards
>> Rajan Chandi
>>
>> On Thu, Sep 17, 2009 at 9:44 PM, Erik Hatcher > >wrote:
>>
>>
>>> On Sep 17, 2009, at 11:40 AM, Ian Connor w
takes place it uses a binary protocol instead of text. I
wanted to know if that was available or could be available via the ruby
library.
Is it possible to host a local shard and skip HTTP between ruby and solr?
--
Regards,
Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1
__at_orgmortbayjettyHttpConnectionhandleHttpConnectionjava378
__at_orgmortbayjettybioSocketConnector$ConnectionrunSocketConnectorjava226
__at_orgmortbaythreadBoundedThreadPool$PoolThreadrunBoun"
/usr/lib/ruby/1.8/net/http.rb:2097:in `error!'
On Mon, Aug 17, 2009 at 12:08 PM, Mark Miller wrote:
> Ian Connor wrote:
>
&
e anderson wrote:
> I am e-mailing to inquire about the status of the spellchecking component
> in
> 1.4 (distributed). I saw SOLR-785, but it is unreleased and for 1.5. Any
> help would be much appreciated.
> Thanks in advance,
> Mike
>
--
Regards,
Ian Connor
pubget.com
o be using it in production
> > yet. Any information you can provide would be most welcome.
> >
> >
> We're using Solr 1.4 built from r793546 in production along with the new
> java based replication.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>
--
Regards,
Ian Connor
ust over a second query time).
Regards,
Ian Connor
http://pubget.com
On Thu, Aug 6, 2009 at 1:27 AM, Silent Surfer wrote:
>
> Hi,
>
> We initially went with Hadoop path, but as it is one more software based
> file system on top of the OS file system, we didn't get a buy in from
o to
> 24 GB).
>
> We need to figure out how many servers are required to handle such amount
> of data..
>
> Any help would be greatly appreciated.
>
> Thanks
> SilentSurfer
>
>
>
>
>
--
Regards,
Ian Connor
1 Leighton St #723
Cambridge, MA 02141
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Fax: +1(770) 818 5697
Skype: ian.connor
example here:
https://issues.apache.org/jira/browse/SOLR-572 after reading through
http://wiki.apache.org/solr/SpellCheckComponent
--
Regards,
Ian Connor
s - thanks.
On Mon, Apr 20, 2009 at 1:30 PM, Yonik Seeley wrote:
> On Mon, Apr 20, 2009 at 1:22 PM, Ian Connor wrote:
> > If I have a field that is the default type text (from the sample schema)
> > with the lowercase filter and so forth, is it possible to also do
> se
only work if the case is the same.
Do I need to create two fields? One for text and the other for case
insensitive sentence matching?
--
Regards,
Ian Connor
q=*:*&rows=10
I am not sure how hard this would be to plug into the rails logger and if
you could then also make it add up at the end in the summary line for the
request.
However, it certainly would be nice to know and help focus performance
debugging.
On Thu, Mar 5, 2009 at 4:33 PM, Ian C
he jetty vs tomcat vs resin vs whatever question pretty much comes down
> >> to what you are comfortable running/managing.
> >>
> >> Solr tries its best to stay container agnostic.
> >>
> >>
> >> On Mar 5, 2009, at 1:55 PM, Jonathan Haddad wrote:
> >>
> >>> Is there any compelling reason to use tomcat instead of jetty if all
> >>> we're doing is using solr? We don't use tomcat anywhere else.
> >>> --
> >>> Jonathan Haddad
> >>> http://www.rustyrazorblade.com
> >
> >
>
>
>
> --
>
> -
>
--
Regards,
Ian Connor
at 3:19 PM, Erik Hatcher wrote:
> First, note we have a ruby-...@lucene.apache.org list which focuses
> primarily on the solr-ruby library, flare, and other Ruby specific things.
> But this forum is as good as any, though I'm CC'ing ruby-dev too.
>
> On Mar 5, 2009, at 1
ace that I could set that would be great.
I am also happy to submit a patch on a ticket if that works.
--
Regards,
Ian Connor
d in order to reduce them.
Is there any debug or way to confirm and investigate this further?
--
Regards,
Ian Connor
7; || c == '-' || c == '!' || c == '(' || c ==
> ')' || c == ':'
>|| c == '^' || c == '[' || c == ']' || c == '\"' || c == '{' || c ==
> '}' || c == '~'
>|| c =
admin interface, the parser does
not like it and returns an error.
What is the way to escape this? Is there such code for ruby?
--
Regards,
Ian Connor
nterface.
>
> It's a stretch, and written in Erlang.. but perhaps there is some
> inspiration to be had for 'solr as the data store'.
>
> - Neal Richter
>
--
Regards,
Ian Connor
s incredibly expensive(time or money), you need to keep that in
> mind.
>
> -Todd
>
>
> -Original Message-
> From: Ian Connor [mailto:ian.con...@gmail.com]
> Sent: Wednesday, January 28, 2009 12:38 PM
> To: solr
> Subject: solr as the data store
>
> Hi
massaging and reindexing seems very
appealing.
Has anyone else thought about this or done this and ran into problems that
caused them to go back to a seperate database model? Is there a critical
need you can think is missing?
--
Regards,
Ian Connor
When you query by *:*, what order does it use. Is there a chance they will
come in a different order as you page through the results (and miss/dupicate
some). Is it best to put the order explicitly by 'id' or is that implied
already?
On Mon, Jan 26, 2009 at 12:00 PM, Ian Con
her then id:[* TO *], just try *:* -- this should match all documents
> without using a range query.
>
>
>
> On Jan 25, 2009, at 3:16 PM, Ian Connor wrote:
>
> Hi,
>>
>> Given the only real way to reindex is to save the document again, what is
>> the fastest way
I don't know of any standard export/import tool -- i think luke has
> > something, but it will be faster if you write your own.
> >
> > Rather then id:[* TO *], just try *:* -- this should match all
> > documents without using a range query.
> >
> >
> > O
quickly export
the index to a text file or making queries 1000 at a time is the best option
and dealing with the time it takes to query once you are deep into the
index?
--
Regards,
Ian Connor
stored version of the words or do I need
to mirror a field that does the indexing but without the filters?
I am hoping there might be something I am missing here and a new field is
not needed.
--
Regards,
Ian Connor
it's more the case that if you have an invalid field value, it
> could blow up at different points in different code paths. The root
> cause is still an invalid value in the field.
>
> -Yonik
>
--
Regards,
Ian Connor
1 Leighton St #605
Cambridge, MA 02141
Direct Line: +
million
queries over it last night (pubmed gets about 3 million per day) and the
response time was within a few seconds. This was also under write load as
well which made me feel very confidant in the scalability of solr and
lucene.
--
Regards,
Ian Connor
pubget.com
Cambridge, MA
iconnor [at] mit.edu
experience with pages or the like in solr? Is splitting it
into two fields like this needed or can I do that with one of the standard
filters that I have missed?
--
Regards,
Ian Connor
) {
val = null;
}
It seems only the BinaryResponseWriter is actually that fussy about
null items. Once it comes back to the client for display, it is
handled without error.
On Thu, Aug 21, 2008 at 9:05 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> More of an u
More of an update and work around. When you query a number field
locally, it can return null. However, when you go through a shard if
you have an empty number it throws an error.
Should I open a bug for this?
On Thu, Aug 21, 2008 at 8:56 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> I thi
I think I have narrowed it down to:
where integer is defined in the example as:
It returns fine when I query directly, but blows up when going through
the binary conversion that shards uses.
On Thu, Aug 21, 2008 at 8:37 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> Hi,
>
&
g this? I just updated from
the latest trunk 1.3
--
Regards,
Ian Connor
1 Leighton St #605
Cambridge, MA 02141
Direct Line: +1 (978) 672
Call Center Phone: +1 (714) 239 3875 (24 hrs)
Mobile Phone: +1 (312) 218 3209
Fax: +1(770) 818 5697
Suisse Phone: +41 (0) 22 548 1664
Skype: ian.connor
in my performance was the removal of the lock.
>
> I expect that helps you out.
>
> 2008/8/20 Ian Connor <[EMAIL PROTECTED]>
>
>> I have based my machines on bare bones servers (I call them ghetto
>> servers). I essentially have motherboards in a rack sitting on
>>
problem
since. So, it looks like it was just bad hardware - sorry about the
confusion.
On Mon, Aug 18, 2008 at 8:29 AM, Michael McCandless
<[EMAIL PROTECTED]> wrote:
>
> OK gotchya. Please keep us posted one way or another...
>
> Mike
>
> Ian Connor wrote:
>
>> Hi
t;> like 1TB of data over 1M docs how do you think my machine requirements might
>>> be affected as compared to yours?
>>>
>>
>> You are in a much better position to determine this than we are. See how
>> big an index you can put on a single machine while maint
1.3 only.
>
>> If 1.3, is the nightly build the best one to grab bearing in mind that we
>> would want any protocols around distributed search to be as stable as
>> possible? Or just wait for the 1.3 release?
>
> Go for the nightly build. The release will look very sim
OST requests while searching content in solr?
>
> Thanks in advance,
> Sunil.
>
>
>
--
Regards,
Ian Connor
Could this idea of a "computed field" actually just be a query
filter? Can the filter just add a field on the return like this?
On Tue, Aug 19, 2008 at 9:10 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> I was thinking more that it would be an extra field you get back. My
>
t
when creating the query.
On Tue, Aug 19, 2008 at 8:59 AM, Brian Whitman <[EMAIL PROTECTED]> wrote:
>
> On Aug 19, 2008, at 8:49 AM, Ian Connor wrote:
>
>> What is the current "special requestHandler" that you can set currently?
>
> If you're referring t
What is the current "special requestHandler" that you can set currently?
On Tue, Aug 19, 2008 at 8:41 AM, Shalin Shekhar Mangar
<[EMAIL PROTECTED]> wrote:
> There's an issue open for this. Look at
> https://issues.apache.org/jira/browse/SOLR-705
>
> On Tue, Au
Hi,
Is there a way to know which shard contains a given result. This would
help when you want to write updates back to the correct place.
The idea is when you read your results, there would be an item to say
where a given result came from.
--
Regards,
Ian Connor
how you have a LB handling shards? Do you put a separate LB
>> in front of each group of replica shards?
>
> A single load balancer should be fine... each shard has it's own VIP
> which maps to 2 or more solr servers with a replica of that shard.
>
> -Yonik
>
--
Regards,
Ian Connor
t. Also, as far as I know,
>> there is nothing that gracefully handles problematic Solr instances during
>> distributed search.
>>
>> Right... we punted that issue to a load balancer (which assumes that
>> you have more than one copy of each shard).
>>
>> -Yonik
>
>
--
Regards,
Ian Connor
Brian Whitman
<[EMAIL PROTECTED]> wrote:
> On Aug 18, 2008, at 11:51 AM, Ian Connor wrote:
>>
>> On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
>>>
>>> I don't think this patch is working yet. If I take a shard out of
&
rusty. Can anyone explain how the
QueryRequest here uses the code that is found in SolrIndexSearcher?
On Mon, Aug 18, 2008 at 9:31 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> I don't think this patch is working yet. If I take a shard out of
> rotation (even just one out of four
te docIDs are merged in the SOLR-303
> response, so other than building in some "this host isn't working, just move
> on and report it" and of course the work to index redundantly, we wouldn't
> need anything to achieve a good redundant shard implementation.
>
> B
>
>
>
--
Regards,
Ian Connor
s corrupt and see which segment it is? Then post back the
> exception and "ls -l" of your index directory?
>
> If you could post the client-side code you're using to build & submit
> docs to Solr, and if I can get access to the Medline content, and I
> can the r
if one of my shards goes down, then I can still give
results. If there was some option that said wait 1 second and then
give up, this would work perfectly for me.
--
Regards,
Ian Connor
Ignore that error - I think I installed the Sun JVM incorrectly - this
seems unrelated to the error.
On Fri, Aug 15, 2008 at 9:01 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
> I tried it again (rm -rf /solr/index and post all the docs again) but
> this time, I get the error (I also swi
change. I don't
> do that any more ;).
>
> Doug
>
> On Aug 14, 2008, at 11:08 PM, Yonik Seeley wrote:
>
>> Since this looks like more of a lucene issue, I've replied in
>> [EMAIL PROTECTED]
>>
>> -Yonik
>>
>> On Thu, Aug 14, 2008 at 10:18
:
> - what platform are you running on, and what JVM?
> - are you using multicore? (I fixed some index locking bugs recently)
> - are there any exceptions in the log before this?
> - how reproducible is this?
>
> -Yonik
>
> On Thu, Aug 14, 2008 at 2:47 PM, Ian Connor <[EM
or another internal
one?
Regards,
Ian Connor
have solr-ruby do this to dates for me would be
ideal.
if field.class == Date
field = field.to_s + "T23:59:59Z"
end
On Tue, Aug 12, 2008 at 5:11 AM, Erik Hatcher
<[EMAIL PROTECTED]> wrote:
>
> On Aug 11, 2008, at 3:03 PM, Ian Connor wrote:
>>
>> I originally us
better here than time.
What is the best strategy here:
1. Use Dates and treat it as a solr.String;
2. Customize the Date class to output a valid solr.DateField string; or
3. Treat it as a string in ruby and handle to/from Date in my model?
--
Regards,
Ian Connor
my employer. Contents may be hot. Slippery when wet.
> Reading disclaimers makes you go blind. Writing them is worse. You have been
> Warned.
>
--
Regards,
Ian Connor
ot;lee" instead of "Lee".
Also, can anyone see danger is using StandardTokenizerFactory for
people's names?
--
Regards,
Ian Connor
.
On Tue, Aug 5, 2008 at 4:59 PM, Smiley, David W. (DSMILEY)
<[EMAIL PROTECTED]> wrote:
> Yes.
>
>
> On 8/5/08 4:58 PM, "Ian Connor" <[EMAIL PROTECTED]> wrote:
>
>> Hi,
>>
>> When you store a multivaluefield in a given order
>> ['one
Hi,
When you store a multivaluefield in a given order
['one','two','three','four'], will it always return the values in that
order?
--
Regards,
Ian Connor
en only 100 docs/second.
>
> Otis
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
>
> - Original Message
>> From: Ian Connor <[EMAIL PROTECTED]>
>> To: solr-user@lucene.apache.org
>> Sent: Friday, August 1, 2008 3:36:1
update' -H
'Content-Type: text/xml'
Is there a faster way to load up these documents into a number of solr
shards? I seem to be able to cover 3000/second just catting them
together (2500 at a time is the sweet spot for me) - but this slows
down to under 100/s once I try to do the post with curl.
--
Regards,
Ian Connor
Indeed - one of my shards had it listed as "text" doh!
thanks for the assurance that led me to find my bug
On Tue, Jul 22, 2008 at 11:43 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Tue, Jul 22, 2008 at 11:39 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
>> >
8 at 11:20 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote:
> On Tue, Jul 22, 2008 at 11:08 AM, Ian Connor <[EMAIL PROTECTED]> wrote:
>> How can I require an exact field match in a query. For instance, if a
>> title field contains "Nature" or "Nature Cell Biolog
re index it with the
field defined in a certain way?
I have this definition now - but it returns all titles that contain
"Nature" rather than just the ones that equals it exactly.
--
Regards,
Ian Connor
Hi,
I am trying to query a "doi" field. However, a doi can contain a ":"
colon character and the query parser throws an error at this point.
How do you escape a colon?
--
Regards,
Ian
gt;
> On Fri, Jul 11, 2008 at 9:55 PM, Ian Connor <[EMAIL PROTECTED]> wrote:
>> Could you give me an example how combining standard with dismax would
>> look like in query string or URL?
>>
>> I thought you had to set qt=dismax in the URL and it applied to the
>>
=10&fl=*%2Cscore&qt=dismax&wt=ruby&explainOther=&hl.fl=
On Fri, Jul 11, 2008 at 11:50 AM, Shalin Shekhar Mangar
<[EMAIL PROTECTED]> wrote:
> Note that you can use both the standard and dismax style as when you
> need more control vs. searching all fields.
>
&g
field:value syntax
>
> To get all terms, you can use the facet.field for all fields and get
> all the terms. However, I'm not able to understand the use-case for
> this.
>
> On Fri, Jul 11, 2008 at 5:12 PM, Ian Connor <[EMAIL PROTECTED]> wrote:
>>
>> Is it po
there at the end to be the
defaultSearchField.
Thanks for any advice,
Ian Connor
Does anyone have a suggestion on a simple "restart" script. I see
tools like supervise can restart a process when it goes down.
Nagios would be ideal because it can warn you before your solr
instance starts to die - but a simple restart script with an email
alert might be good enough for most.
On
through my data before slowing down.
On Wed, Jul 9, 2008 at 7:56 PM, Jacob Singh <[EMAIL PROTECTED]> wrote:
> My total guess is that indexing is CPU bound, and searching is RAM bound.
>
> Best,
> Jacob
> Ian Connor wrote:
>> There was a thread a while ago, that suggested ju
200. Assuming each returns around 30k
> documents, it adds to 200 * 3 bits = 750K.
>
> If we use document cache of size 20K, assuming each document size is around
> 5k at the max, it will take up 2 * 5= 100MB.
>
> Thus we can increase the cache more drastically and still it wi
88 matches
Mail list logo