Does DataImportHandler do any sanitizing?

2012-08-15 Thread Jon Drukman
I am pulling some fields from a mysql database using DataImportHandler and some of them have invalid XML in them. Does DataImportHandler do any kind of filtering/sanitizing to ensure that it will go in OK or is it all on me? Example bad data: orphaned ampersands ("Peanut Butter & Jelly"), curly

Re: Running out of memory

2012-08-13 Thread Jon Drukman
On Sun, Aug 12, 2012 at 12:31 PM, Alexey Serba wrote: > > It would be vastly preferable if Solr could just exit when it gets a > memory > > error, because we have it running under daemontools, and that would cause > > an automatic restart. > -XX:OnOutOfMemoryError="; " > Run user-defined commands

Re: DataImportHandler WARNING: Unable to resolve variable

2012-08-10 Thread Jon Drukman
when while using a Template Transformer. My > fields *always* have a value as well - it is getting indexed correctly. > > Furthermore, the number of warnings I get seems arbitrary. I imported one > document (debug mode) and I got roughly ~400 of those warning messages for > the single f

Re: Connect to SOLR over socket file

2012-08-10 Thread Jon Drukman
On Fri, Aug 10, 2012 at 2:44 AM, Jason Axelson wrote: > You're correct that there is an underlying problem I'm trying to > solve. The underlying problem is that due to the security policies I > cannot run another service that listens on a TCP port, but a unix > domain socket would be okay. It look

Re: /solr/admin/stats.jsp null pointer exception

2012-08-09 Thread Jon Drukman
On Wed, Aug 8, 2012 at 3:03 PM, Chris Hostetter wrote: > I can't reproduce with teh example configs -- it looks like you've > tweaked hte logging to use the XML file format, anyway to get the > stacktrace of the "Caused by" exception so we can see what is null and > where? > Here is the caused by

/solr/admin/stats.jsp null pointer exception

2012-08-08 Thread Jon Drukman
New install of Solr 3.6.1, getting a Null Pointer Exception when trying to access admin/stats.jsp: 2012-08-08T17:55:09 138509624 694 org.apache.solr.servlet.SolrDispatchFilter SEVERE org.apache.solr.common.SolrException log 25 org.apache.jasper.JasperException: java.lang.Nu

Re: Solr always at 100% (or more) CPU

2012-07-09 Thread Jon Drukman
ppened to > me last week. > > > http://blog.wpkg.org/2012/07/01/java-leap-second-bug-30-june-1-july-2012-fix/ > > Michael Della Bitta > > > Appinions, Inc. -- Where Influence Isn’t a Game. > http://www.appinions.com &g

Solr always at 100% (or more) CPU

2012-07-09 Thread Jon Drukman
I have a very small Solr setup. The index is 32MB and there are only 8 fields, most of which are ints. I run a cron job every hour to use DataImportHandler to do a full reimport of a database which has 42,600 rows. There is minimal traffic on the server. Maybe a few dozen queries a minute. Usu

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
OK, setting the wait_timeout back to its previous value and adding readOnly didn't help, I got the stack overflow again. I re-upped the mysql timeout value again. -jsd- On Tue, May 15, 2012 at 2:42 PM, Jon Drukman wrote: > I fixed it for now by upping the wait_timeout on the mysq

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
ould open a bug > report and get this fixed in DIH. > > James Dyer > E-Commerce Systems > Ingram Content Group > (615) 213-4311 > > > -Original Message- > From: Jon Drukman [mailto:jdruk...@gmail.com] > Sent: Tuesday, May 15, 2012 4:12 PM > To: solr-user@

Re: Exception in DataImportHandler (stack overflow)

2012-05-15 Thread Jon Drukman
< michael.della.bi...@appinions.com> wrote: > Hi, Jon: > > Well, you don't see that every day! > > Is it possible that you have something weird going on in your DDL > and/or queries, like a tree schema that now suddenly has a cyclical > reference? > > Michael > > On

Facet auto-suggest

2012-01-17 Thread Jon Drukman
I don't even know what to call this feature. Here's a website that shows the problem: http://pulse.audiusanews.com/pulse/index.php Notice that you can end up in a situation where there are no results. For example, in order, press: People, Performance, Technology, Photos. The client wants it so th

Re: Case insensitive but number sensitive string?

2011-02-25 Thread Jon Drukman
Ahmet Arslan yahoo.com> writes: > > > I want a string field that is case > > insensitive.  This is what I tried: > > > > > sortMissingLast="true" > > omitNorms="true"> > >         > >                 > > > >         > >         > >                 > > > >         > >     > > > > > >

Case insensitive but number sensitive string?

2011-02-25 Thread Jon Drukman
I want a string field that is case insensitive. This is what I tried: However, it is matching "opengl" for "opengl128". I want exact string matches, but I want them case-insensitive. What did I do wrong?

Sorting - bad performance

2011-02-22 Thread Jon Drukman
The performance factors wiki says: "If you do a lot of field based sorting, it is advantageous to add explicitly warming queries to the "newSearcher" and "firstSearcher" event listeners in your solrconfig which sort on those fields, so the FieldCache is populated prior to any queries being executed

Shutdown hook executing for a long time

2011-02-16 Thread Jon Drukman
2011-02-16 11:32:45.489::INFO: Shutdown hook executing 2011-02-16 11:35:36.002::INFO: Shutdown hook complete The shutdown time seems to be proportional to the amount of time that Solr has been running. If I immediately restart and shut down again, it takes a fraction of a second. What causes i

DataImportHandler: regex debugging

2011-02-09 Thread Jon Drukman
I am trying to use the regex transformer but it's not returning anything. Either my regex is wrong, or I've done something else wrong in the setup of the entity. Is there any way to debug this? Making a change and waiting 7 minutes to reindex the entity sucks. This returns columns

DataImportHandler: no queries when using entity=something

2011-02-02 Thread Jon Drukman
So I'm trying to update a single entity in my index using DataImportHandler. http://solr:8983/solr/dataimport?command=full-import&entity=games It ends near-instantaneously without hitting the database at all, apparently. Status shows: 0 0 0 0 Indexing completed. Added/Updated: 0 documents. Del

Re: DataImportHandler: full import of a single entity

2011-01-18 Thread Jon Drukman
Ahmet Arslan yahoo.com> writes: > > > I've got a DataImportHandler set up > > with 5 entities.  I would like to do a full > > import on just one entity.  Is that possible? > > > > Yes, there is a parameter named entity for that. > solr/dataimport?command=full-import&entity=myEntity That seem

DataImportHandler: full import of a single entity

2011-01-14 Thread Jon Drukman
I've got a DataImportHandler set up with 5 entities. I would like to do a full import on just one entity. Is that possible? I worked around it temporarily by hand editing the dataimport.properties file and deleting the delta line for that one entity, and kicking off a delta. But for (hopefully)

Boosting on a document value

2010-11-15 Thread Jon Drukman
I've got a document with a "type" field. If the type is 1, I want to boost the document's relevancy, but type=1 is not a requirement. Types other than 1 should still be returned and scored as normal, just without the boost. How do I do this? -jsd-

Re: Searching with AND + OR and spaces

2010-11-12 Thread Jon Drukman
Ahmet Arslan yahoo.com> writes: > > > (title:"Call of Duty" OR subhead:"Call of Duty") > > > > No matches, despite the fact that there are many documents > > that should match. > > Field types of title and subhead are important here. Do you use stopwordfilterfactory with enable > position inc

Searching with AND + OR and spaces

2010-11-12 Thread Jon Drukman
I want to search two fields for the phrase Call Of Duty. I tried this: (title:"Call of Duty" OR subhead:"Call of Duty") No matches, despite the fact that there are many documents that should match. So I left out the quotes, and it seems to work. But now when I try doing things like title:Call

Re: SEVERE: Could not start SOLR. Check solr/home property

2010-04-28 Thread Jon Drukman
On 4/27/10 12:04 PM, Chris Hostetter wrote: : SEVERE: Could not start SOLR. Check solr/home property it means something when horribly wrong when starting solr, and since this is frequently caused by either an incorrect explicit solr/home or an incorrect implicitly guessed solr home, that is men

Re: SEVERE: Could not start SOLR. Check solr/home property

2010-04-26 Thread Jon Drukman
On 4/26/10 1:18 PM, Siddhant Goel wrote: Did you by any chance set up multicore? Try passing in the path to the Solr home directory as -Dsolr.solr.home=/path/to/solr/home while you start Solr. Nope, no multicore. I destroyed the index and re-created it from scratch and now it works fine. No

SEVERE: Could not start SOLR. Check solr/home property

2010-04-26 Thread Jon Drukman
What does this error mean? SEVERE: Could not start SOLR. Check solr/home property I've had this solr installation working before, but I haven't looked at it in a few months. I checked it today and the web side is returning a 500 error, the log file shows this when starting up: SEVERE: Coul

Boost documents based on a constant value in a field

2010-02-05 Thread Jon Drukman
I have a very simple schema: two integers and two text fields. required="true" /> stored="true"/> I want to do full text searching on the text fields as normal. However, I want to boost all documents where question_source == 3 to the top. How do I do that? So the results s

DataImportHandler delta-import confusion

2010-02-01 Thread Jon Drukman
First, let me just say that DataImportHandler is fantastic. It got my old mysql-php-xml index rebuild process down from 30 hours to 6 minutes. I'm trying to use the delta-import functionality now but failing miserably. Here's my entity tag: (some SELECT statements reduced to increase readabil

Re: stemming (maybe?) question

2009-03-17 Thread Jon Drukman
Yonik Seeley wrote: Not sure... I just took the stock solr example, and it worked fine. I inserted "o'meara" into example/exampledocs/solr.xml Advanced o'meara Full-Text Search Capabilities using Lucene the indexed everything: ./post.sh *.xml Then queried in various ways: q=o'meara q=omeara

Re: stemming (maybe?) question

2009-03-16 Thread Jon Drukman
Yonik Seeley wrote: On Thu, Mar 12, 2009 at 1:36 PM, Jon Drukman wrote: is it possible to make solr think that "omeara" and "o'meara" are the same thing? WordDelimiter would handle it if the document had "o'meara" (but you may or may

stemming (maybe?) question

2009-03-12 Thread Jon Drukman
is it possible to make solr think that "omeara" and "o'meara" are the same thing? -jsd-

Re: exceeded limit of maxWarmingSearchers

2009-02-09 Thread Jon Drukman
Otis Gospodnetic wrote: I'd say: "Make sure you don't commit more frequently than the time it takes for your searcher to warm up", or else you risk searcher overlap and pile-up. cool. i found a place in our code where we were committing the same thing twice in very rapid succession. fingers

Re: exceeded limit of maxWarmingSearchers

2009-02-05 Thread Jon Drukman
Otis Gospodnetic wrote: Jon, If you can, don't commit on every update and that should help or fully solve your problem. is there any sort of heuristic or formula i can apply that can tell me when to commit? put it in a cron job and fire it once per hour? there are certain updates that are

Re: exceeded limit of maxWarmingSearchers

2009-02-04 Thread Jon Drukman
Otis Gospodnetic wrote: That should be fine (but apparently isn't), as long as you don't have some very slow machine or if your caches are are large and configured to copy a lot of data on commit. this is becoming more and more problematic. we have periods where we get 10 of these exceptio

Re: exceeded limit of maxWarmingSearchers

2009-01-30 Thread Jon Drukman
Yonik Seeley wrote: I'd advise setting it to a very low limit (like 2) and committing less often. Once you get too many overlapping searchers, things will slow to a crawl and that will just cause more to pile up. The root cause is simply too many commits in conjunction with warming too long. I

exceeded limit of maxWarmingSearchers

2009-01-30 Thread Jon Drukman
I am getting hit by a storm of these once a day or so: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=16, try again later. I keep bumping up maxWarmingSearchers. It's at 32 now. Is there any way to figure out what the "right"

Re: I get SEVERE: Lock obtain timed out

2009-01-29 Thread Jon Drukman
Yonik Seeley wrote: On Thu, Jan 29, 2009 at 1:16 PM, Jon Drukman wrote: Julian, have you had any luck figuring this out? My production instance just started having this problem. It seems to crop up after solr's been running for several hours. Our usage is very light (maybe one query

Re: permanently setting log level?

2009-01-29 Thread Jon Drukman
Vannia Rajan wrote: On Thu, Jan 29, 2009 at 11:55 PM, Jon Drukman wrote: if i go to /solr/admin/logging, i can set the "root" log level to WARNING, which is what i want. however, every time solr restarts, it is set back to INFO. Is there a way to get the WARNING level to stick p

permanently setting log level?

2009-01-29 Thread Jon Drukman
if i go to /solr/admin/logging, i can set the "root" log level to WARNING, which is what i want. however, every time solr restarts, it is set back to INFO. Is there a way to get the WARNING level to stick permanently? -jsd-

Re: I get SEVERE: Lock obtain timed out

2009-01-29 Thread Jon Drukman
Julian Davchev wrote: Hi, Any documents or something I can read on how locks work and how I can controll it. When do they occur etc. Cause only way I got out of this mess was restarting tomcat SEVERE: org.apache.lucene.store.LockObtainFailedException: Lock obtain timed out: SingleInstanceLock: w

Handling proper names

2008-11-07 Thread Jon Drukman
Is there any way to tell Solr that Stephen is the same as Steven and Steve? Carl and Karl? Bobby/Bob/Robert, and so on... -jsd-

Re: exceeded limit of maxWarmingSearchers

2008-10-29 Thread Jon Drukman
Feak, Todd wrote: Have you looked at how long your warm up is taking? If it's taking longer to warm up a searcher then it does for you to do an update, you will be behind the curve and eventually run into this no matter how big that number. Most of them say warmupTime=0. It ranges from 0 to

exceeded limit of maxWarmingSearchers

2008-10-29 Thread Jon Drukman
I am getting this error quite frequently on my Solr installation: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. exceeded limit of maxWarmingSearchers=8, try again later. I've done some googling but the common explanation of it being related to autocommit doesn't a

dismax and stopwords (was Re: dismax and long phrases)

2008-10-09 Thread Jon Drukman
Norberto Meijome wrote: On Tue, 07 Oct 2008 09:27:30 -0700 Jon Drukman <[EMAIL PROTECTED]> wrote: Yep, you can "fake" it by only using fieldsets (qf) that have a consistent set of stopwords. does that mean changing the query or changing the schema? Jon, - you change sche

Re: dismax and long phrases

2008-10-07 Thread Jon Drukman
Mike Klaas wrote: On 6-Oct-08, at 11:20 AM, Jon Drukman wrote: Chris Hostetter wrote: It's not a bug in the implementation, it's a side effect of the basic tenent of how dismax works since it inverts the input and creates a DisjunctionMaxQuery for each "word" in the inp

Re: dismax and long phrases

2008-10-06 Thread Jon Drukman
Chris Hostetter wrote: It's not a bug in the implementation, it's a side effect of the basic tenent of how dismax works since it inverts the input and creates a DisjunctionMaxQuery for each "word" in the input, any word that is valid in at least one of the "qf" fields generates a "should" claus

dismax and long phrases

2008-10-03 Thread Jon Drukman
i have a document with the following field Saying goodbye to Norman if i search for "saying goodbye to norman" with the standard query, it works fine. if i specify dismax, however, it does not match. here's the output of debugQuery, which I don't understand at all: saying goodbye to norman

Re: help required: how to design a large scale solr system

2008-09-24 Thread Jon Drukman
Martin Iwanowski wrote: How can I setup to run Solr as a service, so I don't need to have a SSH connection open? The advice that I was given on this very list was to use daemontools. I set it up and it is really great - starts when the machine boots, auto-restart on failures, easy to bring u

Re: dismax - undefined field exception

2008-09-22 Thread Jon Drukman
Sean Timm wrote: Add echoParams=all to your URL and look for the "cat" field in one of the passed parameters. Specifically, in pf and qf. These can be defaulted in the solrconfig.xml file. i tried that but the exception prevents solr from returning anything. but i did look in solrconfig.xml

dismax - undefined field exception

2008-09-22 Thread Jon Drukman
whenever i try to use qt=dismax i get the following error: Sep 22, 2008 11:50:48 AM org.apache.solr.common.SolrException log SEVERE: org.apache.solr.common.SolrException: undefined field cat at org.apache.solr.schema.IndexSchema.getDynamicFieldType(IndexSchema.java:1053) i don't have

How to use copyfield with dynamicfield?

2008-09-22 Thread Jon Drukman
I have a dynamicField declaration: I want to copy any *_t's into a text field for searching with dismax. As it is, it appears you can't search dynamicfields this way. I tried adding a copyField: I do have a text field in my schema: However I get 400 errors whenever I try to update a

Re: Illegal character in xml file

2008-09-19 Thread Jon Drukman
James liu wrote: > first, u should escape some string like (code by php) > >> function escapeChars($string) { >> > $string = str_replace("&", "&", $string); > > $string = str_replace("<", "<", $string); > > $string = str_replace(">", ">", $string); > > $string = str_replace("'", "'", $string);

Re: Dismax + Dynamic fields

2008-09-18 Thread Jon Drukman
Daniel Papasian wrote: Norberto Meijome wrote: Thanks Yonik. ok, that matches what I've seen - if i know the actual name of the field I'm after, I can use it in a query it, but i can't use the dynamic_field_name_* (with wildcard) in the config. Is adding support for this something that is desir

Adding a field?

2008-08-26 Thread Jon Drukman
Is there a way to add a field to an existing index without stopping the server, deleting the index, and reloading every document from scratch? -jsd-

Re: Solr won't start under jetty on RHEL5.2

2008-08-18 Thread Jon Drukman
Jon Drukman wrote: I just migrated my solr instance to a new server, running RHEL5.2. I installed java from yum but I suspect it's different from the one I used to use. Turns out my instincts were correct. The version from yum does not work. I installed the official sun jdk and n

Solr won't start under jetty on RHEL5.2

2008-08-18 Thread Jon Drukman
I just migrated my solr instance to a new server, running RHEL5.2. I installed java from yum but I suspect it's different from the one I used to use. Anyway, my Solr no longer works. 2008-08-18 18:01:12.079::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2008-08-18 18:01:12.229::INF

Re: Administrative questions

2008-08-15 Thread Jon Drukman
Jason Rennie wrote: On Wed, Aug 13, 2008 at 1:52 PM, Jon Drukman <[EMAIL PROTECTED]> wrote: Duh. I should have thought of that. I'm a big fan of djbdns so I'm quite familiar with daemontools. Thanks! :) My pleasure. Was nice to hear recently that DJB is moving towa

Re: Administrative questions

2008-08-13 Thread Jon Drukman
Jason Rennie wrote: On Tue, Aug 12, 2008 at 8:49 PM, Jon Drukman <[EMAIL PROTECTED]> wrote: 1. How do people deal with having solr start when system reboots, manage the log output, etc. Right now I run it manually under a unix 'screen' command with a wrapper script that takes

Administrative questions

2008-08-12 Thread Jon Drukman
1. How do people deal with having solr start when system reboots, manage the log output, etc. Right now I run it manually under a unix 'screen' command with a wrapper script that takes care of restarts when it crashes. That means that only my user can connect to it, and it can't happen when t

Re: Wildcard search question

2008-06-24 Thread Jon Drukman
Norberto Meijome wrote: ok well let's say that i can live without john/jon in the short term. what i really need today is a case insensitive wildcard search with literal matching (no fancy stemming. bobby is bobby, not bobbi.) what are my options? http://wiki.apache.org/solr/AnalyzersTokeniz

Re: Wildcard search question

2008-06-23 Thread Jon Drukman
Erik Hatcher wrote: No, because the original data is Bobby Gaza, so Bobby* would match, but not bobby*. "string" type (in the example schema, to be clear) does effectively no analysis, leaving the original string indexed as-is, case and all. [...] stemming and wildcard term queries aren't

Re: Wildcard search question

2008-06-23 Thread Jon Drukman
Erik Hatcher wrote: Jon, You provided a lot of nice details, thanks for helping us help you :) The one missing piece is the definition of the "text" field type. In Solr's _example_ schema, "bobby" gets analyzed (stemmed) to "bobbi"[1]. When you query for bobby*, the query parser is not ru

Wildcard search question

2008-06-23 Thread Jon Drukman
When I search with q=bobby I get the following record: 2008-06-23T07:06:40Z http://farm1.static.flickr.com/117/... 9 Bobby Gaza [EMAIL PROTECTED] When I search with bobby* I get nothing. When I search with steve* I get "Steve Ballmer" and "Steve Jobs"... What's going on?

Best type to use for enum-like behavior

2008-06-12 Thread Jon Drukman
I am going to store two totally different types of documents in a single solr instance. Eventually I may separate them into separate instances but we are a long way from having either the size or traffic to require that. I read somewhere that a good approach is to add a 'type' field to the d

Re: Newbie Q: searching multiple fields

2008-06-02 Thread Jon Drukman
Yonik Seeley wrote: There is your issue: type "string" indexes the whole field value as a single token. You want type "text" like you have on the name field. yep, i noticed that right after i hit send. things are working now. sorry, i did say i was a newbie! -jsd-

Re: Newbie Q: searching multiple fields

2008-06-02 Thread Jon Drukman
Yonik Seeley wrote: Verify all the fields you want to search on indexed Verify that the query is being correctly built by adding debugQuery=true to the request here is the schema.xml extract: required="true" /> here is the debugQuery output. i have no idea how to read

Newbie Q: searching multiple fields

2008-06-02 Thread Jon Drukman
I am brand new to Solr. I am trying to get a very simple setup running. I've got just a few fields: name, description, tags. I am only able to search on the default field (name) however. I tried to set up the dismax config to search all the fields, but I never get any results on the other f