Re: solr.DirectUpdateHandler2 failed to instantiate

2013-06-27 Thread Mark Bennett
hen I removed copies of my jar from other lib directories which I had been experimenting with. -- Mark Bennett / LucidWorks: Search & Big Data / mark.benn...@lucidworks.com Office: 408-898-4201 / Telecommute: 408-733-0387 / Cell: 408-829-6513 On Mar 13, 2013, at 11:52 AM, Jack Park wrote: >

Re: solr.DirectUpdateHandler2 failed to instantiate

2013-06-27 Thread Mark Bennett
Jack, Did you ever find a fix for this? I'm having similar issues (different parts of solrconfig) and my guess is it's a config issue somewhere, vs. a proper casting problem, some nested init issue. Was curious what you found? On Mar 13, 2013, at 11:52 AM, Jack Park wrote: > I can safely sa

Re: [ANN] vifun: tool to help visually tweak Solr boosting

2013-03-04 Thread Mark Bennett
vrh, /clustering, /terms, /elevate (from default Solr 4.1 solrconfig.xml) I'm using /select -- Mark Bennett / LucidWorks: Search & Big Data / mark.benn...@lucidworks.com Office: 408-898-4201 / Telecommute: 408-733-0387 / Cell: 408-829-6513 On Feb 23, 2013, at 6:12 AM, jm

Re: From a high level query call, tell Solr / Lucene to automatically apply a leaf operator?

2013-03-03 Thread Mark Bennett
PHA/queryparser/org/apache/lucene/queryparser/flexible/standard/package-summary.html > > > > On Sun, Feb 24, 2013 at 11:33 AM, Mark Bennett > wrote: > > > Scenario: > > > > You're submitting a block of text as a query. > > > > You're content to

From a high level query call, tell Solr / Lucene to automatically apply a leaf operator?

2013-02-23 Thread Mark Bennett
#x27;s analyzers has some other token filter you forgot about, so you'd have to bring that logic forward as well. (Long story of why I'd want to do all this... and I know people think adding ~2 to all tokens will give bad results anyway, trying to fix inherited code that can't be scr

Missing documents with ConcurrentUpdateSolrServer (vs. HttpSolrServer) ?

2013-01-15 Thread Mark Bennett
Solr server, is there some mechanism to queue them onto disk, or does it try to hold them all in RAM? And *if* the backlog caused an OOM condition, wouldn't that JVM have mostly crashed (if not completely)? Any guesses on the mostly likely failure point, and where to look? Thanks, Mark

Re: Force SolrJ 4.0.0 to use XML to talk to Solr 1.4.1 server

2013-01-03 Thread Mark Bennett
Thank you Sean for the option. Your second post made me smile! -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, Jan 3, 2013 at 12:21 PM, Shawn Heisey wrote: > On 1/3/2013 12:39 PM, Shawn Heisey wr

Force SolrJ 4.0.0 to use XML to talk to Solr 1.4.1 server

2013-01-03 Thread Mark Bennett
and my understanding was that this was possible if using XML. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Re: Indexing performance with solrj vs. direct lucene API

2012-11-29 Thread Mark Bennett
obably use HttpSolrServer. But when doing massive updates, you might consider using ConcurrentUpdateSolrServer instead. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Wed, Nov 28, 2012 at 10:02 AM, Robert Ste

Re: Odd casting error in embedded Jetty container

2012-11-28 Thread Mark Bennett
to be working. (Anonymous - via GTD book) > > > On Tue, Nov 27, 2012 at 4:49 PM, Mark Bennett > wrote: > > > Hi Alex and Erick, > > > > I added -verbose which shows every class load. > > > > From the 12,000+ lines of output, I only see 4.0.0 jar

Re: Odd casting error in embedded Jetty container

2012-11-27 Thread Mark Bennett
e you're mixing up jars from old and new Solrs somehow. > > You've > > stripped down the classpath, but what about the solr libraries? All the > > things that > > can be defined in directives in solrconfig.xml? > > > > Not much help I know, but the best I

Re: Odd casting error in embedded Jetty container

2012-11-27 Thread Mark Bennett
Erik & Alex, If it still sounds like jars to you two, I'll take another whack in that direction. >From previous Google searches I thought I had already eliminated that, but logging every gosh-darn-jar and reviewing the list makes sense. -- Mark Bennett / New Idea Engineering,

Odd casting error in embedded Jetty container

2012-11-26 Thread Mark Bennett
Class.newInstance(Class.java:308) at org.eclipse.jetty.servlet.ServletContextHandler$Context.createFilter(ServletContextHandler.java:951) -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Some Parser Resources / Links, and some related questions

2010-10-04 Thread Mark Bennett
ava, instead of using REST/CGI arguments and XML output You can build queries with it. Link: http://wiki.apache.org/solr/Solrj#Advanced_usage Question: Any good examples of custom query parsers and SolrJ? I think some advanced features aren't always available in SolrJ, it may lag a bit. --

Re: Sites with Innovative Presentation of Tags and Facets

2010-05-28 Thread Mark Bennett
n mobile. Just passing this on, please don't shoot the messenger. ;-) Mark -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, May 27, 2010 at 2:55 PM, Geert-Jan Brits wrote: > Perhaps you could sho

Re: Sites with Innovative Presentation of Tags and Facets

2010-05-28 Thread Mark Bennett
Some value *(50/60)* You'd have: Some value *- 50 of 60* Something like that.... I'm no artist. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, May 27, 2010 at 2:37 PM, Lukas Kahwe Smith wr

Re: Sites with Innovative Presentation of Tags and Facets

2010-05-28 Thread Mark Bennett
generally not what folks expect. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, May 27, 2010 at 2:32 PM, Geert-Jan Brits wrote: > Something like sliders perhaps? > Of course only numerical ranges

Sites with Innovative Presentation of Tags and Facets

2010-05-27 Thread Mark Bennett
raph.com/navigator.html Cool articles on the subject: (some examples now offline) http://www.cs.umd.edu/class/spring2005/cmsc838s/viz4all/viz4all_a.html -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Re: [search_dev] Re: Opinions on Facet+Fulltext behavior?

2010-04-08 Thread Mark Bennett
nly does this when users are coming from the Advanced Search form and have therefore demonstrated some comfort with advanced search. Mark Mail's target audience is also rather sophisticated and "nerdy". * Show both sets of results in different window panes. This at least maintains

Re: [search_dev] Opinions on Facet+Fulltext behavior?

2010-03-18 Thread Mark Bennett
rs, so punctuation in the search box would cause smoke to billow out of ears I imagine. ;-) -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, Mar 18, 2010 at 10:07 AM, Chris Biow wrote: > > > On

Opinions on Facet+Fulltext behavior?

2010-03-18 Thread Mark Bennett
io B: The engine brings back furnished apartments with a garage all over the Bay Area, and I get 800 matches. To limit the search to Sunnyvale, I must again click the City facet and select it. There are strengths and weaknesses to both scenarios, but I don't wanna bias anybody's answer.

Question on Facets and Multiple values (confusion from the Wiki)

2010-02-26 Thread Mark Bennett
e And this: http://wiki.apache.org/solr/FieldOptionsByUseCase multiValued is left blank in many cases, and not filled in for facets. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Code sync between Lucene and Solr, crossing Apache project boundaries, etc.

2009-09-22 Thread Mark Bennett
up the difference exclusively to that then? If folks work with both code trees a lot, maybe having a parent build file could copy over the fresh Lucene jar over to Solr. Also curious if there's an automated way to get this working in Eclipse. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Re: Clarifications to Synonym Filter Wiki entry? (2 of 2)

2009-08-24 Thread Mark Bennett
should always carry the caveat "... and it can cause false positive matches when only one of the words is present?" Am I understanding this correctly? If true, it's to be acceptable in many applications, it's just a question understanding the trade offs. Mark -- Mark Bennett /

Re: Overview of Query Parsing API Stack? / Dismax parsing, new 1.4 parsing, etc.

2009-08-24 Thread Mark Bennett
also thought it would be more "granular". Have you chatted with them to confirm? -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Thu, Aug 20, 2009 at 7:16 PM, Chris Hostetter wrote: > > : Subj

Clarifications to Synonym Filter Wiki entry? (2 of 2)

2009-08-24 Thread Mark Bennett
Analyzer to indicate that two terms occupy the same position: there is no way to indicate that a "phrase" occupies the same position > as a term. For our example the resulting MultiPhraseQuery would be "(sea | sea | seabiscuit) (biscuit | biscit)" which would not match > t

Clarifications to Synonym Filter Wiki entry? (1 of 2)

2009-08-24 Thread Mark Bennett
r, where D would expand to match all 9 letters you can either: 1: Put the synonym filter in the pipeline twice, along with the remove duplicates filter OR 2: Use the synonym filter at both index and query time Does anybody disagree with this? And what should be added to the Wiki doc?

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-13 Thread Mark Bennett
* mlb: comments On Thu, Aug 13, 2009 at 9:39 AM, Stanislaw Osinski wrote: > Hi, > > On Tue, Aug 11, 2009 at 22:19, Mark Bennett wrote: > > Carrot2 has several pluggable algorithms to choose from, though I have no > > evidence that they're "better" than

Re: Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
One other idea I tried, which didn't work, was to see if I could get proper parsing via the stream arg: http://localhost:8983/solr/mlt?stream.body=hello+world&mlt.fl=shingle_field&mlt.mintf=0&debugQuery=true On Tue, Aug 11, 2009 at 9:09 AM, Mark Bennett wrote: > I'

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
Thanks Grant. *** mlb: comments inline On Tue, Aug 11, 2009 at 12:40 PM, Grant Ingersoll wrote: > Inline... > > On Aug 11, 2009, at 12:44 PM, Mark Bennett wrote: > > I'm going somewhere with this... be patient. :-) I had asked about this >> briefly at the SF meetup,

Re: Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
ses the TF/IDF stuff. Still wondering if anybody's tried MLK or Carrot clustering as a primary search entry point. On Tue, Aug 11, 2009 at 9:44 AM, Mark Bennett wrote: > I'm going somewhere with this... be patient. :-) I had asked about this > briefly at the SF meetup, but there wa

Solr 1.4 Clustering / mlt AS search?

2009-08-11 Thread Mark Bennett
e for the "more like this" feature. Take a user's search text (presumably lengthy), quickly index it, then use that new temp doc as a MLT seed doc. I haven't looked deep into the code, it might be that it uses essentially the same relevancy as a query. -- Mark Bennett / New Ide

Trouble with Shingle filter and query parsing / expansion

2009-08-11 Thread Mark Bennett
ng me to the single token at a time problem. I did verify that it's finding my schema, and if I put a non-existent field name in there, it certainly notices.I've tried with and without the PositionFilterFactory filter. If I comment out the shingle stage everything works.

Overview of Query Parsing API Stack? / Dismax parsing, new 1.4 parsing, etc.

2009-08-10 Thread Mark Bennett
I don't recall whether that impacted just Lucene, or if Solr was also going to be affected. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Revisiting IDF Problems and Index Slices

2009-08-06 Thread Mark Bennett
and Solr call "shingles". Using my previous examples: Question: "How can I get a widget ?" Normal index: "how", "can", "i", "get", "widget" Shingles: "how_can", "can_i", "i_get", "get_widget" Question: "How do I change the widget battery?" Normal index: "how", "do", "i", "change", "widget", "battery" Shingles: "how_do", "do_i", "i_change", "change_widget", "widget_battery" And both sets of tokens would be added to the index, in different fields, and with possibly different boosts. The idea being that tokens like "can_i" might be somewhat more statistically significant than "can" and "i" by themselves. So at least in the first question, which has all low quality words, the "can_i" might help a bit. I'd appreciate any comments on these ideas from y'all, or perhaps names of specific algorithms / authors. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Re: Using Luke to get terms for docs matching a specific query filter?

2009-08-03 Thread Mark Bennett
Sow just make sure to use rows=1 ? -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Aug 3, 2009 at 5:51 PM, Yonik Seeley wrote: > On Mon, Aug 3, 2009 at 8:26 PM, Mark Bennett wrote: > > Yonik

Re: Using Luke to get terms for docs matching a specific query filter?

2009-08-03 Thread Mark Bennett
s faster when a field has many unique terms in the index. -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513 On Mon, Aug 3, 2009 at 2:49 PM, Yonik Seeley wrote: > Sounds like faceting? > q=state:CA&facet

Re: Using Luke to get terms for docs matching a specific query filter?

2009-08-03 Thread Mark Bennett
e='facet_fields']/l...@name='title']/int/@name The Luke XPath (terms for all docs) is something like: /response/l...@name='fields']/l...@name='title']/l...@name='topTerms']/int/@name -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.c

Using Luke to get terms for docs matching a specific query filter?

2009-08-03 Thread Mark Bennett
cuments, not just terms that are also present in the query. Thanks, Mark -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513

Re: Solr/Lucene performance differences on Mac OS X running Tiger vs. Leopard ?

2009-07-31 Thread Mark Bennett
default was back then. Though, I wouldn't expect JVMs to be getting worse in the performance department either. More importantly, nobody has chimed in with a "yes" about Tiger vs. Leopard, and I've found no smoking gun online, so I'm thinking the OS upgrade is NOT the is

Solr/Lucene performance differences on Mac OS X running Tiger vs. Leopard ?

2009-07-30 Thread Mark Bennett
king around. Searches on Google haven't turned up any reports, so I'm suspecting the issue lies elsewhere. Also I've run on Leopard for months without any performance issues, though I really don't tax anything on my workstation. -- Mark Bennett / New Idea Engineering, Inc. / mbe

Re: Anybody reformatted the "explain" output to be more visual?

2009-07-28 Thread Mark Bennett
use? In the XML it'd be nice to identify numeric fields. And to possibly call out operands and results. Maybe have a field that conveys "final score" (for this branch), etc. if there's interest, maybe I could mock up a draft of what the XML might look like. -- Mark

Anybody reformatted the "explain" output to be more visual?

2009-07-28 Thread Mark Bennett
propriately) * Having these various sizes of fonts and boxes on one line would also be easier to visually "sum" * Maybe a mouse-over for precise numerical values Just some ideas. Visually representing dismax would be another challenge. -- Mark Bennett / New Idea Engineering, Inc. / mb

Different structure of standard generated query for CJK vs. Western query

2009-07-15 Thread Mark Bennett
1 to 3 or 4 would make them no longer overlap, if that's all there is to it. Ideally I'd like the cjk queries to be structured the same as the English ones. Also it'd be better if this could be done with just schema or config changes, though I realize that's not as likely. --

Confirming doc change for Wiki for schema / plugins config

2009-07-02 Thread Mark Bennett
r or not to make an "uber" Lucene class loader, and the performance impact that might have here: http://www.mail-archive.com/solr-user@lucene.apache.org/msg04487.html -- Mark Bennett / New Idea Engineering, Inc. / mbenn...@ideaeng.com Direct: 408-733-0387 / Main: 866-IDEA-ENG / Cell: 408-829-6513