from:"Geoffrey Young"

solr-14 help

2008-06-24 Thread Geoffrey Young

hi all :) last week I reworked an older patch for SOLR-14 https://issues.apache.org/jira/browse/SOLR-14 this functionality is actually fairly important for our ongoing migration to solr, so I'd really love to get SOLR-14 into 1.3. but open-source being what it is, my super-important featur

Re: SpellCheckerRequestHandler & qt parameter

2008-06-26 Thread Geoffrey Young

Norberto Meijome wrote: > Hi there, > > Short and sweet : > Is SCRH intended to honour qt= ? > > > longer... > I'm testing the newest SCRH ( SOLR-572), using last night's nightly build. > > I have defined a 'dismax' request handler which searches across a number of fields. When I use the SCRH in a

Re: SpellCheckerRequestHandler & qt parameter

2008-06-26 Thread Geoffrey Young

Grant Ingersoll wrote: On Jun 26, 2008, at 5:25 PM, Geoffrey Young wrote: well *almost* - it works most excellently with q=$term but when I add spellchecker.q=$term things implode: HTTP Status 500 - null java.lang.NullPointerException at org .apache .solr .handler

Re: SpellCheckerRequestHandler & qt parameter

2008-06-27 Thread Geoffrey Young

I had null pointer exceptions left and right while composing this email... then I added spellcheck.build=true to one and they went away. do you need to rebuild the spelling index every time you alter (certain parts) of solrconfig.xml? it was very consistent as reported below, but after simpl

Re: problems with SpellCheckComponent

2008-07-08 Thread Geoffrey Young

When I made: http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true I have this exception: Estado HTTP 500 - null java.lang.NullPointerException at org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217) I see this all the t

Re: problems with SpellCheckComponent

2008-07-08 Thread Geoffrey Young

Shalin Shekhar Mangar wrote: Hi Geoff, I can't find anything in the code which would give this exception when both q and spellcheck.q is specified. Though, this exception is certainly possible when you restart solr. Anyways, I'll look into it more deeply. great, thanks. There are a few wa

Re: spellchecker problems (bugs)

2008-07-22 Thread Geoffrey Young

Shalin Shekhar Mangar wrote: The problems you described in the spellchecker are noted in https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to synchronize spellcheck.build so that the index is not corrupted. I'd like to discuss this a little... I'm not sure that I want

Re: spellchecker problems (bugs)

2008-07-23 Thread Geoffrey Young

2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java where the analyzer variable is only set on the build command. Therefore, when the index is reloaded, but not built after starting solr, issuing a query with the spellcheck.q parameter will cause a NullPointerException to b

Re: spell-checker and faceting

2008-07-23 Thread Geoffrey Young

dudes dudes wrote: Hi, I'm trying to couple spell-checking mechanism with faceting in one url statement.. I can get the spell check right, but the facet doesn't work when it's combined with spell-checker... http://localhost:8080/solr/spellCheckCompRH?q=smath&spellcheck.q=smath&spellcheck=tr

Re: spellchecker problems (bugs)

2008-07-23 Thread Geoffrey Young

Jonathan Lee wrote: I don't see the patch attached to my original email either -- does solr-user not allow attachments? This is ugly, but here's the patch inline: issue created in jira: https://issues.apache.org/jira/browse/SOLR-648 --Geoff

Re: spellchecker problems (bugs)

2008-07-25 Thread Geoffrey Young

This issue has been fixed in the trunk. Can you please use the latest trunk code and try? current trunk looks good. thanks! --Geoff

Re: Multiple search components in one handler - ie spellchecker

2008-07-25 Thread Geoffrey Young

Andrew Nagy wrote: Hello - I am attempting to add the spellCheck component in my "search" requesthandler so when a users does a search, they get the results and spelling corrections all in one query just like the way the facets work. I am having some trouble accomplishing this - can anyone poi

Re: Multiple search components in one handler - ie spellchecker

2008-07-25 Thread Geoffrey Young

Andrew Nagy wrote: Thanks for getting back to me Geoff. Although, that is pretty much what I have. Maybe if I show my solrconfig someone might be able to point out what I have incorrect? The problem is that nothing related to the spelling options are show in the results, just the normal expe

using DataImportHandler instead of POST?

2008-09-28 Thread Geoffrey Young

hi all :) I'm sorry I need to ask this, but after reading and re-reading the wiki I don't see a clear path... I have a well-formed xml file, suitable for POSTting to solr. that works just fine. it's very large, though, and using curl in production is so very lame. is there a very simple config

Re: using DataImportHandler instead of POST?

2008-09-29 Thread Geoffrey Young

Chris Hostetter wrote: > : I have a well-formed xml file, suitable for POSTting to solr. that > : works just fine. it's very large, though, and using curl in production > : is so very lame. is there a very simple config that will let solr just > : slurp up the file via the DataImportHandler?

Re: using DataImportHandler instead of POST?

2008-10-01 Thread Geoffrey Young

Geoffrey Young wrote: > > Chris Hostetter wrote: >> : I have a well-formed xml file, suitable for POSTting to solr. that >> : works just fine. it's very large, though, and using curl in production >> : is so very lame. is there a very simple config that will le

Re: using DataImportHandler instead of POST?

2008-10-03 Thread Geoffrey Young

Chris Hostetter wrote: > : I chugg away at 1.5 million records in a single file, but solr never > : commits. specifically, it ignores my settings. (I can > : commit separately at the end, of course :) > > the way the autocommit settings work is soemthing i always get confused by > -- the aut

Re: solr 1.3 snapshooter doesn't work, commit never ending

2008-10-15 Thread Geoffrey Young

sunnyfr wrote: > I tried last evening before leaving and this > morning time elapsed was very important like you can notice above and no > snapshot, no error in the logs. I'm actually having a similar trouble. I've enabled postCommit and postOptimize hooks with an absolute path to snapshooter.

snapshooter and spellchecker

2008-10-16 Thread Geoffrey Young

hi all :) I was surprised to find that snapshooter didn't account for the spellcheck dictionary. but then again, since you can call it whatever you want I guess it couldn't. so, how are people distributing their dictionaries across their slaves? since it takes so long to generate, I can't see i

filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young

hi all :) I'm having difficultly filtering my documents when a field is either blank or set to a specific value. I would have thought this would work fq=-Type:[* TO *] OR Type:blue which I would expect to find all document where either Type is undefined or Type is "blue". my actual result se

Re: filtering on blank OR specific range

2008-11-19 Thread Geoffrey Young

Lance Norskog wrote: > Try: Type:blue OR -Type:[* TO *] > > You can't have a negative clause at the beginning. Yes, Lucene should barf > about this. I did try that, before and again now, and still no luck. anything else? --Geoff

dismax and WordDelimiterFilterFactory+PreserveOriginal

2009-03-16 Thread Geoffrey Young

hi all :) I have two filters combined with dismax on the query side: WordDelimiterFilterFactory { preserveOriginal=1, generateNumberParts=1, catenateWords=0, generateWordParts=1, catenateAll=0, catenateNumbers=0} followed by lowecase filter factory. the analyzer shows the phrase gUYS and d

camel-casing and dismax troubles

2009-05-12 Thread Geoffrey Young

hi all :) I'm having trouble with camel-cased query strings and the dismax handler. a user query LeAnn Rimes isn't matching the indexed term Leann Rimes even though both are lower-cased in the end. furthermore, the analysis tool shows a match. the debug query looks like "parsedquery":"+

Re: camel-casing and dismax troubles

2009-05-13 Thread Geoffrey Young

On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley wrote: > On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young > wrote: >> hi all :) >> >> I'm having trouble with camel-cased query strings and the dismax handler. >> >> a user query >> >> LeAnn Rim

multiple "things" in a document

2008-02-22 Thread Geoffrey Young

hi all :) I'm just getting up to speed with solr (and lucene, for that matter) for a new project. after reading through the available docs I'm not finding an answer to my most basic (newbie, certainly) question. please feel free to just point me to the proper doc :) this isn't my actual us

schema help

2008-03-11 Thread Geoffrey Young

hi :) I'm trying to work out a schema for our widgets. more than "just coming up with something" I'd like something idiomatic in solr terms. any help is much appreciated. here's a similar problem space to what I'm working with... lets say we're talking books. books are written by authors

Re: schema help

2008-03-11 Thread Geoffrey Young

t more complex than the examples out there. thanks for the feedback. --Geoff Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Geoffrey Young <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent: Tuesday, March 11, 2008 12:17:32 PM

Re: schema help

2008-03-12 Thread Geoffrey Young

Rachel McConnell wrote: Our Solr use consists of several rather different data types, some of which have one-to-many relationships with other types. We don't need to do any searching of quite the kind you describe, but I have an idea about it, depending on what you need to do with the book dat

Re: schema help

2008-03-12 Thread Geoffrey Young

the trouble I'm having is one of dimension. an author has many, many attributes (name, birthdate, biography in $language, etc). as does each book (title in $language, summary in $language, genre, etc). as does each library (name, address, directions in $language, etc). so an author with N b

config for very frequent solr updates

2008-04-17 Thread Geoffrey Young

hi all :) I didn't see any documentation on this, so I was wondering what the experience here was with updating solr with a small but constant trickle of daemon-style updates. unfortunately, it's a business requirement that backend db updates make it to search as the changes roll in (5 minut

Re: config for very frequent solr updates

2008-04-18 Thread Geoffrey Young

pointer to solr-303 - I found the distributed search docs from there and will keep that in mind as I move forward. --Geoff Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message From: Geoffrey Young <[EMAIL PROTECTED]> To: solr-user@lucene.apac

another spellchecker question

2008-04-23 Thread Geoffrey Young

hi :) I've noticed that (with solr 1.2) the returned order (as well as the actual matched set) is affected by the number of matches you ask for: q=hanna&suggestionCount=1 "suggestions":["Yanna"] q=hanna&suggestionCount=2 "suggestions":["Manna", "Yanna"] q=hanna&suggestion

Re: another spellchecker question

2008-04-23 Thread Geoffrey Young

Shalin Shekhar Mangar wrote: Hi Geoffrey, Yes, this is a caveat in the lucene contrib spellchecker which Solr uses. From the lucene spell checker javadocs: * As the Lucene similarity that is used to fetch the most relevant n-grammed terms * is not the same as the edit distance strategy us

Re: Got parseException when search keyword AND on a text field

2008-04-24 Thread Geoffrey Young

Otis Gospodnetic wrote: Not in one place and documented. The place to look are query parsers, but things like AND OR NOT TO are the ones to look out for. this seems like something solr ought to handle gracefully on the backend for me - if I need to write logic to make sure a malicious quer

token concat filter?

2008-05-01 Thread Geoffrey Young

hi :) I'm looking for a filter that will compress all tokens into a single token. the WordDelimiterFilterFactory does it for tokens it finds itself, but not ones passed to it. basically, I'm trying to match Radiohead in the index with radio head in the query. if it were spelled Radio

Re: token concat filter?

2008-05-01 Thread Geoffrey Young

Yonik Seeley wrote: If there are only a few such cases, it might be better to use synonyms to correct them. unfortunately, there are too many to handle this way. Off the top of my head there's no concatenating token filter, but it wouldn't be hard to make one. hmm, ok. I'm not a java guy

Re: token concat filter?

2008-05-01 Thread Geoffrey Young

Walter Underwood wrote: I've been doing it with synonyms and I have several hundred of them. I'm dealing mostly with proper names, so I expect more like 80k of them for our data :) Concatenating bi-word groups is pretty useful for English. We have a habit of gluing words together. "datab

Re: token concat filter?

2008-05-01 Thread Geoffrey Young

Walter Underwood wrote: I doubt it would be that many. I recommend tracking the searches and the clicks, and working on queries with low clickthrough. the trouble is I'm in a dynamic biz - last weeks popular clicks are very different from this weeks, so by the time I analyze last weeks popul

Re: token concat filter?

2008-05-01 Thread Geoffrey Young

Otis Gospodnetic wrote: Geoff, Whether synonyms are applied at index time or query time is controlled via schema.xml - it depends on where you put the synonym factory, whether in the index-time or query-time section of a fieldType. Synonyms are read once on start, I believe. It might be good

Re: Sort results on a field not ordered

2008-05-02 Thread Geoffrey Young

Erik Hatcher wrote: What field type is chapterTitle? I'm betting it is an analyzed field with multiple values (tokens/terms) per document. To successfully sort, you'll need to have a single value per document - using copyField can help with this to have both a searchable field and a sortab

Re: token concat filter?

2008-05-08 Thread Geoffrey Young

Otis Gospodnetic wrote: Geoff, Whether synonyms are applied at index time or query time is controlled via schema.xml - it depends on where you put the synonym factory, whether in the index-time or query-time section of a fieldType. Synonyms are read once on start, I believe. It might be good

Re: token concat filter?

2008-05-08 Thread Geoffrey Young

Otis Gospodnetic wrote: There is actually a Wiki page explaining this pretty well... have you seen it? I guess not. I've been reading the wiki, but the trouble with wiki's always seems to be (for me) finding stuff. can you point it out? Index-time expansion means larger indices and ina

adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young

hi :) I'm having an interesting problem with my data. in general, I want the results of the WordDelimiterFilter for better matching, but there are times when it's just too aggressive. for example boys2men => boys 2 men (good) p!nk => pnk (maybe) !!! => (nothing - bad) there'

Re: adding expand=true to WordDelimiterFilter

2008-05-19 Thread Geoffrey Young

Chris Hostetter wrote: by "expand=true" it sounds like you mean you are looking for a way to preserve the orriginal term without any characteres removed. yes, that's it. This sounds like SOLR-14 ... you might want to take a look at it, and see if the patch is still useable, and if not see

Re: searching only within allowed documents

2008-06-11 Thread Geoffrey Young

Solr allows you to specify filters in separate parameters that are applied to the main query, but cached separately. q=the user query&fq=folder:f13&fq=folder:f24 I've been wanting more explanation around this for a while, so maybe now is a good time to ask :) the "cached separately" verbi

Re: searching only within allowed documents

2008-06-12 Thread Geoffrey Young

climbingrose wrote: It depends on your query. The second query is better if you know that fieldb:bar filtered query will be reused often since it will be cached separately from the query. The first query occuppies one cache entry while the second one occuppies two cache entries, one in queryCac

missing document count?

2008-06-18 Thread Geoffrey Young

hi all :) maybe I'm just missing it, but I don't see a way to consistently (and easily) know the number of returned documents without a lot of acrobatics. numFound represents the entire number of matching documents if numFound <= rows then numFound is the number of documents in the respo

Re: missing document count?

2008-06-18 Thread Geoffrey Young

Walter Underwood wrote: Parse the results into a list and do something like this: Math.min(results.size(), numFound) // Java min(len(results), numFound) # Python we're using json, but sure, I can calculate it :) Doesn't seem all that hard, and not worth duplicating the info.

Re: missing document count?

2008-06-18 Thread Geoffrey Young

Chris Hostetter wrote: : not hard, but useful information to have handy without additional : manipulations on my part. : our pages are the results of multiple queries. so, given a max number of : records per page (or total), the rows asked of query2 is max - query1, of in the common case, co

49 matches

Mail list logo