hi all :)
last week I reworked an older patch for SOLR-14
https://issues.apache.org/jira/browse/SOLR-14
this functionality is actually fairly important for our ongoing
migration to solr, so I'd really love to get SOLR-14 into 1.3. but
open-source being what it is, my super-important featur
Norberto Meijome wrote:
> Hi there,
>
> Short and sweet :
> Is SCRH intended to honour qt= ?
>
>
> longer...
> I'm testing the newest SCRH ( SOLR-572), using last night's nightly build.
>
> I have defined a 'dismax' request handler which searches across a number
of fields. When I use the SCRH in a
Grant Ingersoll wrote:
On Jun 26, 2008, at 5:25 PM, Geoffrey Young wrote:
well *almost* - it works most excellently with q=$term but when I add
spellchecker.q=$term things implode:
HTTP Status 500 - null java.lang.NullPointerException at
org
.apache
.solr
.handler
I had null pointer exceptions left and right while composing this
email... then I added spellcheck.build=true to one and they went away.
do you need to rebuild the spelling index every time you alter (certain
parts) of solrconfig.xml? it was very consistent as reported below, but
after simpl
When I made:
http://localhost:8080/solr/spellCheckCompRH?q=*:*&spellcheck.q=ruck&spellcheck=true
I have this exception:
Estado HTTP 500 - null java.lang.NullPointerException at
org.apache.solr.handler.component.SpellCheckComponent.getTokens(SpellCheckComponent.java:217)
I see this all the t
Shalin Shekhar Mangar wrote:
Hi Geoff,
I can't find anything in the code which would give this exception when both
q and spellcheck.q is specified. Though, this exception is certainly
possible when you restart solr. Anyways, I'll look into it more deeply.
great, thanks.
There are a few wa
Shalin Shekhar Mangar wrote:
The problems you described in the spellchecker are noted in
https://issues.apache.org/jira/browse/SOLR-622 -- I shall create an issue to
synchronize spellcheck.build so that the index is not corrupted.
I'd like to discuss this a little...
I'm not sure that I want
2. I believe there is a bug in IndexBased- and FileBasedSpellChecker.java
where the analyzer variable is only set on the build command. Therefore,
when the index is reloaded, but not built after starting solr, issuing a
query with the spellcheck.q parameter will cause a NullPointerException to
b
dudes dudes wrote:
Hi,
I'm trying to couple spell-checking mechanism with faceting in one url statement.. I can get the spell check right, but the facet doesn't work when it's combined
with spell-checker...
http://localhost:8080/solr/spellCheckCompRH?q=smath&spellcheck.q=smath&spellcheck=tr
Jonathan Lee wrote:
I don't see the patch attached to my original email either -- does solr-user
not allow attachments?
This is ugly, but here's the patch inline:
issue created in jira:
https://issues.apache.org/jira/browse/SOLR-648
--Geoff
This issue has been fixed in the trunk. Can you please use the latest trunk
code and try?
current trunk looks good.
thanks!
--Geoff
Andrew Nagy wrote:
Hello - I am attempting to add the spellCheck component in my
"search" requesthandler so when a users does a search, they get the
results and spelling corrections all in one query just like the way
the facets work.
I am having some trouble accomplishing this - can anyone poi
Andrew Nagy wrote:
Thanks for getting back to me Geoff. Although, that is pretty much
what I have. Maybe if I show my solrconfig someone might be able to
point out what I have incorrect? The problem is that nothing related
to the spelling options are show in the results, just the normal
expe
hi all :)
I'm sorry I need to ask this, but after reading and re-reading the wiki
I don't see a clear path...
I have a well-formed xml file, suitable for POSTting to solr. that
works just fine. it's very large, though, and using curl in production
is so very lame. is there a very simple config
Chris Hostetter wrote:
> : I have a well-formed xml file, suitable for POSTting to solr. that
> : works just fine. it's very large, though, and using curl in production
> : is so very lame. is there a very simple config that will let solr just
> : slurp up the file via the DataImportHandler?
Geoffrey Young wrote:
>
> Chris Hostetter wrote:
>> : I have a well-formed xml file, suitable for POSTting to solr. that
>> : works just fine. it's very large, though, and using curl in production
>> : is so very lame. is there a very simple config that will le
Chris Hostetter wrote:
> : I chugg away at 1.5 million records in a single file, but solr never
> : commits. specifically, it ignores my settings. (I can
> : commit separately at the end, of course :)
>
> the way the autocommit settings work is soemthing i always get confused by
> -- the aut
sunnyfr wrote:
> I tried last evening before leaving and this
> morning time elapsed was very important like you can notice above and no
> snapshot, no error in the logs.
I'm actually having a similar trouble. I've enabled postCommit and
postOptimize hooks with an absolute path to snapshooter.
hi all :)
I was surprised to find that snapshooter didn't account for the
spellcheck dictionary. but then again, since you can call it whatever
you want I guess it couldn't.
so, how are people distributing their dictionaries across their slaves?
since it takes so long to generate, I can't see i
hi all :)
I'm having difficultly filtering my documents when a field is either
blank or set to a specific value. I would have thought this would work
fq=-Type:[* TO *] OR Type:blue
which I would expect to find all document where either Type is undefined
or Type is "blue". my actual result se
Lance Norskog wrote:
> Try: Type:blue OR -Type:[* TO *]
>
> You can't have a negative clause at the beginning. Yes, Lucene should barf
> about this.
I did try that, before and again now, and still no luck.
anything else?
--Geoff
hi all :)
I have two filters combined with dismax on the query side:
WordDelimiterFilterFactory { preserveOriginal=1,
generateNumberParts=1, catenateWords=0, generateWordParts=1,
catenateAll=0, catenateNumbers=0}
followed by lowecase filter factory. the analyzer shows the phrase
gUYS and d
hi all :)
I'm having trouble with camel-cased query strings and the dismax handler.
a user query
LeAnn Rimes
isn't matching the indexed term
Leann Rimes
even though both are lower-cased in the end. furthermore, the
analysis tool shows a match.
the debug query looks like
"parsedquery":"+
On Wed, May 13, 2009 at 6:23 AM, Yonik Seeley
wrote:
> On Tue, May 12, 2009 at 7:19 PM, Geoffrey Young
> wrote:
>> hi all :)
>>
>> I'm having trouble with camel-cased query strings and the dismax handler.
>>
>> a user query
>>
>> LeAnn Rim
hi all :)
I'm just getting up to speed with solr (and lucene, for that matter) for
a new project. after reading through the available docs I'm not finding
an answer to my most basic (newbie, certainly) question. please feel
free to just point me to the proper doc :)
this isn't my actual us
hi :)
I'm trying to work out a schema for our widgets. more than "just coming
up with something" I'd like something idiomatic in solr terms. any help
is much appreciated. here's a similar problem space to what I'm working
with...
lets say we're talking books. books are written by authors
t
more complex than the examples out there.
thanks for the feedback.
--Geoff
Otis
-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message From: Geoffrey Young
<[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent:
Tuesday, March 11, 2008 12:17:32 PM
Rachel McConnell wrote:
Our Solr use consists of several rather different data types, some of
which have one-to-many relationships with other types. We don't need
to do any searching of quite the kind you describe, but I have an idea
about it, depending on what you need to do with the book dat
the trouble I'm having is one of dimension. an author has many, many
attributes (name, birthdate, biography in $language, etc). as does
each book (title in $language, summary in $language, genre, etc). as
does each library (name, address, directions in $language, etc). so
an author with N b
hi all :)
I didn't see any documentation on this, so I was wondering what the
experience here was with updating solr with a small but constant trickle
of daemon-style updates.
unfortunately, it's a business requirement that backend db updates make
it to search as the changes roll in (5 minut
pointer to
solr-303 - I found the distributed search docs from there and will keep
that in mind as I move forward.
--Geoff
Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message From: Geoffrey Young
<[EMAIL PROTECTED]> To: solr-user@lucene.apac
hi :)
I've noticed that (with solr 1.2) the returned order (as well as the
actual matched set) is affected by the number of matches you ask for:
q=hanna&suggestionCount=1
"suggestions":["Yanna"]
q=hanna&suggestionCount=2
"suggestions":["Manna",
"Yanna"]
q=hanna&suggestion
Shalin Shekhar Mangar wrote:
Hi Geoffrey,
Yes, this is a caveat in the lucene contrib spellchecker which Solr uses.
From the lucene spell checker javadocs:
* As the Lucene similarity that is used to fetch the most relevant
n-grammed terms
* is not the same as the edit distance strategy us
Otis Gospodnetic wrote:
Not in one place and documented. The place to look are query parsers, but
things like AND OR NOT TO are the ones to look out for.
this seems like something solr ought to handle gracefully on the backend
for me - if I need to write logic to make sure a malicious quer
hi :)
I'm looking for a filter that will compress all tokens into a single
token. the WordDelimiterFilterFactory does it for tokens it finds
itself, but not ones passed to it.
basically, I'm trying to match
Radiohead
in the index with
radio head
in the query. if it were spelled Radio
Yonik Seeley wrote:
If there are only a few such cases, it might be better to use synonyms
to correct them.
unfortunately, there are too many to handle this way.
Off the top of my head there's no concatenating token filter, but it
wouldn't be hard to make one.
hmm, ok. I'm not a java guy
Walter Underwood wrote:
I've been doing it with synonyms and I have several hundred of them.
I'm dealing mostly with proper names, so I expect more like 80k of them
for our data :)
Concatenating bi-word groups is pretty useful for English. We have a
habit of gluing words together. "datab
Walter Underwood wrote:
I doubt it would be that many. I recommend tracking the searches and
the clicks, and working on queries with low clickthrough.
the trouble is I'm in a dynamic biz - last weeks popular clicks are very
different from this weeks, so by the time I analyze last weeks popul
Otis Gospodnetic wrote:
Geoff,
Whether synonyms are applied at index time or query time is
controlled via schema.xml - it depends on where you put the synonym
factory, whether in the index-time or query-time section of a
fieldType. Synonyms are read once on start, I believe. It might be
good
Erik Hatcher wrote:
What field type is chapterTitle? I'm betting it is an analyzed field
with multiple values (tokens/terms) per document. To successfully sort,
you'll need to have a single value per document - using copyField can
help with this to have both a searchable field and a sortab
Otis Gospodnetic wrote:
Geoff,
Whether synonyms are applied at index time or query time is
controlled via schema.xml - it depends on where you put the synonym
factory, whether in the index-time or query-time section of a
fieldType. Synonyms are read once on start, I believe. It might be
good
Otis Gospodnetic wrote:
There is actually a Wiki page explaining this pretty well... have you
seen it?
I guess not. I've been reading the wiki, but the trouble with wiki's
always seems to be (for me) finding stuff. can you point it out?
Index-time expansion means larger indices and ina
hi :)
I'm having an interesting problem with my data. in general, I want the
results of the WordDelimiterFilter for better matching, but there are
times when it's just too aggressive. for example
boys2men => boys 2 men (good)
p!nk => pnk (maybe)
!!! => (nothing - bad)
there'
Chris Hostetter wrote:
by "expand=true" it sounds like you mean you are looking for a way to
preserve the orriginal term without any characteres removed.
yes, that's it.
This sounds like SOLR-14 ... you might want to take a look at it, and see
if the patch is still useable, and if not see
Solr allows you to specify filters in separate parameters that are
applied to the main query, but cached separately.
q=the user query&fq=folder:f13&fq=folder:f24
I've been wanting more explanation around this for a while, so maybe now
is a good time to ask :)
the "cached separately" verbi
climbingrose wrote:
It depends on your query. The second query is better if you know that
fieldb:bar filtered query will be reused often since it will be cached
separately from the query. The first query occuppies one cache entry while
the second one occuppies two cache entries, one in queryCac
hi all :)
maybe I'm just missing it, but I don't see a way to consistently (and
easily) know the number of returned documents without a lot of acrobatics.
numFound represents the entire number of matching documents
if numFound <= rows then numFound is the number of documents in the
respo
Walter Underwood wrote:
Parse the results into a list and do something like this:
Math.min(results.size(), numFound) // Java
min(len(results), numFound) # Python
we're using json, but sure, I can calculate it :)
Doesn't seem all that hard, and not worth duplicating the info.
Chris Hostetter wrote:
: not hard, but useful information to have handy without additional
: manipulations on my part.
: our pages are the results of multiple queries. so, given a max number of
: records per page (or total), the rows asked of query2 is max - query1, of
in the common case, co
49 matches
Mail list logo