Hi,
I didn't hear any responses here, so I went ahead and made a bunch of
changes to the highlighting parameters wiki:
- Highlighter is now known as Original Highlighter so it's more clear
that Highlighter doesn't just refer to the highlighting utilities generally.
- I need help with fragsize
Hi, I'm having some issues trying to sort my grouped results by more than one
field. If I use just one, independently of which I use it just work fine (I
mean it sorts).
I have a case that the first sorting key is equal for all the head docs of each
group, so I expect to return the groups sor
Straying a bit from the subject,
don't you think it will be useful to have the shards parameter used also in
the index, in order to maintain document uniqueness?
I mean as an out of the box feature of Solr.
Because the situation today is that a Solr's client working with a sharded
Solr is respons
It's a bit of a privacy through obscurity measure, unfortunately. The
problem is that American courts do a lousy job of removing social
security numbers from cases that I put on my site. I do anonymization
before sending the cases to Solr, but if you're clever (and the
stopwords weren't in plac
Hi Tanner,
Here is another simple way: AutoComplete.
You know what your users are searching for, you can identify top queries and
you can identify common queries that are not finding matches. This all allows
you to figure out what to feed in AutoComplete. And hopefully your
AutoComplete doesn
I was afraid you would say that.
See http://fora.tv/2009/10/14/ACM_Data_Mining_SIG_Ted_Dunning#fullprogram,
click on the Recommendations section to skip to the good part.
The point is that cross recommendation can let you learn what sorts of
rewrites of this kind are needed. The idea is that you
You mention "that is one way to do it" is there another i'm not seeing?
On Jan 10, 2012, at 4:34 PM, Ted Dunning wrote:
> On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert
> wrote:
>
>> We've had some issues with people searching for a document with the
>> search term '200 movies'. The document i
On Tue, Jan 10, 2012 at 5:32 PM, Tanner Postert wrote:
> We've had some issues with people searching for a document with the
> search term '200 movies'. The document is actually title 'two hundred
> movies'.
>
> Do we need to add every number to our synonyms dictionary to
> accomplish this?
Tha
Thanks for your reply.
I added the argument in the solrconfig.xml and it worked like a charm.
Thanks again
Minh
-Original Message-
From: Koji Sekiguchi [mailto:k...@r.email.ne.jp]
Sent: mardi 10 janvier 2012 01:25
To: solr-user@lucene.apache.org
Subject: Re: ignoreTikaException value
We've had some issues with people searching for a document with the
search term '200 movies'. The document is actually title 'two hundred
movies'.
Do we need to add every number to our synonyms dictionary to
accomplish this? Is it best done at index or search time?
my copyField was defined as copyfield <--- notice the lowercase f
On Tue, Jan 10, 2012 at 2:50 PM, Dyer, James wrote:
> Three things to check:
>
> 1. Use a higher spellcheck.count than 1. Try 10. IndexBasedSpellChecker
> pre-filters the possibilities in a first pass of a 2-pass process.
Three things to check:
1. Use a higher spellcheck.count than 1. Try 10. IndexBasedSpellChecker
pre-filters the possibilities in a first pass of a 2-pass process. If
spellcheck.count is too low, all the good suggestions might get filtered on the
first pass and then it won't find anything on
I am trying to get the IndexBasedSpellChecker to work. I believe I have
everything setup properly and the spellcheck component seems to be running
but the suggestions list is empty.
I am using SOLR 3.5 with Jetty.
My solrconfig.xml and schema.xml are as follows:
solrconfig.xml: http://pastie.o
Right, but that says exactly nothing about how that identifier is used. --wunder
On Jan 10, 2012, at 11:23 AM, Gora Mohanty wrote:
> On Wed, Jan 11, 2012 at 12:37 AM, Walter Underwood
> wrote:
>> Thanks! That looks like it fixed the problem. This list continues to be
>> awesome.
>>
>> Is the f
On Wed, Jan 11, 2012 at 12:37 AM, Walter Underwood
wrote:
> Thanks! That looks like it fixed the problem. This list continues to be
> awesome.
>
> Is the function of the name attribute actually described in the docs? I could
> not figure out what it was for.
Yes, it is, though maybe not very pr
Thanks! That looks like it fixed the problem. This list continues to be awesome.
Is the function of the name attribute actually described in the docs? I could
not figure out what it was for.
wunder
On Jan 10, 2012, at 10:41 AM, dan whelan wrote:
> just a guess but this might need to change fro
just a guess but this might need to change from
${biblio.id}
to
${book.id}
Since the entity name is book instead of biblio
On 1/10/12 10:37 AM, Walter Underwood wrote:
I see a missing required "title" field for every document when I'm using DIH.
Yes, these documents have titles in the dat
I see a missing required "title" field for every document when I'm using DIH.
Yes, these documents have titles in the database. Is there a way to see what
exact queries are sent to MySQL or received by MySQL?
Here is a relevant chunk of the dataConfig:
On 1/9/2012 5:15 PM, Hector Castro wrote:
Hi,
Has anyone had success with multicore single node Solr configurations that have
one core acting solely as a dispatcher for the other cores? For example, say
you had 4 populated Solr cores – configure a 5th to be the definitive endpoint
with `shar
I have no idea what you mean by "different hash", and you
haven't provided much information go on here.
What is your evidence that the document is in the index
twice? If you're inspecting the index at a low level
that's expected, since documents are just marked
as deleted not immediately removed f
On Tue, Jan 10, 2012 at 9:04 PM, geeky2 wrote:
> thank you both for the information.
>
> Gora, when you mentioned:
>
>>>
> - For keeping both values, use synonyms.
> <<
>
> what did you mean exactly.
[...]
Please take a look at
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.Syno
Hi Erick,
I change all my url fields into text (they were string fields before), and
added a WordDelimiterFilterFactory, so that url fields can be tokenized
into several words. But I still got around 15 seconds response time
measured using debugyQuery=on, and most of the time still spend on
DebugC
thank you both for the information.
Gora, when you mentioned:
>>
- For keeping both values, use synonyms.
<<
what did you mean exactly.
mark
--
View this message in context:
http://lucene.472066.n3.nabble.com/best-way-to-force-substitutions-in-data-tp3646195p3647920.html
Sent from the Solr -
> I am not able to do highlighting with surround query parser
> on the returned
> results.
> I have tried the highlighting component but it does not
> return highlighted
> results.
Highlighter does not recognize Surround Query. It must be re-written to enable
highlighting in o.a.s.search.QParser#
Hello again,
Well after further review the ID's are different. The difference was
just so small I missed it after staring it for a few hours.
BR,
Lauri
On 01/10/2012 02:20 PM, Hyttinen Lauri wrote:
Hello,
I sent some data into the solr/lucene index but when I query
the data I see weird resu
On a very quick glance, it looks like the source is at:
./lucene/contrib/analyzers/common/src/java/org/tartarus/snowball
and from there just compile Lucene and/or Solr as you normally would.
See: http://wiki.apache.org/solr/HowToContribute
Best
Erick
On Mon, Jan 9, 2012 at 2:13 PM, wrote:
> H
Thank you for your patience and assistance. XML is not my forte, but layoffs
and attrition have reduced IT staff well below minimum functional levels here.
Thanks to your help, the exact title matches have made it to the first page of
results.
Robert McCarroll
Systems Administration
NYS De
Hello,
I sent some data into the solr/lucene index but when I query
the data I see weird results.
There are documents with identical id fields but they have different
hash values.
Apart from the hash values the results are the same.
I thought it was impossible to have documents with same uniq
In my case the cores are populated with different records that adhere to the
same schema. The question about randomly distributing requests is because each
core has the `shards` parameter populated so that it can hit the other core's
indexes.
My question is more about the advantages (if any) of
I think I solve it... It seems to be because of the - that's just before the
query facet name
--
Mauro Asprea
E-Mail: mauroasp...@gmail.com
Mobile: +34 654297582
Skype: mauro.asprea
On Tuesday, January 10, 2012 at 11:33 AM, Mauro Asprea wrote:
> Hi, I;m having issues using the "new" way
Hi, I;m having issues using the "new" way of faceting dates with the Query
Facets.
The issue is that it is returning wrong counts. I tested it using a Date Facet
instead and the Dated one did result correct counters. I'm using Sunspot RSolr
client and I'm using also new folding/group feature.
On Tue, Jan 10, 2012 at 4:44 AM, geeky2 wrote:
[...]
> i have a database with approximately 7Million rows that i am bringing in to
> solr.
>
> for a very small sub-set of these 7Million rows (about 130 rows), i need to
> substitute an old part number for a new part number. i know ahead of time
>
how about using regular expressions:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.PatternReplaceCharFilterFactory
On Tue, Jan 10, 2012 at 1:14 AM, geeky2 wrote:
> Hello all,
>
> i have been reading the solr book as well as searching the archives of this
> list to learn how
Hello everyone,
I have a question on how to update index using xml messages when there are
some complex custom field types in my index...like:
And field offer has some attributes in it...
I've read page, http://wiki.apache.org/solr/UpdateXmlMessages and example
shows that xml should be like:
Hi,
I understand that setting omitTermFreqAndPositions="true" for a field in
schema.xml stores less information in the index with some restrictions e.g.
phrase search.
But does setting this property as "true" for a field which is of type
"string", "int" or is analyzed by KeywordAnalyzer makes an
35 matches
Mail list logo