Hi Ranveer,
The error in the count of the facets its caused by the tokenized field that
you are using, if you want to do facets for the whole string, use a
fieldType that doesn't strip the the field in tokens like the string field.
Regards,
Marco Martínez Bautista
http://www.paradigmatecnologico
Hello,
I am confused about the proper usage of the Boolean operators, AND, OR and NOT.
Could somebody please provide me an easy to understand explanation.
Thanks,
Sandhya
Andy, I think it is important to know what a stemmer really is.
It reduces words to their infinitves. Those infinitives do not refer to the
real infinitive everytime, but however: for the system, it is an infinitive,
since all its derivates could be reduced to the same form.
Thats a stemmer.
Acc
Hello Sandhya,
title: star AND wars NOT sdi
This query will match every document where "star" *and* "wars" occur but
*not* the term "sdi" (SDI => Strategic Defense Initiative => in the media
there was often the term star wars used to describe the project).
title: star OR wars
This query will mat
Hello Sandhya,
please, show us your schema.xml, so that we can have a look whether
something might be wrong there.
However, if the source of a copyField is "description" and the destination
is "description_stemmed", you can query both: description and
description_stemmed. There will be no error.
Hi,
I have the following filter for a field named "myText"
This enables stemming, I guess.
My questions are:
1) Can I disable stemming for the same field at the query time?
2) Do I need to copyField the "myText" to "nonStemText", wherein "nonStemText"
is not configured with the PorterFilterF
Hello!
If you want to have both non-stemmed and stemmed field You should
use copyField.
Even if there would be a possibility to disable snowball filter at
query time, you would have stemmed tokens written in the index.
> Hi,
> I have the following filter for a field named "myText"
>
Naga,
1) Yes, it is possible.
... define those filters which you want to apply at query-time
2) I am not sure whether I understand your question right:
You do not need to copyField your myText-field, if it is oka
Hello!
MitchK posted the right solution, my post can be confusing ;( Sorry,
for that.
> Hello!
> If you want to have both non-stemmed and stemmed field You should
> use copyField.
> Even if there would be a possibility to disable snowball filter at
> query time, you would have stemm
Thank you Mitch! I will try that.
regards,
Naga
-Original Message-
From: MitchK [mailto:mitc...@web.de]
Sent: Monday, April 19, 2010 2:35 PM
To: solr-user@lucene.apache.org
Subject: Re: Stemming - disable at query time - reg.
Naga,
1) Yes, it is possible.
Thank You Mitch.
I have a query mentioned below : (my defaultOperator is set to "AND")
(field1 : This is a good string AND field2 : This is a good string AND field3 :
This is a good string AND (field4 : ASCIIDocument OR field4 : BinaryDocument OR
field4 : HTMLDocument) AND field5 : doc)
This i
Hi Mitch,
I have defined my field like:
I have indexed two documents with "working" and "worked" values and when I
search for "working" it is not giving me any results
Also, one of the fields here, *field3* is a dynamic field. All the other fields
except this field, are copied into "text" with copyField.
Thanks,
Sandhya
-Original Message-
From: Sandhya Agarwal [mailto:sagar...@opentext.com]
Sent: Monday, April 19, 2010 2:55 PM
To: solr-user@lucene.apa
I need to perform wildcard search in phrase query. I have 2 documents
containing text "how do impair" and "how to improve". I want to be able to
search both documents by searching (how to im*). There is a provision in
lucene which allows me to perform this operation using SpanWildcardQuery and
kee
Hey All
I have 2 cores which have been used with tika to do index files.
I would like to do one query on both at once as I will be searching
attr_content field.
If I do a test on each core I get 1 & 17 results but trying with shards I just
get 17 results.
Here is my example query
http://loca
Hi Naga,
I think you should add the same filter to the query configuration:
**
That way stemming is applied to the query, so it would search for "work"
instead of "working" and, therefo
Hi Grant,
I tried command line of Tika v-0.7(newest), and it parsed the file.. I
believe Solr1.4 contains 0.4 version of Tika.
Do you suggest to upgrade to new Tika? Can i upgrade only tika in Solr-1.4?
or i need to wait till Solr ships with new Tika?
Thanks.
On Sun, Apr 18, 2010 at 11:24 PM, Gra
Praveen Agrawal wrote:
Hi Grant,
I tried command line of Tika v-0.7(newest), and it parsed the file.. I
believe Solr1.4 contains 0.4 version of Tika.
Do you suggest to upgrade to new Tika? Can i upgrade only tika in Solr-1.4?
or i need to wait till Solr ships with new Tika?
Thanks.
Solr trunk
I want to build a function expression for a dismax request handler 'bf'
field, to boost the documents if it is referenced by other documents.
I.e. the more often a document is referenced, the higher the boost.
Something like
linear(query(myQueryReturningACountOfHowOftenThisDocumentIsReferen
Hi, i using solr that running on windows server 2008 32-bit.
I add about 100 million article into solr without set store attribute. (only
store document id) (index file size about 164 GB)
when try to get query without sort , it's return doc ids in some ms, but when
add sort command, i get below
Regarding stemmers, I ditched them altogether a long time ago in favor
of a dictionary of morphologies of all known words (for any given
language). A simple lookup of any word morphology thus produces the set,
including the correct stem.
Works great. 100% of the time.
Just a tip from me.
On Mon
Hello..
I didnt find any about my problem...
how can i replace an ampersand in indextime ?
my autosuggest words are haveing ampersands. how can i replace this sign (&)
???
PatternReplaceCharFilterFactory ??
how is to use this Factory ?
or RegexTransformer ???
thx for ya help ;)
--
View
Thanks for the explanation Mitch.
You're right. There can't be universal stemmers.
What about multi-language stemmers? I'm mostly interested in English, Spanish,
German, French, Italian. Are there any stemmers that would handle those
languages?
If not, what's the recommended way to deal with d
Thanks for the tip.
Are there any publicly available dictionary of morphologies that I could use?
Or did you build your own one?
--- On Mon, 4/19/10, Darren Govoni wrote:
> From: Darren Govoni
> Subject: Re: LucidWorks Solr
> To: solr-user@lucene.apache.org
> Date: Monday, April 19, 2010, 7:
maybe of interest to those doing geo-search in solr?
paul
Début du message réexpédié :
De : "Gavin McArdle"
Date : 19 avril 2010 14:46:05 GMT+02:00
À : dbwo...@cs.wisc.edu
Objet : [Dbworld] Survey on Web Geo-Spatial Open-Source Technologies
Répondre à : dbworld_ow...@yahoo.com
[Apologies for
Dear All,
We're pleased to announce the 3.3.0 release of Carrot2 which significantly
improves the scalability of the clustering algorithms (up to 7x times faster
clustering in case of the STC algorithm) and fixes a number of minor issues.
Release notes:
http://project.carrot2.org/release-3.3.0-no
hey.
sry for this ... stupid question ;)
when i perform an import from my data is use some filters. how can i really
be sure that solr used my configured filters and analyzer ?
when i search in solr the result looks 100% like bevor an import.
th =)
--
View this message in context:
http:
Analyzers/Tokenizers/TokenFilters operate on the text that gets
indexed. Stored text remains exactly as you sent it in.
Erik
On Apr 19, 2010, at 9:53 AM, stockii wrote:
hey.
sry for this ... stupid question ;)
when i perform an import from my data is use some filters. how can i
Hi,
could you provide at least some information? Usually you
can be 100% sure that Solr uses the configuration it is
provided with.
Cheers,
Sven
--On Montag, 19. April 2010 05:53 -0800 stockii wrote:
hey.
sry for this ... stupid question ;)
when i perform an import from my data is use
There have been some open source ones. I don't have the links handy at
this moment[1]. But I parsed through the electronic dictionary and
generated a database of each word and its morphologies. I got tired of
lame stemmers that were wrong half the time. Computers are fast enough to
do lookups on 15
okay.
as example. i want to check if WordDelimiterFactory works correct. And i
want to experimant with search in substrings with edgengram...
i have the problem with that string: "Kamera-Wasserwaage" ...
so i think solr should filter this like this.
Kamera-Wasserwaage
-> Kamera
-> Wasserwaa
Am 19.04.2010 16:09, schrieb stockii:
> so i want to see how it is indexed.
>
>
Go to the admin panel, open the schema browser, and set the number of
shown tokens to 1 or something.
-Michael
If you're submitting this:
field1 : This is a good string
then you're searching in "field1" ONLY for "This". the tokens "is",
"a" "good" and "string" are being searched against your default
search field as defined in your schema.
Have you tried parenthesizing?
Try the SOLR admin page for lo
oha, yes thx but
we have 800 000 items ... to find the right in this way ? XD
--
View this message in context:
http://n3.nabble.com/is-solr-ignored-my-filters-tp729646p729749.html
Sent from the Solr - User mailing list archive at Nabble.com.
Am 19.04.2010 16:29, schrieb stockii:
>
> oha, yes thx but
>
> we have 800 000 items ... to find the right in this way ? XD
Then use the TermsComponent: http://wiki.apache.org/solr/TermsComponent
-Michael
> I didnt find any about my problem...
>
> how can i replace an ampersand in indextime ?
>
> my autosuggest words are haveing ampersands. how can i
> replace this sign (&)
> ???
>
Easiest way is to use MappingCharFilterFactory before your tokenizer.
mapping.txt will be placed under solrhom
> I need to perform wildcard search in phrase query. I have 2
> documents
> containing text "how do impair" and "how to improve". I
> want to be able to
> search both documents by searching (how to im*). There is a
> provision in
> lucene which allows me to perform this operation using
> SpanWildca
hello,
we want to index and search in our intranet documents.
the field "body" contains html-tags.
in our schema.xml we have a fieldType text_de (see at the end of this mail)
which uses charFilter solr.HTMLStripCharFilterFactory with index.
so this is no problem. the text is put into the index
I'm setting up my Solr index to be updated every x minutes.
Does Solr cache the result of a search, and then when next time the same search
is requested, it'd recognize that the Index has not changed and therefore just
return the previous result from cache without processing the search again?
I
> I'm setting up my Solr index to be
> updated every x minutes.
>
> Does Solr cache the result of a search, and then when next
> time the same search is requested, it'd recognize that the
> Index has not changed and therefore just return the previous
> result from cache without processing the sea
Additionally to Alejandro's posting, I would say that you don't need to
specify an analyzer for index-time and query-time, since it *seems* (maybe I
am wrong) like you want to use the same functionality on index- and
query-time.
Hope this helps
- Mitch
--
View this message in context:
http://n
> we want to index and search in our intranet documents.
> the field "body" contains html-tags.
>
> in our schema.xml we have a fieldType text_de (see at the
> end of this mail) which uses charFilter
> solr.HTMLStripCharFilterFactory with index.
> so this is no problem. the text is put into the
Hi everybody:
I have a big problem with solr in a server with the memory size it is using,
I am setting up Solr with "java -jar start.jar" command in an ubuntu server,
the process start.jar is using 7Gb of memory in the server and it is
affecting considerably the performance of the server.
I woul
Erick,
I am a little bit confused, because I wasn't aware of this fact (and have
never noticed any wrong behaviour... maybe because I used the
dismax-handler).
How should I search for
field1: This is a good string
without doing something like
field1:this field1:is ... ?
If I quote the whole thi
> Hi everybody:
>
> I have a big problem with solr in a server with the memory
> size it is using,
> I am setting up Solr with "java -jar start.jar" command in
> an ubuntu server,
> the process start.jar is using 7Gb of memory in the
> server and it is
> affecting considerably the performance of
I am curious:
The idea behind a stemmer is not that he produces the correct infinitive for
a given word. The idea is that he produces always the same infintive for any
derivate of the word.
What would be, if there is an unknown word? For example something like
slang? How does your solution works
Where should Solr know that Wasserwaage contains on "Wasser" and "Waage"?
You are searching for some extra-filter like
DictionaryCompundWordTokenFilter.
Kind regards
- Mitch
stockii wrote:
>
> okay.
>
> as example. i want to check if WordDelimiterFactory works correct. And i
> want to exper
I have just read the post, but it doesn't said if the problems with memory
are associated with that way, the jetty web server it is used when I start
solr that way, then I supposed that problems with memory should not happen
because jetty must administrate the way the memory is used.
Then are you
This is a little bit of hijacking going on here, but
It's algorithmic. That is, there isn't a list of variants that
stem to the same infinitive, and your statement
"always the same infintive for any derivate of the word"
isn't quite what happens.
Stemmers will always produce the same infiniti
yes, thats what im sying to my chef...
but i found another solution in this moment ;)
->
i use EdgeNGram only for my productnames and search with an OR operator in
my default "text" field and in the productname field. so i found all
substrings :D
--
View this message in context:
http://n3.na
if you want to limit the use of memory by the java process you could use
java -XmxNGB
where N is the amount of memory you want to limit to jetty container.
On Mon, Apr 19, 2010 at 10:05 PM, Ariel wrote:
> I have just read the post, but it doesn't said if the problems with memory
> are associated
And what is the recommended max size memory I should use ??? Is there anyone
recommended ???
Regards.
On Mon, Apr 19, 2010 at 12:44 PM, Geek Gamer wrote:
> if you want to limit the use of memory by the java process you could use
> java -XmxNGB
> where N is the amount of memory you want to limit
> And what is the recommended max size
> memory I should use ??? Is there anyone
> recommended ???
What is your index size?
Yes, you are right, thank you Erick.
I've lost this point and thought only of common cases, not of special ones.
However, one can combine the mentioned solutions and different stem-filters
in different fields, so that one can be quite (not absolutely) sure, that in
most of all cases the applicat
Wasn't there a good posting on lucidworks.com?
The title was something like "deadly sins" or so.
There are some good suggestions on things like that :).
Kind regards
- Mitch
--
View this message in context:
http://n3.nabble.com/Big-problem-with-solr-in-an-official-server-tp730049p730168.html
S
Any ideas about my below Q ?
Lee
Begin forwarded message:
> From: Lee Smith
> Date: 19 April 2010 11:19:45 GMT+01:00
> To: solr-user@lucene.apache.org
> Subject: Query 2 Cores
> Reply-To: solr-user@lucene.apache.org
>
> Hey All
>
> I have 2 cores which have been used with tika to do index fil
On 4/19/2010 11:09 AM, Lee Smith wrote:
http://localhost8983/solr/core1/select?shards=localhost:8983/solr/core2&q=attr_content:test
Is this the correct way to query 2 cores at once ?
This should do what you want:
http://localhost:8983/solr/core1/select?shards=localhost:8983/solr/core1
My use requires a mroe correct processing of language than what you define
as a stemmer. My experience with stemmers is that even with some words
without a stem, it makes a new word from it. I consider those false
positives.
My approach is based on the need to recognize that walk, walked, walking
> This is a little bit of hijacking going on here, but
You are right. Accept my regrets.
> It's algorithmic. That is, there isn't a list of variants that
> stem to the same infinitive, and your statement
> "always the same infintive for any derivate of the word"
> isn't quite what happens.
>
no big deal, just wanted to mention.
On Mon, Apr 19, 2010 at 1:24 PM, wrote:
> > This is a little bit of hijacking going on here, but
> You are right. Accept my regrets.
>
>
> > It's algorithmic. That is, there isn't a list of variants that
> > stem to the same infinitive, and your state
hello *, im having issues with the synonym filter altering token offsets,
my input text is
"saturday night live"
its is tokenized by the whitespace tokenizer yielding 3 tokens
[saturday, 0,8], [night, 9, 14], [live, 15,19]
on indexing these are passed through a synonym filter that has this line
s
Andy,
This will help with smooth injection of your multilingual documents into Solr
(multilingual either in the sense of 1 doc containing fields in multiple
languages or 1 index containing documents in different languages):
http://sematext.com/products/multilingual-indexer/index.html
Re your
> Andy,
>
> This will help with smooth injection of your multilingual
> documents into Solr (multilingual either in the sense of 1
> doc containing fields in multiple languages or 1 index
> containing documents in different languages):
>
> http://sematext.com/products/multilingual-indexer/inde
?id you try parenthesizing:
field1:(This is a good string)
You can try lots of things easily by going to
http://localhost:8983/solr/admin/form.jsp
and clicking the "debug enable" checkbox...
HTH
Erick
On Mon, Apr 19, 2010 at 12:23 PM, MitchK wrote:
>
> Erick,
>
> I am a little bit confused, be
Careful though... the Solr admin page is for *analysis* testing, not
query parsing. I saw that mentioned earlier too. To test query
parsing, submit your query to http://localhost:8983/solr/select?q=your_query&debugQuery=true
and look at the parsed query output.
Erik
On Apr 19, 20
Hmmm, I *thought* I saw the XML response with the parsed query in it, did I
miss the details *again*?
Erick
On Mon, Apr 19, 2010 at 7:15 PM, Erik Hatcher wrote:
> Careful though... the Solr admin page is for *analysis* testing, not query
> parsing. I saw that mentioned earlier too. To test que
Ah sorry... my bad. You're right. I thought you were referring to
the admin analysis.jsp page, but I misread and replied to quickly.
You're spot on, Erick.
Erik
On Apr 19, 2010, at 7:21 PM, Erick Erickson wrote:
Hmmm, I *thought* I saw the XML response with the parsed query in
I have the following text field:
...
When I search for women's, womens or women I correctly get back all the
results I want. However when I use the highlighting feature it only
highlights women in the women's cases. Ho
Same general question about highlighting the full work "sunglasses" when I
search for glasses. Is this possible?
Thanks
--
View this message in context:
http://n3.nabble.com/Highlighting-apostrophe-tp731155p731305.html
Sent from the Solr - User mailing list archive at Nabble.com.
Yes, both have same filters, so we can avoid specifying analyzer type.
- Naga
-Original Message-
From: MitchK [mailto:mitc...@web.de]
Sent: Monday, April 19, 2010 9:44 PM
To: solr-user@lucene.apache.org
Subject: Re: Stemming - disable at query time - reg.
Additionally to Alejandro's po
Thanks Erick. Using parentheses works.
With parentheses, the query,q=field1: (this is a good string) is parsed as
follows :
+field1:this +field1:good +field1:string
Is that ok to do.
Thanks,
Sandhya
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: Tues
I'm using Solr 1.4 distribution, with Solr cell. Can i update only new
version of Tika in Solr 1.4 distn? If yes, any guide etc?
Thanks.
On Mon, Apr 19, 2010 at 4:36 PM, Koji Sekiguchi wrote:
> Praveen Agrawal wrote:
>
>> Hi Grant,
>> I tried command line of Tika v-0.7(newest), and it parsed th
72 matches
Mail list logo