Hi,
Can you post the full strack trace? I'd need to know if it's
really org.apache.solr.handler.clustering.ClusteringComponent that's missing
or some other class ClusteringComponent depends on.
Cheers,
Staszek
On Thu, Jun 30, 2011 at 04:19, Walter Closenfleight <
walter.p.closenflei...@gmail.co
>
> and my second question is does clustering effect indexes.
>
No, it doesn't. Clustering is performed only on the search results produced
by Solr, it doesn't change anything in the index.
Cheers,
Staszek
thanks iorixxx, i changed my configuration to include clustering in search
results. in my xml format search results i got a tag , to show
this clusters in to search results do i need to parse this xml.
and my second question is does clustering effect indexes.
-
Thanks & Regards
Romi
--
View t
On 6/29/2011 7:50 PM, Yonik Seeley wrote:
OK, your filter queries have hundreds of terms in them (and that means
hundreds of term lookups, which uses the term index).
Thus, your termIndexInterval change is be the leading suspect for the
slowdown. A termIndexInterval of 1024 means that
a term loo
I had set up the clusteringComponent in solrconfig.xml for my first core. It
has been working fine and now I want to get my next core working. I set up
the second core with the clustering component so that I could use it, use
solritas properly, etc. but Solr did not like the solrconfig.xml changes
On Wed, Jun 29, 2011 at 3:28 PM, Yonik Seeley
wrote:
>
> On Wed, Jun 29, 2011 at 1:43 PM, Shawn Heisey wrote:
> > Just now, three of the six shards had documents deleted, and they took
> > 29.07, 27.57, and 28.66 seconds to warm. The 1.4.1 counterpart to the 29.07
> > second one only took 4.78 s
Hi Filype,
in the response you should have a list of fq arguments something like
field:facetValue
field:FacetValue
use this to set your inputs to be selected / checked
On 29 June 2011 23:54, Filype Pereira wrote:
> Hi all,
> I am looking for some help in building a front end facet filter us
> Are there any best practices or
> preferred ways to accomplish what I am
> trying?
People usually prefer multiplicative boosting. But in your case you want
additive boosting. Dismax's bf is additive.
There is also _val_ hook. http://wiki.apache.org/solr/SolrQuerySyntax
> Do the params for d
Hi all,
I am looking for some help in building a front end facet filter using XSLT.
The code I use is: http://pastebin.com/xVv9La9j
On the image attached, the checkbox should be selected. (You clicked and
submited the facet form. The URL changed)
I can use xsl:if, but there's nothing that I can
> So I made a custom search component
> which runs right after the query
> component and this custom component will update the score
> of each based on
> some things (and no, I definitely can't use existing
> components). I didn't
> see any easy way to just update the score so what I
> currently d
Thanks,
Yes this is the work around I am currently doing.
Still wondering is the sort method can be used alone.
On 29 June 2011 18:34, Michael Ryan wrote:
> You could try adding a new int field (like "typeSort") that has the desired
> sort values. So when adding a document with type:car, als
bump
--
View this message in context:
http://lucene.472066.n3.nabble.com/After-the-query-component-has-the-results-can-I-do-more-filtering-on-them-tp3114775p3123502.html
Sent from the Solr - User mailing list archive at Nabble.com.
Hi Yonik,
as this recommendation comes from you, I am not going to test it, you
are well known as a speed junkie ;)
When we are there (in SignatureUpdateProcessor), why is this code not
moved to the constructor, but remains in processAdd
...
Signature sig = (Signature)
req.getCore().getRe
On Wed, Jun 29, 2011 at 4:32 PM, eks dev wrote:
> req.getSearcher().getFirstMatch(t) != -1;
Yep, this is currently the fastest option we have.
-Yonik
http://www.lucidimagination.com
Thanks Shalin!
would you not expect
req.getSearcher().docFreq(t);
to be slightly faster? Or maybe even
req.getSearcher().getFirstMatch(t) != -1;
which one should be faster, any known side effects?
On Wed, Jun 29, 2011 at 1:45 PM, Shalin Shekhar Mangar
wrote:
> On Wed, Jun 29, 2011 at 2:01
Is there a Solr plugin example similar to Nutch's(
http://wiki.apache.org/nutch/WritingPluginExample) example? I found was a
SolrPlugin(http://wiki.apache.org/solr/SolrPlugins) wiki page but it didn't
have any example code. It would be helpful if there was a concrete example
that would explain how
On Wed, Jun 29, 2011 at 1:43 PM, Shawn Heisey wrote:
> Just now, three of the six shards had documents deleted, and they took
> 29.07, 27.57, and 28.66 seconds to warm. The 1.4.1 counterpart to the 29.07
> second one only took 4.78 seconds, and it did twice as many autowarm
> queries.
Can you po
Does the phonetic analysis preserve the offsets of the original text field?
If so, you should probably be able to hack up FastVectorHighlighter to
do what you want.
-Mike
On 06/29/2011 02:22 PM, Jamie Johnson wrote:
I have a schema with a text field and a text_phonetic field and would like
t
>From all I've read, using something like PatternReplaceFilterFactory allows
you to replace / remove text in an index, but is there anything similar that
allows manipulation of the text in the associated field? For example, if I
pulled a status from Twitter like, "Hi, this is a #hashtag." I would l
: The problem with TikaEntityProcessor is this installation is still running
: v1.4.1 so I'll need to upgrade.
:
: Any short and sweet instructions for upgrading to 3.2? I have a pretty
: straight forward Tomcat install, would just dropping in the new war suffice?
It should be fairly straight f
I have a schema with a text field and a text_phonetic field and would like
to perform highlighting on them in such a way that the tokens that match are
combined. What would be a reasonable way to accomplish this?
Anything is an option, but I think I found another way. I am going to add a
new SearchComponent which reads some additional query parameters and builds
the appropriate filter.
On Tue, Jun 28, 2011 at 2:07 PM, Dmitry Kan wrote:
> You should modify the SolrCore for this, if I'm not mistaken.
>
>
OK - I figured it out. It's not solr at all (and I'm not really surprised).
In the prototype benchmarks, we used a different instance of tomcat than we're
using for production load tests. Our prototype tomcat instance had no
maxThreads value set, so was using the default value of 200. The pro
Are there any best practices or preferred ways to accomplish what I am
trying?
Do the params for defType, qf and bf belong in a solr request handler?
Is it possible to have the weights as variables so they can be tweaked till
we find the optimum balance in showing our results?
Thanks!
--
View
Ah, I think I suddenly answered my own question, but appreciate further
insight if you have it. I converted the & in &myword; to an & so it
looks like this:
Solr is a really &myword; search engine!
On Wed, Jun 29, 2011 at 12:40 PM, Walter Closenfleight <
walter.p.closenflei...@gmail.com> wrote
On 6/29/2011 11:27 AM, Shawn Heisey wrote:
On 6/29/2011 9:17 AM, Yonik Seeley wrote:
Hmmm, you could comment out the query and filter caches on both 1.4.1
and 3.2
and then run some of the queries to see if you can figure out which
are slower?
Do any of the queries have stopwords in fields whe
On 06/28/2011 12:04 PM, Chris Hostetter wrote:
: I'm streaming over the document content (presumably via tika) and its
: gathering the document's metadata which includes the keywords metadata field.
: Since I'm also passing that field from the DB to the REST call as a list (as
: you suggested) t
We have some text entities in fields to index (and search) like so:
Solr is a really &myword; search engine!
I would like to preserve/protect &myword; and not resolve it in the indexing
or search results.
What sort of methods have people used? I realize the results are returned in
XML format, so
You could try adding a new int field (like "typeSort") that has the desired
sort values. So when adding a document with type:car, also add typeSort:1; when
adding type:van, also add typeSort:2; etc. Then you could do "sort=typeSort
asc" to get them in your desired order.
I think this is also po
On 6/29/2011 9:17 AM, Yonik Seeley wrote:
Hmmm, you could comment out the query and filter caches on both 1.4.1 and 3.2
and then run some of the queries to see if you can figure out which are slower?
Do any of the queries have stopwords in fields where you now index
those? If so, that could ent
Hi
Say I have a field type in multiple documents which can be either
type:bike
type:boat
type:car
type:van
and I want to order a search to give me documents in the following order
type:car
type:van
type:boat
type:bike
Is there a way I can do this just using the &sort method?
Thanks
I'm using Solr trunk.
If it's levenstein/edit distance, that's great, that's what I want. It just
didn't seem to be officially documented anywhere so I wanted to find out for
sure. Thanks for confirming.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Fuzzy-Query-Param-tp312
In solr, is it possible to 'chain' copyfields so that you can copy the value
of one into another?
Example:
Point being, every time I add a new field to the autocomplete, I want it to
automatically also be added to ac_spellcheck without having to do it twice.
--
View this message in c
Can you get a thread dump to see what is hanging?
-Yonik
http://www.lucidimagination.com
On Wed, Jun 29, 2011 at 11:45 AM, Bob Sandiford
wrote:
> Hi, all.
>
> I'm hoping someone has some thoughts here.
>
> We're running Solr 3.1 (with the patch for SolrQueryParser.java to not do the
> getLucene
Hi,
I need help in figuring out the right configuration to perform highlighting
in Solr. I can retrieve the matching documents plus the highlighted
matches.
I've done another tool called DTSearch where it would return the offset
positions of the field value to highlight. I've tried a few differ
Hi!
I would like to announce Solr 3.2 with RankingAlgorithm has Near Real
Time capability now. The NRT performance is very high, 1428
documents/sec [ MBArtists 390k index]. The NRT functionality allows you
to add documents without the IndexSearchers being closed or caches being
cleared. A com
Hi, all.
I'm hoping someone has some thoughts here.
We're running Solr 3.1 (with the patch for SolrQueryParser.java to not do the
getLuceneVersion() calls, but use luceneMatchVersion directly).
We're running in a Tomcat instance, 64 bit Java. CATALINA_OPTS are: -Xmx7168m
-Xms7168m -XX:MaxPerm
Hmmm, you could comment out the query and filter caches on both 1.4.1 and 3.2
and then run some of the queries to see if you can figure out which are slower?
Do any of the queries have stopwords in fields where you now index
those? If so, that could entirely account for the difference.
-Yonik
ht
I have noticed a significant difference in filter cache warming times on
my shards between 3.2 and 1.4.1. What can I do to troubleshoot this?
Please let me know what additional information you might need to look
deeper. I know this isn't enough.
It takes about 3 seconds to do an autowarm co
> too bad it is still in todo, that's
> why i was asking some for some tips on
> writing, compiling, registration, calling...
Here is general information about how to customize solr via plugins.
http://wiki.apache.org/solr/SolrPlugins
Here is the registration and code example.
http://wiki.apache.
too bad it is still in todo, that's why i was asking some for some tips on
writing, compiling, registration, calling...
--
View this message in context:
http://lucene.472066.n3.nabble.com/Regex-replacement-not-working-tp3120748p3121856.html
Sent from the Solr - User mailing list archive at Nabbl
I have had the same problems with regex and I went with the regular pattern
replace filter rather than the charfilter. When I added it to the very end
of the chain, only then would it work...I am on Solr 3.2. I have also
noticed that the HTML filter factory is not working either. When I dump the
fi
> ok, last question on the
> UpdateProcessor: can you please give me the steps to
> implement my own?
> i mean, i can push my custom processor in solr's code, and
> then what?
> i don't understand how i have to change the solrconf.xml
> and how can i bind
> that to the updater i just wrotea
> and a
ok, last question on the UpdateProcessor: can you please give me the steps to
implement my own?
i mean, i can push my custom processor in solr's code, and then what?
i don't understand how i have to change the solrconf.xml and how can i bind
that to the updater i just wrotea
and also i don't unders
> I just went through solr wiki page
> for clustering. But i am not getting what
> is the benefit of using clustering. Can anyone tell me what
> is actually
> clusering and what its use in indexing and searching.
> does it effect search results??
> Please reply
It is for search result clustering.
> my goal is/was storing the value into
> the field, and i get i have to create
> my Update handler.
>
> i was trying to use query with salary_min:[100 TO 200] and
> it's actually
> working... since i just need it to search, i'll stay with
> this solution
>
> is the [100 TO 200] a performance kil
Thanks to both of you, I understand now and am now getting the expected
results.
Cheers!
On Wed, Jun 29, 2011 at 2:21 AM, Ahmet Arslan wrote:
>
> > I believe I am missing something very elementary. The
> > following query
> > returns zero hits:
> >
> > http://localhost:8983/solr/core0/select/?q
my goal is/was storing the value into the field, and i get i have to create
my Update handler.
i was trying to use query with salary_min:[100 TO 200] and it's actually
working... since i just need it to search, i'll stay with this solution
is the [100 TO 200] a performance killer? i remember read
Am 29.06.2011 12:30, schrieb samuele.mattiuzzo:
>
>
...
> this is the "final" version of my schema part, but what i get is this:
>
>
>
> 1.0
> Negotiable
> Negotiable
> Negotiable
>
...
The mistake is that you assume that the filter applied to the result.
This is not true. Index
Hi Samuele,
It's not clear for me if your goal is to search on that field (for example,
"salary_min:[100 TO 200]") or if you want to show the transformed field to
the user (so you want the result of the regex replacement to be included in
the search results).
If your goal is to show the results t
I was using SnowballPorterFilterFactory for stemming, and that stammer was
stemming the words.
I added the keyword "ansys" to file "protwords.txt".
Now the stemming is not happening for "ansys" and Its OK now.
On 29 June 2011 17:12, Ahmet Arslan wrote:
> > I am using solr1.4
> > When I search
admin/analysis.jsp page shows RemoveDuplicatesTokenFilterFactory
,ReversedWildcardFilterFactory ,.EnglishPorterFilterFactory
-
Thanks & Regards
Romi
--
View this message in context:
http://lucene.472066.n3.nabble.com/filters-effect-on-search-results-tp3120968p3121506.html
Sent from the Solr
ok, but i'm not applying the filtering on the copyfields.
this is how my schema looks:
and the two datatypes defined before. that's why i tought i could first use
"copyField" to copy the value then index them with my two datatypes
filtering...
--
View this message in context:
http://l
I just went through solr wiki page for clustering. But i am not getting what
is the benefit of using clustering. Can anyone tell me what is actually
clusering and what its use in indexing and searching.
does it effect search results??
Please reply
-
Thanks & Regards
Romi
--
View this message
> i have the string "You may earn 25k
> dollars per week" stored in the field
> "salary"
>
> i'm using 2 copyfields "salary_min" and "salary_max" with
> source in "salary"
> with those 2 datatypes
>
> salary is "text"
> salary_min is "salary_min_text"
> salary_max is "salary_max_text"
>
> so, i
i have the string "You may earn 25k dollars per week" stored in the field
"salary"
i'm using 2 copyfields "salary_min" and "salary_max" with source in "salary"
with those 2 datatypes
salary is "text"
salary_min is "salary_min_text"
salary_max is "salary_max_text"
so, i was expecting this:
solr
> Index Analyzer
> org.apache.solr.analysis.KeywordTokenizerFactory
> {luceneMatchVersion=LUCENE_31}
> position 1
> term text £22000 - £25000 per annum +
> benefits
> startOffset 0
> endOffset 36
>
>
> org.apache.solr.analysis.PatternReplaceFilterFactory
> {replacement=$2,
> pattern=[
Index Analyzer
org.apache.solr.analysis.KeywordTokenizerFactory
{luceneMatchVersion=LUCENE_31}
position1
term text £22000 - £25000 per annum + benefits
startOffset 0
endOffset 36
org.apache.solr.analysis.PatternReplaceFilterFactory {replacement=$2,
pattern=[^\d]?([0-9]+[k,
I am working on indexing arabic documents containg arabic diacritics and
dotless characters (old arabic characters), I am using Apache Tomcat server,
and I am using my modified version of the aramorph analyzer as the arabic
analyzer. I managed on the development enviorment to normalize the arabi
Indeed, I find the Porter stemmer to be too 'aggressive' for my taste, I prefer
the EnglishMinimalStemFilterFactory, with the caveat that it depends on your
data set.
Cheers
François
On Jun 29, 2011, at 6:21 AM, Ahmet Arslan wrote:
>> Hi, when i query for "elegant" in
>> solr i get results fo
On Wed, Jun 29, 2011 at 2:01 AM, eks dev wrote:
> Quick question,
> Is there a way with solr to conditionally update document on unique
> id? Meaning, default, add behavior if id is not already in index and
> *not to touch index" if already there.
>
> Deletes are not important (no sync issues).
>
> I am using solr1.4
> When I search for keyword "ansys" I get lot of posts.
> but when I search for "ansys NOT ansi" I get nothing.
> I guess its because of Phonetic search, "ansys" is
> converted into "ansi" (
> that is NOT keyword) and nothing returns.
>
> How to handle this kind of problem.
F
> name="salary_min_text" class="solr.TextField" >
>
> class="solr.PatternReplaceCharFilterFactory"
> pattern="[^\d]?([0-9]+[k,.]?[0-9]*)+.*?([0-9]+[k,.]?[0-9]*)+.*"
> replacement="$1"/>
> class="solr.KeywordTokenizerFactory"/>
> class="solr.LowerCaseFilterFact
I am using solr1.4
When I search for keyword "ansys" I get lot of posts.
but when I search for "ansys NOT ansi" I get nothing.
I guess its because of Phonetic search, "ansys" is converted into "ansi" (
that is NOT keyword) and nothing returns.
How to handle this kind of problem.
--
Thanks and Re
Which version of Solr (Lucene) are you using?
Recent versions of Lucene now accept ~N > 1 to be edit distance. Ie
foobar~2 matches any term that's <= 2 edit distance away from foobar.
Mike McCandless
http://blog.mikemccandless.com
On Tue, Jun 28, 2011 at 11:00 PM, entdeveloper
wrote:
> Accord
this is the "final" version of my schema part, but what i get is this:
1.0
> Hi, when i query for "elegant" in
> solr i get results for "elegance" too.
>
> *I used these filters for index analyze*
> WhitespaceTokenizerFactory
> StopFilterFactory
> WordDelimiterFilterFactory
> LowerCaseFilterFactory
> SynonymFilterFactory
> EnglishPorterFilterFactory
> RemoveDuplicate
> Hi, i have this bunch of lines in my
> schema.xml that should do a replacement
> but it doesn't work!
>
> class="solr.TextField"
> omitNorms="true">
>
> class="solr.StandardTokenizerFactory"/>
> class="solr.PatternReplaceCharFilterFactory"
> pattern="([0-9]+k?[.,
Hi, when i query for "elegant" in solr i get results for "elegance" too.
*I used these filters for index analyze*
WhitespaceTokenizerFactory
StopFilterFactory
WordDelimiterFilterFactory
LowerCaseFilterFactory
SynonymFilterFactory
EnglishPorterFilterFactory
RemoveDuplicatesTokenFilterFactory
Re
sure, SSD or RAM disks fix these problems with IO.
Anyhow, I can really see no alternative for some in memory index for
slaves, especially for low latency master-slave apps (high commit rate
is a problem).
having possibility to run slaves in memory that are slurping updates
from Master seams t
sure, SSD or RAM disks fix these problems with IO.
Anyhow, I can really see no alternative for some in memory index for
slaves, especially for low latency master-slave apps (high commit rate
is a problem).
having possibility to run slaves in memory that are slurping updates
from Master seams t
On Wed, 2011-06-29 at 09:35 +0200, eks dev wrote:
> In MMAP, you need to have really smart warm up (MMAP) to beat IO
> quirks, for RAMDir you need to tune gc(), choose your poison :)
Other alternatives are operating system RAM disks (avoids the GC
problem) and using SSDs (nearly the same performa
Hi, i have this bunch of lines in my schema.xml that should do a replacement
but it doesn't work!
I need it to extract only the numbers from some other string. The strings
can be anything: only letters (so it should replace it with an empty
string), le
...Using RAMDirectory really does not help performance...
I kind of agree, but in my experience with lucene, there are cases
where RAMDirectory helps a lot, with all its drawbacks (huge heap and
gc() tuning).
We had very good experience with MMAP on average, but moving to
RAMDirectory with prop
> I believe I am missing something very elementary. The
> following query
> returns zero hits:
>
> http://localhost:8983/solr/core0/select/?q=testabc
With this URL, you are hitting the RequestHandler defined as in your core0/conf/solrconfig.xml.
> However, using solritas, it finds many results
> I am trying to create a feature that
> allows search results to be displayed by
> this formula sum(weight1*text relevance score, weight2 *
> price). weight1 and
> weight2 are numeric values that can be changed to influence
> the search
> results.
>
> I am sending the following query params to th
76 matches
Mail list logo