Also regarding the Join functionality I remember Yonik pointed out it's O(#
unique terms) but I agree with Erik on the ExternalFileField as you can use
it just inside a function query, for example, for boosting.
Tommaso
2012/3/1 Erick Erickson
> Hmmm. ExternalFileFields can only be float values,
Hi
I face the issue that i have n business-user. Each business-user has it's
own amount products. I want to provide an interface for each business-user
where he can find only the products he offers. What would be a be a better
solution:
1.)To have one big index and filter by customer-name
Hello Donnie,
1. Nothing beside of design consideration prevents you form doing search in
QueryResponseWriter. You have a request, which isn't closed yet, where you
can obtain searcher from.
2. Your usecase isn't clear. If you need just to search categories, and
return the lists of subcategories p
Thanks Ahmet! That's good to know someone else also tried to make phrase
queries to fix multi-word synonym issue. :-)
On Thu, Mar 1, 2012 at 1:42 AM, Ahmet Arslan wrote:
> > I don't think mm will help here because it defaults to 100%
> > already by the
> > following code.
>
> Default behavior
(12/03/02 6:05), Ahmet Arslan wrote:
I have the same problem. This happens
only for some documents in the index.
Andrew, can you provide a document string and a query pair? I will try to
re-produce the exception. Then we can create a test case that fails. Others can
look into it.
+1. Please
Hi,
I am sorry if this has already been posted.
I am new to the solr.
I am crawling my site using Nutch and posting it to Solr. I am trying to
implement a feature where I want to get all data where url starts with
"http://someurl/";
Any thoughts?
Thanks,
Stan
--
View this message in contex
> I assuming the windows configuration looked correct?
Yeah, so far I can not spot any smoking gun...I'm confounded at the moment.
I'll re read through everything once more...
- Mark
I reindex every time I change something.
I also delete any zookeeper data too.
I assuming the windows configuration looked correct?
On Thu, Mar 1, 2012 at 3:39 PM, Mark Miller wrote:
> P.S. FYI you will have to reindex after adding _version_ back the schema...
>
> On Mar 1, 2012, at 3:35 PM, M
I tried publishing to /update/extract request handler using manifold, but
got the same result.
I also tried swapping out the replication handlers too, but that didn't do
anything.
Otherwise, that's it.
On Thu, Mar 1, 2012 at 3:35 PM, Mark Miller wrote:
> Any other customizations you are making
Hi,
Apologies if this has been answered before, I tried searching for it and
didn't find anything answering this exactly.
I want to find similar documents using MLT Handler using some specified
fields but I want to filter down the returned matches with some keywords as
well.
I looked at the exam
On Thu, Mar 1, 2012 at 3:34 AM, Michael Jakl wrote:
> The topic field holds roughly 5
> values per doc, but I wasn't able to compute the correct number right
> now.
How many unique values for that field in the whole index?
If you have log output (or output from the stats page for
fieldValueCache)
> @iorixxx: Where can I find that
> example schema.xml?
Please find text_general_rev at
http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/conf/schema.xml
> And when I find it, can I just make the title field which
> currently is of
> "text" type then of "text_rev" type?
Yes,
Hi all,
The documents in our solr index have an parent child relationship which we
have basically flattened in our solr queries. We have messaged solr into
being the query API for a 3rd party data. The relationship is simple
parent-child relationship as follows:
category
+-sub-category
this ult
Only one interval? in that case you could add a filter query and facet
in the regular way. That is:
facet.field=person&fq=person:[A TO C]
But consider that you will get the search results that include those
persons only.
Thanks
Emmanuel
2012/3/1 AlexR :
> Hi
>
> i need to build buckets with al
On frequent method of doing leading and trailing wildcards
is to use ngrams (as distinct from edgengrams). That in
combination with phrase queries might work well in this case.
You also might be surprised at how little space bigrams take,
give it a test and see ..
Best
Erick
On Thu, Mar 1, 2012
I'm really confused here. Your first question seemed to be about http
involved in index replication, which really doesn't seem to be
related to your latest post. Can you start over from the beginning?
Best
Erick
On Thu, Mar 1, 2012 at 9:56 AM, Neel wrote:
> Hi Erick, Thanks for your post.
>
> We
I don't think Spatial search will fully fit into this. I have 2 approaches in
mind but I am not satisfied with either one of them.
a) Have 2 separate indexes. First one to store the information about all the
cities and second one to store the retail stores information. Whenever user
searches fo
@iorixxx: Where can I find that example schema.xml?
I downloaded the latest version here:
ftp://apache.mirror.easycolocate.nl//lucene/solr/3.5.0
And checked \example\example-DIH\solr\db\conf\schema.xml
But no text_rev type is defined in there.
And when I find it, can I just make the title field w
--- On Thu, 3/1/12, PeterKerk wrote:
> From: PeterKerk
> Subject: Re: Need tokenization that finds part of stringvalue
> To: solr-user@lucene.apache.org
> Date: Thursday, March 1, 2012, 6:59 PM
> @iorixxx: yes, that is what I need.
> But also when its IN the text, not
> necessarily at the begi
> I have the same problem. This happens
> only for some documents in the index.
Andrew, can you provide a document string and a query pair? I will try to
re-produce the exception. Then we can create a test case that fails. Others can
look into it.
P.S. FYI you will have to reindex after adding _version_ back the schema...
On Mar 1, 2012, at 3:35 PM, Mark Miller wrote:
> Any other customizations you are making to solrconfig?
>
> On Mar 1, 2012, at 1:48 PM, Matthew Parker wrote:
>
>> Added it back in. I still get the same result.
>>
>> On
Any other customizations you are making to solrconfig?
On Mar 1, 2012, at 1:48 PM, Matthew Parker wrote:
> Added it back in. I still get the same result.
>
> On Wed, Feb 29, 2012 at 10:09 PM, Mark Miller wrote:
> Do you have a _version_ field in your schema? I actually just came back to
> this
I have the same problem. This happens only for some documents in the index.
Like sharadgaur, the problem ceased when I removed
ReversedWildcardFilterFactory from my analysis chain,
HTMLStripCharFilterFactory has been there before and after.
I am running branch-3.6 r1238628. As far as I can tell,
Hi, all!
It may be seems strange, but can you who read this post answer at some
questions. I want to understand, that maybe I want to much from my Solr, so:
1) Solr version;
2) Summary doc count;
3) Shards count (if exists);
4) rows count at query (from ... into);
5) Average queries per minute (QP
Added it back in. I still get the same result.
On Wed, Feb 29, 2012 at 10:09 PM, Mark Miller wrote:
> Do you have a _version_ field in your schema? I actually just came back to
> this thread with that thought and then saw your error - so that remains my
> guess.
>
> I'm going to improve the doc
Hi
i need to build buckets with alphanumeric values.
for example:
facet.field=person
person: Alex(10), Ben(5), George(8), Paul(3), Peter(2), Stefan(9)
now i need all person in the interval of A-C
with facet.query=person[A TO C] i only get the number of matches (15)
but i wanna have the values
Thanks Robert. Yes thats right I can get some more accuracy if I use
transposition in addition to substitution, insert and deletion.
From: Robert Muir [rcm...@gmail.com]
Sent: Thursday, March 01, 2012 9:50 PM
To: solr-user@lucene.apache.org
Subject: Re: Spe
Thanks James. I loved the last line in your mail "But in the end, especially
with 1-word queries, I doubt even the best algorithms are going to always
accurately guess what the user wanted." Absolutely I agree to this; if it is a
phrase (instead of single word) then probably we can apply some N
@iorixxx: yes, that is what I need. But also when its IN the text, not
necessarily at the beginning.
So using the * character like:
q=smart*
the product is found, but when I do this:
q=*mart*
it isnt...why is that?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Need-toke
On Thu, Mar 1, 2012 at 6:43 AM, Husain, Yavar wrote:
> Hi
>
> For spell checking component I set extendedResults to get the frequencies and
> then select the word with the best frequency. I understand the spell check
> algorithm based on Edit Distance. For an example:
>
> Query to Solr: Marien
>
> but the following doesn't work.
> TESTING*
Please see the following writeups:
http://wiki.apache.org/solr/MultitermQueryAnalysis
http://www.lucidimagination.com/blog/2011/11/29/whats-with-lowercasing-wildcard-multiterm-queries-in-solr/
> if title holds "smartphone" I want it to be found when
> someone types
> "martph" or "smar" or "smart".
Peter, so you want to beginsWith startsWith type of search? You can use use
wildcard search (with start operator) for this. e.g. &q=smar*
Alternatively, if your index size is not huge, you
Speaking of which, there is a spellchecker in jira that will detect word-break
errors like this. See "WordBreakSpellChecker" at
https://issues.apache.org/jira/browse/LUCENE-3523 .
To use it with Solr, you'd also need to apply SOLR-2993
(https://issues.apache.org/jira/browse/SOLR-2993). This S
Yavar,
When you listed what the spell checker returns you put them in this order:
> Marine (Freq: 120), Market (Freq: 900) and others
Was "Marine" listed first, and then did you pick "Market" because you thought
higher frequency is better? If so, you probably have the right settings
already b
I once used a spell checker to break up compound words. It was slow, but worked
pretty well.
wunder
On Mar 1, 2012, at 5:53 AM, Erick Erickson wrote:
> Right, there's nothing in Solr that I know of that'll help here. How would
> a tokenizer understand that "smartphone" should be "smart" "phone"
Hi!
Having just worked through the solr tutorial
(http://lucene.apache.org/solr/tutorial.html) I think I found two minor
"bugs":
1.
The "delete by query" example
java -Ddata=args -jar post.jar ""
should read
java -Ddata=args -jar post.jar "name:DDR"
2.
The link to the mailing lists at the end
Any segment files on SSD will be faster in cases where the file is not
in OS cache. If you have enough RAM a lot of index segment files will
end up in OS system cache so it wont have to go to disk anyway. Since
most indexes are bigger than RAM an SSD helps a lot. But if index is
much larger than
Hi Erick, Thanks for your post.
We are not directly providing search result from lucene index to user. We
are processing the lucene search result and adding additional information to
it by getting from different sources[from other lunce indexes or from
databases]. So,
consuming search results fro
I'm just starting out...
for either
testing QA
TESTING QA
I can query with the following strings and find my text:
testing
TESTING
testing*
but the following doesn't work.
TESTING*
any ideas?
thanks
Neil
> what about, if a search string starts with "$o$" ? this is
> not recognized by
> dismax too, right? Is there another filter I have to use?
I don't fully follow your question but it seems that you want to search special
characters too? With raw or term query parser plugin you can do that.
htt
Hi,
what about, if a search string starts with "$o$" ? this is not recognized by
dismax too, right? Is there another filter I have to use?
Thanks,
Ramo
-Ursprüngliche Nachricht-
Von: Ahmet Arslan [mailto:iori...@yahoo.com]
Gesendet: Donnerstag, 1. März 2012 12:44
An: solr-user@lucene.a
Perfect! Thanks!
On Wed, Feb 29, 2012 at 3:29 PM, Emmanuel Espina
wrote:
> I think that what you want is FieldCollapsing:
>
> http://wiki.apache.org/solr/FieldCollapsing
>
> For example
> &q=my search&group=true&group.field=subject&group.limit=5
>
> Test it to see if that is what you want.
>
> Th
I think I didnt explain myself clearly: I need to be able to find substrings.
So, its not that I'd expect Solr to find synonyms, but rather if a piece of
text contains the searched text, for example:
if title holds "smartphone" I want it to be found when someone types
"martph" or "smar" or "smart"
Right, there's nothing in Solr that I know of that'll help here. How would
a tokenizer understand that "smartphone" should be "smart" "phone"?
There's no general solution for this issue.
You can do domain-specific solutions with synonyms for instance, or
some other word list that contains terms yo
Currently, the page you referenced here:
http://wiki.apache.org/solr/SolrReplication
is the standard way to replicate incremental indexes.
You say your "worried about the extra http". Why?
Do you have any evidence that this would be a problem?
Http isn't inherently inefficient at all, and even if
Hmmm. ExternalFileFields can only be float values, so I'm not
sure "the necessary data" is straight-forward. Additionally, they
are used in function queries. Does this still work?
I really don't know the performance characteristics if, say, you have
users with access to all documents for SOLR-2272
On Thursday 01 March 2012 13:03:18 Bernd Fehling wrote:
> What is netstat telling you about the connections on the servers?
>
> Any connections in "CLOSE_WAIT" (passive close) hanging?
I can't tell exact numbers right now but there were a lot between all the
cores and the indexing clients.
>
Do you have autocommit enabled? I tested this with 1m docs indexed by
using the default example config and saw used file descriptors go up
to 2400 (did not come down even after the final commit at the end).
Then I disabled autocommit, reindexed and the descriptor count stayed
pretty much flat at ar
Hi,
Just wondering if anyone had any experience with solr and flashcache
[https://wiki.archlinux.org/index.php/Flashcache], my guess it might
be particularly useful for indicies not changing that often, and for
large indicies where an SSD of that size is prohibitive.
Cheers,
Dan
What is netstat telling you about the connections on the servers?
Any connections in "CLOSE_WAIT" (passive close) hanging?
Saw this on my servers last week.
Used a little proggi to spoof a local connection on those servers ports
and was able to fake the TCP-stack to close those connections.
It a
> does that effect my result list? Because if i use the
> dismax, and type into
> my search field the title "blue on blue" (without quotes), I
> get this
> product as a first result. If I use dismax without boosting
> and search for
> "blue on blue" (without quotes) I'm not getting this result
>
Hi
For spell checking component I set extendedResults to get the frequencies and
then select the word with the best frequency. I understand the spell check
algorithm based on Edit Distance. For an example:
Query to Solr: Marien
Spell Check Text Returned: Marine (Freq: 120), Market (Freq: 900)
Hi,
Yesterday we had an issue with too many open files, which was solved
because a username was misspelled. But there is still a problem with
open files.
We cannot succesfully index a few millions documents from MapReduce to
a 5-node Solr cloud cluster. One of the problems is that after a wh
Hi,
does that effect my result list? Because if i use the dismax, and type into
my search field the title "blue on blue" (without quotes), I get this
product as a first result. If I use dismax without boosting and search for
"blue on blue" (without quotes) I'm not getting this result in the first
> I've got an issue when searching with a searchtstring
> like: 'title:"Blue"
> on "Blu' . the original searchstring is: 'title:"Blue" on
> "Blue"' and this
> works well. If I now delete the last double quote and the
> "e" than I get the
> error below. Is there any filter that can handle such
>
Hi,
I've got an issue when searching with a searchtstring like: 'title:"Blue"
on "Blu' . the original searchstring is: 'title:"Blue" on "Blue"' and this
works well. If I now delete the last double quote and the "e" than I get the
error below. Is there any filter that can handle such searches w
> I don't think mm will help here because it defaults to 100%
> already by the
> following code.
Default behavior of mm has changed recently. So it is a good idea to explicitly
set it to 100%. Then all of the search terms must match.
> Regarding multi-word synonym, what is the best way to handle
Hi!
On Wed, Feb 29, 2012 at 22:21, Emmanuel Espina wrote:
> No. But probably we can find another way to do what you want. Please
> describe the problem and include some "numbers" to give us an idea of
> the sizes that you are handling. Number of documents, size of the
> index, etc.
Thank you! Ou
Thanks Mark,
Good, this is probably good enough to give it a try. My analyzers are
normally fast, doing duplicate analysis (at each replica) is
probably not going to cost a lot, if there is some decent "batching"
Can this be somehow controlled (depth of this buffer / time till flush
or some such
59 matches
Mail list logo