Hm, I was purposely avoiding mentioning ngrams because just ngramming
all indexed tokens would balloon the index My assumption was that
only *some* words are misspelled, in which case it may be better not
to ngram all tokens
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On
*Could anyone help me to see what is the reason which Solritas page failed?*
*I can go to http://localhost:8080/solr without problem, but fail to go to
http://localhost:8080/solr/browse*
*As below is the status report! Any help is appreciated.*
*Thanks!*
*Andy*
*
*
*type* Status report
*mess
Another theoretical answer for this question is ngrams approach. You can
index the word and its trigrams. Query the index, by the string as well as
its trigrams, with a % match search. You than pass the exhaustive resultset
through a more expensive scoring such as Smith Waterman.
Thanks,
Jagdish
Hi Kevin,
Would http://search-lucene.com/?q=LBHttpSolrServer work for you?
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Fri, May 24, 2013 at 3:12 PM, Kevin Osborn wrote:
> We are looking install SolrCloud on Azure. We want it to be an internal
> service. For some application
I have not heard of anyone using HLL in Solr, but:
https://docs.google.com/presentation/d/1ESNiqd7HuIfuwXSSK81PAAu6AmEPEE0u_vyk4FU5x9o/present#slide=id.p
https://github.com/ptdavteam/elasticsearch-approx-plugin
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Tue, May 28, 2013 at
It's 2013 and people suffer from ADD. Break it up into a la carte
chapter books.
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Wed, May 29, 2013 at 6:23 PM, Jack Krupansky wrote:
> Markus,
>
> Okay, more pages it is!
>
> -- Jack Krupansky
>
> -Original Message- From:
n-grams might help, followed by a edit distance metric such as Jaro-Winkler
or Smith-Waterman-Gotoh to further filter out.
On Sun, Jun 9, 2013 at 1:59 AM, Otis Gospodnetic wrote:
> Interesting problem. The first thing that comes to mind is to do
> "word expansion" during indexing. Kind of lik
Interesting problem. The first thing that comes to mind is to do
"word expansion" during indexing. Kind of like synonym expansion, but
maybe a bit more dynamic. If you can have a dictionary of correctly
spelled words, then for each token emitted by the tokenizer you could
look up the dictionary a
Hi,
Is there a notion of a data-node vs. non-data node in SolrCloud?
Something a la http://www.elasticsearch.org/guide/reference/modules/node/
Thanks,
Otis
Solr & ElasticSearch Support
http://sematext.com/
I have noticed that when I write a doc with an id that already exists, it
creates a new revision with the only the fields from the second write. I
guess there is a REST API in the latest solr version which updates only
selected fields.
In my opinion, merge should be creating a doc which is a union
Hello,
This sounds like a custom SearchComponent.
Which clustering library you want to use or DIY is up to you, but go
with the SearchComponent approach. You will still need to process N
hits, but you won't need to first send them all over the wire.
Otis
--
Solr & ElasticSearch Support
http://se
Hi,
Is there anything in SolrCloud that would support query-node/shard
affinity/stickiness?
What I mean by that is a mechanism that is smart enough to keep
sending the same query X to the same node(s)+shard(s)... with the goal
being better utilization of Solr and OS caches?
Example:
* Imagine a
Though the syntax looks fine, but I get all the records. As per example
given above I get all the documents, meaning filtering did not work. I am
curious to know if my indexing went fine or not. I will check and revert
back.
On Sun, Jun 9, 2013 at 7:21 AM, Otis Gospodnetic wrote:
> Try:
>
> ...
Also please note that for some documents, blocked_company_ids may not be
present as well. In such cases that document should be present in search
result as well.
BR,
Kamal
On Sun, Jun 9, 2013 at 7:07 AM, Kamal Palei wrote:
> Dear All
> I have a multi-valued field blocked_company_ids in index.
Try:
...&q=*:*&fq=-blocked_company_ids:5
Otis
--
Solr & ElasticSearch Support
http://sematext.com/
On Sat, Jun 8, 2013 at 9:37 PM, Kamal Palei wrote:
> Dear All
> I have a multi-valued field blocked_company_ids in index.
>
> You can think like
>
> 1. document1 , blocked_company_ids: 1, 5, 7
Dear All
I have a multi-valued field blocked_company_ids in index.
You can think like
1. document1 , blocked_company_ids: 1, 5, 7
2. document2 , blocked_company_ids: 2, 6, 7
3. document3 , blocked_company_ids: 4, 5, 6
and so on .
If I want to retrieve all the documents where blocked_compan
Maybe returned hits match other query terms.
Otis
Solr & ElasticSearch Support
http://sematext.com/
On Jun 8, 2013 6:34 PM, "jchen2000" wrote:
> I wanted to analyze high frequency terms using Solr's Luke request handler
> and keep updating the stopwords file for new queries from time to time.
>
Remove the stopFilter from the "index" section of your fieldType, only keep it
in the "query" section. This way your stopwords will always be indexed and
edismax will be able to selectively remove stopwords from the query depending
on whether all words are stopwords or not.
--
Jan Høydahl, sear
I wanted to analyze high frequency terms using Solr's Luke request handler
and keep updating the stopwords file for new queries from time to time.
Obviously I have to index all terms whether they belong to stopwords list or
not.
So I configured query analyzer stopwords list but disabled index anal
Turning swappiness down to 0 can have some decent performance impact.
- http://en.wikipedia.org/wiki/Swappiness
In the past, I've seen better performance with ext3 over ext4 around
commits/fsync. Test were actually enough slower (lots of these operations),
that I made a special ext3 partition w
On Jun 8, 2013, at 12:52 PM, Jamie Johnson wrote:
> When merging through the core admin (
> http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for
> conflicts during the merge? So for instance if I am merging core 1 and
> core 2 into core 0 (first example), what happens if core
When merging through the core admin (
http://wiki.apache.org/solr/MergingSolrIndexes) what is the policy for
conflicts during the merge? So for instance if I am merging core 1 and
core 2 into core 0 (first example), what happens if core 1 and core 2 both
have a document with the same key, say core
Usually, people want to do the opposite - store the numeric code as a
numeric field for perceived efficiency and let the user query and view
results with the text form. But, there isn't any evidence of any great
performance benefit of doing so - just store the string code in a string
field.
A
23 matches
Mail list logo