>Generally a good idea, but be prepared to entertain requests that should
>also ask you to be able to perform the query using those aliases. I mean
>when you talk about something "similar" to aliases in SQL, those aliases can
>be used in SQL scripts in the where clause too.
>
>Cheers
>Avlesh
I am
*Collins:
*i don't know what u wanna say?
--
regards
j.L ( I live in Shanghai, China)
On Fri, Jun 5, 2009 at 5:54 AM, Kir4 wrote:
>
> Is it possible to create a custom analyzer (index time) that uses
> UpdateRequestProcessor to add new fields to posts, based on the tokens
> generated by the other analyzers that have been run (before my custom
> analyzer)?
No, UpdateRequestProces
On Fri, Jun 5, 2009 at 10:20 AM, Avlesh Singh wrote:
> Generally a good idea, but be prepared to entertain requests that should
> also ask you to be able to perform the query using those aliases. I mean
> when you talk about something "similar" to aliases in SQL, those aliases can
> be used in SQL
Generally a good idea, but be prepared to entertain requests that should
also ask you to be able to perform the query using those aliases. I mean
when you talk about something "similar" to aliases in SQL, those aliases can
be used in SQL scripts in the where clause too.
Cheers
Avlesh
2009/6/5 Nob
Hi Otis,
is it a good idea to provide as aliasing feature for Solr similar to
the SQL 'as'
in SQL we can do
select location_da_dk as location
Solr may have
fl.alias=location_da_dk:location
--Noble
On Fri, Jun 5, 2009 at 3:10 AM, Otis Gospodnetic
wrote:
>
> Aha, so you really want to ren
And the field should be of type, text, right Otis?
Does one still need those "anchors" if the type is string with the filters
you suggested?
Cheers
Avlesh
On Fri, Jun 5, 2009 at 6:35 AM, Otis Gospodnetic wrote:
>
> I re-read your original request. Here is the recipe that should work:
>
> * Def
Nice suggestion Noble!
If you are using SolrJ, then this particular binding can be an answer to
your question.
Cheers
Avlesh
2009/6/5 Noble Paul നോബിള് नोब्ळ्
> How are you accessing Solr? SolrJ?
>
> does this help?
> https://issues.apache.org/jira/browse/SOLR-1129
>
> On Fri, Jun 5, 2009 at 3
How are you accessing Solr? SolrJ?
does this help?
https://issues.apache.org/jira/browse/SOLR-1129
On Fri, Jun 5, 2009 at 3:00 AM, Manepalli, Kalyan
wrote:
> Otis,
> With that solution, the client has to accept all type location fields
> (location_de_de, location_it_it). I want to copy t
Can you analyze the logs to see which categories people choose for
each query? When there are enough queries and a clear preference,
you can highlight that choice.
wunder
On 6/4/09 9:21 PM, "Avlesh Singh" wrote:
> If you haven't already given this a thought, you may want to try out an
> auto-co
did you try the NumberFormatTransformer ?
On Fri, Jun 5, 2009 at 12:09 AM, Jianbin Dai wrote:
>
> Hi, One of the fields to be indexed is price which is comma separated, e.g.,
> 12,034.00. How can I indexed it as a number?
> I am using DIH to pull the data. Thanks.
>
>
>
>
>
--
-
If you haven't already given this a thought, you may want to try out an
auto-complete feature, suggesting those categories upfront.
Cheers
Avlesh
On Fri, Jun 5, 2009 at 3:56 AM, ram_sj wrote:
>
> Hi,
>
> I have more than 20 categories for my search application. I'm interested in
> finding the c
On Thu, Jun 4, 2009 at 11:29 PM, Robert Purdy wrote:
>
> Thanks for the Good information :) Well I haven't had any evictions in any of
> the caches in years, but the hit ratio is 0.51 in queryResultCache, 0.77 in
> documentCache, 1.00 in the fieldValueCache, and 0.99 in the filterCache. So
> in yo
On Mon, Feb 16, 2009 at 4:30 PM, revathy arun wrote:
> Hi,
>
> When I index chinese content using chinese tokenizer and analyzer in solr
> 1.3 ,some of the chinese text files are getting indexed but others are not.
>
are u sure ur analyzer can do it good?
if not sure, u can use analzyer link in
first: u not have to restart solr,,,u can use new data to replace old data
and call solr to use new search..u can find something in shell script which
with solr
two: u not have to restart solr,,,just keep id is same..example: old
id:1,title:hi, new id:1,title:welcome,,just index new data,,it will
I re-read your original request. Here is the recipe that should work:
* Define new field type that:
Uses KeywordTokenizer
Uses LowerCaseFilter
* Make your field be of the above type.
* Use those begin/end anchor characters at index and search time.
I believe that should work. Please tr
I don't think there is anything ready to be used in Solr (but would be easy to
add), but if you indexed your with a custom "beginning of string" and "end of
string" anchors, you'll be able to get your exact matching working.
For example, convert "hello the world" to "$hello the world$" before i
Ram,
Typical queries are short, so they are hard to categorize using statistical
approaches. Maybe categorization of queries would work with a custom set of
rules applied to queries?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: ram_
Is it possible to create a custom analyzer (index time) that uses
UpdateRequestProcessor to add new fields to posts, based on the tokens
generated by the other analyzers that have been run (before my custom
analyzer)?
The content of said fields must differ from post to post based on the tokens
ext
Hey,
Your system sounds similar to the work don by Stu Hood at Rackspace in their
Mailtrust unit. See
http://highscalability.com/how-rackspace-now-uses-mapreduce-and-hadoop-query-terabytes-datafor
more details and inspiration.
Regards,
Jeff
On Thu, Jun 4, 2009 at 4:58 PM, wrote:
> Hi,
> This i
Hi,
This is encouraging to know that solr/lucene solution may work.
Can anyone using solr/lucene for such scenario can confirm that the solution is
used and working fine? That would be really helpful, as I just started looking
into the solr/lucene solution only couple of days back and might be di
Here is what we have:
for all the documents we have a field called "small_body" , which is a 60
chars max text field that were we store the "abstract" for each article.
We have about 8,000,000 documents indexed, and usually we display this
small_body on our "listing pages".
For each listing pa
What we usually do to reindex is:
1. stop solr
2. rmdir -r data (that is to remove everything in /opt/solr/data/
3. mkdir data
4. start solr
5. start reindex. with this we're sure about not having old copies or
index..
To check the index size we do:
cd data
du -sh
Otis Gospodnetic wro
I still have a problem with exact matching.
query.setQuery("title:\"hello the world\"");
This will return all docs with title containing "hello the world", i.e.,
"hello the world, Jack" will also be matched. What I want is exactly "hello the
world". Setting this field to string instead of text
Hi,
I have more than 20 categories for my search application. I'm interested in
finding the category of query entered by user dynamically instead of asking
the user to filter the results through long list of categories.
Its a general question, its not specific to solr though, any suggestion
abo
My guess is Solr/Lucene would work. Not sure how well/fast, but it would, esp.
if you avoid range queries (or use tdate), and esp. if you shard/segment
indices smartly, so that at query time you send (or distribute if you have to)
the query to only those shards that have the data (if your quer
Hi,
As Alex correctly pointed out my main intention is to figure out whether
Solr/lucene offer functionalities to replicate what Splunk is doing in terms of
building indexes etc for enabling search capabilities.
We evaluated Splunk, but it is not very cost effective solution for us as we
may hav
Aha, so you really want to rename the field at response time? I wonder if this
is something that could be done with (or should be added to) response writers.
That's where I'd go look first.
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> Fro
I can't tell what that analyzer does, but I'm guessing it uses n-grams?
Maybe consider trying https://issues.apache.org/jira/browse/LUCENE-1629 instead?
Otis
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
- Original Message
> From: Fer-Bj
> To: solr-user@lucene.apache.
On Tue, Jun 2, 2009 at 11:28 PM, anuvenk wrote:
> I'm using query time synonyms.
These don't currently work if the synonyms expand to more than one
option, and those options have a different number of words.
-Yonik
http://www.lucidimagination.com
Otis,
With that solution, the client has to accept all type location fields
(location_de_de, location_it_it). I want to copy the result into "location"
field, so that client can just accept location.
Thanks,
Kalyan Manepalli
-Original Message-
From: Otis Gospodnetic [mailto:otis_
I would also be interested to know what other existing solutions exist.
Splunk's advantage is that it does extraction of the fields with
advanced searching functionality (it has lexers/parsers for multiple
content types). I believe that's the Solr's function desired in
original posting. At the tim
Hello,
If you know what language the user specified (or is associated with), then you
just have to ensure the "fl" URL parameter contain that field (and any other
fields you want returned). So if the language/locale is de_de, then make sure
the request has fl=location_de_de,another_field,anot
Take a look at the security section in the wiki, u could do this with
firewall rules or password access.
On Thursday, June 4, 2009, ashokc wrote:
>
> Hi,
>
> I find that I am freely able to post to my production SOLR server, from any
> other host that can run the post command. So somebody can wip
Hi,
I find that I am freely able to post to my production SOLR server, from any
other host that can run the post command. So somebody can wipe out the whole
index by posting a delete query. Is there a way SOLR can be configured so
that it will take updates ONLY from the server on which it is runn
Yes. I am using 1.3. When is 1.4 due for release?
Yonik Seeley-2 wrote:
>
> Are you using Solr 1.3?
> You might want to try the latest 1.4 test build - faceting has changed a
> lot.
>
> -Yonik
> http://www.lucidimagination.com
>
> On Thu, Jun 4, 2009 at 12:01 PM, Yao Ge wrote:
>>
>> I am ind
Hi, One of the fields to be indexed is price which is comma separated, e.g.,
12,034.00. How can I indexed it as a number?
I am using DIH to pull the data. Thanks.
On Thu, Jun 4, 2009 at 7:52 AM, Marc Sturlese wrote:
> Hey there, I am trying to optimize the setup of hasDocSet.
Be aware that in the latest versions of Solr 1.4, HashDocSet is no
longer used by Solr.
https://issues.apache.org/jira/browse/SOLR-1169
> Have read the documentation here:
> http://w
Thanks for the Good information :) Well I haven't had any evictions in any of
the caches in years, but the hit ratio is 0.51 in queryResultCache, 0.77 in
documentCache, 1.00 in the fieldValueCache, and 0.99 in the filterCache. So
in your opinion should the documentCache and queryResultCache use th
Are you using Solr 1.3?
You might want to try the latest 1.4 test build - faceting has changed a lot.
-Yonik
http://www.lucidimagination.com
On Thu, Jun 4, 2009 at 12:01 PM, Yao Ge wrote:
>
> I am index a database with over 1 millions rows. Two of fields contain
> unstructured text but size of e
Hi,
I am trying to customize the response that I receive from Solr. In
the index I have multiple fields that contain the same data in different
language.
At the query time client specifies the language. Based on this param, I want to
return the value, copied into a different field.
E
On Thu, Jun 4, 2009 at 7:24 PM, Michael Ludwig wrote:
> Shalin Shekhar Mangar wrote:
>
> | If you use spellcheck.q parameter for specifying
> | the spelling query, then the field's analyzer will
> | be used [...] If you use the q parameter, then the
> | SpellingQueryConverter is used.
>
> http://
On Jun 4, 2009, at 6:42 AM, Erick Erickson wrote:
It *will* cause performance issues if you load that field for a large
number of documents on a particular search. I know Lucene itself
has lazy field loading that helps in this case, but I don't know how
to persuade SOLR to use it (it may even
Hi,
I was wondering if there's an option to return statistics about distances
from the query terms to the most frequent terms in the result documents.
At present I return the most frequent terms using facetSearch which returns
for each word in the result documents the number ob occurences (wit
I am index a database with over 1 millions rows. Two of fields contain
unstructured text but size of each fields is limited (256 characters).
I come up with an idea to use visualize the text fields using text cloud by
turning the two text fields in facets. The weight of font and size is of
each
Why build one? Don't those already exist?
Personally, I'd start with Hadoop instead of Solr. Putting logs in a
search index is guaranteed to not scale. People were already trying
different approaches ten years ago.
wunder
On 6/4/09 8:41 AM, "Silent Surfer" wrote:
> Hi,
> Any help/pointers on t
2009/6/4 Noble Paul നോബിള് नोब्ळ् :
> FastLRUCache is designed to be lock free so it is well suited for
> caches which are hit several times in a request. I guess there is no
> harm in using FastLRUCache across all the caches.
Gets are cheaper, but evictions are more expensive. If the cache hit
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer
--- On Tue, 6/2/09, Silent Surfer wrote:
From: Silent Surfer
Subject: Questions regarding IT search solution
To: solr-user@lucene.apache.org
Date: Tuesday, June 2, 2009, 5:45 PM
Hi,
I am new to Lucene forum and
Hi,
Any help/pointers on the following message would really help me..
Thanks,Surfer
--- On Tue, 6/2/09, Silent Surfer wrote:
From: Silent Surfer
Subject: Questions regarding IT search solution
To: solr-user@lucene.apache.org
Date: Tuesday, June 2, 2009, 5:45 PM
Hi,
I am new to Lucene forum and
"query suggest" --wunder
On 6/4/09 1:25 AM, "Michael Ludwig" wrote:
> Yao Ge schrieb:
>
>> Maybe we should call this "alternative search terms" or
>> "suggested search terms" instead of spell checking. It is
>> misleading as there is no right or wrong in spelling, there
>> is only popular (term
Shalin Shekhar Mangar wrote:
| If you use spellcheck.q parameter for specifying
| the spelling query, then the field's analyzer will
| be used [...] If you use the q parameter, then the
| SpellingQueryConverter is used.
http://markmail.org/message/k35r7qmpatjvllsc - message
http://markmail.org/t
Warning: This is from a Lucene perspective
I don't think it matters. I'm pretty sure that COMPRESS onlyapplies to
*storing* the data, not putting the tokens in the index
(this latter is what's serached)...
It *will* cause performance issues if you load that field for a large
number of document
Hmmm, are you quite sure that you emptied the index first and didn'tjust add
all the documents a second time to the index?
Also, when you say the index almost doubled, were you looking only
at the size of the *directory*? SOLR might have been holding a copy
of the old index open while you built a
Hey there, I am trying to optimize the setup of hasDocSet.
Have read the documentation here:
http://wiki.apache.org/solr/SolrPerformanceFactors#head-2de2e9a6f806ab8a3afbd73f1d99ece48e27b3ab
But can't exactly understand it.
Does it mean that the maxSize should be 0.005 x NumberDocsOfMyIndex or that
Yao Ge schrieb:
Maybe we should call this "alternative search terms" or
"suggested search terms" instead of spell checking. It is
misleading as there is no right or wrong in spelling, there
is only popular (term frequency?) alternatives.
I had exactly the same difficulty in understanding the c
Is it correct to assume that using field compression will cause performance
issues if we decide to allow search over this field?
ie:
if I decide to add "compressed=true" to the BODY field... and a I allow
search on body... would that be a problem?
At the same time: if I add compress
56 matches
Mail list logo