On Tue, 2011-09-27 at 02:43 +0200, Bictor Man wrote:
> thanks for your replies. indeed the filesystem caching seems to be the
> difference. sadly I can't add more memory and the 6GB/20core combination
> doesn't work. so I'll just try to tweak it as much as I can.
A (better) alternative to more mem
In case anyone is curious, I responded to him with a solution using either
SOLR-2155 (Geohash prefix query filter) or LSP:
https://issues.apache.org/jira/browse/SOLR-2155?focusedCommentId=13115244#comment-13115244
~ David Smiley
-
Author: https://www.packtpub.com/solr-1-4-enterprise-search-s
On Tue, Sep 27, 2011 at 11:25 AM, nagarjuna wrote:
> Hi gora can u pls quit ur answers like these..
> i may get the perfect answer from anybody but not u,so kindly
> please be quit
Sorry, didn't mean to be particularly obnoxious.
> i already googled and i saw many links as
I'm interested in the stopwords solution as it sounds like less work but i'm
not sure i understand how it works. By having msn.com as a stopword it doesnt
mean i wont get msn.com as a result for say 'hotmail'. My understanding is that
msn.com will never make it to the similarity function and thu
Hi gora can u pls quit ur answers like these..
i may get the perfect answer from anybody but not u,so kindly
please be quit
i already googled and i saw many links as a beginner i am unable to got the
main intention behind using the delta query,even we have query.and i
di
On Tue, Sep 27, 2011 at 10:51 AM, nagarjuna wrote:
> Hi everybody.
>
> right now i have little bit idea about the solr query ..but i am not
> clear about delta query
> wht it is? and how to write ?any sample delta query?
http://lmgtfy.com/?q=solr+delta+query
There are many useful links a
Hi Rahul,
I also tried searching "Coke Studio MTV" but no documents were returned.
Here is the snippet of my schema file.
*
Firstly, just to make it clear the dictionary is made out of already indexed
terms, rather it is built upon it if you are using *solr.IndexBasedSpellChecker* which you are.
Next lot of changes are required for your *solrconfig.xml*
1. spell is the name of the field which will be used
to create yo
I have been able to setup Solr Spell checker on my web application. It is a
file based spell checker that i have implemented. I would like to add that
the same isn't that accurate, since I haven't applied any specific algorithm
for having the most relevant search result. Kindly do let me know in ca
>From: Kiwi de coder
>
>wow, this search engine is powerful !
Thanks, glad it helps.
>too bad after look throught it, still got not solution.
>
>seem like I need to get my hand dirty to make one :)
:)
Please consider contributing: http://wiki.apache.org/solr/HowToContribute
Otis
>kiwi
>
>
The following should help with size estimation:
http://search-lucene.com/?q=estimate+memory&fc_project=Solr
http://issues.apache.org/jira/browse/LUCENE-3435
I'll just add that with that much RAM you'll be more than fine.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene
Hi
You mean to say copy the String field to a Text field or the reverse .
This is the approach I am currently following
Step 1: Created a FieldType
Step 2 :
Step 3 :
And in the SOLR Query planning to q=hospitals&qf=body^4.0 title^5.0
wow, this search engine is powerful !
too bad after look throught it, still got not solution.
seem like I need to get my hand dirty to make one :)
kiwi
On Tue, Sep 27, 2011 at 12:08 PM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Hi,
>
> Here is a 1 month old thread I found on sea
Hi Roland,
Have a look at hit #1
here: http://search-lucene.com/?q=manifoldcf&fc_project=Solr
I think this is what you are after.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>
>From: R
Rajat,
What version? If < 3.4.0, I'd try 3.4.0 first.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>
>From: shinkanze
>To: solr-user@lucene.apache.org
>Sent: Monday, September 26, 2011
Hello,
> PS: solr streamindex is not option because we need to submit javabin...
If you are referring to StreamingUpdateSolrServer, then the above statement
makes no sense and you should give SUSS a try.
Are you sure your 16 reducers produce more than 500 docs/second?
I think somebody already
Aha! See, it was the DB after all! ;) Thanks for following up, I was curious.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>
>From: eks dev
>To: solr-user
>Sent: Monday, September 26,
Hi,
Here is a 1 month old thread I found on search-lucene -- didn't even have to do
a search, I got it as a suggestion from AutoComplete when I started typing the
word mongodb :)
http://search-lucene.com/m/8AEE31AaTd32
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene
If I were you, probably I will try defining two fields:
1. ts_category as a string type
2. ts_category1 as a text_en type
Make sure copy ts_category to ts_category1.
You can use the following as qf in your dismax:
qf=body^4.0 title^5.0 ts_category^10.0 ts_category1^5.0
or something like that.
YH
Hi Gabriele,
Either the latter option, or just treat them as stop words if you just want to
remove those urls/ids from indexed docs (may still get highlighted).
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>___
Hi Mark,
Eh, I don't have Lucene/Solr source code handy, but I *think* for that you'd
need to write custom Lucene similarity.
Otis
Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch
Lucene ecosystem search :: http://search-lucene.com/
>
>From: Mark
Hi, guys,
Do you have any plans to support function queries on score field? for
example, sort=floor(product(score, 100)+0.5) desc?
So far I am getting the following error:
undefined field score
I can't use subquery in this case because I am trying to use secondary
sorting, however I will be open
i found answer to my question ..
basically it works only with complete match..
--
View this message in context:
http://lucene.472066.n3.nabble.com/external-file-field-partial-data-match-in-key-field-tp3368547p3371328.html
Sent from the Solr - User mailing list archive at Nabble.com.
Is UpdateProcessor triggered when updating an existing document or for new
documents also?
On Tue, Sep 27, 2011 at 6:00 AM, Chris Hostetter-3 [via Lucene] <
ml-node+s472066n3371110...@n3.nabble.com> wrote:
>
> : Hi Erick, The problem I am trying to solve is to filter invalid entities.
>
> : User
Hi guys,
thanks for your replies. indeed the filesystem caching seems to be the
difference. sadly I can't add more memory and the 6GB/20core combination
doesn't work. so I'll just try to tweak it as much as I can.
thanks a lot.
2011/9/26 François Schiettecatte
> You have not said how big your
: Hi Erick, The problem I am trying to solve is to filter invalid entities.
: Users might mispell or enter a new entity name. This new/invalid entities
: need to pass through a KeepWordFilter so that it won't pollute our
: autocomplete result.
how are you doing autocomplete?
if you are using th
I have a use case where I would like to search across two fields but I
do not want to weight a document that has a match in both fields higher
than a document that has a match in only 1 field.
For example.
Document 1
- Field A: "Foo Bar"
- Field B: "Foo Baz"
Document 2
- Field A: "Foo Blar
: Subject: Re: Unique Key error on trunk
:
:
: You can replicate it with the example app by replacing the id definition in
schema.xml with
:
: >
thanks for reporting this Viswa, I've filed a bug to track it...
https://issues.apache.org/jira/browse/SOLR-2796
-Hoss
: References:
:
: In-Reply-To:
:
: Subject: how to implemente a query like " like '%pattern%' "
https://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists
When starting a new discussion on a mailing list, please do not reply to
an existing message, instead star
Are you batching the documents before sending them to the solr server? Are
you doing a commit only at the end? Also since you have 32 cores, you can
try upping the number of concurrent updaters from 16 to 32.
Jaeger, Jay - DOT wrote:
>
> 500 / second would be 1,800,000 per hour (much more than
We used copyField to copy the address to two fields:
1. Which contains just the first token up to the first whitespace
2. Which copies all of it, but translates to lower case.
Then our users can enter either a street number, a street name, or both. We
copied all of it to the second field bec
If you need those kinds of searches then you should probably not be using
the KeywordTokenizerFactory, is there any reason why you can't switch to a
WhitespaceTokenizer for example? then you could use a simple phrase query
for your search case. if you need everything as a Token, you could use a
cop
Hi all
I am new to SOLR and have a doubt on Boosting the Exact Terms to the top
on a Particular field
For ex :
I have a text field names ts_category and I want to give more boost to
this field rather than other fields, SO in my Query I pass the following in
the QF params "qf=body^4.0 ti
Hello,
While indexing there are certain urls/ids I'd never want to appear in the
search results (so be indexed). Is there already a 'supported by design'
mechanism to do that to point me too, or should I just create this blacklist
as an processor in the update chain?
--
Regards,
K. Gabriele
---
500 / second would be 1,800,000 per hour (much more than 500K documents).
1) how big is each document?
2) how big are your index files?
3) as others have recently written, make sure you don't give your JRE so much
memory that your OS is starved for memory to use for file system cache.
JRJ
--
Dan:
The disconnect here seems to be that these examples urls on the
MoreLikeThisHandler wiki page assume a "/mlt" request handler exists, but
no handler by that name has ever actually existed in the solr example
configs. (the wiki page doesn't explicitly state that those URLs will
work with
You can replicate it with the example app by replacing the id definition in
schema.xml with
>
Removing the id fields in the one of the example doc.xml and posting it to solr.
Thanks
Viswa
On Sep 26, 2011, at 12:15 AM, Viswa S wrote:
> Hello,
>
> We use solr.UUIDField to generate unique
Hello guys,
I need to implement a functionality which requires something similar
to aggregate functions in SQL. My Solr schema looks like this:
-doc_id: integer
-date: date
-value1: integer
-value2: integer
Basically the index contains some numerical values (value1, value2,
etc) per doc and
:
: Unfortunately the facet fields are not static. The field are dynamic SOLR
: fields and are generated by different applications.
: The field names will be populated into a data store (like memcache) and
: facets have to be driven from that data store.
:
: I need to write a Custom FacetComponen
Hi all.
how can we do a query similar to 'like' ?
if I have this phrase like a single token in the index: "This phrase has
various words" (using KeywordTokenizerFactory)
and i like a exact match of: "phrase has various" or "various words" form
instance...
How can i do this??
Thanks a lot.
You have not said how big your index is but I suspect that allocating 13GB for
your 20 cores is starving the OS of memory for caching file data. Have you
tried 6GB with 20 cores? I suspect you will see the same performance as 6GB &
10 cores.
Generally it is better to allocate just enough memory
On 9/26/2011 9:33 AM, Bictor Man wrote:
Hi everyone,
Sorry if this issue has been discussed before, but I'm new to the list.
I have a solr (3.4) instance running with 20 cores (around 4 million docs
each).
The instance has allocated 13GB in a 16GB RAM server. If I run several sets
of queries se
Hi everyone,
Sorry if this issue has been discussed before, but I'm new to the list.
I have a solr (3.4) instance running with 20 cores (around 4 million docs
each).
The instance has allocated 13GB in a 16GB RAM server. If I run several sets
of queries sequentially in each of the cores, the I/O a
hi,
do we got any DIH plugin which is for mongodb?
regards,
kiwi
Hi Isan,
Does your search return any documents when you remove the 'at' keyword and
just search for "Coke studio MTV" ?
Also, can you please provide the snippet of schema.xml file where you have
mentioned this field name and its "type" description ?
On Mon, Sep 26, 2011 at 6:09 AM, Isan Fulia wro
OK. This is exactly what i did.
With a fresh download of solr 3.2
unpack and go to example directory
start solr: java -jar start.jar
the go to exampledocs and run: ./post.sh *xml
Then go here:
http://localhost:8983/solr/mlt?stream.body=electronics%20memory&mlt.fl=manu,cat&mlt.interestingTerm
Is there any limitation, be it technical or for sanity reasons, on the
number of shards that can be part of a solr cloud implementation?
Hi everyone,
Sorry if this issue has been discussed before, but I'm new to the list.
I have a solr (3.4) instance running with 20 cores (around 4 million docs
each).
The instance has allocated 13GB in a 16GB RAM server. If I run several sets
of queries sequentially in each of the cores, the I/O a
Please don't say "it's just like the example". If it was,
then it would most likely be working.
If you don't take the time to show us what you've tried,
and the results you get back, then there's not much we
can do to help.
Best
Erick
On Mon, Sep 26, 2011 at 7:18 AM, dan whelan wrote:
> On 9/24
This is pretty serious issue
Bill Bell
Sent from mobile
On Sep 26, 2011, at 4:09 AM, Isan Fulia wrote:
> Hi all,
>
> I have a text field named* textForQuery* .
> Following content has been indexed into solr in field textForQuery
> *Coke Studio at MTV*
>
> when i fired the query as
> *textFor
Just to bring closure on this one, we were slurping data from the
wrong DB (hardly desktop class machine)...
Solr did not cough on 41Mio records @34k updates / sec., single threaded.
Great!
On Sat, Sep 24, 2011 at 9:18 PM, eks dev wrote:
> just looking for hints where to look for...
>
> We we
On 9/24/11 12:17 PM, Erick Erickson wrote:
What version of Solr?
I am using solr 3.2
When you copied the default, did you set up
default values for MLT?
This is what I need help with.
"How should the request handler / solrconfig be setup?"
Showing us the request you used
The request is
Hi Alonso, Gora,
I run in the same Problem with the MailEntityProcessor.
I have an Email-Folder called "Test". Inside there a "only" two messages.
When I run the DIH everything looks find, except that the two Emails doesn't
get indexed.
Are there any adidtional informations to this problem?
I'm
I won't guarantee this is the 'best algorithm', but here's what we use. (This
is in a final class with only static helper methods):
// Set of characters / Strings SOLR treats as having special meaning in a
query, and the corresponding Escaped versions.
// Note that the actual operators
Tirthankar,
are you indexing 1.smaller docs or 2.books?
if 1. your caches are too big for your memory, as Erick already said.
Try to allocate 10GB für JVM, leave 14GB for your HDD-Cache and make your
caches smaller.
if 2. read the blog-posts on hathitrust.com.
http://www.hathitrust.org/blogs/la
Hi,
We have 500K web document and usind solr (trunk) to index it. We have
special anaylizer which little bit heavy cpu .
Our machine config:
32 x cpu
32 gig ram
SAS HD
We are sending document with 16 reduce client (from hadoop) to the stand
alone solr server. the problem is we couldnt get speedi
On Sun, 2011-09-25 at 22:00 +0200, Ikhsvaku S wrote:
> Documents: We have close to ~12 million XML docs, of varying sizes average
> size 20 KB. These documents have 150 fields, which should be searchable &
> indexed. [...] Approximately ~6000 such documents are updated & 400-800 new
> ones
> are a
Thx for your response, we will try dynamic fields for this
-Ursprüngliche Nachricht-
Von: Erick Erickson [mailto:erickerick...@gmail.com]
Gesendet: Samstag, 24. September 2011 21:33
An: solr-user@lucene.apache.org
Betreff: Re: How to map database table for facted search?
In general, you
Hi all,
I have a text field named* textForQuery* .
Following content has been indexed into solr in field textForQuery
*Coke Studio at MTV*
when i fired the query as
*textForQuery:("coke studio at mtv")* the results showed 0 documents
After runing the same query in debugMode i got the following r
Sorry for the somewhat length post, I would like to make clear that I covered
my basis here, and looking for an alternative solution, because the more
trivial solutions don't seem to work for my use-case.
Consider Bars, musea, etc.
These places have multiple openinghours that can depend on:
RE
hi ,
I am replicating solr and getting this error . i am unable to make out the
cause so please kindly help
26 Sep, 2011 8:00:14 AM org.slf4j.impl.JDK14LoggerAdapter fillCallerData
SEVERE: Error during auto-warming of
key:org.apache.solr.search.QueryResultKey@150f0455:java.lang.NullPointerExcept
hi
i have product inventory data in solr index I would like to boost or sort
results by using some popularity.
for instance SOLR index has field named Title. Some docs with tile like
iphone 4 - white
iphone 3 - white
blackberry torch
I would like to boost docs where title contains word "iphone"
Hello,
We use solr.UUIDField to generate unique ids, using the latest trunk (change
list 1163767) seems to throw an error "Document is missing mandatory uniqueKey
field: id". The schema is setup to generate a id field on updates
Thanks
Viswa
SEVERE: org.apache.solr.common.SolrException:
63 matches
Mail list logo