I'm planning on having 1 Master and multiple slaves (cloud based, slaves
are going up / down randomly).
The slaves should be constantly available, meaning searching performance
should optimally not be affected by the updates at all.
It's unclear to me how the Cluster based replication works, does
We found that optimising too often killed our slave performance. An optimise
will cause you to merge and ship the whole index rather than just the relevant
portions when you replicate.
The change on our slaves in terms of IO and CPU as well as RAM was marked.
Andrew
Sent on the run.
On 23/
Jan's point that keeping different fields can make some statistical issues
more correct is sound.
The basic idea is that a common word in a rare language should be treated
as a common word if you are working in that language. The simplest way to
make that happen is by having a different field for
I was running 32 bit Java (JDK, JRE & Tomcat) on my 64 bit Windows. For
indexing I was not able to allocate more than 1.5GB Heap Space on my machine.
Each time my tomcat process used to touch the upper bound (i.e. 1.5GB) very
quickly so I thought of working on 64 bit Java/Tomcat. Now I dont see
Hi,
We're building quite a large shared index of resources, using Solr. The
application that makes use of these resources is a multitenant one
(i.e., many customers using the same index). For resources that are
"private" to a customer, it's fairly easy to tag a document with their
customer ID a
Thanks for the article.
I am indexing each page of a document as if it were a document.
I think the answer is to configure SOLR for use of the TermVector Component:
http://wiki.apache.org/solr/TermVectorComponent
I have not tried it yet, but someone told me on StackExchange forum to try
this on
Hi wunder,
for us, it works with internal dots when specifying the properties in
$SOLR_HOME/[core]/conf/solrcore.properties:
like this:
db.url=xxx
db.user=yyy
db.passwd=zzz
$SOLR_HOME/[core]/conf/data-config.xml:
Cheers,
Chantal
On Sat, 2012-01-21 at 01:01 +0100, Walter Underwood wrote:
>
Hi Dipti,
just to make sure: are you aware of
http://wiki.apache.org/solr/DisMaxQParserPlugin
This will handle the user input in a very conventional and user friendly
way. You just have to specify on which fields you want it to search.
With the 'mm' parameter you have a powerfull option to speci
check your defaultOperator, ensure its OR
On 23 January 2012 05:56, jawedshamshedi wrote:
> Hi
> Thanks for the reply..
> I am using NGramFilterFactory for this. But it's not working as desired.
> Like I have a field article_type that has been indexed using the below
> mentioned field type.
>
>
Hi,
Do you have any kind of "group" membership for you users? If you have, a
resource's list of security access tokens could be smaller and avoid
re-indexing most resources when adding "normal" users which mostly belong to
groups. The common way is to add filters on the query. You may do it you
on selection issue another query to get your additional data (if i
follow what you want)
On 22 January 2012 18:53, Dave wrote:
> I take it from the overwhelming silence on the list that what I've asked is
> not possible? It seems like the suggester component is not well supported
> or understood,
Im using trunk and FVH and eventhough I filter stopwords when searching, I
would like to highlight stopwords in fragments. Using a different field
without the stopwords filter did not have the desired effect.
Is there a way to do this?
--
View this message in context:
http://lucene.472066.n3.nab
Hey,
Thanks for that, I have uploaded a new patch as advised.
Cheers,
David
On 23/01/2012 1:01 PM, Erick Erickson wrote:
David:
There's some good info here:
http://wiki.apache.org/solr/HowToContribute#Working_With_Patches
But the short form is to go into solr_home and issue this command
You can update the document in the index quite frequently. IDNK what
your requirement is, another option would be to boost query time.
On Sun, Jan 22, 2012 at 5:51 AM, Bing Li wrote:
> Dear Shashi,
>
> Thanks so much for your reply!
>
> However, I think the value of PageRank is not a static one.
David,
Thank you for taking the time to evaluate SOLR-2585. Perhaps the title of the
issue advertises more than it delivers? (The name is borrowed from a section
in the first book listed here:
http://wiki.apache.org/lucene-java/InformationRetrieval) In any case, I think
SOLR-2585 is a step
0 down vote favorite
share [fb] share [tw]
What is the proper query URL to limit the term frequency to just one term
in a document?
Below is an example query to search for the term frequency in a document,
but it is returning the frequency for all the terms.
[
http://localhost:8983/solr/select/
Hi, I've been wondering why some of my queries did not return the
results I expected. A debugQuery resulted in the following:
"java"^0.0 OR "haskell"^0.0 OR "python"^0.0 OR ("ruby"^0.0) AND
(("programming"^0.0)) OR "programming language"^0.0 OR "code
coding"^0.0 OR -"mobile"^0.0 OR -"android"^0.0
In general, do not optimize unless you
1> have a very static index
2> actually test the search performance afterwards.
First, as Andrew says, optimizing will force a complete
copy of the entire index at replication. If you do NOT
optimize, only the most recent segments to be written
are copied.
S
Please provide more info. In particular what is the
output when you attach &debugQuery=on?
Best
Erick
On Mon, Jan 23, 2012 at 5:11 AM, Lee Carroll
wrote:
> check your defaultOperator, ensure its OR
>
> On 23 January 2012 05:56, jawedshamshedi wrote:
>> Hi
>> Thanks for the reply..
>> I am using
A second, but arguably quite expert option, is to use the no-cache option.
See: https://issues.apache.org/jira/browse/SOLR-2429
The idea here is that you can specify that a filter is "expensive" and it
will only be run after all the other filters & etc have been applied.
Furthermore,
it will not b
Count your parentheses (anyone here speak Lisp?) I think that +
is outside the entire clause, meaning it's saying that there is
a single mandatory clause, and it's the whole thing
But boosting by 0.0 is probably a really bad thing. This may be
dropping all the scores to 0, which means "no matc
You can have a large number of cores, some people have multiple
hundreds. Having multiple cores is preferred over having
multiple JVMs since it's more efficient at sharing system
resources. If you're running a 32 bit JVM, you are limited in
the amount of memory you can let the JVM use, so that's a
Wonderful input. Thank you very much Erick.
One question, I've been told that Solr supports an operation mode of multi
core where you build the index on the master (optimize or not) then pass it
to the "stand by" core on the slaves. Once the synchronization is complete
you switch on the slave betw
My first reaction is that, unless you have a specific use-case,
this is unnecessary. When using a slave the Solr replication
goes on in the background. Autowarming also is carried out
in the background. Only when the autowarming is done are
queries sent to the new (internal-to-solr) searcher. All w
Hi,
I would really appreciate any hint/guide to fix this query issue. A Java
webapp hits solr with a query that does not returns any result but works for
other states. (FL, CA for instance)
>From logs:
[code]
solr path=/select
params={facet=on&facet.mincount=5&facet.sort=count&q=listing.property.s
Hi!
On Mon, Jan 23, 2012 at 18:42, Erick Erickson wrote:
> Count your parentheses (anyone here speak Lisp?) I think that +
> is outside the entire clause, meaning it's saying that there is
> a single mandatory clause, and it's the whole thing
You're right in that case it's the whole query. P
> I would really appreciate any hint/guide to fix this query
> issue. A Java
> webapp hits solr with a query that does not returns any
> result but works for
> other states. (FL, CA for instance)
> From logs:
> [code]
> solr path=/select
> params={facet=on&facet.mincount=5&facet.sort=count&q=listin
Right. Essentially, the precedence is given to AND, so this is parsed
as though it were python OR (ruby AND programming) OR "programming language"
Best
Erick
On Mon, Jan 23, 2012 at 10:55 AM, Michael Jakl wrote:
> Hi!
>
> On Mon, Jan 23, 2012 at 18:42, Erick Erickson wrote:
>> Count your parent
Hi,
Im been trying to figure this out now for a few days and I'm just not
getting anywhere, so any pointers would be MOST welcome. I'm in the
process of upgrading from 1.3 to the latest and greatest version of
Solr and I'm getting there slowly. However I have this (final) problem
that when sending
> Below is an example query to search for the term frequency
> in a document,
> but it is returning the frequency for all the terms.
>
> [
> http://localhost:8983/solr/select/?fl=documentPageId&q=documentPageId:49667.3&qt=tvrh&tv.tf=true&tv.fl=contents][1
> ]
>
> I would like to be able to limit
I have some hierarchical data that I want to represent in the Solr UI
(/browse). I've read through many discussions on this topic, including
http://wiki.apache.org/solr/HierarchicalFaceting and
http://packtlib.packtpub.com/library/9781849516068/ch06lvl1sec09 . However, I
didn't see a solution
On Mon, Jan 23, 2012 at 22:05, Erick Erickson wrote:
> Right. Essentially, the precedence is given to AND, so this is parsed
> as though it were python OR (ruby AND programming) OR "programming language"
That's exactly what I'd expect, but the problem is that "ruby" is
marked as mandatory, that i
Hi,
I implemented the facet using
query.addFacetQuery
query.addFilterQuery
to facet on:
gender:male
state:DC
This works fine. How can I facet on multi-values using Solrj API, like
following:
gender:male
gender:female
state:DC
I've tried, but return 0. Can anyone help ?
Thanks,
-jingjung n
(12/01/23 23:14), O. Klein wrote:
Im using trunk and FVH and eventhough I filter stopwords when searching, I
would like to highlight stopwords in fragments. Using a different field
without the stopwords filter did not have the desired effect.
Please provide more info. In particular, how your qu
Hello,
I'm no expert here (just started learning/using Solr a few months ago) but I
ran into the same issue of needing to search for and facet on the OR
abbreviation.
What worked for me was to double-escape OR (a la :\\OR) for queries and single
escape (:\OR) when doing a facet query.
The pag
Hi,
It's because lowernames=true by default in solrconfig.xml, and it will convert
any "-" into "_" in field names. So try adding a request parameter
&lowernames=false or change the default in solrconfig.xml. Alternatively, leave
as is but name your fields project_id and company_id :)
http://w
On Mon, 23 Jan 2012 14:33:00 -0800 (PST), Yuhao
wrote:
> Programmatically, something like this might work: for each facet field,
> add another hidden field that identifies its parent. Then, program
> additional logic in the UI to show only the facet terms at the currently
> selected level. For
another way is to store the original hierarchy in a sql database (in
the form: id, parent_id, name, level) and in the Lucene index store
the complete
hierarchy (from root to leave node) for each document in one field
using the ids of the sql database. In that way you can get documents
at any level
Koji Sekiguchi wrote
>
> (12/01/23 23:14), O. Klein wrote:
>> Im using trunk and FVH and eventhough I filter stopwords when searching,
>> I
>> would like to highlight stopwords in fragments. Using a different field
>> without the stopwords filter did not have the desired effect.
>
> Please provi
Hi All,
I need community's feedback about deploying newer versions of solr schema
into production while existing (older) schema is in use by applications.
How do people perform these things? What has been the learning of people
about this.
Any thoughts are welcome.
Thanks
Saroj
Thanks for your replies. I can't apply index-time boost because I don't
know the term frequencies in advance. Additionally, new documents come in
every few minutes which make maintaining term frequencies outside Solr a
difficult task.
Facet prefix would probably help in this case. I thought there
Hi
Has some size of index (or number of docs) that is necessary to break
the index in shards?
I have a index with 100GB of size. This index increase 10GB per year.
(I don't have information how many docs they have) and the docs never
will be deleted. Thinking in 30 years, the index will be with 40
Well, at root the Lucene query parser makes no claim of
enforcing boolean logic. Think in terms of MUST, SHOULD
and NOT instead.
Here's a good writeup...
http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/
Best
Erick
On Mon, Jan 23, 2012 at 2:43 PM, Michael Jakl wrote:
> On
Hi Mukund,
Since I am getting this issue for long time, I had done some hit and
run. In my case I am connecting the local tomcat server using solrJ.
SolrJ has max connection perhost 20 and per client 2. As I have heavy
load and lots of dependency on solr so it seems very low. To increase
the
Hi All,
I have two tomcat on same server. One is for Solr and other is my
application server. I am conneting solr server with solrj from application
server. As I am connecting locally so the default connection seems to be
very less. My server stop responding every few hour only up when I reset
the
Hi Kuli,
Did you get the solution of this problem? I am still facing this problem.
Please help me to overcome this problem.
regards
On Wed, Oct 26, 2011 at 1:16 PM, Michael Kuhlmann wrote:
> Hi;
>
> we have a similar problem here. We already raised the file ulimit on the
> server to 4096, but
Hello,
AFAIK by setting connectionManager.closeIdleConnections(0L); you
preventing your http connecitons from caching aka disabling keep-alive. If
you increase it enough you won't see many CLOSE_WAIT connections.
Some explanation and solution for jdk's http client (URL Connection), not
for your
On Tue, Jan 24, 2012 at 06:27, Erick Erickson wrote:
> Well, at root the Lucene query parser makes no claim of
> enforcing boolean logic. Think in terms of MUST, SHOULD
> and NOT instead.
>
> Here's a good writeup...
>
> http://www.lucidimagination.com/blog/2011/12/28/why-not-and-or-and-not/
Hi,
48 matches
Mail list logo