I am using Solr 4.0.
I want the number of replications for each shard is 3.
How can do this?
Sincerely
Vince Wei
From: Vince Wei (jianwei)
Sent: 2012年5月25日 11:40
To: 'solr-user@lucene.apache.org'
Subject: how can I specify the number of replications for each shard?
Hi All,
how c
Hi All,
how can I specify the number of replications for each shard?
Thanks!
Sincerely
Vince Wei
: Anyone found a solution to the getTransformer error. I am getting the same
: error.
If you use Solr 3.6, with the example jetty and example configs, do you
get the same error using the provided example XSL files?
http://localhost:8983/solr/select?q=*:*&wt=xslt&tr=example.xsl
http://localhost:
: in sufficient amount .. But still its is throwing Null Pointer Exception in
: Tomcat and in Eclipse while debugging i had seen Error as "Error Executing
: Query" . Please give me suggestion for this.
:
: Note: While the ids are below or equal to 99800 the Query is returning the
: Result
what e
I tried it and it does appear to be the SnowballPorterFilterFactory that
normally does the accent folding but can't here because it is not multi-term
aware. I did notice that the text_de field type that comes in the Solr 3.6
example schema handles your case fine. It uses the
GermanNormalization
Yes, this is the right way for the DIH. You might find it easier to
write a separate local client that polls the DB and uploads changes.
The DIH is oriented toward longer batch jobs.
On Thu, May 24, 2012 at 7:29 AM, Esteban Donato
wrote:
> Hi community,
>
> I am using Solr with DIH to index conte
: 1) Any recommendations on which best to sub-class? I'm guessing, for this
: scenario with "rare" batch puts and no evictions, I'd be looking for get
: performance. This will also be on a box with many CPUs - so I wonder if the
: older LRUCache would be preferable?
i suspect you are correct ...
Hoss, brilliant as always - many thanks! =)
Subclassing the SolrCache class sounds like a good way to accomplish this.
Some questions:
1) Any recommendations on which best to sub-class? I'm guessing, for this
scenario with "rare" batch puts and no evictions, I'd be looking for get
performance. Th
Interesting problem,
w/o making any changes to Solr, you could probably get this behavior be:
a) sizing your cache large neough.
b) using a firstSearcher that generates your N queries on startup
c) configure autowarming of 100%
d) ensure every query you send uses cache=false
The tricky part
: We are using mm=70% in solrconfig.xml
: We are using qf=title description
: We are not doing phrase query in "q"
:
: In case of a multi-word search text, mostly the end results are the junk
: ones. Because the words, mentioned in search text, are written in different
: fields and in different c
: I get two different DocSets from two different searchers. I need
: to merge them into one and get the facet counts from the merged
: docSets. How do I do it? Any pointers would be appreciated.
1) if you really mean "two different searchers" then you can not do this
-- DocSets, and the do
On 5/23/2012 12:27 PM, Lance Norskog wrote:
If you want to suppress merging, set the 'mergeFactor' very high.
Perhaps 100. Note that Lucene opens many files (50? 100? 200?) for
each segment. You would have to set the 'ulimit' for file descriptors
to 'unlimited' or 'millions'.
My installation (S
True, no argument there as to usage.
I should have clarified that the encoding of the character used for alif
(02BE) carries with it an assigned property in the Unicode database of
(Lm), putting it into the category of 'Modifier_Letter', which contrasts
with the property (Sk), 'Modifier_Symbol', a
On 5/23/2012 2:48 PM, pramila_tha...@ontla.ola.org wrote:
Hi Everyone,
solr 3.6 does not seem to be honoring the field compress.
While merging the indexes the size of Index is very big.
Is there any other way to handle this to keep compression functionality?
Compression support was removed
> Just wondering if you have any suggestions!!! The other
> thing I tried using
> following url and the results returned same way as they were
> (no trimming of
> description to 300 chars). not sure if it is because of
> config file
> settings.
>
>
> http://localhost:8983/solr/browse?&hl=true&hl.
: I just happened to notice a typo when I mistyped a Unicode escape sequence in
a query:
Thanks Jack, r1342363.
: Dismax doesn’t get the error since apparently it doesn’t recognize Unicode
escape sequences.
correct .. dismax doesn't accept any escape sequence (but literal
unicode characters
That's my understanding for releases of Solr before 4.0, that the default
for MM is 100%. You can add a default value of MM in your query request
handler in solrconfig.xml.
-- Jack Krupansky
-Original Message-
From: geeky2
Sent: Thursday, May 24, 2012 10:48 AM
To: solr-user@lucene.ap
The alif and ayn can also be used as diacritic-like characters in Korean; this
is a known practice. But thanks anyway.
On May 24, 2012, at 9:30 AM, Charles Riley wrote:
> Hi Naomi,
>
> I don't have a conclusive answer for you on this yet, but let me pick up on a
> few points.
>
> First, th
On Thu, May 24, 2012 at 7:29 AM, Michael Kuhlmann wrote:
> However, I doubt it. I've not been too deeply into the UpdateHandler yet,
> but I think it first needs to parse the complete XML file before it starts
> to index.
Solr's update handlers all stream (XML, JSON, CSV), reading and
indexing a
Hi Naomi,
I don't have a conclusive answer for you on this yet, but let me pick up on
a few points.
First, the apostrophe is probably being handled through ignoring
punctuation in the ICUCollationKeyFilterFactory.
Alif isn't a diacritic but a letter, and its character properties would be
handled
Have you heard of NG Data with their product called Lily?
--
View this message in context:
http://lucene.472066.n3.nabble.com/List-of-recommendation-engines-with-solr-tp3818917p3985922.html
Sent from the Solr - User mailing list archive at Nabble.com.
I vaguely recall some thread blocking issue with trying to parse too many
PDF files at one time in the same JVM.
Occasionally Tika (actually PDFBox) has been known to hang for some PDF
docs.
Do you have enough memory in the JVM? When the CPU is busy, is there much
memory available in the JVM
Hi,
there are some serious issues with encoding of the data sent to Solr
in the released 3.6.0 version of Solrj (HttpSolrServer), for example:
https://issues.apache.org/jira/browse/SOLR-3375
I believe your issue should already be fixed in the 3.6.0 branch. The
contents from that branch will event
Thanks for the reply,
Do you have any pointers to relevant Docs or Examples that show how this
should be chained together?
Thanks again,
Aaron
On Thu, May 24, 2012 at 3:03 AM, Otis Gospodnetic <
otis_gospodne...@yahoo.com> wrote:
> Perhaps this could be a custom SearchComponent that's run
Thanks for the link, will investigate further. On the outset though, it
looks as though it's not what we want to be going towards.
Also note that it's not open-sourced (other than Solandra which hasn't
been updated in ges https://github.com/tjake/Solandra).
Rather than build on top of Cassand
Thanks for your reply.
I need to boost at the document level and at the field level as well. Only
the query match certain fields would get boost.
In DIH, there is $docBoost (boost at document level), but documentation
about field-boost at all.
On Thu, May 24, 2012 at 10:32 PM, Walter Underwood w
If you want different boosts for different documents, then use the "boost"
parameter in edismax. You can store the factor in a field, then use it to
affect the score.
If you store it in a field named "docboost", you could use this in an edismax
config in your solrconfig.xml.
log(max(doc
The "pf" fields are used for implicit phrase queries to do "implicit phrase
proximity boosting" and don't relate at all to explicit phrase queries. I
don't think there is any way to control the fields for explicit phrase
queries separate from non-phrase term queries.
-- Jack Krupansky
-Or
hI iorixxx,
Just wondering if you have any suggestions!!! The other thing I tried using
following url and the results returned same way as they were (no trimming of
description to 300 chars). not sure if it is because of config file
settings.
http://localhost:8983/solr/browse?&hl=true&hl.fl=DESC
I need to do index-time field boosting because the client buy position
asset. Therefore, some document when matched are more important than
others. That's what index time boost does, right?
On Thu, May 24, 2012 at 10:10 PM, Walter Underwood wrote:
> Why? Query-time boosting is fast and more flexi
You should take a look at what DataStax has already done with Solr and
Cassandra.
http://www.datastax.com/dev/blog/cassandra-with-solr-integration-details
wunder
On May 24, 2012, at 7:50 AM, Nicholas Ball wrote:
>
> Hey all,
>
> I've been working on a SOLR set up with some heavy customizatio
Why? Query-time boosting is fast and more flexible.
wunder
Search Guy, Netflix & Chegg
On May 24, 2012, at 6:11 AM, Chamnap Chhorn wrote:
> Anyone could help me? I really need index-time field-boosting.
>
> On Thu, May 24, 2012 at 4:21 PM, Chamnap Chhorn
> wrote:
>
>> Hi all,
>>
>> I want t
Hey all,
I've been working on a SOLR set up with some heavy customization (using
the adminHandler as a way into the system) for a research project @
Imperial College London, however I now see there has been a substantial
push towards a NoSQL. For this, there needs to be some kind of optimistic
f
environment: solr 3.5
default operator is OR
i want to make sure i understand how the mm param(minimum match) works for
the edismax parser
http://wiki.apache.org/solr/ExtendedDisMax?highlight=%28dismax%29#mm_.28Minimum_.27Should.27_Match.29
it looks like the rule is 100% of the terms must match
On your tag, specify "batchSize" with a value that your
db/driver allows. The hardcoded default is 500. If you set it to -1 it
converts it to Integer.MIN_VALUE. See
http://wiki.apache.org/solr/DataImportHandler#Configuring_JdbcDataSource ,
which recommends using this -1 value in the case of
As Ahmet says, The Update Chain is probably the place to integrate such
document oriented processing.
See http://www.cominvent.com/2011/04/04/solr-architecture-diagram/ for how it
integrates with Solr.
--
Jan Høydahl, search solution architect
Cominvent AS - www.facebook.com/Cominvent
Solr Train
Hi community,
I am using Solr with DIH to index content from a DB. The point is
that I have to configure DIH to check changes in the DB very
frequently (aprox 1 sec) to maintain the index almost up-to-date. I
noted that JDBCDataSource closes the DB connection after every
execution which is not a
Hello All.
I'm a newbie in Solr and I saw this subject a lot, but no one answer was
satisfactory or (probably) I don't know how to properly set up the Solr
environment.
I indexed documents in Solr with a French content field. I used the field
type "text_fr" that comes with the solr schema.xml file.
Hello Paul, Mahout is a machine learning [clustering & classification] and
recommendation library, with Hadoop linking.
So, the answer is yes, it qualifies as a recommender engine on itself (with
no other libs), scalable through Hadoop
On Tue, Mar 13, 2012 at 9:23 AM, Paul Libbrecht wrote:
>
>
2012/5/24 Mark Miller
> I don't think there is yet - my fault, did not realize - we should make
> one.
>
> I've been messing around with some early stuff, but I'm still unsure about
> some things. Might just put in something simple to start though.
>
sure, I'll take a look and try to help there.
HI ,
Sorry , I have no idea as I never worked on this .
Thanks,
Rohan
From: Trev [via Lucene] [mailto:ml-node+s472066n3985922...@n3.nabble.com]
Sent: Thursday, May 24, 2012 7:37 PM
To: Rohan Ashok Kumbhar
Subject: Re: List of recommendation engines with solr
Have you heard of NG Data with thei
I don't think there is yet - my fault, did not realize - we should make one.
I've been messing around with some early stuff, but I'm still unsure about some
things. Might just put in something simple to start though.
On May 24, 2012, at 4:39 AM, Tommaso Teofili wrote:
> 2012/5/23 Mark Miller
Thanks Alexey Serba. I encountered the *java.sql.SQLException: Illegal value
for setFetchSize()* error after upgrading one of my servers to MySQL version
5.5.22.
PLA
--
View this message in context:
http://lucene.472066.n3.nabble.com/problem-on-running-fullimport-tp1707206p3985924.html
Sent fro
> I am recently working on a project to integrate a
> Named-Entity-Recognition-Framework (NER) in an existing
> searchplatform based on Solr. The Platform uses ManifoldCF
> to automatically gather the content from various
> repositories. The NER-Framework creates Annotations/Metadata
> from given
Anyone could help me? I really need index-time field-boosting.
On Thu, May 24, 2012 at 4:21 PM, Chamnap Chhorn wrote:
> Hi all,
>
> I want to do index-time boost field on DIH. Is there any way to do this? I
> see on this documentation, there is only $docBoost. How about field boost?
> Is it possi
Hi,
With (e)dismax explicit phrase queries are executed on the qf fields. The qf
field, however, may contain field(s) we don't want a phrase query for. How can
we tell the dismax query parser to only do phrase queries (explicit or not) on
the fields listed in the pf parameter.
Thanks
Markus
Hey Guys,
I am recently working on a project to integrate a
Named-Entity-Recognition-Framework (NER) in an existing searchplatform based on
Solr. The Platform uses ManifoldCF to automatically gather the content from
various repositories. The NER-Framework creates Annotations/Metadata from given
Hey Guys,
I am recently working on a project to integrate a
Named-Entity-Recognition-Framework (NER) in an existing searchplatform based on
Solr. The Platform uses ManifoldCF to automatically gather the content from
various repositories. The NER-Framework creates Annotations/Metadata from given
humm... ok I will do the test as soon as receive the database.
Thx a lot !
Le 24/05/2012 13:29, Michael Kuhlmann a écrit :
Just try it!
Maybe you're lucky, and it works with 80M docs. If each document takes
100 k, then it only needs 8 GB memory for indexing.
However, I doubt it. I've not be
-Original message-
> From:Michael McCandless
> Sent: Thu 24-May-2012 13:15
> To: Markus Jelsma
> Cc: solr-user@lucene.apache.org
> Subject: Re: field "name" was indexed without position data; cannot
> run PhraseQuery (term=a)
>
> I believe termPositions=false refers to the term ve
Ron,
Did you actually add new xslt file there or did you try to use the
example one, if the latter I believe the filename is example.xsl not
example.xslt
--
Sami Siren
On Wed, May 23, 2012 at 5:30 PM, watson wrote:
> Here is my query:
> http://127.0.0.1:/solr/JOBS/select/??q=Apache&wt=xsl
Just try it!
Maybe you're lucky, and it works with 80M docs. If each document takes
100 k, then it only needs 8 GB memory for indexing.
However, I doubt it. I've not been too deeply into the UpdateHandler
yet, but I think it first needs to parse the complete XML file before it
starts to inde
I believe termPositions=false refers to the term vectors and not how
the field is indexed (which is very confusing I think...).
I think you'll need to index a separate field disabling term freqs +
positions than the field the queryparser can query?
But ... if all of this is to just do custom scor
What version of solr (solrj) are you using?
--
Sami SIren
On Thu, May 24, 2012 at 8:41 AM, in.abdul wrote:
> Hi Dmitry ,
>
> There is no out of memory execution in solr ..
> Thanks and Regards,
> S SYED ABDUL KATHER
>
>
>
> On Thu, May 24, 2012 at 1:14 AM, Dmitry Kan [via Luce
Thanks!
How can we, in that case, omit term frequency for a qf field? I assume the way
to go is to configure a custom flat term frequency similarity for that field.
And how can it be that this error is not thrown with termPosition=false for
that field but only with omitTermFreqAndPositions?
Ma
In fact it's not for an update but only for the first indexation.
I mean, I will receive the full database with around 80M docs in some
XML files (one per country in the world).
From these 80M docs I will generate right XML format for each doc. (I
don't need all fields from the source)
And as
This behavior has changed.
In 3.x, you silently got no results in such cases.
In trunk, you get an exception notifying you that the query cannot run.
Mike McCandless
http://blog.mikemccandless.com
On Thu, May 24, 2012 at 6:04 AM, Markus Jelsma
wrote:
> Hi,
>
> What is the intended behaviour f
Hi,
What is the intended behaviour for explicit phrase queries on fields without
position data? If a (e)dismax qf parameter included a field
omitTermFreqAndPositions=true user explicit phrase queries throw the following
error on trunk but not on the 3x branch.
java.lang.IllegalStateException:
"pish it too jard" - sounds funny. :)
I meant "push it too hard".
Am 24.05.2012 11:46, schrieb Michael Kuhlmann:
There is no hard limit for the maximum nunmber of documents per update.
It's only memory dependent. The smaller each document, and the more
memory Solr can acquire, the more documen
There is no hard limit for the maximum nunmber of documents per update.
It's only memory dependent. The smaller each document, and the more
memory Solr can acquire, the more documents can you send in one update.
However, I wouldn't pish it too jard anyway. If you can send, say, 100
documents
I can't find my answer concerning the max number of ?
Can someone can tell me if there is no limit?
Le 24/05/2012 09:55, Bruno Mannina a écrit :
Sorry I just found : http://wiki.apache.org/solr/UpdateXmlMessages
I will take also a look to find the max number of .
Le 24/05/2012 09:51, Paul Li
2012/5/23 Mark Miller
> Yeah, currently you have to create the core on each node...we are working
> on a 'collections' api that will make this a simple one call operation.
>
Mark, is there a Jira for that yet?
Tomamso
>
> We should have this soon.
>
> - Mark
>
> On May 23, 2012, at 2:36 PM, Da
Ok, thanks a lot, good to know.
BTW: The speed of creating a collections is not the fastest - at least here
on this server I use (approx. second), but this is normal right?
On Wed, May 23, 2012 at 9:28 PM, Mark Miller wrote:
> Yeah, currently you have to create the core on each node...we are wo
Sorry I just found : http://wiki.apache.org/solr/UpdateXmlMessages
I will take also a look to find the max number of .
Le 24/05/2012 09:51, Paul Libbrecht a écrit :
Bruno,
see the solrconfig.xml, you have all sorts of tweaks for this kind of things.
paul
Le 24 mai 2012 à 09:49, Bruno Mannina
Hi All
I have a scinerio, suppose I want to use a new feature which is available in
trunk and also available as a patch.
Should I apply patch to the latest release version to use new feature or
directly use the trunk?
Which one will be the good approach and why?
Thanks in advance
Hemant
--
View
Bruno,
see the solrconfig.xml, you have all sorts of tweaks for this kind of things.
paul
Le 24 mai 2012 à 09:49, Bruno Mannina a écrit :
> Hi All,
>
> Just a little question concerning the max number of
>
>
>
>
>
> that I can write in the xml source file before indexing? only one, 10, 10
Hi All,
Just a little question concerning the max number of
that I can write in the xml source file before indexing? only one, 10,
100, 1000, unlimited...?
I must indexed 80M docs so I can't create one xml file by doc.
thanks,
Bruno
Thanks a lot for all these help !
Le 24/05/2012 09:12, Otis Gospodnetic a écrit :
Bruno,
You can use jconsole to see the size of the JVM heap, if that's what you are
after.
Otis
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
Bruno,
You can use jconsole to see the size of the JVM heap, if that's what you are
after.
Otis
Performance Monitoring for Solr / ElasticSearch / HBase -
http://sematext.com/spm
>
> From: Bruno Mannina
>To: solr-user@lucene.apache.org
>Sent: Tuesday,
Christian,
You don't mention SolrCloud explicitly and based on what you wrote I'm assuming
you are thinking/planning on using the Solr 3.* setup for this. I think that's
the first thing to change - this is going to be a pain to manage if you use
Solr 3.*. You should immediately start looking
Perhaps this could be a custom SearchComponent that's run before the usual
QueryComponent?
This component would be responsible for loading queries, executing them,
caching results, and for returning those results when these queries are
encountered later on.
Otis
Performance Monitoring for
Scott,
In addition to what Lance said, make sure your ramBufferSizeMB in
solrconfig.xml is high. Try with 512MB or 1024MB. Seeing Solr/Lucene index
segment merging visualization in SPM for Solr is one of my favourite reports in
SPM. It's kind of "amazing" how much index size fluctuates!
Otis
72 matches
Mail list logo