Hi all
I try to integrate solr with HDFS HA.When I start the solr server, it
comes out an exeception[1].
And I do know this is because the hadoop.conf.Configuration in
HdfsDirectoryFactory.java does not include the HA configuration.
So I want to know ,in solr,is there any way to includ
Hi Chris & Jeroen,
Tonight I posted some tips on Solr's wiki on this subject:
http://wiki.apache.org/solr/SpatialClustering
~ David
Chris Atkinson wrote
> Did you get any resolution for this? I'm about to implement something
> identical.
> On 3 Jul 2013 23:03, "Jeroen Steggink" <
> jeroen@
>
Dan,
StandardTokenizer implements the word boundary rules from the Unicode Text
Segmentation standard annex UAX#29:
http://www.unicode.org/reports/tr29/#Word_Boundaries
Every character sequence within UAX#29 boundaries that contains a numeric or an
alphabetic character is emitted as a term,
Hi Quan
You claim to be using LatLonType, yet the error you posted makes it clear
you are in fact using SpatialRecursivePrefixTreeFieldType (RPT).
Regardless of which spatial field you use, it's not clear to me what sort of
statistics could be useful on a spatial field. The stats component doesn
Awesome!
Be sure to "watch" the JIRA issue as it develops. The patch will improve
(I've already improved it but not posted it) and one day a solution is bound
to get committed.
~ David
Jeff Wartes wrote
> This is actually pretty far afield from my original subject, but it turns
> out that I al
This is actually pretty far afield from my original subject, but it turns
out that I also had issues with NRT and multi-field geospatial
performance in Solr 4, so I'll follow that up.
I've been testing and working with David's SOLR-5170 patch ever since he
posted it, and I pushed it into produc
OK - I see that this can be done with Field Collapsing/Grouping. I also
see the mentions in the Wiki for avoiding duplicates using a 16-byte hash.
So, question withdrawn...
On Thu, Aug 22, 2013 at 10:21 PM, Dan Davis wrote:
> Suppose I have two documents with different id, and there is anothe
You are right, but here's my null hypothesis for studying the impact on
relevance.Hash the query to deterministically seed random number
generator.Pick one from column A or column B randomly.
This is of course wrong - a query might find two non-relevant results in
corpus A and lots of rele
Ah, but what is the definition of punctuation in Solr?
On Wed, Aug 21, 2013 at 11:15 PM, Jack Krupansky wrote:
> "I thought that the StandardTokenizer always split on punctuation, "
>
> Proving that you haven't read my book! The section on the standard
> tokenizer details the rules that the toke
Alright, thanks for all your help. I finally fix this problem using
PatternReplaceFilterFactory + WordDelimeterfilterFactory.
I first replace _ (underscore) using PatternReplaceFilterFactory and then
using WordDelimeterFilterFactory to generate word and number part to
increase user search hit. Alt
Suppose I have two documents with different id, and there is another field,
for instance "content-hash" which is something like a 16-byte hash of the
content.
Can Solr be configured to return just one copy, and drop the other if both
are relevant?
If Solr does drop one result, do you get any indi
be careful with drop_caches - make sure you sync first
On Thu, Aug 22, 2013 at 1:28 PM, Jean-Sebastien Vachon <
jean-sebastien.vac...@wantedanalytics.com> wrote:
> I was afraid someone would tell me that... thanks for your input
>
> > -Original Message-
> > From: Toke Eskildsen [mailto:t
Hi Dmitry,
So it seems solrjmeter should not assume the adminPath - and perhaps needs
to be passed as an argument. When you set the adminPath, are you able to
access localhost:8983/solr/statements/admin/cores ?
roman
On Wed, Aug 21, 2013 at 7:36 AM, Dmitry Kan wrote:
> Hi Roman,
>
> I have not
Hi,
I am using Solr 4.3 with 3 solr hosta and with an external zookeeper
ensemble of 3 servers. And just 1 shard currently.
When I create collections using collections api it creates collections with
names,
collection1_shard1_replica1, collection1_shard1_replica2,
collection1_shard1_replica3.
I
What we need is similar to what is discussed here, except not as a filter
but as an actual query:
http://lucene.472066.n3.nabble.com/filter-query-from-external-list-of-Solr-unique-IDs-td1709060.html
We'd like to implement a query parser/scorer that would allow us to combine
SOLR searches with sear
Hi jfeist,
Your mail reminds me this blog, not sure about solr though.
http://blog.mikemccandless.com/2011/11/searcherlifetimemanager-prevents-broken.html
From: jfeist
To: solr-user@lucene.apache.org
Sent: Friday, August 23, 2013 12:09 AM
Subject: Storing qu
Hi,
How can i prevent solr from update some fields when updating a doc?
The problem is, i have an uuid with the field name uuid, but it is not an
unique key. When a rss source updates a feed, solr will update the doc with the
same link but it generates a new uuid. This is not the desired because
I am in the process of setting up a search application that allows the user
to view paginated query results. The documents are highly dynamic but I
want the search results to be static, i.e. I don't want the user to click
the next page button, the query reruns, and now he has a different set of
se
I should have said that I have set it both to "true" and to "false" and
restarted Solr each time and the rankings and info in the debug query
showed no change.
Does this have to be set at index time?
Tom
>
Thanks Markus,
I set it , but it seems to make no difference in the score or statistics
listed in the debugQuery or in the ranking. I'm using a field with
CommonGrams and a huge list of common words, so there should be a huge
difference in the document length with and without discountOverlaps.
Hi Tom,
Don't set it as attributes but as lists as Solr uses everywhere:
true
For BM25 you can also set k1 and b which is very convenient!
Cheers
-Original message-
> From:Tom Burton-West
> Sent: Thursday 22nd August 2013 22:42
> To: solr-user@lucene.apache.org
> Subject: How to
If I am using solr.SchemaSimilarityFactory to allow different similarities
for different fields, do I set "discountOverlaps="true" on the factory or
per field?
What is the syntax? The below does not seem to work
Tom
What version of solr are you using? Have you copied a solr.xml from
somewhere else? I can almost reproduce the error you're getting if I put a
non-existent core in my solr.xml, e.g.:
...
On Thu, Aug 22, 2013 at 1:30 PM, yriveiro wrote:
> Hi all,
>
> I think that there is some lack
Thanks, Erick that's exactly the clarification/confirmation I was looking for!
Greg
Ah. That's because Tika processor does not support path extraction. You
need to nest one more level.
Regards,
Alex
On 22 Aug 2013 13:34, "Andreas Owen" wrote:
> i can do it like this but then the content isn't copied to text. it's just
> in text_test
>
> url="${rec.path}${rec.file}" dataS
I don't think you can go into production with that. But cloudera
distribution (with Hue) might be a similar or better option.
Regards,
Alex
On 22 Aug 2013 14:38, "Lance Norskog" wrote:
> You need to:
> 1) crawl the SVN database
> 2) index the files
> 3) make a UI that fetches the original fi
On Aug 22, 2013, at 19:53 , Kamaljeet Kaur wrote:
> On Thu, Aug 22, 2013 at 10:56 PM, SolrLover [via Lucene]
> wrote:
>>
>> Now use DIH to get the data from MYSQL database in to SOLR..
>>
>> http://wiki.apache.org/solr/DataImportHandler
>
>
> These are for versions 1.3, 1.4, 3.6 or 4.0.
> Why
Verisons mentioned in the wiki only tell you that these features are
available from that version of Solr. This will not be applicable in your
case as you are using the latest version. So everything you find in the wiki
would be available in 4.4 Solr
--
View this message in context:
http://lucen
Right, it's a little arcane. But the lockup is because the
various leaders send documents to each other and wait
for returns. If there are a _lot_ of incoming packets to
various leaders, it can generate the distributed deadlock.
So the shuffling you refer to is the root of the issue.
If the leader
Erick,
I've read over SOLR-4816 after finding your comment about the server-side
stack traces showing threads locked up over semaphores and I'm curious how
that issue cures the problem on the server-side as the patch only includes
client-side changes. Do the servers get so tied up shuffling docume
Hi,
I have a Solr cloud set up with 12 shards with 2 replicas each, divided on 6
servers (each server hosting 4 cores). Solr version is 4.3.1.
Due to memory errors on one machine, 3 of its 4 indexes became corrupted. I
unloaded the cores, repaired the indexes with the Lucene CheckIndex tool, an
Thanks a lot !!!
Le 22/08/2013 16:23, Andrea Gazzarini a écrit :
First, a core is a separate index so it is completely indipendent from
the already existing core(s). So basically you don't need to reindex.
In order to have two cores (but the same applies for n cores): you
must have in your so
You need to:
1) crawl the SVN database
2) index the files
3) make a UI that fetches the original file when you click on a search
results.
Solr only has #2. If you run a subversion web browser app, you can
download the developer-only version of the LucidWorks product and crawl
the SVN web view
Hello, I am dealing with an issue of highlighting and so far the other posts
that I've read have not provided a solution.
When using proximity search ("coming soon"~10) I get some documents with no
highlights and some documents highlight these words even when they are not
in a 10 word proximity.
After you connect to Subversion, you'll need parsers for code, etc.
You might want to try Krugle instead, since they have already written all that
stuff: http://krugle.org/
wunder
On Aug 22, 2013, at 10:43 AM, SolrLover wrote:
> I don't think there's an SOLR- SVN connector available out of th
We warm the file buffers before starting Solr to avoid spending time waiting
for disk IO. The script is something like this:
for core in core1 core2 core3
do
find /apps/solr/data/${core}/index -type f | xargs cat > /dev/null
done
It makes a big difference in the first few minutes of service.
Your first problem is that the terms aren't getting to the field
analysis chain as a unit, if you attach &debug=query to your
query and say you're searching lastName:(ogden erickson),
you'll sees something like
lastName:ogden lastName:erickson
when what you want is
lastname:ogden erickson
(note, th
On Thu, Aug 22, 2013 at 10:56 PM, SolrLover [via Lucene]
wrote:
>
> Now use DIH to get the data from MYSQL database in to SOLR..
>
> http://wiki.apache.org/solr/DataImportHandler
These are for versions 1.3, 1.4, 3.6 or 4.0.
Why versions are mentioned there? Don't they work on solr 4.4.0?
--
K
I don't think there's an SOLR- SVN connector available out of the box.
You can write a custom SOLRJ indexer program to get the necessary data from
SVN (using JAVA API) and add the data to SOLR.
--
View this message in context:
http://lucene.472066.n3.nabble.com/How-to-SOLR-file-in-svn-reposit
i can do it like this but then the content isn't copied to text. it's just in
text_test
On 22. Aug 2013, at 6:12 PM, Andreas Owen wrote:
> i put it in the tika-entity as attribute, but it doesn't change anything. my
> bigger concern is why text_test isn't populated at all
Hi all,
I think that there is some lack in solr's ref doc.
Section "Running Solr" says to run solr using the command:
$ java -jar start.jar
But If I do this with a fresh install, I have a stack trace like this:
http://pastebin.com/5YRRccTx
Is it this behavior as expected?
-
Best regards
I was afraid someone would tell me that... thanks for your input
> -Original Message-
> From: Toke Eskildsen [mailto:t...@statsbiblioteket.dk]
> Sent: August-22-13 9:56 AM
> To: solr-user@lucene.apache.org
> Subject: Re: Flushing cache without restarting everything?
>
> On Tue, 2013-08-20
Now use DIH to get the data from MYSQL database in to SOLR..
http://wiki.apache.org/solr/DataImportHandler
You need to define the field mapping (between my sql and SOLR document) in
data-config.xml.
--
View this message in context:
http://lucene.472066.n3.nabble.com/Schema-tp4086136p4086140.h
Hello there,
I have installed solr and its working fine on localhost. Have indexed the
example files given along with solr-4.4.0. These are CSV or XML. Now I want
to index mysql database for django project and search the queries from user
end and also implement more features. What should I do?
-
Ok, found
class="org.apache.solr.handler.dataimport.DataImportHandler">
dih-config.xml
*nohtml**<*/str>
Of course, my mistake...when I changed the name of the chain I deleted
the "<" char.
Sorry
On 08/22/2013 06:15 PM, Shawn Heisey wrote:
of
On 8/22/2013 10:06 AM, Andrea Gazzarini wrote:
yes, yes of course, you should use your already declared request
handler...that was just a copied and pasted example :)
I'm curious about what kind of error you gotI copied the snippet
above from a working core (just replaced the name of the cha
i put it in the tika-entity as attribute, but it doesn't change anything. my
bigger concern is why text_test isn't populated at all
On 22. Aug 2013, at 5:27 PM, Alexandre Rafalovitch wrote:
> Can you try SOLR-4530 switch:
> https://issues.apache.org/jira/browse/SOLR-4530
>
> Specifically, setti
On 8/22/2013 10:02 AM, Steve Rowe wrote:
You could declare your update chain as the default by adding 'default="true"'
to its declaring element:
and then you wouldn't need to declare it as the default update.chain in either
of your request handlers.
If I did this, would it only apply t
yes, yes of course, you should use your already declared request
handler...that was just a copied and pasted example :)
I'm curious about what kind of error you gotI copied the snippet
above from a working core (just replaced the name of the chain)
BTW: AFAIK is the "update.processor" tha
You could declare your update chain as the default by adding 'default="true"'
to its declaring element:
and then you wouldn't need to declare it as the default update.chain in either
of your request handlers.
On Aug 22, 2013, at 11:57 AM, Shawn Heisey wrote:
> On 8/22/2013 9:42 AM, Andre
On 8/22/2013 9:42 AM, Andrea Gazzarini wrote:
You should declare this
nohtml
in the "defaults" section of the RequestHandler that corresponds to your
dataimporthandler. You should have something like this:
dih-config.xml
nohtml/str>
Ot
You should declare this
nohtml
in the "defaults" section of the RequestHandler that corresponds to your
dataimporthandler. You should have something like this:
class="org.apache.solr.handler.dataimport.DataImportHandler">
dih-config.xml
nohtml/str>
I have an updateProcessor defined. It seems to work perfectly when I
index with SolrJ, but when I use DIH (which I do for a full index
rebuild), it doesn't work. This is the case with both Solr 4.4 and Solr
4.5-SNAPSHOT, svn revision 1516342.
Here's a solrconfig.xml excerpt:
ft_t
Can you try SOLR-4530 switch:
https://issues.apache.org/jira/browse/SOLR-4530
Specifically, setting htmlMapper="identity" on the entity definition. This
will tell Tika to send full HTML rather than a seriously stripped one.
Regards,
Alex.
Personal website: http://www.outerthoughts.com/
LinkedIn:
i'm trying to index a html page and only user the div with the id="content".
unfortunately nothing is working within the tika-entity, only the standard text
(content) is populated.
do i have to use copyField for test_text to get the data?
or is there a problem with the entity-h
Hello All,
I am currently doing a spatial query in solr. I indexed "coordinates"
(type="location" class="solr.LatLonType"), but the following query failed.
http://localhost/solr/quan/select?q=*:*&stats=true&stats.field=coordinates&stats.facet=township&rows=0
It showed an error:
Field type
location
First, a core is a separate index so it is completely indipendent from
the already existing core(s). So basically you don't need to reindex.
In order to have two cores (but the same applies for n cores): you must
have in your solr.home the file (solr.xml) described here
http://wiki.apache.org
Little precision, I'm on Ubuntu 12.04LTS
Le 22/08/2013 15:56, Bruno Mannina a écrit :
Dear Users,
(Solr3.6 + Tomcat7)
I use since two years Solr with one core, I would like now to add one
another core (a new database).
Can I do this without re-indexing my core1 ?
could you point me to a goo
On Tue, 2013-08-20 at 20:04 +0200, Jean-Sebastien Vachon wrote:
> Is there a way to flush the cache of all nodes in a Solr Cloud (by
> reloading all the cores, through the collection API, ...) without
> having to restart all nodes?
As MMapDirectory shares data with the OS disk cache, flushing of
S
Dear Users,
(Solr3.6 + Tomcat7)
I use since two years Solr with one core, I would like now to add one
another core (a new database).
Can I do this without re-indexing my core1 ?
could you point me to a good tutorial to do that?
(my current database is around 200Go for 86 000 000 docs)
My new
On 8/22/2013 2:25 AM, YouPeng Yang wrote:
> Hi all
> About the RAMBufferSize and commit ,I have read the doc :
> http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/60544
>
>I can not figure out how do they make work.
>
> Given the settings:
>
> 10
>
>${solr.autoC
How can you validate that the changes you just made had any impact on the
performance of the cloud if you don't have the same starting conditions?
What we do basically is running a batch of requests to warm up the index and
then launch the benchmark itself. That way we can measure the impact of
Updated to sun java 1.7.0_25. on solr 4.4.0. but are still getting mutated
strings:
725597 [Thread-20] ERROR org.apache.solr.update.SolrCmdDistributor - shard
update error StdNode:
http://10.231.188.126:8080/solr/kunde0/:org.apache.solr.client.solrj.impl.HttpSolrServer$RemoteSolrException:
Unexpe
Call "optimize" on your Solr 3.5 server which will write a new index
segment in v3.5 format. Such an index should be read in Solr 4.x
without any problem.
On Thu, Aug 22, 2013 at 5:00 PM, Montu v Boda
wrote:
> thanks
>
> actually the problem is that we have migrated the solr 1.4 index data to
> s
optimize is an explicit request to perform a merge. Merges occur in the
background, automatically, as needed or indicated by the parameters of the
merge policy. An optimize is requested from outside of Solr.
-- Jack Krupansky
-Original Message-
From: YouPeng Yang
Sent: Thursday, Augu
On Wed, 2013-08-21 at 10:09 +0200, sivaprasad wrote:
> The slave will poll for every 1hr.
And are there normally changes?
> We have configured ~2000 facets and the machine configuration is given
> below.
I assume that you only request a subset of those facets at a time.
How much RAM does your
Hello All,
I am also facing a similar issue. I am using Solr 4.3.
Following is the configuration I gave in schema.xml
thanks
actually the problem is that we have migrated the solr 1.4 index data to
solr 3.5 using replication feature of solr 3.5. so that what ever data we
have in solr 3.5 is of solr 1.4.
so i do not think so it is work in solr 4.x.
so please suggest your view based on my above point.
Thanks & R
But is it really a good benchmarking, if you flush the cache? Wouldn't you
want to benchmark against a system, that would be comparable to what is
under real (=production) load?
Dmitry
On Tue, Aug 20, 2013 at 9:39 PM, Jean-Sebastien Vachon <
jean-sebastien.vac...@wantedanalytics.com> wrote:
> I
Thanks much . This was useful.
On Thu, Aug 22, 2013 at 2:24 PM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
> You can use the /admin/mbeans handler to get all system stats. You can
> find stats such as "adds" and "cumulative_adds" under the update
> handler section.
>
> http://localho
Hi, Im using DIH to index data to solr. Solr 4.4 version is used. Indexing
proceeds normal in the beginning.
I have some 10 data-config files.
file1 -> select * from table where id between 1 and 100
file2 -> select * from table where id between 100 and 300. and so
on.
Here 4 batches
No one is asking you to re-index data. The Solr 3.5 index can be read
and written by a Solr 4.x installation.
On Thu, Aug 22, 2013 at 12:08 PM, Montu v Boda
wrote:
> Thanks for suggestion
>
> but as per us this is not the right way to re-index all the data each and
> every time. we mean when we m
You can use the /admin/mbeans handler to get all system stats. You can
find stats such as "adds" and "cumulative_adds" under the update
handler section.
http://localhost:8983/solr/collection1/admin/mbeans?stats=true
On Thu, Aug 22, 2013 at 12:35 PM, Prasi S wrote:
> I am not using dih for indexi
I have been running DIH Imports (>15 000 000 rows) all day and every now and
then I get some weird errors. Some examples:
A letter is replaced by an unknow character (Should have been a 'V')
285680 [Thread-20] ERROR org.apache.solr.update.SolrCmdDistributor - shard
update error StdNode:
http://10
Aliasing instead of swapping removed this problem!
DO NOT USE "SWAP" WHEN IN CLOUD MODE (solr 4.3)
--
View this message in context:
http://lucene.472066.n3.nabble.com/Clusterstate-says-state-recovering-but-Core-says-I-see-state-null-tp4084504p4086037.html
Sent from the Solr - User mailing list
Hi all
About the RAMBufferSize and commit ,I have read the doc :
http://comments.gmane.org/gmane.comp.jakarta.lucene.solr.user/60544
I can not figure out how do they make work.
Given the settings:
10
${solr.autoCommit.maxDocs:1000}
false
If the indexs docs up to 10
Hi All
I do have some diffculty with understand the relation between the
optimize and merge
Can anyone give some tips about the difference.
Regards
I am not using dih for indexing csv files. Im pushing data through solrj
code. But i want a status something like what dih gives. ie. fire a
command=status and we get the response. Is anythin like that available for
any type of file indexing which we do through api ?
On Thu, Aug 22, 2013 at 12:09
78 matches
Mail list logo