I also think that's a good question and currently without a "use this"
answer :-)
I think it shouldn't be hard to write a Solr service querying ZK and
replicate both conf and indexes (via SnapPuller or ZK itself) so that such
a node is responsible to back up the whole cluster in a secure storage
(N
I am seeing the garbage text in browser, Luke Index Toolbox and everywhere
it is the same. My servlet container is Jetty which is the out-of-box one.
Many other special chars are getting indexed and stored properly, only few
characters causes pain.
*Pranav Prakash*
"temet nosce"
On Fri, Sep 14
Hi Hoss,
Thanks for your quick reply.
Below is my solr.xml configuration, and already set persistent to true.
For test1 and tets1-ondeck content, just copied from
example/solr/collection1
Then publish 1 record to test1, and query. it's ok now.
INFO: [test1] webapp=/solr path=/sel
The Solr caches are thrown away on each hard commit. The document cache could
be conserved across commits. Documents in segments that still exist would be
saved. Documents in segments that are removed would be thrown away.
Perhaps the document cache should be pushed down into Lucene, to handle t
: In Solr 3.6, core swap function works good. After switch to use Solr 4.0
: Beta, and found it doesn't work well.
can you elaborate on what exactly you mean by "doesn't work well" ? ..
what does your solr.xml file look like? what command did you run to do the
swap? what results did you get from
What sorts of failures are you thinking of? Power loss? Index
corruption? Server overload?
Could you keep somewhat remote replicas of each shard, but not behind
your load balancer?
Then, should all your customer facing nodes go down, those replicas
would be elected leaders. When you bring the cus
I have been thinking about this some more.
So my scenario of search is as follows.
A visitor types in
3 bed 2 bath condo new york
Now my schema has bed, bath, property type, city. The data going in is
denormalised csv files, so column headings are the fields.
The search consists of a near exac
If reindexing from raw XML files is feasible (less than 30 minutes) it would be
the easiest option. The problem with recovering with old snapshots is that you
have to remove bad indices from all cores and possible stale (or recoveries in
progress) indices and replace it with your snapshot and mo
I'm thinking about catastrophic failure and recovery. If, for some reason,
the cluster should go down or become unusable and I simply want to bring it
back up as quickly as possible, what's the best way to accomplish that?
Maybe I'm thinking about this incorrectly? Is this not a concern?
--
That is a great idea to run the updates thru the LB also! I like it!
Thanks for the replies guys
-Original Message-
From: jimtronic [mailto:jimtro...@gmail.com]
Sent: Thursday, September 20, 2012 1:46 PM
To: solr-user@lucene.apache.org
Subject: Re: some general solr 4.0 questions
I've
He explained why in the message. Because it is faster to bring up a new host
from a snapshot.
I presume that he doesn't need the full cluster running all the time.
wunder
On Sep 20, 2012, at 2:19 PM, Markus Jelsma wrote:
> Hi,
>
> Why do you want to back up? With enough machines and a decent
Hi,
Why do you want to back up? With enough machines and a decent replication
factor (3 or higher) there is usually little need to back it up. If you have
the space it's better to launch a second cluster in another DC.
You can also choose to increase the number of maxCommitsToKeep but it'll tak
Just added this today.
https://issues.apache.org/jira/browse/SOLR-3862
--
View this message in context:
http://lucene.472066.n3.nabble.com/deleting-a-single-value-from-multivalued-field-tp4009092p4009292.html
Sent from the Solr - User mailing list archive at Nabble.com.
I'm trying to determine my options for backing up data from a SolrCloud
cluster.
For me, bringing up my cluster from scratch can take several hours. It's way
faster to take snapshots of the index periodically and then use one of these
when booting a new instance. Since I use static xml files and d
Sorry, but it looks like the SolrEntityProcessor does a raw split on commas
of its "fq" parameter, with no provision for escaping.
You should be able to combine the fq into the query parameter as a nested
query which does not have the split issue.
-- Jack Krupansky
-Original Message-
I've got a setup like yours -- lots of cores and replicas, but no need for
shards -- and here's what I've found so far:
1. Zookeeper is tiny. I would think network I/O is going to be the biggest
concern.
2. I think this is more about high availability than performance. I've been
expirementing wit
I'll answer the other easy ones ;)
#1 yes, no need for a ton of RAM and tons of cores.
#2 it's not the overhead, it's that zookeeper is sensitive to not
hearing from nodes and marking them dead, at least in the Hadoop and
HBase world.
#3 yes, the external LB would simply spread the query load ov
Hi guys,
Has anybody got any idea about that?
I'm really open for any suggestions
Thanks!
Dirceu
On Thu, Sep 20, 2012 at 11:58 AM, Dirceu Vieira wrote:
> Hi,
>
> I'm attempting to write a filter query for my SolrEntityProcessor using
> {frange} over a function.
> It works fine when I'm te
Hi
You gave us quite little information to go on, but I can list some probable
reasons for why you search doesn't match any documents. In schema.xml check
that:
- you have specified fields for custid, familyname, usrname
- that those fields have the attribute indexed="true"
- that they are not of
I'll answer the easy one:
#4 - yes! In fact, it would seem wise in many of these straightforward cases
like yours to leave standard master/slave as-is for the time being even when
upgrading to Solr 4. No need to make life more complicated. Now, if you did
want to have NRT where updates are
Hello solr user group,
I am evaluating the new Solr 4.0 beta with an eye to how to fit it into our
current solr setup. Our current setup is running on solr 3.6.1 and uses 12
slaves behind a load balancer and a master which we index into, and they all
have three cores (now referred to as collec
: I've created a custom process in Solr that has a Zookeeper Watcher
: configured to pull Solr XML files from a znode. When I receive a file I can
: send the file to /update and get it indexed, but that seems inefficient. I
: could use SolrJ, but I believe that is still sending an HTTP request to
My limited understanding, confirmed by profiler though, is that doing mmap
IO cost you a copying bytes from mmaped virtual memory into heap VM. Just
look into java.nio.DirectByteBuffer.get(byte[], int, int) . It happens
several times to me - we saw hotspot in profiler on mmaped IO (yep, just
in co
Ah, I just upgraded us to 3.6, and abandoned xi:include in favor of
symlinks, so I didn't know whether it was fixed or not.
Another thing I just thought of is if you want your config files to be
available from the web UI, the xi:include directives won't be
resolved, so you'll just see the literal
So I just had a curiosity question pop up and wanted to check it out.
Solr has the documentCache, designed to hold stored fields while
various parts of a requestHandler do their tricks, keeping the stored
content from having to be re-fetched from disk. When using
MMapDirectory, is this even somethi
: "xi:include" directives work in Solr config files, but in most (all?)
: versions of Solr, they require absolute paths, which makes portable
: configuration slightly more sticky. Still, a very viable solution.
Huh?
There were bugs in xinclude parsing up to Solr 1.4 that caused relative
paths
Hi
We have a bunch of data that was indexes using a 4.0 snapshot build of solr
We'd like to migrate to the 4.0.beta version. Is there a reccomended way to
migrate the indices or is reindexing the best option
Thanks
--
View this message in context:
http://lucene.472066.n3.nabble.com/4-0-snapsh
There isn’t a mechanism to update or delete only a subset of a multivalued
field. You would have to supply the full list of values you want to have in
the multivalued field.
You may want to offer it as a suggested improvement.
-- Jack Krupansky
-Original Message-
From: deniz
Sent: T
Is there any reason why the log function shouldn't be modified to
always take 1+the number being requested to be log'ed? Reason I ask is
I am taking the log of the value output by another function which
could return 0. For testing, I modified it to return 1 which works but
would rather have the log
Thanks, Erick. That really helped us in learning about tokens and how the
Analyzer works. Thank you!
Warm regards,
Alex
-Original Message-
From: Erick Erickson [mailto:erickerick...@gmail.com]
Sent: 19 September 2012 3:56 PM
To: solr-user@lucene.apache.org
Subject: Re: Wildcard search
Hi,
Sorry for all the questions today but I paid a third party coder to develop
a schema for me but now that I have more of an understanding myself I have a
questions.
The aim is to do spacial searching so in my schema I have this:
My site doesnt seem to submit via JSON to lat_lng_0_coordina
Depends on where the bottlenecks are I guess.
On a single system, increasing shards decreases throughput (this
isn't specific to Solr). The increased parallelism *can* decrease
latency to the degree that the parts that were parallelized outweigh
the overhead.
Going from one shard to two shards
Before anyone asks, these results were obtained warm.
On 20 Sep 2012, at 14:39, Tom Mortimer wrote:
> Hi all,
>
> After reading
> http://carsabi.com/car-news/2012/03/23/optimizing-solr-7x-your-search-speed/
> , I thought I'd do my own experiments. I used 2M docs from wikipedia, indexed
> in
Hi,
I'm using Solr 4.0-BETA and trying to import a CSV file as follows:
curl http://localhost:8080/solr//update -d overwrite=false -d
commit=true -d stream.contentType='text/csv;charset=utf-8' -d
stream.url=file:///dir/file.csv
I have 2 tomcat servers running on different machines and a separate
Ah... you are probably not "encoding" the & and % in your URL, so they are
being eaten when the URL is parsed. Use % followed by the 2-digit hex ASCII
character code. & should be %26 and % should be %25.
-- Jack Krupansky
-Original Message-
From: Gustav
Sent: Thursday, September 20,
You probably are using a "text" field which is tokenizing the input when
this data should probably be a "string" (or "text" with the
KeywordAnalyzer.)
-- Jack Krupansky
-Original Message-
From: zainu
Sent: Thursday, September 20, 2012 5:49 AM
To: solr-user@lucene.apache.org
Subject:
Hello Jack,
My the fieldtype is configured as following:
What other filter could i use to preserve the "&" char?
Another problem that came up, is when i search for ?q="0,5%" it gives an
error:
HTTP Status 400 - missing query string
Probably
But even with XA log, am I correct in thinking that the writes themselves will
be mostly sequential?
Regards,
Phil.
From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com]
Sent: Thu 20/09/2012 14:09
To: solr-user@lucene.apache.org
Subject: Re: Solr Write w
Seriously, if you are having trouble finding the build file, I would suggest
that you do a lot more homework reading and studying the available Solr and
Lucene materials online before asking for further assistance.
Start with:
http://lucene.apache.org/solr/
http://lucene.apache.org/solr/version
Use a field type whose analyzer preserves the &. What field type are you
using?
-- Jack Krupansky
-Original Message-
From: Gustav
Sent: Thursday, September 20, 2012 9:05 AM
To: solr-user@lucene.apache.org
Subject: "&" char in querystring
Good Morning Everyone!
Again, i need your help
Hi all,
After reading
http://carsabi.com/car-news/2012/03/23/optimizing-solr-7x-your-search-speed/ ,
I thought I'd do my own experiments. I used 2M docs from wikipedia, indexed in
Solr 4.0 Beta on a standard EC2 large instance. I compared an unsharded and
2-shard configuration (the latter set
Hi James,
If you don't want this field to be included in user searches, just omit it from
the search configuration (e.g. if using eDisMax parser, don't put it in the qf
list). To keep it out of search results, exclude it from the fl list. See
http://wiki.apache.org/solr/CommonQueryParam
Hi,
And there is a wonderful report in SPM for Solr that shows how your index
changes over time in terms of size, index files, segments, indexed docs,
deleted docs... very useful for understanding what's going on at that
level.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Sep 20, 2
Hi,
Right, documents are buffered in jvm heap according to ramBufferSizeMB
setting before getting indexed.
But xa log doesn't do that I don't think.
Otis
--
Performance Monitoring - http://sematext.com/spm
On Sep 20, 2012 8:11 AM, "John, Phil (CSS)" wrote:
> Hi,
>
> We're in the process of final
Good Morning Everyone!
Again, i need your help Lucene comunity!
I have a query string just like this: q="johnson & johnson" and when i use
debugQuery=true i realize that the Solrparse breaks the string exactly in
the "&" char, changing my query to q="Johnson", i would like to know, is
there any wa
Hi, Simone:
"xi:include" directives work in Solr config files, but in most (all?)
versions of Solr, they require absolute paths, which makes portable
configuration slightly more sticky. Still, a very viable solution.
Michael Della Bitta
Appinions
Hi, James,
If you don't store or index this value, it won't exist in Solr.
If you want to be able to find these records by the unique id, you
need to index it. If you want to find the corresponding DB record from
a Solr document you brought up by other means, you'll need to store
the unique id.
Hi.
My SQL database assigns a uniqueID to each item. I want to keep this
uniqueID assosiated to the items that are in Solr even though I wont ever
need to display them or have them searchable. I do however what to be able
to target specific items in Solr with it, for updating or deleting the
recor
HI Fellows,
I had added the following fields in my data-config.xml to
implement Data Import Handler
When I perform steps of Full import Example at
http://wiki.apache.org/solr/DataImportHandler
I can successfully index on my databas
Yeah, I sent a note to the web folks there about the images.
I'll leave the rest to people who really _understand_ all that stuff
On Thu, Sep 20, 2012 at 8:31 AM, Bernd Fehling
wrote:
> Hi Erik,
>
> thanks for the link.
> Now if we could see the images in that article that would be great
Hi Erik,
thanks for the link.
Now if we could see the images in that article that would be great :-)
By the way, one cause for the memory jumps was located as "killer search" from
a user.
The interesting part is that the verbose gc.log showed a "hiccup" in the GC.
Which means that during a GC r
Hi,
We're in the process of finalising the specification for our Solr cluster and
just wanted to double check something:
What is the major IO/write workload type in Solr?
>From what I understand, the main workload appears to be largely sequential
>appends to segments, rather than heavily bi
Not enough info to go on here, what is your fieldType?
But the first place to look is admin/analysis to see how the
text is tokenized.
Best
Erick
On Thu, Sep 20, 2012 at 5:49 AM, zainu wrote:
> Dear fellows,
> I have a field in solr with value '8E0061123-8E1'. Now when i seach '8E*',
> it does
Dear fellows,
I have a field in solr with value '8E0061123-8E1'. Now when i seach '8E*',
it does return me all values starting with'8E' which is totally right but it
returns nothing when i search '8E0*'. I guess it is not indexing 8E0 or so.
I want to search with all combinations likes '8E', '8E0',
Here's a wonderful writeup about GC and memory in Solr/Lucene:
http://searchhub.org/dev/2011/03/27/garbage-collection-bootcamp-1-0/
Best
Erick
On Thu, Sep 20, 2012 at 5:49 AM, Robert Muir wrote:
> On Thu, Sep 20, 2012 at 3:09 AM, Bernd Fehling
> wrote:
>
>> By the way while looking for upgradi
> Is it correct that a segment file is ready for merging after a commit has
> been done (e.g. using the autoCommit property), so I will see merges of 100
> and up documents (and the index writer continues writing into a new segment
> file)?
Yes, merging won't happen until after a segment is closed
Well, from the bullet points on the Wiki page:
Planned to be included in Solr_4.1
The JIRA referenced points to a Jar that Marko kindly provides,
you can try that.
Best
Erick
On Wed, Sep 19, 2012 at 10:22 PM, rayvicky wrote:
> dataimport.properties
> #Thu Sep 20 10:11:09 CST 2012
> interval=1
Hi,
is it possible to split schema.xml and solrconfig.xml configurations? My
configurations are getting quite large and I'd like to be able to partition
them logically in multiple files.
thank you in advance,
S
Hi,
I'm attempting to write a filter query for my SolrEntityProcessor using
{frange} over a function.
It works fine when I'm testing it on the admin, but once I move it into my
data-config.xml the query blows up because of the commas in the function.
The problem is that fq parameter can be a comma
On Thu, Sep 20, 2012 at 3:09 AM, Bernd Fehling
wrote:
> By the way while looking for upgrading to JDK7, the release notes say under
> section
> "known issues" about the "PorterStemmer" bug:
> "...The recommended workaround is to specify -XX:-UseLoopPredicate on the
> command line."
> Is this st
Hello everybode. I already posted this question on stackoverflow but didn't
get an answer.
I am using the solr suggestion component with the following configuration:
schema.xml
solrconfig.xml
suggest
org.apache.solr.spelling.suggest.Suggester
Hi
Thanks a lot for your answer, Erick!
I changed the value of the autoSoftCommit property and it had the
expected effect. It can be noted that this is per Core, so I get four
getReader calls when my Solr contains four cores per autoSoftCommit
interval.
Is it correct that a segment file is
Hi Alex,
during replication the slave is still available and serving requests but
as you can imagine the responses will be slower because of disk usage,
even with 15k rpm disks.
We have one master and two slaves. Master only for indexing, slaves for
searching.
Only one slave is online the other i
Hi Alex,
During replication also your slave will be available for searches and opens a
new searcher just after replication. You won't get any downtime, but you might
not have warmed cache at the moment. Please look into cache configuration for
solr.
Regards
Harshvardhan OJha
-Original Mes
Hi - at first i didn't recreate the Zookeeper data but i got it to work. I'll
check the removal of the LOG line.
thanks
-Original message-
> From:Sami Siren
> Sent: Wed 19-Sep-2012 17:45
> To: solr-user@lucene.apache.org
> Subject: Re: Nodes cannot recover and become unavailable
>
> a
Hi All!
I want to replicate my Solr server.
At the begining I want to have one master and one slave. Master would
serve for indexing and slave (slaves in the future) would be used for
searching. I was wondering if anybody could tell me what happens with
slave during replication. Is it unavaila
That is the problem with a jvm, it is a virtual machine.
Ask 10 experts about a good jvm settings and you get 15 answers. May be a
tradeoff
of the flexibility of jvm's. There is always a right setting for any application
running on a jvm but you just have to find it.
How about a Solr Wiki page abo
Hello,
I am using solr 3.6.0 , I have observed many connection in CLOSE_WAIT state
after using solr server for some time. On further analysis and googling
found that I need to close the idle connections from the client which is
connecting to solr to query data and it does reduce the number of CLO
68 matches
Mail list logo