Hi,
How do people usually update Solr configuration files from continuous
integration environment like TeamCity or Jenkins.
We have multiple development and testing environments and use WebDeploy and
AwsDeploy type of tools to remotely deploy code multiple times a day, to
update solr I wrote a si
It looks like Consul solves a different problem than Zookeeper. Consul manages
what servers are up and starts new ones as needed. Zookeeper doesn’t start
servers, but does leader election when they fail.
I don’t see any way that Consul could replace Zookeeper, but it could solve
another part of
Not that I know of, but look before you leap. I took a quick look at
Consul and it really doesn't look like any kind of drop-in replacement.
Also, the Zookeeper usage in SolrCloud isn't really pluggable
AFAIK, so there'll be lots of places in the Solr code that need to be
reworked etc., especia
I am investigating a project to make SolrCloud run on Consul instead of
ZooKeeper. So far, my research revealed no such efforts, but I wanted to check
with this list to make sure I am not going to be reinventing the wheel. Have
anyone attempted using Consul instead of ZK to coordinate SolrCloud
@Will:
I can't tell you how many times questions like
"Why do you want to use CSV in SolrJ?" have
lead to solutions different from what the original
question might imply. It's a question I frequently
ask in almost the exact same way; it's a
perfectly legitimate question IMO.
Best,
Erick
On Fri
No, but it is a reasonable request, as a global default, a
collection-specific default, a request-specific default, and on an
individual fuzzy term.
-- Jack Krupansky
-Original Message-
From: elisabeth benoit
Sent: Thursday, October 30, 2014 6:07 AM
To: solr-user@lucene.apache.org
Su
October 2014, Apache Solr™ 4.10.2 available
The Lucene PMC is pleased to announce the release of Apache Solr 4.10.2
Solr is the popular, blazing fast, open source NoSQL search platform
from the Apache Lucene project. Its major features include powerful
full-text search, hit highlighting, faceted
: "Why do you want to use CSV in SolrJ?" Alexandre are you looking for a
It's a legitmate question - part of providing good community support is
making sure we understand *why* users are asking how to do something, so
we can give good advice on other solutions people might not even have
thoug
On 31 October 2014 14:58, will martin wrote:
> "Why do you want to use CSV in SolrJ?" Alexandre are you looking for a
> design gig. This kind of question really begs nothing but disdain.
Nope. Not looking for a design gig. I give that advice away for free:
http://www.airpair.com/solr/workshops/d
Erick Erickson wrote
> What version of Solr/Lucene?
First of all, was Lucene\Solr v.4.6, but later it was changed to Lucene\Solr
4.8. More later to the schema was added _root_ field and child doc support.
Full data re-index on each change was not done. But not so long ago I had
made an optimize to
Yes, I was inadvertently sending them to a replica. When I sent them to the
leader, the leader reported (1000 adds) and the replica reported only 1 add
per document. So, it looks like the leader forwards the batched jobs
individually to the replicas.
On Fri, Oct 31, 2014 at 3:26 PM, Erick Erickson
Internally, the docs are batched up into smaller buckets (10 as I
remember) and forwarded to the correct shard leader. I suspect that's
what you're seeing.
Erick
On Fri, Oct 31, 2014 at 12:20 PM, Peter Keegan wrote:
> Regarding batch indexing:
> When I send batches of 1000 docs to a standalone S
Regarding batch indexing:
When I send batches of 1000 docs to a standalone Solr server, the log file
reports "(1000 adds)" in LogUpdateProcessor. But when I send them to the
leader of a replicated index, the leader log file reports much smaller
numbers, usually "(12 adds)". Why do the batches appea
"Why do you want to use CSV in SolrJ?" Alexandre are you looking for a
design gig. This kind of question really begs nothing but disdain.
Commodity search exists, not matter what Paul Nelson writes and part of
that problem is due to advanced users always rewriting the reqs and specs
of less experi
I think I'm getting the idea now. You either use the response writer via an
HTTP call, or you write your own exporter. Thanks to everyone for their
input.
--
View this message in context:
http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166889.html
Sent from the Solr -
Sorry to say this, but I don't think the numDocs/maxDoc numbers
are telling you anything. because it looks like you've optimized
which purges any data associated with deleted docs, including
the internal IDs which are the numDocs/maxDocs figures. So if there
were deletions, we can't see any evi
In addition to Alexandre's comment, your index chain looks suspect:
So the pattern replace stuff happens on the grams, not the full input. You might
be better off with a
solr.PatternReplaceCharFilterFactory
which works on the entire input string before even tokenization is done.
Th
What version of Solr/Lucene? There have been some instances of index
corruption, see the lucene/CHANGES.txt file that might account for it.
This is something of a stab in the dark
though.
Because this is troubling...
Best,
Erick
On Fri, Oct 31, 2014 at 7:57 AM, ku3ia wrote:
> Hi, Erick. Thanks
: Sure thing, but how do I get the results output in CSV format?
: response.getResults() is a list of SolrDocuments.
Either use something like the NoOpResponseParser which will give you the
entire response back as a single string, or implement your own
ResponseParser along hte lines of...
publ
Why do you want to use CSV in SolrJ? You would just have to parse it again.
You could just trigger that as a URL call from outside with cURL or as
just an HTTP (not SolrJ) call from Java client.
Regards,
Alex.
Personal: http://www.outerthoughts.com/ and @arafalov
Solr resources and newsletter:
I have run some more tests so the numbers have changed a bit.
Index Results done on Node 1:
Indexing completed. Added/Updated: 903,993 documents. Deleted 0 documents.
(Duration: 31m 47s)
Requests: 1 (0/s), Fetched: 903,993 (474/s), Skipped: 0, Processed: 903,993
Node 1:
Last Modified: 44 minutes
Sure thing, but how do I get the results output in CSV format?
response.getResults() is a list of SolrDocuments.
--
View this message in context:
http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166861.html
Sent from the Solr - User mailing list archive at Nabble.com.
copyField can copy only part of the string but it is defined by
character count. If you want to use regular expressions, you may be
better off to do the copy in the UpdateRequestProcessor chain instead:
http://www.solr-start.com/info/update-request-processors/#RegexReplaceProcessorFactory
What you
So I have a title field that is common to look like this:
Personal legal forms simplified : the ultimate guide to personal legal forms
/ Daniel Sitarz.
I made a copyField that is of type "title_only". I want to ONLY copy the
text "Personal legal forms simplified : the ultimate guide to personal l
When you fire a query against Solr with the wt=csv the response coming from
Solr is *already* in CSV, the CSVResponseWriter is responsible for translating
SolrDocument instances into a CSV on the server side, son I don’t see any
reason on using it by your self, Solr already do the heavy lifting
Hi, Erick. Thanks for you response.
I'd checked my index via check index utility, and what I'm got:
3 of 41: name=_1ouwn docCount=518333
codec=Lucene46
compound=false
numFiles=11
size (MB)=431.564
diagnostics = {timestamp=1412166850391, os=Linux,
os.version=3.2.0-68-generic, mergeFactor
OK, that is puzzling.
bq: If there were duplicates only one of the duplicates should be
removed and I still should be able to search for the ID and find one
correct?
Correct.
Your bad request error is puzzling, you may be on to something there.
What it looks like is that somehow some of the docu
I am trying to invoke the CSVResponseWriter to create a CSV file of all
stored fields. There are millions of documents so I need to write to the
file iteratively. I saw a snippet of code online that claimed it could
effectively remove the SorDocumentList wrapper and allow the docs to be
retrieved i
NP, just making sure.
I suspect you'll get lots more bang for the buck, and
results much more closely matching your expectations if
1> you batch up a bunch of docs at once rather than
sending them one at a time. That's probably the easiest
thing to try. Sending docs one at a time is something of
Hi Erick:
All of the records are coming out of an auto numbered field so the ID's will
all be unique.
Here is the the test I ran this morning:
Indexing completed. Added/Updated: 903,993 documents. Deleted 0 documents.
(Duration: 28m)
Requests: 1 (0/s), Fetched: 903,993 (538/s), Skipped: 0, Pro
Your message looks like it's missing stuff (snapshots?), the
e-mail for this list generally strips attachments, so you'll
have to put them somewhere else and link to them if you
want us to see them.
Best,
Erick
On Fri, Oct 31, 2014 at 5:11 AM, 5ton3 wrote:
> Hi!
>
> Not sure if this is a problem
Not quite sure what you mean by "destroy". I can
use a delete-by-query with *:* and mark all docs in
my index deleted. Search results will return nothing
but it's still a valid index, it just consists of all deleted
docs. All the segments may be removed even in the
absence of an optimize due to seg
I started this collection using this command:
http://localhost:8983/solr/admin/collections?action=CREATE&name=inventory&numShards=1&replicationFactor=2&maxShardsPerNode=4
So 1 shard and replicationFactor of 2
AJ
-Original Message-
From: S.L [mailto:simpleliving...@gmail.com]
Sent: Thur
Hi Erick -
Thanks for the detailed response and apologies for my confusing
terminology. I should have said "WPS" (writes per second) instead of QPS
but I didn't want to introduce a weird new acronym since QPS is well
known. Clearly a bad decision on my part. To clarify: I am doing
*only* writes
Hi!
Not sure if this is a problem or if I just don't understand the debug
response, but it seems somewhat odd to me.
The "main" entity can have multiple BLOB documents. I'm using Tika Entity
Processor to retrieve the body (plaintext) from these documents and put the
result in a multivalued field,
Hi folks!
I'm interesting in, can delete operation destroy Solr index, if optimize
command never perform?
--
View this message in context:
http://lucene.472066.n3.nabble.com/Solr-index-corrupt-question-tp4166810.html
Sent from the Solr - User mailing list archive at Nabble.com.
Thanks Chris
With Regards
Aman Tandon
On Fri, Oct 31, 2014 at 5:45 AM, Chris Hostetter
wrote:
>
> : I was just trying to index the fields returned by my msql and i found
> this
>
> If you are importing dates from MySql where you have -00-00T00:00:00Z
> as the default value, you should actau
Oh yes, i want to display stored data in html file. I have 2 pages, at one
page is form and i show here results.
Result here is link (by ID) at file where is all conversation in second
page. And how did you mean sepparate each conversation interaction ? Thanks.
--
View this message in context:
Thanks for your help.
Ok i try it explain one more, sorry for my english.
I need to some functions in my searching.
1.) I will have a lot of documents naturally and i want find out if is for
example is phrase for example to 5 words apart. I used w:"Good morning"~5.
(in example solr it works, but
39 matches
Mail list logo