date:20141031

How to update SOLR schema from continuous integration environment

2014-10-31 Thread Faisal Mansoor

Hi, How do people usually update Solr configuration files from continuous integration environment like TeamCity or Jenkins. We have multiple development and testing environments and use WebDeploy and AwsDeploy type of tools to remotely deploy code multiple times a day, to update solr I wrote a si

Re: Consul instead of ZooKeeper anyone?

2014-10-31 Thread Walter Underwood

It looks like Consul solves a different problem than Zookeeper. Consul manages what servers are up and starts new ones as needed. Zookeeper doesn’t start servers, but does leader election when they fail. I don’t see any way that Consul could replace Zookeeper, but it could solve another part of

Re: Consul instead of ZooKeeper anyone?

2014-10-31 Thread Erick Erickson

Not that I know of, but look before you leap. I took a quick look at Consul and it really doesn't look like any kind of drop-in replacement. Also, the Zookeeper usage in SolrCloud isn't really pluggable AFAIK, so there'll be lots of places in the Solr code that need to be reworked etc., especia

Consul instead of ZooKeeper anyone?

2014-10-31 Thread Greg Solovyev

I am investigating a project to make SolrCloud run on Consul instead of ZooKeeper. So far, my research revealed no such efforts, but I wanted to check with this list to make sure I am not going to be reinventing the wheel. Have anyone attempted using Consul instead of ZK to coordinate SolrCloud

Re: exporting to CSV with solrj

2014-10-31 Thread Erick Erickson

@Will: I can't tell you how many times questions like "Why do you want to use CSV in SolrJ?" have lead to solutions different from what the original question might imply. It's a question I frequently ask in almost the exact same way; it's a perfectly legitimate question IMO. Best, Erick On Fri

Re: prefix length in fuzzy search solr 4.10.1

2014-10-31 Thread Jack Krupansky

No, but it is a reasonable request, as a global default, a collection-specific default, a request-specific default, and on an individual fuzzy term. -- Jack Krupansky -Original Message- From: elisabeth benoit Sent: Thursday, October 30, 2014 6:07 AM To: solr-user@lucene.apache.org Su

[ANNOUNCE] Apache Solr 4.10.2 released

2014-10-31 Thread Michael McCandless

October 2014, Apache Solr™ 4.10.2 available The Lucene PMC is pleased to announce the release of Apache Solr 4.10.2 Solr is the popular, blazing fast, open source NoSQL search platform from the Apache Lucene project. Its major features include powerful full-text search, hit highlighting, faceted

Re: exporting to CSV with solrj

2014-10-31 Thread Chris Hostetter

: "Why do you want to use CSV in SolrJ?" Alexandre are you looking for a It's a legitmate question - part of providing good community support is making sure we understand *why* users are asking how to do something, so we can give good advice on other solutions people might not even have thoug

Re: exporting to CSV with solrj

2014-10-31 Thread Alexandre Rafalovitch

On 31 October 2014 14:58, will martin wrote: > "Why do you want to use CSV in SolrJ?" Alexandre are you looking for a > design gig. This kind of question really begs nothing but disdain. Nope. Not looking for a design gig. I give that advice away for free: http://www.airpair.com/solr/workshops/d

Re: Solr index corrupt question

2014-10-31 Thread ku3ia

Erick Erickson wrote > What version of Solr/Lucene? First of all, was Lucene\Solr v.4.6, but later it was changed to Lucene\Solr 4.8. More later to the schema was added _root_ field and child doc support. Full data re-index on each change was not done. But not so long ago I had made an optimize to

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan

Yes, I was inadvertently sending them to a replica. When I sent them to the leader, the leader reported (1000 adds) and the replica reported only 1 add per document. So, it looks like the leader forwards the batched jobs individually to the replicas. On Fri, Oct 31, 2014 at 3:26 PM, Erick Erickson

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson

Internally, the docs are batched up into smaller buckets (10 as I remember) and forwarded to the correct shard leader. I suspect that's what you're seeing. Erick On Fri, Oct 31, 2014 at 12:20 PM, Peter Keegan wrote: > Regarding batch indexing: > When I send batches of 1000 docs to a standalone S

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Peter Keegan

Regarding batch indexing: When I send batches of 1000 docs to a standalone Solr server, the log file reports "(1000 adds)" in LogUpdateProcessor. But when I send them to the leader of a replicated index, the leader log file reports much smaller numbers, usually "(12 adds)". Why do the batches appea

Re: exporting to CSV with solrj

2014-10-31 Thread will martin

"Why do you want to use CSV in SolrJ?" Alexandre are you looking for a design gig. This kind of question really begs nothing but disdain. Commodity search exists, not matter what Paul Nelson writes and part of that problem is due to advanced users always rewriting the reqs and specs of less experi

Re: exporting to CSV with solrj

2014-10-31 Thread tedsolr

I think I'm getting the idea now. You either use the response writer via an HTTP call, or you write your own exporter. Thanks to everyone for their input. -- View this message in context: http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166889.html Sent from the Solr -

Re: Missing Records

2014-10-31 Thread Erick Erickson

Sorry to say this, but I don't think the numDocs/maxDoc numbers are telling you anything. because it looks like you've optimized which purges any data associated with deleted docs, including the internal IDs which are the numDocs/maxDocs figures. So if there were deletions, we can't see any evi

Re: Only copy string up to certain character symbol?

2014-10-31 Thread Erick Erickson

In addition to Alexandre's comment, your index chain looks suspect: So the pattern replace stuff happens on the grams, not the full input. You might be better off with a solr.PatternReplaceCharFilterFactory which works on the entire input string before even tokenization is done. Th

Re: Solr index corrupt question

2014-10-31 Thread Erick Erickson

What version of Solr/Lucene? There have been some instances of index corruption, see the lucene/CHANGES.txt file that might account for it. This is something of a stab in the dark though. Because this is troubling... Best, Erick On Fri, Oct 31, 2014 at 7:57 AM, ku3ia wrote: > Hi, Erick. Thanks

Re: exporting to CSV with solrj

2014-10-31 Thread Chris Hostetter

: Sure thing, but how do I get the results output in CSV format? : response.getResults() is a list of SolrDocuments. Either use something like the NoOpResponseParser which will give you the entire response back as a single string, or implement your own ResponseParser along hte lines of... publ

Re: exporting to CSV with solrj

2014-10-31 Thread Alexandre Rafalovitch

Why do you want to use CSV in SolrJ? You would just have to parse it again. You could just trigger that as a URL call from outside with cURL or as just an HTTP (not SolrJ) call from Java client. Regards, Alex. Personal: http://www.outerthoughts.com/ and @arafalov Solr resources and newsletter:

RE: Missing Records

2014-10-31 Thread AJ Lemke

I have run some more tests so the numbers have changed a bit. Index Results done on Node 1: Indexing completed. Added/Updated: 903,993 documents. Deleted 0 documents. (Duration: 31m 47s) Requests: 1 (0/s), Fetched: 903,993 (474/s), Skipped: 0, Processed: 903,993 Node 1: Last Modified: 44 minutes

Re: exporting to CSV with solrj

2014-10-31 Thread tedsolr

Sure thing, but how do I get the results output in CSV format? response.getResults() is a list of SolrDocuments. -- View this message in context: http://lucene.472066.n3.nabble.com/exporting-to-CSV-with-solrj-tp4166845p4166861.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Only copy string up to certain character symbol?

2014-10-31 Thread Alexandre Rafalovitch

copyField can copy only part of the string but it is defined by character count. If you want to use regular expressions, you may be better off to do the copy in the UpdateRequestProcessor chain instead: http://www.solr-start.com/info/update-request-processors/#RegexReplaceProcessorFactory What you

Only copy string up to certain character symbol?

2014-10-31 Thread hschillig

So I have a title field that is common to look like this: Personal legal forms simplified : the ultimate guide to personal legal forms / Daniel Sitarz. I made a copyField that is of type "title_only". I want to ONLY copy the text "Personal legal forms simplified : the ultimate guide to personal l

Re: exporting to CSV with solrj

2014-10-31 Thread Jorge Luis Betancourt Gonzalez

When you fire a query against Solr with the wt=csv the response coming from Solr is *already* in CSV, the CSVResponseWriter is responsible for translating SolrDocument instances into a CSV on the server side, son I don’t see any reason on using it by your self, Solr already do the heavy lifting

Re: Solr index corrupt question

2014-10-31 Thread ku3ia

Hi, Erick. Thanks for you response. I'd checked my index via check index utility, and what I'm got: 3 of 41: name=_1ouwn docCount=518333 codec=Lucene46 compound=false numFiles=11 size (MB)=431.564 diagnostics = {timestamp=1412166850391, os=Linux, os.version=3.2.0-68-generic, mergeFactor

Re: Missing Records

2014-10-31 Thread Erick Erickson

OK, that is puzzling. bq: If there were duplicates only one of the duplicates should be removed and I still should be able to search for the ID and find one correct? Correct. Your bad request error is puzzling, you may be on to something there. What it looks like is that somehow some of the docu

exporting to CSV with solrj

2014-10-31 Thread tedsolr

I am trying to invoke the CSVResponseWriter to create a CSV file of all stored fields. There are millions of documents so I need to write to the file iteratively. I saw a snippet of code online that claimed it could effectively remove the SorDocumentList wrapper and allow the docs to be retrieved i

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Erick Erickson

NP, just making sure. I suspect you'll get lots more bang for the buck, and results much more closely matching your expectations if 1> you batch up a bunch of docs at once rather than sending them one at a time. That's probably the easiest thing to try. Sending docs one at a time is something of

RE: Missing Records

2014-10-31 Thread AJ Lemke

Hi Erick: All of the records are coming out of an auto numbered field so the ID's will all be unique. Here is the the test I ran this morning: Indexing completed. Added/Updated: 903,993 documents. Deleted 0 documents. (Duration: 28m) Requests: 1 (0/s), Fetched: 903,993 (538/s), Skipped: 0, Pro

Re: The exact same query gets executed n times for the nth row when retrieving body (plaintext) from BLOB column with Tika Entity Processor

2014-10-31 Thread Erick Erickson

Your message looks like it's missing stuff (snapshots?), the e-mail for this list generally strips attachments, so you'll have to put them somewhere else and link to them if you want us to see them. Best, Erick On Fri, Oct 31, 2014 at 5:11 AM, 5ton3 wrote: > Hi! > > Not sure if this is a problem

Re: Solr index corrupt question

2014-10-31 Thread Erick Erickson

Not quite sure what you mean by "destroy". I can use a delete-by-query with *:* and mark all docs in my index deleted. Search results will return nothing but it's still a valid index, it just consists of all deleted docs. All the segments may be removed even in the absence of an optimize due to seg

RE: Missing Records

2014-10-31 Thread AJ Lemke

I started this collection using this command: http://localhost:8983/solr/admin/collections?action=CREATE&name=inventory&numShards=1&replicationFactor=2&maxShardsPerNode=4 So 1 shard and replicationFactor of 2 AJ -Original Message- From: S.L [mailto:simpleliving...@gmail.com] Sent: Thur

Re: Ideas for debugging poor SolrCloud scalability

2014-10-31 Thread Ian Rose

Hi Erick - Thanks for the detailed response and apologies for my confusing terminology. I should have said "WPS" (writes per second) instead of QPS but I didn't want to introduce a weird new acronym since QPS is well known. Clearly a bad decision on my part. To clarify: I am doing *only* writes

The exact same query gets executed n times for the nth row when retrieving body (plaintext) from BLOB column with Tika Entity Processor

2014-10-31 Thread 5ton3

Hi! Not sure if this is a problem or if I just don't understand the debug response, but it seems somewhat odd to me. The "main" entity can have multiple BLOB documents. I'm using Tika Entity Processor to retrieve the body (plaintext) from these documents and put the result in a multivalued field,

Solr index corrupt question

2014-10-31 Thread ku3ia

Hi folks! I'm interesting in, can delete operation destroy Solr index, if optimize command never perform? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-index-corrupt-question-tp4166810.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: issue related to blank value in datefield

2014-10-31 Thread Aman Tandon

Thanks Chris With Regards Aman Tandon On Fri, Oct 31, 2014 at 5:45 AM, Chris Hostetter wrote: > > : I was just trying to index the fields returned by my msql and i found > this > > If you are importing dates from MySql where you have -00-00T00:00:00Z > as the default value, you should actau

Re: Design optimal Solr Schema

2014-10-31 Thread tomas.kalas

Oh yes, i want to display stored data in html file. I have 2 pages, at one page is form and i show here results. Result here is link (by ID) at file where is all conversation in second page. And how did you mean sepparate each conversation interaction ? Thanks. -- View this message in context:

Re: Design optimal Solr Schema

2014-10-31 Thread tomas.kalas

Thanks for your help. Ok i try it explain one more, sorry for my english. I need to some functions in my searching. 1.) I will have a lot of documents naturally and i want find out if is for example is phrase for example to 5 words apart. I used w:"Good morning"~5. (in example solr it works, but

39 matches

Mail list logo