Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
Erick, Well put and thanks for the clarification. One question: "And if you need NRT, you just can't get it with traditional M/S setups." ==> Can you explain how that works with SolrCloud? I agree with what you said too because there was an article or discussion I read that said having high-avail

Re: index storage on other server

2013-02-28 Thread Chris Hostetter
: I need to store the index folder on other server. : On local system I have less space so I want to application server(Tomcat) : in local machine but Index folder or Index can be stored on other machine. That doesn't really make sense. the entire point of a system like solr is that it maintains

Re: What makes an Analyzer/Tokenizer/CharFilter/etc suitable for Solr?

2013-02-28 Thread Jack Krupansky
The package Javadoc for Solr analysis is a good start: http://lucene.apache.org/solr/4_1_0/solr-core/org/apache/solr/analysis/package-tree.html Especially the AbstractAnalysisFactory: http://lucene.apache.org/core/4_1_0/analyzers-common/org/apache/lucene/analysis/util/AbstractAnalysisFactory.ht

What makes an Analyzer/Tokenizer/CharFilter/etc suitable for Solr?

2013-02-28 Thread Alexandre Rafalovitch
Hello, I want to have a unified reference of all different processors one could use in Solr in various extension points. I have written a small tool to extract all implementations of UpdateRequestProcessorFactory, Analyzer, CharFilterFactory, etc (actually of any root class). However, I assume n

Re: A few operations questions about the tlog (UpdateLog)

2013-02-28 Thread Mark Miller
To add: Current best practice is to do a hard commit with openSearcher=false every minute or so. AutoCommit is great for this. It shouldn't affect your overall indexing performance and it will constrain the transaction log. - Mark On Feb 28, 2013, at 8:35 PM, Erick Erickson wrote: > A new tl

Re: Role of zookeeper at runtime

2013-02-28 Thread Mark Miller
Yup - nothing about it will be automatic or easy - multi dc is not really a current feature. I'm just saying it's a fast way to move the data. If you setup the same cluster on each side though, the appropriate stuff will be in ZooKeeper. - Mark On Feb 28, 2013, at 9:04 PM, varun srivastava wr

Re: filter query on multi-Valued field

2013-02-28 Thread Deepak Parmar
Jack Thank for your suggestions. I made "initials" a tokenized field and now I'm getting the expected result On Thu, Feb 28, 2013 at 10:03 AM, Jack Krupansky wrote: > First, what is your unique key field? If it is "id", then only one of > these documents will be stored since they have the sam

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Erick Erickson
Amit: It's a balancing act. If I was starting fresh, even with one shard, I'd probably use SolrCloud rather than deal with the issues around the "how do I recover if my master goes down" question. Additionally, SolrCloud allows one to monitor the health of the entire system by monitoring the state

Re: Repartition solr cloud

2013-02-28 Thread Erick Erickson
In the works, high priority: https://issues.apache.org/jira/browse/SOLR-3755 Best Erick On Thu, Feb 28, 2013 at 8:13 PM, Vaillancourt, Tim wrote: > Sort of off-topic, is there a way to do the reverse, ie: split indexes? > > This could be useful for people that would like to move to sharding fro

Re: Custom filter for document permissions

2013-02-28 Thread Erick Erickson
You might get some mileage out of encoding what you can in the documents and doing a standard fq clause on that part, and then have your post-filter do the really wild stuff. But you're right, you have to be prepared for the nightmare scenario of your sysadmin who has rights to see everything firin

Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
"You can replicate from a SolrCloud node still. Just hit it's replication handler and pass in the master url to replicate to" How will this work ? lets say s1dc1 is master of s1dc2 , s2dc1 is master for s2dc2 .. so after hitting replicate index binary will get copied but then how appropriate entri

Re: Sort by currency field using OpenExchangeRatesOrgProvider

2013-02-28 Thread Chris Hostetter
: I have as part of my schema one currency field. ... : IT works properly when i filter or do a query using an specific currency. ... : But when I try to sort by this field it returns the results by amount and it : doesn't take in consideration the currency. ... : Do you kn

Re: solrCloud insert data steps

2013-02-28 Thread Erick Erickson
bq: I want to improve performance and want to become a committer if possible Super! In the words of the great wise one (Yonik), you become a committer by acting like a committer. There are lots of options: 1> pick some JIRAs that are interesting to you. Ping the dev list to see if there's active

Re: Solr3.5 Vs Solr4.1 - Help please

2013-02-28 Thread Erick Erickson
First, did the index you moved get built with 3.x? In that case the fields won't be compressed. And things will be slower due 4.1 having to go through the back-compat layer. If you did do this, try optimizing since that'll re-write the index in 4.1 format (I'm pretty sure at least). Second, fl _do

Re: A few operations questions about the tlog (UpdateLog)

2013-02-28 Thread Erick Erickson
A new tlog gets created and the current one is closed on a hard commit (openSearcher=true or false doesn't matter). The old one will be kept around for a bit, I suspect if you'd done it a third time the first one would go away. For all I know, the code might read if (tlog docs > 100) and you may be

Re: MultiValued Search using Filter Query

2013-02-28 Thread Erick Erickson
Well, first are you using that syntax? Because it's not correct. q=sym:TEST fq=initials:MKN [] is the "range" operator. Second, you are using a "string" type for initials, which isn't analyzed in any way so you'd have to search on exactly MKN JRT. MKN wouldn't match. JRT wouldn't match. MKN JRT

Re: Poll: SolrCloud vs. Master-Slave usage

2013-02-28 Thread Amit Nithian
I don't know a ton about SolrCloud but for our setup and my limited understanding of it is that you start to bleed operational and non-operational aspects together which I am not comfortable doing (i.e. software load balancing). Also adding ZooKeeper to the mix is yet another thing to install, setu

Re: update fails if one doc is wrong

2013-02-28 Thread Erick Erickson
This has been hanging around for a long time. I did some preliminary work here: https://issues.apache.org/jira/browse/SOLR-445 but moved on to other things before committing it. The discussion there might be useful. FWIW, Erick On Wed, Feb 27, 2013 at 5:32 AM, Mikhail Khludnev < mkhlud...@griddy

RE: Repartition solr cloud

2013-02-28 Thread Vaillancourt, Tim
Sort of off-topic, is there a way to do the reverse, ie: split indexes? This could be useful for people that would like to move to sharding from one core and could be interesting under SolrCloud. Cheers, Tim -Original Message- From: Mark Miller [mailto:markrmil...@gmail.com] Sent: Fri

Re: Custom filter for document permissions

2013-02-28 Thread Chris Hostetter
: Actually, after thinking for a bit, it makes sense to apply the post : filter everywhere, otherwise I wouldn't be able to know the number of : results overall (something I unfortunately really need). Not to mention things like facet counts, which need access to the full set of matching documen

Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Mark Miller
On Feb 28, 2013, at 7:55 PM, Walter Underwood wrote: > 100 shards on a node will almost certainly be slow I think it depends on some things - with one of the largest of those things being your hardware. Many have found that you can get much better performance out of super concurrent, beefy ha

Re: can we configure spellcheck to be invoked after request processing?

2013-02-28 Thread Jack Krupansky
You can execute search components in any order you want, but their execution will be unconditional, so you would be best off to invoke spellcheck in your application layer. In other words, a second call to Solr, but only if no results were found. -- Jack Krupansky -Original Message-

Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Walter Underwood
100 shards on a node will almost certainly be slow, but at least it would be scalable. 7TB of data on one node is going to be slow regardless of how you shard it. I might choose a number with more useful divisors than 100, perhaps 96 or 144. wunder On Feb 28, 2013, at 4:25 PM, Mark Miller wrot

Re: Solr Case-sensitivity issue with search field name

2013-02-28 Thread Walter Underwood
Lower case is safer than upper case. For unicode, uppercasing is a lossy conversion. There are sets of different lower case characters that convert to the same upper case character. When you convert back to lower case, you don't know which one it was originally. Always use lower case for text.

Re: Problems with Solr 3.6 and Magento

2013-02-28 Thread Chris Hostetter
: I noticed that Magento is using the overwritePending commit directive but I : can't find any documentation on this. Does the overwritePending directive : purge any added docs since the last commit? Any help would be appreciated. overwritePending has never been a commit option. it was an "add"

Re: Role of zookeeper at runtime

2013-02-28 Thread Mark Miller
On Feb 28, 2013, at 6:20 PM, varun srivastava wrote: > So we need way of indexing 1 dc > and then somehow quickly propagate the index binary to others. You can replicate from a SolrCloud node still. Just hit it's replication handler and pass in the master url to replicate to. It doesn't have a

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread Chris Hostetter
: but at a certain add : all documents after that add will not exist in the index : what settings could affect this behavior? : I just need somewhere to start looking : could it be the merge policy? Is anything else logged arround the time of this special "add" ? what are the numDocs & maxDoc va

Re: Solr 4.1 Solr Cloud Shard Structure

2013-02-28 Thread Mark Miller
You will pay some in performance, but it's certainly not bad practice. It's a good choice for setting up so that you can scale later. You just have to do some testing to make sure it fits your requirments. The Collections API even has built in support for this - you can specify more shards than

Re: Role of zookeeper at runtime

2013-02-28 Thread Shawn Heisey
On 2/28/2013 4:20 PM, varun srivastava wrote: We have 10 virtual data centres . Now its setup like this because we do rolling update. While 1 st dc is getting indexed other 9 serve traffic . Indexing one dc take 2 hours. Now with single shard we use to index one dc and then quickly replicate inde

Re: Solr Case-sensitivity issue with search field name

2013-02-28 Thread Shawn Heisey
On 2/28/2013 3:40 PM, hyrax wrote: I'm using Solr 4.0 and I recently notice an issue that bothers me a lot which is that if you define a field in your schema named 'HOST' then in the query you have to specify this field by 'HOST' while if you used 'host' it would throw an 'undefined field' error.

Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
Any thought on this ? We have 10 virtual data centres . Now its setup like this because we do rolling update. While 1 st dc is getting indexed other 9 serve traffic . Indexing one dc take 2 hours. Now with single shard we use to index one dc and then quickly replicate index into other dcs by havin

Re: Timestamp field is changed on update

2013-02-28 Thread Isaac Hebsh
Hoss Man suggested a wonderful solution for this need: Always set update="add" to the field you want to keep (is exists), and use FirstFieldValueUpdateProcessorFactory in the update chain, after DistributedUpdateProcessorFactory (so the AtomicUpdate will add the existing field before, if exists).

Solr Case-sensitivity issue with search field name

2013-02-28 Thread hyrax
Hi guys, I'm using Solr 4.0 and I recently notice an issue that bothers me a lot which is that if you define a field in your schema named 'HOST' then in the query you have to specify this field by 'HOST' while if you used 'host' it would throw an 'undefined field' error. I have done some googling

What am I doing wrong - writing an OpenNLP Filter

2013-02-28 Thread vybe3142
Since the official OpenNLP filter is not yet in an actual release, I'm experimenting with the OpenNLP filter implementation described in chapter 8 of the Taming Text Book http://www.manning.com/ingersoll/Sample-ch08.pdf . The original code is at : https://github.com/tamingtext/book/tree/master/src

RE: Get page number of searchresult of a pdf in solr

2013-02-28 Thread Swati Swoboda
You can get the paragraph of the search result via highlights. You'd have to mark your field as stored (re-indexing required) and then specify it in the highlighting parameters. http://wiki.apache.org/solr/HighlightingParameters#hl As for getting the page number, I am not sure if there is more

Re: Solr4.1 Loggin Level Screen just shows root

2013-02-28 Thread adityab
Logging screen seam to be broken on Solr 4.1 .. any ideas ? -- View this message in context: http://lucene.472066.n3.nabble.com/Solr4-1-Loggin-Level-Screen-just-shows-root-tp4042556p4043787.html Sent from the Solr - User

Re: Solr3.5 Vs Solr4.1 - Help please

2013-02-28 Thread adityab
thanks Shawn, I did try with specifying a fixed set of fl and with no score none gave any better performance. We have a different VM with same index and Solr4.1 on Jboss 5.1 which does perfectly fine with all the queries. So this is confusing us a bit more. Have our VM expert to look now hopeful

Re: Get page number of searchresult of a pdf in solr

2013-02-28 Thread Michael Della Bitta
My guess is the best way to do this is to index each page separately and to store a link to the PDF/page in each document. That would probably require you to preprocess the PDFs to turn each one into a single page per PDF, or to extract the text per page another way. Michael Della Bitta

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread Mark Miller
On Feb 28, 2013, at 3:22 PM, Shawn Heisey wrote: > To the experts: Does an empty update request with commit=true work for this It should work fine. - Mark

Re: Solr cloud deployment on tomcat in prod

2013-02-28 Thread varun srivastava
Great .. I will do it and send you all for review. Thanks Varun On Thu, Feb 28, 2013 at 4:50 AM, Erick Erickson wrote: > Anyone can edit the Wiki, contributions welcome! > > Best > Erick > > > On Mon, Feb 25, 2013 at 5:50 PM, varun srivastava >wrote: > > > Hi, > > Is there any official documen

Get page number of searchresult of a pdf in solr

2013-02-28 Thread dev
Hello, I'm building a web application where users can search for pdf documents and view them with pdf.js. I would like to display the search results with a short snippet of the paragraph where the search term where found and a link to open the document at the right page. So what I need is

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread Shawn Heisey
On 2/28/2013 11:07 AM, dboychuck wrote: Yes I confirmed in the logs. I have also committed manually several times using the updatehandler /update?commit=true To the experts: Does an empty update request with commit=true work for this, or would the user have to send the actual commit command?

Re: Solr3.5 Vs Solr4.1 - Help please

2013-02-28 Thread Shawn Heisey
On 2/28/2013 5:48 AM, adityab wrote: Well i did another test. Copied the Index files from perf lab to Dev machine which has Solr4.1 Now ran solrmeter to generate load on Dev server. We were able to drive the QPS upto 150 with CPU on avg 35%. but the same index is generating 100% CPU at 1 QPS in p

Re: Role of zookeeper at runtime

2013-02-28 Thread varun srivastava
How can I setup cloud master-slave ? Can you point me to any sample config or tutorial which describe the steps to get slor cloud in master-slave setup. As you know from my previous mails, that I dont need active solr replicas, I just need a mechanism to copy a given solr cloud index to a new inst

Re: Syntax for sorting by a sub-query

2013-02-28 Thread Chris Hostetter
: > &sort=query({qf=worked_companies_im:61}) desc : : which yields : : > sort param could not be parsed as a query, and is not a field that : exists in the index: query({qf=worked_companies_im:61}) because it can't be parsed as a query (and is not a field) Try a simple request like... htt

Syntax for sorting by a sub-query

2013-02-28 Thread Edward Rudd
I'm using solr 4.0 in a project and I need to sort the results based on whether they match another filter. e.g. I have a "worked_companies" multi-integer field that contains the list of company ids some person has worked with before. I have a series of other fq= filters to narrow down the lis

Re: Zookeeper Error When Trying to Setup SolrCloud on Weblogic

2013-02-28 Thread Mishra, Shikhar
I was able to move past the error by trying out the fix proposed here: https://gist.github.com/barkbay/4153107 It feels strange to catch RuntimeException, though. On 2/27/13 2:09 PM, "Mishra, Shikhar" wrote: >Hi, > >I'm trying to setup Solr Could on Weblogic 12c. I've started Zookeeper in >a

Re: Solr 3.6.1 Query large field

2013-02-28 Thread Chris Hostetter
: Subject: Solr 3.6.1 Query large field : In-Reply-To: https://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread dboychuck
numdocs is wrong and they will not show up when I search by uniqueid On Thu, Feb 28, 2013 at 10:45 AM, Upayavira [via Lucene] < ml-node+s472066n4043724...@n3.nabble.com> wrote: > What do you mean by 'will not show up'? Is numdocs wrong? They don't > show in queries? > > Upayavira > > On Thu, Feb

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread Upayavira
What do you mean by 'will not show up'? Is numdocs wrong? They don't show in queries? Upayavira On Thu, Feb 28, 2013, at 06:07 PM, dboychuck wrote: > Yes I confirmed in the logs. I have also committed manually several times > using the updatehandler /update?commit=true > > > > -- > View this m

Re: Problems with documents that are added not showing up in index Solr 3.5

2013-02-28 Thread dboychuck
Yes I confirmed in the logs. I have also committed manually several times using the updatehandler /update?commit=true -- View this message in context: http://lucene.472066.n3.nabble.com/Problems-with-documents-that-are-added-not-showing-up-in-index-Solr-3-5-tp4043539p4043716.html Sent from the

Re: Custom filter for document permissions

2013-02-28 Thread Colin Hebert
Actually, after thinking for a bit, it makes sense to apply the post filter everywhere, otherwise I wouldn't be able to know the number of results overall (something I unfortunately really need). Anyways, thank you Timothy Colin Hebert On 28 February 2013 17:38, Colin Hebert wrote: > I know tha

Re: Custom filter for document permissions

2013-02-28 Thread Colin Hebert
I know that the query selects everything, this is why I made this request to test my solution. If a user make a query with a very large amount of results with paging, I expected the post filter to be executed only when necessary (as it can be expensive). Colin On 28 February 2013 17:25, Timothy

Re: Custom filter for document permissions

2013-02-28 Thread Timothy Potter
Hi Colin, Your query is *:* so that is every document. Try a query that only matches a small subset and see if you get different results. Cheers, Tim On Thu, Feb 28, 2013 at 8:17 AM, Colin Hebert wrote: > Thank you Timothy, > > With the indication you gave me (and the help of this article > htt

Sort by currency field using OpenExchangeRatesOrgProvider

2013-02-28 Thread marotosg
Hi, I have as part of my schema one currency field. http://myurwithjsonfile"/> IT works properly when i filter or do a query using an specific currency. Let's say I want to filter by USD price_c:[10.00,USD TO 100.00,USD]. It returns documents where the currency is not in USD and it makes the con

Re: Search in String and Text_en fields simultaneously with edismax

2013-02-28 Thread Jack Krupansky
The analyzer/query generator for a tokenized field will in fact tokenize the value in quotes, but it will generate a "phrase query" to assure that the list of terms occur as a phrase in the index. -- Jack Krupansky -Original Message- From: Burgmans, Tom Sent: Thursday, February 28, 2

Re: SolrCloud as my primary data store

2013-02-28 Thread Michael Sokolov
On 02/21/2013 12:02 AM, jimtronic wrote: Now that I've been running Solr Cloud for a couple months and gotten comfortable with it, I think it's time to revisit this subject. ... I'd really like to hear from someone who has made the leap. Cheers, Jim We use Solr as our primary storage

RE: Search in String and Text_en fields simultaneously with edismax

2013-02-28 Thread Burgmans, Tom
Ah OK. I didn't have a good view of query parsing vs query generation. Thanks for clearing this up. So it means that searching in a tokenized and non-tokenized field simultaneously is not possible when I want - the expression parsed as phrase for the non-tokenized field - the expression parsed a

Re: Solr 3.6 - Out Of Memory Exception

2013-02-28 Thread Jack Krupansky
As a general guide: use the following process. 1. Set your JVM heap to a fairly large size. 2. Load Solr. 3. Do a bunch of common queries that cover the range of what production will see. Be sure to use the most expensive operations you expect, such as facets and filters, and all of the fields

Re: Search in String and Text_en fields simultaneously with edismax

2013-02-28 Thread Jack Krupansky
Query text is always "tokenized" (more properly, "parsed"), unless the text is enclosed in quotes or spaces are escaped with backslash. Try: q=valueadd:"test . test2" or q=valueadd:test\ .\ test2 Parentheses simply provide grouping, either to control boolean operator evaluation order or to a

Re: geodist() spatial sorting: sort param could not be parsed as a query, and is not a field that exists in the index: geodist()

2013-02-28 Thread PeterKerk
You were right, sloppy on my side. I replaced the %20 with & (in more than 1 place) and now it does work. Thanks. -- View this message in context: http://lucene.472066.n3.nabble.com/geodist-spatial-sorting-sort-param-could-not-be-parsed-as-a-query-and-is-not-a-field-that-exists-in--tp4043603p40

Re: Solr 3.6 - Out Of Memory Exception

2013-02-28 Thread Manivannan Selvadurai
hi, Thanks for the quick reply, Total memory in the server is around 7.5 G, Even though there are around 61075834 docs the index size is around 44G. I tried changing the directoryFactory to MMapDirectory, it didnt help. Previously we used Lucene to query for term vectors using TermVectorMapper, W

Search in String and Text_en fields simultaneously with edismax

2013-02-28 Thread Burgmans, Tom
I have a field "valueadd" of type String and field "body" of type text_en (with tokenization and linguistic processing). When I search with edismax against field valueadd like this: q=valueadd:(test . test2) I see that the parsed query is (valueadd:test valueadd:. valueadd:test2)~3 Why not (valu

Re: geodist() spatial sorting: sort param could not be parsed as a query, and is not a field that exists in the index: geodist()

2013-02-28 Thread David Smiley (@MITRE.org)
Strange. The code in Solr that has that error string passes an an additional exception that will have its own error message that is more detailed, and you'll see that in the stack trace in the Solr logs; perhaps in your error response too but I'm not sure. If you remove the sorting, are the searc

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread Anirudha Jadhav
*1.empty Zookeeper* *2.empty index directories for solr* *3.empty solr.xml* * 3.1 upload / link cfg in zookeeper for test collection* *4*.* start 4 solr servers on different machines* *5. Access server* : i see that's ok *6. CREATE collection* http://hostname:15000/solr/admin/collections?

Re: Solr 3.6 - Out Of Memory Exception

2013-02-28 Thread Jan Høydahl
How much memory on the server in total? For such a large index you should leave PLENTY of free memory for the OS to cache your index efficiently. A quick thing to try is to upgrade to Solr4.1, as the index size itself will shrink dramatically and you will get better utilization of whatever memory

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread Mark Miller
On Feb 28, 2013, at 9:41 AM, Shankar Sundararaju wrote: > did having the config name same as collection name allow you to > create collections without having to link the corresponding config names? I > did not try this myself. It should work that way or there is a bug. - Mark

Re: query builder for solr UI?

2013-02-28 Thread Jan Høydahl
Again - what problems did you face when attempting this with the eDismax parser? Are you saying you are unhappy with the way eDisMax interprets -foo as NOT foo? A dash on its own - is treated like a dash. Your JavaScript code would anyway need to handle URL encoding properly so that a query input

Re: Custom filter for document permissions

2013-02-28 Thread Colin Hebert
Thank you Timothy, With the indication you gave me (and the help of this article http://searchhub.org/2012/02/22/custom-security-filtering-in-solr/ ) I managed to draft my own filter, but it seems that it doesn't work quite as I expected. Here is what I've done so far: https://github.com/ColinHeb

Solr 3.6 - Out Of Memory Exception

2013-02-28 Thread Manivannan Selvadurai
Hi, Im using Solr 3.6 on Tomcat 6, Xmx is set to 4096m. I have indexed about 61075834 documents using shingle filter with max shingle size 3. Basically i have a lot of terms. Whenever i request 3-4 queries at a time to to get the termvector component, I get the following exception. SE

Re: filter query on multi-Valued field

2013-02-28 Thread Jack Krupansky
First, what is your unique key field? If it is "id", then only one of these documents will be stored since they have the same id values. Please provide the exact request URL so we can see exactly what the q and fq parameters look like. Your fq looks malformed, but it's hard to say for sure wit

Re: query builder for solr UI?

2013-02-28 Thread eShard
Good question, if the user types in special characters like the dash - How will I know to treat it like a dash or the NOT operator? The first one will need to be URL encoded the second one won't be resulting in very different queries. So I apologize for not being more clear, so really what I'm af

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread Shankar Sundararaju
You may also have to link the config name with collection name in the zookeeper. Here's the command to do it: cloud-scripts/zkcli.sh -cmd linkconfig -zkhost localhost:9983 -collection COLLECTION_1_NAME -confname CONF_1_NAME Rafal, did having the config name same as collection name allow you to cr

Re: Role of zookeeper at runtime

2013-02-28 Thread Mark Miller
On Feb 26, 2013, at 6:49 PM, varun srivastava wrote: > So does it means while doing "document add" the state of cluster is fetched > from zookeeper and then depending upon hash of docid the target shard is > decided ? We keep the zookeeper info cached locally. We only updated it when ZooKeeper

Re: query builder for solr UI?

2013-02-28 Thread Jan Høydahl
Hi, Have you tried edismax across your original (not text copyfield) fiels? If no, try it. If yes, which of your expectations did it not satisfy? Why would you want to "build" a query yourself, when Solr's queryParser is made to do just that for you from the input query string? -- Jan Høydahl,

Re: query builder for solr UI?

2013-02-28 Thread eShard
sorry, The easiest way to describe it is specifically we desire a "google-like" experience. so if the end user types in a phrase or quotes or +, - (for and, not) etc etc. the UI will be flexible enough to build the correct solr query syntax. How will edismax help? And I tried simplifying queries

Re: solrCloud insert data steps

2013-02-28 Thread Erick Erickson
Why do you want to know? General issues or are you having a specific problem? But here's the flow as I understand it. Let's say a leader receives a doc - if the doc is for this shard, forward the doc to all replicas and collect the results before responding - if the doc is for a different shar

Re: Role of zookeeper at runtime

2013-02-28 Thread Erick Erickson
To update at least one node must be up for each shard, otherwise updates fail. Solr replication works fine in 4.x, in fact it's used to synchronize when bulk updates happen (say you bring up a new node). The transaction logs are only used to store at least 100 currently documents for synchronizing

Re: solr/admin java.io.IOException: A file, file system or message queue is no longer available.

2013-02-28 Thread Erick Erickson
Very strange. I'm assuming you've re-started Solr after the directories were removed? I'd expect to see a message in the log (WARNING) indicating the index dirs were being created. Otherwise, permissions errors can manifest themselves in tricky ways. not much help... Erick On Tue, Feb 26, 2

Re: Stored values and date math query

2013-02-28 Thread Erick Erickson
Just to check, your order_prep_time is _indexed_ too, right? It's a bit confusing but anything you use in your function queries will be from indexed terms, not stored ones Best Erick On Tue, Feb 26, 2013 at 4:05 PM, Indika Tantrigoda wrote: > Hi All, > > I am trying to use a stored value in the

Re: Consistent relevance tie-breaking across clusters?

2013-02-28 Thread Erick Erickson
bq: we don't want to use either the primary key or the record's update date as the tie-breaker, as it may introduce an new bias into the ranking algorithm Are you thinking of adding something to your main clause to force this? If so, why not just use sorting by adding a sort clause like: &sort=sc

Re: POI error while extracting docx document

2013-02-28 Thread Erick Erickson
I'd guess you have old Tika jars in your classpath. Best Erick On Tue, Feb 26, 2013 at 12:40 PM, Carlos Alexandro Becker < caarl...@gmail.com> wrote: > sorry: > > http://stackoverflow.com/questions/15095202/extracting-docx-files-with-tika-in-apache-solr-gives-nosuchmethod-error > > > On Tue, Fe

Re: AW: 170G index, 1.5 billion documents, out of memory on query

2013-02-28 Thread Erick Erickson
Personally I've never seen any single node support 1.5B documents. I advise biting the bullet and sharding. Even if you do get the simple keyword search working, the first time you sort I expect it to blow up. Then you'll try to facet and it'll blow up. Then you'll start using filter queries and it

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread Rafał Kuć
Hello! Yes I did :) -- Regards, Rafał Kuć Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch - ElasticSearch > Thanks > I'm going to try this. > Have you tried it yourself? > -- > View this message in context: > http://lucene.472066.n3.nabble.com/solr-4-1-trying-to-create-2-collec

Re: A few operations questions about the tlog (UpdateLog)

2013-02-28 Thread adityab
whats the life cycle of a tlog file. Is it purged after commit (even with soft commit) ? I posted 100 docs to solr (standalone) did hard commit. Observed a new tlog file is created. re-posted the same 100 docs and did hard commit. Observed a new tlog file is created. Old one still exists. When d

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread adfel70
Thanks I'm going to try this. Have you tried it yourself? -- View this message in context: http://lucene.472066.n3.nabble.com/solr-4-1-trying-to-create-2-collection-with-2-different-sets-of-configurations-tp4043609p4043617.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr cloud deployment on tomcat in prod

2013-02-28 Thread Erick Erickson
Anyone can edit the Wiki, contributions welcome! Best Erick On Mon, Feb 25, 2013 at 5:50 PM, varun srivastava wrote: > Hi, > Is there any official documentation around deployment of solr cloud in > production on tomcat ? > > I am looking for anything as detailed as following one .. It will be

Re: Solr3.5 Vs Solr4.1 - Help please

2013-02-28 Thread adityab
Thanks guys .. Well i did another test. Copied the Index files from perf lab to Dev machine which has Solr4.1 Now ran solrmeter to generate load on Dev server. We were able to drive the QPS upto 150 with CPU on avg 35%. but the same index is generating 100% CPU at 1 QPS in perf lab. On a side n

Re: A few operations questions about the tlog (UpdateLog)

2013-02-28 Thread Erick Erickson
Sure, putting the tlog on a different disk can reduce disk I/O contention, of course you don't need to bother unless you can demonstrate that your Solr app is I/O bound. If it's not, you won't see much benefit Don't know about the compression. Note that tlogs are only guaranteed (at present) t

Re: solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread Rafał Kuć
Hello! You can try doing the following: 1. Run Solr with no collection and no cores, just an empty solr.xml 2. If you don't have a ZooKeeper run Solr with -DzkRun 3. Upload you configurations to ZooKeeper, by running cloud-scripts/zkcli.sh -cmdupconfig -zkhost localhost:9983 -confdir CONFIGUR

Re: Backtick character in field values, and results

2013-02-28 Thread Erick Erickson
ICUFoldingFilterFactory is "folding" the backtick (grave accent). See admin/analysis page, it's a lifesaver in these situations! Best Erick On Fri, Feb 22, 2013 at 3:46 PM, Neelesh wrote: > With a text_unbroken field > omitTermFreqAndPositions="true"> "solr.KeywordTokenizerFactory" /> "so

Re: solr4.1.0 how to config field length

2013-02-28 Thread Erick Erickson
I'd try using Solr in isolation first, you have a couple of other products in there and you could be having this issue anywhere along the chain. What is your evidence that your not getting the limit you expect? Best would be to provide a test case illustrating the problem, it should be pretty easy

solr 4.1 - trying to create 2 collection with 2 different sets of configurations

2013-02-28 Thread adfel70
solr 4.1 - trying to create 2 collection with 2 different sets of configurations. Anyone accomplished this? if I run bootstraop twice on different conf dirs, I get both of them in zookeper, but using collections API to create a collection if collection.configName=seconfConf doesnt work. any idea?

Re: Solr3.5 Vs Solr4.1 - Help please

2013-02-28 Thread Michael Della Bitta
We're noticing a performance delta between 3.6 and 4.1 too. We're transitioning some textural fields from indexed only to stored as well. One of the reasons why we're testing 4.1 in this particular case is more efficient storage use, which would eliminate some iowait behavior we were noticing. Tha

geodist() spatial sorting: sort param could not be parsed as a query, and is not a field that exists in the index: geodist()

2013-02-28 Thread PeterKerk
I want to sort the results of my query on distance. But I get this error: sort param could not be parsed as a query, and is not a field that exists in the index: geodist() On this query: http://localhost:8983/solr/tt/select/?indent=on&facet=true&fq=countryid:1&fq={!geofilt}&pt=51.8425,5.85278%20

Re: index storage on other server

2013-02-28 Thread Gora Mohanty
On 28 February 2013 14:55, Jonty Rhods wrote: > Hi All, > > I need to store the index folder on other server. > On local system I have less space so I want to application server(Tomcat) > in local machine but Index folder or Index can be stored on other machine. > > Is it possible? Don't think so

Re: Dropping slow queries

2013-02-28 Thread adm1n
Thanks, but it not exactly what I need. According to the documentation "this value *only* applies to the search and *not to requests* in general." What I need is to effect the request - to drop it. To tell to the cloud to drop all requests which take more then x msec. No matter why - slow search

Re: Solr 3.6.1 Query large field

2013-02-28 Thread Otis Gospodnetic
Mark, Look at http://svn.apache.org/repos/asf/lucene/dev/trunk/solr/example/solr/collection1/conf/solrconfig.xml: Otis -- Solr & ElasticSearch Support http://sematext.com/ On Wed, Feb 27, 2013 at 11:08 AM, Mark Wilson wrote: > Hi > > I am using Nutch to crawl a site, and post it i

Re: Poll: Largest SolrCloud out there?

2013-02-28 Thread Otis Gospodnetic
I'd love to know, too. What we observed at Sematext was that 4.0 SolrCloud very very buggy and difficult, so I suspect there aren't many big Solr 4.0 based clusters out there. 4.1 is much better (thanks Mark & Co.) and I'm looking forward to 4.2 in March. Also, based on the stats we have access t