Re: SOLR4 cluster - strange CPU spike on slave

2012-11-28 Thread John Nielsen
Yup you read it right. We originally intended to do all our indexing to varnish02, replicate to varnish01 and then search from varnish01 (through a fail-over ip which would switch the reader to varnish02 in case of trouble). When I saw the spikes, I tried to eliminate possibilities by starting se

Re: multiple filter query with seperate result sets (in one call)

2012-11-28 Thread ninaddesai82
Thanks Erick for replying, Well, I am actually trying to build a autosuggestion; However functionality I need is little bit tricky. So, just to give you an idea - I have certain generic attributes (say category, city etc) When use types, I want autosuggest to populate, but while doing that I want

Re: Odd casting error in embedded Jetty container

2012-11-28 Thread Mark Bennett
OK Alex. BTW, I did go through your presentation, and even emailed it to some cohorts of mine. On Tue, Nov 27, 2012 at 5:58 PM, Alexandre Rafalovitch wrote: > Are you sure that NamedSPILoader respects -verbose flag? I would still > check the filesystem access using something that does not lie (

Re: SOLR4 cluster - strange CPU spike on slave

2012-11-28 Thread Otis Gospodnetic
If this is caused by index segment merging you should be able to see that very clearly on the Index report in SPM, where you would see sudden graph movement at the time of spike and corresponding to CPU and disk activity. I think uncommenting that infostream in solrconfig would also show it. Otis

Re: Excluding caching of queryresult

2012-11-28 Thread Erick Erickson
setting cache to false is, as far as I know, only possible on filter queries. Do note that the queryResultCache is very cheap. All it is is a map where the key is the query and the value is int[windowsize] where windowsize is the value in solrconfig.xml (queryResultWindowSize). It's primarily used

Re: Downloading files from the solr replication Handler

2012-11-28 Thread Erick Erickson
How are you downloading them? I suspect the issue is with the download process rather than Solr, but I'm just guessing. Best Erick On Wed, Nov 28, 2012 at 12:19 PM, Eva Lacy wrote: > Just to add to that, I'm using solr 3.6.1 > > > On Wed, Nov 28, 2012 at 5:18 PM, Eva Lacy wrote: > > > I downl

Re: multiple filter query with seperate result sets (in one call)

2012-11-28 Thread Erick Erickson
I don't think you can. What's the use-case you're really looking at? You could consider grouping/field collapsing, which sounds related, see: http://wiki.apache.org/solr/FieldCollapsing Best, Erick On Wed, Nov 28, 2012 at 8:35 AM, ninaddesai82 wrote: > Hi, > > I want to have multiple search at

Re: SOLR4 cluster - strange CPU spike on slave

2012-11-28 Thread Erick Erickson
Am I reading this right? All you're doing on varnish1 is replicating to it? You're not searching or indexing? I'm sure I'm misreading this. "The spike, which only lasts for a couple of minutes, sends the disks racing" This _sounds_ suspiciously like segment merging, especially the "disks raci

Re: predefined variables usable in schema.xml ?

2012-11-28 Thread T. Kuro Kurosaka
Thank you, Hoss. I found this SolrWiki page talks about pre-defined properties such as solr.core.instanceDir: http://wiki.apache.org/solr/CoreAdmin I tried to use ${solr.core.instanceDir} in the default single-core schema.xml, and it didn't work. Is this page wrong, or these properties are av

Re: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Jack Krupansky
One change was to change the default for autoGeneratePhraseQueries from true to false. That means that now RoC would match Ro OR C rather than "Ro C" (phrase). Simply add autoGeneratePhraseQueries=true to your field type - no need to re-index. -- Jack Krupansky -Original Message- F

programmatically get dataDir setting from solrconfig.xml

2012-11-28 Thread Jie Sun
I am trying to get the value of 'dataDir' that was set in solrconfig.xml. other thank query solr with http://[host]:8080/solr/default/admin/file/?contentType=text/xml;charset=utf-8&file=solrconfig.xml and parse the dataDir element using some xml parser, then resolve all possible environment vari

Excluding caching of queryresult

2012-11-28 Thread richardg
We are aware of adding cache=false to our queries but everything we see seems to reference filterCache. We weren't sure if this parameter would work the same way with the queryResultsCache. Here is an example of a query we don't want to cache(NOTE: I'm just the administrator, I'm not so familiar

Re: Solr4 - new characters requiring backslash escape?

2012-11-28 Thread Chris Hostetter
: Here's the query showing the apostrophe problem, pulled from our search logs: : q=( (MEXICO DAY OF THE DEAD CELEBRATION 'TRADITIONS OF LIFE AND DAY')) : : This is the error msg I can see in my browser when I send that to Solr : 4.1-SNAPSHOT from 2012/11/26. I am not doing any testing with t

Re: Connecting MySQL to Solr 4.0

2012-11-28 Thread Shawn Heisey
On 11/28/2012 12:03 PM, Joseph C. Trubisz wrote: Anybody have the problem of attempting to use MySQL J/Connector on Solr 4.0 and getting the Standard Java error: ClassNotFoundException: Unable to load com.mysql.jdbc.Driver. Problem is that when I go to the admin screen, the driver (which IS th

Connecting MySQL to Solr 4.0

2012-11-28 Thread Joseph C. Trubisz
Anybody have the problem of attempting to use MySQL J/Connector on Solr 4.0 and getting the Standard Java error: ClassNotFoundException: Unable to load com.mysql.jdbc.Driver. Problem is that when I go to the admin screen, the driver (which IS there, I did check) fails to show. In fact, using th

Re: Access document scores from within a search component?

2012-11-28 Thread Per Fredelius
Thanks! I found out maybe an hour ago. Spent way to long looking for that. Tried to trace where it was put into the response all the way up into TopFieldCollector.java. That didn't make me any wiser though. :) / Per 2012/11/28 Chris Hostetter > > : to retrieve the documents pointed to by the d

Re: Access document scores from within a search component?

2012-11-28 Thread Chris Hostetter
: to retrieve the documents pointed to by the doc ids that I can find in : responsebuilder.getResults().docList. This works well for retrieving fields : but not suprisingly, since I'm accessing the index directly, it : doesn't expose the scoring. If you have a DocList where hasScores() returns tr

Connecting MySQL to Solr 4.0

2012-11-28 Thread Joseph C. Trubisz
Anybody have the problem of attempting to use MySQL J/Connector on Solr 4.0 and getting the Standard Java error: ClassNotFoundException: Unable to load com.mysql.jdbc.Driver. Problem is that when I go to the admin screen, the driver (which IS there, I did check) fails to show. In fact, using th

Re: Indexing performance with solrj vs. direct lucene API

2012-11-28 Thread Mark Miller
One difference is that Solr will call update rather than add by default. If you are willing to ensure unique id's, you can specify overwrite=false (I think thats the one) and it will use add instead. - Mark On Wed, Nov 28, 2012 at 1:02 PM, Robert Stewart wrote: > I have a project where I am port

Indexing performance with solrj vs. direct lucene API

2012-11-28 Thread Robert Stewart
I have a project where I am porting existing application from direct Lucene API usage to using SOLR and SOLRJ client API. The problem I have is that indexing is 2-5x slower using SOLRJ+SOLR than using direct Lucene API. I am creating batches of documents between 200 and 500 documents per call to

Re: SolrCloud and exernal file fields

2012-11-28 Thread Mikhail Khludnev
Mark, Your comment is quite valuable. Let me mention the keyword to be able to find later NoOpDistributingUpdateProcessorFactory.* *Thanks*! * On Wed, Nov 28, 2012 at 5:56 PM, Mark Miller wrote: > Keep in mind that the distrib update proc will be auto inserted into > chains! You have to includ

Re: Does SolrCloud support distributed IDFs?

2012-11-28 Thread Sandeep Mestry
Dear All, Can anyone suggest how long it will take to get SOLR-1632 patch into Solr 4? Also, it'd be good if someone has used any alternate method like Ultraseek XPA Java library to calculate the distributed ranking? Many Thanks, Sandeep On 22 October 2012 13:23, Sascha SZOTT wrote: > Hi Mark

Re: Solr4 - new characters requiring backslash escape?

2012-11-28 Thread Shawn Heisey
On 11/28/2012 10:16 AM, Jack Krupansky wrote: Forward slash is now reserved for regular expression terms. For the full list, see the Javadoc, here: http://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters I don't k

RE: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Frederico Azeiteiro
Also, i'm having issues with searching "RoC" . It returns thousands of matches on 3.6.1 against just a few on solr 1.4.0. Looking to analysis I see no differences... Should I add "RoC" to protected keywords or can I tweak something on schema to achieve exact "RoC" matches? -Mensagem origin

Re: Downloading files from the solr replication Handler

2012-11-28 Thread Eva Lacy
Just to add to that, I'm using solr 3.6.1 On Wed, Nov 28, 2012 at 5:18 PM, Eva Lacy wrote: > I downloaded some configuration and data files directly from solr in an > attempt to develop a backup solution. > I noticed there is some characters at the start and end of the file that > aren't in con

RE: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Frederico Azeiteiro
Ok, I'll test that and let you know. Is there some test I can easily do to confirm that is was really a side-effect of the bug? Frederico Azeiteiro Developer   -Mensagem original- De: Jack Krupansky [mailto:j...@basetechnology.com] Enviada:

Downloading files from the solr replication Handler

2012-11-28 Thread Eva Lacy
I downloaded some configuration and data files directly from solr in an attempt to develop a backup solution. I noticed there is some characters at the start and end of the file that aren't in configuration files, I notice the same characters at the start and end of the data files. Anyone with any

Re: Solr4 - new characters requiring backslash escape?

2012-11-28 Thread Jack Krupansky
Forward slash is now reserved for regular expression terms. For the full list, see the Javadoc, here: http://lucene.apache.org/core/4_0_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Escaping_Special_Characters I don't know of any change related to apostrophe. That may

Re: Solr cloud recovery, why does restarting leader need replicas?

2012-11-28 Thread Mark Miller
On Nov 28, 2012, at 11:58 AM, Mark Miller wrote: > and we don't want to lose any updates. That's probably somewhat inaccurate - in this case it's more about consistency - we only ack updates once they are on every replica. So it's not a lost updates issue, but a consistency issue. The lost

Re: Solr cloud recovery, why does restarting leader need replicas?

2012-11-28 Thread Mark Miller
This is a protective measure. When it looks like a shard is first coming up, we wait to see all the expected shards, or for a timeout, to ensure that everyone participates in the initial sync process - if all the nodes went down, we don't know what documents made it where, and we don't want to l

Solr4 - new characters requiring backslash escape?

2012-11-28 Thread Shawn Heisey
I've been putting a new Solr 4.1 deployment through extensive testing before we upgrade from 3.5. My testing has turned up two characters that used to work fine with no escaping that now give syntax errors without a preceding backslash. Those characters are forward slash and apostrophe (singl

Re: Permanently Full Old Generation...

2012-11-28 Thread Shawn Heisey
On 11/28/2012 9:06 AM, Annette Newton wrote: We are seeing strange gc behaviour after running solr cloud under quite heavy insert load for a period of time. The old generation becomes full and no amount of garbage collection will free up the memory. I have attached a memory profile, as you can

RE: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Frederico Azeiteiro
Sorry, ignore the "". Somehow that text appeared when I copy/pasted the XML from IE and I did not notice, but that is not part of the schema... :) Still can't figure this thing out... -Mensagem original- De: Erick Erickson [

RE: Permanently Full Old Generation...

2012-11-28 Thread Annette Newton
Hi, Class Instance Count Total Size class [C 3410607 699552656 class [Lorg.apache.lucene.util.fst.FST$Arc;

Re: Permanently Full Old Generation...

2012-11-28 Thread Jack Krupansky
Have you done a Java heap dump to see what the most common objects are? -- Jack Krupansky From: Annette Newton Sent: Wednesday, November 28, 2012 11:06 AM To: solr-user@lucene.apache.org Cc: Andy Kershaw Subject: Permanently Full Old Generation... Hi, I’m hoping someone can help me with a

RE: Problem with migration to SolrAdaptersForLuceneSpatial4

2012-11-28 Thread David Smiley (@MITRE.org)
Viacheslav, Did you re-index? Clearly re-indexing is needed when changing field types. ~ David From: Viacheslav Davidovich [via Lucene] [ml-node+s472066n4022861...@n3.nabble.com] Sent: Wednesday, November 28, 2012 4:42 AM To: Smiley, David W. Subject: Re: Problem

Re: Dynamic ranking based on search term

2012-11-28 Thread Floyd Wu
Hi Upayavira Let me explain what I need in the other words. The list is the result that after analyzing log. Key value pairs list actually means that when search term is java, then boosting these documents(doc1,doc2,doc5). for example java, doc1,doc2,doc5 Any ideas? Thanks. 2012/11/28

Re: Replication Backup

2012-11-28 Thread Eva Lacy
There doesn't seem to be a lock file created by the snapshooter, it news up a lock file but it never obtains the lock. So there is no indication of when it is finished backing up the files. On Sun, Nov 25, 2012 at 5:32 AM, Otis Gospodnetic < otis.gospodne...@gmail.com> wrote: > Hi Eva, > > You j

Re: Total number of hits within all documents

2012-11-28 Thread Jack Krupansky
If users don't understand it anyway... just sum up termfreq(field,term) for all query terms. Who will know that it is only an approximation?! BUT... it will may cause queries to be significantly slower. I mean, you COULD add a custom value source such as sumtermfreq(field1,term1,field2,term2..

Re: How to search with single string in a multivalued string field

2012-11-28 Thread Jack Krupansky
Copy that string field to a text field. Use the string field when you want exact matches, and use the text field or keyword or phrase searches. -- Jack Krupansky -Original Message- From: Sangeetha Sent: Wednesday, November 28, 2012 2:39 AM To: solr-user@lucene.apache.org Subject: How

Re: Total number of hits within all documents

2012-11-28 Thread ses
Unfortunately a vague specification is all I have, due to the fact I am trying to replicate the functionality in a closed-source legacy search product. I suspect no-one at the company knows precisely how this works. The purpose is ultimately to display to the user the entire number of 'hits' found

Re: Socket Error

2012-11-28 Thread Alexandre Rafalovitch
Not a Solr issue. Your client has closed the network connection and your server threw this when trying to write to it. Usually happens if you server a long or slow content and client closes the browser window. In your case, there might be a wrinkle "org.apache.catalina.valves. ErrorReportValve.inv

Re: SolrCloud and exernal file fields

2012-11-28 Thread Mark Miller
Keep in mind that the distrib update proc will be auto inserted into chains! You have to include a proc that disables it - see the FAQ: http://wiki.apache.org/solr/SolrCloud#FAQ - Mark On Nov 28, 2012, at 7:25 AM, Mikhail Khludnev wrote: > Martin, > Right as far node in Zookeeper Distributed

Re: positions and qf parameter in (e)dismax

2012-11-28 Thread Jack Krupansky
Edismax is considered Solr, although the same issue exists in the Lucene query parser. -- Jack Krupansky -Original Message- From: Markus Jelsma Sent: Wednesday, November 28, 2012 8:50 AM To: solr-user@lucene.apache.org Subject: RE: positions and qf parameter in (e)dismax I think i ag

RE: positions and qf parameter in (e)dismax

2012-11-28 Thread Markus Jelsma
I think i agree. Is this something that should be resolved in Solr or Lucene? Thanks -Original message- > From:Jack Krupansky > Sent: Tue 27-Nov-2012 17:47 > To: solr-user@lucene.apache.org > Subject: Re: positions and qf parameter in (e)dismax > > That is exactly the exception I woul

Re: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Jack Krupansky
You need to add the generateNumberParts=1 attribute - assuming you actually want the number generated. The fact that your schema worked in 1.4 was probably simply a side effect of this bug: https://issues.apache.org/jira/browse/SOLR-1706 "wrong tokens output from WordDelimiterFilter depending

multiple filter query with seperate result sets (in one call)

2012-11-28 Thread ninaddesai82
Hi, I want to have multiple search at a time with grouped result. i.e. If I am calling http://localhost:8983/solr/file/select?q=*&fq=category:wild&fq=health_status:good&fq=animal:lion+OR+tiger Then I need two results one for {category=wild and health_status=good and animal=lion} and second on

Re: Total number of hits within all documents

2012-11-28 Thread Jack Krupansky
Clue us in as to what you actually want to do with this number. Maybe an approximation might solve the problem as well? In other words, what degree of accuracy is actually required? Also, make sure you actually can reduce your proposed calculation to a mathematical function. As stated, it is a

Re: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Erick Erickson
Well, I get the same results in 1.4 and 3.6. The only difference is I didn't put in. In both cases the 12 is missing from the query analysis but is in the index analysis, due to the catenateNumbers being 1 in one case and 0 in the oth

Re: SolrCloud and exernal file fields

2012-11-28 Thread Mikhail Khludnev
Martin, Right as far node in Zookeeper DistributedUpdateProcessor will broadcast commits to all peers. To hack this you can introduce dedicated UpdateProcessorChain without DistributedUpdateProcessor and send commit to that chain. 28.11.2012 13:16 пользователь "Martin Koch" написал: > Mikhail >

Total number of hits within all documents

2012-11-28 Thread ses
I'm trying to find a way to retrieve from a Solr query the total number of hits for a query across all documents. I'm using an edismax query handler which searches across several fields (specified in the schema.xml). I have tried: /solr/my_core/keyword?q=knights of arabia&fl=ttf:totaltermfreq(htm

AW: Preferred query notation for alternative field values

2012-11-28 Thread Charra, Johannes
Thanks for the hint. You are right: Both queries are identical after parsing. >>> -Ursprüngliche Nachricht- >>> Von: Upayavira [mailto:u...@odoko.co.uk] >>> Gesendet: Mittwoch, 28. November 2012 12:04 >>> An: solr-user@lucene.apache.org >>> Betreff: Re: Preferred query notation for alterna

Re: SOLR4 cluster - strange CPU spike on slave

2012-11-28 Thread John Nielsen
I apologize for the late reply. The query load is more or less stable during the spikes. There are always fluctuations, but nothing on the order of magnitude that could explain this spike. In fact, the latest spike occured last night when there were almost noone using it. To test a hunch of mine,

Re: Dynamic ranking based on search term

2012-11-28 Thread Upayavira
Isn't this what Solr/Lucene are designed to do?? On indexing a document, Lucene creates an inverted index, mapping terms back to their containing documents. The data you have is already inverted. I'd suggest uninverting it and then hand it to Solr in that format, thus: doc1: java doc2: java doc4

Re: Preferred query notation for alternative field values

2012-11-28 Thread Upayavira
Use debugQuery=true to see the format of the parsed query. Solr will parse the query that you provide into Lucene Query objects, which are then used to execute the query. The parsed query info provided by debugQuery=true is basically these Query objects converted back into a string representation,

User-Agent string in Solr

2012-11-28 Thread bkrz
Hi all, I want to use Solr Cell to extract content from remote URLs. The User-Agent string that my Solr server uses is "Java/1.6.0_30", so I guess that it's hardcoded somewhere in the class implementing the HTTP Client functionality. My question is: Provide Solr any way to change this string like

Re: Help with sort on dynamic field and out of memory error

2012-11-28 Thread Toke Eskildsen
On Wed, 2012-11-28 at 03:25 +0100, Arun Rangarajan wrote: [Sorting on 14M docs, 250 fields] > From what I have read, I understand that restricting the number of distinct > values on sortable Solr fields will bring down the fieldCache space. The > values in these sortable fields can be any integer

Re: Problem with migration to SolrAdaptersForLuceneSpatial4

2012-11-28 Thread Viacheslav Davidovich
Hi David, thank you for reply. Actually when I change the fieldType to some magic happens and old query start to work. And this change resolve my problems with distance calculation even without solr.SpatialRecursivePrefixTreeFieldType field type usage. WBR Viacheslav. On 26.11.2012, at 18

RE: Search differences between solr 1.4.0 and 3.6.1

2012-11-28 Thread Frederico Azeiteiro
I just reload both indexes just to make sure that all definitions are loaded. On Analysis tool I can see differences, even that the fields are defined on the same way: Query Analyser for 3.6.1 org.apache.solr.analysis.WordDelimiterFilterFactory {protected=protwords.txt, splitOnCaseChange=1, gene

Re: SolrCloud and exernal file fields

2012-11-28 Thread Martin Koch
Mikhail I haven't experimented further yet. I think that the previous experiment of issuing a commit to a specific core proved that all cores get the commit, so I don't think that this approach will work. Thanks, /Martin On Tue, Nov 27, 2012 at 6:24 PM, Mikhail Khludnev < mkhlud...@griddynamics

Re: stopwords in solr

2012-11-28 Thread 曹霖
yep,lt is a bad idea to eliminate stopword during indexing,may be u can eliminate stopword during querying,That is flexible 2012/11/28 Walter Underwood > Eliminating stopwords is generally a bad idea. It means you cannot search > for "vitamin a".