Understanding RecoveryStrategy

2012-05-04 Thread Trym R. Møller
Hi Using Solr trunk with the replica feature, I see the below exception repeatedly in the Solr log. I have been looking into the code of RecoveryStrategy#commitOnLeader and read the code as follows: 1. sends a commit request (with COMMIT_END_POINT=true) to the Solr instance containing the lead

Re: Invalid version expected 2, but 60 on CentOS

2012-05-04 Thread Ravi Solr
Thank you very much for responding Mr. Miller. There are 5 different apps deployed on the same server as SOLR and all apps call SOLR as via SOLRJ with localhost:8080/solr/sitecore as constructor url for HttpSolrServer.out of all these 5 apps only one has this issueif it is really the web se

Re: solr snapshots - old school and replication - new school ?

2012-05-04 Thread Lance Norskog
Yes. Replication is a lot easier to use and does a lot more. On Thu, May 3, 2012 at 6:00 AM, geeky2 wrote: > hello all, > > enviornment: centOS and solr 3.5 > > i want to make sure i understand the difference between  snapshots and solr > replication. > > snapshots are "old school" and have been

Re: correct XPATH syntax

2012-05-04 Thread Lance Norskog
The XPath implementation in DIH is very minimal- it is tuned for speed, not features. The XSL option lets you do everything you could want, with a slower engine. On Thu, May 3, 2012 at 7:30 AM, lboutros wrote: > ok, not that easy :) > > I did not test it myself but it seems that you could use an

Re: Phrase Slop probelm

2012-05-04 Thread Lance Norskog
Maybe it could throw an exception because the user is clearly trying to do something impossible. On Wed, May 2, 2012 at 3:19 PM, Jack Krupansky wrote: > You are missing the "pf", "pf2", and "pf3" request parameters, which says > which fields to do phrase proximity boosting on. > > "pf" boosts usi

Re: Searching by location – What do I send to Solr?

2012-05-04 Thread Lance Norskog
You could just download postalcodes every day. To be nice, you could pull the HEAD of each file and check if it is new. This is just a set of tables, which you denormalize and add to your other fields. There are other sources of polygonal shape data, but there is no official Solr toolkit for quer

Re: Solr Merge during off peak times

2012-05-04 Thread Lance Norskog
Optimize takes a 'maxSegments' option. This tells it to stop when there are N segments instead of just one. If you use a very high mergeFactor and then call optimize with a sane number like 50, it only merges the little teeny segments. On Thu, May 3, 2012 at 8:28 PM, Shawn Heisey wrote: > On 5/2

Re: how to present html content in browse

2012-05-04 Thread Lance Norskog
You need positions and offsets to do highlighting. A CharFilter does not preserve positions. I think you have to analyze the raw HTML with a different Analyzer, as well as the stripper. I think this is how it works: use a new Analyzer stack that uses the StandardAnalyzer, and the lower case filter

RE: elevate vs. select numFound results

2012-05-04 Thread Noordeen, Roxy
I modified mysolrconfig.xml to: dismax explicit true 0.01 content^2.0 15 1 *:* elevator Then added enableElevation=true parameter to my elevate url. http://mydomain:8181/solr/elevate?q=dwayne+rock+johnson&wt=xml&sort=score+desc&fl=id,bundle_name&exclusive=true&debugQuery=on&enableElevation

Re: SOLRJ: Is there a way to obtain a quick count of total results for a query

2012-05-04 Thread Li Li
don't score by relevance and score by document id may speed it up a little? I haven't done any test of this. may be u can give it a try. because scoring will consume some cpu time. you just want to match and get total count On Wed, May 2, 2012 at 11:58 PM, vybe3142 wrote: > I can achieve this by

Minor type in example solrconfig: process of provided docuemnts

2012-05-04 Thread Jack Krupansky
I noticed this minor typo in the example solrconfig.xml for both 3.6 and trunk (as of 5/1): An analysis handler that provides a breakdown of the analysis process of provided docuemnts. This handler expects a (single) “docuemnts” should be “documents”. -- Jack Krupansky

Re: Single Index to Shards

2012-05-04 Thread Lance Norskog
If you are not using SolrCloud, splitting an index is simple: 1) copy the index 2) remove what you do not want via "delete-by-query" 3) Optimize! #2 brings up a basic design question: you have to decide which documents go to which shards. Mostly people use a value generated by a hash on the actual

RE: elevate vs. select numFound results

2012-05-04 Thread Noordeen, Roxy
My actual problem is with elevate not working with "exclusive=true". I have a special pinned widget, that has to display only the nodes defined in my elevate.xml, kind of sponsored results. If I define "game" in my elevte.xml, and send "exclusive=true" I get only the elevated entries. http://:

Minor typo: None-hex character in unicode escape sequence

2012-05-04 Thread Jack Krupansky
I just happened to notice a typo when I mistyped a Unicode escape sequence in a query: org.apache.lucene.queryparser.classic.ParseException: Cannot parse 'sku:abc-0\ugabc0)': None-hex character in unicode escape sequence: g “None-hex” should be “Non-hex”. And “unicode” should be “Unicode”. Sa

RE: Single Index to Shards

2012-05-04 Thread Young, Cody
You can also make a copy of your existing index, bring it up as a second instance/core and then send delete queries to both indexes. -Original Message- From: Erick Erickson [mailto:erickerick...@gmail.com] Sent: Friday, May 04, 2012 8:37 AM To: solr-user@lucene.apache.org Subject: Re: Si

Re: elevate vs. select numFound results

2012-05-04 Thread Jack Krupansky
Some ways that fewer docs might be returned by query elevation: 1. The "excude" option: exclude="true" in the xml file. 2. The "exclusive" request parameter: &exclusive=true in the URL. (Certainly not your case.) 3. The "exclusive" request parameter default set to "true" in "defaults" for the "

Re: how to present html content in browse

2012-05-04 Thread okayndc
Okay, thanks for the info. On Fri, May 4, 2012 at 4:42 PM, Jack Krupansky wrote: > Evidently there was a problem with highlighting of HTML that is supposedly > fixed in Solr 3.6 and trunk: > > https://issues.apache.org/**jira/browse/SOLR-42 > > > --

Re: Facet and totaltermfreq

2012-05-04 Thread Jamie Johnson
it might be...can you provide an example of the request/response? On Fri, May 4, 2012 at 3:31 PM, Dmitry Kan wrote: > I have tried (as a test) combining facets and term vectors ( > http://wiki.apache.org/solr/TermVectorComponent ) in one query and was able > to get a list of facets and for each f

Re: how to present html content in browse

2012-05-04 Thread Jack Krupansky
Evidently there was a problem with highlighting of HTML that is supposedly fixed in Solr 3.6 and trunk: https://issues.apache.org/jira/browse/SOLR-42 -- Jack Krupansky -Original Message- From: okayndc Sent: Friday, May 04, 2012 4:35 PM To: solr-user@lucene.apache.org Subject: Re: how

Re: how to present html content in browse

2012-05-04 Thread okayndc
Is it possible to return the HTML field highlighted? On Fri, May 4, 2012 at 1:27 PM, Jack Krupansky wrote: > 1. The raw html field (call it, "text_html") would be a "string" type > field that is "stored" but not "indexed". This is the field you direct DIH > to output to. This is the field you wou

Re: Invalid version expected 2, but 60 on CentOS

2012-05-04 Thread Mark Miller
On May 4, 2012, at 4:09 PM, Ravi Solr wrote: > Thanking you in anticipation, Generally this happens because the webapp server is returning an html error response of some kind. Often it's a 404. I think in trunk this might have been addressed - that is, it's easier to see the true error. Not p

Invalid version expected 2, but 60 on CentOS

2012-05-04 Thread Ravi Solr
Hello, We recently we migrated our production SOLR 3.6 servers OS from Solaris to CentOS and from then on we started seeing "Invalid version (expected 2, but 60)" errors on one of the query servers (oddly one other query server seems fine). If we restart the problematic server everything re

Re: SOLRJ: Is there a way to obtain a quick count of total results for a query

2012-05-04 Thread vybe3142
Fair enough, Thanks. Just wanted to confirm that there wasn't a better way of accomplishing this. -- View this message in context: http://lucene.472066.n3.nabble.com/SOLRJ-Is-there-a-way-to-obtain-a-quick-count-of-total-results-for-a-query-tp3955322p3963295.html Sent from the Solr - User mailing

Re: Facet and totaltermfreq

2012-05-04 Thread Dmitry Kan
I have tried (as a test) combining facets and term vectors ( http://wiki.apache.org/solr/TermVectorComponent ) in one query and was able to get a list of facets and for each facet there was a term freq under termVectors section. Not sure, if that's what you are trying to achieve. -Dmitry On Fri,

elevate vs. select numFound results

2012-05-04 Thread roxy.noord...@wwecorp.com
I need help understanding the difference in the numFound number in the result when I execute two queries against my solr instance, one with the elevation and one without. I have a simple elevate.xml file created and working and am searching for terms that are not meant to be elevated. Elevate quer

Re: Template in a database field does not work. Please Help

2012-05-04 Thread RTI QA
Figured out. I have to specify the column name incident_id in uppercase: Looks like it is case sensitive for the transformer, even though to Oracle, the column name is not case sensitive. Thanks, RTI QA On Fri, May 4, 2012 at 1:44 PM, RTI QA wrote: > I specified template in a f

Template in a database field does not work. Please Help

2012-05-04 Thread RTI QA
I specified template in a field When doing full import, for each row retrieved from oracle, there is this output in the console: May 03, 2012 3:47:08 PM org.apache.solr.handler.dataimport.TemplateTransformer transformRow WARNING: Unable to resolve variable: incident.incident_id whi

Re: query keyword-tokenized fields with solrj

2012-05-04 Thread Jack Krupansky
You have an embedded space in your keyword value, which must be escaped, somehow. So, the actual query can be written as article:"L. 111-5-2" or article:L.\ 111-5-2 The later is slightly prettier, I suppose. I suppose you could use a wildcard: article:L.*111-5-2 article:L.?111-5-2 If you w

Invalid version (expected 2, but 60) on CentOS in production please Help!!!

2012-05-04 Thread Ravi Solr
Hello, We Recently we migrated our SOLR 3.6 server OS from Solaris to CentOS and from then on we started seeing "Invalid version (expected 2, but 60)" errors on one of the query servers (oddly one other query server seems fine). If we restart the server having issue everything will be alri

Re: how to present html content in browse

2012-05-04 Thread Jack Krupansky
1. The raw html field (call it, "text_html") would be a "string" type field that is "stored" but not "indexed". This is the field you direct DIH to output to. This is the field you would return in your search results with the HTML to be displayed. 2. The stripped field (call it, "text_stripped

Re: Faceting on a date field multiple times

2012-05-04 Thread SUJIT PAL
Hi Ian, I believe you may be able to use a bunch of facet.query parameters, something like this: facet.query=yourfield:[NOW-1DAY TO NOW] facet.query=yourfield:[NOW-2DAY to NOW-1DAY] ... and so on. -sujit On May 3, 2012, at 10:41 PM, Ian Holsman wrote: > Hi. > > I would like to be able to do

Re: >1MB file to Zookeeper

2012-05-04 Thread Yonik Seeley
On Fri, May 4, 2012 at 12:50 PM, Mark Miller wrote: >> And how should we detect if data is compressed when >> reading from ZooKeeper? > > I was thinking we could somehow use file extensions? > > eg synonyms.txt.gzip - then you can use different compression algs depending > on the ext, etc. > > We

Re: how to present html content in browse

2012-05-04 Thread okayndc
Hello, I'm having a hard time understanding this, and I had this same question. When using DIH should the HTML field be stored in the raw HTML string field or the stripped field? Also what source field(s) need to be copied and to what destination? Thanks On Thu, May 3, 2012 at 10:15 PM, Lance

Re: >1MB file to Zookeeper

2012-05-04 Thread Mark Miller
On May 3, 2012, at 8:30 AM, Markus Jelsma wrote: > Hi. > > Compression is a good suggestion. All large dictionaries are compressed well > below 1MB with GZIP. Where should this be implemented? SolrZkClient or > ZkController? Hmm...I'm not sure - we want to be careful with this feature. Offhan

Re: Documents With large number of fields

2012-05-04 Thread Darren Govoni
I'm also interested in this. Same situation. On Fri, 2012-05-04 at 10:27 -0400, Keswani, Nitin - BLS CTR wrote: > Hi, > > My data model consist of different types of data. Each data type has its own > characteristics > > If I include the unique characteristics of each type of data, my single So

query keyword-tokenized fields with solrj

2012-05-04 Thread G.Long
Hi :) In schema.xml I added a custom fieldType called keyword: and a field called article : Now I would like to query this field using solrj. I'm using the following code: SolrQuery query = new SolrQuery("article:L. 111-5-2"); QueryResponse rsp = server.query(query); list = rsp.getR

Re: Single Index to Shards

2012-05-04 Thread Erick Erickson
There's no way to split an _existing_ index into multiple shards, although some of the work on SolrCloud is considering being able to do this. You have a couple of choices here: 1> Just reindex everything from scratch into two shards 2> delete all the docs from your index that will go into shard 2

Re: problem with date searching.

2012-05-04 Thread Erick Erickson
Right, you need to do the explicit qualification of the date field. dismax parsing is intended to work with text-type fields, not numeric or date fields. If you attach &debugQuery=on, you'll see that your "scanneddate" field is just dropped. Furthermore, dismax was never intended to work with rang

Documents With large number of fields

2012-05-04 Thread Keswani, Nitin - BLS CTR
Hi, My data model consist of different types of data. Each data type has its own characteristics If I include the unique characteristics of each type of data, my single Solr Document could end up containing 300-400 fields. In order to drill down to this data set I would have to provide facetin

RE: Single Index to Shards

2012-05-04 Thread Keswani, Nitin - BLS CTR
Yes you can split your index into multiple shards More info on shards can be found here : http://lucidworks.lucidimagination.com/display/solr/Distributed+Search+with+Index+Sharding Thanks. Regards, Nitin Keswani -Original Message- From: michaelsever [mailto:sever_mich...@bah.com] S

Re: search case: Elision and truncate in french

2012-05-04 Thread Jack Krupansky
Okay, the issue is that only *some* of the filters are "multi-term aware" and the elision filter is one that is NOT multi-term aware. -- Jack Krupansky -Original Message- From: Jack Krupansky Sent: Friday, May 04, 2012 9:42 AM To: solr-user@lucene.apache.org Subject: Re: search case:

Single Index to Shards

2012-05-04 Thread michaelsever
If I have a single Solr index running on a Core, can I split it or migrate it into 2 shards? -- View this message in context: http://lucene.472066.n3.nabble.com/Single-Index-to-Shards-tp3962380.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: search case: Elision and truncate in french

2012-05-04 Thread Jack Krupansky
Well, if it was "fixed", then it is now broken again - in the 3.6 release! Here’s a snippet from debugQuery showing that the generated query has the elision intact in the analyzed term: text_fr:l'avion* text_fr:l'avion* +text_fr:l'avion* +text_fr:l'avion* And for the same term without wildcard

Why would solr norms come up different from Lucene norms?

2012-05-04 Thread Benson Margulies
So, I've got some code that stores the same documents in a Lucene 3.5.0 index and a Solr 3.5.0 instance. It's only five documents. For a particular field, the Solr norm is always 0.625, while the Lucene norm is .5. I've watched the code in NormsWriterPerField in both cases. In Solr we've got .57

Re: search case: Elision and truncate in french

2012-05-04 Thread Erik Hatcher
Jack - that was true, until Solr 3.6+: So, Claire, it's possible with the latest Solr release, to do this using bits and pieces of your existing analysis chain. As Jack said, though, this is a manual chore in pre-Solr-3.6 releases. E

Re: search case: Elision and truncate in french

2012-05-04 Thread Jack Krupansky
Unfortunately, use of a wildcard causes the normal token analysis processing to be completely bypassed, including the elision filter. So, when using a wildcard you have to simulate in your head all of the analysis features, such as manually performing the elision. -- Jack Krupansky -Orig

Re: get latest 50 documents the fastest way

2012-05-04 Thread Nagendra Nagarajayya
You can do this with Solr 4.0 with RankingAlgorithm 1.4.2. Please pass the below parameters to your search: &age=latest&docs=50 For eg: http://localhost:8983/solr/select/?q=*:*&age=latest&docs=50 This would inspect the latest last 50 documents in real time and returns results accordingly. Us

Re: Word recognised in a search

2012-05-04 Thread Dmitry Kan
have you tried HighlightComponent? hl=true&hl.field=orig_text_field - Dmitry On Fri, May 4, 2012 at 1:52 PM, mattia.martine...@gmail.com < mattia.martine...@gmail.com> wrote: > Hi. > > I'm making some searches using Apache SOLR 1.4, but I will upgrade to 3.6. > > When SOLR uses stemming, it is v

Re: Parent-Child relationship

2012-05-04 Thread Erick Erickson
See: https://issues.apache.org/jira/browse/LUCENE-3759 No time-frame mentioned though. Best Erick On Fri, May 4, 2012 at 4:20 AM, tamanjit.bin...@yahoo.co.in wrote: > Hi, > As per my understanding the join is confined to a single core only and it is > not possible to have joins between docs of

Re: Faceting on a date field multiple times

2012-05-04 Thread Ian Holsman
Thanks Marc. On May 4, 2012, at 8:52 PM, Marc Sturlese wrote: > http://lucene.472066.n3.nabble.com/Multiple-Facet-Dates-td495480.html > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Faceting-on-a-date-field-multiple-times-tp3961282p3961865.html > Sent from the Solr -

Word recognised in a search

2012-05-04 Thread mattia.martine...@gmail.com
Hi. I'm making some searches using Apache SOLR 1.4, but I will upgrade to 3.6. When SOLR uses stemming, it is very difficult to know what are the words that are really found (for example, if I search "ups" SOLR find "up" too). I need to know that because I need to highlight founded words in the t

Re: Faceting on a date field multiple times

2012-05-04 Thread Marc Sturlese
http://lucene.472066.n3.nabble.com/Multiple-Facet-Dates-td495480.html -- View this message in context: http://lucene.472066.n3.nabble.com/Faceting-on-a-date-field-multiple-times-tp3961282p3961865.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: problem with date searching.

2012-05-04 Thread Dmitry Kan
unless, something else is wrong, my question would be, if you have the documents in solr stamped with these dates? also could try for a test specifying the field name directly: q=scanneddate:["2011-09-22T22:40:30Z" TO "2012-02-02T01:30:52Z"] also, in your first e-mail you said you have used [*"2

Re: problem with date searching.

2012-05-04 Thread ayyappan
thanks for quick response. I tried your advice . ["2011-09-22T22:40:30Z" TO "2012-02-02T01:30:52Z"] like that even though i am not getting any result . -- View this message in context: http://lucene.472066.n3.nabble.com/problem-with-date-searching-tp3961761p3961833.html Sent from the Solr - Us

Re: problem with date searching.

2012-05-04 Thread Dmitry Kan
you have dates in the wrong order in the second query. Try instead: ["2011-09-22T22:40:30Z" TO "2012-02-02T01:30:52Z"] in general: [start_date TO end_date] Dmitry On Fri, May 4, 2012 at 1:10 PM, ayyappan wrote: > Hi > > I'm having a slight problem with date searching... if i give same date

problem with date searching.

2012-05-04 Thread ayyappan
Hi I'm having a slight problem with date searching... if i give same date range in search query it seems to be working fine when try to give the different date range and i am not getting result. Ex : select/?defType=dismax&q=[*"2012-02-02T01:30:52Z" TO "2012-02-02T01:30:52Z"*]&qf=scanneddate

Re: Advanced search with results matrix

2012-05-04 Thread Mikhail Khludnev
Hi, have you considered to junk your subqueries into disjunction (BooleanQuery.Occurs.SHOULD) and request http://wiki.apache.org/solr/SimpleFacetParameters#facet.query_:_Arbitrary_Query_Faceting? On Fri, May 4, 2012 at 1:32 PM, Gnanakumar wrote: > > 1. If I understand correctly you just need to

RE: Advanced search with results matrix

2012-05-04 Thread Gnanakumar
> 1. If I understand correctly you just need to perform one query. Like so > (translated to propper syntax of course): >("SQL Server" OR SQL) OR ("Visual Basic" OR VB.NET) OR (Java AND > JavaScript) No, it's not just one single query, rather, as I've mentioned before, it's combination of sea

search case: Elision and truncate in french

2012-05-04 Thread Claire Hernandez
Hi all, I have a little problem, I don't find an easy configuration solution but maybe my google search is wrong :) - ElisionFilterFactory is enabled for searching and indexing analyzer. - Index contains: *l'aventure* => when I search *l'avent** solr finds nothing I would have a solution whic

Re: Parent-Child relationship

2012-05-04 Thread tamanjit.bin...@yahoo.co.in
Hi, As per my understanding the join is confined to a single core only and it is not possible to have joins between docs of different cores. Am I correct here? If yes, is there a possibility of having joins across cores anytime soon? -- View this message in context: http://lucene.472066.n3.nabble

Re: SOLR 3.5 Index Optimization not producing single .cfs file

2012-05-04 Thread pravesh
Thanx Mike, >If you really must have a CFS (how come?) then you can call >TieredMergePolicy.setNOCFSRatio(1.0) -- not sure how/where this is >exposed in Solr though. BTW, would this impact the search performance? I mean i was just trying few random keyword searches(without sort and filters) on b

Re: Advanced search with results matrix

2012-05-04 Thread David Radunz
Hey Gnanam, 1. If I understand correctly you just need to perform one query. Like so (translated to propper syntax of course): ("SQL Server" OR SQL) OR ("Visual Basic" OR VB.NET) OR (Java AND JavaScript) 2. Every query you perform with Solr returns the 'results' count, if you ONLY want the r