Re: is there a way we can build spell dictionary from solr index such that it only take words leaving all`special characters

2013-03-12 Thread Rohan Thakur
while building the spell dictionary... On Wed, Mar 13, 2013 at 11:29 AM, Rohan Thakur wrote: > even do not want to break the words as in samsung to s a m s u n g or sII > ti s II ir s2 to s 2 > > On Wed, Mar 13, 2013 at 11:28 AM, Rohan Thakur wrote: > >> k as in like if the field I am indixing f

Re: is there a way we can build spell dictionary from solr index such that it only take words leaving all`special characters

2013-03-12 Thread Rohan Thakur
even do not want to break the words as in samsung to s a m s u n g or sII ti s II ir s2 to s 2 On Wed, Mar 13, 2013 at 11:28 AM, Rohan Thakur wrote: > k as in like if the field I am indixing from the database like title that > has characters like () - # /n// > example: > > Screenguard for Samsun

Re: is there a way we can build spell dictionary from solr index such that it only take words leaving all`special characters

2013-03-12 Thread Rohan Thakur
k as in like if the field I am indixing from the database like title that has characters like () - # /n// example: Screenguard for Samsung Galaxy SII (Matt and Gloss) (with Dual Protection, Cleaning Cloth and Bubble Remover) or samsung-galaxy-sii-screenguard-matt-and-gloss.html or /s/a/samsung_gal

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-12 Thread Steve Rowe
Yes, this is a regression, definitely my fault. Sorry Alex! The table on SOLR-3798 is missing this case: a glob matching one or more explicit fields (as opposed to dynamic fields). I've filed a JIRA: https://issues.apache.org/jira/browse/SOLR-4567 On Mar 13, 2013, at 12:20 AM, "Jack Krupansk

Re: debugQuery, explain tag - What does the fieldWeight value refer to?,

2013-03-12 Thread David Philip
Hi, Any reply on this: How are the documents sequenced in the case when the product of tf idf , coord and fieldnorm is same for both the documents? Thanks - David P.S : This link was very useful to understand the scoring in detail: http://mail-archives.apache.org/mod_mbox/lucene-java-user/2

Re: copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-12 Thread Jack Krupansky
Solr-4503 made the changes to copyField semantics. Indeed, it is not clear whether Solr-4503 (or even Solr-3798) was really intended to de-commit existing functionality. I mean, the normal procedure is to deprecate a feature long before removing it. And, the wiki does not note the decommission

Re: Is Lucene's DrillSideways something suitable for Solr?

2013-03-12 Thread Yonik Seeley
On Tue, Mar 12, 2013 at 10:27 PM, Alexandre Rafalovitch wrote: > Lucene seems to get a new DrillSideways functionality on top of its own > facet implementation. > > I would love to have something like that in Solr Solr has had multi-select faceting for 4 years now. My understanding of DrillSidewa

RE: Poll: Largest SolrCloud out there?

2013-03-12 Thread Vaillancourt, Tim
Considering the silence, I'll take the unofficial largest SolrCloud award until beaten :D: 2 VMWare VMs 4GB RAM/VM 4 Virtual CPUs < 1000mb index Beat that :)!! Tim -Original Message- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Thursday, February 28, 2013 12:00 AM

Re: How to Integrate Solr With Hbase

2013-03-12 Thread Bharat Mallampati
We haven't used Nutch to crawl data into SOLR, we used the standard HBASE API to read and SOLRJ API to write to solr. And our document size is relatively small with 100 to 150 fields. Thanks Bharat On Tue, Mar 12, 2013 at 1:15 AM, adfel70 wrote: > > DO you store all your crawled nutch data i

Re: SolrException: Error opening new searcher

2013-03-12 Thread mark12345
I am continuing to work on this problem. So will update this thread as I go. These are the only logs I have through the "http://localhost:8080/solr-app/#/~logging " interface. I am using tomcat to run the solr war. Is there anything I can do to get more descriptive logs? -- View this messag

Re: Solr _docid_ parameter

2013-03-12 Thread mark12345
The following relates directly to my question above. Thanks Erick. Erick Erickson wrote > Don't use the internal Lucene doc ID. It _will_ change, even the > relationship between existing docs will change. When cores are merged, the > Lucene doc IDs are renumbered. Segments are NOT merged in ins

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

2013-03-12 Thread Shawn Heisey
On 3/12/2013 4:17 PM, feroz_kh wrote: Do we really need to optimize in order to reformat ? The alternative would be to start with an empty index and just reindex your data. That is actually the best way to go, if that option is available to you. If yes, What is the best way of optimizing

copyField with * stops working with 4.2 (related to SOLR-3798 ?)

2013-03-12 Thread Alexandre Rafalovitch
Hello, I have an example schema which worked in 4.1 but is failing to load in 4.2 with: "copyField source :'addr_*' is not an explicit field and doesn't match a dynamicField". I think this must be due to SOLR-3798, but I don't understand why even after reading it through several times. My schema

Re: spellchecker does not have suggestion for keywords typed through a non-whitespace delimiter

2013-03-12 Thread Jack Krupansky
Take a look at the WordBreakSolrSpellChecker. See: http://wiki.apache.org/solr/SpellCheckComponent -- Jack Krupansky -Original Message- From: alx...@aim.com Sent: Tuesday, March 12, 2013 8:11 PM To: solr-user@lucene.apache.org Subject: spellchecker does not have suggestion for keyword

spellchecker does not have suggestion for keywords typed through a non-whitespace delimiter

2013-03-12 Thread alxsss
Hello, Recently we noticed that solr and its spellchecker do not return results for keywords typed with non-whitespace delimiter. A user accidentally typed u instead of white space. For example, paulusoles instead of paul soles. Solr does not return any results or spellcheck suggestion for key

Re: having trouble escaping a character string

2013-03-12 Thread geeky2
oh - now i see what i was doing wrong. i kept trying to use the hex code of %22 as a replacement for the double quote - but that was not working - thank you jack, mark -- View this message in context: http://lucene.472066.n3.nabble.com/having-trouble-escaping-a-character-string-tp404679

Re: having trouble escaping a character string

2013-03-12 Thread Jack Krupansky
You need to escape a quote that is used within a quoted string, using backslash 30326R-26" TILLER becomes "30326R-26\" TILLER" becomes http://bogus/solrpartscat/core2/select?qt=modelItemNoSearch&q=itemModelNoExactMatchStr:%2230326R-26\%22%20TILLER%22 -- Jack Krupansky -Original Message

Re: Upgrade Solr3.5 to Solr4.1 - Index Reformat ?

2013-03-12 Thread feroz_kh
Hi Shawn, Do we really need to optimize in order to reformat ? If yes, What is the best way of optimizing index - Online or Offline ? Can we do it online ? If yes - 1. What is the http request which we can use to invoke optimization - How long it takes ? 2. What is the command line command to invo

Re: having trouble escaping a character string

2013-03-12 Thread geeky2
attempting to upload the screenshot bmp file. the embedded image is difficult to make out. temp1.bmp -- View this message in context: http://lucene.472066.n3.nabble.com/having-trouble-escaping-a-character-string-tp4046796p4046798.

having trouble escaping a character string

2013-03-12 Thread geeky2
hello all, i am searching on this field type: for this string: 30326R-26" TILLER when i use the analyzer and issue the query - it indicates success (please see attached screen shot) but when i issue the searc

Re: PDF keyword searches not accurate

2013-03-12 Thread Jack Krupansky
You could just download Solr and run it by itself (quite easy), sending a PDF document to the solr /update/extract handler as per that wiki page. See: http://lucene.apache.org/solr/4_2_0/tutorial.html -- Jack Krupansky -Original Message- From: JDJ Sent: Tuesday, March 12, 2013 5:12 P

Re: PDF keyword searches not accurate

2013-03-12 Thread JDJ
Hello, Michael. Thank you for your suggestion. I'm unfamiliar with analysis handler. Do you have a link, for that? Much appreciated, - JDJ "There are two kinds of people in the world; those who understand binary, and those who don't. -- View this message in context: http

Re: PDF keyword searches not accurate

2013-03-12 Thread Michael Della Bitta
You could also use the analysis handler to see if your field definition strips numeric input. Michael Della Bitta Appinions 18 East 41st Street, 2nd Floor New York, NY 10017-6271 www.appinions.com Where Influence Isn’t a Game On Tue, Mar 12, 20

Re: How to Integrate Solr With Hbase

2013-03-12 Thread lboutros
Hi Kamaci, why don't you use the Nutch indexing functionality ? The Nutch Crawling script already contains the Solr indexing step. http://wiki.apache.org/nutch/bin/nutch%20solrindex Ludovic. - Jouve France. -- View this message in context: http://lucene.472066.n3.nabble.com/How-to-Integ

Natural Language Processing on input

2013-03-12 Thread perryg
I'm wondering what the best approach would be in this somewhat unique case. My data is a fairly flat table of classification data, where each column gets increasingly specific, down to the exact item i'm looking for. Think the biology tree, where each column would be Kingdom, Order, Family, Genus,

Re: PDF keyword searches not accurate

2013-03-12 Thread Jack Krupansky
Use the "extract only" option for Solr Cell to get the text stream that was extracted by Solr Cell/Tika/PDFBox, then manually search through the response for some text that is near the "1386", and see what text is output in the vicinity of the "1386". See: http://wiki.apache.org/solr/Extractin

Re: Special characters not indexed

2013-03-12 Thread Timothy Potter
Just to add to Jack's points, you can also use the term query parser to avoid all the escaping for special characters, e.g. fq={!term f=some_field} See Erik's preso from Apache Eurocon 2012 around 25:50 - http://vimeopro.com/user11514798/apache-lucene-eurocon-2012/video/55822628 On Tue, Mar 12,

Re: JoinQuery and scores

2013-03-12 Thread Chris Hostetter
: Every doc returned has a score of 1.0 with the join. : Without join I get scores between 0.40337953 and 0.40530312. Saying you get scores w/o joining is an apples/oranges comparison -- w/o joining you have a completley differnet query matching differnet documents. what the JoinQParser does i

Re: How can I limit my Solr search to an arbitrary set of 100,000 documents?

2013-03-12 Thread Andy Lester
On Mar 12, 2013, at 1:21 PM, Chris Hostetter wrote: > How are these sets of flrids created/defined? (undertsanding the source > of the filter information may help inspire alternative suggestsions, ie: > XY Problem) It sounds like you're looking for patterns that could potentially providing

PDF keyword searches not accurate

2013-03-12 Thread JDJ
Hello, everyone. I'm working (basically for the first time) on a project that requires PDFs to be indexed and searched via Solr under ColdFusion Server 9. I've completed the project, but the client is asking a question that I don't have the answer for. Basically, there is one PDF that has "1386"

Re: xml output question

2013-03-12 Thread jazz
Hi Michael, Thans for the reply. My question is how to make child XML elements such as: > These can be converted using Saxon. Regards Bart On 12 Mar 2013, at 20:18, Michael Della Bitta wrote: >

Re: solr.DirectUpdateHandler2 failed to instantiate

2013-03-12 Thread Jack Park
Indeed! Perhaps the germane part is this, before the failure to instantiate notice: Caused by: java.lang.ClassCastException: class org.apache.solr.update.DirectUpda teHandler2 at java.lang.Class.asSubclass(Unknown Source) at org.apache.solr.core.SolrResourceLoader.findClass(SolrRes

Re: xml output question

2013-03-12 Thread Michael Della Bitta
HI Bart, You've linked to the page that explains how to use Saxon to run XSLT over the output. So the answer is yes? I'm having trouble understanding what your real question is. Thanks, Michael Della Bitta Appinions 18 East 41st Street, 2nd Floo

Re: solr.DirectUpdateHandler2 failed to instantiate

2013-03-12 Thread Mark Miller
There should be a stack trace - also, you shouldn't have to do anything special to use this class. It's the default and only truly supported implementation… - Mark On Mar 12, 2013, at 2:53 PM, Jack Park wrote: > That messages gives great, but terrible google. Zillions of hits, > mostly filled

xml output question

2013-03-12 Thread jazz
Hi, I am having trouble with XML output: localhost:8983/solr/collection1/select?q=*.*?wt=xml It is possible to use this schema format in Solr? So, extending a field with XML childs ext and last? Or it it possible to reformat the XML output with and XSLT processor such as Saxon (ht

solr.DirectUpdateHandler2 failed to instantiate

2013-03-12 Thread Jack Park
That messages gives great, but terrible google. Zillions of hits, mostly filled with very long log traces, and zero messages (that I could find) about what to do about it. I switched over to using that handler since it has an update log specified, and that's the only place I've found how to use up

Re: question about syntax for multiple terms in filter query

2013-03-12 Thread Jack Krupansky
Filter query. -- Jack Krupansky -Original Message- From: geeky2 Sent: Tuesday, March 12, 2013 2:28 PM To: solr-user@lucene.apache.org Subject: Re: question about syntax for multiple terms in filter query jack, did you mean "function query" or filter query i was going to do this

Re: Special characters not indexed

2013-03-12 Thread Jack Krupansky
Use the white space tokenizer and be sure to escape a lot of them in queries since a number of them have meaning to the query parser. Or, enclose query terms in quotes. -- Jack Krupansky -Original Message- From: vsl Sent: Tuesday, March 12, 2013 11:16 AM To: solr-user@lucene.apache.o

Re: question about syntax for multiple terms in filter query

2013-03-12 Thread geeky2
jack, did you mean "function query" or filter query i was going to do this in my request handler for parts +itemType:1 +sellingPrice:[1 TO *] -- View this message in context: http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442p4046715

Re: How can I limit my Solr search to an arbitrary set of 100,000 documents?

2013-03-12 Thread Chris Hostetter
: q=title:dogs AND : (flrid:(123 125 139 34823) OR : flrid:(34837 ... 59091) OR : ... OR : flrid:(101294813 ... 103049934)) : The problem with this approach (besides that it's clunky) is that it : seems to perform O(N^2) or so. With 1,000 FLRIDs,

Re: It seems a issue of deal with chinese synonym for solr

2013-03-12 Thread Robert Muir
I agree. Actually that top-level logic is fine. its the loop that follows thats wrong: it needs to look at position increment and do the right thing. Want to open a JIRA issue? On Mon, Mar 11, 2013 at 9:15 PM, 李威 wrote: > in org.apache.solr.parser.SolrQueryParserBase, there is a function: > "pr

RE: Solr 4.1: problems with Spatial Search.

2013-03-12 Thread Smiley, David W.
Hi Luis, I'm glad it's working out. When I *eventually* get a patch together addressing the issue, I'll let you know so you can try it out. 'd' is measured in kilometers. "degrees" is only for the Solr 4 spatial field's Circle radius parameter and for its distance score results. Kind of a m

Solr 4.2.0 on Glassfish 3.1.2.2

2013-03-12 Thread rmohan
Hi, This doesn't look like a Solr or Glassfish issue but before debugging SSL which is tangential to this I thought I should ask. This is a fresh deployment of Solr WAR on Glassfish. Any ideas ? It seems to be basically a HttpClient call during deployment. Thanks. [#|2013-03-12T16:19:12.

Re: question about syntax for multiple terms in filter query

2013-03-12 Thread Jack Krupansky
So they definitely should be specified in a single function query. -- Jack Krupansky -Original Message- From: geeky2 Sent: Tuesday, March 12, 2013 1:30 PM To: solr-user@lucene.apache.org Subject: Re: question about syntax for multiple terms in filter query hello jack, yes - i will al

Re: is there a way we can build spell dictionary from solr index such that it only take words leaving all`special characters

2013-03-12 Thread Alexandre Rafalovitch
Sorry, leaving them where? Can you give a concrete example or problem. Regards, Alex On Mar 12, 2013 1:31 PM, "Rohan Thakur" wrote: > hi all > > wanted to know is there way we can make spell dictionary from solr index > such that it only takes words from the index leaving all the special >

Re: [ANNOUNCE] Apache Solr 4.2 released

2013-03-12 Thread Marthi, Suneel
We presently have Indexes generated from Solr 4.1. What is the upgrade path to Solr 4.2 ? On 3/11/13 8:37 PM, "Robert Muir" wrote: >March 2013, Apache Solr 4.2 available >The Lucene PMC is pleased to announce the release of Apache Solr 4.2 > >Solr is the popular, blazing fast, open source No

Re: question about syntax for multiple terms in filter query

2013-03-12 Thread geeky2
hello jack, yes - i will always be using the two constraints at the same time. thank you again for the info. thx mark -- View this message in context: http://lucene.472066.n3.nabble.com/question-about-syntax-for-multiple-terms-in-filter-query-tp4046442p4046650.html Sent from the Solr - User

Re: SolrException: Error opening new searcher

2013-03-12 Thread Mark Miller
exceeding max warming searchers and failing to open a searcher are entirely different things. Your problem is more severe. Have any more logs? There should likely be deeper stack traces telling why the Searcher could not be opened. - Mark On Mar 12, 2013, at 12:41 AM, mark12345 wrote: > I a

Re: Solr 4.1: problems with Spatial Search.

2013-03-12 Thread Luis Cappa Banda
Hey, David. How are you? I did what you suggested and now works fine. However I hope that those performance issues will be resolved soon and I hope I could help some way: coding, testing, whatever. About cache warm up, I have setted up some warm-up queries in solrconfig.xml that fills the cache. O

Re: [ANNOUNCE] Apache Solr 4.2 released

2013-03-12 Thread Andre Bois-Crettez
On 03/12/2013 01:37 AM, Robert Muir wrote: * Collection Aliasing. Got time based data? Want to re-index in a temporary collection and then swap it into production? Done. Stay tuned for Shard Aliasing. Nice :) Seems that this solves the main use case I have for core SWAP (was missing in SolrCloud

Re: searching exact phrase with stop word returns bad results

2013-03-12 Thread Jack Krupansky
The Word Delimiter Filter will remove all punctuation characters. That is its function. Maybe you should first describe in simple English what your token/term rules are, and then it would be more clear what tokenizer and filters would be most appropriate. -- Jack Krupansky -Original Mes

Special characters not indexed

2013-03-12 Thread vsl
Hi, I am trying to index special characters and make them searchable. User Story: 1. Index document with content: §$ %&/( )=? +*#'-<> 2. Find indexed document using search term: & Additionaly I have several other fields that are copied to textAll Field. The search is performed on this field. Doe

Re: Solr 4.0 to Solr 4.1 upgrade

2013-03-12 Thread richardg
This ended up being a SPM issue. I noticed the same issue w/ 4.2 and decided to upgrade to monitor version 1.9.0 and it is now showing correct data. -- View this message in context: http://lucene.472066.n3.nabble.com/Solr-4-0-to-Solr-4-1-upgrade-tp4044990p4046631.html Sent from the Solr - User

Re: question about syntax for multiple terms in filter query

2013-03-12 Thread Jack Krupansky
It simply comes down to the matter of whether you commonly use two (or more) constraints at the same time, or if selection of the multiple constraints are independent. If you commonly use them together, combine them into one filter query. If you commonly pick and choose what constraints to apply

Re: SOLR - Recommendation on architecture

2013-03-12 Thread Shawn Heisey
On 3/12/2013 4:12 AM, kobe.free.wo...@gmail.com wrote: Following is the prod scenario:- 1. Web Server 1 (with above mentioned configuration) - will be hosting Solr instance and web site. 2. Web Server 2 (with above mentioned configuration) - will be hosting second Solr instance and web site. Do

Re: How to Integrate Solr With Hbase

2013-03-12 Thread adfel70
DO you store all your crawled nutch data in solr? including the text? If you do - dont you get problems with too big documents? If you dont - how do you support snippets and highlighting ? Bharat Mallampati wrote > We do have same kind of scenario in our application also. > > The way we are ac

Re: searching exact phrase with stop word returns bad results

2013-03-12 Thread adfel70
I see that there is not token with @. the question is why. this is my field type: any idea? Erick Erickson wrote > Take a look at admin/analysis for the field in question, feed it values > and > see how they are tokenized. My guess is that th