Re: question about Field Collapsing/ grouping

2011-09-13 Thread O. Klein
Isn't that what the parameter group.ngroups=true is for? -- View this message in context: http://lucene.472066.n3.nabble.com/question-about-Field-Collapsing-grouping-tp3331821p3332471.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: question about Field Collapsing/ grouping

2011-09-13 Thread Jayendra Patil
yup .. seems the group count feature is included now, as mentioned by Klein. Regards, Jayendra On Tue, Sep 13, 2011 at 8:27 AM, O. Klein wrote: > Isn't that what the parameter group.ngroups=true is for? > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/question-about-F

Re: can indexing information stored in db rather than filesystem?

2011-09-13 Thread kiran.bodigam
Thanks for u r reply guys As suggested i agree that we are losing many of the benefits of Solr/Lucene but i still want to store the index output (index files) in db table please suggest what are the steps i need to follow to configure the db with SOLR Engine (As how we done in it solrconfig.xm

Re: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Markus Jelsma
I'm curious; what benefits do you think you'll get by storing the files in some DB table? On Tuesday 13 September 2011 15:51:19 kiran.bodigam wrote: > Thanks for u r reply guys > > As suggested i agree that we are losing many of the benefits of Solr/Lucene > but i still want to store the ind

RE: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Jaeger, Jay - DOT
I don't think you understand. Solr does not have the code to do that. It just isn't there, nor would I expect it would ever be there. Solr is open source though. You could look at the code and figure out how to do it (though why anyone would do that remains beyond my ability to understand).

Re: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Walter Underwood
On Sep 13, 2011, at 6:51 AM, kiran.bodigam wrote: > As suggested i agree that we are losing many of the benefits of Solr/Lucene > but i still want to store the index output (index files) in db table please > suggest what are the steps i need to follow to configure the db with SOLR > Engine The s

using a function query with OR and spaces?

2011-09-13 Thread Jason Toy
I had queries breaking on me when there were spaces in the text I was searching for. Originally I had : fq=state_s:New York and that would break, I found a work around by using: fq={!raw f=state_s}New York My problem now is doing this with an OR query, this is what I have now, but it doesn't w

Re: How to combine RSS w/ Tika when using Data Import Handler (DIH)

2011-09-13 Thread Pulkit Singhal
Hello Everyone, I've been investigating and I understand that using the RegexTransformer is an option that is open for identifying and extracting data to multiple fields from a single rss value source ... But rather than hack together something I once again wanted to check with the community: Is t

Re: using a function query with OR and spaces?

2011-09-13 Thread josh lucas
On Sep 13, 2011, at 8:37 AM, Jason Toy wrote: > I had queries breaking on me when there were spaces in the text I was > searching for. Originally I had : > > fq=state_s:New York > and that would break, I found a work around by using: > > fq={!raw f=state_s}New York > > > My problem now is doin

Re: using a function query with OR and spaces?

2011-09-13 Thread Chris Hostetter
: Subject: using a function query with OR and spaces? First off, what you are asking about is a "filter query" not a "function query" https://wiki.apache.org/solr/CommonQueryParameters#fq : I had queries breaking on me when there were spaces in the text I was : searching for. Originally I had :

Re: How to combine RSS w/ Tika when using Data Import Handler (DIH)

2011-09-13 Thread Chris Hostetter
: I've been investigating and I understand that using the RegexTransformer is : an option that is open for identifying and extracting data to multiple : fields from a single rss value source ... But rather than hack together : something I once again wanted to check with the community: Is there ano

Highlight compounded word instead of part

2011-09-13 Thread O. Klein
I am using DictionaryCompoundWordTokenFilterFactory and want to highlight the whole word instead of the part that matched the dictionary. So when the query "word" matches on "compoundedword", the whole word "compoundedword" is highlighted, instead of just "word". Any ideas? -- View this mess

Re: Solr messing up the UK GBP (pound) symbol in response, even though Java environment variabe has file encoding is set to UTF 8....

2011-09-13 Thread Chris Hostetter
: Any idea why solr is unable to return the pound sign as-is? : : I tried typing in £ 1 million in Solr admin GUI and got following response. ... : £ 1 million ... : Here is my Java Properties I got also from admin interface: ... : catalina.home = : /home/rbhagdev/SCCRepo

How to plug a new ANTLR grammar

2011-09-13 Thread Roman Chyla
Hi, The standard lucene/solr parsing is nice but not really flexible. I saw questions and discussion about ANTLR, but unfortunately never a working grammar, so... maybe you find this useful: https://github.com/romanchyla/montysolr/tree/master/src/java/org/apache/lucene/queryParser/iqp/antlr In th

Re: How to serach on specific file types ?

2011-09-13 Thread ahmad ajiloo
1- How can I put the file extension into my index? I'm using Nutch to crawling web pages and sending Nutch's data to Solr for indexing. and I have no idea to put the file extension to my index. 2- please give me some help links about mime type. I'm new to Solr and don't know anything about mime typ

Get field value in custom searchcomponent (solr 3.3)

2011-09-13 Thread Pablo Ricco
What is the best way to get a float field value from docID? I tried the following code but when it runs throws an exception For input string: "` ??eI" at line float lat = Float.parseFloat(tlat); schemal.xml: ... ... component.java: @Override public void process(ResponseBuilder rb) thro

Re: How to serach on specific file types ?

2011-09-13 Thread Chris Hostetter
: 1- How can I put the file extension into my index? I'm using Nutch to : crawling web pages and sending Nutch's data to Solr for indexing. and I have : no idea to put the file extension to my index. : 2- please give me some help links about mime type. I'm new to Solr and don't : know anything abo

Re: How to serach on specific file types ?

2011-09-13 Thread Markus Jelsma
> 1- How can I put the file extension into my index? I'm using Nutch to > crawling web pages and sending Nutch's data to Solr for indexing. and I > have no idea to put the file extension to my index. To get the file extension in a separate field you can copyField the url and use Solr's char patt

Re: using a function query with OR and spaces?

2011-09-13 Thread Jason Toy
I wrote the title wrong, its a filter query, not a function query, thanks for the correction. The field is a string, I had tried fq=stats_s:"New York" before and that did not work, I'm puzzled to why this didn't work. I tried out your b suggestion and that worked,thanks! On Tue, Sep 13, 2011 at

Re: Adding Query Filter custom implementation to Solr's pipeline

2011-09-13 Thread Chris Hostetter
: If you do need to implement something truely custom, writing it as your : own QParser to trigger via an "fq" can be advantageous so it can cached : and re-used by many queries. I forgot to mention a very cool new feature that is about to be released in Solr 3.4 You can now instruct Solr tha

Re: How to plug a new ANTLR grammar

2011-09-13 Thread Jason Toy
I'd love to see the progress on this. On Tue, Sep 13, 2011 at 10:34 AM, Roman Chyla wrote: > Hi, > > The standard lucene/solr parsing is nice but not really flexible. I > saw questions and discussion about ANTLR, but unfortunately never a > working grammar, so... maybe you find this useful: > >

Out of memory

2011-09-13 Thread Rohit
I have solr running on a machine with 18Gb Ram , with 4 cores. One of the core is very big containing 77516851 docs, the stats for searcher given below searcherName : Searcher@5a578998 main caching : true numDocs : 77516851 maxDoc : 77518729 lockFactory=org.apache.lucene.store.NativeFSLockF

Lucene Grid question

2011-09-13 Thread sol myr
Hi, I have a huge Lucene index, which I'd like to split between machines ("Grid"). E.g. say I have a chain of book-stores, in different countries, and I'm aiming for the following: - Each country has its own index file, on its own machine (e.g. books from Japan are indexed on machine "japan1")

Using the contrib flexible query parser in Solr

2011-09-13 Thread Michael Ryan
Has anyone used the "Flexible Query Parser" (https://issues.apache.org/jira/browse/LUCENE-1567) in Solr? I'm just starting to look at it for the first time and was wondering if it is something that can be dropped into Solr fairly easily, or if more extensive changes are needed. I thought perh

Re: How to plug a new ANTLR grammar

2011-09-13 Thread Peter Keegan
Roman, I'm not familiar with the contrib, but you can write your own Java code to create Query objects from the tree produced by your lexer and parser something like this: StandardLuceneGrammarLexer lexer = new ANTLRReaderStream(new StringReader(queryString)); CommonTokenStream tokens = new Commo

Re: OOM issue

2011-09-13 Thread Erick Erickson
Multiple webapps will not help you, they're still on the underlying memory. In fact, it'll make matters worse since they won't share resources. So questions become: 1> Why do you have 10 cores? Putting 10 cores on the same machine doesn't really do much. It can make lots of sense to put 10 cores o

Re: Document row in solr Result

2011-09-13 Thread Erick Erickson
Not sure if it really applies, but consider the QueryElevationComponent. It can force the display of certain documents (identified by search term) to the top of the results list. Best Erick On Mon, Sep 12, 2011 at 5:44 AM, Eric Grobler wrote: > Hi Pierre, > > Great idea, that will speed things u

RE: can indexing information stored in db rather than filesystem?

2011-09-13 Thread Jaeger, Jay - DOT
Nicely put. ;^) -Original Message- From: Walter Underwood [mailto:wun...@wunderwood.org] Sent: Tuesday, September 13, 2011 9:16 AM To: solr-user@lucene.apache.org Subject: Re: can indexing information stored in db rather than filesystem? On Sep 13, 2011, at 6:51 AM, kiran.bodigam wrote:

RE: Out of memory

2011-09-13 Thread Jaeger, Jay - DOT
numDocs is not the number of documents in memory. It is the number of documents currently in the index (which is kept on disk). Same goes for maxDocs, except that it is a count of all of the documents that have ever been in the index since it was created or optimized (including deleted documen

Can index size increase when no updates/optimizes are happening?

2011-09-13 Thread Yury Kats
One of my users observed that the index size (in bytes) increased over night. There was no indexing activity at that time, only querying was taking place. Running "optimize" brought the index size back down to what it was when indexing finished the day before. What could explain that?

Document Boost not evaluated when using standard Query Type?

2011-09-13 Thread Daniel Pötzinger
Hey all I want to show all documents with of a certain type. The documents should be ordered by the index time document boost. So I expected that this would work: /select?debugQuery=on&q=doctype:music&q.op=OR&qt=standard But in fact every document gets the same score: 0.7306 = (MATCH)

Re: DIH load only selected documents with XPathEntityProcessor

2011-09-13 Thread Pulkit Singhal
This solution doesn't seem to be working for me. I am using Solr trunk and I have the same question as Bernd with a small twist: the field that should NOT be empty, happens to be a derived field called price, see the config below: ... I have also changed the sample script to check the pric

Re: DIH load only selected documents with XPathEntityProcessor

2011-09-13 Thread Pulkit Singhal
Oh and I"m sure that I'm using Java 6 because the properties from the Solr webpage spit out: java.runtime.version = 1.6.0_26-b03-384-10M3425 On Tue, Sep 13, 2011 at 4:15 PM, Pulkit Singhal wrote: > This solution doesn't seem to be working for me. > > I am using Solr trunk and I have the same qu

Managing solr machines (start/stop/status)

2011-09-13 Thread Jamie Johnson
I know this isn't a solr specific question but I was wondering what folks do in regards to managing the machines in their solr cluster? Are there any recommendations for how to start/stop/manage these machines? Any suggestions would be appreciated.

DIH skipping imports with skipDoc vs skipDoc

2011-09-13 Thread Pulkit Singhal
Hello, 1) The documented explanation of skipDoc and skipRow is not enough for me to discern the difference between them: $skipDoc : Skip the current document . Do not add it to Solr. The value can be String true/false $skipRow : Skip the current row. The document will be added with rows from othe

Re: select query does not find indexed pdf document

2011-09-13 Thread Michael Dockery
Thank you for your informative reply. I would like to start simple by combining both filename and content    into the same default search field    ...which my default schema xml calls  "text" ... text ... also: -case and accent insensitive -no splits on numb3rs -no highlights  -text processing sa

Re: Document Boost not evaluated when using standard Query Type?

2011-09-13 Thread Chris Hostetter
: I want to show all documents with of a certain type. The documents : should be ordered by the index time document boost. ... : But in fact every document gets the same score: : : 0.7306 = (MATCH) fieldWeight(doctype:music in 1), product of: : 1.0 = tf(termFreq(doctype:music)=1)

where is the SOLR_HOME ?

2011-09-13 Thread ahmad ajiloo
Hi In this page ( http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUTokenizerFactory )said: "Note: to use th

Re: Document Boost not evaluated when using standard Query Type?

2011-09-13 Thread Daniel Pötzinger
Thanks, that helped! Am Sep 14, 2011 um 4:56 PM schrieb Chris Hostetter: > > > fieldNorms are very corse. In my opinion, if you have a > "weighting" you want to use to affect score sort, it's better to index > that weight as a numeric field, and explicitly factor it into the score > using a

Re: Document Boost not evaluated when using standard Query Type?

2011-09-13 Thread Daniel Pötzinger
> > fieldNorms are very corse. In my opinion, if you have a > "weighting" you want to use to affect score sort, it's better to index > that weight as a numeric field, and explicitly factor it into the score > using a function query... I see that in this use case this makes most sense - thanks

DIH delta last_index_time

2011-09-13 Thread Maria Vazquez
Hi, How do you handle the situation where the time on the server running Solr doesn¹t match the time in the database? I¹m using the last_index_time saved by Solr in the delta query checking it against lastModifiedDate field in the database but the times are not in sync so I might lose some changes.

RE: Out of memory

2011-09-13 Thread Rohit
Thanks Jaeger. Actually I am storing twitter streaming data into the core, so the rate of index is about 12tweets(docs)/second. The same solr contains 3 other cores but these cores are not very heavy. Now the twitter core has become very large (77516851) and its taking a long time to query (Mostly

EofException with Solr in Jetty

2011-09-13 Thread Michael Szalay
Hi all sometimes we have this error in our system. We are running Solr 3.1.0 running on Jetty 7.2.2 Anyone an idea how to tune this? 14:41:05,693 | ERROR | qtp283504850-36 | SolrDispatchFilter | apache.solr.common.SolrException 151 | 154 - mvn_ch.basis06.eld.indexer_ch.basis06.eld.indexer.sol

solr 1.4 highlighting issue

2011-09-13 Thread Dmitry Kan
Hello list, Not sure how many of you are still using solr 1.4 in production, but here is an issue with highlighting, that we've noticed: The query is: (drill AND ships) OR rigs Excerpt from the highlighting list: Within the fleet of 27 floating rigs (semisubmersibles and drillships