trie fields and sortMissingLast

2009-09-30 Thread Steve Conover
Am I correct in thinking that trie fields don't support sortMissingLast (my tests show that they don't). If not, is there any plan for adding it in? Regards, Steve

Re: field collapsing sums

2009-09-30 Thread Matt Weber
You might want to see how the stats component works with field collapsing. Thanks, Matt Weber On Sep 30, 2009, at 5:16 PM, Uri Boness wrote: Hi, At the moment I think the most appropriate place to put it is in the AbstractDocumentCollapser (in the getCollapseInfo method). Though, it mi

Re: Create new core on the fly

2009-09-30 Thread djain101
> So, if we > do a create, will it modify the solr.xml everytime? Can it be avoided in > subsequent requests for create? > > >No, solr.xml will be modified only if persist=true is passed as a request >param. I don't understand your second question. Why would you want to issue >create commands for

Re: field collapsing sums

2009-09-30 Thread Uri Boness
Hi, At the moment I think the most appropriate place to put it is in the AbstractDocumentCollapser (in the getCollapseInfo method). Though, it might not be the most efficient. Cheers, Uri Joe Calderon wrote: hello all, i have a question on the field collapsing patch, say i have an integer f

Re: changing dismax parser to not treat symbols differently

2009-09-30 Thread Mark Miller
Joe Calderon wrote: > how would i go about modifying the dismax parser to treat +/- as regular text? > Would be nice if there was a tiny simple method you could override for this, but: You should extend the dismax parser and override addMainQuery Where it calls SolrPluginUtils.partialEscape, c

changing dismax parser to not treat symbols differently

2009-09-30 Thread Joe Calderon
how would i go about modifying the dismax parser to treat +/- as regular text?

Webinar: Apache Solr 1.4 – Faster, Easier, an d More Versatile than Ever

2009-09-30 Thread Erik Hatcher
Excuse the cross-posting and gratuitous marketing :) Erik My company, Lucid Imagination, is sponsoring a free and in-depth technical webinar with Erik Hatcher, one of our co-founders as Lucid Imagination, as well as co-author of Lucene in Action, and Lucene/Solr PMC member and com

Re: Writing optimized index to different storage?

2009-09-30 Thread Phillip Farber
Sorry, I should have given more background. We have, at the moment 3.8 million documents of 0.7MB/doc average so we have extremely large shards. We build about 400,000 documents to a shard resulting 200GB/shard. We are also using LVM snapshots to manage a snapshot of the shard which we serve

Re: Seattle / PNW Hadoop/Lucene/HBase Meetup, Wed Sep 30th

2009-09-30 Thread Nick Dimiduk
As Bradford is out of town this evening, I will take up the mantel of Person-on-Point. Contact me with questions re: tonight's gathering. See you tonight! -Nick 614.657.0267 On Mon, Sep 28, 2009 at 4:33 PM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Hello everyone! > Don't forget

Seeking Solr/Nutch consultant in San Jose, CA

2009-09-30 Thread Leann Pereira
Hi, I am working with a SaaS vendor who is integrated with Nutch 0.9 and SOLR. We are looking for some help to migrate this to Nutch 1.0. The work involves: 1) We made changes to Nutch 0.9; these need to be ported to Nutch 1.0. 2) Configure SOLR integration with Nutch 1.0 3)

mergefactor=1 questions

2009-09-30 Thread Phillip Farber
In order to make maximal use of our storage by avoiding the dead 2x overhead needed to optimize the index we are considering setting mergefactor=1 and living with the slow indexing performance which is not a problem in our use case. Some questions: 1) Does mergefactor=1 mean that the size o

field collapsing sums

2009-09-30 Thread Joe Calderon
hello all, i have a question on the field collapsing patch, say i have an integer field called "num_in_stock" and i collapse by some other column, is it possible to sum up that integer field and return the total in the output, if not how would i go about extending the collapsing component to suppor

Re: Conditional deduplication

2009-09-30 Thread Mauricio Scheffer
See http://wiki.apache.org/solr/FieldCollapsing On Wed, Sep 30, 2009 at 4:41 PM, Michael wrote: > If I index a bunch of email documents, is there a way to say"show me all > email documents, but only one per To: email address" > so that if there are a total of 10 distinct To: fields in the corpus

Conditional deduplication

2009-09-30 Thread Michael
If I index a bunch of email documents, is there a way to say"show me all email documents, but only one per To: email address" so that if there are a total of 10 distinct To: fields in the corpus, I get back 10 email documents? I'm aware of http://wiki.apache.org/solr/Deduplication but I want to re

Re: NGramTokenFilter behaviour

2009-09-30 Thread Shalin Shekhar Mangar
On Wed, Sep 30, 2009 at 11:24 PM, wrote: > If I index the following text: "I live in Dublin Ireland where > Guinness is brewed" > > Then search for: duvlin > > Should Solr return a match? > > In the admin interface under the analysis section, Solr highlights > some NGram matches? > > When I enter

Re: NGramTokenFilter behaviour

2009-09-30 Thread Shalin Shekhar Mangar
On Wed, Sep 30, 2009 at 11:24 PM, wrote: > > Can someone please clarify what the purpose of the > NGramFilter/tokenizer is, if not to allow for > misspellings/morphological variation and also, what the correct > configuration is in terms of use at index/query time. > > If it is spellcheck you are

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie
>Fergus McMenemie wrote: >>> Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexe

Re: n-Gram, only works with queries of 2 letters

2009-09-30 Thread aodhol
Has this issue been fixed yet? can anyone shed some light on what's going on here please. NGramming is critical to my app. I will have to look to something other than Solr if it's not possible to do :(

RE: NGramTokenFilter behaviour

2009-09-30 Thread Feak, Todd
My understanding of a NGramTokenizing is to help with languages that don't necessarily contain spaces as a word delimiter (Japanese et al). In that case bi-gramming is used to find words contained within a stream of unbroken characters. In that case, you want to find all of the bi-grams that you

Re: Adding data from nutch to a Solr index

2009-09-30 Thread Andrzej Bialecki
Sönke Goldbeck wrote: Alright, first post to this list and I hope the question is not too stupid or misplaced ... what I currently have: - a nicely working Solr 1.3 index with information about some entities e.g. organisations, indexed from an RDBMS. Many of these entities have an URL pointing a

NGramTokenFilter behaviour

2009-09-30 Thread aodhol
If I index the following text: "I live in Dublin Ireland where Guinness is brewed" Then search for: duvlin Should Solr return a match? In the admin interface under the analysis section, Solr highlights some NGram matches? When I enter the following query string into my browser address bar, I ge

Adding data from nutch to a Solr index

2009-09-30 Thread Sönke Goldbeck
Alright, first post to this list and I hope the question is not too stupid or misplaced ... what I currently have: - a nicely working Solr 1.3 index with information about some entities e.g. organisations, indexed from an RDBMS. Many of these entities have an URL pointing at further information,

Multi-valued field cache

2009-09-30 Thread wojtekpia
I want to build a FunctionQuery that scores documents based on a multi-valued field. My intention was to use the field cache, but that doesn't get me multiple values per document. I saw other posts suggesting UnInvertedField as the solution. I don't see a method in the UnInvertedField class that w

Questions about synonyms and highlighting

2009-09-30 Thread Nourredine K.
Hi, Can you please give me some answers for those questions : 1 - How can I get synonyms found for a keyword ? I mean i search "foo" and i have in my synonyms.txt file the following tokens : "foo, foobar, fee" (with expand = true) My index contains "foo" and "foobar". I want to display a

Re: Solr Porting to .Net

2009-09-30 Thread Mauricio Scheffer
Solr is a server that runs on Java and it exposes a http interface.SolrNet is a client library for .Net that connects to a Solr instance via its http interface. My experiment (let's call it SolrIKVM) is an attempt to run Solr on .Net. Hope that clear things up. On Wed, Sep 30, 2009 at 11:50 AM, A

Re: Where do I need to install Solr

2009-09-30 Thread Jérôme Etévé
Solr is a separate service, in the same way a RDMS is a separate service. Whether you install it on the same machine as your webserver or not, it's logically separated from your server. Jerome. 2009/9/30 Claudio Martella : > Kevin Miller wrote: >> Does Solr have to be installed on the web server

Re: Showing few results for each category (facet)

2009-09-30 Thread Varun Gupta
Thanks Matt!! I will take a look at the patch for field collapsing. Thanks Marian for pointing that out. If the field collapse does not work then I will have to rely on solr caching. Thanks, Varun Gupta On Wed, Sep 30, 2009 at 1:44 AM, Matt Weber wrote: > So, you want to display 5 results fro

Re: Solr Porting to .Net

2009-09-30 Thread Antonio Calò
I guys, thanks for your prompt feedback. So, you are saying that SolrNet is just a wrapper written in C#, that connnect the Solr (still written in Java that run on the IKVM) ? Is my understanding correct? Regards Antonio 2009/9/30 Mauricio Scheffer > SolrNet is only a http client to Solr. >

Re: Where do I need to install Solr

2009-09-30 Thread Claudio Martella
Kevin Miller wrote: > Does Solr have to be installed on the web server, or can I install Solr > on a different server and access it from my web server? > > Kevin Miller > Web Services > > you can access it from your webserver (or browser) via HTTP/XML requests and responses. have a look at solr

Where do I need to install Solr

2009-09-30 Thread Kevin Miller
Does Solr have to be installed on the web server, or can I install Solr on a different server and access it from my web server? Kevin Miller Web Services

Re: init parameters for queryParser

2009-09-30 Thread Shalin Shekhar Mangar
On Wed, Sep 30, 2009 at 7:14 PM, Jérôme Etévé wrote: > Hi all, > > I've got my own query parser plugin defined thanks to the queryParser tag: > > > > The QParserPlugin class has got an init method like this: > public void init(NamedList args); > > Where and how do I put my args to be passed to i

Re: delay while adding document to solr index

2009-09-30 Thread Jérôme Etévé
Hi, - Try to let solr do the commits for you (setting up autocommit feature). (and stop committing after inserting one document). This should greatly improve the delays you're experiencing. - If you do not optimize, it's normal your index size only grows. Optimize once regularly when your load is

Re: Solr Porting to .Net

2009-09-30 Thread Mauricio Scheffer
SolrNet is only a http client to Solr. I've been experimenting with IKVM but wasn't very successful... There seem to be some issues with class loading, but unfortunately I don't have much time to continue these experiments right now. In case you're interested in continuing this, here's the reposito

init parameters for queryParser

2009-09-30 Thread Jérôme Etévé
Hi all, I've got my own query parser plugin defined thanks to the queryParser tag: The QParserPlugin class has got an init method like this: public void init(NamedList args); Where and how do I put my args to be passed to init for my query parser plugin? I'm trying value1 value1

Re: search for non empty field

2009-09-30 Thread Erik Hatcher
field:[* TO *] matches documents that have that have one or more terms in that field. If your indexer is sending a value, it'll end up with a term. Note that changing from string to long requires reindexing, though that isn't the issue here. Erik On Sep 30, 2009, at 2:39 AM,

Re: delay while adding document to solr index

2009-09-30 Thread swapna_here
thanks again for your immediate response yes, i am running the commit after a document is indexed here i don't understand why my index size is increased to 625MB(for the 10 documents) which was previously 250MB is this due to i have not optimized at all my index or since i am adding documen

Re: ${dataimporter.last_index_time} as an argument to newerThan in FileListEntityProcessor?

2009-09-30 Thread Shalin Shekhar Mangar
On Tue, Sep 29, 2009 at 11:43 PM, Shalin Shekhar Mangar < shalinman...@gmail.com> wrote: > On Tue, Sep 29, 2009 at 8:14 PM, Bill Dueber wrote: > >> Is this possible? I can't figure out a syntax that works, and all the >> examples show using last_index_time as an argument to an SQL query. >> >> >

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey
Swapna While the disk space does increase during the process of optimization, it should almost always return to the original size or slightly less. This is a silly question. But off the top of my head, I can't think of any other reason why the index size would increase - Are you running a after

Re: delay while adding document to solr index

2009-09-30 Thread swapna_here
thanks for your reply i have not optimized at all my knowledge is optimize improves the query performance but it will take more disk space except that i have no idea how to use it previously for 10 documents the size occupied was around 250MB But after 2 months it is 625MB why this happened

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey
Also, what is your merge factor set to? Pravin 2009/9/30 Pravin Paratey : > Swapna, > > Your answers are inline. > > 2009/9/30 swapna_here : >> >> hi all, >> >> I have indexed 10 documents (daily around 5000 documents will be indexed >> one at a time to solr) >> at the same time daily few(aro

Re: Solr Porting to .Net

2009-09-30 Thread Pravin Paratey
You may want to check out - http://code.google.com/p/solrnet/ 2009/9/30 Antonio Calò : > Hi All > > I'm wondering if is already available a Solr version for .Net or if it is > still under development/planning. I've searched on Solr website but I've > found only info on Lucene .Net project. > > Bes

Solr Porting to .Net

2009-09-30 Thread Antonio Calò
Hi All I'm wondering if is already available a Solr version for .Net or if it is still under development/planning. I've searched on Solr website but I've found only info on Lucene .Net project. Best Regards Antonio -- Antonio Calò -- Software Developer E

Re: delay while adding document to solr index

2009-09-30 Thread Pravin Paratey
Swapna, Your answers are inline. 2009/9/30 swapna_here : > > hi all, > > I have indexed 10 documents (daily around 5000 documents will be indexed > one at a time to solr) > at the same time daily few(around 2000) indexed documents (added 30 days > back) will be deleted using DeleteByQuery of

Re: search for non empty field

2009-09-30 Thread Jorge Agudo Praena
Hi, i'm not having the expected results when using [* TO *], the results are including empty fields. Here is my configuration: schema.xml: bean: @Field private List refFaseExp= new ArrayList(); query: http://host.com/select?rows=0&facet=true&facet.field=refFaseExp&q=*:* AND refFaseExp:[* TO *]

Invalid response with search key having numbers

2009-09-30 Thread con
Hi all I am getting incorrect results when i search with numbers only or string containing numbers. when such a search is done, all the results in the index is returned, irrespective of the search key. For eg, the phone number field is mapped to TextField. it can contains values like , 653-23345

Re: Problem getting Solr home from JNDI in Tomcat

2009-09-30 Thread Andrew Clegg
hossman wrote: > > > : Hi all, I'm having problems getting Solr to start on Tomcat 6. > > which version of Solr? > > Sorry -- a nightly build from about a month ago. Re. your other message, I was sure the two machines had the same version on, but maybe not -- when I'm back in the office tom

Re: Number of terms in a SOLR field

2009-09-30 Thread Andrzej Bialecki
Fergus McMenemie wrote: Fergus McMenemie wrote: Hi all, I am attempting to test some changes I made to my DIH based indexing process. The changes only affect the way I describe my fields in data-config.xml, there should be no changes to the way the data is indexed or stored. As a QA check I

Re: Number of terms in a SOLR field

2009-09-30 Thread Fergus McMenemie
>Fergus McMenemie wrote: >> Hi all, >> >> I am attempting to test some changes I made to my DIH based >> indexing process. The changes only affect the way I >> describe my fields in data-config.xml, there should be no >> changes to the way the data is indexed or stored. >> >> As a QA check I wa

delay while adding document to solr index

2009-09-30 Thread swapna_here
hi all, I have indexed 10 documents (daily around 5000 documents will be indexed one at a time to solr) at the same time daily few(around 2000) indexed documents (added 30 days back) will be deleted using DeleteByQuery of SolrJ Previously each document used to be indexed within 5ms.. but rece

Re: Create new core on the fly

2009-09-30 Thread Shalin Shekhar Mangar
On Wed, Sep 30, 2009 at 3:48 AM, djain101 wrote: > > Hi Shalin, > > Can you please elaborate, why we need to do unload after create? No you don't need to. You can unload if you want to for some reasons. > So, if we > do a create, will it modify the solr.xml everytime? Can it be avoided in > s