Re: Any way to modify result ranking using an integer field?

2010-01-03 Thread Andy
What I meant was that is there any way to makeĀ  {!boost b=log(popularity)} the default query type so that every query will be using it. From: Andy Subject: Re: Any way to modify result ranking using an integer field? To: solr-user@lucene.apache.org Date: Monday, January 4, 2010, 1:08 AM Thanks

RE: Reverse sort facet query [SOLR-1672]

2010-01-03 Thread Chris Hostetter
: Yes, I thought about adding some 'new syntax', but I opted for a separate 'facet.sortorder' parameter, : : mainly because I'm not familiar enough with the codebase to know what effect this might have on : : backward compatibility. It would be easy enough to modify the patch I created to do

Search algorithm used in Solr

2010-01-03 Thread abhishes
Hello everyone, Is there an article which explains (on a high level) the algorithm of search in Solr? How does Solr search approach compare to the "inverted index" technique? Regards, Abhishek --Original Message-- From: Mattmann, Chris A (388J) To: solr-user@lucene.apache.org ReplyTo: s

Re: Any way to modify result ranking using an integer field?

2010-01-03 Thread Andy
Thanks Ahmet. Do I need to do anything to enable BoostQParserPlugin in Solr, or is it already enabled? --- On Sun, 1/3/10, Ahmet Arslan wrote: From: Ahmet Arslan Subject: Re: Any way to modify result ranking using an integer field? To: solr-user@lucene.apache.org Date: Sunday, January 3, 2010

Re: performance question

2010-01-03 Thread A. Steven Anderson
> > dynamic fields don't make it worse ... the number of actaul field names > you sort on makes it worse. > > If you sort on 100 fields, the cost is the same regardless of wether all > 100 of those fields exist because of a single declaration, > or 100 distinct declarations. > Ahh...thanks for t

Re: performance question

2010-01-03 Thread Chris Hostetter
: > If you sort on many of your dynamic fields your memory use will : > explode, and the same with index norms and disk space. : Thanks for the info. In general, I knew sorting was expensive, but I didn't : realize that dynamic fields made it worse. dynamic fields don't make it worse ... the nu

Re: performance question

2010-01-03 Thread A. Steven Anderson
> Sorting and index norms have space penalties. > Sorting on a field creates an array of Java ints, one for every > document in the index. Index norms (used for boosting documents and > other things) create an array of bytes in the Lucene index files, one > for every document in the index. > If you

Rules engine and Solr

2010-01-03 Thread Avlesh Singh
I have a Solr (version 1.3) powered search server running in production. Search is keyword driven is supported using custom fields and tokenizers. I am planning to build a rules engine on top search. The rules are database driven and can't be stored inside solr indexes. These rules would ultimatel

Re: Indexing the latests MS Office documents

2010-01-03 Thread Mattmann, Chris A (388J)
Hi Roland, You probably want to send your email to tika-u...@lucene.apache.org. Best of luck! Cheers, Chris On 1/3/10 4:00 PM, "Roland Villemoes" wrote: > Hi All, > > Anyone who knows how to index the latest MS office documents like .docx and > .xlsx ? > > From searching it seems like Ti

Re: SOLR: Replication

2010-01-03 Thread Yonik Seeley
On Sun, Jan 3, 2010 at 2:55 PM, Peter Wolanin wrote: > Related to the difference between rsync and native Solr replication - > we are seeing issues with Solr 1.4 where search queries that come in > during a replication request hang for excessive amount of time (up to > 100's of seconds for a resul

Indexing the latests MS Office documents

2010-01-03 Thread Roland Villemoes
Hi All, Anyone who knows how to index the latest MS office documents like .docx and .xlsx ? >From searching it seems like Tika only supports the earlier formats .doc and >.xls med venlig hilsen/best regards Roland Villemoes Tel: (+45) 22 69 59 62 E-Mail: mailto:r...@alpha-solutions.dk

Re: Any way to modify result ranking using an integer field?

2010-01-03 Thread Ahmet Arslan
> Is there any way to modify result > ranking using an integer field? > > I have documents that have an integer field "popularity". > > I want to rank results by a combination of normal fulltext > search > relevance and popularity. It's kinda like search in digg - > result > ranking is based

Any way to modify result ranking using an integer field?

2010-01-03 Thread Andy
Is there any way to modify result ranking using an integer field? I have documents that have an integer field "popularity". I want to rank results by a combination of normal fulltext search relevance and popularity. It's kinda like search in digg - result ranking is based on the search releva

Re: Remove the deleted docs from the Solr Index

2010-01-03 Thread Ravi Gidwani
Lance: At times we dont have the freedom make these Database changes. Currently I am in this situation. Hence the requirement on the DIH. ~Ravi. On Sat, Jan 2, 2010 at 3:44 PM, Lance Norskog wrote: > The other option is to have a 'deleted' column in your table, and have > the applica

Re: SOLR Performance Tuning: Pagination

2010-01-03 Thread Peter Wolanin
At the NOVA Apache Lucene/Solr Meetup last May, one of the speakers from Near Infinity (Aaron McCurry I think) mentioned that he had a patch for lucene that enabled unlimited depth memory-efficient paging. Is anyone in contact with him? -Peter On Thu, Dec 24, 2009 at 11:27 AM, Grant Ingersoll w

Re: SOLR: Replication

2010-01-03 Thread Peter Wolanin
Related to the difference between rsync and native Solr replication - we are seeing issues with Solr 1.4 where search queries that come in during a replication request hang for excessive amount of time (up to 100's of seconds for a result normally that takes ~50 ms). We are replicating pretty ofte

Re: Tokenizing problem with numbers in query

2010-01-03 Thread Erick Erickson
This is an *extremely* useful page for figuring out what various tokenizers/filters are doing. The javadocs for the classes referenced can also provide some additional details http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters Erick On Sun, Jan 3, 2010 at 11:26 AM, Bernd Brod wrote

Re: Tokenizing problem with numbers in query

2010-01-03 Thread Ahmet Arslan
> when searching for a string: "asdf5qwerty" solr will > tokenize it to: > "asdf", "5", "qwerty" and display documents matching either > string. > > How can i stop this behaviour and make it just search for > plain > "asdf5qwerty"? What is the type of your field? If you have solr.WordDelimiterFi

RE: SOLR: Replication

2010-01-03 Thread Fuad Efendi
Thank you Yonik, excellent WIKI! I'll try without APR, I believe it's environmental issue; 100Mbps switched should do 10 times faster (current replica speed is 1Mbytes/sec) > -Original Message- > From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik > Seeley > Sent: Januar

Tokenizing problem with numbers in query

2010-01-03 Thread Bernd Brod
Hello, when searching for a string: "asdf5qwerty" solr will tokenize it to: "asdf", "5", "qwerty" and display documents matching either string. How can i stop this behaviour and make it just search for plain "asdf5qwerty"? thanks in advance. Bernd

Re: SOLR: Replication

2010-01-03 Thread Yonik Seeley
On Sat, Jan 2, 2010 at 11:35 PM, Fuad Efendi wrote: > I tried... I set APR to improve performance... server is slow while replica; > but "top" shows only 1% of I/O wait... it is probably environment specific; So you're saying that stock tomcat (non-native APR) was also 10 times slower? > but the

Re: solrJ and spell check queries

2010-01-03 Thread Jay Fisher
Thank you. That did it. ~ Jay On Sun, Jan 3, 2010 at 7:21 AM, Sascha Szott wrote: > Hi, > > > Jay Fisher wrote: > >> I'm trying to find a way to formulate the following query in solrJ. This >> is >> the only way I can get the desired result but I can't figure out how to >> get >> solrJ to gener

Re: solrJ and spell check queries

2010-01-03 Thread Sascha Szott
Hi, Jay Fisher wrote: I'm trying to find a way to formulate the following query in solrJ. This is the only way I can get the desired result but I can't figure out how to get solrJ to generate the same query string. It always generates a url that starts with select and I need it to start with spe

Re: Using IDF to find Collactions and SIPs . . ?

2010-01-03 Thread Siddhartha Pahade
pl unsubscribe me On 12/28/09, Subscriptions wrote: > > I am trying to write a query analyzer to pull: > > > > 1. Common phrases (also known as Collocations) with in a query > > > > 2. Highly unusual phrases (also known as Statistically Improbable > Phrases or SIPs) with in a query > >