FW: What is the format of data contained in a Named List?

2009-07-16 Thread Amandeep Singh09
Hi, Thanks for your reply. But I need one clarification. When you say it will contain the data you requested for, do you mean the data as requested in fl parameter of the query? Thanks. Aman -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble

Re: What is the format of data contained in a Named List?

2009-07-16 Thread Noble Paul നോബിള്‍ नोब्ळ्
The contents of SolrDocument is fixed. It will only contain the data that you requested for On Fri, Jul 17, 2009 at 10:36 AM, Kartik1 wrote: > > A named list contains a key value pair. At the very basic level, if we want > to access the data that is contained in named list > > NamedList foo = t

What is the format of data contained in a Named List?

2009-07-16 Thread Kartik1
A named list contains a key value pair. At the very basic level, if we want to access the data that is contained in named list NamedList foo = thisIsSolrQueryResponseObject.getValues(); Entry bar = null; // Creating a iterator to iterate through the response Iterator> It =foo.iterator(); while (

Re: spellcheck with misspelled words in index

2009-07-16 Thread Peter Wolanin
I think you can just tell the spellchecker to only supply "more popular" suggestions, which would naturally omit these rare misspellings: true -Peter On Wed, Jul 15, 2009 at 7:30 PM, Jay Hill wrote: > We had the same thing to deal with recently, and a great solution was posted > to the lis

Re: Wikipedia or reuters like index for testing facets?

2009-07-16 Thread Peter Wolanin
AWS provides some standard data sets, including an extract of all wikipedia content: http://developer.amazonwebservices.com/connect/entry.jspa?externalID=2345&categoryID=249 Looks like it's not being updated often, so this or another AWS data set could be a consistent basis for benchmarking? -Pe

Re: Distributed search has problem for facet

2009-07-16 Thread Yonik Seeley
Thanks for the bug report... this looks like an escaping bug. But, it looks like it stems from a really weird field name? facet.field=authorname: Shouldn't that be facet.field=authorname -Yonik http://www.lucidimagination.com On Thu, Jul 16, 2009 at 8:14 PM, zehua wrote: > > I use two shards i

can i use solr to do this

2009-07-16 Thread Antonio Eggberg
Hi, every solr document I have a creation date which is the default time stamp "NOW". What I like to know how can I have facets like the following: Past 24 Hours (3) Past 7 days (23) Past 15 days (33) Past 30 days (59) Is this possible? i.e. range query as facet? Regards Anton _

Distributed search has problem for facet

2009-07-16 Thread zehua
I use two shards in two different machines. Here is my URL http://machine1:8900/solr/select/?shards=machine1:8900/solr,machine2:8900/solr&q=body\::dell&start=0&rows=10&facet=true&facet.field=authorname This works great for 1.3.0 I just download the Solr 1.4 from trunk and it breaks. If I run the

Re: Lock timed out 2 worker running

2009-07-16 Thread Chris Hostetter
This is relaly odd. Just to clarify... 1) you are running a normal solr installation (in a servlet container) and using SolrJ to send updates to Solr from another application, correct? 2) Do you have any special custom plugins running 3) do you have any other apps that might be attemptin

Re: Change in DocListAndSetNC not messing everything

2009-07-16 Thread Marc Sturlese
Hey Hoss thanks for answering to this so concrete question. I actually realized that maybe I was not clear enough in my explanation so I just post it again in another thread trying to give more detail and writing the code of my hack in getDocListAndSetNC: http://www.nabble.com/Custom-funcionality-

Re: Create incremental snapshot

2009-07-16 Thread Chris Hostetter
: Thanks for the reply Asif. We have already tried removing the optimization : step. Unfortunately the commit command alone is also causing an identical : behaviour . Is there any thing else that we are missing ? the hardlinking behavior of snapshots is based on the files in the index directory,

Re: boosting MoreLikeThis results

2009-07-16 Thread Chris Hostetter
: I have a need to boost MoreLikeThis results. So far the only boost that I : have come across is boosting query field by using mlt.qf. But what I really : need is to use boost query and boost function like those in the : DisMaxRequestHandler. Is that possible at all either out-of-the-box or by

Re: Boosting for most recent documents

2009-07-16 Thread Chris Hostetter
: Does anyone know if Solr supports sorting by internal document ids, : i.e, like Sort.INDEXORDER in Lucene? If so, how? It does not. in Solr the decisison to make "score desc" the default search ment there is no way to request simple docId ordering. : Also, if anyone have any insight on if

Re: Change in DocListAndSetNC not messing everything

2009-07-16 Thread Chris Hostetter
: For testing, what I have done is do some hacks to SolrIndexSearcher's : getDocListAndSetNC funcion. I fill the ids array in my own order or I just : don't add some docs id's (and so change this id's array size). I have been : testing it and the performance is dramatically better that using the p

Re: Getting Facet Count of combination of fields

2009-07-16 Thread Brian Johnson
An interesting analogy for this feature is that you're doing a count(*) on a group by in SQL. While it's true that you can pre-compute these if you have a small set of combination you know you want to show a-priori, if you want to present a more dynamic customer experience, you need to be able t

Re: posting binary file and metadata in two separate documents

2009-07-16 Thread Chris Hostetter
: Subject: posting binary file and metadata in two separate documents there was some discussion a while back about that fact that you can push multiple "ContentStreams" to SOlr in a single request, and while the existing handelrs all just iterate over and process them seperately, it would be *

Re: Filtering MoreLikeThis results

2009-07-16 Thread Chris Hostetter
: At least in trunk, if you request for: : http://localhost:8084/solr/core_A/mlt?q=id:7468365&fq=price[100 TO 200] : It will filter the MoreLikeThis results I think a big part of the confusion people have about this is the distinction between the MLT RequestHandler, and the MLT SearchComponent.

Re: Problems Issuing Parallel Queries with SolrJ

2009-07-16 Thread danben
Actually, it's obvious that the second case wouldn't work after looking at SimpleHttpConnectionManager. So my question boils down to being able to use a single CommonsHttpSolrServer in a multithreaded fashion. danben wrote: > > I have a running Solr (1.3) server that I want to query with SolrJ

Problems Issuing Parallel Queries with SolrJ

2009-07-16 Thread danben
I have a running Solr (1.3) server that I want to query with SolrJ, and I'm running a benchmark that uses a pool of 10 threads to issue 1000 random queries to the server. Each query executes 7 searches in parallel. My first attempt was to use a single instance of CommonsHttpSolrServer, using the

Re: Highlight arbitrary text

2009-07-16 Thread Jason Rutherglen
Interesting, many sites don't store text in Lucene/Solr and so need a way to highlight text stored in a database (or some equivalent), they have two options, re-analyze the doc for the term positions or access the term vectors from Solr and hand them to the client who then performs the highlighting

Re: Any benefit from compressed object pointers? (java6u14)

2009-07-16 Thread Glen Newton
I am going to do some (large scale) indexing tests using Lucene & will post to both this and the Lucene list. More info on compressed pointers: http://wikis.sun.com/display/HotSpotInternals/CompressedOops -Glen Newton http://zzzoot.blogspot.com/search?q=lucene 2009/7/16 Kevin Peterson : > I not

Any benefit from compressed object pointers? (java6u14)

2009-07-16 Thread Kevin Peterson
I noticed that Ubuntu pushed java 6 update 14 as an update to 9.04 today. This update includes compressed object pointers which are designed to reduce memory requirements with 64bit JVMs. Has anyone experimented with this to see if it provides any benefit to Solr? If not, can anyone comment on wh

Re: Solr 1.4 Release Date

2009-07-16 Thread Mark Miller
Agreed! We are pushing towards it - one of the holds up is that Lucene 2.9 is about to release, so we are waiting for that. We really need to prune down the JIRA list though. A few have been tackling it, but many of the issues are still up in the air. I think once Lucene 2.9 releases though, Solr 1

Re: Multivalued fields and scoring/sorting

2009-07-16 Thread Peter Wolanin
Assuming that you know the unique ID when constructing the query (which it sounds like you do) why not try a boost query with a high boost for 2 and a lower boost for 1 - then the default sort by score should match your desired ordering, and this order can be further tweaked with other bf or bq ar

Re: Word frequency count in the index

2009-07-16 Thread Walter Underwood
I haven't researched old versions of Lucene, but I think it has always been a vector space, tf.idf engine. I don't see any hint of probabilistic scoring. A bit of background about stop words and idf. They are two versions of the same thing. Stop words are a manual, on/off decision about what word

Re: DefaultSearchField ? "important"

2009-07-16 Thread Mani Kumar
On Thu, Jul 16, 2009 at 12:33 AM, Erik Hatcher wrote: > > On Jul 15, 2009, at 2:59 PM, Mani Kumar wrote: > >> @mark, @otis: >> > > Can I answer too? :) > your welcome :) ... thanks > > > yeah copying all the fields to one text field will work but what if i want >> to assign specific weightage

RE: Solr 1.4 Release Date

2009-07-16 Thread Daniel Alheiros
Come on it's time to cut this release, folks! I'm just waiting for that since it was forecasted for early summer. :) Cheers -Original Message- From: Otis Gospodnetic [mailto:otis_gospodne...@yahoo.com] Sent: 15 July 2009 02:18 To: solr-user@lucene.apache.org Subject: Re: Solr 1.4 Release

RE: How to filter old revisions

2009-07-16 Thread Daniel Alheiros
Hi Are you ever going to search for earlier revisions or only the latest? If in your use cases you need the latest, just replace earlier revisions with the latest on your index Regards, Daniel -Original Message- From: Reza Safari [mailto:r.saf...@lukkien.com] Sent: 15 July 2009 12

Re: Dedicated Slave Master

2009-07-16 Thread wojtekpia
Hey Grant, It's a middleman, not a backup. We don't have any issues in the current setup, just trying to make sure we have a solution in case this becomes an issue. I'm concerned about a situation with dozens of searchers. The i/o and network load on the indexer might become significant at that p

RE: Word frequency count in the index

2009-07-16 Thread Daniel Alheiros
Hi Walter, Has it always been there? Which version of Lucene are we talking about? Regards, Daniel -Original Message- From: Walter Underwood [mailto:wunderw...@netflix.com] Sent: 16 July 2009 15:04 To: solr-user@lucene.apache.org Subject: Re: Word frequency count in the index Lucene us

Re: Word frequency count in the index

2009-07-16 Thread Otis Gospodnetic
Plus there is a single class that you can run from the command line in Lucene's contrib. I think it's called HighFreqTerms or something close to that. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Grant Ingersoll > To: solr-user@lucen

Re: Highlight arbitrary text

2009-07-16 Thread Erik Hatcher
On Jul 16, 2009, at 7:41 AM, Shalin Shekhar Mangar wrote: On Thu, Jul 16, 2009 at 4:52 PM, Anders Melchiorsen wrote: What we want to do is to have an extra text highlighted by the Solr highlighter. That text should never be stored in the Solr index, but rather be provided in an HTTP reque

Re: wildcards and German umlauts

2009-07-16 Thread solenweg
Hi, I've got the same problem: searching using wildcards and umlaut -> no results. Just as you descriped it: "if i type complete word (such as "übersicht"). But there are no hits, if i use wildcards (such as "über*") Searching with wildcards and without umlauts works as well." Anyone found the

Re: Word frequency count in the index

2009-07-16 Thread Walter Underwood
Lucene uses a tf.idf relevance formula, so it automatically finds common words (stop words) in your documents and gives them lower weight. I recommend not removing stop words at all and letting Lucene handle the weighting. wunder On 7/16/09 3:29 AM, "Pooja Verlani" wrote: > Hi, > > Is there an

Re: Multivalued fields and scoring/sorting

2009-07-16 Thread Matt Schraeder
The first number is a unique ID that points to a particular customer, the second is a value. It basically tells us whether or not a customer already has that product or not. The main use of it is to be able to search our product listing for products the customer does not already have. The alter

Re: Getting Facet Count of combination of fields

2009-07-16 Thread Erik Hatcher
On Jul 16, 2009, at 4:35 AM, Koji Sekiguchi wrote: ashokcz wrote: Hi all, i have a scenario where i need to get facet count for combination of fields. Say i have two fields Manufacturer and Year of manufacture. I search for something and it gives me 15 results and my facet count as like

Re: Highlight arbitrary text

2009-07-16 Thread Shalin Shekhar Mangar
On Thu, Jul 16, 2009 at 4:52 PM, Anders Melchiorsen wrote: > > What we want to do is to have an extra text highlighted by the Solr > highlighter. That text should never be stored in the Solr index, but rather > be provided in an HTTP request along with the search query. > > Is this possible? > I

Re: Highlight arbitrary text

2009-07-16 Thread Anders Melchiorsen
On Wed, 15 Jul 2009 11:54:22 +0200, Anders Melchiorsen wrote: > Is it possible to have Solr highlight an arbitrary text that is posted at > request time? Hi again. I wonder whether my question was too terse to be well understood. What we want to do is to have an extra text highlighted by the S

Re: Dedicated Slave Master

2009-07-16 Thread Grant Ingersoll
Hi Wojtek, Is this a backup or is it a middleman? I can't say that I have seen the middleman approach before, but that doesn't mean it won't work. Are you actually having an issue with the current setup or just trying to make sure you don't in the future? -Grant On Jul 15, 2009, at 1:3

Re: Word frequency count in the index

2009-07-16 Thread Grant Ingersoll
In the trunk version, the TermsComponent should give you this: http://wiki.apache.org/solr/TermsComponent . Also, you can use the LukeRequestHandler to get the top words in each field. Alternatively, you may just want to point Luke at your index. On Jul 16, 2009, at 6:29 AM, Pooja Verlani w

Word frequency count in the index

2009-07-16 Thread Pooja Verlani
Hi, Is there any way in SOLR to know the count of each word indexed in the solr ? I want to find out the different word frequencies to figure out ' application specific stop words'. Please let me know if its possible. Thank you, Regards, Pooja

Re: Getting Facet Count of combination of fields

2009-07-16 Thread ashokcz
hmmm but in my case it will be dynamic . they may choose different fields at run time and accordingly i need to populate the values ... Avlesh Singh wrote: > > If you create a field called "brand_year_of_manufacturing" and populate it > with the "brandName - YOM" data while indexing, you can ach

Re: Getting Facet Count of combination of fields

2009-07-16 Thread ashokcz
hmmm thanks Koji :handshake: will trymy hands on 1.4 version and see my luck =^D Koji Sekiguchi-2 wrote: > > ashokcz wrote: >> Hi thanks "Koji Sekiguchi-2" for your reply . >> Ya i was looking something like that . >> So when doing the solrrequest is should have extra config parameters >> facet.

Re: Getting Facet Count of combination of fields

2009-07-16 Thread Avlesh Singh
If you create a field called "brand_year_of_manufacturing" and populate it with the "brandName - YOM" data while indexing, you can achieve the desired with a simple facet on this field. Cheers Avlesh On Thu, Jul 16, 2009 at 1:19 PM, ashokcz wrote: > > Hi all, > i have a scenario where i need to

Re: Getting Facet Count of combination of fields

2009-07-16 Thread Koji Sekiguchi
ashokcz wrote: Hi thanks "Koji Sekiguchi-2" for your reply . Ya i was looking something like that . So when doing the solrrequest is should have extra config parameters facet.tree and i shud give the fields csv to specify the hierarchy will try and see if its giving me desired results . But

Re: Getting Facet Count of combination of fields

2009-07-16 Thread ashokcz
Hi thanks "Koji Sekiguchi-2" for your reply . Ya i was looking something like that . So when doing the solrrequest is should have extra config parameters facet.tree and i shud give the fields csv to specify the hierarchy will try and see if its giving me desired results . But just one doubt i

Re: Getting Facet Count of combination of fields

2009-07-16 Thread Koji Sekiguchi
ashokcz wrote: Hi all, i have a scenario where i need to get facet count for combination of fields. Say i have two fields Manufacturer and Year of manufacture. I search for something and it gives me 15 results and my facet count as like this : Manufacturer : Nokia(5);Motorola(7);iphone(

Getting Facet Count of combination of fields

2009-07-16 Thread ashokcz
Hi all, i have a scenario where i need to get facet count for combination of fields. Say i have two fields Manufacturer and Year of manufacture. I search for something and it gives me 15 results and my facet count as like this : Manufacturer : Nokia(5);Motorola(7);iphone(3) Year of manuf