facet results in order of rank

2009-04-23 Thread ristretto.rb
Hello, Is it possible to order the facet results on some ranking score? I've had a look at the facet.sort param, (http://wiki.apache.org/solr/SimpleFacetParameters#head-569f93fb24ec41b061e37c702203c99d8853d5f1) but that seems to order the facet either by count or by index value (in my case alphabe

How to indexing non-english html text in unicode with Solr?

2009-04-23 Thread ahmed baseet
Hi All, I'm trying to index some regional/non-eng html pages with Solr. I thought of indexing the corresponding unicode text for that page as Solr supports Unicode indexing, right? But I'm not able to extract Xml from the html page, because for posting to Solr we require Xml. Can anyone tell me any

Re: Custom score for a id field

2009-04-23 Thread Shalin Shekhar Mangar
On Thu, Apr 23, 2009 at 10:49 PM, Raju444us wrote: > > I have a requirement.I index a field "id" and a calculated score for that > field named "fieldScore". > > Note:I have many other fields which are also indexed.But only for this id > field i want a custom calculated score. > > So when I search

Re: Query | Solr conf and data (index) distribution using master slave configuration

2009-04-23 Thread Shalin Shekhar Mangar
On Thu, Apr 23, 2009 at 10:10 AM, Vicky_Dev wrote: > > 1. Please confirm whether the tag entry : > In solrconfig.xml should match for the Slave solr server / master solr > server in accordance to the scripts.conf configuration settings. Yes, dataDir in solrconfig.xml and scripts.conf should b

Re: how to reset the index in solr

2009-04-23 Thread sagi4
Thanks for your valuable suggestions. Can i get the rake task for clearing the index of solr, I mean "rake index::rebuild", It would be very helpful and also to avoid the delete id by manually. regards, Sg.. Otis Gospodnetic wrote: > > > You can also delete it with "delete by query" using th

Re: autowarmcount how to check if cache has been warmed up

2009-04-23 Thread Shalin Shekhar Mangar
OK, lets try this: 1. Before a commit, check the stats page, see if the size is more than 5 2. Then call commit, and verify that the size is more than 5 If the original size was > 5, then you should have size > 5 after autowarming too. On Wed, Apr 22, 2009 at 2:57 PM, sunnyfr wrote: > > still

PageRank sort

2009-04-23 Thread Marcus Herou
Hi. I've posted before but here it goes again: I have BlogData data which is more or less 100% static but one field is not - the PageRank. I would like to sort on that field and on the Lucene list I got these answers. 1. Use two indexes and a ParallellReader 2. Use a FieldScoreQuery containing t

Re: Delete from Solr index...

2009-04-23 Thread Shalin Shekhar Mangar
On Thu, Apr 23, 2009 at 9:25 AM, lupiss wrote: > > hola de nuevo! > es cierto ese comando es el que borra un index, ya lo intenté y sí, así > borraré mis registros de prueba de mi proyecto, estaría bien saber como > borrarlo desde la aplicación mediante solrj, saludos, gracias :) > > hello again!

Re: Get date facet counts per month

2009-04-23 Thread Shalin Shekhar Mangar
On Thu, Apr 23, 2009 at 2:36 AM, Raju444us wrote: > > In the example on the wiki at gives the facet counts for date per day.How > should the query look like to get date facets by month. > > > > http://wiki.apache.org/solr/SimpleFacetParameters#head-068dc96b0dac1cfc7264fe85528d7df5bf391acd > > > H

Re: replicated index files have incorrect timestamp

2009-04-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
looks like a bug. https://issues.apache.org/jira/browse/SOLR-1126 On Fri, Apr 24, 2009 at 3:26 AM, Jeff Newburn wrote: > I have attached the output from our filelist below.  The slaves are on the > same version using the replication internal to solr 1.4.  All replicated > files are set to the dat

RE: OutofMemory on Highlightling

2009-04-23 Thread Gargate, Siddharth
I am not sure whether lazy loading should help solve this problem. I have set enableLazyFieldLoading to true but it is not helping. I went through the code and observed that DefaultSolrHighlighter.doHighlighting is reading all the documents and the fields for highlighting (In my case, 1 MB stored

Solr Performance bottleneck

2009-04-23 Thread Jon Bodner
Hi all, I am trying to solve a serious performance problem with our Solr search index. We're running under Solr 1.3. We've sharded our index into 4 shards. Index data is stored on a network mount that is accessed over Fibre Channel. Each document's text is indexed, but not stored. Each day,

Re: Access HTTP headers from custom request handler

2009-04-23 Thread Ryan McKinley
Right, you will have to build a new war with your own subclass of SolrDispatchFilter *rather* then using the packaged one. On Apr 23, 2009, at 12:34 PM, Noble Paul നോബിള്‍ नोब्ळ् wrote: nope. you must edit the web.xml and register the filter there On Thu, Apr 23, 2009 at 3:45 PM, Giovann

newbie question about indexing RSS feeds with SOLR

2009-04-23 Thread Tom H
Hi, I've just downloaded solr and got it working, it seems pretty cool. I have a project which needs to maintain an index of articles that were published on the web via rss feed. Basically I need to watch some rss feeds, and search and index the items to be searched. Additionally, I need to run

Re: replicated index files have incorrect timestamp

2009-04-23 Thread Jeff Newburn
I have attached the output from our filelist below. The slaves are on the same version using the replication internal to solr 1.4. All replicated files are set to the date Dec 31 1969 ? 0 1 ? ? _b7t.fdx 1240473795000 1248940 ? _b7t.nrm 1240473844000 27164362 ? _b7u.tii 1240502374000 12

Re: prefix matching

2009-04-23 Thread Grant Ingersoll
Hmm, did some poking around and this conversation rung a bell from the Lucene list see http://www.lucidimagination.com/search/document/3e4ce083206664d2/ngrams_and_positions#3e4ce083206664d2 Looks like Lucene would need to solve LUCENE-1224 and LUCENE-1225. https://issues.apache.org/jira/browse

Re: Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
Could an ExternalFileField help me ? http://lucene.apache.org/solr/api/org/apache/solr/schema/ExternalFileField.html On Thu, Apr 23, 2009 at 10:01 PM, Marcus Herou wrote: > Hi. > > Confusing subject eh ? Trying to become a little clearer in a few > sentences. > > We have a Solr/Lucene index where

RE: storing xml - how to highlight hits in response?

2009-04-23 Thread Ensdorf Ken
> Yeah great idea, thanks. Does anyone know if there is code out there > that > will do this sort of thing? > Perhaps a much simpler option would be to use this: http://lucene.apache.org/solr/api/org/apache/solr/analysis/PatternReplaceFilterFactory.html with a regex of "<[^>]*>" or something lik

Re: Synonym file in a different location or loading synonyms from database

2009-04-23 Thread rajam
Thanks Otis. I tried putting the absolute path and it worked. But I wanted something configurable so that it can be changed if required.(may be thro an admin interface?) In the mean time, another idea stuck to maintain all the synonyms in database. I tried writing a FilterFactory of my own for c

Re: Custom score for a id field

2009-04-23 Thread Marcus Herou
Did you find an answer to this ? On Thu, Apr 23, 2009 at 7:19 PM, Raju444us wrote: > > I have a requirement.I index a field "id" and a calculated score for that > field named "fieldScore". > > Note:I have many other fields which are also indexed.But only for this id > field i want a custom calcu

Re: modify SOLR scoring

2009-04-23 Thread Marcus Herou
Hi. I am interested in a very similar topic like yours. I want to modify the field named "score" and the document boost but not reindex the all fields since it would take to much power. Please let me know if you find a solution to this. Kindly //Marcus On Thu, Apr 23, 2009 at 10:02 PM, Ensdorf

Re: storing xml - how to highlight hits in response?

2009-04-23 Thread Matt Mitchell
Yeah great idea, thanks. Does anyone know if there is code out there that will do this sort of thing? Matt On Thu, Apr 23, 2009 at 3:23 PM, Ensdorf Ken wrote: > > Hi, > > > > I'm storing some raw xml in solr (stored and non-tokenized). I'd like > > to > > highlight hits in the response, obviou

Re: replicated index files have incorrect timestamp

2009-04-23 Thread Akshay
You need to specify the index version number for which list of files is to be shown. The URL should be like this:http://:/solr/replication?command=filelist&indexversion= You can get the index version number from the URL: http://:/solr/replication?command=indexversion On Fri, Apr 24, 2009 at 1:10

RE: modify SOLR scoring

2009-04-23 Thread Ensdorf Ken
I believe you can use a function query to do this: http://wiki.apache.org/solr/FunctionQuery if you embed the following in your query, you should get a boost for more recent date values: _val_:"ord(dateField)" Where "dateField" is the field name of the date you want to use. > -Original Me

Change boost of documents / single fields / external scoring ?

2009-04-23 Thread Marcus Herou
Hi. Confusing subject eh ? Trying to become a little clearer in a few sentences. We have a Solr/Lucene index where each document is a Blog Entry. We have just implemented the PageRank algorithm for Blogs and are about to add a column to the index called score and perhaps adjust the document boost

modify SOLR scoring

2009-04-23 Thread Bertrand DUMAS-PILHOU
Hi everybody, I'm using SOLR with a schema (for example) like this: parutiondate, date, indexed, not stored fulltext, stemmed, indexed, not stored I know it's possible to order by a field or more, but I want to order by score and modify the "scrore"" formula. I'll want keep the SOLR score but ad

Re: replicated index files have incorrect timestamp

2009-04-23 Thread Jeff Newburn
We see the exact same thing. Additionally, that url returns 404 on a multicore and gives an error when I add the core. − − 0 0 no indexversion specified -- Jeff Newburn Software Engineer, Zappos.com jnewb...@zappos.com - 702-943-7562 > From: Jian Han Guo > Reply-To: > Date: Wed, 22 Apr

RE: Highlight question

2009-04-23 Thread Bertrand DUMAS-PILHOU
Thanks a lot for your answer, I'm going to test and I will reply. Bertrand Ensdorf Ken wrote: > > Add the following parameters to the url: > > hl=true&hl.fl=xhtml > > http://wiki.apache.org/solr/HighlightingParameters > > > >> -Original Message- >> From: Bertrand DUMAS-PILHOU [mail

RE: storing xml - how to highlight hits in response?

2009-04-23 Thread Ensdorf Ken
> Hi, > > I'm storing some raw xml in solr (stored and non-tokenized). I'd like > to > highlight hits in the response, obviously this is problematic as the > highlighting elements are also xml. So if I match an attribute value or > tag > name, the xml response is messed up. Is there a way to highli

storing xml - how to highlight hits in response?

2009-04-23 Thread Matt Mitchell
Hi, I'm storing some raw xml in solr (stored and non-tokenized). I'd like to highlight hits in the response, obviously this is problematic as the highlighting elements are also xml. So if I match an attribute value or tag name, the xml response is messed up. Is there a way to highlight only text,

Re: Control segment size

2009-04-23 Thread Otis Gospodnetic
Hi, You are looking for maxMergeDocs, I believe. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: vivek sar > To: solr-user@lucene.apache.org > Sent: Thursday, April 23, 2009 1:08:20 PM > Subject: Control segment size > > Hi, > > Is th

Custom score for a id field

2009-04-23 Thread Raju444us
I have a requirement.I index a field "id" and a calculated score for that field named "fieldScore". Note:I have many other fields which are also indexed.But only for this id field i want a custom calculated score. So when I search for that id q="id:1234".What I want is in the results if I use re

Control segment size

2009-04-23 Thread vivek sar
Hi, Is there any configuration to control the segments' file size in Solr? Currently, I've an index (70G) with 80 segment files and one of the file is 24G. We noticed that in some cases commit takes over 2 hours to complete (committing 50K records), whereas usually it finishes in 20 seconds. Aft

RE: Sorting dates with reduced precision

2009-04-23 Thread Ensdorf Ken
> >> Yes, but dates are fairly spesific, say 06:45 Nov. 2 , 2009. What if > I > >> want to say "Sort so that withing entries for Nov. 2 , you sort by > >> relevance" for example? > >> > > > > Append "/DAY" to the date value you index, for example > > > > "1995-12-31T23:59:59Z/DAY" will yield "1995-

Re: Access HTTP headers from custom request handler

2009-04-23 Thread Noble Paul നോബിള്‍ नोब्ळ्
nope. you must edit the web.xml and register the filter there On Thu, Apr 23, 2009 at 3:45 PM, Giovanni De Stefano wrote: > Hello Hoss, > > thank you for your reply. > > I have no problems subclassing the SolrDispatchFilter...but where shall I > configure it? :-) > > I cannot find any doc/wiki ex

prefix matching

2009-04-23 Thread Tom Morton
Hi all, I'm trying to use prefixes to match similar strings to a query string. I have the following field type: field: copyField: If I apply this to an indexed string: "ipod shuffle" and query string: "shufle" (missing f) I get mat

Re: Synonym file in a different location

2009-04-23 Thread Otis Gospodnetic
Hi Raja, Try putting the absolute path to the synonyms file in the schema.xml. If that doesn't work you can always just use 'ln': http://unixhelp.ed.ac.uk/CGI/man-cgi?ln Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: rajam > To: solr

Synonym file in a different location

2009-04-23 Thread rajam
Hi All, I am trying to use synonyms in my project. I would like to know whether it is possible to pick the synonyms.txt file from a configurable location. Ideally I would like to specify the location in a properties file and make solr read it to load the synonyms file. Could any one please let me

Re: MLT for sorting results?

2009-04-23 Thread Otis Gospodnetic
That is true *only if* you combine those 2 clauses with AND. It's not true with OR. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Shrutipriya > To: solr-user@lucene.apache.org > Sent: Wednesday, April 22, 2009 11:45:30 PM > Subject: R

Re: Some characters are searchable

2009-04-23 Thread Paul Libbrecht
Not with the default-analyzers. But certainly with a whitespaceanalyzer. paul Le 23-avr.-09 à 11:57, Koushik Mitra a écrit : Hi, I am trying to search the following characters present through solr. `, @, #, $, %, _ , , , . But I am not getting any result back, even if those characters are

Re: Access HTTP headers from custom request handler

2009-04-23 Thread Giovanni De Stefano
Hello Hoss, thank you for your reply. I have no problems subclassing the SolrDispatchFilter...but where shall I configure it? :-) I cannot find any doc/wiki explaining how to configure a custom dispatch filter. I believe it should be in solrconfig.xml ... Any idea? Is there a schema for sol

Some characters are searchable

2009-04-23 Thread Koushik Mitra
Hi, I am trying to search the following characters present through solr. `, @, #, $, %, _ , , , . But I am not getting any result back, even if those characters are present in the document . So my question is are these characters getting indexed? Thanks, Koushik CAUTION - Dis

Re: autowarmcount how to check if cache has been warmed up

2009-04-23 Thread sunnyfr
It looks like it doesnt warm up, no? sunnyfr wrote: > > still the same ? > > Seems done : > lookups : 0 > hits : 0 > hitratio : 0.00 > inserts : 0 > evictions : 0 > size : 5 > warmupTime : 20973 > cumulative_lookups : 0 > cumulative_hits : 0 > cumulative_hitratio : 0.00 > cumulative_inserts