Normalizing multiple Chars with MappingCharFilter possible?

2009-11-24 Thread Andreas Kahl
Hello everyone, is it possible to normalize Strings like '`e' (2 chars) => 'e' (in contrast to 'é' (1 char) => 'e') with org.apache.lucene.analysis.MappingCharFilter? I am asking this because I am considering to index some multilingual and multi-alphabetic data with Solr which uses such Strings

Tomcat vs Jetty for a solr instance?

2009-11-24 Thread Kevin Jackson
Hi, We're running a high traffic (very high peak load) site with solr 1.3 (we can't upgrade to 1.4 just yet as we don't have capacity to remove one of our servers even for the 10 mins time it will take!) We're currently running the webapp deployed in tomcat 6.0.18. Most of the documentation ment

Re: Output all, from one field

2009-11-24 Thread Shalin Shekhar Mangar
On Tue, Nov 24, 2009 at 2:31 AM, Chris Hostetter wrote: > > : Do you want to return just one field from all documents? If yes, you can: > : > :1. Query with q=*:*&fl=name > :2. Use TermsComponent - http://wiki.apache.org/solr/TermsComponent > > note that those are very differnet creatures

Re: How to use DataImportHandler with ExtractingRequestHandler?

2009-11-24 Thread Shalin Shekhar Mangar
On Fri, Nov 20, 2009 at 9:13 PM, javaxmlsoapdev wrote: > > did you extend DIH to do this work? can you share code samples. I have > similar requirement where I need tp index database records and each record > has a column with document path so need to create another index for > documents (we allo

Re: Tomcat vs Jetty for a solr instance?

2009-11-24 Thread Shalin Shekhar Mangar
On Tue, Nov 24, 2009 at 3:14 PM, Kevin Jackson wrote: > Hi, > > We're running a high traffic (very high peak load) site with solr 1.3 > (we can't upgrade to 1.4 just yet as we don't have capacity to remove > one of our servers even for the 10 mins time it will take!) > > We're currently running t

Re: error with multicore CREATE action

2009-11-24 Thread Shalin Shekhar Mangar
On Mon, Nov 23, 2009 at 11:49 PM, Chris Harris wrote: > Are there any use cases for CREATE where the instance directory > *doesn't* yet exist? I ask because I've noticed that Solr will create > an instance directory for me sometimes with the CREATE command. In > particular, if I run something lik

Re: creating Lucene document from an external XML file.

2009-11-24 Thread Phanindra Reva
Hello..., Thank you both for patiently reading and understanding my question. // " you already have code that builds up files in the "..." update message syntax solr expects, but you want to modify those documents (wi/o changing your existing code) .. " .. // yeah.. I al

Re: Normalizing multiple Chars with MappingCharFilter possible?

2009-11-24 Thread Koji Sekiguchi
Andreas Kahl wrote: Hello everyone, is it possible to normalize Strings like '`e' (2 chars) => 'e' (in contrast to 'é' (1 char) => 'e') with org.apache.lucene.analysis.MappingCharFilter? I am asking this because I am considering to index some multilingual and multi-alphabetic data with Solr wh

Re: Huge load and long response times during search

2009-11-24 Thread Tomasz Kępski
Hi, : I'm using SOLR(1.4) to search among about 3,500,000 documents. After the : server kernel was updated to 64bit system has started to suffer. ...if the *only* thing that was upgraded was switching the kernel from 32bit to 64bit, then perhaps you are getting bit by java now using 64 bit po

Re: solr+jetty logging to syslog?

2009-11-24 Thread Shalin Shekhar Mangar
On Sun, Nov 22, 2009 at 2:39 AM, Steve Conover wrote: > Does no one send solr logging to syslog? > > On Thu, Nov 19, 2009 at 5:54 PM, Steve Conover wrote: > > The solution involves slf4j to log4j to syslog (at least, for solr), > > but I'm having some trouble stringing all the parts together. I

Re: initiate reindexing in solr for field type changes

2009-11-24 Thread Shalin Shekhar Mangar
On Thu, Nov 19, 2009 at 4:50 AM, darniz wrote: > > Thanks > Could you elaborate what is compatible schema change. > Do you mean schema change which deals only with query time. > > A compatible schema change would be addition of new fields. Removal of fields may also be called compatible as long a

Re: help with dataimport delta query

2009-11-24 Thread Joel Nylund
Thanks that was it, well really this part: ${dataimporter.delta.job_jobs_id} I thought the jobs_id was part of the DIH, but I guess it was just the example, duh! thanks Joel --- On Tue, 11/24/09, Noble Paul നോബിള്‍ नोब्ळ् wrote: > From: Noble Paul നോബിള്‍ नोब्ळ् > Subject: Re: help with

Turning down logging for SOLR running on Weblogic

2009-11-24 Thread DEO, SHANTANU S (ATTCINW)
Hi We recently started a SOLR instance running under Weblogic and noticed that there are a lot of DEBUG messages being output, that we did not notice before when we used tomcat. Where can we turn this logging level down ? Thanks Shantanu AT&T eCommerce Web Hosting - Release Management Office: (

Re: [N to M] range search out of sum of field. howto search this?

2009-11-24 Thread Julian Davchev
Hi, You got right what I am after. Seems I will have to find a workaround for this one. Also I am still stuck on 1.3 so.. Thanks a lot JD Chris Hostetter wrote: > : fq={!frange l=5 u=10}sum(user,num) > > H, One of us massivly missunderstood the original question - and i'm > pretty sure it's Y

Re: Turning down logging for SOLR running on Weblogic

2009-11-24 Thread Mark Miller
DEO, SHANTANU S (ATTCINW) wrote: > Hi > We recently started a SOLR instance running under Weblogic and noticed > that there are a lot of DEBUG messages being output, that we did not > notice before when we used tomcat. > Where can we turn this logging level down ? > > Thanks > Shantanu > AT&T eC

Migrating to Solr

2009-11-24 Thread Tommy Molto
Hi, I'm new at Solr and i need to make a "test pilot" of a migration from Fast ESP to Apache Solr, anyone had this experience before? Att,

Re: Boost document base on field length

2009-11-24 Thread Tomasz Kępski
Hi, I think i'm reading he question differently then Grant -- his suggestion applies when you are searching in the description field, and don't want documents with shorter descriptions to score higher when the same terms match the same number of times (the default behavior of lengthNorm) my

Re: Migrating to Solr

2009-11-24 Thread Lukáš Vlček
Hi, I think there were some links about FAST to Solr migration published recently. See: http://blog.isabel-drost.de/index.php/archives/110/moving-from-fast-to-solr However, as of writing those links are not working, not sure what happend... Regards, Lukas On Tue, Nov 24, 2009 at 2:55 PM, Tommy

Get one document from each category

2009-11-24 Thread Tomasz Kępski
Hi, I have the following case: In my index I do have documents categorized (category_id - int sortable field). I would like to get three top documents matching user query BUT each have to be from different category.: for example from returned set (doc_id : category id): 1:1 2:1 3:1 4:2 5:1

Re: Migrating to Solr

2009-11-24 Thread Grant Ingersoll
I've done been involved with a fair share of these migrations now, what are you looking for? On Nov 24, 2009, at 8:55 AM, Tommy Molto wrote: > Hi, > > I'm new at Solr and i need to make a "test pilot" of a migration from Fast > ESP to Apache Solr, anyone had this experience before? > > > Att,

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

2009-11-24 Thread javaxmlsoapdev
http://machinename:port/solr/admin/luke gives me 404 error so seems like its not able to find luke. I am reusing schema, which is used for indexing other entity from database, which has no relevance to documents. that was my next question that what do I put in, in a schema if my documents don't n

Re: Migrating to Solr

2009-11-24 Thread Shashi Kant
Here is a link that might be helpful: http://sesat.no/moving-from-fast-to-solr-review.html The site is choc-a-bloc with great information on their migration experience. On Tue, Nov 24, 2009 at 8:55 AM, Tommy Molto wrote: > Hi, > > I'm new at Solr and i need to make a "test pilot" of a migrati

Re: Get one document from each category

2009-11-24 Thread Andrey Klochkov
Hi I think you need field collapsing, look here http://wiki.apache.org/solr/FieldCollapsing 2009/11/24 Tomasz Kępski > Hi, > > I have the following case: > > In my index I do have documents categorized (category_id - int sortable > field). I would like to get three top documents matching user

PatternTokenizer question

2009-11-24 Thread j philoon
I have defined a comma-delimited pattern tokenizer as follows: This appears to work fine when adding documents, since if I add a field commafld as "word1,WORD2,word 3" I see terms in the index as expected: "word1", "word2", and "word 3". When I query,

Re: access denied to solr home lib dir

2009-11-24 Thread Charles Moad
Thank you all for the insight into this problem. I was 100% positive that selinux and file permissions were not the problems. Turns out that tomcat 6 on ubuntu comes with a tomcat security manager enabled by default. I had no desire to figure out how this works since this is for local testin

Re: Migrating to Solr

2009-11-24 Thread Tommy Molto
This is really a great source of migration. I guess i will have good questions after trying. But what i know that will be a little harder will be the use of collections (facets in Solr) and hierarquical navigators. On Tue, Nov 24, 2009 at 1:05 PM, Shashi Kant wrote: > Here is a link that might b

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

2009-11-24 Thread javaxmlsoapdev
I was able to configure /docs index separately from my db data index. still I am seeing same behavior where it only puts .docName & its size in the "content" field (I have renamed field to "content" in this new schema) below are the only two fields I have in schema.xml Following is upda

Trouble Configuring WordDelimiterFilterFactory

2009-11-24 Thread Rahul R
Hello, In our application we have a catch-all field (the 'text' field) which is cofigured as the default search field. Now this field will have a combination of numbers, alphabets, special characters etc. I have a requirement wherein the WordDelimiterFilterFactory does not work on numbers, especial

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

2009-11-24 Thread javaxmlsoapdev
I was able to configure /docs index separately from my db data index. still I am seeing same behavior where it only puts .docName & its size in the "content" field (I have renamed field to "content" in this new schema) below are the only two fields I have in schema.xml Following is updat

SolrPlugin Guidance

2009-11-24 Thread Vauthrin, Laurent
Hello, Our team is trying to make a Solr plugin that needs to parse/decompose a given query into potentially multiple queries. The idea is that we're trying to abstract a complex schema (with different document types) from the users so that their queries can be simpler. So basically, we're

Re: initiate reindexing in solr for field type changes

2009-11-24 Thread darniz
thanks darniz Shalin Shekhar Mangar wrote: > > On Thu, Nov 19, 2009 at 4:50 AM, darniz wrote: > >> >> Thanks >> Could you elaborate what is compatible schema change. >> Do you mean schema change which deals only with query time. >> >> > A compatible schema change would be addition of new fiel

Creating Facets

2009-11-24 Thread Tommy Molto
People, I look in the solr wiki and only found about the use of the fecets, not how to configure it in the schema or solrconfig. Any tip how to do it? Att,

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

2009-11-24 Thread javaxmlsoapdev
Following is luke response. is empty. can someone assist to find out why file content isn't being index? 0 0 0 0 0 1259085661332 false true false org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/home/tomcat-solr/bin/docs/dat

Re: Implementing phrase autopop up

2009-11-24 Thread darniz
Thanks for your input You made a valid point, if we are using field type as text to get autocomplete it wont work because it goes through tokenizer. Hence looks like for my use case i need to have a field which uses ngram and copy. Here is what i did i created a filed as same as the lucid blog sa

Multi-Term Synonyms

2009-11-24 Thread brad anderson
Hi Folks, I was trying to get multi term synonyms to work. I'm experiencing some strange behavior and would like some feedback. In the synonyms file I have the line: thomas, boll holly, thomas a, john q => tom And I have a document with the text field as; tom However, when I do a se

Re: Normalizing multiple Chars with MappingCharFilter possible?

2009-11-24 Thread Andreas Kahl
Am 24.11.09 12:30, schrieb Koji Sekiguchi: > Andreas Kahl wrote: >> Hello everyone, >> >> is it possible to normalize Strings like '`e' (2 chars) => 'e' (in >> contrast to 'é' (1 char) => 'e') with >> org.apache.lucene.analysis.MappingCharFilter? >> >> I am asking this because I am considering to

Re: Multi-Term Synonyms

2009-11-24 Thread Tom Hill
Hi Brad, I suspect that this section from the wiki for SynonymFilterFactory might be relevant: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory *"Keep in mind that while the SynonymFilter will happily work with synonyms containing multiple words (ie: "**sea

Re: PatternTokenizer question

2009-11-24 Thread j philoon
I think the answer to my question is contained in the wiki when discussing the SynonymFilter, "The Lucene QueryParser tokenizes on white space before giving any text to the Analyzer". This would indeed explain what I am getting. Next question - can I avoid that behavior? j philoon wrote: > >

Re: Boost document base on field length

2009-11-24 Thread Lance Norskog
The Lucene norms, if set, are 1/number of terms in the field. I cannot find a function that makes norms available. Yo gurus- is this impossible, a bad idea, or just an oversight? On Tue, Nov 24, 2009 at 6:06 AM, Tomasz Kępski wrote: > Hi, > >> I think i'm reading he question differently then Gra

[SolrResourceLoader] Unable to load cached class-name

2009-11-24 Thread Stuart Grimshaw
Bit of a long error message, so I won't post it all in the subject :-) I'm trying to create a log4j solr appender to help us track down log entries from across our jboss cluster, I might be able to make use of the faceted search to identify errors that occur more often and things like that. Anywa

Re: Multi-Term Synonyms

2009-11-24 Thread brad anderson
Thanks for the help. Can't believe I missed that part in the wiki. 2009/11/24 Tom Hill > Hi Brad, > > > I suspect that this section from the wiki for SynonymFilterFactory might be > relevant: > > > http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory > > *"Keep i

Re: Migrating to Solr

2009-11-24 Thread Lance Norskog
Collections in FAST do not exist in Solr. A FAST collection can be implemented in Solr using facets or shards. The collection abstraction in FAST is actually more shard-like in semantics: it is a separate top-level set of content. This has strong ramifications for relevance: if collections have the

configure solr

2009-11-24 Thread Jill Han
Hi, I just downloaded solr -1.4.0 to my computer, C:\apache-solr-1.4.0. 1.I followed the instruction to run the sample, java -jar start.jar at C:\apache-solr-1.4.0\example And then go to http://localhost:8983/solr/admin, however, I got HTTP ERROR: 404 NOT_FOUND RequestURI=/s

Re: Creating Facets

2009-11-24 Thread Lance Norskog
There is nothing special to configure. All facet processing happens during processing the query. On Tue, Nov 24, 2009 at 9:56 AM, Tommy Molto wrote: > People, > > I look in the solr wiki and only found about the use of the fecets, not how > to configure it in the schema or solrconfig. Any tip how

Re: ExternalRequestHandler and ContentStreamUpdateRequest usage

2009-11-24 Thread Lance Norskog
If you are using multicore, you have to run Luke on a particular core: http://machine:port/solr/core/admin/luke And, admin itself: http://machine:port/solr/core/admin On Tue, Nov 24, 2009 at 10:18 AM, javaxmlsoapdev wrote: > > Following is luke response. is empty. can someone > assist to find

why is XMLWriter declared as final?

2009-11-24 Thread Matt Mitchell
Is there any reason the XMLWriter is declared as final? I'd like to extend it for a special case but can't. The other writers (ruby, php, json) are not final. Thanks, Matt

how is score computed with hsin functionquery?

2009-11-24 Thread gdeconto
I was looking at functionqueries, and noticed that: 1. if I use the sum functionquery, the score in the results is the sum of the values I want to sum (all well and good and expected): http://127.0.0.1:8080/solr/select?q=(*:*)^0%20%20_val_:"sum(1,2,3,4,5)"&fl=score,Latitude,Longitude&sort=score%

Deduplication in 1.4

2009-11-24 Thread KaktuChakarabati
Hey, I've been trying to find some documentation on using this feature in 1.4 but Wiki page is alittle sparse.. In specific, here's what i'm trying to do: I have a field, say 'duplicate_group_id' that i'll populate based on some offline documents deduplication process I have. All I want is for s

Index Splitter

2009-11-24 Thread Giovanni Fernandez-Kincade
Hi, I've heard about a tool that can be used to split Lucene indexes, for cases where you want to break up a large index into shards. Do you know where I can find it? Any observations/recommendations about its use? This seems promising but I'm not sure if there is anything more mature out there

Re: how is score computed with hsin functionquery?

2009-11-24 Thread gdeconto
gdeconto wrote: > > ... > is there some way to convert the hsin value to distance? > ... > I just noticed that the solr wiki states "Values must be in Radians" and all my test values were in degrees. -- View this message in context: http://old.nabble.com/how-is-score-computed-with-hsin-func

Re: Index Splitter

2009-11-24 Thread Koji Sekiguchi
Giovanni Fernandez-Kincade wrote: Hi, I've heard about a tool that can be used to split Lucene indexes, for cases where you want to break up a large index into shards. Do you know where I can find it? Any observations/recommendations about its use? This seems promising but I'm not sure if ther

Re: configure solr

2009-11-24 Thread Erick Erickson
For the second question, do the instructions here help? http://wiki.apache.org/solr/SolrTomcat I suspect your SOLR instance doesn't know where to find the SOLR config files. So a severe error, indeed. It can't find them at all . WARNING: I'm *really* not a tomcat expert, and the instructions at t

how to do partial word searches?

2009-11-24 Thread Joel Nylund
Hi, I saw some older postings on this, but didnt see a resolution. I have a field called title, I would like to be able to find partial word matches within the title. For example: http://localhost:8983/solr/select?q=textTitle:%22*sulli*%22 I would expect it to find: the daily dish | by andr

Re: configure solr

2009-11-24 Thread Joel Nylund
for #1, under example, is there a webapps folder, does it contain solr.war ? are there any errors in your startup log for jetty, does it say anything about setting up solr, and solr home etc. Joel On Nov 24, 2009, at 4:55 PM, Jill Han wrote: Hi, I just downloaded solr -1.4.0 to my compute

Re: Implementing phrase autopop up

2009-11-24 Thread darniz
can anybody update me if its possible that a word within a phrase is match, that phrase can be displayed. darniz darniz wrote: > > Thanks for your input > You made a valid point, if we are using field type as text to get > autocomplete it wont work because it goes through tokenizer. > Hence loo

Re: how to do partial word searches?

2009-11-24 Thread Erick Erickson
copying from Eric Hatcher: See http://issues.apache.org/jira/browse/SOLR-218 - Solr currently does not have leading wildcard support enabled. There's a pretty extensive recent exchange on this, see the thread on the user's list titled "leading and trailing wildcard query"Best Erick On Tue, Nov

Re: Deduplication in 1.4

2009-11-24 Thread Otis Gospodnetic
Hi, As far as I know, the point of deduplication in Solr ( http://wiki.apache.org/solr/Deduplication ) is to detect a duplicate document before indexing it in order to avoid duplicates in the index in the first place. What you are describing is closer to field collapsing patch in SOLR-236. Ot

Re: [SolrResourceLoader] Unable to load cached class-name

2009-11-24 Thread Otis Gospodnetic
Hi Stuart, I don't understand your last paragraph, but yes, that class is not in Solr 1.3. It is in Solr 1.4 and Solr 1.4 is available in Apache maven repo. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Katta, Hadoop, HBase, UIMA, NLP, NER, IR ---

Re: [SolrResourceLoader] Unable to load cached class-name

2009-11-24 Thread Otis Gospodnetic
Oh, and regarding the log4j Solr appender, could you please contribute it to log4j? http://logging.apache.org/log4j/1.2/index.html That way it will get more user exposure and developer/maintenance love. Otis -- Sematext is hiring -- http://sematext.com/about/jobs.html?mls Lucene, Solr, Nutch, Ka

Re: Migrating to Solr

2009-11-24 Thread Otis Gospodnetic
Except http://sesat.no/ hasn't been reachable for about 2 days now Google cache to the rescue! Otis - Original Message > From: Shashi Kant > To: solr-user@lucene.apache.org > Sent: Tue, November 24, 2009 10:05:30 AM > Subject: Re: Migrating to Solr > > Here is a link that might

Re: solr+jetty logging to syslog?

2009-11-24 Thread Otis Gospodnetic
Not many people do that, judging from http://www.google.com/search?&q=+solr%20+syslogd . But I think this is really not a Solr-specific question. Isn't the question really "how do I configure log4j to log to syslogd?". Oh, and then "how do I configure slf4j to use log4j?" The answer to the f