Solr Documentation

2008-11-06 Thread Sajith Vimukthi
Hi all, Can someone of you tell me a source where I can find an elaborated documentation for solr. Regards, Sajith Vimukthi Weerakoon Associate Software Engineer | ZONE24X7 | Tel: +94 11 2882390 ext 101 | Fax: +94 11 2878261 | http://www.zone24x7.com

Re: Solr Documentation

2008-11-06 Thread Erik Hatcher
On Nov 6, 2008, at 1:54 AM, Sajith Vimukthi wrote: Can someone of you tell me a source where I can find an elaborated documentation for solr. http://wiki.apache.org/solr Erik

Solr documentation - configuration

2008-11-06 Thread Sajith Vimukthi
Hi all, I have some problems regarding configuring solr framework. And I need to know the internal architecture of the framework. So can someone pass me a fair documentation. Regards, Sajith Vimukthi Weerakoon Associate Software Engineer | ZONE24X7 | Tel: +94 11 2882390 ext 101 | Fax:

RE: Solr Documentation

2008-11-06 Thread Sajith Vimukthi
Thanks Erik. But I need the details that are done by the classes. Where can I find those??? -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, November 06, 2008 3:40 PM To: solr-user@lucene.apache.org Subject: Re: Solr Documentation On Nov 6, 2008, at 1:54 A

Calculating peaks

2008-11-06 Thread gistolero
Hello, How can I get ALL the matching documents back? How can I return an unlimited number of rows? Yes, I have read the FAQ and I got your point, but I need Solr to calculate number based peaks for my indexed data: - For each of my documents the text ('text'), the creation time ('date') and o

Re: Calculating peaks

2008-11-06 Thread Erik Hatcher
Would faceting on date (&facet.field=date&facet=on) satisfy your need? It'll give you back all the dates and frequencies of them within the matched results. Erik On Nov 6, 2008, at 4:59 AM, [EMAIL PROTECTED] wrote: How can I get ALL the matching documents back? How can I return an

Re: Solr documentation - configuration

2008-11-06 Thread Erik Hatcher
On Nov 6, 2008, at 4:30 AM, Sajith Vimukthi wrote: Well I will just tell u what I want to do. I need to develop an application that is capable of giving out search results. I need to index particular documents which are with me now. So please tell me how I can develop this whole thing?

Re: Regex Transformer Error

2008-11-06 Thread Ahmed Hammad
It worked by replace < with < and > with > Thank you for your support, ahmd On Thu, Nov 6, 2008 at 2:39 AM, Norskog, Lance <[EMAIL PROTECTED]> wrote: > There is a nice HTML stripper inside Solr. > "solr.HTMLStripStandardTokenizerFactory" > > > -Original Message- > From: Ahmed Hammad [

solr1.3 / tomcat - does it create the doc if no data in sub entite?

2008-11-06 Thread sunnyfr
Hi, Just to know what's happening if in this entities there is no video : Does it create the document ? Cuz I just tried to to test and fire a full import with a condition in my main entity where video_id= 1 which doesn't have any rel_group_ids and it goes in error : can't make this

RE: Large Data Set Suggestions

2008-11-06 Thread Steven Anderson
> The performance of DIH is likely to be faster than SolrJ. > Because , it does not have the overhead of an http request. Understood. However, we may not have the option of co-locating the data to be injested with the Solr server. > What is your data source? I am assuming it is xml. Yes. Inco

Re: solr1.3 / tomcat - does it create the doc if no data in sub entite?

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
it must create the document as long as rel_group_ids is not a required field On Thu, Nov 6, 2008 at 6:29 PM, sunnyfr <[EMAIL PROTECTED]> wrote: > > Hi, > > Just to know what's happening if in this entities there is no video : > >query="SELECT group_id AS rel_group_ids FROM grou

RE: Large Data Set Suggestions

2008-11-06 Thread Steven Anderson
> In that case you may put the file in a mounted NFS directory > or you can serve it out with an apache server. That's one option although someone else on the list mentioned that performance was 10x slower in their NFS experience. Another option is to serve up the files via Apache and pull them

RE: Solr documentation - configuration

2008-11-06 Thread Sajith Vimukthi
Thanks a lot Erik Thanks alot -Original Message- From: Erik Hatcher [mailto:[EMAIL PROTECTED] Sent: Thursday, November 06, 2008 4:10 PM To: solr-user@lucene.apache.org Subject: Re: Solr documentation - configuration On Nov 6, 2008, at 4:30 AM, Sajith Vimukthi wrote: > Well I will just

Re: Solr documentation - configuration

2008-11-06 Thread Erik Hatcher
On Nov 6, 2008, at 4:05 AM, Sajith Vimukthi wrote: I have some problems regarding configuring solr framework. And I need to know the internal architecture of the framework. So can someone pass me a fair documentation. What specifically are you having problems with? For internal details

Trying to run solr-1.3.0 under tomcat 5.5.20 on OS X 10.5.5 (works with 1.2.0)

2008-11-06 Thread Fergus McMenemie
Further to last message. I downloaded and repeated everything using Solr 1.2.0. This time everything worked fine! But I have to confess that my system is running 10.4.11 tiger rather than leopard, I do not know if that is significant. So it seems the instructions for deploying solr version 1.3.0 t

Re: Large Data Set Suggestions

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Thu, Nov 6, 2008 at 7:04 PM, Steven Anderson <[EMAIL PROTECTED]> wrote: >> The performance of DIH is likely to be faster than SolrJ. >> Because , it does not have the overhead of an http request. > > Understood. However, we may not have the option of co-locating the data > to be injested with t

RE: Solr documentation - configuration

2008-11-06 Thread Sajith Vimukthi
Well I will just tell u what I want to do. I need to develop an application that is capable of giving out search results. I need to index particular documents which are with me now. So please tell me how I can develop this whole thing? Regards, Sajith -Original Message- From: Erik Hat

Re: Huge increase in index size adding just 2 fields

2008-11-06 Thread Phillip Farber
May I ask again whether a index size increase from 120GB to 166GB is expected simply by adding a stored date and a stored repeating string field if length perhaps 20 and roughly 2 values per doc for 500,000 on average? The doc is a large body of OCR and the position index dominates due to the

solr 1.3 - its obviously FULL_DUMP /dataimport,

2008-11-06 Thread sunnyfr
Hi, I don't get what's happening I tried to make a full import with a limit inside between to id ... everything works fine then the time keep running but it looks stuck and Idon't have any error in my logs: Like if a process couldn't be executed, I didn't turn on snapshooter or autocommit is pr

Re: question about Solr directories on mounted file systems

2008-11-06 Thread Chris Hostetter
: machine or human error can sometimes unmount the file system. This causes : Solr to write index files to a different area from the index I am using. can you clarify what you mean by this? Solr will only ever write to the data dir you configure it with. if the parent dir (or volume) doesn't e

Preferred Tomcat version on Windows 2003 (64 bits)

2008-11-06 Thread Jaco
Hello, I am planning a brand new environment for Solr running on a Windows 2003 Server 64 bits platform. I want to use Tomcat, and was wondering whether there is any preference in general for using Tomcat 5.5 or Tomcat 6.0 with Solr. Any suggestions would be appreciated! Thanks, bye, Jaco.

Re: exceeded limit of maxWarmingSearchers

2008-11-06 Thread Chris Hostetter
: SEVERE: org.apache.solr.common.SolrException: Error opening new searcher. : exceeded limit of maxWarmingSearchers=8, try again later. : Our server is not even in public use yet, it's serving maybe one query every : second, or less. I don't understand what could be causing this. that warning i

Re: Preferred Tomcat version on Windows 2003 (64 bits)

2008-11-06 Thread Otis Gospodnetic
I don't think there are preferences. If going with the brand new setup why not go with Tomcat 6.0. Also be aware that if you want master-slave setup Windows you will need to use post 1.3 version of Solr (nightly) that includes functionality from SOLR-561. Otis -- Sematext -- http://sematext.co

Re: Huge increase in index size adding just 2 fields

2008-11-06 Thread Otis Gospodnetic
I'll make a very wild guess and say that it's possible for this to happen if your dates are very granular (down to milliseconds). All of a sudden you probably got 500,000 new terms there. Wild guess. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message -

Re: Huge increase in index size adding just 2 fields

2008-11-06 Thread Chris Hostetter
: We added the following 2 fields to the above schema as follows: : : : : : where the "hlb" field consists of not more than 3-4 strings such as "Social : Sicence"/ : : Our 500,000 document index size increased to 166G! This seems completely if you don't need fieldNorms for these fields (it a

Re: Preferred Tomcat version on Windows 2003 (64 bits)

2008-11-06 Thread Jaco
Thanks for the fast reply! I've tested SOLR-561, and it is working beautifully! Excellent functionality. Cheers, Jaco. 2008/11/6 Otis Gospodnetic <[EMAIL PROTECTED]> > I don't think there are preferences. If going with the brand new setup why > not go with Tomcat 6.0. > Also be aware that if

Re: Large Data Set Suggestions

2008-11-06 Thread Walter Underwood
100X, not 10X. And with the index on NFS. Reading the input data from NFS would be slower than local, but probably not 10X. --wunder On 11/6/08 5:56 AM, "Steven Anderson" <[EMAIL PROTECTED]> wrote: > That's one option although someone else on the list mentioned that > performance was 10x slower i

Re: question about Solr directories on mounted file systems

2008-11-06 Thread Walter Underwood
I've seen a nasty problem like this. When the mounted filesystem goes away, you can create regular directories and files under the mount point. When it comes back, the newly created files are not accessible. Yuk. --wunder On 11/6/08 8:19 AM, "Chris Hostetter" <[EMAIL PROTECTED]> wrote: > : machin

Very bad performance

2008-11-06 Thread Cedric Houis
Hi Solr’s users, and first of all, sorry for my bad english =^D !!! We are experimenting Solr with the intention of using it in a web site. To test solr’s performance, we have made a little program that simulates users on the web site. This program starts N threads (users), with 0.5 second pause

Re: Huge increase in index size adding just 2 fields

2008-11-06 Thread Phillip Farber
Hi Otis and Hoss, My dates are not too granular. They're always -MM-DD 00:00:00 but I see that I did not omitNorms on the date field and hlb field. Thanks for pointing me in the right direction. Phil Chris Hostetter wrote: : We added the following 2 fields to the above schema as foll

RE: Large Data Set Suggestions

2008-11-06 Thread Lance Norskog
You can also do streaming XML upload for the XML-based indexing. This can feed, say, 100k records in one XML file from a separate machine. All of these options ignore the case where there is an error in your input records v.s. the schema. DIH gives up on an error. Streaming XML gives up on an err

Re: Very bad performance

2008-11-06 Thread Ryan McKinley
Data : 367380 documents nGeographicLocations : 39298 distincts values nPersonNames : 325142 distincts values nOrganizationNames : 130681 distincts values nCategories : 929 distincts values nSimpleConcepts : 110198 distincts values nComplexConcepts : 1508141 distincts values Each of those fields

Re: Very bad performance

2008-11-06 Thread Yonik Seeley
Your problem is most likely the time it takes to facet on those multi-valued fields. Help is coming within the month I'd estimate, in the form of faster faceting for multivalued fields where the number of values per document is low. Until then, you might be able to get some smaller speedups by try

Re: Preferred Tomcat version on Windows 2003 (64 bits)

2008-11-06 Thread William Pierce
I am using tomcat 6.0.14 without any problems on windows 2003 R2 server. I am also using the 1.3 patch (using the nightly build of 10/23) for master-slave replication... That's been working great! -- Bill -- From: "Otis Gospodnetic" <[EMAIL PRO

Re: question about Solr directories on mounted file systems

2008-11-06 Thread Chris Hostetter
: I've seen a nasty problem like this. When the mounted filesystem goes away, : you can create regular directories and files under the mount point. When : it comes back, the newly created files are not accessible. Yuk. --wunder Ahhh i see what you mean. isn't the solution there to make sure

how to improve cold sort time?

2008-11-06 Thread Matt Kent
Hey all, I'm working on optimizing my query times, and I was wondering if there's any secrets to improving sort time. Let me emphasize that I'm working on __cold__ query times. I'm intentionally trying to simulate cache misses and I'm aware that caching would improve my times greatly. Here's my si

Re: How to use multicore feature in JBOSS

2008-11-06 Thread Chris Hostetter
: But for the first question, I am still not clear. : I think to use the multicore feature we should inform the server. In the : Jetty server, we are starting the server using: java : -Dsolr.solr.home=multicore -jar start.jar : Once the server is started I think it will take the parameters from :

Re: Bias score proximity for a given field

2008-11-06 Thread Chris Hostetter
: Subject: Bias score proximity for a given field : In-Reply-To: <[EMAIL PROTECTED]> : Is there a way to specify a range boosting for a numeric/date field? 1) take a look at the "bq" (boost query) param option if you wnat discrete ranges to get boosted scores, or take a look at bf (boost func

Re: Calculating peaks

2008-11-06 Thread gistolero
Thank you, Erik. Thats what I need. Sorry, I missed the 'facet' chapter. Original-Nachricht > Datum: Thu, 6 Nov 2008 05:07:39 -0600 > Von: Erik Hatcher <[EMAIL PROTECTED]> > An: solr-user@lucene.apache.org > Betreff: Re: Calculating peaks > Would faceting on date (&facet.field=

Re: Solr Autowarming

2008-11-06 Thread Chris Hostetter
: Yes, you can extend QuerySenderListener to do this. one of the really simple approaches i use (since the EventListener api is kind of awkward) is to implement a RequestHandler that executes whatever queries you want (using LocalSolrRequest instances) and then configure QuerySenderListener to

Distributed Search ...

2008-11-06 Thread souravm
Hi, I have a query on distributed search. The wiki mentioned that Solr can query and merge results from an index split in multiple shards. My question is which server actually does the job of merging. Will there be a separate master node/shard to the merging job ? Also is there any performan

Solr Multicore ...

2008-11-06 Thread souravm
Hi, Can I use multi core feature to have multiple indexes (That is each core would take care of one type of index) within a single Solar instance ? Will there be any performance impact due to this type of setup ? Regards, Sourav CAUTION - Disclaimer * This e-m

Different tokenizing algorithms for the same stream

2008-11-06 Thread Yuri Jan
Hello all, I'm trying to implement a tokenizer that will behave differently on different parts of the incoming stream. For example, for the first X words in the stream I would like to use one tokenizing algorithm, while for the rest of the stream a different tokenizing algorithm will be used. Wha

Re: Solr Multicore ...

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Fri, Nov 7, 2008 at 3:28 AM, souravm <[EMAIL PROTECTED]> wrote: > > Hi, > > Can I use multi core feature to have multiple indexes (That is each core > would take care of one type of index) within a single Solar instance ? Yes .And this is why it is conceived > > Will there be any performance im

Re: Distributed Search ...

2008-11-06 Thread Otis Gospodnetic
Sourav, Whichever Solr instance you send the request to will dispatch requests to other Solr instances you specified and will merge the results. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: souravm <[EMAIL PROTECTED]> > To: "solr-user@

Solr with Wordpress - Anyone doing this?

2008-11-06 Thread Stephen Weiss
Hi, We recently implemented Solr for one major search component of our site, and now that this is complete we're turning to other areas of our site to see where Solr can help us improve results relevancy and performance. One major area where I think Solr could do a lot of good is to repla

Re: Large Data Set Suggestions

2008-11-06 Thread Noble Paul നോബിള്‍ नोब्ळ्
Hi Lance, This is one area we left open in DIH. What is the best way to handle this. On error it should give up or continue with the next? On Fri, Nov 7, 2008 at 12:44 AM, Lance Norskog <[EMAIL PROTECTED]> wrote: > You can also do streaming XML upload for the XML-based indexing. This can > feed,

Re: how to improve cold sort time?

2008-11-06 Thread Shalin Shekhar Mangar
On Fri, Nov 7, 2008 at 1:27 AM, Matt Kent <[EMAIL PROTECTED]> wrote: > I'm using the all query *:* with a few filter queries (type:stuff, > userId:12345, category:3, etc). Without sorting, my query takes about > 600ms, > which is fine for me. My target range is 500-1000ms. When I add a sort > fiel

Re: How to use multicore feature in JBOSS

2008-11-06 Thread con
Thanks Norberto and Hossma. That really helped me. Fist of all i deleted the contents in the example/solr directory and created a solx.xml that points to the multicore. That itself worked. Then based on Hossman's comment i deleted the actual solr directory and renamed the multicore directory to