Re: filter result by catalog

2010-02-19 Thread Kevin Osborn
Yes I thought about both methods. The ACL method is easier, but has some scalability issues. We use the bitset method in another product, but there are some complexity and resource problems. This is a new project so I am revisiting the issue to see if anyone had any better ideas. On Fri Feb 19

Re: optimize is taking too much time

2010-02-19 Thread Otis Gospodnetic
Hello, Solr will never optimize the whole index without somebody explicitly asking for it. Lucene will merge index segments on the master as documents are indexed. How often it does that depends on mergeFactor. See: http://search-lucene.com/?q=mergeFactor+segment+merge&fc_project=Lucene&fc_pro

Re: replications issue

2010-02-19 Thread Otis Gospodnetic
Hello, You are replicating every 60 seconds? I hope you don't have a large index with lots of continuous index updates on the master, as replicating every 60 seconds, while doable, may be a bit too frequent (depending on index size, amount of changes, cache settings, etc.). Otis Sematext

Re: What is largest reasonable setting for ramBufferSizeMB?

2010-02-19 Thread Otis Gospodnetic
Glen may be referring to LuSql indexing with multiple threads? Does/can DIH do that, too? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Message > From: Yonik Seeley > To: solr-user@lucene.apache.org

Re: Documents disappearing

2010-02-19 Thread Otis Gospodnetic
Pascal, Look at that difference between numDocs and maxDocs. That delta represents deleted docs. Maybe there is something deleting your docs after all! Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch Hadoop ecosystem search :: http://search-hadoop.com/ - Original Mess

Re: filter result by catalog

2010-02-19 Thread Otis Gospodnetic
So, hello Kevin, So what have you tried so far? I see from http://www.search-lucene.com/m?id=839141.906...@web81107.mail.mud.yahoo.com||acl you've tried the "acl field" approach. How about the bitset approach described there? Otis Sematext :: http://sematext.com/ :: Solr - Lucene - Nutch

Re: spellcheck.build=true has no effect

2010-02-19 Thread darniz
Hello Can someone please correct me or acknowlege me is this the correct behaviour. Thanksdarniz darniz wrote: > > Hello All. > After doing a lot of research i came to this conclusion please correct me > if i am wrong. > i noticed that if you have buildonCommit and buildOnOptimize as true in >

Re: highlighting fragments EMPTY

2010-02-19 Thread adeelmahmood
well ok I guess that makes sense and I tried changing my title field to text type and then highlighting worked on it .. but 1) as far as not merging all fields in catchall field and instead configuring the dismax handler to search through them .. do you mean then ill have to specify the field I wa

filter result by catalog

2010-02-19 Thread Kevin Osborn
So, I am looking at better ways to filter a resultset by catalog. So, I have an index of products. And based on the user, I want to filter the search results to what they are allowed to see. I will probably have up to 200 or so different catalogs.

Re: long warmup duration

2010-02-19 Thread Antonio Lobato
You can disable warming, and a new searcher will register (almost) instantly, no matter the size. However, once you run your first search, you will be "warming" your searcher, and it will block for a long, long time, giving the end user a "frozen" page. Warming is just another word for "runni

Re: Solr 1.5 in production

2010-02-19 Thread Grant Ingersoll
On Feb 19, 2010, at 4:54 PM, Asif Rahman wrote: > What is the prevailing opinion on using solr 1.5 in a production > environment? I know that many people were using 1.4 in production for a > while before it became an official release. > > Specifically I'm interested in using some of the new spa

Solr 1.5 in production

2010-02-19 Thread Asif Rahman
What is the prevailing opinion on using solr 1.5 in a production environment? I know that many people were using 1.4 in production for a while before it became an official release. Specifically I'm interested in using some of the new spatial features. Thanks, Asif -- Asif Rahman Lead Engineer

Re: Multicore Example

2010-02-19 Thread K Wong
The point that these guys are trying to make is that if another program is using the port that Solr is trying to bind to then they will both fight over the exclusive use of the port. Both the netstat and lsof command work fine on my Mac (Leopard 10.5.8). Trinity:~ kelvin$ which netstat /usr/sbin/

Re: Multicore Example

2010-02-19 Thread Lee Smith
Thanks Shawn I am actually running it on mac It does not like those unix commands ?? Any further advice ? Lee On 19 Feb 2010, at 20:32, Shawn Heisey wrote: > Assuming you are on a unix variant with a working lsof, use this. This > probably won't work correctly on Solaris 10: > > lsof -nPi

RE: Documents disappearing

2010-02-19 Thread Pascal Dimassimo
Using LukeRequestHandler, I see: 7725 28099 758826 1266355690710 false true true org.apache.lucene.store.NIOFSDirectory:org.apache.lucene.store.NIOFSDirectory@/opt/solr/myindex/data/index I will copy the index to my local machine so I can open it with luke. Should I look for something specific

Re: Multicore Example

2010-02-19 Thread Shawn Heisey
Assuming you are on a unix variant with a working lsof, use this. This probably won't work correctly on Solaris 10: lsof -nPi | grep 8983 lsof -nPi | grep 8080 On Windows, you can do this in a command prompt. It requires elevation on Vista or later. The -b option was added in WinXP SP2 and

Re: Multicore Example

2010-02-19 Thread Dave Searle
Are you on windows? Try netstat -a Sent from my iPhone On 19 Feb 2010, at 20:02, "Lee Smith" wrote: > How can I find out ?? > > > On 19 Feb 2010, at 19:26, Dave Searle wrote: > >> Do you have something else using port 8983 or 8080? >> >> Sent from my iPhone >> >> On 19 Feb 2010, at 19:22, "Lee

RE: Documents disappearing

2010-02-19 Thread Ankit Bhatnagar
Try inspecting your index with luke Ankit -Original Message- From: Pascal Dimassimo [mailto:thesuper...@hotmail.com] Sent: Friday, February 19, 2010 2:22 PM To: solr-user@lucene.apache.org Subject: Documents disappearing Hi, I have encounter a situation that I can't explain. We are

Strange performance behaviour when concurrent requests are done

2010-02-19 Thread Marc Sturlese
Hey there, I have been doing some stress with a 2 physical CPU (with 4 cores each) server. After some reading about GC performance tunning I have configured it this way: /usr/lib/jvm/java-6-sun/bin/java -server -Xms7000m -Xmx7000m -XX:ReservedCodeCacheSize=10m -XX:NewSize=1000m -XX:MaxNewSize=100

Re: long warmup duration

2010-02-19 Thread Yonik Seeley
On Fri, Feb 19, 2010 at 12:17 PM, Stefan Neumann wrote: > I am quite confused with your configuration. It seems to me, that your > caches are extremly small for 30 million documents (128) The units of the cache are entries, not documents. So a queryResultCache autowarm count of a few dozen is nor

Re: Multicore Example

2010-02-19 Thread Lee Smith
How can I find out ?? On 19 Feb 2010, at 19:26, Dave Searle wrote: > Do you have something else using port 8983 or 8080? > > Sent from my iPhone > > On 19 Feb 2010, at 19:22, "Lee Smith" wrote: > >> Hey All >> >> Trying to dip my feet into multicore and hoping someone can advise >> why th

Re: Seattle Hadoop/Lucene/NoSQL Meetup; Wed Feb 24th, Feat. MongoDB

2010-02-19 Thread Nick Dimiduk
Reminder: this month's Seattle Hadoop Meetup is this Wednesday. Don't forget to RSVP! On Tue, Feb 16, 2010 at 6:09 PM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Greetings, > > It's time for another awesome Seattle Hadoop/Lucene/Scalability/NoSQL > Meetup! > > As always, it's at the

Re: Multicore Example

2010-02-19 Thread Dave Searle
Do you have something else using port 8983 or 8080? Sent from my iPhone On 19 Feb 2010, at 19:22, "Lee Smith" wrote: > Hey All > > Trying to dip my feet into multicore and hoping someone can advise > why the example is not working. > > Basically I have been working with the example single cor

Re: Multicore Example

2010-02-19 Thread Pascal Dimassimo
Are you sure that you don't have any java processes that are still running? Did you change the port or are you still using 8983? Lee Smith-6 wrote: > > Hey All > > Trying to dip my feet into multicore and hoping someone can advise why the > example is not working. > > Basically I have been w

Documents disappearing

2010-02-19 Thread Pascal Dimassimo
Hi, I have encounter a situation that I can't explain. We are indexing documents that are often duplicates so we activated deduplication like this: true true signature title,text org.apache.solr.update.processor.Lookup3Signature What I can't explain is that when

Multicore Example

2010-02-19 Thread Lee Smith
Hey All Trying to dip my feet into multicore and hoping someone can advise why the example is not working. Basically I have been working with the example single core fine so I have stopped the server and restarted with the new command line for multicore ie, java -Dsolr.solr.home=multicore -jar

Re: What is largest reasonable setting for ramBufferSizeMB?

2010-02-19 Thread Tom Burton-West
Hi Glen, I'd love to use LuSql, but our data is not in a db. Its 6-8TB of files containing OCR (one file per page for about 1.5 billion pages) gzipped on disk which are ugzipped, concatenated, and converted to Solr documents on-the-fly. We have multiple instances of our Solr document producer s

Re: Run Solr within my war

2010-02-19 Thread Richard Frovarp
Pulkit Singhal wrote: Using EmbeddedSolrServer is a client side way of communicating with Solr via the file system. Solr has to still be up and running before that. My question is more along the lines of how to put the server jars that perform the core functionality and bundle them to start up wi

Re: @Field annotation support

2010-02-19 Thread Pulkit Singhal
Ok then, is this the correct class to support the @Field annotation? Because I have it on the path but its not working. org\apache\solr\solr-solrj\1.4.0\solr-solrj-1.4.0.jar/org\apache\solr\client\solrj\beans\Field.class 2010/2/18 Noble Paul നോബിള്‍ नोब्ळ् : > solrj jar > > On Thu, Feb 18, 2010

Re: Run Solr within my war

2010-02-19 Thread Pulkit Singhal
Using EmbeddedSolrServer is a client side way of communicating with Solr via the file system. Solr has to still be up and running before that. My question is more along the lines of how to put the server jars that perform the core functionality and bundle them to start up within a war which is also

Re: long warmup duration

2010-02-19 Thread Stefan Neumann
Hey, I am quite confused with your configuration. It seems to me, that your caches are extremly small for 30 million documents (128) and during warmup you only put up to 20 docs in it. Please correct me if I misunderstand anything. In my opinion your warm up duration is not that impressiv, since

Re: What is largest reasonable setting for ramBufferSizeMB?

2010-02-19 Thread Yonik Seeley
On Fri, Feb 19, 2010 at 5:03 AM, Glen Newton wrote: > You may consider using LuSql[1] to create the indexes, if your source > content is in a JDBC accessible db. It is quite a bit faster than > Solr, as it is a tool specifically created and tuned for Lucene > indexing. Any idea why it's faster? A

Re: highlighting fragments EMPTY

2010-02-19 Thread Ahmet Arslan
> hi > i am trying to get highlighting working and its turning out > to be a pain. > here is my schema > > stored="true" required="true" > /> > stored="true"  /> > stored="true" /> > stored="true" /> > > here is the catchall field (default field for search as > well) > stored="false" > m

Re: highlighting fragments EMPTY

2010-02-19 Thread Jan
All of your fields seem to be of a "string" type, that's why the highlighting doesn't work. The highlighting fields must be tokenized before you can do the highlighting on them. Jan. --- On Fri, 2/19/10, adeelmahmood wrote: From: adeelmahmood Subject: highlighting fragments EMPTY To: sol

highlighting fragments EMPTY

2010-02-19 Thread adeelmahmood
hi i am trying to get highlighting working and its turning out to be a pain. here is my schema here is the catchall field (default field for search as well) here is how I have setup the solrconfig file title pi status 0 content content cont

Re: Range Searches in Collections

2010-02-19 Thread cjkadakia
Unfortunately the number of fees is unknown so we couldn't add the fields into the solr schema until runtime. The work-around we did was create an additional column in the view I'm pulling from for the index to determine each record's minimum "fee" and throw that into the column. A total hack, but

range of scores : queryNorm()

2010-02-19 Thread Smith G
Hello , I have observed that even if we change boosting drastically, scores are being normalized at the end because of queryNorm value. Is there anything ( regarding to the queryNorm) that we can rely on ? like score will always be under 10 or some fixed value ? The main objective is to

Re: Question regarding wildcards and dismax

2010-02-19 Thread gwk
Have a look at the q.alt parameter (http://wiki.apache.org/solr/DisMaxRequestHandler#q.alt) which is used for exactly this issue. Basically putting q.alt=*:* in your query means you can leave out the q parameter if you want all documents to be selected. Regards, gwk On 2/19/2010 11:28 AM, Ro

Question regarding wildcards and dismax

2010-02-19 Thread Roland Villemoes
Hi all, We have a web application build on top of Solr, and we are using a lot of facets - everything works just fine. When the user first hits the searchpage - we would like to do a "get all query" to the a result, and thereby get all facets so we can build up the user interface from this resu

Re: What is largest reasonable setting for ramBufferSizeMB?

2010-02-19 Thread Glen Newton
I've run Lucene with heap sizes as large as 28GB of RAM (on a 32GB machine, 64bit, Linux) and a ramBufferSize of 3GB. While I haven't noticed the GC issues mark mentioned in this configuration, I have seen them in the ranges he discusses (on 1.6 http://lab.cisti-icist.nrc-cnrc.gc.ca/cistilabswiki/i

Re: replications issue

2010-02-19 Thread giskard
Ciao, Uhm after some time a new index in data/index on the slave has been written with the ~size of the master index. the configure on both master slave is the same one on the solrReplication wiki page "enable/disable master/slave in a node" ${enable.master:false} commit schema

Re: How does one sort facet queries?

2010-02-19 Thread gwk
On 2/19/2010 2:15 AM, Kelly Taylor wrote: All sorting of facets works great at the field level (count/index)...all good there...but how is sorting accomplished with range queries? The solrj response doesn't seem to maintain the order the queries are sent in, and the order is not in index or count