Re: Unsubscribe from this mailing-list

2009-09-25 Thread Avlesh Singh
You seem to be desperate to get out of the Solr mailing list :) Send an email to solr-user-unsubscr...@lucene.apache.org Cheers Avlesh On Fri, Sep 25, 2009 at 11:54 AM, Rafeek Raja wrote: > Unsubscribe from this mailing-list >

Highlighting on text fields

2009-09-25 Thread Avlesh Singh
I am new to the whole highlighting API and have a few basic questions: I have a "text" type field defined as underneath: And the schema field is assoc

Re: Showcase: Facetted Search for Wine using Solr

2009-09-25 Thread Marian Steinbach
Hi Grant! Thanks for the advidce, I added the link to the list. Regards, Marian On Fri, Sep 25, 2009 at 5:14 AM, Grant Ingersoll wrote: > Hi Marian, > > Looks great!  Wish I could order some wine.  When you get a chance, please > add the site to http://wiki.apache.org/solr/PublicServers! > >

problem with HTMLStripStandardTokenizerFactory

2009-09-25 Thread Kundig, Andreas
Hello I can't bring HTMLStripStandardTokenizerFactory to remove the content of the style tag, as the documentation says it should. A search for 'mso' returns a document where the search term only appears in the style tag (it's a word document saved as html). Here is the highlight returned by s

Using two Solr documents to represent one logical document/file

2009-09-25 Thread Peter Ledbrook
Hi, I want to index both the contents of a document/file and metadata associated with that document. Since I also want to update the content and metadata indexes independently, I believe that I need to use two separate Solr documents per real/logical document. The question I have is how do I merg

Re: Highlighting on text fields

2009-09-25 Thread Avlesh Singh
I got the answer to my question. The field needs to be "stored" (or "termVector" enabled) for highlighting to work properly. Cheers Avlesh On Fri, Sep 25, 2009 at 1:01 PM, Avlesh Singh wrote: > I am new to the whole highlighting API and have a few basic questions: > I have a "text" type field d

What options would you recommend for the Sun JVM?

2009-09-25 Thread Jérôme Etévé
Hi solr addicts, I know there's no one size fits all set of options for the sun JVM, but I think It'd be useful to everyone to share your tips on using the sun JVM with solr. For instance, I recently figured out that setting the tenured generation garbage collection to Concurrent mark and sweep (

OOM error during merge - index still ok?

2009-09-25 Thread Phillip Farber
Can I expect the index to be left in a usable state ofter an out of memory error during a merge or it it most likely to be corrupt? I'd really hate to have to start this index build again from square one. Thanks. Thanks, Phil --- Exception in thread "http-8080-Processor2505" java.lang.

DIH & RSS > 1.4 nightly 2009-09-25 > full-import&clean=false always clean and import command do nothing

2009-09-25 Thread Brahim Abdesslam
Hello everybody, we are using Solr to index some RSS feeds for a news agregator application. We've got some difficulties with the publication date of each item because each site use an homemade date format. The fact is that we want to have the exact amount of time between the date of publicati

RE: Alphanumeric Wild Card Search Question

2009-09-25 Thread Carr, Adrian
Hi Ken, I am using the WordDelimiterFilterFactory. I thought I needed it because I thought that's what gave me the control over the options of how the words are split and indexed? I did try taking it out completely, but that didn't seem to help. I'll try the analysis tool today. There has got t

Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
Hi to all! Lately my solr servers seem to stop responding once in a while. I'm using solr 1.3. Of course I'm having more traffic on the servers. So I logged the Garbage Collection activity to check if it's because of that. It seems like 11% of the time the application runs, it is stopped because of

Re: Solr and Garbage Collection

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 9:30 AM, Jonathan Ariel wrote: > Hi to all! > Lately my solr servers seem to stop responding once in a while. I'm using > solr 1.3. > Of course I'm having more traffic on the servers. > So I logged the Garbage Collection activity to check if it's because of > that. It seems

RE: Alphanumeric Wild Card Search Question

2009-09-25 Thread Carr, Adrian
In case it helps, here's what I have currently, but I've been messing with different options: -Original Message- From: Carr, Adrian [mailto:adrian.c...@jtv.com] Sent: Friday, September 25, 2009 9:28 AM To: solr-user@lucene.apache.org Subject: RE: Alphanumeric Wild Card Search Questio

Re: OOM error during merge - index still ok?

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 8:20 AM, Phillip Farber wrote: >  Can I expect the index to be left in a usable state ofter an out of memory > error during a merge or it it most likely to be corrupt? It should be in the state it was after the last successful commit. -Yonik http://www.lucidimagination.co

Re: Can we point a Solr server to index directory dynamically at runtime..

2009-09-25 Thread Michael
Are you storing (in addition to indexing) your data? Perhaps you could turn off storage on data older than 7 days (requires reindexing), thus losing the ability to return snippets but cutting down on your storage space and server count. I've experienced 10x decrease in space requirements and a la

Re: Parallel requests to Tomcat

2009-09-25 Thread Michael
Thank you Grant and Lance for your comments -- I've run into a separate snag which puts this on hold for a bit, but I'll return to finish digging into this and post my results. - Michael On Thu, Sep 24, 2009 at 9:23 PM, Lance Norskog wrote: > Are you on Java 5, 6 or 7? Each release sees some twea

RE: Mixed field types and boolean searching

2009-09-25 Thread Ensdorf Ken
> No- there are various analyzers. StandardAnalyzer is geared toward > searching bodies of text for interesting words - punctuation is > ripped out. Other analyzers are more useful for "concrete" text. You > may have to work at finding one that leaves punctuation in. > My problem is not with the

Faceted Search on Dynamic Fields?

2009-09-25 Thread danben
I'm trying to perform a faceted query with the facet field referencing a field that is not in the schema but matches a dynamicField with its suffix. The query returns results but for some reason the facet list is always empty. When I change the facet field to one that is explicitly named in the

Re: What options would you recommend for the Sun JVM?

2009-09-25 Thread Grant Ingersoll
On Sep 25, 2009, at 7:30 AM, Jérôme Etévé wrote: Hi solr addicts, I know there's no one size fits all set of options for the sun JVM, but I think It'd be useful to everyone to share your tips on using the sun JVM with solr. For instance, I recently figured out that setting the tenured generat

Re: Faceted Search on Dynamic Fields?

2009-09-25 Thread danben
Also, here is the field definition in the schema -- View this message in context: http://www.nabble.com/Faceted-Search-on-Dynamic-Fields--tp25612887p25612936.html Sent from the Solr - User mailing list archive at Nabble.com.

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
Right, now I'm giving it 12GB of heap memory. If I give it less (10GB) it throws the following exception: Sep 5, 2009 7:18:32 PM org.apache.solr.common.SolrException log SEVERE: java.lang.OutOfMemoryError: Java heap space at org.apache.lucene.search.FieldCacheImpl$10.createValue(FieldCache

RE: Solr and Garbage Collection

2009-09-25 Thread cbennett
Hi, Have you looked at tuning the garbage collection ? Take a look at the following articles http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot -camp-draft/ http://java.sun.com/docs/hotspot/gc5.0/gc_tuning_5.html Changing to the concurrent or throughput collector shoul

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
I've got the start of a Garbage Collection article here: http://www.lucidimagination.com/blog/2009/09/19/java-garbage-collection-boot-camp-draft/ I plan to tie it more into Lucene/Solr and add some more about the theory/methods in the final version. With so much RAM, I take it you prob have a han

FACET_SORT_INDEX descending?

2009-09-25 Thread Gerald Snyder
Is there any value for the "f.my_year_facet.facet.sort" parameter that will return the facet values in descending order? So far I only see "index" and "count" as the choices. http://lucene.apache.org/solr/api/org/apache/solr/common/params/FacetParams.html#FACET_SORT_INDEX Thanks. Gerald Sn

RE: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
> Bigger heaps lead to bigger GC pauses in general. Opposite viewpoint: 1sec GC happening once an hour is MUCH BETTER than 30ms GC once-per-second. To lower frequency of GC: -Xms4096m -Xmx4096m (make it equal!) Use -server option. -server option of JVM is 'native CPU code', I remember WebLogic

RE: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
Give it even more memory. Lucene FieldCache is used to store non-tokenized single-value non-boolean (DocumentId -> FieldValue) pairs, and it is used (in-full!) for instance for sorting query results. So that if you have 100,000,000 documents with specific heavily distributed field values (cardina

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
You are saying that I should give more memory than 12GB? When I was with 10GB I had the exceptions that I sent. Switching to 12GB made them disappear. So I think I don't have problems with FieldCache any more. What it seems like a problem is 11% on the application time dedicated to GC. Specially wh

RE: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
> You are saying that I should give more memory than 12GB? Yes. Look at this: > > SEVERE: java.lang.OutOfMemoryError: Java heap space > org.apache.lucene.search.FieldCacheImpl$10.createValue(FieldCacheImpl.java:3 > 61 > > ) It can't find few (!!!) contiguous bytes for .createValue(...) It ca

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Yes - more RAM is not a solution to your problem. Jonathan Ariel wrote: > You are saying that I should give more memory than 12GB? > When I was with 10GB I had the exceptions that I sent. Switching to 12GB > made them disappear. > So I think I don't have problems with FieldCache any more. What it

Re: Faceted Search on Dynamic Fields?

2009-09-25 Thread Avlesh Singh
Faceting, as of now, can only be done of definitive field names. Faceting on field names matching wildcards (dynamic field being one such scenario) is yet to be supported. There are lot of open issues, aiming to achieve this. Find a similar discussion here - http://www.lucidimagination.com/search/d

RE: Solr and Garbage Collection

2009-09-25 Thread cbennett
I would look at the JVM. Have you tried switching to the concurrent low pause collector ? Colin. -Original Message- From: Jonathan Ariel [mailto:ionat...@gmail.com] Sent: Friday, September 25, 2009 12:07 PM To: solr-user@lucene.apache.org Subject: Re: Solr and Garbage Collection You ar

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
I can't really understand how increasing the heap will decrease the 11% dedicated to GC On 9/25/09, Fuad Efendi wrote: >> You are saying that I should give more memory than 12GB? > > > Yes. Look at this: > >> > SEVERE: java.lang.OutOfMemoryError: Java heap space >> > org.apache.lucene.search.Fiel

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
BTW why making them equal will lower the frequency of GC? On 9/25/09, Fuad Efendi wrote: >> Bigger heaps lead to bigger GC pauses in general. > > Opposite viewpoint: > 1sec GC happening once an hour is MUCH BETTER than 30ms GC once-per-second. > > To lower frequency of GC: -Xms4096m -Xmx4096m (ma

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
It won't really - it will just keep the JVM from wasting time resizing the heap on you. Since you know you need so much RAM anyway, no reason not to just pin it at what you need. Not going to help you much with GC though. Jonathan Ariel wrote: > BTW why making them equal will lower the frequency o

Re: Faceted Search on Dynamic Fields?

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 12:19 PM, Avlesh Singh wrote: > Faceting, as of now, can only be done of definitive field names. To further clarify, the fields you can facet on can include those defined by dynamic fields. You just must specify the exact field name when you facet. Did you reall

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
>-server option of JVM is 'native CPU code', I remember WebLogic 7 console >with SUN JVM 1.3 not showing any GC (just horizontal line). Not sure what that is all about either. -server and -client are just two different versions of hotspot. The -server version is optimized for long running applicat

RE: Solr and Garbage Collection

2009-09-25 Thread Walter Underwood
30ms is not better or worse than 1s until you look at the service requirements. For many applications, it is worth dedicating 10% of your processing time to GC if that makes the worst-case pause short. On the other hand, my experience with the IBM JVM was that the maximum query rate was 2-3X bette

Re: download pre-release nightly solr 1.4

2009-09-25 Thread michael8
markrmiller wrote: > > michael8 wrote: >> Hi, >> >> I know Solr 1.4 is going to be released any day now pending Lucene 2.9 >> release. Is there anywhere where one can download a pre-released nighly >> build of Solr 1.4 just for getting familiar with new features (e.g. field >> collapsing)? >>

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Walter Underwood wrote: > 30ms is not better or worse than 1s until you look at the service > requirements. For many applications, it is worth dedicating 10% of your > processing time to GC if that makes the worst-case pause short. > > On the other hand, my experience with the IBM JVM was that the

Re: download pre-release nightly solr 1.4

2009-09-25 Thread Mark Miller
michael8 wrote: > > markrmiller wrote: > >> michael8 wrote: >> >>> Hi, >>> >>> I know Solr 1.4 is going to be released any day now pending Lucene 2.9 >>> release. Is there anywhere where one can download a pre-released nighly >>> build of Solr 1.4 just for getting familiar with new feature

boost function for date as unix stamp

2009-09-25 Thread Joe Calderon
hello *, i read on the wiki about using recip(rord(...)...) to boost recent documents with a date field, does anyone have a good function for doing something similar with unix timestamps? if not, is there a lot of overhead related to counting the number of distinct values for rord() ? thx much

RE: Solr and Garbage Collection

2009-09-25 Thread Walter Underwood
As I said, I was using the IBM JVM, not the Sun JVM. The "concurrent low pause" collector is only in the Sun JVM. I just found this excellent article about the various IBM GC options for a Lucene application with a 100GB heap: http://www.nearinfinity.com/blogs/aaron_mccurry/tuning_the_ibm_jvm_for

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
Ok. I will try with the "concurrent low pause" collector and let you know the results. On Fri, Sep 25, 2009 at 2:23 PM, Walter Underwood wrote: > As I said, I was using the IBM JVM, not the Sun JVM. The "concurrent low > pause" collector is only in the Sun JVM. > > I just found this excellent arti

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
My bad - later, it looks as if your giving general advice, and thats what I took issue with. Any Collector that is not doing generational collection is essentially from the dark ages and shouldn't be used. Any Collector that doesn't have concurrent options, unless possibly your running a tiny app

8 for 1.4

2009-09-25 Thread Grant Ingersoll
Y'all, We're down to 8 open issues: https://issues.apache.org/jira/secure/BrowseVersion.jspa?id=12310230&versionId=12313351&showOpenIssuesOnly=true 2 are packaging related, one is dependent on the official 2.9 release (so should be taken care of today or tomorrow I suspect) and then we hav

RE: Solr and Garbage Collection

2009-09-25 Thread Walter Underwood
For batch-oriented computing, like Hadoop, the most efficient GC is probably a non-concurrent, non-generational GC. I doubt that there are many batch-oriented applications of Solr, though. The rest of the advice is intended to be general and it sounds like we agree about sizing. If the nursery is

Solr + Jboss + Custom Transformers

2009-09-25 Thread Papiya Misra
Hi I am trying to use a custom transformer that extends org.apache.solr.handler.dataimport.Transformer. I have the CustomTransformer.jar and DataImportHandler.jar in JBOSS/server/default/lib. I have the solr.war (as is from the distro) in the JBOSS/server/default/deploy. org.apache.solr.handler

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Walter Underwood wrote: > For batch-oriented computing, like Hadoop, the most efficient GC is probably > a non-concurrent, non-generational GC. Okay - for batch we somewhat agree I guess - if you can stand any length of pausing, non concurrent can be nice, because you don't pay for thread sync com

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
This all applies to having more than once processor though - if you have one processor, than non concurrent can also make sense. But especially with the young space, you want concurrency - with upto 98% of objects being short lived, and multiple threads generally creating new objects, its a huge b

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Faud, you didn't read the thread right. He is not having a problem with OOM. He got the OOM because he lowered the heap to try and help GC. He normally runs with a heap that can handle his FC. Please re-read the thread. You are confusing the tread. - Mark Fuad Efendi wrote: > Guys, thanks for

RE: FW: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
> He is not having a problem with OOM. He got the OOM because he lowered > the heap to try and help GC. That is very confusing!!! Lowering heap helps GC? Someone mentioned it in this thread, but my viewpoint is completely opposite. 1. Some RAM is needed to_be_reserved for FieldCache (it will be

FW: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
Guys, thanks for GC discussion; but the root of a problem is FieldCache internals. Not enough RAM for FieldCache will cause unpredictable OOM, and it does not depend on GC. How much RAM FieldCache needs in case of 2 different values for a Field, 200 bytes each (Unicode), and 100M documents? Wh

RE: FW: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
Mark, what if piece of code needs 10 contiguous Kb to load a document field? How locked memory pieces are optimized/moved (putting on hold almost whole application)? Lowering heap is _bad_ idea; we will have extremely frequent GC (optimize of live objects!!!) even if RAM is (theoretically) enough.

Hierarchical Facet Field Prefix Not Working

2009-09-25 Thread Nasseam Elkarra
Hello all, We are using the patch from SOLR-64 (http://issues.apache.org/jira/browse/SOLR-64 ) to implement hierarchical facets for categories. We are trying to use the facet.prefix to prevent all categories from coming back. However, f.category.facet.prefix doesn't work. Using facet.prefix

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
I'm not planning on lowering the heap. I just want to lower the time "wasted" on GC, which is 11% right now.So what I'll try is changing the GC to -XX:+UseConcMarkSweepGC On Fri, Sep 25, 2009 at 4:17 PM, Fuad Efendi wrote: > Mark, > > what if piece of code needs 10 contiguous Kb to load a docume

Re: shards and facet_count

2009-09-25 Thread Paul Rosen
Sorry for the long delay in responding, but I've just gotten back to this problem... I got the solr 1.4 nightly and the problem went away, so I guess it is a solr 1.3 bug. Thanks for all the input! Lance Norskog wrote: Paul, can you create an HTTP url that does this exact query? With multip

RE: FW: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
But again, GC is not just "Garbage Collection" as many in this thread think... it is also "memory defragmentation" which is much costly than "collection" just because it needs move somewhere _live_objects_ (and wait/lock till such objects get unlocked to be moved...) - obviously more memory helps..

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Yonik Seeley
On Fri, Sep 25, 2009 at 2:52 PM, Fuad Efendi wrote: > Lowering heap helps GC? Yes. In general, lowering the heap can help or hurt. Hurt: if one is running very low on memory, GC will be working harder all of the time trying to find more memory and the % of time that GC takes can go up. Help: i

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
Maybe what's missing here is how did I get the 11%.I just ran solr with the following JVM params: -XX:+PrintGCApplicationConcurrentTime -XX:+PrintGCApplicationStoppedTime with that I can measure the amount of time the application run between collection pauses and the length of the collection pauses

Re: Can we point a Solr server to index directory dynamically at runtime..

2009-09-25 Thread Silent Surfer
Hi Michael, We are storing all our data in addition to index, as we need to display those values to the user. So unfortunately we cannot go with the option stored=false, which could have potentially solved our issue. Appreciate any other pointers/suggestions Thanks, sS --- On Fri, 9/25/09, Mi

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
When we talk about Collectors, we are not just talking about "collecting" - whatever that means. There isn't really a "collecting" phase - the whole algorithm is garbage collecting - hence calling the different implementations "collectors". Usually, fragmentation is dealt with using a mark-compact

solr home

2009-09-25 Thread Park, Michael
I already have a handful of solr instances running . However, I'm trying to install solr (1.4) on a new linux server with tomcat using a context file (same way I usually do): However it throws an exception due to the following: SEVERE: Could not start SOLR. Check solr/home propert

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
Ok. I'll first change the GC and see if the time spent decreased. Than I'll try increasing the heap as Fuad recommends. On 9/25/09, Mark Miller wrote: > When we talk about Collectors, we are not just talking about > "collecting" - whatever that means. There isn't really a "collecting" > phase - t

Re: FW: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
>> or IBM has used a mark-sweep-compact collector Never mind - Sun's is also sometimes referred to as mark-sweep-compact. I've just seen it referred to as mark-compact before as well. In either case though, without some sort of sweep phase, there is no reclamation of memory :) It's interesting th

Re: Solr and Garbage Collection

2009-09-25 Thread Grant Ingersoll
On Sep 25, 2009, at 9:30 AM, Jonathan Ariel wrote: Hi to all! Lately my solr servers seem to stop responding once in a while. I'm using solr 1.3. Of course I'm having more traffic on the servers. So I logged the Garbage Collection activity to check if it's because of that. It seems like 11

RE: FW: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
> Usually, fragmentation is dealt with using a mark-compact collector (or > IBM has used a mark-sweep-compact collector). > Copying collectors are not only super efficient at collecting young > spaces, but they are also great for fragmentation - when you copy > everything to the new space, you can

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Jonathan Ariel wrote: > How can I check which is the GC that it is being used? If I'm right JVM > Ergonomics should use the Throughput GC, but I'm not 100% sure. Do you have > any recommendation on this? > > Just to straighten out this one too - Ergonomics doesn't use throughput - throughput is

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Mark Miller wrote: > Jonathan Ariel wrote: > >> How can I check which is the GC that it is being used? If I'm right JVM >> Ergonomics should use the Throughput GC, but I'm not 100% sure. Do you have >> any recommendation on this? >> >> >> > Just to straighten out this one too - Ergonomic

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
Thats a good point too - if you can reduce your need for such a large heap, by all means, do so. However, considering you already need at least 10GB or you get OOM, you have a long way to go with that approach. Good luck :) How many docs do you have ? I'm guessing its mostly FieldCache type stuff

Re: Solr and Garbage Collection

2009-09-25 Thread Mark Miller
One more point and I'll stop - I've hit my email quota for the day ;) While its a pain to have to juggle GC params and tune - when you require a heap thats more than a gig or two, I personally believe its essential to do so for good performance. The (default settings / ergonomics with throughput)

Re: Solr and Garbage Collection

2009-09-25 Thread Jonathan Ariel
I have around 8M documents. I set up my server to use a different collector and it seems like it decreased from 11% to 4%, of course I need to wait a bit more because it is just a 1 hour old log. But it seems like it is much better now. I will tell you on Monday the results :) On Fri, Sep 25, 2009

RE: Solr and Garbage Collection

2009-09-25 Thread Fuad Efendi
Sorry for OFF-topic: Create dummy "Hello, World!" JSP, use Tomcat, execute load-stress simulator(s) from separate machine(s), and measure... don't forget to allocate necessary thread pools in Tomcat (if you have to)... Although such JSP doesn't use any memory, you will see how easy one can go with

Re: problem with HTMLStripStandardTokenizerFactory

2009-09-25 Thread Yonik Seeley
Can you give a small test file that demonstrates the problem? -Yonik http://www.lucidimagination.com On Fri, Sep 25, 2009 at 5:34 AM, Kundig, Andreas wrote: > Hello > > I can't bring HTMLStripStandardTokenizerFactory to remove the content of the > style tag, as the documentation says it shoul

Re: Solr http post performance seems slow - help?

2009-09-25 Thread Lance Norskog
Your indexing project is disk-bound. My modern midrange laptop gets 30MB/s doing "cat > /dev/null" (1 7200rpm disk). The Amazon instances I'm playing with get 50-60 (I really want to know how it fits together). Your laptop might be 10-20? On Thu, Sep 24, 2009 at 11:54 PM, Constantijn Visinescu wr

Re: Showcase: Facetted Search for Wine using Solr

2009-09-25 Thread Lance Norskog
Have you seen this? It is another Solr/Typeo3 integration project. http://forge.typo3.org/projects/show/extension-solr Would you consider open-sourcing your Solr/Typo3 integration? On Fri, Sep 25, 2009 at 1:18 AM, Marian Steinbach wrote: > Hi Grant! > > Thanks for the advidce, I added the link

Problem changing the default MergePolicy/Scheduler

2009-09-25 Thread Jibo John
Hello, It looks like solr is not allowing me to change the default MergePolicy/Scheduler classes. Even if I change the default MergePolicy/ Scheduler(LogByteSizeMErgePolicy and ConcurrentMergeScheduler) defined in solrconfig.xml to a different one (LogDocMergePolicy and SerialMergeSchedu

Re: Mixed field types and boolean searching

2009-09-25 Thread Lance Norskog
The DisMax parser essentially creates a set of queries against different fields. These queries are analyzed as per each field. I think this what you are talking about- "The" in a movie title is diffferent from "the" in the movie description. Would you expect "The Sound Of Music" to fetch every mov

Re: Hierarchical Facet Field Prefix Not Working

2009-09-25 Thread Koji Sekiguchi
Hi Nasseam, I think per field parameter for facet.prefix should be worked on hierarchical facet fields by briefly looking at the patch. And I can get same facet results by: &facet=on&facet.field=hiefacet&facet.prefix=A/B/ and &facet=on&facet.field=hiefacet&f.hiefacet.facet.prefix=A/B/ when us