Re: solr performance for documents with hundreds of fields

2008-04-24 Thread Umar Shah
I am just wondering, because having 200 fields seems like too much (for me), I want to know if people actually have such kind of schemas and how well they perform. On Thu, Apr 24, 2008 at 5:10 PM, Grant Ingersoll <[EMAIL PROTECTED]> wrote: > Are you actually seeing performance problems or jus

Re: GSA <-> Solr

2008-04-24 Thread Otis Gospodnetic
Lukas, >From your description, this looks like a Nutch job, not Solr (no crawling >component), though one can also use Nutch with Solr now. I can't share the reasons, unfortunately. But from a personal stand point, I've seen GSA and it's not all that impressive, it costs a pile of money, and

Re: GSA <-> Solr

2008-04-24 Thread Lukas Vlcek
BTW: Do you think you can share reasons why your clients are switching from GSA? I am very interested in their experience. On Fri, Apr 25, 2008 at 6:29 AM, Lukas Vlcek <[EMAIL PROTECTED]> wrote: > Hi, > > I posted related question into to Nutch-user yesterday. Here is the post: > Crawling > MOSS

Re: GSA <-> Solr

2008-04-24 Thread Lukas Vlcek
Hi, I posted related question into to Nutch-user yesterday. Here is the post: Crawling MOSS 2007 content using Nutch via GSA connector My specific situation if as folows: We are deploying MOSS 2007 whi

Re: Caching of DataImportHandler's Status Page

2008-04-24 Thread Noble Paul നോബിള്‍ नोब्ळ्
It is caused by the new caching feature in Solr. The caching is done at the browser level . Slr just sends appropriate headers. .We had raised an issue to disable that. BTW The command is not exactly http://localhost:8983/solr/dataimport?command=status . http://localhost:8983/solr/dataimport itse

Re: Caching of DataImportHandler's Status Page

2008-04-24 Thread Walter Underwood
Status pages should be sent with Pragma: no-cache. That is a bug. wunder On 4/24/08 6:29 PM, "Erik Hatcher" <[EMAIL PROTECTED]> wrote: > The issue is the HTTP caching feature of Solr, for better or worse in > this case. It confuses me often when I hit this myself. Try hitting > that URL with c

Re: GSA <-> Solr

2008-04-24 Thread Otis Gospodnetic
Ask me in about a month. I will likely be converting one *very* large and well-known organization from the expensive GSA to Solr if that's what you are asking about. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Jon Baer <[EMAIL PROT

Re: GSA <-> Solr

2008-04-24 Thread Erik Hatcher
On Apr 24, 2008, at 8:03 PM, Jon Baer wrote: Going to try to persuade my employer to switch away some functions, maybe all from the GSA black box to Solr and was trying to find some (any?) case studies where this was done ... Also what is the similar function to a "KeyMatch" in Solr? Is it

Re: Caching of DataImportHandler's Status Page

2008-04-24 Thread Erik Hatcher
The issue is the HTTP caching feature of Solr, for better or worse in this case. It confuses me often when I hit this myself. Try hitting that URL with curl and you'll see it change since no caching is involved client-side. For sanity's sake you can turn off HTTP caching in solrconfig.xml

GSA <-> Solr

2008-04-24 Thread Jon Baer
Hi, Going to try to persuade my employer to switch away some functions, maybe all from the GSA black box to Solr and was trying to find some (any?) case studies where this was done ... Also what is the similar function to a "KeyMatch" in Solr? Is it elevate.xml? BTW, have been testing

Re: Caching of DataImportHandler's Status Page

2008-04-24 Thread Chris Harris
No luck with control-R, or with F5. I'm on Windows here if you think that's a potential problem. For now I've found a silly workaround: If http://localhost:8983/solr/dataimport?command=status doesn't work, then you can replace "command=status" with almost anything at all and then you'll be a

Re: Caching of DataImportHandler's Status Page

2008-04-24 Thread Otis Gospodnetic
Chris - what happens if you hit ctrl-R (or command-R on OSX)? That should bypass the browser cache. Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Chris Harris <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, April 2

Re: Multi language, one "body" field, multi stopwords ?

2008-04-24 Thread Otis Gospodnetic
Multilingual indexing and searching is tricky, but I think all of them require that you know the language of a query one way or the other. Without that, you won't know how to correctly analyze the query. You didn't mention that, so I'm bringing it up. To answer your question, comment for copy

Re: MultiThreaded Document Loader?

2008-04-24 Thread Otis Gospodnetic
Nothing publicly available. Want to contribute one? :) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: oleg_gnatovskiy <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Thursday, April 24, 2008 5:57:25 PM > Subject: MultiThreaded

Caching of DataImportHandler's Status Page

2008-04-24 Thread Chris Harris
I'm playing with the DataImportHandler, which so far seems pretty cool. (I've applied the latest patch from JIRA to a fresh download of trunk revision 651344. I'm using the basic Jetty setup in the example directory.) The thing that's bugging me is that while the handler's status page (http://local

MultiThreaded Document Loader?

2008-04-24 Thread oleg_gnatovskiy
Hello. I was wondering if Solr has some kind of a multi-threaded document loader? I've been using post.sh (curl) to post documents to my Solr server, and it's pretty slow. I know it should be pretty easy to write one up, but I was just wondering if one already existed. -- View this message in con

Re: Solr with Auto-suggest

2008-04-24 Thread Ryan McKinley
On Apr 24, 2008, at 12:25 PM, Rantjil Bould wrote: Hi Group, I was asked in my project to implement google suggest kind of functionality for searching help system. I have seen one thread http://www.mail-archive.com/solr-user@lucene.apache.org/ msg06739.html which deals with the wa

Lucene Modules - LucQE [lucky] Lucene Query Expansion Module

2008-04-24 Thread Lance Norskog
http://lucene-qe.sourceforge.net/ This is a much smarter technique for doing query expansion with synonyms, using "Rocchio's Algorithm". Has anyone tried to shoehorn this into Solr? It's a little weird: it needs an analyser, a searcher, and a similarity function. It should be possible to refactor

Re: Distributed Search Caching

2008-04-24 Thread Yonik Seeley
On Thu, Apr 24, 2008 at 3:16 PM, swarag <[EMAIL PROTECTED]> wrote: > > hey, > I have a distributed search environment with one server hitting 3 shards. > for Example: > > http://server1.cs.tmcs:15100/solr/search/?q=starbucks&shards=server1.cs.tmcs:8983/solr,server2.cs.tmcs:8983/solr,server3.cs

Distributed Search Caching

2008-04-24 Thread swarag
hey, I have a distributed search environment with one server hitting 3 shards. for Example: http://server1.cs.tmcs:15100/solr/search/?q=starbucks&shards=server1.cs.tmcs:8983/solr,server2.cs.tmcs:8983/solr,server3.cs.tmcs:8983/solr&collapse.field=locChainId So, where is the cache stored? Is is dist

Re: More Like This boost

2008-04-24 Thread Jonathan Ariel
Ok. Here it is. https://issues.apache.org/jira/browse/LUCENE-1272 On Tue, Apr 22, 2008 at 2:24 PM, Francisco Sanmartin <[EMAIL PROTECTED]> wrote: > Yep, it would be nice for MLT to have this feature, that's why I am trying > to do it from the querys before sending the query to Solr. These are

Re: Updating in Solr.SOLR-139

2008-04-24 Thread Koji Sekiguchi
I've just tried this again in my environment, but I couldn't reproduce what you pointed. My schema is: : required="true" /> multiValued="true"/> : id Koji nutchvf wrote: Hi!! Thank you very much,Koji!! Your response has helped me a lot and I have already managed to update the

Re: SOLR-470 & default value in schema with NOW (update)

2008-04-24 Thread Brian Johnson
Ok, I thought the quickest thing to try out would be (B) so now all of my feeds have the same format and I have removed the default value "NOW" from my schema.xml file. ... and ... I rebuilt my index with consistent date formats in all my files but my exception remains unchanged. Ap

Solr with Auto-suggest

2008-04-24 Thread Rantjil Bould
Hi Group, I was asked in my project to implement google suggest kind of functionality for searching help system. I have seen one thread http://www.mail-archive.com/solr-user@lucene.apache.org/msg06739.html which deals with the way to index if large index. But I am not able to get much i

Re: Updating in Solr.SOLR-139

2008-04-24 Thread Yonik Seeley
Apologies, I read too fast and didn't see that the original user was in fact using the ModifiableDocument patch (that's what I was referring to by "update patch"). On Thu, Apr 24, 2008 at 12:11 PM, Koji Sekiguchi <[EMAIL PROTECTED]> wrote: > Yonik, > > I'm afraid but I don't understand what you m

Re: Updating in Solr.SOLR-139

2008-04-24 Thread Koji Sekiguchi
Yonik, I'm afraid but I don't understand what you mean by "update patch". I did this in last week with Eriks-ModifiableDocument.patch in SOLR-139 and got working... Koji Yonik Seeley wrote: Koji: perhaps you are working with the "update" patch? I'm pretty sure these things won't work with sto

Re: MoreLikeThis patch to support boost factor

2008-04-24 Thread Jonathan Ariel
Ok. Posted. You'll find a patch with unit test. https://issues.apache.org/jira/browse/LUCENE-1272 Thanks! On Wed, Apr 23, 2008 at 10:25 PM, Jonathan Ariel <[EMAIL PROTECTED]> wrote: > Yes. Sure. I'll do that. Just wanted some feedback before posting it. As > soon as I do it I'll post the issue

Re: Got parseException when search keyword AND on a text field

2008-04-24 Thread Walter Underwood
DisMax preserves a fair amount of syntax. It isn't a pure text query. We have a small client library (written before solrj) that escapes all the stuff that Solr doesn't. If you are already lowercasing queries, then you can fix AND, OR, and NOT by replacing them with their lowercase equivalents. w

Re: Got parseException when search keyword AND on a text field

2008-04-24 Thread Ryan McKinley
check the dismax handler -- it expects words to search for, not a query syntax On Apr 24, 2008, at 9:53 AM, Geoffrey Young wrote: Otis Gospodnetic wrote: Not in one place and documented. The place to look are query parsers, but things like AND OR NOT TO are the ones to look out for.

Re: Got parseException when search keyword AND on a text field

2008-04-24 Thread Geoffrey Young
Otis Gospodnetic wrote: Not in one place and documented. The place to look are query parsers, but things like AND OR NOT TO are the ones to look out for. this seems like something solr ought to handle gracefully on the backend for me - if I need to write logic to make sure a malicious quer

Re: Updating in Solr.SOLR-139

2008-04-24 Thread Yonik Seeley
Koji: perhaps you are working with the "update" patch? I'm pretty sure these things won't work with stock solr, right? -Yonik On Fri, Apr 18, 2008 at 10:30 AM, Koji Sekiguchi <[EMAIL PROTECTED]> wrote: > You don't need any additional attributes in schema.xml, but the field > should be stored. >

RE: logging through log4j

2008-04-24 Thread Will Johnson
I did the following. it's not perfect but it does let my other logging system fully configure what gets sent over and what doesn't. I think the better approach is to implement the j.u.l.Logger interface with a custom log manager but that required more work at the time which didn't seem to be worth

Re: solr performance for documents with hundreds of fields

2008-04-24 Thread Grant Ingersoll
Are you actually seeing performance problems or just wondering if there will be a performance problem? -Grant On Apr 24, 2008, at 7:08 AM, Umar Shah wrote: Hi, I wanted to know what would be the performance of SOLR for the following scenario: the documents contain say 200 fields with sa

solr performance for documents with hundreds of fields

2008-04-24 Thread Umar Shah
Hi, I wanted to know what would be the performance of SOLR for the following scenario: the documents contain say 200 fields with say 100 of the fields (containing numbers) and rest containing short strings of 40-50 character length. the sparseness of the data can be assumed to be as approximately

Re: MoreLikeThis handler and field collapsing.

2008-04-24 Thread kordi
Could you please describe me your solution, for the topic. Thanks -- View this message in context: http://www.nabble.com/MoreLikeThis-handler-and-field-collapsing.-tp11866297p16849939.html Sent from the Solr - User mailing list archive at Nabble.com.

Facet display fields

2008-04-24 Thread Nikhil Chhaochharia
Hi, I have an index with a field called 'category'. The data is not clean, so documents may have values such as 'ABC DEF', 'abc-def', 'ABC-def' etc. Now, I want to facet by this field and want all such values to be clubbed together. I created a field called 'category_facet' where I strip ever

Re: Updating in Solr.SOLR-139

2008-04-24 Thread nutchvf
Hi!! Thank you very much,Koji!! Your response has helped me a lot and I have already managed to update the document.Now,I have another problem: Sending the update request to Solr: For example: http://localhost:8389/solr/update?mode=tags:overwrite&commit=true AAA German After that step,

RE: logging through log4j

2008-04-24 Thread Henrib
Will, I'd be definitely interested in your code but mostly in the config & deployment options if you can share. You did not happen to deploy on Websphere 6 by any chance ? I can't find a way to configure jul to only log into our application logs (even less so in our log4j logs); I'm not even sure