date:20070926

Re: searching for non-empty fields

2007-09-26 Thread Chris Hostetter

: Date: Thu, 27 Sep 2007 00:12:48 -0400 : From: Ryan McKinley <[EMAIL PROTECTED]> : Reply-To: solr-user@lucene.apache.org : To: solr-user@lucene.apache.org : Subject: Re: searching for non-empty fields : : > : > Your query will work if you make sure the URL field is omitted from the : > documen

Re: searching for non-empty fields

2007-09-26 Thread Ryan McKinley

Your query will work if you make sure the URL field is omitted from the document at index time when the field is blank. adding something like: to the schema field should do it without needing to ensure it is not null or "" on the client side. ryan

anyone can send me jetty-plus

2007-09-26 Thread James liu

i can't download it from http://jetty.mortbay.org/jetty5/plus/index.html -- regards jl

Re: Geographical distance searching

2007-09-26 Thread patrick o'leary

Might want to remove the *'s around that url http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene.htm There's actually a download-able demo http://www.nsshutdown.com/solr-example_s1.3_ls0.2.tgz start it up as you would a normal solr example $ cd solr-example/apache-solr*/example $

Re: searching for non-empty fields

2007-09-26 Thread Pieter Berkel

I've experienced a similar problem before, assuming the field type is "string" (i.e. not tokenized), there is subtle yet important difference between a field that is null (i.e. not contained in the document) and one that is an empty string (in the document but with no value). See http://www.nabble.

Re: Geographical distance searching

2007-09-26 Thread Ian Holsman

Have you guys seen Local Lucene ? http://www.nsshutdown.com/projects/lucene/whitepaper/*locallucene*.htm no need for mysql if you don't want too. rgrds Ian Will Johnson wrote: With the new/improved value source functions it should be pretty easy to develop a new best practice. You should be a

Re: Converting German special characters / umlaute

2007-09-26 Thread Chris Hostetter

: is there an analyzer which automatically converts all german special : characters to their specific dissected from, such as ü to ue and ä to : ae, etc.?! See also the ISOLatin1TokenFilter which does this regardless of langauge. : I also would like to have, that the search is always run against

Re: What is facet?

2007-09-26 Thread Chris Hostetter

: Faceted search is an approach to search where a taxonomy or categorization : scheme is visible in addition to document matches. My ApacheConUS2006 talk went into a little more detail, including the best definition of faceted searching/browsing I've ever seen... http://people.apache.org/~hossma

Re: Converting German special characters / umlaute

2007-09-26 Thread Thomas Traeger

Try the SnowballPorterFilterFactory described here: http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters You should use the German2 variant that converts ä and ae to a, ö and oe to o and so on. More details: http://snowball.tartarus.org/algorithms/german2/stemmer.html Every document in

Re: What is facet?

2007-09-26 Thread Ezra Ball

Faceted search is an approach to search where a taxonomy or categorization scheme is visible in addition to document matches. http://www.searchtools.com/info/faceted-metadata.html --Ezra. On 9/26/07 3:47 PM, "Teruhiko Kurosaka" <[EMAIL PROTECTED]> wrote: > Could someone tell me what facet is?

Converting German special characters / umlaute

2007-09-26 Thread Matthias Eireiner

Dear list, I have two questions regarding German special characters or umlaute. is there an analyzer which automatically converts all german special characters to their specific dissected from, such as ü to ue and ä to ae, etc.?! I also would like to have, that the search is always run against t

What is facet?

2007-09-26 Thread Teruhiko Kurosaka

Could someone tell me what facet is? I have a vague idea but I am not too clear. A pointer to a sample web site that uses Solr facet would be very good. Thanks. -Kuro

RE: Geographical distance searching

2007-09-26 Thread Will Johnson

With the new/improved value source functions it should be pretty easy to develop a new best practice. You should be able to pull in the lat/lon values from valuesource fields and then do your greater circle calculation. - will -Original Message- From: Lance Norskog [mailto:[EMAIL PROTECT

Re: custom sorting

2007-09-26 Thread Mike Klaas

On 26-Sep-07, at 5:14 AM, Sandeep Shetty wrote: Hi Guys, this question as been asked before but i was unable to find an answer thats good for me, so hope you guys can help again i am working on a website where we need to sort the results by distance from the location entered by the user. I ha

Re: dataset parameters suitable for lucene application

2007-09-26 Thread Mike Klaas

On 26-Sep-07, at 10:50 AM, Law, John wrote: Thanks all! One last question... If I had a collection of 2.5 billion docs and a demand averaging 200 queries per second, what's the confidence that Solr/Lucene could handle this volume and execute search with sub-second response times? No search

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Lance Norskog

My limited experience with larger indexes is: 1) the logistics of copying around and backing up this much data, and 2) indexing is disk-bound. We're on SAS disks and it makes no difference between one indexing thread and a dozen (we have small records). Smaller returns are faster. You need to li

Geographical distance searching

2007-09-26 Thread Lance Norskog

It is a "best practice" to store the master copy of this data in a relational database and use Solr/Lucene as a high-speed cache. MySQL has a geographical database option, so maybe that is a better option than Lucene indexing. Lance (P.s. please start new threads for new topics.) -Original M

Re: How to get debug information while indexing?

2007-09-26 Thread Yonik Seeley

On 9/26/07, Urvashi Gadi <[EMAIL PROTECTED]> wrote: > Hi, > > I am trying to create my own application using SOLR and while trying to > index my data i get > > Server returned HTTP response code: 400 for URL: > http://localhost:8983/solr/update or > Server returned HTTP response code: 500 for URL:

Re: dataset parameters suitable for lucene application

2007-09-26 Thread Walter Underwood

No one can answer that, because it depends on how you configure Solr. How many fields do you want to search? Are you using fuzzy search? Facets? Highlighting? We are searching a much smaller collection, about 250K docs, with great success. We see 80 queries/sec on each of four servers, and respons

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Law, John

Thanks all! One last question... If I had a collection of 2.5 billion docs and a demand averaging 200 queries per second, what's the confidence that Solr/Lucene could handle this volume and execute search with sub-second response times? -Original Message- From: Charlie Jackson [mailto:[E

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Charlie Jackson

Sorry, I meant that it maxed out in the sense that my maxDoc field on the stats page was 8.8 million, which indicates that the most docs it has ever had was around 8.8 million. It's down to about 7.8 million currently. I have seen no signs of a "maximum" number of docs Solr can handle. -Orig

How to get debug information while indexing?

2007-09-26 Thread Urvashi Gadi

Hi, I am trying to create my own application using SOLR and while trying to index my data i get Server returned HTTP response code: 400 for URL: http://localhost:8983/solr/update or Server returned HTTP response code: 500 for URL: http://localhost:8983/solr/update Is there a way to get more debu

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Xuesong Luo

My experience so far: 200k number of indexes were created in 90 mins(including db time), index size is 200m, query a key word on all string fields(30) takes 0.3-1 sec, query a key word on one field takes tens of mill seconds. -Original Message- From: Charlie Jackson [mailto:[EMAIL PROTEC

Re: dataset parameters suitable for lucene application

2007-09-26 Thread Chris Harris

By "maxed out" do you mean that Solr's performance became unacceptable beyond 8.8M records, or that you only had 8.8M records to index? If the former, can you share the particular symptoms? On 9/26/07, Charlie Jackson <[EMAIL PROTECTED]> wrote: > My experiences so far with this level of data have

searching for non-empty fields

2007-09-26 Thread Brian Whitman

I have a large index with a field for a URL. For some reason or another, sometimes a doc will get indexed with that field blank. This is fine but I want a query to return only the set URL fields... If I do a query like: q=URL:[* TO *] I get a lot of empty fields back, like: http://thing.

RE: dataset parameters suitable for lucene application

2007-09-26 Thread Charlie Jackson

My experiences so far with this level of data have been good. Number of records: Maxed out at 8.8 million Database size: friggin huge (100+ GB) Index size: ~24 GB 1) It took me about a day to index 8 million docs using a non-optimized program I wrote. It's non-optimized in the sense that it's not

Re: 2 indexes

2007-09-26 Thread philguillard

Oups. I forgot to set the 2 files with solr home : /opt/tomcat/conf/Catalina/localhost/solr.xml /opt/tomcat/conf/Catalina/localhost/solr.xml Phil philguillard wrote: Hi, I'm new to solr, sorry if i missed my answer in the docs somewhere... I need 2 different solr indexes. Should i create 2 we

2 indexes

2007-09-26 Thread philguillard

Hi, I'm new to solr, sorry if i missed my answer in the docs somewhere... I need 2 different solr indexes. Should i create 2 webapps? In that case i have tomcat contexts solr and solr2, then i can't start solr2, i get this error: Sep 26, 2007 6:07:25 PM org.apache.catalina.core.StandardContex

Re: dataset parameters suitable for lucene application

2007-09-26 Thread Walter Underwood

That seems well within Solr's capabilities, though you should come up with a desired queries/sec figure. Solr's query rate varies widely with the configuration -- how many fields, fuzzy search, highlighting, facets, etc. Essentially, Solr uses Lucene, a modern search core. It has performance and

dataset parameters suitable for lucene application

2007-09-26 Thread Law, John

I am new to the list and new to lucene and solr. I am considering Lucene for a potential new application and need to know how well it scales. Following are the parameters of the dataset. Number of records: 7+ million Database size: 13.3 GB Index Size: 10.9 GB My questions are simply: 1) Appr

Re: Nutch with SOLR

2007-09-26 Thread Brian Whitman

On Sep 26, 2007, at 4:04 AM, Doğacan Güney wrote: NUTCH-442 is one of the issues that I want to really see resolved. Unfortunately, I haven't received many (as in, none) comments, so I haven't made further progress on it. I am probably your target customer but to be honest all we care about

[JOB] Full-time opportunity in Paris, France

2007-09-26 Thread nicolas . dessaigne

Arisem is a French ISV delivering best-of-breed text analytics software. We are using Lucene in our products since 2001 and are in search of a Lucene expert to complement our R&D team. Required skills: - Master degree in computer science - 2+ years of experience in working with Lucene - Stro

Result grouping options

2007-09-26 Thread Thomas

Hello, For the project I'm working on now it is important to group the results of a query by a "product" field. Documents belong to only one product and there will never be more than 10 different products alltogether. When searching through the archives I identified 3 options: 1) [[Client-si

custom sorting

2007-09-26 Thread Sandeep Shetty

> Hi Guys, > > this question as been asked before but i was unable to find an answer > thats good for me, so hope you guys can help again > i am working on a website where we need to sort the results by distance > from the location entered by the user. I have indexed the lat and long > info for ea

Re: Nutch with SOLR

2007-09-26 Thread Doğacan Güney

On 9/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote: > > > Sami has a patch in there which used a older version of the solr > > client. with the current solr client in the SVN tree, his patch > > becomes much easier. > > your job would be to upgrade the patch and mail it back to him so > > he can u

Re: searching for non-empty fields

Re: searching for non-empty fields

anyone can send me jetty-plus

Re: Geographical distance searching

Re: searching for non-empty fields

Re: Geographical distance searching

Re: Converting German special characters / umlaute

Re: What is facet?

Re: Converting German special characters / umlaute

Re: What is facet?

Converting German special characters / umlaute

What is facet?

RE: Geographical distance searching

Re: custom sorting

Re: dataset parameters suitable for lucene application

RE: dataset parameters suitable for lucene application

Geographical distance searching

Re: How to get debug information while indexing?

Re: dataset parameters suitable for lucene application

RE: dataset parameters suitable for lucene application

RE: dataset parameters suitable for lucene application

How to get debug information while indexing?

RE: dataset parameters suitable for lucene application

Re: dataset parameters suitable for lucene application

searching for non-empty fields

RE: dataset parameters suitable for lucene application

Re: 2 indexes

2 indexes

Re: dataset parameters suitable for lucene application

dataset parameters suitable for lucene application

Re: Nutch with SOLR

[JOB] Full-time opportunity in Paris, France

Result grouping options

custom sorting

Re: Nutch with SOLR

35 matches

Site Navigation

Mail list logo

Footer information