: Date: Thu, 27 Sep 2007 00:12:48 -0400
: From: Ryan McKinley <[EMAIL PROTECTED]>
: Reply-To: solr-user@lucene.apache.org
: To: solr-user@lucene.apache.org
: Subject: Re: searching for non-empty fields
:
: >
: > Your query will work if you make sure the URL field is omitted from the
: > documen
Your query will work if you make sure the URL field is omitted from the
document at index time when the field is blank.
adding something like:
to the schema field should do it without needing to ensure it is not
null or "" on the client side.
ryan
i can't download it from http://jetty.mortbay.org/jetty5/plus/index.html
--
regards
jl
Might want to remove the *'s around that url
http://www.nsshutdown.com/projects/lucene/whitepaper/locallucene.htm
There's actually a download-able demo
http://www.nsshutdown.com/solr-example_s1.3_ls0.2.tgz
start it up as you would a normal solr example
$ cd solr-example/apache-solr*/example
$
I've experienced a similar problem before, assuming the field type is
"string" (i.e. not tokenized), there is subtle yet important difference
between a field that is null (i.e. not contained in the document) and one
that is an empty string (in the document but with no value). See
http://www.nabble.
Have you guys seen Local Lucene ?
http://www.nsshutdown.com/projects/lucene/whitepaper/*locallucene*.htm
no need for mysql if you don't want too.
rgrds
Ian
Will Johnson wrote:
With the new/improved value source functions it should be pretty easy to
develop a new best practice. You should be a
: is there an analyzer which automatically converts all german special
: characters to their specific dissected from, such as ü to ue and ä to
: ae, etc.?!
See also the ISOLatin1TokenFilter which does this regardless of langauge.
: I also would like to have, that the search is always run against
: Faceted search is an approach to search where a taxonomy or categorization
: scheme is visible in addition to document matches.
My ApacheConUS2006 talk went into a little more detail, including the best
definition of faceted searching/browsing I've ever seen...
http://people.apache.org/~hossma
Try the SnowballPorterFilterFactory described here:
http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters
You should use the German2 variant that converts ä and ae to a, ö and oe
to o and so on. More details:
http://snowball.tartarus.org/algorithms/german2/stemmer.html
Every document in
Faceted search is an approach to search where a taxonomy or categorization
scheme is visible in addition to document matches.
http://www.searchtools.com/info/faceted-metadata.html
--Ezra.
On 9/26/07 3:47 PM, "Teruhiko Kurosaka" <[EMAIL PROTECTED]> wrote:
> Could someone tell me what facet is?
Dear list,
I have two questions regarding German special characters or umlaute.
is there an analyzer which automatically converts all german special
characters to their specific dissected from, such as ü to ue and ä to
ae, etc.?!
I also would like to have, that the search is always run against t
Could someone tell me what facet is?
I have a vague idea but I am not too clear.
A pointer to a sample web site that uses Solr facet
would be very good.
Thanks.
-Kuro
With the new/improved value source functions it should be pretty easy to
develop a new best practice. You should be able to pull in the lat/lon
values from valuesource fields and then do your greater circle calculation.
- will
-Original Message-
From: Lance Norskog [mailto:[EMAIL PROTECT
On 26-Sep-07, at 5:14 AM, Sandeep Shetty wrote:
Hi Guys,
this question as been asked before but i was unable to find an answer
thats good for me, so hope you guys can help again
i am working on a website where we need to sort the results by
distance
from the location entered by the user. I ha
On 26-Sep-07, at 10:50 AM, Law, John wrote:
Thanks all! One last question...
If I had a collection of 2.5 billion docs and a demand averaging 200
queries per second, what's the confidence that Solr/Lucene could
handle
this volume and execute search with sub-second response times?
No search
My limited experience with larger indexes is:
1) the logistics of copying around and backing up this much data, and
2) indexing is disk-bound. We're on SAS disks and it makes no difference
between one indexing thread and a dozen (we have small records).
Smaller returns are faster. You need to li
It is a "best practice" to store the master copy of this data in a
relational database and use Solr/Lucene as a high-speed cache.
MySQL has a geographical database option, so maybe that is a better option
than Lucene indexing.
Lance
(P.s. please start new threads for new topics.)
-Original M
On 9/26/07, Urvashi Gadi <[EMAIL PROTECTED]> wrote:
> Hi,
>
> I am trying to create my own application using SOLR and while trying to
> index my data i get
>
> Server returned HTTP response code: 400 for URL:
> http://localhost:8983/solr/update or
> Server returned HTTP response code: 500 for URL:
No one can answer that, because it depends on how you configure Solr.
How many fields do you want to search? Are you using fuzzy search?
Facets? Highlighting?
We are searching a much smaller collection, about 250K docs, with
great success. We see 80 queries/sec on each of four servers, and
respons
Thanks all! One last question...
If I had a collection of 2.5 billion docs and a demand averaging 200
queries per second, what's the confidence that Solr/Lucene could handle
this volume and execute search with sub-second response times?
-Original Message-
From: Charlie Jackson [mailto:[E
Sorry, I meant that it maxed out in the sense that my maxDoc field on
the stats page was 8.8 million, which indicates that the most docs it
has ever had was around 8.8 million. It's down to about 7.8 million
currently. I have seen no signs of a "maximum" number of docs Solr can
handle.
-Orig
Hi,
I am trying to create my own application using SOLR and while trying to
index my data i get
Server returned HTTP response code: 400 for URL:
http://localhost:8983/solr/update or
Server returned HTTP response code: 500 for URL:
http://localhost:8983/solr/update
Is there a way to get more debu
My experience so far:
200k number of indexes were created in 90 mins(including db time), index
size is 200m, query a key word on all string fields(30) takes 0.3-1 sec,
query a key word on one field takes tens of mill seconds.
-Original Message-
From: Charlie Jackson [mailto:[EMAIL PROTEC
By "maxed out" do you mean that Solr's performance became unacceptable
beyond 8.8M records, or that you only had 8.8M records to index? If
the former, can you share the particular symptoms?
On 9/26/07, Charlie Jackson <[EMAIL PROTECTED]> wrote:
> My experiences so far with this level of data have
I have a large index with a field for a URL. For some reason or
another, sometimes a doc will get indexed with that field blank. This
is fine but I want a query to return only the set URL fields...
If I do a query like:
q=URL:[* TO *]
I get a lot of empty fields back, like:
http://thing.
My experiences so far with this level of data have been good.
Number of records: Maxed out at 8.8 million
Database size: friggin huge (100+ GB)
Index size: ~24 GB
1) It took me about a day to index 8 million docs using a non-optimized
program I wrote. It's non-optimized in the sense that it's not
Oups. I forgot to set the 2 files with solr home :
/opt/tomcat/conf/Catalina/localhost/solr.xml
/opt/tomcat/conf/Catalina/localhost/solr.xml
Phil
philguillard wrote:
Hi,
I'm new to solr, sorry if i missed my answer in the docs somewhere...
I need 2 different solr indexes.
Should i create 2 we
Hi,
I'm new to solr, sorry if i missed my answer in the docs somewhere...
I need 2 different solr indexes.
Should i create 2 webapps? In that case i have tomcat contexts solr and
solr2, then i can't start solr2, i get this error:
Sep 26, 2007 6:07:25 PM org.apache.catalina.core.StandardContex
That seems well within Solr's capabilities, though you should come up
with a desired queries/sec figure.
Solr's query rate varies widely with the configuration -- how many
fields, fuzzy search, highlighting, facets, etc.
Essentially, Solr uses Lucene, a modern search core. It has performance
and
I am new to the list and new to lucene and solr. I am considering Lucene
for a potential new application and need to know how well it scales.
Following are the parameters of the dataset.
Number of records: 7+ million
Database size: 13.3 GB
Index Size: 10.9 GB
My questions are simply:
1) Appr
On Sep 26, 2007, at 4:04 AM, Doğacan Güney wrote:
NUTCH-442 is one of the issues that I want to really see resolved.
Unfortunately, I haven't received many (as in, none) comments, so I
haven't made further progress on it.
I am probably your target customer but to be honest all we care about
Arisem is a French ISV delivering best-of-breed text analytics software. We
are using Lucene in our products since 2001 and are in search of a Lucene
expert to complement our R&D team.
Required skills:
- Master degree in computer science
- 2+ years of experience in working with Lucene
- Stro
Hello,
For the project I'm working on now it is important to group the results
of a query by a "product" field. Documents
belong to only one product and there will never be more than 10
different products alltogether.
When searching through the archives I identified 3 options:
1) [[Client-si
> Hi Guys,
>
> this question as been asked before but i was unable to find an answer
> thats good for me, so hope you guys can help again
> i am working on a website where we need to sort the results by distance
> from the location entered by the user. I have indexed the lat and long
> info for ea
On 9/26/07, Brian Whitman <[EMAIL PROTECTED]> wrote:
>
> > Sami has a patch in there which used a older version of the solr
> > client. with the current solr client in the SVN tree, his patch
> > becomes much easier.
> > your job would be to upgrade the patch and mail it back to him so
> > he can u
35 matches
Mail list logo