date:20080311

Re: ranking on Multivalued fields

2008-03-11 Thread Tobias Lohr

What you probably want to achieve is displaying only docs in a certain 
category (maybe filtered) ordered by descending score in the context of 
exactly this category, right?


Well, you could come over this by creating a category specific score 
field for every category following the schema "cat-X-score" where X is 
the identifier of each of your categories. Then when receiving a request 
for your category you programmatically have to build the sort-by 
condition for field "cat-Y-score", where Y is the category id of the 
category you received the request for.


*tobi*

Umar Shah wrote:

Hi Otis,

thanks for the reply,

consider a multivalued field name cat

--other fields

 val 1  score2  
 val 3 
Umar,

I'm not sure what you mean by a "subfield", can you explain please?

As for your second question, just add category:X to your query and you'll
get matches ordered/ranked by score by default.

Otis


--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch


- Original Message 
From: Umar Shah <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Friday, March 7, 2008 1:17:35 AM
Subject: ranking on Multivalued fields

Hi,

I have a problem where i want to rank multivalued fields

suppose a multivalued field "category" having associated subfield "score".
First Is it possible to have a subfield in the mutlivalued field?
Second I want to get the documents ranked with the highest score say for
the
category:X

thanks
Umar Shah

Re: What is default Date time format in Solr

2008-03-11 Thread Mahesh Udupa

Thanks Chris,

My index creation was wrong ;)(I was using 12 Hour format)

Thanks for your support
-kmu

On Sat, Mar 8, 2008 at 1:35 AM, Chris Hostetter <[EMAIL PROTECTED]>
wrote:

>
> : I heard Solr Date time format is 24 hours.
>
> that is correct.
>
> : emf.artist:[2007-12-31T22:20:00Z TO  2007-12-31T22:39:00Z]
> :
> : I am not able to get the content what I expected.
> :
> : But, I tried with following query:-
> :
> : emf.artist:[2007-12-31T10:20:00Z TO  2007-12-31T10:39:00Z]
>
> Is your emf.artist field stored?
> If so what value do you see in the field when you do that second query and
> get the results you are looking for?  if they don't match what you think
> they should be, then the code you have reading dates from your index and
> writing them to Solr isn't doing what you think it's doing.
>
>
>
>
> -Hoss
>
>

Re: Accented search

2008-03-11 Thread Peter Cline

I'm not sure about a way to boost scores in this case, but you can 
achieve the basic matching by applying a filter to the index and the 
queries.  The ISOLatin1Accent Filter seems like it may work for you, 
though I'm not entirely certain if that will cover all the accent 
characters you need.


My approach has been to write new filters, one to normalize the unicode 
into the "decomposed" version, then one to manually strip out all of the 
"add-on" characters (with decimal codepoint greater than 256).  I don't 
know if this will always work, but it's worked well for me so far.


I would test out adding a  
to your analyzer.  It might do the trick.  Once again, with this 
approach I'm not sure how to boost either score, so someone else may 
have better ideas.  I'm pretty new to all of this stuff.


Peter

climbingrose wrote:

Hi guys,

I'm running to some problems with accented (UTF-8) language. I'd love to
hear some ideas about how to use Solr with those languages. Basically, I
want to achieve what Google did with UTF-8 language.

My requirements including:
1) Accent insensitive search and proper highlighting:
  For example, we have 2 documents:

  Doc A (title:Lập Trình Viên)
  Doc B (title:Lap Trinh Vien)

  if the user enters "Lập Trình Viên", then Doc B is also matched and "Lập
Trình Viên" is highlighted.
  On the other hand, if the query is "Lap Trinh Vien", Doc A is also
matched.
2) Assign proper scores to accented or non-accented searches:
  if the user enters "Lập Trình Viên", then Doc A should be given higher
score than DOC B.
  if the query is "Lap Trinh Vien", Doc A should be given higher score.

Any ideas guys? Thanks in advance!

RE: Accented search

2008-03-11 Thread Binkley, Peter

We've done this in a pre-Solr Lucene context by using the position increment: 
when a token contains accented characters, you add a stripped version of that 
token with a zero increment, so that for matching purposes the original and the 
stripped version are at the same position. Accents are not stripped from 
queries. The effect is that an accented search matches your Doc A, and an 
unaccented search matches Docs A and B. We do that after lower-casing the token.

There are some limitations: users might start to expect that they can freely 
add accents to restrict their search to accented hits, but if they don't match 
the accents exactly they won't get any hits: e.g. if a word contains two 
accented characters and the user only accents one of them in their query, they 
won't match the accented or the unaccented version. 

Peter

Peter Binkley
Digital Initiatives Technology Librarian
Information Technology Services
4-30 Cameron Library
University of Alberta Libraries
Edmonton, Alberta
Canada T6G 2J8
Phone: (780) 492-3743
Fax: (780) 492-9243
e-mail: [EMAIL PROTECTED]

~ The code is willing, but the data is weak. ~


-Original Message-
From: climbingrose [mailto:[EMAIL PROTECTED] 
Sent: Monday, March 10, 2008 10:01 PM
To: solr-user@lucene.apache.org
Subject: Accented search

Hi guys,

I'm running to some problems with accented (UTF-8) language. I'd love to hear 
some ideas about how to use Solr with those languages. Basically, I want to 
achieve what Google did with UTF-8 language.

My requirements including:
1) Accent insensitive search and proper highlighting:
  For example, we have 2 documents:

  Doc A (title:Lập Trình Viên)
  Doc B (title:Lap Trinh Vien)

  if the user enters "Lập Trình Viên", then Doc B is also matched and "Lập 
Trình Viên" is highlighted.
  On the other hand, if the query is "Lap Trinh Vien", Doc A is also matched.
2) Assign proper scores to accented or non-accented searches:
  if the user enters "Lập Trình Viên", then Doc A should be given higher score 
than DOC B.
  if the query is "Lap Trinh Vien", Doc A should be given higher score.

Any ideas guys? Thanks in advance!

--
Regards,

Cuong Hoang

schema help

2008-03-11 Thread Geoffrey Young


hi :)

I'm trying to work out a schema for our widgets.  more than "just coming 
up with something" I'd like something idiomatic in solr terms.  any help 
is much appreciated.  here's a similar problem space to what I'm working 
with...


lets say we're talking books.  books are written by authors and held in 
libraries.  a sister company is using lucene+compass and they seem to 
have completely different collections (or whatever the technical term is :)


  authors
  books
  libraries

so that a search for authors hits only the authors dataset.

all of the solr examples I can find don't seem to address this kind of 
data disparity.  what is the standard and idiomatic approach for solr?


for my particular data I'd want to display something like this

  author
book in library
book in library

on the same result page, but using a completely flat, single schema 
doesn't seem to scale very well.


collective widsom most welcome :)

--Geoff

RE: Accented search

2008-03-11 Thread Renaud Waldura

Peter:

Very interesting. To take care of the issue you mention, could you add
multiple "synonyms" with progressively less accents? 

E.g. you'd index "préférence" as 4 tokens:
 préférence (unchanged)
 preférence (stripped one accent)
 préference (stripped the other accent)
 preference (stripped both accents)

Or does it yield too many tokens to be useful?

And how does this take care of scoring? Do you get a higher score with a
closer match?


 

-Original Message-
From: Binkley, Peter [mailto:[EMAIL PROTECTED] 
Sent: Tuesday, March 11, 2008 8:37 AM
To: solr-user@lucene.apache.org
Subject: RE: Accented search

We've done this in a pre-Solr Lucene context by using the position
increment: when a token contains accented characters, you add a stripped
version of that token with a zero increment, so that for matching purposes
the original and the stripped version are at the same position. Accents are
not stripped from queries. The effect is that an accented search matches
your Doc A, and an unaccented search matches Docs A and B. We do that after
lower-casing the token.

There are some limitations: users might start to expect that they can freely
add accents to restrict their search to accented hits, but if they don't
match the accents exactly they won't get any hits: e.g. if a word contains
two accented characters and the user only accents one of them in their
query, they won't match the accented or the unaccented version. 

Peter

Peter Binkley
Digital Initiatives Technology Librarian Information Technology Services
4-30 Cameron Library University of Alberta Libraries Edmonton, Alberta
Canada T6G 2J8
Phone: (780) 492-3743
Fax: (780) 492-9243
e-mail: [EMAIL PROTECTED]

~ The code is willing, but the data is weak. ~


-Original Message-
From: climbingrose [mailto:[EMAIL PROTECTED]
Sent: Monday, March 10, 2008 10:01 PM
To: solr-user@lucene.apache.org
Subject: Accented search

Hi guys,

I'm running to some problems with accented (UTF-8) language. I'd love to
hear some ideas about how to use Solr with those languages. Basically, I
want to achieve what Google did with UTF-8 language.

My requirements including:
1) Accent insensitive search and proper highlighting:
  For example, we have 2 documents:

  Doc A (title:L?p Trình Viên)
  Doc B (title:Lap Trinh Vien)

  if the user enters "L?p Trình Viên", then Doc B is also matched and "L?p
Trình Viên" is highlighted.
  On the other hand, if the query is "Lap Trinh Vien", Doc A is also
matched.
2) Assign proper scores to accented or non-accented searches:
  if the user enters "L?p Trình Viên", then Doc A should be given higher
score than DOC B.
  if the query is "Lap Trinh Vien", Doc A should be given higher score.

Any ideas guys? Thanks in advance!

--
Regards,

Cuong Hoang

Re: Accented search

2008-03-11 Thread Walter Underwood

Generally, the accented version will have a higher IDF, so it
will score higher.

wunder

On 3/11/08 8:44 AM, "Renaud Waldura" <[EMAIL PROTECTED]>
wrote:

> Peter:
> 
> Very interesting. To take care of the issue you mention, could you add
> multiple "synonyms" with progressively less accents?
> 
> E.g. you'd index "préférence" as 4 tokens:
>  préférence (unchanged)
>  preférence (stripped one accent)
>  préference (stripped the other accent)
>  preference (stripped both accents)
> 
> Or does it yield too many tokens to be useful?
> 
> And how does this take care of scoring? Do you get a higher score with a
> closer match?
> 
> 
>  
> 
> -Original Message-
> From: Binkley, Peter [mailto:[EMAIL PROTECTED]
> Sent: Tuesday, March 11, 2008 8:37 AM
> To: solr-user@lucene.apache.org
> Subject: RE: Accented search
> 
> We've done this in a pre-Solr Lucene context by using the position
> increment: when a token contains accented characters, you add a stripped
> version of that token with a zero increment, so that for matching purposes
> the original and the stripped version are at the same position. Accents are
> not stripped from queries. The effect is that an accented search matches
> your Doc A, and an unaccented search matches Docs A and B. We do that after
> lower-casing the token.
> 
> There are some limitations: users might start to expect that they can freely
> add accents to restrict their search to accented hits, but if they don't
> match the accents exactly they won't get any hits: e.g. if a word contains
> two accented characters and the user only accents one of them in their
> query, they won't match the accented or the unaccented version.
> 
> Peter
> 
> Peter Binkley
> Digital Initiatives Technology Librarian Information Technology Services
> 4-30 Cameron Library University of Alberta Libraries Edmonton, Alberta
> Canada T6G 2J8
> Phone: (780) 492-3743
> Fax: (780) 492-9243
> e-mail: [EMAIL PROTECTED]
> 
> ~ The code is willing, but the data is weak. ~
> 
> 
> -Original Message-
> From: climbingrose [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 10, 2008 10:01 PM
> To: solr-user@lucene.apache.org
> Subject: Accented search
> 
> Hi guys,
> 
> I'm running to some problems with accented (UTF-8) language. I'd love to
> hear some ideas about how to use Solr with those languages. Basically, I
> want to achieve what Google did with UTF-8 language.
> 
> My requirements including:
> 1) Accent insensitive search and proper highlighting:
>   For example, we have 2 documents:
> 
>   Doc A (title:L?p Trình Viên)
>   Doc B (title:Lap Trinh Vien)
> 
>   if the user enters "L?p Trình Viên", then Doc B is also matched and "L?p
> Trình Viên" is highlighted.
>   On the other hand, if the query is "Lap Trinh Vien", Doc A is also
> matched.
> 2) Assign proper scores to accented or non-accented searches:
>   if the user enters "L?p Trình Viên", then Doc A should be given higher
> score than DOC B.
>   if the query is "Lap Trinh Vien", Doc A should be given higher score.
> 
> Any ideas guys? Thanks in advance!
> 
> --
> Regards,
> 
> Cuong Hoang
> 
>

Re: schema help

2008-03-11 Thread Otis Gospodnetic

Geoff,

I'm not sure if I understood your problem correctly, but it sounds like you 
want your search to be restricted to authors, but then you want to list all of 
his/her books when displaying results.  The easiest thing to do would be to 
create an index where each "row"/Document has the author name, the book title, 
etc.  For each author-matching Document you'd pull his/her books out of the 
result set.  Yes, this means the author name would be denormalized in 
RDBMS-speak.  Another option is not to index/store book titles, but rather have 
only an author index to search against.  The book data (mapped to author 
identities) would then be pulled from an external source (e.g. RDBMS: select 
title from books where author_id in (1,2,3)) at search results display time.

Otis

--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Geoffrey Young <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, March 11, 2008 12:17:32 PM
Subject: schema help

hi :)

I'm trying to work out a schema for our widgets.  more than "just coming 
up with something" I'd like something idiomatic in solr terms.  any help 
is much appreciated.  here's a similar problem space to what I'm working 
with...

lets say we're talking books.  books are written by authors and held in 
libraries.  a sister company is using lucene+compass and they seem to 
have completely different collections (or whatever the technical term is :)

   authors
   books
   libraries

so that a search for authors hits only the authors dataset.

all of the solr examples I can find don't seem to address this kind of 
data disparity.  what is the standard and idiomatic approach for solr?

for my particular data I'd want to display something like this

   author
 book in library
 book in library

on the same result page, but using a completely flat, single schema 
doesn't seem to scale very well.

collective widsom most welcome :)

--Geoff

Re: ranking on Multivalued fields

2008-03-11 Thread Otis Gospodnetic

Umar,

The notion of "subfield" does not exist in Solr (or am I living under a rock?).
Thus,  val 1 http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message 
From: Umar Shah <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Saturday, March 8, 2008 7:03:32 AM
Subject: Re: ranking on Multivalued fields

Hi Otis,

thanks for the reply,

consider a multivalued field name cat

--other fields

 val 1  score2  
 val 3 
> As for your second question, just add category:X to your query and you'll
> get matches ordered/ranked by score by default.
>
> Otis
>
>
> --
> Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>
>
> - Original Message 
> From: Umar Shah <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
> Sent: Friday, March 7, 2008 1:17:35 AM
> Subject: ranking on Multivalued fields
>
> Hi,
>
> I have a problem where i want to rank multivalued fields
>
> suppose a multivalued field "category" having associated subfield "score".
> First Is it possible to have a subfield in the mutlivalued field?
> Second I want to get the documents ranked with the highest score say for
> the
> category:X
>
> thanks
> Umar Shah
>
>
>
>

Re: schema help

2008-03-11 Thread Geoffrey Young




Otis Gospodnetic wrote:

Geoff,

I'm not sure if I understood your problem correctly, but it sounds
like you want your search to be restricted to authors, but then you
want to list all of his/her books when displaying results. 


that's about right.  add that I may also want to search on libraries and 
show all the books (and authors) stored there.


in real life, it's not books or authors, of course, but the parallels 
are close enough :)  in fact, the library example is a good one for 
me... or at least a network of public libraries linked together.



The
easiest thing to do would be to create an index where each
"row"/Document has the author name, the book title, etc.  For each
author-matching Document you'd pull his/her books out of the result
set.  Yes, this means the author name would be denormalized in
RDBMS-speak.  


I think I can live with the denormalization - it seems lucene is flat 
and very different conceptually than a database :)


the trouble I'm having is one of dimension.  an author has many, many 
attributes (name, birthdate, biography in $language, etc).  as does each 
book (title in $language, summary in $language, genre, etc).  as does 
each library (name, address, directions in $language, etc).  so an 
author with N books doesn't seem to scale very well in the flat 
representations I'm finding in all the lucene/solr docs and examples... 
at least not in some way I can wrap my head around.


part of what seemed really appealing about lucene in general was that 
you could stuff all this (unindexed) information into a document and 
retrieve it all based on some search criteria.  but it's seeming very 
difficult for me to wrap my head around the data I need to represent.



Another option is not to index/store book titles, but
rather have only an author index to search against.  The book data
(mapped to author identities) would then be pulled from an external
source (e.g. RDBMS: select title from books where author_id in
(1,2,3)) at search results display time.


eew :)  seriously, though, that's what we have now - all rdbms driven. 
if solr could only conceptually handle the initial lookup there wouldn't 
be much point.


maybe I'm thinking about this all wrong (as is to be expected :), but I 
just can't believe that nobody is using solr to represent data a bit 
more complex than the examples out there.


thanks for the feedback.

--Geoff



Otis

-- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

- Original Message  From: Geoffrey Young
<[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent:
Tuesday, March 11, 2008 12:17:32 PM Subject: schema help

hi :)

I'm trying to work out a schema for our widgets.  more than "just
coming up with something" I'd like something idiomatic in solr terms.
any help is much appreciated.  here's a similar problem space to what
I'm working with...

lets say we're talking books.  books are written by authors and held
in libraries.  a sister company is using lucene+compass and they seem
to have completely different collections (or whatever the technical
term is :)

authors books libraries

so that a search for authors hits only the authors dataset.

all of the solr examples I can find don't seem to address this kind
of data disparity.  what is the standard and idiomatic approach for
solr?

for my particular data I'd want to display something like this

author book in library book in library

on the same result page, but using a completely flat, single schema 
doesn't seem to scale very well.


collective widsom most welcome :)

--Geoff

Re: schema help

2008-03-11 Thread Rachel McConnell

Our Solr use consists of several rather different data types, some of
which have one-to-many relationships with other types.  We don't need
to do any searching of quite the kind you describe, but I have an idea
about it, depending on what you need to do with the book data.  It is
rather hacky, but maybe you can improve it.

If you only need to present a list of books, possibly with links to
fuller data, you could do this:
* store only Authors in solr
* create a field, stored but not indexed (I may be using slightly
wrong terms here) which contains the short text representation of all
their books
* search on authors however you want and make sure you return this
field, and just display it as is

For example, if Jane Doe has written 2 books, How To Garden, and
Fields Of Maine, your special field might contain this:

Fields of Maine published on
DATE.  A brief overvew of Maine's woods and fields with special
attention to wildflowers

If your 'authors' 'write' 'books' with great frequency, you'd need to
update a lot...


Another possibility is to do two searches, with this kind of
structure, which sort of mimics an RDBMS:
* everything in Solr has a field, type (book, author, library, etc).
these can be filtered on a search by search basis
* books have a field, authorId, uniquely referencing the author
* your first search will restricted to just authors, from which you
will extract the IDs.
* your second search will be restricted to just books, whose authorId
field is exactly one of the IDs from the first search


As you have noticed, Lucene is not an RDBMS.  Searching through all
the text of all the books is more the use it was designed around; of
course the analogy might not be THAT strong with your need!

Rachel

On 3/11/08, Geoffrey Young <[EMAIL PROTECTED]> wrote:
>
>
>  Otis Gospodnetic wrote:
>  > Geoff,
>  >
>  > I'm not sure if I understood your problem correctly, but it sounds
>  > like you want your search to be restricted to authors, but then you
>  > want to list all of his/her books when displaying results.
>
>
> that's about right.  add that I may also want to search on libraries and
>  show all the books (and authors) stored there.
>
>  in real life, it's not books or authors, of course, but the parallels
>  are close enough :)  in fact, the library example is a good one for
>  me... or at least a network of public libraries linked together.
>
>
>  > The
>  > easiest thing to do would be to create an index where each
>  > "row"/Document has the author name, the book title, etc.  For each
>  > author-matching Document you'd pull his/her books out of the result
>  > set.  Yes, this means the author name would be denormalized in
>  > RDBMS-speak.
>
>
> I think I can live with the denormalization - it seems lucene is flat
>  and very different conceptually than a database :)
>
>  the trouble I'm having is one of dimension.  an author has many, many
>  attributes (name, birthdate, biography in $language, etc).  as does each
>  book (title in $language, summary in $language, genre, etc).  as does
>  each library (name, address, directions in $language, etc).  so an
>  author with N books doesn't seem to scale very well in the flat
>  representations I'm finding in all the lucene/solr docs and examples...
>  at least not in some way I can wrap my head around.
>
>  part of what seemed really appealing about lucene in general was that
>  you could stuff all this (unindexed) information into a document and
>  retrieve it all based on some search criteria.  but it's seeming very
>  difficult for me to wrap my head around the data I need to represent.
>
>
>  > Another option is not to index/store book titles, but
>  > rather have only an author index to search against.  The book data
>  > (mapped to author identities) would then be pulled from an external
>  > source (e.g. RDBMS: select title from books where author_id in
>  > (1,2,3)) at search results display time.
>
>
> eew :)  seriously, though, that's what we have now - all rdbms driven.
>  if solr could only conceptually handle the initial lookup there wouldn't
>  be much point.
>
>  maybe I'm thinking about this all wrong (as is to be expected :), but I
>  just can't believe that nobody is using solr to represent data a bit
>  more complex than the examples out there.
>
>  thanks for the feedback.
>
>  --Geoff
>
>
>  >
>  > Otis
>  >
>  > -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
>  >
>  > - Original Message  From: Geoffrey Young
>  > <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent:
>  > Tuesday, March 11, 2008 12:17:32 PM Subject: schema help
>  >
>  > hi :)
>  >
>  > I'm trying to work out a schema for our widgets.  more than "just
>  > coming up with something" I'd like something idiomatic in solr terms.
>  > any help is much appreciated.  here's a similar problem space to what
>  > I'm working with...
>  >
>  > lets say we're talking books.  books are written by authors and held
>  > in libraries.  a sister

Re: Unparseable date

2008-03-11 Thread monkins


I indexed my docs with field : 1995-12-31T23:59:59.000Z
But when i try to search on that field : order_dt:1995-12-31T23:59:59.000Z ,
I get an exception :
Mar 11, 2008 4:13:55 PM org.apache.solr.core.SolrException log
SEVERE: org.apache.solr.core.SolrException: Invalid Date
String:'1995-12-31T23'
at org.apache.solr.schema.DateField.toInternal(DateField.java:108)
at
org.apache.solr.schema.FieldType$DefaultAnalyzer$1.next(FieldType.java:298)
at
org.apache.lucene.queryParser.QueryParser.getFieldQuery(QueryParser.java:437)
at
org.apache.solr.search.SolrQueryParser.getFieldQuery(SolrQueryParser.java:78)
at
org.apache.lucene.queryParser.QueryParser.Term(QueryParser.java:1092)
at
org.apache.lucene.queryParser.QueryParser.Clause(QueryParser.java:979)
at
org.apache.lucene.queryParser.QueryParser.Query(QueryParser.java:907)
at
org.apache.lucene.queryParser.QueryParser.TopLevelQuery(QueryParser.java:896)
at
org.apache.lucene.queryParser.QueryParser.parse(QueryParser.java:146)

Am I missing anything ?

Thanks,
Monica.


Daniel Andersson-5 wrote:
> 
> 
> On Mar 5, 2008, at 11:08 PM, Chris Hostetter wrote:
> 
>> It's ".000" not ":00" ... "2008-02-12T15:02:06.000Z"
>>
>> but like i said: that stack trace is odd, the time doesn't seem  
>> like it
>> actually comes from any query params, it looks like it's coming from a
>> previously indexed doc.  To work arround this you may need to reindex
>> all of your docs with those optional milliseconds.
> 
> Ah, re-indexing now. Thanks for your help!
> 
> / d
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Unparseable-date-tp15854401p15994506.html
Sent from the Solr - User mailing list archive at Nabble.com.

Query Level Boosting

2008-03-11 Thread oleg_gnatovskiy


Hello. I was wondering if anyone knew a way to do query level boosting with
SolrJ. On the http client I could just do something like sku:123^2.3 which
would boost the sky query 2.3 points.
-- 
View this message in context: 
http://www.nabble.com/Query-Level-Boosting-tp15995005p15995005.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Out of memory in analysis

2008-03-11 Thread Chris Hostetter


: I pasted a modest blob of text into the analysis debug slot on the admin
: app, and am rewarded with this, even with -Xmx1g.

what was the text?  what was the field/fieldtype?  what did the 
analyzers for that fieldtype look like in your schema.xml?


-Hoss

Re: return only sorted Field, but with a different Field Name

2008-03-11 Thread Chris Hostetter

: 
: For example, say I want to sort by the field '162_sortable_s' then I add a
: parameter like so 'sort=162_sortable_s.' I need to change the settings so
: that when the result set is returned from solr, it takes the values of
: '162_sortable_s' and inserts them into a separate field called 'SortedField'
: so that the return doc looks like this:

there is nothing like this in solr right now, it doesn't seem like 
something that should be odne in solr, as it would be a simple translation 
that could be done via an XSLT or some client layer code.

: How or where do I change that setting? Do I have to rewrite some part of the
: RequestHandler?

assuming you didn't want to just use an XSLT, writing your own response 
writer that subclasses XmlResponseWriter would probably be the simplest 
way to accomplish this.



-Hoss

Re: How to get incrementPositionGap value from IndexSchema ?

2008-03-11 Thread Chris Hostetter


: I am looking for a way to access the incrementPositionGap value defined for a
: field type in the schema.xml.

I think you mean "positionIncrementGap"

It's a property of the  in schema.xml, but internally it's 
passed to SolrAnalyzer.setPositionIncrementGap.  if you want to 
programaticly know what the "positionIncrementGap" is for any analyzer of 
any field or fieldtype regardless of wether or not it's a SolrAnalyzer, 
just use Analzer.getPositionIncrementGap(String fieldName) 

ie: myFieldType.getAnalyzer().getPositionIncrementGap(myFieldName)


If you don't mind me asking:  why do you want/need this information in 
your custom code?


-Hoss

Re: Result based sorting for KWIC?

2008-03-11 Thread Chris Hostetter


: I am investigating using solr for a project that requires presentation of
: search results in a KWIC display, sorted according to either the string
: following the matches or the (reverse) of the characters previous to the
: matches.  Can this be done with Solr?  How would I go about implement this?

1) if you've got full text search, why would you even want KWIC?  

2) your description of how you'd want the results ordered is extrmely 
confusing to me ... can you give a simple concrete example of some 
documents / queries / result-doclists that you would want to see?



-Hoss

Re: Unparseable date

2008-03-11 Thread Chris Hostetter

: I indexed my docs with field : 1995-12-31T23:59:59.000Z
: But when i try to search on that field : order_dt:1995-12-31T23:59:59.000Z ,
: I get an exception :
: Mar 11, 2008 4:13:55 PM org.apache.solr.core.SolrException log
: SEVERE: org.apache.solr.core.SolrException: Invalid Date
: String:'1995-12-31T23'

":" is a special character for the query parser, so it either needs to be 
escaped or the date needs to be quoted...

order_dt:"1995-12-31T23:59:59.000Z"

this isn't something most people typically need to worry about, because 
dates are typically only queried using ranges...

order_dt:[1995-12-31T23:59:59.000Z TO *]



-Hoss

Re: Accented search

2008-03-11 Thread climbingrose

Hi Peter,

It looks like a very promising approach for us. I'm going to implement an
custom Tokeniser based on your suggestions and see how it goes. Thank you
all for your comments!

Cheers

On Wed, Mar 12, 2008 at 2:37 AM, Binkley, Peter <[EMAIL PROTECTED]>
wrote:

> We've done this in a pre-Solr Lucene context by using the position
> increment: when a token contains accented characters, you add a stripped
> version of that token with a zero increment, so that for matching purposes
> the original and the stripped version are at the same position. Accents are
> not stripped from queries. The effect is that an accented search matches
> your Doc A, and an unaccented search matches Docs A and B. We do that after
> lower-casing the token.
>
> There are some limitations: users might start to expect that they can
> freely add accents to restrict their search to accented hits, but if they
> don't match the accents exactly they won't get any hits: e.g. if a word
> contains two accented characters and the user only accents one of them in
> their query, they won't match the accented or the unaccented version.
>
> Peter
>
> Peter Binkley
> Digital Initiatives Technology Librarian
> Information Technology Services
> 4-30 Cameron Library
> University of Alberta Libraries
> Edmonton, Alberta
> Canada T6G 2J8
> Phone: (780) 492-3743
> Fax: (780) 492-9243
> e-mail: [EMAIL PROTECTED]
>
> ~ The code is willing, but the data is weak. ~
>
>
> -Original Message-
> From: climbingrose [mailto:[EMAIL PROTECTED]
> Sent: Monday, March 10, 2008 10:01 PM
> To: solr-user@lucene.apache.org
> Subject: Accented search
>
> Hi guys,
>
> I'm running to some problems with accented (UTF-8) language. I'd love to
> hear some ideas about how to use Solr with those languages. Basically, I
> want to achieve what Google did with UTF-8 language.
>
> My requirements including:
> 1) Accent insensitive search and proper highlighting:
>  For example, we have 2 documents:
>
>  Doc A (title:Lập Trình Viên)
>  Doc B (title:Lap Trinh Vien)
>
>  if the user enters "Lập Trình Viên", then Doc B is also matched and "Lập
> Trình Viên" is highlighted.
>  On the other hand, if the query is "Lap Trinh Vien", Doc A is also
> matched.
> 2) Assign proper scores to accented or non-accented searches:
>  if the user enters "Lập Trình Viên", then Doc A should be given higher
> score than DOC B.
>  if the query is "Lap Trinh Vien", Doc A should be given higher score.
>
> Any ideas guys? Thanks in advance!
>
> --
> Regards,
>
> Cuong Hoang
>



-- 
Regards,

Cuong Hoang

Re: Accented search

2008-03-11 Thread Chris Hostetter

: It looks like a very promising approach for us. I'm going to implement 
: an custom Tokeniser based on your suggestions and see how it goes. Thank 
: you all for your comments!

you don't really need a custom tokenizer -- just a buffered TokenFilter 
that clones the original token if it contains accent chars, mutates the 
clone, and then emits it next with a positionIncrement of 0.

i'm kind of suprised ISOLatin1AccentFilter doesn't have an option to do 
this already -- it would certianly be a worthy patch to commit if someone 
wants to submit it back to lucene-java.

: > don't match the accents exactly they won't get any hits: e.g. if a word
: > contains two accented characters and the user only accents one of them in
: > their query, they won't match the accented or the unaccented version.

this could be accounted for by generating all of the permuations of 
unaccented characters when indexing -- it wouldn't solve the problem of a 
source term containing only one accent and the user quering with only one 
accent but on a different character ... you could work arround this by 
puting all of the permutations in at index time, but querying on the exact 
term and the no-accent term at query time.


-Hoss

Cannot start solr

2008-03-11 Thread Vinci


I follow the tutorial on wiki but when I go to
http://server_address/solr/admin
I got tomcat error message:
HTTP 404

Then I go to check in Tomcat manager, I see it is not started, when I attend
to start it, I got this error message.

FAIL - Application at context path /solr could not be started

I am using tomcat 5.5 on debian, and I am placing the war file outside the
/webapps; also I copied everything under /example/solr to the path I pointed
to... I checked the file is here.

What I did wrong?
-- 
View this message in context: 
http://www.nabble.com/Cannot-start-solr-tp15997140p15997140.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Cannot start solr

2008-03-11 Thread Vinci

Additional Infomation:

2008/3/12 上午 11:10:54 org.apache.solr.core.SolrResourceLoader
locateInstanceDir
INFO: Using JNDI solr.home: /var/webapps/solr
2008/3/12 上午 11:10:54 org.apache.solr.servlet.SolrDispatchFilter init
INFO: looking for multicore.xml: /var/webapps/solr/multicore.xml
2008/3/12 上午 11:10:54 org.apache.solr.servlet.SolrDispatchFilter init
FATAL: Could not start SOLR. Check solr/home property
java.lang.ExceptionInInitializerError
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:104)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:626)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488)
at
org.apache.catalina.startup.HostConfig.check(HostConfig.java:1206)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:293)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at
org.apache.catalina.core.ContainerBase.backgroundProcess(ContainerBase.java:1306)
at
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1570)
at
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.processChildren(ContainerBase.java:1579)
at
org.apache.catalina.core.ContainerBase$ContainerBackgroundProcessor.run(ContainerBase.java:1559)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.lang.RuntimeException: XPathFactory#newInstance() failed to
create an XPathFactory for the default object model:
http://java.sun.com/jaxp/xpath/dom with the
XPathFactoryConfigurationException:
javax.xml.xpath.XPathFactoryConfigurati...

2008/3/12 上午 11:10:54 org.apache.catalina.core.StandardContext start
FATAL: Error filterStart
2008/3/12 上午 11:10:54 org.apache.catalina.core.StandardContext start
FATAL: Context [/solr] startup failed due to previous errors

-
Related config:
solr locate in /var/webapps/solr

tree:
/var/webapps/solr/
|-- README.txt
|-- bin
|   |-- abc
|   |-- abo
|   |-- backup
|   |-- backupcleaner
|   |-- commit
|   |-- optimize
|   |-- readercycle
|   |-- rsyncd-disable
|   |-- rsyncd-enable
|   |-- rsyncd-start
|   |-- rsyncd-stop
|   |-- scripts-util
|   |-- snapcleaner
|   |-- snapinstaller
|   |-- snappuller
|   |-- snappuller-disable
|   |-- snappuller-enable
|   `-- snapshooter
`-- conf
|-- admin-extra.html
|-- elevate.xml
|-- protwords.txt
|-- schema.xml
|-- scripts.conf
|-- solrconfig.xml
|-- stopwords.txt
|-- synonyms.txt
`-- xslt
|-- example.xsl
|-- example_atom.xsl
|-- example_rss.xsl
`-- luke.xsl

solr.xml:

Can anybody help me? I am not so familiar with tomcat...

Vinci wrote:
> 
> I follow the tutorial on wiki but when I go to
> http://server_address/solr/admin
> I got tomcat error message:
> HTTP 404
> 
> Then I go to check in Tomcat manager, I see it is not started, when I
> attend to start it, I got this error message.
> 
> FAIL - Application at context path /solr could not be started
> 
> I am using tomcat 5.5 on debian, and I am placing the war file outside the
> /webapps; also I copied everything under /example/solr to the path I
> pointed to... I checked the file is here.
> 
> What I did wrong?
> 

-- 
View this message in context: 
http://www.nabble.com/Cannot-start-solr-tp15997140p15997330.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: schema help

2008-03-11 Thread Otis Gospodnetic

Geoff, some comments inlined.

- Original Message 
From: Geoffrey Young <[EMAIL PROTECTED]>
To: solr-user@lucene.apache.org
Sent: Tuesday, March 11, 2008 4:55:15 PM
Subject: Re: schema help

Otis Gospodnetic wrote:
> Geoff,
> 
> I'm not sure if I understood your problem correctly, but it sounds
> like you want your search to be restricted to authors, but then you
> want to list all of his/her books when displaying results. 

that's about right.  add that I may also want to search on libraries and 
show all the books (and authors) stored there.

OG: That's fine.  One page (of results) at a time, I imagine.

in real life, it's not books or authors, of course, but the parallels 
are close enough :)  in fact, the library example is a good one for 
me... or at least a network of public libraries linked together.

> The
> easiest thing to do would be to create an index where each
> "row"/Document has the author name, the book title, etc.  For each
> author-matching Document you'd pull his/her books out of the result
> set.  Yes, this means the author name would be denormalized in
> RDBMS-speak.  

I think I can live with the denormalization - it seems lucene is flat 
and very different conceptually than a database :)

OG: Right, it is. :)

the trouble I'm having is one of dimension.  an author has many, many 
attributes (name, birthdate, biography in $language, etc).  as does each 
book (title in $language, summary in $language, genre, etc).  as does 
each library (name, address, directions in $language, etc).  so an 
author with N books doesn't seem to scale very well in the flat 
representations I'm finding in all the lucene/solr docs and examples... 
at least not in some way I can wrap my head around.

OG: I'm not sure why the number of attributes worries you.  Imagine is as a 
wide RDBMS table, if it helps.  Indices with dozens of fields are not uncommon.

part of what seemed really appealing about lucene in general was that 
you could stuff all this (unindexed) information into a document and 
retrieve it all based on some search criteria.  but it's seeming very 
difficult for me to wrap my head around the data I need to represent.

OG: You certainly can do that.  I'm not sure I understand where the hard part 
is.  You seem to know what attributes each entity has.  Maybe you are confused 
by how to handle N different types of entities in a single index? (I'm assuming 
a single index is what you currently have in mind)

> Another option is not to index/store book titles, but
> rather have only an author index to search against.  The book data
> (mapped to author identities) would then be pulled from an external
> source (e.g. RDBMS: select title from books where author_id in
> (1,2,3)) at search results display time.

eew :)  seriously, though, that's what we have now - all rdbms driven. 
if solr could only conceptually handle the initial lookup there wouldn't 
be much point.

OG: Well, there might or might not be, depending on how much data you have, how 
flexible and fast your RDBMS-powered (full-text?) search, and so on.  The 
Lucene/Solr for full-text search + RDBMS/BDB for display data is a common 
combination.

maybe I'm thinking about this all wrong (as is to be expected :), but I 
just can't believe that nobody is using solr to represent data a bit 
more complex than the examples out there.

OG: Oh, lots of people are, it's just that examples are simple, so people new 
to Solr, Lucene, etc. have easier time learning.

Otis 
--
Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch

> 
> Otis
> 
> -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> 
> - Original Message  From: Geoffrey Young
> <[EMAIL PROTECTED]> To: solr-user@lucene.apache.org Sent:
> Tuesday, March 11, 2008 12:17:32 PM Subject: schema help
> 
> hi :)
> 
> I'm trying to work out a schema for our widgets.  more than "just
> coming up with something" I'd like something idiomatic in solr terms.
> any help is much appreciated.  here's a similar problem space to what
> I'm working with...
> 
> lets say we're talking books.  books are written by authors and held
> in libraries.  a sister company is using lucene+compass and they seem
> to have completely different collections (or whatever the technical
> term is :)
> 
> authors books libraries
> 
> so that a search for authors hits only the authors dataset.
> 
> all of the solr examples I can find don't seem to address this kind
> of data disparity.  what is the standard and idiomatic approach for
> solr?
> 
> for my particular data I'd want to display something like this
> 
> author book in library book in library
> 
> on the same result page, but using a completely flat, single schema 
> doesn't seem to scale very well.
> 
> collective widsom most welcome :)
> 
> --Geoff
> 
>

Solr nightly build and the multicore mode

2008-03-11 Thread Vinci


Hi all,

after tracing log, I found the tomcat problem with nightly build is the
multicore.xml on nightly build - if the multicore.xml doesn't exist, it
won't run the application like jetty does (run in single core mode if file
doesn't exist)

Q1. I don't know how to set the path...WHERE should I put the core1 and
core0 folder? somewhare in the solr/home or somewhere in webapps?, and make
the admin panel working?

Q2 how can I disable the multicore function when multicore.xml exist? just
remove the second core?

Thank you for any reply
-- 
View this message in context: 
http://www.nabble.com/Solr-nightly-build-and-the-multicore-mode-tp15997822p15997822.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Result based sorting for KWIC?

2008-03-11 Thread Christian Wittern


Chris Hostetter wrote:
1) if you've got full text search, why would you even want KWIC?  
  
Well, KWIC is a way to present the full text search results so that they 
can be easily read. 
2) your description of how you'd want the results ordered is extrmely 
confusing to me ... can you give a simple concrete example of some 
documents / queries / result-doclists that you would want to see?
  
If you go to http://tkb.mydns.jp:8899/exist/rest/db/new/tkb.xq you will 
see what I currently have.  Just click search to search for the example, 
or maybe delete the last character so that you get more results (this is 
not released yet, so don't be surprised if it breaks...). 
You will see the search term highlighted in the middle, context is 
available from the blue arrow to the right.  The display would be much 
more useful for the users, if this could be sorted on the characters 
following the hit (ignoring punctuation).  Another option would be to 
sort on the characters previous to the hit.  But in this case, the 
sorting has to be reversed, so that if I have:

 ABCDFGHI
the sort-key would be constructed as DCBA for this case.

I know that this can be done by post-processing the results on the 
client (which is what Erik suggested offline), but if I get thousands of 
hits, that would be very slow, so I am looking for other ways.   Erik 
also said that down the road there might be a sort function that could 
be called, which is what I would need here. 


Cheers,

Christian

--
Christian Wittern 
Institute for Research in Humanities, Kyoto University

47 Higashiogura-cho, Kitashirakawa, Sakyo-ku, Kyoto 606-8265, JAPAN

Re: Out of memory in analysis

2008-03-11 Thread Benson Margulies

This turned out to be a side-effect of the since-fixed use of GET in
analysis.jsp, coupled with a mistake in one of my filters.

On Tue, Mar 11, 2008 at 8:31 PM, Chris Hostetter <[EMAIL PROTECTED]>
wrote:

>
> : I pasted a modest blob of text into the analysis debug slot on the admin
> : app, and am rewarded with this, even with -Xmx1g.
>
> what was the text?  what was the field/fieldtype?  what did the
> analyzers for that fieldtype look like in your schema.xml?
>
>
> -Hoss
>
>

[Update] Solr can be started from jetty but not tomcat

2008-03-11 Thread Vinci


Hi all, 

after several hour I make the solr works a little bit: the jetty version
works, but the tomcat version doesn't.

Enviroment: JRE 1.6, tomcat 5.5, ubuntu 7.10. Solr nightly (8 Mar 08)

Look like the multicore.xml cause the problem...the Solr die at the time of
Config?

In the localhost log:
org.apache.catalina.core.StandardContext filterStart
SEVERE: Exception starting filter SolrRequestFilter
java.lang.NoClassDefFoundError: Could not initialize class
org.apache.solr.core.SolrConfig
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:114)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:626)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488)
at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)
at
org.apache.catalina.core.StandardService.start(StandardService.java:448)
at
org.apache.catalina.core.StandardServer.start(StandardServer.java:700)
at org.apache.catalina.startup.Catalina.start(Catalina.java:552)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at org.apache.catalina.startup.Bootstrap.start(Bootstrap.java:295)
at org.apache.catalina.startup.Bootstrap.main(Bootstrap.java:433)



Catalina log:
 org.apache.solr.servlet.SolrDispatchFilter init
INFO: SolrDispatchFilter.init()
 org.apache.solr.core.SolrResourceLoader locateInstanceDir
INFO: Using JNDI solr.home: /var/webapps/solr
 org.apache.solr.servlet.SolrDispatchFilter init
INFO: looking for multicore.xml: /var/webapps/solr/multicore.xml
 org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.ExceptionInInitializerError
at
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:104)
at
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:221)
at
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:302)
at
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:78)
at
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3635)
at
org.apache.catalina.core.StandardContext.start(StandardContext.java:4222)
at
org.apache.catalina.core.ContainerBase.addChildInternal(ContainerBase.java:760)
at
org.apache.catalina.core.ContainerBase.addChild(ContainerBase.java:740)
at
org.apache.catalina.core.StandardHost.addChild(StandardHost.java:544)
at
org.apache.catalina.startup.HostConfig.deployDescriptor(HostConfig.java:626)
at
org.apache.catalina.startup.HostConfig.deployDescriptors(HostConfig.java:553)
at
org.apache.catalina.startup.HostConfig.deployApps(HostConfig.java:488)
at
org.apache.catalina.startup.HostConfig.start(HostConfig.java:1138)
at
org.apache.catalina.startup.HostConfig.lifecycleEvent(HostConfig.java:311)
at
org.apache.catalina.util.LifecycleSupport.fireLifecycleEvent(LifecycleSupport.java:120)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1022)
at
org.apache.catalina.core.StandardHost.start(StandardHost.java:736)
at
org.apache.catalina.core.ContainerBase.start(ContainerBase.java:1014)
at
org.apache.catalina.core.StandardEngine.start(StandardEngine.java:443)

Re: ranking on Multivalued fields

2008-03-11 Thread Umar Shah

Thanks Otis,

I am a first time user of SOLR. I understood that my problem calls for a
redesign of the document structure. However using CatX and Cat-X-Score is
not simple because cat is not fixed set, number of values x can take is not
predetermined. However I think dynamic fields might be helpful. If you have
any insights please share.

thanks again.

umar

On 3/12/08, Otis Gospodnetic <[EMAIL PROTECTED]> wrote:
>
> Umar,
>
> The notion of "subfield" does not exist in Solr (or am I living under a
> rock?).
> Thus,  val 1 http://sematext.com/ -- Lucene - Solr - Nutch
>
> - Original Message 
> From: Umar Shah <[EMAIL PROTECTED]>
> To: solr-user@lucene.apache.org
>
> Sent: Saturday, March 8, 2008 7:03:32 AM
> Subject: Re: ranking on Multivalued fields
>
> Hi Otis,
>
> thanks for the reply,
>
> consider a multivalued field name cat
> 
> --other fields
>
>  val 1  score2  
>  val 3  >
> > As for your second question, just add category:X to your query and
> you'll
> > get matches ordered/ranked by score by default.
> >
> > Otis
> >
> >
> > --
> > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch
> >
> >
> > - Original Message 
> > From: Umar Shah <[EMAIL PROTECTED]>
> > To: solr-user@lucene.apache.org
> > Sent: Friday, March 7, 2008 1:17:35 AM
> > Subject: ranking on Multivalued fields
> >
> > Hi,
> >
> > I have a problem where i want to rank multivalued fields
> >
> > suppose a multivalued field "category" having associated subfield
> "score".
> > First Is it possible to have a subfield in the mutlivalued field?
> > Second I want to get the documents ranked with the highest score say for
> > the
> > category:X
> >
> > thanks
> > Umar Shah
> >
> >
> >
> >
>
>
>
>

Re: ranking on Multivalued fields

Re: What is default Date time format in Solr

Re: Accented search

RE: Accented search

schema help

RE: Accented search

Re: Accented search

Re: schema help

Re: ranking on Multivalued fields

Re: schema help

Re: schema help

Re: Unparseable date

Query Level Boosting

Re: Out of memory in analysis

Re: return only sorted Field, but with a different Field Name

Re: How to get incrementPositionGap value from IndexSchema ?

Re: Result based sorting for KWIC?

Re: Unparseable date

Re: Accented search

Re: Accented search

Cannot start solr

Re: Cannot start solr

Re: schema help

Solr nightly build and the multicore mode

Re: Result based sorting for KWIC?

Re: Out of memory in analysis

[Update] Solr can be started from jetty but not tomcat

Re: ranking on Multivalued fields

28 matches

Site Navigation

Mail list logo

Footer information