Re: Index arbitrary xml-elments in only one field without copying

2007-03-14 Thread Erik Hatcher
Thomas - you will need to do this client-side if you don't want to  
use copyField.  The client needs to gather up all the text you want  
indexed and send that as 


Erik


On Mar 14, 2007, at 3:50 AM, thomas arni wrote:


Hello

I'm currently evaluate solr for our needs. In a first step I used your
example and adapted the “schema.xml”.

In contrast to the example docs provided I haven't homogeneous
documents, which means I only want to index to two fields. This fields
are the uniqueKey (docno) and a textfield (text).






Instead of using the copyField for other XML-elements, to copy (and
duplicate) this fields to my “text”-field, I want to specify which
fields should be indexed directly in the “text”-field without copying
nor duplicating. I have no need for additional index-fields in my
heterogeneous environment. This extra fields only need additional  
space

in my index, which is a disadvantage for me.


How can I specify arbitrary xml-elements, which should be indexed  
in my

one and only field “text”. I have no need of additional fields in my
index.


Any help is appreciated.


Thomas




AW: Index arbitrary xml-elments in only one field without copying

2007-03-14 Thread Burkamp, Christian
You can even put multiple  entries into one 
document. The text field needs to be defined multi-valued for this to work. 
 
You can put each chunk of data to its own text field.
Perhaps this approach is best suited for what you want to do?

--Christian

-Ursprüngliche Nachricht-
Von: Erik Hatcher [mailto:[EMAIL PROTECTED] 
Gesendet: Mittwoch, 14. März 2007 11:55
An: solr-user@lucene.apache.org
Betreff: Re: Index arbitrary xml-elments in only one field without copying


Thomas - you will need to do this client-side if you don't want to  
use copyField.  The client needs to gather up all the text you want  
indexed and send that as 

Erik


On Mar 14, 2007, at 3:50 AM, thomas arni wrote:

> Hello
>
> I'm currently evaluate solr for our needs. In a first step I used your 
> example and adapted the "schema.xml".
>
> In contrast to the example docs provided I haven't homogeneous 
> documents, which means I only want to index to two fields. This fields 
> are the uniqueKey (docno) and a textfield (text).
>
> 
>  
>  
> 
>
> Instead of using the copyField for other XML-elements, to copy (and
> duplicate) this fields to my "text"-field, I want to specify which 
> fields should be indexed directly in the "text"-field without copying 
> nor duplicating. I have no need for additional index-fields in my 
> heterogeneous environment. This extra fields only need additional
> space
> in my index, which is a disadvantage for me.
>
>
> How can I specify arbitrary xml-elements, which should be indexed
> in my
> one and only field "text". I have no need of additional fields in my
> index.
>
>
> Any help is appreciated.
>
>
> Thomas



Re: Index arbitrary xml-elments in only one field without copying

2007-03-14 Thread thomas arni

Thanks for your reply Erik. I will use your suggested approach.

IMHO this could be something to add for future versions of solr. The 
Terrier IR-framework for example and other IR solutions allow to specify 
different XML-elements, which should be indexed in only one (lucene) field.


As I said in my previous post, this approach is especially helpful, if 
you have heterogeneous documents with different XML-elements.




Erik Hatcher wrote:
Thomas - you will need to do this client-side if you don't want to use 
copyField.  The client needs to gather up all the text you want 
indexed and send that as 


Erik


On Mar 14, 2007, at 3:50 AM, thomas arni wrote:


Hello

I'm currently evaluate solr for our needs. In a first step I used your
example and adapted the “schema.xml”.

In contrast to the example docs provided I haven't homogeneous
documents, which means I only want to index to two fields. This fields
are the uniqueKey (docno) and a textfield (text).






Instead of using the copyField for other XML-elements, to copy (and
duplicate) this fields to my “text”-field, I want to specify which
fields should be indexed directly in the “text”-field without copying
nor duplicating. I have no need for additional index-fields in my
heterogeneous environment. This extra fields only need additional space
in my index, which is a disadvantage for me.


How can I specify arbitrary xml-elements, which should be indexed in my
one and only field “text”. I have no need of additional fields in my
index.


Any help is appreciated.


Thomas






Restrict Servlet Access

2007-03-14 Thread Gunther, Andrew
What are people doing to restrict UpdateServlet access on production
installs of Solr.  Are people removing that option and rotating in a new
index or restricting access from the jetty side.

Cheers,
Andrew





Re: Restrict Servlet Access

2007-03-14 Thread Erik Hatcher


On Mar 14, 2007, at 10:12 AM, Gunther, Andrew wrote:

What are people doing to restrict UpdateServlet access on production
installs of Solr.  Are people removing that option and rotating in  
a new

index or restricting access from the jetty side.


The recommendation is to firewall off Solr so only your application  
server can access it.   Solr is not at all designed for direct client  
(browser, etc) access.


Erik




Re: Restrict Servlet Access

2007-03-14 Thread Brian Whitman


The recommendation is to firewall off Solr so only your application  
server can access it.   Solr is not at all designed for direct  
client (browser, etc) access.


Assuming you lock down update properly, what's the problem? We are  
currently using select directly through the XSLTResponseWriter right  
into a  via Ajax.Updater. Do you predict pain?








Re: Restrict Servlet Access

2007-03-14 Thread Erik Hatcher


On Mar 14, 2007, at 11:09 AM, Brian Whitman wrote:



The recommendation is to firewall off Solr so only your  
application server can access it.   Solr is not at all designed  
for direct client (browser, etc) access.


Assuming you lock down update properly, what's the problem? We are  
currently using select directly through the XSLTResponseWriter  
right into a  via Ajax.Updater. Do you predict pain?


I don't predict pain really, but I don't want to see Solr get bogged  
down in having a lot of security-related code added to it.  I do  
think it would be good for there to be some sort of capability to  
make Solr read-only in some form or another, such that an indexer  
could still work from an authorized environment.


Exposing Solr directly to a client does have appeal in the way you're  
doing it, but it also allows the possibility of hackers tinkering  
with it and perhaps requesting things they shouldn't.  For example,  
we index tags and annotations, and only a logged in user can see  
their own annotations, so exposing Solr directly would subvert that  
protection.


Erik



Re: Casting Exception with Similarity

2007-03-14 Thread Tim Patton



Chris Hostetter wrote:

: I figured out my problem.  My own jar must be in the examples/solr/lib
: directory (which does not exist in the download).  I found a hint to
: this on the mailing list.  The docs don't indicate this anywhere
: promenant.  Perhaps the lib directory should exist in the default
: download in the future?

it's mentioned in both the plugin wiki i listed, as well as the README for
solr.home (example/solr/README.txt) ... do you have any suggestions about
where else we should document it?

we don't include the lib directory in the example solr home because adding
"plugins" is considered a little above and beyond the basic usage .. we
try to keep the example as simple as possible.


-Hoss




Makes sense, I guess I was looking for a mention in the online 
documentation for the xml file where it mentions how to specify your own 
similarity.  Somehow I never stumbled on the other two spots.




Re: SPAM-LOW: Re: Federated Search

2007-03-14 Thread Tim Patton
I have several indexes now (4 at the moment, 20gb each, and I want to be 
able to drop in a new machine easily).  I'm using SQL server as a DB and 
it scales well.  The DB doesn't get hit too hard, mostly doing location 
lookups, and the app does some checking to make sure a document  has 
really changed before updating that back in the DB or the index.  When a 
new server is added it randomly picks up additions from the message 
server (it's approximately round-robin) and the rest of the system 
really doesn't even need to know about it.


I've realized partitioned indexing is a difficult, but solvable problem. 
 It could be a big project though.  I mean we have all solved it in our 
own way but no one has a general solution.  Distributed searching might 
be a better area to add to Solr since that should basically be the same 
for everyone.  I'm going to mess around with Jini on my own indexes, 
there's finally a new book out to go with the newer versions.


How were you planning on using Solr with Hadoop?  Maybe I don't fully 
understand how hadoop works.


Tim

Venkatesh Seetharam wrote:

Hi Tim,

Thanks for your response. Interesting idea. Does the DB scale?  Do you have
one single index which you plan to use Solr for or you have multiple
indexes?


But I don't know how big the index will grow and I wanted to be able to

add servers at any point.
I'm thinking of having N partitions with a max of 10 million documents per
partition. Adding a server should not be a problem but the newly added
server would take time to grow so that distribution of documents are equal
in the cluster. I've tested with 50 million documents of 10 size each and
looks very promising.


The hash idea sounds really interesting and if I had a fixed number of

indexes it would be perfect.
I'm infact looking around for a reverse-hash algorithm where in given a
docId, I should be able to find which partition contains the document so I
can save cycles on broadcasting slaves.

I mean, even if you use a DB, how have you solved the problem of
distribution when a new server is added into the mix.

We have the same problem since we get daily updates to documents and
document metadata.


How did you work around not being able to update a lucene index that is

stored in Hadoop?
I do not use HDFS. I use a NetApp mounted on all the nodes in the cluster
and hence did not need any change to Lucene.

I plan to index using Lucene/Hadoop and use Solr as the partition searcher
and a broker which would merge the results and return 'em.

Thanks,
Venkatesh

On 3/5/07, Tim Patton <[EMAIL PROTECTED]> wrote:




Venkatesh Seetharam wrote:
> Hi Tim,
>
> Howdy. I saw your post on Solr newsgroup and caught my attention. I'm
> working on a similar problem for searching a vault of over 100 million
> XML documents. I already have the encoding part done using Hadoop and
> Lucene. It works like a  charm. I create N index partitions and have
> been trying to wrap Solr to search each partition, have a Search broker
> that merges the results and returns.
>
> I'm curious about how have you solved the distribution of additions,
> deletions and updates to each of the indexing servers.I use a
> partitioner based on a hash of the document id. Do you broadcast to the
> slaves as to who owns a document?
>
> Also, I'm looking at Hadoop RPC and ICE ( www.zeroc.com
> ) for distributing the search across these Solr
> servers. I'm not using HTTP.
>
> Any ideas are greatly appreciated.
>
> PS: I did subscribe to solr newsgroup now but  did not receive a
> confirmation and hence sending it to you directly.
>
> --
> Thanks,
> Venkatesh
>
> "Perfection (in design) is achieved not when there is nothing more to
> add, but rather when there is nothing more to take away."
> - Antoine de Saint-Exupéry


I used a SQL database to keep track of which server had which document.
Then I originally used JMS and would use a selector for which server
number the document should go to.  I switched over to a home grown,
lightweight message server since JMS behaves really badly when it backs
up and I couldn't find a server that would simply pause the producers if
there was a problem with the consumers.  Additions are pretty much
assigned randomly to whichever server gets them first.  At this point I
am up to around 20 million documents.

The hash idea sounds really interesting and if I had a fixed number of
indexes it would be perfect.  But I don't know how big the index will
grow and I wanted to be able to add servers at any point.  I would like
to eliminate any outside dependencies (SQL, JMS), which is why a
distributed Solr would let me focus on other areas.

How did you work around not being able to update a lucene index that is
stored in Hadoop?  I know there were changes in Lucene 2.1 to support
this but I haven't looked that far into it yet, I've just been testing
the new IndexWriter.  As an aside, I hope those features can be used by
Solr soon (if they aren't already

Re: Restrict Servlet Access

2007-03-14 Thread Jed Reynolds

Gunther, Andrew wrote:


What are people doing to restrict UpdateServlet access on production
installs of Solr.  Are people removing that option and rotating in a new
index or restricting access from the jetty side.
 



I'm putting Solr on my DMZ without direct WAN access. If I had to put it 
on a WAN facing server, I'd hide it behind Apache and access it using 
mod_rewrite and use the [P] proxy directive. Using mod_rewrite, by 
ignoring the /foo/update URI then you have no external access to that.


Jed


Re: Commit after how many updates?

2007-03-14 Thread Mike Klaas

On 3/14/07, Maximilian Hütter <[EMAIL PROTECTED]> wrote:


It is the default heap size for the Sun JVM, so I guess 64MB max. The
documents are rather large, but if you manage to index 100,000 docs,
there seems to be some problem with Solr.


The documents are not held in memory until a commit occurs (just some
tracking info), so I'm not sure that that is the appropriate
conclusion.  Lucene keeps a few documents in memory
(maxBufferedDocs--you could lower this setting), and if your documents
are large, this could use a higher maximum amount of memory than in my
case.  Solr does keep all uniqueIds in memory until commit.


What would be the recommended heap size for Solr?


That is difficult to answer, since it depends on so many factors.  64
megs seems on the rather low end.  Remember that you aren't just
trying to avoid OOM errors--more memory means bigger caches and
increased query performance.

-Mike


Re: Index arbitrary xml-elments in only one field without copying

2007-03-14 Thread Chris Hostetter
:
: IMHO this could be something to add for future versions of solr. The
: Terrier IR-framework for example and other IR solutions allow to specify
: different XML-elements, which should be indexed in only one (lucene) field.

I don't know anythign about Terrier, but there are lots of simple ways to
achieve thigns like this with Solr depending on what exactly you want, two
off the top of my head...

1) use an XSLT on the client to extract only the fields you want from your
XML file and build up the text fields you send to solr (we have the
framework in place for you do even o that XSLT server side)

2) send each element that you care about as a seperate field -- you could
use xpath like descripters for the names, ie...
  body of tag CCC
...and then use copyField with a wildcard in the source to consolidate all
tags into a single text field...

   
   





-Hoss



Re: Commit after how many updates?

2007-03-14 Thread Chris Hostetter

: It is the default heap size for the Sun JVM, so I guess 64MB max. The
: documents are rather large, but if you manage to index 100,000 docs,
: there seems to be some problem with Solr.

i think you mean "there DOES NOT seems to be some problem with Solr."
right ... why would Mike being able to commit only every 100,000 indicate
a problem with Solr?

: What would be the recommended heap size for Solr?

there isn't one ... it's entirely dependent on how big your documents are,
how many fields in yoru schema have norms enabled, what types of queries
your process, how big you configurae teh various solr caches, etc

-Hoss



Re: Casting Exception with Similarity

2007-03-14 Thread Chris Hostetter
:
: Makes sense, I guess I was looking for a mention in the online
: documentation for the xml file where it mentions how to specify your own
: similarity.  Somehow I never stumbled on the other two spots.


Hmmm... you mean http://wiki.apache.org/solr/SchemaXml right?

yeah i can see how that would be a little confusing ... i've made some
updates, feel free to edit the docs further if you think it's still not
clear.


-Hoss



Re: Casting Exception with Similarity

2007-03-14 Thread Tim Patton

Sweet, looks like someone beat me to it.

Tim

Chris Hostetter wrote:

:
: Makes sense, I guess I was looking for a mention in the online
: documentation for the xml file where it mentions how to specify your own
: similarity.  Somehow I never stumbled on the other two spots.


Hmmm... you mean http://wiki.apache.org/solr/SchemaXml right?

yeah i can see how that would be a little confusing ... i've made some
updates, feel free to edit the docs further if you think it's still not
clear.


-Hoss






Re: Federated Search

2007-03-14 Thread Venkatesh Seetharam

Hi Jed,

Thanks for sharing your thoughts and the link.

Venkatesh

On 3/11/07, Jed Reynolds <[EMAIL PROTECTED]> wrote:


Venkatesh Seetharam wrote:
>
>> The hash idea sounds really interesting and if I had a fixed number of
> indexes it would be perfect.
> I'm infact looking around for a reverse-hash algorithm where in given a
> docId, I should be able to find which partition contains the document
> so I
> can save cycles on broadcasting slaves.

Many large databases partition their data either by load or by another
logical manner, like by alphabet. I hear that Hotmail, for instance,
partitions its users alphabetically. Having a broker will certainly
abstract this mechninism, and of course your application(s) want to be
able to bypass a broker when necessary.

> I mean, even if you use a DB, how have you solved the problem of
> distribution when a new server is added into the mix.

http://www8.org/w8-papers/2a-webserver/caching/paper2.html

I saw this link on the memcached list and the thread surrounding it
certainly covered some similar ground. Some ideas have been discussed
like:
- high availability of memcached, redundant entries
- scaling out clusters and facing the need to rebuild the entire cache
on all nodes depending on your bucketing.
I see some similarties with maintaining multiple indicies/lucene
partitions and having a memcache deployment: mostly if you are hashing
your keys to partitions (or buckets or machines) then you might be faced
with a) availability issues if there's a machine/partition outtage b)
rebuilding partitions if adding a partition/bucket changes the hash
mapping.

The ways I can think of to scale-out new indexes would be to have your
application maintain two sets of bucket mappings for ids to indexes, and
the second would be to key your documents and partition them by date.
The former method would allow you to rebuild a second set of
repartitioned indexes and buckets and allow you to update your
application to use the new bucket mapping (when all the indexes has been
rebuilt). The latter method would only apply if you could organize your
document ids by date and only added new documents to the 'now' end or
evenly across most dates. You'd have to add a new partition onto the end
as time progressed, and rarely rebuild old indexes unless your documents
grow unevenly.

Interesting topic! I don't yet need to run multiple Lucene partitions,
but I have a few memcached servers and increasing the number of them I
expect will force my site to take a performance accordingly as I am
forced to rebuild the caches. I can see similarly if I had multiple
lucene partitions, that if I had to fission some of them, rebuilding the
resulting partitions would be time intensive and I'd want to have
procedures in place for availibility, scaling out and changing
application code as necessary. Just having one fail-over Solr index is
just so easy in comparison.

Jed



Reload schema.xml

2007-03-14 Thread Debra

Is there a way to reload schema.xml while solr is running?

TIA
Debra
-- 
View this message in context: 
http://www.nabble.com/Reload-schema.xml-tf3404798.html#a9483346
Sent from the Solr - User mailing list archive at Nabble.com.



Re: Reload schema.xml

2007-03-14 Thread Chris Hostetter

: Is there a way to reload schema.xml while solr is running?

Afraid not ...there's a lot of interdependencies between the way the
IndexSchema, the RequestHandlers, the SolrCore, and updates work ... so i
supsect atttempting to add something like that would require a lot of
tricky synchronization interaction to get it working safetly ... which
owuld not only be hard to do, but might also have some adverse impacts on
the more common case: a long runningsystem with a constant IndexSchema.

It also wouldn't address a more fundemental issue: many schema.xml changes
require reindexing.

Your servlet container may provide easy hooks for reloading a webapp
(like Solr) on demand .. you could always trigger that whenever you change
your schema.xml.

(in some containers/configs it's a simple matter of touching the solr.war)


-Hoss



Re: Reload schema.xml

2007-03-14 Thread Debra

That was quick... I was editing my question while you sent the answer.
I would appreciate if you can take a look at the edited question.


Chris Hostetter wrote:
> 
> 
> : Is there a way to reload schema.xml while solr is running?
> 
> Afraid not ...there's a lot of interdependencies between the way the
> IndexSchema, the RequestHandlers, the SolrCore, and updates work ... so i
> supsect atttempting to add something like that would require a lot of
> tricky synchronization interaction to get it working safetly ... which
> owuld not only be hard to do, but might also have some adverse impacts on
> the more common case: a long runningsystem with a constant IndexSchema.
> 
> It also wouldn't address a more fundemental issue: many schema.xml changes
> require reindexing.
> 
> Your servlet container may provide easy hooks for reloading a webapp
> (like Solr) on demand .. you could always trigger that whenever you change
> your schema.xml.
> 
> (in some containers/configs it's a simple matter of touching the solr.war)
> 
> 
> -Hoss
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/Reload-schema.xml-tf3404798.html#a9483960
Sent from the Solr - User mailing list archive at Nabble.com.



Re: faceting for further MLT/grouping

2007-03-14 Thread Chris Hostetter

I'm not sure if i'm understanding your question ... i was going to mention
SOLR-69 as a possible solution for your goal, but i see you've commented
on that issue so you are already aware of it ... doesn't that code do what
you are describing (by using the facet fields as the mlt.fl ?)


I have to confess, i havne't really followed the MLT stuff very closely,
so forgive me if i'm missing a big disconnect betwen what it does and what
you describe...

: Which is great as a simple "auto-tagger." But we would love to take
: these results and compare it to others to make a "related people"
: query. I could take the top 10 terms, perform a query like
:
: q=pynchon^9%20barthelme^4&facet=true&facet.field=username&facet.limit=10
:
: and get the 2nd-order facets back, but I imagine this is a common



-Hoss



Re: Reload schema.xml

2007-03-14 Thread Chris Hostetter
: That was quick... I was editing my question while you sent the answer.
: I would appreciate if you can take a look at the edited question.

i'm not sure i understand ... did you send a reply?  i didn't get other
messages from you.

Oh crap ... so you used nabble.com to send your message right? and
apparently nabble let's people edit messages they post in place ... that's
really scary since everybody else on the planet who is subscribed to the
list (and all of the people reading the list archives on other hosts) will
never have any idea what you are talking about.

can you please send a seperate message with your question?


-Hoss



Hierarchical facted search redux - nabble

2007-03-14 Thread Graeme Merrall

From Chris's ApacheCon talk on Solr (nice one BTW) I see that nabble

is apparently using Solr for search?  I noticed that the screenshot of
nabble.com in the presentation and the current site has some form of
hierarchy displayed when you search.

Is this done by Solr? Just lokoing for a clue as to how it migth have been done.

Cheers,
Graeme


Re: faceting for further MLT/grouping

2007-03-14 Thread Brian Whitman


On Mar 14, 2007, at 5:48 PM, Chris Hostetter wrote:



I'm not sure if i'm understanding your question ... i was going to  
mention
SOLR-69 as a possible solution for your goal, but i see you've  
commented
on that issue so you are already aware of it ... doesn't that code  
do what

you are describing (by using the facet fields as the mlt.fl ?)


The problem with SOLR-69 is that faceting does not operate on MLT  
results. If this was fixed we could do


q=username:bwhitman&mlt=true&mlt.field=reviewText&facet=true&facet.field 
=username


..but as of now the faceting only operates on the main query, not the  
MLT subsection.












Re: Hierarchical facted search redux - nabble

2007-03-14 Thread Chris Hostetter
:
: >From Chris's ApacheCon talk on Solr (nice one BTW) I see that nabble
: is apparently using Solr for search?  I noticed that the screenshot of
: nabble.com in the presentation and the current site has some form of
: hierarchy displayed when you search.

Nabble does aparently use Lucene, but i was not trying to suggest that
they use Solr ... that screenshot was in a section where i was explaining
what faceted searching is, and using them as an example of a UI with a
single facet (which happens to be a hierarchy)



-Hoss



Re: faceting for further MLT/grouping

2007-03-14 Thread Chris Hostetter

: The problem with SOLR-69 is that faceting does not operate on MLT
: results. If this was fixed we could do

ah... i see, you're tlaking about getting facet counts based on the MLT
results, not the main results ... i think somoene (Yonik?) suggested
extracting MLT into a seperate request handler ... if something like that
was done, then it would be pretty easy to do.

in general it would probably be pretty easy to write a custom request
handler that does this .. the SimpleFacets API can operate on any DocSet,
so mixing and matching the pieces should be feasible.




-Hoss



Re: Hierarchical facted search redux - nabble

2007-03-14 Thread Graeme Merrall

On 3/15/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:

Nabble does aparently use Lucene, but i was not trying to suggest that
they use Solr ... that screenshot was in a section where i was explaining
what faceted searching is, and using them as an example of a UI with a
single facet (which happens to be a hierarchy)


Gotcha. Thanks Chris.

G


Anyone know How to test CustomerAnalyzer with jsp?

2007-03-14 Thread James liu

I don't know how to test,,Maybe someone can tell me.

Thks Chris Advice.


--
regards
jl


DisMax Question.

2007-03-14 Thread shai deljo

Hi,
I am trying to use DisMax handler in order to search multiple fields
but i don't get results back.

Assume the fields i want to search on are "abc", "def", "ghi" and
"jkl" then i changed solrconfig.xml to this:

   
explicit
0.01

   abc^2 def ghi^0.1 jkl

   
 

When i run my query with qt=dismax i don't get results:

/solr/select/?qt=dismax&q=blablabla%3BRelevanc
+desc&version=2.2&start=0&rows=1&indent=on&fl=*,score

when i remove the qt=dismax i do get result back:

/solr/select/?q=blablabla%3BRelevanc
+desc&version=2.2&start=0&rows=1&indent=on&fl=*,score

What am i doing wrong ?
Thanks,
S


Re: Anyone know How to test CustomerAnalyzer with jsp?

2007-03-14 Thread Chris Hostetter
: Subject: Anyone know How to test CustomerAnalyzer with jsp?
:
: I don't know how to test,,Maybe someone can tell me.

how about something like...

<%
  Analyzer a = new CustomerAnalyzer();
  TokenStream t = a.tokenStream("f", new StringReader("test"));
%>
<%= t.next().toString() %>


-Hoss



Re: DisMax Question.

2007-03-14 Thread Chris Hostetter

: When i run my query with qt=dismax i don't get results:
:
: /solr/select/?qt=dismax&q=blablabla%3BRelevanc
: +desc&version=2.2&start=0&rows=1&indent=on&fl=*,score
:
: when i remove the qt=dismax i do get result back:
:
: /solr/select/?q=blablabla%3BRelevanc
: +desc&version=2.2&start=0&rows=1&indent=on&fl=*,score

dismax does not use hte ";" syntax for sorting .. this is the one usefull
piece of documetnation i ever manged to put in the wiki about dismax...

http://wiki.apache.org/solr/DisMaxRequestHandler

...if you add debugQuery=true to your URL it will give you a bunch of
great debugging info thta would have pointed out you were actually geting
a query for the terms "blablabla;Relevanc" and "desc" across all of those
fields.

..also is "Relevanc" the name of a field you have, because if you are
trying to sort by score that's not right for either handler ... you need
"score desc", either after the ";" for standard, or in the "sort" param
for dismax.


-Hoss



Reloading solr schema file

2007-03-14 Thread Debra

Is there a way to reload schema.xml while solr is running? 

As a newbiie Java programmer I'm not sure what happens if I do the
following: 

SolrCore core = new SolrCore(null,null); 

Will it replace the current core? What happens to requests that are running? 


What if I do? 

SolrCore core =SolrCore.getSolrCore(); 
core=null; // first core.close(); ?? 
core =SolrCore.getSolrCore(); 



TIA 
Debra
-- 
View this message in context: 
http://www.nabble.com/Reloading-solr-schema-file-tf3406562.html#a9489139
Sent from the Solr - User mailing list archive at Nabble.com.



Re: DisMax Question.

2007-03-14 Thread shai deljo

Thanks, that did the trick but now i have another problem (the
documentation is very little).
It fails when i try to boost terms in the query, i.e.
I get results for :

qt=dismax&q=blabla&version=2.2&start=0&rows=1&indent=on&fl=*,score&debugQuery=true&sort=length_seconds+desc

but no results for:
qt=dismax&q= blabla
^1.5&version=2.2&start=0&rows=1&indent=on&fl=*,score&debugQuery=true&sort=length_seconds+desc

doesn't DisMax support term boosting?

What is the alternative to DisMax then? something like this :

q=field1:bla^0.1 fo OR
field2:blabla^3;score+desc&version=2.2&start=0&rows=170&indent=on&fl=*,score&debugQuery=on

?

On 3/14/07, Chris Hostetter <[EMAIL PROTECTED]> wrote:


: When i run my query with qt=dismax i don't get results:
:
: /solr/select/?qt=dismax&q=blablabla%3BRelevanc
: +desc&version=2.2&start=0&rows=1&indent=on&fl=*,score
:
: when i remove the qt=dismax i do get result back:
:
: /solr/select/?q=blablabla%3BRelevanc
: +desc&version=2.2&start=0&rows=1&indent=on&fl=*,score

dismax does not use hte ";" syntax for sorting .. this is the one usefull
piece of documetnation i ever manged to put in the wiki about dismax...

http://wiki.apache.org/solr/DisMaxRequestHandler

...if you add debugQuery=true to your URL it will give you a bunch of
great debugging info thta would have pointed out you were actually geting
a query for the terms "blablabla;Relevanc" and "desc" across all of those
fields.

..also is "Relevanc" the name of a field you have, because if you are
trying to sort by score that's not right for either handler ... you need
"score desc", either after the ";" for standard, or in the "sort" param
for dismax.


-Hoss