Re: index browsing with solr
Solr has a pluggable request handling framework that lets you easily write custom logic and takes care of the xml/json/etc writing for you. Check: http://wiki.apache.org/solr/SolrPlugins#head-7c0d03515c496017f6c0116ebb096e34a872cb61 http://wiki.apache.org/solr/SolrRequestHandler Since the exact term browsing mechanism you asked for is not supported, I suggested writing your own and looking to the IndexInfoRequestHandler as a simple starting place. After more thought (and Yonik pointing it out), you are probably best off if you can use faced browsing to do what you need. http://wiki.apache.org/solr/SolrFacetingOverview ryan Thanks ! the plugins capacity of solr make me more happy to have choosen it, but after a few reading, it seems that faced browsing is the way to go to avoid too much programing... In fact it seems pretty easy; just perhaps a little too time consuming, as I'm forced to throw a query in order to get back the indexed terms. But at least, it works. Thanks again to everybody for your contribution. Pierre-Yves Landron _ Don't just search. Find. Check out the new MSN Search! http://search.msn.click-url.com/go/onm00200636ave/direct/01/
Multiple instances, wiki out of date?
Hi there, I've been following the instruction from http://wiki.apache.org/solr/SolrJetty?highlight=%28Multiple%29%7C%28Solr%29%7C%28Webapps%29solr to get a few indexes running under the same instance of jetty 6.1.2. If I use the webapp descriptors as specified in the wiki (with correct paths, I'm just pasting the example here).. /*solr*1/* /your/path/to/the/*solr*.war true name="defaultsDescriptor">org/mortbay/jetty/servlet/webdefault.xml *solr*/home /your/path/to/your/*solr*/home/dir /*solr*2/* /your/path/to/the/*solr*.war true name="defaultsDescriptor">org/mortbay/jetty/servlet/webdefault.xml *solr*/home /your/path/to/your/alternate/*solr*/home/dir Jetty complains that: 2007-02-26 18:36:04.874::INFO: Logging to STDERR via org.mortbay.log.StdErrLog 2007-02-26 18:36:05.066::WARN: Config error at name="addWebApplication">/solr1/*/your/path/to/the/solr.warname="extractWAR">truename="defaultsDescriptor">org/mortbay/jetty/servlet/webdefault.xmlname="addEnvEntry">solr/hometype="String">/your/path/to/your/solr/home/dir 2007-02-26 18:36:05.066::WARN: EXCEPTION java.lang.IllegalStateException: No Method: name="addWebApplication">/solr1/*/your/path/to/the/solr.warname="extractWAR">truename="defaultsDescriptor">org/mortbay/jetty/servlet/webdefault.xmlname="addEnvEntry">solr/hometype="String">/your/path/to/your/solr/home/dir on class org.mortbay.jetty.Server at org.mortbay.xml.XmlConfiguration.call(XmlConfiguration.java:548) at org.mortbay.xml.XmlConfiguration.configure(XmlConfiguration.java:241) at org.mortbay.xml.XmlConfiguration.configure(XmlConfiguration.java:203) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:919) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) 2007-02-26 18:36:05.068::INFO: Shutdown hook executing 2007-02-26 18:36:05.068::INFO: Shutdown hook complete I've been looking at the Jetty API and it looks like those methods are deprecated in the latest versions of Jetty. Anyway, I can get several instances to run together using the descriptor shown below and several war files default="."/>/webapps-plus false true false default="."/>/etc/webdefault.xml This is good enough for me but the problem then is that all point to the same data/index folder sharing the same index and I need them to use different indexes. The question is, how can you configure solr.home differently for each of the solr instances deployed in the webapps-plus folder? It would be equally valid if there is a way of fixing the xml in the wiki so individual war files can be specified passing a different solr.home to each.. thanks, galo.
MoreLikeThis and term vectors - documentation suggestion
Hi all, I was trying out the MoreLikeThis support, and getting some odd results. I realized that unless the fields being used for similarity calculation have a stored term vector, the MoreLikeThis code from Lucene will re-analyze the field using the StandardAnalyzer. Which, in my case, is quite different from what I'm using in the Solr schema. So the first note is just for anybody using MoreLikeThis, make sure you also specify termVectors=true in the Solr schema for any fields being passed to the query as mlt.fl parameters. The second note is that the Wiki page and the example schema might want to include some reference to the termVectors field attribute. For example, the sample schema says:
Re: Multiple instances, wiki out of date?
Galo: are you using plain vanilla Jetty, or are you using Jetty Plus? the examples for "Configuring Solr Home with JNDI" and "Multiple Solr Webapps" both require Jetty Plus (because the JNDI support only exists in the extra libraries JettyPlus provides) That may explain the missing method call when trying to do addEnvEntry if you can't use Jetty Plus to have JNDI support, or if the latest version of Jetty doesn't support JNDI anymore (can't imagine that would be the case) then you might be able to find a way to set teh solr.solr.home system property on a per webapp basis ... how to do that in a Jetty config may be a better question for the Jetty user community. (if you do discover that the config syntax for JNDI has changed significantly in the latest versions of Jetty, by all means please update the wiki ... we'd probably want seperate sections for the different versions since not everyone will be running the latest, but it's still good info to have) -Hoss
Re: Document boost not reflected in fieldNorm
i just tried this with the example schema: 1) changed the "cat" field to have omitNorms="false" 2) edited solr.xml so there was a second with all the same data except a differnet "id" and a doc boost of 2 3) restarted port, and reindexed solr.xml ...when i search on cat:search, i definitely see the docboost comeinto play... 2.5622776 = (MATCH) fieldWeight(cat:search in 19), product of: 1.0 = tf(termFreq(cat:search)=1) 2.049822 = idf(docFreq=6) 1.25 = fieldNorm(field=cat, doc=19) 1.2811388 = (MATCH) fieldWeight(cat:search in 18), product of: 1.0 = tf(termFreq(cat:search)=1) 2.049822 = idf(docFreq=6) 0.625 = fieldNorm(field=cat, doc=18) ...perhaps your docboosts (while diferent) are close enough together that they encode to the same byte encoded value? -Hoss
Re: MoreLikeThis and term vectors - documentation suggestion
On 2/26/07, Ken Krugler <[EMAIL PROTECTED]> wrote: ...I was trying out the MoreLikeThis support, and getting some odd results... Thanks for the info, I have added a link to your message at https://issues.apache.org/jira/browse/SOLR-69 -Bertrand
Re: Overriding Ranking in solr
Your question is broad, and has a lot of potential answers... 1) Lucene has a very configurable Scoring, that allows a lot of customiztaion -- much of the scoring formula can be tweaked just by changing the "Similarity" class used, other more complex things can be achieved by writing your own Query classes 2) Solr allows for a *lot* of customization using "plugins" where just about any class you can imagine (including Similarity, custom RequestHandlers, and new Query clases) can be loaded from a JAR you provide at runtime... http://wiki.apache.org/solr/SolrPlugins 3) Solr has a special type of query called a FunctionQuery which makes writing special Query Scoring based on numeric Document Fields really easy ... some very complicated things can be done right out of the box using the Function Parsing supported by the SolrQueryParser... http://lucene.apache.org/solr/api/org/apache/solr/search/QueryParsing.html#parseFunction(java.lang.String,%20org.apache.solr.schema.IndexSchema) ...but more complicated things (like distance searching) would require you to write a simple ValueSource definining your equation, and using that ValueSource in a FunctionQuery you constructi na custom RequestHandler. using FunctionQuery has been discussed on several Lucene lists in the past, there have even been some fairly in depth discussion about using it for Geo based scoring... http://www.nabble.com/forum/Search.jtp?query=FunctionQuery+distance&local=y&forum=44 -Hoss
Re: MoreLikeThis and term vectors - documentation suggestion
On 2/26/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 2/26/07, Ken Krugler <[EMAIL PROTECTED]> wrote: > ...I was trying out the MoreLikeThis support, and getting some odd results... Thanks for the info, I have added a link to your message at https://issues.apache.org/jira/browse/SOLR-69 Is it possible to modify MoreLikeThis to use the schema.xml-defined analyzer? That's the way the highlighting code currently works (it picks the index-time analyzer). It woudl be nice for as many features as possible to work without term vectors. I sometimes wonder whether schema.xml exposes the right level of abstraction (it is currently very lucene-guts-y). Options like compressed are nice as we are free to change the implementation. canPerformMoreLikeThis=true gives us more flexibility in the future. Then again, perhaps all that is needed is a nice table... something like http://wiki.apache.org/solr/FieldOptionsByUseCase? -Mike
Re: MoreLikeThis and term vectors - documentation suggestion
On 2/26/07, Bertrand Delacretaz <[EMAIL PROTECTED]> wrote: On 2/26/07, Ken Krugler <[EMAIL PROTECTED]> wrote: ...I was trying out the MoreLikeThis support, and getting some odd results... Thanks for the info, I have added a link to your message at https://issues.apache.org/jira/browse/SOLR-69 Is it possible to modify MoreLikeThis to use the schema.xml-defined analyzer? That's the way the highlighting code currently works (it picks the index-time analyzer). I looked at that briefly (passing the analyzer to use down to MoreLikeThis), but for my fields it's a lot more than just what analyzer is used, given all of the filters that are also in play. Also the performance really stunk when I didn't use stored term vectors. It woudl be nice for as many features as possible to work without term vectors. I sometimes wonder whether schema.xml exposes the right level of abstraction (it is currently very lucene-guts-y). Options like compressed are nice as we are free to change the implementation. canPerformMoreLikeThis=true gives us more flexibility in the future. Then again, perhaps all that is needed is a nice table... something like http://wiki.apache.org/solr/FieldOptionsByUseCase? That would be nice, yes. Thanks, -- Ken -- Ken Krugler Krugle, Inc. +1 530-210-6378 "Find Code, Find Answers"