Re[2]: Question about search suggestion
I searched and read about auto-complete feature. Thanks. It looks nice, I think I should try it first. NM> On Tue, 26 Aug 2008 15:15:21 +0300 NM> Aleksey Gogolev <[EMAIL PROTECTED]> wrote: >> >> Hello. >> >> I'm new to solr and I need to make a search suggest (like google >> suggestions). >> NM> Hi Aleksey, NM> please search the archives of this list for subjects containing 'autocomplete' NM> or 'auto-suggest'. that should give you a few ideas and starting points. NM> best, NM> B NM> _ NM> {Beto|Norberto|Numard} Meijome NM> "The more I see the less I know for sure." NM> John Lennon NM> I speak for myself, not my employer. Contents may be hot. Slippery when wet. NM> Reading disclaimers makes you go blind. Writing them is worse. You have been NM> Warned. NM> __ NOD32 3387 (20080826) Information __ NM> This message was checked by NOD32 antivirus system. NM> http://www.eset.com -- Aleksey Gogolev developer, dev.co.ua Aleksey mailto:[EMAIL PROTECTED]
Wrong sort by score
Hi, I have encountered a weird problem in solr. In one of my queries (dismax, default sorting) I noticed that the results are not sorted by score (according to debugQuery). The first 150 results are tied (with score 12.806474), and after those, there is a bunch of results with higher score (12.962835). What can be the cause? I'm overriding the tf function in my similarity class. Can it be related? Thanks, Yuri
Re: Wrong sort by score
On Wed, Aug 27, 2008 at 9:10 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: > I have encountered a weird problem in solr. > In one of my queries (dismax, default sorting) I noticed that the results > are not sorted by score (according to debugQuery). > > The first 150 results are tied (with score 12.806474), and after those, > there is a bunch of results with higher score (12.962835). > > What can be the cause? > I'm overriding the tf function in my similarity class. Can it be related? Do the explain scores in the debug section match the normal scores paired with the documents? (add score to the fl parameter to get a score with each document). -Yonik
Re: Wrong sort by score
Actually, no... The score in the fl are 12.806475 and 10.386531 respectively, so the results according to that are sorted correctly. Is it just a problem with the debugQuery? On Wed, Aug 27, 2008 at 9:21 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Wed, Aug 27, 2008 at 9:10 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: > > I have encountered a weird problem in solr. > > In one of my queries (dismax, default sorting) I noticed that the results > > are not sorted by score (according to debugQuery). > > > > The first 150 results are tied (with score 12.806474), and after those, > > there is a bunch of results with higher score (12.962835). > > > > What can be the cause? > > I'm overriding the tf function in my similarity class. Can it be related? > > Do the explain scores in the debug section match the normal scores > paired with the documents? (add score to the fl parameter to get a > score with each document). > > -Yonik >
Re: SpellCheckComponent bug?
Hmm, sounds like a bug. A test case would be great, but at a minimum file a JIRA. Do those other terms that collate properly have multiple suggestions? On Aug 25, 2008, at 6:24 PM, Matthew Runo wrote: Hello folks! I seem to be seeing a bug in the SpellCheckComponent.. Search term: Quicksilver... I get two suggestions... 2 Quicksilver 220 Quiksilver ...and it's not correctly spelled... false ...but the collation is of the first term - not the one with the highest frequency? Quicksilver This seems to be anti-what-the-docs-say collation should do. Other, more popular terms (shoez, runnning, etc) all seem to collate properly. I'm hitting Solr via SolrJ and not really doing anything too fancy - using SVN head at the moment. Just wondered if anyone had any ideas. There are no synonyms in this system, so I don't think that could be it. I've rebuilt the search index. Thanks for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: SpellCheckComponent bug?
runnning does have multiple suggestions, Cunning and Running - but it properly picks Running. I have not noticed this for any other term, but I have not exhaustively tested others yet. Thanks for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833 On Aug 27, 2008, at 7:52 AM, Grant Ingersoll wrote: Hmm, sounds like a bug. A test case would be great, but at a minimum file a JIRA. Do those other terms that collate properly have multiple suggestions? On Aug 25, 2008, at 6:24 PM, Matthew Runo wrote: Hello folks! I seem to be seeing a bug in the SpellCheckComponent.. Search term: Quicksilver... I get two suggestions... 2 Quicksilver 220 Quiksilver ...and it's not correctly spelled... false ...but the collation is of the first term - not the one with the highest frequency? Quicksilver This seems to be anti-what-the-docs-say collation should do. Other, more popular terms (shoez, runnning, etc) all seem to collate properly. I'm hitting Solr via SolrJ and not really doing anything too fancy - using SVN head at the moment. Just wondered if anyone had any ideas. There are no synonyms in this system, so I don't think that could be it. I've rebuilt the search index. Thanks for your time! Matthew Runo Software Engineer, Zappos.com [EMAIL PROTECTED] - 702-943-7833
Re: Wrong sort by score
On Wed, Aug 27, 2008 at 9:38 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: > Actually, no... > The score in the fl are 12.806475 and 10.386531 respectively, so the results > according to that are sorted correctly. > Is it just a problem with the debugQuery? Looks like it... I guess the custom similarity isn't being used when explain() is called. Did you register this custom similarity in the schema? If so, can you file a JIRA bug for this? -Yonik > On Wed, Aug 27, 2008 at 9:21 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > >> On Wed, Aug 27, 2008 at 9:10 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: >> > I have encountered a weird problem in solr. >> > In one of my queries (dismax, default sorting) I noticed that the results >> > are not sorted by score (according to debugQuery). >> > >> > The first 150 results are tied (with score 12.806474), and after those, >> > there is a bunch of results with higher score (12.962835). >> > >> > What can be the cause? >> > I'm overriding the tf function in my similarity class. Can it be related? >> >> Do the explain scores in the debug section match the normal scores >> paired with the documents? (add score to the fl parameter to get a >> score with each document). >> >> -Yonik >> >
Re: How does Solr search when a field is not specified?
Thanks Otis! :) On Tue, Aug 26, 2008 at 10:47 PM, Otis Gospodnetic <[EMAIL PROTECTED]> wrote: > Jake, > > Yes, that field would have to be some kind of an analyzed field (e.g. text), > not string if you wanted that query to match "Jake is Testing" input. There > are no built-in Lucene or Solr-specific limits on field lengths. There is > one parameter called maxFieldLength in Solr's solrconfig.xml, I think, > which tells Lucene how many tokens to consider for indexing. If you don't > want that limit, increase that parameter's value to the max. > > > Otis > -- > Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch > > > > - Original Message >> From: Jake Conk <[EMAIL PROTECTED]> >> To: solr-user@lucene.apache.org >> Sent: Tuesday, August 26, 2008 4:38:09 PM >> Subject: How does Solr search when a field is not specified? >> >> Hello, >> >> I was wondering how does Solr search when a field is not specified, >> just a query? Say for example I got the following: >> >> ?q="Jake" AND "Test" >> >> I have a mixture of integer, string, and text columns. Some indexed, >> some stored, and some string fields copied to text fields. >> >> Say I have a string field with the value "Jake is Testing" which is >> also copied to a text field. If I did not copyField that string field >> to a text field then would the above query not return any results if >> the word "Jake" and "Test" are not found anywhere else since we cannot >> do fulltext searches on string fields? >> >> Lastly, is there a limit how many characters can be in a string and text >> field? >> >> Thanks, >> - Jake > >
Distributed Search Test
Hello, I have been performing some simple distributed search tests and don't understand why distributed search seems to work in some circumstances but not others. In my setup I have compiled the example server using the solr trunk downloaded on Aug 22nd. I am running a sample server instance on 2 separate hosts (localhost and "fred"). I've added a portion of the sample docs [a-n]*.xml to the local host solr server, and added the other portion, [m-z]*.xml sample docs to host fred. Assuming that I have setup things correctly, I would expect to receive a see non zero length SolrDocumentList for any distributed search that matches syntax in the example docs. Specifically when I test the contents of each server separately ( using the included TestCase ) the tests pass. This confirms that each server has different documents. However when I do the distributed tests, it seems the tests pass or fail based on the initial URL passed in the createNewSolrServer(String URL). I realize a real junit should be self contained, unlike this one. junit test testDistrbutedSearch() passes, while testDistrbutedSearch2() fails. Why? My understanding is that each host should send a query to all shards and collate the responses, and return them to the client. Is this true? Ron Here is my TestCase; package org.apache.solr.client.solrj.ron; import junit.framework.TestCase; import org.apache.solr.client.solrj.SolrQuery; import org.apache.solr.client.solrj.SolrServer; import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; import org.apache.solr.client.solrj.response.QueryResponse; import org.apache.solr.client.solrj.response.SolrPingResponse; import org.apache.solr.common.SolrDocumentList; import org.apache.solr.common.params.ShardParams; public class SolrExampleDistributedTest extends TestCase { int port = 8983; static final String context = "/solr"; static String SOLR_SHARD1 = "localhost:8983/solr"; static String SOLR_SHARD2 = "fred:8983/solr"; static String SOLR_SHARDS = SOLR_SHARD1 + "," + SOLR_SHARD2; static String HTTP_PREFIX = "http://";; static String SOLR_URL1 = HTTP_PREFIX + SOLR_SHARD1; static String SOLR_URL2 = HTTP_PREFIX + SOLR_SHARD2; static String QUERY1 = "Samsung"; static String QUERY2 = "solr"; @Override public void setUp() throws Exception { super.setUp(); } public SolrExampleDistributedTest(String name) { super(name); } @Override public void tearDown() throws Exception { super.tearDown(); } protected SolrServer createNewSolrServer(String url) { try { CommonsHttpSolrServer s = new CommonsHttpSolrServer(url); s.setConnectionTimeout(100); // 1/10th sec s.setDefaultMaxConnectionsPerHost(100); s.setMaxTotalConnections(100); return s; } catch (Exception ex) { throw new RuntimeException(ex); } } public void testLocalhost() { try { SolrServer server = createNewSolrServer(SOLR_URL1); SolrQuery query = new SolrQuery(); query.setQuery(QUERY1); QueryResponse qr = server.query(query); SolrDocumentList sdl = qr.getResults(); assertTrue(sdl.getNumFound() > 0); query = new SolrQuery(); query.setQuery(QUERY2); qr = server.query(query); sdl = qr.getResults(); assertTrue(sdl.getNumFound() == 0); } catch (Exception ex) { ex.printStackTrace(); fail(); } } public void testRemoteHost() { try { SolrServer server = createNewSolrServer(SOLR_URL2); SolrQuery query = new SolrQuery(); query.setQuery(QUERY1); QueryResponse qr = server.query(query); SolrDocumentList sdl = qr.getResults(); assertTrue(sdl.getNumFound() == 0); query = new SolrQuery(); query.setQuery(QUERY2); qr = server.query(query); sdl = qr.getResults(); assertTrue(sdl.getNumFound() > 0); } catch (Exception ex) { // expected ex.printStackTrace(); fail(); } } public void testDistrbutedSearch() { try { SolrServer server = createNewSolrServer(SOLR_URL1); SolrQuery query = new SolrQuery(); query.setQuery(QUERY1); query.setParam(ShardParams.SHARDS, SOLR_SHARDS); QueryResponse qr = server.query(query); SolrDocumentList sdl = qr.getResults(); assertTrue(sdl.getNumFound() > 0); SolrQuery query2 = new SolrQuery(); query2.setQuery(QUERY2); query2.setParam(ShardParams.SHARDS, SOLR_SHARDS); QueryResponse qr2 = server.query(query); SolrDocumentList sdl2 = qr2.getResults(); assertTrue(sdl.getNumFound() >
Re: Sorting and also looking at stored fields
Aha! Yep, that's the problem (not set to store in schema.xml)! Thanks!
Re: Distributed Search Test
It fails because you are using "localhost" as part of a shard name. When you send the request to "fred" it uses the "fred" shard and the "localhost" shard (which is the same as fred!) -Yonik On Wed, Aug 27, 2008 at 12:07 PM, Ronald Aubin <[EMAIL PROTECTED]> wrote: > Hello, >I have been performing some simple distributed search tests and don't > understand why distributed search seems to work in some circumstances but > not others. > > In my setup I have compiled the example server using the solr trunk > downloaded on Aug 22nd. I am running a sample server instance on 2 separate > hosts (localhost and "fred"). I've added a portion of the sample docs > [a-n]*.xml to the local host solr server, and added the other portion, > [m-z]*.xml sample docs to host fred. > > Assuming that I have setup things correctly, I would expect to receive a see > non zero length SolrDocumentList for any distributed search that matches > syntax in the example docs. > > Specifically when I test the contents of each server separately ( using the > included TestCase ) the tests pass. This confirms that each server has > different documents. However when I do the distributed tests, it seems the > tests pass or fail based on the initial URL passed in the > createNewSolrServer(String URL). I realize a real junit should be self > contained, unlike this one. > > junit test testDistrbutedSearch() passes, while testDistrbutedSearch2() > fails. Why? > > My understanding is that each host should send a query to all shards and > collate the responses, and return them to the client. Is this true? > > Ron > > > Here is my TestCase; > > package org.apache.solr.client.solrj.ron; > > import junit.framework.TestCase; > > import org.apache.solr.client.solrj.SolrQuery; > import org.apache.solr.client.solrj.SolrServer; > import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; > import org.apache.solr.client.solrj.response.QueryResponse; > import org.apache.solr.client.solrj.response.SolrPingResponse; > import org.apache.solr.common.SolrDocumentList; > import org.apache.solr.common.params.ShardParams; > > public class SolrExampleDistributedTest extends TestCase { > >int port = 8983; >static final String context = "/solr"; > >static String SOLR_SHARD1 = "localhost:8983/solr"; >static String SOLR_SHARD2 = "fred:8983/solr"; >static String SOLR_SHARDS = SOLR_SHARD1 + "," + SOLR_SHARD2; >static String HTTP_PREFIX = "http://";; >static String SOLR_URL1 = HTTP_PREFIX + SOLR_SHARD1; >static String SOLR_URL2 = HTTP_PREFIX + SOLR_SHARD2; >static String QUERY1 = "Samsung"; >static String QUERY2 = "solr"; > >@Override >public void setUp() throws Exception { >super.setUp(); > >} > >public SolrExampleDistributedTest(String name) { >super(name); >} > >@Override >public void tearDown() throws Exception { >super.tearDown(); >} > >protected SolrServer createNewSolrServer(String url) { >try { > >CommonsHttpSolrServer s = new CommonsHttpSolrServer(url); >s.setConnectionTimeout(100); // 1/10th sec >s.setDefaultMaxConnectionsPerHost(100); >s.setMaxTotalConnections(100); >return s; >} catch (Exception ex) { >throw new RuntimeException(ex); >} >} > >public void testLocalhost() { >try { >SolrServer server = createNewSolrServer(SOLR_URL1); > >SolrQuery query = new SolrQuery(); >query.setQuery(QUERY1); >QueryResponse qr = server.query(query); >SolrDocumentList sdl = qr.getResults(); >assertTrue(sdl.getNumFound() > 0); > >query = new SolrQuery(); >query.setQuery(QUERY2); >qr = server.query(query); >sdl = qr.getResults(); >assertTrue(sdl.getNumFound() == 0); > >} catch (Exception ex) { >ex.printStackTrace(); >fail(); >} >} > >public void testRemoteHost() { >try { >SolrServer server = createNewSolrServer(SOLR_URL2); > >SolrQuery query = new SolrQuery(); >query.setQuery(QUERY1); >QueryResponse qr = server.query(query); >SolrDocumentList sdl = qr.getResults(); >assertTrue(sdl.getNumFound() == 0); > >query = new SolrQuery(); >query.setQuery(QUERY2); >qr = server.query(query); >sdl = qr.getResults(); >assertTrue(sdl.getNumFound() > 0); >} catch (Exception ex) { >// expected >ex.printStackTrace(); >fail(); >} >} > >public void testDistrbutedSearch() { >try { >SolrServer server = createNewSolrServer(SOLR_URL1); > >SolrQuery query = new SolrQuery(); >query.setQuery(QUERY1); > >query.setParam(ShardParams.SHARDS, SOLR_SHARDS); >
Re: Distributed Search Test
Yonik, Thanks for your reply. I'm not sure if I understand completely. Do you mean that each solr server should be given a different shard list and not a list containing all shards? So in my case: 1) host fred should be given a shard list containing only locahost, 2) localhost should be given a shard list of fred I'll give it a try. Thanks again Ron On Wed, Aug 27, 2008 at 12:21 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > It fails because you are using "localhost" as part of a shard name. > When you send the request to "fred" it uses the "fred" shard and the > "localhost" shard (which is the same as fred!) > > -Yonik > > On Wed, Aug 27, 2008 at 12:07 PM, Ronald Aubin <[EMAIL PROTECTED]> > wrote: > > Hello, > >I have been performing some simple distributed search tests and don't > > understand why distributed search seems to work in some circumstances but > > not others. > > > > In my setup I have compiled the example server using the solr trunk > > downloaded on Aug 22nd. I am running a sample server instance on 2 > separate > > hosts (localhost and "fred"). I've added a portion of the sample docs > > [a-n]*.xml to the local host solr server, and added the other portion, > > [m-z]*.xml sample docs to host fred. > > > > Assuming that I have setup things correctly, I would expect to receive a > see > > non zero length SolrDocumentList for any distributed search that matches > > syntax in the example docs. > > > > Specifically when I test the contents of each server separately ( using > the > > included TestCase ) the tests pass. This confirms that each server has > > different documents. However when I do the distributed tests, it seems > the > > tests pass or fail based on the initial URL passed in the > > createNewSolrServer(String URL). I realize a real junit should be self > > contained, unlike this one. > > > > junit test testDistrbutedSearch() passes, while testDistrbutedSearch2() > > fails. Why? > > > > My understanding is that each host should send a query to all shards and > > collate the responses, and return them to the client. Is this true? > > > > Ron > > > > > > Here is my TestCase; > > > > package org.apache.solr.client.solrj.ron; > > > > import junit.framework.TestCase; > > > > import org.apache.solr.client.solrj.SolrQuery; > > import org.apache.solr.client.solrj.SolrServer; > > import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; > > import org.apache.solr.client.solrj.response.QueryResponse; > > import org.apache.solr.client.solrj.response.SolrPingResponse; > > import org.apache.solr.common.SolrDocumentList; > > import org.apache.solr.common.params.ShardParams; > > > > public class SolrExampleDistributedTest extends TestCase { > > > >int port = 8983; > >static final String context = "/solr"; > > > >static String SOLR_SHARD1 = "localhost:8983/solr"; > >static String SOLR_SHARD2 = "fred:8983/solr"; > >static String SOLR_SHARDS = SOLR_SHARD1 + "," + SOLR_SHARD2; > >static String HTTP_PREFIX = "http://";; > >static String SOLR_URL1 = HTTP_PREFIX + SOLR_SHARD1; > >static String SOLR_URL2 = HTTP_PREFIX + SOLR_SHARD2; > >static String QUERY1 = "Samsung"; > >static String QUERY2 = "solr"; > > > >@Override > >public void setUp() throws Exception { > >super.setUp(); > > > >} > > > >public SolrExampleDistributedTest(String name) { > >super(name); > >} > > > >@Override > >public void tearDown() throws Exception { > >super.tearDown(); > >} > > > >protected SolrServer createNewSolrServer(String url) { > >try { > > > >CommonsHttpSolrServer s = new CommonsHttpSolrServer(url); > >s.setConnectionTimeout(100); // 1/10th sec > >s.setDefaultMaxConnectionsPerHost(100); > >s.setMaxTotalConnections(100); > >return s; > >} catch (Exception ex) { > >throw new RuntimeException(ex); > >} > >} > > > >public void testLocalhost() { > >try { > >SolrServer server = createNewSolrServer(SOLR_URL1); > > > >SolrQuery query = new SolrQuery(); > >query.setQuery(QUERY1); > >QueryResponse qr = server.query(query); > >SolrDocumentList sdl = qr.getResults(); > >assertTrue(sdl.getNumFound() > 0); > > > >query = new SolrQuery(); > >query.setQuery(QUERY2); > >qr = server.query(query); > >sdl = qr.getResults(); > >assertTrue(sdl.getNumFound() == 0); > > > >} catch (Exception ex) { > >ex.printStackTrace(); > >fail(); > >} > >} > > > >public void testRemoteHost() { > >try { > >SolrServer server = createNewSolrServer(SOLR_URL2); > > > >SolrQuery query = new SolrQuery(); > >query.setQuery(QUERY1); > >QueryResponse qr = server.query(query); > >SolrDocument
java.io.FileNotFoundException: no segments* file found
Hi all, I've had a multicore system running for while now, and I just cycled the jetty server and all of a sudden I got this error: SEVERE: java.lang.RuntimeException: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/opt/cisearch/ci-content-search/solr/cores/0601_0/data/index: files: at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:899) at org.apache.solr.core.SolrCore.(SolrCore.java:450) at org.apache.solr.core.MultiCore.create(MultiCore.java:255) at org.apache.solr.core.MultiCore.load(MultiCore.java:139) at org.apache.solr.servlet.SolrDispatchFilter.initMultiCore(SolrDispatchFilter.java:147) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:72) at org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:99) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:594) at org.mortbay.jetty.servlet.Context.startContext(Context.java:139) at org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1218) at org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:500) at org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:448) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:161) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:147) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:117) at org.mortbay.jetty.Server.doStart(Server.java:210) at org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:40) at org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:929) at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method) at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:39) at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25) at java.lang.reflect.Method.invoke(Method.java:585) at org.mortbay.start.Main.invokeMain(Main.java:183) at org.mortbay.start.Main.start(Main.java:497) at org.mortbay.start.Main.main(Main.java:115) Caused by: java.io.FileNotFoundException: no segments* file found in org.apache.lucene.store.FSDirectory@/opt/cisearch/ci-content-search/solr/cores/0601_0/data/index: files: at org.apache.lucene.index.SegmentInfos$FindSegmentsFile.run(SegmentInfos.java:600) at org.apache.lucene.index.DirectoryIndexReader.open(DirectoryIndexReader.java:81) at org.apache.lucene.index.IndexReader.open(IndexReader.java:209) at org.apache.lucene.index.IndexReader.open(IndexReader.java:173) at org.apache.solr.search.SolrIndexSearcher.(SolrIndexSearcher.java:94) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:890) ... 29 more Of course, the odd thing is that the segments* file does exist: % ls -1 /opt/cisearch/ci-content-search/solr/cores/0601_0/data/index/segments* /opt/cisearch/ci-content-search/solr/cores/0601_0/data/index/segments_32i /opt/cisearch/ci-content-search/solr/cores/0601_0/data/index/segments.gen Any ideas on what could cause this? The only thing I can think of off the top of my head is that the core was coming up at the moment between the snapinstaller steps of: 1) /bin/rm -rf ${data_dir}/${index} && 2) mv -f ${data_dir}/${index}.tmp$$ ${data_dir}/${index} Any other thoughts / conjectures ? enjoy, -jeremy -- Jeremy Hinegardner [EMAIL PROTECTED]
Re: Wrong sort by score
It seems like the debug information is using the custom similarity as it should - the bug isn't there. I see in the explain information the right tf value (I modified it to be 1 in my custom similarity). The numbers in the explain seem to add up and make sense. Is it possible that the score itself is wrong (the one that I get from fl)? Thanks, Yuri On Wed, Aug 27, 2008 at 11:44 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > On Wed, Aug 27, 2008 at 9:38 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: > > Actually, no... > > The score in the fl are 12.806475 and 10.386531 respectively, so the > results > > according to that are sorted correctly. > > Is it just a problem with the debugQuery? > > Looks like it... I guess the custom similarity isn't being used when > explain() is called. > Did you register this custom similarity in the schema? > If so, can you file a JIRA bug for this? > > -Yonik > > > > On Wed, Aug 27, 2008 at 9:21 AM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > > > >> On Wed, Aug 27, 2008 at 9:10 AM, Yuri Jan <[EMAIL PROTECTED]> wrote: > >> > I have encountered a weird problem in solr. > >> > In one of my queries (dismax, default sorting) I noticed that the > results > >> > are not sorted by score (according to debugQuery). > >> > > >> > The first 150 results are tied (with score 12.806474), and after > those, > >> > there is a bunch of results with higher score (12.962835). > >> > > >> > What can be the cause? > >> > I'm overriding the tf function in my similarity class. Can it be > related? > >> > >> Do the explain scores in the debug section match the normal scores > >> paired with the documents? (add score to the fl parameter to get a > >> score with each document). > >> > >> -Yonik > >> > > >
Re: Distributed Search Test
Yonik, I now perfectly understand. Thanks for your help. All my tests now work. Ron On Wed, Aug 27, 2008 at 12:21 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > It fails because you are using "localhost" as part of a shard name. > When you send the request to "fred" it uses the "fred" shard and the > "localhost" shard (which is the same as fred!) > > -Yonik > > On Wed, Aug 27, 2008 at 12:07 PM, Ronald Aubin <[EMAIL PROTECTED]> > wrote: > > Hello, > >I have been performing some simple distributed search tests and don't > > understand why distributed search seems to work in some circumstances but > > not others. > > > > In my setup I have compiled the example server using the solr trunk > > downloaded on Aug 22nd. I am running a sample server instance on 2 > separate > > hosts (localhost and "fred"). I've added a portion of the sample docs > > [a-n]*.xml to the local host solr server, and added the other portion, > > [m-z]*.xml sample docs to host fred. > > > > Assuming that I have setup things correctly, I would expect to receive a > see > > non zero length SolrDocumentList for any distributed search that matches > > syntax in the example docs. > > > > Specifically when I test the contents of each server separately ( using > the > > included TestCase ) the tests pass. This confirms that each server has > > different documents. However when I do the distributed tests, it seems > the > > tests pass or fail based on the initial URL passed in the > > createNewSolrServer(String URL). I realize a real junit should be self > > contained, unlike this one. > > > > junit test testDistrbutedSearch() passes, while testDistrbutedSearch2() > > fails. Why? > > > > My understanding is that each host should send a query to all shards and > > collate the responses, and return them to the client. Is this true? > > > > Ron > > > > > > Here is my TestCase; > > > > package org.apache.solr.client.solrj.ron; > > > > import junit.framework.TestCase; > > > > import org.apache.solr.client.solrj.SolrQuery; > > import org.apache.solr.client.solrj.SolrServer; > > import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; > > import org.apache.solr.client.solrj.response.QueryResponse; > > import org.apache.solr.client.solrj.response.SolrPingResponse; > > import org.apache.solr.common.SolrDocumentList; > > import org.apache.solr.common.params.ShardParams; > > > > public class SolrExampleDistributedTest extends TestCase { > > > >int port = 8983; > >static final String context = "/solr"; > > > >static String SOLR_SHARD1 = "localhost:8983/solr"; > >static String SOLR_SHARD2 = "fred:8983/solr"; > >static String SOLR_SHARDS = SOLR_SHARD1 + "," + SOLR_SHARD2; > >static String HTTP_PREFIX = "http://";; > >static String SOLR_URL1 = HTTP_PREFIX + SOLR_SHARD1; > >static String SOLR_URL2 = HTTP_PREFIX + SOLR_SHARD2; > >static String QUERY1 = "Samsung"; > >static String QUERY2 = "solr"; > > > >@Override > >public void setUp() throws Exception { > >super.setUp(); > > > >} > > > >public SolrExampleDistributedTest(String name) { > >super(name); > >} > > > >@Override > >public void tearDown() throws Exception { > >super.tearDown(); > >} > > > >protected SolrServer createNewSolrServer(String url) { > >try { > > > >CommonsHttpSolrServer s = new CommonsHttpSolrServer(url); > >s.setConnectionTimeout(100); // 1/10th sec > >s.setDefaultMaxConnectionsPerHost(100); > >s.setMaxTotalConnections(100); > >return s; > >} catch (Exception ex) { > >throw new RuntimeException(ex); > >} > >} > > > >public void testLocalhost() { > >try { > >SolrServer server = createNewSolrServer(SOLR_URL1); > > > >SolrQuery query = new SolrQuery(); > >query.setQuery(QUERY1); > >QueryResponse qr = server.query(query); > >SolrDocumentList sdl = qr.getResults(); > >assertTrue(sdl.getNumFound() > 0); > > > >query = new SolrQuery(); > >query.setQuery(QUERY2); > >qr = server.query(query); > >sdl = qr.getResults(); > >assertTrue(sdl.getNumFound() == 0); > > > >} catch (Exception ex) { > >ex.printStackTrace(); > >fail(); > >} > >} > > > >public void testRemoteHost() { > >try { > >SolrServer server = createNewSolrServer(SOLR_URL2); > > > >SolrQuery query = new SolrQuery(); > >query.setQuery(QUERY1); > >QueryResponse qr = server.query(query); > >SolrDocumentList sdl = qr.getResults(); > >assertTrue(sdl.getNumFound() == 0); > > > >query = new SolrQuery(); > >query.setQuery(QUERY2); > >qr = server.query(query); > >sdl = qr.getResults(); > >assertTrue(sdl.g
Re: Distributed Search Test
On Wed, Aug 27, 2008 at 12:33 PM, Ronald Aubin <[EMAIL PROTECTED]> wrote: > Thanks for your reply. I'm not sure if I understand completely. Do you > mean that each solr server should be given a different shard list and not a > list containing all shards? You could use the same shard list (as long as it doesn't contain localhost), or you could use different ones (as long as localhost was correctly substituted for the host you are talking to). I'd recommend avoiding "localhost" in the shard list unless all of your shards happen to be on the local host. Example: If you have hosta, hostb, then querying hosta with shards=hosta,hostb or shards=localhost,hostb will work (they are equivalent) querying hostb with shards=hosta,hostb or shards=hosta,localhost will work (they are equivalent) BUT querying hostb with shards=localhost,hostb is equivalent to shards=hostb,hostb -Yonik > So in my case: > 1) host fred should be given a shard list containing only locahost, > 2) localhost should be given a shard list of fred > > I'll give it a try. > > Thanks again > > Ron > > On Wed, Aug 27, 2008 at 12:21 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > >> It fails because you are using "localhost" as part of a shard name. >> When you send the request to "fred" it uses the "fred" shard and the >> "localhost" shard (which is the same as fred!) >> >> -Yonik >> >> On Wed, Aug 27, 2008 at 12:07 PM, Ronald Aubin <[EMAIL PROTECTED]> >> wrote: >> > Hello, >> >I have been performing some simple distributed search tests and don't >> > understand why distributed search seems to work in some circumstances but >> > not others. >> > >> > In my setup I have compiled the example server using the solr trunk >> > downloaded on Aug 22nd. I am running a sample server instance on 2 >> separate >> > hosts (localhost and "fred"). I've added a portion of the sample docs >> > [a-n]*.xml to the local host solr server, and added the other portion, >> > [m-z]*.xml sample docs to host fred. >> > >> > Assuming that I have setup things correctly, I would expect to receive a >> see >> > non zero length SolrDocumentList for any distributed search that matches >> > syntax in the example docs. >> > >> > Specifically when I test the contents of each server separately ( using >> the >> > included TestCase ) the tests pass. This confirms that each server has >> > different documents. However when I do the distributed tests, it seems >> the >> > tests pass or fail based on the initial URL passed in the >> > createNewSolrServer(String URL). I realize a real junit should be self >> > contained, unlike this one. >> > >> > junit test testDistrbutedSearch() passes, while testDistrbutedSearch2() >> > fails. Why? >> > >> > My understanding is that each host should send a query to all shards and >> > collate the responses, and return them to the client. Is this true? >> > >> > Ron >> > >> > >> > Here is my TestCase; >> > >> > package org.apache.solr.client.solrj.ron; >> > >> > import junit.framework.TestCase; >> > >> > import org.apache.solr.client.solrj.SolrQuery; >> > import org.apache.solr.client.solrj.SolrServer; >> > import org.apache.solr.client.solrj.impl.CommonsHttpSolrServer; >> > import org.apache.solr.client.solrj.response.QueryResponse; >> > import org.apache.solr.client.solrj.response.SolrPingResponse; >> > import org.apache.solr.common.SolrDocumentList; >> > import org.apache.solr.common.params.ShardParams; >> > >> > public class SolrExampleDistributedTest extends TestCase { >> > >> >int port = 8983; >> >static final String context = "/solr"; >> > >> >static String SOLR_SHARD1 = "localhost:8983/solr"; >> >static String SOLR_SHARD2 = "fred:8983/solr"; >> >static String SOLR_SHARDS = SOLR_SHARD1 + "," + SOLR_SHARD2; >> >static String HTTP_PREFIX = "http://";; >> >static String SOLR_URL1 = HTTP_PREFIX + SOLR_SHARD1; >> >static String SOLR_URL2 = HTTP_PREFIX + SOLR_SHARD2; >> >static String QUERY1 = "Samsung"; >> >static String QUERY2 = "solr"; >> > >> >@Override >> >public void setUp() throws Exception { >> >super.setUp(); >> > >> >} >> > >> >public SolrExampleDistributedTest(String name) { >> >super(name); >> >} >> > >> >@Override >> >public void tearDown() throws Exception { >> >super.tearDown(); >> >} >> > >> >protected SolrServer createNewSolrServer(String url) { >> >try { >> > >> >CommonsHttpSolrServer s = new CommonsHttpSolrServer(url); >> >s.setConnectionTimeout(100); // 1/10th sec >> >s.setDefaultMaxConnectionsPerHost(100); >> >s.setMaxTotalConnections(100); >> >return s; >> >} catch (Exception ex) { >> >throw new RuntimeException(ex); >> >} >> >} >> > >> >public void testLocalhost() { >> >try { >> >SolrServer server = createNewSolrServer(SOLR_URL1); >> > >> >SolrQuery query = new SolrQuery(); >> >
Replacing FAST functionality at sesam.no
At sesam.no we want to replace a FAST (fast.no) Query Matching Server with a Solr index. The index we are trying to replace is not a regular index, but specially configured to perform phrases (and sub-phrases) matches against several large lists (like an index with only a 'title' field). I'm not sure of a correct, or logical, name for the behavior we are after, but it is like a combination between Shingles and exact matching. Some examples should explain it well. Lets say we have the following list: > one two three > one two > two three > one > two > three > three two > two one > one three > three one For the query "one two three", we need hits against, and only against: > one two three > one two > two three > one > two > three For the query "one two", we need hits against, and only against: > one two > one > two For the query "one three four" (or "four one three"), we need hits against, and only against: > one three > one > three For the query "one two sesam three", we need hits against, and only against: > one two > one > two > three We have been testing out solr with the ShingleFilter for this, but without luck. I am unsure whether the reason is misconfiguration in schema.xml or that the ShingleFilter actually don't support this type of behavior. Attached our current schema.xml (it is different from when I made this post to the solr-dev mailinglist, the shingle "fieldType" is of class "solr.StrField") Attached is screenshots of the solr/admin/analysis.jsp against this configuration. I'd like to know if the SchingleFilter is at all able to do what we want. If it is: How can I configure schema.xml? If not: does there exist any other solutions that we can incorporate into solr which will give us this behavior? If there is no existing solution to this, we will probably end up writing our own methods for it, extending the ShingleFilter, gladly contributing to the solr project =) Thanks for a great product, Glenn-Erik schema.xml Description: XML document
odd 500 error
Hello - I stumbled across an odd error which my intuition is telling me is a bug. Here is my installation: Solr Specification Version: 1.2.2008.08.13.13.05.16 Lucene Implementation Version: 2.4-dev 685576 - 2008-08-13 10:55:25 I did the following query today: author:(r*a* AND fisher) And get the following 500 error: maxClauseCount is set to 1024 org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set to 1024 at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:165) at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:156) at org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:63) at org.apache.lucene.search.WildcardQuery.rewrite(WildcardQuery.java:54) at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:385) at org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:163) at org.apache.lucene.search.Query.weight(Query.java:94) at org.apache.lucene.search.Searcher.createWeight(Searcher.java:175) at org.apache.lucene.search.Searcher.search(Searcher.java:126) at org.apache.lucene.search.Searcher.search(Searcher.java:105) at org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966) at org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838) at org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269) at org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160) at org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:167) at org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) at org.apache.solr.core.SolrCore.execute(SolrCore.java:1156) at org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) at org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) at org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1088) at org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360) at org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) at org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) at org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729) at org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) at org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:206) at org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) at org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) at org.mortbay.jetty.Server.handle(Server.java:324) at org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505) at org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:829) at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211) at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380) at org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395) at org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488) Thanks Andrew
Re: odd 500 error
On Wed, Aug 27, 2008 at 2:21 PM, Andrew Nagy <[EMAIL PROTECTED]> wrote: > Hello - I stumbled across an odd error which my intuition is telling me is a > bug. Unfortunately, wildcard queries can expand to an undefined number of terms. This was the reason ConstantScorePrefixQuery and ConstantScoreRangeQuery were introduced, but I never got around to ConstantScoreWildcardQuery. So this is a known limitation. -Yonik > Here is my installation: > Solr Specification Version: 1.2.2008.08.13.13.05.16 > Lucene Implementation Version: 2.4-dev 685576 - 2008-08-13 10:55:25 > > I did the following query today: > author:(r*a* AND fisher) > > And get the following 500 error: > > maxClauseCount is set to 1024 > > org.apache.lucene.search.BooleanQuery$TooManyClauses: maxClauseCount is set > to 1024 >at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:165) >at org.apache.lucene.search.BooleanQuery.add(BooleanQuery.java:156) >at > org.apache.lucene.search.MultiTermQuery.rewrite(MultiTermQuery.java:63) >at > org.apache.lucene.search.WildcardQuery.rewrite(WildcardQuery.java:54) >at org.apache.lucene.search.BooleanQuery.rewrite(BooleanQuery.java:385) >at > org.apache.lucene.search.IndexSearcher.rewrite(IndexSearcher.java:163) >at org.apache.lucene.search.Query.weight(Query.java:94) >at org.apache.lucene.search.Searcher.createWeight(Searcher.java:175) >at org.apache.lucene.search.Searcher.search(Searcher.java:126) >at org.apache.lucene.search.Searcher.search(Searcher.java:105) >at > org.apache.solr.search.SolrIndexSearcher.getDocListNC(SolrIndexSearcher.java:966) >at > org.apache.solr.search.SolrIndexSearcher.getDocListC(SolrIndexSearcher.java:838) >at > org.apache.solr.search.SolrIndexSearcher.search(SolrIndexSearcher.java:269) >at > org.apache.solr.handler.component.QueryComponent.process(QueryComponent.java:160) >at > org.apache.solr.handler.component.SearchHandler.handleRequestBody(SearchHandler.java:167) >at > org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135) >at org.apache.solr.core.SolrCore.execute(SolrCore.java:1156) >at > org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:341) >at > org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:272) >at > org.mortbay.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1088) >at > org.mortbay.jetty.servlet.ServletHandler.handle(ServletHandler.java:360) >at > org.mortbay.jetty.security.SecurityHandler.handle(SecurityHandler.java:216) >at > org.mortbay.jetty.servlet.SessionHandler.handle(SessionHandler.java:181) >at > org.mortbay.jetty.handler.ContextHandler.handle(ContextHandler.java:729) >at > org.mortbay.jetty.webapp.WebAppContext.handle(WebAppContext.java:405) >at > org.mortbay.jetty.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:206) >at > org.mortbay.jetty.handler.HandlerCollection.handle(HandlerCollection.java:114) >at > org.mortbay.jetty.handler.HandlerWrapper.handle(HandlerWrapper.java:152) >at org.mortbay.jetty.Server.handle(Server.java:324) >at > org.mortbay.jetty.HttpConnection.handleRequest(HttpConnection.java:505) >at > org.mortbay.jetty.HttpConnection$RequestHandler.headerComplete(HttpConnection.java:829) >at org.mortbay.jetty.HttpParser.parseNext(HttpParser.java:513) >at org.mortbay.jetty.HttpParser.parseAvailable(HttpParser.java:211) >at org.mortbay.jetty.HttpConnection.handle(HttpConnection.java:380) >at > org.mortbay.io.nio.SelectChannelEndPoint.run(SelectChannelEndPoint.java:395) >at > org.mortbay.thread.QueuedThreadPool$PoolThread.run(QueuedThreadPool.java:488) > > > Thanks > Andrew >
Re: Replacing FAST functionality at sesam.no
The screenshot didn't make it (some attachments gets stripped) Otis -- Sematext -- http://sematext.com/ -- Lucene - Solr - Nutch - Original Message > From: Glenn-Erik Sandbakken <[EMAIL PROTECTED]> > To: solr-user@lucene.apache.org > Sent: Wednesday, August 27, 2008 1:44:53 PM > Subject: Replacing FAST functionality at sesam.no > > At sesam.no we want to replace a FAST (fast.no) Query Matching Server > with a Solr index. > > The index we are trying to replace is not a regular index, but specially > configured to perform phrases (and sub-phrases) matches against several > large lists (like an index with only a 'title' field). > > I'm not sure of a correct, or logical, name for the behavior we are > after, but it is like a combination between Shingles and exact matching. > > Some examples should explain it well. > > Lets say we have the following list: > > one two three > > one two > > two three > > one > > two > > three > > three two > > two one > > one three > > three one > > For the query "one two three", we need hits against, and only against: > > one two three > > one two > > two three > > one > > two > > three > > For the query "one two", we need hits against, and only against: > > one two > > one > > two > > For the query "one three four" (or "four one three"), we need hits > against, and only against: > > one three > > one > > three > > For the query "one two sesam three", we need hits against, and only > against: > > one two > > one > > two > > three > > We have been testing out solr with the ShingleFilter for this, but > without luck. > I am unsure whether the reason is misconfiguration in schema.xml or that > the ShingleFilter actually don't support this type of behavior. > Attached our current schema.xml > (it is different from when I made this post to the solr-dev mailinglist, > the shingle "fieldType" is of class "solr.StrField") > Attached is screenshots of the solr/admin/analysis.jsp against this > configuration. > > I'd like to know if the SchingleFilter is at all able to do what we > want. > If it is: How can I configure schema.xml? > If not: does there exist any other solutions that we can incorporate > into solr which will give us this behavior? > > If there is no existing solution to this, we will probably end up > writing our own methods for it, extending the ShingleFilter, gladly > contributing to the solr project =) > > Thanks for a great product, > Glenn-Erik
Beginners question: adding a plugin
Hello, I'm pretty new to Solr, and not a Java expert, and trying to create my own plug in according to the instructions given in http://wiki.apache.org/solr/SolrPlugins. I want to integrate an external stemmer for the Dutch language by creating a new FilterFactory that will invoke the external stemmer for a TokenStream. First thing I want to do is just make sure I can get the plug in running. Here's what I did: - Take a copy of DutchStemFilterFactory.java, rename it to TestStemFilterFactory, renamed the class to TestStemFilterFactory - Successfully compiled the java using javac, and add the .class file to a jar file - Put the jar file in SOLR_HOME/lib - Put a line in my analyzer definition in schema.xml - Restart tomcat In the Tomcat log, there is an indication that the file is found: 27-Aug-2008 20:58:25 org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Adding 'file:/D:/Programs/Solr/lib/Test.jar' to Solr classloader But then I get errors being reported by Tomcat further down the log file: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.TestStemFilterFactory' at org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:256) at org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:261) at org.apache.solr.util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:83) at org.apache.solr.util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) <> Caused by: java.lang.ClassNotFoundException: solr.TestStemFilterFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) <.> Probably some configuration issue somewhere, but I am in the dark here (as said: not a Java expert...). I've tried to find information in mailing list archives on this, but no luck so far. I'm Running Solr nightly build of 20.08.2008, tomcat 5.5.26 on Windows. Any help would be much appreciated! Cheers, Jaco.
Re: Replacing FAST functionality at sesam.no
On 27. aug.. 2008, at 19.44, Glenn-Erik Sandbakken wrote: At sesam.no we want to replace a FAST (fast.no) Query Matching Server with a Solr index. The index we are trying to replace is not a regular index, but specially configured to perform phrases (and sub-phrases) matches against several large lists (like an index with only a 'title' field). I'm not sure of a correct, or logical, name for the behavior we are after, but it is like a combination between Shingles and exact matching. Some examples should explain it well. In order to do this, you can´t use the ShingleFilter during indexing since a document like "one two three" and a query like "one two four" will match since they have the shingle "one two" in common. You will get what you want, I think, if you don´t tokenize during indexing (some normalization will be required if your lists aren't normalized to begin with) and apply the ShingleFilter only to the queries. Svein
Question about autocomplete feature
Hello. I'm trying to implement autocomplete feature using the snippet posted by Dan. (http://mail-archives.apache.org/mod_mbox/lucene-solr-user/200807.mbox/[EMAIL PROTECTED]) Here is the snippet: ... First I decided to make it working for solr example. So I pasted the snippet to schema.xml. Then I edited exampledocs/hd.xml, I added the "ac" field to each doc. Value of "ac" field is a copy of name filed: SP2514N Samsung SpinPoint P12 SP2514N - hard drive - 250 GB - ATA-133 Samsung SpinPoint P12 SP2514N - hard drive - 250 GB - ATA-133 Samsung Electronics Co. Ltd. electronics hard drive 7200RPM, 8MB cache, IDE Ultra ATA-133 NoiseGuard, SilentSeek technology, Fluid Dynamic Bearing (FDB) motor 92 6 true 6H500F0 Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300 Maxtor DiamondMax 11 - hard drive - 500 GB - SATA-300 Maxtor Corp. electronics hard drive SATA 3.0Gb/s, NCQ 8.5ms seek 16MB cache 350 6 true Then I clean solr index, posted hd.xml and restarted solr server. But when I'm trying to search for "samsu" (the part of word "samsung") I still get no result. Seems like solr treats "ac" field like the regular field. What did I do wrong? Thanks in advance. -- Aleksey Gogolev developer, dev.co.ua Aleksey
copyField: String vs Text Field
Hello, I was wondering if there was an added advantage in using to copy a string field to a text field? If the field is copied to a text field then why not just make the field a text field and eliminate copying its data? If you are going to use full text searching on that field which you cant do with string fields wouldn't it just make sense to keep it a text field since it has the same abilities as a string field and more? ... Or is the reason because string fields have better performance on matching exact strings than text fields? Thanks, - Jake
Re: Beginners question: adding a plugin
Instead of solr.TestStemFilterFactory, put the fully qualified classname for the TestStemFilterFactory, i.e. com.my.great.stemmer.TestStemFilterFactory. The solr.FactoryName notation is just shorthand for org.apache.solr.BlahBlahBlah -Grant On Aug 27, 2008, at 3:27 PM, Jaco wrote: Hello, I'm pretty new to Solr, and not a Java expert, and trying to create my own plug in according to the instructions given in http://wiki.apache.org/solr/SolrPlugins. I want to integrate an external stemmer for the Dutch language by creating a new FilterFactory that will invoke the external stemmer for a TokenStream. First thing I want to do is just make sure I can get the plug in running. Here's what I did: - Take a copy of DutchStemFilterFactory.java, rename it to TestStemFilterFactory, renamed the class to TestStemFilterFactory - Successfully compiled the java using javac, and add the .class file to a jar file - Put the jar file in SOLR_HOME/lib - Put a line in my analyzer definition in schema.xml - Restart tomcat In the Tomcat log, there is an indication that the file is found: 27-Aug-2008 20:58:25 org.apache.solr.core.SolrResourceLoader createClassLoader INFO: Adding 'file:/D:/Programs/Solr/lib/Test.jar' to Solr classloader But then I get errors being reported by Tomcat further down the log file: SEVERE: org.apache.solr.common.SolrException: Error loading class 'solr.TestStemFilterFactory' at org .apache .solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:256) at org .apache .solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:261) at org .apache .solr .util.plugin.AbstractPluginLoader.create(AbstractPluginLoader.java:83) at org .apache .solr .util.plugin.AbstractPluginLoader.load(AbstractPluginLoader.java:140) <> Caused by: java.lang.ClassNotFoundException: solr.TestStemFilterFactory at java.net.URLClassLoader$1.run(URLClassLoader.java:200) at java.security.AccessController.doPrivileged(Native Method) at java.net.URLClassLoader.findClass(URLClassLoader.java:188) <.> Probably some configuration issue somewhere, but I am in the dark here (as said: not a Java expert...). I've tried to find information in mailing list archives on this, but no luck so far. I'm Running Solr nightly build of 20.08.2008, tomcat 5.5.26 on Windows. Any help would be much appreciated! Cheers, Jaco. -- Grant Ingersoll http://www.lucidimagination.com Lucene Helpful Hints: http://wiki.apache.org/lucene-java/BasicsOfPerformance http://wiki.apache.org/lucene-java/LuceneFAQ
Re: copyField: String vs Text Field
Jake, copyField exists to decouple document values (on the update size) from how they are indexed. >From the example schema: -Yonik On Wed, Aug 27, 2008 at 4:46 PM, Jake Conk <[EMAIL PROTECTED]> wrote: > Hello, > > I was wondering if there was an added advantage in using > to copy a string field to a text field? > > If the field is copied to a text field then why not just make the > field a text field and eliminate copying its data? > > If you are going to use full text searching on that field which you > cant do with string fields wouldn't it just make sense to keep it a > text field since it has the same abilities as a string field and more? > > ... Or is the reason because string fields have better performance on > matching exact strings than text fields? > > Thanks, > > - Jake >
Re: dataimporthandler and mysql connector jar
: Can you please open a JIRA issue for this? However, we may only be able to : fix this after 1.3 because a code freeze has been decided upon, to release : 1.3 asap. "code freeze" may be overstating it ... the point of the freeze is to hold off on new fatures and other misc refactorings and focus on bug fixes and documentation improvements. This sounds like a bug, and assuming the fix isn't insanely invasive there's no reason not to make bug fixes on the 1.3 branch (and merge with the trunk). -Hoss
Re: copyField: String vs Text Field
Yonik, Thanks for the reply. Does that mean that if I were to edit the data then the field it was copied to will not be updated? I assume it does get deleted if I delete the record right? I understand how it can make searching simpler by copying fields to one but would that really make it faster? How? Thanks, - Jake On Wed, Aug 27, 2008 at 2:22 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: > Jake, copyField exists to decouple document values (on the update > size) from how they are indexed. > > From the example schema: > > > -Yonik > > On Wed, Aug 27, 2008 at 4:46 PM, Jake Conk <[EMAIL PROTECTED]> wrote: >> Hello, >> >> I was wondering if there was an added advantage in using >> to copy a string field to a text field? >> >> If the field is copied to a text field then why not just make the >> field a text field and eliminate copying its data? >> >> If you are going to use full text searching on that field which you >> cant do with string fields wouldn't it just make sense to keep it a >> text field since it has the same abilities as a string field and more? >> >> ... Or is the reason because string fields have better performance on >> matching exact strings than text fields? >> >> Thanks, >> >> - Jake >> >
Re: Wrong sort by score
: It seems like the debug information is using the custom similarity as it : should - the bug isn't there. : I see in the explain information the right tf value (I modified it to be 1 : in my custom similarity). : The numbers in the explain seem to add up and make sense. : Is it possible that the score itself is wrong (the one that I get from fl)? the score in the doclist is by definition the correct score - the debug info follows a differnet code path and sometimes that code path isn't in sink with the the actual searching/scoring code for differnet query types (although i was pretty confident that the test i added to Lucene-Java a whle back tested this for anything you can see in Solr without getting into crazy contrib Query classes) it would help if you could post: 1) the full debugQuery output from a query where you see this disconnect, showing the all query toString info, and the score explanations 2) the corrisponding scores you see in the doclist 3) some more details about how your custom similarity works (can you post the code) 4) info on how you've configured dismax and what request params you are using (the output from using echoParams=all would be good) -Hoss
Re: copyField: String vs Text Field
On Wed, Aug 27, 2008 at 7:47 PM, Jake Conk <[EMAIL PROTECTED]> wrote: > Thanks for the reply. Does that mean that if I were to edit the data > then the field it was copied to will not be updated? You can't really "edit" a document in Lucene or Solr, really just overwrite an old document with an entirely new version. > I assume it does > get deleted if I delete the record right? I understand how it can make > searching simpler by copying fields to one but would that really make > it faster? How? Searching a single field for a term is faster than searching multiple fields for a term. That's really only one use case though... the other being to have a single stored field that is analyzed multiple different ways. -Yonik > Thanks, > - Jake > > On Wed, Aug 27, 2008 at 2:22 PM, Yonik Seeley <[EMAIL PROTECTED]> wrote: >> Jake, copyField exists to decouple document values (on the update >> size) from how they are indexed. >> >> From the example schema: >> >> >> -Yonik >> >> On Wed, Aug 27, 2008 at 4:46 PM, Jake Conk <[EMAIL PROTECTED]> wrote: >>> Hello, >>> >>> I was wondering if there was an added advantage in using >>> to copy a string field to a text field? >>> >>> If the field is copied to a text field then why not just make the >>> field a text field and eliminate copying its data? >>> >>> If you are going to use full text searching on that field which you >>> cant do with string fields wouldn't it just make sense to keep it a >>> text field since it has the same abilities as a string field and more? >>> >>> ... Or is the reason because string fields have better performance on >>> matching exact strings than text fields? >>> >>> Thanks, >>> >>> - Jake >>> >> >
Re: copyField: String vs Text Field
On 8/27/08 5:54 PM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: > > That's really only one use case though... the other being to have a > single stored field that is analyzed multiple different ways. We are the other use case. We take a title and put it in three fields: one merely lowercased, one stemmed and stopped, and one phonetic. At query time, we search all three with decreasing weights. An exact match is weighted more than a stemmed and stopped match, and so on. wunder -- Search Guy, Netflix
Re: dataimporthandler and mysql connector jar
On Thu, Aug 28, 2008 at 5:11 AM, Chris Hostetter <[EMAIL PROTECTED]>wrote: > "code freeze" may be overstating it ... the point of the freeze is to hold > off on new fatures and other misc refactorings and focus on bug fixes and > documentation improvements. Ah ok. I was under the impression that only blocker bugs should make it there. > > > This sounds like a bug, and assuming the fix isn't insanely invasive > there's no reason not to make bug fixes on the 1.3 branch (and merge with > the trunk). That's great. There are a couple of small bugs which could make it to 1.3 then. -- Regards, Shalin Shekhar Mangar.
Re: copyField: String vs Text Field
Hi Walter, What do you mean by when you stemmed and stopped your title field? Thanks, - Jake On Wed, Aug 27, 2008 at 7:41 PM, Walter Underwood <[EMAIL PROTECTED]> wrote: > On 8/27/08 5:54 PM, "Yonik Seeley" <[EMAIL PROTECTED]> wrote: >> >> That's really only one use case though... the other being to have a >> single stored field that is analyzed multiple different ways. > > We are the other use case. We take a title and put it in three > fields: one merely lowercased, one stemmed and stopped, and one > phonetic. At query time, we search all three with decreasing > weights. An exact match is weighted more than a stemmed and > stopped match, and so on. > > wunder > -- > Search Guy, Netflix > > >