Re: solr search
Thanks for your replies.My problem has been resolved. It was a sql server connection problem. I declared a variable "databasename" in the dataconfig.xml file and removed the database name from url. Can anyone suggest me some good link or url for multiple indexing and spell check in solr? Manish Bawne Software Engineer Biz Integra Systems Pvt Ltd http://www.bizhandel.com Noble Paul നോബിള് नोब्ळ्-2 wrote: > > Please paste the complete stacktrace > > On Fri, Nov 6, 2009 at 1:37 PM, manishkbawne > wrote: >> >> Thanks for assistance. Actually I installed jdk 6 and my problem was >> resolved. But now I am getting this exception:- >> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to >> execute query: select PkMenuId from WCM_Menu Processing Document # 1 >> at >> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186) >> at --- >> >> The changes the db-dataconfig.xml file are as :- >> >> >> > fetchSize="1"> >> > name="id1" /> >> >> >> >> >> I don't think, there is some problem with missing hyphen. Please anybody >> suggest me some way to resolve this error? >> >> Manish Bawne >> Software Engineer >> Biz Integra Systems >> www.bizhandel.com >> >> >> >> >> >> >> >> >> >> >> >> Chantal Ackermann wrote: >>> >>> Hi Manish, >>> >>> is this a typo in your e-mail or is your config file really missing a >>> hyphen? (Your repeating the name without second hyphen several times.) >>> >>> Cheers, >>> Chantal >>> >>> manishkbawne schrieb: db-data-config.xml The changes that I have done in the db-dataconfig.xml file is :- >>> >>> >> >> -- >> View this message in context: >> http://old.nabble.com/solr-search-tp26125183p26228077.html >> Sent from the Solr - User mailing list archive at Nabble.com. >> >> > > > > -- > - > Noble Paul | Principal Engineer| AOL | http://aol.com > > -- View this message in context: http://old.nabble.com/solr-search-tp26125183p26251669.html Sent from the Solr - User mailing list archive at Nabble.com.
schema.jsp is not displaying tint types corretly
I have a field defined tint with values 100,200,300 and -100 only. When i use admin/schema.jsp i see 5 distinct values. 0 666083 100 431176 200 234907 256 33947 300 33947 First i thought that i post wrong values. I was expecting 4 distinct values. When i query flag_value:256 i get 0 docs. Same as with flag_value:0. 100 200 and 300 works as expected. So i concluded that there is something wrong with schema.jsp with tint types. I am using solr-2009-11-03.tgz By the way, trie types makes sorts faster? Or they are only useful in range queries?
synonym payload boosting
Hi, I have a field and a wighted synonym map. I have indexed the synonyms with the weight as payload. my code snippet from my filter *public Token next(final Token reusableToken) throws IOException * *. * *. * *.* * Payload boostPayload;* * * *for (Synonym synonym : syns) {* ** *Token newTok = new Token(nToken.startOffset(), nToken.endOffset(), "SYNONYM");* *newTok.setTermBuffer(synonym.getToken().toCharArray(), 0, synonym.getToken().length());* *// set the position increment to zero* *// this tells lucene the synonym is* *// in the exact same location as the originating word* *newTok.setPositionIncrement(0);* *boostPayload = new Payload(PayloadHelper.encodeFloat(synonym.getWieght()));* *newTok.setPayload(boostPayload);* * * I have put it in the index time analyzer : this is my field definition: * my similarity class is public class BoostingSymilarity extends DefaultSimilarity { public BoostingSymilarity(){ super(); } @Override public float scorePayload(String field, byte [] payload, int offset, int length) { double weight = PayloadHelper.decodeFloat(payload, 0); return (float)weight; } @Override public float coord(int overlap, int maxoverlap) { return 1.0f; } @Override public float idf(int docFreq, int numDocs) { return 1.0f; } @Override public float lengthNorm(String fieldName, int numTerms) { return 1.0f; } @Override public float tf(float freq) { return 1.0f; } } My problem is that scorePayload method does not get called at search time like the other methods in my similarity class. I tested and verified it with break points. What am I doing wrong? I used solr 1.3 and thinking of the payload boos support in solr 1.4. *
Re: synonym payload boosting
Additionaly you need to modify your queryparser to return BoostingTermQuery, PayloadTermQuery, PayloadNearQuery etc. With these types of Queries scorePayload method invoked. Hope this helps. --- On Sun, 11/8/09, David Ginzburg wrote: > From: David Ginzburg > Subject: synonym payload boosting > To: solr-user@lucene.apache.org > Date: Sunday, November 8, 2009, 4:06 PM > Hi, > I have a field and a wighted synonym map. > I have indexed the synonyms with the weight as payload. > my code snippet from my filter > > *public Token next(final Token reusableToken) throws > IOException * > * . * > * . * > * .* > * Payload boostPayload;* > * > * > * for (Synonym synonym : syns) > {* > * * > * Token newTok = > new Token(nToken.startOffset(), > nToken.endOffset(), "SYNONYM");* > * > newTok.setTermBuffer(synonym.getToken().toCharArray(), 0, > synonym.getToken().length());* > * // set the > position increment to zero* > * // this tells > lucene the synonym is* > * // in the exact > same location as the originating word* > * > newTok.setPositionIncrement(0);* > * boostPayload = > new > Payload(PayloadHelper.encodeFloat(synonym.getWieght()));* > * > newTok.setPayload(boostPayload);* > * > * > I have put it in the index time analyzer : this is my field > definition: > > * > positionIncrementGap="100" > > > class="solr.WhitespaceTokenizerFactory"/> > class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > class="solr.LowerCaseFilterFactory"/> > class="com.digitaltrowel.solr.DTSynonymFactory" > FreskoFunction="names_with_scoresPipe23Columns.txt" > ignoreCase="true" > expand="false"/> > > > > > > class="solr.WhitespaceTokenizerFactory"/> > class="solr.LowerCaseFilterFactory"/> > > class="solr.StopFilterFactory" ignoreCase="true" > words="stopwords.txt"/> > > > > > > > my similarity class is > public class BoostingSymilarity extends DefaultSimilarity > { > > > public BoostingSymilarity(){ > super(); > > } > @Override > public float scorePayload(String field, > byte [] payload, int offset, > int length) > { > double weight = PayloadHelper.decodeFloat(payload, 0); > return (float)weight; > } > > @Override public float coord(int overlap, int maxoverlap) > { > return 1.0f; > } > > @Override public float idf(int docFreq, int numDocs) > { > return 1.0f; > } > > @Override public float lengthNorm(String fieldName, int > numTerms) > { > return 1.0f; > } > > @Override public float tf(float freq) > { > return 1.0f; > } > } > > My problem is that scorePayload method does not get called > at search time > like the other methods in my similarity class. > I tested and verified it with break points. > What am I doing wrong? > I used solr 1.3 and thinking of the payload boos support in > solr 1.4. > > > * > __ Do You Yahoo!? Tired of spam? Yahoo! Mail has the best spam protection around http://mail.yahoo.com
Re: schema.jsp is not displaying tint types corretly
maybe you indexed some documents with value 256, but then deleted them? try optimizing to get the terms removed. Erik On Nov 8, 2009, at 6:11 AM, AHMET ARSLAN wrote: I have a field defined tint with values 100,200,300 and -100 only. When i use admin/schema.jsp i see 5 distinct values. 0 666083 100 431176 200 234907 256 33947 300 33947 First i thought that i post wrong values. I was expecting 4 distinct values. When i query flag_value:256 i get 0 docs. Same as with flag_value:0. 100 200 and 300 works as expected. So i concluded that there is something wrong with schema.jsp with tint types. I am using solr-2009-11-03.tgz By the way, trie types makes sorts faster? Or they are only useful in range queries?
Re: schema.jsp is not displaying tint types corretly
> maybe you indexed some documents with > value 256, but then deleted them? try optimizing to > get the terms removed. I am running full-import with DIH. No deletions. And domain of this type is exactly 100,200,300 and -100. I am using this SQL query to fetch that field: SELECT CASE WHEN ... THEN 300 WHEN ... THEN 200 WHEN ... THEN 100 ELSE -100 END AS flag_value FROM ... I just optimized just to make sure. The same: Distinct: 7 0 987692 100 751304 200 236388 256 33830 300 33830 Interestingly it says there is 7 distinct values. When i try to see top 7 terms it always shows top 5. Also changes the value of top terms textbox to 5 again.
Getting started with DIH
I would like to start using DIH to index some RSS-Feeds and mail folders To get started I tried the RSS example from the wiki but as it is Solr complains about the missing id field. After some experimenting I found out two ways to fill the id: - in schema.xml This works but isn't very flexible. Perhaps I have other types of records with a real id or a multivalued link-field. Then this solution would break. - Changing the id field to type "uuid" Again I would like to keep real ids where I have them and not a random UUID. What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Another question regarding MailEntityProcessor I found this example: But what is the dataSource (the enclosing tag to document)? That is, how would a minimal but complete data-config.xml look like to index mails from an IMAP server? And finally, is it possible to combine the definitions for several RSS-Feeds and Mail-accounts into one data-config? Or do I need a separate config file and request handler for each of them? -Michael
Re: Getting started with DIH
You have an example on using mail dih in solr distro []s, Lucas Frare Teixeira .·. - lucas...@gmail.com - lucastex.com.br - blog.lucastex.com - twitter.com/lucastex On Sun, Nov 8, 2009 at 1:56 PM, Michael Lackhoff wrote: > I would like to start using DIH to index some RSS-Feeds and mail folders > > To get started I tried the RSS example from the wiki but as it is Solr > complains about the missing id field. After some experimenting I found > out two ways to fill the id: > > - in schema.xml > This works but isn't very flexible. Perhaps I have other types of > records with a real id or a multivalued link-field. Then this solution > would break. > > - Changing the id field to type "uuid" > Again I would like to keep real ids where I have them and not a random > UUID. > > What didn't work but looks like the potentially best solution is to fill > the id in my data-config by using the link twice: > > > This would be a definition just for this single data source but I don't > get any docs (also no error message). No trace of any inserts whatsoever. > Is it possible to fill the id that way? > > Another question regarding MailEntityProcessor > I found this example: > > user="someb...@gmail.com" > password="something" > host="imap.gmail.com" > protocol="imaps" > folders = "x,y,z"/> > > > But what is the dataSource (the enclosing tag to document)? That is, how > would a minimal but complete data-config.xml look like to index mails > from an IMAP server? > > And finally, is it possible to combine the definitions for several > RSS-Feeds and Mail-accounts into one data-config? Or do I need a > separate config file and request handler for each of them? > > -Michael >
Re: Getting started with DIH
On 08.11.2009 17:03 Lucas F. A. Teixeira wrote: > You have an example on using mail dih in solr distro Don't know where my eyes were. Thanks! When I was at it I looked at the schema.xml for the rss example and it uses "link" as UniqueKey, which is of course good, if you only have rss items but not so good if you also plan to add other data sources. So I am still interested in a good solution for my id problem: >> What didn't work but looks like the potentially best solution is to fill >> the id in my data-config by using the link twice: >> >> >> This would be a definition just for this single data source but I don't >> get any docs (also no error message). No trace of any inserts whatsoever. >> Is it possible to fill the id that way? and this one: >> And finally, is it possible to combine the definitions for several >> RSS-Feeds and Mail-accounts into one data-config? Or do I need a >> separate config file and request handler for each of them? Thanks -Michael
Re: Getting started with DIH
If I'm not wrong, you can have several entities in one document, but just one datasource configured. []sm Lucas Frare Teixeira .·. - lucas...@gmail.com - lucastex.com.br - blog.lucastex.com - twitter.com/lucastex On Sun, Nov 8, 2009 at 3:36 PM, Michael Lackhoff wrote: > On 08.11.2009 17:03 Lucas F. A. Teixeira wrote: > > > You have an example on using mail dih in solr distro > > Don't know where my eyes were. Thanks! > > When I was at it I looked at the schema.xml for the rss example and it > uses "link" as UniqueKey, which is of course good, if you only have rss > items but not so good if you also plan to add other data sources. > So I am still interested in a good solution for my id problem: > > >> What didn't work but looks like the potentially best solution is to fill > >> the id in my data-config by using the link twice: > >> > >> > >> This would be a definition just for this single data source but I don't > >> get any docs (also no error message). No trace of any inserts > whatsoever. > >> Is it possible to fill the id that way? > > and this one: > > >> And finally, is it possible to combine the definitions for several > >> RSS-Feeds and Mail-accounts into one data-config? Or do I need a > >> separate config file and request handler for each of them? > > Thanks > -Michael >
Re: tracking solr response time
Thanks Lance for the clear explanation .. are you saying we should give solr JVM enough memory so that os cache can optimize disk I/O efficiently .. that means in our case we have 16 GB index so would it be enough to allocated solr JVM 20GB memory and rely on the OS cache to optimize disk I/O i .e cache the index in memory ?? below is stats related to cache *name: * queryResultCache *class: * org.apache.solr.search.LRUCache * version: * 1.0 *description: * LRU Cache(maxSize=512, initialSize=512, autowarmCount=256, regenerator=org.apache.solr.search.solrindexsearche...@67e112b3) *stats: *lookups : 0 hits : 0 hitratio : 0.00 inserts : 8 evictions : 0 size : 8 cumulative_lookups : 15 cumulative_hits : 7 cumulative_hitratio : 0.46 cumulative_inserts : 8 cumulative_evictions : 0 *name: * documentCache *class: * org.apache.solr.search.LRUCache * version: * 1.0 *description: * LRU Cache(maxSize=512, initialSize=512) * stats: *lookups : 0 hits : 0 hitratio : 0.00 inserts : 0 evictions : 0 size : 0 cumulative_lookups : 744 cumulative_hits : 639 cumulative_hitratio : 0.85 cumulative_inserts : 105 cumulative_evictions : 0 *name: * filterCache *class: * org.apache.solr.search.LRUCache *version: *1.0 *description: * LRU Cache(maxSize=512, initialSize=512, autowarmCount=256, regenerator=org.apache.solr.search.solrindexsearche...@1e3dbf67) *stats: *lookups : 0 hits : 0 hitratio : 0.00 inserts : 20 evictions : 0 size : 12 cumulative_lookups : 64 cumulative_hits : 60 cumulative_hitratio : 0.93 cumulative_inserts : 12 cumulative_evictions : 0 hits and hit ratio are zero for ducment cache , filter cache and query cache .. only commulative hits and hitratio has a non zero numbers .. is this how it is supposed to be .. or do we to configure it properly ? Thanks, Bharath On Sat, Nov 7, 2009 at 5:47 AM, Lance Norskog wrote: > The OS cache is the memory used by the operating system (Linux or > Windows) to store a cache of the data stored on the disk. The cache is > usually by block numbers and are not correlated to files. Disk blocks > that are not used by programs are slowly pruned from the cache. > > The operating systems are very good at maintaining this cache. It > usually better to give the Solr JVM enough memory to run comfortably > and rely on the OS cache to optimize disk I/O, instead of giving it > all available ram. > > Solr has its own caches for certain data structures, and there are no > solid guidelines for tuning those. The solr/admin/stats.jsp page shows > the number of hits & deletes for the caches and most people just > reload that over & over. > > On Fri, Nov 6, 2009 at 3:09 AM, bharath venkatesh > wrote: > >>I have to state the obvious: you may really want to upgrade to 1.4 when > > it's out > > > > when would solr 1.4 be released .. is there any beta version available ? > > > >>We don't have the details, but a machine with 32 GB RAM and 16 GB index > > should have the whole index cached by >the OS > > > > do we have to configure solr for the index to be cached by OS in a > > optimised way . how does this caching of index in memory happens ? r > > there any docs or link which gives details regarding the same > > > >>unless something else is consuming the memory or unless something is > > constantly throwing data out of the OS >cache (e.g. frequent index > > optimization). > > > > what are the factors which would cause constantly throwing data out of > the > > OS cache (we are doing index optimization only once in a day during > > midnight ) > > > > > > Thanks, > > Bharath > > > > > > -- > Lance Norskog > goks...@gmail.com >
Re: Desenvolvedores no Rio de Janeiro - Brasil
> > Prezados, > > Algum membro da lista trabalha como freelancer, no Rio de Janeiro, em > desenvolvimentos de sites com navegação facetada no Solr 1.4? > > Um abraço, > Renato. >
Re: Solr Replication: How to restore data from last snapshot
: Subject: Solr Replication: How to restore data from last snapshot : References: <8950e934db69a040a1783438e67293d813da3f6...@delmail.sapient.com> : <26230840.p...@talk.nabble.com> : : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking -Hoss
Re: dismax + wildcard
: Subject: dismax + wildcard : References: <3c9e9890-e1e9-43b0-bd01-b9fa4a77f...@gmail.com> : : : In-Reply-To: http://people.apache.org/~hossman/#threadhijack Thread Hijacking on Mailing Lists When starting a new discussion on a mailing list, please do not reply to an existing message, instead start a fresh email. Even if you change the subject line of your email, other mail headers still track which thread you replied to and your question is "hidden" in that thread and gets less attention. It makes following discussions in the mailing list archives particularly difficult. See Also: http://en.wikipedia.org/wiki/Thread_hijacking -Hoss
Re: schema.jsp is not displaying tint types corretly
: I have a field defined tint with values 100,200,300 and -100 only. i assume you mean that tint is a solr.TrieIntField, probably with precisionStep="8" ? : When i use admin/schema.jsp i see 5 distinct values. ... : First i thought that i post wrong values. I was expecting 4 distinct values. I'm not an expert on Trie fields, but you need to remember that schema.jsp shows you the *indexed* values, and the whole point of TrieFields is to create multiple indexed values at various levels of precision so that range queries can be much faster. : When i query flag_value:256 i get 0 docs. Same as with flag_value:0. ...which is to be expected since you didn't index any docs with those exact values. If you facet on that field, you should see the 4 values you expect (and no others) because the faceting code for TreiFields knows about the special values, but schema.jsp just tells you exactly what's in the index. -Hoss
Re: schema.jsp is not displaying tint types corretly
> I'm not an expert on Trie fields, but you need to remember > that schema.jsp > shows you the *indexed* values, and the whole point of > TrieFields is to > create multiple indexed values at various levels of > precision so that > range queries can be much faster. > > : When i query flag_value:256 i get 0 docs. Same as with > flag_value:0. > > ...which is to be expected since you didn't index any docs > with those > exact values. If you facet on that field, you should > see the 4 values you > expect (and no others) because the faceting code for > TreiFields knows > about the special values, but schema.jsp just tells you > exactly what's in > the index. Yes tint is the default one comes with schema.xml. I queried *:* and faceted on that field, result is just like as you said: 1132679 207459 27474 So we can say that it is normal to see weird vaules in schema.jsp for trie types. Thanks for the explanations.
Re: schema.jsp is not displaying tint types corretly
: So we can say that it is normal to see weird vaules in schema.jsp for trie types. Thanks for the explanations. it's normal to see weird values in schema.jsp for all types, trie, stemmed, etc... -Hoss
using different field for search and boosting
hello i wanted to know if its possible to search on one field and provide boosting relevancy on other fields. For example if i have fields like make, model, description etc and all are copied to text field. So can i define a handler where i do a search on text field but can define relevancy models on make,model and description ie make^4 model^2 Any advice. -- View this message in context: http://old.nabble.com/using-different-field-for-search-and-boosting-tp26260479p26260479.html Sent from the Solr - User mailing list archive at Nabble.com.
Segment file not found error - after replicating
Hi guys, We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux environment and use the replication scripts to make replicas those live in load balancing slaves. The issue we face quite often (only in Linux servers) is that they tend to not been able to find the segment file (segment_x etc) after the replicating completed. As this has become quite common, we started hitting a serious issue. Below is a stack trace, if that helps and any help on this matter is greatly appreciated. Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created gap: org.apache.solr.highlight.GapFragmenter Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created regex: org.apache.solr.highlight.RegexFragmenter Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load INFO: created html: org.apache.solr.highlight.HtmlFormatter Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init SEVERE: Could not start SOLR. Check solr/home property java.lang.RuntimeException: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory) at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960) at org.apache.solr.core.SolrCore.(SolrCore.java:470) at org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119) at org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69) at org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275) at org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397) at org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108) at org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709) at org.apache.catalina.core.StandardContext.start(StandardContext.java:4363) at org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099) at org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916) at org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536) at org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114) at javax.servlet.http.HttpServlet.service(HttpServlet.java:617) at javax.servlet.http.HttpServlet.service(HttpServlet.java:717) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57) at org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235) at org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206) at org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233) at org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191) at org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525) at org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128) at org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102) at org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109) at org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286) at org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845) at org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583) at org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447) at java.lang.Thread.run(Thread.java:619) Caused by: java.io.FileNotFoundException: /solrinstances/solrhome01/data/index/segments_v (No such file or directory) at java.io.RandomAccessFile.open(Native Method) at java.io.RandomAccessFile.(RandomAccessFile.java:212) at org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552) at org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582) at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488) at org.apache.lucene.store.FSDirectory.openInput(FS
RE: Solr Replication: How to restore data from last snapshot
What happen if it is multiple core? Thanks -Original Message- From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble Paul ??? ?? Sent: Friday, November 06, 2009 10:49 PM To: solr-user@lucene.apache.org Subject: Re: Solr Replication: How to restore data from last snapshot if it is a single core you will have to restart the master On Sat, Nov 7, 2009 at 1:55 AM, Osborn Chan wrote: > Thanks. But I have following use cases: > > 1) Master index is corrupted, but it didn't replicate to slave servers. > - In this case, I only need to restore to last snapshot. > 2) Master index is corrupted, and it has replicated to slave servers. > - In this case, I need to restore to last snapshot, and make sure > slave servers replicate the restored index from index server as well. > > Assuming both cases are in production environment, and I cannot shutdown the > master and slave servers. > Is there any rest API call or something else I can do without manually using > linux command and restart? > > Thanks, > > Osborn > > -Original Message- > From: Matthew Runo [mailto:matthew.r...@gmail.com] > Sent: Friday, November 06, 2009 12:20 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr Replication: How to restore data from last snapshot > > If your master index is corrupt and it hasn't been replicated out, you > should be able to shut down the server and remove the corrupted index > files. Then copy the replicated index back onto the master and start > everything back up. > > As far as I know, the indexes on the replicated slaves are exactly > what you'd have on the master, so this method should work. > > --Matthew Runo > > On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan wrote: >> Hi, >> >> I have followed Solr set up ReplicationHandler for index replication to >> slave. >> Do anyone know how to restore corrupted index from snapshot in master, and >> force replication of the restored index to slave? >> >> >> Thanks, >> >> Osborn >> > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Getting started with DIH
On 08.11.2009 16:56 Michael Lackhoff wrote: > What didn't work but looks like the potentially best solution is to fill > the id in my data-config by using the link twice: > > > This would be a definition just for this single data source but I don't > get any docs (also no error message). No trace of any inserts whatsoever. > Is it possible to fill the id that way? Found the answer in the list archive: use TemplateTransformer: Only minor and cosmetic problem: there are brackets around the id field (like [http://somelink/]). For an id this doesn't really matter but I would like to understand what is going on here. In the wiki I found only this info: > The rules for the template are same as the templates in 'query', 'url' > etc but I couldn't find any info about those either. Is this documented somewhere? -Michael
Re: Getting started with DIH
The brackets probably come from it being transformed as an array. Try saying multiValued="false" on your specifications. Erik On Nov 9, 2009, at 12:34 AM, Michael Lackhoff wrote: On 08.11.2009 16:56 Michael Lackhoff wrote: What didn't work but looks like the potentially best solution is to fill the id in my data-config by using the link twice: This would be a definition just for this single data source but I don't get any docs (also no error message). No trace of any inserts whatsoever. Is it possible to fill the id that way? Found the answer in the list archive: use TemplateTransformer: Only minor and cosmetic problem: there are brackets around the id field (like [http://somelink/]). For an id this doesn't really matter but I would like to understand what is going on here. In the wiki I found only this info: The rules for the template are same as the templates in 'query', 'url' etc but I couldn't find any info about those either. Is this documented somewhere? -Michael
Re: Getting started with DIH
On 09.11.2009 06:54 Erik Hatcher wrote: > The brackets probably come from it being transformed as an array. Try > saying multiValued="false" on your specifications. Indeed. Thanks Erik that was it. My first steps with DIH showed me what a powerful tool this is but although the DIH wiki page might well be the longest in the whole wiki there are so many mysteries left for the uninitiated. Is there any other documentation I might have missed? Thanks -Michael
Re: Getting started with DIH
This one is kind of a hack. So I have opened an issue. https://issues.apache.org/jira/browse/SOLR-1547 On Mon, Nov 9, 2009 at 12:43 PM, Michael Lackhoff wrote: > On 09.11.2009 06:54 Erik Hatcher wrote: > >> The brackets probably come from it being transformed as an array. Try >> saying multiValued="false" on your specifications. > > Indeed. Thanks Erik that was it. > > My first steps with DIH showed me what a powerful tool this is but > although the DIH wiki page might well be the longest in the whole wiki > there are so many mysteries left for the uninitiated. Is there any other > documentation I might have missed? > > Thanks > -Michael > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Getting started with DIH
On Mon, Nov 9, 2009 at 12:43 PM, Michael Lackhoff wrote: > On 09.11.2009 06:54 Erik Hatcher wrote: > >> The brackets probably come from it being transformed as an array. Try >> saying multiValued="false" on your specifications. > > Indeed. Thanks Erik that was it. > > My first steps with DIH showed me what a powerful tool this is but > although the DIH wiki page might well be the longest in the whole wiki > there are so many mysteries left for the uninitiated. Is there any other > documentation I might have missed? There is an FAQ page and that is it http://wiki.apache.org/solr/DataImportHandlerFaq It just started of as a single page and the features just got piled up and the page just bigger. we are thinking of cutting it down to smaller more manageable pages > > Thanks > -Michael > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
Re: Getting started with DIH
On 09.11.2009 08:20 Noble Paul നോബിള് नोब्ळ् wrote: > It just started of as a single page and the features just got piled up > and the page just bigger. we are thinking of cutting it down to > smaller more manageable pages Oh, I like it the way it is as one page, so that the browser full text search can help. It is just that the features and power seem to grow even faster than the wike page ;-) E.g. I couldn't find a way how to add a second rss feed. I tried with a second entity parallel to the slashdot one but got an exception: "java.io.IOException: FULL" whatever that means, so I must be doing something wrong but couldn't find a hint. -Michael
Re: Getting started with DIH
The tried and tested strategy is to post the question in this mailing list w/ your data-config.xml. On Mon, Nov 9, 2009 at 1:08 PM, Michael Lackhoff wrote: > On 09.11.2009 08:20 Noble Paul നോബിള് नोब्ळ् wrote: > >> It just started of as a single page and the features just got piled up >> and the page just bigger. we are thinking of cutting it down to >> smaller more manageable pages > > Oh, I like it the way it is as one page, so that the browser full text > search can help. It is just that the features and power seem to grow > even faster than the wike page ;-) > E.g. I couldn't find a way how to add a second rss feed. I tried with a > second entity parallel to the slashdot one but got an exception: > "java.io.IOException: FULL" whatever that means, so I must be doing > something wrong but couldn't find a hint. > > -Michael > -- - Noble Paul | Principal Engineer| AOL | http://aol.com
How to import multiple RSS-feeds with DIH
[A new thread for this particular problem] On 09.11.2009 08:44 Noble Paul നോബിള് नोब्ळ् wrote: > The tried and tested strategy is to post the question in this mailing > list w/ your data-config.xml. See my data-config.xml below. The first is the usual slashdot example with my 'id' addition, the second a very simple addtional feed. The second example works if I delete the slashdot-feed but as I said I would like to have them both. -Michael http://rss.slashdot.org/Slashdot/slashdot"; processor="XPathEntityProcessor" forEach="/RDF/channel | /RDF/item" transformer="TemplateTransformer,DateFormatTransformer"> http://www.heise.de/newsticker/heise.rdf"; processor="XPathEntityProcessor" forEach="/RDF/channel | /RDF/item" transformer="TemplateTransformer">