Re: solr search

2009-11-08 Thread manishkbawne

Thanks for your replies.My problem has been resolved. It was a sql server
connection problem. I declared a variable "databasename" in the
dataconfig.xml file and removed the database name from url.

Can anyone suggest me some good link or url for multiple indexing and spell
check in solr?

Manish Bawne
Software Engineer
Biz Integra Systems Pvt Ltd
http://www.bizhandel.com



Noble Paul നോബിള്‍  नोब्ळ्-2 wrote:
> 
> Please paste the complete stacktrace
> 
> On Fri, Nov 6, 2009 at 1:37 PM, manishkbawne 
> wrote:
>>
>> Thanks for assistance. Actually I installed jdk 6 and my problem was
>> resolved. But now I am getting this exception:-
>> org.apache.solr.handler.dataimport.DataImportHandlerException: Unable to
>> execute query: select PkMenuId from WCM_Menu Processing Document # 1
>>        at
>> org.apache.solr.handler.dataimport.JdbcDataSource$ResultSetIterator.(JdbcDataSource.java:186)
>>        at ---
>>
>> The changes the db-dataconfig.xml file are as :-
>> 
>>                
>>                                > fetchSize="1">
>>                                          > name="id1" />
>>                                
>>                        
>> 
>>
>> I don't think, there is some problem with missing hyphen. Please anybody
>> suggest me some way to resolve this error?
>>
>> Manish Bawne
>> Software Engineer
>> Biz Integra Systems
>> www.bizhandel.com
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>> Chantal Ackermann wrote:
>>>
>>> Hi Manish,
>>>
>>> is this a typo in your e-mail or is your config file really missing a
>>> hyphen? (Your repeating the name without second hyphen several times.)
>>>
>>> Cheers,
>>> Chantal
>>>
>>> manishkbawne schrieb:
   db-data-config.xml

 The changes that I have done in the db-dataconfig.xml file  is :-
>>>
>>>
>>
>> --
>> View this message in context:
>> http://old.nabble.com/solr-search-tp26125183p26228077.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 
> 
> -- 
> -
> Noble Paul | Principal Engineer| AOL | http://aol.com
> 
> 

-- 
View this message in context: 
http://old.nabble.com/solr-search-tp26125183p26251669.html
Sent from the Solr - User mailing list archive at Nabble.com.



schema.jsp is not displaying tint types corretly

2009-11-08 Thread AHMET ARSLAN
I have a field defined tint with values 100,200,300 and -100 only.
When i use admin/schema.jsp i see 5 distinct values.

0   666083
100 431176
200 234907
256 33947
300 33947

First i thought that i post wrong values. I was expecting 4 distinct values.
When i query flag_value:256 i get 0 docs. Same as with flag_value:0.
100 200 and 300 works as expected. So i concluded that there is something wrong 
with schema.jsp with tint types. I am using solr-2009-11-03.tgz

By the way, trie types makes sorts faster? Or they are only useful in range 
queries?


  


synonym payload boosting

2009-11-08 Thread David Ginzburg
Hi,
I have a field and a wighted synonym map.
I have indexed the synonyms with the weight as payload.
my code snippet from my filter

*public Token next(final Token reusableToken) throws IOException *
*. *
*. *
*.*
   * Payload boostPayload;*
*
*
*for (Synonym synonym : syns) {*
**
*Token newTok = new Token(nToken.startOffset(),
nToken.endOffset(), "SYNONYM");*
*newTok.setTermBuffer(synonym.getToken().toCharArray(), 0,
synonym.getToken().length());*
*// set the position increment to zero*
*// this tells lucene the synonym is*
*// in the exact same location as the originating word*
*newTok.setPositionIncrement(0);*
*boostPayload = new
Payload(PayloadHelper.encodeFloat(synonym.getWieght()));*
*newTok.setPayload(boostPayload);*
*
*
I have put it in the index time analyzer : this is my field definition:

*

  







  
  






  



my similarity class is
public class BoostingSymilarity extends DefaultSimilarity {


public BoostingSymilarity(){
super();

  }
@Override
public  float scorePayload(String field, byte [] payload, int offset,
int length)
{
 double weight = PayloadHelper.decodeFloat(payload, 0);
return (float)weight;
 }

@Override public float coord(int overlap, int maxoverlap)
 {
return 1.0f;
}

@Override public float idf(int docFreq, int numDocs)
{
 return 1.0f;
}

@Override public float lengthNorm(String fieldName, int numTerms)
 {
return 1.0f;
}

@Override public float tf(float freq)
{
 return 1.0f;
}
}

My problem is that scorePayload method does not get called at search time
like the other methods in  my similarity class.
I tested and verified it with break points.
What am I doing wrong?
I used solr 1.3 and thinking of the payload boos support in solr 1.4.


*


Re: synonym payload boosting

2009-11-08 Thread AHMET ARSLAN
Additionaly you need to modify your queryparser to return BoostingTermQuery, 
PayloadTermQuery, PayloadNearQuery etc.

With these types of Queries scorePayload method invoked.

Hope this helps.

--- On Sun, 11/8/09, David Ginzburg  wrote:

> From: David Ginzburg 
> Subject: synonym payload boosting
> To: solr-user@lucene.apache.org
> Date: Sunday, November 8, 2009, 4:06 PM
> Hi,
> I have a field and a wighted synonym map.
> I have indexed the synonyms with the weight as payload.
> my code snippet from my filter
> 
> *public Token next(final Token reusableToken) throws
> IOException *
> *        . *
> *        . *
> *        .*
>        * Payload boostPayload;*
> *
> *
> *        for (Synonym synonym : syns)
> {*
> *            *
> *            Token newTok =
> new Token(nToken.startOffset(),
> nToken.endOffset(), "SYNONYM");*
> *           
> newTok.setTermBuffer(synonym.getToken().toCharArray(), 0,
> synonym.getToken().length());*
> *            // set the
> position increment to zero*
> *            // this tells
> lucene the synonym is*
> *            // in the exact
> same location as the originating word*
> *           
> newTok.setPositionIncrement(0);*
> *            boostPayload =
> new
> Payload(PayloadHelper.encodeFloat(synonym.getWieght()));*
> *           
> newTok.setPayload(boostPayload);*
> *
> *
> I have put it in the index time analyzer : this is my field
> definition:
> 
> *
>  positionIncrementGap="100" >
>       
>          class="solr.WhitespaceTokenizerFactory"/>
>          class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>          class="solr.LowerCaseFilterFactory"/>
>          class="com.digitaltrowel.solr.DTSynonymFactory"
> FreskoFunction="names_with_scoresPipe23Columns.txt"
> ignoreCase="true"
> expand="false"/>
> 
>         
>         
>       
>       
>          class="solr.WhitespaceTokenizerFactory"/>
>          class="solr.LowerCaseFilterFactory"/>
>         
>          class="solr.StopFilterFactory" ignoreCase="true"
> words="stopwords.txt"/>
>         
>         
>       
>     
> 
> 
> my similarity class is
> public class BoostingSymilarity extends DefaultSimilarity
> {
> 
> 
>     public BoostingSymilarity(){
>         super();
> 
>   }
>     @Override
>     public  float scorePayload(String field,
> byte [] payload, int offset,
> int length)
> {
>  double weight = PayloadHelper.decodeFloat(payload, 0);
> return (float)weight;
>  }
> 
> @Override public float coord(int overlap, int maxoverlap)
>  {
> return 1.0f;
> }
> 
> @Override public float idf(int docFreq, int numDocs)
> {
>  return 1.0f;
> }
> 
> @Override public float lengthNorm(String fieldName, int
> numTerms)
>  {
> return 1.0f;
> }
> 
> @Override public float tf(float freq)
> {
>  return 1.0f;
> }
> }
> 
> My problem is that scorePayload method does not get called
> at search time
> like the other methods in  my similarity class.
> I tested and verified it with break points.
> What am I doing wrong?
> I used solr 1.3 and thinking of the payload boos support in
> solr 1.4.
> 
> 
> *
> 

__
Do You Yahoo!?
Tired of spam?  Yahoo! Mail has the best spam protection around 
http://mail.yahoo.com


Re: schema.jsp is not displaying tint types corretly

2009-11-08 Thread Erik Hatcher
maybe you indexed some documents with value 256, but then deleted  
them?  try optimizing to get the terms removed.


Erik

On Nov 8, 2009, at 6:11 AM, AHMET ARSLAN wrote:


I have a field defined tint with values 100,200,300 and -100 only.
When i use admin/schema.jsp i see 5 distinct values.

0   666083
100 431176
200 234907
256 33947
300 33947

First i thought that i post wrong values. I was expecting 4 distinct  
values.

When i query flag_value:256 i get 0 docs. Same as with flag_value:0.
100 200 and 300 works as expected. So i concluded that there is  
something wrong with schema.jsp with tint types. I am using  
solr-2009-11-03.tgz


By the way, trie types makes sorts faster? Or they are only useful  
in range queries?








Re: schema.jsp is not displaying tint types corretly

2009-11-08 Thread AHMET ARSLAN
> maybe you indexed some documents with
> value 256, but then deleted them?  try optimizing to
> get the terms removed.

I am running full-import with DIH. No deletions. And domain of this type is 
exactly 100,200,300 and -100. I am using this SQL query to fetch that field:

SELECT CASE WHEN ... THEN 300 WHEN ... THEN 200 WHEN ... THEN 100 ELSE -100 END 
AS flag_value FROM ...

I just optimized just to make sure. The same:

Distinct:  7

0 987692 
100 751304 
200 236388 
256 33830 
300 33830 

Interestingly it says there is 7 distinct values. When i try to see top 7 terms 
it always shows top 5. Also changes the value of top terms textbox to 5 again.






Getting started with DIH

2009-11-08 Thread Michael Lackhoff
I would like to start using DIH to index some RSS-Feeds and mail folders

To get started I tried the RSS example from the wiki but as it is Solr
complains about the missing id field. After some experimenting I found
out two ways to fill the id:

-  in schema.xml
This works but isn't very flexible. Perhaps I have other types of
records with a real id or a multivalued link-field. Then this solution
would break.

- Changing the id field to type "uuid"
Again I would like to keep real ids where I have them and not a random UUID.

What didn't work but looks like the potentially best solution is to fill
the id in my data-config by using the link twice:
  
  
This would be a definition just for this single data source but I don't
get any docs (also no error message). No trace of any inserts whatsoever.
Is it possible to fill the id that way?

Another question regarding MailEntityProcessor
I found this example:

   


But what is the dataSource (the enclosing tag to document)? That is, how
would a minimal but complete data-config.xml look like to index mails
from an IMAP server?

And finally, is it possible to combine the definitions for several
RSS-Feeds and Mail-accounts into one data-config? Or do I need a
separate config file and request handler for each of them?

-Michael


Re: Getting started with DIH

2009-11-08 Thread Lucas F. A. Teixeira
You have an example on using mail dih in solr distro

[]s,

Lucas Frare Teixeira .·.
- lucas...@gmail.com
- lucastex.com.br
- blog.lucastex.com
- twitter.com/lucastex


On Sun, Nov 8, 2009 at 1:56 PM, Michael Lackhoff wrote:

> I would like to start using DIH to index some RSS-Feeds and mail folders
>
> To get started I tried the RSS example from the wiki but as it is Solr
> complains about the missing id field. After some experimenting I found
> out two ways to fill the id:
>
> -  in schema.xml
> This works but isn't very flexible. Perhaps I have other types of
> records with a real id or a multivalued link-field. Then this solution
> would break.
>
> - Changing the id field to type "uuid"
> Again I would like to keep real ids where I have them and not a random
> UUID.
>
> What didn't work but looks like the potentially best solution is to fill
> the id in my data-config by using the link twice:
>  
>  
> This would be a definition just for this single data source but I don't
> get any docs (also no error message). No trace of any inserts whatsoever.
> Is it possible to fill the id that way?
>
> Another question regarding MailEntityProcessor
> I found this example:
> 
>  user="someb...@gmail.com"
>   password="something"
>   host="imap.gmail.com"
>   protocol="imaps"
>   folders = "x,y,z"/>
> 
>
> But what is the dataSource (the enclosing tag to document)? That is, how
> would a minimal but complete data-config.xml look like to index mails
> from an IMAP server?
>
> And finally, is it possible to combine the definitions for several
> RSS-Feeds and Mail-accounts into one data-config? Or do I need a
> separate config file and request handler for each of them?
>
> -Michael
>


Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 08.11.2009 17:03 Lucas F. A. Teixeira wrote:

> You have an example on using mail dih in solr distro

Don't know where my eyes were. Thanks!

When I was at it I looked at the schema.xml for the rss example and it
uses "link" as UniqueKey, which is of course good, if you only have rss
items but not so good if you also plan to add other data sources.
So I am still interested in a good solution for my id problem:

>> What didn't work but looks like the potentially best solution is to fill
>> the id in my data-config by using the link twice:
>>  
>>  
>> This would be a definition just for this single data source but I don't
>> get any docs (also no error message). No trace of any inserts whatsoever.
>> Is it possible to fill the id that way?

and this one:

>> And finally, is it possible to combine the definitions for several
>> RSS-Feeds and Mail-accounts into one data-config? Or do I need a
>> separate config file and request handler for each of them?

Thanks
-Michael


Re: Getting started with DIH

2009-11-08 Thread Lucas F. A. Teixeira
If I'm not wrong, you can have several entities in one document, but just
one datasource configured.

[]sm


Lucas Frare Teixeira .·.
- lucas...@gmail.com
- lucastex.com.br
- blog.lucastex.com
- twitter.com/lucastex


On Sun, Nov 8, 2009 at 3:36 PM, Michael Lackhoff wrote:

> On 08.11.2009 17:03 Lucas F. A. Teixeira wrote:
>
> > You have an example on using mail dih in solr distro
>
> Don't know where my eyes were. Thanks!
>
> When I was at it I looked at the schema.xml for the rss example and it
> uses "link" as UniqueKey, which is of course good, if you only have rss
> items but not so good if you also plan to add other data sources.
> So I am still interested in a good solution for my id problem:
>
> >> What didn't work but looks like the potentially best solution is to fill
> >> the id in my data-config by using the link twice:
> >>  
> >>  
> >> This would be a definition just for this single data source but I don't
> >> get any docs (also no error message). No trace of any inserts
> whatsoever.
> >> Is it possible to fill the id that way?
>
> and this one:
>
> >> And finally, is it possible to combine the definitions for several
> >> RSS-Feeds and Mail-accounts into one data-config? Or do I need a
> >> separate config file and request handler for each of them?
>
> Thanks
> -Michael
>


Re: tracking solr response time

2009-11-08 Thread bharath venkatesh
Thanks  Lance for the clear explanation .. are you saying we should give
solr JVM enough memory so that os cache can optimize disk I/O efficiently ..
that means in our case we have  16 GB  index so  would it  be enough to
allocated solr JVM 20GB memory and rely on the OS cache to optimize disk I/O
i .e cache the index in memory  ??


below is stats related to cache


*name: * queryResultCache  *class: * org.apache.solr.search.LRUCache  *
version: * 1.0  *description: * LRU Cache(maxSize=512, initialSize=512,
autowarmCount=256,
regenerator=org.apache.solr.search.solrindexsearche...@67e112b3)
*stats: *lookups
: 0
hits : 0
hitratio : 0.00
inserts : 8
evictions : 0
size : 8
cumulative_lookups : 15
cumulative_hits : 7
cumulative_hitratio : 0.46
cumulative_inserts : 8
cumulative_evictions : 0


*name: * documentCache  *class: * org.apache.solr.search.LRUCache  *
version: * 1.0  *description: * LRU Cache(maxSize=512, initialSize=512)  *
stats: *lookups : 0
hits : 0
hitratio : 0.00
inserts : 0
evictions : 0
size : 0
cumulative_lookups : 744
cumulative_hits : 639
cumulative_hitratio : 0.85
cumulative_inserts : 105
cumulative_evictions : 0


*name: * filterCache  *class: * org.apache.solr.search.LRUCache
*version: *1.0
*description: * LRU Cache(maxSize=512, initialSize=512, autowarmCount=256,
regenerator=org.apache.solr.search.solrindexsearche...@1e3dbf67)
*stats: *lookups
: 0
hits : 0
hitratio : 0.00
inserts : 20
evictions : 0
size : 12
cumulative_lookups : 64
cumulative_hits : 60
cumulative_hitratio : 0.93
cumulative_inserts : 12
cumulative_evictions : 0


hits and hit ratio are  zero for ducment cache , filter cache and query
cache ..  only commulative hits and hitratio has a non zero numbers ..  is
this how it is supposed to be .. or do we to configure it properly ?

Thanks,
Bharath





On Sat, Nov 7, 2009 at 5:47 AM, Lance Norskog  wrote:

> The OS cache is the memory used by the operating system (Linux or
> Windows) to store a cache of the data stored on the disk. The cache is
> usually by block numbers and are not correlated to files. Disk blocks
> that are not used by programs are slowly pruned from the cache.
>
> The operating systems are very good at maintaining this cache. It
> usually better to give the Solr JVM enough memory to run comfortably
> and rely on the OS cache to optimize disk I/O, instead of giving it
> all available ram.
>
> Solr has its own caches for certain data structures, and there are no
> solid guidelines for tuning those. The solr/admin/stats.jsp page shows
> the number of hits & deletes for the caches and most people just
> reload that over & over.
>
> On Fri, Nov 6, 2009 at 3:09 AM, bharath venkatesh
>  wrote:
> >>I have to state the obvious: you may really want to upgrade to 1.4 when
> > it's out
> >
> > when would solr 1.4 be released .. is there any beta version available ?
> >
> >>We don't have the details, but a machine with 32 GB RAM and 16 GB index
> > should have the whole index cached by >the OS
> >
> > do we have to configure solr  for the index to be cached  by OS in a
> > optimised way   . how does this caching of index in memory happens ?  r
> > there  any docs or link which gives details regarding the same
> >
> >>unless something else is consuming the memory or unless something is
> > constantly throwing data out of the OS >cache (e.g. frequent index
> > optimization).
> >
> > what are the factors which would cause constantly throwing data out of
> the
> > OS cache  (we are doing  index optimization only once in a day during
> > midnight )
> >
> >
> > Thanks,
> > Bharath
> >
>
>
>
> --
> Lance Norskog
> goks...@gmail.com
>


Re: Desenvolvedores no Rio de Janeiro - Brasil

2009-11-08 Thread Renato Alves
>
> Prezados,
>
> Algum membro da lista trabalha como freelancer, no Rio de Janeiro, em
> desenvolvimentos de sites com navegação facetada no Solr 1.4?
>
> Um abraço,
> Renato.
>


Re: Solr Replication: How to restore data from last snapshot

2009-11-08 Thread Chris Hostetter

: Subject: Solr Replication: How to restore data from last snapshot
: References: <8950e934db69a040a1783438e67293d813da3f6...@delmail.sapient.com>
:  <26230840.p...@talk.nabble.com>
:  
: In-Reply-To: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to 
an existing message, instead start a fresh email.  Even if you change the 
subject line of your email, other mail headers still track which thread 
you replied to and your question is "hidden" in that thread and gets less 
attention.   It makes following discussions in the mailing list archives 
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking



-Hoss



Re: dismax + wildcard

2009-11-08 Thread Chris Hostetter

: Subject: dismax + wildcard
: References: <3c9e9890-e1e9-43b0-bd01-b9fa4a77f...@gmail.com>
: 
: 
: In-Reply-To: 

http://people.apache.org/~hossman/#threadhijack
Thread Hijacking on Mailing Lists

When starting a new discussion on a mailing list, please do not reply to
an existing message, instead start a fresh email.  Even if you change the
subject line of your email, other mail headers still track which thread
you replied to and your question is "hidden" in that thread and gets less
attention.   It makes following discussions in the mailing list archives
particularly difficult.
See Also:  http://en.wikipedia.org/wiki/Thread_hijacking




-Hoss



Re: schema.jsp is not displaying tint types corretly

2009-11-08 Thread Chris Hostetter

: I have a field defined tint with values 100,200,300 and -100 only.

i assume you mean that tint is a solr.TrieIntField, probably with 
precisionStep="8" ?

: When i use admin/schema.jsp i see 5 distinct values.
...
: First i thought that i post wrong values. I was expecting 4 distinct values.

I'm not an expert on Trie fields, but you need to remember that schema.jsp 
shows you the *indexed* values, and the whole point of TrieFields is to 
create multiple indexed values at various levels of precision so that 
range queries can be much faster.

: When i query flag_value:256 i get 0 docs. Same as with flag_value:0.

...which is to be expected since you didn't index any docs with those 
exact values.  If you facet on that field, you should see the 4 values you 
expect (and no others) because the faceting code for TreiFields knows 
about the special values, but schema.jsp just tells you exactly what's in 
the index.


-Hoss



Re: schema.jsp is not displaying tint types corretly

2009-11-08 Thread AHMET ARSLAN

 
> I'm not an expert on Trie fields, but you need to remember
> that schema.jsp 
> shows you the *indexed* values, and the whole point of
> TrieFields is to 
> create multiple indexed values at various levels of
> precision so that 
> range queries can be much faster.
> 
> : When i query flag_value:256 i get 0 docs. Same as with
> flag_value:0.
> 
> ...which is to be expected since you didn't index any docs
> with those 
> exact values.  If you facet on that field, you should
> see the 4 values you 
> expect (and no others) because the faceting code for
> TreiFields knows 
> about the special values, but schema.jsp just tells you
> exactly what's in 
> the index.

Yes tint is the default one comes with schema.xml. I queried *:* and faceted on 
that field, result is just like as you said:

1132679
207459
27474


So we can say that it is normal to see weird vaules in schema.jsp for trie 
types. Thanks for the explanations.





Re: schema.jsp is not displaying tint types corretly

2009-11-08 Thread Chris Hostetter

: So we can say that it is normal to see weird vaules in schema.jsp for trie 
types. Thanks for the explanations.

it's normal to see weird values in schema.jsp for all types, trie, stemmed, 
etc...


-Hoss



using different field for search and boosting

2009-11-08 Thread darniz

hello
i wanted to know if its possible to search on one field and provide boosting
relevancy on other fields.

For example if i have fields like make, model, description etc and all are
copied to text field.
So can i define a handler where i do a search on text field but can define
relevancy models on make,model and description ie make^4 model^2

Any advice.
-- 
View this message in context: 
http://old.nabble.com/using-different-field-for-search-and-boosting-tp26260479p26260479.html
Sent from the Solr - User mailing list archive at Nabble.com.



Segment file not found error - after replicating

2009-11-08 Thread Maduranga Kannangara
Hi guys,

We use Solr 1.3 for indexing large amounts of data (50G avg) on Linux 
environment and use the replication scripts to make replicas those live in load 
balancing slaves.

The issue we face quite often (only in Linux servers) is that they tend to not 
been able to find the segment file (segment_x etc) after the replicating 
completed. As this has become quite common, we started hitting a serious issue.

Below is a stack trace, if that helps and any help on this matter is greatly 
appreciated.



Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/: org.apache.solr.handler.admin.AdminHandlers
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /admin/ping: org.apache.solr.handler.PingRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created /debug/dump: org.apache.solr.handler.DumpRequestHandler
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created gap: org.apache.solr.highlight.GapFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created regex: org.apache.solr.highlight.RegexFragmenter
Nov 5, 2009 11:34:46 PM org.apache.solr.util.plugin.AbstractPluginLoader load
INFO: created html: org.apache.solr.highlight.HtmlFormatter
Nov 5, 2009 11:34:46 PM org.apache.solr.servlet.SolrDispatchFilter init
SEVERE: Could not start SOLR. Check solr/home property
java.lang.RuntimeException: java.io.FileNotFoundException: 
/solrinstances/solrhome01/data/index/segments_v (No such file or directory)
at org.apache.solr.core.SolrCore.getSearcher(SolrCore.java:960)
at org.apache.solr.core.SolrCore.(SolrCore.java:470)
at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:119)
at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:69)
at 
org.apache.catalina.core.ApplicationFilterConfig.getFilter(ApplicationFilterConfig.java:275)
at 
org.apache.catalina.core.ApplicationFilterConfig.setFilterDef(ApplicationFilterConfig.java:397)
at 
org.apache.catalina.core.ApplicationFilterConfig.(ApplicationFilterConfig.java:108)
at 
org.apache.catalina.core.StandardContext.filterStart(StandardContext.java:3709)
at 
org.apache.catalina.core.StandardContext.start(StandardContext.java:4363)
at 
org.apache.catalina.core.StandardContext.reload(StandardContext.java:3099)
at 
org.apache.catalina.manager.ManagerServlet.reload(ManagerServlet.java:916)
at 
org.apache.catalina.manager.HTMLManagerServlet.reload(HTMLManagerServlet.java:536)
at 
org.apache.catalina.manager.HTMLManagerServlet.doGet(HTMLManagerServlet.java:114)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:617)
at javax.servlet.http.HttpServlet.service(HttpServlet.java:717)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:290)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at com.jamonapi.JAMonFilter.doFilter(JAMonFilter.java:57)
at 
org.apache.catalina.core.ApplicationFilterChain.internalDoFilter(ApplicationFilterChain.java:235)
at 
org.apache.catalina.core.ApplicationFilterChain.doFilter(ApplicationFilterChain.java:206)
at 
org.apache.catalina.core.StandardWrapperValve.invoke(StandardWrapperValve.java:233)
at 
org.apache.catalina.core.StandardContextValve.invoke(StandardContextValve.java:191)
at 
org.apache.catalina.authenticator.AuthenticatorBase.invoke(AuthenticatorBase.java:525)
at 
org.apache.catalina.core.StandardHostValve.invoke(StandardHostValve.java:128)
at 
org.apache.catalina.valves.ErrorReportValve.invoke(ErrorReportValve.java:102)
at 
org.apache.catalina.core.StandardEngineValve.invoke(StandardEngineValve.java:109)
at 
org.apache.catalina.connector.CoyoteAdapter.service(CoyoteAdapter.java:286)
at 
org.apache.coyote.http11.Http11Processor.process(Http11Processor.java:845)
at 
org.apache.coyote.http11.Http11Protocol$Http11ConnectionHandler.process(Http11Protocol.java:583)
at 
org.apache.tomcat.util.net.JIoEndpoint$Worker.run(JIoEndpoint.java:447)
at java.lang.Thread.run(Thread.java:619)
Caused by: java.io.FileNotFoundException: 
/solrinstances/solrhome01/data/index/segments_v (No such file or directory)
at java.io.RandomAccessFile.open(Native Method)
at java.io.RandomAccessFile.(RandomAccessFile.java:212)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput$Descriptor.(FSDirectory.java:552)
at 
org.apache.lucene.store.FSDirectory$FSIndexInput.(FSDirectory.java:582)
at org.apache.lucene.store.FSDirectory.openInput(FSDirectory.java:488)
at org.apache.lucene.store.FSDirectory.openInput(FS

RE: Solr Replication: How to restore data from last snapshot

2009-11-08 Thread Osborn Chan
What happen if it is multiple core?

Thanks

-Original Message-
From: noble.p...@gmail.com [mailto:noble.p...@gmail.com] On Behalf Of Noble 
Paul ??? ??
Sent: Friday, November 06, 2009 10:49 PM
To: solr-user@lucene.apache.org
Subject: Re: Solr Replication: How to restore data from last snapshot

if it is a single core you will have to restart the master

On Sat, Nov 7, 2009 at 1:55 AM, Osborn Chan  wrote:
> Thanks. But I have following use cases:
>
> 1) Master index is corrupted, but it didn't replicate to slave servers.
>        - In this case, I only need to restore to last snapshot.
> 2) Master index is corrupted, and it has replicated to slave servers.
>        - In this case, I need to restore to last snapshot, and make sure 
> slave servers replicate the restored index from index server as well.
>
> Assuming both cases are in production environment, and I cannot shutdown the 
> master and slave servers.
> Is there any rest API call or something else I can do without manually using 
> linux command and restart?
>
> Thanks,
>
> Osborn
>
> -Original Message-
> From: Matthew Runo [mailto:matthew.r...@gmail.com]
> Sent: Friday, November 06, 2009 12:20 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr Replication: How to restore data from last snapshot
>
> If your master index is corrupt and it hasn't been replicated out, you
> should be able to shut down the server and remove the corrupted index
> files. Then copy the replicated index back onto the master and start
> everything back up.
>
> As far as I know, the indexes on the replicated slaves are exactly
> what you'd have on the master, so this method should work.
>
> --Matthew Runo
>
> On Fri, Nov 6, 2009 at 11:41 AM, Osborn Chan  wrote:
>> Hi,
>>
>> I have followed Solr set up ReplicationHandler for index replication to 
>> slave.
>> Do anyone know how to restore corrupted index from snapshot in master, and 
>> force replication of the restored index to slave?
>>
>>
>> Thanks,
>>
>> Osborn
>>
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 08.11.2009 16:56 Michael Lackhoff wrote:

> What didn't work but looks like the potentially best solution is to fill
> the id in my data-config by using the link twice:
>   
>   
> This would be a definition just for this single data source but I don't
> get any docs (also no error message). No trace of any inserts whatsoever.
> Is it possible to fill the id that way?

Found the answer in the list archive: use TemplateTransformer:
  
  

Only minor and cosmetic problem: there are brackets around the id field
(like [http://somelink/]). For an id this doesn't really matter but I
would like to understand what is going on here. In the wiki I found only
this info:
> The rules for the template are same as the templates in 'query', 'url'
> etc
but I couldn't find any info about those either. Is this documented
somewhere?

-Michael


Re: Getting started with DIH

2009-11-08 Thread Erik Hatcher
The brackets probably come from it being transformed as an array.  Try  
saying multiValued="false" on your  specifications.


Erik

On Nov 9, 2009, at 12:34 AM, Michael Lackhoff wrote:


On 08.11.2009 16:56 Michael Lackhoff wrote:

What didn't work but looks like the potentially best solution is to  
fill

the id in my data-config by using the link twice:
 
 
This would be a definition just for this single data source but I  
don't
get any docs (also no error message). No trace of any inserts  
whatsoever.

Is it possible to fill the id that way?


Found the answer in the list archive: use TemplateTransformer:
 
 

Only minor and cosmetic problem: there are brackets around the id  
field

(like [http://somelink/]). For an id this doesn't really matter but I
would like to understand what is going on here. In the wiki I found  
only

this info:
The rules for the template are same as the templates in 'query',  
'url'

etc

but I couldn't find any info about those either. Is this documented
somewhere?

-Michael




Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 09.11.2009 06:54 Erik Hatcher wrote:

> The brackets probably come from it being transformed as an array.  Try  
> saying multiValued="false" on your  specifications.

Indeed. Thanks Erik that was it.

My first steps with DIH showed me what a powerful tool this is but
although the DIH wiki page might well be the longest in the whole wiki
there are so many mysteries left for the uninitiated. Is there any other
documentation I might have missed?

Thanks
-Michael


Re: Getting started with DIH

2009-11-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
This one is kind of a hack.

So I have opened an issue.

https://issues.apache.org/jira/browse/SOLR-1547

On Mon, Nov 9, 2009 at 12:43 PM, Michael Lackhoff  wrote:
> On 09.11.2009 06:54 Erik Hatcher wrote:
>
>> The brackets probably come from it being transformed as an array.  Try
>> saying multiValued="false" on your  specifications.
>
> Indeed. Thanks Erik that was it.
>
> My first steps with DIH showed me what a powerful tool this is but
> although the DIH wiki page might well be the longest in the whole wiki
> there are so many mysteries left for the uninitiated. Is there any other
> documentation I might have missed?
>
> Thanks
> -Michael
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Getting started with DIH

2009-11-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
On Mon, Nov 9, 2009 at 12:43 PM, Michael Lackhoff  wrote:
> On 09.11.2009 06:54 Erik Hatcher wrote:
>
>> The brackets probably come from it being transformed as an array.  Try
>> saying multiValued="false" on your  specifications.
>
> Indeed. Thanks Erik that was it.
>
> My first steps with DIH showed me what a powerful tool this is but
> although the DIH wiki page might well be the longest in the whole wiki
> there are so many mysteries left for the uninitiated. Is there any other
> documentation I might have missed?

There is an FAQ page and that is it
http://wiki.apache.org/solr/DataImportHandlerFaq

It just started of as a single page and the features just got piled up
and the page just bigger.  we are thinking of cutting it down to
smaller more manageable pages
>
> Thanks
> -Michael
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


Re: Getting started with DIH

2009-11-08 Thread Michael Lackhoff
On 09.11.2009 08:20 Noble Paul നോബിള്‍ नोब्ळ् wrote:

> It just started of as a single page and the features just got piled up
> and the page just bigger.  we are thinking of cutting it down to
> smaller more manageable pages

Oh, I like it the way it is as one page, so that the browser full text
search can help. It is just that the features and power seem to grow
even faster than the wike page ;-)
E.g. I couldn't find a way how to add a second rss feed. I tried with a
second entity parallel to the slashdot one but got an exception:
"java.io.IOException: FULL" whatever that means, so I must be doing
something wrong but couldn't find a hint.

-Michael


Re: Getting started with DIH

2009-11-08 Thread Noble Paul നോബിള്‍ नोब्ळ्
The tried and tested strategy is to post the question in this mailing
list w/ your data-config.xml.


On Mon, Nov 9, 2009 at 1:08 PM, Michael Lackhoff  wrote:
> On 09.11.2009 08:20 Noble Paul നോബിള്‍ नोब्ळ् wrote:
>
>> It just started of as a single page and the features just got piled up
>> and the page just bigger.  we are thinking of cutting it down to
>> smaller more manageable pages
>
> Oh, I like it the way it is as one page, so that the browser full text
> search can help. It is just that the features and power seem to grow
> even faster than the wike page ;-)
> E.g. I couldn't find a way how to add a second rss feed. I tried with a
> second entity parallel to the slashdot one but got an exception:
> "java.io.IOException: FULL" whatever that means, so I must be doing
> something wrong but couldn't find a hint.
>
> -Michael
>



-- 
-
Noble Paul | Principal Engineer| AOL | http://aol.com


How to import multiple RSS-feeds with DIH

2009-11-08 Thread Michael Lackhoff
[A new thread for this particular problem]

On 09.11.2009 08:44 Noble Paul നോബിള്‍ नोब्ळ् wrote:

> The tried and tested strategy is to post the question in this mailing
> list w/ your data-config.xml.

See my data-config.xml below. The first is the usual slashdot example
with my 'id' addition, the second a very simple addtional feed. The
second example works if I delete the slashdot-feed but as I said I would
like to have them both.

-Michael


  

  http://rss.slashdot.org/Slashdot/slashdot";
processor="XPathEntityProcessor"
forEach="/RDF/channel | /RDF/item"
transformer="TemplateTransformer,DateFormatTransformer">
















  
  http://www.heise.de/newsticker/heise.rdf";
processor="XPathEntityProcessor"
forEach="/RDF/channel | /RDF/item"
transformer="TemplateTransformer">