document inside document?

2012-03-26 Thread sam
Hey,

I am making an image search engine where people can tag images with various
items that are themselves tagged.
For example, http://example.com/abc.jpg is tagged with the following three
items:
- item1 that is tagged with: tall blond woman
- item2 that is tagged with: yellow purse
- item3 that is tagged with: gucci red dress

Querying for +yellow +purse  will return the example image. But, querying
for +gucci +purse will not because the image does not have an item tagged
with both gucci and purse.

In addition to "items", each image has various metadata such as alt text,
location, description, photo credit.. etc  that should be available for
search.

How should I write my schema.xml ?
If imageUrl is primary key, do I implement my own fieldType for items, so
that I can write:

What would myItemType look like so that solr would know the example image
will not be part of the query, +gucci +purse??

If itemId is primary key, I can use result grouping (
http://wiki.apache.org/solr/FieldCollapsing). But, I need to repeat alt
text and other image metadata for each item.

Or, should I create different schema for item search and metadata search?

Thanks.
Sam.


custom query string parsing?

2012-04-10 Thread sam
I would like to transform the following:

/myhandler/?colors=red&colors=blue&materials=leather

to a Query that is similar to:
/select/?fq:colors:(red OR
blue)&fq:materials:leather&facet=on&facet.field=  and various default
query params.

I tried to do this by providing QParserPlugin:
  
  
 
 
 myQueryParser
 
  


In my.QueryParser (returned by my.QueryParserPlugin) I can:
getReq().getParamString();

and parse that string to get non-default SolrParams.
And, build a Query object to my liking.

Am I going about the right direction?  It'd be much easier if I could place
a custom app proxying solr... But, I am required to expose solr to the
public without middle layer between the browsers and solr.

So, my idea is to reject all solr query params but handful of custom params
such as colors, materials, ..   and build a Query object myself combining
with default SolrParams specified in solrconfig.xml .

Is this feasible?  I don't really like to parse query string myself
(returned by getReq().getParamString()) ..
getReq().getParams()  returns DefaultSolrParams, which does provide a way
to get non-default params only.


Re: custom query string parsing?

2012-04-10 Thread sam
Essentially, this is what I want to do  (I'm extending SearchComponent):

@Override
public void process(ResponseBuilder rb) throws IOException {
final SolrQueryRequest req = rb.req;
final MultiMapSolrParams requestParams =
SolrRequestParsers.parseQueryString(req.getParamString());
final SolrParams allParams = req.getParams();
final SolrParams defaultParams = allParams - requestParams;
req.setParams(defaultParams + transformToSolrParams(requestParams));
}

where allParams - requestParams  will give me default,append,invariant
query params. And, I manually transform requestParams to proper SolrParams,
and sum them up with defaultParams.







On Tue, Apr 10, 2012 at 11:46 AM, sam ”  wrote:

> I would like to transform the following:
>
> /myhandler/?colors=red&colors=blue&materials=leather
>
> to a Query that is similar to:
> /select/?fq:colors:(red OR
> blue)&fq:materials:leather&facet=on&facet.field=  and various default
> query params.
>
> I tried to do this by providing QParserPlugin:
>   
>   
>  
>  
>  myQueryParser
>  
>   
>
>
> In my.QueryParser (returned by my.QueryParserPlugin) I can:
> getReq().getParamString();
>
> and parse that string to get non-default SolrParams.
> And, build a Query object to my liking.
>
> Am I going about the right direction?  It'd be much easier if I could
> place a custom app proxying solr... But, I am required to expose solr to
> the public without middle layer between the browsers and solr.
>
> So, my idea is to reject all solr query params but handful of custom
> params such as colors, materials, ..   and build a Query object myself
> combining with default SolrParams specified in solrconfig.xml .
>
> Is this feasible?  I don't really like to parse query string myself
> (returned by getReq().getParamString()) ..
> getReq().getParams()  returns DefaultSolrParams, which does provide a way
> to get non-default params only.
>
>
>
>


Re: Securing Solr with Tomcat

2012-04-10 Thread sam
http://wiki.apache.org/solr/SolrSecurity

Make sure you block query params such as qt=
https://issues.apache.org/jira/browse/SOLR-3161  is still open.

This could be useful, too:
http://www.nodex.co.uk/blog/12-03-12/installing-solr-debian-squeeze


On Tue, Apr 10, 2012 at 4:25 PM, solruser  wrote:

> Hi All,
> Our web application is allowing users to query directly from browser using
> Solr as Tomcat application by utilizing AJAX Solr library (using jsonp).
> I'm
> looking for ways to block internet users directly either updating the index
> or hitting the admin pages. I’d appreciate your input on this.
>
> Thanks in Anticipation.
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Securing-Solr-with-Tomcat-tp3900737p3900737.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: custom query string parsing?

2012-04-11 Thread sam
Yah, RequestHandler is much better. Thanks! I don't know why I started with
QParserPlugin and SearchComponent.


Even with my own RequestHandler that only passes down selected query
params, people can  still get around it through qt parameter:

  ?qt=/update&stream.update=*:*&commit=true

I think I am trying to solve two things at once: security and application
specific query parameter transformation.

Security will have to be handled elsewhere. Query parameter manipulation
can indeed be done by providing RequestHandler...
If I could just introduce a custom application, both will be solved quite
easily.
But, I am required to do all application development using Solr only
(through plugins and velocity templates).


Thanks.


On Tue, Apr 10, 2012 at 10:19 PM, Chris Hostetter
wrote:

>
> : Essentially, this is what I want to do  (I'm extending SearchComponent):
>
> the level of request manipulation you seem to be interested strikes me as
> something that you should do as a custom RequestHandler -- not a
> SearchComponent or a QParserPlugin.
>
> You can always subclass SearchHandler, and override the handleRequest
> method to manipulate the request however you want and then delegate to
> super.
>
>
> -Hoss
>


Re: custom query string parsing?

2012-04-11 Thread sam
Actually,  /solr/mycore/myhandler/?qt=/updatestill uses my handler.

Only /solr/mycore/select/?qt=/update  uses update handler :P



On Wed, Apr 11, 2012 at 11:41 AM, sam ”  wrote:

> Yah, RequestHandler is much better. Thanks! I don't know why I started
> with QParserPlugin and SearchComponent.
>
>
> Even with my own RequestHandler that only passes down selected query
> params, people can  still get around it through qt parameter:
>
>   ?qt=/update&stream.update=*:*&commit=true
>
> I think I am trying to solve two things at once: security and application
> specific query parameter transformation.
>
> Security will have to be handled elsewhere. Query parameter manipulation
> can indeed be done by providing RequestHandler...
> If I could just introduce a custom application, both will be solved quite
> easily.
> But, I am required to do all application development using Solr only
> (through plugins and velocity templates).
>
>
> Thanks.
>
>
>
> On Tue, Apr 10, 2012 at 10:19 PM, Chris Hostetter <
> hossman_luc...@fucit.org> wrote:
>
>>
>> : Essentially, this is what I want to do  (I'm extending SearchComponent):
>>
>> the level of request manipulation you seem to be interested strikes me as
>> something that you should do as a custom RequestHandler -- not a
>> SearchComponent or a QParserPlugin.
>>
>> You can always subclass SearchHandler, and override the handleRequest
>> method to manipulate the request however you want and then delegate to
>> super.
>>
>>
>> -Hoss
>>
>
>


hierarchical faceting?

2012-04-18 Thread sam
I have hierarchical colors:

text_path is TextField with PathHierarchyTokenizerFactory as tokenizer.

Given these two documents,
Doc1: red
Doc2: red/pink

I want the result to be the following:
?fq=red
==> Doc1, Doc2

?fq=red/pink
==> Doc2

But, with PathHierarchyTokenizer, Doc1 is included for the query:
?fq=red/pink
==> Doc1, Doc2

How can I query for hierarchical facets?
http://wiki.apache.org/solr/HierarchicalFaceting describes facet.prefix..
But it looks too cumbersome to me.

Is there a simpler way to implement hierarchical facets?


Re: hierarchical faceting?

2012-04-18 Thread sam
Yah, that's exactly what PathHierarchyTokenizer does.

  

  


I think I have a query time tokenizer that tokenizes at /

?q=colors:red
==> Doc1, Doc2

?q=colors:redfoobar
==>

?q=colors:red/foobarasdfoaijao
==> Doc1, Doc2



On Wed, Apr 18, 2012 at 11:10 AM, Darren Govoni  wrote:

> Put the parent term in all the child documents at index time
> and the re-issue the facet query when you expand the parent using the
> parent's term. works perfect.
>
> On Wed, 2012-04-18 at 10:56 -0400, sam ” wrote:
> > I have hierarchical colors:
> >  > stored="true" multiValued="true"/>
> > text_path is TextField with PathHierarchyTokenizerFactory as tokenizer.
> >
> > Given these two documents,
> > Doc1: red
> > Doc2: red/pink
> >
> > I want the result to be the following:
> > ?fq=red
> > ==> Doc1, Doc2
> >
> > ?fq=red/pink
> > ==> Doc2
> >
> > But, with PathHierarchyTokenizer, Doc1 is included for the query:
> > ?fq=red/pink
> > ==> Doc1, Doc2
> >
> > How can I query for hierarchical facets?
> > http://wiki.apache.org/solr/HierarchicalFaceting describes
> facet.prefix..
> > But it looks too cumbersome to me.
> >
> > Is there a simpler way to implement hierarchical facets?
>
>
>


Re: hierarchical faceting?

2012-04-18 Thread sam
It looks like TextField is the problem.

This fixed:

  
  
  
  
  
  


I am assuming the text_path fields won't include whitespace characters.

?q=colors:red/pink
==> Doc2   (Doc1, which has colors = red isn't included!)


Is there a tokenizer that tokenizes the string as one token?
I tried to extend Tokenizer myself  but it fails:
public class AsIsTokenizer extends Tokenizer {
@Override
public boolean incrementToken() throws IOException {
return true;//or false;
}
}


On Wed, Apr 18, 2012 at 11:33 AM, sam ”  wrote:

> Yah, that's exactly what PathHierarchyTokenizer does.
>  positionIncrementGap="100">
>   
> 
>   
> 
>
> I think I have a query time tokenizer that tokenizes at /
>
> ?q=colors:red
> ==> Doc1, Doc2
>
> ?q=colors:redfoobar
> ==>
>
> ?q=colors:red/foobarasdfoaijao
> ==> Doc1, Doc2
>
>
>
>
> On Wed, Apr 18, 2012 at 11:10 AM, Darren Govoni wrote:
>
>> Put the parent term in all the child documents at index time
>> and the re-issue the facet query when you expand the parent using the
>> parent's term. works perfect.
>>
>> On Wed, 2012-04-18 at 10:56 -0400, sam ” wrote:
>> > I have hierarchical colors:
>> > > > stored="true" multiValued="true"/>
>> > text_path is TextField with PathHierarchyTokenizerFactory as tokenizer.
>> >
>> > Given these two documents,
>> > Doc1: red
>> > Doc2: red/pink
>> >
>> > I want the result to be the following:
>> > ?fq=red
>> > ==> Doc1, Doc2
>> >
>> > ?fq=red/pink
>> > ==> Doc2
>> >
>> > But, with PathHierarchyTokenizer, Doc1 is included for the query:
>> > ?fq=red/pink
>> > ==> Doc1, Doc2
>> >
>> > How can I query for hierarchical facets?
>> > http://wiki.apache.org/solr/HierarchicalFaceting describes
>> facet.prefix..
>> > But it looks too cumbersome to me.
>> >
>> > Is there a simpler way to implement hierarchical facets?
>>
>>
>>
>


can I use different tokenizer/analyzer for facet count query?

2012-04-25 Thread sam
I have the following in schema.xml











And, I have the following doc:


blues$Teal/Turquoise

...



Response of the query:
http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=colors&rows=100
is





  1
  1
 







During index,  blues$Teal/Turquoise  is tokenized into:
blues
blues$Teal/Turquoise

I think that's why facet count includes both blues and blues$Teal/Turquoise.

Can I have facet count only include the whole keyword,
blues$Teal/Turquoise,  not blues?


Re: can I use different tokenizer/analyzer for facet count query?

2012-04-25 Thread sam
>From wiki:
http://wiki.apache.org/solr/SimpleFacetParameters

If you want both Analysis (for searching) and Faceting on the full literal
Strings, *use copyField *to create two versions of the field: one Text and
one String. Make sure both are indexed="true"

Is that the only way? Do I need to have another field of type String? I'm
using KeywordTokenizer for query...

On Wed, Apr 25, 2012 at 10:41 AM, sam ”  wrote:

> I have the following in schema.xml
>  positionIncrementGap="100">
> 
>  delimiter="$"/>
> 
> 
> 
> 
> 
>  stored="true" multiValued="true"/>
>
>
> And, I have the following doc:
> 
> 
> blues$Teal/Turquoise
> 
> ...
> 
>
>
> Response of the query:
>
> http://localhost:8983/solr/select/?q=*:*&facet=true&facet.field=colors&rows=100
> is
>
> 
> 
> 
> 
>   1
>   1
>  
> 
> 
> 
> 
>
>
>
> During index,  blues$Teal/Turquoise  is tokenized into:
> blues
> blues$Teal/Turquoise
>
> I think that's why facet count includes both blues and
> blues$Teal/Turquoise.
>
> Can I have facet count only include the whole keyword,
> blues$Teal/Turquoise,  not blues?
>
>
>


Re: hierarchical faceting?

2012-05-01 Thread sam
yup.














and ?facet.field=colors_facet



On Mon, Apr 30, 2012 at 9:35 PM, Chris Hostetter
wrote:

>
> : Is there a tokenizer that tokenizes the string as one token?
>
> Using KeywordTokenizer at query time should do whta you want.
>
>
> -Hoss
>


Solr 7.7.1 indexing failing with analysis error: startOffset must be non-negative, and endOffset must be >= startOffset, and offsets must not go backwards

2019-05-27 Thread SAM
indexing a message on solr7.7.1 is failing with the following error. any
help is appreciated. attaching schema files.

2019-05-24 19:32:42.010 ERROR (qtp1115201599-17) [c:bn_sample s:shard1
r:core_node2 x:bn_sample_shard1_replica_n1] o.a.s.h.RequestHandlerBase
org.apache.solr.common.SolrException: Exception writing document id 1
to the index; possible analysis error: startOffset must be
non-negative, and endOffset must be >= startOffset, and offsets must
not go backwards startOffset=1,endOffset=3,lastStartOffset=6721 for
field 'message_text'
at 
org.apache.solr.update.DirectUpdateHandler2.addDoc(DirectUpdateHandler2.java:243)
at 
org.apache.solr.update.processor.RunUpdateProcessor.processAdd(RunUpdateProcessorFactory.java:67)
at 
org.apache.solr.update.processor.UpdateRequestProcessor.processAdd(UpdateRequestProcessor.java:55)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.doLocalAdd(DistributedUpdateProcessor.java:1001)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.versionAdd(DistributedUpdateProcessor.java:1222)
at 
org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:693)
at 
org.apache.solr.update.processor.LogUpdateProcessorFactory$LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:103)
at org.apache.solr.handler.loader.JavabinLoader$1.update(JavabinLoader.java:110)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readOuterMostDocIterator(JavaBinUpdateRequestCodec.java:327)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readIterator(JavaBinUpdateRequestCodec.java:280)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:333)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec$StreamingCodec.readNamedList(JavaBinUpdateRequestCodec.java:235)
at org.apache.solr.common.util.JavaBinCodec.readObject(JavaBinCodec.java:298)
at org.apache.solr.common.util.JavaBinCodec.readVal(JavaBinCodec.java:278)
at org.apache.solr.common.util.JavaBinCodec.unmarshal(JavaBinCodec.java:191)
at 
org.apache.solr.client.solrj.request.JavaBinUpdateRequestCodec.unmarshal(JavaBinUpdateRequestCodec.java:126)
at 
org.apache.solr.handler.loader.JavabinLoader.parseAndLoadDocs(JavabinLoader.java:123)
at org.apache.solr.handler.loader.JavabinLoader.load(JavabinLoader.java:70)
at 
org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:97)
at 
org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:68)
at 
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:199)
at org.apache.solr.core.SolrCore.execute(SolrCore.java:2551)
at org.apache.solr.servlet.HttpSolrCall.execute(HttpSolrCall.java:710)
at org.apache.solr.servlet.HttpSolrCall.call(HttpSolrCall.java:516)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:395)
at 
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:341)
at 
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1602)
at org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:540)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:146)
at org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:548)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:257)
at 
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:1588)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextHandle(ScopedHandler.java:255)
at 
org.eclipse.jetty.server.handler.ContextHandler.doHandle(ContextHandler.java:1345)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:203)
at org.eclipse.jetty.servlet.ServletHandler.doScope(ServletHandler.java:480)
at 
org.eclipse.jetty.server.session.SessionHandler.doScope(SessionHandler.java:1557)
at 
org.eclipse.jetty.server.handler.ScopedHandler.nextScope(ScopedHandler.java:201)
at 
org.eclipse.jetty.server.handler.ContextHandler.doScope(ContextHandler.java:1247)
at org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:144)
at 
org.eclipse.jetty.server.handler.ContextHandlerCollection.handle(ContextHandlerCollection.java:220)
at 
org.eclipse.jetty.server.handler.HandlerCollection.handle(HandlerCollection.java:126)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at 
org.eclipse.jetty.rewrite.handler.RewriteHandler.handle(RewriteHandler.java:335)
at 
org.eclipse.jetty.server.handler.HandlerWrapper.handle(HandlerWrapper.java:132)
at org.eclipse.jetty.server.Server.handle(Server.java:502)
at org.eclipse.jetty.server.HttpChannel.handle(HttpChannel.java:364)
at org.eclipse.jetty.server.HttpConnection.onFillable(Http

SolrCloud Config file

2016-04-11 Thread Sam Xia
Hi,

I installed Solr 5.5 in my test server but was having issue updating the 
solrconfig.xml.

Solr is installed in /locm/solr-5.5.0/ folder

1) First I create a topic connection with the following command:

bin/solr create -c topic -d topic_configs_ori

But there is no folder name topc in 
/locm/solr-5.5.0/server/solr/configsets/topic after the above commend.

The issue is that when I check the configuration file in Solr admin, the 
correct solrconfig.xml is not updated to the one in 
/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori. Actually it looked 
to me that the default config files are used.

2) Then I run the following command to try to update

./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic -solrhome 
/locm/solr-5.5.0/ -confdir 
/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf

I got the following error:

./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic -solrhome 
/locm/solr-5.5.0/ -confdir 
/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf

Exception in thread "main" java.io.IOException: Error uploading file 
/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf/get-pip.py to 
zookeeper path /configs/topic/get-pip.py

at 
org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.java:69)

at 
org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.java:59)

at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:135)

at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:199)

at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:69)

at java.nio.file.Files.walkFileTree(Files.java:2602)

at java.nio.file.Files.walkFileTree(Files.java:2635)

at 
org.apache.solr.common.cloud.ZkConfigManager.uploadToZK(ZkConfigManager.java:59)

at 
org.apache.solr.common.cloud.ZkConfigManager.uploadConfigDir(ZkConfigManager.java:121)

at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:222)

Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
KeeperErrorCode = ConnectionLoss for /configs/topic/get-pip.py

at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)

at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)

at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)

at org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:501)

at 
org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:60)

at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:498)

at org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:408)

at 
org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.java:67)

... 9 more

Please help me as this seems to be very basic. But I followed the document in:
https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage+Configuration+Files

Is this a bug or am I missing anything? Thanks





Re: SolrCloud Config file

2016-04-11 Thread Sam Xia
I tried solr-6.0 and was able to see the same issue. Please help. Thanks





On 4/11/16, 3:59 PM, "Sam Xia"  wrote:

>Hi,
>
>I installed Solr 5.5 in my test server but was having issue updating the 
>solrconfig.xml.
>
>Solr is installed in /locm/solr-5.5.0/ folder
>
>1) First I create a topic connection with the following command:
>
>bin/solr create -c topic -d topic_configs_ori
>
>But there is no folder name topc in 
>/locm/solr-5.5.0/server/solr/configsets/topic after the above commend.
>
>The issue is that when I check the configuration file in Solr admin, the 
>correct solrconfig.xml is not updated to the one in 
>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori. Actually it 
>looked to me that the default config files are used.
>
>2) Then I run the following command to try to update
>
>./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic -solrhome 
>/locm/solr-5.5.0/ -confdir 
>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf
>
>I got the following error:
>
>./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic -solrhome 
>/locm/solr-5.5.0/ -confdir 
>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf
>
>Exception in thread "main" java.io.IOException: Error uploading file 
>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf/get-pip.py 
>to zookeeper path /configs/topic/get-pip.py
>
>at 
>org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.j
>ava:69)
>
>at 
>org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.j
>ava:59)
>
>at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:135)
>
>at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:199)
>
>at java.nio.file.FileTreeWalker.walk(FileTreeWalker.java:69)
>
>at java.nio.file.Files.walkFileTree(Files.java:2602)
>
>at java.nio.file.Files.walkFileTree(Files.java:2635)
>
>at 
>org.apache.solr.common.cloud.ZkConfigManager.uploadToZK(ZkConfigManager.ja
>va:59)
>
>at 
>org.apache.solr.common.cloud.ZkConfigManager.uploadConfigDir(ZkConfigManag
>er.java:121)
>
>at org.apache.solr.cloud.ZkCLI.main(ZkCLI.java:222)
>
>Caused by: org.apache.zookeeper.KeeperException$ConnectionLossException: 
>KeeperErrorCode = ConnectionLoss for /configs/topic/get-pip.py
>
>at org.apache.zookeeper.KeeperException.create(KeeperException.java:99)
>
>at org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
>
>at org.apache.zookeeper.ZooKeeper.create(ZooKeeper.java:783)
>
>at 
>org.apache.solr.common.cloud.SolrZkClient$10.execute(SolrZkClient.java:501
>)
>
>at 
>org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.ja
>va:60)
>
>at 
>org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:498)
>
>at 
>org.apache.solr.common.cloud.SolrZkClient.makePath(SolrZkClient.java:408)
>
>at 
>org.apache.solr.common.cloud.ZkConfigManager$1.visitFile(ZkConfigManager.j
>ava:67)
>
>... 9 more
>
>Please help me as this seems to be very basic. But I followed the 
>document in:
>https://cwiki.apache.org/confluence/display/solr/Using+ZooKeeper+to+Manage
>+Configuration+Files
>
>Is this a bug or am I missing anything? Thanks
>
>
>


Re: SolrCloud Config file

2016-04-11 Thread Sam Xia
Thanks Shawn.

Where is the path of topic collection zookeeper config file? Here is from 
wiki (see below). But I was not able to find configs/topic anywhere in the 
installation folder. 

"The create command will upload a copy of the data_driven_schema_configs 
configuration directory to ZooKeeper under /configs/mycollection. Refer to 
the Solr Start Script Reference 
<https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Referen
ce> page for more details about the create command for creating 
collections.”



Here is the command that I run and verify zookeeper is in port 8983. BTW, 
I did not modify anything and the Solr is a clean install so I do not know 
why Python is used in the script. The error looks to me that the config 
folder was not created at first command. So when you try to update it, it 
gets an IO error.

./solr status

Found 2 Solr nodes: 

Solr process 30976 running on port 7574
{
  "solr_home":"/locm/solr-6.0.0/example/cloud/node2/solr",
  "version":"6.0.0 48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 - nknize - 
2016-04-01 14:41:49",
  "startTime":"2016-04-11T23:42:59.513Z",
  "uptime":"0 days, 0 hours, 51 minutes, 43 seconds",
  "memory":"93.2 MB (%19) of 490.7 MB",
  "cloud":{
"ZooKeeper":"localhost:9983",
"liveNodes":"2",
"collections":"2"}}


Solr process 30791 running on port 8983
{
  "solr_home":"/locm/solr-6.0.0/example/cloud/node1/solr",
  "version":"6.0.0 48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 - nknize - 
2016-04-01 14:41:49",
  "startTime":"2016-04-11T23:42:54.041Z",
  "uptime":"0 days, 0 hours, 51 minutes, 49 seconds",
  "memory":"78.9 MB (%16.1) of 490.7 MB",
  "cloud":{
"ZooKeeper":"localhost:9983",
"liveNodes":"2",
"collections":"2"}}



If you run the following steps, you would be able to reproduce the issue 
every time.

Step 1) bin/solr start -e cloud -noprompt
Step 2) bin/solr create -c topic -d sample_techproducts_configs
Step 3) ./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic 
-solrhome /locm/solr-5.5.0/ -confdir 
/locm/solr-5.5.0/server/solr/configsets/sample_techproducts_configs/conf








On 4/11/16, 5:29 PM, "Shawn Heisey"  wrote:

>On 4/11/2016 4:59 PM, Sam Xia wrote:
>> Solr is installed in /locm/solr-5.5.0/ folder
>>
>> 1) First I create a topic connection with the following command:
>>
>> bin/solr create -c topic -d topic_configs_ori
>>
>> But there is no folder name topc in 
>>/locm/solr-5.5.0/server/solr/configsets/topic after the above commend.
>
>This command does not change anything in configsets.  Since you are in
>cloud mode, it will copy that configset from the indicated directory
>(topic_configs_ori) to zookeeper, to a config named "topic" -- assuming
>that this config does not already exist in zookeeper.  If the named
>config already exists in zookeeper, then it will be used as-is, and not
>updated.  When not in cloud mode, it behaves a little differently, but
>still would not create anything in configsets.
>
>> I got the following error:
>>
>> ./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic 
>>-solrhome /locm/solr-5.5.0/ -confdir 
>>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf
>>
>> Exception in thread "main" java.io.IOException: Error uploading file 
>>/locm/solr-5.5.0/server/solr/configsets/topic_configs_ori/conf/get-pip.py
>> to zookeeper path /configs/topic/get-pip.py
>
>
>
>> Caused by: 
>>org.apache.zookeeper.KeeperException$ConnectionLossException: 
>>KeeperErrorCode = ConnectionLoss for /configs/topic/get-pip.py
>
>The stacktrace from the "caused by" exception indicates that the zkcli
>command is trying to create the "/configs/topic/get-pip.py" path in the
>zookeeper database and is having a problem connecting to zookeeper.  Are
>you positive that "localhost:9983" is the correct connection string, and
>that there is an active zookeeper server listening on that port?  FYI:
>The embedded zookeeper server should not be used in production.
>
>Side issue:  I'm curious why you have a python script in your config. 
>Nothing explicitly wrong with that, it's just an odd thing to feed to a
>Java program like Solr.
>
>Thanks,
>Shawn
>


Re: SolrCloud Config file

2016-04-12 Thread Sam Xia
Thanks you Shawn and Erick. It turns out there is a get-pip.py file in the 
configuration folder (the config file was copied from somewhere), which 
caused the mis-behave. After get-pip.py is removed, everything worked as 
expected. Thanks Again.







On 4/11/16, 8:40 PM, "Erick Erickson"  wrote:

>Do note by the way that as of Solr 5.5, the bin/solr script has an
>option for uploading and downloading configsets. Try typing
>
>bin/solr zk -help
>
>Best,
>Erick
>
>On Mon, Apr 11, 2016 at 6:30 PM, Shawn Heisey  wrote:
>> On 4/11/2016 6:40 PM, Sam Xia wrote:
>>> Where is the path of topic collection zookeeper config file? Here is 
>>>from
>>> wiki (see below). But I was not able to find configs/topic anywhere in 
>>>the
>>> installation folder.
>>
>> The /configs/topic path is *inside the zookeeper database*.  It is not a
>> path on the filesystem at all.  Zookeeper is a separate Apache project
>> that Solr happens to use when running in cloud mode.
>>
>> http://zookeeper.apache.org/
>>
>>> "The create command will upload a copy of the 
>>>data_driven_schema_configs
>>> configuration directory to ZooKeeper under /configs/mycollection. 
>>>Refer to
>>> the Solr Start Script Reference
>>> 
>>><https://cwiki.apache.org/confluence/display/solr/Solr+Start+Script+Refe
>>>ren
>>> ce> page for more details about the create command for creating
>>> collections.”
>>>
>>> Here is the command that I run and verify zookeeper is in port 8983. 
>>>BTW,
>>> I did not modify anything and the Solr is a clean install so I do not 
>>>know
>>> why Python is used in the script. The error looks to me that the config
>>> folder was not created at first command. So when you try to update it, 
>>>it
>>> gets an IO error.
>>>
>>> ./solr status
>>>
>>> Found 2 Solr nodes:
>>>
>>> Solr process 30976 running on port 7574
>>> {
>>>   "solr_home":"/locm/solr-6.0.0/example/cloud/node2/solr",
>>>   "version":"6.0.0 48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 - nknize -
>>> 2016-04-01 14:41:49",
>>>   "startTime":"2016-04-11T23:42:59.513Z",
>>>   "uptime":"0 days, 0 hours, 51 minutes, 43 seconds",
>>>   "memory":"93.2 MB (%19) of 490.7 MB",
>>>   "cloud":{
>>> "ZooKeeper":"localhost:9983",
>>> "liveNodes":"2",
>>> "collections":"2"}}
>>>
>>>
>>> Solr process 30791 running on port 8983
>>> {
>>>   "solr_home":"/locm/solr-6.0.0/example/cloud/node1/solr",
>>>   "version":"6.0.0 48c80f91b8e5cd9b3a9b48e6184bd53e7619e7e3 - nknize -
>>> 2016-04-01 14:41:49",
>>>   "startTime":"2016-04-11T23:42:54.041Z",
>>>   "uptime":"0 days, 0 hours, 51 minutes, 49 seconds",
>>>   "memory":"78.9 MB (%16.1) of 490.7 MB",
>>>   "cloud":{
>>> "ZooKeeper":"localhost:9983",
>>> "liveNodes":"2",
>>> "collections":"2"}}
>>
>> 8983 is a *Solr* port.  The default embedded zookeeper port is the first
>> Solr port in the cloud example plus 1000, so it usually ends up being 
>>9983.
>>
>>> If you run the following steps, you would be able to reproduce the 
>>>issue
>>> every time.
>>>
>>> Step 1) bin/solr start -e cloud -noprompt
>>> Step 2) bin/solr create -c topic -d sample_techproducts_configs
>>> Step 3) ./zkcli.sh -cmd upconfig -zkhost localhost:9983 -confname topic
>>> -solrhome /locm/solr-5.5.0/ -confdir
>>> 
>>>/locm/solr-5.5.0/server/solr/configsets/sample_techproducts_configs/conf
>>
>> The "-solrhome" option is not something you need.  I have no idea what
>> it will do, but it is not one of the options for upconfig.
>>
>> I tried this (on Windows) and I'm getting a different problem on the
>> upconfig command trying to connect to zookeeper:
>>
>> https://www.dropbox.com/s/c65zmkhd0le6mzv/upconfig-error.png?dl=0
>>
>> Trying again on Linux, I had zero problems with the commands you used,
>> changing only minor details for the upconfig command (things are in a
>> different place, and I didn't use the unnecessary -solrhome option):
>>
>> https://www.dropbox.com/s/edoa07anmkkep0l/xia-recreate1.png?dl=0
>> https://www.dropbox.com/s/ad5ukuvfvlgwq0z/xia-recreate2.png?dl=0
>> https://www.dropbox.com/s/ay1u3jjuwy5t52s/xia-recreate3.png?dl=0
>>
>> Your stated commands indicate 5.5.0, but the JSON status information
>> above and the paths they contain indicate that it is 6.0.0 that is
>> responding.  I will have to try 6.0.0 later.
>>
>> If nothing has changed, then "get-pip.py" would not be there.  There
>> isn't a configset named "topic_configs_ori" included with Solr, not even
>> in the 6.0.0 version.  This came from somewhere besides the Solr 
>>website.
>>
>> Thanks,
>> Shawn
>>


Re: Indexing database in Solr using Data Import Handler

2014-07-17 Thread Sam Barber
Hi,



You have the wrong varname in your sub query.



select favouritedby from filefav where id=
'${filemetadata.id}'



should be

select favouritedby from filefav where id=
'${restaurant.id}'


4.6 Core Discovery coreRootDirectory not working

2014-01-29 Thread Sam Batschelet
Hello this is my 1st post to you group I am in the process of setting up a 
development environment using solr.  We will require multiple cores managed by 
multiple users in the following layout.  I am running a fairly vanilla version 
of 4.6


/home/camp/example/solr/solr.xml


/home/user1/solr/core.properties
/home/user2/solr/core.properties

If I manually add the core from admin everything works fine I can index etc but 
when I kill the server the core information is no longer available.  I need to 
delete the core.properties file and recreate core from admin.

I since have learned that this should be done with Core Discovery.  Mainly 
setting coreRootDirectory which logically in this case should be /home.  But 
solr is not finding the core even if I set the directory directly. ie 
/home/user1/solr/ or /home/user1/.  I must be missing another config and was 
hoping for some insight.


## solr.xml

  

  
${host:}
${jetty.port:8883}
${hostContext:solr}
${zkClientTimeout:15000}
${genericCoreNodeNames:true}
  

  
${socketTimeout:0}
${connTimeout:0}
  



Thanks
-Sam

 



Re: 4.6 Core Discovery coreRootDirectory not working

2014-01-29 Thread Sam Batschelet
On Jan 29, 2014, at 4:31 PM, Sam Batschelet wrote:

> Hello this is my 1st post to you group I am in the process of setting up a 
> development environment using solr.  We will require multiple cores managed 
> by multiple users in the following layout.  I am running a fairly vanilla 
> version of 4.6
> 
> 
> /home/camp/example/solr/solr.xml
> 
> 
> /home/user1/solr/core.properties
> /home/user2/solr/core.properties
> 
> If I manually add the core from admin everything works fine I can index etc 
> but when I kill the server the core information is no longer available.  I 
> need to delete the core.properties file and recreate core from admin.
> 
> I since have learned that this should be done with Core Discovery.  Mainly 
> setting coreRootDirectory which logically in this case should be /home.  But 
> solr is not finding the core even if I set the directory directly. ie 
> /home/user1/solr/ or /home/user1/.  I must be missing another config and was 
> hoping for some insight.
> 
> 
> ## solr.xml
> 
>  

Just to point out the obvious before I get 20 responses to such I did test this 
without the commenting :).

An issue with atomic updates?

2013-06-28 Thread Sam Antique
Hi all,

I think I have found an issue (or misleading behavior, per say) about
atomic updates.

If I do atomic updates on a field, and if the operation is none-sense
(anything other than add, set, inc), it still returns success. Say I send:

/update/json?commit=true -d '[{"id":"...", "field1":{"add":"value"}}]'

it adds fine and return success. But if I continue:

/update/json?commit=true -d '[{"id":"...",
"field1":{"none-sense":"value"}}]'

It still returns status:0, which is a bit misleading.

Is this a known issue?

Thanks,
Sam


Re: lang.fallback doesn't work when using lang.fallbackFields

2013-07-28 Thread Sam Dillingham
unsubscribe


On Sun, Jul 28, 2013 at 5:59 PM, Jan Høydahl  wrote:

> Hi,
>
> Looking at the code, you are right. Whitelist processing is only done on
> detected languages, not on the fallback or fallbackFields languages, since
> these are assumed to be correct. Thus you should not pass in a fallback
> language, either in the input document or with langid.fallback which cannot
> be handled by your schema.
>
> This is by design. However, I can also see an argument for making
> fallbackFields subject to whitelist logic, especially if you do not control
> the application that populates this field, to safeguard against exception.
> Also, such a change woudl not harm any of the existing functionality, so it
> would be safe to introduce.
>
> Feel free to write a JIRA issue for it.
>
> A workaround could be to write a simple UpdateProcessor which removes any
> illegal value from langid.fallbackFields before the LangId processor.
>
> --
> Jan Høydahl, search solution architect
> Cominvent AS - www.cominvent.com
>
> 7. juli 2013 kl. 18:05 skrev adfel70 :
>
> > Hi
> > I'm trying to index a set of documents with solr's language detection
> > component.
> > I set
> > user_lan
> > en,it
> > en
> >
> > In some documents user_lan has 'sk', solr falls-back to 'sk' ,which is
> not
> > in the whitelist, and instead of falling back to 'en' as stated  here
> > <http://wiki.apache.org/solr/LanguageDetection#langid.fallbackFields>
>  , I
> > get an excpetion regarding not having a text_sk field in the schema.
> >
> > Anyone encountered this behavior?
> >
> > thanks.
> >
> >
> >
> > --
> > View this message in context:
> http://lucene.472066.n3.nabble.com/lang-fallback-doesn-t-work-when-using-lang-fallbackFields-tp4076048.html
> > Sent from the Solr - User mailing list archive at Nabble.com.
>
>


-- 
Sam


filter query by string length or word count?

2013-05-22 Thread Sam Lee
I have schema.xml

...






















how can I query docs whose body has more than 80 words (or 80 characters) ?


basic solr cloud questions

2011-09-27 Thread Sam Jiang
Hi all

I'm a relatively new solr user, and recently I discovered the interesting
solr cloud feature. I have some basic questions:
(please excuse me if I get the terminologies wrong)

- from my understanding, this is still a work in progress. How mature is it?
Is there any estimate on the official release?

- has the solr_cluster.properties configuration been implemented? it's
mentioned in http://wiki.apache.org/solr/NewSolrCloudDesign. I was trying to
play with it a bit but I couldn't find the file.

- I tried to to setup a two node, 1 shard cluster, e.g. active active solr
with fault tolerance. (this isn't possible with the old replication feature
right?) I have both instances of solr configured to use , and I started each
instance with its own instance of zookeeper to form an ensemble. From the
zookeeper admin page, I can see both nodes under shard1. I can submit
documents fine. However, when I do a search, it appears that only one node
has the submitted documents. (e.g. if I keep refreshing, I get different
results depending on which node gets assigned the work). My search url is
http://localhost:8983/solr/collection1/select?distrib=true&q=*.*. Did I miss
something?

thanks


Re: jetty update

2011-04-13 Thread Sam Granieri
I found this link after googling for a few minutes.
http://wiki.eclipse.org/Jetty/Howto/Upgrade_from_Jetty_6_to_Jetty_7

I hope that helps
Also, a question like this may be more appropriate for a jetty mailing list.

On Wed, Apr 13, 2011 at 8:44 AM, ramires  wrote:
> hi
>
>  how to update jetty 6 to jetty 7 ?
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/jetty-update-tp2816084p2816084.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


Re: jetty update

2011-04-13 Thread Sam Granieri
Is your current solr installation with Jetty 6 working well for you in
a production environment?

I dont know enough about Jetty to help you further on this question.
On Wed, Apr 13, 2011 at 10:47 AM, stockii  wrote:
> is it necessary to update for solr ?
>
> -
> --- System 
> 
>
> One Server, 12 GB RAM, 2 Solr Instances, 7 Cores,
> 1 Core with 31 Million Documents other Cores < 100.000
>
> - Solr1 for Search-Requests - commit every Minute  - 5GB Xmx
> - Solr2 for Update-Request  - delta every Minute - 4GB Xmx
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/jetty-update-tp2816084p2816650.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>


LUCENE-2899 patch, OpenNLPTokenizer compile error

2012-07-19 Thread sam wu
I am following instruction
http://wiki.apache.org/solr/OpenNLP to test OpenNLP, Solr integration

1. pull 4.0 branch from trunk
2. apply patch LUCENE-2899 patch
(there are several LUCENE-2899 patch files, I took the one, 385KB,
02/Jul/12 08:05, I should only apply this one, correct ?)
3. ant compile

and get the following errors:
---

 [javac] warning: [options] bootstrap class path not set in conjunction
with -source 1.6
[javac]
/home/swu/newproject/lucene_4x/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:170:
error: method reset in class TokenStream cannot be applied to given types;
[javac] super.reset(input);
[javac]  ^
[javac]   required: no arguments
[javac]   found: Reader
[javac]   reason: actual and formal argument lists differ in length
[javac]
/home/swu/newproject/lucene_4x/lucene/analysis/opennlp/src/java/org/apache/lucene/analysis/opennlp/OpenNLPTokenizer.java:168:
error: method does not override or implement a method from a supertype
[javac]   @Override
[javac]   ^
[javac] 2 errors
[javac] 2 warnings

BUILD FAILED


---

I am running java 1.7. Is patch only work for java 1.6 ?, or I am doing
something wrong.


Thanks


Sam


Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-20 Thread sam fang
Hi Hoss,

Thanks for your quick reply.

Below is my solr.xml configuration, and already set persistent to true.



  





For test1 and tets1-ondeck content, just copied from
example/solr/collection1

Then publish 1 record to test1, and query. it's ok now.

INFO: [test1] webapp=/solr path=/select
params={distrib=false&wt=javabin&rows=10&version=2&fl=id,score&df=text&NOW=1348195088691&shard.url=host1:18000/solr/test1/&start=0&q=*:*&isShard=true&fsv=true}
hits=1 status=0 QTime=1
Sep 20, 2012 10:38:08 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select
params={ids=SOLR1000&distrib=false&wt=javabin&rows=10&version=2&df=text&NOW=1348195088691&shard.url=
host1:18000/solr/test1/&q=*:*&isShard=true} status=0 QTime=1
Sep 20, 2012 10:38:08 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*&wt=python} status=0
QTime=20


Then use core admin console page to swap, and click reload for test1 and
test1-ondeck.  if keep refresh query page, sometimes give 1 record,
sometime give 0 records.
And found the shardurl is different with the log which search before swap.
It’s shard.url=host1:18000/solr/test1-ondeck/| host1:18000/solr/test1/.

Below return 0
S Sep 20, 2012 10:41:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select
params={fl=id,score&df=text&NOW=1348195292608&shard.url=host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/&start=0&q=*:*&distrib=false&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=0 status=0 QTime=0
Sep 20, 2012 10:41:32 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*&wt=python} status=0
QTime=14

Below return 1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select
params={fl=id,score&df=text&NOW=1348195351293&shard.url=
host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/&start=0&q=*:*&distrib=false&isShard=true&wt=javabin&fsv=true&rows=10&version=2}
hits=1 status=0 QTime=1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1-ondeck] webapp=/solr path=/select
params={df=text&NOW=1348195351293&shard.url=
host1:18000/solr/test1-ondeck/|
host1:18000/solr/test1/&q=*:*&ids=SOLR1000&distrib=false&isShard=true&wt=javabin&rows=10&version=2}
status=0 QTime=1
Sep 20, 2012 10:42:31 PM org.apache.solr.core.SolrCore execute
INFO: [test1] webapp=/solr path=/select params={q=*:*&wt=python} status=0
QTime=9

Thanks a lot,
Sam

On Thu, Sep 20, 2012 at 8:27 PM, Chris Hostetter
wrote:

> : In Solr 3.6, core swap function works good. After switch to use Solr 4.0
> : Beta, and found it doesn't work well.
>
> can you elaborate on what exactly you mean by "doesn't work well" ? ..
> what does your solr.xml file look like? what command did you run to do the
> swap? what results did you get from those commands?  what exactly did you
> observe after teh swap and how did you observe it?
>
> : I tried to swap two cores, but it still return old core data when do the
> : search. After restart tomat which contain Solr, it will mess up when do
> the
> : search, seems it will use like oldcoreshard|newcoreshard to do the
> search.
> : Anyone hit this issue?
>
> how did you "do the search" ? is it possible you were just seeing your
> browser cache the results?  Do you have persistent="true" in your solr.xml
> file? w/o that changes made via the CoreAdmin commands won't be saved to
> disk.
>
> I just tested using both 4.0-BETA and the HEAD of the 4x branch and
> couldn't see any problems using SWAP  (i tested using 'java
> -Dsolr.solr.home=multicore/ -jar start.jar' and indexing some trivial
> docs, and then tested again after modifying the solr.xml to use
> persistent="true")
>
>
> -Hoss
>


Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-21 Thread sam fang
Hi Chris,

Thanks for your help. Today I tried again and try to figure out the reason.

1. set up an external zookeeper server.

2. change /opt/solr/apache-solr-4.0.0-BETA/example/solr/solr.xml persistent
to true. and run below command to upload config to zk. (renamed multicore
to solr, and need to put zkcli.sh related jar package.)
/opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core0/conf/
-confname
core0 -z localhost:2181
/opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
upconfig -confdir /opt/solr/apache-solr-4.0.0-BETA/example/solr/core1/conf/
-confname
core1 -z localhost:2181

3. Start jetty server
cd /opt/solr/apache-solr-4.0.0-BETA/example
java -DzkHost=localhost:2181 -jar start.jar

4. publish message to core0
/opt/solr/apache-solr-4.0.0-BETA/example/solr/exampledocs
cp ../../exampledocs/post.jar ./
java -Durl=http://localhost:8983/solr/core0/update -jar post.jar
 ipod_video.xml

5. query to core0 and core1 is ok.

6. Click "swap" in the admin page, the query to core0 and core1 is
changing. Previous I saw sometimes returns 0 result. sometimes return 1
result. Today
seems core0 still return 1 result, core1 return 0 result.

7. Then click "reload" in the admin page, the query to core0 and core1.
Sometimes return 1 result, and sometimes return nothing. Also can see the zk
configuration also changed.

8. Restart jetty server. If do the query, it's same as what I saw in step 7.

9. Stop jetty server, then log into zkCli.sh, then run command "set
/clusterstate.json {}". then start jetty again. everything back to normal,
that is what previous swap did in solr 3.6 or solr 4.0 w/o cloud.


>From my observation, after swap, seems it put shard information into
actualShards, when user request to search, it will use all shard
information to do the
search. But user can't see zk update until click "reload" button in admin
page. When restart web server, this shard information eventually went to
zk, and
the search go to all shards.

I found there is a option "distrib", and used url like "
http://host1:18000/solr/core0/select?distrib=false&q=*%3A*&wt=xml";, then
only get the data on the
core0. Digged in the code (handleRequestBody method in SearchHandler class,
seems it make sense)

I tried to stop tomcat server, then use command "set /clusterstate.json {}"
to clean all cluster state, then use command "cloud-scripts/zkcli.sh -cmd
upconfig" to upload config to zk server, and start tomcat server. It
rebuild the right shard information in zk. then search function back to
normal like what
we saw in 3.6 or 4.0 w/o cloud.

Seems solr always add shard information into zk.

I tested cloud swap on single machine, if each core have one shard in the
zk, after swap, eventually zk has 2 slices(shards) for that core because
 now only
do the add. so the search will go to both 2 shards.

and tested cloud swap with 2 machine which each core have 1 shard and 2
slices. Below the configuration in the zk. After swap, eventually zk has 4
for that
core. and search will mess up.

  "core0":{"shard1":{
  "host1:18000_solr_core0":{
"shard":"shard1",
"roles":null,
"leader":"true",
"state":"active",
"core":"core0",
"collection":"core0",
"node_name":"host1:18000_solr",
"base_url":"http://host1:18000/solr"},
  "host2:18000_solr_core0":{
"shard":"shard1",
"roles":null,
"state":"active",
"core":"core0",
"collection":"core0",
"node_name":"host2:18000_solr",
"base_url":"http://host2:18000/solr"}}},

For previous 2 cases, if I stoped tomcat/jetty server, then manullay upload
configuration to zk, then start tomcat server, zk and search become normal.

On Fri, Sep 21, 2012 at 3:34 PM, Chris Hostetter
wrote:

>
> : Below is my solr.xml configuration, and already set persistent to true.
> ...
> : Then publish 1 record to test1, and query. it's ok now.
>
> Ok, first off -- please provide more details on how exactly you are
> running Solr.  Your initial email said...
>
> >>> In Solr 3.6, core swap function works good. After switch to use Solr
> 4.0
> >>> Beta, and found it doesn't work well.
>
> ...but based on your solr.xml file and your logs, it appears you are now
> trying to use some of the ZooKeeper/SolrCloud features that didn't even
> exist in Solr 3.6, so it's kind of an apples to oranges comparison.  i'm
> pretty sure that for a simple multicore setup, SWAP still works exactly as
> it did in Solr 3.6.
>
> Wether SWAP works with ZooKeeper/SolrCloud is something i'm not really
> clear on -- mainly because i'm not sure what it should mean conceptually.
> Should the two SolrCores swap which collections they are apart of? what
> happens if the doc->shard assignment for the two collections means the
> same docs woulnd't wind up im those SolrCores? what if the SolrCores are
> two different shards of the same collectio

Re: Solr Swap Function doesn't work when using Solr Cloud Beta

2012-09-24 Thread sam fang
Hi Mark,

If can support in future, I think it's great. It's a really useful feature.
For example, user can use to refresh with totally new core. User can build
index on one core. After build done, can swap old core and new core. Then
get totally new core for search.

Also can used in the backup. If one crashed, can easily swap with backup
core and quickly serve the search request.

Best Regards,
Sam

On Sun, Sep 23, 2012 at 2:51 PM, Mark Miller  wrote:

> FYI swap is def not supported in SolrCloud right now - even though it may
> work, it's not been thought about and there are no tests.
>
> If you would like to see support, I'd add a JIRA issue along with any
> pertinent info from this thread about what the behavior needs to be changed
> to.
>
> - Mark
>
> On Sep 21, 2012, at 6:49 PM, sam fang  wrote:
>
> > Hi Chris,
> >
> > Thanks for your help. Today I tried again and try to figure out the
> reason.
> >
> > 1. set up an external zookeeper server.
> >
> > 2. change /opt/solr/apache-solr-4.0.0-BETA/example/solr/solr.xml
> persistent
> > to true. and run below command to upload config to zk. (renamed multicore
> > to solr, and need to put zkcli.sh related jar package.)
> > /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
> > upconfig -confdir
> /opt/solr/apache-solr-4.0.0-BETA/example/solr/core0/conf/
> > -confname
> > core0 -z localhost:2181
> > /opt/solr/apache-solr-4.0.0-BETA/example/cloud-scripts/zkcli.sh -cmd
> > upconfig -confdir
> /opt/solr/apache-solr-4.0.0-BETA/example/solr/core1/conf/
> > -confname
> > core1 -z localhost:2181
> >
> > 3. Start jetty server
> > cd /opt/solr/apache-solr-4.0.0-BETA/example
> > java -DzkHost=localhost:2181 -jar start.jar
> >
> > 4. publish message to core0
> > /opt/solr/apache-solr-4.0.0-BETA/example/solr/exampledocs
> > cp ../../exampledocs/post.jar ./
> > java -Durl=http://localhost:8983/solr/core0/update -jar post.jar
> > ipod_video.xml
> >
> > 5. query to core0 and core1 is ok.
> >
> > 6. Click "swap" in the admin page, the query to core0 and core1 is
> > changing. Previous I saw sometimes returns 0 result. sometimes return 1
> > result. Today
> > seems core0 still return 1 result, core1 return 0 result.
> >
> > 7. Then click "reload" in the admin page, the query to core0 and core1.
> > Sometimes return 1 result, and sometimes return nothing. Also can see
> the zk
> > configuration also changed.
> >
> > 8. Restart jetty server. If do the query, it's same as what I saw in
> step 7.
> >
> > 9. Stop jetty server, then log into zkCli.sh, then run command "set
> > /clusterstate.json {}". then start jetty again. everything back to
> normal,
> > that is what previous swap did in solr 3.6 or solr 4.0 w/o cloud.
> >
> >
> > From my observation, after swap, seems it put shard information into
> > actualShards, when user request to search, it will use all shard
> > information to do the
> > search. But user can't see zk update until click "reload" button in admin
> > page. When restart web server, this shard information eventually went to
> > zk, and
> > the search go to all shards.
> >
> > I found there is a option "distrib", and used url like "
> > http://host1:18000/solr/core0/select?distrib=false&q=*%3A*&wt=xml";, then
> > only get the data on the
> > core0. Digged in the code (handleRequestBody method in SearchHandler
> class,
> > seems it make sense)
> >
> > I tried to stop tomcat server, then use command "set /clusterstate.json
> {}"
> > to clean all cluster state, then use command "cloud-scripts/zkcli.sh -cmd
> > upconfig" to upload config to zk server, and start tomcat server. It
> > rebuild the right shard information in zk. then search function back to
> > normal like what
> > we saw in 3.6 or 4.0 w/o cloud.
> >
> > Seems solr always add shard information into zk.
> >
> > I tested cloud swap on single machine, if each core have one shard in the
> > zk, after swap, eventually zk has 2 slices(shards) for that core because
> > now only
> > do the add. so the search will go to both 2 shards.
> >
> > and tested cloud swap with 2 machine which each core have 1 shard and 2
> > slices. Below the configuration in the zk. After swap, eventually zk has
> 4
> > for that
> > core. and search will mess up.
> >
> >  "core0":{"shard1":{
> >  "host1:18000_solr_core0":

Null pointer exception on use of ImportDataHandler (useSolrAddSchema="true")

2009-03-18 Thread Sam Keen
I'm attempting to use and XML/HTTP datasource
[http://wiki.apache.org/solr/DataImportHandler#head-13ffe3a5e6ac22f08e063ad3315f5e7dda279bd4]
I went through the RSS example in
apache-solr-1.3.0/example/example-DIH and that all worked for me.

What I am now attempting to do is leverage 'useSolrAddSchema="true"' .
I have a URL the responds with a well formatted solr add xml (I'm able
to add it by POSTing).  But when I try to add it using
http://localhost:8983/solr/dataimport?command=full-import i get a null
pointer exception.

I am a little unsure if my data-config.xml is correct (I couldn't find
many examples that used useSolrAddSchema="true")  but I've tried every
alternate setting I could think of.
Any help is very much appreciated (I'm sure I've just missed something simple)

regards,
sam

using solr 1.3.0 on OSX 10.5.6

= solrconfig.xml =



  /Users/sam/src/apache-solr-1.3.0/example/solr/conf/data-config.xml

  

=== data-config.xml ===



http://local.smwe.com/factsheets/feed";
useSolrAddSchema="true"
dataSource="smwe">




== solr add xml returned by http://local.smwe.com/factsheets/feed ===


100
Antinori
Chardonnay
2007
Castello della Sala

Cevaro della Sala
Vibrant aromas of citrus
fruit, pineapple, pears and acacia flowers blend on the nose.The
rounded palate has sweet hints of hazelnut butter but is also
minerally and lingering. This wine will age and develop very
well.
bar



= fields section in schema.xml


..

 

   
   
   
   
   
   
   
   
   

.

 id

 
 all

 
 

  








 ...



= exception =

Mar 18, 2009 12:38:44 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={command=full-import}
status=0 QTime=0
Mar 18, 2009 12:38:44 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
INFO: Starting Full Import
Mar 18, 2009 12:38:44 PM org.apache.solr.update.DirectUpdateHandler2 deleteAll
INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
Mar 18, 2009 12:38:44 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
SEVERE: Full Import failed
java.lang.NullPointerException
at java.util.regex.Matcher.getTextLength(Matcher.java:1127)
at java.util.regex.Matcher.reset(Matcher.java:284)
at java.util.regex.Matcher.(Matcher.java:205)
at java.util.regex.Pattern.matcher(Pattern.java:879)
at 
org.apache.solr.handler.dataimport.TemplateString.(TemplateString.java:50)
at 
org.apache.solr.handler.dataimport.TemplateString.replaceTokens(TemplateString.java:72)
at 
org.apache.solr.handler.dataimport.VariableResolverImpl.replaceTokens(VariableResolverImpl.java:77)
at 
org.apache.solr.handler.dataimport.SqlEntityProcessor.nextRow(SqlEntityProcessor.java:74)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)


Re: Null pointer exception on use of ImportDataHandler (useSolrAddSchema="true")

2009-03-18 Thread Sam Keen
that worked perfectly Shalin.  thanks so much for your help!

sam keen


On Wed, Mar 18, 2009 at 1:15 PM, Shalin Shekhar Mangar
 wrote:
> On Thu, Mar 19, 2009 at 1:29 AM, Sam Keen  wrote:
>
>>
>> What I am now attempting to do is leverage 'useSolrAddSchema="true"' .
>> I have a URL the responds with a well formatted solr add xml (I'm able
>> to add it by POSTing).  But when I try to add it using
>> http://localhost:8983/solr/dataimport?command=full-import i get a null
>> pointer exception.
>
>
> You need to use XPathEntityProcessor. If you do not specify a processor, the
> default is SqlEntityProcessor (used for DB imports).
>
> Add the attribute processor="XPathEntityProcessor" to the entity and try.
>
> --
> Regards,
> Shalin Shekhar Mangar.
>


Re: Null pointer exception on use of ImportDataHandler (useSolrAddSchema="true")

2009-03-19 Thread Sam Keen
guess I spoke to soon.  The above setup (with Shalin's fix) works for
a mock run of 2 records.  But when I try it with the production data
of about 450 records, I get this error.

again, any help is greatly appreciated

sam keen

Mar 19, 2009 3:59:20 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
INFO: Starting Full Import
Mar 19, 2009 3:59:20 PM org.apache.solr.core.SolrCore execute
INFO: [] webapp=/solr path=/dataimport params={command=full-import}
status=0 QTime=6
Mar 19, 2009 3:59:20 PM org.apache.solr.update.DirectUpdateHandler2 deleteAll
INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
Mar 19, 2009 3:59:20 PM
org.apache.solr.handler.dataimport.HttpDataSource getData
INFO: Created URL to: http://local.smwe.com/factsheets/feed
Mar 19, 2009 3:59:36 PM
org.apache.solr.handler.dataimport.DataImporter doFullImport
SEVERE: Full Import failed
java.lang.RuntimeException: java.lang.IndexOutOfBoundsException:
Index: 3, Size: 3
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:226)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:180)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:163)
at 
org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
at 
org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
at 
org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
at 
org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
at 
org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
at 
org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
at java.util.ArrayList.RangeCheck(ArrayList.java:546)
at java.util.ArrayList.get(ArrayList.java:321)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.readRow(XPathEntityProcessor.java:266)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor.access$100(XPathEntityProcessor.java:53)
at 
org.apache.solr.handler.dataimport.XPathEntityProcessor$1.handle(XPathEntityProcessor.java:229)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:149)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
at 
org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
at 
org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
... 9 more




On Wed, Mar 18, 2009 at 2:58 PM, Sam Keen  wrote:
> that worked perfectly Shalin.  thanks so much for your help!
>
> sam keen
>
>
> On Wed, Mar 18, 2009 at 1:15 PM, Shalin Shekhar Mangar
>  wrote:
>> On Thu, Mar 19, 2009 at 1:29 AM, Sam Keen  wrote:
>>
>>>
>>> What I am now attempting to do is leverage 'useSolrAddSchema="true"' .
>>> I have a URL the responds with a well formatted solr add xml (I'm able
>>> to add it by POSTing).  But when I try to add it using
>>> http://localhost:8983/solr/dataimport?command=full-import i get a null
>>> pointer exception.
>>
>>
>> You need to use XPathEntityProcessor. If you do not specify a processor, the
>> default is SqlEntityProcessor (used for DB imports).
>>
>> Add the attribute processor="XPathEntityProcessor" to the entity and try.
>>
>> --
>> Regards,
>> Shalin Shekhar Mangar.
>>
>


Re: Null pointer exception on use of ImportDataHandler (useSolrAddSchema="true")

2009-03-20 Thread Sam Keen
thanks,
I applied the patch in SOLR-1077 and this is now fixed for me (i
updated the bug w/ a comment)

sam keen

2009/3/19 Noble Paul നോബിള്‍  नोब्ळ् :
> it is a bug , I have raised an issue
>
> https://issues.apache.org/jira/browse/SOLR-1077
>
> On Fri, Mar 20, 2009 at 4:41 AM, Sam Keen  wrote:
>> guess I spoke to soon.  The above setup (with Shalin's fix) works for
>> a mock run of 2 records.  But when I try it with the production data
>> of about 450 records, I get this error.
>>
>> again, any help is greatly appreciated
>>
>> sam keen
>>
>> Mar 19, 2009 3:59:20 PM
>> org.apache.solr.handler.dataimport.DataImporter doFullImport
>> INFO: Starting Full Import
>> Mar 19, 2009 3:59:20 PM org.apache.solr.core.SolrCore execute
>> INFO: [] webapp=/solr path=/dataimport params={command=full-import}
>> status=0 QTime=6
>> Mar 19, 2009 3:59:20 PM org.apache.solr.update.DirectUpdateHandler2 deleteAll
>> INFO: [] REMOVING ALL DOCUMENTS FROM INDEX
>> Mar 19, 2009 3:59:20 PM
>> org.apache.solr.handler.dataimport.HttpDataSource getData
>> INFO: Created URL to: http://local.smwe.com/factsheets/feed
>> Mar 19, 2009 3:59:36 PM
>> org.apache.solr.handler.dataimport.DataImporter doFullImport
>> SEVERE: Full Import failed
>> java.lang.RuntimeException: java.lang.IndexOutOfBoundsException:
>> Index: 3, Size: 3
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:85)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.initQuery(XPathEntityProcessor.java:226)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.fetchNextRow(XPathEntityProcessor.java:180)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.nextRow(XPathEntityProcessor.java:163)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.buildDocument(DocBuilder.java:285)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.doFullDump(DocBuilder.java:178)
>>        at 
>> org.apache.solr.handler.dataimport.DocBuilder.execute(DocBuilder.java:136)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter.doFullImport(DataImporter.java:334)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter.runCmd(DataImporter.java:386)
>>        at 
>> org.apache.solr.handler.dataimport.DataImporter$1.run(DataImporter.java:377)
>> Caused by: java.lang.IndexOutOfBoundsException: Index: 3, Size: 3
>>        at java.util.ArrayList.RangeCheck(ArrayList.java:546)
>>        at java.util.ArrayList.get(ArrayList.java:321)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.readRow(XPathEntityProcessor.java:266)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor.access$100(XPathEntityProcessor.java:53)
>>        at 
>> org.apache.solr.handler.dataimport.XPathEntityProcessor$1.handle(XPathEntityProcessor.java:229)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:149)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.parse(XPathRecordReader.java:174)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader$Node.access$000(XPathRecordReader.java:89)
>>        at 
>> org.apache.solr.handler.dataimport.XPathRecordReader.streamRecords(XPathRecordReader.java:82)
>>        ... 9 more
>>
>>
>>
>>
>> On Wed, Mar 18, 2009 at 2:58 PM, Sam Keen  wrote:
>>> that worked perfectly Shalin.  thanks so much for your help!
>>>
>>> sam keen
>>>
>>>
>>> On Wed, Mar 18, 2009 at 1:15 PM, Shalin Shekhar Mangar
>>>  wrote:
>>>> On Thu, Mar 19, 2009 at 1:29 AM, Sam Keen  wrote:
>>>>
>>>>>
>>>>> What I am now attempting to do is leverage 'useSolrAddSchema="true"' .
>>>>> I have a URL the responds with a well formatted solr add xml (I'm able
>>>>> to add it by POSTing).  But when I try to add it using
>>>>> http://localhost:8983/solr/dataimport?command=full-import i get a null
>>>>> pointer exception.
>>>>
>>>>
>>>> You need to use XPathEntityProcessor. If you do not specify a processor, 
>>>> the
>>>> default is SqlEntityProcessor (used for DB imports).
>>>>
>>>> Add the attribute processor="XPathEntityProcessor" to the entity and try.
>>>>
>>>> --
>>>> Regards,
>>>> Shalin Shekhar Mangar.
>>>>
>>>
>>
>
>
>
> --
> --Noble Paul
>


When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-30 Thread Sam Michaels

Hi,

I'm running Solr 1.3/Java 1.6.  

When I run a query like  - (activity_type:NAME) AND title:(\...@#$%\^&\*\(\))
all the documents are returned even though there is not a single match.
There is no title that matches the string (which has been escaped). 

My document structure is as follows


NAME
Bathing




The title field is of type text_title which is described below. 


  





  
  






  


When I run the query against Luke, no results are returned. Any suggestions
are appreciated.


-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

As per relevance, no results should be returned. But all the results are
returned in alphabetical order.


Walter Underwood wrote:
> 
> I'm really curious. What is the most relevant result for that query?
> 
> wunder
> 
> On 5/30/09 7:35 PM, "Ryan McKinley"  wrote:
> 
>> two key things to try (for anyone ever wondering why a query matches
>> documents)
>> 
>> 1.  add &debugQuery=true and look at the explain text below --
>> anything that contributed to the score is listed there
>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>> break text up into tokens.
>> 
>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>> something to do with it...
>> 
>> 
>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>> 
>>> Hi,
>>> 
>>> I'm running Solr 1.3/Java 1.6.
>>> 
>>> When I run a query like  - (activity_type:NAME) AND
>>> title:(\...@#$%\^&\*\(\))
>>> all the documents are returned even though there is not a single match.
>>> There is no title that matches the string (which has been escaped).
>>> 
>>> My document structure is as follows
>>> 
>>> 
>>> NAME
>>> Bathing
>>> 
>>> 
>>> 
>>> 
>>> The title field is of type text_title which is described below.
>>> 
>>> >> positionIncrementGap="100">
>>>      
>>>        
>>>        
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>>      
>>>      
>>>        
>>>        >> ignoreCase="true" expand="true"/>
>>>        >> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>        
>>>        
>>> 
>>>      
>>>    
>>> 
>>> When I run the query against Luke, no results are returned. Any
>>> suggestions
>>> are appreciated.
>>> 
>>> 
>>> --
>>> View this message in context:
>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents
>>> -are-matched-incorrectly-tp23797731p23797731.html
>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>> 
>>> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23804060.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

Upon some further experimentation, I found out that even @ matches all the
documents. However when I append the wildcard * to @ (@*) then there is no
match...

SM


Sam Michaels wrote:
> 
> Hi,
> 
> I'm running Solr 1.3/Java 1.6.  
> 
> When I run a query like  - (activity_type:NAME) AND
> title:(\...@#$%\^&\*\(\)) all the documents are returned even though there
> is not a single match. There is no title that matches the string (which
> has been escaped). 
> 
> My document structure is as follows
> 
> 
> NAME
> Bathing
> 
> 
> 
> 
> The title field is of type text_title which is described below. 
> 
>  positionIncrementGap="100">
>   
> 
> 
>  generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> 
> 
>   
>   
> 
>  ignoreCase="true" expand="true"/>
>  generateWordParts="1" generateNumberParts="1" catenateWords="1"
> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
> 
> 
> 
>   
> 
> 
> When I run the query against Luke, no results are returned. Any
> suggestions are appreciated.
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23804381.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-05-31 Thread Sam Michaels

Here is the output from the debug query when I'm trying to match the String @
against Bathing (should not match)


3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
  0.9994 = queryWeight(activity_type:NAME), product of:
3.2689075 = idf(docFreq=153, numDocs=1489)
0.30591258 = queryNorm
  3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
1.0 = tf(termFreq(activity_type:NAME)=1)
3.2689075 = idf(docFreq=153, numDocs=1489)
1.0 = fieldNorm(field=activity_type, doc=0)


Looks like the AND clause in the search string is ignored...

SM.


ryantxu wrote:
> 
> two key things to try (for anyone ever wondering why a query matches
> documents)
> 
> 1.  add &debugQuery=true and look at the explain text below --
> anything that contributed to the score is listed there
> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
> break text up into tokens.
> 
> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
> something to do with it...
> 
> 
> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>
>> Hi,
>>
>> I'm running Solr 1.3/Java 1.6.
>>
>> When I run a query like  - (activity_type:NAME) AND
>> title:(\...@#$%\^&\*\(\))
>> all the documents are returned even though there is not a single match.
>> There is no title that matches the string (which has been escaped).
>>
>> My document structure is as follows
>>
>> 
>> NAME
>> Bathing
>> 
>> 
>>
>>
>> The title field is of type text_title which is described below.
>>
>> > positionIncrementGap="100">
>>      
>>        
>>        
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>      
>>      
>>        
>>        > ignoreCase="true" expand="true"/>
>>        > generateWordParts="1" generateNumberParts="1" catenateWords="1"
>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>        
>>        
>>
>>      
>>    
>>
>> When I run the query against Luke, no results are returned. Any
>> suggestions
>> are appreciated.
>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23797731.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23807341.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

Walter,

The analysis link does not produce any matches for either @ or !...@#$%^&*()
strings when I try to match against bathing. I'm worried that this might be
the symptom of another problem (which has not revealed itself yet) and want
to get to the bottom of this...

Thank you.
sm


Walter Underwood wrote:
> 
> Use the [analysis] link on the Solr admin UI to get more info on
> how this is being interpreted.
> 
> However, I am curious about why this is important. Do users enter
> this query often? If not, maybe it is not something to spend time on.
> 
> wunder
> 
> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
> 
>> 
>> Here is the output from the debug query when I'm trying to match the
>> String @
>> against Bathing (should not match)
>> 
>> 
>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>   0.9994 = queryWeight(activity_type:NAME), product of:
>> 3.2689075 = idf(docFreq=153, numDocs=1489)
>> 0.30591258 = queryNorm
>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>> 1.0 = tf(termFreq(activity_type:NAME)=1)
>> 3.2689075 = idf(docFreq=153, numDocs=1489)
>> 1.0 = fieldNorm(field=activity_type, doc=0)
>> 
>> 
>> Looks like the AND clause in the search string is ignored...
>> 
>> SM.
>> 
>> 
>> ryantxu wrote:
>>> 
>>> two key things to try (for anyone ever wondering why a query matches
>>> documents)
>>> 
>>> 1.  add &debugQuery=true and look at the explain text below --
>>> anything that contributed to the score is listed there
>>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>>> break text up into tokens.
>>> 
>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>>> something to do with it...
>>> 
>>> 
>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels  wrote:
>>>> 
>>>> Hi,
>>>> 
>>>> I'm running Solr 1.3/Java 1.6.
>>>> 
>>>> When I run a query like  - (activity_type:NAME) AND
>>>> title:(\...@#$%\^&\*\(\))
>>>> all the documents are returned even though there is not a single match.
>>>> There is no title that matches the string (which has been escaped).
>>>> 
>>>> My document structure is as follows
>>>> 
>>>> 
>>>> NAME
>>>> Bathing
>>>> 
>>>> 
>>>> 
>>>> 
>>>> The title field is of type text_title which is described below.
>>>> 
>>>> >>> positionIncrementGap="100">
>>>>      
>>>>        
>>>>        
>>>>        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>>        
>>>>        
>>>>      
>>>>      
>>>>        
>>>>        >>> synonyms="synonyms.txt"
>>>> ignoreCase="true" expand="true"/>
>>>>        >>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>>        
>>>>        
>>>> 
>>>>      
>>>>    
>>>> 
>>>> When I run the query against Luke, no results are returned. Any
>>>> suggestions
>>>> are appreciated.
>>>> 
>>>> 
>>>> --
>>>> View this message in context:
>>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>>>> s-are-matched-incorrectly-tp23797731p23797731.html
>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>> 
>>>> 
>>> 
>>> 
> 
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

So the fix for this problem would be

1. Stop using WordDelimiterFilter for queries (what is the alternative) OR
2. Not allow any search strings without any alphanumeric characters..

SM.


Yonik Seeley-2 wrote:
> 
> OK, here's the deal:
> 
> -features:foo features:(\...@#$%\^&\*\(\))
> -features:foo features:(\...@#$%\^&\*\(\))
> -features:foo
> -features:foo
> 
> The text analysis is throwing away non alphanumeric chars (probably
> the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
> away term queries when the token is zero length (after analysis).
> Solr then interprets the left over "-features:foo" as "all documents
> not containing foo in the features field", so you get a bunch of
> matches.
> 
> -Yonik
> http://www.lucidimagination.com
> 
> 
> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
>>
>> Walter,
>>
>> The analysis link does not produce any matches for either @ or !...@#$%^&*()
>> strings when I try to match against bathing. I'm worried that this might
>> be
>> the symptom of another problem (which has not revealed itself yet) and
>> want
>> to get to the bottom of this...
>>
>> Thank you.
>> sm
>>
>>
>> Walter Underwood wrote:
>>>
>>> Use the [analysis] link on the Solr admin UI to get more info on
>>> how this is being interpreted.
>>>
>>> However, I am curious about why this is important. Do users enter
>>> this query often? If not, maybe it is not something to spend time on.
>>>
>>> wunder
>>>
>>> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>>>
>>>>
>>>> Here is the output from the debug query when I'm trying to match the
>>>> String @
>>>> against Bathing (should not match)
>>>>
>>>> 
>>>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>>>   0.9994 = queryWeight(activity_type:NAME), product of:
>>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>>     0.30591258 = queryNorm
>>>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product of:
>>>>     1.0 = tf(termFreq(activity_type:NAME)=1)
>>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>>     1.0 = fieldNorm(field=activity_type, doc=0)
>>>> 
>>>>
>>>> Looks like the AND clause in the search string is ignored...
>>>>
>>>> SM.
>>>>
>>>>
>>>> ryantxu wrote:
>>>>>
>>>>> two key things to try (for anyone ever wondering why a query matches
>>>>> documents)
>>>>>
>>>>> 1.  add &debugQuery=true and look at the explain text below --
>>>>> anything that contributed to the score is listed there
>>>>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>>>>> break text up into tokens.
>>>>>
>>>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory has
>>>>> something to do with it...
>>>>>
>>>>>
>>>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
>>>>> wrote:
>>>>>>
>>>>>> Hi,
>>>>>>
>>>>>> I'm running Solr 1.3/Java 1.6.
>>>>>>
>>>>>> When I run a query like  - (activity_type:NAME) AND
>>>>>> title:(\...@#$%\^&\*\(\))
>>>>>> all the documents are returned even though there is not a single
>>>>>> match.
>>>>>> There is no title that matches the string (which has been escaped).
>>>>>>
>>>>>> My document structure is as follows
>>>>>>
>>>>>> 
>>>>>> NAME
>>>>>> Bathing
>>>>>> 
>>>>>> 
>>>>>>
>>>>>>
>>>>>> The title field is of type text_title which is described below.
>>>>>>
>>>>>> >>>>> positionIncrementGap="100">
>>>>>>      
>>>>>>        
>>>>>>        
>>>>>>        >>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>>>>        
>>>>>>        
>>>>>>      
>>>>>>      
>>>>>>        
>>>>>>        >>>>> synonyms="synonyms.txt"
>>>>>> ignoreCase="true" expand="true"/>
>>>>>>        >>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>>>> catenateNumbers="1" catenateAll="1" splitOnCaseChange="1"/>
>>>>>>        
>>>>>>        
>>>>>>
>>>>>>      
>>>>>>    
>>>>>>
>>>>>> When I run the query against Luke, no results are returned. Any
>>>>>> suggestions
>>>>>> are appreciated.
>>>>>>
>>>>>>
>>>>>> --
>>>>>> View this message in context:
>>>>>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-document
>>>>>> s-are-matched-incorrectly-tp23797731p23797731.html
>>>>>> Sent from the Solr - User mailing list archive at Nabble.com.
>>>>>>
>>>>>>
>>>>>
>>>>>
>>>
>>>
>>>
>>
>> --
>> View this message in context:
>> http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23815688.html
>> Sent from the Solr - User mailing list archive at Nabble.com.
>>
>>
> 
> 

-- 
View this message in context: 
http://www.nabble.com/When-searching-for-%21%40-%24-%5E-*%28%29-all-documents-are-matched-incorrectly-tp23797731p23816242.html
Sent from the Solr - User mailing list archive at Nabble.com.



Re: When searching for !...@#$%^&*() all documents are matched incorrectly

2009-06-01 Thread Sam Michaels

Yonik,

Done, here is the link.
https://issues.apache.org/jira/browse/SOLR-1196

SM.


Yonik Seeley-2 wrote:
> 
> On Mon, Jun 1, 2009 at 10:50 AM, Sam Michaels  wrote:
>>
>> So the fix for this problem would be
>>
>> 1. Stop using WordDelimiterFilter for queries (what is the alternative)
>> OR
>> 2. Not allow any search strings without any alphanumeric characters..
> 
> Short term workaround for you, yes.
> I would classify this surprising behavior as a bug we should
> eventually fix though.  Could you open a JIRA issue for it?
> 
> -Yonik
> http://www.lucidimagination.com
> 
>> SM.
>>
>>
>> Yonik Seeley-2 wrote:
>>>
>>> OK, here's the deal:
>>>
>>> -features:foo
>>> features:(\...@#$%\^&\*\(\))
>>> -features:foo features:(\...@#$%\^&\*\(\))
>>> -features:foo
>>> -features:foo
>>>
>>> The text analysis is throwing away non alphanumeric chars (probably
>>> the WordDelimiterFilter).  The Lucene (and Solr) query parser throws
>>> away term queries when the token is zero length (after analysis).
>>> Solr then interprets the left over "-features:foo" as "all documents
>>> not containing foo in the features field", so you get a bunch of
>>> matches.
>>>
>>> -Yonik
>>> http://www.lucidimagination.com
>>>
>>>
>>> On Mon, Jun 1, 2009 at 10:15 AM, Sam Michaels  wrote:
>>>>
>>>> Walter,
>>>>
>>>> The analysis link does not produce any matches for either @ or
>>>> !...@#$%^&*()
>>>> strings when I try to match against bathing. I'm worried that this
>>>> might
>>>> be
>>>> the symptom of another problem (which has not revealed itself yet) and
>>>> want
>>>> to get to the bottom of this...
>>>>
>>>> Thank you.
>>>> sm
>>>>
>>>>
>>>> Walter Underwood wrote:
>>>>>
>>>>> Use the [analysis] link on the Solr admin UI to get more info on
>>>>> how this is being interpreted.
>>>>>
>>>>> However, I am curious about why this is important. Do users enter
>>>>> this query often? If not, maybe it is not something to spend time on.
>>>>>
>>>>> wunder
>>>>>
>>>>> On 5/31/09 2:56 PM, "Sam Michaels"  wrote:
>>>>>
>>>>>>
>>>>>> Here is the output from the debug query when I'm trying to match the
>>>>>> String @
>>>>>> against Bathing (should not match)
>>>>>>
>>>>>> 
>>>>>> 3.2689073 = (MATCH) weight(activity_type:NAME in 0), product of:
>>>>>>   0.9994 = queryWeight(activity_type:NAME), product of:
>>>>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>>>>     0.30591258 = queryNorm
>>>>>>   3.2689075 = (MATCH) fieldWeight(activity_type:NAME in 0), product
>>>>>> of:
>>>>>>     1.0 = tf(termFreq(activity_type:NAME)=1)
>>>>>>     3.2689075 = idf(docFreq=153, numDocs=1489)
>>>>>>     1.0 = fieldNorm(field=activity_type, doc=0)
>>>>>> 
>>>>>>
>>>>>> Looks like the AND clause in the search string is ignored...
>>>>>>
>>>>>> SM.
>>>>>>
>>>>>>
>>>>>> ryantxu wrote:
>>>>>>>
>>>>>>> two key things to try (for anyone ever wondering why a query matches
>>>>>>> documents)
>>>>>>>
>>>>>>> 1.  add &debugQuery=true and look at the explain text below --
>>>>>>> anything that contributed to the score is listed there
>>>>>>> 2.  check /admin/analysis.jsp -- this will let you see how analyzers
>>>>>>> break text up into tokens.
>>>>>>>
>>>>>>> Not sure off hand, but I'm guessing the WordDelimiterFilterFactory
>>>>>>> has
>>>>>>> something to do with it...
>>>>>>>
>>>>>>>
>>>>>>> On Sat, May 30, 2009 at 5:59 PM, Sam Michaels 
>>>>>>> wrote:
>>>>>>>>
>>>>>>>> Hi,
>>>>>>>>
>>>>>>>> I'm running Solr 1.3/Java 1.6.