date:20120618

Re: Exception when optimizing index

2012-06-18 Thread Rok Rejc

Hi all,

during the last days, I have create solr instance on a windows environment
- same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
solr configurations, Tomcat 6, Java 6u23.
I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).

Import and optimize on the windows machine worked without any issue, but on
the linux machine optimize fails with the same exception:

Caused by: java.io.IOException: Invalid vInt detected (too many bits)
at
org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
...

after that I have also change directory factory (on the linux machine) to
SimpleFSDirectoryFactory. I have reindexed all the documents and again run
the optimize - it fails again with the same expcetion.

In the next steps I could maybe do partial insertions (which will be a
painful process), but after that I'm out of ideas (and out of time for
experimenting).

Many thanks for further suggestions.

Rok

On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir  wrote:

> On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc  wrote:
> >   - java.runtime.nameOpenJDK Runtime Environment
> >   - java.runtime.version1.6.0_22-b22
> ...
> >
> > As far as I see from the JIRA issue I have the patch attached (as
> mentioned
> > I have a trunk version from May 12). Any ideas?
> >
>
> its not guaranteed that the patch will workaround all hotspot bugs
> related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921
>
> Since you can reproduce, is it possible for you to re-test the
> scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?
>
> --
> lucidimagination.com
>

delete by query don't work

2012-06-18 Thread ramzesua

Hi all. I am using solr 4.0 and trying to clear index by query. At first I
use *:* with commit, but index is still not
empty. I tried another queries, but it not help me. Then I tried delete by
`id`. It works fine, but I need clear all index. Can anyone help me?


--
View this message in context: 
http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Exception when optimizing index

2012-06-18 Thread Erick Erickson

Is it possible that you somehow have some problem with jars and classpath?
I'm wondering because this problem really seems odd, and you've eliminated
a bunch of possibilities. I'm wondering if you've somehow gotten some old
jars mixed in the bunch.

Or, alternately, what about re-installing Solr on the theory that somehow you
got a bad download and/or files (i.e. the Solr jar files) got
corrupted, your disk has
a bad spot or.

Really clutching at straws here

Erick

On Mon, Jun 18, 2012 at 3:44 AM, Rok Rejc  wrote:
> Hi all,
>
> during the last days, I have create solr instance on a windows environment
> - same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
> solr configurations, Tomcat 6, Java 6u23.
> I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).
>
> Import and optimize on the windows machine worked without any issue, but on
> the linux machine optimize fails with the same exception:
>
> Caused by: java.io.IOException: Invalid vInt detected (too many bits)
>    at
> org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
> ...
>
> after that I have also change directory factory (on the linux machine) to
> SimpleFSDirectoryFactory. I have reindexed all the documents and again run
> the optimize - it fails again with the same expcetion.
>
> In the next steps I could maybe do partial insertions (which will be a
> painful process), but after that I'm out of ideas (and out of time for
> experimenting).
>
> Many thanks for further suggestions.
>
> Rok
>
>
>
> On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir  wrote:
>
>> On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc  wrote:
>> >   - java.runtime.nameOpenJDK Runtime Environment
>> >   - java.runtime.version1.6.0_22-b22
>> ...
>> >
>> > As far as I see from the JIRA issue I have the patch attached (as
>> mentioned
>> > I have a trunk version from May 12). Any ideas?
>> >
>>
>> its not guaranteed that the patch will workaround all hotspot bugs
>> related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921
>>
>> Since you can reproduce, is it possible for you to re-test the
>> scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?
>>
>> --
>> lucidimagination.com
>>

Re: delete by query don't work

2012-06-18 Thread Erick Erickson

Well, it would help if you defined what behavior you're seeing. When you
say delete-by-query doesn't work, what is the symptom? What does "empty"
mean? Because if you're just looking at your index directory and expecting
to see files disappear, you'll be disappointed.

When you delete documents in Solr, the docs are just marked as deleted, they
aren't physically removed until segments are merged. Does a query for *:* return
any documents after you delete-by-query?

Running an optimize after you do the delete will force merging to happen BTW.

If this doesn't help, please post the exact URLs you use, and what
your evidence that
"the index isn't empty" is.

Best
Erick

On Mon, Jun 18, 2012 at 5:45 AM, ramzesua  wrote:
> Hi all. I am using solr 4.0 and trying to clear index by query. At first I
> use *:* with commit, but index is still not
> empty. I tried another queries, but it not help me. Then I tried delete by
> `id`. It works fine, but I need clear all index. Can anyone help me?
>
>
> --
> View this message in context: 
> http://lucene.472066.n3.nabble.com/delete-by-query-don-t-work-tp3990077.html
> Sent from the Solr - User mailing list archive at Nabble.com.

Re: Exception when optimizing index

2012-06-18 Thread Michael McCandless

Is it possible the Linux machine has bad RAM / bad disk?

Mike McCandless

http://blog.mikemccandless.com

On Mon, Jun 18, 2012 at 7:06 AM, Erick Erickson  wrote:
> Is it possible that you somehow have some problem with jars and classpath?
> I'm wondering because this problem really seems odd, and you've eliminated
> a bunch of possibilities. I'm wondering if you've somehow gotten some old
> jars mixed in the bunch.
>
> Or, alternately, what about re-installing Solr on the theory that somehow you
> got a bad download and/or files (i.e. the Solr jar files) got
> corrupted, your disk has
> a bad spot or.
>
> Really clutching at straws here
>
> Erick
>
> On Mon, Jun 18, 2012 at 3:44 AM, Rok Rejc  wrote:
>> Hi all,
>>
>> during the last days, I have create solr instance on a windows environment
>> - same Solr as on the linux machine (solr 4.0 from 9th June 2012), same
>> solr configurations, Tomcat 6, Java 6u23.
>> I have also upgraded Java on the linux machine (1.7.0_05-b05 from Oracle).
>>
>> Import and optimize on the windows machine worked without any issue, but on
>> the linux machine optimize fails with the same exception:
>>
>> Caused by: java.io.IOException: Invalid vInt detected (too many bits)
>>    at
>> org.apache.lucene.store.BufferedIndexInput.readVInt(BufferedIndexInput.java:217)
>> ...
>>
>> after that I have also change directory factory (on the linux machine) to
>> SimpleFSDirectoryFactory. I have reindexed all the documents and again run
>> the optimize - it fails again with the same expcetion.
>>
>> In the next steps I could maybe do partial insertions (which will be a
>> painful process), but after that I'm out of ideas (and out of time for
>> experimenting).
>>
>> Many thanks for further suggestions.
>>
>> Rok
>>
>>
>>
>> On Wed, Jun 13, 2012 at 1:31 PM, Robert Muir  wrote:
>>
>>> On Thu, Jun 7, 2012 at 5:50 AM, Rok Rejc  wrote:
>>> >   - java.runtime.nameOpenJDK Runtime Environment
>>> >   - java.runtime.version1.6.0_22-b22
>>> ...
>>> >
>>> > As far as I see from the JIRA issue I have the patch attached (as
>>> mentioned
>>> > I have a trunk version from May 12). Any ideas?
>>> >
>>>
>>> its not guaranteed that the patch will workaround all hotspot bugs
>>> related to http://bugs.sun.com/bugdatabase/view_bug.do?bug_id=5091921
>>>
>>> Since you can reproduce, is it possible for you to re-test the
>>> scenario with a newer JVM (e.g. 1.7.0_04) just to rule that out?
>>>
>>> --
>>> lucidimagination.com
>>>

SolrCloud non-distributed indexing (update.chain)

2012-06-18 Thread Boon Low

Hi,

What's happening to the update.chain of SolrCloud?

I am running SolrCloud (compiled from trunk today) with an update.chain 
pointing to an updateRequestProcessorChain in solrconfig which omits the 
DistributedUpdateProcessorFactory, so that indexing can be done on specific 
shards (not distributed).

This works previously but not in the recent builts (e.g. since 6th June..). I 
noticed the additional update parameters such as "update.distrib" being logged 
across the cloud nodes:

...update.distrib=TOLEADER update.chain=notdistributed

I tried update.distrib=NONE. The indexing still being distributed and ignoring 
the update.chain (as specified below in Solr config).




 

How do I get the above chain and non-distributed indexing to work again?

Regards,

Boon

-
Boon Low
Search UX and Engine Developer (SOLR)
brightsolid Online Publishing


__
"brightsolid" is used in this email to collectively mean brightsolid online 
innovation limited and its subsidiary companies brightsolid online publishing 
limited and brightsolid online technology limited.
findmypast.co.uk is a brand of brightsolid online publishing limited.
brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
Street, London EC2A 3DQ. Registered in England No. 04369607.
brightsolid online technology limited, Gateway House, Luna Place, Dundee 
Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.

Email Disclaimer

This message is confidential and may contain privileged information. You should 
not disclose its contents to any other person. If you are not the intended 
recipient, please notify the sender named above immediately. It is expressly 
declared that this e-mail does not constitute nor form part of a contract or 
unilateral obligation. Opinions, conclusions and other information in this 
message that do not relate to the official business of brightsolid shall be 
understood as neither given nor endorsed by it.
__
This email has been scanned by the brightsolid Email Security System. Powered 
by MessageLabs
__

Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah

Am Freitag, den 15.06.2012, 18:22 +0100 schrieb Kissue Kissue:
> Hi,
> 
> Does anybody know what the default connection timeout setting is for
> StreamingUpdateSolrServer? Can i explicitly set one and how?
> 
> Thanks. 

Use a custom HttpClient to set one (only snippets, should be clear, if
not tell):

this.instance = new StreamingUpdateSolrServer(getUrl(), httpClient,
DOC_QUEUE_SIZE, WORKER_SIZE);

and use httpClient like this:

this.connectionManager = new MultiThreadedHttpConnectionManager();
final HttpClient httpClient = new HttpClient(this.connectionManager);
httpClient.getParams().setConnectionManagerTimeout(CONN_ACQUIRE_TIMEOUT);
httpClient.getParams().setSoTimeout(SO_TIMEOUT);

regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature

Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah

AddOn: You can even set a custom http factory for commons-http (which is
used by SolrStreamingUpdateServer) at all to influence socket options,
example is:

final Protocol http = new Protocol("http",
MycustomHttpSocketFactory.getSocketFactory(), 80);

and MycustomHttpSocketFactory.getSocketFactory is a factory which does
extend

org.apache.commons.httpclient.protocol.DefaultProtocolSocketFactory

and override / implement methods as needed (direct socket access).

Call this e.g. in a ServletListener in contextInitialized and you are
done.

regards

Torsten



smime.p7s
Description: S/MIME cryptographic signature

Re: SolrCloud and split-brain

2012-06-18 Thread Sami Siren

On Sat, Jun 16, 2012 at 5:33 AM, Otis Gospodnetic
 wrote:

> And here is one more Q.
> * Imagine a client is adding documents and, for simplicity, imagine SolrCloud 
> routes all these documents to the same shard, call it S.
> * Imagine that both the 7-node and the 3-node partition end up with a 
> complete index and thus both accept updates.

According to comments from Mark this (having two functioning sides)
is not possible... only one side _can_ continue functioning (taking in
updates). Depending on how the shards are deployed over the nodes a
"side" may still not accept updates (even if that side has a working
zk setup).

> Now imagine if the client sending documents for indexing happened to be 
> sending documents to 2 nodes, say in round-robin fashion.

In my understanding all updates are routed through a shard leader.

--
 Sami Siren

Re: delete by query don't work

2012-06-18 Thread Toke Eskildsen

On Mon, 2012-06-18 at 11:45 +0200, ramzesua wrote:
> Hi all. I am using solr 4.0 and trying to clear index by query. At first I
> use *:* with commit, but index is still not
> empty. I tried another queries, but it not help me. Then I tried delete by
> `id`. It works fine, but I need clear all index. Can anyone help me?

It's a subtle bug/problem in the default schema. Fortunately it is
easily fixable. See https://issues.apache.org/jira/browse/SOLR-3432

Re: Lock error when indexing with curl

2012-06-18 Thread Heike Grimm

harun sahiner  gmail.com> writes:

> 
> Hi, 
> 
> i have a similar "lock" error. Did you find any solution ? 
> 
> --
> View this message in context:
http://lucene.472066.n3.nabble.com/Lock-error-when-indexing-with-curl-tp480958p3403119.html
> Sent from the Solr - User mailing list archive at Nabble.com.
> 
> 

Hi there!

I had the same problem. It seems that curl did not accept "root" (I tried under
Debian) as a user that may index files in that folder. Changing the rights of
the folder or adding root as user in curl will help :)

Greetings
Heike

RE: How to update one field without losing the others?

2012-06-18 Thread Kai Gülzau

I'm currently playing around with a branch 4x Version 
(https://builds.apache.org/job/Solr-4.x/5/) but I don't get field updates to 
work.

A simple GET testrequest
http://localhost:8983/solr/master/update/json?stream.body={"add":{"doc":{"ukey":"08154711","type":"1","nbody":{"set":"mycontent"

results in
{
  "ukey":"08154711",
  "type":"1",
  "nbody":"{set=mycontent}"}]
}

All fields are stored.
ukey is the unique key :-)
type is a required field.
nbody is a solr.TextField.


Is there any (wiki/readme) pointer how to test and use these feature correctly?
What are the restrictions?

Regards,

Kai Gülzau

 
-Original Message-
From: ysee...@gmail.com [mailto:ysee...@gmail.com] On Behalf Of Yonik Seeley
Sent: Saturday, June 16, 2012 4:47 PM
To: solr-user@lucene.apache.org
Subject: Re: How to update one field without losing the others?

Atomic update is a very new feature coming in 4.0 (i.e. grab a recent
nightly build to try it out).

It's not documented yet, but here's the JIRA issue:
https://issues.apache.org/jira/browse/SOLR-139?focusedCommentId=13269007&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13269007

-Yonik
http://lucidimagination.com

Re: StreamingUpdateSolrServer Connection Timeout Setting

2012-06-18 Thread Torsten Krah

You should also call the glue code ;-):

Protocol.registerProtocol("http", http);

regards

Torsten


smime.p7s
Description: S/MIME cryptographic signature

Re: SolrCloud and split-brain

2012-06-18 Thread Mark Miller

On Jun 15, 2012, at 10:33 PM, Otis Gospodnetic wrote:

> However, if my half brain understands what split brain is then I think that's 
> not a completely true claim because one can get unlucky and get a SolrCloud 
> cluster partitioned in a way that one or even all partitions reject indexing 
> (and update and deletion) requests if they do not have a complete index.

That's not split brain. Split brain means that multiple partitioned clusters 
think they are *the* cluster and would keep accepting updates. This is a real 
problem because when you unsplit the cluster, you cannot reconcile conflicting 
updates easily! In many cases you have to ask the user to resolve the conflict.

Yes, you must have a node to serve a shard in order to index to that shard. You 
do not need the whole index - but if an update hashes to a shard that has no 
nodes hosting it, it will fail. If there is no node, the document has no where 
to live. Some systems do interesting things like buffer those updates to other 
nodes for a while - we don't plan on anything like that soon. At some point, 
you can only survive a loss of so many nodes before its time to give up 
accepting updates in any system. If you need to survive catastrophic loss of 
nodes, you have to have enough replicas to handle it. Whether those nodes are 
partitioned off from the cluster or simply die, it's all the same. You can only 
survive so many node loses, and replicas are your defense.

The lack of split-brain allows your cluster to remain consistent. If you allow 
split brain you have to use something like vector clocks and handle conflict 
resolution when the splits rejoin, or you will just have a lot of messed up 
data. You generally allow split brain when you want to favor write availability 
in the face of partitions, like Dynamo. But you must have a strategy for 
rejoining splits (like vector clocks or something) or you can never properly go 
back to a single, consistent cluster. We favor consistency in the face of 
partitions rather than write availability. It seemed like the right choice for 
Solr.

- Mark Miller
lucidimagination.com

Re: SolrCloud non-distributed indexing (update.chain)

2012-06-18 Thread Mark Miller

I think this was changed by https://issues.apache.org/jira/browse/SOLR-2822

Add NoOpDistributingUpdateProcessorFactory to your chain to avoid distrib 
update 'action' being auto injected.

- Mark Miller
lucidimagination.com

On Jun 18, 2012, at 8:10 AM, Boon Low wrote:

> Hi,
> 
> What's happening to the update.chain of SolrCloud?
> 
> I am running SolrCloud (compiled from trunk today) with an update.chain 
> pointing to an updateRequestProcessorChain in solrconfig which omits the 
> DistributedUpdateProcessorFactory, so that indexing can be done on specific 
> shards (not distributed).
> 
> This works previously but not in the recent builts (e.g. since 6th June..). I 
> noticed the additional update parameters such as "update.distrib" being 
> logged across the cloud nodes:
> 
> ...update.distrib=TOLEADER update.chain=notdistributed
> 
> I tried update.distrib=NONE. The indexing still being distributed and 
> ignoring the update.chain (as specified below in Solr config).
> 
>   
>   
>   
>
> 
> How do I get the above chain and non-distributed indexing to work again?
> 
> Regards,
> 
> Boon
> 
> -
> Boon Low
> Search UX and Engine Developer (SOLR)
> brightsolid Online Publishing
> 
> 
> __
> "brightsolid" is used in this email to collectively mean brightsolid online 
> innovation limited and its subsidiary companies brightsolid online publishing 
> limited and brightsolid online technology limited.
> findmypast.co.uk is a brand of brightsolid online publishing limited.
> brightsolid online innovation limited, Gateway House, Luna Place, Dundee 
> Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC274983.
> brightsolid online publishing limited, The Glebe, 6 Chapel Place, Rivington 
> Street, London EC2A 3DQ. Registered in England No. 04369607.
> brightsolid online technology limited, Gateway House, Luna Place, Dundee 
> Technology Park, Dundee DD2 1TP.  Registered in Scotland No. SC161678.
> 
> Email Disclaimer
> 
> This message is confidential and may contain privileged information. You 
> should not disclose its contents to any other person. If you are not the 
> intended recipient, please notify the sender named above immediately. It is 
> expressly declared that this e-mail does not constitute nor form part of a 
> contract or unilateral obligation. Opinions, conclusions and other 
> information in this message that do not relate to the official business of 
> brightsolid shall be understood as neither given nor endorsed by it.
> __
> This email has been scanned by the brightsolid Email Security System. Powered 
> by MessageLabs
> __

Re: WordBreak and default dictionary crash Solr

2012-06-18 Thread Carrie Coy


On 06/15/2012 05:16 PM, Dyer, James wrote:

I'm pretty sure you've found a bug here.  Could you tell me whether you're 
using a build from Trunk or Solr_4x ?  Also, do you know the svn revision or 
the Jenkins build # (or timestamp) you're working from?
I continued to see the problem after updating to version below 
(previously was running version built on 06-09):


   *

 solr-spec
 4.0.0.2012.06.16.10.22.10

   *

 solr-impl
 4.0-2012-06-16_10-02-16 1350899 - hudson - 2012-06-16 10:22:10


Could you try instead to use DirectSolrSpellChecker instead of IndexBasedSpellChecker for 
your "default" dictionary?


Switching to DirectSolrSpellChecker appears to fix the problem: a query 
with 2 misspellings, one from each dictionary, does not crash Solr and 
is correctly spell-checked.


Thanks!

Carrie Coy

Re: SolrCloud and split-brain

2012-06-18 Thread Otis Gospodnetic

Hi Mark,

Thanks.  All that is clear (I think Voldemort does a good job with hinted 
handoff, which I think Mark is referring to).
The part that I'm not clear about is maybe not SolrCloud-specific, and that is 
- what exactly prevents the two halves of a cluster that's been split from 
thinking they are *the* cluster?
Let's say you have a 10-node cluster, say with 10 ZK instances, one instance on 
each Solr node.
And say 5 of these 10 servers are on switch A and the other 5 are on switch B.
Something happens and switch A and 5 nodes on it get separated from 5 nodes on 
switch B.
Say that both A and B happen to have complete copies of the index.

What in Solr (or ZK) tells either A or B half that "no, you are not *the* 
cluster and thou shalt not accept updates"?

I'm guessing 
this: https://cwiki.apache.org/confluence/display/ZOOKEEPER/FailureScenarios ?

So then the Q becomes: if we have 10 ZK nodes and they split in 5 & 5 nodes, 
does that mean neither side will have quorum because having 10 ZKs was a bad 
number of ZKs to have to begin with?

Thanks,
Otis 

Performance Monitoring for Solr / ElasticSearch / HBase - 
http://sematext.com/spm 




- Original Message -
> From: Mark Miller 
> To: solr-user 
> Cc: 
> Sent: Monday, June 18, 2012 11:05 AM
> Subject: Re: SolrCloud and split-brain
> 
> 
> On Jun 15, 2012, at 10:33 PM, Otis Gospodnetic wrote:
> 
>>  However, if my half brain understands what split brain is then I think 
> that's not a completely true claim because one can get unlucky and get a 
> SolrCloud cluster partitioned in a way that one or even all partitions reject 
> indexing (and update and deletion) requests if they do not have a complete 
> index.
> 
> That's not split brain. Split brain means that multiple partitioned clusters 
> think they are *the* cluster and would keep accepting updates. This is a real 
> problem because when you unsplit the cluster, you cannot reconcile 
> conflicting 
> updates easily! In many cases you have to ask the user to resolve the 
> conflict.
> 
> Yes, you must have a node to serve a shard in order to index to that shard. 
> You 
> do not need the whole index - but if an update hashes to a shard that has no 
> nodes hosting it, it will fail. If there is no node, the document has no 
> where 
> to live. Some systems do interesting things like buffer those updates to 
> other 
> nodes for a while - we don't plan on anything like that soon. At some point, 
> you can only survive a loss of so many nodes before its time to give up 
> accepting updates in any system. If you need to survive catastrophic loss of 
> nodes, you have to have enough replicas to handle it. Whether those nodes are 
> partitioned off from the cluster or simply die, it's all the same. You can 
> only survive so many node loses, and replicas are your defense.
> 
> The lack of split-brain allows your cluster to remain consistent. If you 
> allow 
> split brain you have to use something like vector clocks and handle conflict 
> resolution when the splits rejoin, or you will just have a lot of messed up 
> data. You generally allow split brain when you want to favor write 
> availability 
> in the face of partitions, like Dynamo. But you must have a strategy for 
> rejoining splits (like vector clocks or something) or you can never properly 
> go 
> back to a single, consistent cluster. We favor consistency in the face of 
> partitions rather than write availability. It seemed like the right choice 
> for 
> Solr.
> 
> - Mark Miller
> lucidimagination.com
>

Re: SolrCloud and split-brain

2012-06-18 Thread Mark Miller

>
> So then the Q becomes: if we have 10 ZK nodes and they split in 5 & 5
> nodes, does that mean neither side will have quorum because having 10 ZKs
> was a bad number of ZKs to have to begin with?


Right - from the ZooKeeper admin guide, under Clustered Setup:

"Because Zookeeper requires a majority, it is best to use an odd number of
machines."



-- 
- Mark

http://www.lucidimagination.com

StandardTokenizerFactory behaviour

2012-06-18 Thread Alok Bhandari


Hello ,

I am working on Solr from last few months and stuck some where ,

Analyzer in Field Definition : --


  


In: "Please, email john@foo.com by 03-09, re: m37-xq."

Expected Out: "Please", "email", "john@foo.com", "by", "03-09", "re",
"m37-xq"

but not getting this. Is something wrong with my understanding of
StandardTokenizer? I am using solr 3.6.
Please let me know what is wrong with this. Thanks


--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: StandardTokenizerFactory behaviour

2012-06-18 Thread Alok Bhandari


Just to make sure that there is no ambiguity the In: "Please, email
john@foo.com by 03-09, re: m37-xq." is the input given to this field for
indexing and the Expected Out: "Please", "email", "john@foo.com", "by",
"03-09", "re", "m37-xq"  is expected output tokens.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215p3990216.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: How to update one field without losing the others?

2012-06-18 Thread Sami Siren

On Mon, Jun 18, 2012 at 5:03 PM, Kai Gülzau  wrote:
> I'm currently playing around with a branch 4x Version 
> (https://builds.apache.org/job/Solr-4.x/5/) but I don't get field updates to 
> work.
>
> A simple GET testrequest
> http://localhost:8983/solr/master/update/json?stream.body={"add":{"doc":{"ukey":"08154711","type":"1","nbody":{"set":"mycontent"
>
> results in
> {
>  "ukey":"08154711",
>  "type":"1",
>  "nbody":"{set=mycontent}"}]
> }
>
> All fields are stored.
> ukey is the unique key :-)
> type is a required field.
> nbody is a solr.TextField.

With the Solr example (4.x), the following seems to work:

URL=http://localhost:8983/solr/update
curl $URL?commit=true -H 'Content-type:application/json' -d '{ "add":
{ "doc": { "id": "id", "title": "test", "price_f": 10 }}}'
curl $URL?commit=true -H 'Content-type:application/json' -d '{ "add":
{ "doc": { "id": "id", "price_f": {"set": 5'

If you are using solrj then there's a junit test method,
testUpdateField(), that does something similar:

http://svn.apache.org/viewvc/lucene/dev/trunk/solr/solrj/src/test/org/apache/solr/client/solrj/SolrExampleTests.java?view=markup

--
 Sami Siren

AW: StandardTokenizerFactory behaviour

2012-06-18 Thread Markus Klose

The behaviour of the StandardTokenizerFactory  changed  with solr 3.1.
The actual output is now: "Please", "email", "john.doe", "foo.com", "by", "03", 
"09", "re", "m37","xq"

Viele Grüße aus Augsburg

Markus Klose
SHI Elektronische Medien GmbH 
 



-Ursprüngliche Nachricht-
Von: Alok Bhandari [mailto:alokomprakashbhand...@gmail.com] 
Gesendet: Dienstag, 19. Juni 2012 07:33
An: solr-user@lucene.apache.org
Betreff: Re: StandardTokenizerFactory behaviour


Just to make sure that there is no ambiguity the In: "Please, email 
john@foo.com by 03-09, re: m37-xq." is the input given to this field for 
indexing and the Expected Out: "Please", "email", "john@foo.com", "by", 
"03-09", "re", "m37-xq"  is expected output tokens.

--
View this message in context: 
http://lucene.472066.n3.nabble.com/StandardTokenizerFactory-behaviour-tp3990215p3990216.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Exception when optimizing index

delete by query don't work

Re: Exception when optimizing index

Re: delete by query don't work

Re: Exception when optimizing index

SolrCloud non-distributed indexing (update.chain)

Re: StreamingUpdateSolrServer Connection Timeout Setting

Re: StreamingUpdateSolrServer Connection Timeout Setting

Re: SolrCloud and split-brain

Re: delete by query don't work

Re: Lock error when indexing with curl

RE: How to update one field without losing the others?

Re: StreamingUpdateSolrServer Connection Timeout Setting

Re: SolrCloud and split-brain

Re: SolrCloud non-distributed indexing (update.chain)

Re: WordBreak and default dictionary crash Solr

Re: SolrCloud and split-brain

Re: SolrCloud and split-brain

StandardTokenizerFactory behaviour

Re: StandardTokenizerFactory behaviour

Re: How to update one field without losing the others?

AW: StandardTokenizerFactory behaviour

22 matches

Site Navigation

Mail list logo

Footer information