Re: Stream sstables hosted on a node from client using streaming protocol

2015-05-10 Thread Pierre Devops
OK so I know a little more now, it's not doable in client mode ATM because
it rely to much on server side stuff.

It needs to initialize ColumnFamilyStore and use an instance of it
afterwards, which will require to much server-side configuration
initialization.

Secondly the way it streams is inefficient because it will deserialize the
streamed sstable to rebuild a new sstable in SSTableWriter.appendFromStream
(needed to rebuild index & other compoment)  while I just need to copy the
-Data- file on the disk.

So I think I'm going to provide my own IncomingFileMessage and its own
deserializer.



2015-05-09 23:32 GMT+02:00 Pierre Devops :

> Thanks yuki, copying SSLTableLoader was the first thing I try, but without
> success.
>
> I checked BulkLoadConnectionFactory (
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/tools/BulkLoadConnectionFactory.java)
> and I don't see what it provide over the DefaultConnectionFactory that can
> help me more in this case.
>
> Without setting up a custom connection factory, it manages already to
> connect to the node, and send a streaming request (I see it in cassandra
> logs).
>
> INFO  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899 ID#0]
>> Creating new streaming plan for SST Import
>> INFO  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899, ID#0]
>> Received streaming plan for SST Import
>> INFO  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899, ID#0]
>> Received streaming plan for SST Import
>> INFO  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899 ID#0]
>> Prepare completed. Receiving 0 files(0 bytes), sending 2 files(4083518
>> bytes)
>> INFO  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899] Session
>> with /127.0.0.1 is complete
>> WARN  21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899] Stream
>> failed
>> ERROR 21:16:25 [Stream #a630d860-f690-11e4-a2d0-adca0d5ee899] Streaming
>> error occurred
>
>
>
> So it looks like my client is receiving two message in its
> ConnectionHandler loop (
> https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/streaming/ConnectionHandler.java#L251)
> , the first one is a PREPARE_MESSAGE type with a StreamSummary indicating
> the good number of files.
>
> But the second message it receives, it fails to deserialize. So I debugged
> and streamed what was coming from this socket, and it was the sstables. but
> I don't know why it fails deseriliazion of message type.
>
>


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread tupshin
+1

On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:
> *With 8099 still weeks from being code complete, and even longer from
> being
> stable, I’m starting to think we should decouple everything that’s
> already
> done in trunk from 8099.  That is, ship 2.2 ASAP with - Windows support-
> UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row
> cache- Message coalescing on by default- Native protocol v4and let 3.0
> ship
> with 8099 and a few things that finish by then (vnode compaction,
> file-based hints, maybe materialized views).Remember that we had 7
> release
> candidates for 2.1.  Splitting 2.2 and 3.0 up this way will reduce the
> risk
> in both 2.2 and 3.0 by separating most of the new features from the big
> engine change.  We might still have a lot of stabilization to do for
> either
> or both, but at the least this lets us get a head start on testing the
> new
> features in 2.2.This does introduce a new complication, which is that
> instead of 3.0 being an unusually long time after 2.1, it will be an
> unusually short time after 2.2.  The “default” if we follow established
> practice would be to*
> 
>-
> 
>EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization
>branches
> 
> 
> *But, this is probably not the best investment we could make for our
> users
> since 2.2 and 3.0 are relatively close in functionality.  I see a couple
> other options without jumping to 3 concurrent stabilization series:*
> 
> 
> 
> * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x stabilization
> series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but stop
> 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder, http://www.datastax.com
> @spyced


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread tupshin
To clarify, I'm +1ing the creation of a stable 2.2 branch, prior to
8099, in order to not block certain key features, as mentioned. Neutral
on any additional nuances.

-Tupshin

On Sun, May 10, 2015, at 08:05 AM, tups...@tupshin.com wrote:
> +1
> 
> On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:
> > *With 8099 still weeks from being code complete, and even longer from
> > being
> > stable, I’m starting to think we should decouple everything that’s
> > already
> > done in trunk from 8099.  That is, ship 2.2 ASAP with - Windows support-
> > UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row
> > cache- Message coalescing on by default- Native protocol v4and let 3.0
> > ship
> > with 8099 and a few things that finish by then (vnode compaction,
> > file-based hints, maybe materialized views).Remember that we had 7
> > release
> > candidates for 2.1.  Splitting 2.2 and 3.0 up this way will reduce the
> > risk
> > in both 2.2 and 3.0 by separating most of the new features from the big
> > engine change.  We might still have a lot of stabilization to do for
> > either
> > or both, but at the least this lets us get a head start on testing the
> > new
> > features in 2.2.This does introduce a new complication, which is that
> > instead of 3.0 being an unusually long time after 2.1, it will be an
> > unusually short time after 2.2.  The “default” if we follow established
> > practice would be to*
> > 
> >-
> > 
> >EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization
> >branches
> > 
> > 
> > *But, this is probably not the best investment we could make for our
> > users
> > since 2.2 and 3.0 are relatively close in functionality.  I see a couple
> > other options without jumping to 3 concurrent stabilization series:*
> > 
> > 
> > 
> > * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x stabilization
> > series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but stop
> > 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*
> > 
> > -- 
> > Jonathan Ellis
> > Project Chair, Apache Cassandra
> > co-founder, http://www.datastax.com
> > @spyced


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread Phil Yang
How about naming it 2.9 as a development preview version before 3.0? If
this version and 3.0 are close in functionality, it is not a good idea that
the two version number have a huge difference. And after 3.0 being shipped,
I think we should stop maintaining this version because of the similarity
with 3.0 and still maintain 2.1.x since 2.1.0 was shipped 8 months ago and
just have a "maybe product ready" version 2.1.5.


2015-05-10 20:17 GMT+08:00 :

> To clarify, I'm +1ing the creation of a stable 2.2 branch, prior to
> 8099, in order to not block certain key features, as mentioned. Neutral
> on any additional nuances.
>
> -Tupshin
>
> On Sun, May 10, 2015, at 08:05 AM, tups...@tupshin.com wrote:
> > +1
> >
> > On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:
> > > *With 8099 still weeks from being code complete, and even longer from
> > > being
> > > stable, I’m starting to think we should decouple everything that’s
> > > already
> > > done in trunk from 8099.  That is, ship 2.2 ASAP with - Windows
> support-
> > > UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row
> > > cache- Message coalescing on by default- Native protocol v4and let 3.0
> > > ship
> > > with 8099 and a few things that finish by then (vnode compaction,
> > > file-based hints, maybe materialized views).Remember that we had 7
> > > release
> > > candidates for 2.1.  Splitting 2.2 and 3.0 up this way will reduce the
> > > risk
> > > in both 2.2 and 3.0 by separating most of the new features from the big
> > > engine change.  We might still have a lot of stabilization to do for
> > > either
> > > or both, but at the least this lets us get a head start on testing the
> > > new
> > > features in 2.2.This does introduce a new complication, which is that
> > > instead of 3.0 being an unusually long time after 2.1, it will be an
> > > unusually short time after 2.2.  The “default” if we follow established
> > > practice would be to*
> > >
> > >-
> > >
> > >EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization
> > >branches
> > >
> > >
> > > *But, this is probably not the best investment we could make for our
> > > users
> > > since 2.2 and 3.0 are relatively close in functionality.  I see a
> couple
> > > other options without jumping to 3 concurrent stabilization series:*
> > >
> > >
> > >
> > > * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x
> stabilization
> > > series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but
> stop
> > > 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*
> > >
> > > --
> > > Jonathan Ellis
> > > Project Chair, Apache Cassandra
> > > co-founder, http://www.datastax.com
> > > @spyced
>



-- 
Thanks,
Phil Yang


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread Aleksey Yeschenko
Releasing a 2.2 now is indeed a good idea, +1 to that.

Regarding EOLs, however, there I don’t feel like dropping the planned 3.0.x 
stabilisation branch is necessary.

I’d also say that having both 2.1.x and 2.2.x LTS branches is both 1) very 
cheap for us and 2) is not really needed.

Here is why:

1) The new features in 2.2 don’t modify the core heavily - unlike 3.0 would. 
Hence 2.1 patches almost always apply cleanly to trunk, not causing us 
headaches as developers

2) New features being almost entirely opt-in, if you don’t use them, you can 
jump from 2.1 to 2.2 without significant stability degradation. It’s only the 
new features, in this case, that require stabilising. Messaging formats haven’t 
changed, sstable format is the same, the storage engine has had no 
modifications.

So, maintaining both 2.1 and 2.2 LTS branches, while cheap for us, is 
unnecessary, and would cause avoidable fragmentation.

3.0, however, will require a stabilisation period, just by the nature of it. It 
might seem like 2.2 and 3.0 are closer to each other than 2.1 and 2.2 are, if 
you go purely by the feature list, but in fact the opposite is true.

So I’d suggest a third EOL alternative. We leave the planned 3.0.x 
stabilisation branch in place - we are going to need it. And we have the new 
2.2 branch inherit 2.1’s LTS status, and retire 2.1 itself earlier than 
planned. In other words,

1) 2.0.x branch goes EOL when 3.0 is out, as planned
2) 3.0.x LTS branch stays, as planned, and helps us stabilise the new storage 
engine
3) in a few months after 2.2 gets released, we EOL 2.1. Users upgrade to 2.2, 
get the same stability as with 2.1.7, plus a few new features

With that addition, +100 to the idea of having a 2.2 ASAP.

-- 
AY

On May 10, 2015 at 17:28:05, Phil Yang (ud1...@gmail.com) wrote:

How about naming it 2.9 as a development preview version before 3.0? If  
this version and 3.0 are close in functionality, it is not a good idea that  
the two version number have a huge difference. And after 3.0 being shipped,  
I think we should stop maintaining this version because of the similarity  
with 3.0 and still maintain 2.1.x since 2.1.0 was shipped 8 months ago and  
just have a "maybe product ready" version 2.1.5.  


2015-05-10 20:17 GMT+08:00 :  

> To clarify, I'm +1ing the creation of a stable 2.2 branch, prior to  
> 8099, in order to not block certain key features, as mentioned. Neutral  
> on any additional nuances.  
>  
> -Tupshin  
>  
> On Sun, May 10, 2015, at 08:05 AM, tups...@tupshin.com wrote:  
> > +1  
> >  
> > On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:  
> > > *With 8099 still weeks from being code complete, and even longer from  
> > > being  
> > > stable, I’m starting to think we should decouple everything that’s  
> > > already  
> > > done in trunk from 8099. That is, ship 2.2 ASAP with - Windows  
> support-  
> > > UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row  
> > > cache- Message coalescing on by default- Native protocol v4and let 3.0  
> > > ship  
> > > with 8099 and a few things that finish by then (vnode compaction,  
> > > file-based hints, maybe materialized views).Remember that we had 7  
> > > release  
> > > candidates for 2.1. Splitting 2.2 and 3.0 up this way will reduce the  
> > > risk  
> > > in both 2.2 and 3.0 by separating most of the new features from the big  
> > > engine change. We might still have a lot of stabilization to do for  
> > > either  
> > > or both, but at the least this lets us get a head start on testing the  
> > > new  
> > > features in 2.2.This does introduce a new complication, which is that  
> > > instead of 3.0 being an unusually long time after 2.1, it will be an  
> > > unusually short time after 2.2. The “default” if we follow established  
> > > practice would be to*  
> > >  
> > > -  
> > >  
> > > EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization  
> > > branches  
> > >  
> > >  
> > > *But, this is probably not the best investment we could make for our  
> > > users  
> > > since 2.2 and 3.0 are relatively close in functionality. I see a  
> couple  
> > > other options without jumping to 3 concurrent stabilization series:*  
> > >  
> > >  
> > >  
> > > * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x  
> stabilization  
> > > series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but  
> stop  
> > > 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*  
> > >  
> > > --  
> > > Jonathan Ellis  
> > > Project Chair, Apache Cassandra  
> > > co-founder, http://www.datastax.com  
> > > @spyced  
>  



--  
Thanks,  
Phil Yang  


Re: Proposal: release 2.2 (based on current trunk) before 3.0 (based on 8099)

2015-05-10 Thread Robert Stupp
+1 on the idea of releasing what’s already there and what’s possible without 
big effort for 3.0.

Instead of labeling it 2.2, I’d like to propose to label it 3.0 (so basically 
just move 8099 to 3.1).
In the end it’s ”only a label”. But there are a lot of new user-facing features 
in it that justifies a major release.

> Am 10.05.2015 um 21:42 schrieb Aleksey Yeschenko :
> 
> Releasing a 2.2 now is indeed a good idea, +1 to that.
> 
> Regarding EOLs, however, there I don’t feel like dropping the planned 3.0.x 
> stabilisation branch is necessary.
> 
> I’d also say that having both 2.1.x and 2.2.x LTS branches is both 1) very 
> cheap for us and 2) is not really needed.
> 
> Here is why:
> 
> 1) The new features in 2.2 don’t modify the core heavily - unlike 3.0 would. 
> Hence 2.1 patches almost always apply cleanly to trunk, not causing us 
> headaches as developers
> 
> 2) New features being almost entirely opt-in, if you don’t use them, you can 
> jump from 2.1 to 2.2 without significant stability degradation. It’s only the 
> new features, in this case, that require stabilising. Messaging formats 
> haven’t changed, sstable format is the same, the storage engine has had no 
> modifications.
> 
> So, maintaining both 2.1 and 2.2 LTS branches, while cheap for us, is 
> unnecessary, and would cause avoidable fragmentation.
> 
> 3.0, however, will require a stabilisation period, just by the nature of it. 
> It might seem like 2.2 and 3.0 are closer to each other than 2.1 and 2.2 are, 
> if you go purely by the feature list, but in fact the opposite is true.
> 
> So I’d suggest a third EOL alternative. We leave the planned 3.0.x 
> stabilisation branch in place - we are going to need it. And we have the new 
> 2.2 branch inherit 2.1’s LTS status, and retire 2.1 itself earlier than 
> planned. In other words,
> 
> 1) 2.0.x branch goes EOL when 3.0 is out, as planned
> 2) 3.0.x LTS branch stays, as planned, and helps us stabilise the new storage 
> engine
> 3) in a few months after 2.2 gets released, we EOL 2.1. Users upgrade to 2.2, 
> get the same stability as with 2.1.7, plus a few new features
> 
> With that addition, +100 to the idea of having a 2.2 ASAP.
> 
> -- 
> AY
> 
> On May 10, 2015 at 17:28:05, Phil Yang (ud1...@gmail.com) wrote:
> 
> How about naming it 2.9 as a development preview version before 3.0? If  
> this version and 3.0 are close in functionality, it is not a good idea that  
> the two version number have a huge difference. And after 3.0 being shipped,  
> I think we should stop maintaining this version because of the similarity  
> with 3.0 and still maintain 2.1.x since 2.1.0 was shipped 8 months ago and  
> just have a "maybe product ready" version 2.1.5.  
> 
> 
> 2015-05-10 20:17 GMT+08:00 :  
> 
>> To clarify, I'm +1ing the creation of a stable 2.2 branch, prior to  
>> 8099, in order to not block certain key features, as mentioned. Neutral  
>> on any additional nuances.  
>> 
>> -Tupshin  
>> 
>> On Sun, May 10, 2015, at 08:05 AM, tups...@tupshin.com wrote:  
>>> +1  
>>> 
>>> On Sat, May 9, 2015, at 06:38 PM, Jonathan Ellis wrote:  
 *With 8099 still weeks from being code complete, and even longer from  
 being  
 stable, I’m starting to think we should decouple everything that’s  
 already  
 done in trunk from 8099. That is, ship 2.2 ASAP with - Windows  
>> support-  
 UDF- Role-based permissions - JSON- Compressed commitlog- Off-heap row  
 cache- Message coalescing on by default- Native protocol v4and let 3.0  
 ship  
 with 8099 and a few things that finish by then (vnode compaction,  
 file-based hints, maybe materialized views).Remember that we had 7  
 release  
 candidates for 2.1. Splitting 2.2 and 3.0 up this way will reduce the  
 risk  
 in both 2.2 and 3.0 by separating most of the new features from the big  
 engine change. We might still have a lot of stabilization to do for  
 either  
 or both, but at the least this lets us get a head start on testing the  
 new  
 features in 2.2.This does introduce a new complication, which is that  
 instead of 3.0 being an unusually long time after 2.1, it will be an  
 unusually short time after 2.2. The “default” if we follow established  
 practice would be to*  
 
 -  
 
 EOL 2.1 when 3.0 ships, and maintain 2.2.x and 3.0.x stabilization  
 branches  
 
 
 *But, this is probably not the best investment we could make for our  
 users  
 since 2.2 and 3.0 are relatively close in functionality. I see a  
>> couple  
 other options without jumping to 3 concurrent stabilization series:*  
 
 
 
 * - Extend 2.1.x series and 2.2.x until 4.0, but skip 3.0.x  
>> stabilization  
 series in favor of tick-tock 3.x- Extend 2.1.x series until 4.0, but  
>> stop  
 2.2.x when 3.0 ships in favor of developing 3.0.x insteadThoughts?*  
 
 --  
 Jonathan Ellis  
 Pr