Re: /var/tmp in FailureDetector

2010-10-21 Thread Aaron Morton
To quick for me :)
Aaron


On 21 Oct 2010, at 17:52, Jonathan Ellis  wrote:

> Done in r1025822
> 
> On Wed, Oct 20, 2010 at 12:54 PM, Gary Dusbabek  wrote:
>> You're right!  It looks like dead code that should be removed.
>> 
>> Gary.
>> 
>> On Wed, Oct 20, 2010 at 12:50, aaron morton  wrote:
>>> I should have mentioned the FailureDetectorMBean only has the parameterless 
>>> dumpInterArrivalTimes().
>>> 
>>> The overload that takes InetAddress is not available through JMX.
>>> 
>>> A
>>> On 21 Oct 2010, at 01:55, Gary Dusbabek wrote:
>>> 
 Yes, we should generate it in the right temp directory.  That method
 is an implementation of an interface method (FailureDetectorMBean),
 meant to be invoked by JMX, which is why no other code calls it.
 
 Gary.
 
 On Wed, Oct 20, 2010 at 03:48, aaron morton  
 wrote:
> I was reading through some code and noticed the following in 
> FailureDetector.dumpInterArrivealTimes()
> 
>FileOutputStream fos = new FileOutputStream("/var/tmp/output-" 
> + System.currentTimeMillis() + ".dat", true);
> 
> If this is meant to be cross platform I'm happy to create a bug and 
> change it to use File.createTempFile() .
> 
> Also I could not find any use of the  dumpInterArrivalTimes(InetAddress 
> ep) overload. Anyone know if it should be kept?
> 
> thanks
> Aaron
> 
> 
>>> 
>>> 
>> 
> 
> 
> 
> -- 
> Jonathan Ellis
> Project Chair, Apache Cassandra
> co-founder of Riptano, the source for professional Cassandra support
> http://riptano.com


Re: Question about ColumnFamily Id's

2010-10-21 Thread Gary Dusbabek
On Wed, Oct 20, 2010 at 15:53, Aaron Morton  wrote:
> I was helping a guy who in the end had a mixed beta1 and beta2
> cluster http://www.mail-archive.com/u...@cassandra.apache.org/msg06661.html
> I had a look around the code and have a couple of questions, just for my
> understanding.
> When ReadResponseSerialize is called to deserialize the response from a
> node, it calls the RowSerializer which uses the ColumnFamilySerializer. If
> the CfId in the row is not known on the node
> a UnserializableColumnFamilyException is thrown. It's an IOException sub
> class and the error is treated as an Internal Error by the thrift generated
> Cassandra server.
> The read message sent to the node contains the Keyspace+CF names, and it
> returns it's CfID in the response.
> It looks like if a node somehow has a different/bad schema it can cause
> reads to fail. Is this correct?

Seems like the right thing to do.

> Could it's response be ignored if the read
> still meets the CL?

I haven't made up my mind about this one.  On one hand, it would be
good to let the user know that things aren't good in the cluster.  But
on the other hand, the cluster is still healthy enough to return good
results, maybe we should let it.  I lean towards the latter.

> Next question was how nodes could ever get to have a different CfId for the
> same Keyspace+CF pair?  It looks like the the CfId is never changed, so it
> would only happen if two node were each given a schema update and could not
> communicate it with each other.

It would have to be as a result of two migrations.  When migrations
are exported to other nodes, the cfid used is the value that is sent,
not the next counter value on the host receiving the migration.

> Am guessing the whole scenario is "unsupported" just trying to understand
> whats happening.

Makes me scratch my head too.

Gary.

> Thanks
> Aaron
>
>


Re: Would it be possible to do a ColumnCache?

2010-10-21 Thread Héctor Izquierdo Seliva
El jue, 21-10-2010 a las 00:02 -0500, Jonathan Ellis escribió:
> I don't think it's possible.  Cassandra's data model means you can't
> know what columns are present in a row, so if you query, say, name and
> birthdate from your users column family and only the name column is in
> the cache for the row you are querying, does that mean that birthdate
> doesn't exist in that row?  Or just that it's not cached?  You don't
> know, so you'd have to do a full read every time.  Similar
> difficulties arise with slices.

I was thinking in a column cache to be used when you need a slice of a
row, where you'd ask the cache for all the rows in the slice, and then
read from disk whichever rows weren't in cache. If the columns you want
don't exist, you still hit the disk, but you get a nice speedup for
existing columns. 

If you want write-through cache then things complicate a bit, but it
should be doable.

The question is, would the performance improvement justifies the
increase in complexity?



> 
> 2010/10/20 Héctor Izquierdo Seliva :
> > Hi. Before wasting time in something that might not be feasible at the
> > moment, I wanted to ask the devs if a column cache would be possible
> > (instead of a whole row cache). This would allow users with fat rows to
> > also use a cache and reduce latency for hot data.
> >
> > If this is possible, i'd appreciate some hints about where to dig.
> >
> > Thanks for your time!
> >
> >
> 
> 
> 




Re: Would it be possible to do a ColumnCache?

2010-10-21 Thread Héctor Izquierdo Seliva
When I say "rows in the slice" I mean "columns in the slice", sorry for
the typo.

El jue, 21-10-2010 a las 17:44 +0200, Héctor Izquierdo Seliva escribió:
> El jue, 21-10-2010 a las 00:02 -0500, Jonathan Ellis escribió:
> > I don't think it's possible.  Cassandra's data model means you can't
> > know what columns are present in a row, so if you query, say, name and
> > birthdate from your users column family and only the name column is in
> > the cache for the row you are querying, does that mean that birthdate
> > doesn't exist in that row?  Or just that it's not cached?  You don't
> > know, so you'd have to do a full read every time.  Similar
> > difficulties arise with slices.
> 
> I was thinking in a column cache to be used when you need a slice of a
> row, where you'd ask the cache for all the rows in the slice, and then
> read from disk whichever rows weren't in cache. If the columns you want
> don't exist, you still hit the disk, but you get a nice speedup for
> existing columns. 
> 
> If you want write-through cache then things complicate a bit, but it
> should be doable.
> 
> The question is, would the performance improvement justifies the
> increase in complexity?
> 
> 
> 
> > 
> > 2010/10/20 Héctor Izquierdo Seliva :
> > > Hi. Before wasting time in something that might not be feasible at the
> > > moment, I wanted to ask the devs if a column cache would be possible
> > > (instead of a whole row cache). This would allow users with fat rows to
> > > also use a cache and reduce latency for hot data.
> > >
> > > If this is possible, i'd appreciate some hints about where to dig.
> > >
> > > Thanks for your time!
> > >
> > >
> > 
> > 
> > 
> 
>