Re: Low hanging fruit crew

2016-10-18 Thread Nate McCall
That's a good idea, Ed, thanks for bringing this up. Kurt, if you are offering up resources for review and test coverage, there is work out there. Most immediately in the department of reviews for "Patch Available." While not quite low-hanging fruit, it's helpful to have non-committers look throu

Re: Batch read requests to one physical host?

2016-10-18 Thread Eric Stevens
We've had some luck with bulk known key reads with grouping by replica and doing SELECT... WHERE key IN(...). Not compatible with all data models, but it works well where we can get away with it. As a more general purpose construct it makes sense to me. In our driver layer we have abstracted batch

Batch read requests to one physical host?

2016-10-18 Thread Dikang Gu
Hi there, We have couple use cases that are doing fanout read for their data, means one single read request from client contains multiple keys which live on different physical hosts. (I know it's not recommended way to access C*). Right now, on the coordinator, it will issue separate read command

Re: Low hanging fruit crew

2016-10-18 Thread kurt Greaves
So there are a bunch of us at Instaclustr looking into contributing to the project on a more frequent basis. We'd definitely be interested in some kind of LHF crew under whom our new "contributors" could make tracks on becoming main committers, while having some close by colleagues who can help the

Re: Use of posix_fadvise

2016-10-18 Thread Edward Capriolo
I want to point out something: https://issues.apache.org/jira/browse/CASSANDRA-6846 "I'm definitively -1 on putting any type of contract on the internals. They are called internals for a reason, and if rewriting it all entirely tomorrow is best for Cassandra, we should have the possibility to do

Re: Use of posix_fadvise

2016-10-18 Thread Nate McCall
Benedict mentioned it briefly above, but the earliest / most detailed conversation on this can be found in CASSANDRA-1470. You may also find some stuff from the dev list and other issues around the same time from specifically from Chris G. and Peter S. (names on that ticket) as, IIRC, they were bo

Re: Cleanup after yourselves please

2016-10-18 Thread Edward Capriolo
IMHO, while through the code. A vast majority of the //comments would be better as /** */ comments in method declarations. Many believe that excessive inline comments could indicate a code smell. On Tue, Oct 18, 2016 at 1:21 PM, Josh McKenzie wrote: > > > > tests hastily and messly commented o

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
I'm not certain this is the best way to go about encouraging people to help you, or generally encourage participation in the project. You have seemed to lash out at the project (and in this case me specifically) in a fairly antagonistic manner a multitude of times in just a couple of hours. Your

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Sorry, No. Always document your assumptions. I shouldn't need to git blame a thousand commits and read thru a billion tickets to maybe understand why something was done. Clearly thru the conversations on this topic I've had on IRC and the responses so far on this email thread it's not/still not

Re: Cleanup after yourselves please

2016-10-18 Thread Josh McKenzie
> > tests hastily and messly commented out line by line (*whyy?*) Couldn't we use /* */ comments instead of every single line one by one? When Jake and I were mass porting unit tests for 8099, I know I used idea's shortcut (ctrl + /) to block comment out things that wouldn't compile while port

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
This is what JIRA is for. It seems to date back to CASSANDRA-1470, where the default became immediately evicting newly compacted files. This results in cold reads for *hot* data after compaction, so CASSANDRA-6916 permitted evicting the *old* data instead, while guaranteeing >= the same amount of

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Yeah, it has been there for years -- that being said most of the community is just catching up to 2.1 and 3.0 now where the usage did appear to change over 2.0-- and I'm more trying to figure out what the intent was in the various usages all over the codebase and make sure it's actually doing th

Re: Use of posix_fadvise

2016-10-18 Thread Jake Luciani
Although given we have an in process page cache[1] now this may not be needed anymore? This is only for the data file though. I think its been years? since we showed it helped so perhaps someone should show if this is still working/helping in the real world. [1] https://issues.apache.org/jira/bro

Re: Cleanup after yourselves please

2016-10-18 Thread Michael Kjellman
Cool, as I would have assumed they would need to be. Given they were initially commented out on 6/30/15 maybe cleanup and removal of that dead code is still at least warranted. On Oct 18, 2016, at 9:15 AM, Oleksandr Petrov mailto:oleksandr.pet...@gmail.com>> wrote: Unit tests will be completel

8099 Storage Format Documentation as used with PRIMARY_INDEX

2016-10-18 Thread Michael Kjellman
I'm working on writing Birch for trunk and I noticed the following: https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/db/columniterator/AbstractSSTableIterator.java#L503 Prior to 3.0 the offset was the literal offset into the data file, yet now we seem to be doing the

Re: Cleanup after yourselves please

2016-10-18 Thread Oleksandr Petrov
It makes sense to make them work for backwards compatibility. There was no announcement that if you migrated to 3.x they wouldn't work, so everyone would most likely expect a clear upgrade path. it's not bringing them back from the dead, but rather helping people who want to upgrade to have this op

Re: Use of posix_fadvise

2016-10-18 Thread Ariel Weisberg
Hi, Compaction can merge some very large files together with data that may be completely cold. So yeah caching the whole file just creates pressure to evict useful stuff. In some theories. In other theories the page cache is flush and scan resistant and should just eat this stuff up without inter

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Within a single SegmentedFile? On Oct 18, 2016, at 9:02 AM, Ariel Weisberg mailto:ariel.weisb...@datastax.com>> wrote: With compaction there can be hot and cold data mixed together.

Re: Cleanup after yourselves please

2016-10-18 Thread Michael Kjellman
Gotcha, I didn't know we were actually bringing them back from the dead! That being said, won't the unit tests need to be re-writtten (or at least refactored) after your work? Couldn't we use /* */ comments instead of every single line one by one? Given we use source control couldn't we remove

Re: Use of posix_fadvise

2016-10-18 Thread Ariel Weisberg
Hi, With compaction there can be hot and cold data mixed together. So we want to drop the data and then warm it via early opening so only the hot data is in the cache. Some of those cases are for the old sstable that have been rewritten or discarded so the data is entirely defunct. The files migh

Low hanging fruit crew

2016-10-18 Thread Edward Capriolo
I go through the Cassandra jira weekly and I notice a number of tickets which appear to be clear issues or requests for simple metrics. https://issues.apache.org/jira/browse/CASSANDRA-12626 https://issues.apache.org/jira/browse/CASSANDRA-12330 I also have a few jira issues (opinion) would be sim

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Specifically regarding the behavior in different kernels, from `man posix_fadvise`: "In kernels before 2.6.6, if len was specified as 0, then this was interpreted literally as "zero bytes", rather than as meaning "all bytes through to the end of the file"." On Oct 18, 2016, at 8:57 AM, Michael

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Right, so in SSTableReader#GlobalTidy$tidy it does: // don't ideally want to dropPageCache for the file until all instances have been released CLibrary.trySkipCache(desc.filenameFor(Component.DATA), 0, 0); CLibrary.trySkipCache(desc.filenameFor(Component.PRIMARY_INDEX), 0, 0); It seems to me ever

Re: Use of posix_fadvise

2016-10-18 Thread Jake Luciani
The main point is to avoid keeping things in the page cache that are no longer needed like compacted data that has been early opened elsewhere. On Oct 18, 2016 11:29 AM, "Michael Kjellman" wrote: > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments we

Re: Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
Sure -- my bad, I aggregated them all of them up for you: https://github.com/apache/cassandra/search?utf8=✓&q=CLibrary.trySkipCache&type=Code https://github.com/apache/cassandra/blob/81f6c784ce967fadb6ed7f5

Re: Use of posix_fadvise

2016-10-18 Thread Benedict Elliott Smith
... and continuing in the fashion of behaviours one might like to disabuse people of, no code link is provided. On 18 October 2016 at 16:28, Michael Kjellman wrote: > We use posix_fadvise in a bunch of places, and in stereotypical Cassandra > fashion no comments were provided. > > There is a c

Use of posix_fadvise

2016-10-18 Thread Michael Kjellman
We use posix_fadvise in a bunch of places, and in stereotypical Cassandra fashion no comments were provided. There is a check the OS is Linux (okay, a start) but it turns out the behavior of providing a length of 0 to posix_fadvise changed in some 2.6 kernels. We don't check the kernel version

Re: Cleanup after yourselves please

2016-10-18 Thread Oleksandr Petrov
I'm currently working on actually making Super Columns work in CQL context. Currently they do not really work[1]. It's not a very small piece of work. It was in the pipeline for some time, although there most likely were more important things that had to be worked on. I understand your disappointm

Cleanup after yourselves please

2016-10-18 Thread Michael Kjellman
There was a bunch of tests hastily and messly commented out line by line (*whyy?*) ColumnFamilyStoreTest with comments that they are pending SuperColumns support post 8099. Could those responsible please cleanup after themselves? It's been a while since 8099 was committed in the first place an