from:"Joel Knighton"

Re: Problems with test DateTieredCompactionStrategyTest for 2.2.5 on RedHat machine

2016-04-15 Thread Joel Knighton

If this is the only problem you're experiencing, I don't think the problem
is on your end. DataStax runs a CI server at cassci.datastax.com, and a
quick look suggests that the same test occasionally fails there (
http://cassci.datastax.com/job/cassandra-3.0_testall/472/testReport/org.apache.cassandra.db.compaction/DateTieredCompactionStrategyTest/testFilterOldSSTables/
).

It is likely that this is a test timing issue exacerbated by your
particular set up. While you're more than welcome to open an issue for this
and try to fix it if it is of particular concern for you, I wouldn't let it
be a blocker to otherwise unrelated work.

In the future, for these sort of problems (a test is failing and you don't
know if it is related to your development, etc), you might get a lower
latency reply on #cassandra-dev on Freenode.

Best,
Joel

On Fri, Apr 15, 2016 at 10:16 AM, Giampaolo Trapasso <
giampaolo.trapa...@radicalbit.io> wrote:

> Just an additional info. Repeating the test, I get odd executions as
> failed, even as successful. The error message remains the same:
> [junit] Testcase:
>
> testFilterOldSSTables(org.apache.cassandra.db.compaction.DateTieredCompactionStrategyTest):
> FAILED
> [junit] only the newest 2 sstables should remain expected:<2> but
> was:<42>
> [junit] junit.framework.AssertionFailedError: only the newest 2
> sstables should remain expected:<2> but was:<42>
> [junit] at
>
> org.apache.cassandra.db.compaction.DateTieredCompactionStrategyTest.testFilterOldSSTables(DateTieredCompactionStrategyTest.java:276)
>
> Giampaolo
>
> 2016-04-15 15:38 GMT+02:00 Giampaolo Trapasso <
> giampaolo.trapa...@radicalbit.io>:
>
> > Hi to all,
> >
> > I'm trying to do unit test on cassandra-2.2.5
> > (dd76858c7652541c7b137323f7b9e154686d6fba) and I cannot correctly run
> them.
> > Let me explain. While on my personal laptop (OSX) it's all ok, I'm having
> > some troubles on a RedHat box where I want to put my CI.
> >
> > In particular, I have problems with the DateTieredCompactionStrategyTest.
> > I ran the test twice and, with surprise, the first run failed while the
> > immediate second succeeded. You can find the output of both test here
> > <
> https://gist.github.com/giampaolotrapasso/882f26cef098969cceb65d8143b4c60f#file-test-output-txt
> >
> > and the information about the distro is here
> > <
> https://gist.github.com/giampaolotrapasso/882f26cef098969cceb65d8143b4c60f#file-linux-distro-txt
> >
> >
> >
> > I kindly ask you if you could help me to solve this (for me) strange
> > situation. Any hint is appreciated.
> >
> > Thanks in advance
> >
> > Giampaolo
> >
>



-- 

<http://www.datastax.com/>

Joel Knighton
Cassandra Developer | joel.knigh...@datastax.com

<https://www.linkedin.com/company/datastax>
<https://www.facebook.com/datastax> <https://twitter.com/datastax>
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

<http://cassandrasummit.org/Email_Signature>

Re: UUIDGen : Unsupported major.minor version 52.0

2016-05-10 Thread Joel Knighton

This mailing list is for discussion of the development of Cassandra - since
your mail discusses building a system using Cassandra, it is better suited
for the Users mailing list.

That error indicates that you've built the system using JDK 8 (the minimum
Java version supported on the 3.x series) but are running it on a lower
version of the Java runtime. I'd suggest double-checking the version of
Java used on your test system.

If you have any followup questions, please start a new thread on the Users
mailing list.

On Tue, May 10, 2016 at 12:07 PM, Ben Vogan  wrote:

> Hi all,
>
> I am trying to use Cassandra (v3.5) from spark (v1.6.0 - CDH 5.7) via the
> datastax connector (1.6.0-M2) in scala (2.10.5).  My initial test runs fine
> locally, but when I try to deploy it to a test machine I am getting the
> following exception:
>
> 16/05/09 23:04:25 WARN Lost task 0.0 in stage 1.0 (TID 8,
> hcompute005.internal.shopkick.com):
> java.lang.UnsupportedClassVersionError:
> org/apache/cassandra/utils/UUIDGen : Unsupported major.minor version 52.0
>
> at java.lang.ClassLoader.defineClass1(Native Method)
>
> at java.lang.ClassLoader.defineClass(ClassLoader.java:800)
>
> at
> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>
> at java.net.URLClassLoader.defineClass(URLClassLoader.java:449)
>
> at java.net.URLClassLoader.access$100(URLClassLoader.java:71)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:361)
>
> at java.net.URLClassLoader$1.run(URLClassLoader.java:355)
>
> at java.security.AccessController.doPrivileged(Native Method)
>
> at java.net.URLClassLoader.findClass(URLClassLoader.java:354)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:425)
>
> at java.lang.ClassLoader.loadClass(ClassLoader.java:358)
>
> at
>
> sandbox.ben.cassandra.UserActivityTransformLoader.extractUserActivity(UserActivityTransformLoader.scala:121)
>
>
>
> The line of code that is failing is just:
>
> UUIDGen.getTimeUUID(timestamp).toString
>
>
>
> The java version in this environment is:
>
> $ java -version
>
> java version "1.8.0_91"
>
> Java(TM) SE Runtime Environment (build 1.8.0_91-b14)
>
> Java HotSpot(TM) 64-Bit Server VM (build 25.91-b14, mixed mode)
>
>
>
> Any advice on how to proceed here?
>
> Thanks,
>
> --Ben
>



-- 

<http://www.datastax.com/>

Joel Knighton
Cassandra Developer | joel.knigh...@datastax.com

<https://www.linkedin.com/company/datastax>
<https://www.facebook.com/datastax> <https://twitter.com/datastax>
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

<http://cassandrasummit.org/Email_Signature>

Failing tests 2016-07-28 [cassandra-3.9]

2016-07-28 Thread Joel Knighton

It sounds like most found the plain text email that Josh sent out yesterday
palatable, so I'll do the same. As always, feedback is welcome.

testall:
  All pass! (again)

=

dtest:
  scrub_test.TestScrubIndexes.test_standalone_scrub
CASSANDRA-12337, new flaky Linux failure on a test with known failures
on Windows. I created a new ticket to track this likely distinct problem.
Stacktrace in the error message makes this look like a C* problem. If you
are familiar with the standalone scrub utilities and want to take a look at
this, assign yourself. Otherwise, I'll find an assignee tomorrow.

  cql_tracing_test.TestCqlTracing.tracing_default_impl_test
CASSANDRA-11465, known historically flaky test. Stefania Alborghetti as
assignee, Paulo Motta for review. Patches are being reviewed and tested,
forward progress.

  rebuild_test.TestRebuild.simple_rebuild_test
CASSANDRA-11687, Yuki Morishita is assignee. Paulo Motta for review,
dtest runs with the fix look clean. This should be ready to merge soon.

=

novnode_dtest:
  rebuild_test.TestRebuild.simple_rebuild_test
Same failure as in the vnode run. CASSANDRA-11687.

=

dtest_upgrade:
  Still a lot. (64, up from 53). CASSANDRA-12236 is likely responsible for
most of these, delaying further analysis of upgrade tests until it is
resolved. This ticket is patch available and review in progress by Aleksey
Yeschenko.

=

The dtest_upgrade tests had to be restarted because a machine rebooted.
Otherwise, the test environment looked stable today with no timeouts. The
unit tests in testall continue to pass; forward progress is being made on
the known dtest failures. One new flaky dtest failure today that looks like
it might be a Cassandra issue. Keep up the good work! The signal to noise
ratio is definitely improving.

Failing tests 2016-07-29 [cassandra-3.9]

2016-07-29 Thread Joel Knighton

testall:

org.apache.cassandra.cql3.validation.entities.StaticColumnsTest.testStaticColumnPurging-compression
CASSANDRA-12336. Patch available to C*. Assignee Sylvain Lebresne,
reviewer Carl Yeksigian. Test currently fails inconsistently, patch
modifies the test to fail more consistently in addition to fixing the
underlying problem.

org.apache.cassandra.io.sstable.SSTableRewriterTest.basicTest2-compression
CASSANDRA-12348. This test has flaky been for a few runs but hadn't had
a ticket opened yet. I've opened a ticket and need an assignee familiar
with compaction/sstablerewriter if anyone wants to volunteer.

dtest:
rebuild_test.TestRebuild.simple_rebuild_test
CASSANDRA-11687. Patch available. Assignee Yuki Morishita, reviewer
Paulo Motta. Several problems with this test have been fixed, and a dtest
PR was merged. This test failed in a different way after the PR and still
seems prone to race conditions.

cql_tracing_test.TestCqlTracing.tracing_unknown_impl_test
CASSANDRA-11465. Resolved and committed
as 7bd65a129c63091d6885f92afe77a41c4fc46a6f shortly after today's dtest run.

novnode_dtest:
materialized_views_test.TestMaterializedViews.view_tombstone_test
CASSANDRA-12097. Part of a set of historically flaky MV tests. Assignee
Carl Yeksigian, reviewer Joel Knighton. Patch available with fixes to only
the dtests.

dtest_upgrade:
Another bad day for dtest upgrade. 119 failures, up from 64.
CASSANDRA-12236 is still open with assignee Sylvain Lebresne and reviewer
Aleksey Yeschenko. The patch is in progress. When this ticket is addressed,
it should make triage of the remaining failures manageable.

No surprises in the testall/dtest/novnode_dtest runs today, but one new
ticket has been opened for a flaky unit test failure that slipped through
the cracks over the past couple weeks. The upgrade dtests still need to be
stabilized.

Failing tests 2016-08-01 [cassandra-3.9]

2016-08-01 Thread Joel Knighton

testall:

org.apache.cassandra.index.CustomIndexTest.customIndexRejectsExpressionSyntax-compression
CASSANDRA-12353. Patch available. Assignee Sam Tunnicliffe, needs
reviewer. Small fix where the previous fix wasn't exhaustive. Easy review
if you want to get involved!

org.apache.cassandra.db.compaction.BlacklistingCompactionsTest.testBlacklistingWithSizeTieredCompactionStrategy
CASSANDRA-12359. New flaky failure. Google suggests this test has
flaked a few times historically. Needs an assignee who is familiar with
compaction.

dtest:

materialized_views_test.TestMaterializedViews.add_write_survey_node_after_mv_test
CASSANDRA-12140. Open. Assignee Carl Yeksigian. Needs analysis and a
fix.

novnode_dtest:

materialized_views_test.TestMaterializedViews.add_dc_after_mv_simple_replication_test
CASSANDRA-12267. Open. Assigned for triage, looks like the same issue
as CASSANDRA-12140 to me.

materialized_views_test.TestMaterializedViews.add_write_survey_node_after_mv_test
CASSANDRA-12140. Same as the vnode failure above.

repair_tests.incremental_repair_test.TestIncRepair.sstable_marking_test
CASSANDRA-12264. Open. Assignee Marcus Eriksson. Problem is unknown,
still needs analysis and a fix.

dtest_upgrade:
134 failures, up from 119. CASSANDRA-12236 is still in review with the
patch being iterated upon. After that is merged, triage should become more
feasible.

TL;DR: The upgrade dtests are still the highest risk to releasing on
schedule. Failures in materialized_views_test have increased lately; it
looks like they could be the same source issue (discussed on
CASSANDRA-12164) so may be resolved together. Unit tests are looking quite
stable - at this point, we're just working through the backlog of flaky
tests. Thanks for all the hard work on getting all tests to passing.

Failing tests 2016-08-02 [cassandra-3.9]

2016-08-02 Thread Joel Knighton

testall:
  All good! Nice work.

=

dtest:
  Rough day - 16 failures. All these failures are from problems creating a
connection to the CCM cluster under test. These failures are also happening
on the 2.2/3.0/3.9/trunk branches, so this is my highest priority to
resolve. I am yet to find the cause.

=

novnode_dtest
  17 failures. Same cause as the dtest failures above.

=

dtest_upgrade
  110 failures. Combination of the connection failures and the existing
upgrade test failures.

=

Today's email is brief because recapping 143 failures wouldn't do much
good. As soon as the connection issue is resolved, the emails will be back
to higher signal content.

Josh McKenzie should be back from vacation, so expect these emails to come
from him in the future.

Re: Failing tests 2016-08-03 [cassandra-3.9]

2016-08-04 Thread Joel Knighton

I can confirm that those noisy novnode_dtest failures are from the Python
driver problem I tracked down. I forgot to trigger a new CI run after that
was fixed. Today is down to one failure, which is much closer to what we
expect.

On Wed, Aug 3, 2016 at 7:45 PM, Josh McKenzie  wrote:

> Summary:
>Somewhat messy day, but I believe things should look less noisy tomorrow
> so
>we can get a better view of what we have left to a stable board. Going
> to try
>a slightly new format:
>
> [Failures]
>testall:   1
>dtest: 0
>novnode_dtest: 17
>upgrade:   110
>
> [Details]
>testall (1):
>   CommitLogSegmentManagerTest.testCompressedCommitLogBackpressure
>  CASSANDRA-12283, currently assigned to blerer
>
>novnode_dtest (17):
>   Bulk: Hard to separate signal from noise due to rampant timeouts and
> nodes
>  being marked down. This is possibly due to a problem in the python
> driver
>  that was tracked down today - Joel, can you confirm?
>  An example message:
> ('Unable to complete the operation against any hosts', { 127.0.0.3
> datacenter1>: ConnectionException('Host has been marked down or
> removed',),
> : ConnectionException('Host has
> been marked down
> or removed',), :
> ConnectionException('Host has
> been marked down or removed',)})
>   cql_tests.MiscellaneousCQLTester.prepared_statement_
> invalidation_test
>  **Needs Triage**
>  New failure as of yesterday, tracked by CASSANDRA-12361
>   materialized_views_test.TestMaterializedViews.complex_repair_test
>  Assigned to cyeksigian
>  Intermittent failure
>  Last comment from Philip indicates we want to move this to
> dtest_large. Pinged Jim.
>
>upgrade (110):
>   CASSANDRA-12236 was committed today so that should help with the
> noise
>   Sylvain commented on CASSANDRA-10848 yesterday concerning some
> potential
>  problems with paging tests and whether or not we're setting the
> protocol
>  version for those tests. Should be interesting to see how things
> look
>  after the run with 12236 in.
>
> ~Josh
>



-- 

<http://www.datastax.com/>

Joel Knighton
Cassandra Developer | joel.knigh...@datastax.com

<https://www.linkedin.com/company/datastax>
<https://www.facebook.com/datastax> <https://twitter.com/datastax>
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

<http://cassandrasummit.org/Email_Signature>

Failing tests 2016-08-05 [cassandra-3.9]

2016-08-05 Thread Joel Knighton

testall:
  All good!

=

dtest:
  bootstrap_test.TestBootstrap.local_quorum_bootstrap_test
CASSANDRA-12393. Assigned for triage. This looks like
a test issue. Failing on multiple branches.

  materialized_views_test.TestMaterializedViews
  .add_dc_after_mv_network_replication_test
CASSANDRA-12140. Known failure that periodically appears
in a variety of materialized view tests. Assignee
Carl Yeksigian, still being investigated.

=

novnode_dtest:
  bootstrap_test.TestBootstrap.local_quorum_bootstrap_test
Same as above.

  repair_tests.repair_test.TestRepair.test_multiple_concurrent_repairs
CASSANDRA-12395. The failure was due to a test refactor.
Fixed today.

=

dtest_upgrade:
  49 failures were present in today's run, which is an improvement.
  Multiple fixes are in flight; recent runs have improved due
  to the driver fix as well as the merge of CASSANDRA-12236. Several
  test and environmental problems are being investigated.

=

Unit tests are looking very good. We're working through the long
tail of flaky dtest failures and dealing with some amount of regular
churn. Upgrade tests are improving but still a problem area.

Re: Failing tests 2016-08-09 [cassandra-3.9]

2016-08-09 Thread Joel Knighton

The nose.failure.Failure.runTest was a missing import in a dtest that was
quickly fixed. We'd ideally have a way to stage dtest changes to catch
these sort of things, but there seems to be a reasonable consensus that
this is a lower priority issue than environmental stability and other test
fixes.

On Tue, Aug 9, 2016 at 7:30 PM, Josh McKenzie  wrote:

> Today's unpleasantry: upgrade tests failed to run. This makes
> failure to run to completion on 4 of the last 10 runs. We
> should probably look into stabilizing this test environment and/or
> the tests if tests are causing the full job to fail as a failed run
> is essentially the worst result we can get here.
>
> ===
> testall: No failures!
>
> ===
> dtest: Down to 1 failure
>topology_test.TestTopology.crash_during_decommission_test
>   CASSANDRA-11611
>   Removed windows label
>   Needs triage
>   Slightly flaky - failed 2 of last 24
>
> ===
> novnode: 5 failures
>2x cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
>   Addressed by CASSANDRA-11811
>   Assigned to blambov
>   Removed windows label on that ticket
>   Regression on cdc test, first failure
>nose.failure.Failure.runTest
>   Honestly, no clue what's up here.
>   Looks like it's failing to find the known_failure method in tools.py
>   ? Anyone have any ideas?
>bootstrap_test.TestBootstrap.resumable_bootstrap_test
>   Linked to: CASSANDRA-11414 (closed as resolved)
>   Single regression
>   Likely needs triage and a new ticket to track different failure
>repair_tests.repair_test.TestRepair.test_multiple_concurrent_repairs
>   Linked to: CASSANDRA-12395 (closed as resolved)
>   Different error than ticket
>   Likely needs triage and a new ticket to track different failure
>
> I'm going to be out for a few days so Joel will be taking over on the daily
> email again in my stead.
>
> ~Josh
>



-- 

<http://www.datastax.com/>

Joel Knighton
Cassandra Developer | joel.knigh...@datastax.com

<https://www.linkedin.com/company/datastax>
<https://www.facebook.com/datastax> <https://twitter.com/datastax>
<https://plus.google.com/+Datastax/about>
<http://feeds.feedburner.com/datastax> <https://github.com/datastax/>

<http://cassandrasummit.org/Email_Signature>

Failing tests 2016-08-10 [cassandra-3.9]

2016-08-10 Thread Joel Knighton

===
testall: No failures! Nice work.

===
dtest: No failures! Nice work.

===
novnode: 1 failure

  repair_tests.repair_test.TestRepair.test_multiple_concurrent_repairs
Same failure as yesterday - this seems to be failing consistently
but no ticket has been created yet. I will follow up tomorrow.

===
upgrade: Failed to run - aborted because of environmental problems.

===

Unit tests and dtests are looking great; we've made serious progress
on this front, and I've noticed the effects on day-to-day contribution.
Upgrade tests are having significant work done and look like they
should improve dramatically over the next few days.

Failing tests 2016-08-11 [cassandra-3.9]

2016-08-12 Thread Joel Knighton

===
testall:
  org.apache.cassandra.streaming.StreamingTransferTest
  .testTransferRangeTombstones-compression
New flaky failure. CASSANDRA-12445 created and I'm looking
for an assignee tomorrow.

===
dtest:
  repair_tests.incremental_repair_test.TestIncRepair
  .sstable_marking_test
CASSANDRA-12264. We thought we had solved this; I've
reopened the issue for now.
  materialized_views_test.TestMaterializedViews
  .add_node_after_mv_test
New failure. I've opened CASSANDRA-12446 for triage.

===
novnode:
  repair_tests.repair_test.TestRepair.test_multiple_concurrent_repairs
CASSANDRA-12439. Fixed today.

===
upgrade:
  Great upgrade test run. We're down to four failures being
  addressed in CASSANDRA-12260 and CASSANDRA-12192.

===

Forward progress continues! We're still working through the
long tail of problem unit tests and dtests. Upgrade tests have
seen a dramatic improvement, largely due to CASSANDRA-12236
and CASSANDRA-12249, as well as some environmental changes.

Failing tests 2016-08-12 [cassandra-3.9]

2016-08-12 Thread Joel Knighton

===
testall: All passed!

===
dtest: All passed!

===
novnode: All passed!

===
upgrade:
  Two test failures, both attributable to CASSANDRA-12192.

===

Great runs today. There are a few outstanding flaky tests to fix, but
we're getting really close to a clean test board.

Failing tests 2016-08-15 [cassandra-3.9]

2016-08-15 Thread Joel Knighton

===
testall: 1 failure
  org.apache.cassandra.io.compress
  .CompressedRandomAccessReaderTest.testDataCorruptionDetection
New flaky failure. I've opened CASSANDRA-12465 and assigned
myself.

===
dtest: All passed!

===
novnode: All passed!

===
upgrade: 3 failures
  upgrade_tests.cql_tests
  .TestCQLNodes2RF1_Upgrade_current_3_0_x_To_indev_3_x
  .map_keys_indexing_test
CASSANDRA-12192. Tyler Hobbs as assignee. They have identified
the cause and proposed a test fix. They are also investigating a C*
change here to improve robustness.
  upgrade_tests.cql_tests
  .TestCQLNodes3RF3_Upgrade_current_3_x_To_indev_3_x
  .map_keys_indexing_test
Same as above.
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
Potentially CASSANDRA-11195, which is open with no clear
progress.  I'll follow up with those on that issue tomorrow and see if
they agree that this is the same problem.

===
Overall, the testing situation continues to look better. The massive
upgrade failures seem to have subsided, so we can continue to target
individual failures.

Since the 3.9 tests are getting to a manageable level, we should focus
on managing test failures on trunk as well. I will soon start tracking
these failures, as well as failures on the large dtest runs, which consist
of tests that have been segmented off due to increased cluster size.

On months that we're maintaining a 3.x bugfix branch as well as trunk,
is there any preference toward a single email or a separate email for
each branch?  Any other feedback is welcome, as always.

Failing tests 2016-08-16 [cassandra-3.9]

2016-08-16 Thread Joel Knighton

===
testall: 1 failure
  org.apache.cassandra.db.commitlog
  .CommitLogSegmentManagerTest
  .testCompressedCommitLogBackpressure
  CASSANDRA-12283. This issue is under investigation and it looks
  like it is understood at this time. It still needs to be fixed.

===
dtest: 1 failure
  consistency_test.TestConsistency.short_read_test
  New flaky failure. I opened CASSANDRA-12475 and assigned
  for analysis.

===
novnode: All passed!

===
upgrade: 2 failures
  upgrade_tests.repair_test.TestUpgradeRepair
  .repair_after_upgrade_test
  CASSANDRA-12469. The cause is unknown at this time.

  upgrade_tests.cql_tests
  .TestCQLNodes2RF1_Upgrade_current_3_0_x_To_indev_3_x
  .map_keys_indexing_test
  CASSANDRA-12192. The dtest fix for this issue has been
  merged, so this should be clear in the next test run.

Failing tests 2016-08-17 [casssandra-3.9]

2016-08-17 Thread Joel Knighton

===
testall: 1 failure
  org.apache.cassandra.service.RemoveTest.testBadHostId
CASSANDRA-12487. Flaky test failure for which I've opened a new
issue. This is almost certainly a test problem, as the failure
occurs in a test utility method.

===
dtest: All passed!

===
novnode: All passed!

===
upgrade: 2 failures
  upgrade_tests.cql_tests.
  TestCQLNodes3RF3_Upgrade_current_3_0_x_To_indev_3_x
  .cql3_non_compound_range_tombstones_test
  CASSANDRA-11195 again, I believe. I'll follow up on this again
  tomorrow.
  upgrade_tests.cql_tests
  .TestCQLNodes3RF3_Upgrade_current_3_0_x_To_indev_3_x
  .cql3_non_compound_range_tombstones_test
  CASSANDRA-12488. New failure, looks flaky. I don't have any
  ideas here.

Failing tests 2016-08-18/19 [cassandra-3.9]

2016-08-19 Thread Joel Knighton

Yesterday's email got delayed by technical difficulties (mine, not the
project's). I've combined the two days worth of results into today's email.

testall, dtest, and novnode all passed.

upgrade: 2 failures
  upgrade_tests.paging_test.
  TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195 again, I believe. No ideas here yet.
  upgrade_tests.paging_test
  .TestPagingWithDeletionsNodes3RF3_Upgrade_current_3_x_To_indev_3_x
  .test_single_cell_deletions
  CASSANDRA-12260. I have a patch available for this that I need to
submit.

Overall, the week's test results have been pretty good. We've detected
regressions quickly and seen few new failures this week.

If anyone is looking for a testing-related task, we've seen some warnings
creep into the eclipse-warnings ant target again. Any help in cleaning
these up on any active minor bugfix branch (2.2+) would be greatly
appreciated.

Failing tests 2016-08-22 [cassandra-3.9]

2016-08-22 Thread Joel Knighton

===
testall: All passed!

===
dtest: 2 failures
  upgrade_internal_auth_test.TestAuthUpgrade.upgrade_to_30_test
Looks like a new, flaky failure. I'll follow up on this and get a ticket
created tomorrow.

  materialized_views_test.TestMaterializedViews
  .add_dc_after_mv_network_replication_test
CASSANDRA-12140. Known issue, still needs to be solved.

===
novnode: 6 failures
  6 failures in cql_tests.SlowQueryTester. This was a test regression
  quickly fixed in CASSANDRA-12514.

===
upgrade: 1 failure
  upgrade_tests.cql_tests
  .TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_3_x
  .bug_5732_test
  CASSANDRA-12457. A fix is in development.

Failing tests 2016-08-23 [cassandra-3.9]

2016-08-23 Thread Joel Knighton

===
testall: All passed!

===
dtest: 1 failure
  materialized_views_test.TestMaterializedViews
  .add_dc_after_mv_network_replication_test
CASSANDRA-12140. Known issue, still needs to be solved.

===
novnode: All passed!

===
upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. This issue still needs to be analyzed and
  fixed.


Overall, today looked very good. We're seeing a fairly static long tail
of challenging issues that are still in progress. I opened
CASSANDRA-12528 to fix the outstanding eclipse-warning
problems that are presently failing testall jobs on 2.2, 3.0, 3.9, and
trunk. If you are interested, feel free to assign the issue to yourself.

Failing tests 2016-08-24 [cassandra-3.9]

2016-08-24 Thread Joel Knighton

===
testall: All passed!

===
dtest: 2 failures
  scrub_test.TestScrubIndexes.test_standalone_scrub
CASSANDRA-12337. I've root-caused this; the failure is cosmetic
but user-facing, so I plan on fixing this soon.

  commitlog_test.TestCommitLog.test_commitlog_replay_on_startup
CASSANDRA-12213. This is still being analyzed.

===
novnode: All passed!

===
upgrade: All passed!

While it is somewhat due to the stars aligning such that our flaky tests
all didn't fail this run, it is very exciting to see an upgrade test run
with 0 failures. This is 50+ fewer failures than two weeks ago.

Failings tests 2016-08-25 [cassandra-3.9]

2016-08-25 Thread Joel Knighton

The testall, novnode, and upgrade jobs all look good.

dtest: 1 failure
  paging_test.TestPagingDatasetChanges
  .test_cell_TTL_expiry_during_paging
This is a new flaky failure. It looks like the test
 didn't correctly wait for schema agreement. I'll
 make sure an issue is created for this tomorrow.

Failing tests 2016-08-26 [cassandra-3.9]

2016-08-26 Thread Joel Knighton

No tests were run today since no commits were made to the 3.9 branch. 3.9
is looking stable and very close to being ready for release; only a few
outstanding flaky test failures remain.

Starting next week, I will focus on including trunk test failures in these
digests, while also including 3.9 failures when the tests are run.

Failing tests 2016-08-29

2016-08-29 Thread Joel Knighton

3.9 hasn't had any new runs. Today's results are for trunk.

===
testall: All passed!

===
dtest: Only one meta-test failure. This has been resolved.

===
novnode: Same as dtests above.

===
upgrade: 1 failure
  upgrade_tests.cql_tests
  .TestCQLNodes2RF1_Upgrade_current_2_1_x_To_indev_3_x
  .bug_5732_test
  CASSANDRA-12457. This issue is still in progress.

>From looking at 3.0.x and trunk more lately, it looks like the test fixing
efforts toward 3.9 have had a positive effect across all branches.

Failing tests 2016-08-30

2016-08-30 Thread Joel Knighton

Today's results are again for trunk - no new 3.9 runs have occurred.

Dtest and testall runs passed. The novnode dtest run failed due to an
environmental instability and has been restarted. This specific
environmental instability on harvesting artifacts should be somewhat
mitigated by a recent change to make artifact collection more robust
through retries.

upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. This is a known failure and still needs to be
  understood and solved.


Since the inception of these test results emails, the results they have
reported are the results of tests run on the CassCI Jenkins instance. Work
is underway to stabilize testing on the ASF Jenkins available at
builds.apache.org.  The most recent trunk testall on ASF infrastructure run
experienced 27 failures. Two significant failure sources appear to be
issues with Cassandra inconsistently being unable to bind certain ports and
test timeouts when the Jenkins instances are under significant resource
contention. A variety of other failures regularly occur as well.
Stabilization of unit tests on ASF CI infrastructure needs more work.
Individual issues have not yet been created.

Failing tests 2016-08-31

2016-08-31 Thread Joel Knighton

cassandra-3.9
===
testall: All passed!

===
dtest: 2 failures
  repair_tests.repair_test.TestRepair.nonexistent_table_repair_test
CASSANDRA-12578. New failure, looks like a test problem.
  cql_tracing_test.TestCqlTracing.tracing_default_impl_test
CASSANDRA-12579. New failure, looks like a test problem.

===
novnode: 2 failures - same as vnode dtests above

===
upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. Work is under way to understand this failure.


trunk
===
testall: All passed!

===
dtest: All passed!

===
novnode: 1 failure
  cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
CASSANDRA-11811.  Known issue under investigation.

===
upgrade:
  Failed due to environmental problems that are under investigation.


ASF Infra
  No significant updates. 13 failures in today's trunk testall run, all due
  to timeouts or port binding issues.

Failing tests 2016-09-01

2016-09-01 Thread Joel Knighton

cassandra-3.9 - No new runs


trunk
===
testall: All passed!

===
dtest: Failed, likely due to test environment/configuration issues.

===
novnode: 1 failure
  cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
CASSANDRA-11811.  Known issue under investigation.

===
upgrade: 1 failure
  Failure due to a new test committed with problems.
  (bootstrap_upgrade_test.py, to be specific)

ASF Infra
  No significant updates. 30 failures in today's trunk testall run, all due
  to timeouts or port binding issues.

Failing tests 2016-09-02

2016-09-02 Thread Joel Knighton

cassandra-3.9 - No new runs


trunk
===
testall: All passed!

===
dtest: 1 failure
  cqlsh_tests.cqlsh_tests.TestCqlsh.test_pep8_compliance
CASSANDRA-12599. Test problem due to changes in how
dependencies are installed. This is already fixed.

===
novnode: 1 failure
  Same as the dtest run above.

===
upgrade: 1 failure
  Failure due to a new test committed with problems.
  (bootstrap_upgrade_test.py, to be specific). This has been fixed since
  the latest run.

ASF Infra
  Today's trunk testall run passed. It looks like all the passing runs we've
  had occur on the ubuntu-1 build machine. It may be worth looking into
  how this machine differs from others to understand what kinds of
  resource starvation are causing timeouts on other build machines.

Failing tests 2016-09-12

2016-09-12 Thread Joel Knighton

cassandra-3.9
===
testall: 1 failure
  org.apache.cassandra.db.lifecycle.LogTransactionTest
  .testUnparsableFirstRecord-compression
CASSANDRA-12632. New failure, needs triage. I've never seen this
fail on any branch.

===
dtest && novnode:
  Failed to fetch test results due to environmental issues.

===
upgrade: 3 failures
  upgrade_tests.storage_engine_upgrade_test
  .TestBootstrapAfterUpgrade
  .upgrade_with_range_tombstone_eoc_0_test
  New failure, yet to be triaged.

  upgrade_tests.storage_engine_upgrade_test
  .TestStorageEngineUpgrade
  .upgrade_with_range_tombstone_eoc_0_test
  New failure, yet to be triaged. Looks very similar to the failure
above.

  upgrade_tests.paging_test
  .TestPagingWithDeletionsNodes2RF1_Upgrade_current_2_1_x_To_indev_3_x
  .test_failure_threshold_deletions
  New failure, yet to be triaged. Looks like it is due to a mistake
  in recent revisions to the test.


trunk
===
testall: 10 failures
  All timeouts in a string of tests that most frequently timeout when the
  system has degraded I/O performance.

===
dtest: All passed!

===
novnode: 2 failures
  repair_tests.repair_test.TestRepairDataSystemTable
  .repair_parent_table_test
New failure. It looks like stress failed to create a keyspace.

  user_types_test.TestUserTypes.test_type_as_part_of_pkey
New failure. It looks like it was broken by CASSANDRA-11031 and
a PR fixing the test is in progress.

===
upgrade: 1 failure
  upgrade_tests.paging_test
  .TestPagingDataNodes2RF1_Upgrade_current_2_2_x_To_indev_3_x
  .static_columns_paging_test
  CASSANDRA-11195. The fix has been committed since this run.


ASF Infra
  No significant updates. I think there's been discussion of somehow
  contributing Cassandra specific resources to the ASF CI pool, but
  I don't know much about this. If anyone does, feel free to reply!

Failing tests 2016-09-13

2016-09-13 Thread Joel Knighton

cassandra-3.9
===
testall: No new runs

===
dtest: 6 failures
  thrift_tests.TestMutations.test_range_tombstone_eoc_0

  repair_tests.repair_test.TestRepair.nonexistent_table_repair_test

  replace_address_test.TestReplaceAddress
  .insert_data_during_replace_different_address_test

  replace_address_test.TestReplaceAddress
  .insert_data_during_replace_same_address_test

  cql_tracing_test.TestCqlTracing.tracing_default_impl_test
  These 5 tests needed to be limited to not run on 3.9. Fixed.

  cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
  .test_writing_with_token_boundaries
  CASSANDRA-12430. Flaky failure.

===
novnode: 5 failures
  The same 5 version-gated test failures as above.

===
upgrade: No new runs


trunk
===
testall: No new runs

===
dtest: 2 failure
  user_types_test.Testuser_types_test
  .TestUserTypes.test_type_as_part_of_pkey
  CASSANDRA-12639. The test needs to be updated to reflect a
  change in the error message returned.

  cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
  CASSANDRA-11811. This failure is not yet understood.

===
novnode: No new run

===
upgrade: All passed!

Failing tests 2016-09-14

2016-09-14 Thread Joel Knighton

cassandra-3.9
===
testall: 8 failures
  org.apache.cassandra.cql3.ViewFilteringTest
  .testPartitionKeyAndClusteringKeyFilteringRestrictions

  org.apache.cassandra.cql3.ViewFilteringTest
  .testMVCreationSelectRestrictions

  org.apache.cassandra.cql3.ViewTest.testCompoundPartitionKey

  org.apache.cassandra.cql3.validation.entities.UFTest.testEmptyString

  org.apache.cassandra.cql3.validation.operations.AggregationTest
  .testFunctionsWithCompactStorage

  org.apache.cassandra.cql3.validation.operations.SelectTest
  .testAllowFiltering
  These six test failures are due to environmental timeouts.

  org.apache.cassandra.db.compaction
  .TimeWindowCompactionStrategyTest
  .testDropExpiredSSTables-compression
  New flaky failure. CASSANDRA-12645 opened.

  org.apache.cassandra.service.RemoveTest.testBadHostId
CASSANDRA-12487. Flaky failure in a test utility setup method.

===
dtest: 1 failure
  user_types_test.TestUserTypes.test_type_as_part_of_pkey
Should have been fixed as part of CASSANDRA-11031. Incorrect
version gating still - I'll follow up and get this fixed tomorrow.

===
novnode: 4 failures
  user_types_test.TestUserTypes.test_type_as_part_of_pkey
  Same as above.

  cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
  .test_bulk_round_trip_with_single_core
  New failure - looks like a schema agreement problem. A JIRA
  hasn't been created yet.

  cqlsh_tests.cqlsh_copy_tests.CqlshCopyTest
  .test_reading_max_insert_errors
  New failure - looks like Netty detected a leak. A JIRA hasn't been
  created yet.

  batch_test.TestBatch.logged_batch_doesnt_throw_uae_test
  CASSANDRA-12383. Flaky failure.

===
upgrade: 1 failure
  upgrade_tests.cql_tests
  .TestCQLNodes3RF3_Upgrade_current_2_1_x_To_indev_3_x
  .bug_5732_test
  CASSANDRA-12457. Patch available that needs a reviewer.


Since there's a few open opportunities based on 3.9 failures, I'm only
covering 3.9 on today's email.

Failing tests 2016-09-15

2016-09-15 Thread Joel Knighton

cassandra-3.9: No new runs


trunk
===
testall: 6 failures
  org.apache.cassandra.cql3.KeyCacheCqlTest
  .test2iKeyCachePathsShallowIndexEntry

  org.apache.cassandra.cql3.KeyCacheCqlTest
  .test2iKeyCachePathsShallowIndexEntry-compression
  CASSANDRA-12650 for the two failures above. New flaky failure.

  org.apache.cassandra.cql3.validation.entities.SecondaryIndexTest
  .testAllowFilteringOnPartitionKeyWithSecondaryIndex

  org.apache.cassandra.cql3.validation.entities.SecondaryIndexTest
  .testAllowFilteringOnPartitionKeyWithSecondaryIndex-compression
  CASSANDRA-12651 for the two failures above. New flaky failure.

  org.apache.cassandra.index.sasi.SASIIndexTest
  .testMultiExpressionQueriesWhereRowSplitBetweenSSTables
  Looks like an environmental problem where the forked JVM exited.
  I'm holding off on creating a JIRA for now.

  org.apache.cassandra.index.sasi.SASIIndexTest
  .testStaticIndex-compression
  CASSANDRA-12652. New flaky failure.

===
dtest: 4 failures
  cdc_test.TestCDC.test_cdc_data_available_in_cdc_raw
  CASSANDRA-11811. Known flaky failure.

  materialized_views_test.TestMaterializedViews
  .add_node_after_mv_test
  CASSANDRA-12140. Known flaky failure.

  materialized_views_test.TestMaterializedViews
  .really_complex_repair_test
  CASSANDRA-12475. Known flaky failure.

  snitch_test.TestGossipingPropertyFileSnitch
  .test_prefer_local_reconnect_on_listen_address
  A typo fix was committed to trunk without updating the test
  looking for the log message.

===
novnode: 6 failures
  paging_test.TestPagingData
  .test_paging_with_filtering_on_partition_key

  paging_test.TestPagingData
  .test_paging_with_filtering_on_partition_key_on_clustering_columns

  paging_test.TestPagingData

.test_paging_with_filtering_on_partition_key_on_clustering_columns_with_contains

  paging_test.TestPagingData
  .test_paging_with_filtering_on_partition_key_on_counter_columns
  Four new failures, bisect suggests they are due to
  CASSANDRA-11031. Only failed on novnode. I've asked Alex
  Petrov to take a look. No JIRA yet.

  snitch_test.TestGossipingPropertyFileSnitch
  .test_prefer_local_reconnect_on_listen_address
  Same as the vnode failure above.

  replication_test.SnitchConfigurationUpdateTest
  .test_rf_collapse_property_file_snitch
  New flaky failure. No JIRA created yet.

===
upgrade: All passed!