spark git commit: [SPARK-9281] [SQL] use decimal or double when parsing SQL

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 6309b9346 -> 15667a0af [SPARK-9281] [SQL] use decimal or double when parsing SQL Right now, we use double to parse all the float number in SQL. When it's used in expression together with DecimalType, it will turn the decimal into double a

spark git commit: [SPARK-9398] [SQL] Datetime cleanup

2015-07-28 Thread davies
Repository: spark Updated Branches: refs/heads/master ea49705bd -> 6309b9346 [SPARK-9398] [SQL] Datetime cleanup JIRA: https://issues.apache.org/jira/browse/SPARK-9398 Author: Yijie Shen Closes #7725 from yjshen/date_null_check and squashes the following commits: b4eade1 [Yijie Shen] inlin

spark git commit: [SPARK-9419] ShuffleMemoryManager and MemoryStore should track memory on a per-task, not per-thread, basis

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 429b2f0df -> ea49705bd [SPARK-9419] ShuffleMemoryManager and MemoryStore should track memory on a per-task, not per-thread, basis Spark's ShuffleMemoryManager and MemoryStore track memory on a per-thread basis, which causes problems in th

spark git commit: [SPARK-8608][SPARK-8609][SPARK-9083][SQL] reset mutable states of nondeterministic expression before evaluation and fix PullOutNondeterministic

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 3744b7fd4 -> 429b2f0df [SPARK-8608][SPARK-8609][SPARK-9083][SQL] reset mutable states of nondeterministic expression before evaluation and fix PullOutNondeterministic We will do local projection for LocalRelation, and thus reuse the same

spark git commit: [SPARK-9422] [SQL] Remove the placeholder attributes used in the aggregation buffers

2015-07-28 Thread yhuai
Repository: spark Updated Branches: refs/heads/master e78ec1a8f -> 3744b7fd4 [SPARK-9422] [SQL] Remove the placeholder attributes used in the aggregation buffers https://issues.apache.org/jira/browse/SPARK-9422 Author: Yin Huai Closes #7737 from yhuai/removePlaceHolder and squashes the fol

spark git commit: [SPARK-9421] Fix null-handling bugs in UnsafeRow.getDouble, getFloat(), and get(ordinal, dataType)

2015-07-28 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 6662ee212 -> e78ec1a8f [SPARK-9421] Fix null-handling bugs in UnsafeRow.getDouble, getFloat(), and get(ordinal, dataType) UnsafeRow.getDouble and getFloat() return NaN when called on columns that are null, which is inconsistent with the b

spark git commit: [SPARK-9418][SQL] Use sort-merge join as the default shuffle join.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master b7f54119f -> 6662ee212 [SPARK-9418][SQL] Use sort-merge join as the default shuffle join. Sort-merge join is more robust in Spark since sorting can be made using the Tungsten sort operator. Author: Reynold Xin Closes #7733 from rxin/smj

spark git commit: [SPARK-9420][SQL] Move expressions in sql/core package to catalyst.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master c5ed36953 -> b7f54119f [SPARK-9420][SQL] Move expressions in sql/core package to catalyst. Since catalyst package already depends on Spark core, we can move those expressions into catalyst, and simplify function registry. This is a follow

spark git commit: [STREAMING] [HOTFIX] Ignore ReceiverTrackerSuite flaky test

2015-07-28 Thread tdas
Repository: spark Updated Branches: refs/heads/master 59b92add7 -> c5ed36953 [STREAMING] [HOTFIX] Ignore ReceiverTrackerSuite flaky test Author: Tathagata Das Closes #7738 from tdas/ReceiverTrackerSuite-hotfix and squashes the following commits: 00f0ee1 [Tathagata Das] ignore flaky test

spark git commit: [SPARK-9393] [SQL] Fix several error-handling bugs in ScriptTransform operator

2015-07-28 Thread joshrosen
Repository: spark Updated Branches: refs/heads/master 21825529e -> 59b92add7 [SPARK-9393] [SQL] Fix several error-handling bugs in ScriptTransform operator SparkSQL's ScriptTransform operator has several serious bugs which make debugging fairly difficult: - If exceptions are thrown in the wr

spark git commit: [SPARK-9247] [SQL] Use BytesToBytesMap for broadcast join

2015-07-28 Thread davies
Repository: spark Updated Branches: refs/heads/master 198d181df -> 21825529e [SPARK-9247] [SQL] Use BytesToBytesMap for broadcast join This PR introduce BytesToBytesMap to UnsafeHashedRelation, use it in executor for better performance. It serialize all the key and values from java HashMap,

spark git commit: [SPARK-7105] [PYSPARK] [MLLIB] Support model save/load in GMM

2015-07-28 Thread meng
Repository: spark Updated Branches: refs/heads/master b88b868eb -> 198d181df [SPARK-7105] [PYSPARK] [MLLIB] Support model save/load in GMM This PR introduces save / load for GMM's in python API. Also I refactored `GaussianMixtureModel` and inherited it from `JavaModelWrapper` with model bein

spark git commit: [SPARK-8003][SQL] Added virtual column support to Spark

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 8d5bb5283 -> b88b868eb [SPARK-8003][SQL] Added virtual column support to Spark Added virtual column support by adding a new resolution role to the query analyzer. Additional virtual columns can be added by adding case expressions to [the

spark git commit: [SPARK-9391] [ML] Support minus, dot, and intercept operators in SparkR RFormula

2015-07-28 Thread meng
Repository: spark Updated Branches: refs/heads/master 6cdcc21fe -> 8d5bb5283 [SPARK-9391] [ML] Support minus, dot, and intercept operators in SparkR RFormula Adds '.', '-', and intercept parsing to RFormula. Also splits RFormulaParser into a separate file. Umbrella design doc here: https://

spark git commit: [SPARK-9196] [SQL] Ignore test DatetimeExpressionsSuite: function current_timestamp.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 31ec6a871 -> 6cdcc21fe [SPARK-9196] [SQL] Ignore test DatetimeExpressionsSuite: function current_timestamp. This test is flaky. https://issues.apache.org/jira/browse/SPARK-9196 will track the fix of it. For now, let's disable this test.

spark git commit: [SPARK-9327] [DOCS] Fix documentation about classpath config options.

2015-07-28 Thread vanzin
Repository: spark Updated Branches: refs/heads/master 614323406 -> 31ec6a871 [SPARK-9327] [DOCS] Fix documentation about classpath config options. Author: Marcelo Vanzin Closes #7651 from vanzin/SPARK-9327 and squashes the following commits: 2923e23 [Marcelo Vanzin] [SPARK-9327] [docs] Fix

spark git commit: Use vector-friendly comparison for packages argument.

2015-07-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/branch-1.4 c103c99d2 -> 8dfdca46d Use vector-friendly comparison for packages argument. Otherwise, `sparkR.init()` with multiple `sparkPackages` results in this warning: ``` Warning message: In if (packages != "") { : the condition has length

spark git commit: Use vector-friendly comparison for packages argument.

2015-07-28 Thread shivaram
Repository: spark Updated Branches: refs/heads/master 35ef853b3 -> 614323406 Use vector-friendly comparison for packages argument. Otherwise, `sparkR.init()` with multiple `sparkPackages` results in this warning: ``` Warning message: In if (packages != "") { : the condition has length > 1

spark git commit: [SPARK-9397] DataFrame should provide an API to find source data files if applicable

2015-07-28 Thread marmbrus
Repository: spark Updated Branches: refs/heads/master 9bbe0171c -> 35ef853b3 [SPARK-9397] DataFrame should provide an API to find source data files if applicable Certain applications would benefit from being able to inspect DataFrames that are straightforwardly produced by data sources that

spark git commit: [SPARK-8196][SQL] Fix null handling & documentation for next_day.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master c740bed17 -> 9bbe0171c [SPARK-8196][SQL] Fix null handling & documentation for next_day. The original patch didn't handle nulls correctly for next_day. Author: Reynold Xin Closes #7718 from rxin/next_day and squashes the following commit

spark git commit: [SPARK-9373][SQL] follow up for StructType support in Tungsten projection.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 5a2330e54 -> c740bed17 [SPARK-9373][SQL] follow up for StructType support in Tungsten projection. Author: Reynold Xin Closes #7720 from rxin/struct-followup and squashes the following commits: d9757f5 [Reynold Xin] [SPARK-9373][SQL] foll

spark git commit: [SPARK-9402][SQL] Remove CodegenFallback from Abs / FormatNumber.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master 4af622c85 -> 5a2330e54 [SPARK-9402][SQL] Remove CodegenFallback from Abs / FormatNumber. Both expressions already implement code generation. Author: Reynold Xin Closes #7723 from rxin/abs-formatnum and squashes the following commits: 31

spark git commit: [SPARK-8919] [DOCUMENTATION, MLLIB] Added @since tags to mllib.recommendation

2015-07-28 Thread meng
Repository: spark Updated Branches: refs/heads/master ac8c549e2 -> 4af622c85 [SPARK-8919] [DOCUMENTATION, MLLIB] Added @since tags to mllib.recommendation Author: vinodkc Closes #7325 from vinodkc/add_since_mllib.recommendation and squashes the following commits: 93156f2 [vinodkc] Changed

spark git commit: [EC2] Cosmetic fix for usage of spark-ec2 --ebs-vol-num option

2015-07-28 Thread srowen
Repository: spark Updated Branches: refs/heads/master 15724fac5 -> ac8c549e2 [EC2] Cosmetic fix for usage of spark-ec2 --ebs-vol-num option The last line of the usage seems ugly. ``` $ spark-ec2 --help --ebs-vol-num=EBS_VOL_NUM Number of EBS volumes to attach to eac

spark git commit: [SPARK-9394][SQL] Handle parentheses in CodeFormatter.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master fc3bd96bc -> 15724fac5 [SPARK-9394][SQL] Handle parentheses in CodeFormatter. Our CodeFormatter currently does not handle parentheses, and as a result in code dump, we see code formatted this way: ``` foo( a, b, c) ``` With this patch, i

spark git commit: Closes #6836 since Round has already been implemented.

2015-07-28 Thread rxin
Repository: spark Updated Branches: refs/heads/master d93ab93d6 -> fc3bd96bc Closes #6836 since Round has already been implemented. Project: http://git-wip-us.apache.org/repos/asf/spark/repo Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/fc3bd96b Tree: http://git-wip-us.apache.or