[jira] [Comment Edited] (SPARK-3947) Support Scala/Java UDAF

Milad Bourhani (JIRA) Fri, 20 Nov 2015 00:39:27 -0800

    [ 
https://issues.apache.org/jira/browse/SPARK-3947?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15013886#comment-15013886
 ]


Milad Bourhani edited comment on SPARK-3947 at 11/20/15 8:38 AM:
-----------------------------------------------------------------

I managed to run your code against both 1.5.2 and 1.5.0:
- works fine on 1.5.2
- differs on 1.5.0, where using a cluster the following result is computed:
{noformat}
+-------------+------------+----+--------------------+--------------------+
|store_country|store_region| _c2|                 _c3|                 _c4|
+-------------+------------+----+--------------------+--------------------+
|        italy|      emilia|87.5|175.0000000000000...|6.18028512604618E...|
|        italy|     toscana|50.5|50.50000000000000...|                50.5|
|        italy|      puglia|  70|70.00000000000000...|                70.0|
+-------------+------------+----+--------------------+--------------------+
{noformat}

*EDIT* -- Running the example on 1.5.2 multiple times the error *sometimes* 
showed up there too, and every time this happened, there was an error on the 
Worker's log, identical to SPARK-9844, so it looks like that log error makes 
the cluster computation fail (and the result changes, too):
{noformat}
+-------------+------------+----+--------------------+--------------------+
|store_country|store_region| _c2|                 _c3|                 _c4|
+-------------+------------+----+--------------------+--------------------+
|        italy|      emilia|87.5|175.0000000000000...|7.136378562382805...|
|        italy|     toscana|50.5|50.50000000000000...|                50.5|
|        italy|      puglia|  70|70.00000000000000...|                70.0|
+-------------+------------+----+--------------------+--------------------
{noformat}

Note also that my {{conf/spark-env.sh}} contains this single line:
{code}
export SPARK_WORKER_INSTANCES=1
{code}
so I find it strange that a race condition occurs (SPARK-9844 talks about race 
conditions).


was (Author: [email protected]):
I managed to run your code against both 1.5.2 and 1.5.0:
- works fine on 1.5.2
- differs on 1.5.0, where using a cluster the following result is computed:
{noformat}
+-------------+------------+----+--------------------+--------------------+
|store_country|store_region| _c2|                 _c3|                 _c4|
+-------------+------------+----+--------------------+--------------------+
|        italy|      emilia|87.5|175.0000000000000...|6.18028512604618E...|
|        italy|     toscana|50.5|50.50000000000000...|                50.5|
|        italy|      puglia|  70|70.00000000000000...|                70.0|
+-------------+------------+----+--------------------+--------------------+
{noformat}

> Support Scala/Java UDAF
> -----------------------
>
>                 Key: SPARK-3947
>                 URL: https://issues.apache.org/jira/browse/SPARK-3947
>             Project: Spark
>          Issue Type: Sub-task
>          Components: SQL
>            Reporter: Pei-Lun Lee
>            Assignee: Yin Huai
>             Fix For: 1.5.0
>
>         Attachments: spark-udaf.zip
>
>
> Right now only Hive UDAFs are supported. It would be nice to have UDAF 
> similar to UDF through SQLContext.registerFunction.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Comment Edited] (SPARK-3947) Support Scala/Java UDAF

Reply via email to