nabilkazi27 opened a new issue, #2779:
URL: https://github.com/apache/sedona/issues/2779

   When running Apache Sedona in **AWS Glue 5.0** with **Fine-Grained Access 
Control (FGAC)** enabled (via the `--enable-lakeformation-fine-grained-access` 
parameter), the initialization of the `SedonaContext` fails.
   The error appears to be caused by Lake Formation's strict validation engine, 
which identifies Sedona’s User-Defined Types (UDTs), such as `GeometryUDT`, as 
"synthetic types" that are not natively supported by the Glue Data Catalog's 
restricted execution profile. This prevents the `SedonaContext` from being 
created even when using the recommended `spark-shaded` and `geotools-wrapper` 
jars.
   
   ## Steps to Reproduce
   
   1. Set up an AWS Glue 5.0 job.
   2. Provide the following job parameters:
   `--enable-lakeformation-fine-grained-access: true`
   `--extra-jars: 
s3://path/to/sedona-spark-shaded-3.5_2.12-1.5.1.jar,s3://path/to/geotools-wrapper-1.5.1-28.2.jar`
   3. Attempt to initialize Sedona in the script:
   ```python
   from sedona.spark.SedonaContext import SedonaContext
   from pyspark.sql import SparkSession
   
   
   spark = SparkSession.builder.getOrCreate()
   sedona = SedonaContext.create(spark) # Failure occurs here
   ```
   4. Observe the "synthetic type" error in the driver logs.
   
   ## Environment
   
   - **Sedona Version**: 1.8.1
   - **Spark Version**: 3.5 (Glue 5.0)
   - **Deployment**: AWS Glue
   - **Connector**: `spark-shaded` and `geotools-wrapper` jars.
   
   ## Expected Behavior
   `SedonaContext` should initialize successfully even when Lake Formation FGAC 
is active, or there should be a supported method to register Sedona UDTs so 
they are not flagged as "synthetic" by the AWS security layer.
   
   ## Actual Behavior
   The job fails during context creation with a log entry indicating that the 
custom spatial types are unsupported in this restricted security mode.
   
   ## Logs
   ```
   INFO  2026-03-18T09:47:43,865  12581  
com.amazonaws.services.glue.launch.helpers.OpenTableFormatManager$  [main]  32  
Setup data lake format 'iceberg'.
   INFO  2026-03-18T09:48:13,925  8703  
org.apache.spark.emr.EMRParamSideChannel  [Thread-7]  177  Setting FGAC mode to 
true
   WARN  2026-03-18T09:48:13,927  8705  org.apache.spark.SparkContext  
[Thread-7]  72  When executing lakeformation enabled jobs, some user-provided 
spark configurations may not be applied. Please refer to AWS documentation for 
the full list of supported configurations.
   INFO  2026-03-18T09:48:13,940  8718  org.apache.spark.SparkContext  
[Thread-7]  60  Running Spark version 3.5.4-amzn-0
   INFO  2026-03-18T09:48:13,941  8719  org.apache.spark.SparkContext  
[Thread-7]  60  OS info Linux, 5.10.248-247.988.amzn2.x86_64, amd64
   INFO  2026-03-18T09:48:13,942  8720  org.apache.spark.SparkContext  
[Thread-7]  60  Java version 17.0.18
   INFO  2026-03-18T09:48:14,471  9249  
org.apache.spark.fgac.network.plugins.tls.SslOptions  [Thread-7]  60  
createNettySslContext: SslOptions{enabled=true, 
keyStore=Some(/var/certificates/keystore.p12), keyStorePassword=Some(xxx), 
trustStore=Some(/var/certificates/truststore.p12), 
trustStorePassword=Some(xxx), 
crl=Some(/var/certificates/root.crl)protocol=None, enabledAlgorithms=Set()}, 
endpointIdentificationAlgorithm=None
   INFO  2026-03-18T09:48:15,461  10239  
org.apache.spark.fgac.network.plugins.tls.TlsTransportPlugin  [Thread-7]  60  
Initializing TlsChannelInitializer with ssl context: 
org.apache.spark.fgac.network.plugins.tls.NettySslContext@21e95207
   INFO  2026-03-18T09:48:15,623  10401  
org.apache.spark.broadcast.BroadcastManager  [Thread-7]  60  BroadcastManager - 
initialized using FGAC client
   INFO  2026-03-18T09:48:16,042  10820  
org.apache.spark.fgac.client.SparkContextClientImpl  [Thread-7]  110  Created 
SparkContextClient ClientId = 123.
   INFO  2026-03-18T09:48:18,749  13527  
org.apache.spark.fgac.server.UserDriverServer  [Thread-7]  25  
==================== Starting User Driver Server ====================
   INFO  2026-03-18T09:48:19,222  14000  
org.apache.spark.fgac.server.UserDriverServer  [Thread-7]  50  
==================== User Driver Server started ====================
   INFO  2026-03-18T09:48:19,250  14028  
org.apache.spark.fgac.network.plugins.tls.SslOptions  [Thread-7]  60  
createNettySslContext: SslOptions{enabled=true, 
keyStore=Some(/var/certificates/keystore.p12), keyStorePassword=Some(xxx), 
trustStore=Some(/var/certificates/truststore.p12), 
trustStorePassword=Some(xxx), 
crl=Some(/var/certificates/root.crl)protocol=None, enabledAlgorithms=Set()}, 
endpointIdentificationAlgorithm=None
   INFO  2026-03-18T09:48:19,822  14600  
org.apache.spark.fgac.network.plugins.tls.TlsTransportPlugin  [Thread-7]  60  
Initializing TlsChannelInitializer with ssl context: 
org.apache.spark.fgac.network.plugins.tls.NettySslContext@64655827
   INFO  2026-03-18T09:48:42,426  37204  
org.apache.spark.fgac.network.plugins.tls.TlsChannelInitializer  
[rpc-server-4-1]  60  initializeServer for connection with
                           client address: ip-XX-XXX-XX-XXX.ec2.internal
   INFO  2026-03-18T09:49:01,511  56289  
org.apache.spark.fgac.server.SystemDriverInfoReceiverImpl  
[grpc-default-executor-0]  48  registerSystemDriver: Received registration of 
System Driver address: host=ip-XX-XXX-XX-XXX.ec2.internal, port=34119
   INFO  2026-03-18T09:49:01,511  56289  
org.apache.spark.fgac.client.SparkContextClientImpl  [Thread-7]  381  Creating 
channel for host ip-XX-XXX-XX-XXX.ec2.internal, port 34119
   INFO  2026-03-18T09:49:01,802  56580  
org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  172  Created 
SparkSessionClient ClientId = 123.
   INFO  2026-03-18T09:49:05,287  60065  
org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  277  [XXX] 
executeCollect.
   ERROR  2026-03-18T09:49:59,465  114243  
org.apache.spark.fgac.error.SparkFGACExceptionSanitizer  [Thread-7]  76  Client 
received error with id = XXXX-XXX-XXXX-XXXX-XXXXX, reason = 
SparkIllegalConfigModificationException, message = 
spark.sql.optimizer.nestedPredicatePushdown.supportedFileSources
   WARN  2026-03-18T09:49:59,466  114244  
org.apache.spark.sql.hive.SecureSQLConf  [Thread-7]  72  Config 
spark.sql.optimizer.nestedPredicatePushdown.supportedFileSources not allowed to 
be modified on system driver
   INFO  2026-03-18T09:50:00,371  115149  
org.apache.spark.fgac.client.SparkSessionClientImpl  [Thread-7]  359  [XXX] 
RegisterFunctionRequest (function=<function1>, 
expressionInfo=org.apache.spark.sql.catalyst.expressions.ExpressionInfo@4e0d57fd).
   ERROR  2026-03-18T09:50:00,879  115657  
com.amazonaws.services.glue.ProcessLauncher  [main]  76  Error from 
Python:Traceback (most recent call last):
     File "<frozen runpy>", line 291, in run_path
     File "<frozen runpy>", line 98, in _run_module_code
     File "<frozen runpy>", line 88, in _run_code
     File "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py", line 
54, in <module>
       raise e
     File "/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py", line 
50, in <module>
       sedona = SedonaContext.create(spark)
                ^^^^^^^^^^^^^^^^^^^^^^^^^^^
     File 
"/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py",
 line 48, in create
       spark._jvm.SedonaContext.create(spark._jsparkSession, "python")
     File 
"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py", line 
1322, in __call__
       return_value = get_return_value(
                      ^^^^^^^^^^^^^^^^^
     File 
"/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py", 
line 185, in deco
       raise converted from None
   pyspark.errors.exceptions.captured.IllegalArgumentException: Cannot parse 
synthetic types.
   ERROR  2026-03-18T09:50:00,918  115696  
com.amazonaws.services.glueexceptionanalysis.GlueExceptionAnalysisListener  
[main]  9  [Glue Exception Analysis] {
       "Event": "GlueETLJobExceptionEvent",
       "Timestamp": 1773827400916,
       "Failure Reason": "Traceback (most recent call last):\n  File \"<frozen 
runpy>\", line 291, in run_path\n  File \"<frozen runpy>\", line 98, in 
_run_module_code\n  File \"<frozen runpy>\", line 88, in _run_code\n  File 
\"/tmp/glue-job-14352982340482138070/Icerberg Sedona Testing.py\", line 54, in 
<module>\n    raise e\n  File \"/tmp/glue-job-14352982340482138070/Icerberg 
Sedona Testing.py\", line 50, in <module>\n    sedona = 
SedonaContext.create(spark)\n             ^^^^^^^^^^^^^^^^^^^^^^^^^^^\n  File 
\"/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py\",
 line 48, in create\n    spark._jvm.SedonaContext.create(spark._jsparkSession, 
\"python\")\n  File 
\"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py\", line 
1322, in __call__\n    return_value = get_return_value(\n                   
^^^^^^^^^^^^^^^^^\n  File 
\"/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py\",
 line 185, in deco\n    raise converted 
 from None\npyspark.errors.exceptions.captured.IllegalArgumentException: Cannot 
parse synthetic types.",
       "Stack Trace": [
           {
               "Declaring Class": "deco",
               "Method Name": "raise converted from None",
               "File Name": 
"/usr/lib/spark/python/lib/pyspark.zip/pyspark/errors/exceptions/captured.py",
               "Line Number": 185
           },
           {
               "Declaring Class": "__call__",
               "Method Name": "return_value = get_return_value(",
               "File Name": 
"/usr/lib/spark/python/lib/py4j-0.10.9.7-src.zip/py4j/java_gateway.py",
               "Line Number": 1322
           },
           {
               "Declaring Class": "create",
               "Method Name": 
"spark._jvm.SedonaContext.create(spark._jsparkSession, \"python\")",
               "File Name": 
"/home/hadoop/.local/lib/python3.11/site-packages/sedona/spark/SedonaContext.py",
               "Line Number": 48
           },
           {
               "Declaring Class": "<module>",
               "Method Name": "sedona = SedonaContext.create(spark)",
               "File Name": "/tmp/glue-job-14352982340482138070/Icerberg Sedona 
Testing.py",
               "Line Number": 50
           },
           {
               "Declaring Class": "<module>",
               "Method Name": "raise e",
               "File Name": "/tmp/glue-job-14352982340482138070/Icerberg Sedona 
Testing.py",
               "Line Number": 54
           }
       ],
       "Last Executed Line number": 54,
       "script": "Icerberg Sedona Testing.py"
   }
   ERROR  2026-03-18T09:50:00,919  115697  
com.amazonaws.services.glueexceptionanalysis.GlueExceptionAnalysisListener  
[main]  9  [Glue Exception Analysis] Last Executed Line number from script 
Icerberg Sedona Testing.py: 54
   INFO  2026-03-18T09:50:01,012  115790  
org.apache.spark.fgac.client.SparkContextClientImpl  [shutdown-hook-0]  308  
[XXX] shutdownServer client starts.
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Reply via email to