korbit-ai[bot] commented on code in PR #34519:
URL: https://github.com/apache/superset/pull/34519#discussion_r2249267636


##########
superset/common/utils/dataframe_utils.py:
##########
@@ -43,11 +43,19 @@ def left_join_df(
 def full_outer_join_df(
     left_df: pd.DataFrame,
     right_df: pd.DataFrame,
+    join_keys: list[str] | None = None,
     lsuffix: str = "",
     rsuffix: str = "",
 ) -> pd.DataFrame:
-    df = left_df.join(right_df, lsuffix=lsuffix, rsuffix=rsuffix, how="outer")
-    df.reset_index(inplace=True)
+    if join_keys:
+        df = left_df.set_index(join_keys).join(
+            right_df.set_index(join_keys), lsuffix=lsuffix, rsuffix=rsuffix, 
how="outer"
+        )
+        df.reset_index(inplace=True)
+
+    else:
+        df = left_df.join(right_df, lsuffix=lsuffix, rsuffix=rsuffix, 
how="outer")

Review Comment:
   ### Unsafe default join behavior for time series alignment <sub>![category 
Functionality](https://img.shields.io/badge/Functionality-0284c7)</sub>
   
   <details>
     <summary>Tell me more</summary>
   
   ###### What is the issue?
   The default join behavior in pandas when join_keys is not provided uses the 
index for joining, which may not be appropriate for time series data alignment.
   
   
   ###### Why this matters
   If the indices of the dataframes are not properly set for time series data, 
the join operation could result in incorrect data alignment or missing dates, 
contradicting the developer's intent to fix misalignment issues.
   
   ###### Suggested change ∙ *Feature Preview*
   Require join_keys parameter to be mandatory to ensure explicit join criteria 
for time series alignment:
   ```python
   def full_outer_join_df(
       left_df: pd.DataFrame,
       right_df: pd.DataFrame,
       join_keys: list[str],  # Remove None default
       lsuffix: str = "",
       rsuffix: str = "",
   ) -> pd.DataFrame:
       df = left_df.set_index(join_keys).join(
           right_df.set_index(join_keys), lsuffix=lsuffix, rsuffix=rsuffix, 
how="outer"
       )
       df.reset_index(inplace=True)
       return df
   ```
   
   
   ###### Provide feedback to improve future suggestions
   [![Nice 
Catch](https://img.shields.io/badge/👍%20Nice%20Catch-71BC78)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cf016c84-3cfd-4bbd-9c77-29d75491b49a/upvote)
 
[![Incorrect](https://img.shields.io/badge/👎%20Incorrect-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cf016c84-3cfd-4bbd-9c77-29d75491b49a?what_not_true=true)
  [![Not in 
Scope](https://img.shields.io/badge/👎%20Out%20of%20PR%20scope-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cf016c84-3cfd-4bbd-9c77-29d75491b49a?what_out_of_scope=true)
 [![Not in coding 
standard](https://img.shields.io/badge/👎%20Not%20in%20our%20standards-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cf016c84-3cfd-4bbd-9c77-29d75491b49a?what_not_in_standard=true)
 
[![Other](https://img.shields.io/badge/👎%20Other-white)](https://app.korbit.ai/feedback/aa91ff46-6083-4491-9416-b83dd1994b51/cf016c84-3cfd-4bbd-9c77-29d75491b49a)
   </details>
   
   <sub>
   
   💬 Looking for more details? Reply to this comment to chat with Korbit.
   </sub>
   
   <!--- korbi internal id:5666fd5e-feb0-431b-a667-9adb5463ea98 -->
   
   
   [](5666fd5e-feb0-431b-a667-9adb5463ea98)



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to