michael-s-molina commented on PR #32520:
URL: https://github.com/apache/superset/pull/32520#issuecomment-2754560510

   > I've made the API more compact, requiring only the UUIDs for metrics and 
columns.
   
   Thanks for the improvements @betodealmeida.
   
   > Do you think that would be enough for your use case?
   
   I think the best way to test the feature is to generate a dataset with 
thousands of columns and metrics and simulate the interactions.
   
   The script below will create a CSV file containing 500 columns and 500 
metrics that can be imported into Superset.
   
   ```python
   import csv
   
   # Define the number of columns, metrics, and rows
   num_columns = 500
   num_metrics = 500
   num_rows = 10
   
   # Create column headers
   column_headers = [f"c_{i}" for i in range(1, num_columns + 1)] + [
       f"m_{i}" for i in range(1, num_metrics + 1)
   ]
   
   # Generate sample data
   data = [
       [
           (
               f"{row}_{col}"
               if col < num_columns
               else row * num_metrics + (col - num_columns)
           )
           for col in range(num_columns + num_metrics)
       ]
       for row in range(num_rows)
   ]
   
   # Write to CSV file
   with open("large_dataset.csv", "w", newline="") as csvfile:
       writer = csv.writer(csvfile)
       writer.writerow(column_headers)
       writer.writerows(data)
   
   print("CSV file 'large_dataset.csv' created successfully.")
   ```
   
   It's still only 1000 columns which is way less than our 30k but it's a 
start. If you have another way of creating a dataset in Superset with more 
columns it would be even better. 
   
   Once ready, I'll test the feature using our real data but the test dataset 
can help making sure the payload and algorithms to edit the folder can handle a 
lot of data efficiently.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to