richardstartin opened a new issue #8009:
URL: https://github.com/apache/pinot/issues/8009


   Run this query against hybrid quick start:
   
   ```sql
   explain plan for select count(*) from airlineStats where 
insubquery(OriginAirportID, 'select idset(DestAirportID) from airlineStats') = 1
   ```
   
   it prints: 
   ```json
   {
     "resultTable": {
       "dataSchema": {
         "columnNames": [
           "Operator",
           "Operator_Id",
           "Parent_Id"
         ],
         "columnDataTypes": [
           "STRING",
           "INT",
           "INT"
         ]
       },
       "rows": [
         [
           "BROKER_REDUCE(limit:10)",
           0,
           -1
         ],
         [
           "COMBINE_AGGREGATE",
           1,
           0
         ],
         [
           "AGGREGATE(aggregations:count(*))",
           2,
           1
         ],
         [
           "TRANSFORM_PASSTHROUGH()",
           3,
           2
         ],
         [
           "PROJECT()",
           4,
           3
         ],
         [
           
"FILTER_EXPRESSION(operator:EQ,predicate:inidset(OriginAirportID,'ATowAAABAAAAAAAZARAAAACXJ5gnnCeiJ6snrSe6J8kn4CcRKBwoJyg7KF0oeSiEKJ0oqCi3KL8oISk3KUEpVSlnKXwpgymHKZMpqim9KcUp2SnhKegp6ynsKfMp+ykCKhsqHSohKigqMCpFKmEqdCp6KqYqriqyKuQq7iryKvsqISsiKykrMSs6KzsrRCtZK2UrZytyK4Qriiu5K8Mr9Cv7KwMsCiwOLBwsIiwsLDMsSSyVLJ8sqSzPLNks7ywFLREtFC1TLVwtYS1iLWgtbi11LXYteS2ALbEtyS3OLf8tAi4vLlkuWy5sLoEukS6xLsUuyS7MLs4u0i7bLtwu4y7nLuwu8C4+L40vny+lL64vuS/oL+ov9i/4LyAwIzAvMDMwNzBlMGcwcjCZMKAwozC+MN8w6zDWMRMyVDJYMlkyWzJcMmAyczKRMpcymTKaMrYywDLgMuUyBTNHM2YzgDOOM5QzrjOwM7wzyDPQM90z6jPwM/czHjQgNDA0NzRBNEw0bjRwNHk0pDStNK40rzS3NL40CTXjNeQ1BjYbNi82MTZDNmo2azZtNow2kja2Nss26TYSNxQ3GzccNyE3KjdxN6w3rjewN7Y34zfxN3k4lziZOJw4uDi8OM846jjuOPA4KzlSOVc5WzldOWE5aDlqOXU5ijmbObM5vznKOd457DnvOfo5+zkVOi06OTo8Omg6cDqKOqg6sDqzOsE6yDreOvg6kTu/O8g7/DsKPBA8FDwdPCk8Mzw0PPc8CD3hPS8+Wj8=')
 = '1')",
           5,
           4
         ]
       ]
     },
     "exceptions": [],
     "numServersQueried": 1,
     "numServersResponded": 1,
     "numSegmentsQueried": 1,
     "numSegmentsProcessed": 0,
     "numSegmentsMatched": 0,
     "numConsumingSegmentsQueried": 0,
     "numDocsScanned": 0,
     "numEntriesScannedInFilter": 0,
     "numEntriesScannedPostFilter": 0,
     "numGroupsLimitReached": false,
     "totalDocs": 289,
     "timeUsedMs": 22,
     "offlineThreadCpuTimeNs": 0,
     "realtimeThreadCpuTimeNs": 0,
     "offlineSystemActivitiesCpuTimeNs": 0,
     "realtimeSystemActivitiesCpuTimeNs": 0,
     "offlineResponseSerializationCpuTimeNs": 0,
     "realtimeResponseSerializationCpuTimeNs": 0,
     "offlineTotalCpuTimeNs": 0,
     "realtimeTotalCpuTimeNs": 0,
     "segmentStatistics": [],
     "traceInfo": {},
     "minConsumingFreshnessTimeMs": 0,
     "numRowsResultSet": 6
   }
   ```
   
   Printing function parameters leaks data when taking an explain plan. The 
base64 encoded idsets can be deserialised to reveal the values of an entire 
column, and anyone capable of reading the source code can decode these 
parameters:
   
   ```java
     public static void main(String... args) throws IOException {
       ByteBuffer idset =  
ByteBuffer.wrap(Base64.getDecoder().decode(args[0])).position(1).slice().order(ByteOrder.LITTLE_ENDIAN);
       RoaringBitmap bitmap = new RoaringBitmap();
       bitmap.deserialize(idset);
       System.err.println(Arrays.toString(bitmap.toArray()));
     }
   ``` 
   
   prints the airline ids, and the subquery could easily have been for social 
security numbers of users satisfying some condition:
   
   ```
   [10135, 10136, 10140, 10146, 10155, 10157, 10170, 10185, 10208, 10257, 
10268, 10279, 10299, 10333, 10361, 10372, 10397, 10408, 10423, 10431, 10529, 
10551, 10561, 10581, 10599, 10620, 10627, 10631, 10643, 10666, 10685, 10693, 
10713, 10721, 10728, 10731, 10732, 10739, 10747, 10754, 10779, 10781, 10785, 
10792, 10800, 10821, 10849, 10868, 10874, 10918, 10926, 10930, 10980, 10990, 
10994, 11003, 11041, 11042, 11049, 11057, 11066, 11067, 11076, 11097, 11109, 
11111, 11122, 11140, 11146, 11193, 11203, 11252, 11259, 11267, 11274, 11278, 
11292, 11298, 11308, 11315, 11337, 11413, 11423, 11433, 11471, 11481, 11503, 
11525, 11537, 11540, 11603, 11612, 11617, 11618, 11624, 11630, 11637, 11638, 
11641, 11648, 11697, 11721, 11726, 11775, 11778, 11823, 11865, 11867, 11884, 
11905, 11921, 11953, 11973, 11977, 11980, 11982, 11986, 11995, 11996, 12003, 
12007, 12012, 12016, 12094, 12173, 12191, 12197, 12206, 12217, 12264, 12266, 
12278, 12280, 12320, 12323, 12335, 12339, 12343, 12389, 12391, 12402, 12441, 
 12448, 12451, 12478, 12511, 12523, 12758, 12819, 12884, 12888, 12889, 12891, 
12892, 12896, 12915, 12945, 12951, 12953, 12954, 12982, 12992, 13024, 13029, 
13061, 13127, 13158, 13184, 13198, 13204, 13230, 13232, 13244, 13256, 13264, 
13277, 13290, 13296, 13303, 13342, 13344, 13360, 13367, 13377, 13388, 13422, 
13424, 13433, 13476, 13485, 13486, 13487, 13495, 13502, 13577, 13795, 13796, 
13830, 13851, 13871, 13873, 13891, 13930, 13931, 13933, 13964, 13970, 14006, 
14027, 14057, 14098, 14100, 14107, 14108, 14113, 14122, 14193, 14252, 14254, 
14256, 14262, 14307, 14321, 14457, 14487, 14489, 14492, 14520, 14524, 14543, 
14570, 14574, 14576, 14635, 14674, 14679, 14683, 14685, 14689, 14696, 14698, 
14709, 14730, 14747, 14771, 14783, 14794, 14814, 14828, 14831, 14842, 14843, 
14869, 14893, 14905, 14908, 14952, 14960, 14986, 15016, 15024, 15027, 15041, 
15048, 15070, 15096, 15249, 15295, 15304, 15356, 15370, 15376, 15380, 15389, 
15401, 15411, 15412, 15607, 15624, 15841, 15919, 16218]
   ```
   
   This would make it impossible for either a business user to take an explain 
plan from a production database on behalf of an operator and share it to 
diagnose a performance problem, or to create a role common in enterprises which 
gives technical users the ability run diagnostic commands but not access 
production data, because they can essentially access any data they like 
combining explain plans and idsets.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org



---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

Reply via email to