[GitHub] [pinot] soumitra-st opened a new issue, #10566: Fault-Domain-Aware Instance Assignment

via GitHub Thu, 06 Apr 2023 04:24:10 -0700


soumitra-st opened a new issue, #10566:
URL: https://github.com/apache/pinot/issues/10566


   There are 6 servers, and are assigned into 2 pools by creating tags as below:
   
   Group1: Server_192.168.1.160_8001, Server_192.168.1.160_8002, and 
Server_192.168.1.160_8003 have tag "DefaultTenant_OFFLINE": "0"
   Group2: Server_192.168.1.160_8004, Server_192.168.1.160_8005, and 
Server_192.168.1.160_8006 have tag "DefaultTenant_OFFLINE": "1"
   
   Group1 servers are in FD1, and Group2 servers in FD2. Now creating a table 
with FD_AWARE_INSTANCE_PARTITION_SELECTOR, the table config is:
   ```
   % cat transcript-table-offline.json.rg
   {
    "tableName": "transcript",
    "segmentsConfig" : {
    "replicaGroupStrategyConfig": {
    "partitionColumn": "timestamp",
    "numInstancesPerPartition": 2
    },
    "timeColumnName": "timestamp",
    "timeType": "MILLISECONDS",
    "replication" : "2",
    "schemaName" : "transcript"
    },
    "tableIndexConfig" : {
    "invertedIndexColumns" : [],
    "loadMode" : "MMAP"
    },
    "tenants" : {
    "broker":"DefaultTenant",
    "server":"DefaultTenant"
    },
    "tableType":"OFFLINE",
    "metadata": {},
    "instanceAssignmentConfigMap": {
    "OFFLINE": {
    "partitionSelector": "FD_AWARE_INSTANCE_PARTITION_SELECTOR",
    "tagPoolConfig": {
    "tag": "DefaultTenant_OFFLINE",
    "poolBased": true
    },
    "replicaGroupPartitionConfig": {
    "replicaGroupBased": true,
    "numReplicaGroups": 2,
    "numPartitions": 2,
    "numInstancesPerPartition": 2
    }
    }
    }
   }
   
   ```
   
   Adding a table with above table config to do FD aware instance assignment:
   ```
   % bin/pinot-admin.sh AddTable -tableConfigFile 
$BASE_DIR/transcript-table-offline.json.rg -schemaFile 
$BASE_DIR/transcript-schema.json -controllerPort 9001 -exec
   …
   Executing command: AddTable -tableConfigFile 
/Users/soumitra/pinot-tutorial/transcript/transcript-table-offline.json.rg 
-offlineTableConfigFile null -realtimeTableConfigFilenull -schemaFile 
/Users/soumitra/pinot-tutorial/transcript/transcript-schema.json 
-controllerProtocol http -controllerHost 192.168.1.160 -controllerPort 9001 
-user null -password [hidden] -exec
   {"unrecognizedProperties":{},"status":"TableConfigs transcript successfully 
added"}
   ```
   
   Instance assignment as per the logs:
   ```
   % grep PartitionSelector logs/pinot-all.log
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Assigning 2 replica groups to 2 fault domains
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Warning, normalizing isn't finished yet
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Selecting 2 partitions, 2 instances per partition 
within a replica-group for table: transcript_OFFLINE
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Selecting instances: [Server_192.168.1.160_8001, 
Server_192.168.1.160_8004] for replica-group: 0, partition: 0 for table: 
transcript_OFFLINE
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Selecting instances: [Server_192.168.1.160_8002, 
Server_192.168.1.160_8001] for replica-group: 0, partition: 1 for table: 
transcript_OFFLINE
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Selecting instances: [Server_192.168.1.160_8005, 
Server_192.168.1.160_8003] for replica-group: 1, partition: 0 for table: 
transcript_OFFLINE
   2023/03/28 20:57:10.807 INFO [FDAwareInstancePartitionSelector] 
[grizzly-http-server-9] Selecting instances: [Server_192.168.1.160_8006, 
Server_192.168.1.160_8005] for replica-group: 1, partition: 1 for table: 
transcript_OFFLINE
   ```
   
   As per above logs:
   [Server_192.168.1.160_8001, Server_192.168.1.160_8004] for replica-group: 0, 
partition: 0
   [Server_192.168.1.160_8002, Server_192.168.1.160_8001] for replica-group: 0, 
partition: 1
   [Server_192.168.1.160_8005, Server_192.168.1.160_8003] for replica-group: 1, 
partition: 0
   [Server_192.168.1.160_8006, Server_192.168.1.160_8005] for replica-group: 1, 
partition: 1
   
   If entire FD1 is down, then [Server_192.168.1.160_8001, 
Server_192.168.1.160_8002, and Server_192.168.1.160_8003] are not down, hence 
"replica-group: 0, partition: 0" and "replica-group: 1, partition: 0", both are 
not available. If the segments have two replicas, then no replica of partition 
0 is available. Does it mean that the queries to "partition 0" will fail?
   
   This is a simple scenario, most likely not a bug. What am I missing in 
understanding the FD aware instance assignment?


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: commits-unsubscr...@pinot.apache.org
For additional commands, e-mail: commits-h...@pinot.apache.org

[GitHub] [pinot] soumitra-st opened a new issue, #10566: Fault-Domain-Aware Instance Assignment

Reply via email to