FYI: https://issues.apache.org/jira/browse/KNOX-2990
On Thu, Nov 30, 2023 at 9:05 AM Sandor Molnar <[email protected]> wrote: > Hello folks! > > After an offline discussion with Larry, we agreed on the following (as an > extension to the action plan I listed above): > - the migration tool will be run automatically when the Knox Gateway > starts, and it will run on the main thread (i.e. not in the background). > - there will be a config to control this step: in case of an error/bug, > this automated migration could be turned off > - when the first version of this newly configured DerbyDB JDBC TSS is > implemented, I'll run some performance tests to see if encryption should be > enabled by default > - we'll make sure to protect the DerbyDB data folder with proper file > permissions > > I'll submit the required JIRAs soon. > > Cheers, > Sandor > > > On Thu, Nov 23, 2023 at 11:59 PM larry mccay <[email protected]> wrote: > >> If we can determine whether there is already an alias based TSS in place >> and continue to use that for upgrades but derby for new clusters, I would >> be in favor of that. >> On whether to enable encryption, if we are only storing a hash of the >> passcode token then that should be okay. >> The persistence should be protected appropriately with file permissions >> for >> the knox user. >> >> NOTE: We will need to have some idea of how this may affect management >> applications like Cloudera Manager and Ambari, if at all, and get out in >> front of it. >> >> On Fri, Nov 17, 2023 at 8:27 AM Sandor Molnar >> <[email protected]> >> wrote: >> >> > Hi folks! >> > >> > Let me try to summarize what we have so far: >> > 1. we are all in favor of removing the JournalBased and Zookeeper TSS >> > implementations >> > 2. we all agreed that removing the AliasBasedTSS implementation requires >> > extra caution >> > 3. Larry raised the following concerns >> > 3.1 clear data storage in Derby -> ANSWER: Attila and I also >> indicated >> > Derby provides data encryption OOTB >> > 3.2 token hashes -> ANSWER: we do not store JWTs, only metadata. We >> > persist the passcode tokens though. It's hashed and stored using the >> > "knox.token.hash.key" secret and "gateway.knox.token.hash.algorithm" >> HMAC >> > algorithm which defaults to HmacSHA256. >> > 3.3 token synchronization across multiple Knox instances. -> ANSWER: >> > Derby has data replication capabilities. However, in HA environments, >> I'd >> > strongly recommend using Postgres/MySQL in those Knox instances >> > 4. Sandeep and Phil articulated the importance of deprecation -> we all >> > agree on this point >> > 5. Phil asked whether data encryption should be the default in the >> > Derby-configured JDBC TSS --> IMO, encryption should be turned on by >> > default. The required "bootPassword" should be auto-generated and >> stored in >> > __gateway-credentials.jks >> > 6. I recommended that the migration tool should be automated: when token >> > state service is initiated and it's using the pre-configured Derby >> > database, we may check if there is any token stored in __gateway.jks and >> > migrate them. This way it'd be seamless for existing tokens. >> > >> > Action plan: >> > - waiting for additional inputs on the above >> > - implement the DerbyDB configuration using encryption >> > - implement the migration tool in KnoxCLI and wire it in as a startup >> step >> > for the DerbyDB default implementation >> > - make sure end-users will not need to change anything when switching to >> > the new DerbyDB configured JDBC TSS >> > - make those three TSS implementations deprecated in v2.1.0, but leave >> the >> > AliasBasedTokenState service the default implementation >> > - release v2.1.0 and document the changes in this area. It's crucial to >> > emphasize we are going to remove them in the upcoming release (v2.2.0?) >> and >> > encourage end-users to switch to the DerbyDB JDBC TSS ASAP >> > - once v2.1.0 is released, remove the deprecated implementations and >> have >> > the new DerbyDB JDBC TSS the default one >> > >> > As always, feel free to add your comments and insights on the above >> > subject. >> > >> > Cheers, >> > Sandor >> > >> > On Tue, Nov 14, 2023 at 3:41 PM Phil Zampino <[email protected]> >> wrote: >> > >> > > First and foremost, I'll echo the comments about deprecation. IMO, we >> > must >> > > deprecate these implementations in a release before completely >> removing >> > > them in a subsequent release. >> > > >> > > I agree that the ZK and Journal implementations should be deprecated >> for >> > > the reasons stated. >> > > >> > > Concerning replacing the alias-based implementation with Derby, I >> > > share some of the same concerns expressed by Larry: >> > > - Attila has mentioned that Derby supports data encryption, but do we >> > > enable it by default? Should we require it always? >> > > - The questions around copying Derby data remains unanswered, at least >> > > partially if the migration utility proposal was intended to address >> this >> > > topic. >> > > >> > > >> > > On Mon, Nov 13, 2023 at 8:02 AM Sandor Molnar >> > <[email protected] >> > > > >> > > wrote: >> > > >> > > > Hello folks! >> > > > >> > > > I'm starting this thread because I am convinced we should remove the >> > > > following TokenStateService implementations: >> > > > - AliasBasedtokenStateService >> > > > - ZookeeperTokenStateService >> > > > - JournalBasedTokenStateService >> > > > >> > > > The reason behind this idea for the last two implementations in the >> > above >> > > > list is quite simple: >> > > > >> > > > 1. ZookeeperTokenStateService was our first approach to provide HA >> > > support >> > > > for Knox Token Integration. However, our internal tests have shown >> that >> > > ZK >> > > > is just simply not the right tool for that feature. Eventual >> > consistency >> > > is >> > > > only one part of this issue (we could make this work with re-tried >> ZK >> > > > queries). Performance-wise ZK proved to be a wrong decision. In our >> > test >> > > > environment, where hundreds of tokens were generated in every >> minute, >> > ZK >> > > > was not enough to scale. >> > > > >> > > > 2. JournalBasedTokensSateService is >> > > > 2.1 insecure (it stores plain data on the FS), >> > > > 2.2 missing features (no impersonation or SSO Cookie support) >> > > > >> > > > In the case of the AliasBasedtokenStateService, the reason is not >> that >> > > > simple. It's true, that keystore-related operations are expensive, >> but >> > > the >> > > > background thread that actually persists the token state improved a >> lot >> > > in >> > > > this respect. However, it's still slow compared to the supported >> > > databases >> > > > we added for the JDBC implementation when it comes to token >> > verification. >> > > > In addition to that, the current implementation creates at least 3 >> > > aliases >> > > > per token, which makes the __gateway really big in case of lots of >> > > tokens. >> > > > Even worse, we try to read all tokens into memory from __gateway >> > > credential >> > > > store in a background thread that also consumes memory, CPU which we >> > > could >> > > > avoid. >> > > > To be honest, I don't see any reason why could not we achieve the >> same >> > > > functionality with a pre-configured Derby database that stores its >> data >> > > in >> > > > a dedicated sub-folder within the KNOX_DATA_DIR. This would be the >> > > default >> > > > choice, so users will still not need to configure everything for the >> > > > KnoxToken service even if token state management is enabled. >> > > > >> > > > We could also write a small KNOX CLI command to migrate existing >> tokens >> > > > from keystores to Derby upon upgrade. >> > > > >> > > > Advantages of the above: >> > > > - only one implementation will be kept (JDBCTokenStateService) >> which is >> > > > proven to be robust enough and can scale well >> > > > - easier to maintain the product >> > > > - easier to troubleshoot in PROD environments (Derby has very >> powerful >> > > > tools to connect and run SQL queries) >> > > > - eliminate background threads which make debugging hard, >> > > > resource-consuming, and adds complexity >> > > > - the non-desired side effects of reading lots of tokens into memory >> > from >> > > > __gateway credential store that may make the >> > > > >> > > > I'm curious about what you think of the above and I'd like to hear >> back >> > > > from you with your suggestions and ideas. >> > > > >> > > > Cheers, >> > > > Sandor >> > > > >> > > >> > >> >
