Re: [DISCUSS] StreamPark Platform configuration files improvements

Huajie Wang Tue, 02 Apr 2024 01:13:27 -0700

Thanks to all developers opinions, I will complete the proposal in the near
future



Best,
Huajie Wang



gongzhongqiang <[email protected]> 于2024年4月2日周二 15:55写道：

> Hi Ben,
>
> I agree with the points you've made. It will be clear for users.
>
> Best regards,
>
> Zhongqiang Gong
>
> Huajie Wang <[email protected]> 于2024年3月30日周六 18:08写道：
>
> > hi devs:
> >
> >
> > Currently, the streampark platform provides multiple configuration
> > files for user configuration, such as: application.yml,
> > application-pgsql.yml, application-mysql.yml, kerberos.yml... , We can
> > improve these configuration files. Many config files are internal
> > system configurations, for example, in application.yml, a large number
> > of configurations are internal platform configurations, such as
> > jackson config for integration Spring Boot, swagger-ui config. the
> > 'allow-circular-references' parameter for Spring... These do not need
> > user configuration and should not be exposed to users.
> >
> > application.yml:
> > ```yaml
> >
> > server:
> >   port: 10000
> >   undertow:
> >     buffer-size: 1024
> >     direct-buffers: true
> >     threads:
> >       io: 4
> >       worker: 20
> >
> > logging:
> >   level:
> >     root: info
> >
> > knife4j:
> >   enable: true
> >   basic:
> >     # basic authentication, used to access swagger-ui and doc
> >     enable: false
> >     username: admin
> >     password: streampark
> >
> > springdoc:
> >   api-docs:
> >     enabled: true
> >   swagger-ui:
> >     path: /swagger-ui.html
> >   packages-to-scan: org.apache.streampark.console
> >
> > spring:
> >   profiles.active: h2 #[h2,pgsql,mysql]
> >   application.name: StreamPark
> >   devtools.restart.enabled: false
> >   mvc.pathmatch.matching-strategy: ant_path_matcher
> >   servlet:
> >     multipart:
> >       enabled: true
> >       max-file-size: 500MB
> >       max-request-size: 500MB
> >   aop.proxy-target-class: true
> >   messages.encoding: utf-8
> >   jackson:
> >     date-format: yyyy-MM-dd HH:mm:ss
> >     time-zone: GMT+8
> >     deserialization:
> >       fail-on-unknown-properties: false
> >   main:
> >     allow-circular-references: true
> >     banner-mode: off
> >   mvc:
> >     converters:
> >       preferred-json-mapper: jackson
> >
> > management:
> >   endpoints:
> >     web:
> >       exposure:
> >         include: [ 'health', 'httptrace', 'metrics' ]
> >   endpoint:
> >     health:
> >       enabled: true
> >       show-details: always
> >       probes:
> >         enabled: true
> >   health:
> >     ldap:
> >       enabled: false
> >
> > streampark:
> >   proxy:
> >     # knox process address
> > https://cdpsit02.example.cn:8443/gateway/cdp-proxy/yarn
> >     yarn-url:
> >     # lark alert proxy,default https://open.feishu.cn
> >     lark-url:
> >   yarn:
> >       # default simple, or kerberos
> >     http-auth: simple
> >
> >   # HADOOP_USER_NAME
> >   hadoop-user-name: hdfs
> >   # local workspace, used to store source code and build dir etc.
> >   workspace:
> >     local: /opt/streampark_workspace
> >     remote: hdfs:///streampark   # support hdfs:///streampark/ 、
> > /streampark 、hdfs://host:ip/streampark/
> >
> >   # remote docker register namespace for streampark
> >   docker:
> >     # instantiating DockerHttpClient
> >     http-client:
> >       max-connections: 10000
> >       connection-timeout-sec: 10000
> >       response-timeout-sec: 12000
> >       docker-host: ""
> >
> >   # flink-k8s tracking configuration
> >   flink-k8s:
> >     tracking:
> >       silent-state-keep-sec: 10
> >       polling-task-timeout-sec:
> >         job-status: 120
> >         cluster-metric: 120
> >       polling-interval-sec:
> >         job-status: 2
> >         cluster-metric: 3
> >     # If you need to specify an ingress controller, you can use this.
> >     ingress:
> >       class: nginx
> >
> >   # packer garbage resources collection configuration
> >   packer-gc:
> >     # maximum retention time for temporary build resources
> >     max-resource-expired-hours: 120
> >     # gc task running interval hours
> >     exec-cron: 0 0 0/6 * * ?
> >
> >   shiro:
> >     # token timeout, unit second
> >     jwtTimeOut: 86400
> >     # backend authentication-free resources url
> >     anonUrl: >
> >
> > ldap:
> >   # Is ldap enabled? If so, please modify the urls
> >   enable: false
> >   ## AD server IP, default port 389
> >   urls: ldap://99.99.99.99:389
> >   ## Login Account
> >   base-dn: dc=streampark,dc=com
> >   username: cn=Manager,dc=streampark,dc=com
> >   password: streampark
> >   user:
> >     identity-attribute: uid
> >     email-attribute: mail
> >
> > ```
> >
> >
> > So, I propose that we improve these configurations by providing users
> > with only one configuration file(only one). The configurations in this
> > file should be completely user-focused, clear, and core
> > configurations.
> >
> > e.g:
> > ```yaml
> >
> > # logging level
> > logging.level.root: info
> > # server port
> > server.port: 10000
> > # The user's login session has a validity period. If it exceeds this
> > time, the user will be automatically logout
> > # unit: s|m|h|d, s: second, m:minute, h:hour, d: day
> > server.session.ttl: 2h # unit[s|m|h|d], e.g: 24h, 2d....
> >
> > # see:
> >
> https://github.com/undertow-io/undertow/blob/master/core/src/main/java/io/undertow/Undertow.java
> > server.undertow.direct-buffers: true
> > server.undertow.buffer-size: 1024
> > server.undertow.threads.io: 16
> > server.undertow.threads.worker: 256
> >
> > # system database, default h2, mysql|pgsql|h2
> > datasource.dialect: h2 # h2, pgsql
> > #-------if datasource.dialect is mysql or pgsql, it is necessary to
> > set-------
> > datasource.username:
> > datasource.password:
> > # mysql jdbc url example:
> > # datasource.url:
> >
> >
> jdbc:mysql://localhost:3306/streampark?useUnicode=true&characterEncoding=UTF-8&useJDBCCompliantTimezoneShift=true&useLegacyDatetimeCode=false&serverTimezone=GMT%2B8
> > # postgresql jdbc url example:
> > # datasource.url:
> > jdbc:postgresql://localhost:5432/streampark?stringtype=unspecified
> > datasource.url:
> >
> >
> #---------------------------------------------------------------------------------
> >
> > # Directory for storing locally built project
> > streampark.workspace.local: /tmp/streampark
> > # The root hdfs path of the jars, Same as yarn.provided.lib.dirs for
> > flink on yarn-application
> > # and Same as --jars for spark on yarn
> > streampark.workspace.remote: hdfs:///streampark/
> > # hadoop yarn proxy path, e.g: knox process address
> > https://streampark.com:8443/proxy/yarn
> > streampark.proxy.yarn-url:
> > # lark proxy address, default https://open.feishu.cn
> > streampark.proxy.lark-url:
> > # flink on yarn or spark on yarn, monitoring job status from yarn, it
> > is necessary to set hadoop.http.authentication.type
> > streampark.yarn.http-auth: simple  # default simple, or kerberos
> > # flink on yarn or spark on yarn, it is necessary to set
> > streampark.hadoop-user-name: hdfs
> > # flink on k8s ingress setting, If an ingress controller is specified
> > in the configuration, the ingress class
> > #  kubernetes.io/ingress.class must be specified when creating the
> > ingress, since there are often
> > #  multiple ingress controllers in a production environment.
> > streampark.flink-k8s.ingress.class: nginx
> >
> > # sign streampark with ldap.
> > ldap.enable: false  # ldap enabled
> > ldap.urls: ldap://99.99.99.99:389 #AD server IP, default port 389
> > ldap.base-dn: dc=streampark,dc=com  # Login Account
> > ldap.username: cn=Manager,dc=streampark,dc=com
> > ldap.password: streampark
> > ldap.user.identity-attribute: uid
> > ldap.user.email-attribute: mail
> >
> > # flink on yarn or spark on yarn, when the hadoop cluster enable
> > kerberos authentication,
> > # it is necessary to set up Kerberos authentication related parameters.
> > security.kerberos.login.enable: false
> > security.kerberos.login.debug: false
> > # kerberos principal path
> > security.kerberos.login.principal:
> > security.kerberos.login.krb5:
> > security.kerberos.login.keytab:
> > security.kerberos.ttl: 2h # unit [s|m|h|d]
> >
> > ```
> >
> > this is issue:
> https://github.com/apache/incubator-streampark/issues/3641
> >
> > What's your opinion on this? Welcome to discuss
> >
> >
> >
> > Best,
> > Huajie Wang
> >
>

Re: [DISCUSS] StreamPark Platform configuration files improvements

Reply via email to