Modified: kylin/site/docs23/release_notes.html URL: http://svn.apache.org/viewvc/kylin/site/docs23/release_notes.html?rev=1836252&r1=1836251&r2=1836252&view=diff ============================================================================== --- kylin/site/docs23/release_notes.html (original) +++ kylin/site/docs23/release_notes.html Thu Jul 19 07:35:15 2018 @@ -5404,428 +5404,400 @@ there are source code package, binary pa <p>or send to Apache Kylin mailing list:</p> <ul> - <li>User relative: <a href="mailto:u...@kylin.apache.org">u...@kylin.apache.org</a></li> - <li>Development relative: <a href="mailto:d...@kylin.apache.org">d...@kylin.apache.org</a></li> + <li>User relative: <a href="mailto:user@kylin.apache.org">user@kylin.apache.org</a></li> + <li>Development relative: <a href="mailto:dev@kylin.apache.org">dev@kylin.apache.org</a></li> </ul> <h2 id="v232---2018-07-08">v2.3.2 - 2018-07-08</h2> <p><em>Tag:</em> <a href="https://github.com/apache/kylin/tree/kylin-2.3.2">kylin-2.3.2</a><br /> This is a bug fix release after 2.3.1, with 12 bug fixes and enhancement. Check <a href="/docs23/howto/howto_upgrade.html">How to upgrade</a>.</p> -<p><strong>Improvement</strong></p> -<ul> - <li>[KYLIN-3345] - Use Apache Parent POM 19</li> - <li>[KYLIN-3372] - Upgrade jackson-databind version due to security concerns</li> - <li>[KYLIN-3415] - Remove âexternalâ module</li> -</ul> - -<p><strong>Bug</strong></p> -<ul> - <li>[KYLIN-3115] - Incompatible RowKeySplitter initialize between build and merge job</li> - <li>[KYLIN-3336] - java.lang.NoSuchMethodException: org.apache.kylin.tool.HBaseUsageExtractor.execute([Ljava.lang.String;)</li> - <li>[KYLIN-3348] - âmissing LastBuildJobIDâ error when building new cube segment</li> - <li>[KYLIN-3352] - Segment pruning bug, e.g. date_col > âmax_date+1â</li> - <li>[KYLIN-3363] - Wrong partition condition appended in JDBC Source</li> - <li>[KYLIN-3388] - Data may become not correct if mappers fail during the redistribute step, âdistribute by rand()â</li> - <li>[KYLIN-3400] - WipeCache and createCubeDesc causes deadlock</li> - <li>[KYLIN-3401] - The current using zip compress tool has an arbitrary file write vulnerability</li> - <li>[KYLIN-3404] - Last optimized time detail was not showing after cube optimization</li> -</ul> +<p><strong>Improvement</strong><br /> +* [KYLIN-3345] - Use Apache Parent POM 19<br /> +* [KYLIN-3372] - Upgrade jackson-databind version due to security concerns<br /> +* [KYLIN-3415] - Remove âexternalâ module</p> + +<p><strong>Bug</strong><br /> +* [KYLIN-3115] - Incompatible RowKeySplitter initialize between build and merge job<br /> +* [KYLIN-3336] - java.lang.NoSuchMethodException: org.apache.kylin.tool.HBaseUsageExtractor.execute([Ljava.lang.String;)<br /> +* [KYLIN-3348] - âmissing LastBuildJobIDâ error when building new cube segment<br /> +* [KYLIN-3352] - Segment pruning bug, e.g. date_col > âmax_date+1â<br /> +* [KYLIN-3363] - Wrong partition condition appended in JDBC Source<br /> +* [KYLIN-3388] - Data may become not correct if mappers fail during the redistribute step, âdistribute by rand()â<br /> +* [KYLIN-3400] - WipeCache and createCubeDesc causes deadlock<br /> +* [KYLIN-3401] - The current using zip compress tool has an arbitrary file write vulnerability<br /> +* [KYLIN-3404] - Last optimized time detail was not showing after cube optimization</p> <h2 id="v231---2018-03-28">v2.3.1 - 2018-03-28</h2> <p><em>Tag:</em> <a href="https://github.com/apache/kylin/tree/kylin-2.3.1">kylin-2.3.1</a><br /> This is a bug fix release after 2.3.0, with 12 bug fixes and enhancement. Check <a href="/docs23/howto/howto_upgrade.html">How to upgrade</a>.</p> -<p><strong>Improvement</strong></p> -<ul> - <li>[KYLIN-3233] - CacheController can not handle if cacheKey has â/â</li> - <li>[KYLIN-3278] - Kylin should not distribute hive table by random at Step1</li> - <li>[KYLIN-3300] - Upgrade jackson-databind to 2.6.7.1 with security issue fixed</li> - <li>[KYLIN-3301] - Upgrade opensaml to 2.6.6 with security issue fixed</li> -</ul> - -<p><strong>Bug</strong></p> -<ul> - <li>[KYLIN-3270] - Fix the blocking issue in Cube optimizing job</li> - <li>[KYLIN-3276] - Fix the query cache bug with dynamic parameter</li> - <li>[KYLIN-3288] - âSqoop To Flat Hive Tableâ step should specify âmapreduce.queue.nameâ</li> - <li>[KYLIN-3306] - Fix the rarely happened unit test exception of generic algorithm</li> - <li>[KYLIN-3287] - When a shard by column is in dict encoding, dict building error.</li> - <li>[KYLIN-3280] - The delete button should not be enabled without any segment in cube segment delete confirm dialog</li> - <li>[KYLIN-3119] - A few bugs in the function âmassageSqlâ of âQueryUtil.javaâ</li> - <li>[KYLIN-3236] - The function âreGenerateAdvancedDict()â has an error logical judgment, which will cause an exception when you edit the cube.</li> -</ul> +<p><strong>Improvement</strong><br /> +* [KYLIN-3233] - CacheController can not handle if cacheKey has â/â<br /> +* [KYLIN-3278] - Kylin should not distribute hive table by random at Step1<br /> +* [KYLIN-3300] - Upgrade jackson-databind to 2.6.7.1 with security issue fixed<br /> +* [KYLIN-3301] - Upgrade opensaml to 2.6.6 with security issue fixed</p> + +<p><strong>Bug</strong><br /> +* [KYLIN-3270] - Fix the blocking issue in Cube optimizing job<br /> +* [KYLIN-3276] - Fix the query cache bug with dynamic parameter<br /> +* [KYLIN-3288] - âSqoop To Flat Hive Tableâ step should specify âmapreduce.queue.nameâ<br /> +* [KYLIN-3306] - Fix the rarely happened unit test exception of generic algorithm<br /> +* [KYLIN-3287] - When a shard by column is in dict encoding, dict building error.<br /> +* [KYLIN-3280] - The delete button should not be enabled without any segment in cube segment delete confirm dialog<br /> +* [KYLIN-3119] - A few bugs in the function âmassageSqlâ of âQueryUtil.javaâ<br /> +* [KYLIN-3236] - The function âreGenerateAdvancedDict()â has an error logical judgment, which will cause an exception when you edit the cube.</p> <h2 id="v230---2018-03-04">v2.3.0 - 2018-03-04</h2> <p><em>Tag:</em> <a href="https://github.com/apache/kylin/tree/kylin-2.3.0">kylin-2.3.0</a><br /> This is a major release after 2.2, with more than 250 bug fixes and enhancement. Check <a href="/docs23/howto/howto_upgrade.html">How to upgrade</a>.</p> -<p><strong>New Feature</strong></p> -<ul> - <li>[KYLIN-3125] - Support SparkSql in Cube building step âCreate Intermediate Flat Hive Tableâ</li> - <li>[KYLIN-3052] - Support Redshift as data source</li> - <li>[KYLIN-3044] - Support SQL Server as data source</li> - <li>[KYLIN-2999] - One click migrate cube in web</li> - <li>[KYLIN-2960] - Support user/group and role authentication for LDAP</li> - <li>[KYLIN-2902] - Introduce project-level concurrent query number control</li> - <li>[KYLIN-2776] - New metric framework based on dropwizard</li> - <li>[KYLIN-2727] - Introduce cube planner able to select cost-effective cuboids to be built by cost-based algorithms</li> - <li>[KYLIN-2726] - Introduce a dashboard for showing kylin service related metrics, like query count, query latency, job count, etc</li> - <li>[KYLIN-1892] - Support volatile range for segments auto merge</li> -</ul> - -<p><strong>Improvement</strong></p> -<ul> - <li>[KYLIN-3265] - Add âjobSearchModeâ as a condition to â/kylin/api/jobsâ API</li> - <li>[KYLIN-3245] - Searching cube support fuzzy search</li> - <li>[KYLIN-3243] - Optimize the code and keep the code consistent in the access.html</li> - <li>[KYLIN-3239] - Refactor the ACL code about âcheckPermissionâ and âhasPermissionâ</li> - <li>[KYLIN-3215] - Remove âdropâ option when job status is stopped and error</li> - <li>[KYLIN-3214] - Initialize ExternalAclProvider when starting kylin</li> - <li>[KYLIN-3209] - Optimize job partial statistics path be consistent with existing one</li> - <li>[KYLIN-3196] - Replace StringUtils.containsOnly with Regex</li> - <li>[KYLIN-3194] - Tolerate broken job metadata caused by executable ClassNotFoundException</li> - <li>[KYLIN-3193] - No model clone across projects</li> - <li>[KYLIN-3182] - Update Kylin help menu links</li> - <li>[KYLIN-3181] - The submit button status of refreshing cube is not suitable when the start time is equal or more than the end time.</li> - <li>[KYLIN-3162] - Fix alignment problem of âSave Queryâ pop-up box</li> - <li>[KYLIN-3159] - Remove unnecessary cube access request</li> - <li>[KYLIN-3158] - Metadata broadcast should only retry failed node</li> - <li>[KYLIN-3157] - Enhance query timeout to entire query life cycle</li> - <li>[KYLIN-3151] - Enable âQuery Historyâ to show items filtered by different projects</li> - <li>[KYLIN-3150] - Support different compression in PercentileCounter measure</li> - <li>[KYLIN-3145] - Support Kafka JSON message whose property name includes â_â</li> - <li>[KYLIN-3144] - Adopt Collections.emptyList() for empty list values</li> - <li>[KYLIN-3129] - Fix the joda library conflicts during Kylin start on EMR 5.8+</li> - <li>[KYLIN-3128] - Configs for allowing export query results for admin/nonadmin user</li> - <li>[KYLIN-3127] - In the Insights tab, results section, make the list of Cubes hit by the query either scrollable or multiline</li> - <li>[KYLIN-3124] - Support horizontal scroll bar in âInsightâ</li> - <li>[KYLIN-3117] - Hide project config in cube level</li> - <li>[KYLIN-3114] - Enable kylin.web.query-timeout for web query request</li> - <li>[KYLIN-3113] - Editing Measure supports fuzzy search in web</li> - <li>[KYLIN-3108] - Change IT embedded Kafka broker path to /kylin/streaming_config/UUID</li> - <li>[KYLIN-3105] - Interface Schedulerâs stop method should be removed</li> - <li>[KYLIN-3100] - Building empty partitioned cube with rest api supports partition_start_date</li> - <li>[KYLIN-3098] - Enable kylin.query.max-return-rows to limit the maximum row count returned to user</li> - <li>[KYLIN-3092] - Synchronize read/write operations on Managers</li> - <li>[KYLIN-3090] - Refactor to consolidate all caches and managers under KylinConfig</li> - <li>[KYLIN-3088] - Spell Error of isCubeMatch</li> - <li>[KYLIN-3086] - Ignore the intermediate tables when loading Hive source tables</li> - <li>[KYLIN-3079] - Use Docker for document build environment</li> - <li>[KYLIN-3078] - Optimize the estimated size of percentile measure</li> - <li>[KYLIN-3076] - Make kylin remember the choices we have made in the âMonitor>Jobsâ page</li> - <li>[KYLIN-3074] - Change cube access to project access in ExternalAclProvider.java</li> - <li>[KYLIN-3073] - Automatically refresh the âSaved Queriesâ tab page when new query saved.</li> - <li>[KYLIN-3070] - Enable âkylin.source.hive.flat-table-storage-formatâ for flat table storage format</li> - <li>[KYLIN-3067] - Provide web interface for dimension capping feature</li> - <li>[KYLIN-3065] - Add âFirstâ and âLastâ button in case âQuery Historyâ is too much</li> - <li>[KYLIN-3064] - Turn off Yarn timeline-service when submit mr job</li> - <li>[KYLIN-3048] - Give warning when merge with holes, but allow user to force proceed at the same time</li> - <li>[KYLIN-3043] - Donât need create materialized view for lookup tables without snapshot</li> - <li>[KYLIN-3039] - Unclosed hbaseAdmin in ITAclTableMigrationToolTest</li> - <li>[KYLIN-3036] - Allow complex column type when loading source table</li> - <li>[KYLIN-3024] - Input Validator for âAuto Merge Thresholdsâ text box</li> - <li>[KYLIN-3019] - The pop-up window of âCalculate Cardinalityâ and âLoad Hive Tableâ should have the same hint</li> - <li>[KYLIN-3009] - Rest API to get Cube join SQL</li> - <li>[KYLIN-3008] - Introduce âsubmit-patch.pyâ</li> - <li>[KYLIN-3006] - Upgrade Spark to 2.1.2</li> - <li>[KYLIN-2997] - Allow change engineType even if there are segments in cube</li> - <li>[KYLIN-2996] - Show DeployCoprocessorCLI Log failed tables info</li> - <li>[KYLIN-2993] - Add special mr config for base cuboid step</li> - <li>[KYLIN-2992] - Avoid OOM in CubeHFileJob.Reducer</li> - <li>[KYLIN-2990] - Add warning window of exist model names for other project selected</li> - <li>[KYLIN-2987] - Add âauto.purge=trueâ when creating intermediate hive table or redistribute a hive table</li> - <li>[KYLIN-2985] - Cache temp json file created by each Calcite Connection</li> - <li>[KYLIN-2984] - Only allow delete FINISHED or DISCARDED job</li> - <li>[KYLIN-2982] - Avoid upgrade column in OLAPTable</li> - <li>[KYLIN-2981] - Typo in Cube refresh setting page.</li> - <li>[KYLIN-2980] - Remove getKey/Value setKey/Value from Kylinâs Pair.</li> - <li>[KYLIN-2975] - Unclosed Statement in test</li> - <li>[KYLIN-2966] - push down jdbc column type id mapping</li> - <li>[KYLIN-2965] - Keep the same cost calculation logic between RealizationChooser and CubeInstance</li> - <li>[KYLIN-2947] - Changed the Pop-up box when no project selected</li> - <li>[KYLIN-2941] - Configuration setting for SSO</li> - <li>[KYLIN-2940] - List job restful throw NPE when time filter not set</li> - <li>[KYLIN-2935] - Improve the way to deploy coprocessor</li> - <li>[KYLIN-2928] - PUSH DOWN query cannot use order by function</li> - <li>[KYLIN-2921] - Refactor DataModelDesc</li> - <li>[KYLIN-2918] - Table ACL needs GUI</li> - <li>[KYLIN-2913] - Enable job retry for configurable exceptions</li> - <li>[KYLIN-2912] - Remove âhfileâ folder after bulk load to HBase</li> - <li>[KYLIN-2909] - Refine Email Template for notification by freemarker</li> - <li>[KYLIN-2908] - Add one option for migration tool to indicate whether to migrate segment data</li> - <li>[KYLIN-2905] - Refine the process of submitting a job</li> - <li>[KYLIN-2884] - Add delete segment function for portal</li> - <li>[KYLIN-2881] - Improve hbase coprocessor exception handling at kylin server side</li> - <li>[KYLIN-2875] - Cube e-mail notification Validation</li> - <li>[KYLIN-2867] - split large fuzzy Key set</li> - <li>[KYLIN-2866] - Enlarge the reducer number for hyperloglog statistics calculation at step FactDistinctColumnsJob</li> - <li>[KYLIN-2847] - Avoid doing useless work by checking query deadline</li> - <li>[KYLIN-2846] - Add a config of hbase namespace for cube storage</li> - <li>[KYLIN-2809] - Support operator â+â as string concat operator</li> - <li>[KYLIN-2801] - Make default precision and scale in DataType (for hive) configurable</li> - <li>[KYLIN-2764] - Build the dict for UHC column with MR</li> - <li>[KYLIN-2736] - Use multiple threads to calculate HyperLogLogPlusCounter in FactDistinctColumnsMapper</li> - <li>[KYLIN-2672] - Only clean necessary cache for CubeMigrationCLI</li> - <li>[KYLIN-2656] - Support Zookeeper ACL</li> - <li>[KYLIN-2649] - Tableau could send âselect *â on a big table</li> - <li>[KYLIN-2645] - Upgrade Kafka version to 0.11.0.1</li> - <li>[KYLIN-2556] - Switch Findbugs to Spotbugs</li> - <li>[KYLIN-2363] - Prune cuboids by capping number of dimensions</li> - <li>[KYLIN-1925] - Do not allow cross project clone for cube</li> - <li>[KYLIN-1872] - Make query visible and interruptible, improve serverâs stablility</li> -</ul> - -<p><strong>Bug</strong></p> -<ul> - <li>[KYLIN-3268] - Tomcat Security Vulnerability Alert. The version of the tomcat for kylin should upgrade to 7.0.85.</li> - <li>[KYLIN-3263] - AbstractExecutableâs retry has problem</li> - <li>[KYLIN-3247] - REST API âGET /api/cubes/{cubeName}/segs/{segmentName}/sqlâ should return a cube segment sql</li> - <li>[KYLIN-3242] - export result should use alias too</li> - <li>[KYLIN-3241] - When refresh on âAdd Cube Pageâ, a blank page will appear.</li> - <li>[KYLIN-3228] - Should remove the related segment when deleting a job</li> - <li>[KYLIN-3227] - Automatically remove the blank at the end of lines in properties files</li> - <li>[KYLIN-3226] - When user logs in with only query permission, âN/Aâ is displayed in the cubeâs action list.</li> - <li>[KYLIN-3224] - data canât show when use kylin pushdown model</li> - <li>[KYLIN-3223] - Query for the list of hybrid cubes results in NPE</li> - <li>[KYLIN-3222] - The function of editing âAdvanced Dictionariesâ in cube is unavailable.</li> - <li>[KYLIN-3219] - Fix NPE when updating metrics during Spark CubingJob</li> - <li>[KYLIN-3216] - Remove the hard-code of spark-history path in âcheck-env.shâ</li> - <li>[KYLIN-3213] - Kylin help has duplicate items</li> - <li>[KYLIN-3211] - Class IntegerDimEnc shuould give more exception information when the length is exceed the max or less than the min</li> - <li>[KYLIN-3210] - The project shows â_nullâ in result page.</li> - <li>[KYLIN-3205] - Allow one column is used for both dimension and precisely count distinct measure</li> - <li>[KYLIN-3204] - Potentially unclosed resources in JdbcExplorer#evalQueryMetadata</li> - <li>[KYLIN-3199] - The login dialog should be closed when ldap user with no permission login correctly</li> - <li>[KYLIN-3190] - Fix wrong parameter in revoke access API</li> - <li>[KYLIN-3184] - Fix â_nullâ project on the query page</li> - <li>[KYLIN-3183] - Fix the bug of the âRemoveâ button in âQuery Historyâ</li> - <li>[KYLIN-3178] - Delete table acl failed will cause the wabpage awalys shows âPlease waitâ¦â</li> - <li>[KYLIN-3177] - Merged Streaming cube segment has no start/end time</li> - <li>[KYLIN-3175] - Streaming segment lost TSRange after merge</li> - <li>[KYLIN-3173] - DefaultScheduler shutdown didnât reset field initialized.</li> - <li>[KYLIN-3172] - No such file or directory error with CreateLookupHiveViewMaterializationStep</li> - <li>[KYLIN-3167] - Datatype lost precision when using beeline</li> - <li>[KYLIN-3165] - Fix the IllegalArgumentException during segments auto merge</li> - <li>[KYLIN-3164] - HBase connection must be closed when clearing connection pool</li> - <li>[KYLIN-3143] - Wrong use of Preconditions.checkNotNull() in ManagedUser#removeAuthoritie</li> - <li>[KYLIN-3139] - Failure in map-reduce job due to undefined hdp.version variable when using HDP stack and remote HBase cluster</li> - <li>[KYLIN-3136] - Endless status while subtask happens to be the illegal RUNNING</li> - <li>[KYLIN-3135] - Fix regular expression bug in SQL comments</li> - <li>[KYLIN-3131] - After refresh the page,the cubes canât sort by âcreate_timeâ</li> - <li>[KYLIN-3130] - If we add new cube then refresh the page,the page is blank</li> - <li>[KYLIN-3116] - Fix cardinality caculate checkbox issue when loading tables</li> - <li>[KYLIN-3112] - The job âPauseâ operation has logic bug in the kylin server.</li> - <li>[KYLIN-3111] - Close of HBaseAdmin instance should be placed in finally block</li> - <li>[KYLIN-3110] - The dashboard page has some display problems.</li> - <li>[KYLIN-3106] - DefaultScheduler.shutdown should use ExecutorService.shutdownNow instead of ExecutorService.shutdown</li> - <li>[KYLIN-3104] - When the user log out from âMonitorâ page, an alert dialog will pop up warning âFailed to load query.â</li> - <li>[KYLIN-3102] - Solve the problems for incomplete display of Hive Table tree.</li> - <li>[KYLIN-3101] - The âsearchâ icon will separate from the âFilterâ textbox when click the âshowStepsâ button of a job in the jobList</li> - <li>[KYLIN-3097] - A few spell error in partials directory</li> - <li>[KYLIN-3087] - Fix the DistributedLock release bug in GlobalDictionaryBuilder</li> - <li>[KYLIN-3085] - CubeManager.updateCube() must not update the cached CubeInstance</li> - <li>[KYLIN-3084] - File not found Exception when processing union-all in TEZ mode</li> - <li>[KYLIN-3083] - potential overflow in CubeHBaseRPC#getCoprocessorTimeoutMillis</li> - <li>[KYLIN-3082] - Close of GTBuilder should be placed in finally block in InMemCubeBuilder</li> - <li>[KYLIN-3081] - Ineffective null check in CubeController#cuboidsExport</li> - <li>[KYLIN-3077] - EDW.TEST_SELLER_TYPE_DIM_TABLE is not being created by the integration test, but itâs presence in the Hive is expected</li> - <li>[KYLIN-3069] - Add proper time zone support to the WebUI instead of GMT/PST kludge</li> - <li>[KYLIN-3063] - load-hive-conf.sh should not get the commented configuration item</li> - <li>[KYLIN-3061] - When we cancel the Topic modification for âKafka Settingâ of streaming table, the âCancelâ operation will make a mistake.</li> - <li>[KYLIN-3060] - The logical processing of creating or updating streaming table has a bug in server, which will cause a NullPointerException.</li> - <li>[KYLIN-3058] - We should limit the integer type ID and Port for âKafka Settingâ in âStreaming Clusterâ page</li> - <li>[KYLIN-3056] - Fix âCannot find segment nullâ bug when click âSQLâ in the cube view page</li> - <li>[KYLIN-3055] - Fix NullPointerException for intersect_count</li> - <li>[KYLIN-3054] - The drop-down menu in the grid column of query results missing a little bit.</li> - <li>[KYLIN-3053] - When aggregation group verification failed, the error message about aggregation group number does not match with the actual on the Advanced Setting page</li> - <li>[KYLIN-3049] - Filter the invalid zero value of âAuto Merge Thresholdsâ parameter when you create or upate a cube.</li> - <li>[KYLIN-3047] - Wrong column type when sync hive table via beeline</li> - <li>[KYLIN-3042] - In query results page, the results data table should resize when click âfullScreenâ button</li> - <li>[KYLIN-3040] - Refresh a non-partitioned cube changes the segment name to â19700101000000_2922789940817071255â</li> - <li>[KYLIN-3038] - cannot support sum of type-converted column SQL</li> - <li>[KYLIN-3034] - In the models tree, the âEdit(JSON)â option is missing partly.</li> - <li>[KYLIN-3032] - Cube size shows 0 but actually it isnât empty</li> - <li>[KYLIN-3031] - KeywordDefaultDirtyHack should ignore case of default like other database does</li> - <li>[KYLIN-3030] - In the cubes table, the options of last column action are missing partly.</li> - <li>[KYLIN-3029] - The warning window of existing cube name does not work</li> - <li>[KYLIN-3028] - Build cube error when set S3 as working-dir</li> - <li>[KYLIN-3026] - Can not see full cube names on insight page</li> - <li>[KYLIN-3020] - Improve org.apache.hadoop.util.ToolRunner to be threadsafe</li> - <li>[KYLIN-3017] - Footer covers the selection box and some options can not be selected</li> - <li>[KYLIN-3016] - StorageCleanup job doesnât clean up all the legacy fiels in a in Read/Write seperation environment</li> - <li>[KYLIN-3004] - Update validation when deleting segment</li> - <li>[KYLIN-3001] - Fix the wrong Cache key issue</li> - <li>[KYLIN-2995] - Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing</li> - <li>[KYLIN-2994] - Handle NPE when load dict in DictionaryManager</li> - <li>[KYLIN-2991] - Query hit NumberFormatException if partitionDateFormat is not yyyy-MM-dd</li> - <li>[KYLIN-2989] - Close of BufferedWriter should be placed in finally block in SCCreator</li> - <li>[KYLIN-2974] - zero joint group can lead to query error</li> - <li>[KYLIN-2971] - Fix the wrong âRealization Namesâ in logQuery when hit cache</li> - <li>[KYLIN-2969] - Fix the wrong NumberBytesCodec cache in Number2BytesConverter</li> - <li>[KYLIN-2968] - misspelled word in table_load.html</li> - <li>[KYLIN-2967] - Add the dependency check when deleting a project</li> - <li>[KYLIN-2962] - drop error job not delete segment</li> - <li>[KYLIN-2959] - SAML logout issue</li> - <li>[KYLIN-2956] - building trie dictionary blocked on value of length over 4095</li> - <li>[KYLIN-2953] - List readable project not correct if add limit and offset</li> - <li>[KYLIN-2939] - Get config properties not correct in UI</li> - <li>[KYLIN-2933] - Fix compilation against the Kafka 1.0.0 release</li> - <li>[KYLIN-2930] - Selecting one column in union causes compile error</li> - <li>[KYLIN-2929] - speed up Dump file performance</li> - <li>[KYLIN-2922] - Query fails when a column is used as dimension and sum(column) at the same time</li> - <li>[KYLIN-2917] - Dup alias on OLAPTableScan</li> - <li>[KYLIN-2907] - Check if a number is a positive integer</li> - <li>[KYLIN-2901] - Update correct cardinality for empty table</li> - <li>[KYLIN-2887] - Subquery columns not exported in OLAPContext allColumns</li> - <li>[KYLIN-2876] - Ineffective check in ExternalAclProvider</li> - <li>[KYLIN-2874] - Ineffective check in CubeDesc#getInitialCuboidScheduler</li> - <li>[KYLIN-2849] - duplicate segmentúåcannot be deleted and data cannot be refreshed and merged</li> - <li>[KYLIN-2837] - Ineffective call to toUpperCase() in MetadataManager</li> - <li>[KYLIN-2836] - Lack of synchronization in CodahaleMetrics#close</li> - <li>[KYLIN-2835] - Unclosed resources in JdbcExplorer</li> - <li>[KYLIN-2794] - MultipleDictionaryValueEnumerator should output values in sorted order</li> - <li>[KYLIN-2756] - Let âLIMITâ be optional in âInspectâ page</li> - <li>[KYLIN-2470] - cube build failed when 0 bytes input for non-partition fact table</li> - <li>[KYLIN-1664] - Harden security check for â/kylin/api/admin/configâ API</li> -</ul> - -<p><strong>Task</strong></p> -<ul> - <li>[KYLIN-3207] - Blog for Kylin Superset Integration</li> - <li>[KYLIN-3200] - Enable SonarCloud for Code Analysis</li> - <li>[KYLIN-3198] - More Chinese Howto Documents</li> - <li>[KYLIN-3195] - Kylin v2.3.0 Release</li> - <li>[KYLIN-3191] - Remove the deprecated configuration item kylin.security.acl.default-role</li> - <li>[KYLIN-3189] - Documents for kylin python client</li> - <li>[KYLIN-3080] - Kylin Qlik Sense Integration Documentation</li> - <li>[KYLIN-3068] - Rename deprecated parameter for HDFS block size in HiveColumnCardinalityJob</li> - <li>[KYLIN-3062] - Hide RAW measure</li> - <li>[KYLIN-3010] - Remove v1 Spark engine code</li> - <li>[KYLIN-2843] - Upgrade nvd3 version</li> - <li>[KYLIN-2797] - Remove MR engine V1</li> - <li>[KYLIN-2796] - Remove the legacy âstatisticsenabledâ codes in FactDistinctColumnsJob</li> -</ul> - -<p><strong>Sub-Task</strong></p> -<ul> - <li>[KYLIN-3235] - add null check for SQL</li> - <li>[KYLIN-3202] - Doc directory for 2.3</li> - <li>[KYLIN-3155] - Create a document for how to use dashboard</li> - <li>[KYLIN-3154] - Create a document for cube planner</li> - <li>[KYLIN-3153] - Create a document for system cube creation</li> - <li>[KYLIN-3018] - Change maxLevel for layered cubing</li> - <li>[KYLIN-2946] - Introduce a tool for batch incremental building of system cubes</li> - <li>[KYLIN-2934] - Provide user guide for KYLIN-2656(Support Zookeeper ACL)</li> - <li>[KYLIN-2822] - Introduce sunburst chart to show cuboid tree</li> - <li>[KYLIN-2746] - Separate filter row count & aggregated row count for metrics collection returned by coprocessor</li> - <li>[KYLIN-2735] - Introduce an option to make job scheduler consider job priority</li> - <li>[KYLIN-2734] - Introduce hot cuboids export & import</li> - <li>[KYLIN-2733] - Introduce optimize job for adjusting cuboid set</li> - <li>[KYLIN-2732] - Introduce base cuboid as a new input for cubing job</li> - <li>[KYLIN-2731] - Introduce checkpoint executable</li> - <li>[KYLIN-2725] - Introduce a tool for creating system cubes relating to query & job metrics</li> - <li>[KYLIN-2723] - Introduce metrics collector for query & job metrics</li> - <li>[KYLIN-2722] - Introduce a new measure, called active reservoir, for actively pushing metrics to reporters</li> -</ul> +<p><strong>New Feature</strong><br /> +* [KYLIN-3125] - Support SparkSql in Cube building step âCreate Intermediate Flat Hive Tableâ<br /> +* [KYLIN-3052] - Support Redshift as data source<br /> +* [KYLIN-3044] - Support SQL Server as data source<br /> +* [KYLIN-2999] - One click migrate cube in web<br /> +* [KYLIN-2960] - Support user/group and role authentication for LDAP<br /> +* [KYLIN-2902] - Introduce project-level concurrent query number control<br /> +* [KYLIN-2776] - New metric framework based on dropwizard<br /> +* [KYLIN-2727] - Introduce cube planner able to select cost-effective cuboids to be built by cost-based algorithms<br /> +* [KYLIN-2726] - Introduce a dashboard for showing kylin service related metrics, like query count, query latency, job count, etc<br /> +* [KYLIN-1892] - Support volatile range for segments auto merge</p> + +<p><strong>Improvement</strong><br /> +* [KYLIN-3265] - Add âjobSearchModeâ as a condition to â/kylin/api/jobsâ API<br /> +* [KYLIN-3245] - Searching cube support fuzzy search<br /> +* [KYLIN-3243] - Optimize the code and keep the code consistent in the access.html<br /> +* [KYLIN-3239] - Refactor the ACL code about âcheckPermissionâ and âhasPermissionâ<br /> +* [KYLIN-3215] - Remove âdropâ option when job status is stopped and error<br /> +* [KYLIN-3214] - Initialize ExternalAclProvider when starting kylin<br /> +* [KYLIN-3209] - Optimize job partial statistics path be consistent with existing one<br /> +* [KYLIN-3196] - Replace StringUtils.containsOnly with Regex<br /> +* [KYLIN-3194] - Tolerate broken job metadata caused by executable ClassNotFoundException<br /> +* [KYLIN-3193] - No model clone across projects<br /> +* [KYLIN-3182] - Update Kylin help menu links<br /> +* [KYLIN-3181] - The submit button status of refreshing cube is not suitable when the start time is equal or more than the end time.<br /> +* [KYLIN-3162] - Fix alignment problem of âSave Queryâ pop-up box<br /> +* [KYLIN-3159] - Remove unnecessary cube access request<br /> +* [KYLIN-3158] - Metadata broadcast should only retry failed node<br /> +* [KYLIN-3157] - Enhance query timeout to entire query life cycle<br /> +* [KYLIN-3151] - Enable âQuery Historyâ to show items filtered by different projects<br /> +* [KYLIN-3150] - Support different compression in PercentileCounter measure<br /> +* [KYLIN-3145] - Support Kafka JSON message whose property name includes â_â<br /> +* [KYLIN-3144] - Adopt Collections.emptyList() for empty list values<br /> +* [KYLIN-3129] - Fix the joda library conflicts during Kylin start on EMR 5.8+<br /> +* [KYLIN-3128] - Configs for allowing export query results for admin/nonadmin user<br /> +* [KYLIN-3127] - In the Insights tab, results section, make the list of Cubes hit by the query either scrollable or multiline<br /> +* [KYLIN-3124] - Support horizontal scroll bar in âInsightâ<br /> +* [KYLIN-3117] - Hide project config in cube level<br /> +* [KYLIN-3114] - Enable kylin.web.query-timeout for web query request<br /> +* [KYLIN-3113] - Editing Measure supports fuzzy search in web<br /> +* [KYLIN-3108] - Change IT embedded Kafka broker path to /kylin/streaming_config/UUID<br /> +* [KYLIN-3105] - Interface Schedulerâs stop method should be removed<br /> +* [KYLIN-3100] - Building empty partitioned cube with rest api supports partition_start_date<br /> +* [KYLIN-3098] - Enable kylin.query.max-return-rows to limit the maximum row count returned to user<br /> +* [KYLIN-3092] - Synchronize read/write operations on Managers<br /> +* [KYLIN-3090] - Refactor to consolidate all caches and managers under KylinConfig<br /> +* [KYLIN-3088] - Spell Error of isCubeMatch<br /> +* [KYLIN-3086] - Ignore the intermediate tables when loading Hive source tables<br /> +* [KYLIN-3079] - Use Docker for document build environment<br /> +* [KYLIN-3078] - Optimize the estimated size of percentile measure<br /> +* [KYLIN-3076] - Make kylin remember the choices we have made in the âMonitor>Jobsâ page<br /> +* [KYLIN-3074] - Change cube access to project access in ExternalAclProvider.java<br /> +* [KYLIN-3073] - Automatically refresh the âSaved Queriesâ tab page when new query saved. <br /> +* [KYLIN-3070] - Enable âkylin.source.hive.flat-table-storage-formatâ for flat table storage format<br /> +* [KYLIN-3067] - Provide web interface for dimension capping feature<br /> +* [KYLIN-3065] - Add âFirstâ and âLastâ button in case âQuery Historyâ is too much<br /> +* [KYLIN-3064] - Turn off Yarn timeline-service when submit mr job<br /> +* [KYLIN-3048] - Give warning when merge with holes, but allow user to force proceed at the same time<br /> +* [KYLIN-3043] - Donât need create materialized view for lookup tables without snapshot<br /> +* [KYLIN-3039] - Unclosed hbaseAdmin in ITAclTableMigrationToolTest<br /> +* [KYLIN-3036] - Allow complex column type when loading source table<br /> +* [KYLIN-3024] - Input Validator for âAuto Merge Thresholdsâ text box<br /> +* [KYLIN-3019] - The pop-up window of âCalculate Cardinalityâ and âLoad Hive Tableâ should have the same hint<br /> +* [KYLIN-3009] - Rest API to get Cube join SQL<br /> +* [KYLIN-3008] - Introduce âsubmit-patch.pyâ<br /> +* [KYLIN-3006] - Upgrade Spark to 2.1.2<br /> +* [KYLIN-2997] - Allow change engineType even if there are segments in cube<br /> +* [KYLIN-2996] - Show DeployCoprocessorCLI Log failed tables info<br /> +* [KYLIN-2993] - Add special mr config for base cuboid step<br /> +* [KYLIN-2992] - Avoid OOM in CubeHFileJob.Reducer<br /> +* [KYLIN-2990] - Add warning window of exist model names for other project selected<br /> +* [KYLIN-2987] - Add âauto.purge=trueâ when creating intermediate hive table or redistribute a hive table<br /> +* [KYLIN-2985] - Cache temp json file created by each Calcite Connection<br /> +* [KYLIN-2984] - Only allow delete FINISHED or DISCARDED job<br /> +* [KYLIN-2982] - Avoid upgrade column in OLAPTable<br /> +* [KYLIN-2981] - Typo in Cube refresh setting page.<br /> +* [KYLIN-2980] - Remove getKey/Value setKey/Value from Kylinâs Pair.<br /> +* [KYLIN-2975] - Unclosed Statement in test<br /> +* [KYLIN-2966] - push down jdbc column type id mapping<br /> +* [KYLIN-2965] - Keep the same cost calculation logic between RealizationChooser and CubeInstance<br /> +* [KYLIN-2947] - Changed the Pop-up box when no project selected<br /> +* [KYLIN-2941] - Configuration setting for SSO<br /> +* [KYLIN-2940] - List job restful throw NPE when time filter not set<br /> +* [KYLIN-2935] - Improve the way to deploy coprocessor<br /> +* [KYLIN-2928] - PUSH DOWN query cannot use order by function<br /> +* [KYLIN-2921] - Refactor DataModelDesc<br /> +* [KYLIN-2918] - Table ACL needs GUI<br /> +* [KYLIN-2913] - Enable job retry for configurable exceptions<br /> +* [KYLIN-2912] - Remove âhfileâ folder after bulk load to HBase<br /> +* [KYLIN-2909] - Refine Email Template for notification by freemarker<br /> +* [KYLIN-2908] - Add one option for migration tool to indicate whether to migrate segment data<br /> +* [KYLIN-2905] - Refine the process of submitting a job<br /> +* [KYLIN-2884] - Add delete segment function for portal<br /> +* [KYLIN-2881] - Improve hbase coprocessor exception handling at kylin server side <br /> +* [KYLIN-2875] - Cube e-mail notification Validation<br /> +* [KYLIN-2867] - split large fuzzy Key set<br /> +* [KYLIN-2866] - Enlarge the reducer number for hyperloglog statistics calculation at step FactDistinctColumnsJob<br /> +* [KYLIN-2847] - Avoid doing useless work by checking query deadline<br /> +* [KYLIN-2846] - Add a config of hbase namespace for cube storage<br /> +* [KYLIN-2809] - Support operator â+â as string concat operator<br /> +* [KYLIN-2801] - Make default precision and scale in DataType (for hive) configurable<br /> +* [KYLIN-2764] - Build the dict for UHC column with MR<br /> +* [KYLIN-2736] - Use multiple threads to calculate HyperLogLogPlusCounter in FactDistinctColumnsMapper<br /> +* [KYLIN-2672] - Only clean necessary cache for CubeMigrationCLI<br /> +* [KYLIN-2656] - Support Zookeeper ACL<br /> +* [KYLIN-2649] - Tableau could send âselect *â on a big table<br /> +* [KYLIN-2645] - Upgrade Kafka version to 0.11.0.1<br /> +* [KYLIN-2556] - Switch Findbugs to Spotbugs<br /> +* [KYLIN-2363] - Prune cuboids by capping number of dimensions<br /> +* [KYLIN-1925] - Do not allow cross project clone for cube<br /> +* [KYLIN-1872] - Make query visible and interruptible, improve serverâs stablility</p> + +<p><strong>Bug</strong><br /> +* [KYLIN-3268] - Tomcat Security Vulnerability Alert. The version of the tomcat for kylin should upgrade to 7.0.85.<br /> +* [KYLIN-3263] - AbstractExecutableâs retry has problem<br /> +* [KYLIN-3247] - REST API âGET /api/cubes/{cubeName}/segs/{segmentName}/sqlâ should return a cube segment sql<br /> +* [KYLIN-3242] - export result should use alias too<br /> +* [KYLIN-3241] - When refresh on âAdd Cube Pageâ, a blank page will appear.<br /> +* [KYLIN-3228] - Should remove the related segment when deleting a job<br /> +* [KYLIN-3227] - Automatically remove the blank at the end of lines in properties files<br /> +* [KYLIN-3226] - When user logs in with only query permission, âN/Aâ is displayed in the cubeâs action list.<br /> +* [KYLIN-3224] - data canât show when use kylin pushdown model <br /> +* [KYLIN-3223] - Query for the list of hybrid cubes results in NPE<br /> +* [KYLIN-3222] - The function of editing âAdvanced Dictionariesâ in cube is unavailable.<br /> +* [KYLIN-3219] - Fix NPE when updating metrics during Spark CubingJob<br /> +* [KYLIN-3216] - Remove the hard-code of spark-history path in âcheck-env.shâ<br /> +* [KYLIN-3213] - Kylin help has duplicate items<br /> +* [KYLIN-3211] - Class IntegerDimEnc shuould give more exception information when the length is exceed the max or less than the min<br /> +* [KYLIN-3210] - The project shows â_nullâ in result page.<br /> +* [KYLIN-3205] - Allow one column is used for both dimension and precisely count distinct measure<br /> +* [KYLIN-3204] - Potentially unclosed resources in JdbcExplorer#evalQueryMetadata<br /> +* [KYLIN-3199] - The login dialog should be closed when ldap user with no permission login correctly<br /> +* [KYLIN-3190] - Fix wrong parameter in revoke access API<br /> +* [KYLIN-3184] - Fix â_nullâ project on the query page<br /> +* [KYLIN-3183] - Fix the bug of the âRemoveâ button in âQuery Historyâ<br /> +* [KYLIN-3178] - Delete table acl failed will cause the wabpage awalys shows âPlease waitâ¦â<br /> +* [KYLIN-3177] - Merged Streaming cube segment has no start/end time<br /> +* [KYLIN-3175] - Streaming segment lost TSRange after merge<br /> +* [KYLIN-3173] - DefaultScheduler shutdown didnât reset field initialized.<br /> +* [KYLIN-3172] - No such file or directory error with CreateLookupHiveViewMaterializationStep <br /> +* [KYLIN-3167] - Datatype lost precision when using beeline<br /> +* [KYLIN-3165] - Fix the IllegalArgumentException during segments auto merge<br /> +* [KYLIN-3164] - HBase connection must be closed when clearing connection pool<br /> +* [KYLIN-3143] - Wrong use of Preconditions.checkNotNull() in ManagedUser#removeAuthoritie<br /> +* [KYLIN-3139] - Failure in map-reduce job due to undefined hdp.version variable when using HDP stack and remote HBase cluster<br /> +* [KYLIN-3136] - Endless status while subtask happens to be the illegal RUNNING<br /> +* [KYLIN-3135] - Fix regular expression bug in SQL comments<br /> +* [KYLIN-3131] - After refresh the page,the cubes canât sort by âcreate_timeâ<br /> +* [KYLIN-3130] - If we add new cube then refresh the page,the page is blank<br /> +* [KYLIN-3116] - Fix cardinality caculate checkbox issue when loading tables<br /> +* [KYLIN-3112] - The job âPauseâ operation has logic bug in the kylin server.<br /> +* [KYLIN-3111] - Close of HBaseAdmin instance should be placed in finally block<br /> +* [KYLIN-3110] - The dashboard page has some display problems.<br /> +* [KYLIN-3106] - DefaultScheduler.shutdown should use ExecutorService.shutdownNow instead of ExecutorService.shutdown<br /> +* [KYLIN-3104] - When the user log out from âMonitorâ page, an alert dialog will pop up warning âFailed to load query.â<br /> +* [KYLIN-3102] - Solve the problems for incomplete display of Hive Table tree.<br /> +* [KYLIN-3101] - The âsearchâ icon will separate from the âFilterâ textbox when click the âshowStepsâ button of a job in the jobList<br /> +* [KYLIN-3097] - A few spell error in partials directory<br /> +* [KYLIN-3087] - Fix the DistributedLock release bug in GlobalDictionaryBuilder<br /> +* [KYLIN-3085] - CubeManager.updateCube() must not update the cached CubeInstance<br /> +* [KYLIN-3084] - File not found Exception when processing union-all in TEZ mode<br /> +* [KYLIN-3083] - potential overflow in CubeHBaseRPC#getCoprocessorTimeoutMillis<br /> +* [KYLIN-3082] - Close of GTBuilder should be placed in finally block in InMemCubeBuilder<br /> +* [KYLIN-3081] - Ineffective null check in CubeController#cuboidsExport<br /> +* [KYLIN-3077] - EDW.TEST_SELLER_TYPE_DIM_TABLE is not being created by the integration test, but itâs presence in the Hive is expected<br /> +* [KYLIN-3069] - Add proper time zone support to the WebUI instead of GMT/PST kludge<br /> +* [KYLIN-3063] - load-hive-conf.sh should not get the commented configuration item<br /> +* [KYLIN-3061] - When we cancel the Topic modification for âKafka Settingâ of streaming table, the âCancelâ operation will make a mistake.<br /> +* [KYLIN-3060] - The logical processing of creating or updating streaming table has a bug in server, which will cause a NullPointerException.<br /> +* [KYLIN-3058] - We should limit the integer type ID and Port for âKafka Settingâ in âStreaming Clusterâ page<br /> +* [KYLIN-3056] - Fix âCannot find segment nullâ bug when click âSQLâ in the cube view page<br /> +* [KYLIN-3055] - Fix NullPointerException for intersect_count<br /> +* [KYLIN-3054] - The drop-down menu in the grid column of query results missing a little bit.<br /> +* [KYLIN-3053] - When aggregation group verification failed, the error message about aggregation group number does not match with the actual on the Advanced Setting page<br /> +* [KYLIN-3049] - Filter the invalid zero value of âAuto Merge Thresholdsâ parameter when you create or upate a cube.<br /> +* [KYLIN-3047] - Wrong column type when sync hive table via beeline<br /> +* [KYLIN-3042] - In query results page, the results data table should resize when click âfullScreenâ button<br /> +* [KYLIN-3040] - Refresh a non-partitioned cube changes the segment name to â19700101000000_2922789940817071255â<br /> +* [KYLIN-3038] - cannot support sum of type-converted column SQL<br /> +* [KYLIN-3034] - In the models tree, the âEdit(JSON)â option is missing partly.<br /> +* [KYLIN-3032] - Cube size shows 0 but actually it isnât empty<br /> +* [KYLIN-3031] - KeywordDefaultDirtyHack should ignore case of default like other database does<br /> +* [KYLIN-3030] - In the cubes table, the options of last column action are missing partly.<br /> +* [KYLIN-3029] - The warning window of existing cube name does not work<br /> +* [KYLIN-3028] - Build cube error when set S3 as working-dir<br /> +* [KYLIN-3026] - Can not see full cube names on insight page<br /> +* [KYLIN-3020] - Improve org.apache.hadoop.util.ToolRunner to be threadsafe<br /> +* [KYLIN-3017] - Footer covers the selection box and some options can not be selected<br /> +* [KYLIN-3016] - StorageCleanup job doesnât clean up all the legacy fiels in a in Read/Write seperation environment<br /> +* [KYLIN-3004] - Update validation when deleting segment<br /> +* [KYLIN-3001] - Fix the wrong Cache key issue <br /> +* [KYLIN-2995] - Set SparkContext.hadoopConfiguration to HadoopUtil in Spark Cubing<br /> +* [KYLIN-2994] - Handle NPE when load dict in DictionaryManager<br /> +* [KYLIN-2991] - Query hit NumberFormatException if partitionDateFormat is not yyyy-MM-dd<br /> +* [KYLIN-2989] - Close of BufferedWriter should be placed in finally block in SCCreator<br /> +* [KYLIN-2974] - zero joint group can lead to query error<br /> +* [KYLIN-2971] - Fix the wrong âRealization Namesâ in logQuery when hit cache<br /> +* [KYLIN-2969] - Fix the wrong NumberBytesCodec cache in Number2BytesConverter <br /> +* [KYLIN-2968] - misspelled word in table_load.html<br /> +* [KYLIN-2967] - Add the dependency check when deleting a project<br /> +* [KYLIN-2962] - drop error job not delete segment<br /> +* [KYLIN-2959] - SAML logout issue<br /> +* [KYLIN-2956] - building trie dictionary blocked on value of length over 4095 <br /> +* [KYLIN-2953] - List readable project not correct if add limit and offset<br /> +* [KYLIN-2939] - Get config properties not correct in UI<br /> +* [KYLIN-2933] - Fix compilation against the Kafka 1.0.0 release<br /> +* [KYLIN-2930] - Selecting one column in union causes compile error<br /> +* [KYLIN-2929] - speed up Dump file performance<br /> +* [KYLIN-2922] - Query fails when a column is used as dimension and sum(column) at the same time<br /> +* [KYLIN-2917] - Dup alias on OLAPTableScan<br /> +* [KYLIN-2907] - Check if a number is a positive integer <br /> +* [KYLIN-2901] - Update correct cardinality for empty table<br /> +* [KYLIN-2887] - Subquery columns not exported in OLAPContext allColumns<br /> +* [KYLIN-2876] - Ineffective check in ExternalAclProvider<br /> +* [KYLIN-2874] - Ineffective check in CubeDesc#getInitialCuboidScheduler<br /> +* [KYLIN-2849] - duplicate segmentúåcannot be deleted and data cannot be refreshed and merged<br /> +* [KYLIN-2837] - Ineffective call to toUpperCase() in MetadataManager<br /> +* [KYLIN-2836] - Lack of synchronization in CodahaleMetrics#close<br /> +* [KYLIN-2835] - Unclosed resources in JdbcExplorer<br /> +* [KYLIN-2794] - MultipleDictionaryValueEnumerator should output values in sorted order<br /> +* [KYLIN-2756] - Let âLIMITâ be optional in âInspectâ page<br /> +* [KYLIN-2470] - cube build failed when 0 bytes input for non-partition fact table<br /> +* [KYLIN-1664] - Harden security check for â/kylin/api/admin/configâ API</p> + +<p><strong>Task</strong><br /> +* [KYLIN-3207] - Blog for Kylin Superset Integration<br /> +* [KYLIN-3200] - Enable SonarCloud for Code Analysis<br /> +* [KYLIN-3198] - More Chinese Howto Documents<br /> +* [KYLIN-3195] - Kylin v2.3.0 Release<br /> +* [KYLIN-3191] - Remove the deprecated configuration item kylin.security.acl.default-role<br /> +* [KYLIN-3189] - Documents for kylin python client<br /> +* [KYLIN-3080] - Kylin Qlik Sense Integration Documentation<br /> +* [KYLIN-3068] - Rename deprecated parameter for HDFS block size in HiveColumnCardinalityJob<br /> +* [KYLIN-3062] - Hide RAW measure<br /> +* [KYLIN-3010] - Remove v1 Spark engine code<br /> +* [KYLIN-2843] - Upgrade nvd3 version<br /> +* [KYLIN-2797] - Remove MR engine V1<br /> +* [KYLIN-2796] - Remove the legacy âstatisticsenabledâ codes in FactDistinctColumnsJob</p> + +<p><strong>Sub-Task</strong><br /> +* [KYLIN-3235] - add null check for SQL<br /> +* [KYLIN-3202] - Doc directory for 2.3<br /> +* [KYLIN-3155] - Create a document for how to use dashboard<br /> +* [KYLIN-3154] - Create a document for cube planner<br /> +* [KYLIN-3153] - Create a document for system cube creation<br /> +* [KYLIN-3018] - Change maxLevel for layered cubing<br /> +* [KYLIN-2946] - Introduce a tool for batch incremental building of system cubes<br /> +* [KYLIN-2934] - Provide user guide for KYLIN-2656(Support Zookeeper ACL)<br /> +* [KYLIN-2822] - Introduce sunburst chart to show cuboid tree<br /> +* [KYLIN-2746] - Separate filter row count & aggregated row count for metrics collection returned by coprocessor<br /> +* [KYLIN-2735] - Introduce an option to make job scheduler consider job priority<br /> +* [KYLIN-2734] - Introduce hot cuboids export & import<br /> +* [KYLIN-2733] - Introduce optimize job for adjusting cuboid set<br /> +* [KYLIN-2732] - Introduce base cuboid as a new input for cubing job<br /> +* [KYLIN-2731] - Introduce checkpoint executable<br /> +* [KYLIN-2725] - Introduce a tool for creating system cubes relating to query & job metrics<br /> +* [KYLIN-2723] - Introduce metrics collector for query & job metrics<br /> +* [KYLIN-2722] - Introduce a new measure, called active reservoir, for actively pushing metrics to reporters</p> <h2 id="v220---2017-11-03">v2.2.0 - 2017-11-03</h2> <p><em>Tag:</em> <a href="https://github.com/apache/kylin/tree/kylin-2.2.0">kylin-2.2.0</a><br /> This is a major release after 2.1, with more than 70 bug fixes and enhancements. Check <a href="/docs21/howto/howto_upgrade.html">How to upgrade</a>.</p> -<p><strong>New Feature</strong></p> -<ul> - <li>[KYLIN-2703] - Manage ACL through Apache Ranger</li> - <li>[KYLIN-2752] - Make HTable name prefix configurable</li> - <li>[KYLIN-2761] - Table Level ACL</li> - <li>[KYLIN-2775] - Streaming Cube Sample</li> -</ul> - -<p><strong>Improvement</strong></p> -<ul> - <li>[KYLIN-2535] - Use ResourceStore to manage ACL files</li> - <li>[KYLIN-2604] - Use global dict as the default encoding for precise distinct count in web</li> - <li>[KYLIN-2606] - Only return counter for precise count_distinct if query is exactAggregate</li> - <li>[KYLIN-2622] - AppendTrieDictionary support not global</li> - <li>[KYLIN-2623] - Move output(Hbase) related code from MR engine to outputside</li> - <li>[KYLIN-2653] - Spark Cubing read metadata from HDFS</li> - <li>[KYLIN-2717] - Move concept Table under Project</li> - <li>[KYLIN-2790] - Add an extending point to support other types of column family</li> - <li>[KYLIN-2795] - Improve REST API document, add get/list jobs</li> - <li>[KYLIN-2803] - Pushdown non âselectâ query</li> - <li>[KYLIN-2818] - Refactor dateRange & sourceOffset on CubeSegment</li> - <li>[KYLIN-2819] - Add âkylin.env.zookeeper-base-pathâ for zk path</li> - <li>[KYLIN-2823] - Trim TupleFilter after dictionary-based filter optimization</li> - <li>[KYLIN-2844] - Override âmax-visit-scanrangeâ and âmax-fuzzykey-scanâ at cube level</li> - <li>[KYLIN-2854] - Remove duplicated controllers</li> - <li>[KYLIN-2856] - Log pushdown query as a kind of BadQuery</li> - <li>[KYLIN-2857] - MR configuration should be overwritten by user specified parameters when resuming MR jobs</li> - <li>[KYLIN-2858] - Add retry in cache sync</li> - <li>[KYLIN-2879] - Upgrade Spring & Spring Security to fix potential vulnerability</li> - <li>[KYLIN-2891] - Upgrade Tomcat to 7.0.82.</li> - <li>[KYLIN-2963] - Remove Beta for Spark Cubing</li> -</ul> - -<p><strong>Bug</strong></p> -<ul> - <li>[KYLIN-1794] - Enable job list even some job metadata parsing failed</li> - <li>[KYLIN-2600] - Incorrectly set the range start when filtering by the minimum value</li> - <li>[KYLIN-2705] - Allow removing modelâs âpartition_date_columnâ on web</li> - <li>[KYLIN-2706] - Fix the bug for the comparator in SortedIteratorMergerWithLimit</li> - <li>[KYLIN-2707] - Fix NPE in JobInfoConverter</li> - <li>[KYLIN-2716] - Non-thread-safe WeakHashMap leading to high CPU</li> - <li>[KYLIN-2718] - Overflow when calculating combination amount based on static rules</li> - <li>[KYLIN-2753] - Job duration may become negative</li> - <li>[KYLIN-2766] - Kylin uses default FS to put the coprocessor jar, instead of the working dir</li> - <li>[KYLIN-2773] - Should not push down join condition related columns are compatible while not consistent</li> - <li>[KYLIN-2781] - Make âfind-hadoop-conf-dir.shâ executable</li> - <li>[KYLIN-2786] - Miss âorg.apache.kylin.source.kafka.DateTimeParserâ</li> - <li>[KYLIN-2788] - HFile is not written to S3</li> - <li>[KYLIN-2789] - Cubeâs last build time is wrong</li> - <li>[KYLIN-2791] - Fix bug in readLong function in BytesUtil</li> - <li>[KYLIN-2798] - Canât rearrange the order of rowkey columns though web UI</li> - <li>[KYLIN-2799] - Building cube with percentile measure encounter with NullPointerException</li> - <li>[KYLIN-2800] - All dictionaries should be built based on the flat hive table</li> - <li>[KYLIN-2806] - Empty results from JDBC with Date filter in prepareStatement</li> - <li>[KYLIN-2812] - Save to wrong database when loading Kafka Topic</li> - <li>[KYLIN-2814] - HTTP connection may not be released in RestClient</li> - <li>[KYLIN-2815] - Empty results with prepareStatement but OK with KylinStatement</li> - <li>[KYLIN-2824] - Parse Boolean type in JDBC driver</li> - <li>[KYLIN-2832] - Table meta missing from system diagnosis</li> - <li>[KYLIN-2833] - Storage cleanup job could delete the intermediate hive table used by running jobs</li> - <li>[KYLIN-2834] - Bug in metadata sync, Broadcaster lost listener after cache wipe</li> - <li>[KYLIN-2838] - Should get storageType in changeHtableHost of CubeMigrationCLI</li> - <li>[KYLIN-2862] - BasicClientConnManager in RestClient canât do well with syncing many query severs</li> - <li>[KYLIN-2863] - Double caret bug in sample.sh for old version bash</li> - <li>[KYLIN-2865] - Wrong fs when use two cluster</li> - <li>[KYLIN-2868] - Include and exclude filters not work on ResourceTool</li> - <li>[KYLIN-2870] - Shortcut key description is error at Kylin-Web</li> - <li>[KYLIN-2871] - Ineffective null check in SegmentRange</li> - <li>[KYLIN-2877] - Unclosed PreparedStatement in QueryService#execute()</li> - <li>[KYLIN-2906] - Check model/cube name is duplicated when creating model/cube</li> - <li>[KYLIN-2915] - Exception during query on lookup table</li> - <li>[KYLIN-2920] - Failed to get streaming config on WebUI</li> - <li>[KYLIN-2944] - HLLCSerializer, RawSerializer, PercentileSerializer returns shared object in serialize()</li> - <li>[KYLIN-2949] - Couldnât get authorities with LDAP in RedHat Linux</li> -</ul> - -<p>Task</p> -<ul> - <li>[KYLIN-2782] - Replace DailyRollingFileAppender with RollingFileAppender to allow log retention</li> - <li>[KYLIN-2925] - Provide document for Ranger security integration</li> -</ul> - -<p>Sub-task</p> -<ul> - <li>[KYLIN-2549] - Modify tools that related to Acl</li> - <li>[KYLIN-2728] - Introduce a new cuboid scheduler based on cuboid tree rather than static rules</li> - <li>[KYLIN-2729] - Introduce greedy algorithm for cube planner</li> - <li>[KYLIN-2730] - Introduce genetic algorithm for cube planner</li> - <li>[KYLIN-2802] - Enable cube planner phase one</li> - <li>[KYLIN-2826] - Add basic support classes for cube planner algorithms</li> - <li>[KYLIN-2961] - Provide user guide for Ranger Kylin Plugin</li> -</ul> +<p><strong>New Feature</strong><br /> +* [KYLIN-2703] - Manage ACL through Apache Ranger<br /> +* [KYLIN-2752] - Make HTable name prefix configurable<br /> +* [KYLIN-2761] - Table Level ACL<br /> +* [KYLIN-2775] - Streaming Cube Sample</p> + +<p><strong>Improvement</strong><br /> +* [KYLIN-2535] - Use ResourceStore to manage ACL files<br /> +* [KYLIN-2604] - Use global dict as the default encoding for precise distinct count in web<br /> +* [KYLIN-2606] - Only return counter for precise count_distinct if query is exactAggregate<br /> +* [KYLIN-2622] - AppendTrieDictionary support not global<br /> +* [KYLIN-2623] - Move output(Hbase) related code from MR engine to outputside<br /> +* [KYLIN-2653] - Spark Cubing read metadata from HDFS<br /> +* [KYLIN-2717] - Move concept Table under Project<br /> +* [KYLIN-2790] - Add an extending point to support other types of column family<br /> +* [KYLIN-2795] - Improve REST API document, add get/list jobs<br /> +* [KYLIN-2803] - Pushdown non âselectâ query<br /> +* [KYLIN-2818] - Refactor dateRange & sourceOffset on CubeSegment<br /> +* [KYLIN-2819] - Add âkylin.env.zookeeper-base-pathâ for zk path<br /> +* [KYLIN-2823] - Trim TupleFilter after dictionary-based filter optimization<br /> +* [KYLIN-2844] - Override âmax-visit-scanrangeâ and âmax-fuzzykey-scanâ at cube level<br /> +* [KYLIN-2854] - Remove duplicated controllers<br /> +* [KYLIN-2856] - Log pushdown query as a kind of BadQuery<br /> +* [KYLIN-2857] - MR configuration should be overwritten by user specified parameters when resuming MR jobs<br /> +* [KYLIN-2858] - Add retry in cache sync<br /> +* [KYLIN-2879] - Upgrade Spring & Spring Security to fix potential vulnerability<br /> +* [KYLIN-2891] - Upgrade Tomcat to 7.0.82.<br /> +* [KYLIN-2963] - Remove Beta for Spark Cubing</p> + +<p><strong>Bug</strong><br /> +* [KYLIN-1794] - Enable job list even some job metadata parsing failed<br /> +* [KYLIN-2600] - Incorrectly set the range start when filtering by the minimum value<br /> +* [KYLIN-2705] - Allow removing modelâs âpartition_date_columnâ on web<br /> +* [KYLIN-2706] - Fix the bug for the comparator in SortedIteratorMergerWithLimit<br /> +* [KYLIN-2707] - Fix NPE in JobInfoConverter<br /> +* [KYLIN-2716] - Non-thread-safe WeakHashMap leading to high CPU<br /> +* [KYLIN-2718] - Overflow when calculating combination amount based on static rules<br /> +* [KYLIN-2753] - Job duration may become negative<br /> +* [KYLIN-2766] - Kylin uses default FS to put the coprocessor jar, instead of the working dir<br /> +* [KYLIN-2773] - Should not push down join condition related columns are compatible while not consistent<br /> +* [KYLIN-2781] - Make âfind-hadoop-conf-dir.shâ executable<br /> +* [KYLIN-2786] - Miss âorg.apache.kylin.source.kafka.DateTimeParserâ<br /> +* [KYLIN-2788] - HFile is not written to S3<br /> +* [KYLIN-2789] - Cubeâs last build time is wrong<br /> +* [KYLIN-2791] - Fix bug in readLong function in BytesUtil<br /> +* [KYLIN-2798] - Canât rearrange the order of rowkey columns though web UI<br /> +* [KYLIN-2799] - Building cube with percentile measure encounter with NullPointerException<br /> +* [KYLIN-2800] - All dictionaries should be built based on the flat hive table<br /> +* [KYLIN-2806] - Empty results from JDBC with Date filter in prepareStatement<br /> +* [KYLIN-2812] - Save to wrong database when loading Kafka Topic<br /> +* [KYLIN-2814] - HTTP connection may not be released in RestClient<br /> +* [KYLIN-2815] - Empty results with prepareStatement but OK with KylinStatement<br /> +* [KYLIN-2824] - Parse Boolean type in JDBC driver<br /> +* [KYLIN-2832] - Table meta missing from system diagnosis<br /> +* [KYLIN-2833] - Storage cleanup job could delete the intermediate hive table used by running jobs<br /> +* [KYLIN-2834] - Bug in metadata sync, Broadcaster lost listener after cache wipe<br /> +* [KYLIN-2838] - Should get storageType in changeHtableHost of CubeMigrationCLI<br /> +* [KYLIN-2862] - BasicClientConnManager in RestClient canât do well with syncing many query severs<br /> +* [KYLIN-2863] - Double caret bug in sample.sh for old version bash<br /> +* [KYLIN-2865] - Wrong fs when use two cluster<br /> +* [KYLIN-2868] - Include and exclude filters not work on ResourceTool<br /> +* [KYLIN-2870] - Shortcut key description is error at Kylin-Web<br /> +* [KYLIN-2871] - Ineffective null check in SegmentRange<br /> +* [KYLIN-2877] - Unclosed PreparedStatement in QueryService#execute()<br /> +* [KYLIN-2906] - Check model/cube name is duplicated when creating model/cube<br /> +* [KYLIN-2915] - Exception during query on lookup table<br /> +* [KYLIN-2920] - Failed to get streaming config on WebUI<br /> +* [KYLIN-2944] - HLLCSerializer, RawSerializer, PercentileSerializer returns shared object in serialize()<br /> +* [KYLIN-2949] - Couldnât get authorities with LDAP in RedHat Linux</p> + +<p>Task<br /> +* [KYLIN-2782] - Replace DailyRollingFileAppender with RollingFileAppender to allow log retention<br /> +* [KYLIN-2925] - Provide document for Ranger security integration</p> + +<p>Sub-task<br /> +* [KYLIN-2549] - Modify tools that related to Acl<br /> +* [KYLIN-2728] - Introduce a new cuboid scheduler based on cuboid tree rather than static rules<br /> +* [KYLIN-2729] - Introduce greedy algorithm for cube planner<br /> +* [KYLIN-2730] - Introduce genetic algorithm for cube planner<br /> +* [KYLIN-2802] - Enable cube planner phase one<br /> +* [KYLIN-2826] - Add basic support classes for cube planner algorithms<br /> +* [KYLIN-2961] - Provide user guide for Ranger Kylin Plugin</p> <h2 id="v210---2017-08-17">v2.1.0 - 2017-08-17</h2> @@ -7100,7 +7072,7 @@ This version includes many bug fixs/enha <li>[KYLIN-1396] - minor bug in BigDecimalSerializer - avoidVerbose should be incremented each time when input scale is larger than given scale</li> <li>[KYLIN-1419] - NullPointerException occurs when query from subqueries with order by</li> <li>[KYLIN-1445] - Kylin should throw error if HIVE_CONF dir cannot be found</li> - <li>[KYLIN-1466] - Some environment variables are not used in bin/kylin.sh <RUNNABLE_CLASS_NAME></RUNNABLE_CLASS_NAME></li> + <li>[KYLIN-1466] - Some environment variables are not used in bin/kylin.sh <runnable_class_name></runnable_class_name></li> <li>[KYLIN-1469] - Hive dependency jars are hard coded in test</li> <li>[KYLIN-1471] - LIMIT after having clause should not be pushed down to storage context</li> <li>[KYLIN-1473] - Cannot have comments in the end of New Query textbox</li> @@ -7199,7 +7171,7 @@ This version includes many bug fixs/enha <li>[KYLIN-1443] - For setting Auto Merge Time Ranges, before sending them to backend, the related time ranges should be sorted increasingly</li> <li>[KYLIN-1445] - Kylin should throw error if HIVE_CONF dir cannot be found</li> <li>[KYLIN-1456] - Shouldnât use â1970-01-01â as the default end date</li> - <li>[KYLIN-1466] - Some environment variables are not used in bin/kylin.sh <RUNNABLE_CLASS_NAME></RUNNABLE_CLASS_NAME></li> + <li>[KYLIN-1466] - Some environment variables are not used in bin/kylin.sh <runnable_class_name></runnable_class_name></li> <li>[KYLIN-1469] - Hive dependency jars are hard coded in test</li> </ul>
Modified: kylin/site/feed.xml URL: http://svn.apache.org/viewvc/kylin/site/feed.xml?rev=1836252&r1=1836251&r2=1836252&view=diff ============================================================================== --- kylin/site/feed.xml (original) +++ kylin/site/feed.xml Thu Jul 19 07:35:15 2018 @@ -19,11 +19,249 @@ <description>Apache Kylin Home</description> <link>http://kylin.apache.org/</link> <atom:link href="http://kylin.apache.org/feed.xml" rel="self" type="application/rss+xml"/> - <pubDate>Mon, 02 Jul 2018 06:59:10 -0700</pubDate> - <lastBuildDate>Mon, 02 Jul 2018 06:59:10 -0700</lastBuildDate> + <pubDate>Thu, 19 Jul 2018 00:27:24 -0700</pubDate> + <lastBuildDate>Thu, 19 Jul 2018 00:27:24 -0700</lastBuildDate> <generator>Jekyll v2.5.3</generator> <item> + <title>Use Star Schema Benchmark for Apache Kylin</title> + <description><h2 id="background">Background</h2> + +<p>For many Apache Kylin users, when deploying Kylin in the production environment, how to measure Kylinâs performance before delivering to the business is a problem. A performance benchmark can help to find the potential performance issues, so you can tune the configuration to improve the overall performance. The tunning may include Kylinâs own Job and Query, concurrent building of Cubes, HBase write and read, MapReduce or Spark parameters and more.</p> + +<h2 id="ssb-introduction">SSB Introduction</h2> +<p>Kyligence Inc provides an SSB (Star Schema Benchmark) project called <a href="https://github.com/Kyligence/ssb-kylin">ssb-kylin</a> on github, which is modified from the TPC-H benchmark, and specifically targeted to test tools in the star model OLAP scenario.</p> + +<p>The test process generates 5 tables, and the data volume can be adjusted by parameters. The table structure of SSB is shown below:</p> + +<p><img src="/images/blog/1. The table structure of SSB.png" alt="" /></p> + +<p>The table âlineorderâ is the fact table, the other four are dimension tables. Each dimension table is associated with the fact table by the primary key, which is a standard star schema.</p> + +<p>The environment for this test is CDH 5.13.3, which enables authentication and authorization of Kerberos and OpenLDAP, and uses Sentry to provide fine-grained, role-based authorization and multi-tenant management. However, the official âssb-kylinâ does not involve the processing of permissions and authentication, so I have slightly modified it. For details, see my fork <a href="https://github.com/jiangshouzhuang/ssb-kylin">jiangshouzhuang/ssb-kylin</a>.</p> + +<h2 id="prerequisites">Prerequisites</h2> + +<p>** Here is a description of the Kylin deployment:**<br /> +ãã1. Kylin deploys integrated OpenLDAP user unified authentication management<br /> +ãã2. Add Kylin deployment user kylin_manager_user in OpenLDAP (user group is kylin_manager_group)<br /> +ãã3. The Kylin version is apache-kylin-2.4.0<br /> +ãã4. Kylin Cluster configuration (VM):<br /> +ããKylin Job 1 node: 16GB, 8Cores<br /> +ããKylin Query 2 nodes: 32GB, 8Cores<br /> +<strong>A few points before SSB pressure measurement:</strong><br /> +1 Create a database named ssb in the Hive database.</p> +<pre name="code" class="java"> +# Log in to the hive database as a super administrator. +Create database SSB; +CREATE ROLE ssb_write_role; +GRANT ALL ON DATABASE ssb TO ROLE ssb_write_role; +GRANT ROLE ssb_write_role TO GROUP ssb_write_group; +# Then add kylin_manager_user to kylin_manager_group in OpenLDAP, so kylin_manager_user has access to the ssb database. +</pre> +<p>2 Assign HDFS directory /user/kylin_manager_user read and write permissions to kylin_manager_user user.<br /> +3 Configure the HADOOP_STREAMING_JAR environment variable under the kylin_manager_user user home directory.<br /> +<code class="highlighter-rouge"> +Export HADOOP_STREAMING_JAR=/opt/cloudera/parcels/CDH/lib/hadoop-mapreduce/hadoop-streaming.jar +</code></p> + +<h2 id="download-the-ssb-tool-and-compile">Download the SSB tool and compile</h2> + +<p>You can quickly download and compile the ssb test tool by entering the following command in the linux terminal.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>git clone https://github.com/jiangshouzhuang/ssb-kylin.git +cd ssb-kylin +cd ssb-benchmark +make clean +make +</code></pre> +</div> + +<h2 id="adjust-the-ssb-parameters">Adjust the SSB parameters</h2> + +<p>In the ssb-kylin project, there is a ssb.conf file below the bin directory, which defines the base data volume of the fact table and the dimension table. When we generate the amount of test data, we can specify the size of the scale so that the actual data is base * scale.</p> + +<p>Part of the ssb.conf file is:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code> # customer base, default value is 30,000 + customer_base = 30000 + # part base, default value is 200,000 + part_base = 200000 + # supply base, default value is 2,000 + supply_base = 2000 + # date base (days), default value is 2,556 + date_base = 2556 + # lineorder base (purchase record), default value is 6,000,000 + lineorder_base = 6000000 +</code></pre> +</div> + +<p>Of course, the above base parameters can be adjusted according to their actual needs, I use the default parameters.<br /> +In the ssb.conf file, there are some parameters as follows.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code># manufacturer max. The value range is (1 .. manu_max) +manu_max = 5 +# category max. The value range is (1 .. cat_max) +cat_max = 5 +# brand max. The value range is (1 .. brand_max) +brand_max = 40 +</code></pre> +</div> + +<p><strong>The explanation is as follows:</strong> <br /> +manu_max, cat_max and brand_max are used to define hierarchical scale. For example, manu_max=10, cat_max=10, and brand_max=10 refer to a total of 10 manufactures, and each manufactures has a maximum of 10 category parts, and each category has up to 10 brands. Therefore, the cardinality of manufacture is 10, the cardinality of category is 100, and the cardinality of brand is 1000.</p> + +<div class="highlighter-rouge"><pre class="highlight"><code># customer: num of cities per country, default value is 100 +cust_city_max = 9 +# supplier: num of cities per country, default value is 100 +supp_city_max = 9 +</code></pre> +</div> + +<p><strong>The explanation is as follows:</strong> <br /> +cust_city_max and supp_city_max are used to define the number of city for each country in customer and supplier tables. If the total number of country is 30, and cust_city_max=100, supp_city_max=10, then the customer table will have 3000 different city, and the supplier table will have 300 different city.</p> + +<p><strong>Prompt:</strong><br /> +In this pressure test, the resources allocated by Yarn are used to generate test data. If the memory problems are encountered in the process of generating the data, increase the memory size of the Yarn allocation of container.</p> + +<h2 id="generate-test-data">Generate test data</h2> + +<p>Before running the <code class="highlighter-rouge">ssb-kylin/bin/run.sh</code> script, explain several points to run.sh:<br /> +1 configuring HDFS_BASE_DIR as the path to table data, because I give kylin_manager_user the right to read and write to /user/kylin_manager_user directory, so configure here:</p> +<pre name="code" class="java"> +HDFS_BASE_DIR=/user/kylin_manager_user/ssb +</pre> +<p>The temporary and actual data will be generated under this directory when you run run.sh.<br /> +2 configure the LDAP user and password for deploying Kylin, and operate KeyTab files such as HDFS.</p> +<pre name="code" class="java"> +KYLIN_INSTALL_USER=kylin_manager_user +KYLIN_INSTALL_USER_PASSWD=xxxxxxxx +KYLIN_INSTALL_USER_KEYTAB=/home/${KYLIN_INSTALL_USER}/keytab/${KYLIN_INSTALL_USER}.keytab +</pre> +<p>3 configure the way that beeline accesses the hive database.</p> +<pre name="code" class="java"> +BEELINE_URL=jdbc:hive2://hiveserve2_ip:10000 +HIVE_BEELINE_COMMAND="beeline -u ${BEELINE_URL} -n ${KYLIN_INSTALL_USER} -p +${KYLIN_INSTALL_USER_PASSWD} -d org.apache.hive.jdbc.HiveDriver" +</pre> +<p>If your CDH or other big data platform is not using beeline, but hive cli, please modify it yourself.<br /> +Once everything is ready, we start running the program and generate test data:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>cd ssb-kylin +bin/run.sh --scale 20 +</code></pre> +</div> + +<p>We set the scale to 20, the program will run for a while, the maximum lineorder table data has more than 100 million. After the program is executed, we look at the tables in the hive database and the amount of data:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>use ssb; +show tables; +select count(1) from lineorder; +select count(1) from p_lineorder; +</code></pre> +</div> + +<p><img src="/images/blog/2.1 generated tables.png" alt="" /></p> + +<p><img src="/images/blog/2.2 the volume of data.png" alt="" /></p> + +<p>As you can see, a total of five tables and one view were created.</p> + +<h2 id="load-the-cubes-metadata-and-build-the-cube">Load the cubeâs metadata and build the cube</h2> + +<p>The ssb-kylin project has helped us build the project, model, and cube in advance. Just import the Kylin directly like the learn_kylin example. Cube Metadataâs directory is cubemeta, because our kylin integrates OpenLDAP, there is no ADMIN user, so the owner parameter in cubemeta/cube/ssb.json is set to null.<br /> +Execute the following command to import cubemeta:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>cd ssb-kylin +$KYLIN_HOME/bin/metastore.sh restore cubemeta +</code></pre> +</div> + +<p>Then log in to Kylin and execute Reload Metadata operation. This creates new project, model and cube in Kylin. Before building cube, first Disable, then Purge, delete old temporary files.</p> + +<p>The results of building with MapReduce are as follows:</p> + +<p><img src="/images/blog/3 build with mapReduce.png" alt="" /></p> + +<p>Here I test the performance of Spark to build Cube again, disable the previously created Cube, and then Purge. Since the Cube is used by Purge, the useless HBase tables and HDFS files need to be deleted. Here, manually clean up the junk files. First execute the following command:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete false +</code></pre> +</div> + +<p>Then check whether the listed HBase table and the HDFS file are useless. After confirming the error, perform the delete operation:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>${KYLIN_HOME}/bin/kylin.sh org.apache.kylin.tool.StorageCleanupJob --delete true +</code></pre> +</div> + +<p>When using Spark to build a cube, it consumes a lot of memory. After all, using memory resources improves the speed of cube building. Here I will list some of the parameters of Spark in the kylin.properties configuration file:</p> + +<div class="highlighter-rouge"><pre class="highlight"><code>kylin.engine.spark-conf.spark.master=yarn +kylin.engine.spark-conf.spark.submit.deployMode=cluster +kylin.engine.spark-conf.spark.yarn.queue=root.kylin_manager_group +# config Dynamic resource allocation +kylin.engine.spark-conf.spark.dynamicAllocation.enabled=true +kylin.engine.spark-conf.spark.dynamicAllocation.minExecutors=10 +kylin.engine.spark-conf.spark.dynamicAllocation.maxExecutors=1024 +kylin.engine.spark-conf.spark.dynamicAllocation.executorIdleTimeout=300 + +kylin.engine.spark-conf.spark.shuffle.service.enabled=true +kylin.engine.spark-conf.spark.shuffle.service.port=7337 + +kylin.engine.spark-conf.spark.driver.memory=4G +kylin.engine.spark-conf.spark.executor.memory=4G +kylin.engine.spark-conf.spark.executor.cores=1 +kylin.engine.spark-conf.spark.network.timeout=600 +</code></pre> +</div> + +<p>The above parameters can meet most of the requirements, so users basically do not need to configure when designing the Cube. Of course, if the situation is special, you can still set Spark-related tuning parameters at the Cube level.</p> + +<p>Before executing Spark to build a Cube, you need to set the Cube Engine value to Spark in Advanced Setting and then execute Build. After the construction is completed, the results are as follows:</p> + +<p><img src="/images/blog/4 build completely.png" alt="" /></p> + +<p>In contrast, the time for MapReduce and Spark to build Cube is as follows: (Scale=20):</p> + +<p><img src="/images/blog/5 the results of comparing Spark and MapReduce.png" alt="" /></p> + +<p>You can see that the speed of building is almost 1x faster. In fact, Spark has many other aspects of tuning (performance can be improved by 1-4 times and above), which is not involved here.</p> + +<h2 id="query">Query</h2> + +<p>Ssb-kylin provides 13 SSB query SQL lists. The query conditions may vary with the scale factor. You can modify the results according to the actual situation. The following examples show the test results in the case of scale 10 and 20:<br /> +The query result of Scale=10 is as follows:</p> + +<p><img src="/images/blog/6.1 scale 10.png" alt="" /></p> + +<p>The query result of Scale=20 is as follows:</p> + +<p><img src="/images/blog/6.2 scale 20.png" alt="" /></p> + +<p>As can be seen from the results, all the queries are completed within 1 s, which proves Apache Kylinâs subsecond query capability strongly. In addition, the average performance of the query did not decrease significantly as the amount of data doubled, which is also determined by the theory of Cube precomputation.</p> + +<p>Note: For details on each query statement, see the README.md description in the ssb-kylin project.</p> + +<p>At this point, the Kylinâs SSB pressure test is completed, but for you who are reading the article, everything is just beginning.</p> + +<h2 id="references">References</h2> + +<ol> + <li>èå®å£®.<a href="https://juejin.im/post/5b46d0606fb9a04fd6593d31">å¦ä½ä½¿ç¨ Star Schema Benchmark åæµ Apache Kylin</a></li> +</ol> + +</description> + <pubDate>Mon, 16 Jul 2018 05:28:00 -0700</pubDate> + <link>http://kylin.apache.org/blog/2018/07/16/Star-Schema-Benchmark-on-Apache-Kylin/</link> + <guid isPermaLink="true">http://kylin.apache.org/blog/2018/07/16/Star-Schema-Benchmark-on-Apache-Kylin/</guid> + + + <category>blog</category> + + </item> + + <item> <title>Redash-Kylin plugin from Strikingly</title> <description><p>At strikingly, we are using Apache Kylin as our OLAP engine. Kylin is very powerful and it supports our big data business well. Weâve chosen Apache Kylin because it fits our demand: it handles a huge amount of data, undertakes multiple concurrent queries and has sub-second response time.</p> @@ -805,53 +1043,6 @@ The time only cover the building cube st <category>blog</category> - - </item> - - <item> - <title>Apache Kylin v1.6.0 Release Announcement</title> - <description><p>The Apache Kylin community is pleased to announce the release of Apache Kylin v1.6.0.</p> - -<p>Apache Kylin is an open source Distributed Analytics Engine designed to provide SQL interface and multi-dimensional analysis (OLAP) on Hadoop supporting extremely large datasets.</p> - -<p>This is a major release after 1.5.4, with the reliable and scalable support for using Apache Kafka as data source; this enables user to build cubes directly from streaming data (without loading to Apache Hive), reducing the data latency from days/hours to minutes.</p> - -<p>Apache Kylin 1.6.0 resolved 102 issues including bug fixes, improvements, and new features. All of the changes can be found in the <a href="https://kylin.apache.org/docs16/release_notes.html">release notes</a>.</p> - -<h2 id="change-highlights">Change Highlights</h2> - -<ul> - <li>Scalable streaming cubing <a href="https://issues.apache.org/jira/browse/KYLIN-1726">KYLIN-1726</a></li> - <li>TopN counter merge performance improvement <a href="https://issues.apache.org/jira/browse/KYLIN-1917">KYLIN-1917</a></li> - <li>Support Embedded Structure JSON Message <a href="https://issues.apache.org/jira/browse/KYLIN-1919">KYLIN-1919</a></li> - <li>More robust approach to hive schema changes <a href="https://issues.apache.org/jira/browse/KYLIN-2012">KYLIN-2012</a></li> - <li>TimedJsonStreamParser should support other time format <a href="https://issues.apache.org/jira/browse/KYLIN-2054">KYLIN-2054</a></li> - <li>Add an encoder for Boolean type <a href="https://issues.apache.org/jira/browse/KYLIN-2055">KYLIN-2055</a></li> - <li>Allowe concurrent build/refresh/merge <a href="https://issues.apache.org/jira/browse/KYLIN-2070">KYLIN-2070</a></li> - <li>Support to change streaming configuration <a href="https://issues.apache.org/jira/browse/KYLIN-2082">KYLIN-2082</a></li> -</ul> - -<p>To download Apache Kylin v1.6.0 source code or binary package, visit the <a href="http://kylin.apache.org/download">download</a> page.</p> - -<p><strong>Upgrade</strong></p> - -<p>Follow the <a href="/docs16/howto/howto_upgrade.html">upgrade guide</a>.</p> - -<p><strong>Support</strong></p> - -<p>Any issue or question,<br /> -open JIRA to Apache Kylin project: <a href="https://issues.apache.org/jira/browse/KYLIN/">https://issues.apache.org/jira/browse/KYLIN/</a><br /> -or<br /> -send mail to Apache Kylin dev mailing list: <a href="&#109;&#097;&#105;&#108;&#116;&#111;:&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;">&#100;&#101;&#118;&#064;&#107;&#121;&#108;&#105;&#110;&#046;&#097;&#112;&#097;&#099;&#104;&#101;&#046;&#111;&#114;&#103;</a></p> - -<p><em>Great thanks to everyone who contributed!</em></p> -</description> - <pubDate>Sun, 04 Dec 2016 12:00:00 -0800</pubDate> - <link>http://kylin.apache.org/blog/2016/12/04/release-v1.6.0/</link> - <guid isPermaLink="true">http://kylin.apache.org/blog/2016/12/04/release-v1.6.0/</guid> - - - <category>blog</category> </item> Added: kylin/site/images/blog/1. The table structure of SSB.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/1.%20The%20table%20structure%20of%20SSB.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/1. The table structure of SSB.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/2.1 generated tables.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/2.1%20generated%20tables.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/2.1 generated tables.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/2.2 the volume of data.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/2.2%20the%20volume%20of%20data.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/2.2 the volume of data.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/3 build with mapReduce.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/3%20build%20with%20mapReduce.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/3 build with mapReduce.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/4 build completely.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/4%20build%20completely.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/4 build completely.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/5 the results of comparing Spark and MapReduce.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/5%20the%20results%20of%20comparing%20Spark%20and%20MapReduce.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/5 the results of comparing Spark and MapReduce.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/6.1 scale 10.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/6.1%20scale%2010.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/6.1 scale 10.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream Added: kylin/site/images/blog/6.2 scale 20.png URL: http://svn.apache.org/viewvc/kylin/site/images/blog/6.2%20scale%2020.png?rev=1836252&view=auto ============================================================================== Binary file - no diff available. Propchange: kylin/site/images/blog/6.2 scale 20.png ------------------------------------------------------------------------------ svn:mime-type = application/octet-stream