diff --git a/docs/EN_US/ECLLanguageReference/ECLR_mods/BltInFunc-BUILD.xml b/docs/EN_US/ECLLanguageReference/ECLR_mods/BltInFunc-BUILD.xml index a8baa1a8599..0a396a1c675 100644 --- a/docs/EN_US/ECLLanguageReference/ECLR_mods/BltInFunc-BUILD.xml +++ b/docs/EN_US/ECLLanguageReference/ECLR_mods/BltInFunc-BUILD.xml @@ -39,9 +39,9 @@ - + - + @@ -241,9 +241,9 @@ - + - + @@ -256,8 +256,8 @@ written to disk is always determined by the number of nodes in the cluster on which the workunit executes, regardless of the number of nodes on the target cluster(s) unless the WIDTH option - is also specified. Use this option for bare-metal deployments. - + is also specified. Use this option for bare-metal + deployments. @@ -292,7 +292,7 @@ names of the plane(s) to write the indexfile to. The targetPlane names must be listed as they - are defined in the deployment. + are defined in the deployment. @@ -856,17 +856,17 @@ BUILD(FilterDsLib1); - + - + LZW - The default compression. It is a variant of the - Lempel-Ziv-Welch algorithm. It remains the default for backward - compatibility. + A variant of the Lempel-Ziv-Welch algorithm. This was the + the default compression prior to versions 9.6.90, 9.8.66,and + 9.10.12. @@ -894,34 +894,113 @@ BUILD(FilterDsLib1); compression on the payload. The resulting index can be smaller than using lz4. + + + 'inplace:lz4s' + + Causes inplace compression on the key fields and lz4s + compression on the payload. This uses the stream LZ4 API to avoid + recompressing the data and reduce the index build times. + + + + 'inplace:lz4shc' + + The default compression for inplace indexes in versions + after versions 9.6.90, 9.8.66, and 9.10.12. Causes inplace + compression on the key fields and lz4shc compression on the + payload. This uses the stream LZ4 API to avoid recompressing the + data and reduce the index build times. + - The inplace index compression format (introduced in version 9.2.0) - improves compression of keyed fields and allows them to be searched - without decompression. The original index compression implementation - decompresses the rows when they are read from disk. + The lz4s and lz4hc inplace index compression formats (introduced in + versions 9.6.90, 9.8.66, and 9.10.12 9.2.0 or later) improves compression + and reduces build time. These formats require an engine that supports it. + In other words, if you build an index using the lz4s + or lz4shc formats, you must use a platform later than 9.6.90, 9.8.66, and + 9.10.12 to read those indexes. + + If you attempt to read an index with the inplace compression format + on a system that does not support it, you will receive an error + message. Because the branch nodes can be searched without decompression more branch nodes fit into memory which can improve search performance. The lz4 compression used for the payload is significantly faster at decompressing - leaf pages than the previous LZW compression. + leaf pages than the previous LZW compression. Whether performance is + better with lz4hc (a high-compression variant of lz4) on the payload + fields depends on the access characteristics of the data and how much of + the index is cached in memory. - Whether performance is better with lz4hc (a high-compression variant - of lz4) on the payload fields depends on the access characteristics of the - data and how much of the index is cached in memory. + Compression Levels : - If you attempt to read an index with the inplace compression format - on a system that does not support them, you will receive an error - message. + + + + + + + + + hclevel + + An integer between 2 and 12 to specify the level of + compression. The default is 3. Higher levels increase the + compression, but also increase the compression times. This may be + cost effective depending on the length of time the data is stored, + and the storage costs compared to the compute costs to build the + index. + - See Also: INDEX, JOIN, FETCH, MODULE, INTERFACE, LIBRARY, DISTRIBUTE, #WORKUNIT + + maxcompression + + The maximum desired compression ratio. This avoids the leaf + nodes getting too large when expanded, but increases the size of + some indexes. The default is 20. + + + + maxrecompress + + Specifies the number of times the entire input dataset + should be recompressed to free up space. Increasing the number + decreases the size of the indexes, and will probably decrease the + decompress time slightly (because there are fewer stream blocks), + but will increase the build time. The default is 1. + + + + + + + + Example: + + Vehicles := DATASET('vehicles', + {STRING2 st,STRING20 city,STRING20 lname},FLAT); + +SearchTerms := RECORD + Vehicles.st; + Vehicles.city; +END; +Payload := RECORD + Vehicles.lname; +END; +VehicleKey := INDEX(Vehicles,SearchTerms,Payload,'vkey::st.city', + COMPRESSED('inplace:lz4shc,compressopt(hclevel=9, + maxcompression=25, + maxrecompress=4)')); +BUILD(VehicleKey); + + See Also: DATASET, BUILDINDEX, JOIN, FETCH, KEYED/WILD diff --git a/docs/EN_US/ECLLanguageReference/ECLR_mods/Recrd-Index.xml b/docs/EN_US/ECLLanguageReference/ECLR_mods/Recrd-Index.xml index a1b9358527d..11aa09f726f 100644 --- a/docs/EN_US/ECLLanguageReference/ECLR_mods/Recrd-Index.xml +++ b/docs/EN_US/ECLLanguageReference/ECLR_mods/Recrd-Index.xml @@ -49,9 +49,9 @@ - + - + @@ -266,7 +266,7 @@ All STRINGs must be fixed length. - + @@ -365,17 +365,17 @@ BUILD(VehicleKey3); - + - + LZW - The default compression. It is a variant of the - Lempel-Ziv-Welch algorithm. It remains the default for backward - compatibility. + A variant of the Lempel-Ziv-Welch algorithm. This was the + the default compression prior to versions 9.6.90, 9.8.66, and + 9.10.12. @@ -403,27 +403,109 @@ BUILD(VehicleKey3); compression on the payload. The resulting index can be smaller than using lz4. + + + 'inplace:lz4s' + + Causes inplace compression on the key fields and lz4s + compression on the payload. This uses the stream LZ4 API to avoid + recompressing the data and reduce the index build times. + + + + 'inplace:lz4shc' + + The default compression for inplace indexes in versions + after versions 9.6.90, 9.8.66, and 9.10.12. Causes inplace + compression on the key fields and lz4shc compression on the + payload. This uses the stream LZ4 API to avoid recompressing the + data and reduce the index build times. + - The inplace index compression format (introduced in version 9.2.0) - improves compression of keyed fields and allows them to be searched - without decompression. The original index compression implementation - decompresses the rows when they are read from disk. + The lz4s and lz4hc inplace index compression formats (introduced in + versions 9.6.90, 9.8.66, and 9.10.12 9.2.0 or later) improves compression + and reduces build time. These formats require an engine that supports it. + In other words, if you build an index using the lz4s + or lz4shc formats, you must use a platform later than 9.6.90, 9.8.66, and + 9.10.12 to read those indexes. + + If you attempt to read an index with the inplace compression format + on a system that does not support it, you will receive an error + message. Because the branch nodes can be searched without decompression more branch nodes fit into memory which can improve search performance. The lz4 compression used for the payload is significantly faster at decompressing - leaf pages than the previous LZW compression. + leaf pages than the previous LZW compression. Whether performance is + better with lz4hc (a high-compression variant of lz4) on the payload + fields depends on the access characteristics of the data and how much of + the index is cached in memory. - Whether performance is better with lz4hc (a high-compression variant - of lz4) on the payload fields depends on the access characteristics of the - data and how much of the index is cached in memory. + Compression Levels : - If you attempt to read an index with the inplace compression format - on a system that does not support them, you will receive an error - message. + + + + + + + + + hclevel + + An integer between 2 and 12 to specify the level of + compression. The default is 3. Higher levels increase the + compression, but also increase the compression times. This may be + cost effective depending on the length of time the data is stored, + and the storage costs compared to the compute costs to build the + index. + + + + maxcompression + + The maximum desired compression ratio. This avoids the leaf + nodes getting too large when expanded, but increases the size of + some indexes. The default is 20. + + + + maxrecompress + + Specifies the number of times the entire input dataset + should be recompressed to free up space. Increasing the number + decreases the size of the indexes, and will probably decrease the + decompress time slightly (because there are fewer stream blocks), + but will increase the build time. The default is 1. + + + + + + + + Example: + + Vehicles := DATASET('vehicles', + {STRING2 st,STRING20 city,STRING20 lname},FLAT); + +SearchTerms := RECORD + Vehicles.st; + Vehicles.city; +END; +Payload := RECORD + Vehicles.lname; +END; +VehicleKey := INDEX(Vehicles,SearchTerms,Payload,'vkey::st.city', + COMPRESSED('inplace:lz4shc,compressopt(hclevel=9, + maxcompression=25, + maxrecompress=4)')); +BUILD(VehicleKey); See Also: DATASET, BUILDINDEX, JOIN,