1*5fe8380dSStefan ReinauerLZMA SDK 9.20 278acc472SPeter Tyser------------- 378acc472SPeter Tyser 478acc472SPeter TyserLZMA SDK provides the documentation, samples, header files, libraries, 578acc472SPeter Tyserand tools you need to develop applications that use LZMA compression. 678acc472SPeter Tyser 778acc472SPeter TyserLZMA is default and general compression method of 7z format 878acc472SPeter Tyserin 7-Zip compression program (www.7-zip.org). LZMA provides high 978acc472SPeter Tysercompression ratio and very fast decompression. 1078acc472SPeter Tyser 1178acc472SPeter TyserLZMA is an improved version of famous LZ77 compression algorithm. 1278acc472SPeter TyserIt was improved in way of maximum increasing of compression ratio, 1378acc472SPeter Tyserkeeping high decompression speed and low memory requirements for 1478acc472SPeter Tyserdecompressing. 1578acc472SPeter Tyser 1678acc472SPeter Tyser 1778acc472SPeter Tyser 1878acc472SPeter TyserLICENSE 1978acc472SPeter Tyser------- 2078acc472SPeter Tyser 2178acc472SPeter TyserLZMA SDK is written and placed in the public domain by Igor Pavlov. 2278acc472SPeter Tyser 23*5fe8380dSStefan ReinauerSome code in LZMA SDK is based on public domain code from another developers: 24*5fe8380dSStefan Reinauer 1) PPMd var.H (2001): Dmitry Shkarin 25*5fe8380dSStefan Reinauer 2) SHA-256: Wei Dai (Crypto++ library) 26*5fe8380dSStefan Reinauer 2778acc472SPeter Tyser 2878acc472SPeter TyserLZMA SDK Contents 2978acc472SPeter Tyser----------------- 3078acc472SPeter Tyser 3178acc472SPeter TyserLZMA SDK includes: 3278acc472SPeter Tyser 3378acc472SPeter Tyser - ANSI-C/C++/C#/Java source code for LZMA compressing and decompressing 3478acc472SPeter Tyser - Compiled file->file LZMA compressing/decompressing program for Windows system 3578acc472SPeter Tyser 3678acc472SPeter Tyser 3778acc472SPeter TyserUNIX/Linux version 3878acc472SPeter Tyser------------------ 3978acc472SPeter TyserTo compile C++ version of file->file LZMA encoding, go to directory 40*5fe8380dSStefan ReinauerCPP/7zip/Bundles/LzmaCon 4178acc472SPeter Tyserand call make to recompile it: 4278acc472SPeter Tyser make -f makefile.gcc clean all 4378acc472SPeter Tyser 4478acc472SPeter TyserIn some UNIX/Linux versions you must compile LZMA with static libraries. 4578acc472SPeter TyserTo compile with static libraries, you can use 4678acc472SPeter TyserLIB = -lm -static 4778acc472SPeter Tyser 4878acc472SPeter Tyser 4978acc472SPeter TyserFiles 5078acc472SPeter Tyser--------------------- 5178acc472SPeter Tyserlzma.txt - LZMA SDK description (this file) 5278acc472SPeter Tyser7zFormat.txt - 7z Format description 5378acc472SPeter Tyser7zC.txt - 7z ANSI-C Decoder description 5478acc472SPeter Tysermethods.txt - Compression method IDs for .7z 5578acc472SPeter Tyserlzma.exe - Compiled file->file LZMA encoder/decoder for Windows 56*5fe8380dSStefan Reinauer7zr.exe - 7-Zip with 7z/lzma/xz support. 5778acc472SPeter Tyserhistory.txt - history of the LZMA SDK 5878acc472SPeter Tyser 5978acc472SPeter Tyser 6078acc472SPeter TyserSource code structure 6178acc472SPeter Tyser--------------------- 6278acc472SPeter Tyser 6378acc472SPeter TyserC/ - C files 6478acc472SPeter Tyser 7zCrc*.* - CRC code 6578acc472SPeter Tyser Alloc.* - Memory allocation functions 6678acc472SPeter Tyser Bra*.* - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code 6778acc472SPeter Tyser LzFind.* - Match finder for LZ (LZMA) encoders 6878acc472SPeter Tyser LzFindMt.* - Match finder for LZ (LZMA) encoders for multithreading encoding 6978acc472SPeter Tyser LzHash.h - Additional file for LZ match finder 7078acc472SPeter Tyser LzmaDec.* - LZMA decoding 7178acc472SPeter Tyser LzmaEnc.* - LZMA encoding 7278acc472SPeter Tyser LzmaLib.* - LZMA Library for DLL calling 7378acc472SPeter Tyser Types.h - Basic types for another .c files 7478acc472SPeter Tyser Threads.* - The code for multithreading. 7578acc472SPeter Tyser 7678acc472SPeter Tyser LzmaLib - LZMA Library (.DLL for Windows) 7778acc472SPeter Tyser 7878acc472SPeter Tyser LzmaUtil - LZMA Utility (file->file LZMA encoder/decoder). 7978acc472SPeter Tyser 8078acc472SPeter Tyser Archive - files related to archiving 8178acc472SPeter Tyser 7z - 7z ANSI-C Decoder 8278acc472SPeter Tyser 8378acc472SPeter TyserCPP/ -- CPP files 8478acc472SPeter Tyser 8578acc472SPeter Tyser Common - common files for C++ projects 8678acc472SPeter Tyser Windows - common files for Windows related code 8778acc472SPeter Tyser 8878acc472SPeter Tyser 7zip - files related to 7-Zip Project 8978acc472SPeter Tyser 9078acc472SPeter Tyser Common - common files for 7-Zip 9178acc472SPeter Tyser 9278acc472SPeter Tyser Compress - files related to compression/decompression 9378acc472SPeter Tyser 9478acc472SPeter Tyser Archive - files related to archiving 9578acc472SPeter Tyser 9678acc472SPeter Tyser Common - common files for archive handling 9778acc472SPeter Tyser 7z - 7z C++ Encoder/Decoder 9878acc472SPeter Tyser 9978acc472SPeter Tyser Bundles - Modules that are bundles of other modules 10078acc472SPeter Tyser 10178acc472SPeter Tyser Alone7z - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2 102*5fe8380dSStefan Reinauer LzmaCon - lzma.exe: LZMA compression/decompression 10378acc472SPeter Tyser Format7zR - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2 10478acc472SPeter Tyser Format7zExtractR - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2. 10578acc472SPeter Tyser 10678acc472SPeter Tyser UI - User Interface files 10778acc472SPeter Tyser 10878acc472SPeter Tyser Client7z - Test application for 7za.dll, 7zr.dll, 7zxr.dll 10978acc472SPeter Tyser Common - Common UI files 11078acc472SPeter Tyser Console - Code for console archiver 11178acc472SPeter Tyser 11278acc472SPeter Tyser 11378acc472SPeter Tyser 11478acc472SPeter TyserCS/ - C# files 11578acc472SPeter Tyser 7zip 11678acc472SPeter Tyser Common - some common files for 7-Zip 11778acc472SPeter Tyser Compress - files related to compression/decompression 11878acc472SPeter Tyser LZ - files related to LZ (Lempel-Ziv) compression algorithm 11978acc472SPeter Tyser LZMA - LZMA compression/decompression 12078acc472SPeter Tyser LzmaAlone - file->file LZMA compression/decompression 12178acc472SPeter Tyser RangeCoder - Range Coder (special code of compression/decompression) 12278acc472SPeter Tyser 12378acc472SPeter TyserJava/ - Java files 12478acc472SPeter Tyser SevenZip 12578acc472SPeter Tyser Compression - files related to compression/decompression 12678acc472SPeter Tyser LZ - files related to LZ (Lempel-Ziv) compression algorithm 12778acc472SPeter Tyser LZMA - LZMA compression/decompression 12878acc472SPeter Tyser RangeCoder - Range Coder (special code of compression/decompression) 12978acc472SPeter Tyser 13078acc472SPeter Tyser 13178acc472SPeter TyserC/C++ source code of LZMA SDK is part of 7-Zip project. 13278acc472SPeter Tyser7-Zip source code can be downloaded from 7-Zip's SourceForge page: 13378acc472SPeter Tyser 13478acc472SPeter Tyser http://sourceforge.net/projects/sevenzip/ 13578acc472SPeter Tyser 13678acc472SPeter Tyser 13778acc472SPeter Tyser 13878acc472SPeter TyserLZMA features 13978acc472SPeter Tyser------------- 14078acc472SPeter Tyser - Variable dictionary size (up to 1 GB) 14178acc472SPeter Tyser - Estimated compressing speed: about 2 MB/s on 2 GHz CPU 14278acc472SPeter Tyser - Estimated decompressing speed: 14378acc472SPeter Tyser - 20-30 MB/s on 2 GHz Core 2 or AMD Athlon 64 14478acc472SPeter Tyser - 1-2 MB/s on 200 MHz ARM, MIPS, PowerPC or other simple RISC 14578acc472SPeter Tyser - Small memory requirements for decompressing (16 KB + DictionarySize) 14678acc472SPeter Tyser - Small code size for decompressing: 5-8 KB 14778acc472SPeter Tyser 14878acc472SPeter TyserLZMA decoder uses only integer operations and can be 14978acc472SPeter Tyserimplemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions). 15078acc472SPeter Tyser 15178acc472SPeter TyserSome critical operations that affect the speed of LZMA decompression: 15278acc472SPeter Tyser 1) 32*16 bit integer multiply 15378acc472SPeter Tyser 2) Misspredicted branches (penalty mostly depends from pipeline length) 15478acc472SPeter Tyser 3) 32-bit shift and arithmetic operations 15578acc472SPeter Tyser 15678acc472SPeter TyserThe speed of LZMA decompressing mostly depends from CPU speed. 15778acc472SPeter TyserMemory speed has no big meaning. But if your CPU has small data cache, 15878acc472SPeter Tyseroverall weight of memory speed will slightly increase. 15978acc472SPeter Tyser 16078acc472SPeter Tyser 16178acc472SPeter TyserHow To Use 16278acc472SPeter Tyser---------- 16378acc472SPeter Tyser 16478acc472SPeter TyserUsing LZMA encoder/decoder executable 16578acc472SPeter Tyser-------------------------------------- 16678acc472SPeter Tyser 16778acc472SPeter TyserUsage: LZMA <e|d> inputFile outputFile [<switches>...] 16878acc472SPeter Tyser 16978acc472SPeter Tyser e: encode file 17078acc472SPeter Tyser 17178acc472SPeter Tyser d: decode file 17278acc472SPeter Tyser 17378acc472SPeter Tyser b: Benchmark. There are two tests: compressing and decompressing 17478acc472SPeter Tyser with LZMA method. Benchmark shows rating in MIPS (million 17578acc472SPeter Tyser instructions per second). Rating value is calculated from 17678acc472SPeter Tyser measured speed and it is normalized with Intel's Core 2 results. 17778acc472SPeter Tyser Also Benchmark checks possible hardware errors (RAM 17878acc472SPeter Tyser errors in most cases). Benchmark uses these settings: 17978acc472SPeter Tyser (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter. 18078acc472SPeter Tyser Also you can change the number of iterations. Example for 30 iterations: 18178acc472SPeter Tyser LZMA b 30 18278acc472SPeter Tyser Default number of iterations is 10. 18378acc472SPeter Tyser 18478acc472SPeter Tyser<Switches> 18578acc472SPeter Tyser 18678acc472SPeter Tyser 18778acc472SPeter Tyser -a{N}: set compression mode 0 = fast, 1 = normal 18878acc472SPeter Tyser default: 1 (normal) 18978acc472SPeter Tyser 19078acc472SPeter Tyser d{N}: Sets Dictionary size - [0, 30], default: 23 (8MB) 19178acc472SPeter Tyser The maximum value for dictionary size is 1 GB = 2^30 bytes. 19278acc472SPeter Tyser Dictionary size is calculated as DictionarySize = 2^N bytes. 19378acc472SPeter Tyser For decompressing file compressed by LZMA method with dictionary 19478acc472SPeter Tyser size D = 2^N you need about D bytes of memory (RAM). 19578acc472SPeter Tyser 19678acc472SPeter Tyser -fb{N}: set number of fast bytes - [5, 273], default: 128 19778acc472SPeter Tyser Usually big number gives a little bit better compression ratio 19878acc472SPeter Tyser and slower compression process. 19978acc472SPeter Tyser 20078acc472SPeter Tyser -lc{N}: set number of literal context bits - [0, 8], default: 3 20178acc472SPeter Tyser Sometimes lc=4 gives gain for big files. 20278acc472SPeter Tyser 20378acc472SPeter Tyser -lp{N}: set number of literal pos bits - [0, 4], default: 0 20478acc472SPeter Tyser lp switch is intended for periodical data when period is 20578acc472SPeter Tyser equal 2^N. For example, for 32-bit (4 bytes) 20678acc472SPeter Tyser periodical data you can use lp=2. Often it's better to set lc0, 20778acc472SPeter Tyser if you change lp switch. 20878acc472SPeter Tyser 20978acc472SPeter Tyser -pb{N}: set number of pos bits - [0, 4], default: 2 21078acc472SPeter Tyser pb switch is intended for periodical data 21178acc472SPeter Tyser when period is equal 2^N. 21278acc472SPeter Tyser 21378acc472SPeter Tyser -mf{MF_ID}: set Match Finder. Default: bt4. 21478acc472SPeter Tyser Algorithms from hc* group doesn't provide good compression 21578acc472SPeter Tyser ratio, but they often works pretty fast in combination with 21678acc472SPeter Tyser fast mode (-a0). 21778acc472SPeter Tyser 21878acc472SPeter Tyser Memory requirements depend from dictionary size 21978acc472SPeter Tyser (parameter "d" in table below). 22078acc472SPeter Tyser 22178acc472SPeter Tyser MF_ID Memory Description 22278acc472SPeter Tyser 22378acc472SPeter Tyser bt2 d * 9.5 + 4MB Binary Tree with 2 bytes hashing. 22478acc472SPeter Tyser bt3 d * 11.5 + 4MB Binary Tree with 3 bytes hashing. 22578acc472SPeter Tyser bt4 d * 11.5 + 4MB Binary Tree with 4 bytes hashing. 22678acc472SPeter Tyser hc4 d * 7.5 + 4MB Hash Chain with 4 bytes hashing. 22778acc472SPeter Tyser 22878acc472SPeter Tyser -eos: write End Of Stream marker. By default LZMA doesn't write 22978acc472SPeter Tyser eos marker, since LZMA decoder knows uncompressed size 23078acc472SPeter Tyser stored in .lzma file header. 23178acc472SPeter Tyser 23278acc472SPeter Tyser -si: Read data from stdin (it will write End Of Stream marker). 23378acc472SPeter Tyser -so: Write data to stdout 23478acc472SPeter Tyser 23578acc472SPeter Tyser 23678acc472SPeter TyserExamples: 23778acc472SPeter Tyser 23878acc472SPeter Tyser1) LZMA e file.bin file.lzma -d16 -lc0 23978acc472SPeter Tyser 24078acc472SPeter Tysercompresses file.bin to file.lzma with 64 KB dictionary (2^16=64K) 24178acc472SPeter Tyserand 0 literal context bits. -lc0 allows to reduce memory requirements 24278acc472SPeter Tyserfor decompression. 24378acc472SPeter Tyser 24478acc472SPeter Tyser 24578acc472SPeter Tyser2) LZMA e file.bin file.lzma -lc0 -lp2 24678acc472SPeter Tyser 24778acc472SPeter Tysercompresses file.bin to file.lzma with settings suitable 24878acc472SPeter Tyserfor 32-bit periodical data (for example, ARM or MIPS code). 24978acc472SPeter Tyser 25078acc472SPeter Tyser3) LZMA d file.lzma file.bin 25178acc472SPeter Tyser 25278acc472SPeter Tyserdecompresses file.lzma to file.bin. 25378acc472SPeter Tyser 25478acc472SPeter Tyser 25578acc472SPeter TyserCompression ratio hints 25678acc472SPeter Tyser----------------------- 25778acc472SPeter Tyser 25878acc472SPeter TyserRecommendations 25978acc472SPeter Tyser--------------- 26078acc472SPeter Tyser 26178acc472SPeter TyserTo increase the compression ratio for LZMA compressing it's desirable 26278acc472SPeter Tyserto have aligned data (if it's possible) and also it's desirable to locate 26378acc472SPeter Tyserdata in such order, where code is grouped in one place and data is 26478acc472SPeter Tysergrouped in other place (it's better than such mixing: code, data, code, 26578acc472SPeter Tyserdata, ...). 26678acc472SPeter Tyser 26778acc472SPeter Tyser 26878acc472SPeter TyserFilters 26978acc472SPeter Tyser------- 27078acc472SPeter TyserYou can increase the compression ratio for some data types, using 27178acc472SPeter Tyserspecial filters before compressing. For example, it's possible to 27278acc472SPeter Tyserincrease the compression ratio on 5-10% for code for those CPU ISAs: 27378acc472SPeter Tyserx86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC. 27478acc472SPeter Tyser 27578acc472SPeter TyserYou can find C source code of such filters in C/Bra*.* files 27678acc472SPeter Tyser 27778acc472SPeter TyserYou can check the compression ratio gain of these filters with such 27878acc472SPeter Tyser7-Zip commands (example for ARM code): 27978acc472SPeter TyserNo filter: 28078acc472SPeter Tyser 7z a a1.7z a.bin -m0=lzma 28178acc472SPeter Tyser 28278acc472SPeter TyserWith filter for little-endian ARM code: 28378acc472SPeter Tyser 7z a a2.7z a.bin -m0=arm -m1=lzma 28478acc472SPeter Tyser 28578acc472SPeter TyserIt works in such manner: 28678acc472SPeter TyserCompressing = Filter_encoding + LZMA_encoding 28778acc472SPeter TyserDecompressing = LZMA_decoding + Filter_decoding 28878acc472SPeter Tyser 28978acc472SPeter TyserCompressing and decompressing speed of such filters is very high, 29078acc472SPeter Tyserso it will not increase decompressing time too much. 29178acc472SPeter TyserMoreover, it reduces decompression time for LZMA_decoding, 29278acc472SPeter Tysersince compression ratio with filtering is higher. 29378acc472SPeter Tyser 29478acc472SPeter TyserThese filters convert CALL (calling procedure) instructions 29578acc472SPeter Tyserfrom relative offsets to absolute addresses, so such data becomes more 29678acc472SPeter Tysercompressible. 29778acc472SPeter Tyser 29878acc472SPeter TyserFor some ISAs (for example, for MIPS) it's impossible to get gain from such filter. 29978acc472SPeter Tyser 30078acc472SPeter Tyser 30178acc472SPeter TyserLZMA compressed file format 30278acc472SPeter Tyser--------------------------- 30378acc472SPeter TyserOffset Size Description 30478acc472SPeter Tyser 0 1 Special LZMA properties (lc,lp, pb in encoded form) 30578acc472SPeter Tyser 1 4 Dictionary size (little endian) 30678acc472SPeter Tyser 5 8 Uncompressed size (little endian). -1 means unknown size 30778acc472SPeter Tyser 13 Compressed data 30878acc472SPeter Tyser 30978acc472SPeter Tyser 31078acc472SPeter TyserANSI-C LZMA Decoder 31178acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~ 31278acc472SPeter Tyser 31378acc472SPeter TyserPlease note that interfaces for ANSI-C code were changed in LZMA SDK 4.58. 31478acc472SPeter TyserIf you want to use old interfaces you can download previous version of LZMA SDK 31578acc472SPeter Tyserfrom sourceforge.net site. 31678acc472SPeter Tyser 31778acc472SPeter TyserTo use ANSI-C LZMA Decoder you need the following files: 31878acc472SPeter Tyser1) LzmaDec.h + LzmaDec.c + Types.h 31978acc472SPeter TyserLzmaUtil/LzmaUtil.c is example application that uses these files. 32078acc472SPeter Tyser 32178acc472SPeter Tyser 32278acc472SPeter TyserMemory requirements for LZMA decoding 32378acc472SPeter Tyser------------------------------------- 32478acc472SPeter Tyser 32578acc472SPeter TyserStack usage of LZMA decoding function for local variables is not 32678acc472SPeter Tyserlarger than 200-400 bytes. 32778acc472SPeter Tyser 32878acc472SPeter TyserLZMA Decoder uses dictionary buffer and internal state structure. 32978acc472SPeter TyserInternal state structure consumes 33078acc472SPeter Tyser state_size = (4 + (1.5 << (lc + lp))) KB 33178acc472SPeter Tyserby default (lc=3, lp=0), state_size = 16 KB. 33278acc472SPeter Tyser 33378acc472SPeter Tyser 33478acc472SPeter TyserHow To decompress data 33578acc472SPeter Tyser---------------------- 33678acc472SPeter Tyser 33778acc472SPeter TyserLZMA Decoder (ANSI-C version) now supports 2 interfaces: 33878acc472SPeter Tyser1) Single-call Decompressing 33978acc472SPeter Tyser2) Multi-call State Decompressing (zlib-like interface) 34078acc472SPeter Tyser 34178acc472SPeter TyserYou must use external allocator: 34278acc472SPeter TyserExample: 34378acc472SPeter Tyservoid *SzAlloc(void *p, size_t size) { p = p; return malloc(size); } 34478acc472SPeter Tyservoid SzFree(void *p, void *address) { p = p; free(address); } 34578acc472SPeter TyserISzAlloc alloc = { SzAlloc, SzFree }; 34678acc472SPeter Tyser 34778acc472SPeter TyserYou can use p = p; operator to disable compiler warnings. 34878acc472SPeter Tyser 34978acc472SPeter Tyser 35078acc472SPeter TyserSingle-call Decompressing 35178acc472SPeter Tyser------------------------- 35278acc472SPeter TyserWhen to use: RAM->RAM decompressing 35378acc472SPeter TyserCompile files: LzmaDec.h + LzmaDec.c + Types.h 35478acc472SPeter TyserCompile defines: no defines 35578acc472SPeter TyserMemory Requirements: 35678acc472SPeter Tyser - Input buffer: compressed size 35778acc472SPeter Tyser - Output buffer: uncompressed size 35878acc472SPeter Tyser - LZMA Internal Structures: state_size (16 KB for default settings) 35978acc472SPeter Tyser 36078acc472SPeter TyserInterface: 36178acc472SPeter Tyser int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen, 36278acc472SPeter Tyser const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode, 36378acc472SPeter Tyser ELzmaStatus *status, ISzAlloc *alloc); 36478acc472SPeter Tyser In: 36578acc472SPeter Tyser dest - output data 36678acc472SPeter Tyser destLen - output data size 36778acc472SPeter Tyser src - input data 36878acc472SPeter Tyser srcLen - input data size 36978acc472SPeter Tyser propData - LZMA properties (5 bytes) 37078acc472SPeter Tyser propSize - size of propData buffer (5 bytes) 37178acc472SPeter Tyser finishMode - It has meaning only if the decoding reaches output limit (*destLen). 37278acc472SPeter Tyser LZMA_FINISH_ANY - Decode just destLen bytes. 37378acc472SPeter Tyser LZMA_FINISH_END - Stream must be finished after (*destLen). 37478acc472SPeter Tyser You can use LZMA_FINISH_END, when you know that 37578acc472SPeter Tyser current output buffer covers last bytes of stream. 37678acc472SPeter Tyser alloc - Memory allocator. 37778acc472SPeter Tyser 37878acc472SPeter Tyser Out: 37978acc472SPeter Tyser destLen - processed output size 38078acc472SPeter Tyser srcLen - processed input size 38178acc472SPeter Tyser 38278acc472SPeter Tyser Output: 38378acc472SPeter Tyser SZ_OK 38478acc472SPeter Tyser status: 38578acc472SPeter Tyser LZMA_STATUS_FINISHED_WITH_MARK 38678acc472SPeter Tyser LZMA_STATUS_NOT_FINISHED 38778acc472SPeter Tyser LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK 38878acc472SPeter Tyser SZ_ERROR_DATA - Data error 38978acc472SPeter Tyser SZ_ERROR_MEM - Memory allocation error 39078acc472SPeter Tyser SZ_ERROR_UNSUPPORTED - Unsupported properties 39178acc472SPeter Tyser SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src). 39278acc472SPeter Tyser 39378acc472SPeter Tyser If LZMA decoder sees end_marker before reaching output limit, it returns OK result, 39478acc472SPeter Tyser and output value of destLen will be less than output buffer size limit. 39578acc472SPeter Tyser 39678acc472SPeter Tyser You can use multiple checks to test data integrity after full decompression: 39778acc472SPeter Tyser 1) Check Result and "status" variable. 39878acc472SPeter Tyser 2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize. 39978acc472SPeter Tyser 3) Check that output(srcLen) = compressedSize, if you know real compressedSize. 40078acc472SPeter Tyser You must use correct finish mode in that case. */ 40178acc472SPeter Tyser 40278acc472SPeter Tyser 40378acc472SPeter TyserMulti-call State Decompressing (zlib-like interface) 40478acc472SPeter Tyser---------------------------------------------------- 40578acc472SPeter Tyser 40678acc472SPeter TyserWhen to use: file->file decompressing 40778acc472SPeter TyserCompile files: LzmaDec.h + LzmaDec.c + Types.h 40878acc472SPeter Tyser 40978acc472SPeter TyserMemory Requirements: 41078acc472SPeter Tyser - Buffer for input stream: any size (for example, 16 KB) 41178acc472SPeter Tyser - Buffer for output stream: any size (for example, 16 KB) 41278acc472SPeter Tyser - LZMA Internal Structures: state_size (16 KB for default settings) 41378acc472SPeter Tyser - LZMA dictionary (dictionary size is encoded in LZMA properties header) 41478acc472SPeter Tyser 41578acc472SPeter Tyser1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header: 41678acc472SPeter Tyser unsigned char header[LZMA_PROPS_SIZE + 8]; 41778acc472SPeter Tyser ReadFile(inFile, header, sizeof(header) 41878acc472SPeter Tyser 41978acc472SPeter Tyser2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties 42078acc472SPeter Tyser 42178acc472SPeter Tyser CLzmaDec state; 42278acc472SPeter Tyser LzmaDec_Constr(&state); 42378acc472SPeter Tyser res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc); 42478acc472SPeter Tyser if (res != SZ_OK) 42578acc472SPeter Tyser return res; 42678acc472SPeter Tyser 42778acc472SPeter Tyser3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop 42878acc472SPeter Tyser 42978acc472SPeter Tyser LzmaDec_Init(&state); 43078acc472SPeter Tyser for (;;) 43178acc472SPeter Tyser { 43278acc472SPeter Tyser ... 43378acc472SPeter Tyser int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen, 43478acc472SPeter Tyser const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode); 43578acc472SPeter Tyser ... 43678acc472SPeter Tyser } 43778acc472SPeter Tyser 43878acc472SPeter Tyser 43978acc472SPeter Tyser4) Free all allocated structures 44078acc472SPeter Tyser LzmaDec_Free(&state, &g_Alloc); 44178acc472SPeter Tyser 44278acc472SPeter TyserFor full code example, look at C/LzmaUtil/LzmaUtil.c code. 44378acc472SPeter Tyser 44478acc472SPeter Tyser 44578acc472SPeter TyserHow To compress data 44678acc472SPeter Tyser-------------------- 44778acc472SPeter Tyser 44878acc472SPeter TyserCompile files: LzmaEnc.h + LzmaEnc.c + Types.h + 44978acc472SPeter TyserLzFind.c + LzFind.h + LzFindMt.c + LzFindMt.h + LzHash.h 45078acc472SPeter Tyser 45178acc472SPeter TyserMemory Requirements: 45278acc472SPeter Tyser - (dictSize * 11.5 + 6 MB) + state_size 45378acc472SPeter Tyser 45478acc472SPeter TyserLzma Encoder can use two memory allocators: 45578acc472SPeter Tyser1) alloc - for small arrays. 45678acc472SPeter Tyser2) allocBig - for big arrays. 45778acc472SPeter Tyser 45878acc472SPeter TyserFor example, you can use Large RAM Pages (2 MB) in allocBig allocator for 45978acc472SPeter Tyserbetter compression speed. Note that Windows has bad implementation for 46078acc472SPeter TyserLarge RAM Pages. 46178acc472SPeter TyserIt's OK to use same allocator for alloc and allocBig. 46278acc472SPeter Tyser 46378acc472SPeter Tyser 46478acc472SPeter TyserSingle-call Compression with callbacks 46578acc472SPeter Tyser-------------------------------------- 46678acc472SPeter Tyser 46778acc472SPeter TyserCheck C/LzmaUtil/LzmaUtil.c as example, 46878acc472SPeter Tyser 46978acc472SPeter TyserWhen to use: file->file decompressing 47078acc472SPeter Tyser 47178acc472SPeter Tyser1) you must implement callback structures for interfaces: 47278acc472SPeter TyserISeqInStream 47378acc472SPeter TyserISeqOutStream 47478acc472SPeter TyserICompressProgress 47578acc472SPeter TyserISzAlloc 47678acc472SPeter Tyser 47778acc472SPeter Tyserstatic void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); } 47878acc472SPeter Tyserstatic void SzFree(void *p, void *address) { p = p; MyFree(address); } 47978acc472SPeter Tyserstatic ISzAlloc g_Alloc = { SzAlloc, SzFree }; 48078acc472SPeter Tyser 48178acc472SPeter Tyser CFileSeqInStream inStream; 48278acc472SPeter Tyser CFileSeqOutStream outStream; 48378acc472SPeter Tyser 48478acc472SPeter Tyser inStream.funcTable.Read = MyRead; 48578acc472SPeter Tyser inStream.file = inFile; 48678acc472SPeter Tyser outStream.funcTable.Write = MyWrite; 48778acc472SPeter Tyser outStream.file = outFile; 48878acc472SPeter Tyser 48978acc472SPeter Tyser 49078acc472SPeter Tyser2) Create CLzmaEncHandle object; 49178acc472SPeter Tyser 49278acc472SPeter Tyser CLzmaEncHandle enc; 49378acc472SPeter Tyser 49478acc472SPeter Tyser enc = LzmaEnc_Create(&g_Alloc); 49578acc472SPeter Tyser if (enc == 0) 49678acc472SPeter Tyser return SZ_ERROR_MEM; 49778acc472SPeter Tyser 49878acc472SPeter Tyser 49978acc472SPeter Tyser3) initialize CLzmaEncProps properties; 50078acc472SPeter Tyser 50178acc472SPeter Tyser LzmaEncProps_Init(&props); 50278acc472SPeter Tyser 50378acc472SPeter Tyser Then you can change some properties in that structure. 50478acc472SPeter Tyser 50578acc472SPeter Tyser4) Send LZMA properties to LZMA Encoder 50678acc472SPeter Tyser 50778acc472SPeter Tyser res = LzmaEnc_SetProps(enc, &props); 50878acc472SPeter Tyser 50978acc472SPeter Tyser5) Write encoded properties to header 51078acc472SPeter Tyser 51178acc472SPeter Tyser Byte header[LZMA_PROPS_SIZE + 8]; 51278acc472SPeter Tyser size_t headerSize = LZMA_PROPS_SIZE; 51378acc472SPeter Tyser UInt64 fileSize; 51478acc472SPeter Tyser int i; 51578acc472SPeter Tyser 51678acc472SPeter Tyser res = LzmaEnc_WriteProperties(enc, header, &headerSize); 51778acc472SPeter Tyser fileSize = MyGetFileLength(inFile); 51878acc472SPeter Tyser for (i = 0; i < 8; i++) 51978acc472SPeter Tyser header[headerSize++] = (Byte)(fileSize >> (8 * i)); 52078acc472SPeter Tyser MyWriteFileAndCheck(outFile, header, headerSize) 52178acc472SPeter Tyser 52278acc472SPeter Tyser6) Call encoding function: 52378acc472SPeter Tyser res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable, 52478acc472SPeter Tyser NULL, &g_Alloc, &g_Alloc); 52578acc472SPeter Tyser 52678acc472SPeter Tyser7) Destroy LZMA Encoder Object 52778acc472SPeter Tyser LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc); 52878acc472SPeter Tyser 52978acc472SPeter Tyser 530*5fe8380dSStefan ReinauerIf callback function return some error code, LzmaEnc_Encode also returns that code 531*5fe8380dSStefan Reinaueror it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS. 53278acc472SPeter Tyser 53378acc472SPeter Tyser 53478acc472SPeter TyserSingle-call RAM->RAM Compression 53578acc472SPeter Tyser-------------------------------- 53678acc472SPeter Tyser 53778acc472SPeter TyserSingle-call RAM->RAM Compression is similar to Compression with callbacks, 53878acc472SPeter Tyserbut you provide pointers to buffers instead of pointers to stream callbacks: 53978acc472SPeter Tyser 54078acc472SPeter TyserHRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen, 54178acc472SPeter Tyser CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark, 54278acc472SPeter Tyser ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig); 54378acc472SPeter Tyser 54478acc472SPeter TyserReturn code: 54578acc472SPeter Tyser SZ_OK - OK 54678acc472SPeter Tyser SZ_ERROR_MEM - Memory allocation error 54778acc472SPeter Tyser SZ_ERROR_PARAM - Incorrect paramater 54878acc472SPeter Tyser SZ_ERROR_OUTPUT_EOF - output buffer overflow 54978acc472SPeter Tyser SZ_ERROR_THREAD - errors in multithreading functions (only for Mt version) 55078acc472SPeter Tyser 55178acc472SPeter Tyser 55278acc472SPeter Tyser 553*5fe8380dSStefan ReinauerDefines 554*5fe8380dSStefan Reinauer------- 55578acc472SPeter Tyser 55678acc472SPeter Tyser_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code. 55778acc472SPeter Tyser 55878acc472SPeter Tyser_LZMA_PROB32 - It can increase the speed on some 32-bit CPUs, but memory usage for 55978acc472SPeter Tyser some structures will be doubled in that case. 56078acc472SPeter Tyser 56178acc472SPeter Tyser_LZMA_UINT32_IS_ULONG - Define it if int is 16-bit on your compiler and long is 32-bit. 56278acc472SPeter Tyser 56378acc472SPeter Tyser_LZMA_NO_SYSTEM_SIZE_T - Define it if you don't want to use size_t type. 56478acc472SPeter Tyser 56578acc472SPeter Tyser 566*5fe8380dSStefan Reinauer_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder. 567*5fe8380dSStefan Reinauer 568*5fe8380dSStefan Reinauer 56978acc472SPeter TyserC++ LZMA Encoder/Decoder 57078acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~~~~~~ 57178acc472SPeter TyserC++ LZMA code use COM-like interfaces. So if you want to use it, 57278acc472SPeter Tyseryou can study basics of COM/OLE. 57378acc472SPeter TyserC++ LZMA code is just wrapper over ANSI-C code. 57478acc472SPeter Tyser 57578acc472SPeter Tyser 57678acc472SPeter TyserC++ Notes 57778acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~~~~~~ 57878acc472SPeter TyserIf you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling), 57978acc472SPeter Tyseryou must check that you correctly work with "new" operator. 58078acc472SPeter Tyser7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator. 58178acc472SPeter TyserSo 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator: 58278acc472SPeter Tyseroperator new(size_t size) 58378acc472SPeter Tyser{ 58478acc472SPeter Tyser void *p = ::malloc(size); 58578acc472SPeter Tyser if (p == 0) 58678acc472SPeter Tyser throw CNewException(); 58778acc472SPeter Tyser return p; 58878acc472SPeter Tyser} 58978acc472SPeter TyserIf you use MSCV that throws exception for "new" operator, you can compile without 59078acc472SPeter Tyser"NewHandler.cpp". So standard exception will be used. Actually some code of 59178acc472SPeter Tyser7-Zip catches any exception in internal code and converts it to HRESULT code. 59278acc472SPeter TyserSo you don't need to catch CNewException, if you call COM interfaces of 7-Zip. 59378acc472SPeter Tyser 59478acc472SPeter Tyser--- 59578acc472SPeter Tyser 59678acc472SPeter Tyserhttp://www.7-zip.org 59778acc472SPeter Tyserhttp://www.7-zip.org/sdk.html 59878acc472SPeter Tyserhttp://www.7-zip.org/support.html 599