xref: /rk3399_rockchip-uboot/lib/lzma/lzma.txt (revision 96764df1b47ddebfb50fadf5af72530b07b5fc89)
1*5fe8380dSStefan ReinauerLZMA SDK 9.20
278acc472SPeter Tyser-------------
378acc472SPeter Tyser
478acc472SPeter TyserLZMA SDK provides the documentation, samples, header files, libraries,
578acc472SPeter Tyserand tools you need to develop applications that use LZMA compression.
678acc472SPeter Tyser
778acc472SPeter TyserLZMA is default and general compression method of 7z format
878acc472SPeter Tyserin 7-Zip compression program (www.7-zip.org). LZMA provides high
978acc472SPeter Tysercompression ratio and very fast decompression.
1078acc472SPeter Tyser
1178acc472SPeter TyserLZMA is an improved version of famous LZ77 compression algorithm.
1278acc472SPeter TyserIt was improved in way of maximum increasing of compression ratio,
1378acc472SPeter Tyserkeeping high decompression speed and low memory requirements for
1478acc472SPeter Tyserdecompressing.
1578acc472SPeter Tyser
1678acc472SPeter Tyser
1778acc472SPeter Tyser
1878acc472SPeter TyserLICENSE
1978acc472SPeter Tyser-------
2078acc472SPeter Tyser
2178acc472SPeter TyserLZMA SDK is written and placed in the public domain by Igor Pavlov.
2278acc472SPeter Tyser
23*5fe8380dSStefan ReinauerSome code in LZMA SDK is based on public domain code from another developers:
24*5fe8380dSStefan Reinauer  1) PPMd var.H (2001): Dmitry Shkarin
25*5fe8380dSStefan Reinauer  2) SHA-256: Wei Dai (Crypto++ library)
26*5fe8380dSStefan Reinauer
2778acc472SPeter Tyser
2878acc472SPeter TyserLZMA SDK Contents
2978acc472SPeter Tyser-----------------
3078acc472SPeter Tyser
3178acc472SPeter TyserLZMA SDK includes:
3278acc472SPeter Tyser
3378acc472SPeter Tyser  - ANSI-C/C++/C#/Java source code for LZMA compressing and decompressing
3478acc472SPeter Tyser  - Compiled file->file LZMA compressing/decompressing program for Windows system
3578acc472SPeter Tyser
3678acc472SPeter Tyser
3778acc472SPeter TyserUNIX/Linux version
3878acc472SPeter Tyser------------------
3978acc472SPeter TyserTo compile C++ version of file->file LZMA encoding, go to directory
40*5fe8380dSStefan ReinauerCPP/7zip/Bundles/LzmaCon
4178acc472SPeter Tyserand call make to recompile it:
4278acc472SPeter Tyser  make -f makefile.gcc clean all
4378acc472SPeter Tyser
4478acc472SPeter TyserIn some UNIX/Linux versions you must compile LZMA with static libraries.
4578acc472SPeter TyserTo compile with static libraries, you can use
4678acc472SPeter TyserLIB = -lm -static
4778acc472SPeter Tyser
4878acc472SPeter Tyser
4978acc472SPeter TyserFiles
5078acc472SPeter Tyser---------------------
5178acc472SPeter Tyserlzma.txt     - LZMA SDK description (this file)
5278acc472SPeter Tyser7zFormat.txt - 7z Format description
5378acc472SPeter Tyser7zC.txt      - 7z ANSI-C Decoder description
5478acc472SPeter Tysermethods.txt  - Compression method IDs for .7z
5578acc472SPeter Tyserlzma.exe     - Compiled file->file LZMA encoder/decoder for Windows
56*5fe8380dSStefan Reinauer7zr.exe      - 7-Zip with 7z/lzma/xz support.
5778acc472SPeter Tyserhistory.txt  - history of the LZMA SDK
5878acc472SPeter Tyser
5978acc472SPeter Tyser
6078acc472SPeter TyserSource code structure
6178acc472SPeter Tyser---------------------
6278acc472SPeter Tyser
6378acc472SPeter TyserC/  - C files
6478acc472SPeter Tyser        7zCrc*.*   - CRC code
6578acc472SPeter Tyser        Alloc.*    - Memory allocation functions
6678acc472SPeter Tyser        Bra*.*     - Filters for x86, IA-64, ARM, ARM-Thumb, PowerPC and SPARC code
6778acc472SPeter Tyser        LzFind.*   - Match finder for LZ (LZMA) encoders
6878acc472SPeter Tyser        LzFindMt.* - Match finder for LZ (LZMA) encoders for multithreading encoding
6978acc472SPeter Tyser        LzHash.h   - Additional file for LZ match finder
7078acc472SPeter Tyser        LzmaDec.*  - LZMA decoding
7178acc472SPeter Tyser        LzmaEnc.*  - LZMA encoding
7278acc472SPeter Tyser        LzmaLib.*  - LZMA Library for DLL calling
7378acc472SPeter Tyser        Types.h    - Basic types for another .c files
7478acc472SPeter Tyser        Threads.*  - The code for multithreading.
7578acc472SPeter Tyser
7678acc472SPeter Tyser    LzmaLib  - LZMA Library (.DLL for Windows)
7778acc472SPeter Tyser
7878acc472SPeter Tyser    LzmaUtil - LZMA Utility (file->file LZMA encoder/decoder).
7978acc472SPeter Tyser
8078acc472SPeter Tyser    Archive - files related to archiving
8178acc472SPeter Tyser      7z     - 7z ANSI-C Decoder
8278acc472SPeter Tyser
8378acc472SPeter TyserCPP/ -- CPP files
8478acc472SPeter Tyser
8578acc472SPeter Tyser  Common  - common files for C++ projects
8678acc472SPeter Tyser  Windows - common files for Windows related code
8778acc472SPeter Tyser
8878acc472SPeter Tyser  7zip    - files related to 7-Zip Project
8978acc472SPeter Tyser
9078acc472SPeter Tyser    Common   - common files for 7-Zip
9178acc472SPeter Tyser
9278acc472SPeter Tyser    Compress - files related to compression/decompression
9378acc472SPeter Tyser
9478acc472SPeter Tyser    Archive - files related to archiving
9578acc472SPeter Tyser
9678acc472SPeter Tyser      Common   - common files for archive handling
9778acc472SPeter Tyser      7z       - 7z C++ Encoder/Decoder
9878acc472SPeter Tyser
9978acc472SPeter Tyser    Bundles    - Modules that are bundles of other modules
10078acc472SPeter Tyser
10178acc472SPeter Tyser      Alone7z           - 7zr.exe: Standalone version of 7z.exe that supports only 7z/LZMA/BCJ/BCJ2
102*5fe8380dSStefan Reinauer      LzmaCon           - lzma.exe: LZMA compression/decompression
10378acc472SPeter Tyser      Format7zR         - 7zr.dll: Reduced version of 7za.dll: extracting/compressing to 7z/LZMA/BCJ/BCJ2
10478acc472SPeter Tyser      Format7zExtractR  - 7zxr.dll: Reduced version of 7zxa.dll: extracting from 7z/LZMA/BCJ/BCJ2.
10578acc472SPeter Tyser
10678acc472SPeter Tyser    UI        - User Interface files
10778acc472SPeter Tyser
10878acc472SPeter Tyser      Client7z - Test application for 7za.dll,  7zr.dll, 7zxr.dll
10978acc472SPeter Tyser      Common   - Common UI files
11078acc472SPeter Tyser      Console  - Code for console archiver
11178acc472SPeter Tyser
11278acc472SPeter Tyser
11378acc472SPeter Tyser
11478acc472SPeter TyserCS/ - C# files
11578acc472SPeter Tyser  7zip
11678acc472SPeter Tyser    Common   - some common files for 7-Zip
11778acc472SPeter Tyser    Compress - files related to compression/decompression
11878acc472SPeter Tyser      LZ     - files related to LZ (Lempel-Ziv) compression algorithm
11978acc472SPeter Tyser      LZMA         - LZMA compression/decompression
12078acc472SPeter Tyser      LzmaAlone    - file->file LZMA compression/decompression
12178acc472SPeter Tyser      RangeCoder   - Range Coder (special code of compression/decompression)
12278acc472SPeter Tyser
12378acc472SPeter TyserJava/  - Java files
12478acc472SPeter Tyser  SevenZip
12578acc472SPeter Tyser    Compression    - files related to compression/decompression
12678acc472SPeter Tyser      LZ           - files related to LZ (Lempel-Ziv) compression algorithm
12778acc472SPeter Tyser      LZMA         - LZMA compression/decompression
12878acc472SPeter Tyser      RangeCoder   - Range Coder (special code of compression/decompression)
12978acc472SPeter Tyser
13078acc472SPeter Tyser
13178acc472SPeter TyserC/C++ source code of LZMA SDK is part of 7-Zip project.
13278acc472SPeter Tyser7-Zip source code can be downloaded from 7-Zip's SourceForge page:
13378acc472SPeter Tyser
13478acc472SPeter Tyser  http://sourceforge.net/projects/sevenzip/
13578acc472SPeter Tyser
13678acc472SPeter Tyser
13778acc472SPeter Tyser
13878acc472SPeter TyserLZMA features
13978acc472SPeter Tyser-------------
14078acc472SPeter Tyser  - Variable dictionary size (up to 1 GB)
14178acc472SPeter Tyser  - Estimated compressing speed: about 2 MB/s on 2 GHz CPU
14278acc472SPeter Tyser  - Estimated decompressing speed:
14378acc472SPeter Tyser      - 20-30 MB/s on 2 GHz Core 2 or AMD Athlon 64
14478acc472SPeter Tyser      - 1-2 MB/s on 200 MHz ARM, MIPS, PowerPC or other simple RISC
14578acc472SPeter Tyser  - Small memory requirements for decompressing (16 KB + DictionarySize)
14678acc472SPeter Tyser  - Small code size for decompressing: 5-8 KB
14778acc472SPeter Tyser
14878acc472SPeter TyserLZMA decoder uses only integer operations and can be
14978acc472SPeter Tyserimplemented in any modern 32-bit CPU (or on 16-bit CPU with some conditions).
15078acc472SPeter Tyser
15178acc472SPeter TyserSome critical operations that affect the speed of LZMA decompression:
15278acc472SPeter Tyser  1) 32*16 bit integer multiply
15378acc472SPeter Tyser  2) Misspredicted branches (penalty mostly depends from pipeline length)
15478acc472SPeter Tyser  3) 32-bit shift and arithmetic operations
15578acc472SPeter Tyser
15678acc472SPeter TyserThe speed of LZMA decompressing mostly depends from CPU speed.
15778acc472SPeter TyserMemory speed has no big meaning. But if your CPU has small data cache,
15878acc472SPeter Tyseroverall weight of memory speed will slightly increase.
15978acc472SPeter Tyser
16078acc472SPeter Tyser
16178acc472SPeter TyserHow To Use
16278acc472SPeter Tyser----------
16378acc472SPeter Tyser
16478acc472SPeter TyserUsing LZMA encoder/decoder executable
16578acc472SPeter Tyser--------------------------------------
16678acc472SPeter Tyser
16778acc472SPeter TyserUsage:  LZMA <e|d> inputFile outputFile [<switches>...]
16878acc472SPeter Tyser
16978acc472SPeter Tyser  e: encode file
17078acc472SPeter Tyser
17178acc472SPeter Tyser  d: decode file
17278acc472SPeter Tyser
17378acc472SPeter Tyser  b: Benchmark. There are two tests: compressing and decompressing
17478acc472SPeter Tyser     with LZMA method. Benchmark shows rating in MIPS (million
17578acc472SPeter Tyser     instructions per second). Rating value is calculated from
17678acc472SPeter Tyser     measured speed and it is normalized with Intel's Core 2 results.
17778acc472SPeter Tyser     Also Benchmark checks possible hardware errors (RAM
17878acc472SPeter Tyser     errors in most cases). Benchmark uses these settings:
17978acc472SPeter Tyser     (-a1, -d21, -fb32, -mfbt4). You can change only -d parameter.
18078acc472SPeter Tyser     Also you can change the number of iterations. Example for 30 iterations:
18178acc472SPeter Tyser       LZMA b 30
18278acc472SPeter Tyser     Default number of iterations is 10.
18378acc472SPeter Tyser
18478acc472SPeter Tyser<Switches>
18578acc472SPeter Tyser
18678acc472SPeter Tyser
18778acc472SPeter Tyser  -a{N}:  set compression mode 0 = fast, 1 = normal
18878acc472SPeter Tyser          default: 1 (normal)
18978acc472SPeter Tyser
19078acc472SPeter Tyser  d{N}:   Sets Dictionary size - [0, 30], default: 23 (8MB)
19178acc472SPeter Tyser          The maximum value for dictionary size is 1 GB = 2^30 bytes.
19278acc472SPeter Tyser          Dictionary size is calculated as DictionarySize = 2^N bytes.
19378acc472SPeter Tyser          For decompressing file compressed by LZMA method with dictionary
19478acc472SPeter Tyser          size D = 2^N you need about D bytes of memory (RAM).
19578acc472SPeter Tyser
19678acc472SPeter Tyser  -fb{N}: set number of fast bytes - [5, 273], default: 128
19778acc472SPeter Tyser          Usually big number gives a little bit better compression ratio
19878acc472SPeter Tyser          and slower compression process.
19978acc472SPeter Tyser
20078acc472SPeter Tyser  -lc{N}: set number of literal context bits - [0, 8], default: 3
20178acc472SPeter Tyser          Sometimes lc=4 gives gain for big files.
20278acc472SPeter Tyser
20378acc472SPeter Tyser  -lp{N}: set number of literal pos bits - [0, 4], default: 0
20478acc472SPeter Tyser          lp switch is intended for periodical data when period is
20578acc472SPeter Tyser          equal 2^N. For example, for 32-bit (4 bytes)
20678acc472SPeter Tyser          periodical data you can use lp=2. Often it's better to set lc0,
20778acc472SPeter Tyser          if you change lp switch.
20878acc472SPeter Tyser
20978acc472SPeter Tyser  -pb{N}: set number of pos bits - [0, 4], default: 2
21078acc472SPeter Tyser          pb switch is intended for periodical data
21178acc472SPeter Tyser          when period is equal 2^N.
21278acc472SPeter Tyser
21378acc472SPeter Tyser  -mf{MF_ID}: set Match Finder. Default: bt4.
21478acc472SPeter Tyser              Algorithms from hc* group doesn't provide good compression
21578acc472SPeter Tyser              ratio, but they often works pretty fast in combination with
21678acc472SPeter Tyser              fast mode (-a0).
21778acc472SPeter Tyser
21878acc472SPeter Tyser              Memory requirements depend from dictionary size
21978acc472SPeter Tyser              (parameter "d" in table below).
22078acc472SPeter Tyser
22178acc472SPeter Tyser               MF_ID     Memory                   Description
22278acc472SPeter Tyser
22378acc472SPeter Tyser                bt2    d *  9.5 + 4MB  Binary Tree with 2 bytes hashing.
22478acc472SPeter Tyser                bt3    d * 11.5 + 4MB  Binary Tree with 3 bytes hashing.
22578acc472SPeter Tyser                bt4    d * 11.5 + 4MB  Binary Tree with 4 bytes hashing.
22678acc472SPeter Tyser                hc4    d *  7.5 + 4MB  Hash Chain with 4 bytes hashing.
22778acc472SPeter Tyser
22878acc472SPeter Tyser  -eos:   write End Of Stream marker. By default LZMA doesn't write
22978acc472SPeter Tyser          eos marker, since LZMA decoder knows uncompressed size
23078acc472SPeter Tyser          stored in .lzma file header.
23178acc472SPeter Tyser
23278acc472SPeter Tyser  -si:    Read data from stdin (it will write End Of Stream marker).
23378acc472SPeter Tyser  -so:    Write data to stdout
23478acc472SPeter Tyser
23578acc472SPeter Tyser
23678acc472SPeter TyserExamples:
23778acc472SPeter Tyser
23878acc472SPeter Tyser1) LZMA e file.bin file.lzma -d16 -lc0
23978acc472SPeter Tyser
24078acc472SPeter Tysercompresses file.bin to file.lzma with 64 KB dictionary (2^16=64K)
24178acc472SPeter Tyserand 0 literal context bits. -lc0 allows to reduce memory requirements
24278acc472SPeter Tyserfor decompression.
24378acc472SPeter Tyser
24478acc472SPeter Tyser
24578acc472SPeter Tyser2) LZMA e file.bin file.lzma -lc0 -lp2
24678acc472SPeter Tyser
24778acc472SPeter Tysercompresses file.bin to file.lzma with settings suitable
24878acc472SPeter Tyserfor 32-bit periodical data (for example, ARM or MIPS code).
24978acc472SPeter Tyser
25078acc472SPeter Tyser3) LZMA d file.lzma file.bin
25178acc472SPeter Tyser
25278acc472SPeter Tyserdecompresses file.lzma to file.bin.
25378acc472SPeter Tyser
25478acc472SPeter Tyser
25578acc472SPeter TyserCompression ratio hints
25678acc472SPeter Tyser-----------------------
25778acc472SPeter Tyser
25878acc472SPeter TyserRecommendations
25978acc472SPeter Tyser---------------
26078acc472SPeter Tyser
26178acc472SPeter TyserTo increase the compression ratio for LZMA compressing it's desirable
26278acc472SPeter Tyserto have aligned data (if it's possible) and also it's desirable to locate
26378acc472SPeter Tyserdata in such order, where code is grouped in one place and data is
26478acc472SPeter Tysergrouped in other place (it's better than such mixing: code, data, code,
26578acc472SPeter Tyserdata, ...).
26678acc472SPeter Tyser
26778acc472SPeter Tyser
26878acc472SPeter TyserFilters
26978acc472SPeter Tyser-------
27078acc472SPeter TyserYou can increase the compression ratio for some data types, using
27178acc472SPeter Tyserspecial filters before compressing. For example, it's possible to
27278acc472SPeter Tyserincrease the compression ratio on 5-10% for code for those CPU ISAs:
27378acc472SPeter Tyserx86, IA-64, ARM, ARM-Thumb, PowerPC, SPARC.
27478acc472SPeter Tyser
27578acc472SPeter TyserYou can find C source code of such filters in C/Bra*.* files
27678acc472SPeter Tyser
27778acc472SPeter TyserYou can check the compression ratio gain of these filters with such
27878acc472SPeter Tyser7-Zip commands (example for ARM code):
27978acc472SPeter TyserNo filter:
28078acc472SPeter Tyser  7z a a1.7z a.bin -m0=lzma
28178acc472SPeter Tyser
28278acc472SPeter TyserWith filter for little-endian ARM code:
28378acc472SPeter Tyser  7z a a2.7z a.bin -m0=arm -m1=lzma
28478acc472SPeter Tyser
28578acc472SPeter TyserIt works in such manner:
28678acc472SPeter TyserCompressing    = Filter_encoding + LZMA_encoding
28778acc472SPeter TyserDecompressing  = LZMA_decoding + Filter_decoding
28878acc472SPeter Tyser
28978acc472SPeter TyserCompressing and decompressing speed of such filters is very high,
29078acc472SPeter Tyserso it will not increase decompressing time too much.
29178acc472SPeter TyserMoreover, it reduces decompression time for LZMA_decoding,
29278acc472SPeter Tysersince compression ratio with filtering is higher.
29378acc472SPeter Tyser
29478acc472SPeter TyserThese filters convert CALL (calling procedure) instructions
29578acc472SPeter Tyserfrom relative offsets to absolute addresses, so such data becomes more
29678acc472SPeter Tysercompressible.
29778acc472SPeter Tyser
29878acc472SPeter TyserFor some ISAs (for example, for MIPS) it's impossible to get gain from such filter.
29978acc472SPeter Tyser
30078acc472SPeter Tyser
30178acc472SPeter TyserLZMA compressed file format
30278acc472SPeter Tyser---------------------------
30378acc472SPeter TyserOffset Size Description
30478acc472SPeter Tyser  0     1   Special LZMA properties (lc,lp, pb in encoded form)
30578acc472SPeter Tyser  1     4   Dictionary size (little endian)
30678acc472SPeter Tyser  5     8   Uncompressed size (little endian). -1 means unknown size
30778acc472SPeter Tyser 13         Compressed data
30878acc472SPeter Tyser
30978acc472SPeter Tyser
31078acc472SPeter TyserANSI-C LZMA Decoder
31178acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~
31278acc472SPeter Tyser
31378acc472SPeter TyserPlease note that interfaces for ANSI-C code were changed in LZMA SDK 4.58.
31478acc472SPeter TyserIf you want to use old interfaces you can download previous version of LZMA SDK
31578acc472SPeter Tyserfrom sourceforge.net site.
31678acc472SPeter Tyser
31778acc472SPeter TyserTo use ANSI-C LZMA Decoder you need the following files:
31878acc472SPeter Tyser1) LzmaDec.h + LzmaDec.c + Types.h
31978acc472SPeter TyserLzmaUtil/LzmaUtil.c is example application that uses these files.
32078acc472SPeter Tyser
32178acc472SPeter Tyser
32278acc472SPeter TyserMemory requirements for LZMA decoding
32378acc472SPeter Tyser-------------------------------------
32478acc472SPeter Tyser
32578acc472SPeter TyserStack usage of LZMA decoding function for local variables is not
32678acc472SPeter Tyserlarger than 200-400 bytes.
32778acc472SPeter Tyser
32878acc472SPeter TyserLZMA Decoder uses dictionary buffer and internal state structure.
32978acc472SPeter TyserInternal state structure consumes
33078acc472SPeter Tyser  state_size = (4 + (1.5 << (lc + lp))) KB
33178acc472SPeter Tyserby default (lc=3, lp=0), state_size = 16 KB.
33278acc472SPeter Tyser
33378acc472SPeter Tyser
33478acc472SPeter TyserHow To decompress data
33578acc472SPeter Tyser----------------------
33678acc472SPeter Tyser
33778acc472SPeter TyserLZMA Decoder (ANSI-C version) now supports 2 interfaces:
33878acc472SPeter Tyser1) Single-call Decompressing
33978acc472SPeter Tyser2) Multi-call State Decompressing (zlib-like interface)
34078acc472SPeter Tyser
34178acc472SPeter TyserYou must use external allocator:
34278acc472SPeter TyserExample:
34378acc472SPeter Tyservoid *SzAlloc(void *p, size_t size) { p = p; return malloc(size); }
34478acc472SPeter Tyservoid SzFree(void *p, void *address) { p = p; free(address); }
34578acc472SPeter TyserISzAlloc alloc = { SzAlloc, SzFree };
34678acc472SPeter Tyser
34778acc472SPeter TyserYou can use p = p; operator to disable compiler warnings.
34878acc472SPeter Tyser
34978acc472SPeter Tyser
35078acc472SPeter TyserSingle-call Decompressing
35178acc472SPeter Tyser-------------------------
35278acc472SPeter TyserWhen to use: RAM->RAM decompressing
35378acc472SPeter TyserCompile files: LzmaDec.h + LzmaDec.c + Types.h
35478acc472SPeter TyserCompile defines: no defines
35578acc472SPeter TyserMemory Requirements:
35678acc472SPeter Tyser  - Input buffer: compressed size
35778acc472SPeter Tyser  - Output buffer: uncompressed size
35878acc472SPeter Tyser  - LZMA Internal Structures: state_size (16 KB for default settings)
35978acc472SPeter Tyser
36078acc472SPeter TyserInterface:
36178acc472SPeter Tyser  int LzmaDecode(Byte *dest, SizeT *destLen, const Byte *src, SizeT *srcLen,
36278acc472SPeter Tyser      const Byte *propData, unsigned propSize, ELzmaFinishMode finishMode,
36378acc472SPeter Tyser      ELzmaStatus *status, ISzAlloc *alloc);
36478acc472SPeter Tyser  In:
36578acc472SPeter Tyser    dest     - output data
36678acc472SPeter Tyser    destLen  - output data size
36778acc472SPeter Tyser    src      - input data
36878acc472SPeter Tyser    srcLen   - input data size
36978acc472SPeter Tyser    propData - LZMA properties  (5 bytes)
37078acc472SPeter Tyser    propSize - size of propData buffer (5 bytes)
37178acc472SPeter Tyser    finishMode - It has meaning only if the decoding reaches output limit (*destLen).
37278acc472SPeter Tyser         LZMA_FINISH_ANY - Decode just destLen bytes.
37378acc472SPeter Tyser         LZMA_FINISH_END - Stream must be finished after (*destLen).
37478acc472SPeter Tyser                           You can use LZMA_FINISH_END, when you know that
37578acc472SPeter Tyser                           current output buffer covers last bytes of stream.
37678acc472SPeter Tyser    alloc    - Memory allocator.
37778acc472SPeter Tyser
37878acc472SPeter Tyser  Out:
37978acc472SPeter Tyser    destLen  - processed output size
38078acc472SPeter Tyser    srcLen   - processed input size
38178acc472SPeter Tyser
38278acc472SPeter Tyser  Output:
38378acc472SPeter Tyser    SZ_OK
38478acc472SPeter Tyser      status:
38578acc472SPeter Tyser        LZMA_STATUS_FINISHED_WITH_MARK
38678acc472SPeter Tyser        LZMA_STATUS_NOT_FINISHED
38778acc472SPeter Tyser        LZMA_STATUS_MAYBE_FINISHED_WITHOUT_MARK
38878acc472SPeter Tyser    SZ_ERROR_DATA - Data error
38978acc472SPeter Tyser    SZ_ERROR_MEM  - Memory allocation error
39078acc472SPeter Tyser    SZ_ERROR_UNSUPPORTED - Unsupported properties
39178acc472SPeter Tyser    SZ_ERROR_INPUT_EOF - It needs more bytes in input buffer (src).
39278acc472SPeter Tyser
39378acc472SPeter Tyser  If LZMA decoder sees end_marker before reaching output limit, it returns OK result,
39478acc472SPeter Tyser  and output value of destLen will be less than output buffer size limit.
39578acc472SPeter Tyser
39678acc472SPeter Tyser  You can use multiple checks to test data integrity after full decompression:
39778acc472SPeter Tyser    1) Check Result and "status" variable.
39878acc472SPeter Tyser    2) Check that output(destLen) = uncompressedSize, if you know real uncompressedSize.
39978acc472SPeter Tyser    3) Check that output(srcLen) = compressedSize, if you know real compressedSize.
40078acc472SPeter Tyser       You must use correct finish mode in that case. */
40178acc472SPeter Tyser
40278acc472SPeter Tyser
40378acc472SPeter TyserMulti-call State Decompressing (zlib-like interface)
40478acc472SPeter Tyser----------------------------------------------------
40578acc472SPeter Tyser
40678acc472SPeter TyserWhen to use: file->file decompressing
40778acc472SPeter TyserCompile files: LzmaDec.h + LzmaDec.c + Types.h
40878acc472SPeter Tyser
40978acc472SPeter TyserMemory Requirements:
41078acc472SPeter Tyser - Buffer for input stream: any size (for example, 16 KB)
41178acc472SPeter Tyser - Buffer for output stream: any size (for example, 16 KB)
41278acc472SPeter Tyser - LZMA Internal Structures: state_size (16 KB for default settings)
41378acc472SPeter Tyser - LZMA dictionary (dictionary size is encoded in LZMA properties header)
41478acc472SPeter Tyser
41578acc472SPeter Tyser1) read LZMA properties (5 bytes) and uncompressed size (8 bytes, little-endian) to header:
41678acc472SPeter Tyser   unsigned char header[LZMA_PROPS_SIZE + 8];
41778acc472SPeter Tyser   ReadFile(inFile, header, sizeof(header)
41878acc472SPeter Tyser
41978acc472SPeter Tyser2) Allocate CLzmaDec structures (state + dictionary) using LZMA properties
42078acc472SPeter Tyser
42178acc472SPeter Tyser  CLzmaDec state;
42278acc472SPeter Tyser  LzmaDec_Constr(&state);
42378acc472SPeter Tyser  res = LzmaDec_Allocate(&state, header, LZMA_PROPS_SIZE, &g_Alloc);
42478acc472SPeter Tyser  if (res != SZ_OK)
42578acc472SPeter Tyser    return res;
42678acc472SPeter Tyser
42778acc472SPeter Tyser3) Init LzmaDec structure before any new LZMA stream. And call LzmaDec_DecodeToBuf in loop
42878acc472SPeter Tyser
42978acc472SPeter Tyser  LzmaDec_Init(&state);
43078acc472SPeter Tyser  for (;;)
43178acc472SPeter Tyser  {
43278acc472SPeter Tyser    ...
43378acc472SPeter Tyser    int res = LzmaDec_DecodeToBuf(CLzmaDec *p, Byte *dest, SizeT *destLen,
43478acc472SPeter Tyser        const Byte *src, SizeT *srcLen, ELzmaFinishMode finishMode);
43578acc472SPeter Tyser    ...
43678acc472SPeter Tyser  }
43778acc472SPeter Tyser
43878acc472SPeter Tyser
43978acc472SPeter Tyser4) Free all allocated structures
44078acc472SPeter Tyser  LzmaDec_Free(&state, &g_Alloc);
44178acc472SPeter Tyser
44278acc472SPeter TyserFor full code example, look at C/LzmaUtil/LzmaUtil.c code.
44378acc472SPeter Tyser
44478acc472SPeter Tyser
44578acc472SPeter TyserHow To compress data
44678acc472SPeter Tyser--------------------
44778acc472SPeter Tyser
44878acc472SPeter TyserCompile files: LzmaEnc.h + LzmaEnc.c + Types.h +
44978acc472SPeter TyserLzFind.c + LzFind.h + LzFindMt.c + LzFindMt.h + LzHash.h
45078acc472SPeter Tyser
45178acc472SPeter TyserMemory Requirements:
45278acc472SPeter Tyser  - (dictSize * 11.5 + 6 MB) + state_size
45378acc472SPeter Tyser
45478acc472SPeter TyserLzma Encoder can use two memory allocators:
45578acc472SPeter Tyser1) alloc - for small arrays.
45678acc472SPeter Tyser2) allocBig - for big arrays.
45778acc472SPeter Tyser
45878acc472SPeter TyserFor example, you can use Large RAM Pages (2 MB) in allocBig allocator for
45978acc472SPeter Tyserbetter compression speed. Note that Windows has bad implementation for
46078acc472SPeter TyserLarge RAM Pages.
46178acc472SPeter TyserIt's OK to use same allocator for alloc and allocBig.
46278acc472SPeter Tyser
46378acc472SPeter Tyser
46478acc472SPeter TyserSingle-call Compression with callbacks
46578acc472SPeter Tyser--------------------------------------
46678acc472SPeter Tyser
46778acc472SPeter TyserCheck C/LzmaUtil/LzmaUtil.c as example,
46878acc472SPeter Tyser
46978acc472SPeter TyserWhen to use: file->file decompressing
47078acc472SPeter Tyser
47178acc472SPeter Tyser1) you must implement callback structures for interfaces:
47278acc472SPeter TyserISeqInStream
47378acc472SPeter TyserISeqOutStream
47478acc472SPeter TyserICompressProgress
47578acc472SPeter TyserISzAlloc
47678acc472SPeter Tyser
47778acc472SPeter Tyserstatic void *SzAlloc(void *p, size_t size) { p = p; return MyAlloc(size); }
47878acc472SPeter Tyserstatic void SzFree(void *p, void *address) {  p = p; MyFree(address); }
47978acc472SPeter Tyserstatic ISzAlloc g_Alloc = { SzAlloc, SzFree };
48078acc472SPeter Tyser
48178acc472SPeter Tyser  CFileSeqInStream inStream;
48278acc472SPeter Tyser  CFileSeqOutStream outStream;
48378acc472SPeter Tyser
48478acc472SPeter Tyser  inStream.funcTable.Read = MyRead;
48578acc472SPeter Tyser  inStream.file = inFile;
48678acc472SPeter Tyser  outStream.funcTable.Write = MyWrite;
48778acc472SPeter Tyser  outStream.file = outFile;
48878acc472SPeter Tyser
48978acc472SPeter Tyser
49078acc472SPeter Tyser2) Create CLzmaEncHandle object;
49178acc472SPeter Tyser
49278acc472SPeter Tyser  CLzmaEncHandle enc;
49378acc472SPeter Tyser
49478acc472SPeter Tyser  enc = LzmaEnc_Create(&g_Alloc);
49578acc472SPeter Tyser  if (enc == 0)
49678acc472SPeter Tyser    return SZ_ERROR_MEM;
49778acc472SPeter Tyser
49878acc472SPeter Tyser
49978acc472SPeter Tyser3) initialize CLzmaEncProps properties;
50078acc472SPeter Tyser
50178acc472SPeter Tyser  LzmaEncProps_Init(&props);
50278acc472SPeter Tyser
50378acc472SPeter Tyser  Then you can change some properties in that structure.
50478acc472SPeter Tyser
50578acc472SPeter Tyser4) Send LZMA properties to LZMA Encoder
50678acc472SPeter Tyser
50778acc472SPeter Tyser  res = LzmaEnc_SetProps(enc, &props);
50878acc472SPeter Tyser
50978acc472SPeter Tyser5) Write encoded properties to header
51078acc472SPeter Tyser
51178acc472SPeter Tyser    Byte header[LZMA_PROPS_SIZE + 8];
51278acc472SPeter Tyser    size_t headerSize = LZMA_PROPS_SIZE;
51378acc472SPeter Tyser    UInt64 fileSize;
51478acc472SPeter Tyser    int i;
51578acc472SPeter Tyser
51678acc472SPeter Tyser    res = LzmaEnc_WriteProperties(enc, header, &headerSize);
51778acc472SPeter Tyser    fileSize = MyGetFileLength(inFile);
51878acc472SPeter Tyser    for (i = 0; i < 8; i++)
51978acc472SPeter Tyser      header[headerSize++] = (Byte)(fileSize >> (8 * i));
52078acc472SPeter Tyser    MyWriteFileAndCheck(outFile, header, headerSize)
52178acc472SPeter Tyser
52278acc472SPeter Tyser6) Call encoding function:
52378acc472SPeter Tyser      res = LzmaEnc_Encode(enc, &outStream.funcTable, &inStream.funcTable,
52478acc472SPeter Tyser        NULL, &g_Alloc, &g_Alloc);
52578acc472SPeter Tyser
52678acc472SPeter Tyser7) Destroy LZMA Encoder Object
52778acc472SPeter Tyser  LzmaEnc_Destroy(enc, &g_Alloc, &g_Alloc);
52878acc472SPeter Tyser
52978acc472SPeter Tyser
530*5fe8380dSStefan ReinauerIf callback function return some error code, LzmaEnc_Encode also returns that code
531*5fe8380dSStefan Reinaueror it can return the code like SZ_ERROR_READ, SZ_ERROR_WRITE or SZ_ERROR_PROGRESS.
53278acc472SPeter Tyser
53378acc472SPeter Tyser
53478acc472SPeter TyserSingle-call RAM->RAM Compression
53578acc472SPeter Tyser--------------------------------
53678acc472SPeter Tyser
53778acc472SPeter TyserSingle-call RAM->RAM Compression is similar to Compression with callbacks,
53878acc472SPeter Tyserbut you provide pointers to buffers instead of pointers to stream callbacks:
53978acc472SPeter Tyser
54078acc472SPeter TyserHRes LzmaEncode(Byte *dest, SizeT *destLen, const Byte *src, SizeT srcLen,
54178acc472SPeter Tyser    CLzmaEncProps *props, Byte *propsEncoded, SizeT *propsSize, int writeEndMark,
54278acc472SPeter Tyser    ICompressProgress *progress, ISzAlloc *alloc, ISzAlloc *allocBig);
54378acc472SPeter Tyser
54478acc472SPeter TyserReturn code:
54578acc472SPeter Tyser  SZ_OK               - OK
54678acc472SPeter Tyser  SZ_ERROR_MEM        - Memory allocation error
54778acc472SPeter Tyser  SZ_ERROR_PARAM      - Incorrect paramater
54878acc472SPeter Tyser  SZ_ERROR_OUTPUT_EOF - output buffer overflow
54978acc472SPeter Tyser  SZ_ERROR_THREAD     - errors in multithreading functions (only for Mt version)
55078acc472SPeter Tyser
55178acc472SPeter Tyser
55278acc472SPeter Tyser
553*5fe8380dSStefan ReinauerDefines
554*5fe8380dSStefan Reinauer-------
55578acc472SPeter Tyser
55678acc472SPeter Tyser_LZMA_SIZE_OPT - Enable some optimizations in LZMA Decoder to get smaller executable code.
55778acc472SPeter Tyser
55878acc472SPeter Tyser_LZMA_PROB32   - It can increase the speed on some 32-bit CPUs, but memory usage for
55978acc472SPeter Tyser                 some structures will be doubled in that case.
56078acc472SPeter Tyser
56178acc472SPeter Tyser_LZMA_UINT32_IS_ULONG  - Define it if int is 16-bit on your compiler and long is 32-bit.
56278acc472SPeter Tyser
56378acc472SPeter Tyser_LZMA_NO_SYSTEM_SIZE_T  - Define it if you don't want to use size_t type.
56478acc472SPeter Tyser
56578acc472SPeter Tyser
566*5fe8380dSStefan Reinauer_7ZIP_PPMD_SUPPPORT - Define it if you don't want to support PPMD method in AMSI-C .7z decoder.
567*5fe8380dSStefan Reinauer
568*5fe8380dSStefan Reinauer
56978acc472SPeter TyserC++ LZMA Encoder/Decoder
57078acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~~~~~~
57178acc472SPeter TyserC++ LZMA code use COM-like interfaces. So if you want to use it,
57278acc472SPeter Tyseryou can study basics of COM/OLE.
57378acc472SPeter TyserC++ LZMA code is just wrapper over ANSI-C code.
57478acc472SPeter Tyser
57578acc472SPeter Tyser
57678acc472SPeter TyserC++ Notes
57778acc472SPeter Tyser~~~~~~~~~~~~~~~~~~~~~~~~
57878acc472SPeter TyserIf you use some C++ code folders in 7-Zip (for example, C++ code for .7z handling),
57978acc472SPeter Tyseryou must check that you correctly work with "new" operator.
58078acc472SPeter Tyser7-Zip can be compiled with MSVC 6.0 that doesn't throw "exception" from "new" operator.
58178acc472SPeter TyserSo 7-Zip uses "CPP\Common\NewHandler.cpp" that redefines "new" operator:
58278acc472SPeter Tyseroperator new(size_t size)
58378acc472SPeter Tyser{
58478acc472SPeter Tyser  void *p = ::malloc(size);
58578acc472SPeter Tyser  if (p == 0)
58678acc472SPeter Tyser    throw CNewException();
58778acc472SPeter Tyser  return p;
58878acc472SPeter Tyser}
58978acc472SPeter TyserIf you use MSCV that throws exception for "new" operator, you can compile without
59078acc472SPeter Tyser"NewHandler.cpp". So standard exception will be used. Actually some code of
59178acc472SPeter Tyser7-Zip catches any exception in internal code and converts it to HRESULT code.
59278acc472SPeter TyserSo you don't need to catch CNewException, if you call COM interfaces of 7-Zip.
59378acc472SPeter Tyser
59478acc472SPeter Tyser---
59578acc472SPeter Tyser
59678acc472SPeter Tyserhttp://www.7-zip.org
59778acc472SPeter Tyserhttp://www.7-zip.org/sdk.html
59878acc472SPeter Tyserhttp://www.7-zip.org/support.html
599