2018-09-09 21:20:36 +02:00

23 KiB

Raw Blame History

lzma-native

Node.js interface to the native liblzma compression library (.xz file format, among others)

This package provides interfaces for compression and decompression of .xz (and legacy .lzma) files, both stream-based and string-based.

Example usage

Installation

Simply install lzma-native via npm:

$ npm install --save lzma-native

Note: As of version 1.0.0, this module provides pre-built binaries for multiple Node.js versions and all major OS using node-pre-gyp, so for 99 % of users no compiler toolchain is necessary. Please create an issue here if you have any trouble installing this module.

Note: lzma-native@2.x requires a Node version >= 4. If you want to support Node 0.10 or 0.12, you can feel free to use lzma-native@1.x.

For streams

If you don’t have any fancy requirements, using this library is quite simple:

var lzma = require('lzma-native');

var compressor = lzma.createCompressor();
var input = fs.createReadStream('README.md');
var output = fs.createWriteStream('README.md.xz');

input.pipe(compressor).pipe(output);

For decompression, you can simply use lzma.createDecompressor().

Both functions return a stream where you can pipe your input in and read your (de)compressed output from.

For simple strings/Buffers

If you want your input/output to be Buffers (strings will be accepted as input), this even gets a little simpler:

lzma.compress('Banana', function(result) {
    console.log(result); // <Buffer fd 37 7a 58 5a 00 00 01 69 22 de 36 02 00 21 ...>
});

Again, replace lzma.compress with lzma.decompress and you’ll get the inverse transformation.

lzma.compress() and lzma.decompress() will return promises and you don’t need to provide any kind of callback (Example code).

API

Compatibility implementations

Apart from the API described here, lzma-native implements the APIs of the following other LZMA libraries so you can use it nearly as a drop-in replacement:

node-xz via lzma.Compressor and lzma.Decompressor
LZMA-JS via lzma.LZMA().compress and lzma.LZMA().decompress, though without actual support for progress functions and returning Buffer objects instead of integer arrays. (This produces output in the .lzma file format, not the .xz format!)

Multi-threaded encoding

Since version 1.5.0, lzma-native supports liblzma’s built-in multi-threading encoding capabilities. To make use of them, set the threads option to an integer value: lzma.createCompressor({ threads: n });. You can use value of 0 to use the number of processor cores. This option is only available for the easyEncoder (the default) and streamEncoder encoders.

Note that, by default, encoding will take place in Node’s libuv thread pool regardless of this option, and setting it when multiple encoders are running is likely to affect performance negatively.

Reference

Encoding strings and Buffer objects

compress() – Compress strings and Buffers
decompress() – Decompress strings and Buffers
LZMA().compress() (LZMA-JS compatibility)
LZMA().decompress() (LZMA-JS compatibility)

Creating streams for encoding

createCompressor() – Compress streams
createDecompressor() – Decompress streams
createStream() – (De-)Compression with advanced options
Compressor() (node-xz compatibility)
Decompressor() (node-xz compatibility)

.xz file metadata

isXZ() – Test Buffer for .xz file format
parseFileIndex() – Read .xz file metadata
parseFileIndexFD() – Read .xz metadata from a file descriptor

Miscellaneous functions

crc32() – Calculate CRC32 checksum
checkSize() – Return required size for specific checksum type
easyDecoderMemusage() – Expected memory usage
easyEncoderMemusage() – Expected memory usage
rawDecoderMemusage() – Expected memory usage
rawEncoderMemusage() – Expected memory usage
versionString() – Native library version string
versionNumber() – Native library numerical version identifier

Encoding strings and Buffer objects

`lzma.compress()`, `lzma.decompress()`

lzma.compress(string, [opt, ]on_finish)
lzma.decompress(string, [opt, ]on_finish)

Param	Type	Description
`string`	Buffer / String	Any string or buffer to be (de)compressed (that can be passed to `stream.end(…)`)
[`opt`]	Options / int	Optional. See options
`on_finish`	Callback	Will be invoked with the resulting Buffer as the first parameter when encoding is finished, and as `on_finish(null, err)` in case of an error.

These methods will also return a promise that you can use directly.

Example code:

lzma.compress('Bananas', 6, function(result) {
    lzma.decompress(result, function(decompressedResult) {
        assert.equal(decompressedResult.toString(), 'Bananas');
    });
});

Example code for promises:

lzma.compress('Bananas', 6).then(function(result) {
    return lzma.decompress(result);
}).then(function(decompressedResult) {
    assert.equal(decompressedResult.toString(), 'Bananas');
}).catch(function(err) {
    // ...
});

`lzma.LZMA().compress()`, `lzma.LZMA().decompress()`

lzma.LZMA().compress(string, mode, on_finish[, on_progress])
lzma.LZMA().decompress(string, on_finish[, on_progress])

(Compatibility; See LZMA-JS for the original specs.)

Note that the result of compression is in the older LZMA1 format (.lzma files). This is different from the more universally used LZMA2 format (.xz files) and you will have to take care of possible compatibility issues with systems expecting .xz files.

Param	Type	Description
`string`	Buffer / String / Array	Any string, buffer, or array of integers or typed integers (e.g. `Uint8Array`)
`mode`	int	A number between 0 and 9, indicating compression level
`on_finish`	Callback	Will be invoked with the resulting Buffer as the first parameter when encoding is finished, and as `on_finish(null, err)` in case of an error.
`on_progress`	Callback	Indicates progress by passing a number in [0.0, 1.0]. Currently, this package only invokes the callback with 0.0 and 1.0.

These methods will also return a promise that you can use directly.

This does not work exactly as described in the original LZMA-JS specification:

The results are Buffer objects, not integer arrays. This just makes a lot more sense in a Node.js environment.
on_progress is currently only called with 0.0 and 1.0.

Example code:

lzma.LZMA().compress('Bananas', 4, function(result) {
    lzma.LZMA().decompress(result, function(decompressedResult) {
        assert.equal(decompressedResult.toString(), 'Bananas');
    });
});

For an example using promises, see compress().

Creating streams for encoding

`lzma.createCompressor()`, `lzma.createDecompressor()`

lzma.createCompressor([options])
lzma.createDecompressor([options])

Param	Type	Description
[`options`]	Options / int	Optional. See options

Return a duplex stream, i.e. a both readable and writable stream. Input will be read, (de)compressed and written out. You can use this to pipe input through this stream, i.e. to mimick the xz command line util, you can write:

var compressor = lzma.createCompressor();

process.stdin.pipe(compressor).pipe(process.stdout);

The output of compression will be in LZMA2 format (.xz files), while decompression will accept either format via automatic detection.

`lzma.Compressor()`, `lzma.Decompressor()`

lzma.Compressor([preset], [options])
lzma.Decompressor([options])

(Compatibility; See node-xz for the original specs.)

These methods handle the .xz file format.

Param	Type	Description
[`preset`]	int	Optional. See options.preset
[`options`]	Options	Optional. See options

var compressor = lzma.Compressor();

process.stdin.pipe(compressor).pipe(process.stdout);

`lzma.createStream()`

lzma.createStream(coder, options)

Param	Type	Description
[`coder`]	string	Any of the supported coder names, e.g. `"easyEncoder"` (default) or `"autoDecoder"`.
[`options`]	Options / int	Optional. See options

Return a duplex stream for (de-)compression. You can use this to pipe input through this stream.

The available coders are (the most interesting ones first):

easyEncoder Standard LZMA2 (.xz file format) encoder. Supports options.preset and options.check options.
autoDecoder Standard LZMA1/2 (both .xz and .lzma) decoder with auto detection of file format. Supports options.memlimit and options.flags options.
aloneEncoder Encoder which only uses the legacy .lzma format. Supports the whole range of LZMA options.

Less likely to be of interest to you, but also available:

aloneDecoder Decoder which only uses the legacy .lzma format. Supports the options.memlimit option.
rawEncoder Custom encoder corresponding to lzma_raw_encoder (See the native library docs for details). Supports the options.filters option.
rawDecoder Custom decoder corresponding to lzma_raw_decoder (See the native library docs for details). Supports the options.filters option.
streamEncoder Custom encoder corresponding to lzma_stream_encoder (See the native library docs for details). Supports options.filters and options.check options.
streamDecoder Custom decoder corresponding to lzma_stream_decoder (See the native library docs for details). Supports options.memlimit and options.flags options.

Options

Option name	Type	Description
`check`	check	Any of `lzma.CHECK_CRC32`, `lzma.CHECK_CRC64`, `lzma.CHECK_NONE`, `lzma.CHECK_SHA256`
`memlimit`	float	A memory limit for (de-)compression in bytes
`preset`	int	A number from 0 to 9, 0 being the fastest and weakest compression, 9 the slowest and highest compression level. (Please also see the xz(1) manpage for notes – don’t just blindly use 9!) You can also OR this with `lzma.PRESET_EXTREME` (the `-e` option to the `xz` command line utility).
`flags`	int	A bitwise or of `lzma.LZMA_TELL_NO_CHECK`, `lzma.LZMA_TELL_UNSUPPORTED_CHECK`, `lzma.LZMA_TELL_ANY_CHECK`, `lzma.LZMA_CONCATENATED`
`synchronous`	bool	If true, forces synchronous coding (i.e. no usage of threading)
`bufsize`	int	The default size for allocated buffers
`threads`	int	Set to an integer to use liblzma’s multi-threading support. 0 will choose the number of CPU cores.
`blockSize`	int	Maximum uncompressed size of a block in multi-threading mode
`timeout`	int	Timeout for a single encoding operation in multi-threading mode

options.filters can, if the coder supports it, be an array of filter objects, each with the following properties:

.id Any of lzma.FILTERS_MAX, lzma.FILTER_ARM, lzma.FILTER_ARMTHUMB, lzma.FILTER_IA64, lzma.FILTER_POWERPC, lzma.FILTER_SPARC, lzma.FILTER_X86 or lzma.FILTER_DELTA, lzma.FILTER_LZMA1, lzma.FILTER_LZMA2

The delta filter supports the additional option .dist for a distance between bytes (see the xz(1) manpage).

The LZMA filter supports the additional options .dict_size, .lp, .lc, pb, .mode, nice_len, .mf, .depth and .preset. See the xz(1) manpage for meaning of these parameters and additional information.

Miscellaneous functions

`lzma.crc32()`

lzma.crc32(input[, encoding[, previous]])

Compute the CRC32 checksum of a Buffer or string.

Param	Type	Description
`input`	string / Buffer	Any string or Buffer.
[`encoding`]	string	Optional. If `input` is a string, an encoding to use when converting into binary.
[`previous`]	int	The result of a previous CRC32 calculation so that you can compute the checksum per each chunk

Example usage:

lzma.crc32('Banana') // => 69690105

`lzma.checkSize()`

lzma.checkSize(check)

Return the byte size of a check sum.

Param	Type	Description
`check`	check	Any supported check constant.

Example usage:

lzma.checkSize(lzma.CHECK_SHA256) // => 16
lzma.checkSize(lzma.CHECK_CRC32)  // => 4

`lzma.easyDecoderMemusage()`

lzma.easyDecoderMemusage(preset)

Returns the approximate memory usage when decoding using easyDecoder for a given preset.

Param	Type	Description
`preset`	preset	A compression level from 0 to 9

Example usage:

lzma.easyDecoderMemusage(6) // => 8454192

`lzma.easyEncoderMemusage()`

lzma.easyEncoderMemusage(preset)

Returns the approximate memory usage when encoding using easyEncoder for a given preset.

Param	Type	Description
`preset`	preset	A compression level from 0 to 9

Example usage:

lzma.easyEncoderMemusage(6) // => 97620499

`lzma.rawDecoderMemusage()`

lzma.rawDecoderMemusage(filters)

Returns the approximate memory usage when decoding using rawDecoder for a given filter list.

Param	Type	Description
`filters`	array	An array of filters

`lzma.rawEncoderMemusage()`

lzma.rawEncoderMemusage(filters)

Returns the approximate memory usage when encoding using rawEncoder for a given filter list.

Param	Type	Description
`filters`	array	An array of filters

`lzma.versionString()`

lzma.versionString()

Returns the version of the underlying C library.

Example usage:

lzma.versionString() // => '5.2.3'

`lzma.versionNumber()`

lzma.versionNumber()

Returns the version of the underlying C library.

Example usage:

lzma.versionNumber() // => 50020012

.xz file metadata

`lzma.isXZ()`

lzma.isXZ(input)

Tells whether an input buffer is an XZ file (.xz, LZMA2 format) using the file format’s magic number. This is not a complete test, i.e. the data following the file header may still be invalid in some way.

Param	Type	Description
`input`	string / Buffer	Any string or Buffer (integer arrays accepted).

Example usage:

lzma.isXZ(fs.readFileSync('test/hamlet.txt.xz')); // => true
lzma.isXZ(fs.readFileSync('test/hamlet.txt.lzma')); // => false
lzma.isXZ('Banana'); // => false

(The magic number of XZ files is hex fd 37 7a 58 5a 00 at position 0.)

`lzma.parseFileIndex()`

lzma.parseFileIndex(options[, callback])

Read .xz file metadata.

options.fileSize needs to be an integer indicating the size of the file being inspected, e.g. obtained by fs.stat().

options.read(count, offset, cb) must be a function that reads count bytes from the underlying file, starting at position offset. If that is not possible, e.g. because the file does not have enough bytes, the file should be considered corrupt. On success, cb should be called with a Buffer containing the read data. cb can be invoked as cb(err, buffer), in which case err will be passed along to the original callback argument when set.

callback will be called with err and info as its arguments.

If no callback is provided, options.read() must work synchronously and the file info will be returned from lzma.parseFileIndex().

Example usage:

fs.readFile('test/hamlet.txt.xz', function(err, content) {
  // handle error

  lzma.parseFileIndex({
    fileSize: content.length,
    read: function(count, offset, cb) {
      cb(content.slice(offset, offset + count));
    }
  }, function(err, info) {
    // handle error
    
    // do something with e.g. info.uncompressedSize
  });
});

`lzma.parseFileIndexFD()`

lzma.parseFileIndexFD(fd, callback)

Read .xz metadata from a file descriptor.

This is like parseFileIndex(), but lets you pass an file descriptor in fd. The file will be inspected using fs.stat() and fs.read(). The file descriptor will not be opened or closed by this call.

Example usage:

fs.open('test/hamlet.txt.xz', 'r', function(err, fd) {
  // handle error

  lzma.parseFileIndexFD(fd, function(err, info) {
    // handle error
    
    // do something with e.g. info.uncompressedSize
    
    fs.close(fd, function(err) { /* handle error */ });
  });
});

Installation

This package includes the native C library, so there is no need to install it separately.

Licensing

The original C library package contains code under various licenses, with its core (liblzma) being public domain. See its contents for details. This wrapper is licensed under the MIT License.

Other implementations of the LZMA algorithms for node.js and/or web clients include:

Note that LZMA has been designed to have much faster decompression than compression, which is something you may want to take into account when choosing an compression algorithm for large files. Almost always, LZMA achieves higher compression ratios than other algorithms, though.

Acknowledgements

Initial development of this project was financially supported by Tradity.

23 KiB Raw Blame History Unescape Escape

lzma-native

Example usage

Installation

For streams

For simple strings/Buffers

API

Compatibility implementations

Multi-threaded encoding

Reference

Encoding strings and Buffer objects

lzma.compress(), lzma.decompress()

lzma.LZMA().compress(), lzma.LZMA().decompress()

Creating streams for encoding

lzma.createCompressor(), lzma.createDecompressor()

lzma.Compressor(), lzma.Decompressor()

lzma.createStream()

Options

Miscellaneous functions

lzma.crc32()

lzma.checkSize()

lzma.easyDecoderMemusage()

lzma.easyEncoderMemusage()

lzma.rawDecoderMemusage()

lzma.rawEncoderMemusage()

lzma.versionString()

lzma.versionNumber()

.xz file metadata

lzma.isXZ()

lzma.parseFileIndex()

lzma.parseFileIndexFD()

Installation

Licensing

Related projects

Acknowledgements

23 KiB

Raw Blame History

`lzma.compress()`, `lzma.decompress()`

`lzma.LZMA().compress()`, `lzma.LZMA().decompress()`

`lzma.createCompressor()`, `lzma.createDecompressor()`

`lzma.Compressor()`, `lzma.Decompressor()`

`lzma.createStream()`

`lzma.crc32()`

`lzma.checkSize()`

`lzma.easyDecoderMemusage()`

`lzma.easyEncoderMemusage()`

`lzma.rawDecoderMemusage()`

`lzma.rawEncoderMemusage()`

`lzma.versionString()`

`lzma.versionNumber()`

`lzma.isXZ()`

`lzma.parseFileIndex()`

`lzma.parseFileIndexFD()`