|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectjava.io.InputStream
org.tukaani.xz.SeekableInputStream
org.tukaani.xz.SeekableXZInputStream
public class SeekableXZInputStream
Decompresses a .xz file in random access mode. This supports decompressing concatenated .xz files.
Each .xz file consist of one or more Streams. Each Stream consist of zero or more Blocks. Each Stream contains an Index of Streams' Blocks. The Indexes from all Streams are loaded in RAM by a constructor of this class. A typical .xz file has only one Stream, and parsing its Index will need only three or four seeks.
To make random access possible, the data in a .xz file must be splitted into multiple Blocks of reasonable size. Decompression can only start at a Block boundary. When seeking to an uncompressed offset that is not at a Block boundary, decompression starts at the beginning of the Block and throws away data until the target offset is reached. Thus, smaller Blocks mean faster seeks to arbitrary uncompressed offsets. On the other hand, smaller Blocks mean worse compression. So one has to make a compromise between random access speed and compression ratio.
Implementation note: This class uses linear search to locate the correct Stream from the data structures in RAM. It was the simplest to implement and should be fine as long as there aren't too many Streams. The correct Block inside a Stream is located using binary search and thus is fast even with a huge number of Blocks.
The amount of memory needed for the Indexes is taken into account when checking the memory usage limit. Each Stream is calculated to need at least 1 KiB of memory and each Block 16 bytes of memory, rounded up to the next kibibyte. So unless the file has a huge number of Streams or Blocks, these don't take significant amount of memory.
When using XZOutputStream
, a new Block can be started by calling
its endBlock
method. If you know
that the decompressor will need to seek only to certain offsets, it can
be a good idea to start a new Block at (some of) these offsets (and
perhaps only at these offsets to get better compression ratio).
liblzma in XZ Utils supports starting a new Block with
LZMA_FULL_FLUSH
. XZ Utils 5.1.1alpha added threaded
compression which creates multi-Block .xz files. XZ Utils 5.1.1alpha
also added the option --block-size=SIZE
to the xz command
line tool.
SeekableFileInputStream
,
XZInputStream
,
XZOutputStream
Constructor Summary | |
---|---|
SeekableXZInputStream(SeekableInputStream in)
Creates a new seekable XZ decompressor without a memory usage limit. |
|
SeekableXZInputStream(SeekableInputStream in,
int memoryLimit)
Creates a new seekable XZ decomporessor with an optional memory usage limit. |
Method Summary | |
---|---|
int |
available()
Returns the number of uncompressed bytes that can be read without blocking. |
void |
close()
Closes the stream and calls in.close() . |
int |
getCheckTypes()
Gets the types of integrity checks used in the .xz file. |
int |
getIndexMemoryUsage()
Gets the amount of memory in kibibytes (KiB) used by the data structures needed to locate the XZ Blocks. |
long |
getLargestBlockSize()
Gets the uncompressed size of the largest XZ Block in bytes. |
long |
length()
Gets the uncompressed size of this input stream. |
long |
position()
Gets the uncompressed position in this input stream. |
int |
read()
Decompresses the next byte from this input stream. |
int |
read(byte[] buf,
int off,
int len)
Decompresses into an array of bytes. |
void |
seek(long pos)
Seeks to the specified absolute uncompressed position in the stream. |
Methods inherited from class org.tukaani.xz.SeekableInputStream |
---|
skip |
Methods inherited from class java.io.InputStream |
---|
mark, markSupported, read, reset |
Methods inherited from class java.lang.Object |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
Constructor Detail |
---|
public SeekableXZInputStream(SeekableInputStream in) throws java.io.IOException
in
- seekable input stream containing one or more
XZ Streams; the whole input stream is used
XZFormatException
- input is not in the XZ format
CorruptedInputException
- XZ data is corrupt or truncated
UnsupportedOptionsException
- XZ headers seem valid but they specify
options not supported by this implementation
java.io.EOFException
- less than 6 bytes of input was available
from in
, or (unlikely) the size
of the underlying stream got smaller while
this was reading from it
java.io.IOException
- may be thrown by in
public SeekableXZInputStream(SeekableInputStream in, int memoryLimit) throws java.io.IOException
in
- seekable input stream containing one or more
XZ Streams; the whole input stream is usedmemoryLimit
- memory usage limit in kibibytes (KiB)
or -1
to impose no
memory usage limit
XZFormatException
- input is not in the XZ format
CorruptedInputException
- XZ data is corrupt or truncated
UnsupportedOptionsException
- XZ headers seem valid but they specify
options not supported by this implementation
MemoryLimitException
- decoded XZ Indexes would need more memory
than allowed by the memory usage limit
java.io.EOFException
- less than 6 bytes of input was available
from in
, or (unlikely) the size
of the underlying stream got smaller while
this was reading from it
java.io.IOException
- may be thrown by in
Method Detail |
---|
public int getCheckTypes()
The returned value has a bit set for every check type that is present.
For example, if CRC64 and SHA-256 were used, the return value is
(1 << XZ.CHECK_CRC64)
| (1 << XZ.CHECK_SHA256)
.
public int getIndexMemoryUsage()
public long getLargestBlockSize()
public int read() throws java.io.IOException
read
in class java.io.InputStream
-1
to indicate the end of the compressed stream
CorruptedInputException
UnsupportedOptionsException
MemoryLimitException
XZIOException
- if the stream has been closed
java.io.IOException
- may be thrown by in
public int read(byte[] buf, int off, int len) throws java.io.IOException
If len
is zero, no bytes are read and 0
is returned. Otherwise this will try to decompress len
bytes of uncompressed data. Less than len
bytes may
be read only in the following situations:
len
bytes have already been successfully decompressed.
The next call with non-zero len
will immediately
throw the pending exception.
read
in class java.io.InputStream
buf
- target buffer for uncompressed dataoff
- start offset in buf
len
- maximum number of uncompressed bytes to read
-1
to indicate
the end of the compressed stream
CorruptedInputException
UnsupportedOptionsException
MemoryLimitException
XZIOException
- if the stream has been closed
java.io.IOException
- may be thrown by in
public int available() throws java.io.IOException
CorruptedInputException
may get
thrown before the number of bytes claimed to be available have
been read from this input stream.
available
in class java.io.InputStream
java.io.IOException
public void close() throws java.io.IOException
in.close()
.
If the stream was already closed, this does nothing.
close
in interface java.io.Closeable
close
in class java.io.InputStream
java.io.IOException
- if thrown by in.close()
public long length()
length
in class SeekableInputStream
public long position() throws java.io.IOException
position
in class SeekableInputStream
XZIOException
- if the stream has been closed
java.io.IOException
public void seek(long pos) throws java.io.IOException
read
is called
to read at least one byte.
Seeking past the end of the stream is possible. In that case
read
will return -1
to indicate
the end of the stream.
seek
in class SeekableInputStream
pos
- new uncompressed read position
XZIOException
- if pos
is negative, or
if stream has been closed
java.io.IOException
- if pos
is negative or if
a stream-specific I/O error occurs
|
||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |