public class ParquetFileReader extends Object implements Closeable
Constructor and Description |
---|
ParquetFileReader(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.Path filePath,
List<BlockMetaData> blocks,
List<ColumnDescriptor> columns) |
Modifier and Type | Method and Description |
---|---|
void |
close() |
static List<Footer> |
readAllFootersInParallel(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.FileStatus fileStatus) |
static List<Footer> |
readAllFootersInParallel(org.apache.hadoop.conf.Configuration configuration,
List<org.apache.hadoop.fs.FileStatus> partFiles) |
static List<Footer> |
readAllFootersInParallelUsingSummaryFiles(org.apache.hadoop.conf.Configuration configuration,
List<org.apache.hadoop.fs.FileStatus> partFiles)
for files provided, check if there's a summary file.
|
static ParquetMetadata |
readFooter(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.FileStatus file)
Reads the meta data block in the footer of the file
|
static ParquetMetadata |
readFooter(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.Path file)
Reads the meta data block in the footer of the file
|
static List<Footer> |
readFooters(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.FileStatus pathStatus) |
static List<Footer> |
readFooters(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.Path file) |
PageReadStore |
readNextRowGroup()
Reads all the columns requested from the row group at the current file position.
|
static List<Footer> |
readSummaryFile(org.apache.hadoop.conf.Configuration configuration,
org.apache.hadoop.fs.FileStatus summaryStatus) |
public ParquetFileReader(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path filePath, List<BlockMetaData> blocks, List<ColumnDescriptor> columns) throws IOException
f
- the Parquet file (will be opened for read in this constructor)blocks
- the blocks to readcolums
- the columns to read (their path)codecClassName
- the codec used to compress the blocksIOException
- if the file can not be openedpublic static List<Footer> readAllFootersInParallelUsingSummaryFiles(org.apache.hadoop.conf.Configuration configuration, List<org.apache.hadoop.fs.FileStatus> partFiles) throws IOException
configuration
- the hadoop conf to connect to the file system;partFiles
- the part files to readIOException
public static List<Footer> readAllFootersInParallel(org.apache.hadoop.conf.Configuration configuration, List<org.apache.hadoop.fs.FileStatus> partFiles) throws IOException
IOException
public static List<Footer> readAllFootersInParallel(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.FileStatus fileStatus) throws IOException
IOException
public static List<Footer> readFooters(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.FileStatus pathStatus) throws IOException
IOException
public static List<Footer> readSummaryFile(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.FileStatus summaryStatus) throws IOException
IOException
public static final ParquetMetadata readFooter(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path file) throws IOException
configuration
- file
- the parquet FileIOException
- if an error occurs while reading the filepublic static final List<Footer> readFooters(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.Path file) throws IOException
IOException
public static final ParquetMetadata readFooter(org.apache.hadoop.conf.Configuration configuration, org.apache.hadoop.fs.FileStatus file) throws IOException
configuration
- file
- the parquet FileIOException
- if an error occurs while reading the filepublic PageReadStore readNextRowGroup() throws IOException
IOException
- if an error occurs while readingpublic void close() throws IOException
close
in interface Closeable
close
in interface AutoCloseable
IOException
Copyright © 2015. All rights reserved.