|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectcom.bigdata.util.CSVReader
public class CSVReader
A helper class to read CSV (comma separated value) and similar kinds of
delimited data. Files may use commas or tabs to delimit columns. If you have
to parse other kinds of delimited data then you should override
split(String).
Note: The default parsing of column values will provide Long integers
and Double precision floating point values rather than
Integer or Float. If you want to change this you need to
customize the CSVReader.Header class since that is responsible for interpreting
column values.
Note: If no headers are defined (by the caller) or read from the file (by the caller), then default headers named by the origin ONE column indices will be used.
| Nested Class Summary | |
|---|---|
static class |
CSVReader.Header
A header for a column that examines its values and interprets them as floating point numbers, integers, dates, or times when possible and as uninterpreted character data otherwise. |
| Field Summary | |
|---|---|
protected static int |
BUF_SIZE
The #of characters to buffer in the reader. |
protected CSVReader.Header[] |
headers
The header definitions (initially null). |
protected static boolean |
INFO
|
protected static org.apache.log4j.Logger |
log
|
protected BufferedReader |
r
The source. |
| Constructor Summary | |
|---|---|
CSVReader(InputStream is,
String charSet)
|
|
CSVReader(Reader r)
|
|
| Method Summary | |
|---|---|
CSVReader.Header[] |
getHeaders()
Return the current headers (by reference). |
boolean |
getSkipBlankLines()
|
boolean |
getSkipCommentLines()
|
long |
getTailDelayMillis()
The #of milliseconds that the CSVReader should wait before
attempting to read another line from the source (when reading from
a pipe) -or- 0L if the CSVReader should NOT continue reading
once it has reached the end of the input (default 0L). |
boolean |
getTrimWhitespace()
|
boolean |
hasNext()
|
int |
lineNo()
The current line number (origin one). |
Map<String,Object> |
next()
|
protected Map<String,Object> |
parse(String[] values)
Parse the line into column values. |
protected CSVReader.Header[] |
parseHeaders(String line)
Parse a line containing headers. |
void |
readHeaders()
Interpret the next row as containing headers. |
void |
remove()
Unsupported operation. |
protected void |
setDefaultHeaders(int ncols)
Creates default headers named by the origin ONE column indices {1,2,3,4,...}. |
void |
setHeader(int index,
CSVReader.Header header)
Re-define the CSVReader.Header at the specified index. |
void |
setHeaders(CSVReader.Header[] headers)
Explictly set the headers. |
boolean |
setSkipBlankLines(boolean skipBlankLines)
|
boolean |
setSkipCommentLines(boolean skipCommentLines)
|
long |
setTailDelayMillis(long tailDelayMillis)
|
boolean |
setTrimWhitespace(boolean trimWhitespace)
|
protected String[] |
split(String line)
Split the line into columns based on tabs or commas. |
protected String[] |
trim(String[] cols)
Trim whitespace and optional quotes from each value iff getTrimWhitespace() is true. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
protected static final org.apache.log4j.Logger log
protected static final boolean INFO
protected static final int BUF_SIZE
protected final BufferedReader r
protected CSVReader.Header[] headers
readHeaders(),
#setHeaders(String[])| Constructor Detail |
|---|
public CSVReader(InputStream is,
String charSet)
throws IOException
IOException
public CSVReader(Reader r)
throws IOException
IOException| Method Detail |
|---|
public int lineNo()
public boolean setSkipCommentLines(boolean skipCommentLines)
public boolean getSkipCommentLines()
public boolean setSkipBlankLines(boolean skipBlankLines)
public boolean getSkipBlankLines()
public boolean setTrimWhitespace(boolean trimWhitespace)
public boolean getTrimWhitespace()
public long getTailDelayMillis()
CSVReader should wait before
attempting to read another line from the source (when reading from
a pipe) -or- 0L if the CSVReader should NOT continue reading
once it has reached the end of the input (default 0L).
public long setTailDelayMillis(long tailDelayMillis)
public boolean hasNext()
hasNext in interface Iterator<Map<String,Object>>public Map<String,Object> next()
next in interface Iterator<Map<String,Object>>protected String[] split(String line)
line - The line.
protected String[] trim(String[] cols)
getTrimWhitespace() is true.
cols - The column values.
protected Map<String,Object> parse(String[] values)
setDefaultHeaders(int).
line - The line.
protected void setDefaultHeaders(int ncols)
ncols - The #of columns.protected CSVReader.Header[] parseHeaders(String line)
line - The line.
public void readHeaders()
throws IOException
IOExceptionpublic CSVReader.Header[] getHeaders()
public void setHeaders(CSVReader.Header[] headers)
headers - The headers.
public void setHeader(int index,
CSVReader.Header header)
CSVReader.Header at the specified index.
index - The index in [0:#headers-1].header - The new CSVReader.Header definition.public void remove()
remove in interface Iterator<Map<String,Object>>
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||