public class CsvTokenizer extends Object
Note that this tokenizer is implemented to consume Iterator<String>
iteratively,
and to return columns one-by-one by nextColumn()
or nextColumnOrNull()
.
Modifier and Type | Class and Description |
---|---|
static class |
CsvTokenizer.Builder
A builder of
CsvTokenizer . |
Modifier and Type | Method and Description |
---|---|
static CsvTokenizer.Builder |
builder(String delimiter)
Creates a builder instance for
CsvTokenizer . |
long |
getCurrentLineNumber()
Returns the line number where the tokenizer is currently on tokenizing.
|
boolean |
hasNextColumn()
Checks if the line has a remaining column to read.
|
String |
nextColumn()
Reads the next column tokenized, and moves the tokenizing cursor forward.
|
String |
nextColumnOrNull()
Reads the next column tokenized, with
nullString considered, and moves the tokenizing cursor forward. |
boolean |
nextRecord()
Moves the tokenizing cursor forward to the next record, with skipping empty lines.
|
boolean |
nextRecord(boolean skipEmptyLine)
Moves the tokenizing cursor forward to the next record.
|
String |
skipCurrentLine()
Skips a line, where the tokenizer is currently on tokenizing at.
|
boolean |
skipHeaderLine()
Skips a line considered as a "header line", where the tokenizer is currently on tokenizing at.
|
boolean |
wasQuotedColumn()
Checks if the last column was a quoted column.
|
public static CsvTokenizer.Builder builder(String delimiter)
CsvTokenizer
.delimiter
- a mandatory column delimiter (ex. ","
for CSV, "\t"
for TSV)NullPointerException
- if delimiter
is null
IllegalArgumentException
- if delimiter
is emptypublic long getCurrentLineNumber()
public boolean skipHeaderLine()
Note that this method does not check the line is really a "header line". It just assumes that the line is a header line, and skips a line unconditionally.
false
if the tokenizer observes no any more lines. true
otherwise.public String skipCurrentLine()
If the tokenizer is in the "quoted" state with newline(s), it skips only the first line in the quote, and then unreads following lines.
public boolean nextRecord()
true
if the tokenizer still has a line to readRecordHasUnexpectedTrailingColumnException
- if called while the line has not reached at an end-of-line yetpublic boolean nextRecord(boolean skipEmptyLine)
skipEmptyLine
- true
to skip empty linestrue
if the tokenizer still has a line to readRecordHasUnexpectedTrailingColumnException
- if called while the line has not reached at an end-of-line yetpublic boolean hasNextColumn()
true
if the line has a remaining columnpublic String nextColumn()
String
EndOfFileInQuotedFieldException
- if the tokenizer reaches at the end-of-file while in the "quoted" stateInvalidCharacterAfterQuoteException
- if the tokenizer encounters an invalid character next to a quoted fieldQuotedFieldLengthLimitExceededException
- if the tokenizer encounters too long a quoted fieldRecordDoesNotHaveExpectedColumnException
- if called while the line has already reached at an end-of-linepublic String nextColumnOrNull()
nullString
considered, and moves the tokenizing cursor forward.
If the tokenizer is configured with nullString
, it returns null
for a column that matches
nullString
.
If the tokenizer is not configured with nullString
, it returns an empty string ""
for a
quoted field, or null
otherwise.
String
EndOfFileInQuotedFieldException
- if the tokenizer reaches at the end-of-file while in the "quoted" stateInvalidCharacterAfterQuoteException
- if the tokenizer encounters an invalid character next to a quoted fieldQuotedFieldLengthLimitExceededException
- if the tokenizer encounters too long a quoted fieldRecordDoesNotHaveExpectedColumnException
- if called while the line has already reached at an end-of-linepublic boolean wasQuotedColumn()
true
if the last column was a quoted column