Package org.jumpmind.symmetric.csv
Class CsvReader
java.lang.Object
org.jumpmind.symmetric.csv.CsvReader
- All Implemented Interfaces:
AutoCloseable
A stream based parser for parsing delimited text data from a file or a stream.
-
Field Summary
FieldsModifier and TypeFieldDescriptionstatic final int
Use a backslash character before the text qualifier to represent an occurrence of the text qualifier.static final int
Double up the text qualifier to represent an occurrence of the text qualifier. -
Constructor Summary
ConstructorsConstructorDescriptionCsvReader
(InputStream inputStream, char delimiter, Charset charset) Constructs aCsvReader
object using anInputStream
object as the data source.CsvReader
(InputStream inputStream, Charset charset) Constructs aCsvReader
object using anInputStream
object as the data source. Uses a comma as the column delimiter.Creates aCsvReader
object using a file as the data source. -
Method Summary
Modifier and TypeMethodDescriptionvoid
close()
Closes and releases all related resources.protected void
get
(int columnIndex) Returns the current column value for a given column index.Returns the current column value for a given column header name.boolean
int
Gets the count of columns found in this record.char
Gets the character being used as a comment signal.long
Gets the index of the current record.char
Gets the character being used as the column delimiter.int
Gets the current way to escape an occurrence of the text qualifier inside qualified data.getHeader
(int columnIndex) Returns the column header value for a given column index.int
Gets the count of headers read in by a previous call toreadHeaders()
.String[]
Returns the header values as a string array.int
Gets the corresponding column index for a given column header name.char
boolean
Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.boolean
char
Gets the character to use as a text qualifier in the data.boolean
Gets whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.boolean
Gets whether comments are being looked for while parsing or not.boolean
Whether text qualifiers will be used while parsing or not.String[]
boolean
isQualified
(int columnIndex) static CsvReader
boolean
Read the first record of data as column headers.boolean
Reads another record.void
setCaptureRawRecord
(boolean captureRawRecord) void
setComment
(char comment) Sets the character to use as a comment signal.void
setDelimiter
(char delimiter) Sets the character to use as the column delimiter.void
setEscapeMode
(int escapeMode) Sets the current way to escape an occurrence of the text qualifier inside qualified data.void
setHeaders
(String[] headers) void
setRecordDelimiter
(char recordDelimiter) Sets the character to use as the record delimiter.void
setSafetySwitch
(boolean safetySwitch) Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.void
setSkipEmptyRecords
(boolean skipEmptyRecords) void
setTextQualifier
(char textQualifier) Sets the character to use as a text qualifier in the data.void
setTrimWhitespace
(boolean trimWhitespace) Sets whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.void
setUseComments
(boolean useComments) Sets whether comments are being looked for while parsing or not.void
setUseTextQualifier
(boolean useTextQualifier) Sets whether text qualifiers will be used while parsing or not.boolean
skipLine()
Skips the next line of data using the standard end of line characters and does not do any column delimited parsing.boolean
Skips the next record of data by parsing each column. Does not incrementgetCurrentRecord()
.
-
Field Details
-
ESCAPE_MODE_DOUBLED
public static final int ESCAPE_MODE_DOUBLEDDouble up the text qualifier to represent an occurrence of the text qualifier.- See Also:
-
ESCAPE_MODE_BACKSLASH
public static final int ESCAPE_MODE_BACKSLASHUse a backslash character before the text qualifier to represent an occurrence of the text qualifier.- See Also:
-
-
Constructor Details
-
CsvReader
Creates aCsvReader
object using a file as the data source.- Parameters:
fileName
- The path to the file to use as the data source.delimiter
- The character to use as the column delimiter.charset
- TheCharset
to use while parsing the data.- Throws:
FileNotFoundException
-
CsvReader
- Parameters:
fileName
- The path to the file to use as the data source.delimiter
- The character to use as the column delimiter.- Throws:
FileNotFoundException
-
CsvReader
Creates aCsvReader
object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as theCharset
.- Parameters:
fileName
- The path to the file to use as the data source.- Throws:
FileNotFoundException
-
CsvReader
- Parameters:
inputStream
- The stream to use as the data source.delimiter
- The character to use as the column delimiter.
-
CsvReader
Constructs aCsvReader
object using aReader
object as the data source. Uses a comma as the column delimiter.- Parameters:
inputStream
- The stream to use as the data source.
-
CsvReader
Constructs aCsvReader
object using anInputStream
object as the data source.- Parameters:
inputStream
- The stream to use as the data source.delimiter
- The character to use as the column delimiter.charset
- TheCharset
to use while parsing the data.
-
CsvReader
Constructs aCsvReader
object using anInputStream
object as the data source. Uses a comma as the column delimiter.- Parameters:
inputStream
- The stream to use as the data source.charset
- TheCharset
to use while parsing the data.
-
-
Method Details
-
getCaptureRawRecord
public boolean getCaptureRawRecord() -
setCaptureRawRecord
public void setCaptureRawRecord(boolean captureRawRecord) -
getRawRecord
-
getTrimWhitespace
public boolean getTrimWhitespace()Gets whether leading and trailing whitespace characters are being trimmed from non-textqualified column data. Default is true.- Returns:
- Whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
-
setTrimWhitespace
public void setTrimWhitespace(boolean trimWhitespace) Sets whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not. Default is true.- Parameters:
trimWhitespace
- Whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
-
getDelimiter
public char getDelimiter()Gets the character being used as the column delimiter. Default is comma, ','.- Returns:
- The character being used as the column delimiter.
-
setDelimiter
public void setDelimiter(char delimiter) Sets the character to use as the column delimiter. Default is comma, ','.- Parameters:
delimiter
- The character to use as the column delimiter.
-
getRecordDelimiter
public char getRecordDelimiter() -
setRecordDelimiter
public void setRecordDelimiter(char recordDelimiter) Sets the character to use as the record delimiter.- Parameters:
recordDelimiter
- The character to use as the record delimiter. Default is combination of standard end of line characters for Windows, Unix, or Mac.
-
getTextQualifier
public char getTextQualifier()Gets the character to use as a text qualifier in the data.- Returns:
- The character to use as a text qualifier in the data.
-
setTextQualifier
public void setTextQualifier(char textQualifier) Sets the character to use as a text qualifier in the data.- Parameters:
textQualifier
- The character to use as a text qualifier in the data.
-
getUseTextQualifier
public boolean getUseTextQualifier()Whether text qualifiers will be used while parsing or not.- Returns:
- Whether text qualifiers will be used while parsing or not.
-
setUseTextQualifier
public void setUseTextQualifier(boolean useTextQualifier) Sets whether text qualifiers will be used while parsing or not.- Parameters:
useTextQualifier
- Whether to use a text qualifier while parsing or not.
-
getComment
public char getComment()Gets the character being used as a comment signal.- Returns:
- The character being used as a comment signal.
-
setComment
public void setComment(char comment) Sets the character to use as a comment signal.- Parameters:
comment
- The character to use as a comment signal.
-
getUseComments
public boolean getUseComments()Gets whether comments are being looked for while parsing or not.- Returns:
- Whether comments are being looked for while parsing or not.
-
setUseComments
public void setUseComments(boolean useComments) Sets whether comments are being looked for while parsing or not.- Parameters:
useComments
- Whether comments are being looked for while parsing or not.
-
getEscapeMode
public int getEscapeMode()Gets the current way to escape an occurrence of the text qualifier inside qualified data.- Returns:
- The current way to escape an occurrence of the text qualifier inside qualified data.
-
setEscapeMode
Sets the current way to escape an occurrence of the text qualifier inside qualified data.- Parameters:
escapeMode
- The way to escape an occurrence of the text qualifier inside qualified data.- Throws:
IllegalArgumentException
- When an illegal value is specified for escapeMode.
-
getSkipEmptyRecords
public boolean getSkipEmptyRecords() -
setSkipEmptyRecords
public void setSkipEmptyRecords(boolean skipEmptyRecords) -
getSafetySwitch
public boolean getSafetySwitch()Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.- Returns:
- The current setting of the safety switch.
-
setSafetySwitch
public void setSafetySwitch(boolean safetySwitch) Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.- Parameters:
safetySwitch
-
-
getColumnCount
public int getColumnCount()Gets the count of columns found in this record.- Returns:
- The count of columns found in this record.
-
getCurrentRecord
public long getCurrentRecord()Gets the index of the current record.- Returns:
- The index of the current record.
-
getHeaderCount
public int getHeaderCount()Gets the count of headers read in by a previous call toreadHeaders()
.- Returns:
- The count of headers read in by a previous call to
readHeaders()
.
-
getHeaders
Returns the header values as a string array.- Returns:
- The header values as a String array.
- Throws:
IOException
- Thrown if this object has already been closed.
-
setHeaders
-
getValues
- Throws:
IOException
-
get
Returns the current column value for a given column index.- Parameters:
columnIndex
- The index of the column.- Returns:
- The current column value.
- Throws:
IOException
- Thrown if this object has already been closed.
-
get
Returns the current column value for a given column header name.- Parameters:
headerName
- The header name of the column.- Returns:
- The current column value.
- Throws:
IOException
- Thrown if this object has already been closed.
-
parse
- Parameters:
data
- The String of data to use as the source.- Returns:
- A
CsvReader
object using the String of data as the source.
-
readRecord
Reads another record.- Returns:
- Whether another record was successfully read or not.
- Throws:
IOException
- Thrown if an error occurs while reading data from the source stream.
-
readHeaders
Read the first record of data as column headers.- Returns:
- Whether the header record was successfully read or not.
- Throws:
IOException
- Thrown if an error occurs while reading data from the source stream.
-
getHeader
Returns the column header value for a given column index.- Parameters:
columnIndex
- The index of the header column being requested.- Returns:
- The value of the column header at the given column index.
- Throws:
IOException
- Thrown if this object has already been closed.
-
isQualified
- Throws:
IOException
-
expandColumnBuffer
protected void expandColumnBuffer() -
getIndex
Gets the corresponding column index for a given column header name.- Parameters:
headerName
- The header name of the column.- Returns:
- The column index for the given column header name. Returns -1 if not found.
- Throws:
IOException
- Thrown if this object has already been closed.
-
skipRecord
Skips the next record of data by parsing each column. Does not incrementgetCurrentRecord()
.- Returns:
- Whether another record was successfully skipped or not.
- Throws:
IOException
- Thrown if an error occurs while reading data from the source stream.
-
skipLine
Skips the next line of data using the standard end of line characters and does not do any column delimited parsing.- Returns:
- Whether a line was successfully skipped or not.
- Throws:
IOException
- Thrown if an error occurs while reading data from the source stream.
-
close
public void close()Closes and releases all related resources.- Specified by:
close
in interfaceAutoCloseable
-