Class CsvReader

java.lang.Object
org.jumpmind.symmetric.csv.CsvReader
All Implemented Interfaces:
AutoCloseable

public class CsvReader extends Object implements AutoCloseable
A stream based parser for parsing delimited text data from a file or a stream.
  • Field Summary

    Fields
    Modifier and Type
    Field
    Description
    static final int
    Use a backslash character before the text qualifier to represent an occurrence of the text qualifier.
    static final int
    Double up the text qualifier to represent an occurrence of the text qualifier.
  • Constructor Summary

    Constructors
    Constructor
    Description
    CsvReader(InputStream inputStream, char delimiter, Charset charset)
    Constructs a CsvReader object using an InputStream object as the data source.
    CsvReader(InputStream inputStream, Charset charset)
    Constructs a CsvReader object using an InputStream object as the data source. Uses a comma as the column delimiter.
    CsvReader(Reader inputStream)
    Constructs a CsvReader object using a Reader object as the data source. Uses a comma as the column delimiter.
    CsvReader(Reader inputStream, char delimiter)
    Constructs a CsvReader object using a Reader object as the data source.
    CsvReader(String fileName)
    Creates a CsvReader object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as the Charset.
    CsvReader(String fileName, char delimiter)
    Creates a CsvReader object using a file as the data source. Uses ISO-8859-1 as the Charset.
    CsvReader(String fileName, char delimiter, Charset charset)
    Creates a CsvReader object using a file as the data source.
  • Method Summary

    Modifier and Type
    Method
    Description
    void
    Closes and releases all related resources.
    protected void
     
    get(int columnIndex)
    Returns the current column value for a given column index.
    get(String headerName)
    Returns the current column value for a given column header name.
    boolean
     
    int
    Gets the count of columns found in this record.
    char
    Gets the character being used as a comment signal.
    long
    Gets the index of the current record.
    char
    Gets the character being used as the column delimiter.
    int
    Gets the current way to escape an occurrence of the text qualifier inside qualified data.
    getHeader(int columnIndex)
    Returns the column header value for a given column index.
    int
    Gets the count of headers read in by a previous call to readHeaders().
    Returns the header values as a string array.
    int
    getIndex(String headerName)
    Gets the corresponding column index for a given column header name.
     
    char
     
    boolean
    Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.
    boolean
     
    char
    Gets the character to use as a text qualifier in the data.
    boolean
    Gets whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
    boolean
    Gets whether comments are being looked for while parsing or not.
    boolean
    Whether text qualifiers will be used while parsing or not.
     
    boolean
    isQualified(int columnIndex)
     
    static CsvReader
    parse(String data)
    Creates a CsvReader object using a string of data as the source. Uses ISO-8859-1 as the Charset.
    boolean
    Read the first record of data as column headers.
    boolean
    Reads another record.
    void
    setCaptureRawRecord(boolean captureRawRecord)
     
    void
    setComment(char comment)
    Sets the character to use as a comment signal.
    void
    setDelimiter(char delimiter)
    Sets the character to use as the column delimiter.
    void
    setEscapeMode(int escapeMode)
    Sets the current way to escape an occurrence of the text qualifier inside qualified data.
    void
    setHeaders(String[] headers)
     
    void
    setRecordDelimiter(char recordDelimiter)
    Sets the character to use as the record delimiter.
    void
    setSafetySwitch(boolean safetySwitch)
    Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file.
    void
    setSkipEmptyRecords(boolean skipEmptyRecords)
     
    void
    setTextQualifier(char textQualifier)
    Sets the character to use as a text qualifier in the data.
    void
    setTrimWhitespace(boolean trimWhitespace)
    Sets whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
    void
    setUseComments(boolean useComments)
    Sets whether comments are being looked for while parsing or not.
    void
    setUseTextQualifier(boolean useTextQualifier)
    Sets whether text qualifiers will be used while parsing or not.
    boolean
    Skips the next line of data using the standard end of line characters and does not do any column delimited parsing.
    boolean
    Skips the next record of data by parsing each column. Does not increment getCurrentRecord().

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Field Details

    • ESCAPE_MODE_DOUBLED

      public static final int ESCAPE_MODE_DOUBLED
      Double up the text qualifier to represent an occurrence of the text qualifier.
      See Also:
    • ESCAPE_MODE_BACKSLASH

      public static final int ESCAPE_MODE_BACKSLASH
      Use a backslash character before the text qualifier to represent an occurrence of the text qualifier.
      See Also:
  • Constructor Details

    • CsvReader

      public CsvReader(String fileName, char delimiter, Charset charset) throws FileNotFoundException
      Creates a CsvReader object using a file as the data source.
      Parameters:
      fileName - The path to the file to use as the data source.
      delimiter - The character to use as the column delimiter.
      charset - The Charset to use while parsing the data.
      Throws:
      FileNotFoundException
    • CsvReader

      public CsvReader(String fileName, char delimiter) throws FileNotFoundException
      Creates a CsvReader object using a file as the data source. Uses ISO-8859-1 as the Charset.
      Parameters:
      fileName - The path to the file to use as the data source.
      delimiter - The character to use as the column delimiter.
      Throws:
      FileNotFoundException
    • CsvReader

      public CsvReader(String fileName) throws FileNotFoundException
      Creates a CsvReader object using a file as the data source. Uses a comma as the column delimiter and ISO-8859-1 as the Charset.
      Parameters:
      fileName - The path to the file to use as the data source.
      Throws:
      FileNotFoundException
    • CsvReader

      public CsvReader(Reader inputStream, char delimiter)
      Constructs a CsvReader object using a Reader object as the data source.
      Parameters:
      inputStream - The stream to use as the data source.
      delimiter - The character to use as the column delimiter.
    • CsvReader

      public CsvReader(Reader inputStream)
      Constructs a CsvReader object using a Reader object as the data source. Uses a comma as the column delimiter.
      Parameters:
      inputStream - The stream to use as the data source.
    • CsvReader

      public CsvReader(InputStream inputStream, char delimiter, Charset charset)
      Constructs a CsvReader object using an InputStream object as the data source.
      Parameters:
      inputStream - The stream to use as the data source.
      delimiter - The character to use as the column delimiter.
      charset - The Charset to use while parsing the data.
    • CsvReader

      public CsvReader(InputStream inputStream, Charset charset)
      Constructs a CsvReader object using an InputStream object as the data source. Uses a comma as the column delimiter.
      Parameters:
      inputStream - The stream to use as the data source.
      charset - The Charset to use while parsing the data.
  • Method Details

    • getCaptureRawRecord

      public boolean getCaptureRawRecord()
    • setCaptureRawRecord

      public void setCaptureRawRecord(boolean captureRawRecord)
    • getRawRecord

      public String getRawRecord()
    • getTrimWhitespace

      public boolean getTrimWhitespace()
      Gets whether leading and trailing whitespace characters are being trimmed from non-textqualified column data. Default is true.
      Returns:
      Whether leading and trailing whitespace characters are being trimmed from non-textqualified column data.
    • setTrimWhitespace

      public void setTrimWhitespace(boolean trimWhitespace)
      Sets whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not. Default is true.
      Parameters:
      trimWhitespace - Whether leading and trailing whitespace characters should be trimmed from non-textqualified column data or not.
    • getDelimiter

      public char getDelimiter()
      Gets the character being used as the column delimiter. Default is comma, ','.
      Returns:
      The character being used as the column delimiter.
    • setDelimiter

      public void setDelimiter(char delimiter)
      Sets the character to use as the column delimiter. Default is comma, ','.
      Parameters:
      delimiter - The character to use as the column delimiter.
    • getRecordDelimiter

      public char getRecordDelimiter()
    • setRecordDelimiter

      public void setRecordDelimiter(char recordDelimiter)
      Sets the character to use as the record delimiter.
      Parameters:
      recordDelimiter - The character to use as the record delimiter. Default is combination of standard end of line characters for Windows, Unix, or Mac.
    • getTextQualifier

      public char getTextQualifier()
      Gets the character to use as a text qualifier in the data.
      Returns:
      The character to use as a text qualifier in the data.
    • setTextQualifier

      public void setTextQualifier(char textQualifier)
      Sets the character to use as a text qualifier in the data.
      Parameters:
      textQualifier - The character to use as a text qualifier in the data.
    • getUseTextQualifier

      public boolean getUseTextQualifier()
      Whether text qualifiers will be used while parsing or not.
      Returns:
      Whether text qualifiers will be used while parsing or not.
    • setUseTextQualifier

      public void setUseTextQualifier(boolean useTextQualifier)
      Sets whether text qualifiers will be used while parsing or not.
      Parameters:
      useTextQualifier - Whether to use a text qualifier while parsing or not.
    • getComment

      public char getComment()
      Gets the character being used as a comment signal.
      Returns:
      The character being used as a comment signal.
    • setComment

      public void setComment(char comment)
      Sets the character to use as a comment signal.
      Parameters:
      comment - The character to use as a comment signal.
    • getUseComments

      public boolean getUseComments()
      Gets whether comments are being looked for while parsing or not.
      Returns:
      Whether comments are being looked for while parsing or not.
    • setUseComments

      public void setUseComments(boolean useComments)
      Sets whether comments are being looked for while parsing or not.
      Parameters:
      useComments - Whether comments are being looked for while parsing or not.
    • getEscapeMode

      public int getEscapeMode()
      Gets the current way to escape an occurrence of the text qualifier inside qualified data.
      Returns:
      The current way to escape an occurrence of the text qualifier inside qualified data.
    • setEscapeMode

      public void setEscapeMode(int escapeMode) throws IllegalArgumentException
      Sets the current way to escape an occurrence of the text qualifier inside qualified data.
      Parameters:
      escapeMode - The way to escape an occurrence of the text qualifier inside qualified data.
      Throws:
      IllegalArgumentException - When an illegal value is specified for escapeMode.
    • getSkipEmptyRecords

      public boolean getSkipEmptyRecords()
    • setSkipEmptyRecords

      public void setSkipEmptyRecords(boolean skipEmptyRecords)
    • getSafetySwitch

      public boolean getSafetySwitch()
      Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.
      Returns:
      The current setting of the safety switch.
    • setSafetySwitch

      public void setSafetySwitch(boolean safetySwitch)
      Safety caution to prevent the parser from using large amounts of memory in the case where parsing settings like file encodings don't end up matching the actual format of a file. This switch can be turned off if the file format is known and tested. With the switch off, the max column lengths and max column count per record supported by the parser will greatly increase. Default is true.
      Parameters:
      safetySwitch -
    • getColumnCount

      public int getColumnCount()
      Gets the count of columns found in this record.
      Returns:
      The count of columns found in this record.
    • getCurrentRecord

      public long getCurrentRecord()
      Gets the index of the current record.
      Returns:
      The index of the current record.
    • getHeaderCount

      public int getHeaderCount()
      Gets the count of headers read in by a previous call to readHeaders().
      Returns:
      The count of headers read in by a previous call to readHeaders().
    • getHeaders

      public String[] getHeaders() throws IOException
      Returns the header values as a string array.
      Returns:
      The header values as a String array.
      Throws:
      IOException - Thrown if this object has already been closed.
    • setHeaders

      public void setHeaders(String[] headers)
    • getValues

      public String[] getValues() throws IOException
      Throws:
      IOException
    • get

      public String get(int columnIndex) throws IOException
      Returns the current column value for a given column index.
      Parameters:
      columnIndex - The index of the column.
      Returns:
      The current column value.
      Throws:
      IOException - Thrown if this object has already been closed.
    • get

      public String get(String headerName) throws IOException
      Returns the current column value for a given column header name.
      Parameters:
      headerName - The header name of the column.
      Returns:
      The current column value.
      Throws:
      IOException - Thrown if this object has already been closed.
    • parse

      public static CsvReader parse(String data)
      Creates a CsvReader object using a string of data as the source. Uses ISO-8859-1 as the Charset.
      Parameters:
      data - The String of data to use as the source.
      Returns:
      A CsvReader object using the String of data as the source.
    • readRecord

      public boolean readRecord() throws IOException
      Reads another record.
      Returns:
      Whether another record was successfully read or not.
      Throws:
      IOException - Thrown if an error occurs while reading data from the source stream.
    • readHeaders

      public boolean readHeaders() throws IOException
      Read the first record of data as column headers.
      Returns:
      Whether the header record was successfully read or not.
      Throws:
      IOException - Thrown if an error occurs while reading data from the source stream.
    • getHeader

      public String getHeader(int columnIndex) throws IOException
      Returns the column header value for a given column index.
      Parameters:
      columnIndex - The index of the header column being requested.
      Returns:
      The value of the column header at the given column index.
      Throws:
      IOException - Thrown if this object has already been closed.
    • isQualified

      public boolean isQualified(int columnIndex) throws IOException
      Throws:
      IOException
    • expandColumnBuffer

      protected void expandColumnBuffer()
    • getIndex

      public int getIndex(String headerName) throws IOException
      Gets the corresponding column index for a given column header name.
      Parameters:
      headerName - The header name of the column.
      Returns:
      The column index for the given column header name. Returns -1 if not found.
      Throws:
      IOException - Thrown if this object has already been closed.
    • skipRecord

      public boolean skipRecord() throws IOException
      Skips the next record of data by parsing each column. Does not increment getCurrentRecord().
      Returns:
      Whether another record was successfully skipped or not.
      Throws:
      IOException - Thrown if an error occurs while reading data from the source stream.
    • skipLine

      public boolean skipLine() throws IOException
      Skips the next line of data using the standard end of line characters and does not do any column delimited parsing.
      Returns:
      Whether a line was successfully skipped or not.
      Throws:
      IOException - Thrown if an error occurs while reading data from the source stream.
    • close

      public void close()
      Closes and releases all related resources.
      Specified by:
      close in interface AutoCloseable