In this article we will look at how to read CSV files into Java objects. We will be using OpenCSV library to do the conversions, and look at some examples of how we can customize it based on requirement.

1. Pre-requisite

In this article we will me use of Gradle to import the package dependencies into our Java project. However you can use Maven or any other dependency management tool you want.

In your build.gradle, add the following dependencies for Open CSV

compile group: 'com.opencsv', name: 'opencsv', version: '5.2'

OpenCSV is a simple library for reading and writing to CSV files.

2. Reading from a CSV file (simplest scenario)

To start with we will consider the simplest possible scenario

  • CSV file has primitive data types like numbers, strings, etc
  • All columns in CSV file has header names and the names are exactly same in the Java Bean

Consider the following sample CSV file, which is a simple file listing out few product details

id,name,retailPrice,discountedPrice,brand
1000,Double Sofa Bed,32157,22646,FabHomeDecor
1001,AW Bellies,999,499,AW
1002,Women's Cycling Shorts,699,267,Alisha

This is the target class in which we want this data to be loaded up.

@Data
public class Product {
    private String id;
    private String name;
    private String brand;
    private BigDecimal retailPrice;
    private BigDecimal discountedPrice;
}

Note, that in this example the column headers in CSV are exactly same as the field names the Java Bean.

Note that we are using Lombok annotations in our Java Beans. Hence we dont have to write any boiler plate code lik e Constructors/ Getter / Setters, ToString, etc.

To do so we will first need to create an instance of com.opencsv.bean.CsvToBean, that will be used to read and parse the CSV file to the List<Product>

First step will be to read the file and get an java.io.Reader

Reader reader = new BufferedReader(new FileReader("file.csv"))

Now that we have got a Reader, we will now create an instance the CsvToBean using a CsvToBeanBuilder by

  • Reader we created above
  • Reference to the Java Bean class which is to be created
  • Seperator for CSV file (by default it is comma)
CsvToBean<Product> csvReader = new CsvToBeanBuilder(reader)
                .withType(Product.class)
                .withSeparator(',')
                .withIgnoreLeadingWhiteSpace(true)
                .withIgnoreEmptyLine(true)
                .build();

Once we have created an instance of CsvToBean, we can use to read and parse the CSV file into List<Product>

List<Product> results = csvReader.parse();

If we dont want to get all results at once, we can also get a handle to an Iterator or Stream of Product, which we can then loop through and process one by one.

Iterator<Product> iterator = csvReader.iterator();
Stream<Product> stream = csvReader.stream();

Simple isn’t it.

Well the real life situations are not so simple. In next few sections we will add complexeties to the CSV file and see how we can handle them and continue reading the CSV file into a list of Java objects.

3. Reading from a CSV file when header names do not match

The field names in Java classes mostly follow camel case syntax.

Generally in real world scenarios, the header names in CSV file may not have same syntax. They may even be have completely different name.
Consider the following sample CSV file, with different header names.

uniq_id,product_name,retail_price,discounted_price,brand
1000,Double Sofa Bed,32157,22646,FabHomeDecor
1001,AW Bellies,999,499,AW
1002,Women's Cycling Shorts,699,267,Alisha

If we read this file as is using above code, then we will get null values in all fields that do not match (all except field – brand)

So we need to modify our code to handle the difference in field names betwen the CSV file and Java Bean. Let’s first create such mapping.

Map<String, String> columnMappings = Map.of(
                "uniq_id", "id",
                "product_name", "name",
                "retail_price", "retailPrice",
                "discounted_price", "discountedPrice",
                "brand", "brand"
        );

Next we will create a CsvToBean using the column name mappings

HeaderColumnNameTranslateMappingStrategy mappingStrategy = 
new HeaderColumnNameTranslateMappingStrategy();
        mappingStrategy.setColumnMapping(columnNameMappings);
        mappingStrategy.setType(Product.class);


CsvToBean<Product> csvReader = new CsvToBeanBuilder(reader)
                .withType(Product.class)
                .withSeparator(',')
                .withIgnoreLeadingWhiteSpace(true)
                .withIgnoreEmptyLine(true)
                .withMappingStrategy(mappingStrategy)
                .build();

Now that we have modified our CsvToBean by providing it MappingStrategy, we can now call the parse / stream methods to read the CSV file to create List<Product>, even though the header names were not matching.

4. Parsing CSV to complex Java Bean Fields

Till now the examples that we have seen were simple mappings – String/Numeric fields in the CSV file to String/Numeric fields in the Java Bean.

The mapping was also done verbose, in the Java code by creating a MappingStrategy class.

Let’s start by adding few complex fields to the Java Bean – Product

@Data
public class ProductOne {
    private String id;
    private String name;
    private String brand;
    private BigDecimal retailPrice;
    private BigDecimal discountedPrice;
    private List<String> images;
    private ZonedDateTime added_on;

Note that we have added two new fields of type – List<String> and a ZoneDateTime.

We will also use a different version of sample csv file now.

uniq_id,crawl_timestamp,product_name,retail_price,discounted_price,images,brand

1000,2016-03-25 21:59:23 +0000,Double Sofa Bed,32157,22646,"[""http://image1.jpg"", ""http://image2.jpg""]",FabHomeDecor

1001,2016-03-26 22:59:23 +0000,AW Bellies,999,499,"[""http://image4.jpg""]",AW

1002,2016-03-27 23:59:23 +0000,Women's Cycling Shorts,699,267,[],Alisha

Now this adds some layer of complexity to the CSV Parsing.

  • crawl_timestamp
    • String in the form of "2016-03-25 21:59:23 +0000"
    • This string needs to be converted to field added_on of type ZoneDateTime
  • images
    • String containing a JSON Array of Image Urls, e.g "[""http://image1.jpg"", ""http://image2.jpg""]"
    • Needs to convert into images field of type List<String>

The reason both above scenarios add to complexity, is that the default converters of the CSV Parser might not be able to do the date conversion or the Json conversion for us.

For this we need to write our own Converters, and provide these converts to the CsvToBean while parsing csv files.

This will require us to change how we are writing the code for the CSV Parsing.

  • Currently we are writing too much verbose code. We are creating a map of csv to java field mappings, and manually creating an instance of AbstractMappingStrategy object and passing to it to CsvToBean
  • Unfortunately this approach does allows us to customize the field level conversions.
  • To do so we need to use Annotations based approach supported by OpenCSV. With this approach we add annotations to each field of Java Bean and add reference to our custom converters inside the annotations.

Let’s look at the annotation based approach in more detail.

4.1 Creating Converters

First of all, we need to create the converters for ZoneDateTime conversion and Json array String to List<String> conversion.

  • Each converter needs to extend from abstract class – com.opencsv.bean.AbstractBeanField
  • Each converter needs to implement the method –
protected abstract Object convert(String value)
 throws CsvDataTypeMismatchException, CsvConstraintViolationException;

This is how the both the converter classes will look like.

@NoArgsConstructor
public class ZoneDateTimeConverter extends AbstractBeanField {

    @Override
    protected Object convert(String value) throws CsvDataTypeMismatchException, CsvConstraintViolationException {

        DateTimeFormatter formatter = DateTimeFormatter.ofPattern("yyyy-MM-dd HH:mm:ss X");
        return ZonedDateTime.parse(value, formatter);
    }

}

In the above converter we made use of Java 8 DateTimeFormatter class to convert a String to ZoneDateTime object.

@NoArgsConstructor
public class ListConverter extends AbstractBeanField {

    @Override
    protected Object convert(String value) throws CsvDataTypeMismatchException, CsvConstraintViolationException {
        ObjectMapper objectMapper = new ObjectMapper();
        try {
            return objectMapper.readValue(value, List.class);
        } catch (IOException e) {
            e.printStackTrace();
        }
        return null;
    }
}

In the above converter we made use of Jackson library’s ObjectMapper class to convert String Json array to a List<String> object.

4.2 Adding the Annotations to the Bean

Now that we have created the converters we will now add annotations to the Product class for field mapping and field converters.
There are two annotations that we will use in this example

  • @CsvBindByName
    • This is used for field name mapping, where field name is different in csv headers and the Java bean.
    • E.g following annotation provides a field name mapping from “uniq_id” field of csv to “id” field of Java bean object.
@CsvBindByName(column = "uniq_id")
private String id;
  • @CsvCustomBindByName
    • This is used for field name mapping, as well as providing the converter for the field.
    • E.g following annotation provides a field name mapping from “crawl_timestamp” field of csv to “added_on” field of Java bean object, and used a ZoneDateTimeConverter class to do the conversion during parsing.
@CsvCustomBindByName(
  column = "crawl_timestamp", converter = ZoneDateTimeConverter.class)
private ZonedDateTime addedOn;

After we have added all the annotations, this is how our Product class will look like

@Data
public class ProductTwo {

    @CsvBindByName(column = "uniq_id")
    private String id;

    @CsvBindByName(column = "product_name")
    private String name;

    @CsvBindByName(column = "brand")
    private String brand;

    @CsvBindByName(column = "retail_price")
    private BigDecimal retailPrice;

    @CsvBindByName(column = "discounted_price")
    private BigDecimal discountedPrice;

    @CsvCustomBindByName(column = "images", converter = ListConverter.class)
    private List<String> images;

    @CsvCustomBindByName(column = "crawl_timestamp", converter = ZoneDateTimeConverter.class)
    private ZonedDateTime addedOn;
}

Now that we have provided all the mappings and converters as annotations in the Product class itself, so now we do not need to add mappings and strategy to the CsvToBean.

We can do the parsing using the simple way, as below

Reader reader = new BufferedReader(new FileReader("file.csv"))

CsvToBean<Product> csvReader = new CsvToBeanBuilder(reader)
                .withType(Product.class)
                .withSeparator(',')
                .withIgnoreLeadingWhiteSpace(true)
                .withIgnoreEmptyLine(true)
                .build();

List<Product> results = csvReader.parse();

To summarize we looked at different ways of using OpenCSV to read and parse a csv file to a Java bean object.
We also looked at different examples on how we can handle different real world scenarios while csv parsing.

There can be more complex scenarios like – nested objects in java bean, combining multiple csv fields into a single Java Bean field, etc. But most of them can be handled by OpenCSV.
You can go in more details through their documentation – http://opencsv.sourceforge.net/#reading_into_beans

You can find the source code and JUnit5 unit test cases for above code at following location – CSVToJavaBeanUsingOpenCSV.java and CsvToBeanWithCsvMapperTest.java 

Unit Test can be found at this folder.