Parse CSV Comma or Pipe Delimited File with Linq

knowventBanner

Many times I have been tasked with importing, reading and parsing text files with different formats. Most of the time I have had to read in a comma or pipe delimited file. Using libraries like CsvHelper helps if you are always developing solutions like this. If you need something quick or don’t want the added overhead of testing and deploying a third-party assembly then Linq is to the rescue.

Here is quick code snippet to parse a CSV comma or pipe delimited file with Linq. There you can do this every easily without using a special library or framework.

Parse CSV Comma or Pipe Delimited File with Linq

As long as you are targeting .NET Framework 3.5 or higher you can use this parsing method. This will also give you the added advantage that you can map to a concrete class.

Example Delimited Data

employeeId,firstName,lastName,hireDate,salary
123,Joe,Doe,8/5/2015,50000
123,Joe1,Doe1,8/5/2015,40000
123,Joe2,Doe2,8/5/2015,60000
123,Joe3,Doe3,8/5/2015,70000

Notice we are not using double quote text qualifiers in our data file

Concrete Schema Class

 public class Schema
{
    public string employeeId { get; set; }
    public string firstName { get; set; }
    public string lastName { get; set; }
    public string hireDate { get; set; }
    public string salary { get; set; }
}

Parsing with Linq

public static List<Schema> ParseDelimitedData()
{
    _data = _data.TrimEnd('\r', '\n'); //Remove last CRLF
    var items = from line in _data
                    .Split(new string[] { Environment.NewLine }, StringSplitOptions.None) //Split each row
                    .Skip(1) //Skip column names
                let col = line.Split(',') //split each column
                select new Schema()
                {
                    employeeId = col[0],
                    firstName = col[1],
                    lastName = col[2],
                    hireDate = col[3],
                    salary = col[4]
                };

    return items.ToList();
}

The implementation

var importData = ParseDelimitedData();

Conclusion

This is not without it’s downsides. If you have text qualifiers in your data file you will need to split and handle accordingly. You can implement a splitting solution found here. This will handle splitting text qualified columns where the delimiter might be using with in the text. Also , there has been some question about the performance Linq when used in this manner.

Thanks for reading.

Let me know if you have any question or comments below.

Posted in .NET, Programming, SOA.