Solution 1 - ConvertFrom-CSV:
Save the header and use ConvertFrom-Csv to parse it for you. Haven't tested it on large files, but it won't have to load the entire file in memory, so it should work at least. Ex:
#Create samplefile
@"
ID,Desc,Obs
1234,"Some text, More Text, Text again","Text"
5678,"Some text, More Text, Text again and again",Text2
$(1..100000 | % { "$_,`"Some text$_, More Text$_, Text again and again$_`",Text$_`n" })
"@ -split "`n" | % { $_.trim() } | Set-Content D:\Downloads\test.txt
$datatable = New-Object System.Data.DataTable
$src = "D:\Downloads\test.txt"
$reader = New-Object IO.StreamReader($src)
#Get header and split to columns
$columns = $reader.ReadLine() -split ','
foreach ($column in $columns) {
$datatable.columns.add($column)
}
while(($line = $reader.ReadLine()) -ne $null){
#Let ConvertFrom-CSV do the heavy-lifting by making it convert one "csv-file" per line using a known header
$obj = $line | ConvertFrom-Csv -Header $columns
$row = $datatable.NewRow()
$row.ID = $obj.ID
$row.Desc = $obj.Desc
$row.Obs = $obj.Obs
$datatable.Rows.Add($row)
}
Test:
#Show available columns
$datatable.Columns.Caption
ID
Desc
Obs
#Show datatable
$datatable
ID Desc Obs
-- ---- ---
1234 Some text, More Text, Text again Text
5678 Some text, More Text, Text again and again Text2
Solution 2 - TextFieldParser:
The VisualBasic-assembly has a TextFieldParser-class that understands quoted fields. This will perform faster (50% faster in my 100k csv-test) due to less overhead when working directly with .NET. Try:
Add-Type -AssemblyName Microsoft.VisualBasic
$datatable = New-Object System.Data.DataTable
$src = "D:\Downloads\test.txt"
$reader = New-Object -TypeName Microsoft.VisualBasic.FileIO.TextFieldParser -ArgumentList $src
$reader.Delimiters = @(",")
#Default values, but wanted to show the options
$reader.HasFieldsEnclosedInQuotes = $true
$reader.TrimWhiteSpace = $true
#Get header as array
$columns = $reader.ReadFields()
foreach ($column in $columns) {
$datatable.columns.add($column)
}
while($fields = $reader.ReadFields()) {
$row = $datatable.NewRow()
#Insert value in property using field(column) index
for ($i = 0; $i -lt $columns.Count; $i++) {
$row.($columns[$i]) = $fields[$i]
}
$datatable.Rows.Add($row)
}
$reader.Close()