I've got a function that takes in stream from Excel (of type .xls) and parse into the fields as per the columns:
public static List<DancerTest> ParseTestsFromString(Stream stream, string filename, char delimiter = '\t')
{
List<DancerTest> results = new List<DancerTest>();
using (TextFieldParser parser = new TextFieldParser(stream))
{
parser.TextFieldType = FieldType.Delimited;
parser.SetDelimiters(delimiter.ToString());
parser.HasFieldsEnclosedInQuotes = true;
if (parser.EndOfData)
{
throw new Exception("No data found in file.");
}
int i = 0;
while (!parser.EndOfData)
{
i++;
//Process row
string[] fields = parser.ReadFields(); // Getting unicode output somehow, which is not expected
}
}
return results;
}
In my unit test, I have got the following:
// Arrange
var fullPathToFile = Path.GetFullPath(@"Utilities\file.xls");
var bytes = File.ReadAllBytes(fullPathToFile);
var memoryStream = new MemoryStream(bytes);
// Act
var parseOutput = DancerTestParser.ParseTestsFromString(memoryStream, "Testing");
// Assert
// To continue
In the ReadFields() line, instead of 15 fields being produced, I am seeing only three, where they look like unicode:
The encoding of the excel file is Western European, so I tried the but to no avail:
using (TextFieldParser parser = new TextFieldParser(stream, Encoding.GetEncoding(1252)))