I have a following problem. Let's say I have a list of strings that are actually urls to xls excel files and I am trying to download them all and convert them to xlsx , since I am using the Microsoft Compatibility Pack I can't just use the converter after I downloaded a file because I don't want that many processes running at a time, and there are about 1600 files so I really don't want that many processes and doing it sequentially would probably last forever.
I was trying to improve my code by using TPL data flow because I thought that this situation is ideal for a producer-consumer like pattern and the internet suggested that TPL Data flow is what I need, but probably I misunderstood something from the tutorials I was reading because the following code is not working. What am I doing wrong ?
var pathsBuffer = new BufferBlock<string>(new DataflowBlockOptions
{
BoundedCapacity = 12
});
var converterOptions = new ExecutionDataflowBlockOptions
{
MaxDegreeOfParallelism = 4
};
var converter = new ActionBlock<string>((filePath) =>
{
Process.Start(@"c:\Program Files (x86)\Microsoft Office\Office12\excelcnv.exe",
string.Format(@" -nme -oice {0} {1}", filePath, filePath + "x")).WaitForExit();
}, converterOptions);
pathsBuffer.LinkTo(converter);
pathsBuffer.Completion.ContinueWith(task => converter.Complete());
Parallel.ForEach(FileAdress, async(file) =>
{
using(var webClient = new WebClient())
{
string OutputDirectory = ConfigurationManager.AppSettings["RootDirectory"] +
FolderIndex;
if (!Directory.Exists(OutputDirectory))
{
Directory.CreateDirectory(OutputDirectory);
}
string filePath = Path.Combine(OutputDirectory, AdressIndex[file]);
await webClient.DownloadFileTaskAsync(new Uri(file), filePath);
while (!pathsBuffer.Post(filePath)) {}
}
});
pathsBuffer.Complete();