-2

I would like to parse a file line by line and replace a fixed punctuation with another punctuation (e.g. periods "/" with slashes ".") Only if that string in the line contains a certain variable

Example: Replace only if the string contains Fx

line1: test1/test2
line2: .test + e/y + Fx/var1/var2

Output:

line1: test1/test2
line2: .test + e/y + Fx.var1.var2
 

How can I go about doing this? Code so far but I know it doesn't work

import os

textToFind = '/'
textToReplace = '.'
sourcepath = os.listdir('InputFiles/')

def lines_that_contain(string, fp):
    return [line for line in fp if string in line]

for file in sourcepath:
    inputFile = 'InputFiles/'+ file
    print('Conversion is ongoing for:' +inputFile)
    with open(inputFile, 'r') as inputFile:
        for line in lines_that_contain("Fx.", inputFile):
            print('found Fx.')
            fileData = fileData.replace(textToFind, textToReplace)
            freq2 = 0
            freq2 = fileData.count(textToFind)
            
            destinationPath = 'OutputFile/' + file
            with open(destinationPath, 'w') as file:
                file.write(fileData)
                print ('Total %d Record Replaced' %freq2)
        else:
            print('Did not find selected strings')
eyllanesc
  • 221,139
  • 17
  • 121
  • 189
ruawzrd
  • 1
  • 4
  • 1
    There are various example on Stackoverflow which explains a way to accomplish this. e.g. https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file – akaur Apr 07 '20 at 20:59
  • _but I know it doesn't work_ Can you be more specific? As an aside, variable and function names should follow the `lower_case_with_underscores` style. – AMC Apr 07 '20 at 21:25

1 Answers1

0

There are multiple problems with your code:

  • fileData is used before it is set.
  • Your loop only runs over the lines that contain the "trigger" string so you will not be able to output the other lines unmodified.
  • If fileData is supposed to contain all the data read up to this point, the replacement will affect every line, regardless of whether it contains the trigger or not.
  • The output will probably be "Total 0 Record Replaced" followed by "Did not find selected strings": You are counting the occurrences of the text to be replaced right after replacing it. And since your loop does not contain a break statement the else clause will be evaluated.
  • You are recreating and writing the output file for every line read.

To fix these issues, collect all the lines in a list, modifying them if they contain the trigger. After reading the whole file, open the output file and dump the lines you have collected.

import os

textToFind = '/'
textToReplace = '.'
trigger = "Fx."

sourcepath = os.listdir('InputFiles/')

for file in sourcepath:
    inputFile = 'InputFiles/'+ file
    print('Conversion is ongoing for:' + inputFile)
    with open(inputFile, 'r') as infile:
        fileData = []
        replacements = 0
        for line in infile:
            if trigger in line:
                fileData.append(line.replace(textToFind, textToReplace))
                replacements += 1
            else:
                fileData.append(line)

    destinationPath = 'OutputFile/' + file
    with open(destinationPath, 'w') as outfile:
        # The lines already contain terminating \n characters.
        outfile.write(''.join(fileData))

    if replacements > 0:
        print('Total %d Record Replaced' % replacements)
    else:
        print('Did not find selected strings')

Since each line is independently processed, you can also implement a streaming version where you open the input and output files first, and then read, process, and write one line at a time. This is what the sed program does -- invoking sed '/Fx\./ s#/#.#g' inputFile > outputFile in a shell performs the same task on a single file.

Roland W
  • 1,336
  • 13
  • 21