search & replace within

Question

I would like to parse a file line by line and replace a fixed punctuation with another punctuation (e.g. periods "/" with slashes ".") Only if that string in the line contains a certain variable

Example: Replace only if the string contains Fx

line1: test1/test2
line2: .test + e/y + Fx/var1/var2

Output:

line1: test1/test2
line2: .test + e/y + Fx.var1.var2

How can I go about doing this? Code so far but I know it doesn't work

import os

textToFind = '/'
textToReplace = '.'
sourcepath = os.listdir('InputFiles/')

def lines_that_contain(string, fp):
    return [line for line in fp if string in line]

for file in sourcepath:
    inputFile = 'InputFiles/'+ file
    print('Conversion is ongoing for:' +inputFile)
    with open(inputFile, 'r') as inputFile:
        for line in lines_that_contain("Fx.", inputFile):
            print('found Fx.')
            fileData = fileData.replace(textToFind, textToReplace)
            freq2 = 0
            freq2 = fileData.count(textToFind)
            
            destinationPath = 'OutputFile/' + file
            with open(destinationPath, 'w') as file:
                file.write(fileData)
                print ('Total %d Record Replaced' %freq2)
        else:
            print('Did not find selected strings')

There are various example on Stackoverflow which explains a way to accomplish this. e.g. https://stackoverflow.com/questions/17140886/how-to-search-and-replace-text-in-a-file — akaur, Apr 07 '20 at 20:59
_but I know it doesn't work_ Can you be more specific? As an aside, variable and function names should follow the `lower_case_with_underscores` style. — AMC, Apr 07 '20 at 21:25

score 0 · Accepted Answer · answered Apr 07 '20 at 21:39

There are multiple problems with your code:

fileData is used before it is set.
Your loop only runs over the lines that contain the "trigger" string so you will not be able to output the other lines unmodified.
If fileData is supposed to contain all the data read up to this point, the replacement will affect every line, regardless of whether it contains the trigger or not.
The output will probably be "Total 0 Record Replaced" followed by "Did not find selected strings": You are counting the occurrences of the text to be replaced right after replacing it. And since your loop does not contain a break statement the else clause will be evaluated.
You are recreating and writing the output file for every line read.

To fix these issues, collect all the lines in a list, modifying them if they contain the trigger. After reading the whole file, open the output file and dump the lines you have collected.

import os

textToFind = '/'
textToReplace = '.'
trigger = "Fx."

sourcepath = os.listdir('InputFiles/')

for file in sourcepath:
    inputFile = 'InputFiles/'+ file
    print('Conversion is ongoing for:' + inputFile)
    with open(inputFile, 'r') as infile:
        fileData = []
        replacements = 0
        for line in infile:
            if trigger in line:
                fileData.append(line.replace(textToFind, textToReplace))
                replacements += 1
            else:
                fileData.append(line)

    destinationPath = 'OutputFile/' + file
    with open(destinationPath, 'w') as outfile:
        # The lines already contain terminating \n characters.
        outfile.write(''.join(fileData))

    if replacements > 0:
        print('Total %d Record Replaced' % replacements)
    else:
        print('Did not find selected strings')

Since each line is independently processed, you can also implement a streaming version where you open the input and output files first, and then read, process, and write one line at a time. This is what the sed program does -- invoking sed '/Fx\./ s#/#.#g' inputFile > outputFile in a shell performs the same task on a single file.

search & replace within

1 Answers1