2

I am writing some text (which includes \n and \t characters) taken from one source file onto a (text) file ; for example:

source file (test.cpp):

/*
 * test.cpp
 *
 *    2013.02.30
 *
 */

is taken from the source file and stored in a string variable like so

test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"

which when I write onto a file using

    with open(test.cpp, 'a') as out:
        print(test_str, file=out)

is being written with the newline and tab characters converted to new lines and tab spaces (exactly like test.cpp had them) whereas I want them to remain \n and \t exactly like the test_str variable holds them in the first place.

Is there a way to achieve that in Python when writing to a file these 'special characters' without them being translated?

Yannis
  • 1,572
  • 7
  • 25
  • 44
  • Did you tried to add the backslash `"\"` to the special character `"\n"`--> `"\\n"`? See this [post](http://stackoverflow.com/questions/4245709/how-do-you-write-special-characters-n-b-to-a-file-in-python) – terence hill May 01 '16 at 20:58
  • 1
    @terencehill I was aware that such a string manipulation could meet my needs but I was hoping for something more subtle and/or built-in; the `encode` method seems perfect for this provided by Jon [below](http://stackoverflow.com/a/36971942/3286832). – Yannis May 02 '16 at 08:58

3 Answers3

2

Use replace(). And since you need to use it multiple times, you might want to look at this.

test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"
with open("somefile", "w") as f:
    test_str = test_str.replace('\n','\\n')
    test_str = test_str.replace('\t','\\t')
    f.write(test_str)
Community
  • 1
  • 1
quapka
  • 2,630
  • 4
  • 20
  • 34
  • Very useful. I was hoping for something more subtle and/or built-in. Especially, the regex approach might be an overkill for my case but still useful. – Yannis May 02 '16 at 08:47
2

You can use str.encode:

with open('test.cpp', 'a') as out:
    print(test_str.encode('unicode_escape').decode('utf-8'), file=out)

This'll escape all the Python recognised special escape characters.

Given your example:

>>> test_str = "/*\n test.cpp\n *\n *\n *\n\t2013.02.30\n *\n */\n"
>>> test_str.encode('unicode_escape')
b'/*\\n test.cpp\\n *\\n *\\n *\\n\\t2013.02.30\\n *\\n */\\n'
Jon Clements
  • 132,101
  • 31
  • 237
  • 267
  • Seems exactly what I was hoping Python had built-in and fits my needs. Could you please explain the purpose of `decode('utf-8')`, especially when on your example just the `encode('unicode_escape')` gives the solution? – Yannis May 02 '16 at 08:49
  • And for clarity, if I wanted say to reverse the effect to have the newline and tab characters as they were originally (before the `str.encdode`), how would I achieve that? – Yannis May 02 '16 at 09:07
  • @Yannis the encoding gives you a byte string (notice the `b` prefix and the output in the file when printed) - decoding it gives you back a unicode string. – Jon Clements May 02 '16 at 13:47
1

I want them to remain \n and \t exactly like the test_str variable holds them in the first place.

test_str does NOT contain the backslash \ + t (two characters). It contains a single character ord('\t') == 9 (the same character as in the test.cpp). Backslash is special in Python string literals e.g., u'\U0001f600' is NOT ten characters—it is a single character Don't confuse a string object in memory during runtime and its text representation as a string literal in Python source code.

JSON could be a better alternative than unicode-escape encoding to store text (more portable) i.e., use:

import json

with open('test.json', 'w') as file:
    json.dump({'test.cpp': test_str}, file)

instead of test_str.encode('unicode_escape').decode('ascii').

To read json back:

with open('test.json') as file:
    test_str = json.load(file)['test.cpp']
jfs
  • 374,366
  • 172
  • 933
  • 1,594
  • Explaining the difference between string object in memory during runtime and Python's string literal is greatly appreciated. JSON's portability is good, but for my case not an issue. – Yannis May 02 '16 at 11:09