Is there some string class in Python like StringBuilder in C#?
-
9This is a duplicate of [Python equivalent of Java StringBuffer](https://stackoverflow.com/questions/19926089/python-equivalent-of-java-stringbuffer). **CAUTION: The answers here are way out of date and have, in fact, become misleading.** See [that other question](https://stackoverflow.com/questions/19926089/python-equivalent-of-java-stringbuffer) for answers that are more relevant to modern Python versions (certainly 2.7 and above). – Jean-François Corbett Nov 20 '17 at 08:52
8 Answers
There is no one-to-one correlation. For a really good article please see Efficient String Concatenation in Python:
Building long strings in the Python progamming language can sometimes result in very slow running code. In this article I investigate the computational performance of various string concatenation methods.
TLDR the fastest method is below. It's extremely compact, and also pretty understandable:
def method6():
return ''.join([`num` for num in xrange(loop_count)])
- 2,992
- 4
- 33
- 36
- 333,516
- 69
- 632
- 626
-
32Note that this article was written based on Python 2.2. The tests would likely come out somewhat differently in a modern version of Python (CPython usually successfully optimizes concatenation, but you don't want to depend on this in important code) and a generator expression where he uses a list comprehension would be worthy of consideration. – Mike Graham Mar 10 '10 at 06:35
-
6It would be good to pull in some highlights in that article, at the least a couple of the implementations (to avoid link rot problems). – jpmc26 Jul 29 '14 at 22:22
-
4Method 1: resultString += appendString is the fastest according to tests by @Antoine-tran below – Justas Dec 31 '15 at 17:47
-
7Your quote doesn't at all answer the question. Please include the relevant parts in your answer itself, to comply with new guidelines. – Nic Oct 21 '16 at 16:48
Relying on compiler optimizations is fragile. The benchmarks linked in the accepted answer and numbers given by Antoine-tran are not to be trusted. Andrew Hare makes the mistake of including a call to repr in his methods. That slows all the methods equally but obscures the real penalty in constructing the string.
Use join. It's very fast and more robust.
$ ipython3
Python 3.5.1 (default, Mar 2 2016, 03:38:02)
IPython 4.1.2 -- An enhanced Interactive Python.
In [1]: values = [str(num) for num in range(int(1e3))]
In [2]: %%timeit
...: ''.join(values)
...:
100000 loops, best of 3: 7.37 µs per loop
In [3]: %%timeit
...: result = ''
...: for value in values:
...: result += value
...:
10000 loops, best of 3: 82.8 µs per loop
In [4]: import io
In [5]: %%timeit
...: writer = io.StringIO()
...: for value in values:
...: writer.write(value)
...: writer.getvalue()
...:
10000 loops, best of 3: 81.8 µs per loop
- 7,428
- 1
- 48
- 45
-
2Yes, the `repr` call dominates the runtime, but there's no need to make the mistake personal. – Alex Reinking Aug 17 '18 at 21:43
-
9@AlexReinking sorry, nothing personal meant. I'm not sure what made you think it was personal. But if it was the use of their names, I used those only to refer to the user's answers (matches usernames, not sure if there's a better way). – GrantJ Aug 18 '18 at 19:15
-
1good timing example that separates data initialization and concatenation operations – aiodintsov Jun 29 '19 at 22:37
I have used the code of Oliver Crow (link given by Andrew Hare) and adapted it a bit to tailor Python 2.7.3. (by using timeit package). I ran on my personal computer, Lenovo T61, 6GB RAM, Debian GNU/Linux 6.0.6 (squeeze).
Here is the result for 10,000 iterations:
method1: 0.0538418292999 secs process size 4800 kb method2: 0.22602891922 secs process size 4960 kb method3: 0.0605459213257 secs process size 4980 kb method4: 0.0544030666351 secs process size 5536 kb method5: 0.0551080703735 secs process size 5272 kb method6: 0.0542731285095 secs process size 5512 kb
and for 5,000,000 iterations (method 2 was ignored because it ran tooo slowly, like forever):
method1: 5.88603997231 secs process size 37976 kb method3: 8.40748500824 secs process size 38024 kb method4: 7.96380496025 secs process size 321968 kb method5: 8.03666186333 secs process size 71720 kb method6: 6.68192911148 secs process size 38240 kb
It is quite obvious that Python guys have done pretty great job to optimize string concatenation, and as Hoare said: "premature optimization is the root of all evil" :-)
- 323
- 3
- 3
-
3Apparently Hoare does not accept that: http://hans.gerwitz.com/2004/08/12/premature-optimization-is-the-root-of-all-evil.html – Pimin Konstantin Kefaloukos Dec 11 '12 at 13:13
-
6It is not a premature optimization to avoid fragile, interpreter-dependant optimizations. If you ever want to port to PyPy or risk hitting [one of the many subtle failure cases](http://stackoverflow.com/questions/24040198/cpython-string-addition-optimisation-failure-case) for the optimization, do things the right way. – Veedrac Nov 03 '14 at 21:46
-
1
Python has several things that fulfill similar purposes:
- One common way to build large strings from pieces is to grow a list of strings and join it when you are done. This is a frequently-used Python idiom.
- To build strings incorporating data with formatting, you would do the formatting separately.
- For insertion and deletion at a character level, you would keep a list of length-one strings. (To make this from a string, you'd call
list(your_string). You could also use aUserString.MutableStringfor this. (c)StringIO.StringIOis useful for things that would otherwise take a file, but less so for general string building.
- 69,495
- 14
- 96
- 129
Using method 5 from above (The Pseudo File) we can get very good perf and flexibility
from cStringIO import StringIO
class StringBuilder:
_file_str = None
def __init__(self):
self._file_str = StringIO()
def Append(self, str):
self._file_str.write(str)
def __str__(self):
return self._file_str.getvalue()
now using it
sb = StringBuilder()
sb.Append("Hello\n")
sb.Append("World")
print sb
- 585
- 4
- 8
There is no explicit analogue - i think you are expected to use string concatenations(likely optimized as said before) or third-party class(i doubt that they are a lot more efficient - lists in python are dynamic-typed so no fast-working char[] for buffer as i assume). Stringbuilder-like classes are not premature optimization because of innate feature of strings in many languages(immutability) - that allows many optimizations(for example, referencing same buffer for slices/substrings). Stringbuilder/stringbuffer/stringstream-like classes work a lot faster than concatenating strings(producing many small temporary objects that still need allocations and garbage collection) and even string formatting printf-like tools, not needing of interpreting formatting pattern overhead that is pretty consuming for a lot of format calls.
- 51
- 6
In case you are here looking for a fast string concatenation method in Python, then you do not need a special StringBuilder class. Simple concatenation works just as well without the performance penalty seen in C#.
resultString = ""
resultString += "Append 1"
resultString += "Append 2"
See Antoine-tran's answer for performance results