1

I've been trying to learn file I/O in Python, but have come across some sort of memory leak that I can't solve for no apparent reason.

file = "D:\\babelStorage\\Testing"
x = 1000000
while (x > 0):
    with open("".join([file, "\\", "junk", str(x), ".txt"]), "wt") as trash:
        trash.write("garbage")
    x = x - 1

The same issue seems to occur even when I explicitly use trash.close(). What exactly am I doing wrong that's causing huge chunks of memory to accumulate?

None of the memory shows up as a process on task manager. If I run it long enough I can get 10GB which are... somewhere. Closing the python shell doesn't recover the memory, either, I have to reboot.

Rayalot72
  • 13
  • 3
  • 3
    If the memory doesn't show up in task manager, how are you detecting a memory leak? – afarley Jul 16 '20 at 18:54
  • Memory usage doesn't change at all until I run this sort of file I/O. When I do, it very steadily climbs and doesn't go back down. Started at under 3GB with this machine running overnight not doing anything. Just testing file I/O within the past hour now has it at a constant 20GB. Highest memory usage displayed is Google Chrome at ~1,200 MB. – Rayalot72 Jul 16 '20 at 18:58
  • Oh, I see what you're reading. When I said "none of the memory shows up on task manager" I meant it doesn't show up as a process, it still shows a very high memory usage. Edited to be more clear. – Rayalot72 Jul 16 '20 at 19:02
  • 2
    Probably just the operating system's block cache storing the data you wrote to disk so it can be retrieved from memory if/when something needs it. It's *normal* for the block cache to not be empty -- in fact, it's *better* for it to not be empty: An empty block cache is a block cache that isn't doing anything to make your system faster. As soon as something else has a use for that memory, the OS will free it up. – Charles Duffy Jul 16 '20 at 19:03
  • It's possible that the Python interpreting is deciding not to release that memory, even though it should be no longer necessary after each iteration of the loop. You could try calling gc.collect() to confirm as discussed here: https://stackoverflow.com/questions/1316767/how-can-i-explicitly-free-memory-in-python – afarley Jul 16 '20 at 19:04
  • 1
    @afarley, the OP says the Python interpreter is _exiting_. A process that has exited cannot decide not to release memory (unless it's something exotic like a SHM block, but Python doesn't do anything like that automatically). – Charles Duffy Jul 16 '20 at 19:05
  • @CharlesDuffy good point, my previous suggestion is probably wrong. – afarley Jul 16 '20 at 19:06
  • 1
    @Rayalot72, ...if this _isn't_ a cache in the OS, it's an operating system or driver bug. In no scenario is it Python doing anything wrong; as soon as a process exits, it's the operating system's job to release its memory and other resources, so if the OS fails to do so, it's an OS bug... _or_, as described above (and far, _far_ more likely to be the case), an intentional and desired behavior. – Charles Duffy Jul 16 '20 at 19:08
  • @Rayalot72, ...if you gave us details about how you were measuring "used" memory, we could speak to whether that measure counts data that's storing a transient cache as "used". – Charles Duffy Jul 16 '20 at 19:09
  • @Charles Duffy Not sure how to answer that? I had just seen very high numbers on task manager I couldn't account for and thought it was a memory leak. If that includes caches, then it's probably not something to worry about as you said (although I'll try to run out of memory to make sure). – Rayalot72 Jul 16 '20 at 19:16
  • @afarley Doesn't seem to be holding on to memory, gc.collect() in the shell returned ~400 but memory usage didn't change. Closing and opening the shell didn't change anything. – Rayalot72 Jul 16 '20 at 19:18
  • @Rayalot72 just out of curiosity, does the behaviour still happen if you run it on the same drive that python and the script are stored? I see you using a `D://` drive and since this sounds far more likely to be an OS issue I wonder if it is just "windows sucks" or "your disk driver has an issue." – Tadhg McDonald-Jensen Jul 16 '20 at 19:19
  • I'd probably head over to our sister site [Super User](https://superuser.com/) and review [Why is my memory at 65% usage when I'm not running any programs?](https://superuser.com/questions/1338367) -- the answers go into figuring out if there's _really_ a program using that memory (and exactly what that program is), or if it's just cache. Also, the form in which the question is asked (including details about _where_ in the task manager it's displaying that the memory is used) is a pattern that's worth following; "available" is not an exact inverse of "used", so details matter. – Charles Duffy Jul 16 '20 at 19:38
  • @TadhgMcDonald-Jensen I don't believe so? I've just tried similar file I/O on my C and D drives, and it seems to increase no matter what. – Rayalot72 Jul 17 '20 at 18:53
  • @CharlesDuffy Thanks for the tip, the RAMMAP application was very useful. Seems the memory is being taken up by "mapped files." Is that normal, or something I should fix? – Rayalot72 Jul 17 '20 at 18:56
  • This depends on the details of how Windows works. Over in UNIX land, it's typical for memmapped files to just be a special case in the block cache code (where writes to the memory are allowed but need to be written back to the underlying file). You might ask a question over at Super User to get folks who are subject matter experts on Windows. BTW, if there _is_ an application holding a file descriptor open on a file that was only intended to be opened by an application that already closed, I'd suspect something like an antivirus to be a likely suspect. – Charles Duffy Jul 18 '20 at 18:13

0 Answers0