I am writing a Pythonscript for securing locally available OneDrive files. As some or most of you already know, you can opt to use placeholders for your OneDrive files. Your files are then shown on your local filesystem including their size bĂșt they only get downloaded when you click them. So if you install and configure OneDrive with this functionality then when doing forensic work, chances are that a part of the OneDrive files are available locally and some aren't.
So I decided to write a script to copy all locally available files to a destination. Obviously the first step is to determine wether a file really exists locally. As file size of the real file is shown in both the OS and by using os.path.getsize and os,stat, this could not be done by looking up size.
I also tried reading the file. I expected the file with FTK Imager and when visiting the file I saw the real file size but also nothing else but zeroes. So I thought I could make a simple condition like
if filecontent.read(16) != 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00 0x00
However, I never got to testing and tuning this. If a file is locally available I can print the first 16 bytes of the file with
print(filecontent.read(16))
But if the file is only a placeholder then I get this error:
Traceback (most recent call last):
File "C:\Users\Jeffrey\PycharmProjects\ODEOF\main.py", line 44, in <module>
filecontent.read(64)
OSError: [Errno 22] Invalid argument
I once again inspected the file with FTK Imager, only to find out that apparently FTK can't tell me in which cluster and sector the file is located, only showing -1 for both. For an available file it does show where it's available.
I decided to use this method to see what happens as a way of diagnosing what was going on. The weird thing is, when I use that solution it doesn't work as the value returned by file.tell is way into the ten thousands. So this method would indicate a file filled with data, while file.open can't read it and FTK shows an empty file. I have searched (that's how I found that particular method) for other ways of trying to read the files content or for solutions for OneDrive placeholders but I have found absolutely nothing.
One thing I did do which makes it work but imho not very robustely, is that I capture the OSError 22 and consider it a sign of the file being a placeholder. But if that error would ever show for any other reason while the file is locally available, I will miss that particular file that I should have gotten. So I was hoping anyone else might have an idea.
Here's my rough code:
with os.scandir(<insert path>) as files:
for file in files:
if (not file.is_symlink() and file.is_file():
print(file.name)
filecontent = open(file, mode='rb')
try:
filecontent.read(16)
print("Copy file")
except OSError as OS_error:
if OS_error.errno == 22:
print("OSError 22, Placeholder?")
# Trigger error: print(filecontent.read(64))
else:
print("Other OSError:" + OS_error)
#empty_test = filecontent.read(16)