The other day, someone asked me if I could write a script that could cleanup a directory of all files that are greater than or equal to X number of days old. I ended up using Python's core modules for this task. We will spend some time looking at one way to do this useful exercise.
FAIR WARNING: The code in this article is designed to delete files. Use at your own risk!
Here's the code I came up with:
import os import sys import time #---------------------------------------------------------------------- def remove(path): """ Remove the file or directory """ if os.path.isdir(path): try: os.rmdir(path) except OSError: print "Unable to remove folder: %s" % path else: try: if os.path.exists(path): os.remove(path) except OSError: print "Unable to remove file: %s" % path #---------------------------------------------------------------------- def cleanup(number_of_days, path): """ Removes files from the passed in path that are older than or equal to the number_of_days """ time_in_secs = time.time() - (number_of_days * 24 * 60 * 60) for root, dirs, files in os.walk(path, topdown=False): for file_ in files: full_path = os.path.join(root, file_) stat = os.stat(full_path) if stat.st_mtime <= time_in_secs: remove(full_path) if not os.listdir(root): remove(root) #---------------------------------------------------------------------- if __name__ == "__main__": days, path = int(sys.argv[1]), sys.argv[2] cleanup(days, path)
Let's spend a few minutes looking at how this code works. In the cleanup function, we take the numberOfDays parameter and transform it into seconds. Then we subtract that amount from today's current time. Next we use the os module's walk method to walk through the directories. We set topdown to False to tell the walk method to traverse the directories from the innermost to the outermost. Then we loop over the files in the innermost folder and check its last access time. If that time is less than or equal to timeInSecs (i.e. X days ago), then we try to remove the file. When that loop finishes, we do a check on root to see if it has files (where root is the innermost folder). If it doesn't, then we delete the folder.
The remove function is extremely straight forward. All it does is check if the path that is passed is a directory or not. Then it attempts to delete the path using the appropriately method (i.e. os.rmdir or os.remove).
There are a couple of other ways to modify folders and files which should be mentioned. If you know you have a set of nested directories are all empty, you could use os.removedirs() to just remove the them all in one fell swoop. Another more extreme way of doing this would be to use Python's shutil module. It has a method called rmtree that can remove files and folders!
I've used both methods to great effect in other scripts. I have also found that sometimes I cannot delete a particular file on Windows unless I do it through Windows Explorer. To get around this, I have used Python's subprocess module to call Window's del command with its /F flag to force the delete. You can probably do something similar on Linux with its rm -r command. Occasionally you will run into files that are locked, protected or you just don't have the correct permissions and you can't delete them.
If you've spent any time thinking about the script above, you've probably already thought of some improvements or features to add. Here are some that I thought would be nice:
I'm sure you have thought of other fun ideas or solutions. Feel free to share them in the comments below.
Copyright © 2024 Mouse Vs Python | Powered by Pythonlibrary