CSiEra

CSiEra

I know I know nothing.

Bulk delete certain files

Background#

A few days ago, I downloaded a 100GB dataset online, which contained tens of thousands of files. However, when I extracted it, I found that each file had a duplicate starting with ._, for example, if there was a file named sub_12345, there would be a corresponding ._sub_12345 file. This duplicate file is useless, but it is visible in Windows. It not only looks unpleasant but also affects the subsequent program's file reading.

Python script for batch deletion#

The core is to use the os.walk module for processing:

import os

data_dir = './test/'
for root, subdir, filename in os.walk(data_dir, topdown=False):
  if filename.startswith('._'):
    os.remove(os.path.join(root, filename))

The above is the script, which is very simple. Because os.walk implements recursive folder traversal, the task becomes much simpler.

Loading...
Ownership of this post data is guaranteed by blockchain and smart contracts to the creator alone.