A relative of mine asked me to restore a couple of files from backup CDs created by the windows native backup utility.
It turned out more difficult than I initially expected.
First, unlike other tools, the windows backup utility zips the files using the backslash as a directory separator and none of the Linux decompression tools can handle that.
Second, the files were encoded with one of the Windows native encodings, and none of the Linux decompression tools handle that (apparently unzip on Ubuntu is patched to add support for filename encodings, but Debian, as usual, is lacking).
What a mess ...
1. unzip with bsdtar as unzip would scramble the non-ascii filenames
$ bsdtar xf ../Backup\ files\ 7.zip
2. convert filenames to utf-8
$ convmv -r -f cp852 -t utf-8 --notest .
3. recreate the directory structure
$ cat y.py
#! /usr/bin/env python
import os
import errno
# already created directories, walk works topdown, so a child dir
# never creates a directory if there is a parent dir with a file.
made_dirs = set()
for root, dir_names, file_names in os.walk('.'):
for file_name in file_names:
if '\\' not in file_name:
continue
alt_file_name = file_name.replace('\\', '/')
if alt_file_name.startswith('/'):
alt_file_name = alt_file_name[1:] # cut of starting dir separator
alt_dir_name, alt_base_name = alt_file_name.rsplit('/', 1)
print('alt_dir', alt_dir_name)
full_dir_name = os.path.join(root, alt_dir_name)
if full_dir_name not in made_dirs:
try:
os.makedirs(full_dir_name)
except OSError as exc:
if exc.errno == errno.EEXIST and os.path.isdir(full_dir_name):
# the pass already exists and is a folder, let's just ignore it
pass
else:
raise
made_dirs.add(full_dir_name)
os.rename(os.path.join(root, file_name),
os.path.join(root, alt_file_name))
$ python y.py
No comments:
Post a Comment