Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support bagging to a destination other than the source #92

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

runderwood
Copy link
Contributor

See #35.

@coveralls
Copy link

coveralls commented Jun 20, 2017

Coverage Status

Coverage decreased (-0.03%) to 84.774% when pulling 936c0ff on runderwood:35-other-destinations into af21e2a on LibraryOfCongress:master.

@coveralls
Copy link

coveralls commented Jun 20, 2017

Coverage Status

Coverage decreased (-0.1%) to 84.665% when pulling 01d1ede on runderwood:35-other-destinations into af21e2a on LibraryOfCongress:master.

@coveralls
Copy link

coveralls commented Jun 20, 2017

Coverage Status

Coverage increased (+0.1%) to 84.923% when pulling be38f43 on runderwood:35-other-destinations into af21e2a on LibraryOfCongress:master.

@johnscancella
Copy link
Contributor

Looks good to me. @acdha is on vacation, but I am sure once he gets back he will take a look at it.

@runderwood
Copy link
Contributor Author

@johnscancella @acdha

Ping.

@@ -132,7 +133,9 @@ def find_locale_dir():
UNICODE_BYTE_ORDER_MARK = u'\uFEFF'


def make_bag(bag_dir, bag_info=None, processes=1, checksums=None, checksum=None, encoding='utf-8'):
def make_bag(bag_dir,
bag_info=None, processes=1, checksums=None, checksum=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer not to have formatting changes in this PR or to follow PEP-8 if we do. In this case, the existing code has been using the aligned indent style.

@@ -148,28 +151,57 @@ def make_bag(bag_dir, bag_info=None, processes=1, checksums=None, checksum=None,
checksums = DEFAULT_CHECKSUMS

bag_dir = os.path.abspath(bag_dir)
LOGGER.info(_("Creating bag for directory %s"), bag_dir)
if dest_dir:
LOGGER.info(_("Creating bag for directory %s"), dest_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would prefer a single logging call rather than two near-duplicate lines. In this case, I think it might be easier to follow if we set dest_dir using bag_dir when called with dest_dir=None so most of the code could assume that dest_dir is what you use when assembling paths

# FIXME: we should do the permissions checks before changing directories
old_dir = os.path.abspath(os.path.curdir)
if dest_dir is not None:
dest_dir = os.path.abspath(dest_dir)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd prefer to normalize dest_dir earlier in the function

old_dir = os.path.abspath(os.path.curdir)
if dest_dir is not None:
dest_dir = os.path.abspath(dest_dir)
if os.path.exists(dest_dir):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This block of code is somewhat complicated. As a minor point, OSError or EnvironmentError is probably more appropriate than RuntimeError but I'm leaning towards simplifying the entire block to:

if not os.path.isdir(dest_dir):
    os.makedirs(dest_dir)

That will raise OSError if the path already exists but has a component which is not a directory.

data_subdir = os.path.relpath(
bag_dir, os.path.join(bag_dir, os.pardir)
)
if os.path.abspath(os.path.join(bag_dir, data_subdir)) == bag_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be os.path.samefile(os.path.join(bag_dir, data_subdir), bag_dir)

if not os.path.exists(data_dir):
os.mkdir(data_dir)
elif os.path.exists(os.path.join(data_dir, data_subdir)):
shutil.rmtree(os.path.join(data_dir, data_subdir))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems risky at first read – shouldn't we just throw an error if the data directory already exists?

os.chmod('data', os.stat(cwd).st_mode)
os.chmod(data_dir, os.stat(bag_dir).st_mode)

if dest_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm wondering whether this should be one call with src_dir either being bag_dir or None for clarity


LOGGER.info(_("Creating bagit.txt"))
txt = """BagIt-Version: 0.97\nTag-File-Character-Encoding: UTF-8\n"""
with open_text_file('bagit.txt', 'w') as bagit_file:
bagit_file.write(txt)
if dest_dir:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This entire section is where I was thinking it might be cleaner to always have dest_dir defined since I'm not happy about so many if statements in a row

@runderwood
Copy link
Contributor Author

@acdha Not sure if this branch is worth keeping around. I'm going to be out of pocket for at least a few months and don't have time to tighten it up before.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants