Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement initial capability to accept pre-generated checksum manifest file #4

Open
jordanpadams opened this issue Jan 24, 2020 · 0 comments
Assignees
Labels

Comments

@jordanpadams
Copy link
Member

jordanpadams commented Jan 24, 2020

  • Create flag for accepting 1 or more input checksum manifest files - these files can be used in lieu of generating a checksum

  • Create flag for manifest base directory (base_dir) - directory where paths for checksum manifest begin, e.g. from this manifest they would need to specify some path so we would be able to track down:
    $base_dir/./data_hkl1/orbit_c/20190805T133001S610_ocm_hkL1.csv

  • Create flag for specifying manifest checksum type - MD5 (default), SHA-1, SHA-256

  • Read manifest file into memory (should we put a cap on file size so we make sure we don't crash the system?)

  • Perform exact same functionality as "offline" mode, except when a file matches that path identified in the checksum manifest, do not generate a checksum and use the value identified in the checksum manifest

  • Manifest(s) will look something like this this or this

  • Manifest format: assume the format is MD5Deep 4.n (https://pds.nasa.gov/datastandards/documents/im/current/index_1D00.html#value_pds_checksum_manifest_pds_parsing_standard_id_md5deep%204.n) which basically points to http://md5deep.sourceforge.net/

See internal Github issues for additional comments.

jordanpadams added a commit that referenced this issue Jan 24, 2020
An attempt at #4 (adds `--digestdir` command line option in order to use pre-computed digests)

There are a number of assumptions here that'll probably need revisiting:

- Issue #4 says to add a true/false flag for checksum manifest; skipping that since it's implied by providing a flag for the manifest base directory
- Since we're providing a directory and not a single file, we're assuming that the directory contains a tree of possible manifest files
- Assuming manifest files are named `manifest.tsv`; all of them are read into memory
- Treating them as tab separated values of "filename" TAB "digest" (checksum)
- They're actually whitespace separated values
- No commenting or other syntax specified for them

The `docs/samples` directory now includes a sample tree of `manifest.tsv` files.

NOTE: These changes were rebased in internal repo so functionality will most likely need to be modified and tested.
@jordanpadams jordanpadams added the enhancement New feature or request label Mar 10, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
Status: ToDo
Development

No branches or pull requests

2 participants