Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flag to avoid backing up long-stopped containers #94

Open
MMauro94 opened this issue Apr 19, 2022 · 14 comments
Open

Flag to avoid backing up long-stopped containers #94

MMauro94 opened this issue Apr 19, 2022 · 14 comments
Labels
enhancement New feature or request pr welcome

Comments

@MMauro94
Copy link
Contributor

It would be nice to be able to add a flag (either in the config or in the container labels) to avoid making a backup if the container is stopped AND it was last stopped before the latest backup.

My use case is this: I have a game server container that I start on demand. There may be periods on which it'll be used almost every day and periods of months when it will be permanently stopped. I would like to run backups only when there are actual changes to the state of the container. When used in conjunction with backup pruning this also avoids having N backups of the same exact thing and allows to keep the last N states.

By running docker inspect I found that there are the StartedAt and FinishedAt properties, so it should be relatively easy to pull this info up and compare it with the last backup date.

@m90
Copy link
Member

m90 commented Apr 19, 2022

This is an interesting proposal, however I am not sure if it's possible to do this in a configuration friendly way because of the following: right now, a docker-volume-backup container does not really know how to map a volume that is mounted for backup to another container running on the host. Starting / stopping containers during backup happens using labels, but this still does not create any kind of logical connection to a volume that is being backed up.

So if we were to implement such a feature, we'd need to come up with an API that allows linking a volume and a service definition. Right now, I have a hard time coming up with something that would work here, but I'll play with it in the back of my head for a little while. Do you have a good idea how such an API could look like?


Looking at your underlying use case, another solution could be: introduce an option like CONSIDER_STALE_AFTER (a value would be something like 168h), and in case no file in the volume has changed (looking at the respective mtime) after the deadline defined by the option, backup is skipped. What do you think about this approach?

@m90 m90 added the enhancement New feature or request label Apr 19, 2022
@MMauro94
Copy link
Contributor Author

You're right, as I am using a single specific docker-volume-backup container for this case so it didn't occur to me the problem about mapping volumes.

An option could be to make this a global config instead of a label in a container and apply the rule to all matching containers. The condition to avoid making a backup would then be: all containers are stopped AND MAX FinishedAt < lastest_backup_date.


In my case CONSIDER_STALE_AFTER would definetly work, but I have two questions:

  1. Why define a fixed amount of time instead of simply checking last_changed_file_date < lastest_backup_date?
  2. It doesn't apply to my case, but there may be a significant performance hit in directories with a lot of small files.

@m90
Copy link
Member

m90 commented Apr 19, 2022

I think I prefer the second approach in the variation you describe. It's just very easy to understand and would cover a lot of cases. The option could be called something like SKIP_IF_UNCHANGED.

Why define a fixed amount of time instead of simply checking last_changed_file_date < lastest_backup_date

Yeah that's a very good point. It would be a lot more intuitive if it's working as you described, also more useful. One problem to solve still would be: how do we know about the "last" backup? Pruning has the BACKUP_PRUNING_PREFIX but while replicating this (or even reusing) would work, it's not very elegant. Ideas are very welcome.

It doesn't apply to my case, but there may be a significant performance hit in directories with a lot of small files.

This is correct, but the backup mechanism already has to walk the entire file tree so it can create the tar archive here:

if err := filepath.WalkDir(inPath, func(path string, di fs.DirEntry, err error) error {
paths = append(paths, path)
return err
}); err != nil {
return fmt.Errorf("compress: error walking filesystem tree: %w", err)
}
which means we would get the info for free in that case.

@MMauro94
Copy link
Contributor Author

but the backup mechanism already has to walk the entire file tree so it can create the tar archive here ... which means we would get the info for free in that case

That's great!


Ok, so if we don't want to reuse the BACKUP_PRUNING_PREFIX logic this leaves us with the task of listing backups (being careful to exclude any file that is not a backup).

I have a few ideas in mind, I'll list them here in no particular order:

  1. Generate a regex from BACKUP_FILENAME that will mach all backup files. We'll need to be careful about % strftime expansion and $ env expansion but it should be doable.
  2. Save a JSON file that will act as a small DB for the backups info. This prompts a few questions. How do we call it? What if two instances of docker-volume-backup are pointed to the same dir? What about old backups that won't appear in this file?
  3. Explicitly fobid pointing two instances to the same backup folder, but this would be a pretty big breaking change.

Personally I'd go with option 1 as that is 100% backwards compatible and keeps the logic and filesystem pretty clean. What do you think?

Note: if we go though with this, it could also be used for the pruning mechanism.

@m90
Copy link
Member

m90 commented Apr 20, 2022

Option 1 sounds enticing, but to be honest I'm not entirely sure how such a regexp that is derived from BACKUP_FILENAME could work in a way that it's guaranteed to always work (this is very sensitive as it could potentially cause accidental deletion). Maybe I'm wrong though and it's easier than I think?

Maybe using the prefix is dumb, but predictable and lets people sleep well (which should be an explicit design tool for a backup tool). Not sure right now what's the best way to go. Right now I would lean towards soft-deprecating BACKUP_PRUNING_PREFIX and introducing sth like BACKUP_SELECTION_PREFIX that works both for this feature as well as pruning. I'll need to think about this a little further though.

In any case I think having a BACKUP_FILENAME mechanism that also allows for failsafe matching existing backups should be something to consider in case a v3 is ever released (see #80).

I think option 2 and 3 would work well but have too many drawbacks.

@MMauro94
Copy link
Contributor Author

The regex is pretty simple to generate, but the problem is actually with the env expand now that I think about it.
Imagine having two instances pointed to the same dir with these settings:

BACKUP_FILENAME Would-be regex
backup-$ENV-%Y-%m-%dT%H-%M-%S.tar.gz backup-.*-\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}\.tar\.gz
$ENV-%Y-%m-%dT%H-%M-%S.tar.gz .*-\d{4}-\d{2}-\d{2}T\d{2}-\d{2}-\d{2}\.tar\.gz

Notice how the second regex will match any backup from the first instance.
This is just an example, but these kind of slip-ups are pretty easy to make, especially since we don't know the content of the ENV so we must use .*.


Probably your BACKUP_SELECTION_PREFIX is the easiest and safest thing to do. I'll propose, however, a little twist: what if we make a BACKUP_SELECTION_REGEX instead? This would allow for greater flexibility while still maintaining predictability.

@m90
Copy link
Member

m90 commented Apr 20, 2022

I'll propose, however, a little twist: what if we make a BACKUP_SELECTION_REGEX instead? This would allow for greater flexibility while still maintaining predictability.

This is a brilliant idea considering it's a new option anyways.


So what we'd need to do to implement this feature:

  • add bool SKIP_IF_UNCHANGED to the config
  • add string BACKUP_SELECTION_REGEX to the config
  • factor out the call to filepath.WalkDir from archive.go to happen upfront so we know about the latest mtime in the backup sources
  • in case SKIP_IF_UNCHANGED is true get the latest backups from all storages
  • in case the latest mtime is older than the latest backup, stop doing anything, in case it's newer, continue as usual
  • separate task: print a deprecation warning in case BACKUP_PRUNING_PREFIX is used and transform it to be a regex like ^existing-prefix
  • make backup pruning use the regex instead of the prefix when selecting candidates. This likely means we need to stop using native filtering when working against S3 storages (which is why this works as it does right now in the first place), but I guess it's ok to fetch everything and then do the comparison ourselves

@m90 m90 added the help wanted Extra attention is needed label Apr 20, 2022
@MMauro94
Copy link
Contributor Author

That looks reasonable. I'll start working on a PR as soon as a get a bit of spare time. I'll probably need some help when dealing with S3 storages, but I'll let you know when I get there.

@m90
Copy link
Member

m90 commented May 5, 2022

@MMauro94 Just to coordinate with you: I am planning to move things around in here a bit in the next few days so a. we can have an easier way to exclude files from the backup (this is being requested again and again) and also provide some saner structure so that we could maybe do #95 at some point.

This would probably collide with what you are planning to do here (also provide synergies, e.g. making the list of files available upfront for filtering), so I wanted to check whether you have started working on this already? It would be a bit disappointing if you'd be close to opening that PR just to see a big merge conflict against my recent changes.

@MMauro94
Copy link
Contributor Author

MMauro94 commented May 5, 2022

Luckily I've been quite busy lately, so I haven't started working on a PR yet. I'd say go ahead with your changes, and I'll start working on this once you've finished.

@m90
Copy link
Member

m90 commented May 10, 2022

@MMauro94 I merged #100 so you can now a. decode Regular Expressions from configuration and b. access the list of files to be backed up outside of the archive mechanism which might be helpful for the task at hand here.

All other refactoring I might be doing won't collide with your feature, so feel free to start whenever it works for you (no obligations, but I guess you know that). Thanks again!

@MMauro94
Copy link
Contributor Author

@m90 Great! Quick question: there is quite a bit of similar code for the different types of storages. Would you be open to generalize the code a bit? In a OOP language I would create a common abstract class with with abstract functions like listFiles, uploadFile, removeFile, and so on. Then it's simply a matter of implementing each storage in its own class. This would allow, for instance, to have a single purge function. I'm guessing the go equivalent would be something similar to this.

I know that it's a bit out of scope for this issue, but I'd like to hear your thoughts on this. Also, there to consider the possibility that a single storage jack-of-all-trades like rclone will be the way to go in the future (as per #65); in this case abstracting the current storages like this will probably be a waste of time.

@m90
Copy link
Member

m90 commented May 11, 2022

Would you be open to generalize the code a bit?

Definitely. I attempted to do this this branch already, but never really finished it. The idea was to make each storage implement the same interface, something like this

type storage interface {
	id() storageID
	list(prefix string) ([]backupInfo, error)
	symlink(file string) error
	copy(file string) error
	delete(file string) error
}

and then the script just iterates over a list of configured storages, doing the same thing over and over again. IIRC the code on that branch works (all tests pass), but I never got around to testing it thoroughly enough. If you want to work on this, maybe this is a good starting point? This branch existing does not mean it cannot be improved btw, so if you think something can be done better, feel invited to do so.


About rclone I am not sure. It sounds tempting but right now it doesn't check all boxes it would need to check for me:

  • I wouldn't want to start installing a binary in the container and then have the Go code call through (then we could go back to a shell script again which brings other problems)
  • You can use rclone as a Go library theoretically, but it's not a supported use case, i.e. there is no API stability. This would mean that whenever a bug in an exotic storage backend supported by rclone would require an update, it might mean that things break elsewhere because of some unrelated change in rclone

If there was an rclone like tool that is designed to be used as a library I would be open to look into it, but right now I don't know of any.

@m90
Copy link
Member

m90 commented Aug 18, 2022

Would you be open to generalize the code a bit?

This refactoring is now on main, so it's probably a bit easier to implement this feature now.

@m90 m90 added pr welcome and removed help wanted Extra attention is needed labels Sep 28, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request pr welcome
Projects
None yet
Development

No branches or pull requests

2 participants