Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Management of the DB #496

Open
permcody opened this issue Mar 24, 2021 · 3 comments
Open

Management of the DB #496

permcody opened this issue Mar 24, 2021 · 3 comments
Assignees

Comments

@permcody
Copy link
Member

The Production CIVET database at the INL is growing and growing in size. Some of what is in there are considered records so we can't just delete it. However, we also can't just let it grow unbounded. We need to work out a way for the server to delete large portions of the DB that aren't records. I believe the DB uses keys so data integrity will mostly handle itself. What we need to do is engineer a way for the server to selectively delete records and then work on what to delete. I know you need more details, but let's see if we can engineer how deletion would/could work (e.g. Probably a daily or weekly triggered task).

@brianmoose
Copy link
Contributor

Django has "management" commands, see ci/management/commands for some examples. A couple of these are routinely called via cron. This is how civet recipes are loaded into the DB. It should be easy to add a new command to delete stuff, as long as you can precisely specify what you want to delete.

@socratesgorilla
Copy link
Contributor

What exactly constitutes a record entry vs a non-record entry? Is there an easy way to identify this programatically?

@socratesgorilla socratesgorilla self-assigned this Jan 6, 2022
@permcody
Copy link
Member Author

permcody commented Jan 6, 2022

Back when I wrote this, I believe I had a good grasp of what was a record versus what was not. I'm less sure now that we are nine months out from the original ticket. If I had to guess, I'd say any record linked to one of our releases and that's it!

However, we don't want to delete records from a few days ago either in case we need to go back in time a bit as part of our normal development cycle.

I would propose that we deleted records older than a month or two (of course this will be a variable that we can change), that are NOT part of a release. We could probably be generous and keep a whole year's worth of build results if we want. So what's a release? Well these are linked on our webpage so we'll have to think about how to formalize them in CIVET land so we don't accidentally dump anything. This will be critical! For now, let's skip that part and assume I've get a hash of what I'm calling a release (see the releases for MOOSE here: https://mooseframework.inl.gov/sqa/index.html). Can you come up with a way to not remove anything that's browsable from one of those links? That might be pretty difficult!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants