Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

📦 Automatic submodule state upgrade on PR #1730

Closed
wants to merge 2 commits into from

Conversation

daquinteroflex
Copy link
Collaborator

Github bot will commit to PR rather than failing the PR commit

@daquinteroflex daquinteroflex marked this pull request as draft May 31, 2024 14:24
@tylerflex
Copy link
Collaborator

@daquinteroflex would this make the developer's local branch diverge from their branch on origin? would it make another commit? just curious how it would work

@tylerflex
Copy link
Collaborator

We could also try just adding the submodule state check as a pre-commit? if it isn't already?

@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented May 31, 2024

Hi Tyler! I explored a few options to do this:

  • In this PR, the bot will add an extra commit onto the PR that just updates the submodule to the target branch automatically. This means that they can rebase that bot commit in before merging or the PR will get merged with the latest submodule state commit as required.
  • I actually tried implementing the pre-commit but it doesn't actually work logically, because it doesn't know which is the target PR branch to update the submodules for (could be develop or pre/). I could try to make it a parameter that can be changed but don't know what you think about that.

@daquinteroflex daquinteroflex marked this pull request as ready for review May 31, 2024 15:25
@daquinteroflex
Copy link
Collaborator Author

image

Still thinking how to streamline this best

@tylerflex
Copy link
Collaborator

ok that sounds good. I'll approve this PR but dont really understand all the details of the GitHub action so trust you on it. Feel free to assign Yannick if you need another pair of eyes on it.

@daquinteroflex
Copy link
Collaborator Author

Hi @yaugenst-flex, thinking how to best streamline this, what do you think? Any suggestions?

@daquinteroflex
Copy link
Collaborator Author

Actually thinking about it the pre-commit approach can work assuming that it just fetches the latest commit from the current branch of the submodule, although it doesn't actually fix the issue of it being the wrong branch for a corresponding PR. Can implement that if we like based on the previous approach?

@yaugenst-flex
Copy link
Contributor

yaugenst-flex commented Jun 3, 2024

I'm personally not a big fan of having commits added automatically to a PR, mostly because of the issue that @tylerflex mentioned, making you have to rebase against your own PR (or pull at least), which is a bit opaque. What do you think about adding a pre-commit hook that essentially does the same thing as that runner script, but without adding the commit? I just wrote one that I think would work:

scripts/check_latest_submodules.sh:

#!/bin/bash

check_submodule_latest_commit() {
    local submodule_path=$1
    local branch=$2

    cd $submodule_path || { echo "Submodule path not found: $submodule_path"; exit 1; }

    git fetch origin

    current_commit=$(git rev-parse HEAD)
    latest_commit=$(git rev-parse origin/$branch)

    if [ "$current_commit" != "$latest_commit" ]; then
        echo "Make sure to check out the latest commit for $submodule_path on branch $branch."
        exit 1
    fi

    cd - > /dev/null
}

BRANCH=$(\
    git show-branch \
    | grep '*' \
    | grep -v "$(git rev-parse --abbrev-ref HEAD)" \
    | head -n1 \
    | sed 's/.*\[\([^\^~]*\).*\].*/\1/' \
)

check_submodule_latest_commit docs/notebooks $BRANCH
check_submodule_latest_commit docs/faq develop

And the corresponding .pre-commit-config.yaml:

repos:
  - repo: local
    hooks:
      - id: check-submodules
        name: check latest submodules
        entry: scripts/check_latest_submodules.sh
        language: script
        pass_filenames: false

If the submodule HEAD is not latest, the hook will then output the following:
image

In principle, the script could also be modified to automatically update the submodule. I'm just a hesitating a bit on this since that would add a bunch of edge cases that need to be handled in the hook since you might have local changes in the notebooks for example.
What do you think @daquinteroflex?

@yaugenst-flex
Copy link
Contributor

yaugenst-flex commented Jun 3, 2024

I'm still not sure whether I understand the submodule branches fully. I assumed that if we branch off of pre/*, then the notebook submodule state should also be on the latest commit of the submodule's pre/* branch. So the script above first gets the parent of the current branch and then does the check on that. This would fail however if I am branching off from something that is not a branch of the submodule, e.g. if I'm submitting a patch to another PR, where I branch off of that PR's branch. Although the runner in this PR would also not handle this case I think?

@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented Jun 3, 2024

Hi Yannick, thanks for the good suggestions! You are right in how the submodule branch check works. I've also been debating how to approach this best too. It sounds like there are two approaches: one that just checks the submodules, and one that updates it.

The main issue as you suggest with the pre-commit is that it does not know the target branch (either to update or to check), it only knows what it has branched from. I guess I was aiming to make sure the submodule check would always pass (by adding the correct updated submodule commit since we do know the target branch from the PR).

The pre-commit approach can work in terms of updating the PR branch submodules HEAD to the latest origin submodules, beyond just checking. However, we just recognise that the submodule check action might fail if the PR branch submodules have been updated from an original incorrect checked-out submodule branch. The benefit of the pre-commit is that it could automatically update the submodules for most of the trivial cases which maybe is what we want - whilst we mantain the submodule action check. What do you think, shall we go with this approach then? Good to know on the preference on not having automatic commits too, it's always tricky how to make these decisions depending how we all develop differently.

@yaugenst-flex
Copy link
Contributor

Hmmm... I was going to write that if we have a local one the catches 90% of errors, and then a GH action that fixes the rest of them, that might be fine. But tbh, if the action can fix it in 100% of cases, maybe there is really no point in having a local check. Just let the runner do it and never worry about submodule updates again? I guess that would probably be the easiest solution for most people. And if someone (like me) doesn't like to have those commits being added, well then they just need to make sure that it's not necessary in the first place 😆

@daquinteroflex
Copy link
Collaborator Author

Why don't we have both? Can't hurt too much and I think they'll improve this for most people in any case

@yaugenst-flex
Copy link
Contributor

Why don't we have both?

Unnecessary complexity, mostly. But sure, let's do it.

@daquinteroflex daquinteroflex changed the base branch from pre/2.7 to develop June 7, 2024 15:43
@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented Jun 7, 2024

Hmm we can trial it to see how we feel about it, can always revert if we don't like it

± |dario/submodule_update_bot {3} U:1 ✗| → git commit -am "status"
ruff.................................................(no files to check)Skipped
ruff-format..........................................(no files to check)Skipped
Update latest submodules.................................................Failed
- hook id: update-submodules
- exit code: 1

Initializing and updating submodules...
Checking docs/notebooks for updates...
Fetching latest commits for docs/notebooks...
error: unable to load config blob object '2ce36b88a26a7c6a56dceaf57b0434576ea11919'
error: unable to load config blob object '2ce36b88a26a7c6a56dceaf57b0434576ea11919'
LATEST_COMMIT: 594fc5f80e55d196bf9f23811821390ebafb3114
CURRENT_COMMIT: 5e5ea61a117b651592f8fe8ada3dc608253d27ac
::error::Submodule docs/notebooks is not up to date with the develop branch. Attempting automatic update.
error: unable to load config blob object '2ce36b88a26a7c6a56dceaf57b0434576ea11919'
From github.com:flexcompute/tidy3d-notebooks
 * branch            develop    -> FETCH_HEAD
error: unable to load config blob object '2ce36b88a26a7c6a56dceaf57b0434576ea11919'
Updating 5e5ea61..594fc5f
error: Your local changes to the following files would be overwritten by merge:
        .gitignore
        AdjointPlugin1Intro.ipynb
        AdjointPlugin2GradientChecking.ipynb
        AdjointPlugin3InverseDesign.ipynb
        AdjointPlugin4MultiObjective.ipynb
        AdjointPlugin5BoundaryGradients.ipynb
        AdjointPlugin6GratingCoupler.ipynb
        AdjointPlugin8WaveguideBend.ipynb
        AdjointPlugin9WDM.ipynb
        AllDielectricStructuralColor.ipynb
        AndersonLocalization.ipynb
        Fitting.ipynb
        HeatSolver.ipynb
        MetalHeaterPhaseShifter.ipynb
        ThermallyTunedRingResonator.ipynb
        docs/case_studies/index.rst
        docs/features/adjoint.rst
        docs/features/data_visualisation.rst
        docs/features/index.rst
        misc/grating_coupler_history.pkl
        misc/import_file_mapping.json
        misc/inv_des_mode_conv.gds
        misc/inverse_designed_gc.gds
        misc/y_branch_fab.pkl
Please commit your changes or stash them before you merge.
Aborting

or

± |dario/submodule_update_bot {3} S:1 ✗| → git commit -am "update action"
ruff.................................................(no files to check)Skipped
ruff-format..........................................(no files to check)Skipped
Update latest submodules.................................................Passed
[dario/submodule_update_bot 76a322ed] update action
 1 file changed, 1 insertion(+), 1 deletion(-)

@dmarek-flex
Copy link
Contributor

Is there an issue with simply setting submodules to track their corresponding develop branches? When versions of tidy3d are released we could then freeze the submodules at a specific commit. At least I think that would work, just floating out a thought balloon.

@daquinteroflex
Copy link
Collaborator Author

Yeah, we can do that if we want. The benefit of having the pre/ branches track the pre/ notebook branch is that it will build the documentation including the new functionality examples which would not be included if we tracked develop. In any case this PR will automate this correctly for an open PR.

@tylerflex
Copy link
Collaborator

The benefit of having the pre/ branches track the pre/ notebook branch is that it will build the documentation including the new functionality examples which would not be included if we tracked develop. In any case this PR will automate this correctly for an open PR.

A little confused, would it be possible to have pre branches track pre notebook branches? and develop track develop notebook branch?

@daquinteroflex
Copy link
Collaborator Author

Yep so this PR implements exactly that logic by adding the automatic submodule update commit by a bot on a given PR.

Does someone fancy trying it out before merging or shall we go ahead with it?

@tylerflex
Copy link
Collaborator

tylerflex commented Jun 28, 2024

@daquinteroflex

How would this work then when developing? if I'm working on feature branch feature to merge into x.y.z, would the submodule state just not be tracked in my feature branch? and when we want to release x.y.z, someone would need to just set the submodule state manually?

@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented Jun 28, 2024

So I think it works like:

Open a PR:

feature -> pre/target

Then within the PR, automatically the bot will add a commit to update the notebook submodule to the pre/target submodule state and to the faq develop latest state as per their remotes.

Hence, opening a PR would mean that the bot activates to the submodule state automatically to the target branch. However, you've just raised a good point that PRs from pre/target -> develop or develop -> latest should be excluded which are currently not. Will update on this. We probably don't want a bot adding automatic commits on the base branches.

I think this would work independently of tag releases.

Not sure if this is what we want.

@momchil-flex
Copy link
Collaborator

@dmarek-flex had the suggestion that instead of tracking a submodule state, we can be tracking a submodule branch. I guess this would mean though that e.g. in pre/2.8 the corresponding 2.8 branch is tracked, and then this is changed to develop once it is merged there? Might still be easier though and avoid the whole need to check latest submodule state...

@daquinteroflex
Copy link
Collaborator Author

daquinteroflex commented Jun 28, 2024 via email

@momchil-flex
Copy link
Collaborator

Hm maybe this doesn't make that much sense after all. If I want to get back to some time in the past, I won't be able to check out the submodules at that state they were at the time of a given commit. Is this why you say that we would still have to check out a specific commit when making a release? But sometimes we may want to go back to some intermediate un-tagged state too... (generally probably very rare, but feels bad to not be able to)

So maybe we're back to this PR as it stands...

@momchil-flex
Copy link
Collaborator

Now that I had another experience with this, I think what this PR is trying to do is a bit backwards to what we may want to happen.

The most common scenario we've been having issues with is this: an update happens to tidy3d-notebooks/develop. Now our develop branch is out of sync, technically. Any PR into it, which isn't modifying the tidy3d-notebooks state at all, will now also error, but the fundamental reason is that the develop branch is out of sync. Adding a commit to the PR can actually be pretty bad in some cases. Say such a commit is done. The PR doesn't get merged for a while and another update to tidy3d-notebooks/develop happens. The code is never updated though so the tests are never re-run. Then the PR is merged - it will revert the tidy3d-notebooks state even if it had been synced correctly on develop.

In short, is the correct solution then to make sure that whenever tidy3d-notebooks/develop is updated, a commit is pushed to tidy3d/develop updating the state? That's what I've been doing recently basically to fix these issues, rather than updating the PRs themselves... A bit of a downside is that technically we have branch protection for tidy3d/develop that changes should be made through a PR.

@yaugenst-flex
Copy link
Contributor

In short, is the correct solution then to make sure that whenever tidy3d-notebooks/develop is updated, a commit is pushed to tidy3d/develop updating the state? That's what I've been doing recently basically to fix these issues, rather than updating the PRs themselves... A bit of a downside is that technically we have branch protection for tidy3d/develop that changes should be made through a PR.

I like this a lot more than magically adding commits to user PRs! I think we can actually have a bot open a PR on tidy3d when tidy3d-notebooks is updated instead of directly adding a commit, kind of like dependabot does it. That would also satisfy branch protection rules.

@momchil-flex
Copy link
Collaborator

Yeah, I think that sounds good to me, let's see if others agree and we can hopefully fix this finally! :)

@tylerflex
Copy link
Collaborator

Yea i like the idea of separate submodule update commits on the public branch whenever they are out of sync

@tylerflex
Copy link
Collaborator

And as PRs sounds good

@daquinteroflex
Copy link
Collaborator Author

Ahh I also really like this approach.

Before I close this PR to implement that strategy. Do we want to retain the pre-commit check on the submodule state for the branch in the HEAD? Sounds like it could still be useful on top of this to streamline the updates accordingly.

@momchil-flex
Copy link
Collaborator

I think the only time a feature PR should be explicitly moving the submodule state is if there really is an associated notebook, and even then it might be easier to just merge that notebook in notebooks/develop and then merge the automatically generated PR into develop here, rather than update the submodule state in the feature PR. So.. no need for commit hook?

@daquinteroflex
Copy link
Collaborator Author

Yeah you're right. As long as the notebook commit bot updates the pre/, develop, and latest branches then anytime people rebase into them for a given feature the submodules should update automatically, and unlikely there's a scenario where they should be out of sync. It's kind of cool that the changelog between the notebooks and main frontend code will also be linked through this commits so much easier to see what's been added as well.

Will implement that linking PR then.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants