2021/02/28

How to keep a git repo hierarchy constantly in sync locally

In the past few years I have been working more of an architect than a developer and I am mostly working with teams from the other side of the sea. So usually, by the time I wake up and get to work, there are a bunch of updates in the project. And if the project is big enough, it can potentially involve dozens of repos - microservices for the win.

Now I do have notification configured to my liking in the interested repos, but since I am still trying to be hands-on I am trying to spend a few minutes running the code locally. (Or some other cases need to support business in real time, in which case, a running local version is really handy)

TLDR: I need to keep a lot of repos updated, and I have found myself re-inventing the same 10 lines of bash script over and over again.

pull-all-v1.sh


  #!/usr/bin/env bash
  
  folders=("folder" "list" "of" "repos" "of" "interest")
  for folder in ${folders[@]};  do
      pushd $folder
      git stash push
      git reset --hard
      git checkout master
      git pull -p origin
      popd
  done


Problems

Now while this does the job, there are a few problems with it. One, I hate double administration, and maintaining the folders list while I have it in the directory hierarchy is just wasteful. Second, I sometime do some experimentation in the code, then forget about it, and due to stashing, the idea might lay there, forgotten. So today I've spent some time and come up with a better solution for my use-case.
Note: I tend to keep the bitbucket/github/gitlab project structure in my folders as well, hence if two repo has the same name, they'll be kept in different folders, so I can avoid the risk of losing track of what is what. For example, if I am cloning these repos:
  • https://github.com/maxence-charriere/go-app
  • https://github.com/maxence-charriere/vector
  • https://github.com/casualjim/go-app
  • https://github.com/casualjim/httprouter-rs
I'd have the following folder structure:
  • ~/work/maxence-charriere
    • go-app
    • vector
  • ~/work/casualjim
    • go-app
    • httprouter-rs

pull-all-v2.sh

  
  #!/usr/bin/env bash
  { set -euxo pipefail; } 2>/dev/null
  
  function pull_all () {
  
      local workdir=${1:-$(pwd)}
      pushd "${workdir}"
  
      find . -mindepth 3 -maxdepth 3 -type d -name .git -exec dirname {} \; | while read folder; do
          echo "        >>>> cd $folder <<<<"
          pushd $folder
          if [[ -d .git ]]
          then
              local branch_name="$(git branch --show-current)"
              local gstatus="$(git status --porcelain=2)"
              local head_name="$(git remote show origin | grep 'HEAD branch' | sed 's/.*: //')" # master vs main
              if [ "${gstatus}" != "" ]
              then
                  git stash push --all
              fi
              if [[ "${branch_name}" != "${head_name}" ]];
              then
                  git checkout ${head_name}
              fi
              git pull -p origin
              if [[ "${branch_name}" != "${head_name}" ]];
              then
                  git checkout "${branch_name}"
              fi
              if [ "${gstatus}" != "" ]
              then
                  git stash pop -q
              fi
          fi
          popd
      done
      popd
  }


Now, I have a function defined as pull_all which does the following:
  • Searches for .git folders under the "project structure" - hence the min/maxdept limitation. If you want to enable search via all folders, you can get rid of them, or if you're using a flat structure, set it to 2 etc.
  • Saves changes - if any - to stash and changes to the master branch - if not there.
  • Pulls branches + cleans up remotely deleted ones - PRs for the win
  • Then restores the earlier stage (branch change-back + pop if needed)
    • Now be carefull here, if you get a merge-conflict, you'll want to clean it up before issuing a pull in that repo again
If you find yourself using this script a lot, you might just want to add the function definition to the end of your .bashrc file. If you do that, then you'll be able to call the pull_all function from anywhere you'd like. For example, you can call it then and there, making sure that every time you log in, you'll get the latest updates. Or you can schedule a cron job running the script every x hours etc. The sky is the limit.

Enjoy!

No comments:

Post a Comment