Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revisiting the deployment workflow #235

Open
grahamb opened this issue Oct 16, 2015 · 0 comments
Open

Revisiting the deployment workflow #235

grahamb opened this issue Oct 16, 2015 · 0 comments

Comments

@grahamb
Copy link
Member

grahamb commented Oct 16, 2015

As mentioned in the Tuesday meeting on Oct 13, it's time to step back and revisit our deployment strategy for Canvas. Keith handed this to me, but it's not my decision alone and there really needs to be more than one person with a good degree of familiarity with the process. I'm going to start with describing the current process and then open it up for brainstorming on how to move forward. I've also emailed Cody & Simon at Instructure to see if they can provide some more information on how they manage things on their side, as well as sharing any tooling if they're able to do so.

Current state

Code Repositories

Code required to deploy canvas.sfu.ca is currently spread out amongst many git repos. This does not include any ancillary services, such as Predoc, Parti, or any other LTIs; we're talking core Canvas only.

  • sfu/canvas-lms: the core repository, a fork of instructure/canvas-lms. This contains code authored by Instructure, our mods to that code, our custom JS & CSS includes, several gems/plugins (sfu_api, sfu_course_form, sfu_copyright, sfu_stats), as well as the 'test cluster' plugin in lib/aaa_sfu_misc. In other words, there's a lot going on there.
    • branches:
      • sfu-deploy: our "clean" branch, should be deployable to production at any time
      • sfu-develop: our development branch; features or fixes developed in individual developer forks (e.g. grahamb/canvas-lms). Auto-deployed to canvas-test whenever it changes.
      • edge: branch where treesame commmits from instructure (representing a deployed release to their cloud platform) are merged into the SFU repo and deployed to canvas-edge.
  • sfu/canvas-spaces: the backend of Canvas Spaces (provides the routing, API, etc). Originally written by @patchin, ongoing development by @grahamb. Cloned into gems/plugins by Bamboo.
    • branches:
      • master: development, deployed to canvas-edge and canvas-test
      • stable: clean branch, deployed to canvas production
  • sfu/canvas-spaces-client-app: the front-end for Canvas spaces. Written by @grahamb. Cloned into client_apps by Bamboo.
    • branches:
      • master: deployed to all environments
  • sfu/canvas_auth: plugin to authenticate non-SFU users by CAS. Written by @patchin. Cloned into gems/plugins by Bamboo.
    • branches:
      • master: deployed to all environments
  • instructure/QTIMigrationTool: Instructure-maintained tool. Cloned into vendor by Bamboo.
    • branches:
      • master: deployed to all environments
  • instructure/analytics: Instructure's analytics module. Cloned into gems/plugins by Bamboo.
    • branches:
      • stable: deployed to all environments
      • master: not deployed
      • beta: not deployed
    • tags: contains tags matching instructure/canvas-lms release tags. We currently don't use these for anything, because we can't actually use the canvas release tags.

Deployment workflow

This diagram represents the how Canvas is deployed from sfu-deploy to production (and from other branches to their respective environments as well). The stages are the three stages defined in the Bamboo plan:

  1. Checkout and install dependencies
  2. Compile assets
  3. Deploy code to servers

Blue boxes represent tasks being handled by Bamboo; yellow boxes are tasks being handled by Capistrano (kicked off by Bamboo).

canvasdeploy

It's worth noting that the Capistrano stuff is split up: generic Canvas tasks are pulled in from grahamb/capistrano-canvas, which was my attempt to create a generic 'use Capistrano to deploy Canvas' gem. It is requried in Gemfile.d/sfu_gems.rb. SFU-specific tasks are in the sfu/canvas-lms repo in lib/capistrano/tasks/canvas.rake (along with the stuff in config/deploy.rb and config/deploy/*. I personally think that splitting them up was a mistake, and I would like to fix it in the future.

What's the problem?

Since the beginning, we've always had two repos to deal with: sfu/canvas-lms and instructure/QTIMigrationTool. Over time, we've accumulated the others. This adds complexity to the build process. This was evidenced by the deploy on Tuesday, October 13:

  • The sfu_api plugin was moved from vendor/plugins to gems/plugins
  • canvas-spaces has a hard-coded require path to part of the API in it (which in of itself it a problem that needs to be addressed).
  • In order to be able to deploy both canvas-lms and canvas-spaces to canvas-test during the testing cycle, I created the stable branch on canvas-spaces and made the change to point at the new location of sfu_api on the master branch. The Bamboo configuration was thus changed to deploy the master branch to edge and test, and the stable branch to stage and production. This allowed canvas-spaces to work with the new sfu_api location on edge and test (with sfu-develop), while allowing it to continue to be deployable to production should the need arise for a mid-cycle deploy.
  • On Tuesday morning, before the production deploy, I merged sfu-develop into sfu-deploy. I did not, however, merge the canvas-spaces master branch into stable, meaning that production received a copy of canvas-spaces with the old path to the sfu_api plugin (vendor/plugins/sfu_api/...).
  • Fortunately, Passenger Enterprise did its job; when it started up the new release, it caught the error from canvas-spaces and resisted the deploy. I was able to make the changes in place; however, we did experience some downtime in the process.

How do we fix this?

There's a few questions that we need to address:

  1. canvas-stage: if we were doing things "right", we would have deployed to canvas-stage first, and caught the problem. We've never used stage as a true staging environment; we use it more as a backup and confirmation of problems system (the same is true for SFU Connect). We absolutely need to start using stage as a true staging system. If we want to continue having a "backup" (not in the DR sense, but more of a "restore an accidentally deleted thing" sense), then that should be a separate system (probably similar to canvas-egde). Setting up that is outside the scope of this discussion, but it does need to get done.
  2. How do we manage all these repos? Do we continue to put new plugins into their own repositories? Should the existing externals be folded into sfu/canvas-lms? Should the plugins in sfu/canvas-lms be spun out into their own repos?
  3. If we do keep multiple repositories, how do we prevent what happened on Tuesday with a branch not being merged? Deploying to stage first would have caught it, but it shouldn't have happened in the first place. Relying on my memory is not a good solution; if we keep externals then we'll need some tooling to manage it for us.

How Instructure does it

Instructure has multiple repositories that make up Canvas. If you look in gems/, you'll see a bunch of stuff prefixed canvas_. They also have plugins in gems/plugins. It's my understanding that each of those is a separate repository (plus QTIMigrationTool and analytics). I asked in IRC about this:

10/13 11:10 (grahamb-sfu_) simonista / ccutrer / etc: do you guys have stuff in external repositories that get pulled into canvas during a deploy? Not things like gems (e.g. canvas-jobs), but other repos that get checked out into a point in the tree (similar to instructure/QTIMigration and instructure/analytics for open-source)?
10/13 11:10 ccutrer: yes. we have a repo we call `canvas_deploy` that has canvas-lms, QTIMigration, and each of our custom plugins as git submodules
10/13 11:20 (grahamb-sfu_) The canvas_deploy repo approach is interesting. Do you guys have any processes for making sure that those plugins are at the right state before they get brought in?
10/13 11:26 simonista: yeah, they follow the same branching strategy as we use for canvas-lms (branch named for each deploy date), and we have a bit of custom tooling that creates all the new branches at the same time from each master branch, and updates where the submodules are tracking before building a package
10/13 11:41 (grahamb-sfu) Thanks. I'll add that to our discussion here.
10/13 11:44 ccutrer: yep. and that makes it impossible to mismatch versions, even when jumping back and forth between releases, cause git keeps them all in sync when starting a new build
10/13 11:45 (grahamb-sfu) when developing, are you using that canvas_deploy repo as your base?
10/13 11:46 ccutrer: not really
10/13 11:46 ccutrer: I keep everything on master in an ad-hoc "plugin repos are just where they are supposed to be"
10/13 11:46 ccutrer: probably partly because our build scripts require that everything be clean before starting a build
10/13 11:46 ccutrer: and my working directories are _never_ completely clean

I've emailed Cody and Simon privately and if they're able to provide more information. If I get anything back I'll add it here. Submodules don't have a good reputation, and kind of scare me, but I'm open to it. We kind of get part way there with how Bamboo does the cloning of the various repos. I do like the approach of having a per-release branch on every repo.

Discuss

This needs to be a collaborative effort; whatever we choose has to work for everyone. It also needs to be in the head of more than one person. Let's use this issue to brainstorm ideas and come up with a solution. There's no need to rush this; we need to do it right, not fast.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant