How To Introduce New Tools During a Software Development Project?

Nowadays, most software development projects use several tools to guarantee a certain level of code and system quality. We're talking here about formers, linters or other static code analyzers. These tools are generally configured at the start of a project for simplicity's sake.

But what happens when you discover a new tool after months of development? You can integrate it into the project, but it's likely to generate hundreds, if not thousands, of errors. Fixing all these errors at once is often far too big a task to be achievable in a reasonable timeframe. In other words, these improvements are often put aside because the initial obstacle seems insurmountable, to the detriment of the project's long-term health.

What if, rather than trying to fix the whole project, we focused solely on future developments? In this article, we'll explore a technique I've used with great success to gradually integrate a new linting tool into a software project.

Applying a tool gradually

One technique I've used recently to introduce a new tool into a project is to apply it only to files modified by a merge request, with of course validation by the CI. In this way, we can ensure our standard of quality for all future developments, in addition to retroactively correcting the rest of the project as modifications are made.

The following steps detail an implementation for a Javascript project in which we're going to introduce a new ESLint rule, all validated by a GitLab CI pipeline. They should be easily adaptable to most software projects. The only prerequisites are the following:

Your selection tool must allow you to specify the exact list of files to be validated, either via a command line parameter or a configuration file;
Your CI system must allow you to obtain the list of files modified by the merge request (or pull request, depending on your repository).

Setting up the project

For our example, the first step is to write a new ESLint configuration that includes our new rule. If you're introducing a completely new tool, simply configure it as you wish for your project.

// strict.eslintrc

{
    "rules": [
      "no-empty-function": "error"
    ]
}

Throughout the example, we'll use the word "strict" to identify the configuration containing the new rule.

In the case of ESLint, it's also possible to extend your existing configuration if you want to add your new rules rather than replace them. In this way, you can replace your existing CI pipeline rather than adding a new step, as we'll see later.

Once the configuration has been created, you'll need a way of identifying the files to apply it to. To enhance the developer's experience, it's important that this works both in CI and on a workstation, regardless of whether or not the current branch contains files being modified.

For the CI, we'll use a Javascript script to obtain the list of modified files directly from the GitLab API.

To get the list of files modified on GitLab, the script must be run in a for merge request pipeline.

// getChangedFiles.js

const { spawnSync } = require('child_process')
const fetch = require('node-fetch')
const { exit } = require('process')

// Par défaut, "fetch" ne lance pas d'exception en cas d'erreur.
// Cette fonction nous permettra de chaîner les promesses un peu
// plus facilement.
function handleErrors(response) {
  if (!response.ok) {
    throw new Error(response.status)
  }
  return response
}

// Cette variable d'environnement est assignée automatiquement par la
// plupart des systèmes de CI
if (process.env.CI === 'true') {
  // Ces trois variables proviennent directement de GitLab CI:
  // https://docs.gitlab.com/13.6/ee/ci/variables/predefined_variables.html
  const apiURL = process.env.CI_API_V4_URL
  const projectID = process.env.CI_PROJECT_ID
  const mrIID = process.env.CI_MERGE_REQUEST_IID
  // Cette variable est assignée via l'interface de GitLab, tel
  // qu'illustré plus bas.
  const accessToken = process.env.CI_ACCESS_TOKEN

  fetch(
    // L'option "access_raw_diffs" est importante pour éviter que 
    // GitLab ne limite la taille de la réponse:
    // https://docs.gitlab.com/ee/api/merge_requests.html#get-single-mr-changes
    `${apiURL}/projects/${projectID}/merge_requests/${mrIID}/changes?access_raw_diffs`,
    {
      headers: { 'PRIVATE-TOKEN': accessToken },
    }
  )
    .then(handleErrors)
    .then((response) => response.json())
    .then((json) => {
      const changedFilePaths = json.changes
        .filter((change) => !change.deleted_file)
        .map((change) => change.new_path)
        .filter((path) => /(\.ts)|(\.tsx)$/.test(path))

      console.log(changedFilePaths.join(' '))
    })
    .catch((e) => {
      console.error(e)
      exit(1)
    })
} else {
  // Voir l'implémentation de "getLocalDiff.sh" ci-bas
  console.log(String(spawnSync('./getLocalDiff.sh').stdout))
}

For developers' workstations, a simple bash script and a few git commands will do the trick:

# getLocalDiff.sh

#!/usr/bin/env bash

# Obtenir la branche courante
CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
# S'assurer d'être à jour avec le serveur
git fetch
# Obtenir le commit où la branche courante et sa base ont divergé
HASH=$(git merge-base origin/develop $CURRENT_BRANCH)

# Obtenir la liste des fichiers modifiés, mais pas supprimés
MODIFIED=$(git diff --name-only --diff-filter=d $HASH -- "*.ts" "*.tsx")
# Obtenir la liste des fichiers ajoutés mais pas suivis par git
# (ceci permet d'obtenir le même résultat avant et après "git add")
NEW=$(git ls-files --others --exclude-standard "*.ts" "*.tsx")

# Imprimer la combinaison des deux listes et remplacer les retours
# de ligne par des espaces
echo "$MODIFIED $NEW" | tr '\n' ' '

Finally, all that's missing is a simple npm script to run this configuration:

// package.json

"scripts": {
  // "FORCE_COLOR=true" sert simplement à indiquer à ESLint 
  // d'afficher les résultats en couleur même si l'environnement
  // d'exécution ne le supporte pas à priori (e.g. en CI)
  "eslint:strict": "FORCE_COLOR=true eslint -c strict.eslintrc $(node getChangedFiles.js)",
}

At this point, you should be able to run the npm script locally to validate the files modified by your branch.

Setting up the CI

First of all, as mentioned above, it's important to ensure that pipelines for merge requests are activated for the project.

Next, we need to create a job to run our new script:

# .gitlab-ci.yaml

eslint-strict:
    stage: test
    script:
        - npm run eslint:strict
    rules:
        # Cette job ne peut être exécutée que si le pipeline 
        # courant est associé à une merge request.
        - if: $CI_MERGE_REQUEST_IID

Finally, you'll need to enable the script to authenticate itself to GitLab. In our case, we'll use a personal or project access token to which the previously configured job will have access via an environment variable to run GitLab.

Project tokens are to be favored, but are currently only available for self-hosted GitLabs or paid gitlab.com teams. If you don't have access to them, a personal token can also do the trick, but it's important to make sure you change the token if the person who created it ever leaves the team.

It should also be noted that GitLab offers a CI pipeline token authentication method, but this only works for a limited list of API routes, of which the merge request change route is not one at the time of writing. This may change in the future.

Alternatives

It's worth mentioning that this idea is not new, and that simpler solutions do exist. lint-staged is a good example. However, when it came to applying this to my project, I couldn't find any solution that integrated perfectly at the CI level, let alone one that was independent of the underlying technologies (my solution doesn't meet this criterion either, being dependent on NodeJS to access the GitLab API, but at least the script should be very easy to adapt to another language, including bash).

The Results

These scripts are all well and good, but do they really work? That’s the question!

In the case of my project, I'd say yes, it worked like a charm! We used this technique to introduce the typescript-eslint plugin into our project. At the time of integration, this plugin was finding 1933 problems in our code (errors and warnings combined). After four months, we're down to 745. The best part is that we accomplished this without ever creating a commit exclusively to fix these errors. It's all been done as we've been developing the project, with no additional friction whatsoever (apart from a few speed bumps encountered when fine-tuning the various scripts).

ESLint result when integrating typescript-eslint

ESLint results after 4 months

If you try this technique, I'd love to hear about your results. Feel free to leave a comment!

Les articles en vedette

Soyez les premiers au courant des derniers articles publiés

Abonnez-vous à l’infolettre pour ne jamais rater une nouvelle publication de notre blogue et toutes nos nouvelles.

Product Sprint

Product Strategy

Software Design Audit

Software Product Design

Custom Software Development

Mobile Applications

Web Applications

How To Introduce New Tools During a Software Development Project?

Applying a tool gradually

Setting up the project

Setting up the CI

Alternatives

The Results

Les articles en vedette

Soyez les premiers au courant des derniers articles publiés

Product Sprint

Product Strategy

Software Design Audit

Software Product Design

Custom Software Development

Mobile Applications

Web Applications

How To Introduce New Tools During a Software Development Project?

Applying a tool gradually

Setting up the project

Setting up the CI

Alternatives

The Results

Les articles en vedette

Simplify your tests with testing-library-selector

Shift-Left in Development: Reduce Lead Times and Improve Code

Cultivating the Growth Mindset: the key to success in software engineering

Soyez les premiers au courant des derniers articles publiés