~ 8 min read

Best Practices for Bootstrapping a Node.js Application Configuration

share this story on
Follow these best practices to bootstrap a Node.js application configuration in a robust and maintainable way using env-schema.

Bootstrapping a Node.js application often requires loading configuration, whether from environment variables, or static configuration files. In a previous article, I shared my opinions on environment variables and configuration anti patterns in Node.js applications and in this article I’d like to share my approach to avoid those anti patterns.

Traits of robust Node.js configuration

What makes a good configuration architecture for a Node.js application? Regardless if your choice is working with process.env environment variables or other mechanisms, here are some traits I personally look for when evaluating an approach for configuration management:

  • Async - I want to be able to load configuration asynchronously. This is especially important when loading configuration from external sources, such as a database or a remote API. However most configuration patterns I see in Node.js applications are synchronous, such as:
const config = require('./config.json')
  • Schema defined - I want to be able to define the types of my configuration values, and have them validated at application bootstrap, before the application is entered into a ready state. Without a schema, a Node.js app might bootstrap successfully, but fail at runtime due to a missing or invalid configuration value.

  • Configuration manager - I want to be able to access my configuration values from anywhere in my application, without having to expose implementation details such as the configuration file path or environment variable names. The following is an example of Node.js configuration anti-pattern:

const db = require('db')

db.connect({
  host: process.env.DB_HOST,
  port: process.env.DB_PORT,
  user: process.env.DB_USER,
  password: process.env.DB_PASSWORD,
  database: process.env.DB_NAME
})
  • Configuration factory - I want to be able to create multiple configuration instances, for example for different environments, or for different parts of my application. This is especially important when writing tests, where I might want to create a configuration instance with different values than the production configuration. Often the anti pattern followed by Node.js developers is to use a CJS configuration module and passing it around, which forces a singleton pattern.

Practical approach to Node.js configuration

In this second part of the article I’ll demonstrate the proposed configuration manager approach in a step-by-step process which outlines the libraries in-use, and outlines which anti-patterns to avoid.

Step 1: Use configuration data sourced from .env files

The first step is to load configuration data from a .env file. This is a common pattern in Node.js applications, so any of you whom are fans of the dotenv library will be familiar with this approach. However, note, I won’t be using dotenv directly.

The .env file is a simple text file, which contains key-value pairs, such as:

PORT=3000
LOG_LEVEL=debug
DB_HOST=localhost
DB_PORT=5432

Important to note:

  • ✅ the .env file is not committed to source control
  • ✅ the .env file is added to .gitignore (along with its permutations such as .env.local, .env.development, .env.test, etc)
  • ✅ In production, the configuration values are sourced from environment variables, not from the .env file.
  • ✅ I do not have any other environment designation such as .env.development, .env.test, etc. I only have a single .env file.

Step 2: Use configuration schema with env-schema

Configuration details are defined, validated, and enforced using a schema.

I’m using the npm package env-schema to the define the configuration schema. For example:

import EnvSchema from "env-schema";

const schema = {
  type: "object",
  required: [
    "PORT",
    "LOG_LEVEL",
    "DB_HOST",
    "DB_PORT",
  ],
  properties: {
    PORT: {
      type: "number",
      default: 3000,
    },
    LOG_LEVEL: {
      type: "string",
      default: "info",
    },
    DB_HOST: {
      type: "string",
      default: "localhost",
    },
    DB_PORT: {
      type: "number",
      default: 5432,
    },
  },
};

This allows to define configuration details and enforce:

  1. A specific strongly typed value, such as number or string for a configuration item.
  2. The configuration item is required, and must be present in the configuration data. Otherwise, the application crashes at bootstrap when configuration is initialized. This is a good practice, often known as fail fast.

I previously written an article titled Configuration Decoded: Lesser-Known Tips for Working with env-schema in Node.js which goes into more details about the env-schema library and how to use custom validators in env-schema.

Step 3: Use a configuration manager instead of a singleton

Node.js module system, commonly known as CommonJS (CJS), follows a singleton pattern. This means that when a module is imported, it is cached and subsequent imports of the same module return the same instance. This could lead to challenges introduced during tests, where you might want to create a new instance of the configuration for each test, with different values.

Avoid the following anti-pattern:

// config.js
const config = require('./config.json')
module.exports = config

Instead, create an abstraction wrapper around the configuration data, ideally a class, which can be instantiated multiple times and hold its state internally instead of holding it via global variables.

The following is a simple class blueprint for a configuration manager that is responsible for loading the configuration data, validating it against the schema, and returning the configuration data.

class Config {
  constructor() {
    this.config = null;
  }

  async load({ envSchemaOptions = {}, overrideConfigData = null } = {}) {

    // load configuration..
    // ..

    return this.config;
  }
}

export { Config };

Step 4: Load configuration asynchronously

It is common for Node.js applications to load configuration synchronously, because developers rely on the Node.js module system (config = require('./config')). This is also quite popular in codebases that use dotenv which recommends a similar pattern: require('dotenv').config().

There’s nothing inherently wrong with configuration data that is loaded synchronously, but as you’ve probably experienced with JavaScript, a single function’s update to asynchronous pattern (Hi async/await!) can lead to a cascading effect of code refactors across the rest of the codebase.

Therefore, I recommend to always load configuration asynchronously, even if it is a simple .env file. If in the future you’d need to load secrets via remote AWS KMS (Key Management System) or similar external resource use-cases, you wouldn’t have to pay the cost of code refactor.

Whether your choice is class-based or function-based, make sure you load configuration asynchronously. For example:

class Config {
  
  // [...]

  async load() {

    // load configuration..
    // ..

    return this.config;
  }
}

Step 4: Don’t leak configuration properties

If you use dotenv with .env files, you might be familiar with the following pattern:

// Load the configuration:
// (which really means it populates process.env with the configuration)
require('dotenv').config()

// Then later:
const dbHost = process.env.DB_HOST

No matter what, never access configuration data via process.env directly. This is a common anti-pattern in Node.js applications, and it is dangerous because it exposes the implementation details of the configuration data, such as the environment variable names. It also exposes the entire process.env object which might contain sensitive data that you don’t want to leak.

In this example, the configuration is loaded via dotenv and the configuration values are accessed via process.env. This is dangerous because the data in process.env holds more than just the configuration data, it also holds the environment variables of the current process and potentially other sensitive data that might end up leaking.

Instead, make sure that you are always forcing that the configuration data you have in the application’s state is only scoped to the configuration data that you defined in the schema, and not the entire process.env object.

If you work with the env-schema library, you get this by default when working with a defined schema. If you work with dotenv, make sure you return the config object from dotenv.config() such as:

import dotenv from 'dotenv'
const config = dotenv.config()

// your configuration data is now available via
// `config.parsed` property

Note, that config.parsed in the above configuration loading example doesn’t enforce or validated the configuration data against a schema, it only scopes the configuration data to the .env file.

Step 5: Rely on feature flags instead of environment designation

I’ve seen many Node.js applications that use environment designation to load different configuration data. For example, a .env.development file is used to load configuration data for the development environment, and then the application code makes use of this check to toggle on or off a specific feature.

Following is an example of an anti pattern that uses environment designation to toggle on or off a feature:

if (process.env.NODE_ENV === 'development') {
  apm.disable()
  tracing.disable()
}

Instead of the above, prefer feature flags. For example, the following is a better approach:

if (process.env.APP_MONITORING === 'true') {
  apm.enable()
}

if (process.env.USER_EVENTS_TRACKING === 'true') {
  productAnalytics.enable()
}

This approach promotes coding conventions that decouple your application code from the environment designation, and instead rely on feature flags. This is a better approach because it allows you to toggle on or off a feature without having to change the environment designation.

Summary

In concluding thoughts, if you are using a different library than env-schema and your own configuration wrappers by upholding the above configuration best practices then you are doing it right. My personal choice is env-schema with Fastify.

As an addendum, I also use Awilix for dependency injection and here’s how I would use it to inject the configuration manager into my application:

// Load configuration
const configManager = new Config();
const config = await configManager.load();

// Other initializations [...]

// Initialize DI
const diContainer = await initDI({ config, database });