Module systems in JavaScript

Hello everyone! 👋

If you have ever had the pleasure of developing something in JavaScript, I believe at least once in your life you have come across something like:

text

SyntaxError: Cannot use import statement outside a module

(if you haven't, it's still ahead of you 😉)

In particular, it can be challenging to understand what happens when the source code undergoes transpilation during the build process, and one module system transforms into another.

So today, I will talk about JavaScript/TypeScript module systems, how it happened that there are several of them, and personally, what I would recommend using in 2023.

A bit of theory

The principle of modularity allows to simplify the task of software design and distribute the development process among groups of developers.

It is the need to develop large-scale software systems that led to the emergence of modular programming, where the entire program is divided into components called modules, each with a manageable size, a clear purpose, and a well-defined interface aka API.

A module in programming is a sequence of logically related code fragments, organized as a separate part of a program. In many languages, it is encapsulated in a separate file or named continuous portion of code.

Of course, the principle of modularity did not bypass JavaScript. On the contrary, JavaScript has implemented several approaches to code organization, but let's take it step by step.

IIFE

Let's start, with IIFE, even though it is not a module system but a native language mechanism. I believe that for the sake of completeness, we need to consider it. IIFE stands for Immediately Invoked Function Expression. It is a function that is declared and immediately invoked. This technique is used for the purpose of isolation - variables declared within the body of such a function are not accessible from the outside, which helps to avoid naming conflicts and pollution of the global namespace. An example of such a function can be seen below:

javascript

;(function () {
  const a = 5;

  console.log(a) // 5
})()

console.log(a) // Uncaught ReferenceError: a is not defined

If you prefer to use JS without ;, then I recommend placing a ; before declaring the IIFE, as shown above, to prevent the expression from being interpreted as a function call. However, if you use ; consistently throughout your code, you won't encounter such issues.

By the way, an IIFE can be asynchronous, which means that the body of such a function can use the await keyword. However, this is not directly relevant to the topic of this article.

Although IIFE provides a convenient (and native) way to isolate code, it does not provide a convenient mechanism for splitting code into files. Of course, you can use as many script tags as you want on a page (if you develop frontend), but it becomes very difficult to track module dependencies on larger projects. While, from a programming perspective, as I mentioned earlier in the module definition, a module doesn't necessarily have to be a separate file, the practice shows that working with separate files is much easier. As a result, more powerful tools and approaches have emerged over time.

AMD

AMD (Asynchronous Module Definition) is a module system designed for use in the browser environment. It provides a mechanism for discovering and resolving module dependencies, enabling automatic loading and execution of modules in the correct order.

AMD was born out of a group of developers that were displeased with the direction adopted by CommonJS. In fact, AMD was split from CommonJS early in its development.

One of the key features of AMD is its ability to load modules asynchronously. When a module is requested for loading, the AMD loader asynchronously loads the module's dependencies, ensuring the proper loading order.

Here's an example of how it can be used:

javascript

define('sounds', 
  ['dog', 'audio'], 
  function (dog, audio) {
    return {
      bark: function() {
        return audio.play(dog.getVoice());
      }
    }
  };
});

Here, we are implementing the sounds module and explicitly stating its dependencies. One of the most popular tools that implements AMD is RequireJS. Here’s an example of how it can be used from official documentation:

javascript

requirejs(
  ['helper/util'],
  function(util) {
    // This function is called when scripts/helper/util.js is loaded.
    // If util.js calls define(), then this function is not fired until
    // util's dependencies have loaded, and the util argument will hold
    // the module value for "helper/util".
  }
);

CommonJS (CJS)

While AMD was used in browser environments, another module system called CommonJS was used in Node.js environments. The main difference between AMD and CommonJS lies in its support for asynchronous module loading.

While AMD is somewhat difficult to come across nowadays, CommonJS is literally everywhere. It's important to note that CommonJS is still the default module system in Node.js.

Here's an example code using CommonJS:

javascript

const fs = require('fs');
const dog = require('./dog');

module.exports = {
  barkToFile: function () {
    fs.writeFileSync('./bark.txt', dog.voiceToString());
  }
};

To import modules in CommonJS, the require function is used. It takes the path (and extension may be omitted) to the module or module name and returns the exported values of that module.

Any properties added to exports or assigned to module.exports become the exported values of the module.

In CommonJS, modules are loaded synchronously and resolved at runtime. When a module is initially loaded, its code is executed, and its exported values become available for import in other modules.

In the browser, there is no built-in require function or access to the file system, so CommonJS is not supported. However, a similar API can be achieved using the AMD approach and the RequireJS library:

javascript

define(
  function(require, exports) {
    const dog = require("dog");

    exports.barkToConsole = function() {
      console.log(dog.voiceToString());
    }
  }
);

ES Modules (ESM)

And here we finally come to the first module system described in the ECMAScript standard. ES modules were introduced with ES6 in 2015. And yes, you understood it correctly, for 20 years there was no standardized module system in JavaScript. Over time, the standard has evolved and I believe you are also familiar with this syntax:

javascript

import api from 'api.js';
import dog from './dog.js';

export function makeBarkRequest () {
  api.post('/bark', (err) => {
    if (err) return;

    dog.bark();
  });
}

Despite being standardized, ES modules are disabled by default. In a Node.js environment, you need to set type to 'module' in your package.json or use the .mjs file extension. In browser environments, to use ESM syntax within a script tag, you need to set the type attribute to 'module'. Support for ES modules first appeared in Node.js 12 under the --experimental-modules flag. In later versions, ES modules were came out from the flag.

That concludes the discussion of module systems, but there are a couple more approaches that are important to look at.

UMD

UMD (Universal Module Definition) is a template or approach for creating modules that can work in both CommonJS and AMD environments, as well as be accessible as global variables if no module loader is present.

The main idea behind UMD is to create a module that can automatically adapt to different runtime environments and module loaders.

UMD uses conditional constructs to determine which existing module system is available and selects the appropriate method for exporting and importing the module. In many cases, it uses AMD as a base and adds special handling to ensure compatibility with CommonJS.

Here's an example of how it typically looks:

javascript

(function (root, factory) {
  if (typeof define === 'function' && define.amd) {
    // AMD env
    define(['dependency'], factory);
  } else if (typeof exports === 'object') {
    // CommonJS env
    module.exports = factory(require('dependency'));
  } else {
    // global
    root.ModuleName = factory(root.Dependency);
  }
}(this, function (dependency) {
  // module logic
  return {
    // exports
  };
}));

If the runtime environment supports AMD (checked using define), the module is defined using define and the dependencies are specified for the module loader.

If the runtime environment supports CommonJS (checked using exports), the module is exported using module.exports and the dependencies are resolved through require.

If none of the checks pass, it is assumed that the module is running in a global environment, and it is exported by assigning its property to the global object (root), which represents the global namespace in this case.

UMD allows developers to create modules that can be used in different runtime environments, providing maximum flexibility and code portability.

SystemJS

SystemJS is a universal JavaScript module loader designed for use in the browser and Node.js runtime. It is designed to support and load various module formats, including AMD, CommonJS, UMD, and ESM, allowing developers to use different module formats within a single project.

Here's an example of how it looks:

javascript

System.import('dog.js').then(function(module) {
  console.log(module.bark());
}).catch(function(error) {
  console.error('Failed to load module:', error);
});

And keep in mind that the 'dog.js' module can be written in ANY of the formats mentioned above. However, I would say that if you find yourself needing to use different module systems within a single project, there might be something wrong with your project.

In real life, you are unlikely to deal with AMD, UMD, or SystemJS. ESM and CJS, on the other hand, are widely used, and you will encounter them regardless of the environment you're developing in.

Bundlers and traspilers

In fact, module systems are not something complicated. I believe it won't be difficult to determine the approach being used by examining the source code. However, in real-world development, the code we write doesn't go into production as-is. On the backend, a transpiler is usually sufficient. The most common scenario is transpiling TypeScript code to JavaScript (of course, there are other solutions like ts-node or deno, but that's a topic for another article). On the frontend, things are even more complex - the code goes through not only a transpiler but also various tools called bundlers, minifiers, and so on.

Transpilers (for example, tsc) have the ability to generate code using various module systems. tsc supports generating code into all of module systems we've looked at: ESM, CJS, AMD, UMD and SystemJS. And here you can play with the tsc transpiler by yourself!

Let's see how it works. For example, here's the code:

javascript

import dog from './dog';
import audio from './audio';

export const bark = () => {
  audio.play(dog.bark())
}

It will not change if you build it with tsc with the option module set to 'esnext' , but if you build the code with the same tool but with the module option set to 'commonjs' , the code will look like this:

javascript

"use strict";
var __importDefault = (this && this.__importDefault) || function (mod) {
    return (mod && mod.__esModule) ? mod : { "default": mod };
};
Object.defineProperty(exports, "__esModule", { value: true });
exports.bark = void 0;
const dog_1 = __importDefault(require("./dog"));
const audio_1 = __importDefault(require("./audio"));
const bark = () => {
    audio_1.default.play(dog_1.default.bark());
};
exports.bark = bark;

If we delve into what's happening here and remove the generated helpers, we can notice that the code we wrote using ESM has been transformed into code using CJS, and this is the root cause of all the errors!

This is where the confusion arises. Since you are using ESM in your source files, it seems logical to set the type to 'module' in package.json. But it's not that simple. In reality, Node.js will run the already compiled tsc code, which, as we have seen, does not use ESM, and you will encounter the error:

text

ReferenceError: require is not defined

The reverse situation is also possible. For example, if we build the code and it still contains ESM syntax, but we haven't set the type to 'module' in package.json, we will get the error mentioned at the beginning of the article:

text

SyntaxError: Cannot use import statement outside a module

All of this is further complicated by the fact that some packages are discontinuing support for CJS. One such package is chalk. Starting from major version 5, they stopped supporting CJS, and if you try to use require, you will encounter the error:

text

Error [ERR_REQUIRE_ESM]: require() of ES Module

Now, imagine a situation where you have TypeScript code:

javascript

import chalk from 'chalk'; // >= 5
import dog from './dog';

export const colorfulBark = () => {
  console.log(chalk.green(dog.barkToString()));
}

And you decided to build it with tsc with module option set to 'commonjs'. You will get this code:

javascript

"use strict";
var __importDefault = (this && this.__importDefault) || function (mod) {
    return (mod && mod.__esModule) ? mod : { "default": mod };
};
Object.defineProperty(exports, "__esModule", { value: true });
exports.colorfulBark = void 0;
const chalk_1 = __importDefault(require("chalk")); // <-- require('chalk')
const dog_1 = __importDefault(require("./dog"));
const colorfulBark = () => {
    console.log(chalk_1.default.green(dog_1.default.barkToString()));
};
exports.colorfulBark = colorfulBark;

And when you attempt to run it, you encounter the error

text

Error [ERR_REQUIRE_ESM]: require() of ES Module

indicating that one of our modules (in this case, chalk) does not support CJS. This can be quite confusing because we don't have any require statements in our source code! It becomes particularly challenging to troubleshoot in a large project when examining the already bundled code.

In this case, there are two options:

Set type to 'module', which would also require setting module to 'esnext' in the tsconfig (and possibly break other things 😁).
Downgrade chalk to version 4.1.2, which is the latest version in the 4.x major release that still supports CJS. This means that the module confusion forces you to use older package versions, which is not ideal, of course.

I also want to note that setting type to 'module' is a very significant project-wide configuration that affects everything, so changing it is usually extremely difficult. For example, if you are using Jest with type set to 'module' and then suddenly decide to switch to commonjs for some reason, your top-level await in test files will stop working.

So, what should I do?

I can't give recommendations on already existing projects, because, as I said, changing the type field is usually very difficult and each case needs to be analyzed separately. Of course, it will be better if you can upgrade to ESM.

As for new projects, I would recommend starting them with type set to 'module'. This syntax is standard and modern. More and more public packages are dropping support for CJS. Additionally, type set to 'module' provides additional features like top-level await.

Of course, there is a caveat. A file extension must be provided when using the import keyword to resolve relative or absolute specifiers. Directory indexes (for example, ./startup/index.js) must also be fully specified. This can be a problem if you're using tsc because TypeScript doesn't modify import paths during transpilation. This issue can be resolved using a special tooling or simply by using the .js extension in imports. TypeScript will still understand that it refers to an appropriate TypeScript file (.ts), and after transpilation, the code will have a valid path. By the way, my IDE (VSCode) was even able to correctly suggest the path with the .js extension. And code navigation also works correctly.

As an alternative for TypeScript projects, you can consider using ts-node, but it needs to be run with the --transpile-only flag. Of course, you need to remember that ts-node is a different runtime from the standard Node.js, and it has its own interesting nuances.

Another solution to this problem could be bare specifiers combined with moduleResolution set to 'nodenext'. However, you'll also need to solve the issue of copying package.json files to the build folder and it’s important to preserve the directory structure.

Or you can just customize ESM specifier resolution algorithm with loaders API to make modules resolvable without extensions like CJS. But this feature is experimental (Node.js 20).

Nevertheless, for new JavaScript projects, I would still recommend using type set to 'module'. In the case of TypeScript, it's not as straightforward as we saw above, but I would still recommend using ESM and specifying the .js extension. Yes, it may seem a bit strange, but it's the best solution among the available options in my opinion.

We discussed the usage of module systems in Node.js environments. But what about browser environments? Usually, in browser environments, these issues don't arise because different tools like webpack are used, which inherently support both ESM syntax and CJS. The output is typically a bundle where all the code is simply concatenated into one file, eliminating the need for module systems (and thus avoiding related problems). webpack also provides a solution for splitting applications into files (chunks) for on-demand asynchronous loading, which is also similar to the of module system, but that's a topic for another article.

When it comes to newer tools like vite, it is built on ESM and uses script tags in the browser with the attribute type set to 'module'. vite’s performance is based on ESM. As far as I know, there are no alternatives to this approach, so there is no need to make a choice.

Conclusion

While writing this article, I came across numerous other articles on the topic of comparing module systems in JavaScript. However, all of these articles only scratch the surface in terms of examining the differences and similarities of these systems. In contrast, I made an effort to not only explore the module systems but also address the common challenges developers face and even proposed a few solutions. I hope you find this information useful in your projects.

Thank you for your attention, and see you next time!

References

Modular programming, viewed 19 Jun 2023, https://en.wikipedia.org/wiki/Modular_programming
IIFE, viewed 19 Jun 2023, https://developer.mozilla.org/en-US/docs/Glossary/IIFE
RequireJS, viewed 19 Jun 2023, https://requirejs.org/
ESM, viewed 19 Jun 2023, https://nodejs.org/docs/latest-v18.x/api/esm.html
UMD, viewed 19 Jun 2023, https://github.com/umdjs/umd
AMD, viewed 19 Jun 2023, https://en.wikipedia.org/wiki/Asynchronous_module_definition