Developing fast & built-in task runner in Node.js core
With this blog post, I’m going to explain and analyze the steps I’ve taken to land a super-fast, built-in task runner in Node.js core. For those who are not familiar with the term “task runner”, it’s a command-line tool to run a task specified in a package.json file in node.js projects.
{
"scripts": {
"start": "node index.js"
}
}From a user perspective
Whenever, a person executes npm run start on this project, it makes a series of executions that makes the operation slow. Here are the steps taken to spawn a process and executes your command:
- User types
npm run startin the terminal. npmtries to find the closest package.json file- It starts from current directory and traverses up to the root of your operating-system, while making
statcalls in every folder to search forpackage.jsonfile. This means if your current path is/home/username/projects/my-project, it will look forpackage.jsonin the following directories:/home/username/projects/my-project/home/username/projects/home/username/home/
- It starts from current directory and traverses up to the root of your operating-system, while making
- Every parent directory of your current directory gets added to the
PATHenvironment variable with a suffix ofnode_modules/.bin.- This is done to make sure that you can have a command like
biome check .in yourpackage.jsoneven though Biome binary is available atnode_modules/.binfolder.
- This is done to make sure that you can have a command like
- It reads the
package.jsonand looks intoscripts[key]field to find the command to execute. - It spawns a process and executes in the shell.
- This is the slowest part of the operation because it involves spawning a new process and executing the command in the shell.
Some package managers like npm adds several npm specific environment variables into the newly spawned process, such as npm_lifecycle_event, npm_config_user_agent and npm_package_json.
The problem with npm task runner
Before diving into the technical implementation, let’s analyze the problems with the current task runner in npm:
npmdoes not dynamically load it’s subcommands likerun,install,test, etc. This means that every time you runnpm, it has to load all the subcommands, even though you are only interested in running a script.- For those who are unfamiliar with
nodeinternals, whenever yourequire()a module in a project, it makes a filesystem call to the operating system to read the file and parse it. This gets cached on common.js applications (but not on ESM), but regardless separating and having multipel files results in slower startup times. - This will likely be optimized in the future, but for the time being it’s a problem that effects the execution time.
- For those who are unfamiliar with
npmruns in the context of Node.js, which means it has to load the Node.js runtime and execute the command in the shell.- By default, in order to execute a shell command, you have to initialize a Node.js process, that initializes V8 engine, parses the
npmJS code, and then executes the command in the shell. This is a slow process. (Even writing this sentence is slow, imagine how slow it is to execute a shell command on a Node.js library that is not optimized for this purpose.)
- By default, in order to execute a shell command, you have to initialize a Node.js process, that initializes V8 engine, parses the
Solution
The solution to the problem is to create a new command in Node.js that is optimized for running tasks in a project. This command should be able to run a task specified in a package.json file, without having to load the entire Node.js runtime.
The original implementation was written in JavaScript to make sure that the Node.js project, contributors and technical steering committee members are open to the idea of having a built-in task runner in Node.js core. After the initial pull-request landed, I’ve re-written the implementation in C++ to make sure that the performance is optimal.
There is still some things that needed to be done to reduce the overhead to less than 10ms, but the current implementation is already faster than all alternatives. If you’re interested in contributing and optimizing this implementation even more, please let me know. I’m more than happy to help you get started.
Benchmarks
JavaScript solution inside Node.js project
This implementation is available at this pull-request .
❯ hyperfine './out/Release/node run test' 'npm run test' -i
Benchmark 1: ./out/Release/node run test
Time (mean ± σ): 29.3 ms ± 1.1 ms [User: 23.2 ms, System: 3.1 ms]
Range (min … max): 27.6 ms … 33.2 ms 97 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: npm run test
Time (mean ± σ): 185.7 ms ± 9.2 ms [User: 136.7 ms, System: 30.3 ms]
Range (min … max): 174.7 ms … 212.9 ms 15 runs
Warning: Ignoring non-zero exit code.
Summary
./out/Release/node run test ran
6.34 ± 0.40 times faster than npm run test
C++ re-write of the task runner
Almost 10ms faster than the JavaScript solution, which doesn’t require V8 to load and execute the original JavaScript implementation.
❯ hyperfine '../node/main-branch --run test' '../node/cpp-rewrite --run test' 'npm run test' -i
Benchmark 1: ../node/main-branch --run test
Time (mean ± σ): 28.9 ms ± 0.9 ms [User: 24.2 ms, System: 3.4 ms]
Range (min … max): 27.5 ms … 31.7 ms 96 runs
Warning: Ignoring non-zero exit code.
Benchmark 2: ../node/cpp-rewrite --run test
Time (mean ± σ): 18.3 ms ± 0.6 ms [User: 16.0 ms, System: 1.5 ms]
Range (min … max): 17.5 ms … 20.8 ms 139 runs
Warning: Ignoring non-zero exit code.
Timeline
For those interested in the technical implementation, here’s a list of series of pull-requests that lead to make node --run task runner from scratch to a stable release candidate. In overall, the whole implementation process took almost 3 months (March 22 to June 12).
- cli: implement
node --run <script-in-package-json> - src: rewrite task runner in c++
- src: fix positional args in task runner
- test: add env variable test for —run
- cli: add
NODE_RUN_SCRIPT_NAMEenv tonode --run - cli: add
NODE_RUN_PACKAGE_JSON_PATHenv - src: traverse parent folders while running
--run - doc: move
node --runstability to release candidate