Skip to content
About Garden

JavaScript

6 posts with the tag “JavaScript”

Type Safety and Runtime Safety Part.1/2: model, validator & data

TypeScript makes JavaScript even stronger with a strongly-typed system, which is a merit for developers. TypeScript checks our static code before running it. That is so called type safety. However, runtime safety is checking our code at runtime with real data. In runtime, TypeScript files are already compiled to JavaScript. So static type check is not available.

It means: type safe does not mean runtime safe

The post is divided into two parts.

In part 1 here, I want to discuss about runtime validation. And in part 2, I want to share some use cases with Nuxt3 as example.

  • Basic knowledge: JavaScript, TypeScript, zod

It starts with a story. Here we have two developers, John & Bill.

John wrote a utility function and shared to his team:

utils.ts
export function printList(list: (string | number)[]): void {
list.forEach((item) => {
console.log(item);
});
}

The parameter list is limited to array, TypeScript will take an eye on everyone who call this function, and check if type of list is correct. It seems no problem to John. So he transpiled it into JavaScript, here is the result:

utils.js
"use strict";
Object.defineProperty(exports, "__esModule", { value: true });
exports.printList = void 0;
function printList(list) {
list.forEach(function (item) {
console.log(item);
});
}
exports.printList = printList;

One day, Bill imported this file to use printList function:

product.js
const { printList } = require("./utils.js");
const myList = [1, 2, 3, "a", "b", "c"];
console.log(printList(myList));
const myIllegalData = "hello"; // <-- this is illegal!
console.log(printList(myIllegalData)); // <-- will throw error

Without type definition, Bill used illegal parameter accidentally, then the function was broken after ran it:

runtime error

When John created the function, he expected the parameter should be an array. But someone may use it in the wrong way. (without type definition file *.d.ts)

Expected parameter is different with actual parameter, which means type (or model) is different from data. I call it mismatch (or misalignment) between type and data personally. If John did runtime check, the function would be unbreakable.

It is crucial to do runtime check, more then type check in my own opinion. As an frontend developer, we always get data from outside our code, mostly from API. External sources are likely to change or update without notifying us. So it is important to do validation right away when we got some data.

I recently use a validation library called zod, which is very suitable for runtime validation.

As the documentation introduced itself:

TypeScript-first schema validation with static type inference

It has good capability to TypeScript, and can do validation with type inference.

Personally, I would like to define a model first, because model is the blueprint of data. Second, we need validator to validate the data at runtime. Since I am using zod, it is very typed-friendly to infer an validator with model. Then when receive data, we use the validator to do the validation.

Here we have three parts:

  1. Model: Using type definition in TypeScript.
  2. Validator: Using zod to build up validators.
  3. Data: Validate the data using validators.

In fact, both model and validator make restriction to the data. The difference is, models restrict data statically (before runtime), validators restrict data dynamically (at runtime). Besides, validators can do stricter restriction than what models do. I will explain it later.

As Single Source of Truth or SSOT, our truth here is the model. The shape of a validator is inferred by the model, no matter what other strict validation we put on, it is still under restriction of the model. Then the data should follow the rule that validator provided.

Model is a blueprint of data, or origin of data. Each data value might vary, but data type should remain the same.

For example we have a model User to define what a user should look like:

enum Gender {
Male,
Female,
Other
}
type User {
name: string;
age: number;
gender: Gender;
}

So an user includes name, age, gender as above:

  1. Name is defined as string with no doubt.
  2. Age is a number, an positive integer actually, but TypeScript only provide number type, so we have no choice.
  3. Gender is defined as an enum called Gender, actually it is also a number.

Here we use zod to build the validator, zod will check the validator with the type we provided:

const userValidator: z.ZodSchema<User> = z.object({
name: z.string().min(1),
age: z.number().int().nonnegative(),
gender: z.nativeEnum(Gender),
})

As we can see, in validator, we define our value precisely:

  1. Name: Empty string is not allowed.
  2. Age: Only nonnegative integer is allowed. No -5 year old, no 15.5 year old.
  3. Gender: Here we only have male (0), female (1), other (2), other values are not allowed.

Here is what I said validator is stricter than model.

After that, we can validate data with it at anywhere we want:

const userValidation = userValidator.safeParse(userData);
if(userValidation.success){
// validation is passed, congrats!
// ...
} else {
// validation is not passed, throw error, print logs or do anything to handle errors
// ...
}

So we have much confident with our code right now! No more fear about being bombarded by illegal data!

In the next part, I want to discuss with some use cases.

Hope this post is helpful to you!

Happy coding.

Type Safety and Runtime Safety Part.2/2: page, route guard & API

The post is divided into two parts. If you want to read part 1 first, please go to here.

As part 1 is about runtime validation. Part 2 will focus on use cases. I will use Nuxt3 as example.

Here I have 3 use cases: page, route guard (Vue route) and API.

  • Basic knowledge: JavaScript, TypeScript, Nuxt3, Vue3, zod

Suppose we have a post list page with route: /posts?page=3&sort=desc&limit=20.

Then let’s define the route parameter model:

enum Sort {
ASC = 'asc',
DESC = 'desc',
}
// PS. RP means route parameter, which is my naming preference.
interface RPPostList {
limit?: number;
page?: number;
sort?: Sort;
}

And validator:

const RPPostListValidator: z.ZodSchema<RPPostList> = z.object({
limit: z.coerce.number().int().positive().max(50).optional(),
page: z.coerce.number().int().positive().optional(),
sort: z.nativeEnum(Sort).optional(),
});

The data we plan to validate is from url. That means we will only get value with string type. So the values from page and limit should be transformed into number before validate it. Otherwise, we will get "3" instead of 3 from page. Thankfully, the latest version of zod has coerce method, we can transform it easily.

  • page: page means current page number, so it should be positive integer.
  • limit: limit means how many items per page, which is also a positive integer. Besides, it is limited to 50 as maximum value due to performance concern. Of course we can define it in the backend.
  • sort: sort is limited to desc or asc.

In the page component, here is the process:

  1. We get query string from url with route.query in Nuxt3.
  2. Validate the query string.
  3. After safely parsed the data, we can do whatever we want.

Here is an example:

src/pages/posts.vue
<script setup lang="ts">
import { SafeParseReturnType } from 'zod';
const route = useRoute();
const queryStringValidation = computed<SafeParseReturnType<RPPostList, RPPostList>>(() =>
RPPostListValidator.safeParse(route.query),
);
const currentPage = computed<number>(() => {
if (queryStringValidation.value.success) {
// if query string has page, return page, otherwise, return 1
return queryStringValidation.value.data?.page ? queryStringValidation.value.data.page : 1;
} else {
// if validation failed, return 1 whatsoever
return 1;
}
});
const currentSortType = computed<Sort>(() => {
if (queryStringValidation.value.success) {
return queryStringValidation.value.data?.sort ? queryStringValidation.value.data.sort : SortingEnum.DESC;
} else {
// if validation failed, return desc as default sorting
return Sort.DESC;
}
});
</script>

For currentPage, if query string didn’t have page value, or giving something strange page like ?page=-1, ?page=hello, then we will give page number 1 as default. Same as currentSortType.

Consider we have a post page with route: /post/:id.

The route middleware in Nuxt3 will triggered before entering the page. So it is very suitable to make a route guard using route middleware.

/src/middleware/postGuard.ts
import { z } from 'zod';
// PS. RP means route param, which is my naming preference.
interface RPPost {
id: number;
}
const RPPostValidator: z.ZodSchema<RPPost> = z.object({
id: z.coerce.number().int().positive(),
});
export default defineNuxtRouteMiddleware((to, _from) => {
const isValid: boolean = RPPostValidator.safeParse(to.params);
// if validation failed, redirect to home page
if (!isValid) {
return navigateTo('/');
}
});

As the validator shown above, id is considered as post id. The post id in our database will be PK, or primary key. And primary keys are positive integers. So it is unnecessary to go to page like: /post/-123, /post/2.35 or /post/hello. Without asking the database, we have confidence to say that there’s no such a post in our database. So we can directly redirect it to home page or error page.

Incoming url might be various and unpredictable. (Or should I say untrustable? Sounds like a skeptic, LOL.) So it is safer to do strict check before we use it.

Speaking of untrustable, it remind me of an quote from a great assassin Altair, once he said:

Nothing is true, everything is permitted.

Altair

Here we can say:

Nothing is true, everything should be validated. 🤘

When encounter an API, I asked myself:

  • Which data is unsafe and needs validation?
  • When each validation failed, how to handle the error?

Until now, the process below is what I think as a good practice:

  1. Validate incoming payload: from route params & query string
  2. If failed, throw 400 bad request
  3. Query data, DB connection
  4. Validate raw data
  5. If failed, throw 500 internal server error
  6. Return data as response

Here’s an Nuxt3 server API for querying single post data. The endpoint is /api/post/:id using GET http method. To demo, I gathered all models and validators together for easy reading. For real world use, they will be placed in other directory respectively.

/src/server/api/post/[id].get.ts
import { z } from 'zod';
interface RPPost {
id: number;
}
const RPPostValidator: z.ZodSchema<RPPost> = z.object({
id: z.coerce.number().int().positive(),
});
// PS. M means model, which is my naming preference.
interface MPost {
id: number;
content: string;
publishAt: Date;
title: string;
}
const MPostValidator: z.ZodSchema<MPost> = z.object({
id: z.number().int().positive(),
content: z.string(),
publishAt: z.date(),
title: z.string().min(1),
});
// Nuxt3 server API
export default defineEventHandler(async (event) => {
// Step 1 - get payload from router, query string or request body
const params = getRouterParams(event);
// Step 2 - payload validation
const routerParamValidation = RouterParamValidator.safeParse(params);
// Step 3 - if validation failed, would throw 400 bad request
if (!routerParamValidation.success) {
throw createError({ message: 'Request is invalid.', statusCode: 400 });
}
// Step 4 - db connection query method defined at another place
const rawData = await getPostById(routerParamValidation.data.id);
// Step 5 - raw data validation
const rawDataValidation = MPostValidator.nullable().safeParse(rawData);
// Step 6 - if validation failed, would throw 500 internal error
if (!rawDataValidation.success) {
throw createError({ message: 'Data and model mismatched.', statusCode: 500 });
}
// Step 7 - everything is fine, then transform data into view model, then return it
return rawDataValidation.data;
});

An API also get url like route middleware does. So they might share the same route param model and validator. Besides, if accepted request body like POST API, it should also validate request body, too.

If route params was illegal, throw 400 bad request error to user. And if it passed the validation, then continue to the next step: connect with external source (like database, APIs) to get data.

Since we may get data from several external source listed below:

  1. From our database directly.
  2. From APIs created by our backend colleagues.
  3. From third party APIs.
  4. …any other external sources you might need.

There are some concerns about the data from each source:

  1. Database: database might stored dirty data under development or other reasons.
  2. APIs from backend: Your backend colleagues updated their data model or adjusted their APIs, but forgot to inform you. Human makes mistakes.
  3. Third party APIs: Third party APIs might change their response, and of course the provider is not obligated to notify everyone who use it, especially free APIs.

So it is much safer to validate the raw data before using it.

If validation failed, throw 500 internal server error error to user if necessary. Maybe throw a 500 error might be too radical for someone. If so, another choice is to just log error down without throwing an error.

But what I concerned here is, any kind of dirty data (no matter how small it is) might have risks to break front end page. A frontend developer might murmur like this: The page was good yesterday, why is it broken today? I didn’t touch anything… WHY?!

So, the stricter the better.

After all validations passed, we are good to go: returning back as a response.

Validation process might be tedious, but it is worthy in the long term. Maybe it is frustrated. But it is more frustrated when we get a bomb (dirty data) that crash our page at any unexpected time… (Saying when we are going to sleep.)

validation everywhere

Since I played Minesweeper and got GAME OVERS some many times when developing, it is time to face the music. Then it leads me to the ideas mentioned above. Hope it is helpful!

Happy coding.

Create Hugo Post With NPM Script

  • Basic knowledge: NPM, Hugo, JavaScript, shell script
  • Pre-installed: VS Code, NPM CLI, Hugo CLI

Create a post using hugo CLI is a tedious work for me. Because I always create a post using archetype and placing it in nested folder. For example, when creating this post, I should type the command below in the terminal:

Terminal window
hugo new --kind develop posts/_developer/create-hugo-post-with-npm-script

The problem here is, I always forget how many kind of archetypes I already have, and what my folder structure looks like right now. Folder structure can be dynamic, can be adjusted very frequently. Furthermore, I really like the NPM SCRIPTS feature that VSCode provided at Explorer in side menu, screenshot shown below:

npm script in side menu

This feature, which I call it click to run script personally, is very convenient if the user can not memorize or forget scripts. But it seems to support node pack manager a.k.a NPM as far as I know. In order to using the “click to run script” feature combining with Hugo CLI, it is necessary to using NPM as a middleware, even though Hugo blog does not need NPM or any node packages at any time. So here we get start it.

First initialize NPM with npm init.

Then let’s try running hugo dev server through NPM, after adding this script into your package.json:

package.json
"scripts": {
"dev": "hugo serve -D",
}

Type npm run dev in your terminal, or just click the script to run at side bar:

npm run dev

Work like a charm! ✨

So, NPM script called Hugo CLI perfectly. Then let’s trying to achieve final goal: create a post.

First we have to install two packages:

  1. @inquirer/prompts, which is used to make user-friendly interface in our terminal.
  2. inquirer-directory, make choose directory easier.

Then I create a JavaScript file createPost.js in root directory, build the post creation progress, here’s the code for your reference:

"use strict";
const inquirer = require("inquirer");
const { input, select, Separator, confirm } = require("@inquirer/prompts");
const { execSync } = require("child_process");
const inquirerDirectory = require("inquirer-directory");
const BASE_PATH = "./content";
inquirer.registerPrompt("directory", inquirerDirectory);
const exec = (commands) => {
execSync(commands, { stdio: "inherit", shell: true });
};
/**
* Create post script
*
* @see https://github.com/SBoudrias/Inquirer.js
* @see https://github.com/nicksrandall/inquirer-directory
*/
(async function () {
const archeType = await select({
message: "Select a archetype",
choices: [
{
name: "Basic",
value: "basic",
description: "Basic post",
},
{
name: "Dev",
value: "dev",
description: "Post for developer.",
},
new Separator(),
{
name: "Garden",
value: "garden",
description: "Note for digital garden.",
},
],
});
const title = await input({ message: "Enter your post title" });
const directory = await inquirer.prompt({
type: "directory",
name: "path",
message: "Please choose post directory.",
basePath: BASE_PATH,
});
const answer = await confirm({ message: "Confirm create the post?", default: false });
if (answer) {
exec(`hugo new --kind ${archeType} ${directory.path}/${title}`);
exec(`open ${BASE_PATH}/${directory.path}/${title}/index.md`);
}
})();

In the script , I provided 3 question, and some actions:

  1. Select a archetype.
  2. Insert post title.
  3. Choose a directory.
  4. Confirm the creation.
  5. Execute the hugo post creation script.
  6. Finally, open the file we created.

After finish your crafted script, then add this to our package.json:

package.json
"scripts": {
"create": "node createPost.js"
}

Then run npm run create, here’s the execution result:

npm run create

That’s it! Happy coding.

[Note] Implementation of State Machine and Multi-step Form

Basic knowledge: VueTypeScriptXStatevee-validatezod

We often see “multi-step” forms that break down a lengthy form into several separate sections for completion. This approach reduces the psychological burden on users compared to a single long-page form.

Among the numerous form-related open-source packages available, I’ve selected the following tools to manage form-related tasks:

  • Form state management: Using Vue for this example, vee-validate manages the rendering states of all form elements including input, select, and other form components
  • Form validation: For validation tasks, we use zod (officially recommended by vee-validate) to handle the validation logic

Now that we’ve covered the form components, the key question remains: how should we design this “multi-step” architecture?

If we combine all fields from each stage into a single form, as shown in the diagram:

When switching between stages, we need to determine which fields to display, but when moving to the next stage, we need to validate only the current set of fields, which leads to complex validation logic.

So I thought of making each stage an independent form, meaning all validation is no longer partial, but rather validates all fields within a form (for example, all fields in stage one):

By splitting into multiple forms, we’ve simplified the form validation logic, eliminating the need for additional checks (partial field validation). The responsibilities of each form component are as follows:

  1. Field state management
  2. All field validation

After delegating the above tasks to the form components, the remaining logic to handle is:

  1. Display the state of which stage form component is currently active
  2. Submit form data and execute asynchronous requests

Since “Stage One” can only proceed to “Stage Two” and not to “Stage Three” or “Confirmation Stage (Step Confirm)”, I thought of using a finite state machine to solve these two issues, and decided to try XState, a well-known package for implementing finite state machines, to handle these tasks.

Therefore, the responsibility distribution diagram is as follows:

The previous section outlined the initial ideas. Here, let’s organize the planned assignment of responsibilities, which is divided into two parts:

  1. Control flow between stages (determining which form to display at each stage)
  2. Management of all form data
  3. Data submission and asynchronous request handling
  4. Handling of asynchronous request states (loading and error handling)

All of these features are implemented using XState.

  1. Form field state management (using vee-validate)
  2. Form field validation (using zod)
  3. Form submission events and form data (using vee-validate)

For the form component, taking the first stage Form1.vue as an example, the structure is as follows:

<template>
<div>
<h2 class="formTitle">Choose channels you like</h2>
<form class="form" @submit="onSubmit">
<div>
<input type="checkbox" id="discovery" :value="1" v-model="channels" />
<label class="label" for="discovery">Discovery</label>
</div>
{/* other inputs... */}
<div v-if="errors.channels" class="error">{{ errors.channels }}</div>
<div class="buttonGroup">
<button class="button" type="submit">next step</button>
</div>
</form>
</div>
</template>
<script setup lang="ts">
import type { Form1Model } from "@/types";
import { toTypedSchema } from "@vee-validate/zod";
import { useForm, useField } from "vee-validate";
import z from "zod";
interface Props {
initialValues: Form1Model;
}
interface Emits {
(event: "next", values: Form1Model): void;
}
const props = withDefaults(defineProps<Props>(), {
class: "",
});
const emits = defineEmits<Emits>();
const validationSchema = toTypedSchema(
z.object({
channels: z.number().array().nonempty("Please choose at least one channel."),
})
);
const { handleSubmit, errors, values } = useForm<Form1Model>({
initialValues: props.initialValues,
validationSchema,
});
const { value: channels } = useField<number[]>("channels");
const onSubmit = handleSubmit((values) => emits("next", values));
</script>

The form’s initial values are provided by the state machine and passed in via props; then when the form is submitted, it emits events for the state machine to handle. One person is responsible for one thing, embodying the spirit of the “Single Responsibility Principle.”

The display control for forms at each stage is implemented in the outer MultiStepForm.vue, which imports the state machine and determines the display logic. Each form emits various events (next step, previous step, submit, etc.), which are then handed over to the state machine for execution.

<template>
<div :class="`h-full w-full ${props.class}`">
<h1 class="title">Multi Step Form Example</h1>
<div class="grid grid-cols-2 gap-x-6">
<div>
<h2 class="subTitle">Form Component</h2>
<div class="p-4 border border-slate-700 rounded-lg">
<Form1
v-if="state.matches('step1')"
@next="send('NEXT_TO_STEP_2', { formValues: $event })"
@prev="send('PREV')"
:initial-values="state.context.form1Values"
/>
<Form2
v-if="state.matches('step2')"
@next="send('NEXT_TO_STEP_3', { formValues: $event })"
@prev="send('PREV')"
:initial-values="state.context.form2Values"
/>
<Form3
v-if="state.matches('step3')"
@next="send('NEXT_TO_STEP_CONFIRM', { formValues: $event })"
@prev="send('PREV')"
:initial-values="state.context.form3Values"
/>
<FormConfirm
v-if="state.matches('stepConfirm')"
@prev="send('PREV')"
@submit="send('SUBMIT')"
:is-submitting="state.matches('stepConfirm.submitting')"
:error="state.context.error"
:machine-context="state.context"
:payload="state.context.payload"
/>
<FormComplete v-if="state.matches('complete')" @restart="send('RESTART')" />
</div>
</div>
<div>
<p class="subTitle">Current Machine Context</p>
<pre class="preBlock">{{ state.context }}</pre>
</div>
</div>
</div>
</template>
<script setup lang="ts">
import { useMachine } from "@xstate/vue";
import Form1 from "@/components/Form1.vue";
import Form2 from "@/components/Form2.vue";
import Form3 from "@/components/Form3.vue";
import FormConfirm from "@/components/FormConfirm.vue";
import FormComplete from "@/components/FormComplete.vue";
import { multiStepFormMachine } from "@/multiStepFormMachine";
interface Props {
class?: string;
}
interface Emits {
(event: "click"): void;
}
const props = withDefaults(defineProps<Props>(), {
class: "",
});
const emits = defineEmits<Emits>();
const { state, send } = useMachine(multiStepFormMachine);
</script>

Next is the state machine, which I’ve separated into a standalone file multiStepFormMachine.ts for easier management:

import { assign, createMachine } from "xstate";
import type { Form1Model, Form2Model, Form3Model, SubmitData } from "./types";
import { FORM_1_INITIAL_VALUES, FORM_2_INITIAL_VALUES, FORM_3_INITIAL_VALUES } from "./default";
import { sendFormData } from "./utils";
type MachineEvent =
| { type: "NEXT_TO_STEP_2"; formValues: Form1Model }
| { type: "NEXT_TO_STEP_3"; formValues: Form2Model }
| { type: "NEXT_TO_STEP_CONFIRM"; formValues: Form3Model }
| { type: "PREV" }
| { type: "SUBMIT" }
| { type: "RESTART" };
export type MachineContext = {
form1Values: Form1Model;
form2Values: Form2Model;
form3Values: Form3Model;
payload: SubmitData | null;
error: string | null;
};
const INITIAL_MACHINE_CONTEXT: MachineContext = {
form1Values: FORM_1_INITIAL_VALUES,
form2Values: FORM_2_INITIAL_VALUES,
form3Values: FORM_3_INITIAL_VALUES,
payload: null,
error: null,
};
type MachineState =
| { context: MachineContext; value: "step1" }
| { context: MachineContext; value: "step2" }
| { context: MachineContext; value: "step3" }
| { context: MachineContext; value: "stepConfirm" }
| { context: MachineContext; value: "stepConfirm.submitting" }
| { context: MachineContext; value: "complete" };
export const multiStepFormMachine = createMachine<MachineContext, MachineEvent, MachineState>(
{
id: "multiStepForm",
initial: "step1",
context: INITIAL_MACHINE_CONTEXT,
states: {
step1: {
on: {
NEXT_TO_STEP_2: {
target: "step2",
actions: assign({
form1Values: (context, event) => event.formValues,
}),
},
},
},
step2: {
on: {
NEXT_TO_STEP_3: {
target: "step3",
actions: assign({
form2Values: (context, event) => event.formValues,
}),
},
PREV: {
target: "step1",
},
},
},
step3: {
on: {
NEXT_TO_STEP_CONFIRM: {
target: "stepConfirm",
actions: assign({
form3Values: (context, event) => event.formValues,
}),
},
PREV: {
target: "step2",
},
},
},
stepConfirm: {
initial: "preSubmit",
states: {
preSubmit: {
entry: assign({
payload: (context, event) => ({
...context.form1Values,
...context.form2Values,
...context.form3Values,
}),
}),
on: {
SUBMIT: {
target: "submitting",
},
},
},
submitting: {
invoke: {
src: "formSubmit",
onDone: {
target: "#multiStepForm.complete",
actions: "resetContext",
},
onError: {
target: "errored",
actions: assign({
error: (context, event) => event.data.error,
}),
},
},
},
errored: {
on: {
SUBMIT: {
target: "submitting",
},
},
},
},
on: {
PREV: {
target: "step3",
},
},
},
complete: {
entry: "resetContext",
on: {
RESTART: {
target: "step1",
},
},
},
},
},
{
actions: {
resetContext: assign(INITIAL_MACHINE_CONTEXT),
},
services: {
formSubmit: async (context, event) => {
if (context.payload) {
return await sendFormData(context.payload);
} else {
return await new Promise((resolve, reject) => reject("Context cannot be null."));
}
},
},
}
);

For details of the state machine’s operation flow, please refer to this visualization page: multi-step-form | Stately

Above is the state machine code - Sorry for too lengthy. 🙏

The main responsibilities:

  1. Form1 emits a NEXT_TO_STEP_2 event to proceed to Form2, or Form2 emits a PREV event to return to Form1, and so on.
  2. FormConfirm emits a SUBMIT event to tell the state machine to execute an asynchronous request, sending out the form data.
  3. The state machine’s context stores the field data for Form1 and other form components, as well as the payload for the final request submission, along with the status of asynchronous requests (loading, error)

Below are the final implementation results, with form components on the left and the current context status on the right, clearly showing when the context data gets updated

Page

Please refer to here for all code

Above are some ideas for multi-stage forms, using state machines to achieve single responsibility for forms while clearly separating and delegating logic to different parts.

This is my first time seriously writing a state machine on my own, and I’m still in the learning and exploration phase. If you have any questions, feel free to leave a comment. 😎

Happy coding. 🙏

The First Dive in Multi-Threaded Patterns

Here’s a little side note from Chapter 6 - Multithreaded Patterns in this book: Multithreaded JavaScript.

In this chapter, introducing some multi-threaded patterns:

  1. Thread Pool
  2. Mutex
  3. Ring Buffers
  4. Actor Model
  • The thread pool is a very popular pattern that is used in most multithreaded applications in some form or another.
  • A thread pool is a collection of homogeneous worker threads that are each capable of carrying out CPU-intensive tasks that the application may depend on.
  • libuv library that Node.js depends on provides a thread pool, defaulting to four threads, for performing low-level I/O operations.
  • This pattern might feel similar to distributed systems.
  • Discuss thread into two parts: pool size and dispatch strategies.
  • Typically, the size of a thread pool won’t need to dynamically change throughout the lifetime of an application.
  • With most operating systems there is not a direct correlation between a thread and a CPU core.
  • Having too many threads compared to the number of CPU cores can cause a loss of performance.
  • The constant context switching will actually make an application slower.
  • Thread pool contains: worker thread, main thread, garbage collection thread (if using libuv)
Node.js
// browser
cores = navigator.hardwareConcurrency;
cores = require("os").cpus().length;
  • Don’t forget the main thread, so total threads are n + 1
  • Deciding how many threads by purpose:
    • Cryptocurrency miner that does 99.9% of the work in each thread and almost no I/O and no work in the main thread. Using the number of available cores as the size of the thread pool might be OK.
    • Video streaming and transcoding service that performs heavy CPU and heavy I/O. You may want to use the number of available cores minus two.
  • Reasonable starting point might be to use the number of available cores minus one and then tweak when necessary.
  • A naive approach might be to just collect tasks to be done, then pass them in once the number of tasks ready to be performed meets the number of worker threads and continue once they all complete.
  • However, each task isn’t guaranteed to take the same amount of time to complete.

Here’s a list of the most common strategies:

  • Each task is given to the next worker in the pool, wrapping around to the beginning once the end has been hit.
  • The benefit of this is that each thread gets the exact same number of tasks to perform.
  • Unfair distribution of work.
  • The HAProxy reverse proxy refers to this as roundrobin.
  • Each task is assigned to a random worker in the pool.
  • Possibly unfair distribution of work.
  • When a new task comes along it is given to the least busy worker.
  • When two workers have a tie for the least amount of work, then one can be chosen randomly.
  • HAProxy refers to this as leastconn.
  • Mutex means mutually exclusive lock.
  • A mechanism for controlling access to some shared data.
  • It ensures that only one task may use that resource at any given time.
  • A task acquires the lock in order to run code that accesses the shared data, and then releases the lock once it’s done.
  • The code between the acquisition and the release is called the critical section.
  • A ring buffer is an implementation of a first-in-first-out (FIFO) queue, implemented using a pair of indices into an array of data in memory.
  • The array is treated as if one end is connected to the other, creating a ring of data. This means that if these indices are incremented past the end of the array, they’ll go back to the beginning.
  • An analog in the physical world is the restaurant order wheel, commonly found in North American diners.
  • head index: The head index refers to the next position to add data into the queue.
  • tail index: The tail index refers to the next position to read data out of the queue from.
  • buffer capacity (length): The capacity of the buffer.

Refer to the chart from the book:

ring buffer

  • When the data is written into buffer, head index will move to the next position.
  • When the data is read from the buffer, tail index will move to the next position.
  • When head or tail index at the last position of buffer, next will move the the first position of buffer.
  • Since it’s a RING buffer, there’s no start and end point. Ths start position of head and tail index does not matter.
  • tail index is always located behind or at the same position with head index.
  • When the buffer is FULL, there’s two strategies for this situation:
    • Overwrite the oldest: Overwrite the oldest data in the buffer. It means that newer data is more important.
    • Prevent from writing: Throw an error, banning the new data from writing into the buffer.
  • It is ALWAYS necessary to get the oldest data in the buffer correctly.

Refer to wikis:

  • The useful property of a circular buffer is that it does not need to have its elements shuffled around when one is consumed.
  • The circular buffer is well-suited as a FIFO (first in, first out) buffer.
  • The non-circular buffer is well suited as a LIFO (last in, first out) buffer.
  • The idea of stake in JavaScript meets the concept of LIFO.
  • Circular buffering makes a good implementation strategy for a queue that has fixed maximum size.
  • For arbitrarily expanding queues, a linked list approach may be preferred instead.
  • The actor model is a programming pattern for performing concurrent computation.
  • An actor is a primitive container that allows for executing code.
  • An actor is a first-class citizen in the Erlang programming language, but it can certainly be emulated using JavaScript.
  • An actor is capable of running logic, creating more actors, sending messages to other actors, and receiving messages.
  • No two actors are able to write to the same piece of shared memory, they are free to mutate their own memory.
  • An actor is like a function in a functional language, accepting inputs and avoiding access to global state.
  • Actors are single-threaded.
  • A system that uses actors should be resilient to delays and out-of-order delivery, especially since actors can be spread across a network.
  • Individual actors can also have the concept of an address. For example, tcp://127.0.0.1:1234/3 might refer to the third actor running in a program on the local computer listening on port 1234.
  • With the actor pattern, you shouldn’t think of the joined actors as external APIs. Instead, think of them as an extension of the program itself.

Refer to the chart from the book:

actor model

E2E Testing Oriented Developing Process

One day, we got a mockup from designer.

Then we start crafting the page. At first, we might work on making UI and UX of components. That’s the main part. E2E testing is not a concern at the moment.

When developing, if we were luckily get the test case from QAs (or whoever wrote it), should we consider it when crafting our components? (And yeah only if we have enough time.)

By the way, test case I mentioned here is for E2E testing (automation testing), not for human.

To make writing E2E testing smoothly, it is good to add some HTML attributes which the test case is planned to query. Some ACTIONS in test cases like click a button, type something in input, get text from a div element, is the elements we want to query in the future.

But how about we haven’t got the test case yet? If so, we don’t know what elements will be queried very clearly. And here it comes the thought in my mind:

We might need some rules, then we don’t have to guess every time. With rules, the coverage will be guaranteed at acceptable level.

So, I made some rules for myself. I will discuss it into several parts:

  1. Component specific
  2. Location specific
  3. State specific
  4. Invisible data specific
  5. Structure specific

In this post, I will use React component to explain the concept. In fact, it doesn’t matter which front end framework are you using. It will always lead to HTML itself. Whether it is JSX or not.

Use data-test-page attribute in the outer HTML tag: data-test-page='<PAGE_NAME>'. With the data-test-page attribute, we can easily know the BOUNDARY of the page component.

PS. The attribute name is up to you! I will show the best choice of mine. 🙃

For example, if we have a home page component:

function HomePage() {
return <div data-test-page="home">{/* ... */}</div>;
}

Use data-test-comp attribute in the outer HTML tag: data-test-comp='<COMPONENT_NAME>'. With the data-test-comp attribute, we can easily know the BOUNDARY of the basic component.

function Button() {
return <div data-test-comp="button">{/* ... */}</div>;
}

The component might be used many times. (That’s why we make a component) Thus, data-test-comp will be duplicate and not easy to be recognized. Saying if we used 5 button components at the same page, but we just want to query the specific one and then click it. At this point, only we can do is just get all button elements, then search the label in the button. Finally click the button we want. The process might be tedious. So I am wondering if there was a better way to do it. What if we have unique attribute (or nearly unique, which means rarely used), it will be easier.

Therefore, we still need another FEATURE attribute for this case. (The attribute name is better to be UNIQUE) Use data-test-feat attribute to tell us what feature (or purpose) of the component is.

In my opinion, this attribute is optional. We still can query the right element we want with CSS selector. And also, overuse it might cause duplicate attribute name very easily. We never remember every line of codes we write, right? So make a good balance.

So there might be 2 ways to do it.

function HomePage() {
return (
<>
{/* ... */}
<div data-test-feat="confirmBtn">
<Button />
</div>
<div data-test-feat="cancelBtn">
<Button />
</div>
{/* ... */}
</>
);
}

In the example above, we clearly know that in home page, there are two buttons: confirm button and cancel button.

If wrapping component is annoying for you, another way is to insert it into component by props. Here’s the way:

Another way without the container element, is to pass the attribute name into the component through props:

function HomePage() {
return (
<>
{/* ... */}
<Button dataTestFeat="confirmBtn" />
<Button dataTestFeat="cancelBtn" />
{/* ... */}
</>
);
}

In button component:

function Button({ dataTestFeat, btnText, labelText }) {
return (
<div data-test-comp="button">
<label>
{labelText}
<button data-test-feat={dataTestFeat}>{btnText}</button>
</label>
</div>
);
}

Sometimes, we may want to know the state of some components. Saying we have a toggle button component like this:

toggle button

In test case, we want to know that after clicked the toggle button, it will change to OFF state correctly, for example.

We can still know what the state is by its style. If you used SASS that might not be big problem. But as I recently used tailwindcss very often, it become not quite straightforward… Here is what the component looks like:

function ToggleButton() {
const [isOn, setIsOn] = useState(false);
const handleChange = () => setIsOn((prev) => !prev);
return (
<button
type="button"
className="rounded-full overflow-hidden relative w-16 h-7 shrink-0 text-white font-semibold text-sm uppercase ml-2.5 bg-neutral-500"
onClick={handleChange}
>
{/* here is what the difference by state */}
<div className={`absolute transition left-1.5 top-1/2 -translate-y-1/2 ${isOn && "translate-x-9"}`}>
<div className="rounded-full bg-white w-4 h-4 relative">
<div className="absolute right-full top-1/2 -translate-y-1/2 px-2 whitespace-nowrap">On</div>
<div className="absolute left-full top-1/2 -translate-y-1/2 px-2 whitespace-nowrap">Off</div>
</div>
</div>
</button>
);
}

In here, CSS class translate-x-9 is which the style difference between ON and OFF state. It is still possible to recognize, but just like hell to query a long class like this… it especially happen when you use utility first css library like tailwindcss.

So in this situation, it is better to have a state attribute:

function Button({ btnText, labelText }) {
const [isOn, setIsOn] = useState(false);
const handleChange = () => setIsOn((prev) => !prev);
return (
<div data-test-comp="button">
<label>
{labelText}
<button data-test-state={isOn} onClick={handleChange}>
{btnText}
</button>
</label>
</div>
);
}

Another common case is loading state. Suppose that we have a list and search bar component here:

function SearchPage() {
// ...some state here
return (
<div data-test-page="searchPage">
<div data-test-comp="searchBar">
<input value={searchValue} data-test-feat="searchInput" />
<button onClick={handleSearch} data-test-feat="searchBtn">
Search
</button>
</div>
<div data-test-comp="list" data-test-loading={isLoading}>
{/* list data here */}
</div>
</div>
);
}

PS. For readability, I used raw HTML instead of wrapped components.

After click the search button, it will start loading to fetch new data from API. Then finish the loading, update new data in list component.

Suppose that we have a card component with product info. To find specific product very quickly, it is nice to have a product id on the component. But the product id does not expose to user. So we need to add it as attribute into the element:

function ProductCard({ productId, ...restProps }) {
return (
<div data-test-comp="productCard" data-test-product-id={productId}>
{/* product card content */}
</div>
);
}

For naming rule here, I think every SEMANTIC name will be fine.

Under some special situation, we will use div element to simulate specific HTML elements.

For example, like table-simulated elements, use data-test-el attribute: data-test-el='<ELEMENT_NAME>'

function Table() {
return (
<div data-test-el="table">
<div data-test-el="tbody">{/* ... */}</div>
</div>
);
}

This case might not be familiar for everyone. But still I can explain why I have to use this.

I met some CSS issue of table related elements (namely, <table>, <thead>, <tbody>, …etc) in Safari browser. So I have to use divs to re-build the same structure as table. Adding attributes is easier for me to read the whole structure.

Sometimes, we need to understand a component structure quickly. To do this, insert attribute in element as flag, will improve the readability. Saying we have a dialog component here:

function Dialog() {
return (
<div data-test-comp="dialog">
<div data-test-el="dialogHeader">this is dialog header</div>
<div data-test-el="dialogBody">this is dialog body</div>
<div data-test-el="dialogFooter">this is dialog footer</div>
</div>
);
}

After adding data-test-el attributes, it is more efficient to understand the component structure at a glance.

It will be a long journey adding a lot of attributes when developing. Besides, at this moment, test case is not the biggest concern. With some rules, we can do the tedious tasks without thinking too much. After finish the development, writing test cases might be smoothly. We might be thankful for our past selves.

And after all, it’s just my personal thoughts, NOT an industrial standard. 😎 I will be glad if someone benefit from it.

Cheers.