Skip to content
About Garden

Note

2 posts with the tag “Note”

The First Dive in Multi-Threaded Patterns

Here’s a little side note from Chapter 6 - Multithreaded Patterns in this book: Multithreaded JavaScript.

In this chapter, introducing some multi-threaded patterns:

  1. Thread Pool
  2. Mutex
  3. Ring Buffers
  4. Actor Model
  • The thread pool is a very popular pattern that is used in most multithreaded applications in some form or another.
  • A thread pool is a collection of homogeneous worker threads that are each capable of carrying out CPU-intensive tasks that the application may depend on.
  • libuv library that Node.js depends on provides a thread pool, defaulting to four threads, for performing low-level I/O operations.
  • This pattern might feel similar to distributed systems.
  • Discuss thread into two parts: pool size and dispatch strategies.
  • Typically, the size of a thread pool won’t need to dynamically change throughout the lifetime of an application.
  • With most operating systems there is not a direct correlation between a thread and a CPU core.
  • Having too many threads compared to the number of CPU cores can cause a loss of performance.
  • The constant context switching will actually make an application slower.
  • Thread pool contains: worker thread, main thread, garbage collection thread (if using libuv)
Node.js
// browser
cores = navigator.hardwareConcurrency;
cores = require("os").cpus().length;
  • Don’t forget the main thread, so total threads are n + 1
  • Deciding how many threads by purpose:
    • Cryptocurrency miner that does 99.9% of the work in each thread and almost no I/O and no work in the main thread. Using the number of available cores as the size of the thread pool might be OK.
    • Video streaming and transcoding service that performs heavy CPU and heavy I/O. You may want to use the number of available cores minus two.
  • Reasonable starting point might be to use the number of available cores minus one and then tweak when necessary.
  • A naive approach might be to just collect tasks to be done, then pass them in once the number of tasks ready to be performed meets the number of worker threads and continue once they all complete.
  • However, each task isn’t guaranteed to take the same amount of time to complete.

Here’s a list of the most common strategies:

  • Each task is given to the next worker in the pool, wrapping around to the beginning once the end has been hit.
  • The benefit of this is that each thread gets the exact same number of tasks to perform.
  • Unfair distribution of work.
  • The HAProxy reverse proxy refers to this as roundrobin.
  • Each task is assigned to a random worker in the pool.
  • Possibly unfair distribution of work.
  • When a new task comes along it is given to the least busy worker.
  • When two workers have a tie for the least amount of work, then one can be chosen randomly.
  • HAProxy refers to this as leastconn.
  • Mutex means mutually exclusive lock.
  • A mechanism for controlling access to some shared data.
  • It ensures that only one task may use that resource at any given time.
  • A task acquires the lock in order to run code that accesses the shared data, and then releases the lock once it’s done.
  • The code between the acquisition and the release is called the critical section.
  • A ring buffer is an implementation of a first-in-first-out (FIFO) queue, implemented using a pair of indices into an array of data in memory.
  • The array is treated as if one end is connected to the other, creating a ring of data. This means that if these indices are incremented past the end of the array, they’ll go back to the beginning.
  • An analog in the physical world is the restaurant order wheel, commonly found in North American diners.
  • head index: The head index refers to the next position to add data into the queue.
  • tail index: The tail index refers to the next position to read data out of the queue from.
  • buffer capacity (length): The capacity of the buffer.

Refer to the chart from the book:

ring buffer

  • When the data is written into buffer, head index will move to the next position.
  • When the data is read from the buffer, tail index will move to the next position.
  • When head or tail index at the last position of buffer, next will move the the first position of buffer.
  • Since it’s a RING buffer, there’s no start and end point. Ths start position of head and tail index does not matter.
  • tail index is always located behind or at the same position with head index.
  • When the buffer is FULL, there’s two strategies for this situation:
    • Overwrite the oldest: Overwrite the oldest data in the buffer. It means that newer data is more important.
    • Prevent from writing: Throw an error, banning the new data from writing into the buffer.
  • It is ALWAYS necessary to get the oldest data in the buffer correctly.

Refer to wikis:

  • The useful property of a circular buffer is that it does not need to have its elements shuffled around when one is consumed.
  • The circular buffer is well-suited as a FIFO (first in, first out) buffer.
  • The non-circular buffer is well suited as a LIFO (last in, first out) buffer.
  • The idea of stake in JavaScript meets the concept of LIFO.
  • Circular buffering makes a good implementation strategy for a queue that has fixed maximum size.
  • For arbitrarily expanding queues, a linked list approach may be preferred instead.
  • The actor model is a programming pattern for performing concurrent computation.
  • An actor is a primitive container that allows for executing code.
  • An actor is a first-class citizen in the Erlang programming language, but it can certainly be emulated using JavaScript.
  • An actor is capable of running logic, creating more actors, sending messages to other actors, and receiving messages.
  • No two actors are able to write to the same piece of shared memory, they are free to mutate their own memory.
  • An actor is like a function in a functional language, accepting inputs and avoiding access to global state.
  • Actors are single-threaded.
  • A system that uses actors should be resilient to delays and out-of-order delivery, especially since actors can be spread across a network.
  • Individual actors can also have the concept of an address. For example, tcp://127.0.0.1:1234/3 might refer to the third actor running in a program on the local computer listening on port 1234.
  • With the actor pattern, you shouldn’t think of the joined actors as external APIs. Instead, think of them as an extension of the program itself.

Refer to the chart from the book:

actor model

E2E Testing Oriented Developing Process

One day, we got a mockup from designer.

Then we start crafting the page. At first, we might work on making UI and UX of components. That’s the main part. E2E testing is not a concern at the moment.

When developing, if we were luckily get the test case from QAs (or whoever wrote it), should we consider it when crafting our components? (And yeah only if we have enough time.)

By the way, test case I mentioned here is for E2E testing (automation testing), not for human.

To make writing E2E testing smoothly, it is good to add some HTML attributes which the test case is planned to query. Some ACTIONS in test cases like click a button, type something in input, get text from a div element, is the elements we want to query in the future.

But how about we haven’t got the test case yet? If so, we don’t know what elements will be queried very clearly. And here it comes the thought in my mind:

We might need some rules, then we don’t have to guess every time. With rules, the coverage will be guaranteed at acceptable level.

So, I made some rules for myself. I will discuss it into several parts:

  1. Component specific
  2. Location specific
  3. State specific
  4. Invisible data specific
  5. Structure specific

In this post, I will use React component to explain the concept. In fact, it doesn’t matter which front end framework are you using. It will always lead to HTML itself. Whether it is JSX or not.

Use data-test-page attribute in the outer HTML tag: data-test-page='<PAGE_NAME>'. With the data-test-page attribute, we can easily know the BOUNDARY of the page component.

PS. The attribute name is up to you! I will show the best choice of mine. 🙃

For example, if we have a home page component:

function HomePage() {
return <div data-test-page="home">{/* ... */}</div>;
}

Use data-test-comp attribute in the outer HTML tag: data-test-comp='<COMPONENT_NAME>'. With the data-test-comp attribute, we can easily know the BOUNDARY of the basic component.

function Button() {
return <div data-test-comp="button">{/* ... */}</div>;
}

The component might be used many times. (That’s why we make a component) Thus, data-test-comp will be duplicate and not easy to be recognized. Saying if we used 5 button components at the same page, but we just want to query the specific one and then click it. At this point, only we can do is just get all button elements, then search the label in the button. Finally click the button we want. The process might be tedious. So I am wondering if there was a better way to do it. What if we have unique attribute (or nearly unique, which means rarely used), it will be easier.

Therefore, we still need another FEATURE attribute for this case. (The attribute name is better to be UNIQUE) Use data-test-feat attribute to tell us what feature (or purpose) of the component is.

In my opinion, this attribute is optional. We still can query the right element we want with CSS selector. And also, overuse it might cause duplicate attribute name very easily. We never remember every line of codes we write, right? So make a good balance.

So there might be 2 ways to do it.

function HomePage() {
return (
<>
{/* ... */}
<div data-test-feat="confirmBtn">
<Button />
</div>
<div data-test-feat="cancelBtn">
<Button />
</div>
{/* ... */}
</>
);
}

In the example above, we clearly know that in home page, there are two buttons: confirm button and cancel button.

If wrapping component is annoying for you, another way is to insert it into component by props. Here’s the way:

Another way without the container element, is to pass the attribute name into the component through props:

function HomePage() {
return (
<>
{/* ... */}
<Button dataTestFeat="confirmBtn" />
<Button dataTestFeat="cancelBtn" />
{/* ... */}
</>
);
}

In button component:

function Button({ dataTestFeat, btnText, labelText }) {
return (
<div data-test-comp="button">
<label>
{labelText}
<button data-test-feat={dataTestFeat}>{btnText}</button>
</label>
</div>
);
}

Sometimes, we may want to know the state of some components. Saying we have a toggle button component like this:

toggle button

In test case, we want to know that after clicked the toggle button, it will change to OFF state correctly, for example.

We can still know what the state is by its style. If you used SASS that might not be big problem. But as I recently used tailwindcss very often, it become not quite straightforward… Here is what the component looks like:

function ToggleButton() {
const [isOn, setIsOn] = useState(false);
const handleChange = () => setIsOn((prev) => !prev);
return (
<button
type="button"
className="rounded-full overflow-hidden relative w-16 h-7 shrink-0 text-white font-semibold text-sm uppercase ml-2.5 bg-neutral-500"
onClick={handleChange}
>
{/* here is what the difference by state */}
<div className={`absolute transition left-1.5 top-1/2 -translate-y-1/2 ${isOn && "translate-x-9"}`}>
<div className="rounded-full bg-white w-4 h-4 relative">
<div className="absolute right-full top-1/2 -translate-y-1/2 px-2 whitespace-nowrap">On</div>
<div className="absolute left-full top-1/2 -translate-y-1/2 px-2 whitespace-nowrap">Off</div>
</div>
</div>
</button>
);
}

In here, CSS class translate-x-9 is which the style difference between ON and OFF state. It is still possible to recognize, but just like hell to query a long class like this… it especially happen when you use utility first css library like tailwindcss.

So in this situation, it is better to have a state attribute:

function Button({ btnText, labelText }) {
const [isOn, setIsOn] = useState(false);
const handleChange = () => setIsOn((prev) => !prev);
return (
<div data-test-comp="button">
<label>
{labelText}
<button data-test-state={isOn} onClick={handleChange}>
{btnText}
</button>
</label>
</div>
);
}

Another common case is loading state. Suppose that we have a list and search bar component here:

function SearchPage() {
// ...some state here
return (
<div data-test-page="searchPage">
<div data-test-comp="searchBar">
<input value={searchValue} data-test-feat="searchInput" />
<button onClick={handleSearch} data-test-feat="searchBtn">
Search
</button>
</div>
<div data-test-comp="list" data-test-loading={isLoading}>
{/* list data here */}
</div>
</div>
);
}

PS. For readability, I used raw HTML instead of wrapped components.

After click the search button, it will start loading to fetch new data from API. Then finish the loading, update new data in list component.

Suppose that we have a card component with product info. To find specific product very quickly, it is nice to have a product id on the component. But the product id does not expose to user. So we need to add it as attribute into the element:

function ProductCard({ productId, ...restProps }) {
return (
<div data-test-comp="productCard" data-test-product-id={productId}>
{/* product card content */}
</div>
);
}

For naming rule here, I think every SEMANTIC name will be fine.

Under some special situation, we will use div element to simulate specific HTML elements.

For example, like table-simulated elements, use data-test-el attribute: data-test-el='<ELEMENT_NAME>'

function Table() {
return (
<div data-test-el="table">
<div data-test-el="tbody">{/* ... */}</div>
</div>
);
}

This case might not be familiar for everyone. But still I can explain why I have to use this.

I met some CSS issue of table related elements (namely, <table>, <thead>, <tbody>, …etc) in Safari browser. So I have to use divs to re-build the same structure as table. Adding attributes is easier for me to read the whole structure.

Sometimes, we need to understand a component structure quickly. To do this, insert attribute in element as flag, will improve the readability. Saying we have a dialog component here:

function Dialog() {
return (
<div data-test-comp="dialog">
<div data-test-el="dialogHeader">this is dialog header</div>
<div data-test-el="dialogBody">this is dialog body</div>
<div data-test-el="dialogFooter">this is dialog footer</div>
</div>
);
}

After adding data-test-el attributes, it is more efficient to understand the component structure at a glance.

It will be a long journey adding a lot of attributes when developing. Besides, at this moment, test case is not the biggest concern. With some rules, we can do the tedious tasks without thinking too much. After finish the development, writing test cases might be smoothly. We might be thankful for our past selves.

And after all, it’s just my personal thoughts, NOT an industrial standard. 😎 I will be glad if someone benefit from it.

Cheers.