Optimizing REST API calls

Recently, we refactored our codebase at Basedash to fetch our server data with React Query and optimize our REST API calls in the process. The transition to React Query allowed for better code readability and the optimization of our API calls resulted in half the number of data fetching API calls and a 3x reduction in the amount of data loaded on initial page load.

This post describes what prompted the move to use React Query and the optimizations that were made to our REST API calls and routes in the process.

Problems with the current data fetching logic

Our codebase data fetching logic was hard to follow and we had a lot of bugs because of this. We were using Redux and Redux thunks to coordinate the fetching of our data and storing the data in our Redux store. The following patterns is what was commonly used to fetch data:

  • On component load, dispatch a thunk action in a useEffect to trigger the data fetching
  • The thunk will first dispatch an action indicating the request has been initiated
  • A reducer will update the store with a loading state in response to the dispatched action
  • Any components that should show a UI loading indication will be connected to the store with a useSelector and show a spinner accordingly
  • The thunk will make the API call to fetch the data
  • The thunk will dispatch an action if the API call is successful or if the API call returns an error
  • A reducer will update the store with the API data or the error and reset the loading state to false
  • Any components that rely on the API data will have a useSelector that gets the API data from the store and will update the UI accordingly

https://basedash-blog.s3.amazonaws.com/Optimizing%20REST%20API%20calls%2070f49e7eaa274e1eb8b07deff63fe315/FA7461B8-1AD5-48C4-9354-B4D5A776FDB5.jpeg

💡 If following the above pattern, checkout createAsyncThunk from Redux toolkit which will dispatch pending, fulfilled, and rejected actions for you. All you need to do is write the data fetching and reject logic.

Things got more complicated if some API calls needed to happen before others.

We were also making a ton of API calls on initial page load to get all the data a page needed, but in a lot of cases, it did not make sense to have those calls split up so we were making unnecessary API calls.

We decided to take a shot a using React Query for our data fetching needs since it seemed to have a nice API for data fetching and I've been reading good things about how React Query makes it easy to keep your server data in sync on the client side.

While migrating over to React Query we also decided to optimize the number of API calls we made and also tried to prevent sending any unnecessary data from the server if it was not needed for the UI.

Analyzing current state of affairs

Refactoring the code to use React Query started out by analyzing the current API calls done for the current pages of the application and thinking of what the optimal scenario would be for data fetching.

Let's take the following Basedash page as an example:

https://basedash-blog.s3.amazonaws.com/Optimizing%20REST%20API%20calls%2070f49e7eaa274e1eb8b07deff63fe315/Untitled.png

This page has 23 API calls related to data fetching. Some of the API calls were requesting data that's not required for the page's UI (e.g. billing information and user activities used on the activity page). Some of this data was saved in a normalized Redux store which we could then take advantage of to save API calls in the future whenever that data is required in the UI.

As you can see in the screenshot, there is a table being shown, and the data for that table was gathered through 4 separate API calls:

  • columns: GET request that fetches all the columns for the table
  • foreign-keys: GET request that fetches all the foreign keys for the table
  • enum-values: GET request that fetches all the enum values for columns that are of an enum type
  • records: POST request that fetches the table records

We saw that the first 3 API calls could be combined into 1, so we reworked the API so that the follow API calls are used to get the table data:

  • table: GET request that fetches the table columns, foreign keys, and enum values
  • records: POST request that fetches the table records

Following the same process as above we combined API routes that could logically be grouped into a single route. We also took note of which API calls needed to happen on which pages so we could avoid fetching data that was not necessary for the currently loaded page.

ℹ️ It's actually not a bad idea if you fetch data that isn't used on the current page if you intend to cache it to prevent API calls further down the line. This is especially true for data that is highly likely to be requested by the user during their session. You should be able to use React.lazy and/or react-loadable to preload some pages/components.

One cool thing to look out for the prefers-reduced-data media query that will allow developers decide if they should do this kind of preloading or not to respect a user's preference for loading data. If you have unlimited internet and Gigabit download speeds, then you'll surely not mind preloading data, but if you are on a 3G connection with a limited data plan, then you would prefer not to have to preload data you might not need.

React Query structure

The majority of the React Query data fetching code makes use of the useQuery and useMutation hooks. Since these hooks can be re-used within different components and we wanted to reduce the amount of duplicate code (especially with TypeScript types), we've created custom hooks that use useQuery and useMutation and also makes sure to properly type the expected params, options, errors, and data related to the hooks.

Here's an example of one of those custom hooks:

export const useApiTable = ( params: FetchTableParams, options?: UseQueryOptions< FetchTableResponse, ApiError, FetchTableResponse, [string, FetchTableParams] > ) => useQuery< FetchTableResponse, ApiError, FetchTableResponse, [string, FetchTableParams] >( ['table', params], async ({ queryKey }) => { const [_key, params] = queryKey; const response = await fetchTable(params); if (!response.ok) { if (response.status === 400) { throw new ApiError(await response.json()); } throw new ApiError('Network response was not ok'); } return response.json(); }, options );

In some cases, we also use the queryClient.fetchQuery method to fetch queries, so we sometimes go a step further and extract the query function into its own function that can be used with queryClient.fetchQuery.

Refactoring the above custom hook to use an extracted query function looks like this:

export const useApiTableQueryFunction: QueryFunction< FetchTableResponse, [string, FetchTableParams] > = async ({ queryKey }) => { const [_key, params] = queryKey; const response = await fetchTable(params); if (!response.ok) { if (response.status === 400) { throw new ApiError(await response.json()); } throw new ApiError('Network response was not ok'); } return response.json(); } export const useApiTable = ( params: FetchTableParams, options?: UseQueryOptions< FetchTableResponse, ApiError, FetchTableResponse, [string, FetchTableParams] > ) => useQuery< FetchTableResponse, ApiError, FetchTableResponse, [string, FetchTableParams] >( ['table', params], useApiTableQueryFunction, options );

The process for fetching data with React Query looks like this:

  • Create a custom hook to uses useQuery and give it a query key and a query function that specified the network request that must be made
  • Use the custom hook in a component
  • The custom hook returns a loading property that will be true while the data is being fetched and false once the API call has completed
  • The custom hook returns a data property that will contain the API data once the request has successfully completed and also returns an error property if the API call failed or returned an error.
  • Perform actions that should happen after a query/mutation using the onSuccess and onError callbacks.

https://basedash-blog.s3.amazonaws.com/Optimizing%20REST%20API%20calls%2070f49e7eaa274e1eb8b07deff63fe315/39A5D709-68E4-472F-A8CC-428A46064B33.jpeg

React Query allows you to easily do some cool things related to your data fetching such as retrying failed API calls n number of times, refetch queries when a user re-focuses into the window, query cancellation, and more.

When updating API data using React Query mutations, we often make optimistic updates by using queryClient.setQueryData in the onMutate option passed in the useMutation hook. This allows us to update the UI data immediately without having to wait for the API call to complete. In the case the API call fails, we will revert the optimistic update in the onError callback function.

In some cases, we don't manually patch the React Query state after a mutation, but instead invalidate queries so that they get refetched. A rule of thumb we use is to patch the React Query state if the data is currently visible in the UI and invalidate the queries for data that is found on pages that aren't currently being viewed. This way, the UI updates instantly on the current page and we don't have to worry about manually patching the React Query state for a bunch of queries (which is a lot of work).

Beware of combining too much data into one API route

We ran into an issue with one of our API calls that we had refactored to return a large amount of data required on initial page load (such as all items needed in the sidebar). The issue was that we had a problem in some cases where the API had an issue resolving a subset of the data that was returned from that API route. The problematic code also meant that the API call took 40+ seconds to resolve because of a timeout/retry mechanism we had on the server.

This problem meant that users saw a loading screen for 40+ seconds because the API call had trouble resolving a subset of the data it was supposed to return.

The more data you move into an API call, the more points of failure you introduce for a single API call, which can be problematic if a large portion of your UI depends on that API call.

Also, in terms of error handling, it is not obvious what part of the API response caused the error. You would need to return a more precise error message explaining what part of the API response was causing the problem and the client would have to know how to parse that error to show appropriate errors in the UI.

If you split up your API calls, you're better positioned to display data in your UI and error messages for sections of your UI that are unable to display data due to API errors.

Another benefit of not serving too much data in a single API call, specifically in the context of React Query, is that if you need to invalidate queries, you can be more precise in which query needs to be invalidated and so you can help prevent too much "overfetching" of data.

ℹ️ GraphQL APIs are quite nice with respect to being able to pick and choose what data you want to request in a single API call. There are also a number of ways to handle errors in GraphQL, although it is not really standardized.

Normalized cache and over-fetching data

With Redux, you're able to have a normalized cache for your data. This means that you can save you data in the cache in a way that there can be no duplication of data and a single point of reference for your data "entities".

To give an example of what I mean, say we take a Twitter-like application that will show a list of "tweets" on one page and then you can click on individual tweets to see replies to a tweet.

https://basedash-blog.s3.amazonaws.com/Optimizing%20REST%20API%20calls%2070f49e7eaa274e1eb8b07deff63fe315/136F41CD-0A13-478F-AF33-64AE81917345.jpeg

If you load the app and view the list of tweets, your normalized cache would look something like this after the data has been fetched:

const store = { tweets: { ids: [1, 2, 3], entities: { 1: { message: 'Hello world', replyCount: 8, likes: 30, }, 2: { message: 'Goodby world', replyCount: 12, likes: 28, }, 3: { message: 'YOLO', replyCount: 32, likes: 1003, }, }, }, };

Now, let's say a user clicks on a tweet with ID of 1 to open it up on it's own page. You would be able to show the tweet immediately in the UI since the tweet with ID of 1 is already in the cache. You would just need to make an API call to load the replies to the tweet. Now let's say you were to "like" the tweet. This would increase it's like count from 30 to 31.

const store = { tweets: { ids: [1, 2, 3], entities: { 1: { message: 'Hello world', replyCount: 8, likes: 31, }, 2: { message: 'Goodby world', replyCount: 12, likes: 28, }, 3: { message: 'YOLO', replyCount: 32, likes: 1003, }, }, }, };

After "liking" the tweet, the like count will update instantly in the UI. Now if the user navigates back to the list of tweets page, you could skip making an API call to fetch the data since there is already a list of tweets in the cache. Also, the like count for the tweet with ID 1 will also be correctly displayed with 31 likes since the tweet data references the same cache entity as the one that was used on the individual tweet page.

This idea of having a single point of reference for an entity is what I mean by a normalized cache and it is incredibly useful for updating your application's UI without refetching data from your server.

With React Query, you lose out on the ability to have a normalized cache, so you either need to invalidate any query related to the data you update in your UI, or you need to update all the different references to the same entity in your React Query store.

Having to invalidate many queries will lead to over-fetching a of data, but it does guarantee that your client data is in sync with your server data. You also don't need to re-implement any fancy server-side logic you might have in order to appropriately update your client-side store correctly.

The creator of React Query, Tanner Linsley, shared some thoughts on this issue:

Open tweet->

🛠 The new standard for internal tools

Basedash replaces custom internal tools so you can focus on building your product.

Check it out here

Invite only

We're building the next generation of data visualization.