Report: The 2023 State of Databases
Databases power much of the modern tech industry. And, by extension, the global economy.
You’d think that would make them the subject of countless industry studies, but the conversation around databases is actually relatively quiet.
We’re working hard to change that. Last year we released the first ever State of Databases report, where we surveyed thousands of developers about their favorite database tech.
Today I’m excited to release our 2023 edition. This year we partnered with our friends at Prisma, Timescale, Airplane, Planetscale, SingleStore and Weaviate. The vast majority of respondents are developers, founders and engineering leaders.
What did we find? To start, it’s clear the world of databases is as diverse as ever, with old products existing alongside new ones. For instance, many developers still work with apps powered by OracleDB - a forty-year old product - while thousands of others have flocked to a Postgres setup on Supabase, a company that’s less than five years old.
This year’s edition has fresh data on some familiar questions (SQL vs. NoSQL), along with topics that feel very 2023 (looking at you, vector databases).
We also cover related products like analytics, internal & admin tools, ORMs and hosting providers.
If you’re looking to learn what backend to use for your new project, thinking about how to make your company more productive, or merely curious to learn what developers value in data products, this report is for you.
Read on to learn about the State of Databases in 2023. You can also access the full report, including most of the raw data, at this link.
Methodology and biases
This year we collected data and ranked products based on feedback that respondents gave on each tool. Namely: how they would rate it, whether they had used it, heard of it, and plan to use it again in the future. We also collected demographic data (e.g. years of experience, company size, industry) and opinion data on categories as a whole (e.g. “How happy are you with the state of databases?”). Results were collected through a public online survey shared on social media.
Since some products are obviously more well-known than others, we noticed a popularity bias when compiling the data. Basically, products with a low sample size had a rating that was disproportionately higher or lower relative to other tools in the survey, which skewed their overall rating. We solved this by applying a Bayesian average with C=30 and m=3.7 (the average rating across all tools). This resulted in the final ratings being a balance between the tool’s average rating and the rating of all other tools in the survey, with more of an effect on tools with fewer ratings.
This is the same way IMDb ranks movies, which is why beloved classics like Batman outrank the art house movies your film buff friends are obsessed with.
Our partners for this year shared the survey with their users, and we obviously asked our users to complete the survey as well (10% of respondents were existing Basedash users). Rankings and stats in the survey results were compiled as averages and percentages, meaning that tools with more respondents (including us and our partners) weren’t inherently favored.
We also published the full aggregated response data in a public Basedash workspace which you can access here.
Databases
Postgres dominates, but PlanetScale is not far behind
PostgreSQL remains the clear winner in usage with over half of all respondents currently using Postgres in production. It also has the highest rating of all tools (across all categories that we surveyed) at 4.49, but PlanetScale (based on MySQL) is right behind at 4.48. PlanetScale still has a lot of room to grow, so we’ll be keeping a close eye on this rivalry in next year’s report.
Consistent with last year, newcomers like Timescale, ClickHouse, CockroachDB, and SurrealDB are becoming much more widespread. One interesting trend is that many of these newer database companies are becoming more opinionated around hosting and management. Some companies like Timescale and ClickHouse offer official managed hosting as an option, while others like Planetscale are completely built around managed hosting.
More specialized databases like Redis and SQLite remain popular with high usage and rating.
2023 is the year of the vector
This year we’re also seeing an explosion in the usage of vector databases. These data sources are used to store and index massive unstructured datasets, making them ideally suited for working with LLMs. Given the rise of AI models and related products this year, it’s no surprise that these databases are growing in popularity.
Many newer vector databases like Pinecone and Weaviate made huge gains in popularity this year, and raised some massive funding rounds. Other established databases also expanded support for vector data through tools like pgvector, an extension for Postgres that expands support for vector use cases. Among the vector-specialized databases, Weaviate is by far the most highly rated, coming in at 4th out of all databases surveyed.
SQL or NoSQL?
SQL databases are still the most popular database type, with over half of respondents stating that they would only work with SQL databases if given the choice.
NoSQL still has some fans, but the average developer is much more likely to prefer working with both SQL and NoSQL than they are to prefer NoSQL entirely: 40% of respondents said they like both, whereas only 6% prefer NoSQL over a relational database.
What makes a database worth using?
We asked developers what factors they care most about when selecting a database. Respondents told us that performance, developer experience and quality of documentation are the three most important factors they take into account, with cost not far behind:
Other tools
Since developers will naturally turn to a host of related tools when working with databases, we surveyed respondents about their favorite data warehouses, ORMs, analytics products and internal tools.
Data warehouses
Many respondents came from large or rapidly scaling companies, where data teams write massive multi-table queries as part of their analytical work. That’s where data warehouses come in.
Perhaps unsurprisingly, Snowflake continues to rise in both usage and overall popularity. Google BigQuery is an ever-reliable alternative, whereas Amazon’s Redshift has experienced a lag in user satisfaction:
ORMs & clients
We also asked about the tools and libraries used to access and query the data in databases. We grouped these under the category of “ORMs”, but included some other major database querying tools which are not strictly ORMs (e.g. Supabase’s instant APIs).
Supabase is the most highly rated tool in this category, which tracks what has seemed like a breathtaking surge in usage if you talk to any developer spinning up a new project.
That being said, Prisma by far had the most usage among developers, with nearly 30% of all respondents (across all backend languages) currently using it in production.
A new rivalry?
This year also saw a new rivalry between Prisma and newcomer Drizzle. With the rise of Next.js and serverless architectures, Prisma’s cold start performance became a huge talking point. Drizzle took the opportunity to hop in the ring and offer an alternative with incredibly fast cold start performance. As a newer product, Drizzle is still early in its usage, but the fact that its rating (4.11) edges out Prisma (4.06) says a lot.
Hosting providers
What good is all our data and tooling if we don't have a reliable place to host and serve our data and apps?
No surprise that AWS was the most widely heard of, but interesting to see Planetscale and Supabase take the number 1 and 2 spots in terms of overall ratings. Cloudfare comes in a close third.
Many of the newer database companies offer managed hosting alongside their database products. Products like MongoDB Atlas, Timescale Cloud, and ClickHouse Cloud are optional managed hosting products, while Planetscale is primarily designed to be used as a managed, hosted database.
Internal & admin tools
Building tools on top of databases is inevitable in many companies. So is the need for running analysis for business purposes. Odds are your colleagues do operational work with company data, and these tools make it easier.
Basedash was the number one rated tool, which isn’t surprising since this is a Basedash survey. One takeaway: many developers might prefer to spin up admin panel-like use cases via a SaaS product rather than building custom tools.
Airplane had the second highest rating, suggesting that many respondents value full git control over no-code editors if they choose to go the route of building custom tools.
In fact, the biggest takeaway from this data might be that the no-code movement is somewhat overblown, at least when it comes to internal tooling. Instead, the data points to two competing trends when it comes to database-related tooling: ready-made SaaS products on the one hand and complete, code-enabled platforms on the other.
Both certainly have their virtues and we’ll be watching closely to see how this plays out in the long run.
BI tools
If databases are the engine for modern software, analytics is the pit crew. What good is all of that precious user data if you don’t try to optimize your end product?
From detecting trends and understanding user behavior to optimizing performance, effective use of BI tools can give your application, and by extension your business, a significant edge over competing products.
Hex was the highest rated tool, followed by Apache Superset and June. It’s interesting to see a newer product like Hex completely overtake established alternatives like Metabase and Mode. One guess: people we’ve spoken to who swear by Hex tend to praise how well it works for collaborative workflows.
This is also interesting given that Mode was acquired by another company this year. Maybe there’s a shift in developer preferences that will cause even more consolidation in the space. We’ll have to keep an eye on this for next year’s survey.
Invite only
We're building the next generation of data visualization.