Database Design Diagrams
These days, there are many options available to persist your data, but the most common is still some form of database. Whether that be relational (e.g. Postgres), document (e.g. MongoDB) or other, you should be using schemas to structure your data so that it's clear what is stored where, and that data is accurate and reliable.
In the majority of systems you work on, you will likely have a need to store data in a database.
Creating a database design diagram is not necessary for every service, especially if the service is relatively small or persists a small amount of data. However, it can be really helpful for new engineers joining the team, or other colleagues, to understand how your data is modelled in your database.
In this article, we will learn about the most common way to create database design diagrams. It's called an Entity Relationship Diagram, often shortened to ER Diagram or ERD. At the end of the article, we'll take a look at some ERD examples.
How To Use ER Diagrams
To avoid confusion with domain entites, such as the ones used in domain models, the entity in an ER diagram is a database table. An ER diagram's purpose is to define the database schemas for a given set of entities, such as their fields and data types, as well as how the entities link together. ER diagrams are most commonly used with relational databases, but can be used with document databases too, to represent collections such as in MongoDB.
Unlike the domain model or architecture diagrams which should be regularly updated, I find ER diagrams have a different lifecycle. They are more like snapshots of a point in time. I have found three main scenarios where I use an ER diagram:
- When writing an Architecture Decision Record (ADR) for a brand new service, I may optionally include an Entity Relationship Diagram to show a snapshot of the database design at that point in time.
- When making significant database design changes to an existing service, it's often helpful to include a snapshot of the database schema to help understand in the future why that decision was made. This might be another ADR, or included on pull requests that make the change.
- When making small database schema changes, it can be helpful to draw out the change using a database design diagram to quickly see if the change makes sense, or to simply help explain the change to a colleague.
I've highlighted the word snapshot above, as that's what an ER diagram should be in my opinion. The database is going to evolve, and the ER diagram will get out of date. You don't necessarily need to update the ER diagram every time you update the database schema, because the database schemas themselves are the source of truth. However, it can be very useful to include snapshots of what the database design looked like at that point in time, to aid with rationalising changes, or explaining changes to others. Whenever I use an ER diagram, I make sure to call out that it's simply a snapshot, and not the source of truth.
ERD Examples
Now that we understand what an ERD is, and when to use an ERD, let's take a look at a small example:
erDiagram TITLE { int title_id int type_id string name datetime release_date } TITLE_TYPE { int type_id string type } GENRE { int genre_id PK string name } TITLE_GENRE { int title_id int genre_id } TITLE }|--|| TITLE_TYPE: has TITLE ||--o{ TITLE_GENRE: "belongs to" TITLE_GENRE }o--|| GENRE: references
This diagram was created using diagrams as code.
ERD's are one of the more complex types of diagrams, due to their unique syntax. ERD's use what are often called "crow's feet" to identify relationships between entities in your database design.
In the ERD example above, we can see four entities: TITLE, TITLE_TYPE, GENRE and TITLE_GENRE and their respective fields and data types. A TITLE has just one TITLE_TYPE (e.g. a TV show or movie) and is associated with many genres, as it has a join table TITLE_GENRE which itself has a relationship to GENRE.
To understand the crow's feet in the ERD example, we need to break down the notation. At each point an entity connects to another, there are a set of symbols. The first symbol, closest to the entity, denotes how many of one entity exist in another. A single line means just one, multiple lines means many. The second symbol, which is always either a line or a circle, indicates whether it's mandatory or not. A line means it's mandatory, a circle means it's optional.
Taking a closer look at the ERD example, a TITLE must be associated with a single TITLE_TYPE, and a TITLE_TYPE must associated to a TITLE, but a single type can be associated to many TITLES.
Learn To Create ER Diagrams
That rounds out this introduction to database design diagrams using ER diagrams. There's a lot more to cover, but this should serve as a good entry point.
The ERD example above was created using a tool called MermaidJS. You can create ER diagrams, and other diagram types, using its simple Markdown-like syntax, which can then be rendered easily as an image.
Want to learn more? I cover how to create database design diagram, using ER diagrams, along with a number of other use cases and diagrams, in my book: Creating Software Using Modern Diagramming Techniques, published via The Pragmatic Programmers.