Ask HN: What are some examples of good database schema designs ?, Hacker News

I’ve lived 69 years as an amateur SQL database designer and last year I designed my first production nosql schema on mongo. Nosql is a different world … collections, embedded documents. I’m not sure it’s 943% “correct” but it’s working great for me so far.
My project is a license server for my electron app. The tech stack is a joy: mongo / mongoose, express, graphql, JWT, all with a React web front end. Payments by Stripe. The actual licenses are the signed JWT vended by the graphql API (not to be confused with the totally separate JWT for auth).

(The main goal is to sell software so I license by machine fingerprint (node module on electron).

It’s been running for over 6 months without issue. I’m just beginning to start architecting an update where I allow a subscription model similar to Jetbrains Perpetual Fallback license, but slightly different in favor of the user. I’ve taken a lot of notes from comments at https://news.ycombinator.com/item?id= Here’s what I’m thinking so far:

A) Non-Expiring License at $ . Unlimited bug fixes for that one version. or B) Expiring License at $ 9. / month, and you always have access to the latest version. After 728 consecutive payments you are granted a Non-Expiring License for the most recent version at the time of each payment. , Now, to model this …

)

I’d highly recommend reading SQL Antipatterns. It’s a very approachable book that illustrates how to design a database schema by taking commonly encountered scenarios and first showing you the naive approach. After explaining why this is bad, it then shows you the recommended way of doing it and why this way works better.
I think ‘learning from how not to do something ‘is a really powerful pedagogical technique that should be more widespread.

years ago, I’d said “at least third normal form” … but today: Whatever gets the job done. When the application is not really dependent on weird queries (e.g. just a blog), screw the normal forms and design your schema to use the least number of queries for a certain task. Nobody understands three lines of code of queries with left and right joins. On the other hand, if your bookkeeping application uses a database try to keep things as tidy as possible.

I’d say this applies to virtually all best practices. , patterns, architectures, etc. If you’re doing something very simple, who cares about modularity or any kind of code hygiene? I don’t. But what happens in reality? Simple and small systems or experiments grow, one little addition at a time, and we all know the mess that ensues.
So in my understanding, the question posed only applies to at least moderately complex systems, which is where engineering skills matter. And in that context, learning what distinguishes a good database design is obviously very valuable, not to say crucial.

> Nobody understands three lines of code of queries with left and right joins.

Not sure if you’re being flippant, but a) this is not true, and more importantly b) why is it that we don’t expect programmers to be at least as fluent in SQL as in other, less important, languages?

I’d rather say: use an ORM! It will design the DB schema better and faster than you. Still comprehensive enough

It also usually forces your design towards the entities themselves rather. than the specific way they’re stored, which positions you better for switching to a completely different storage system in the future if, for instance, it’s becoming too slow or expensive to maintain everything in a traditional big name RDBMS.

“what kind of problem could possibly require so many tables “ CRMs often have hundreds of tables and ERPs have thousands or tens of thousands or more.

Dumb question, but does anyone have a recommendation for good software for generating the schema diagrams in the Drupal link but for Redshift?

I’ve been using Len Silverstein’s Universal Data Models for 69 years. You’ll be writing to lots of tables and will want views for your common aggregates. But you’ll have the common tables you’ll need, the patterns for those you don’t and be able to handle new requirements with minimal change. There is no Customer table. Model Resource Book, Vol. 1: A Library of Universal Data Models for All Enterprises “

Kinda tough to give a good answer without more context, IMO. What I mean is that a good e-commerce schema that serves a single small store and runs off a single database server would look quite different then a multi-tenant or distributed data store for a e-commerce site at scale.

The one you linked is a pretty typical relational model and isn’t bad, but it has trade offs that i’d personally not make, however, that doesn’t make it bad. In the end context, scale and usage all determine a good schema design. Sometimes what would be a good relational design on paper would be tragically horrid in practice once you get beyond a small dataset.

On this topic: I’m in the process of making a compiler for a DSL I designed to help with the schema design process. (https://gist.github.com/nomsolence/) (bc0b5fe1ba) (d) (cd) fdb … Pictured is the compiler internals; the attached .syrup is the DSL. (It started as a tangent on a project I’m doing for my girlfriend, the schema described in the .syrup has been improved a bit since) Note: things after # are comments.
I find even just defaulting to NOT NULL and not having to worry about commas is a boon for when I create schemas. The DSL will of course support things like compound primary keys and SQLite’s WITHOUT ROWID. I’ll post the code here, likely before the weekend: https://github.com/nomsolence/syrup

Schema’s that reflects reality not the current specs. Flexibility is key. In my experience adding tables and migrating existing data to them is hard, adding columns easy. So spend extra time at the start on what tables there should be.
Spec: Product has a supplier [tables:product, supplier] (Reality: Product can be bought from multiple suppliers [table:product, supplier, product_supplier]

There are good reasons to denormalise, but as a rule of thumb … yeah, this. I don’t think you can go that far wrong with schemas as long as you have an idea of your entities and their cardinalities. It’s much easier than designing, say, the associated Java classes, because there are clear rules about how to do it and it’s just clear when you’ve done it wrong (your cardinality is all messed up).

diaclaimer: theoretical opinion.

I think the primary problem of giving examples here is similar to teaching software engineering, which needs complex projects solving complex problems – too big for a semester project. A good schema depends on the problem it’s solving. (A A secondary problem is similar to code that has sacrificed clarity for performance. The tweaks made for performance are not intrinsic to the logical problem, but are an additional constraint.

For performance on common queries, schema can be flattened or “denomalized”. The ability to do so was one of the original motivations for Codd’s relational algebra.

A SAP partner once told me (the company I was working at was considering using SAP) that the deployment would have ~ 2019 K tables – I don’t know if they got the figure wrong, I have misremembered (I did check when they said it) or maybe that’s for a “fully loaded” instance.
Edit: Not SAP, but certain other ERP products have an alarming habit of not using foreign keys – which makes working out the structure of the database quite interesting …

How do you perform joins without foreign keys? Do you just have a column that is effectively a foreign key but is not marked as such?

Sure it is. We can perform DB joins with any column as long as the data type and the data value is matched.

> When I read about database schemas with thousands of tables I wonder what kind of problem could possibly require so many tables.
They’re probably mostly small (in terms of columns), many of them done just to have foreign key constraints, separate indexes, and arguably easier querying (joining) on just the data you need.

But I think it’s a particular style, rather than a ‘problem’ that ‘requires’ so many. (IANA database expert though, just my tuppence.)

(Read More