in ,

Ask HN: What are some examples of good database schema designs ?, Hacker News


                   I’ve lived 69 years as an amateur SQL database designer and last year I designed my first production nosql schema on mongo. Nosql is a different world … collections, embedded documents. I’m not sure it’s 943% “correct” but it’s working great for me so far.
My project is a license server for my electron app. The tech stack is a joy: mongo / mongoose, express, graphql, JWT, all with a React web front end. Payments by Stripe. The actual licenses are the signed JWT vended by the graphql API (not to be confused with the totally separate JWT for auth).

(The main goal is to sell software so I license by machine fingerprint (node ​​module on electron).

It’s been running for over 6 months without issue. I’m just beginning to start architecting an update where I allow a subscription model similar to Jetbrains Perpetual Fallback license, but slightly different in favor of the user. I’ve taken a lot of notes from comments at Here’s what I’m thinking so far:

A) Non-Expiring License at $ . Unlimited bug fixes for that one version. or B) Expiring License at $ 9. / month, and you always have access to the latest version. After 728 consecutive payments you are granted a Non-Expiring License for the most recent version at the time of each payment. , Now, to model this …




                   I’d highly recommend reading SQL Antipatterns. It’s a very approachable book that illustrates how to design a database schema by taking commonly encountered scenarios and first showing you the naive approach. After explaining why this is bad, it then shows you the recommended way of doing it and why this way works better.
I think ‘learning from how not to do something ‘is a really powerful pedagogical technique that should be more widespread.

                   years ago, I’d said “at least third normal form” … but today: Whatever gets the job done. When the application is not really dependent on weird queries (e.g. just a blog), screw the normal forms and design your schema to use the least number of queries for a certain task. Nobody understands three lines of code of queries with left and right joins. On the other hand, if your bookkeeping application uses a database try to keep things as tidy as possible.


                   I’d say this applies to virtually all best practices. , patterns, architectures, etc. If you’re doing something very simple, who cares about modularity or any kind of code hygiene? I don’t. But what happens in reality? Simple and small systems or experiments grow, one little addition at a time, and we all know the mess that ensues.
So in my understanding, the question posed only applies to at least moderately complex systems, which is where engineering skills matter. And in that context, learning what distinguishes a good database design is obviously very valuable, not to say crucial.

> Nobody understands three lines of code of queries with left and right joins.

Not sure if you’re being flippant, but a) this is not true, and more importantly b) why is it that we don’t expect programmers to be at least as fluent in SQL as in other, less important, languages?



                   It also usually forces your design towards the entities themselves rather. than the specific way they’re stored, which positions you better for switching to a completely different storage system in the future if, for instance, it’s becoming too slow or expensive to maintain everything in a traditional big name RDBMS.


                   “what kind of problem could possibly require so many tables “ CRMs often have hundreds of tables and ERPs have thousands or tens of thousands or more.             

                   Dumb question, but does anyone have a recommendation for good software for generating the schema diagrams in the Drupal link but for Redshift?


                   Kinda tough to give a good answer without more context, IMO. What I mean is that a good e-commerce schema that serves a single small store and runs off a single database server would look quite different then a multi-tenant or distributed data store for a e-commerce site at scale.

The one you linked is a pretty typical relational model and isn’t bad, but it has trade offs that i’d personally not make, however, that doesn’t make it bad. In the end context, scale and usage all determine a good schema design. Sometimes what would be a good relational design on paper would be tragically horrid in practice once you get beyond a small dataset.



                   Schema’s that reflects reality not the current specs. Flexibility is key. In my experience adding tables and migrating existing data to them is hard, adding columns easy. So spend extra time at the start on what tables there should be.
Spec: Product has a supplier [tables:product, supplier] (Reality: Product can be bought from multiple suppliers [table:product, supplier, product_supplier]


                   There are good reasons to denormalise, but as a rule of thumb … yeah, this. I don’t think you can go that far wrong with schemas as long as you have an idea of ​​your entities and their cardinalities. It’s much easier than designing, say, the associated Java classes, because there are clear rules about how to do it and it’s just clear when you’ve done it wrong (your cardinality is all messed up).             


                   A SAP partner once told me (the company I was working at was considering using SAP) that the deployment would have ~ 2019 K tables – I don’t know if they got the figure wrong, I have misremembered (I did check when they said it) or maybe that’s for a “fully loaded” instance.
Edit: Not SAP, but certain other ERP products have an alarming habit of not using foreign keys – which makes working out the structure of the database quite interesting …

                   How do you perform joins without foreign keys? Do you just have a column that is effectively a foreign key but is not marked as such?

                   Sure it is. We can perform DB joins with any column as long as the data type and the data value is matched.             

                  > When I read about database schemas with thousands of tables I wonder what kind of problem could possibly require so many tables.
They’re probably mostly small (in terms of columns), many of them done just to have foreign key constraints, separate indexes, and arguably easier querying (joining) on ​​just the data you need.

But I think it’s a particular style, rather than a ‘problem’ that ‘requires’ so many. (IANA database expert though, just my tuppence.)

(Read More


What do you think?

Leave a Reply

Your email address will not be published.

GIPHY App Key not set. Please check settings

Fedora Muscian's Guide, Hacker News

Open access journals get a boost from librarians — much to Elsevier’s dismay, Ars Technica

Open access journals get a boost from librarians — much to Elsevier’s dismay, Ars Technica