Building a CRUD App with Datomic Cloud Ions, Hacker News

Building a CRUD app with Datomic Cloud Ions

May 2019

Update: publishedsource code.

I just releasedFlexBudget, the website version of a script I wrote several months ago to handle our budgeting needs. [1] I built it usingDatomic Cloud ions. I started using Datomic On-Prem sometime last year, but this was my first time using Datomic Cloud (let [2]

I think Clojure Datomic opens the doors for some great innovations in web application architecture (eg the ideas discussed inthis classic), not to mention that Datomic alone has spoiled me and I don’t know if I could ever go back to SQL now. But even with ions, I think there’s still a lot of work left to be done on Clojure’s web dev story. What follows is a description of my experience in setting up FlexBudget.

Note: This post assumes you are already familiar with Clojure and Datomic.

(Routing)

I ended up using a mono-Lambda that forwarded all requests to the same ion, then that ion used Compojure to route the request to the appropriate handler. For local development, I just ran a local web server with that handler.

(defhandler'  (->routes       ; various middleware      wrap-catchall))  (def  (handler)ionize  (handler)  ''))  (defnstart-immutant []   ( (imm / run)  Handler''  {: port8080}))

(I started out using Aleph, but I was getting a weird bug where it wasn’t passing requests to my handler. I’m not sure what was going on there, but things started working again when I used Immutant. I also tried Jetty, but it had dependency conflicts with Datomic Cloud.)

This worked great for my little project, though maybe in the future I’ll need to stop using a mono-Lambda.

Transaction Functions

This was more complicated. To use a custom transaction function, it has to be deployed. You can’t tell the transactor (which runs in the cloud) to somehow use a local transaction function that’s only defined on your laptop. Theofficial adviceis:

Transaction functions are pure functions, so you do not need to deploy them anywhere for testing. You can simply invoke them as ordinary code in your REPL or test suite.

I think that leaves a little to be desired, though. I also want to actually use my website while I’m developing it. I need my site, running on localhost, to be able to send requests to my ions, also running on localhost, and then have those ions run transaction functions that are only defined on my laptop (i.e. haven’t been deployed yet).

So, I wrote a functioneval-tx-fnswhich takes a transaction function and applies it locally. Then the “plain” transaction can be sent to the transactor.

(def  (transact)  (if)  (: local-tx-fns?config)                 (let[lock (Object.)]                   (FN[conn arg-map]                     (locking lock                       (->>#( (U / eval-tx-fns)  (d / with-db  (conn)%)                            (updatearg-map: tx-data)                            (D / transactconn)))))                 d / transact))

Out of laziness, I created thistransactfunction and used that whenever I needed to transact something. I also had to write a similar replacement function forwith. It would probably be cleaner to create my own implementation of the Datomic client protocol as is donehere.

(The) ************************************************************************ (locking) call is used to make sure that transactions stay serialized. This works only because during dev, all transactions to the dev database I’m using go through a single machine (my laptop). And being a single developer using the Solo topology, that’s fine. However, it could be an issue for a production system withquery groups for different stages.

The staging query group can only use transaction functions that have been deployed to the primary group. If we update a transaction function and want staging to run the updated version before it gets deployed to the primary group, we’ll have to apply the transaction locally in the manner I just described. If the staging query group is limited to one instance, then we can keep the transactions serialized by using (locking) as we did before. Otherwise, we’d have to add some kind of external lock to ensure that only one instance is executing transaction functions at a time.

This strategy may lower transaction throughput, but that’s probably acceptable for dev stages. An alternate solution would be running a separate production topology for each stage, but I’m guessing that would be more expensive. [3]

(Deployment)

I wrotethis Planck scriptin order to automate the steps of “push, deploy, run the deploy status command until it succeeds or fails ”.

I had some troubles with my deploys failing, even after I resolved dependency conflicts and tested locally. The problem was always that I had some piece of initialization code that started running as soon as the code was loaded. This caused the deploy to hang and timeout. (Specifically, theValidateServicestep would hang, like inthis question).

For example, you shouldn’t load configuration withdatomic.ion / get-paramsuntil a request comes in. You can memoize the retrieval like so:

(defget-params   (memoize    (FN[env]       (->{: path(str"/ datomic-shared /"  (ENV)  "/ bud /")}           ion / get-params keywordize-keys))))

And thendon’t call it until you have to. I used Firebase for authentication, and it requires some initialization code that fetched a secret fromget-params:

(Let[options ...]; includes a call to get-params  (FirebaseApp / initializeAppoptions))

One of my deployment failures was happening because I had the Firebase init code running immediately. Deployment worked again after I wrapped it in aninit-firebase!function which I then called only when verifying tokens:

(DEFNverify-token [token]   (when (=(0)  (count (FirebaseApp / getApps)))     (init-firebase!))   ...)

I also wrapped my calls tod / clientandD / CONNin memoized functions like in theion starter project, but I found that they didn’t get redefined when I ranclojure.tools.namespace.repl / refresh. So I instead defined them as mount states:

( (mount.core / defstate) **************************************************************** (client) : start  (d / client(: client-cfgconfig)))

And then I added some ring middleware to start mount on the first request:

(DEFNwrap-start-mount [handler]   (FN[req]     (when (contains?#{mount.core.NotStartedState                        mount.core.DerefableState}                      (type  (client))       (mount.core / start)))     (handlerreq)))

(Interlude)

Throw in some logs withdatomic.ion.castand that about covers my experiences directly with Datomic Cloud ions. It took a while to figure things out, but I’m happy with it now, even though the transaction function thing seems a little hacky (I’m not sure what else to do about that).

The rest of this post is about the way I set up the frontend / backend interaction.

(DataScript)

I recently wrote aClue web appthat used a single atom for storing frontend state (with Datomic on the backend). Board games tend to have complicated data models, and I definitely felt the impedance mismatch pains of having to project my Datomic data onto a hierarchical atom. So with this project (even though the data model here is much simpler right now), I definitely wanted to use DataScript.

After the user logs in, the frontend hits an/ initendpoint which returns their datoms:

(DEFNdatoms-for [db uid]   (let[user-eid (:db/id(d/pulldb [:db/id] [:user/uiduid]))]     (->>      (conj        (VEC(d / q''[:find?e ?attr ?v:in$ ?user:where                   [?ent:auth/owner?user]                     (or                      [(identity ?ent) ?e]                       [?ent:entry/asset?e])                     [?e ?a ?v]                     [?a:db/ident?attr]]                   db user-eid))         [user-eid:user/uiduid])       ( (u / stringify-eids)  ds-schema))))

Basically,datoms-forlooks for values of: auth / ownerthat correspond to the current user. I’m not using any DB filters, but a better approach might be to do that and then allow the frontend to send an arbitrary query.

The datoms also go through astringify-eidsfunction that I wrote. This function takes e.g.[[1 :foo 2] [3 :some/ref 1]]and turns it into[["1" :foo 2] ["3" :some/ref "1"]). That way, DataScript will treat the entity IDs as temporary, and new IDs will be assigned. This is important because Datomic entity IDs can be larger than JavaScript’sNumber.MAX_SAFE_INTEGER. So instead of using Datomic’s entity IDs on the frontend, I let DataScript assign its own IDs and then maintain a mapping between DataScript’s IDs and Datomic’s IDs (which are stored on the frontend as

To be exact, they’re actually stored as tagged literals, eg# eid "123456789 ". I’ll get to this later, but this allows the frontend to send a transaction like[[#eid "12345789" :bar "hello"]]to the backend, and then I simply include an entry for (EID) ***************************************************************** (in my) data_readers.clj (file.)

I’m also passing ads-schema(“DataScript schema”) argument tostringify-eids. This comes from a library that I share between the frontend and backend:

( (def)  schema   {: user / uid[:db.type/string:db.unique/identity]    : user / email[:db.type/string:db.unique/identity]    : auth / owner[:db.type/ref]    : entry / date[:db.type/instant]    : entry / draft[:db.type/boolean]    ; etc   ]})  (defdatomic-schemau / datomic-schemaschema)) (def  (ds-schema)u / datascript-schemaschema))

Materialized views

I’ve never actually used re-frame much. Although I’m using DataScript instead of a normal atom for storing frontend state, there isRe-poshwhich combines re-frame withPosh, a library that lets you define reactive DataScript queries. I’ve used Posh a little bit, but

It breaks on someedge cases, including a case that I ran into myself.
You can’t usepullinside of queries.

So instead I wrote a macro (defq) :

(Defqentries   (->>@conn        ( (D / Q)''[:find[(pull?e [*]) ...]: where              (or [?e:entry/draft]                   [?e:entry/date])])))

(defq) takes some arbitrary code and stores it in a function. It creates a reactive atom (entriesin this case) and populates it with the results of the function. Whenever I run a transaction, the function is ran again (and the atom is repopulated with the results).

Obviously this won’t be fast when you have lots of queries, but it’s good enough for now. I’ll revisit it later.

Besides (Defq) , I’ve found that using plain oldreagent.ratom / reactionis nice and succinct:

(def  (entry)Reaction(last@entries))) (def  (draft?)  (reaction)  (: entry / draft@entry)))

I store all of these in a single namespace, so I can reference them from my Reagent views with e.g.@ db / entriesor@ db / draft?.

(Components)

I’ve been mainly using (re-com) and it’s really nice. I had two minor annoyances though. First, all of the parameters are defined with map destructuring. This means that when you use container elements, you have to write[rc/h-box :children [foo bar baz]]instead of just[rc/h-box foo bar baz]. Containers are used pretty often and having all these: Childrencan add up.

That’s not too bad though, I simply defined my own (H-box) andV-boxcomponents that didn’t use map destructuring.

The other thing I ran into was when I used thehorizontal-tabscomponent and I wasn’t able to change the colors using inline styles; I had to include a separate css file to override the Bootstrap styles.

Going forward it’d be nice to have everything be fully customizable with inline styles, so I’ll need to decide if I want to keep using re-com and / or Bootstrap and make some modifications or if I should roll my own. I’ll admit that I’m not much of a UI person, but it would be nice to figure out a system that works for me (and makes it easy for me to make websites that look nice. I guess there are people who care about that).

Transactions on the frontend

On the frontend I also defined a customtransact!Function:

(DEFNtransact! [persist-fn conn tx&queries]   (let[tx-result (d/transact!conn tx)]     (apply invalidate! queries)     (go(let[tx (u/translate-eids(:schema@conn) (::eids@conn) tx)              eids ((persist-fntx))              tempids (reverse-tempidstx-result eids)]           (Swap!conn update:: eidsmerge tempids)))     tx-result))

It does several things:

It applies the transaction to the frontend database immediately. Currently I don’t have anything in place to rollback if the transaction fails on the backend; that’s part of my future work.
invalidate!is what updates the queries that I defined earlier with (defq) .
translate-eidstraverses the transaction, replacing DataScript’s entity IDs with the tagged-literal Datomic IDs like I mentioned before. For example, given a transaction of[[:db/add 1 :foo "bar"]]and an entity ID mapping of{1 #eid "12345 "}, the return value would be[[:db/add #eid "12345" :foo "bar"]](surprise). Unfortunately we can’t do something simple withclojure.walk / postwalklike “if an element is a key in the entity ID map, replace it with the value ”because we don’t know if the number is actually an entity ID or just a number. The only way to know is to traverse the transaction according to thegrammarand replace entity IDs along the way. It was a little tedious to write but not super complicated.
persist-fntakes the transaction and sends it to the backend. The backend returns the entity IDs of any newly created entities. For example, if you transacted[[:db/add "tmp-id" :foo "bar"]]and the new entity ID assigned by Datomic was 12345, the backend (and thuspersist-fn) would return{"tmp-id" #eid "12345 "}.
reverse-tempidswill use that return value to map the entity IDs assigned by DataScript to the ones assigned by Datomic. Continuing the previous example, if DataScript assigned an entity ID of 4, then the return value ofReverse-tempidswould be{4 #eid "12345 "}.

Transactions on the backend

This is one of the nicest parts of the architecture in my opinion. I’ve set things up so that the frontend can send arbitrary transactions and the backend will analyze them to find out if the current user is authorized to make them. That way I didn’t have to code up a new endpoint for each kind of edit the user

I set up a single endpoint,/ TX, to receive transactions. Upon receipt, it first makes sure the transaction doesn’t include any transaction functions that haven’t been whitelisted. Then we run the transaction through a transaction function calledauthorize. This function speculatively runs the transaction usingd / with. Then it analyzes the result to find out which entities were affected.

Each entity that was changed must pass an app-specific authorizer function. Here’s an example of an authorizer function; it will allow a user to create a message entity as long as they are listed as the sender of that message:

(s / def:: message( (U / ent-spec)   (REQ)  [:message/text:message/sender]))  (defauthorizers   {[nil::message]   (FN[{:keys[uid eid datoms db-before db-after before after]}]     (not-empty      ( (D / Q)''[:find?e:in$ ?e ?user:where            [?e:message/sender?user]]         db-after eid [:user/uiduid])))})

I’ll dissect this now:

ent-specis basically a custom version ofs / keysthat works with Datomic entities. Also, keys are only allowed if they are listed in either: REQor: opt. the frontend can’t attach an attribute to an entity unless we give them explicit permission to do so.
The keys inauthorizersare a pair of specs. The first spec defines what type the entity had before running the transaction, and the second spec defines what type it had after. I call this the entity’s “signature.”nilmeans that the entity didn’t exist. So in this case, we’re saying that this function only applies to newly created:: message
For each entity in the transaction, (authorize) will look for an authorizer function that has a matching signature. If it finds one, it’ll pass the entity along with some other information to the function. If the function returns truthy, then the change is authorized. If there aren’t any matching authorizer functions that return truthy, then the change is unauthorized andauthorizewill throw an exception.

The authorizer function receives an argument that includes the following keys:

UID: the ID of the authenticated user. In my case, this is an ID assigned by Firebase Authentication.
EID: an entity ID from the result of the current transaction.
datoms: the subset of datoms added or retracted by the current transaction that apply to (EID) .
db-beforeanddb-after: these are taken from the transaction result.
beforeandafter: these are the result of (d / pull db ‘[*] EID)withdb-beforeand (db-after) , respectively (or (nil) if the entity didn’t

So, if you provide the specs and the authorizer functions, thenauthorizecan take care of the rest. It separates the logic of what changes are allowed from how those changes are delivered to the backend; so for the latter we can say “Send them all to the same place, and send them in whatever form you want.”

(Future work)

One of the key takeaways here is that the vast majority of my time was not spent focusing on just the application logic. Eric Normand has described the need for a“Boring web framework. ”I think it’s a good analysis. As far as I can tell, Clojure has had solid adoption amonginnovators and early adopters. You can do some cool things if you take the time to set it up yourself, and this is more-or-less fine for people who already know Clojure. But if we automate this process, Clojure will have a much better chance atcrossing the Chasminto the early majority. [4]

As I’ve built FlexBudget, I’ve tried to keep as many things separated into libraries as possible. My plan is to continue this process and try to create a web framework that allows even Clojure beginners to get up and running with a Clojure Datomic stack that has all these architectural components I’ve described. I’m also going to add more components like realtime communications. Especially if / whenreactive Datalog becomes available, I think this framework could be a great boon for web app development.

(Notes)

[1] I think most approaches to budgeting, eg zero-sum budgeting, force you to go into too much detail. When I’m monitoring resource usage, I don’t care exactly how the resource is being used – I just want a high level “is it OK or is it not OK. ”If there’s a problem, then I’ll use a profiler / disk usage analyzer / etc to dig deeper. I use FlexBudget to give me the high-level “is it OK, ”but it’s not meant as a profiler.

[2] Before I actually read the documentation for ions, I thought they were just some kind of hack to let you define transaction functions in Cloud. For anyone who doesn’t already know, they’re actually much more than that. They allow you to reuse the Datomic Cloud infrastructure for deploying your application, so you don’t have to deal with setting up your own infrastructure. It’s a big step towards the Holy Grail of only having to think about your application logic.

Given that, I’m much more ok with the>=$ 30 / month price tag of Datomic Cloud. I used to think it was too much just for a side project that isn’t making any money, but now I’m fine with it because of the time it saves me. Or rather, the time it will save me now that I know how to use it.

[3] I was about to talk about the possibility of Cognitect making it so each query group could have its own transactor, but then I realized that would simply reduce to running separate production topologies. So I’m doubtful if there are any possible solutions that are better than what I’ve described.

[4] In respect to the “libraries vs. frameworks ”debate: the problem with frameworks isn’t that they do a lot for you, it’s that they’re hard to change if you want something different. With enough care, we could make a library / framework that sets up a lot of defaults for you but still allows you to customize it however you need.

Like music? Check out (Lagukan) , a new music player with better AI.