trees are harlequins, words are harlequins, Hacker News

[Attention conservation notice: machine learning framework shop talk / whining that will read like gibberish if you are lucky enough to have never used a thing called “tensorflow”]

I’ve probably probably spent solid hours this week trying (for “fun,” not work) to get some simple tensorflow 1.x code to run on a cloud TPU in the Google-approved manner

By which I mean, itrunsokay albeit slowly and inefficiently if I Just throw it in a tf.Session () like I’m used to, but I wanted to actually utilize the TPU, so I’ve been trying to use all the correct ™ stuff like, uh…

… “Datasets” and “TFRecords” containing “tf.Examples” (who knewserializing dicts of intscould be so painful?) And “Estimators” / “Strategies” (which do overlapping things but are mutually exclusive!) And “tf.functions” with “GradientTapes” because the “Strategies” apparently require lazily-defined eagerly-executed computations instead of eagerly-defined lazily-executed computations, and “Object-based checkpoints” which are th e new official ™ thing to do instead of the old Saver checkpoints except the equally official ™ “Estimators” do the old checkpoints by default, and oh by the way if you have code that just defines tensorflow ops directly instead of getting them via tf. keras objects (which do all sorts of higher-level management and thus can’t serve as safe drop-in equivalents for “legacy” code using raw ops, and by “legacy” I mean “early 2019 ″) then fuck you because every code example of a correct ™ feature gets its ops from tf.keras, andaaaaaaaaaaaaaargh !!

This solidifies the impression I got last time I tried trusting Google and using fancy official ™ tensorflow features. That was with “tensorflow-probability,” a fancy new part of tensorflow which had been officially released and included cool stuff like Bayesian keras layers … which wereimpossible to save to disk and then load again… and this was a known issue, andthe closest thing to an official reactionwas from a dev who’d moved off the project and was now re-implementing thesame thingin some newly-or-differently official ™ tensorflow tentacle called “tensor2tensor,” and was like “uh yeah the version here does work , you can try tensor2tensor if you want ”

(I still don’t know what“ tensor2tensor ”is. I refuse to learn what“ tensor2tensor ”is. They’re not going to get me again, dammit)

I don’t know whether the relevant category is “popular neural net frameworks,” or “large open-sourced projects from the big 5 tech companies , ”Or what, but there’s a certain category of curr ently popular software that is frustrating in this distinctive way. (Cloud computing stuff that doesn’t involve ML is often kind of like this too.) There’s a bundle of frustrating qualities like:

They keep releasing new abstractions that are hard to port old code into, and their documentation advocates constantly porting everything to keep up
The new abstractions always have (misleading) generic English names like “Example” or “Estimator” or “Dataset” or “Model,” giving them a spurious aura of legitimacy and standardization while also fostering namespace collisions in the user’s brain
The thing is massive and complicated but never feelsdone) or even stable – a hallmark of such software is that there is no such thing as “an expert user” but merely “an expert user ca. 2017 ″ and the very different “an expert user ca. 01575879, ”etc

Everything is half-broken because it’s very new, and if it’s old enough to have a chance at not being half-broken, it’s no longer official ™ (and possibly even deprecated)

Documentation is a chilly API reference plus a disorganized, decontextualized collection of demos / tutuorials for specific features written in an excited “it’s so easy!” Tone, lacking the conventional ” User’s Manual ”level that strings the features together into mature workflows

Built to do really fancy cutting-edge stuff and also to make common workflows look very easy, but without a middle ground, so either you are doing somethingveryordinary and your code is 2 lines that magically work, or you’re lost in cryptic Error messages coming from mysterious middleware objects that, you learn 5 hours later, exist so the code can run on a steam-powered deep-sea quantum computer cluste r or something

Actually, you know what it reminds me of, in some ways? With the profusion of backwards-incompatible wheel-reinventing features, and the hard-won platform-specific knowledge you just know will be out of date in two years? Microsoft Office. I just want to make a neural net with something that doesn’t remind me of Microsoft Office. Is that too much to ask?

********************Read More