in ,

Programmer's critique of missing structure of operating systems, Hacker News

Programmer's critique of missing structure of operating systems, Hacker News
             

Some time ago, I’ve come across a contemplation on the topic of necessity of operating systems. I’ve been doing sort of “research” about this for almost two years, so I’ve decided to write up central thoughts along with links to some relevant sources of information on this topic.       

Be aware

      

This blog contains a wall of text.

      

      

I would like to declare that I am talking about myself. When I write that something is “necessary” , “possible” , or “viable” , what I mean is that it’s necessary for me “, ” possible for me “, or “viable for me”

.        I’ve seen too many discussion drama caused by reading between lines and by applying reader’s presuppositions and opinions on the topic of author’s character. I know that this notice probably wouldn’t help, but please, try to read and think about presented ideas with open mind.       

If you find topics presented here controversial, nonsensical, downright outrageous or stupid – please, calm down. Remember the declaration presented in previous paragraphs. I am not talking about you, I don’t have the need or urge to press something on you, break your workflow or slander your system. Take it as my strangeness or maybe a possible direction of research. Please, don’t apply it to yourself.

      

If you don’t want to comment in discussion threads linked at the end of the article, you can send me an email to bystrousak [] kitakitsune.org with some meaningful subject. I've written a lot of articles like this over the years and I still talk to people who read them to these days, even seven years after the publication. I value your opinion.       

             

In recent years, I've worked for several companies that make software. In some cases, I was a part of bigger teams, in other cases, I started with a few other programsmers from scratch. I was often involved in designing the system's architecture, if I was the one creating it from the beginning.       

I work as a [configuration]. "backend programmer" My job description is to create systems which read, process and store all kinds of data, parse all kinds of formats, call all kinds of programs and interact with other systems and devices.

      

In the Czech National Library , I was a part of a three-man team where I was working on a system for processing electronic publications. I wrote almost whole backend code, from data storage to communication with existing library systems; Aleph, Kramerius, or

LTP (Long Time Preservation). For storage, we were using ext4 / CentOS and object storage

ZODB forced by Plone frontend. For the communication, system used all kinds of protocols, from REST and upload of XML files to FTP, to zipping files with generated XML manifest and uploading them to smb. Backend microservices and frontend used RabbitMQ.

       In another company, which shall not be named, I worked on a system for DDoS protection. I can't talk about specific details because of the NDA I've signed but you could get all kinds of buzzwords from the job ads from that time; Linux, Python, SQL and NoSQL databases, all kinds of existing opensource software and RabbitMQ for communication. Most of the communication backend was written by me.        I also worked on the backend for the Nubium development , a company which operates probably the largest Czech file hosting website uloz.to . I've refactored and partially designed pieces of software that handle the storage and redistribution of files across different servers, and also software that handles processing of used data. Icons with thumbnails that show in preview of compressed and video files is one example of the stuff I've done there, among a good deal of other stuff you can't see directly.

       I also did websites and REST services for several other companies and I've touched more exotic stuff as well. For example, I was a part of a three-man team working on a project of a video recognition system that detected objects in live video feeds from all kinds of public tunnels (Metro, road tunnels). There, I've learned a lot from the machine learning optimization tasks I was solving, from the low-level image processing and from the architecture of the whole backend code.        I'm telling you this not only to establish my credibility, but also because I've seen the same pattern everywhere I worked; independently created, but approximately similar piece of software, which has naturally grown from several requirements:

       (Reliable storage of a large number of small files (millions or billions of files, from megabytes to gigabytes per file), or a smaller number of large files (terabytes per file).        ["name"] Reading and storage of configuration in some kind of structured format (INI, JSON, XML, YAML, ..).

  •        [configuration] Structured communication between internal services, but also with external programs.        Distributed architecture with multiple physical machines, which allows good scaling.
  •              

    When I was a kid, I thought I knew what an operating system is; Windows , that's for sure! That's the stuff everyone has on their computer; There is a button with "Start" on it in the left bottom corner, and if you start a computer and only black screen comes on, you'll have to type (win

    to make it appear.       

    High school taught me a definition, by which the operating system is an equipment in form of a program, which makes the input-output devices of the computer available ”       

    Later, I’ve found out that there are all kinds of operating systems, and almost all of them do the same stuff; they create ± united abstraction on top of the hardware, allow running different types of programs, offer more or less advanced filesystem and networking, and also handle memory and multitasking and user permissions.

          

    Today, operating system is for most of the people that thing which they use to run browser where they watch youtubes and facebooks and also send messages to others. Sometimes, it can run games and work with all kinds of hardware, from CD-ROM to keyboard.

          

    For advanced users, OS is a specific form of a database and API, which allows them to run programs and also provides standardized mechanism to different operations, such as, for example, printing a character on the screen, storing something under a name in the permanent storage, creating connection over the Internet, and so on.

          

    We usually tend to think about the OS as about something that is given and more or less changeless, something that we pick from the Holy Trinity (Windows, Mac, Linux), around which crouches a range of small (0. 39%) alternatives BSD, Plan9, BeOS, ..).

          

    For some reason, almost all used operating systems are, with a few exceptions, almost the same. I would like to explore whether that is really desired property or an artifact of historical development.

           Short history of operating systems       

    First computers did not have operating systems and were generally everything but user friendly. The only way to program them was by directly changing hardware switches.       

    Then, computers programmed by punch cards and tape came. In order to use them, operators still had to manually enter the bootloader directly into the memory to tell the computer how to use the punch tape hardware. But the rest of the work was done by a key.        System worked on so called “batches”. A programmer delivered a medium with data and a program to work on this data, a system operator then loaded the given “batch” to the computer and run the program. When the program finished, he returned results or error message to the programmer.       

    Since the whole process required quite a lot of manual work, it was only natural that parts of the whole process were automated in form of tools and libraries with useful functions; for example, a loader which is loaded automatically when the computer starts and allows to load programmer’s program only by pressing a specific button. Or subroutines which print content of the memory in case of an error. This bundles of tools were called monitors

          

    As computers at that time were really, really expensive, there was a pressure to share them amongst multiple users at the same time. This gave birth to the first operating systems which merged functionality of the monitors with support for running multiple programs at once together, and also added user management and concurrent user sessions.        Increasingly complex hardware and complexity of the direct control put pressure on the universality and reuse of the code across different machines and their versions. Operating systems began to support standard devices. It was no longer necessary to give specific address of memory on the disc drive, you could just store the data under some name, and later, also in the directory. First file systems were born. With the support of concurrent access of multiple users, it was also quite obvious that it is necessary to separate their processes and programs, so that they can’t read and overwrite each other’s data. This gave rise to memory allocators, virtual memory and to multitasking as we know it.        Operating systems became a layer which stands between the user, who was no longer required to become a programmer, and the hardware. They provide standardized ways to store data, print out characters, handle keyboard input, printing on printer and running batches or programs.        Later on, also graphical interface and networking was added. Personal computers offered plenty of devices, all of which had to be supported by the operating system. Except for a few exceptions, it was all iterative development which did not bring anything new. Everything just got better and improved but the paradigm itself did not change.              

    I have no problem with the concept of operating system as a hardware abstraction. On the contrary, I have a problem with the concept of OS as a user interface. And I don’t mean graphical interface, but everything else you are interacting with, thus everything that has some kind of “shape”;        Concept of the file system

          

    Think about what it really is; a limited hierarchical

    key-value database.

           It is limited because it not only limits the allowed subset and size of the key , which is in better cases stored as UTF, but also the value itself. You can only store a stream of bytes. That seems reasonable only until you realize that it is a hierarchical (tree-based) database which does not allow you to store structured data directly. The number of inodes (branches in the structure) is also limited to a number which is tens of thousands in the worst cases; However, it is several millions of items in one directory at most.       

           In two of my previous jobs we had to go around inode limitation by using stuff like BalancedDiscStorage or storing the files into three sub directories named by the first letters of the MD5 hash of the file.        Filesystem is so shitty database, that for most of the operations, atomicity is not supported and transactions are not supported at all. Parallel writes and reads work differently in different operating systems, if at all, and in fact, nothing is guaranteed at all. Compare it to the world of databases where the ACID is considered a matter of course.        File system is specifically restricted database. Everyone I know who tried to use it for anything non-trivial, shot himself into the leg sooner or later. It doesn’t matter whether it is a programmer who thinks that he’ll just store that sequence into the files, or a basic user who can’t get a sense in his own data and has to create elaborate schemes of storing personal files into the groups of directories and use specialized tools to search for stuff in order to find anything. Bleeding edge is when the filesystem can recognize that it has corrupted data and can recover them by using error-correcting algorithms, but only if you have discs in the RAID.       

    Don’t get me wrong. I get that the point is somewhere else. You want to solve the low-level stuff, disc sectors, journaling, disk plates and partitions and RAIDs and everything is really great progress from how we stored data on the storage before. But – from the point of the user interface – isn’t this a progress in the wrong direction?                     

    [configuration]        (Original filesystems were a metaphor used for storing files into the folders . It was meant to be a mental tool, an idea given to people who were used to work with paper, so they could understand what they were doing. And among many technical limitations of the current day filesystem, this metaphor is restricting us even more because it restricts our thinking [ 0, 2, 4, 6, 8, 10 ] . I can’t help but wonder whether we shouldn’t drop the direction “how can we improve this fifty year old metaphor” (by using tags, for example) and focus on “is this really the best metaphor for storing data?” question instead.        (Programs        Purely physically, programs as such are nothing but sequences of stored bytes. Basically, there can be nothing else for the operating system, as the operating system’s file-database does not know how to work with anything else.        Let me give you an overwhelmingly exaggerated explanation. of how programs work.       

    Programs are at first written in a source code of a chosen programming language, which is then compiled and linked into one block of data. Then, it is punched on punch cards (binary data) and plugged into an appropriate box (file) in the correct section of a file cabinet.        when a user wants to run the program, he writes its name on the command line or clicks on the icon somewhere in the system. Stacked punch-cards are then taken from the box stored somewhere in the file cabinet and loaded into the memory. For a program, memory looks like one huge linear block, and thanks to the virtual memory, it seems like the program is the only thing running on the system. Binary code is then run from the first punched cards to the last, but it can conditionally jump and choose a different punched card by its address.        [“name”]          () Structure of the PE executable file. Source: https://github.com/corkami/pics/ [configuration] )         
    Programs can also read parameters from the command line, use shared library, call API of the operating system , work with filesystem, send (numeric) signals to other programs, react to such signals, return (numeric) values ​​or open sockets.        What’s wrong with that?       
    I don’t want to say that the concept itself is wrong. BUT. Again, it is the same old metaphor extended a few steps into the future by iterative progress. Everything is really, really low-level . The whole system has changed only minimally in the last 073 years and I can’t get rid of the idea that rather than arriving at absolute perfection, we are stuck somewhere in a twist of local maximum.       

    (Lisp Machines) , Smalltalk ) and

    Self’s environment have taught me that it can be done differently. The programs do not necessarily have to be a collection of bytes, they can be small separate objects that are (by a metaphor of sending messages) callable from other parts of the system, and that dynamically compile as needed.

           Are you familiar with a Unix philosophy which says that you should compose solutions to your problems from small utilities; do one thing, focus on it, and do it well? Why don’t we take the idea to the next level and make small programs from every function and method of our programs? This would allow them to communicate with the methods and functions of other programs in the same way as (microservices) (work.)       

    It works in Smalltalk; it can be versioned, it can use and be part of the dependency specification system, handle exceptions, help, and who knows what else. Can’t we do this with programs in general?        Data without a structure, called Files

                   The whole UNIX ethos has been a huge drag on industry progress. UNIX’s pitch is essentially: how about a system of small functions each doing discrete individual things … but functions must all have the signature char function (const char ? Structured data is for fools, we'll just reductively do everything as text. What we had in is fine. Let's stick with that.       

          

    Source:

    (http://www.righto.com/) / 43 / the-xerox-alto-smalltalk-and-rewriting.html? show Comment=(# c)

    )

          

    And here we are again, with the binary data. At first glance, it is a brilliant idea. Nothing can be more universal.        So where is the problem?        Virtually all data has its own internal structure. Whenever programmers work with the data and don’t just move them around, they have to parse them, that means to slice them and create trees of structures out of the data. Even streams of bytes, for example, (audio or video data have their own structure of chunks to iterate over.                 (Structure of the WAV file. Source: https://github.com/corkami/pics/

    )          This happens over and over again. Each program takes raw data, gives them structure, works with it and then throws it over in act of collapsing the data back to the raw binary data .        Current computer culture is obsessed with parsers and external descriptions of the data which could carry the structure itself, in form of metadata. Every day, vast amounts of CPU cycles are unnecessarily wasted on conversion of raw bytes to structures and then back. And every program does that differently, sometimes even in different versions of that program. Considerable part of my time as a programmer is spent by parsing and converting the data, which, if it had a structure, would be editable by simple transformation. Take a part of the tree here and move it here. Add this to this part of the graph, remove something else here.

           Current situation is analogous to sending Lego castle as individual cubes by post, with reference to the instructions on how to put it into the form of the castle. The recipient would then have to obtain instructions somewhere (remember, I put only the reference to the instructions into the package, not the instructions itself) and then laboriously compose the castle by hand. The absurdity is even more evident when we realize that this is not just the Lego, but everything. Want to send a glass bottle? Break it into tiny sand pieces and tell the recipient that he has to make the bottle himself. I understand that in the end, you always have to send raw bytes on some level, just as you have to send Lego cubes eventually. But why don’t you send them already in form of the castle?        It is not just about the absurdity of it all. The point is that at the same time, it is also worse . Worse for the user and definitely worse for the programmers. The data you are using could be self-descriptive but they are not. Individual items could contain data types as well as documentation, but they don’t. Why? Because it is fashionable to have a bunch of raw binary data and their description separately. Do we really want it, does it really have advantages?        In the past decades, massive growth in the use of formats such as XML, JSON, and YAML has been seen. While it is certainly better than a wire to the eye, it is still not what I’m talking about. I am not concerned about the specific format, but about the structure itself. Why not have all of the data structured?       

    I’m not talking about parsing with an XML parser here, I’m talking about loading directly into the memory, in the style of (message pack [configuration] , SBE

    , (FlatBuffers) , or (Cap’n Proto) . Without the need to evaluate text and solve escape sequences and unicode formats. I’m talking about using

    people [0]. Name (instead of doc.getElementByTagName ("person") [0] .name.value in one format and doc ["people"] [0] ["name"] in another. About reading help by simply using help (people) instead of looking into the documentation.        I am talking about the absence of the need to parse WAV files, because you can naturally iterate over the chunks which are all the time there. About the fact that data describe itself by its structure, not by an external description or a parser. I am talking about direct serialization of " objects ", about unified system supported by all languages, even if they don't have objects, but just some kind of a structure.        Would it really be impossible? Why?        Structured communication        Do you notice the pattern in my criticism, anger and passion? The uniting theme is subconscious structure .

           We do not perceive filesystems as databases. We do not realize that their structure is hierarchical key-value data. We take for granted that programs are binary blobs, instead of a bundle of connected functions, structures and objects that need to communicate with each other. We understand data as dead series of raw bytes, instead of tree and graph structures.       

    What else has a structure and I haven’t mentioned it yet? Communication. With operating system. With programs. Between programs and computers.        / sys        Plan9 was a beautiful step in the direction of more structured approach to communication with computers. However, after exploring it, I came to a conclusion that although the creators had a general awareness of what they were doing, they did not realize it in its wholeety. Maybe partly because they were still influenced by Unix and concept communication by byte streams too much.        Plan9 is amazing in how you can interact with the system using just file operations on special kind of files. That’s brilliant! Revolutionary! So amazing, that Linux now uses half of the features, for example in the / sys subsystem and in (FUSE) .        Do you know what is missing? Reflection and structured data. If you don't read a manual, you don't know what to write where and what to expect. Things just don't carry explicit semantic meaning.        Here is a code for blinking with the diode on the raspberry pi:        (echo) () (> / [configuration] (sys) / (class) / (gpio) [configuration] / export (echo) out [0] [configuration] (sys) / [configuration] (class) (/ (gpio) [configuration] / (export ["name"] / (gpio) / (direction) (echo) (1) [configuration] / (gpio) / (value)        How are errors handled? What happens when you write the string „vánočka“ to the / sys / class / gpio / export / gpio (/ value Do you get error code back from the echo command? Or is there any information in any other file? How does the system handle parallel writes? And what about reading? What happens when I write

    "in" instead of

    “out” (to the

    / sys / class / gpio / export / gpio) / direction

    and then I’ll try to read from the value ?       

     The description and expectation is stored in a different place. Maybe in man page. Maybe in the description of the module somewhere on the net. Real expectation is only in the form of a machine code and maybe in a source code, stored, of course, separately in a completely different place. Everything is stringly oriented and there is no standardization. One module may use 

    "in" and

    “out” , the second

    “master” and

    and “slave” and the third

    1 "

    and "0" Data types are not specified or checked. If you really think about it, it's just a really primitive communication framework built on top of the files, without any other advantages of modern communication frameworks.        It is also probably the greatest effort to create something like objects I've seen, without actually realizing that they are trying to do so. The structure of / sys

    is very object like. The difference between the

    sys.class.gpio.diode

    and this naive file protocol is that a file implementation is an undescribed key-value set, similar to JSON. It also does not explicitly specify the structure, set of properties, format of more complicated messages, help, or format and mechanism of raising exceptions.       
    Sockets

          

    I understand why they were created in the way how they work. Really. At the time of creation, this was the best and completely rational option. But why do we still use unstructured format of binary data transmission even today when all communication is structured? Even seemingly binary stuff like streamed audio is structured.       

    Have you ever created an IRC bot? You have to create a connection. Great. Of course, you’ll use

    select because you don’t want to eat the whole CPU just by waiting. Data is read in a block, typically 4435 bytes. You then transform them to strings in the memory and look for the “ r n "sequences. You have to create buffers and process only lines terminated with the " r n "sequence, otherwise you end up with incomplete messages. Then, you parse the text structure and transform it to the message. Different messages have different formats and you have to parse them differently. People still do this, all the time. Every protocol is re-implemented, ad hoc parsers are written. Thousand times every day. Even though messages could have structure by themselves, same as everything else.       

    Or HTTP, for example. It is used to transport structured (x) HTML data, right? You have clearly specified serialization format and its description and how to parse it. Great. What else could you wish for? But do you really think that HTTP uses HTML for the data transfer? Of course not. It

    specifies its own protocol with the key-value serialization that is completely different from every other protocol, with chunk of specification and whatnot .       

    Email? I don't even want to rant on the topic of the perversion of email, its protocol and format. Vaguely defined structure in something like five specifications melded into result that is implemented differently by each software product. If you have ever parsed email headers from an email conference, you know what I mean. If not, I can only suggest trying that as an exercise. I can guarantee, that it'll change your view of the world.       

    And everything is like that. You almost never really need to transfer stream of bytes. You need messages. Most of the times it is data in some kind of a key-value structure, or as an array. Why do we, 79 years after the invention of socket, still transport data as streams of bytes and reinvent text protocols all the time? Shouldn't we use something better?       

    We think it's normal to create a structure here, then serialize it using some kind of serialization protocol, often invented ad-hoc. Then we transfer it to the system which needs to have codec for deserialization and reconstruction of the original structure, which is basically just external description of data in form of code. If we are lucky, this codec is close enough to our serialization format and can deserialize it without the loss of the information. And why? Why don't we send the structures directly?        ZeroMQ was imho step in the right direction but for the time being, I don't think it has received much warmth.        Command line parameters

          

    You have a program which is doing something. If you ignore that you can find it and click on it using mouse and then fill some kind of form manually, then the mechanism of command line parameters is one of the most common ways how to tell the program what you want from it. And almost every program uses different syntax for parsing them.       

    Theoretically, there are standard and best practices, but in reality, you just don't know the syntax until you'll read the man page. Some programs use

    - param . Others - param

    . And then there are programs that use just param

    . Sometimes, you separate lists by spaces, other times it is using commas, or even colons or semicolons. I've seen programs that required parameters encoded in JSON mixed with "normal" parameters.       

    If you call the program from some kind of a "command line", that is some kind of a shell, like bash, then you also get a mixture of the shell scripting language and its ways of defining strings, variables and only god knows what else (I can think of escape sequences, reserved names of functions, eval sequences using some random characters like `, wildcard characters,

    - for stopping the evaluation of wild cards, and so on). This is just one gigantic mess where everyone uses whatever he thinks is nice and parses it without any internal consistence or logic, just because he can.

           And then, there is a topic of calling programs from other programs. I don't even want to recall how many times I was forced to use some piece of code like

           (import) (subprocess) (sp)=[configuration] (subprocess)

    .

    Popen

    (    ['7z', 'a', 'Test.7z', 'Test', '-mx9'],    (stderr)=[configuration] (subprocess)

    .

    STDOUT

    ,

       (stdout)=[configuration] (subprocess)

    .

    PIPE

    () (stdout) , [configuration] (stderr)

    =

    sp

    . (communicate)

    ()

          

    Most recently, with a wide range of parsing free-form output. And I’m really tired of this shit.       

    If the arguments are a bit more complicated, then it quickly degrades into masturbation of string concatenation where you are never sure if it is actually safe and it really does not allow to command injections, you don’t have any guarantees of supported character encoding, there are pipes and tty behaviors you wouldn’t dream of in your worst nightmares and everything is super complex. For example, your

    buffer stops responding for larger outputs. Or, the program reacts differently when run in an interactive mode, then in non-interactive, and there is no way of forcing it behave correctly. Or, it puts random escapes sequences and

    tty formatting to the output, but only sometimes . And, how do you actually transfer (structured data to the program and from it?        I just don’t get this mess. The command line parameters are usually just a list or a dictionary with nested structures. We could really use a unified and simple way to write language of specification of such structures. Something easier to write than JSON, and also more expressive.       

    I completely omit that the necessity of command-line parameters is completely suppressed when you can send structured messages to a program just like calling a function in a programming language. After all, it is really nothing more than a call to the appropriate function / method with specific parameters, so why doing it so strangely and indirectly?        Env variables

           Env [“name”]. variables are a dictionary. They literally map the data to this structure and behave like that. But, because of the missing structure,

    they are just a one-dimensional dictionary with keys and values ​​as strings. In D, they would be declared as string [string] env; . This is often not enough because you need to transfer nested or more complicated structures than just simple strings. So, you are forced to use serialization.        My soul is screaming, terrorized by statements like "You need to store structured data to an (env) ( variable? Just use JSON, or a link to the file! ” Why, for God's sake, can't we manage a uniform way of passing and storing data, so that we have to mix the syntax of (env variables in bash with JSON?

           (Configuration files)        Whether you realize it or not, virtually every non-trivial program on your computer requires some kind of configuration. This usually takes form of some kind of a configuration file. Do you know where the file is located? In Linux, it is customary to place them in

    / etc , but they can also be in your $ HOME , or in $ HOME / .config , or in a random dot subfolder (like [configuration] $ HOME / .thunderbird /.       

    And, what about the format? I think you can guess it. It can be literally anything anyone ever thought of. It can be (pseudo) INI file, or XML, JSON, YAML. Sometimes, it is actually a programming language, like Lua, or some kind of a hybrid language, like postfix. Anything goes.        There is a joke that the complexity of each configuration file format increases with time, until it resembles a poorly implemented half of the lisp. My favorite example is (Ansible) and its half-assed, incomplete parody of a programming language built on to of YAML .       

    I understand where it comes from. I wandered in this direction too. But, why it can't be standardized and the same across the system? Ideally, the same data format that is also a programming language used everywhere. Why it can't be just the objects themselves, stored in a proper location in the configuration namespace?

          

    Logs

          

    Every production service needs to log information somewhere. And, every single one has to solve the following problems:       

           Anyone familiar with the Greenspun’s Tenth Rule

    probably knows how this has evolved after that. A definition of

    ENV variables and their substitution by using template systems came. A definition of escape characters, . Dockerignore file. Commands Such as

    CMD have received an alternative syntax, so it may not only be a CMD program parameter (but also a) (CMD)       

    Things like LABEL made it possible to define additional [configuration] (key=value structures. Of course, of course, someone needed conditional build sections, so hacks were created to get around it by using template systems and setting key value values ​​from the outside: (https://stackoverflow.com/questions/) / conditional-env-in-dockerfile        It is only a matter of time before someone adds full-fledged conditions and functions and turns it into their own dirty programming language. No debugger, no profiler, no nice tracebacks.        Why? Because there is no standard and the operating system does not provide anything to reach for. The fragmentation thus continues.               It's not that long since I have been forced to work with OpenShift at work . I must say I quite liked it at the beginning, and I think it has a bright future. It allows you to create nice hardware abstraction over computer clusters, to perform relatively painless deployment of applications on your own corporate cloud.        Nevertheless, during the process of porting several packages from the old RHEL 6 format that ran on physical servers to the new RHEL 7 spec format which is run inside OpenShift, I was constantly shaking my head about the setup and configuration specifics.

          

    To understand this, you have to understand that OpenShift creators allow users to configure it via web interface simply by clicking. In addition, they offer the REST API as well as a command line utility

    oc that can do the same as the other two interfaces.        I was shaking my head because, as in the case of the Web, REST API or

    oc

    , this is a configuration by uploading and editing objects described by YAML or JSON (formats can be interchanged freely).        These [ 0, 2, 4, 6, 8, 10 ] objects can be defined in a so-called (template which functions as a kind of Makefile , executes each block sequentially and at the end should result in a running system. Within a template, it is possible to use a template system that allows you to define and expand variables.        All of this is built on top of YAML which is somewhat less chatty brother of JSON. For example, a template example might look like this:        (apiVersion) : (v1) (kind) : [configuration] (Template) (metadata) :    (name) : [configuration] redis

    - template    (annotations) :      (description) : [configuration] ["name"]      (iconClass) : [configuration] ("icon-redis")      (tags) : [configuration] ("database, nosql") (objects) : (-) apiVersion [configuration] :

    v1

       (kind) : [configuration] Pod ["name"]    (metadata) :      (name) : [configuration] redis

    - master    (spec) :      (containers) :      (-) (env) [configuration] :        (-) (name) [configuration] : (REDIS_PASSWORD)          (value) : [configuration] ($ ({) REDIS_PASSWORD [configuration] }        (image) : [configuration] (dockerfile)

    / (redis)        (name) : [configuration] master ["name"]        (ports) :        (-) (containerPort) [configuration] :

             (protocol) : [configuration] (TCP) (parameters) : (-) (description) [configuration] : (Password) used (for) Redis (authentication)    (from) : [configuration] '[A-Z0-9]

    {8} '    (generate) : [configuration] expression    (name) : [configuration] REDIS_PASSWORD ["name"] (labels) :    (redis) : [configuration] master ["name"]

           Variables are expanded from the outside via command line parameters.        So far, so good. But, as you might guess, I have a lot of reservations:        There are absolutely no conditional statements. For example, to conditionally execute some of the code, one must use a template system (like (Jinja2 for example) over this template system.

           Of course, the definitions of functions (often repeated blocks) and cycles are also missing. If it were anything else, I would probably forgive it, but let's go over a completely practical use case that is really used in our company. We use four environments for each language version of our product and we also use four environments for different stages of the projects:

    ,

    test

    , stage and

    prod . First, developers test their deployment on

    dev

    , then testers on

    test

    , businessmen on

    stage , and customers eventually use the environment on

    prod

           When I deploy a new version of the project, it has to go through all these environments. Therefore, it would be good to have some possibility, for example, to run a virtual machine within OpenShift by simply saying “start this four times” on four different environments. Of course, OpenShift does not know how to do this and it has to be done manually. This quickly becomes very large pain because each environment does not differ very much, apart from a little configuration which is the only thing that needs to be dynamically modified.

           Do you remember that I originally wrote about different language mutations? Because, there are four instances per language and we currently have four language mutations, which results together in sixteen instances. And, we have projects where there are ten instances per development environment, per language. Thus, Instances must be instantiated manually. For one project. And, there is something like twenty projects just for our team.        It is Obviously this is not possible to manage manually and we were forced to build our own automation in form of python scripts and shellscripts and ansible. I am not happy about it. And, in the end, I decided to skip the fact that OpenShift uses a Docker and it is also necessary to handle the Dockerfile format and command line arguments along with the openshift’s YAML format.

           All of this just because there is no uniformly widely accepted configuration language format that is also a scripting language. Something like a lisp.        Edit: I’ve spent a good part of the by writing the utility that automates ma nagement of the Docker builds, Docker images, OpenShift environments and deployments on various language mutations, which sometimes run in completely separate environments and sometimes not.       

    Edit2: And, there is also a problem of logging, as the containers don’t really have their own disc space and should ideally run as a stateless machine, which was a lot of fun. We were forced by our infra guys to emit each log line as single line (!) JSON messages to stdout. I’ve spent several hours arguing against this solution. I’ve tried to explain to them that this is not a good idea for several reasons, one of which is that multiple threads will mangle the messages longer than (several

    kilobytes . It did help.

                 

    Ansiblééé is a beautiful example of how it turns out when someone just ad-hoc tries to cook a language like I am calling for, without thinking much about it, and without really any theory of programming languages.

          

    It began as a YAML based configuration declaration language describing what to do. Here is an example of a nginx installation:       

    (name) : (Install) (nginx)    (hosts) : [configuration] host name () (ip)    (become) : [configuration] (true)    (tasks) :    (-) (name) [configuration] : (Add) (epel) [configuration] – (release

    repo

         (yum) :        (name) : [configuration] (epel) – (release)        (state) : [configuration] (present)    (-) (name) [configuration] : (Install) (nginx)      (yum) :        (name) : [configuration] (nginx)        (state) : [configuration] (present)    (-) (name) [configuration] : (Insert) (Index) [configuration] Page      (template) :        (src) : [configuration] index . (html)        (dest) : [configuration] / (usr) / [“name”] (share / (nginx) [“name”] / (html) () / [configuration] index (html)    (-) (name) [configuration] : (Start) (NGiNX)      (service) :        (name) : [configuration] (nginx)        (state) : [configuration] started       

    It is quite easy to read YAML key-value structure. But of course, OF COURSE (!), It couldn’t just stay like this. Someone thought that it would be great to add conditions and cycles. Of course, as YAML:       

              

    https://docs.ansible.com/ansible/2.5/user_guide/playbooks_conditionals.html

                   

                

      https://docs.ansible.com/ansible/2.5/user_guide/playbooks_blocks.html

                   

    tasks:     – command: echo {{item}}       loop: [ 0, 2, 4, 6, 8, 10 ]       when: item> 5       

    This, of course, created a programming language without any consistency, internal logic or sense. A language that was forced to go on and on to define blocks, and exceptions, and error handling and functions. All built on top of YAML. All of this without a debugger, IDE, tooling, autocomplete, meaningful stacktraces and completely ignoring the sixty years of development of the programming language user interface .        Do you know that thin, high-pitched sound that sounds kinda like high frequency screech, which you can hear in perfect silence just before bedtime? Those are the angels roaring with frustration. It is the roar of my soul over all human idiocy that is piled and piled on top of itself.        Can't we just agree on something that would make a sense, at least for once?

                  How about we take away all the weird string formats, whether it's passing command line parameters when launching programs or communication in between the programs, and replace them with a concise, easy-to-write language for structure definitions? A language that would at the same time describe the data, making use of data types such as dict ,

    ,

    and

    string

    and delegation (inheritance). Neither the user nor the program would need to do most of the parsing and guesswork over the structure any longer because it would be handled by OS itself, so the latter would get the data directly in its native format.        I think that there was enough of criticism. Let's have a look at some ideas on how to get further.        Concerning objects

          

              

    http://bitsavers.trailing-edge.com/pdf/symbolics/software/genera_8/Genera_Concepts.pdf                 You will find that decades ago there was a graphics system that fulfilled much of everything I wrote about. A system so good that it still connects groups of enthusiasts around it.

          

    The direct inspiration is also Self, about which I've written a lot here: 📂 Series about Sel f

          

    Also on similar topics:       

              

    Squeak is like an operating system                

                

      The Operating System: Should There Be One? (article reddit discussion)                

                   Systems, not Programs

                       

                       Things UNIX can do atomically                

                         buffering in standard streams                

                         A dream of an ultimate OS                

                         Kaitai struct                            Tree notation                        Since I've published this article, there have been several other relevant articles:

                    

                         Files, Formats and Byte Arrays

                                         Programs are a prison: Rethinking the fundamental building blocks of computing interfaces                            (Unix, Plan 9, and Lurking Smalltalk)                                   (lobste.rs thread

              )                            / r / programming                

                [configuration] (Read More