in ,

Datamining Bandersnatch, Hacker News


    

You may have heard aboutBandersnatch, an interactive film released on Netflix as part of the Black Mirror series. I’ve heard about it when it was released, but didn’t get around to watch it until recently, and I was surprised at how deep and thorough the implementation is. The film consists of:

  • (segments)
  • (variables) mostly boolean, some having 3 or 4 possible values)
  • (choices
  • segment groups (points at which variables are used to decide the outcome)
  • and preconditions (boolean expressions used in segment groups and elsewhere

That’s more data than you can shake a stick at, so I spent a few days down the rabbit hole and wrote some code to pick it apart. The fruits of this endeavor can be summarized as:

  1. An understanding of the data format used by the film.
  2. Aworkingstand-alone player for the game’s data files.
  3. Anactually completeyet mostly-readable flowchart containingallof the movie’s scenes and logic.
  4. An assortment of interesting observations, such as bugs in the script.
  5. Let’s go over these in order:

    The Data

    There are two relevant metadata files,

    SegmentMap.jsonandbandersnatch.json

    . To parse these, I usedmy JSON librarywhich parses JSON into D types (structs, arrays, and associative arrays). This alloweddefining and validating the data against a schema, Which is a good way to ensure you haven’t missed some field that occurs in only a few places. Some parts of the JSON data do not fit into the above model (due to e.g. heterogeneous arrays); these are saved toJSONFragment

    s, which can later be parsed using e.g.std.jsoninto dynamically-typed variant structs .

    The metadata content can be summarized as follows:

    • Definitions ofsegments, which subdivide the video file into parts. Every frame of the movie is covered by exactly one segment.
      • Each segment can have a number ofmoments, which have a start an end timestamp. Moments may overlap, and may have a precondition which decides whether they are activated. A moment's effect is generally to set a variable, or ask the viewer to make a choice.
        • Achoicecontains the actions to perform when it is selected. This obviously includes which part of the video to play next, but also it can set a variable, or decide the next action by consulting asegment group.
    • Segment groupsare branching points dependent on the current state of persistent variables. Choices can point to a segment group instead of a segment; and, if a segment does not end with a choice, its corresponding segment group is used to decide where to go next.
    • There is also various metadata for cosmetic purposes, i.e. How to lay out the choices on the screen, or titles and thumbnails for choice points when navigating past decisions.

    As part of understanding the data format, I wrote code to dump it to ahuman-readable HTML.

    The Player

    As far as I know, there did not exist a fully working implementation of a player for the film’s data files other than the one provided by Netflix. Which is obviously fine, notwithstanding things such as personal preferences or Netflix not being available in all countries, But a standalone player which allows jumping to arbitrary segments certainly makes thoroughly exploring the film much easier.

    The closest thing I found to a working player wasthe one created by joric, which was later adapted into a stand-alone web page bymehotkhan. However, it was still extremely lacking: the UI was buggy, and the logic used to interpret the data files ranged from flawed to utterly wrong to completely missing, thus rendering many parts of the film inaccessible. I needed to rewrite nearly all of the code to get it to a proper level of function.

    I was also able to convince all authors involved to release the code under an open source license, Which means that it is now possible to fully enjoy Bandersnatch using only Free Software. Hurray? Well, the catch is, of course, that, as far as I know, there is no way to obtain these data files (particularly the video and audio tracks) legally, whether you have a Netflix subscription or not. If you know a way, please let me know!

    The player source code, released under theUnlicense, can be found here:HTML,JS,CSS.

    mehotkhan's version comes, for better or for worse, bundled with subtitles and all the metadata necessary for playback; all that’s missing is a video file, which you can provide by dragging and dropping it onto the browser window. The fork containing my fixes and improvements can be accessed here:

    ************************** cybershadow ************ .github.io / BandersnatchInteractive
    (********************************The Flowchart

      You may have seen some flowcharts of the film; some of their authors may even claim that these flowcharts are fully complete. Well, perhaps they are, for a certain definition of “complete”, but that’s certainly not a definition I would use! But, using the data files, it should be possible to simply generate a full flowchart, right?

      Well, easier said than done. The first attempts werea disaster, due to the sheer number of the film’s segments and complexity of the logic deciding what should be played next.

      Long story short, after a bunch of research, code, graph theory, and tweaking, here is the final version ( spoiler warning!):

      (**************************************

        ****************************

        Click to enlarge (spoiler warning)!

        Notes:

    • When opened in its own tab, the chart is searchable (with Ctrl F), and nodes will have tooltips with details for which segments exactly they correspond to.
    • Variables whose descriptions start with “Watched…” are flags that can only be set, and are never cleared other than starting from scratch. All other variables can be cleared somewhere.
      • Note how a certain ending is accessible only if youhaven'twatched something, so, it's possible to permanently screw up your “save file” by making the wrong decision.
    • The graph is divided intostoryandnon-storynodes. The story nodes on the left occur during normal playback; the non-story ones on the right consist of abridged versions of segments, rewinds / fast-forward segments, as well as all the logic that deals with where to suggest returning upon reaching a dead-end or ending.
    • In case it’s not clear, thick green lines mean “yes” and dotted red lines mean “no”.

    Implementation details:

    • The flowchart graph is heavilyoptimized. Which is to say, it still covers every frame and every conditional in the film, but some of them have been optimized out or folded together in ways that don’t change their meaning. (For comparison,here is an only partially optimized version. Zoom out or scroll around!)
    • The chart is generated completely by software (my code GraphViz). Unfortunately, GraphViz is not perfect, and sometimes lays out the nodes in a way that causes edges to overlap and make them difficult to follow. Sorry about that.
    • Conditional statements used to query variables are represented rather differently than in the flowchart. In the data files, the conditionals are a nested tree of boolean expressions, which doesn’t fit too neatly in a flowchart. My implementation evaluates them into a full truth table covering all relevant variables and values, then extracts a tree of yes-or-no questions from it.
    • The source code can be foundhere(also under the Unlicense). My descriptions for the segments and variables are there too - feel free to send improvements.
    • The tool also generates a number of other files, including an HTML file containing a human-readable dump of the metadata.
    • My biggest regret is, of course, not being able to ask GraphViz to lay out the chart using**************** to connect the nodes.
    • If you notice something that looks wrong in the flowchart, it’s probably that way in the script! There’s a few things that don’t seem right, which I’ll cover in the next section:

    Odds and ends

    • The first thing to point out would be that the video seems to contain a lot of duplicate data:
      • There are many rewind segments and abridged versions of segments, which are played instead of the full ones on successive replays.
      • Many long segments are near duplicates of others, except for minor differences.
      • Some segments are complete duplicates of others, except for accompanying metadata (which choices are available at their end).
      • Some segments seem to be complete duplicates of others, including metadata. Even though the video file is over five hours long, there is actually less “unique” content than that.
    • There is exactly one unreferenced segment group (shown in the chart from the “UNREACHABLE” node).
    • There is one variable which is only written, never read.
    • Some segment groups have listed possible outcomes guarded by conditions that can never actually be true. Considering that some segment groups have conditional expressions involving as many as 19 variables, bugs in the logic are not at all surprising!
    • Some segments are unreachable through only making decisions from the start of the movie; This includes decisions to go back from a game-over screen. Specifically, some segment groups have possible outcomes that can only be true if it was possible to jump to that point from later in the movie (and not just from a game-over screen). However, I don’t see a way to do this in the official player (in Google Chrome); which is strange, as there is information in the metadata for each choice containing a description and a picture, as if for an UI to select which decision to go back to.
    • Bugs in the script:

), a choice is presented; given certain conditions, playback jumps directly to the credits instead of the character performing the selected action. This oddity actually causes the flowchart to be considerably messier in that place than it would have been otherwise. Here is the sequence of steps required to reproduce the bug- I’ve confirmed it happens with the official player, too.

  • In one place (Stefan about to take his pills

    , the selected action does not match what the protagonist actually does. However, this can only happen when coming from an aforementioned “unreachable” segment (as described above), so I haven’t been able to confirm this bug.

  • So, how to get ending X?

    Just follow the flowchart backwards. Enjoy!

      

    flowchartRead More

    ********************************

      **************************************

      What do you think?

      Leave a Reply

      Your email address will not be published. Required fields are marked *

      GIPHY App Key not set. Please check settings

      Uncharted: The Nathan Drake Collection Leads January PS Plus Games, Crypto Coins News

      Uncharted: The Nathan Drake Collection Leads January PS Plus Games, Crypto Coins News

      A Modern Introduction to Online Learning, Hacker News