import React from "react";
import styled from "styled-components";

const Figure = styled.img`
  max-width: 100%;
  width: 100%;
`;

export const ProjectReality = () => {
  return (
    <>
      <h2>Overview</h2>
      <p>
        Reality is a game engine designed to allow developers to create large multiplayer in-browser experiences and games
        quickly, without worrying about creating the boilerplate and plumbing for handling players, client synchronization,
        and rendering. Reality is capable of scaling itself automatically to host hundreds or thousands of concurrent players,
        without requiring any human intervention.
      </p>
      <p>
        I created it along with one other member of the <a href={"http://www.dubhacks.co"}>DubHacks</a> technology team as part of an effort to take
        the DubHacks hackathon events online during the pandemic in 2020. In 2020, DubHacks hosted two virtual hackathons,
        each of which was attended by nearly 1000 participants. In an effort to create a community space for participants
        to meet each other and share ideas, we created Reality (the engine) to allow us to create Reality (the online
        multiplayer meeting space). In this post, I'll talk about some of our design goals and technical decisions behind
        Reality (the engine), what it can do, and how we built it.
      </p>
      <h2>Design Goals</h2>
      <p>
        In creating Reality, we sought to build an engine that could host a wide variety of in-browser experiences while
        streamlining the development process to allow us to iterate quickly on feedback from the DubHacks team as well as
        other interested third-parties. As an example of this: during our main event, we were able to take feedback from some
        of our participants, implement an entirely new feature from that feedback, and deploy it to the system within a few
        hours.
      </p>
      <p>
        In support of this primary goal, we wanted to create a system that had a significant amount of "development safety"
        built-in. I'm a huge advocate of static analysis tools and type systems, so the decisions to write Reality in TypeScript
        (as opposed to JavaScript) was an early choice we made that really paid off in easier-to-solve or never-created bugs.
      </p>
      <p>
        Also in support of both system safety and iteration speed, we chose to divide the system into a large number of loosely-coupled
        modules, both within the internal engine code as well as in the "client code" that was used to implement our specific
        hackathon experience on the engine. Internal code was divided into a number of packages that were managed and could
        be developed separately, and client code was structured as a grouping of different entities and subsystems. The set of
        entities and subsystems loaded (along with some additional metadata) defines what particular game or experience is being
        implemented on the engine.
      </p>
      <p>
        Note - in this post, I'll use the terms "client" and "internal" to differentiate between the code written by the user
        of the engine versus the code that actually makes up the engine itself. To differentiate between the code that actually
        runs in a browser versus the code that runs on a server, I'll use the terms "client-side" and "server-side."
      </p>
      <h2>Technical Overview</h2>
      <p>
        The Reality engine was built entirely in TypeScript. On the client side, we used React, but only for managing the more
        "traditional" elements of the user interface (i.e. the "log in" page). For the elements of the interface used for the
        actual game experience, we broke out of React in favor of a layer of code managing and rendering to an HTML canvas. On
        the server side, we ran TypeScript (transpiled) on Node, deployed (for our hackathon events) on Google Cloud Platform.
        The client-side code was served using GitHub pages, proxied by CloudFlare. For persistence, we used a MongoDB instance
        provided by MongoDB Atlas.
      </p>
      <h3>System Organization</h3>
      <p>
        The Reality system, once deployed, consists of several related services. At its most basic, the system consists of
        two main components - the client-side code running in a user's browser (served to them statically through GitHub pages) and
        a particular collection of server-side code running on a linux box in the cloud. I chose to call this server-side unit
        "gaia." The gaia instance is responsible for owning the shared state of the world that the players are in, and notifying
        all users of changes to that state. As part of joining the world, the client-side code establishes a WebSocket connection
        to the gaia instance. It maintains this connection throughout the time the user is online, and passes messages through it
        for all client/server communications.
      </p>
      <p>
        The description above is heavily over-simplified, so I'll now describe additional problems that we considered, and the
        additional elements of the system that we created to solve them. The first issue is one of population - there comes a
        point where there are too many players present for the server to be able to reliably handle them all. To resolve this,
        we chose to shard the world across a number of gaia instances. When a user connects to the world, they first connect
        to a coordination server that load balances players across a number of gaia instances. When the user receives an assignment
        to a particular gaia instance, it only then will connect to the gaia instance and formally "join" the world. I call this
        coordination server "artemis." Artemis is also responsible for authenticating users when they connect, and it issues
        one-time-use authentication tokens to allow them to connect to the gaia instances in the cluster. Each gaia
        instance is entirely independent of the others - they are running parallel copies of the same world, and users connected
        to one cannot interact with (and are not aware of) users connected to the others. When the existing gaia instances
        have reached their capacities, new gaia processes can be spawned to increase the overall capacity of the cluster.
      </p>
      <p>
        The final piece of the puzzle is again concerned with scale. If all the gaia instances as well as artemis were run on the
        same cloud machine, then it'd be possible for artemis to manage and spawn gaia instances itself. However, this constrains us
        to the limitations of the machine we run on - in particular, their outbound network capacity. (Outbound bandwidth
        is the primary limiting factor of the current system's scalability, mostly because much of the networking code is fairly
        unoptimized, preferring simplicity and rapid development for when we were creating Reality on a tight timeline). To resolve
        this, a third service was created: "apollo." One apollo instance runs on each cloud machine that will host gaia instances.
        Each apollo instance maintains a WebSocket connection to the artemis service, through which it reports statistics about
        the machine's overall health (CPU load averages, network load, etc...). Over this same channel, artemis can request that
        the apollo instance create a new gaia process or kill an existing (rogue) gaia process.
      </p>
      <Figure src={"/img/reality-arch.png"} alt={"Diagram of the architecture of Reality"}/>
      <p>
        The above diagram demonstrates the majority of the system at work - each machine that hosts gaia instances has an
        apollo instance that creates them. Each apollo maintains a WebSocket connection to the artemis to facilitate the management
        and monitoring of these remote machines. Users fetch the client-side code through a request to GitHub pages. Artemis
        provides an HTTP API to users to allow them to authenticate and request assignment to a particular gaia instance.
        Artemis can also communicate to gaia instances throughout this process through an open WebSocket connection it has
        to each gaia. When the user is assigned to a gaia, it opens a WebSocket connection to the gaia through which the majority
        of the work is done while the user is connected. Both artemis and gaia have connections to the database - artemis for
        user authentication and management, and gaia to allow the game elements to store data persistently (e.g. a user's high
        score). There are a few elements of the system that are not shown, mostly related to parts of Reality's internal
        operation that I am not discussing here.
      </p>
      <p>
        The above structure allows us a lot of scalability. It's possible for artemis to spawn a gaia instance in under 2
        seconds, which means in the worst case it can be spawned on-demand, when a user is in the process of signing in. In
        general, though, artemis attempts to predict when it will need new gaia instances long before it actually begins
        using them. In the future, it is also possible to add a "cloud service provider" layer to artemis to allow it to
        actually commission new cloud machines in their entirety, meaning the system can scale itself to an effectively
        unlimited degree.
      </p>
      <h3>Code Organization</h3>
      <p>
        The structure of the code itself was another major decision we made early on. Reality, the engine, is actually a
        collection of 24 separate TypeScript projects that each define an individual portion of the overall system. These components
        are divided into three groups: (a) packages, (b) services, and (c) utilities. Utilities are projects that are related
        to the development, deployment, and operation of the Reality system itself. Services are the top-level runnable
        units: apollo, artemis, gaia, and the client-side code are all examples. Finally, packages are everything else: subcomponents
        of the system like the renderer, the database driver, or http APIs.
      </p>
      <p>
        Separate from the internal code, described above, is the "client" code - what we used to implement the specific experience
        that DubHacks used for our events. This includes: (a) regions, (b) entities, and (c) subsystems. Regions are collections
        of metadata that describe the game world: what sprites exist in the world and their locations, whether certain tiles
        can be walked on by the player, and so on. Entities are any elements of the world that have some sort of dynamic behavior.
        They can change their appearance programmatically, maintain internal state, and respond to input events. Examples of
        entities would be the player, or a door that can be opened and closed. Subsystems allow generic extensions to the existing
        behavior of the engine to be implemented. This may be used to provide additional functionality, manage and coordinate
        groups of entities, or render additional UI elements on the webpage. Examples of subsystems would be: our implementation
        of voice chat, text chat (which injects additional UI elements into the React tree), or a puzzle that users solve cooperatively
        by interacting with entities in the world.
      </p>
      <p>
        Both entities and subsystems use existing APIs created by the engine to define their behaviors - there's quite a
        lot they can do with a comparatively small API surface. Our goal would be eventually to allow developers to install and
        depend on all the engine code transparently, while providing their own custom regions, entities, and subsystems to create
        the actual content of the particular game they want to implement.
      </p>
      <h3>Everything Else</h3>
      <p>
        There are huge sections of Reality that I didn't choose to discuss in this overview. An incomplete list,
        in no particular order, follows:
      </p>
      <ul>
        <li>
          I created a management portal for performing administrative actions like issuing new accounts, banning or
          purging users, and monitoring the status of the cluster.
        </li>
        <li>
          My partner in creating Reality created a simple photoshop-like tool for designing and building the game world
          from the individual sprites in spritesheets.
        </li>
        <li>
          I've designed (and partially implemented, currently) a system whereby only the code relevant to a particular
          world region is loaded at any given time, and that code (along with other resources) is dynamically fetched
          from static buckets when the user joins the world.
        </li>
        <li>
          I have designed a dynamic configuration system to be used for modifying or toggling the behavior of the system
          while it's still running. While the configuration system itself hasn't been implemented fully, portions of it
          already exist in the code to allow it to be integrated easily.
        </li>
        <li>
          I created a slack bot that was used to distribute accounts to all the participants in the DubHacks events where
          we used Reality.
        </li>
        <p>
          I'm more than happy to talk about any of these in great detail, and I may choose to write about them at some
          point. If you're interested, send me an email (andrew AT andrewgies.com) and say hello. :)
        </p>
      </ul>
      <h2>Sub-Projects</h2>
      <p>
        A few of the individual TypeScript projects that make up the larger system are actually themselves fairly robust and
        independent of Reality in general. I mention two interesting examples here.
      </p>
      <h3>Messenger</h3>
      <p>
        Messenger is one of the very first packages that was written during Reality's development. It provides a layer of
        TypeScript abstractions on top of WebSockets, which allows users to define servers and clients, and open a series
        of independent messaging channels that are automatically multiplexed over a single WebSocket connection. It also handles
        brokering and management of the WebSocket connection and storing per-client data and providing it during message
        handling for servers.
      </p>
      <p>
        On top of the basic WebSocket management, messenger provides ways to describe the valid shapes of messages that may
        appear over a channel, and performs typechecking on all messages as they pass. In particular, it does so in such
        a way that the typechecking can be done statically, so (if Messenger is used correctly) it becomes nearly impossible
        to accidentally send incorrect content over the wire, or to the incorrect place. Furthermore, it provides primitives
        for "message chaining" - the idea that Message A will expect an associated response Message B, which will receive
        a response Message C, and so forth. Message chains get a fully type-safe API that's very reminiscent of redux sagas,
        allowing the multi-step asynchronous messaging code to be written in a synchronous fashion, with automatic association
        between messages and their responses, and error checking throughout.
      </p>
      <p>
        Messenger has the potential of being used basically anywhere WebSockets are used, and provides significant static analysis
        and developer friendliness advantages over using traditional WebSockets.
      </p>
      <h3>Eye</h3>
      <p>
        Eye is a tool derived of necessity, not desire. Since our code is structured as a series of interconnected TypeScript
        projects that import each other, we needed a way to recursively build those projects in the correct order so that the
        transpiled JavaScript (as well as TS definition files) were available to the downstream projects that depend on them.
        Furthermore, due to the size of the codebase, we needed a way to iterate quickly and only rebuild the parts of the
        project that were changed, without rebuilding the entire codebase on every change. Eye was the solution to this problem.
      </p>
      <p>
        In its current incarnation, eye is a tool that, when run, scans the repository and builds a dependency graph of all the
        projects that need to be built. It then spawns typescript compilers for those projects and monitors their status, creating
        additional compilers as projects have their dependencies built. Finally, it spawns and manages instances of all the
        main services in a Reality cluster, allowing a full running version of Reality to be run locally (with hot reloading)
        for development. Eye also maintains file watchers on all source files, and rebuilds the minimum necessary set of code
        required whenever a change is made while it's running.
      </p>
      <h2>What's Next</h2>
      <p>
        There is a very long list of things that we'd still like to change, add, or fix in Reality. This is by no means a completed
        project, and we'd love to continue development and create a complete, read-to-use engine for other developers. This section
        is a brief, incomplete overview of a few of the gripes I have with the existing system, and where we could improve it
        in the future.
      </p>
      <ol>
        <li>
          <strong>Finish writing the damn thing.</strong> This one's a bit obvious, but there are huge portions of the codebase
          that I've had designed and ready to implement for months, but haven't yet been important enough to implement yet. These
          are features that really take Reality from "the engine that was appropriate for DubHacks' use case" to "the engine
          that is appropriate for a vast variety of different applications." Along those lines, there are also a handful of things
          in the engine that were hardcoded to support specific "client" features, but were added directly to the engine as shortcuts
          for features that weren't yet implemented, because we needed to get the system working before participants started
          showing up at our events.
        </li>
        <li>
          <strong>Artemis is a single point of failure, and a bottleneck.</strong> This was a problem that I chose to ignore during
          our development of the original system, because I knew for DubHacks' use-case I could tolerate the load on a single
          artemis and we could tolerate the probability of failure due to that weakness. In the future, it probably makes sense
          to distribute artemis across a number of physical machines, put it behind a cloud provider's load balancer, and have it
          implement a distributed consensus algorithm (like Raft, for example) for managing its internal state.
        </li>
        <li>
          <strong>Package and deploy the engine independently.</strong> We need to come up with a plan for eventually packaging just
          the "internal" portions of the codebase to allow them to be easily imported and used by developers created their own
          games. Along with this comes a suite of development, build, and packaging tools needed as well.
        </li>
        <li>
          <strong>Ditch eye.</strong> Eye was a tool that came out of necessity, but it has caused so, so much pain. I'd love to transition
          the codebase to use something like TypeScript project references, but there's a decent amount of re-jiggering (technical term)
          needed to have that work smoothly.
        </li>
        <li>
          <strong>WTFM - Write the F*cking Manual.</strong> If this is ever going to become a project that other developers can use to build
          their own games, we'll need to write a very thorough manual, along with a suite of example projects and best practices for
          using the engine to its fullest potential.
        </li>
        <li>
          <strong>Expand development tooling.</strong> While the existing set of tools for developing games with the Reality
          engine (like the photoshop-clone editor) were sufficient for us, they need to be expanded on significantly before they're
          ready for a general audience.
        </li>
        <li>
          <strong>Make it (more) performant.</strong> I have a long list of "planned optimizations" that I've avoided so far because
          nearly all of them introduce additional code complexity (read: more bugs to worry about). For a truly complete engine,
          I need to invest the time in implementing these and tuning the performance overall. We may also look into a speedier,
          GPU-accelerated (WebGL) renderer for the client-side presentation, to improve performance and/or add graphics features
          to the engine.
        </li>
      </ol>
      <p>
        In all, Reality has been an amazing project to work on, and I've really learned a ton in creating it. There's a lot of room
        for continued growth, and I'm excited to pick it back up and keep improving it. If you're interested in chatting about Reality,
        or just want to share your thoughts about this article, please shoot me an email at andrew AT andrewgies.com anytime.
      </p>
    </>
  );
}
