(it'll only take this long the first time around, but if you're impatient, click HERE.)
A study diary of interactive fiction (IF) design theory and techniques, with a special focus on choice-based narratives. On-going.
Author: Vương Cẩm Vy
(...or skip and begin reading)
Interactive fiction (IF) seems self-explanatory enough – it is fiction (i.e. stories) with interaction (i.e. required input) from the audience. Perhaps, to guard against pedantry, we must add also that said input goes a bit beyond there mere flipping of a page or pressing “play” on the remote. Interactivity here involves freedom to engage with the material within an intended choice-space that is a significantly more multi-dimensional than what’s found in so-called “passive media”. It is intended that the audience may choose to do A instead of B, thus missing out on B altogether, and it will still make a cohesive experience, whereas one might be pretty lost if whole chapters were to be ripped out from a book.
Early digital IFs appear alongside the popularization of the home computer and PC gaming. These IFs where parser-based – they used a software to parse user input, often a simple [verb][noun] phrase, so that the IF and the user can enter into a dialog: “Here is a room, and there’s a cup in the middle.” “>I [pick up][cup].” “Ah, you have a cup now.” A large part of the challenge for the audience was to figure out the right keywords and the order to use them. Their minimal presentation resembled a terminal console because that was the dominant visual language at the time, prior to widespread GUI, but with very little, such softwares could simulate exploring physical spaces quite well. While parser-based IFs used to be synonymous with the genre as a whole, they are less popular these days.
Another well-known type of IF is the choice-based kind, where options are presented to the player, taking them to different outcomes of the story. In contrast to parser-based IFs, choice-based IFs do not make the audience figure out the keywords. All interactions at any point in the story is presented as discrete choices, hence the name. Like Kleenex or Band-Aid, Choose Your Own Adventure (CYOA) gamebooks have been genericized into meaning any types of IFs where there choices are presented (“Option A, go to page 12”), the story branches into different paths, and there are multiple endings depending on the audience’s choice.
IFs are still a relatively niche genre of fiction (or game), so underground that some consider it dead. However, while media explicitly labeled as IFs will likely not get the commercial attention that, I think, they deserve, IF as an approach to storytelling has seeped so deep into all forms of media, especially video games, that it has become the standard. Most video games with a story these days carry with it some expectation of narrative branches (perhaps to many projects’ detriment), player agency in the story, narrative responsiveness to player choice, and multiple endings. They are more or less IFs with gameplay from other genres attached. We see this a lot in adventure games and visual novel games. And when the RPG genre, often associated with a lot of improvisation, is translated to the digital medium – itself a deterministic space, IF becomes an acceptable “pidgin” to bridge the gap between freeform player agency and machine logic. Most RPGs, at least in the “Western” sense, are IFs.
Outside of video games, IF – especially the CYOA variety – has become its own form of mass literacy. The invention of the hypertext – text with hyperlinks that allow users to immediately access different text – has contributed greatly to making this kind of IF more accessible. Not only is it easier to make, hypertext – due to the internet – has become one of the primary ways modern people engage with textual content. Its rhizomatic logic has become more than familiar. I think it is a truly novel way of navigating text, perhaps only experimented with before (Vladimir Nabokov as his system of notes comes to mind) but never so widely implemented, let alone accepted.
The hypertext and IFs have converged into a revival of the genre. The free software Twine has made it far easier for those with little coding knowledge to start creating interactive fiction via hypertext, thus producing a plethora of new entries (much to the oldheads’ chagrin), some quite experimental and genre-defining. Twine is seeing a lot of use as an educational tool and a prototyping tool. Ren’Py is beyond popular; many of its visual novels have seen great commercial success and critical acclaim. Even the parser-based games are coming back with the release of the Inform 7 engine.
I find all of this extremely exciting, and I’d like to get in on it, for no other reason than to learn.
It’s obvious that I think about IFs fairly often. I read up on them, play them, and have made a few as well. While that’s no indication of the quality of what I can make, if I want to make a half-decent IF at all (and I do want to), then I should have a place to jot down what I’ve learned, done, or imagined. This “study diary” is the place for it. Else, I’ll keep forgetting.
Right now, I am primarily focused on learning Twine (version 2, story format Harlowe 3.3.9). I'll eventually make time for the other stuff as well.
Does interactive fiction (IF) have to be text-based? Historically, this has been the case, but that would be a contingent description, not a definitional necessity. In fact, modern IFs are almost always a multimedia affair. The various productions from IF giants like Failbetter Games or Inkle Studios can attest to this. Genres that are descendants from IF, such as the visual novel, are text-based in the same extent that comic books and graphic novels are text-based – that is, the textual element is almost incidental. Genre definitions are always playing catch-up with a porous and evolving practical understanding. It’s an “I know it when I see it” -type of deal. So, is the association between IFs and text-based-ness only a historical artifact?
We should look at the term “fiction” – a literary term. IF softwares have variously been called “adventure games”, but this sounds quite different from “fiction”. A “game” carries with it certain connotation – that it will be played instead of closely read, critically interpreted, and appreciated as a form of literature. Perhaps quite unfortunately, it lacks a certain prestige. “Fiction”, on the other hand, places it squarely in peerage and competition to works of “serious” art, conferring it both an aura and a responsibility, even though an IF is quite free to be as pulpy as the author wants, and a so-called “adventure game” can be so innovative and masterful that it uplifts the entire genre to its own field of critical scholarship. Sometimes, we forget these things, and having a literary name can be an aspirational reminder that we are not so bound by genre conventions (that a “game” must be “game-like”, that an “adventure” must delivery upon its dramatic promise).
Thus a more critical defense of the name “IF” might be: So long as an IF is interactive (that is, it uses a certain interactive language to evoke a specific emotional experience in its audience, much like how film language does to moviegoers) and is a fiction (though not even necessarily untrue or made-up, only that it makes use of the literary syntax and semantics established by our long history of artistic communication), then it is an IF. It being text-based is not needed.
Given this definition, it might be the case that all games with narratives are in fact IFs, and we, shackled by historical contingencies, have yet to admit this.
That is certainly a very permissive definition. It almost all games with a narrative, thereby voiding itself. But are definitions just playing catch-up? Or is there a reason beyond just historical as to why IFs are so often text-based? To dial it back a bit, let’s examine a few ways where this definition isn’t so helpful:
So, it is quite easy to say that IFs do not have to be text-based, but once we venture out of the domain of text and its literary baggage, we quickly find ourselves in strange waters. This ambivalence is a necessary and unavoidable part of IFs being subsumed under multimedia projects: Each media has its own “will” or tendency towards its own ends.
For example, anytime a mechanic exposes number-states, such as combat or commerce, the player is invited to optimize those numbers. Does it make narrative sense or have a narrative impact? Most of the time, failure to adequately regulate these numbers result in quite a sudden end to the narrative – a “game over”. A book being yanked out of one’s hands is hardly a satisfying end of a story, nor is it intended (the game expects the player to return with better knowledge of the number-states, so that they may survive long enough to receive the next narrative breadcrumb). The medium of numbers in this case threatens to interrupt the narrative. If the game were to be considered fully an IF, then it would be in big trouble. But thankfully, it’s not, and that’s because not all narratives, no matter how branching or IF-like, can be considered an IF if other mechanics or media are too prominent. In such a case, it doesn’t matter if the story structure is linear or if the player has no narrative agency, the game is not being played like an IF and much more like an engined being tuned. Despite my preference that all narratives be critically examined using the toolbox of an IF designer, so that stories are well-integrated with gameplay, the definition cannot be too permissive.
IF being text-based isn’t only a historical artifact; there is something inherently valuable for an IF to be text-based, or at least be literary. IFs traffick in the sense of narrative agency – that is, the player understands the story and wants to affect it a certain way, even if it’s just a momentary expression of individuality (either the player’s or the player character’s) that ultimately does not changes the overall story, and to have the fiction respond in kind. At the core of this experience is a literacy – comprehending tropes, archetypes, motivations, in-world consequences, paratextual comparisons, etc. Without such comprehension, the player does not meaningfully anticipate, and therefore, they do not meaningfully interact with the fiction. IFs then must devote a certain percentage share of its engagement time to this task of exploring the text, comprehending it, and making informed narrative decision. If too many non-narrative elements interject, then the player finds themselves forgetting where they are in a story, what they ought to be doing, and whom they are playing as.
Thus, being text-based confers a unique advantage in creating nuances and context for informed narrative choices (the design and writing skill of the author notwithstanding). Other media within a game – such as cutscenes, voiceovers, etc. – may also accomplish this goal, but the prohibitive cost and difficulty of producing them places a limit on how intricate and subtle the fiction can be.
We can look to the From Software formula for narrative subtlety – so subtle that the narrative detaches completely from the gameplay, and the way players make narrative choices (or even start and complete a quest) can be so fragmentary as to preclude informed narrative engagement. This formula works very well for From Software precisely because their games are not IFs, no matter how much text there is or how many branches or endings there are. Into the gaps between these story fragments, From Software games have filled in with engaging, action-oriented gameplay based on an explore-theorycraft/build-try loop with impressive art direction that does a lot of narrativizing via environmental design. These things amount to a far larger percentage share of the playtime, not IF-like engagement with the story. One cannot have it all.
Thus, I propose the following model that might explain IFs’ association with being text-based: Given a triangle with the three corners...
… Each work can only specialize in one or two things, but not all three before sacrificing its quality.
Through the above model, we see one way how IF needs to be text-based, not just because it is tradition. Of course, this is not to be taken as an authoritative conclusion; I am operating on very limited information here, and I may change my mind in the future. But it is useful for an aspiring IF designer to think about what exactly is so special about IFs. They both have an obligation to be “good literature” (even as a children’s book, it still needs to be compared to the best), but it also has to produce a meaningful choice-space beyond the curiosity of seeing what’s next. Text is often the most accessible and the most well-studied way to accomplish this.
(Although, with changes in tech, education, and social standards rapidly occurring, this may not always be the case; text may turn out to also be a mere historical artifact of a still-literate society).
Stories have a temporal element: chronological order establishes a causal relationship between events – we might call this diegetic time. But even non-chronological narration has a latent time-perception. For example: jumping between different events, perspectives, or factoids evokes a thematic or logical through-line between them. Their chronology might be only implied or is non-existent; what matters more is the audience’s own time-perception, or rather the timing of their exposure to various pieces of the narrative. This is non-diegetic time.
Both of these forms of time play unique roles in the narrative: Diegetic time is the backbone of the narrative, allowing the author to organize the basic relationship between various events and their consequences. Non-diegetic time is the stylistic and ultimately most affecting kind of time in any piece of art because in some sense, diegetic time is only “latent time”, constructed and reconstructed per the audience’s exposure to the stylistic ordering of event, i.e. non-diegetic time or “manifest time”, which is inexorable insofar as we can assume that the audience perceives and acts in acceptably linear time.
Such stylistic (or more accurately, artistic) decisions hinge greatly on the author’s understanding of and control over the underlying latent time, or canonically diegetic time. Namely, the literary devices such as the “plot twist” or “red herring”, or even whole genres like mystery, only works because the author has a clearer view of the underlying timeline, while the audience is intentionally fed information piecemeal, which may or may not (mis)lead them into constructing false timelines, and the revelation of its falseness is part of the narrative’s thrill. Or, in speculative genres where “lore” is a large part of the appeal, much of the diegetic time (i.e. world history) prior to the story’s main plot has already been articulated outside of the text, and the text forms a series of lore disclosures that is artfully incomplete. The interplay between the latent diegesis and manifest presentation forms the art.
Therefore, for an author, discounting projects where chronology is experimented with (or done away altogether), diegetic time or latent time should be designed pretty early on since it is the “backbone”, without which, seemingly artful approaches to storytelling will quickly devolve into flailing. Most obvious aspects of this diegetic time design to prioritize are basic plotting, arcs, world-building, timeline, etc.; these things come hand-in-hand with fiction writing. However, IF brings with it an extra consideration compared to “traditional” fiction writing: The interaction.
IF is characterized by interaction – that is, the audience input shaping the story. The story prompts, the audience gives a command (either via a text command seen in parser-based IFs, or via choosing one option among many in choice-based IFs, AKA “CYOA”), and the story replies, the reply itself is a prompt for the next set of choices or commands. The question is: From one prompt/reply to the next, how has the IF moved on the diegetic timeline?
Have they moved forward a minute of in-game time? Or are they now in a flashback, where in-game time pauses so that narrative time can move back a decade? Does the audience interact with the flashback?
We see that in IFs, the boundary between diegetic and non-diegetic time sometimes wavers: If a flashback is only in narration and does not allow for interaction, then the flashback acts only in non-diegetic time, leaving the diegetic timeline intact, thus effectively function as a commonplace literary device. However, if the flashback prompts the audience’s input, and said input may change the course of the story, then that flashback is now a part of the diegetic timeline, functionally a time travel device. IFs’ interactive nature challenges the distinction between diegetic and non-diegetic time, which is all the more reason to approach it systematically. Author’s self-imposed rules, especially through design and the subsequent code, will go a long way in clarifying which elements are latent and canonical, and which element is manifest and subject to artistic play.
A person may sit before a text for any given amount of time and that has very little to do with how much time has passed in-story. As such, when engaging with fiction, all perceptions of time and its passage should be properly understood as illusory; the perception is constructed via a system of consistently applied symbols: A clock in the corner of the screen, a readout of the in-story date, the mention of “yesterday” within the narration, the orange-and-purple tint of the UI reminiscent of the an afternoon’s light, etc. All these symbols are only capable of keeping up the illusion because they are consistent: consistent with an external schema and internally consistent with themselves.
At the heart of external consistency is a kind of skeuomorphism, where the symbols mimic those seen elsewhere, whose meaning have been established prior. An example would be the hourglass or a stopwatch. They may convey superficial information (e.g. that this symbol has to do with time) or actual, deeper information (e.g. it is thee o’clock). The extent to which an IF uses skeuomorphism can vary, but much of it is only a “foot in the door” – a familiar entry point to acclimate the audience to a different world of meaning (but there is no need to simulate the real world one-to-one). Once inside the door, however, the audience enters into a sort of “magic circle”, to borrow a term from ludology, where they can effectively pretend that the rules are real because those rules are applied consistently and have their own internal coherence.
Internal consistency here is quite literal: Since IFs are almost always computer programs, or at least are programmatic in some other, analog way, the display and manipulation of their time-related signs are also in some way procedural. When the computer image of a clock strikes three, it is because somewhere in the program, there is a variable that tells the pixel arms of the clock where to rest, and that variable stands for three o’clock or an equivalent value. Absent a computer program, the participants must do their best to impose a program onto their analog materials in a “seriously pretending” way common to all play. Absent rules, the author must impose consistency in the narration in the same way that a novel must keep track of its own time.
(Note: Even if the program’s time is pegged to realtime, e.g. in synchronized with an atomic clock, our formal analysis should consider this correspondence merely incidental – a particularly convincing illusion.)
Let us proceed with the assumption that a program – digital or analog – is required to uphold this illusion of time. What is the internal logic here? What are the units and intervals? What constitutes or triggers the passage of time? And how should breaks between diegetic and non-diegetic time (e.g. flashbacks, flashforwards, non-linear storytelling) be treated?
In other words: How do we design time?
In discussing design techniques, I must necessarily narrow the scope to only hypertext IFs or at least choice-based IFs. Parallels in implementation may exist in other kinds of programs, genres, or media.
First, simulated time has several primary properties:
The above are sufficient in forming an IF’s time mechanic. But what is it for? In many sense, what the interval stands for is mechanically irrelevant; it is only a kind of “flavor text”. To a lesser extent, the same thing could be said about triggers, the resultant granularity, and the entire time mechanic if there is no mechanical integration with the rest of the IF – that is, if there is no interaction between time, choices, and story.
So, we arrive at the second aspect of IF’s time: its integration into and use in the rest of the program.
The time mechanic is a state machine in two senses:
Boiling it down the two most important considerations for time in IFs:
Programmatic time is discrete – that is, it proceeds by predefined intervals. The frequency (in relation to real, non-diegetic time) by which this happens is the granularity of time. Very finely granular time (for example, proceeding by seconds, pegged to a system clock) is still discrete, and it is not continuous in the same way realtime is.
First, we discuss several ways to define time intervals, the spectrum of which varies from abstract to realistic.
Abstract time represents rough chunks of time. In its most extreme, most abstract forms, abstract time is not recognizably “time”, but it still functions as such (i.e. it keeps a diegetic “timeline” to sequentially display narrative content, and its regulates non-diegetic time perception by pacing out said narrative content). Time here may look like states or flags; the variable “worldState = 1” is sequentially prior to “worldState = 2”, which corresponds to no real-world unit of timekeeping but can still perform the same job as more realistic time units. Abstract time signals a sequence but not necessarily fluid continuity, meaning that a break of undetermined length may have occurred in between two otherwise contiguous time-states.
Realistic time is minutes, hours, days, fortnights, years, centuries, eons etc. There is a spectrum between realistic and abstract (is an “eon” realistic or abstract?), and the difference is purely in flavor, just as it is between the different units of time. A simulated minute can last to an hour in stories where time slows down to a crawl, and every thought, heart beat is registered. A century may last only a single click in stories that span eons, and interactions represent civilizational movements instead of personal experience. However, it is a two-way street: Realistic time brings with it the expectation that each concrete passage is strictly sequential and representational (therefore restricting narrative potential; one should only be able to do so much within a minute), with few to no breaks, and if there are, such breaks can and should be calculated.
An IF may keep only one layer of time, or it may keep several layers in tandem. A calendar is one familiar example of multilayered time, where each day, month, year, etc. is its own state machine, which takes input from the layer beneath it and outputs into the layer above. Abstract and realistic time can also combine: There might be a clock, but there might also be a general, irreversible “story state”, which may or may not proceed with said clock. This brings us to the central question: How should time in IFs proceed?
As said, programmatic time is discrete, since it is a simulation of time and not actual time. This means that there must also be discrete triggers for time’s passage, the frequency of these triggers form the subjective perception of granularity. Such triggers should be designed prior to coding so that consistent rules is established, making it easier for the author to control granularity. The kind of time (realistic or abstract) and the interval unit should give some clue as to what these rules might be.
For example, in Dungeons and Dragons, combat proceeds in turns, and a full series of turns (when all characters have used up their turns) form a round. We have a double-layered abstract time mechanic. But how much can be convincingly accomplished in a turn? While realistic time does not factor into actual game mechanics, it is generally understood that each turn lasts for six seconds, and so a character might cast a spell, run a few meters, stab an enemy several times, but not, for example, read a book. A character is allotted several (kinds of) action points, expenditure of which constitutes a passage of time. That is the general rule, but mechanistically, the character might wait in place or end their turn early. In terms of computer programming, the trigger for time’s passage (a turn’s end) is then either the expenditure of all action points, or a button which manually passes a turn. All of these things are discrete triggers.
In many visual novels, especially the slice-of-life variety, where day-to-day living is simulated, a day may be broken down into phases (e.g. morning, afternoon, evening). Each phases permits a “costly” action, which advances a phase after its conclusion, or a “costless” action, which does not. Because phases of the day are half-realistic, half-abstract intervals, they are permissive in terms of what constitutes a costly or costless action; it is far easier to tell and therefore more restrictive if an action costs six seconds than whether it costs an entire afternoon. As such, the rule must be established outside of the program, at the authored fiction level. An example of such rule would be: Talking to NPCs should not cost a phase, but if an NPC gets the player character to do something as part of the course of that interaction, then a phase is passed. What it means to “do something” is decided outside of the program, since the program might not be able to intuit what action should be costly. A “chunkier” variety of this is when time (or story state) advances forward only when the player has reached a narrative milestone.
Another method is to set a time passage trigger at all interactions, even in each dialogue choice, with each choice advancing time by, say, a minute. This is the method used in Disco Elysium and Esoteric Ebb, which all have an in-game clock that only advances with the player’s action, i.e. talking to NPCs and interacting with objects. In these games, talking to people, investigating scenes, and discovering things is at the heart of the gameplay. As such, making each interaction costly presents a tension: Does the player spend more time (and potentially waste it) with one NPC or scene? Or do they proceed to the next bit of their investigation? Such tension should be carefully balanced and tested; too tight, and the player is put into an optimizing mindset, afraid to engage with the fiction per their curiosity, thus breaking the game’s intended experience; too permissive, and the player loses their sense of momentum and purpose, ending up clicking on every object and dialogue choice just to pass the time. It worth it to note that the two aforementioned games only advances time when the interaction is novel; interactions that have already been chosen do not advance time upon subsequent visits.
Perhaps not an IF, strictly speaking, Outer Wilds uses an in-game clock pegged to realtime; each run lasts for twenty-two real minutes, which invariably ends in what is understood as a “game over” in any other game, perhaps with the exception of several scenes with permissive time, existing outside of this twenty-two-minute regimen. Time in this game passes inexorably and fluidly, as fluid as simulated time can be, and the play experience is frantic and desperate, occasionally giving way to resignation once a wrong, run-ruining move has been made, momentarily opening up a brief window of doomed calm where the player has time to look around and witness the beauty of this constructed universe. Relating this back to IFs, we might imagine how such pressure can lead a resigned audience to take a seemingly sub-optimal route and discover new things, even if those things are useless, right before the run’s end.
An interesting example of an IF that combines several of the above method is The Master of the Land by Pseudavid. This game combines the run-based, Groundhog’s Day treatment of time seen in Outer Wilds, but instead of being pegged to a clock, time is dependent on player movement and interaction like in Disco Elysium. For hypertext IFs, it is far easier to hide triggers for time’s passage inside each link so that each click corresponds to a discrete interval being advanced. Using a live-update game clock is technically doable, but it is far more challenging to make it enjoyable. After all, the player character’s movement speed in Outer Wilds is consistent, and optimizing their routes is part of the game’s challenge. Meanwhile, being a fast reader does not make one a better player in The Master of the Land; skimming or skipping text can be antithetical to a text-based IF. Therefore, it is generally advisable that hypertext IFs uses a trigger that corresponds to the audience’s interaction, which is done at their leisure.
The range of method of designing time can be daunting, but it is not yet the most important aspect of simulating time in IFs. Now that we have a functioning pretend-clock, what do we use it for? How does it contribute to the meaningful interaction – the informed decision-making, the ability to see ahead and anticipate, or more plainly, the “playing” that these programs elicit? It is here that we must discuss the time purpose.
Diegetic time is an in-universe force, which acts upon the various objects found within it. If something does not have an effect, it may as well not exist. Time in IF is a state machine, and its purpose is to output the input for subsequent state machines – the other mechanics.
To describe in the most general and genre-agnostic terms that I can manage: time should change the interactions in other mechanics. It opens up new interactions, disables some, or changes the logic and behaviors inside those other mechanics.
More concrete examples might be:
But mechanical interactions alone are only the building block to an experience; time, as a mechanic, must also contribute to the intended experience. What time meant to do for the audience? To pressure them? To frustrate them? To reward them? To inculcate a sense of realism? If there is no clear reason why there must be a time mechanic, then the mechanic should not exist. No-one likes reading under pressure; it is distracting. If the time mechanic is designed first, and other mechanics are designed around it, the author must self-reflect and ask whether they have a solution in need of a problem, not the other way around.
In terms of player experience, a time mechanic can serve several purposes, which the following is a non-exhaustive list of:
This has been briefly discussed, but it is important enough to revisit: Since time is a mechanic, and all mechanics are to be learned as part of the challenge and enjoyment of a game (insofar as an IF can be considered a “game”, more novelistic or experimental IFs notwithstanding), then it is vitally important that the audience be able to anticipate some part of time’s effect. This entails exposing the underlying state information to the audience (e.g. a time display) and communicating consistently time’s (expected) effect on the various mechanics (e.g. “Sue is not home right now. Come back later in the day.”).
Of course, there are scenarios where some element of time ought to be kept hidden. For example, in an IF with multiple layers of time – let’s say a day-to-day phase system and a story state (act) system – it is the phase that is clearly exposed, but the story state is hidden. This is because as fictions, the work should consider what information should be available to the player character. A person is typically aware of the time of the day, but not “which act” they are in “their story”. Like “chapters”, story states are not part of the diegetic awareness, even if it governs a diegetic timeline. If knowing does not lead to meaningful change audience interaction, then the information should not be exposed at all.
The mechanical uses of time in an IF is a vast and varied topic, which cannot be exhaustively covered in this entry alone. Thus far, we have covered some important considerations in designing time mechanics, but we will leave the discussion open and without a conclusion. More will be discussed in the future.