Electronic Literature

Finding and Evaluating the Code

Lev Manovich’s first Principle of New Media, numerical representation, begins with the axiom “All new media objects, whether they are created from scratch on computers or converted from analog media sources, are composed of digital code” (49). Though this coded representation is sometimes invisible to the user, the complexity of digital representation can be pushed to the foreground; for example, carrier (becoming symborg) switches between different platforms–HTML, Java, Shockwave, and VRML–for different purposes, begging the question of what such a heterogeneous collage shows about the essential unity of a work. Code can be experienced subtly but still have an impact on the user’s experience. Understanding the relationship between these levels can seem like a daunting prospect, but it can yield insights into a work. This essay serves as a guide to levels of code and their components, beginning with some general principles that will apply to any work and ending with an exploration of these ideas in three works that use three different languages.

Executed, Source, and Structure Layers

In carrier, the character sHe is simultaneously represented by the words displayed in Java applets, the Java source code that constructs the applets, and the process that turns that source code into binary code executed by the computer’s processor. Manovich's fifth principle of New Media, transcoding, becomes evident: a digital work is represented simultaneously by multiple signification systems and interpreted by different agents. Though users may think they only perceive the first of these, the other levels create the user’s experience.

Generally, we may think of three different layers at which a digital work can signify; these may be permeable but these three terms provide a useful entry point for thinking about a work. The executed layer is the work experienced in the browser window, and the process that drives it; it is not only visual and auditory experience, but also the work’s process as it is experienced. Underlying the executed layer is the source code, the original text created by the author. Between these two layers is the execution of the code, based on the language structure. Depending on the language, code may have multiple layers of interpretation before it reaches the binary code executed by the processor; though this process is mostly invisible, the structures within a language and the consequent choices that a programmer makes yield meaning. This section will explain some general principles for understanding how these layers interact.

The Executed Layer

The executed layer is what the user experiences, and it is here that we see what the other two layers are directed toward producing. To understand the executed layer, we must define the work’s process, that component of the work that produces what the user experiences insofar as the user is aware of this process. I use the term “algorithm” to refer to this process, as understood by Ian Bogost, who defines it as "sets of operations of varying complexity" (Unit Operations, 40). Implicit in this definition is that an algorithm can contain simpler algorithms, the lines between which are ambiguous. Every work, even a static web page, contains algorithms, whether they are as simple as coloring text or as complicated as playing a game of poker.

An easy way to begin with this layer is to develop a schema of the work by simply describing what happens, as if writing a step-by-step summary of a novel. After summarizing the work briefly from beginning to end, break the steps you have identified down into simpler steps, exploring how each of them might be an algorithm itself. As part of this schema, notice any significant values that change over the course of the program, understanding values broadly as any kind of data, from numbers to the position of a window. Note anything that can't be explained simply by the executed work, such as any decision-making criteria the work leaves unclear. The schema developed should remain provisional, pending exploration of the code. Not every step in this schema will provide fruitful interpretive fodder, but this is a necessary place to start.

Note too that execution is a spectrum rather than an either-or, and the code and executed layers are permeable. Many factors stand between source code and execution. Some works consciously interrupt execution with bad code, causing code to rise to the surface. Furthermore, not all work that uses code executes; some work, called "codework," exists only as source code, which invites questions about why the code does not function. Other work, in its executed content, uses the vocabulary of code. In both of these, though, it still makes sense to start with the executed level because these works foreground the question of execution.

Source Code

Once you have a schema of the program's execution, the next step is to acquire the source code. The steps of this process depend on the language, which I will explainin later sections. Once the source code has been obtained, read it in parallel with the work, mapping your schema onto the source code to answer questions raised by the program's execution. This process may yield new insights about the execution; source code can provide a clearer understanding of the methods behind the program. If you do not have any experience with programming, this section will introduce some basic concepts common in most languages and how they translate to the executed layer.

Variables: These represent data the program uses. These can often be mapped, not necessarily one-to-one, to the values in a schema of the program. Considering what kind of variables exist, how they are named, and how they are modified over the course of a program provides some answers to the implementation of an algorithm. Variables fall into a few general types; the ones of most concern are primitives (numbers that usually are called simply "variables”) arrays (collections of variables, usually denoted by [] at the end of the name), and strings (collections of text).

Operators: These are non-alphanumeric symbols used to accomplish tasks; for example, the = operator sets a variable equal to a value as in: i = 2. These are important to conditional statements.

Functions: These are commands that accomplish tasks. Most languages allow programmers to divide their programs into these smaller units if the tasks they accomplish happen in many different circumstances. In all the languages explored here, functions are written in the format function(value1, value 2), and all the commands in a function are contained in curly braces {}. You may consider how these are named and designed -- what steps they include -- and how they map to the steps of your schema.

Statements: These are combinations of variables, operators, and functions that are executed at one time. The statement var abc = 3 in Javascript creates a variable 'abc' and assigns it the value 3. Like functions, these can be mapped, alone or in groups, to steps in your schema.

Conditional Statements: These are used by the program to make decisions; they can be used to direct the program's flow based on whether or not a condition is true, or to execute certain steps several times until a condition is met. These can be mapped to the decisions and choices identified in your schema.

Objects: This is a slightly more difficult concept to explain. Objects are collections of data with functions that operate on that data; they are intended to make programming easier by representing things in the real world. For example, a camera object might have an array that represents a series of pictures inside it and a button function that takes a picture and writes it to that array. The variables and functions within an object are called its members and are referred to by putting the variable or function name after the object name with a dot, for example Camera.TakePicture(). Javascript, Flash, and Java are all object-oriented languages. Because these contain both functions and data, objects can be mapped to values and steps in your schema.

Classes: These are blueprints for different kinds of objects, usually defined with "class" statements. A programmer defines a class and then creates individual objects from that class; for example, a programmer could create two cameras, one called MyCamera and one called YourCamera and then use them to take different pictures. They would have the same blueprint but would have different pictures inside them and be distinct objects.

Extending: Classes, in many languages, can be reused to build new classes; in Flash and Java, the statement would look like "class CameraWithTelePhotoLens extends Camera." The CameraWithTelePhotoLens class could add a function to focus the lens, but retain the other variables and functions of the other class. All Java applets use class extensions in different forms.

Comments: These are notes that programmers write to themselves as reminders for how elements of programs work. Comments are not executed and are generally removed if a program is compiled; for this reason, you will not find these in Flash or Java if decompiled, but you may in HTML or Javascript. They can sometimes make reading the code easier.

Language Structure

Once we understand how code constructs its algorithms, it's useful to start to figure out how the structure of the language bears on the way the work is constructed with code. Programmers and artists take these into account when they select a language.

Executing agent: Though some languages are turned into binary code, the simplest machine code that is run directly by the microprocessor, all languages used in Internet art require another program that stands between the code and the microprocessor. This makes them slower to run but accommodates the differences between computers more easily. In some of these, it will be the browser itself while others require another program called a plug-in.

Interpreted and Compiled Languages: This is a practical concern. Many languages are "compiled," or turned into a simpler set of instructions (called "bytecode") quicker for an executing agent to convert to binary code, but are difficult for humans to read. To understand the code, bytecode must be turned back into source code; this requires another program called a decompiler. For any works that require decompilation, it may be best for the teacher to decompile them and distribute the files to students instead of having each student do their own decompilation.

Design: Languages generally are written with certain purposes in mind, which affects the designer's decisions about its structure and how easy different tasks are to accomplish with it. Many languages do not use objects at all, while even some object-oriented languages do not allow creating classes. The two kinds of languages most common in the collection are markup and object-oriented programming languages. Programming languages can perform complex calculations and are very powerful, while markup languages are limited to telling an executing agent, like your browser, how to display certain kinds of information.

Understanding Languages

There are at least ten different languages represented in the ELC:1; I will discuss three of the most common. The first element toward interpreting the code is determining the language. This will be fairly clear from which browser plug-ins load, if any, what file extensions are used to contain the files, and what the code itself looks like. File extensions can be found in the HTML source code of a page, using your browser's "View Source" command, because works written in these languages are generally included in web pages. Following the sections on each language is a demonstration of reading a specific work with the principles previously outlined. These are not meant to exclude other interpretations and serve only to provide examples that might inspire further interest.

Javascript

Javascript is a language developed to supplement HTML, the language used in web pages. HTML can only make static web pages, but Javascript can make them dynamic and interactive; for example, Stud Poetry is a poker game written in HTML and Javascript. Javascript is an interpreted language distributed as source code and executed by a browser, so it requires no special tools. Different browsers interpret Javascript differently, so if the title page of a work recommends using a certain browser, it is important to follow those recommendations.

Javascript can either be included in an HTML page itself, in a <script> tag, or in .js files included onto the page, which can be viewed in your browser by overwriting the filename of the page that includes them (the part after the last slash) with the .js file’s name. The similarity between the names "Java" and "Javascript" has confused many people into thinking that they are similar, but they have very little in common aside from looking similar, and knowing one does not make learning the other easier.

The syntax of Javascript and Actionscript (the programming language in Flash) look very similar, though they have different capabilities. Variables in Javascript are declared with the "var" command, arrays with the var name = new Array() statement, and functions with the function command, so determining these language elements is quite easy. All variables and functions in Javascript are objects; unlike most other object-oriented languages, classes cannot be defined in Javascript, and objects cannot have methods or variables added to them. Comments in Javascript are marked with double slashes (//) or are begin with (/*) and end with (*/).

Stud Poetry: Algorithm Explained in Source

One of the collection's more light-hearted works, Stud Poetry, demonstrates the power of Javascript. This is a game in which the user plays poker against famous French poets, in which all the cards are ranked with words instead of numbers. The game begins with a dealer being selected and all players receiving two cards and being allowed to bet or stand in order. This routine continues as more cards are dealt. In the end, all the cards are revealed and the player with the best hand wins; the situation repeats until all players run out of money. This description raises questions about two algorithms in particular: how the best hand is determined (a set of values that is unclear) and how the computer-controlled players make decisions. With these two questions, we can turn to the code to find answers.

The code shows that both of these are random. studpoetry.js includes two new values to consider, two arrays called tightness and aggression at lines 38-39. The program uses these arrays to determine each computer-controlled player's playing style; at line 302, in the function WantToFold, if the player's tightness is greater than a random number, then the player folds; otherwise, he stays in. These two arrays are filled with random numbers in the Initialize function; the player's name is merely decoration. The word-cards are selected randomly from the AllWords array, which is filled before the game begins, and their value is determined by their position in the deck; nothing intrinsically makes "temple" worth more than "nature", and no routine determines what makes a straight aside from these random values. Suggesting that not only meaning but poetic skill are the products of chance, Stud Poetry makes language a game of which one figures out the rules by playing. In fact, checking to see what words are worth in each specific game is impossible by viewing the code; the randomness of the selection process precludes this. There is also no Royal Flush hand; no set of words is best. The game therefore portrays language both as a system of which one can never entirely control or understand and as a generator of free-floating signification.

Flash

Flash is a graphical programming language usually used to create animations, banner ads, or video games. A Flash program is called an SWF (Shockwave Flash) file, and has an .swf extension. A Flash movie is opened with a browser plug-in called Adobe Flash Player (formerly Macromedia Flash player, before Adobe acquired Macromedia) and run in the browser window, though the browser does little beyond provide a space for the movie to run in. Flash movies are created with graphics that can be drawn in Flash or other programs, and animated using graphical commands (for example, drawing a path for a graphic to follow) or using a programming language called Actionscript.

Flash movies run more or less the same on every system, assuming that the Flash Player is equivalent. Having the latest version of Flash Player is generally a good idea, as later versions add new features but remain backwards compatible; Flash Player 9 can easily run movies written for Flash Player 6. Because Flash is a compiled language, it must be decompiled before it can be viewed, requiring a decompiler such as Eltima Flash Decompiler (for Windows) or Trillix (for Macintosh), though there are many different options. These can resurrect the code more or less intact, but the decompiled files will still have some differences from the original source files. Developing and decompiling Flash has a high cost of entrance; Flash Professional, Adobe’s program for making Flash movies, costs around $700.

Like Java and Javascript, Flash (because it is called Shockwave Flash) invites confusion with Shockwave, another Macromedia platform used for similar purposes. The two are incompatible. Shockwave uses a different programming language called Lingo. While it is common in ELC:1, Shockwave cannot be decompiled by any available program that reproduces the Lingo.

Actionscript uses much of the same syntax as Javascript; variables are declared with the var statement, and functions with the function() statement. It adds the ability to define classes and extend them with other classes. More significantly, Flash deviates from Javascript and Java in the addition of a separate but incorporated layer for graphics. This layer has its own conventions that are as much code as Actionscript, and each element has some analogue that reproduces a function of textual code. In Flash, one can create an entire movie without any code; in Java, everything must be done in code.

Timeline: Because Flash is used for animation, objects are placed on a timeline in the order that things occur. The timeline is divided diachronically into frames and synchronically into layers.

Symbols: Flash graphics, whether images or text, are called symbols; these visual elements can be composed of other symbols and are treated as objects in Actionscript. Elements of symbols can also be changed during execution, so these serve some of the same purposes as variables. A symbol can be a graphic, movie clip, or button; buttons and movie clips can contain Actionscript code. Though enclosing Actionscript in the symbols it affects makes programming easier, Flash documentation recommends putting all code in a single frame on a single layer of the timeline to make it easy to find.

Tweens: In Flash, symbols are manipulated by the use of tweens, which allow objects to change state or location between two moments on a timeline. A tween defines how much time it takes for a transition to occur; what happens during the tween is defined by the different states an object is in the beginning and ending frames, which are called keyframes. Tweens affect every object on a layer, so if multiple tweens happen at once, they must be put on separate layers.

Code Movie 1: Graphics as Source

In the introduction to Code Movie 1, author Giselle Begieulman calls it a work that questions how conversions between file formats affect an image; the movie suggests that file formats are not transparent and disputes the idea of perfect reproduction, or, in Begieulman's term, "WYSIWYG utopias." The work animates transitions of image data between different formats while foregrounding the multiple codes both as objects to represent and as representations of the object. In demonstrating how file format constructs the image it displays, the work provides an example of how the structure of a graphical programming languages can create signification, simultaneously showing how source code can be a text independent of the work's algorithm.

It may sound strange to speak of a narrative in a movie whose only characters are numbers that signify data; this narrative, however, depicts an image being scanned into digital data, its storage, and its appearance on a screen. The movie begins with an image of characters while other layers of characters sweep across it with two brilliant white lines in the middle that represent the lights on a scanner. This animation is replaced by layers of numbers that represent the image in a different format, stored in RAM or written to disk as JPEG file. The characters pass through multiple stages, and are occasionally overwritten by other code that represents program instructions manipulating them, and occasionally sorted into streams for ordering. Eventually, the numbers are separated into streams that slide across the screen like the stream of a cathode-ray tube, or an application processing multiple parts of an image to display at once. Finally numbers dissolve and fade into the white background; the code is finally replaced by an image.

I've already developed a schema of the work somewhat in describing the narrative: an image is displayed, other images overlaid on it, then numbers and letters are manipulated as shapes and moved around. The movie's description mentions the constraints of file format, begging the question of how the Flash platform affects this movie, which asks for consideration of how the graphics are generated. There are, roughly speaking, two ways to store graphics digitally: raster graphics, which represent an image as a grid of single-colored pixels, and vector graphics, which represent it as a set of commands to draw shapes. Flash can use both, but generally uses the latter because they can be manipulated easily. QuickTime or Windows Media Player, on the other hand, use the former.

Flash also has a less obvious third option for producing graphics: text. Text in Flash can be stored as characters instead of images of text, making it a kind of vector graphic, but one with the additional component of having a value. Viewing the movie in a decompiler, if one clicks on the 6, one can see that it is a character from the Tahoma font. To produce the bending 6 in layer 5278, frames 609-655, Flash Player calls the Tahoma font on the user's computer, pulls the character '6' from it, and displays the results. This 6 could be, with a single line of code, replaced with a 7 or multiplied by 713. One could also run a utility to pull the text from the SWF into a separate file and save the new file as a JPEG, which would at least partially reconstruct the original image. Flash allows the preservation of the original JPEG image’s data, which affirmatively answers the artist's question of whether an image's format affects the user's experience. These numbers are unstable and easily changed, illustrating in the source code layer the narrative layer's depiction of how an image's code fluctuates. The source code offers signs not apparent in the executed layer.

Java

Of all of these languages, Java is the most sophisticated; it is the first language most computer science students learn in college and is used to develop programs as complex as NeoOffice, an office suite, and a version of the video game Quake. Understanding the entire Java language takes years of study, but I will sketch some frameworks for interpreting the works in the ELO collection. A program written in Java is called an applet, or an application. Most Java applets in the ELO collection are conceptually simple (in terms of programming, at least), but even these may extend across several files, while a SWF is often self-contained.

Java programs run in a program called a Java Virtual Machine (JVM), like the Flash Player. When Sun Microsystems first introduced Java in the early 1990s, the intent was to make it possible to run a Java program on any computer by simulating another computer with the JVM. The JVM translates the bytecode's instructions, which remain constant on all computers, into processor instructions, which change from computer to computer.

Java is a compiled language; it is distributed in .class files, or in compressed class files called JAR (Java Archive) files. The Java decompiler Jad can return any class file to the source code; a JAR file must first be decompressed with a decompression utility like WinZip or Stuffit Expander.

Though it was originally developed by Sun Microsystems, a for-profit corporation, the language was released to the open-source development community in 2006, and even before this, Sun did not maintain exclusive control over Java compilers, so it was quite simple for an experienced programmer to write a Java applet with free software, in marked contrast to Flash's high entry cost. Java was designed so that code could be stored across multiple files and reused in different applications, and also so that code written by different users for different purposes could be reused by others.

Java, like Actionscript, allows the programmer to create new classes but in Java, the program itself is a class. When it runs, the JVM uses the class file to create an object that is the program that can then be treated as an object. Multiple JVMs can run multiple copies of the same program, each as independent systems on a computer at the same time. As all Java applets extend the class "Applet", a surefire way to identify an applet is that the source code will contain the line "class [appletname] extends Applet."

Each applet has four basic methods, init(), start(), stop() and destroy(). When an applet loads init() and start() are called. init() initializes the application for the first time, then start() is called the first time and each time the application moves "into view.” For example, if the user has multiple tabs open, switches away from the tab, and then switches back, start() is called; stop() is called when the application moves out of view; and destroy() is called when the user navigates away from the page.

carrier: Language Structure as Writing

In carrier (becoming symborg) the structures of the Java language become signifying elements of the work, as does the vocabulary of code on the executed level. Because the work includes more than simply Java applets, providing a complete interpretation of carrier is beyond my purview in this essay, so I will briefly discuss how some elements of the code in this work plays into an interpretation: the use of class files to construct the agent sHe, the metaphor of the body mapped on to the JVM, and the allusion to code language.

"The boundaries of separate identity collapsed long ago," says sHe, an entity representing the Hepatitis C virus that controls the program's execution. Provocatively, despite this role, sHe is not represented by its own class but by an amorphous set of classes that depict its actions–infect.jar, carrier.class, and love.class. There is no actual sHe.class. sHe is not a single, independent agent, but rather a collective mass of semi-independent Java applets made one in the player's eyes by his or her experience of the work. sHe's collective voice is constructed as the player navigates through the text and experiences different class files. As a virus, sHe lives only in execution and propagation within other individuals, not as an independent entity.

The function of a JVM plays into the work's statements about the fluidity of identity and of the separation between entities. Its multiple Java applets are safely run on the computer-within-a-computer of the JVM. The computer is itself revealed to be a multiplicity of semi-autonomous systems, like the body. Even after the player leaves the work, though, the computer is still contaminated with the sHe because the browser does not delete the class files until it does a routine cache cleaning.

In its text, carrier alludes to the Java language that it is composed of. infect.jar produces the line "sHe extends [user's name]." The virus becomes something that includes and envelopes the player rather than something that the player has. Similarly, the work shows how a biological agent may draw people together and take precedence over their individual identity. The web postings illustrate how these anonymous authors are brought together under the banner of HCV. The work uses code language as ordinary language, as these users begin to speak in medical terms that have become commonplace to them. The work then illustrates how specialized languages, whether programming languages or medical languages, influence the human languages from which they are derived.

Conclusion:

This essay has only been able briefly to sketch some starting points for accessing and interpreting code. I hope that it will serve as an inspiration for thought rather than a definitive statement on practice or interpretation. Much remains to be said even about the pieces mentioned, let alone the other works in ELC:1.

A former web designer and programmer, David Shepard came to New Media because he was never able to lose his either his interest in computers or his interest in literature. He is a UCLA PhD student whose interests focus on code and the relationship between print literature and electronic texts, as well as making reading code from a critical point of view less threatening. He is currently working on a dissertation on the changes that take place in literature when the idea of a programmer supplants the idea of a writer.