Fig. 1: An old meme chart

This is a diagram I made back in 2015 illustrating the stylistic differences between different eras and media of visual humor. I took the same joke premise, which comes from an old newspaper comic, and embedded it into different styles of memes for comparison. The initial purpose of using the same exact joke in each image was to standardize the narrative and more clearly illustrate the stylistic evolution, eliminating distractions for an audience that was potentially unfamiliar with memes.

While my chart was done for educational purposes, this type of meme-ification of conventional jokes is actually a really common phenomenon. In the early internet days, people often made compilations of jokes in text form in order to share them online. Nowadays, more often than not, what could be considered conventional, ordinary jokes are packaged and told in meme form and shared on social media instead. So this got me thinking, what about the reverse process? What about memes that originate online and then migrate outwards? Why is it so often the case that meme-originated jokes retold in meatspace are met with awkward silence and averted eyes? Perhaps this bad reception isn’t simply the fault of the clumsy joke re-teller or of the circumstances of social appropriateness, but to a certain extent, an emergent property of the medium itself? In order to better understand the inner workings of visual humor, I decided to do some investigation, starting with puns.

1. Linguistic vs Visual Puns

There are many types of puns, from the more common homophonic, or “same-sound” puns, all the way to some weird ones like recursive or paradigmatic puns which involve more than just individual words but instead play on the listener’s understanding of concepts alluded to in another, usually earlier, part of the joke. A lot of memes feature puns, but are the puns in those memes any different from a regular old “dad” pun? In essence, is there a distinction between puns told verbally, and puns told through images?

For example, here is one of the first things you get if you google visual puns:


Fig. 2: A “Visual” Pun, according to Google.

However, is this even a visual pun at all? You could totally have just relayed the joke verbally and it wouldn’t lose anything in the translation! It’s simply illustrating a normal, homophonic pun—a nice background, if you will. I think that most memes which involve puns are like this. For example:


Fig. 3: “Kerchoo” (2017)

Here it’s just a bowl(of cereal)/bowl(weed pipe) pun. While the funny “kerchoo” image at the bottom is sort of the punchline of the meme, it isn’t really a part of the actual pun itself. It just humorously illustrates how a person might feel/act when they’re super high. I could say the same pun out loud to someone and it’d still be the same joke, sans kerchoo.

But what about something like this?


Fig. 4: “Can Eye Fist Uranus” (2017)

This is what is called a “rebus,” from the Latin non verbis, sed rebus, which means “not by words but by things.” In essence, a picture of something is substituted for something else when conveying a written message. A rebus can be a type of pun—instead of substituting a word with a similar sound but different meaning, you’re substituting a picture. It’s all the same thing basically, except that we are starting to depart from language in the conventional, words-and-letters sense. And that’s exactly where I’m headed!


Fig. 5: “One must imagine Sisyphus happy.” (2017)

This one is also a rebus pun and it even features a rhyming component, but because the picture doesn’t literally represent what it stands for, it becomes a bit more than that. The answer isn’t merely a literal “man rolling a huge boulder” but a very particular line from Camus’ famous essay. It’s more of a recursive/paradigmatic rebus than just a simple substitution/homophonic rebus, as it requires you to rack your brains a bit for the right answer, guided only by the expected rhyming scheme of the poem.

The main point of distinction between figures 4 and 5 is the set-up. Figure 4 has no set-up or structure to it, there are no pre-existing expectations for the joke. It’s a one off-er that is contingent on the literal-ness of the rebus conveying its message. In the Sisyphus image, the format itself is a meme, starting out with a well-known cheesy poem, followed by a picture substitution meant to invoke a particular phrase as the punchline, which rhymes with the initial poem set-up. This requires niche background knowledge on the part of the viewer in order to decode the meme. It includes an element of recall, searching and synthesis—the answer lies beyond the literal content of the image. It is precisely because it is pre-structured in this way that its punchline can fall so much farther from its face value content than it can in the simple rebus. The rhyming scheme narrows down the pool of possible solutions to a very limited quantity.

With the rebus puns, we started moving into the territory of what I term “trans-linguistic” puns. Rebus puns transcend conventional spoken or written language, relying on a blending of both linguistic and non-linguistic components interacting semantically and phonetically. They often follow ad-hoc conventions, and are inherently a visual medium. Unlike our typical homophonic pun which can confer its full contents when said aloud, a rebus cannot be conveyed without its visual form. In this way we are finally starting to understand the relationship between visuals and language within memes. However, are all visual puns simply substitutes for regular words and phrases, or is there a structure that is unique to visuals?

2. A Pun without Language

While the previous examples of memes move beyond conventional linguistic puns and therefore become inextricable from their visual embodiments, they nevertheless rely on language to tell their story. Both of the examples contain a phonetic component, and still deal with normal english words, just encoded in a more complicated way. But if there can be purely auditory puns, and combination (audio + visual puns), then surely there can be purely visual puns, too.

Here is what I consider a purely visual pun:


Fig. 6: “10 Rubin’s Vases”

This type of drawing is called a Rubin’s Vase, also known as a figure–ground vase. The reason it is called figure-ground is that the black alternates in visual dominance with the white—your brain identifies it as either a black vase on a white background, or two white faces on a black background, and these visual perceptions alternate as you shift your focus. It takes the defining feature of a pun—the witty combination of two unexpected objects presented ambiguously & simultaneously—and translates it into a purely visual form.

Now let’s take a look at how this concept might work as a meme:


Fig. 7: “Monkey Haircut Collage” (2017)

This meme collage has no real linguistic component, it’s purely visual imagery, which is still nevertheless meaningful, has structure, maybe what we might call a kind of rudimentary “grammar” or rules about how it works, yet it never intersects with anything that we might conventionally deem “language.” It’s a purely visual form, there is no way to convey the form of the joke other than to describe it as you would a painting. However, that would not be “translating” the joke, it would be equivalent to saying “hey do you remember that one really funny joke about the monkey getting a haircut?” That describes what the joke might be about, but doesn’t actually convey the joke itself.

I still consider this a pun though—there’s a punning happening in the melding/substitution of the images being conjoined together the way that homophones or other types of puns combine language semantically or phonetically, creating dissonance and ambiguity but also fitting together in a chimeric way, recontextualizing the content of other parts of the “sentence” (in this case, picture) by their inclusion into a seemingly unfitting surrounding structure.

3. Categories of Collage Memes

Fig. 7 seems like a rather complex image, however its complexity arises out of a lack of an overarching structure. Just because a visual pun doesn’t have any text in it doesn’t mean it has to be an unstructured, chaotic mess of collaged elements.

Another thing that arises from the unstructured form is that the elements which comprise the image and their interactions with each other are pretty literal. The arm is an arm, it does what an arm does. We recognize it as an arm independent of any memetic background knowledge. A person unfamiliar with memes would probably be pretty confused by it, but they would still mostly get it. It’s a weird collage but most of the parts don’t really require understanding beyond being able to visually recognize that those are arms from one picture, which are then substituted with hands from another picture, clearly unrelated.

It operates purely on gestalt principles of perception, grouping, continuity, etc., and while some of the components might individually have some semantic content, I don’t believe it’s particularly relevant to the “interpretation” of the meme as a whole. There isn’t much to interpret in the first place—it’s pure visual play and the only meaningful relationships between the characters are their physical interactions within the image. It’s like a group of people passing a ball around but there aren’t really any rules to the game, there’s just physical interactions.


Fig. 8: Syntactical Structure in a Collage Meme

With Figure 8, we introduce a level of structure into the image. The relationship between figures 7 & 8 is analogous to the one previously described between 4 & 5. The former ones in the comparisons are constrained by their lack of structure and forced to work with ad-hoc conventions contingent on their face-value content, while the latter structured ones have room to play with more complex meanings and relationships. Because this image has an overarching structure, it can have more complex themes. The meaning behind this spidey meme is that the featured meme templates, despite their seemingly different appearances, are all the same. But how are they the same? It doesn’t really go into detail, leaving the viewers to come to their own conclusions.

Meme templates give structure and define relationships between the elements within them. Swinging back towards a linguistic approach, we could call this a kind of “syntax.” The spidey template does exactly that—it literally points to a relationship between the inner components.

However, it’s a pretty simple relationship. The images being compared are meme templates themselves, with their own complex internal systems, yet they are just treated as objects in the grand scheme of the meme, they don’t have any particular role in the system beyond just being “one template, among many.” Their order is arbitrary and mutually interchangeable.

We must ascend higher. meme-jokes-puns-9.png

Fig. 9: Semantic Structure in a Collage Meme

Besides illustrating relationships between its components, the final function of a template is also to confer valuations to the components themselves—the blank spots function as value placeholders. For example, this drake meme template constructs a simple comparison of A vs B. The top panel gives it a positive value, claiming that the image in the corresponding placeholder represents a good thing, and the bottom panel gives it a negative value, claiming that the image in the corresponding placeholder represents a bad thing. This comparison can also be done ironically, reversing the implied values. Whenever an image is inserted into the value placeholder, it acquires the value embedded in that cell.


Fig. 10: Value Placeholders in Memes

Through this notion of cell values, individual parts of a template contain different meanings. Since it takes these values into consideration, Figure 8 moves beyond simple syntactical relations towards a semantically grounded structure. The individual values are actually meaningfully used and incorporated into the macro-structure of the image. The elements’ individual meanings are rooted in memetic literacy, background knowledge, established usage conventions, and not simply based on the gestalt or the syntax of a meme. The individual meanings of the subcomponents are separated and structured in a way that integrates them into the subcomponents of the other template elements of the meme.

Here is a diagram summarizing the points:


Fig. 11: Structural Organization in Visual Collage Memes

While I haven’t quite gotten to the bottom of the question of how memes compare with conventional jokes, and there’s much left to explore in that direction, I hope I’ve provided a basic framework for discussing the interplay between visual structure, conceptual organization, visual perception and meaning-making in memes.