Chapter Six: Silence Is Funny
The squirrel has committed.
You can see it in the photograph — or rather, you can see what remains of it, which is the back half. The front half is gone, plunged headfirst into a hole in an old tree in a park in Ravenna, Italy. White flowers bloom on the branches overhead. The tree is dappled with spring light. The hole is perfectly sized for a squirrel, as it happens, because squirrels have been using this particular hollow as a hide for years, which Milko Marchetti knows because he has been lying in the grass watching them do it.
What Marchetti's camera caught — at the precise instant when the squirrel's head, shoulders, and front legs had vanished inside but its back legs and tail had not — is the visual equivalent of a freeze-frame at exactly the wrong moment. The tail extends horizontally in the air. Perfectly level. Motionless. It has the geometry of a diver who has plunged halfway through the water surface and somehow stopped. The flowers bloom. The squirrel does not move. Everything about the composition says: stuck.
The squirrel was not stuck. It went in fine and ran off. Marchetti knows this. But the camera doesn't, and neither do you, and it doesn't matter. Something happened to your face just now, reading that description, something you did not decide to do. The corners of your mouth moved. Maybe a sound came out.
No caption was necessary. No knowledge of Italy was required. You didn't need to know anything about red squirrels, or Ravenna, or photography competitions, or the particular tree. The image — or rather, the prose description of the image — did something to you before you had time to think about why.
That's the question. Why?
Every other chapter in this book has had to swim upstream against the problem of translation. Culture shapes the punchline. Language loads the gun. A Sumerian fart joke crosses four thousand years, but only if you already share the basic primate opinion about bodily functions. A Yoruba riddle requires the setup of a call-and-response. Even in-group humor — which is about as universal a social mechanism as laughing itself — only works once you know which group you're in.
Visual humor, it turns out, is playing a different game entirely.
In 2007, a team of researchers led by K.K. Watson at Duke University put subjects into fMRI machines and showed them two things: sight gags, and verbal jokes. Everyone expects the brain to have a "humor circuit" — a single mechanism that fires when you encounter something funny, regardless of the delivery format. Comedy is comedy, right? The researchers expected to find the same network activating in both conditions, maybe with some extra activity in the visual cortex when the trigger was an image rather than a sentence.
What they actually found was something more interesting and considerably more inconvenient for that assumption. The two conditions lit up different brains. Sight gags activated high-level visual areas — the extrastriate cortex and its neighbors, regions that specialize in processing complex patterns, biological motion, and spatial anomaly. Verbal jokes activated the classic language areas: Broca's region, left temporal cortex, the circuitry that processes words. These were not variations on a single theme. They were separate operations.
They converged at the end. The reward system — the nucleus accumbens, the ventromedial prefrontal cortex, the same regions that respond to food and music and the pleasure of solving a puzzle — lit up in both cases with equal enthusiasm. The brain, whatever else it might disagree about, was unanimous: funny is funny, the reward is the same, the signal of resolution and relief is identical. But the routes there were distinct.
This matters for the same reason that different roads to the same destination matter: one of them has tolls, and one doesn't. The visual humor route is, in cognitive terms, the toll-free highway. It runs through perception rather than language, through the visual system rather than the linguistic one, through a network that has been processing incongruity — identifying things that are in the wrong place, the wrong size, the wrong context — since long before language showed up to complicate matters. The verbal route requires a learned system. The visual route requires only eyes.
A follow-up study by Samson, Zysset, and Huber in 2008 refined this further by removing language from the experiment entirely — they used only nonverbal cartoons, not a spoken or written word in the whole protocol — and found that even within purely visual humor, there were multiple distinct subtypes, each recruiting its own additional circuitry. A visual pun, where something looks like two things simultaneously, activated the extrastriate cortex — the brain was treating the pun as a genuine perceptual ambiguity, not just a conceptual one. A cartoon where the humor came from understanding a character's mistaken belief activated the theory-of-mind network, the same circuitry you use to model what other people know and don't know. The brain was not running one process and dressing it differently; it was running different processes for different kinds of funny, each of which is also different from what happens with a verbal joke.
What all of these processes share is the endpoint: the nucleus accumbens lights up, the ventromedial prefrontal cortex fires, the same reward signal arrives that arrives when you eat something delicious or hear a piece of music that hits just right. Vinod Goel and Raymond Dolan established this in 2003 with a beautiful experiment in which subjects rated how funny various stimuli were while being scanned, and the intensity of the reward-system signal correlated directly with the funniness ratings. The brain was not being polite. It was not performing amusement. The reward signal was proportional to the laugh. And that reward pathway — the mesolimbic dopaminergic system — is not culturally constructed. It is neurological bedrock. You were born with it. Every human was. It predates, evolutionarily speaking, the capacity for language by a considerable margin.
Which is why a photograph of a squirrel stuck halfway into a tree hole works in Tokyo and Nairobi and São Paulo and everywhere else Marchetti's image traveled after it won the 2024 Comedy Wildlife Photography Awards. And which is why, once you start pulling on that thread, you find it unspooling back through seven hundred years of illuminated margins and three thousand years of papyrus and every silent film that ever made a Berlin audience demand an encore.
Let's go back a bit.
The year is somewhere in the 1290s. A monk — or possibly a professional illuminator employed by a monk, working for a patron who might have been a lawyer or a cleric or a wealthy merchant — is ruling out the margins of a manuscript and reaching for his paints. What is he about to draw? We don't know his name. We have no record of what he was thinking. What we have, preserved in the British Library under the catalog number Add MS 62925 and known as the Rutland Psalter, is what he drew: a figure in full knightly armor, seated on a dog, leveling a spindle at a garden snail.
Let's be specific about this, because specificity is what makes it funny.
The "knight" is not a man. He's a hybrid ape-creature — hirsute where the armor leaves off, with the expression of someone who has committed entirely to this enterprise regardless of its logic. He's riding not a warhorse but a dog, which is doing its best to look warlike and mostly succeeding in looking confused. The weapon leveled at the snail is not a lance. It is a spindle — the wooden rod women used for weaving, domestic and slender and approximately as threatening as a kitchen utensil, which is what it is. Everything about the composition is wrong. Every element has been deliberately degraded one grade below what it's supposed to be.
And then there's the snail.
The snail is painted with evident care: the spiral of the shell, the stalked eyes, the deliberate extension of the body across the ground. It is not fleeing. It is not cowering. It is not doing anything, in fact, that suggests it has noticed the ape-knight with the spindle bearing down on it from an extremely short distance on his deeply skeptical dog. The snail's posture is one of complete, unruffled indifference. It is going wherever it was going. The ape-knight is its problem.
In dozens of other versions of this image — Lilian Randall counted seventy of them, in twenty-nine separate manuscripts, when she sat down in the early 1960s to do what the field of medieval art history had been politely avoiding — the knight is losing. Not just facing a challenge. Actively, humiliatingly losing. In some images he kneels in submission before the snail. In others he's already retreating. In at least one version, a woman has taken his place and isn't doing any better. In perhaps the most satisfying variant, there's no knight at all — just the snail, processing toward the viewer with the serene confidence of a creature that has never needed to worry.
Randall's 1962 article in the journal Speculum was the first to take these images seriously as images rather than dismissing them as the idle scribbling of bored monks. She counted. She mapped the distribution. She identified that the motif clustered between roughly 1290 and 1310 and crossed national boundaries — these weren't one monastery's in-joke but a visual tradition spread across prayer books, psalm books, and legal compendia from England to Flanders to the Rhine Valley. Someone in 1260 wanted to draw a knight losing to a snail. Two decades later, someone in another city wanted one too. And then someone else. Twenty-nine someones across half of Europe, all commissioning their own version of the same joke.
Randall proposed that the snail represented the Lombards — moneylenders from northern Italy who were notorious across Europe for, as the imagery suggested, advancing against people much more powerful than themselves and somehow always winning. It's a tidy interpretation, and it's probably at least partly right. But here's the thing: you don't need it. You can look at an armored knight kneeling in submission before a garden snail and find it funny without any knowledge of medieval Italian finance. The humor lives in the disparity — the cosmic, visual, undeniable disparity between a trained warrior in full kit and the creature most associated with slowness, softness, and the garden path.
Scale did it. Posture did it. The snail's magnificent indifference did it. No Lombard scholarship required.
One Auguste de Bastard d'Estang, a 19th-century French aristocrat and amateur medievalist, found a snail-combat image in a manuscript near a miniature depicting the Raising of Lazarus, and proposed — with apparent confidence and genuine intellectual effort — that the snail symbolized the Resurrection. The shell, he argued: the creature emerges from it, as Christ emerged from the tomb. It was an elegant theory, symmetrical and scholarly, and it was completely wrong. When Randall checked, she found snails in positions with no conceivable Resurrection symbolism: secular manuscripts, law books, hunting guides, scenes with no theological content whatsoever. The Comte had found meaning in proximity rather than structure. The snail was next to the Lazarus scene purely by probability. The joke had become so common that it had ended up next to everything. De Bastard found symbolism where there was merely abundance.
The lesson: when a gag is funny enough, people put it everywhere, including places where an over-earnest medievalist will later find it and construct a theory.
Across six consecutive folios of a different manuscript — a 14th-century law book, Pope Gregory IX's legal rulings with scholarly commentary, known as the Smithfield Decretals — an anonymous London illuminator inserted something that had no business being there and is one of the most quietly perfect jokes in the history of illustrated books.
The illuminator drew rabbits operating a criminal justice system.
In sequence, unfolding as you turn the pages: a rabbit archer draws a longbow and shoots a human hunter in the back. The hunter falls. Muscular rabbits — painted with obvious relish, with the thick limbs and purposeful expressions of creatures who have been waiting for this — bind the man and drag him before a rabbit judge seated at a proper tribunal. The court is in full session. Procedure is observed. A verdict is rendered with all due formality. The hunter is led away and beheaded. A parallel sequence handles the hunting hound: the dog is hanged rather than beheaded, the rabbits apparently aware that legal distinctions about method apply by rank and species.
The joke has layers. At the surface, it is the food chain reversed — the hunted become the prosecutors, the hunters become the convicted. This joke has been funny since at least 1150 BC and will be funny until the species ends. But it's funnier here because of where it is. The rabbits are conducting their parody of justice in the margins of a book that, in the text above their heads, is enunciating the real thing. A reader consulting the Decretals on ecclesiastical procedure would find, on the same page, the official procedures and, below them, small illustrated creatures who know those procedures well enough to mock them. Someone — almost certainly a canon of a London priory — commissioned this. A professional illuminator executed it. Both of them presumably kept straight faces about it. The humor has survived every century since.
Someone with a sense of irony was reading law books in 14th-century London and wanted to leave evidence.
Before we go further back, a brief detour to Paris in 1932, because there is something worth clarifying about how visual humor works in still images — something that feels paradoxical until you see it demonstrated and then seems obvious.
Verbal humor is sequential. The setup comes first, the punchline second, the timing lives in the gap between them. A still photograph cannot be sequential. It presents everything at once. The question is whether a photograph can therefore contain a joke, and if so, how, given that jokes are supposed to unfold in time.
Henri Cartier-Bresson had been shooting with a small camera around Paris for about three years in 1932 when he pushed his lens through a gap in a fence behind the Gare Saint-Lazare — Paris's massive railway terminus — and captured one of the most analyzed photographs in the history of the medium. The image shows a man at the precise instant of leaping across a flooded area of the station's rear courtyard. His feet are not yet touching the standing water beneath him — he is airborne, mid-stride, his body in full forward extension. In the flat surface of the flood below, his reflection mirrors his position perfectly, a duplicate figure arcing upward from below to meet the downward arc of his feet. And on the wall behind him, a circus poster shows an acrobat frozen in an identical mid-air posture.
Three things in the same frame, wearing the same pose: the leaping man, his reflection, the acrobat on the poster.
The poster is the setup — it establishes the acrobatic silhouette as a visual reference point before the viewer's eye has consciously registered it. The leaping man is the punchline — he has accidentally, in this one unplanned moment, become the poster. The reflection is the button, the comic kicker that doubles the punchline and adds a layer of inadvertent symmetry that the man clearly didn't know he was producing. The whole joke fires in the order the eye naturally scans the image, and it fires instantaneously — you don't read it so much as receive it.
Cartier-Bresson spent years articulating what he was after, and the phrase he kept returning to was "the decisive moment": the instant when the subject, the composition, and "a geometry that surprises" all align. He meant something technical and formal by this, but the description maps precisely onto the structure of a visual joke. The funny photograph is the one where the incongruity has been compressed into a single frame — where the setup and the violation exist simultaneously, where the timing is spatial rather than sequential, where the viewer's gaze supplies the beat that a verbal joke would put into words.
A photograph cannot be slow. The timing is in the choice of which moment to freeze, and then in the structure of what the frozen moment contains. Cartier-Bresson chose, instinctively or deliberately, the one moment when the man behind the Gare Saint-Lazare was a joke.
Go back further. Much further.
Around 1150 BC, in the village of Deir el-Medina on the west bank of the Nile — the settlement built to house the artisans who were constructing the royal tombs in the Valley of the Kings — a literate craftsman unrolled 8.5 feet of papyrus and began to draw. What he drew has survived. It lives now in the Museo Egizio in Turin, catalog number 55001, and it is recognizable across three thousand years as exactly what it is: a comic strip.
The images that make up the first section of the scroll are a sustained riff on a single punchline, repeated with variations. A cat dressed as a shepherd herds geese. The geese walk in a line ahead of her. She carries a walking staff. She does not eat the geese; she tends them. A hippopotamus — improbably, magnificently, requiring the viewer to accept that physics is taking the day off — sits in a fruit tree. A lion and a gazelle face each other across a game board, both apparently absorbed in play, neither apparently concerned by what the other is. Mice in military formation, bearing siege ladders and scaling equipment, besiege a cat fortress. Cats wait on a mouse dressed in the manner of a noble, carrying her food on formal service trays, submitting to her inspection.
Each image is a variation on the same structure: the powerful serve the powerless, the hunters are hunted, the rulers are ruled. Scholars read the scroll as political satire aimed at the royal household — the artisans who built the tombs were not always paid on time, and the reversal of hierarchy had obvious appeal. There may well be targets in these images. But the jokes don't need those targets. The hippo in the fruit tree works as pure visual absurdity, independent of any political subtext. The cat waiting on the mouse works because everyone who has ever seen a cat understands what a cat is supposed to do, and it isn't this.
The joke structure is identical to the ape-knight. Identical to the rabbits. The powerful brought low, the dignified made ridiculous, the natural order stood on its head. Someone in a pharaoh's village was drawing the same joke as the illuminator in 1260. The creatures change. The structure is the constant. And the structure needs no words.
Here is where the argument needs to pause and be honest with itself, because the chapter is about to make a large claim and large claims require calibration.
Not all visual humor travels. Some visual humor is as culturally dependent as any pun — it simply arrives through the eyes rather than the ears, and falls flat in exactly the same way a pun falls flat in translation: with a polite smile and the faint sense that everyone else in the room got something you didn't.
In the late 1970s and early 1980s, Procter & Gamble was trying to sell Pampers disposable diapers in Japan. The product was good — technically sound, carefully manufactured. The advertising campaign was imported wholesale from the American market. It featured a cartoon stork.
In the United States, the stork delivers babies. This is the kind of cultural fact that feels like common sense to anyone raised inside the tradition, and feels like nothing whatsoever to anyone outside it. The stork as birth symbol belongs to a specific Northern European folkloric lineage — a tradition with roots in Germanic countries, carried through immigrant communities into North American popular culture, solidified over generations into something that feels like it simply must be true, the way "ring around the rosy" feels like it must be about the plague (it isn't, but everyone believes it).
In Japan, the stork carries no birth symbolism. None. The traditional Japanese story about where children come from — the one that's culturally loaded, narratively rich, the one that would make a baby product advertisement make emotional sense — involves a giant peach floating down a river. This is the birth of Momotaro, the Peach Boy, a folk hero. Storks are, in Japan, wild birds. Large ones, with red heads, occasionally visible at wetland habitats. Not messengers. Not magical. Birds.
Japanese consumers were not offended by the stork. Offense would have required finding meaning in it. They were confused. The image was competent, legible, warmly rendered, and entirely meaningless as an emotional symbol. The slight whimsy of the cartoon stork — and it is a slightly charming image — also failed, because whimsy requires shared recognition. The stork is endearing only if you already know it's supposed to be delivering a baby. Without that layer, it is a bird carrying a diaper. Why is a bird carrying a diaper? Why is this bird smiling about it? Is the bird all right?
By 1983, domestic competitors had nearly driven Pampers from the Japanese market. Uni-Charm and its rivals, understanding that Japanese parents related to their products differently — with more anxiety, more aspiration, less folksy whimsy — had taken market leadership. P&G eventually redesigned the campaign and recovered. But the stork was gone.
This matters for what the chapter is arguing, so let's be precise: the stork failed not because visual humor doesn't travel, but because symbolic visual humor — humor that encodes cultural meaning in an image — only works when the culture travels with it. The stork is a symbol. Symbols are compressed cultural arguments. You can only decompress a symbol if you already know the argument.
A lion cub falling out of a tree is not a symbol. It is a physical event. A squirrel stuck halfway into a hole is not a symbol. It is a body doing something physics permits and dignity does not. These images make you laugh before you read the label because the incongruity is available to any set of eyes: this is what the creature is supposed to do, this is what it is actually doing, these two things are not the same. The humor lives in the gap between expectation and outcome, and the gap is visible.
The test is simple: does the image make you laugh before you read the caption? The lion cub: yes. The squirrel: yes. The stork outside its folkloric home: no. You need the caption to understand what you're supposed to find charming, and by then it's already too late. Comedy explained is comedy anatomized, which is to say, comedy dead.
The chapter's claim is not that all visual humor is universal. The claim is that a specific kind — humor rooted in what bodies can and cannot do, in the gap between dignity and its spectacular absence, in the violated hierarchies of size and power — travels the way physics travels. Which is to say: everywhere.
There is a thought experiment implied by all of this that is worth making explicit. Imagine you are an archaeologist three thousand years from now, and you have found a fragment of the Turin papyrus scroll, and you cannot read the hieratic script in the corner — you have no Rosetta Stone for this dialect, no cultural context for the Twentieth Dynasty, no knowledge of Deir el-Medina or the Valley of the Kings or any of the political tensions the images may encode. You have only the image: a hippo, sitting in a tree, eating fruit.
You would still get it.
Not the political meaning. Not the specific satire. Not the names of any of the royal figures who may or may not be implied. But the joke — the pure visual architecture of the joke, which is this creature is not supposed to be here doing this — you would get instantly, because it requires nothing beyond the visual system's capacity to recognize physical anomaly and the nervous system's capacity to find safe anomaly rewarding. The hippo is in the tree. The tree is not for hippos. Everything else is secondary.
This is what travels. This is what has always traveled. This is what the illuminated manuscripts and the papyrus scroll and the silent films and the wildlife photograph competition are all demonstrating, if you arrange them in a line and look at what they share: not political opinion, not religious belief, not linguistic convention, not technological sophistication. Just the gap. Just the distance between what's supposed to happen and what is happening, rendered visible, rendered immediate, rendered funny before the mind has time to explain to itself why.
In 1925, The Gold Rush was playing in Berlin.
The title cards had been translated. The few words Chaplin spoke on screen — none, as it happened, because the film was silent — required no translation at all. The Berlin audience understood the plot through gesture, through expression, through the physical logic of what they were watching. They had been understanding Charlie Chaplin through physical logic for a decade. By 1925, his films had traveled farther than almost any English-language text in history, partly because the English never got there first; what got there was the body.
At the center of The Gold Rush is a sequence that has been analyzed, replayed, and written about more than almost any other three minutes in cinema. The setup is simple and devastating: the Tramp has invited Georgia, the dance-hall girl he loves, to his cabin for New Year's Eve dinner. He has cooked. He has set the table. He has made everything as right as a man with no money and considerable ingenuity can make things. She does not come. She forgot, or she never really intended to come, or both — the film is quietly honest about the asymmetry. The Tramp is alone.
He has two dinner rolls and two forks.
What he does next is this: he impales each roll on a fork, holds the forks upright so the rolls rest atop them like a pair of tiny feet, and performs a little dance. The rolls are feet. The forks are legs. They step and shuffle and turn, delicate and precise, executing a full vaudeville routine on the tabletop in front of an audience of no one except the Tramp himself, who watches his own performance with an expression of genuine private pleasure, as if the rolls are doing this for him rather than him doing it for them. The dance goes on long enough to develop character — the rolls have rhythm, confidence, a certain lightness — and then it stops, and the Tramp is alone again at his set table, and no one has come.
The sequence is, depending on how you look at it, either the funniest thing in the film or the saddest, and in Chaplin's best work those categories are not as separate as they sound.
When this scene played in Berlin, the audience laughed, and then laughed more, and then laughed more than that, and the laughter built and extended and would not stop. The theater manager left his office. He went to the projection booth. He told the projectionist: rewind the film. Play it again.
It was played again.
No word of English had been spoken. No cultural knowledge was required. Two dinner rolls had made a Berlin audience demand an encore.
This is the natural experiment that cinema accidentally ran when it was invented: when you make a visual medium and give it to the world before you figure out sound, you find out which kind of humor needs language and which kind doesn't. The answer, in 1925, was demonstrated empirically and repeatedly across every language barrier in the developed world. Films traveled. Chaplin traveled. The Tramp — that specific body, that specific way of moving, that singular technique of communicating an inner life through external gesture — was legible from Shanghai to São Paulo, from Berlin to Bombay, without a syllable of explanation.
In China, between 1919 and 1924, twenty-nine of Chaplin's films were released. In 1922, while he was still very much alive and at work in California and had not been anywhere near China, the Mingxing studio in Shanghai produced a film called Huaji Dawang You Hu — The King of Comedy Visits Shanghai. It was a fiction. Chaplin had not visited Shanghai. He would not visit until 1936. But a Chinese studio, having watched a foreign comedian through silent films, had found him sufficiently culturally available — his physical persona sufficiently recognizable, without words, without translation — that they could premise a local comedy on the fiction of his imaginary presence. The Little Tramp had become a known cultural quantity in a city he'd never seen, through pantomime alone. The studio could write a title and an audience would understand the joke from the title alone, because the character was already there, fully formed, built from nothing but a body and the way it moved.
Charles Chaplin Jr., writing in his 1960 memoir, offered the most precise explanation anyone has given for why Japanese audiences in particular responded to his father the way they did. The obvious explanation — the one Western critics tended to reach for — was emotional universalism: Chaplin was universally funny because everyone understands poverty and dignity and longing. The content was universal.
Chaplin Jr. pointed elsewhere. Japanese people, he wrote, "understand and love his pantomime, which has so much in common with the tradition of their own Kabuki theater." Not emotional universalism. Formal equivalence. The Japanese audience already possessed a sophisticated performance grammar built on physical precision, on stylized gesture, on the communication of character and emotion through movement rather than dialogue. They didn't just "get" Chaplin — they recognized him. They had the vocabulary. The silence was not a limitation he'd had to overcome; it was the mode he was working in, and it was a mode Japanese audiences had been reading for centuries.
This is more interesting, and more accurate, than the emotional universalism story. Chaplin didn't bypass culture. He plugged into an existing socket. But the deeper point holds: the socket was there. A performance tradition built on the physical reading of bodies existed in Japan entirely independently of European vaudeville, and the two were compatible because both of them rested on the same foundation: a human nervous system that finds physical incongruity funny, that reads bodies for information, that has been interpreting gesture and posture for its whole evolutionary history. The Kabuki tradition and the vaudeville tradition are cousins who never met, both descended from the same ancestor.
Remove language from comedy and you expose the bedrock that was always there underneath. The bedrock is: bodies doing unexpected things. The bedrock is: the gap between dignity and physics. The bedrock is: what eyes recognize before minds name.
There is one more thing worth saying about Chaplin and what his global reception actually proves, which is a subtler point than "physical comedy is universal" and more interesting.
The easy version of the Chaplin story is: he was funny everywhere because he was funny about universal things. Hunger. Love. The little man against the system. Dignity under poverty. The Tramp's pathos is recognizable across cultures because those themes are recognizable across cultures. This explanation is not wrong, but it's not quite the one that matters for the chapter's argument.
Film theorist Noël Carroll made a useful distinction in 1991 about what he called the "sight gag" — a category of visual comedy distinct from physical comedy in general, distinguished by the fact that it presents the setup and the violation simultaneously rather than sequentially. The sight gag does not build toward a punchline. It crystallizes the incongruity in a single image or moment, complete, requiring no development. The roll dance is not exactly a sight gag in Carroll's strict sense — it unfolds in time — but it shares the sight gag's key property: its funniness is architectural, not narrative. You can describe it in a sentence and it is immediately funny. You don't need to explain it.
This is what crosses borders. Not the narrative content. Not the emotional themes, though they help. The architecture — the formal structure of the incongruity, the way the joke is built in space and time and body rather than in words. Chaplin Jr.'s observation about Kabuki is the key: Japanese audiences already possessed a sophisticated grammar for reading physical precision, stylized gesture, the communication of character through movement. They didn't need a translation. They needed a socket, and the socket was already there. The grammar of physical comedy is old enough and widespread enough that most developed performance traditions have a version of it, because all of them are built on the same raw material: a human body, doing something, in front of other human bodies watching.
The body is the oldest visual medium. Before papyrus, before parchment, before film, before photography, before the internet, there was the body — and there were other bodies watching, and finding things about it funny.
The Comedy Wildlife Photography Awards started in 2015 because a man in Tanzania was sitting at his desk laughing at his own photographs.
Paul Joynson-Hicks had been photographing wildlife in East Africa for more than thirty years. One afternoon he came across two pictures in his files. The first was an eagle he'd shot from below, lying flat on the ground with his camera aimed upward, so that the bird was framed looking down at him directly through its own back legs — the great predator caught in the most undignified compositional relationship possible with its own body. The second was a warthog's bottom.
He called Tom Sullam.
There should be a competition for this, he said.
He was right. By 2024, the Awards were receiving nearly ten thousand entries from more than a hundred countries, and by 2025, one hundred and eight. Every winning image is funny without a caption. Every year, in the week the finalists are announced, the global press covers them: the BBC, CNN, the Smithsonian, outlets in French and German and Japanese and Portuguese, all running the same images with different text underneath, and the text is always secondary, because the image has already done its work.
Joynson-Hicks has said that the founding insight behind the competition was about more than humor for its own sake — there was a conservation argument at the heart of it, a belief that laughter was a better lever for engaging people with wildlife than guilt or fear. If you care about something, you protect it; if you find it funny, you care about it; the funny photograph is therefore conservation in disguise. Whether or not that chain of logic holds up to scrutiny (and it probably does, though the data is complicated), it rests on an intuition that the Awards themselves prove year after year: that a photograph of a pelican looking personally offended, or an owl that has just realized it landed on the wrong branch, or a sea otter dramatically covering its face, crosses every barrier of language and culture and makes the same response happen in every viewer. Not because the joke was translated. Because the joke didn't need to be.
The 2022 overall winner was Jennifer Hadley's photograph of a three-month-old lion cub in the Serengeti. He had apparently decided, this morning, to climb down a tree by himself for the first time.
Lions are defined by grace. They are the apex predators of the African savanna, the creatures that feature on royal crests and football club badges and national flags, the animals whose dignified authority is invoked by everyone who needs a symbol of dignified authority. Cats, more broadly, are defined by grace — by the effortless control of the body, the smooth economy of motion, the famous ability to land on all fours from any height. A falling cat is not supposed to be a spectacle. A falling cat is supposed to be a demonstration.
The cub was mid-fall. His body was at an angle that communicated, visually, in a single frame, everything you needed to know: the angle was wrong, the limbs were wrong, the whole posture was wrong, wrong, wrong. Each of the four legs was deployed separately, at different angles, with different intentions, as though they had formed separate committees about this and were awaiting coordination. His expression — to the extent that a lion cub falling very quickly has a legible expression — was the expression of someone encountering the concept of gravity for the first time in a professional context.
"It didn't even occur to me that he would make a go of getting down by himself," Hadley said, "in the most un-cat-like fashion. I mean, how often do cats fall out of trees?"
He landed on all fours and ran off. Of course he did. The injury to his dignity was the whole point; the body was fine.
You don't need to know anything about lions to find this funny. You need to know that cats are supposed to be graceful, which is, roughly speaking, universal knowledge. The gap between what the creature is — the bearing, the implied authority, the royal feline heritage — and what the creature is currently doing is the entire joke, fully realized, delivered instantaneously, requiring no setup beyond the fraction of a second it takes to read an image.
The 2023 winner was Jason Moore's photograph of a western grey kangaroo standing in a field of yellow wildflowers outside Perth. Moore had been lying in position for some time, accumulating tick bites, shooting frame after frame of a perfectly ordinary kangaroo. He shot perhaps forty frames. In one of them — one, out of forty — the kangaroo had its front limbs at an angle that no rational person can look at without thinking: air guitar.
Both readings are completely legible from a single frame. The kangaroo is a kangaroo, doing something unremarkable with its limbs. The kangaroo is also a rock musician at peak performance, arms raised, wrists angled, posture communicating that the solo is going well and the crowd is responding. The yellow wildflowers have become a stage. The joke is entirely in the coincidence of the geometry — two incompatible images occupying the same moment, both entirely real, both visible to anyone with a set of eyes and some familiarity with rock music, which is to say anyone from the hundred-plus countries the competition was receiving entries from. No annotation required. The species is irrelevant. The setting is irrelevant. The limbs are everything.
Then there was the 2021 finalist, Arthur Trevino's image of a bald eagle making a strike on a prairie dog in the American plains.
The eagle is America's national symbol, apex avian predator, featured on the Presidential Seal, the creature that soars while you plod. The prairie dog is a small burrowing rodent that lives in holes and gets eaten by eagles. This is the established relationship. The image shows the prairie dog, caught in mid-air, launching itself directly at the eagle. Full extension. All four limbs deployed for assault. Every element of the posture communicates: I am attacking this eagle.
The eagle is pulling back.
Two sentences. The joke is done.
The image works if you know what a bald eagle symbolizes to Americans, and it works without that knowledge too, because predator flees prey is its own joke, culture-independent, legible from the same visual architecture that makes the ape-knight funny and the rabbits arresting the hunter funny and the mice besieging the cat fortress funny and the hippo in the tree funny. The small things revolt. The strong things retreat. Somewhere in the brainstem, something registers this as anomalous, and somewhere slightly above the brainstem, something finds the anomaly delightful.
The Egyptian artisan drawing the mice besieging the cat fortress would have understood Trevino's photograph without a word of explanation. Three thousand years and a medium that wouldn't exist for another three millennia, and the joke is the same joke: the one where the prey turns around.
You can draw a line from that papyrus to that prairie dog and find the same structure running all the way through, unbroken, needing nothing but eyes to read it.
Before we get to what the British Library catalogers found in 1808, there's a question worth sitting with for a moment, because the chapter is about to make its final turn and it's worth knowing why the turn goes where it does.
What makes the visual joke durable?
Verbal jokes age badly. They depend on linguistic usage, on shared cultural reference, on topicality. A political joke from 1310 requires a footnote; a linguistic pun from 1260 requires a translation; a topical gag from 1150 BC is basically archaeological evidence. But the visual jokes in the manuscripts Randall catalogued — the jousting snail, the rabbit judge, the fox preaching to the birds — don't need a footnote to be funny. They need a reader who has eyes and can recognize a snail and knows, somewhere in the animal-knowledge-processing parts of the brain, that the snail is not a legitimate military opponent. They need, in other words, the same reader who has always existed and always will: a primate with binocular vision, a nervous system calibrated to identify anomaly, and an evolved capacity to find safe violations pleasurable.
The verbal joke encodes its payload in language. Language changes, erodes, becomes opaque. The visual joke encodes its payload in the image itself. The image may fade, but the structure persists: the disproportionate adversaries, the upended hierarchy, the creature out of its proper context. These are not conventions. They are features of the world that the visual system has been trained, over millions of years of evolution, to notice. A small animal challenging a large one is anomalous. A creature occupying the wrong place in the natural order is anomalous. A body failing to do what bodies are supposed to do is anomalous. Anomaly triggers attention. In a safe context, attention to anomaly produces the cognitive signature we experience as laughter.
The visual joke, at its most basic, is a direct address to that machinery. And the machinery is the same across time.
There's a book of hours in the British Library — a devotional object, made for a female patron in London around 1320 to 1330, cataloged as Harley MS 6563 — that had something done to it sometime in the early modern period. Someone, at some point, took scissors or a knife to the illuminated miniatures: the full-page religious paintings, the carefully executed portraits of saints, the devotional images that formed the serious content of the manuscript. They cut them out. They wanted the pictures. They left the text.
They also left, apparently, the marginalia.
The British Library's catalogers worked through the Harley Collection in 1808, noting the contents of each manuscript in the formal, compressed prose of official catalog entry. When they reached Harley MS 6563, they encountered what was left: the religious text, and the marginal drolleries. Fox preachers, delivering sermons to credulous birds seated in attentive rows. Animal orchestras, with rabbits on strings and dogs on percussion. Knights surrendering to snails. Cats and mice in combat, the cats perhaps not having it all their own way. Something usually described as a butt trumpet, which the manuscript's original patron presumably encountered during prayer and which I will leave to your imagination, or perhaps you've already been to Tumblr.
The catalogers had no access to 14th-century English devotional culture. They didn't know about the Lombard symbolism. They had no way to interpret the fox-preacher satirical tradition, or the theological joke of animals playing sacred music, or the political specifics of any of the images. Six centuries of religious upheaval, linguistic change, and social transformation had dissolved every layer of specific meaning these images once carried. The counterreformation had happened. The English Civil War had happened. The Enlightenment had happened. The world in which these jokes made pointed cultural arguments had been entirely replaced by a different world.
The catalogers looked at the fox preaching to the birds and called it ludicrous.
Three words. "Ludicrous figures in the margin." Official catalog prose, neutral and compressed. The humor had survived a complete cultural rupture — not preserved behind glass, not annotated into comprehensibility, not explained by a docent. Just still there, doing its work, making 1808 British librarians respond the same way 1320 Londoners had responded, through the same machinery, via the same visual structure: creatures doing what they shouldn't do, the dignified made absurd, the hierarchy inverted until it tips over. The specific targets were gone. What remained was the shape of the joke.
The eye didn't need the explanation, and neither does the chapter.
But it's worth saying, before leaving the manuscript behind: the 14th-century manuscripts are still in the British Library. High-resolution scans are freely available online, which means that for the first time in seven hundred years, the ape-knight on his dog can reach approximately every person on earth with an internet connection and fifteen seconds to spare. The snail's indifference has lost nothing in the transmission. The rabbit judge's authority is undiminished. The anonymous London illuminator's joke is still working on audiences the illuminator could not have imagined, in a medium the illuminator could not have conceived.
Languages die. Religions change. Empires fall. The illuminated miniatures — the serious ones, the devotional images, the theological content — were cut out of Harley MS 6563 at some point by someone who wanted the pictures and left the rest. They are gone. Someone carried them away, and they were presumably separated from the manuscript and eventually lost or destroyed or absorbed into some other collection without their context, and we have no idea where they are.
The ludicrous figures in the margin are still there.
Back to the squirrel.
Milko Marchetti reports that when he shows this photograph at photography seminars — and by now he has shown it many times, to many different audiences — the laughter comes within seconds. Every time. It doesn't matter what nationality the audience is. It doesn't matter whether they know Ravenna or know about squirrels or have any previous acquaintance with the Comedy Wildlife Photography Awards. The tail is extended horizontally in the air. The white flowers bloom. The squirrel appears to be stuck. Something fires. Something lands.
What fires is not a cultural reference. What fires is not a verbal structure. What fires is the visual system encountering an anomaly and delivering a verdict before the rest of the brain has had time to weigh in: this is wrong in a way that is also fine, and the combination of wrong and fine is what you have been calling funny since you were old enough to understand that gravity usually wins.
Three thousand years of evidence, from papyrus to parchment to photographic emulsion, says the same thing. The ape-knight on his dog, leveling his spindle. The hippo in the fruit tree. The gazelle and the lion at their board game. The rabbit judge presiding. The Tramp's dinner rolls dancing in the empty cabin. The lion cub's catastrophic four-limb descent. The kangaroo in the flowers. The squirrel, frozen, halfway in.
The brain looking at any of these images does the same thing: builds the expectation from context, encounters the violation, resolves it into the new and absurd and perfectly clear sense, and laughs. No translation required. No caption needed. No cultural footnote attached.
The joke arrived. The eye got it.
That has been true since someone in a pharaoh's village drew a hippo in a tree. It will be true as long as there are eyes to see it.