Depth in Anime – Photography, Compositing and Animation

So I watched the first episode of A-1 Picture’s ‘Granblue Fantasy the Animation’ last night. Not sure I’m a fan. Like so many other anime these days, Granblue appears to be a victim of its own ambition. On the surface of it, it has all the hallmarks of a big-win production – ornate, beautiful characters, battle sequences, lots of lavish detail. But put into practice, these building blocks of greatness don’t fall into place. There are signs of production stress all throughout – symptoms of the issues that caused them to delay the release of the anime by a season. This is all too common in TV anime today, but the reason I am picking on Granblue Fantasy is because its issues are harder to put a finger on. It’s not like there are blatantly disfigured drawings of the characters or incomplete cuts. Rather, there’s just this jarring sense of something not being right – it doesn’t feel like quality animation.

This is because the detail in the raw drawings are not the issue. As drawings, they are fine, but by the time they hit our screen they often come off looking flat and awkward against their backdrops. This is an issue that’s often a lot harder for people to pinpoint than shoddy pencil draftsmanship. It’s the product of a web of processes and techniques, of approaches to animation, and the art of compositing in photography. In this case, the drawings come off looking flat and out of place because these factors have failed to produce a sense of depth to the scenes, or of natural distance between layers. When the opposite occurs and skillful photography seamlessly binds artful animation, anime can take your breathe away with rich, cinematic depth.

Photography in Anime Production

Most of the sense of depth in anime is injected in the photography stage of production. But before we get ahead of ourselves, let’s make sure we’re all across the basics – what is photography in anime? To convey this, I’ll run through the whole process briefly to show how it fits in. If you already know how things work, feel free to skip this section! In the first instance, I’m also going to talk about the process as it was back in the days when it was produced by physically filming cels. It is easy to think about the concepts of photography by relating it back to analogue era and then seeing how new digital technologies now replicate the same approach within computer software.

The storyboard is broken up into a series of cuts (generally marked by a change in camera angle or transition). The key animator usually draws the layout for a cut, which is like the blue-print for the composition of the shot/sequence – what actors/objects will be in it, where they will be placed in what layers, and what the background is going to be. From this point, the fine arts team work on the backgrounds, while the animation department works on the key frame drawings and the in-betweens.

In anime nowadays, there is further work on the key animation, with varying amounts of touch-up, animation direction (senior animators correcting the drawings to the character designs and tweaking the movements). Once the drawings are done, they are handed over to painters. Traditionally, they meticulously painted each frame onto celluloid (clear plastic sheets), cleaning them up at the same time. These finished product are referred to as cels. Nowadays, this painting is done digitally after scanning the drawings. Either way, these cels are then delivered to photography.

Originally the photography department loaded the cels into the animation stand. The animation stand is a production apparatus and system that allows the cels to be systematically loaded into a rack over the top of each other, forming layers. A camera is mounted above the stand, facing down, to capture them on film. Between each layer, lighting can be applied to stop shadowing creeping in or for other effects. In the most primitive form, you load the background sheet on the bottom layer, and then have one cel in a layer directly above it (say a person standing at a bus stop). If the bus needed to pull up in front of the person, the bus would be added in the rack over the top of the other two, creating a third layer. The work of utilising these layers and their interactions is called compositing.

Each frame is then captured on film with a mounted camera. Between each shot, the cels can be re-ordered, swapped in or out, or simply moved horizontally or vertically. The two sources of motion that can be seen of anime are therefore changes in pose with different cels, or relative movement of cels and/or background. Again, this work is referred to as compositing.

In addition, there are a whole suite of effects that are applied at this photography stage, such as making more distant layers blurrier than others, and adding other digital effects. One example might be making everything overlaid with a pale white colour during a snow scene, or applying enhanced shine of the sun of metallic surfaces or the glimmer on the water. These effects are handled by the photography team because they must work across all layers, bridging them together with holistic consideration for lighting and distance.

bd-mushishi-02-the-light-of-the-eyelid-1920x1080-x264-aac-5-1ch-mkv_snapshot_17-15_2017-03-03_16-33-27

bd-mushishi-02-the-light-of-the-eyelid-1920x1080-x264-aac-5-1ch-mkv_snapshot_17-09_2017-03-03_16-33-16

Although it may appear to be 3DCG at first glance, this effect from Mushishi was achieved by applying effects to hand-drawn cels during photography.

To summarise, the photography department take on the completed, coloured drawings and all other elements that are featured in a particular cut (such as background art and 3DCG) and combine them into a recording, adding any effects that can act across all the layers. These days, the elements are combined in computer software rather than an anime stand, but the approach and scope has largely carried through – dealing with the various layers, moving them between frames, and handling lighting and effects.

One of the new challenges in photography these days is compositing with both 2D and 3DCG animation and not creating an uneven sense of space and depth between them. This is getting better and better. Take a look at Fuuka, in this band scene. Coordinating the 3DCG of the instruments and 2D animation of the characters would likely have been difficult, but even a relatively poorly produced anime like this can pull it off.

horriblesubs-fuuka-04-1080p-mkv_snapshot_18-26_2017-03-03_16-37-29

horriblesubs-fuuka-04-1080p-mkv_snapshot_18-16_2017-03-03_16-37-24

Anime has started to become proficient in having 2D and 3D layers interact as shown by the characters playing 3DCG instruments.

Photography as Animation

Part of the final product we call animation is actually the direct outcome of photography – movement achieved by shifting the layers relative to each other to produce motion.

Take this walking shot. The key animation defines the convincing walk cycle, but it is the photography work that actually depicts them as moving forward by pulling the background across behind them. More specifically, this cut implies that the camera is panning along, following them. Going back to the example of the animation stand, the camera does not need to be moved, just the layers it is filming.

On the flip side, the background can be left static, and the cels or other layers can be shifted frame-by-frame to indicate that they are moving. This gives the effect that the camera is fixed while the actors or objects are moving.

Either way, it is the work of photography that creates the real motion by shifting layers, while they key animation creates the pose cycles that make it convincing. Clouds parting, doors opening, objects falling, mouths moving, many small pieces of movement within a scene are not brought to movement by an animator but by photography.

The movement of the castle in this cut from Howl’s moving castle is done by moving static 2D drawings.

Only through careful compositing can you pull off all of these kinds of camera movements and layer movements in a convincing way. If the audience perceives an incongruence in the relative movement of layers or the space between them, the intended effect can be off-putting and feel cheap. Picture the example of a car driving along a road – in the worst case the viewer might not get the sense that the car is moving. Sure, you can work out that that’s probably what’s happening by the images involved, but it certainly won’t feel realistic or natural.

In addition to hilariously bad key animation, this cut from Higurashi Kai feels wrong because the relative movement of layers is unnatural – the cel layer feels detached, as though it’s just floating.

Another thing that compositing can get right to bring out the space between layers is the relativity of movement during pans. If all layers moved together it would look like they were sliding awkwardly together across the backdrop. Instead, the pan of the camera is implied by the layers moving, and how fast one layer moves over another creates space between them. Notice in the Inuyasha gif above, there are two background layers that are moving at different speeds to imply depth – if the fence and the cityscape moved together it would not have felt succiciently three-dimensional to be believable.

Cinematic Depth

The keyword that I just reached is depth. While there are certainly contrarian examples that I’ll get to later, generally speaking anime aims to replicate a sense of reality and a cinematic flavour. In other words, it wants its shots to feel like they are occurring in a full-bodied, three dimensional natural world and to present them so that the audience can feel drawn in. Anime striving for cinematic tones will attempt to imbue their shots with visual depth – when you look into them you feel like your glance can penetrate ever deeper and deeper into the cut into infinity.

Orange (top) has very good compositing that gives a cinematic sense of depth while Granblue Fantasy (Bottom) feels flat and unrealistic at times due to poor compositing. When looking at the top example your eyes feel like they can penetrate the shot, whereas Granblue doesn’t feel natural to look deeply into.

Through effective compositing, photography must then create visual depth with only a few flat layers. Unfortunately, human visual perception is a funny thing, it’s easy to trick but also hard to convince. The first way to get around it is using proportions – obviously layers that are meant to be further away should be proportionally smaller. Getting this balance right to portray correct distances is important to the viewer feeling that the layers are in a believable spatial relationship.

Another often used trick is done with lighting, by creating differentials of vibrancy in the layers, depth is very quickly established. Going back to the anime stand, this could be controlled with the actual lighting in the machine. These days, digital lighting can be easily tweaked in similar fashion. A common way to instill depth that I’ve observed is to have exaggerated lighting, with say a diagonal ray of light hitting half the room. This allows you to easily cast characters in certain amounts of light to produce a palpable sense of space. Much more common now is the use of blur at different layers to simulate camera focus and therefore imply depth.

One of the best ways to make something feel cinematic is to have the camera move forward, pulling the audience into the scene. This is called movement into depth, and is a lot harder to nail in 2D animation than in full 3D animation or when filming a movie. Returning again to the animation stand, the camera is fixed above the mounted layers. You can’t simply move the camera down towards the layers, or the space between them will instantly leap out as being unnatural.

Thomas LaMarre discusses this in his fantastic book ‘The Anime Machine: A Media Theory of Animation’ and uses a great example:

Say that you want to create the sensation of a person walking toward a barn under the full moon. You begin with a background sheet with the barn and moon drawn on it. You might try changing the focus of the camera (zooming in or out), or try moving the camera closer or farther away from the picture. The problem is that, as the barn gets bigger, so does everything around it in the picture. The moon, for instance, also grows larger— rather than remaining the same size, as our conventional sense of the world dictates. Piling on additional layers doesn’t help with this problem. You might try drawing the moon on a separate sheet. But the same problem will arise. The problem does not lie in the number of layers but in the relation between layers

The camera essentially stays fixed and you need separate layers for the distinct levels of distance. You would then need to move these layers closer to the camera at different rates to portray the right sense of distance and speed of movement. This movement into depth is something that Walt Disney was apparently obsessed over early in Disney’s leap into cinema, going as far as to patent (though arguable not invent) the multi-planar anime stand, which allowed for the layers to be shifted not just horizontally but also vertically for this very purpose.

Even with this stand, it is a difficult effect to achieve, and requires especially precise compositing to impart the proper sense of space. Due to limited resources, anime has traditionally shied away from these movement into depth shots. This has begun to change recently with the exploration of 3D backgrounds and improved integration of 3D and the 2D layers in compositing. One anime film to seriously explore this potential was Ghost in the Shell Innocence. Mamori Oshii has a clear cinematic approach to animation production and it is plain to see he relished the opportunity to break through the surfaces of his layers in his compositing.

Other applications are starting to sneak into every day TV animation. K-ON! had a strikingly well-executed 360 degree pan around the band as they played their instruments, and examples like this are becoming much more common.

K-ON! impressed with a technically challenging 360-pan in the second season’s OP, using a 3DCG backdrop.

In general though, the anime looks to other avenues to deal with space and depth in compositing.

Animating Space

This isn’t all the magic of the photography department, of course; it is clear that the animators play a key role in suggesting depth. Although a layer is just 2D drawing, the way that drawing is posed, and the way the model changes between each frame does impart depth as well as motion. The first principle of course is to animate movements multi-dimensionally. For example if you have a character walk, don’t have them walk a flat x-axis, but also change their proportions so that they are moving slightly towards and/or away from the screen. This obviously adds an extra level of complexity in animating, but immediately gives the cut depth.

Yasuo Otsuka, famous as being a linchpin figure in the formative years of the anime industry and bringing animation to life with dynamic timing and expert detail, used a technique called the ‘peg hole’ technique (named as such due to the fact that he literally rotated subsequent genga around the hole at the top of the sheet). This technique added a roughness to the arc of movement of a character – instead of running in a straight line they would pivot into and out of the motion. It adds both a sense of energy and weight to his sequences, with the feeling that his character’s vitality is only barely bounded by gravity. The other effect is that it looks like his cels are grounded to the backgrounds, placing them nicely into the natural world and thereby delivering an innate kind of depth.

Yasuo Otsuka’s ‘peg hole technique’ adds both energy and a sense of natural, grounded relationship between layers.

Yoshinori Kanada is famous for the cool poses and playful timing he uses in his key-frames, but what’s sometimes overlooked is the fact that those poses included exaggerated perspective, often referred to as the ‘Kanada Perspective’. Wherever possible, the poses would have arms and legs spread out towards or away from the camera, going from one extreme to the other throughout the motion. These poses worked within wide angle lens and fish-eye distortions to expand the stage. This perspective made for wildly dynamic action sequences because they felt like they were frenetically moving through a space.

Yoshinori Kanada’s drawings create their own space with exaggerated, angular poses and perspectives.

Where I’ve discussed depth perception previously as being the feel of space between layers, addressed through compositing, here, Kanada’s layers forcefully create their own space. By their perspective posing, the layers have carved out depth within themselves, avoiding the need for careful compositing.

The eponymous Itano circus is another avenue for animating space. Popularised by Ichiro Itano, they have become a staple in anime. Trailing schools of ballistics traverse the full breadth of the scene, with self-propelled trajectory and speed. The geometric patterns these trails form etch out their own fields of space, as deep and vast as the animator can will it. A reason these scenes are so great to watch is the way the ballistics drive the photography; their geometries and paths very easily establish both depth and speed. The physical camera may be still but is carried rapidly and wildly through the trajectories of the missiles.

The Itano circus elicits space and speed through the ballistic pathways.

Shinya Ohira is a master at animating with a view to compositing, using a myriad of layers in complex interaction with distinct timing and multi-planar movements to give his shots an unparalleled cinematic quality. When he draws a character running, they don’t just follow a run cycle across the screen, they lunge to and fro in multiple dimensions, coming closer to the camera and farther. He also favours characters running into the shot from behind the camera, or into the camera. These kind of shots not only serve to place you in the scene but can also implicitly portray movement into depth.

Shinya Ohira’s animation creates a world of depth through many layers in complex interaction

Ohira is one of the only animators I have seen who animates a whole world within his cuts, a world of space infinitely deep and wide.

Ohira creates depth through multi-dimensional movement of many layers, but, as I have discussed, true sense of movement into or out of depth is always going to be extremely difficult to obtain while you have a static background layer. One way to get around this is to do away with the static background and animate every layer. This is known as background animation.

One of the first people to really start unveiling the potential of background animation in anime is Masahito Yamashita, whose part in the climax of Urusei Yatsura Beautiful Dreamers grabbed a lot of people’s attention. The sequence followed Lum flying through the school, with the feel of the camera following. The fact that you felt like you were zooming into the world along with her gave the sequence that depth and cinematicness typically missing from anime.

Masahito Yamashita’s turned heads with his thrilling background animation in Urusei Yatsura Beautiful Dreamer

Others have built on this over the years and it has become a go-to tool in anime’s repertoire to deliver wow-factor sequences. Interestingly, 3DCG backgrounds are starting to replace this particular art-form. While I admit they are probably better suited to most such applications, there’ll always be something special about this kind of cut. I suppose the fact that every line and shape is drawn frame-by-frame means there’s an unconscious energy in the unpredictability of it all; at any time, our perception of space can be turned on its head – the edges of the walls or the stairs could bend and and warp into new perspectives. When we see a background we can trust that it’s going to be static, but in these shots there’s nothing you can trust to do what you expect, it’s all in the hands of the animator.

Background animation can deliver movement into depth but it also seriously undermines the potential for depth between layers. Instead of the detailed, painted backgrounds, suddenly the background has to be simplified into looking like a cel (for all practical, commercial purposes anyway). This means there’s no obvious distinction between background and foreground. In one sense, this serves to make the shots feel flatter. Although the camera is moving into depth, our eyes don’t penetrate into depth in the same way.

Flat Compositing

When the feel of depth between layers is suppressed, this can be referred to as flat compositing. This is an intentional style in which both background, foreground, and all layers in between are given equal prominence on the screen. Instead of aiming to draw your eye in to some point of depth, your eyes are encouraged to wonder and take everything in holistically. Background animation usually implies flat compositing because the background feels like a cel in the same way as the characters acting over it might (in fact in many cases they are the one layer). In other cases, it’s about harmonising background and foreground.

horriblesubs-urara-meirochou-01-720p-mkv_snapshot_16-01_2017-03-01_22-03-40

horriblesubs-urara-meirochou-01-720p-mkv_snapshot_06-50_2017-03-01_22-02-06

horriblesubs-urara-meirochou-01-720p-mkv_snapshot_04-36_2017-03-01_22-01-25

Urara Meirochou brings it background to the fore with a harmonised vibrancy

Flattening in composition minimises the sense that the background is further away than the foreground, one of the fundamental notions underpinning the more traditionally cinematic approach. A key facet of this is depth is colour. As Urara Meirochou (and many other anime in recent) years attest to, when the background art is coloured with equal vibrancy to the foreground it removes the most intrinsic sense of depth and brings both into a single layer of perception, flattened. Many other anime carry this look very well.

There’s flat, and there’s superflat. A term coined by the artist Murakami, he drew from a number of Japanese sources to define an art movement that highlights the beauty of flattened depth. Hopefully I can explain what that means in the context of animation! One of the first things he cited was animation by Yoshinori Kanada – his ubiquitous fire dragon erupting from the volcano from Haramgeddon.

Kanada’s fire dragon defines form through shapes of colour rather than clear linework, a facet of Murakami’s superflat art movement.

The style of Kanada’s fire dragon feeds into a major element of Murakami’s superflat look, and that is the supression of outlines that define depth. The painted colours of his dragon are drawn with geometries that signify body and form without the use of clear lines. It’s a beautiful abstraction but our minds can still unpack the relative colours into the three-dimensional figure.

This flat kind of compositing is very explicitly used in the Dirty Pair movie opening. Essentially the idea is to portray the scene as flat, drawing your eyes to patterns and colours to unpack the space between layers that were projected into the flat surface at the screen.

Super-flattening in the sense of Murakami’s work goes a step further by breaking the rules of perception and flattens multiple perspectives of an object into a single orthogonal view point. SHAFT’s Bakemonogatari exhibits striking compositing that follows this superflat style – notice the sheer flatness of the imagery. Even thought there are clearly layers from a functional anime perspective there is no inherit sense of depth. Furthermore, even though we are looking dead-on at the shot, it hints at diagonal perspectives all throughout its artwork. The objects like houses and desks do not feel oriented in a real 3D space but exhibit 3-dimensional traits in their flattened form.

coalgirls_bakemonogatari_01_1280x720_blu-ray_flac_8d26b517-mkv_snapshot_20-03_2017-03-01_22-47-21

coalgirls_bakemonogatari_01_1280x720_blu-ray_flac_8d26b517-mkv_snapshot_14-47_2017-03-01_22-46-31

coalgirls_bakemonogatari_01_1280x720_blu-ray_flac_8d26b517-mkv_snapshot_02-27_2017-03-01_22-44-35

coalgirls_bakemonogatari_01_1280x720_blu-ray_flac_8d26b517-mkv_snapshot_00-01_2017-03-01_22-43-42

Bakemonogatari’s compositing and artwork flattens a sense of different perspectives in a ‘superflat’ manner

Both with Bakemonogatari’s superflat, schematic art and Urarara Meirochou’s equalised background and foreground, they feel very different to look at than your typical anime. That’s because most anime chase that cinematic perspective, setting your eyes up for a journey into the depths of the shot, whereas this flat compositing has your eyes drifting and meandering across the image, taking it in laterally.

Parting Words

With the healthy growth of the sakuga community over the last year or so, there has been a kind of awakening in the western anime community. Suddenly, people understand the talent behind animation and appreciate the value of creative and technically difficult movements. From my experience though, the discourse around the final presentation of an anime, the gravitas of its visual appeal, can lack the same sophistication. The visual side of anime production tends to be talked about as either ‘art’ or ‘animation’, however the overarching approach to tying the two together is just as important.

Both animation and art need to be consciously tackled with the goal of producing a sense of depth or an attractive kind of flat aesthetic, and then photography must harmonise all of the elements with well-crafted compositing. That’s how you get anime that pack the most powerful visual punch, when animation, art, 3DCG are all singing in chorus.

Frankly, this is where the role of the director steps into the limelight. With the sakuga communities’ general focus on key animation, it may often seem as though the director is more of a paper-pushing producer than anything else. However, the best directors can exert their creative power by harnessing all of these elements to reach a final vision for the visuals.

If anything, I hope this post might prompt someone to think further about the interplay between art, animation and photography rather than focusing on them independantly.