Many of the loyal readers of this website are probably wondering where I’m at in my mini-sub-career as a photographer. The short answer to that is that I am waiting. Waiting not only for something to really sink my teeth into, but also for new equipment to be released. Why is this important? I’ve never been hugely concerned with equipment in the past, and in any case it seems that I have everything I might ever need. To really understand why I am waiting on equipment, one must first understand where photography equipment is at at the moment.
Curiously enough, it was Nikon that threw the first punch. When the D90 was announced as the successor to the D80, it seemed like a pretty regular incremental upgrade except that it was the first DSLR to shoot video. Compact cameras have been able to shoot video (of fairly low quality) for as long as digital compacts have been common, but since DSLRs are much more complex machines, with mirrors and shutters and whatnot, even “live view”, where one can preview the image in real-time on the screen, is a relatively recent addition to the DSLR feature list. As live view became more and more widespread, the obvious question arose as to what was stopping now stopping video to come to the DSLR market. The answer: not much.
Video came in the form of 720/24p on the Nikon D90. After spending some time explaining much of the technical terminology in my previous entry “Photo Gear“, I should probably give similar treatment to the technical jargon of the video world, which I have been slowly learning over the last few months. The big number (720 in this case) is the number of horizontal lines of resolution. 720 usually means 1280×720 pixels. The “p” refers to “progressive scan” as opposed to “interlaced scan” which is usually indicated by an “i”. What does all of this mean? A movie is like a flip-book, your moving picture consists of lots of still pictures being shown to you in a sequence. Progressive scan means that every single pixel is scanned at one time per frame, so every single picture in the sequence represents the full resolution. Interlaced scan means that all the odd-numbered horizontal lines are scanned for one frame, and the even-numbered lines for the next frame, and this process alternates, meaning that each frame contains half the resolution. Progressive scan requires twice as much data to be throughput as interlaced scan at the same resolution and framerate. (more info on resolution can be found in the post “Notes on Resolution“).
A curious decision on Nikon’s part was to set the framerate of capture for the D90 at 24 frames per second. To the best of my knowledge, 24fps is only used on motion pictures. The standard for TV is 30 (that’s HD, standard definition TV it gets a bit complicated… PAL is 768×576 at 25 fps, while NTSC is 640×480 at 30 fps) Anyway, I thought it strange because, if anything, a DSLR being used for video would have applications more immediately in the realm of TV than in motion pictures. The reason I felt this way was simply a matter of resolution. TV resolution is fairly low, while film resolution is… the resolution of film, which I discussed previously to be approximately 24 megapixels in digital-speak. Even 1080p (1920×1080) is 2,073,600 pixels, which is less than a tenth the resolution of film. In fact, the closest equivalent standard in digital video is called “6K” and is simply 6000×4000 pixels. Unfortunately, there don’t yet exist digital motion picture cameras in those resolutions. The problem is data throughput.
With so many DSLRs now with resolutions to match high quality film, some people wonder why shooting motion with these is so difficult. A digital stills camera captures an image by first filtering the light into red, green and blue components using filters on top of the pixel sensors as can be seen in the diagram above. There are twice as many green filters as any other colour simply because dividing things into three evenly is difficult to do on a rectilinear grid. Our eyes are also more sensitive to green light. Anyway, it takes four of those little squares up above to count as one “pixel”, which will have a number for red, green, and blue (RGB) which is how computers store pixel information. When you hear high-end DSLR salesmen talk about 12 vs 14-bit colour depth, what they really mean (whether they understand this or not) is the amount of space given for storing the numbers associated with each individual pixel. More bits is better because it allows for more “steps” in the luminosity for each individual R, G, or B component, resulting in a finer degree of colour graduation. Also, the number increases exponentially – 14-bit colour is four times better than 12-bit colour. The point I’m trying to make is that every time you take a picture, a whole bunch of numbers need to be recorded and stored, and a typical 24-megapixel camera will need about 24 megabytes per photo.
24 megabytes is a lot of information. Try writing down a single 0 or a 1. That’s one bit. Eight of them make up a byte. So 24 megabytes is about 192 million 1s and 0s to write down. Obviously camera technology has advanced to the level where we can do that, but to do it 24 or 30 times a second, that’s a throughput of 576 megabytes a second. That comes to about 35 gigabytes per minute, which means that a typical 2-hour film would represent 4,200 gigabytes of information. Of course, anyone who’s ever edited film knows that the production team of a typical 2-hour film goes through a lot more than 2 hours worth of film. The way storage and data manipulation technologies are going, it won’t be too long before we have storage media, and applications that can reasonably handle that amount of data, but at the moment, we’re still a touch behind.
mind the gap
At the moment, the current gold-standard for DSLR filming is the Canon 5D Mark 2. It captures video at 1080p at either 24, 25, or 30 fps. Other Canon cameras such as the T2i, 1D Mark 4, and 7D all capture 1080p as well, but the important difference with the 5D is that it is a full-frame camera. Recall from my earlier article that full-frame sensors are the same size as old-school film used to be, that being 36x24mm. This is significant because, frame rates and resolutions aside, the main thing that makes films look so much better than normal TV is the shallow depth of field – the amount of the image that is in focus. Even high quality HDTV cameras that TV Networks use to broadcast have 2/3″ sensors which have about a fifteenth of the area of a full-frame sensor. Anyone who has ever tried to purchase a high-end video camera knows that even cameras with 1/3″ and 1/2″ sensors can be very expensive. For reasons of optics, you want the sensor’s dimensions to be a large as possible and in the world of digital video, only the RED One had anything near this.
The nice thing about this, is that the RED One reached the top of that ladder by approaching the problem from a completely different direction. Regular digital video capture works by splitting the incoming image into three separate beams which are then filtered with red, green, and blue filters. Expanding the size of those sensors generally meant making three larger sensors, along with all the filters and prisms that went along with it (you’ll notice “3 CCD” advertised as a feature of higher-end video cameras – that is what they mean – that those cameras bother to split the beam, resulting in better colour reproduction). The RED, on the other hand doesn’t split the beam, but instead has one sensor of 4096×2304 pixels and each and every one of those pixels is recorded for every frame of capture. This camera was a bit of a game-changer at the time of its release and many successful motion pictures have been shot with it (such as District 9).
Of course, the RED was aimed at professional film makers and although cheap by their standards at $17,000 not including lenses, it didn’t exactly jump out as an option for small-time independent film makers. Once you start throwing in lenses, follow focus, eyepiece, batteries, monitors, matte boxes, and tripods, the whole setup can become very expensive very quickly. Of course, this is a setup for producing feature film quality footage (albeit still at not-quite film resolution). Still, 9.5 megapixels is impressive – imagine a 10 megapixel DSLR shooting 24 shots every second continuously for however long a sequence lasts. Such a feat is, in a technical sense, still impossible in the current world of DSLR cameras (of course, in a few years I will re-read this article and laugh). A DSLR filming at 1080/24p is basically shooting a 2 megapixel image 24 times a second.
The Canon 5D shoots 21 megapixel stills at a maximum of about 4 frames per second. What really happens with motion picture shooting is that some fraction, let’s say a tenth, of the pixels in the sensor takes a photo for the first 24th of a second, then a different tenth of the pixels takes another shot in the second 24th of a second, and so on (I think this process is called “downsampling”). If you’re doing the maths in your head, you might be thinking that the 5D should be able to shoot at 1080 resolution at 40 frames per second, but it’s a little more complicated than that. Having to process all of those megapixels is a rather severe bottleneck in the digital workflow and I wouldn’t be surprised if 30 frames per second is already a struggle with current technology.
So with all this new DSLR technology, what is the point of having all these old sensors? They are surely completely obsolete by now. Well, not quite. Even though the sensors are relatively small, and they don’t give you the “film look” the technology is very mature and has been in use in TV broadcasting for a very long time. One of the biggest advantages that the small-sensor cameras have at the moment is their auto focus. (Just as a comparison, motion picture film cameras are all manually-focused). Although it is generally more appreciated by those who are old enough to have used cameras that didn’t have auto focus, auto focus is a very obviously useful thing, and in some cases essential. Basically any situation that involves either a crew that is limited in size (i.e. one camera man + reporter) or events which are difficult to predict (like sports). Actually, the only situations where auto focus isn’t essential are ones that take place in controlled settings, like film shoots, and on the set of TV shows, where everything is scripted and rehearsed and planned out in advance (and repeated many times).
Even in the world of small-sensor video cameras, technology has been improving in leaps and bounds (probably as a response to the whole DSLR-video thing). My camera, very new technology when I bought it, shoots in 1080p at 24, 25, or 30 frames per second. The sensors are 1/4″, and the big feature of this camera is that it shoots straight onto SD card media. This is significant because until very recently, all video cameras recorded onto tapes. Tapes are all well and good, but transferring video on a tape to a computer can only be done in real-time, that is – an hour on the tape takes an hour to transfer to your computer. With the SD card, you just pop it out, put it in a card reader, and hours of footage takes only minutes to download AND it is already in a readable electronic format and requires no extra time or computer wizardry to convert to something usable. Of course another advantage of my JVC is that it is built to take video, and everything I might need to do that is already included. The Canon 5D (which I don’t own btw) is a stills camera first and foremost; it just happens to also shoot video – it can’t yet auto focus during the shooting of video.
The question of “the look” and focus is actually quite an important one. The reason films look the way they do is because the lenses they shoot with have very wide apertures, and therefore give a very shallow depth of field. The downside to this, is that if you are even slightly out of focus, your picture is going to be blurry. With a small sensor camera, I can be several inches out of focus and you might not notice; on a large sensor camera, that kind of error would make a shot unusable. In fact, very high quality lenses for motion picture film cameras are even able to control their “focus falloff” which is the degree to which the image becomes out of focus and how quickly that happens. When you combine this shallow depth of field with the fact that all the cameras used to shoot films are manual focus, you might wonder how any of the shots stay in focus at all. Well, that is a job for the “focus puller” which, in America is known as the first assistant cameraman (yes, there is more than one assistant cameraman).
This is how it works – the director goes through shot with all the actors and effects people and set people and the camera operator (the guy who points the camera at the stuff). The focus puller makes a note of where everyone is and when, and measures the distance from the camera to those marks. He then has a fiddle with the lenses that he’s going to use for the shot because they will have focusing distances marked on them, and he will practice moving the lens from one distance marker to another. It’s a pretty difficult job, and much harder than the job of the second assistant camera who only has to worry about loading the film and clapping the clapperboard (then there are the people who move the camera around, and the crane operators if the camera is on a crane…). The focus puller also has to make small compensations on the fly because things don’t always go quite according to plan, and until recently he has also had to do it with no feedback, but thankfully these days we have high-res digital monitors to assist the focus puller.
So where is all of this going? The only thing that delivers the resolution of film is still actual film, digital cameras for cinematography are hideously expensive, dedicated broadcast-quality video equipment is (sometimes) cheaper but doesn’t look as good as film, and DSLRs look like film, but aren’t ergonomic, lack features, and can’t auto focus. Of course, there are always compromises, but let’s consider the resolution. The highest resolution that a TV will offer is 1080/60p and not a lot of broadcasts are made in that resolution, most HD broadcasts are 1080/30p and the vast majority of people’s TVs don’t go anywhere near this resolution. The only place where you could make a good argument for needing more than 1920×1080 resolution is in a cinema, where the image is projected onto a very large screen and let’s face it, very few of us will ever produce a film to be shown on a cinema screen and even fewer of us will ever produce a film to be shown on a cinema screen which will actually end up being shown on a cinema screen.
What’s on the horizon then? RED has been talking about two new cameras, the Epic and the Scarlet for quite some time now and one hopes that they will be released soon. They have made an unfortunate error in their timing however, because the lower-end versions of these camera (read: more affordable) were aimed at precisely the people who now use DSLRs like the Canon 5D to shoot things like the season finale of House M.D. Of course, the real groundbreaking versions, like the full-frame (35mm-sized) RED Epic which should be the first digital camera to shoot 6K (or 6000×4000 pixels, or 24 megapixels, i.e. the resolution of film), but RED will be making it’s money off the much larger volume purchases of the cameras further down the line.
Another exciting development I believe is on its way is auto focus. High quality video cameras have never really had auto focus and, as a result, require at least two people to operate at any one time if there is any movement in the scene. But now, in a completely different segment of the digital photography market – the one of micro four-thirds cameras – compact cameras with large sensors that do away with the use of the reflex mirror (where the “R” in SLR comes from). These cameras have pushed the development of very fast contrast-detect auto focus (as opposed to the already-fast phase-detect which is what all SLR cameras use these days). Nikon recently filed a patent for a fast continuously-focusing contrast-detect auto focus which is potentially a game-changer for the world of large-sensor videography.
In any case, I’m really waiting for Nikon to release its answer to the Canon 5D, possibly in the form of an updated D700 (recall that the camera body I currently own is a D700) which can shoot 1080p video. This way, I can do away with my video camera and be able to travel with only one camera body doubling as two cameras. One of the reasons that Nikon is particularly attractive, is that, if forced to manual focus, it is much better to do so with a lens that is designed for manual focusing than one that is not. The only current lens models that do this are the Zeiss primes (some are in the picture above) which are obscenely expensive (and beautiful) things. The wonderful thing about the Nikon lens mount is that it is compatible with very old fully-manual focus lenses (if you stick to prime lenses (fixed focal lengths, not zooms) the optical design is essentially the same). Sadly, old Canon lenses aren’t similarly compatible with the current EF mount. Lens manufacturers have also noticed this important trend and Zeiss for example released a set of “Compact Primes” (pictured above) which have interchangeable lens mounts, so they are able to mount onto either Canon EF, Nikon F, or the PL (the current standard for cinematic lenses). They are also clearly targeting the film makers on a lower budget because, as you may have noticed from the photo, all the lenses are roughly the same size, and importantly, have the same diameter, meaning you only need one “matte box” on all of them (the matte box is the thing that looks like an overgrown lens hood and is used to minimize lens flare, and also for holding filters).
My long term plan with all of this is to sell my old (a whole 9 months!) video camera and perhaps even my D700 camera body, buy the replacement camera body (which I imagine will cost a little bit more) and all those little extra bits that are required such as a matte box, manual follow focus equipment, some better microphones, and if there’s any money left, one or two of those sweet Zeiss lenses. Why? Because making films is something that I want to eventually get good at, even if I don’t make a career out of it (I almost certainly wont) because it would be a useful skill. Of course, my upcoming documentary about the story of the Australian long track speed skating team1 has been the inspiration for my recent interest in the technical aspects of film making, although to be honest, I’ve been interested in it for a while – I’m that crazy dude who watches Lawrence of Arabia on DVD, then watches it again with the director’s commentary (did you know that all of their shots of the sun were paintings because they kept burning the film every time they pointed the cameras at the sun?). Nikon recently released a more entry-level DSLR, the D3100 which has an APS-C sized sensor and shoots 1080/24p which does feature the continuous autofocus in video mode, although it remains to be seen how well it performs in real life. It also has no external microphone jack, which will make it difficult to mitigate the sounds of the lens focusing. In any case, the rumour mill has it that the D700 replacement won’t be out until early next year, so I guess I’ll just have to wait.
- by the way, if anyone has any suggestions for the title of this documentary, I’m all ears. The best I’ve come up with so far has been “Another Kind of Oval” ↩