DIY segmented capture film scanner (Part 1)
In this article I'll cover the design and construction of a high performance segmented capture film scanner, made from an off the shelf 3D printer, 3D printed parts, a Raspberry Pi HQ/GS camera, Raspberry Pi 5, and a high performance industrial lens.
Spoiler warning: A future article will cover how I replaced both the camera, Pi 5, and lens with better though more expensive options.
Table of Contents
- Introduction
- What it do?
- Concept of execution
- Export scripts
- Summary
- Timeline
- Initial idea
- Sketching a concept
- A reasonable concept
- Camera prototype
- Fine tuning monochrome
- Film mounts and backlight, round 1
- Flat field correction, round 1
- Planning a better backlight
- A better backlight
- Milestone: first colour image!
- End of colour-honeymoon, even flatter fields
- Charting a path to greatness
Introduction
The scanner design presented is a first-pass solution to a problem most people don't have, which is a convenient, very high resolution, semi-automatic film scanner that isn't a DSLR scanner.
What it do?
The scanner solves a key problem faced by dozens of photographers all over the world: how to get my film pictures into the computer, without using my normal flatbed scanner, or my DSLR and a copy-stand.
The problem with flatbeds is they're slow and the optics usually aren't that great (even a V850 needs a lot of sharpening to give decent results). Further, the V850's sensitivity isn't actually that great and it will often struggle with dense colour negatives. Scanner software is another thing, Silverfast isn't bad but it's not super stable and has a lot of useless settings and slightly eccentric behaviour.
The problem with DSLR scanning is it's even more fiddly, very manual, at least the scanner can be left to work for an hour per half-roll. Additionally, the DSLR scanner will be resolution limited to the DSLR you have (if you have a GFX 50 or 100 then just use that), and I just don't want to do that because everyone else is doing it.
Concept of execution
If we consider a flatbed scanner, those are basically obsolete tech and use a line scan sensor (usually two sets with an offset to fake more resolution; "real" performance is in the 2400 DPI range at best). Film stays in place and we slide the scanner over it, keeping the motion really smooth. One benefit is these can usually do IR-scan with automatic dust removal for colour film. Overall it's slow as shit and as mentioned even a current V850 is basically obsolete from the factory.
Nikon CoolScan's are better from what I'm told, while I don't doubt their performance these seem to at best use a strip-loader that needs babysitting. These scanners are all 20+ years old and quite expensive so not really an option.
If we then consider a DSLR-scanner setup we keep the film in place and point a camera at it (with the best macro lens you can find), and capture the whole film using a 2D sensor. For larger film sizes you'll probably be doing a segmented capture and merging manually. This is pretty good, the camera is pricey but most people use the DSLR for non-scanner things too. Disadvantage is you have to either move the film or camera, then align it and probably refocus per frame. IR scanning is technically possible but rarely done since most cameras have IR-cut filters.
The initial concept I had in mind was using a 9⁄125 µm fiber coupled light source as a mechanical flying-spot scanner, tracing it across the film plane and capturing the image using APD's on the other side. This was ruled out after a day or so due to the obvious issue of insane mechanical precision requirements and how incredibly slow it would be. This is kind of how a drum scanner works though.
This solution is a hybrid approach of the first two: instead of using a huge sensor to capture each frame, use a smaller and cheaper sensor to capture segments of the film. Then automatically move the camera around to get a whole frame, then merge those segments automatically into a huge image of a film frame.
This approach requires several pieces to work:
- Camera with decent dynamic range and suitable pixel count/size/price
- Lens with zero distortion and incredible sharpness
- Motion stage (X/Y/Z)
- Film mounts that can keep the film very flat
- Backlight, ideally RGB+IR
- Image capture + motion control software
- Image merging software
We'll go through the considerations for each one below:
Camera & lens
The lens is the most important piece, and I figured that if I got the best lens for the job then I could always find a sensor to match. You can generally get CMOS sensors with pixels in the 0.8-8 µm range with varying sizes from 1⁄4" up to a low-cost maximum of 1" (1" sensors are pricey but can often be found used).
To simplify the merging later we really want absolutely no distortion or vignetting. The lens should have a resolution in the object-plane (i.e. the film's size) of around 5-10 µm if possible, this works out to around 4800 dpi in scanner terms or around 30-50 megapixels per 35 mm frame.
Lens considerations
I started with the lens: after some thinking about it I decided that telecentric machine vision lenses seemed like a good idea. These lenses are perspective-free (i.e. give an orthographic projection) and are typically optimised for sharpness, distortion, and to work at a specific fairly short distance. They're usually specified with a magnification which is the scale factor vs. object at the optimal distance vs. how big it will be on the sensor's plane. A common use case for this type of lens might be component identification in an electronics pick-and-place machine.
Obviously we want the best mix of big object size, large image circle (to fit a large sensor) and resolution. What we don't really care about is f-number beyond the diffraction limits it imposes, we control the backlight power and the integration time can be as long as we need within reason.
I initially tried a Daheng DH-MML1-HR65 "1⨉" telecentric lens, which isn't a terrible choice. This lens has a resolution of around 7 µm which isn't the best but also perfectly adequate really. I encountered two issues with this lens: 1) the image was slightly brighter in the corners which was pretty noticeable when viewing the image at lower magnifications, 2) the lens is really 0.5⨉ so I was getting twice the FoV that I expected.
The DH-MML1-HR65 is still good considering it's less than $50 used, but assume it's 0.5⨉ and expect to crop out around 20% of the image to avoid vignetting. Do note that I switched to the EO lens before implementing flat-field correction (see later chapters); with flat field correction the DH-MML1-HR65 may well be completely usable.
The real big boi is the Edmund Optics SilverTL™ 0.6⨉ (56-678) bi-telecentric lens. This is a fair bit more expensive and longer, but it's good. The 0.6⨉ magnification is actually true, there's no vignetting or distortion, and the sharpness is divine at around 3 µm spot size. I think this lens is close to the best possible lens for this project (for up to 1⁄2" sensors), I'm very happy with it! I generally run this lens at f⁄6, though slightly more performance could be squeezed out at f⁄10.


As shown above the selected lens has a very good sharpness both for coarse details (low frequency MTF; often technically erroneously called "micro-contrast") and fine details. The lens will be able to reproduce the fine details of T-Max/Ektapan 100 well, though not perfectly.
Lens alternatives might include other SilverTL™ products or another series from Edmund Optics. I think a bi-telecentric lens is preferable to avoid vignetting artifacting. One fun alternative might be to use a low-magnification lens and just image the entire frame at once using a higher resolution sensor. The alternatives I found seemed to cap out at around 5 megapixels so a pretty significant resolution drop.
Camera
I initially intended to pick up a GigE Vision industrial camera thing, these are available from various Chinese manufacturers and usually pack some kind of Sony CMOS sensor. These do have a benefit in that you have a good selection of sensors to use, but they're a little annoying to talk to (usually some precompiled driver you have to use).
The optimal sensor for the EO lens is something with around 3 µm pixels up to around 1/2" format.
As a quick start I had a Raspberry Pi Global Shutter camera sitting around, this has an IMX296 sensor with 3.45 µm square pixels, 1440⨉1080 format, and 10 bit output. I did the initial testing with this camera and it's not a bad choice at all. What I didn't realise at first was that this sensor is colour only, and I really didn't want the bayer pattern blurring out my monochrome images.
Once I realised the colour issue I switched to the Raspberry Pi HQ Camera, which uses a rolling shutter IMX477 sensor with 1.55 µm pixels in a 4056⨉3040 format, and it does 12 bit output. This is oversampling the lens, and so pre-processing is to de-bayer then downscale to 2028⨉1520 (4⨉ drop in pixel count), this way each final pixel is represented by 2 green pixels and 1 each of red/blue, and should match the lens very well.
Sony's info on the sensor performance is a bit limited but Strolls with my Dog has done some work calculating performance characteristics here: Pi HQ Cam Sensor Performance. As such we know the sensor is capable of around 11 stops of dynamic range, and if we want to we could disable the defect pixels management (though I've left it on). This is comparable to many cameras used for DSLR scanning, to extend the dynamic range I use a deterministic HDR capture (described later).
The IMX296/GS camera had one big benefit: very high sensitivity, I needed around 5x the exposure time per shot with equal lighting for the IMX477. Given the ratio of pixel area is around 5⨉ this isn't a big shock I guess. "Fortunately" the readout and processing time is much longer on the IMX477 too, so the increased exposure time isn't felt as such. Since there are two extra bits per pixel I could turn down the frame averaging to end up at around the same total time per frame.
Obviously I disable as much of the in-camera and in-library processing as possible, including minimising analogue gain.
Dynamic range & Dmax
The IMX477 sensor natively has a dynamic range of 11 stops. The relationship to Dmax isn't particularly complex[0], it's simply log102n where n is the number of bits or stops. An 11 stop sensor therefore has a Dmax of 3.3, by setting our exposure to the base fog instead of the backlight we can maybe add 0.15 or so to that. This is sufficient for a standard B&W negative, but we don't have huge margins.
In the image capture/processing section below I explain how two captures can be used to extend the dynamic range somewhat. By using a HDR gain of 64 our dynamic range is in theory extended by 6 stops, to a new Dmax>5. This corresponds to >17 stops of dynamic range for reference, we'll call this a theoretical Dmax since I don't have the equipment to prove it's real.
In practice the dynamic range is scene dependent: imagine a pair of lines of density 5.0 with full transparency in between. Depending on the width and spacing of the lines there will obviously be a glow effect around the lines. This is what the lens MTF chart above is showing, for finely spaced lines we simply can't see the true density in between them due to glow from the transparent areas: we have low contrast for finely spaced line pairs (= high spatial frequency). This is not something generally talked about with film scanners but is a limiting factor in basically all optical systems, and is analogous to bandwidth in an electrical system.
I'm pretty sure the "clarity" slider in most image editing programs works to increase low-medium frequency contrast; it would probably improve my scans, though I'm hesitant to do processing in a scanner that might degrade image quality in edge cases.
The ability to punch through a dense-ass negative is quite useful in general, even if fine detail contrast is still limited by the lens.
[0]: This is the case for "raw" linear light sensor data; an 8 bit gamma-encoded image is not limited to 8 stops of dynamic range.
Alignment
It's pretty important that the camera sensor plane is perfectly parallel to the film plane, otherwise you'll get blurry corners and any vignetting is magnified. I use a kinematic lens mount (kind of annoying to mount but very easy to adjust), this lets me adjust the lens and camera plane in the X/Y axis.
As for how to optimally align it, the use of a telecentric lens actually makes this easier than you'd expect. Just place a precision pin on top of the film plane and look down at it.
Since the pin is reflective on the sides it makes a ghost image around it, and adjusting the kinematic mount to center this reflection around the pin body ensures the optical axis is parallell to the pin outer walls. Since the pin is precision, the pin should be normal to the film plane and we've achieved alignment.
This is significantly more precise and faster than tricks like trying to compare corner sharpness etc.
A secondary alignment that's more obvious how to do is aligning the sensor's X/Y axis with the motion stage, just twist the camera until horizontal moves don't also cause vertical moves.
Motion stage
The motion stage needs to translate the camera vs. the film in the X/Y plane, and move it up and down (Z axis). There's only one obvious choice here: a 3D printer. If you have a higher budget a desktop CNC-router would also be a good choice, though backlash in lead screws might be more of an issue.
I considered buying the cheapest printer available new but those look awful. I found a Creality Ender 3 V3 SE lightly used locally so picked that up. A used 3D printer is by a very wide margin the cheapest way to get a motion stage. 3D printers are generally pretty damn precise and have very little of e.g. backlash due to being belt driven. Since I already have a slightly nicer Ender 3 model I felt comfortable using this as a starting point.
To control the printer I obviously ditched the built in control software and flashed it with Klipper, and went with a moonraker + fluidd control UI. Connecting to this from my Python script went very painlessly and it's just worked since I got it working. Waiting for motions to finish was also quite simple, just issue an M400 command after a motion.
Klipper makes it very easy to manage the changes I make to the printer, e.g. my machine dimensions and probe offsets are different than stock, but this just means working out the new parameters and updating the configuration file. One funny is that the 3D printer control software Klipper for some reason isn't very happy with my machine if I remove the extruder details , so I just pretend we still have those parts. This is similar to how most chess playing engines don't like it if you give it a board without a king on it.
Now a 3D printer doesn't really need to be very precise in a lot of ways that we care about, but it's over all perfectly acceptable and with the steppers configured for optimal resolution it's surprisingly good. The X-axis seems to tilt up/down a bit vs. position but this only amounts to approx. 50 pixels alignment error over a 35 mm frame which is handled in software.
Configuring a suitable acceleration is very important to avoid shaking the machine during image capture, and is fortunately trivial to do.
As mentioned above keeping the film plane aligned to the sensor is very important, but it's also important to keep the film-sensor distance constant. What may not be obvious is that bed flatness is relatively unimportant (which is good, the E3 V3 SE lacks the adjustment screws). Using Klipper's bed mesh mapping feature and the honestly quite good optical-pin based CRTouch probe we can have Klipper compute a Z-correction mesh that is transparently applied. If the bed is a tilted plane then the pin-alignment described above will make the sensor parallel to it, and the bed mesh will adjust the Z-offset automatically as we move around, in theory this perfectly cancels out any bed levelling issues.
I do notice some backlash in the Z-axis movement, on the order of 50 µm or so. This is mostly solved by fine tuning the autofocus in the middle of each frame before starting a capture, which also ensures that any remaining bed mesh error is minimised in the scan-area.
Film mount
I initially intended to use a glass sandwich with ANR+clear glass to squeeze the film between. This was a good way of making newton rings, future work will be to improve this situation. Double ANR-glass might be one way to solve this, though sharpness could be an issue.
For initial testing I used some photo paper with a hole cut into it as a film guide and mask, photo paper is a bit thicker than typical films so if the film was naturally flat this was fine. Curved films that touched the top glass consistently caused newton rings.
I still had the V850 film holes so I used these for most scans, these don't hold curved film very flat but with per-frame autofocus it works out ok.
If you wanted to you could probably use the extruder motor drive to run a film-puller to scan a whole uncut roll. This is a whole project in itself, but could be super neat and a 35 mm-only scanner could be made much smaller if this were done.
A fun side-project could be a 8/16 mm film scanner; this would be easier since with a reasonable lens and sensor combo you wouldn't need any stitching. I don't currently work with motion picture stuff unfortunately.
Backlight
A set of 3 pcs 8⨉32 (10 mm pitch) WS2812B ("Neopixel") flexible light panels are used as the backlight. A future upgrade will be to add IR capabilities, which I expect to do by making some super thin IR LED strips to go in between the Neopixels.
The use of a spatially controlled backlight has a huge thermal benefit and makes tweaking uniformity easier. Lighting the entire film-area up evenly to a level where ambient light won't affect the scanning would require on the order of 100 W of LED's, while <10 W is used to light up a given 35 mm frame.
For one-shot colour the R/G/B PWM-balance is adjusted to compensate for orange-masked negatives. (An alternate approach is to do field sequential capture with variable exposure time)
Future work will probably involve making a custom backlight assembly with more precise PWM control and IR emitters. If the backlight is wired to the camera hardware trigger lines then PWM control could be synced to the camera exposure.
Capture and image processing software
I wrote a basic capture program in Python using a variety of libraries. The bulk of the camera work is done by Picamera2/libcamera, which while slightly lacking in documentation is a very nice project once I got my head around it.
For actual image crunching it's the obvious choice: numpy and OpenCV. I use a variety of OpenCV functions to do filtering in the capture system. Unless otherwise noted the entire processing chain is 32-bit floating point up until where TIFF's or similar are generated.
The control software is a basic terminal curses application that configures the scan size/start point, exposure control, and autofocus. When set up it scans the image and outputs 32-bit float images (actually compressed numpy arrays) as well as some JPEG's and histogram plots, these can be viewed in a basic HTML site as the scan happens. After a completed scan a JSON file is written out containing the frame parameters like number of segments, overlap in pixels, filenames, frame/roll ID to name the output file, and parameters that go into the EXIF data later.
For colour scanning the colour captures are merged during scanning; with monochrome scanning the output is still RGB with equal R/G/B values. IR data is stored separately.
The control software calculates segment positions using the sensor/lens configuration, and ensures a 50 px per side overlap is present in each frame for later merging.
It also has a concept of a "roll ID" and frame number that are used to name the output files, usually I just go with a format of "YYYYMMDD_n" using the date of development and n just incrementing for each roll that day.
The frame index is at present used to automatically step over to the next frame when using my 3x6-strip 35 mm stencils, though in future it will need to be configuration controlled to support medium format (or 4x5/8x10 if I ever go down that road).
I also make a real-time image view with overlays to help alignment and monitoring, this is shown on a 5" DSI LCD on top of the printer, or it can be viewed in a webpage remotely. To make the real-time viewer I use the CLI image viewer 'feh' which is great for this, I just launch it as DISPLAY=:0 feh /dev/shm/cam.bmp -F -Z and it will automatically update the view every time a new file is written by my program.
The overlays include a histogram with min/max indicator, gamma/HDR info, Z coordinate, a view of the number of segments to be captured and location (bottom right, currently showing a capture in progress), and the autofocus indicator (Green/Blue/Red bars upper right). The green box is the ROI/colour match/AE-zone and is placed over the film base, the blue box is the AF area, and the red box shows where the overlap will be between segments. The yellow arrow in the upper right changes every time a frame is generated to make it easy to tell if it's live or not.
Bayer sensors & monochrome light sources
One issue you'll run into using an RGB sensor with monochrome light is that the saturation level of the sensor will seem to be dramatically reduced. This happens because a normal RGB-greyscale conversion uses weighting of the RGB-values that assumes you have a pretty grey scene.
When using monochrome backlighting I only use the intensity value from the corresponding channel and the rest are discarded after debayering. This seems to work ok.
Autofocus & auto-exposure
The autofocus system used is fairly simple, and is provided as an operator aid and normally done automatically in the center-patch of each frame before scanning a frame.
To evaluate sharpness I do a laplacian transform over the image (a sort of edge-filtering operation), and compute the variance of the laplacian transformed image. This becomes a unitless number that is used as the AF score (grainier film stocks obviously end up with a higher variance). The autofocus algorithm performs a trinary search about the selected distance, this means it checks the AF score at Z positions +/0/- the selected range and goes to the best position of these, repeating about the new optimal value for each iteration. The range to search is divided by 2 per round and typically I search ±0.15 mm down to around 25 µm step size in the automatic frame-center mode. The manual activation searches a larger range.
Auto-exposure is a pretty simple integrator regulator that can be activated on demand. It takes a couple of averaged images, reverses the gamma correction to work in linear units, works out the percentage-error and scales the exposure time by the error. This converges quite quickly, so there's an early out if the 8-bit equivalent error is less than 0.5 levels. The green patch in the display shows the ROI which is averaged for auto-exposure, it's small enough to fit between sprocket holes easily. A simple template matcher runs in parallel and updates the position based on a sprocket-hole template to ensure it's always positioned sensibly.
In monochrome mode autofocus and auto-exposure is done using the green light source, since this is the wavelength where the lens has best performance.
Colour negative film base correction
To simplify my mostly manual colour negative inversion process I use the adjustable RGB backlight to adjust the film base colour to neutral.
This works by positioning the green ROI marker over the film base, the blue backlight is set to around 80% of maximum and the exposure time is set as usual based on this. The average value for blue is then used as a target for green and red, adjusting the backlight intensity for these channels to match.
By doing this the inverted image at the end of the chain will have the film base at a neutral grey level, and in theory all that's needed is contrast adjustment to have a "precise" colour response.
In general this adjustment is only done once per roll if at all. A set of presets can be loaded as starting points and refined if desired.
Image averaging
To reduce readout noise I accumulate somewhere between 4-16 images depending on how I feel. This is done using a simple flat weight so e.g. if I wanted to merge 16 images then I'll multiply each image by 1⁄16 and add them together. Readout noise is as expected reduced by √N for N captures, while the film being captured is unaltered since it never changes.
Flat field correction
The IMX477 seems to be a very stable sensor with low drift and readout noise. I apply the standard darkframe/flat field correction principle to correct for a slight non-uniformity, leading to the lower right corner being slightly darker than the rest.
For the two-exposure HDR mode separate corrections are used since exposure time and analogue gain are quite different. A third flat-field is applied to the merged image to further refine the image quality.
HDR capture
While various HDR techniques exist, I don't care for the standard ones since I need the HDR to be applied equally to all the composite images. I also don't care for tone mapping algorithms in an automatic pipeline, they tend to produce obvious aberrations.
I use a simple deterministic HDR technique where each sub-frame is captured using different analogue gains up to the maximum, adding exposure time if required.
I then mix these with a scaling factor (OpenCV AddWeighted): with a factor of e.g. 64 the weight is simply 1/64 since the sensor provides linear-light values.
When averages are called for I average high/low captures then merge these. If e.g. an HDR gain of 64 was requested then the analogue gain will be capped at 8 (16 is sensor max), and the exposure time would be multiplied by 8 to make up the difference.
As I discovered in the docs, there is no guarantee for how many images need to be read out before a change in exposure time or analogue gain is effected, but I found stopping then starting the picamera2 instance made it apply the settings on the next frame. This was way faster than waiting for a correctly set frame to pop out of the pipeline.
Gamma-management
I elected to use a continuous[0] gamma (𝛾) of 2.4 for my system to sort of match Rec.2020, this became a necessity in the HDR modes to give presentable images that didn't need a ton of post processing.
For real-time use the preview-image is converted to 16-bit and a LUT is used, during stitching a floating-point gamma is used.
[0]: Continuous meaning my gamma correction stays a true gamma down to 0, as opposed to switching to a linear scaling near 0. My offset-correction keeps the black level under control despite this.
Capture time
With the configuration discussed here, each sub-frame covers around 10⨉7 mm with some overlap. Using 64⨉ HDR merging and averaging around 2-8 frames the capture time per sub-frame is around 2-5 seconds. Motion time is typically 1-2 seconds between frames.
Some optimisation has been done to avoid slowdowns such as using multiple threads to handle e.g. display updates, talking to the camera in general, and saving output files. This means the capture cycle is effectively time limited by 1) printer motion, acceleration has to be low to avoid shaking or you have you wait a bunch after motion, and 2) actually capturing the images where 2a) multiple exposure averaging takes a bit of time and 2b) switching exposure time and analogue gain takes a bit of time.
Image merging software
The image merging software is written in Python as usual for my image processing. It's designed to run on a much more powerful computer than even the Pi 5.
I chose this split since I expected the amount of number crunching and RAM needed to be fairly large with medium format scans, and waiting for merging to happen between scans would slow down the capture process. A script running on a server polls the output folder of the Pi and automatically grabs and stitches each scan.
Merging
To merge the mosaic images I initially tried the obvious solution of just asking OpenCV and Affinity Photo to put the frames together for me. This approach works sometimes but failed tremendously in general.
I think the issue with standard image merging algorithms is they're very clever and optimised to handle a bewildering array of lens and camera distortions and to reject noise/fine detail.
I had the basic idea down before starting, getting it working wasn't too difficult and mostly boiled down to figuring out how to make OpenCV do what I want and learning the basics of image processing again.
The idea is this: we know that all our images have an amount of pixels overlapping on each side, and we know their approximate orientation. I decided to only handle orienting two images at a time along one axis. In the right hand image, we select a rectangle on the left side which is our template (cropped to less than our expected pixel-offset error in the X/Y axis), then we grab a fairly large chunk of the rightmost edge of the left hand image and call this our reference.
Then I use a standard OpenCV template matcher, using the Square-difference comparison method, this returns the pixel offset in the left image subset where our right-image template is the closest match. It does this by picking a point, "placing" the template over the image at that point, then it adds the square of the difference in pixel value for each pixel in the template vs. the reference image. This is the "score" for putting that template at that position, this is repeated for each offset the template can fit into. It's kind of like putting two transparent images on top of each other and trying to line them up as best you can.
This matching technique would be sensitive to intensity variations, optical distortions, noise etc., but since I built a system with the best lens the idea is that the images will always contain film grain patterns in the overlap. Film grain is basically random, and by selecting the biggest template that will fit it's all but impossible to get a false positive.
During development the false-matches I've had were caused by misalignment of the camera, incorrect overlap settings etc. that combined to make it impossible for the simple alignment to work. I also noticed that humans are pretty bad at quickly spotting a poor merge job, I had to zoom in on details to spot some errors.
With the template position found it's a simple matter of calculating the X/Y offsets to apply when putting the two images next to each other. I crossfade a few pixel region between the frames in the overlap, which can sometimes be seen at 100% zoom as a slightly blurrier band. The blur-band doesn't seem to be visible at normal magnifications. For the Y-axis offset I simply crop both images, allowing images to contain dead areas would complicate processing and isn't necessary when the scanner is well aligned.
The merging works its' way along the X-axis making strips, then later loads and rotates the X-strips to merge along the Y-axis.
The Y-crop is a good indicator of how well aligned the camera is vs. the printer, the Y-axis error should be adjusted to +0-10 pixels per sub-frame. If the Y-crop is negative the Y-axis overlap region tends to be cropped out, causing issues with the later Y-axis merging.
For my own reference: rotating the camera clock-wise decreases the Y-error. For those wondering: the focus-adjust ring on the HQ/GS camera is M1.6x8 mm, I replaced mine with a DIN912 style hex-head screw since the flat head was damn near impossible to precisely turn.
The processing is fairly amenable to multi-threading, each X-strip is processed in parallel, and rotations/gamma/sharpening are applied before merging for the same reason. Typical wall-clock time for a 35 mm frame is 30 seconds for a 50 MP sprocket-hole scan.
Sharpening
I do a basic unsharp-mask over the image to improve sharpness. The parameters that seem to work best are a strength of 1.3 and a pixel-dimension of 2. I obviously want the sharpening to never be excessive, but I also prefer not to have to apply sharpening in post processing.
I did briefly experiment with using a Wiener filter but found this to be unnecessary in this system, it leads to very visible artifacting if not perfectly tuned and my PSF is very close to 1 px anyway. A Wiener filter is basically guessing at what the "point spread function" (PSF) is, then using this as a filter to reverse the system blur. It can give very impressive results for unsharp or motion blurred images but it doesn't seem suited to general purpose photographic use due to the hideous artifacting. Github user michal2229 has a nice repo you can run to try it out yourself: dft-wiener-deconvolution-with-psf, though note that you have to change cv2.CV_AA to cv2.LINE_AA to make it not crash with circular PSF's.
IR dust removal
Making a good IR dust removal system will require a lot of testing, at the time of publication my backlight lacks IR emitters so I haven't been able to start on this.. The minimum viable product as I see it is creating a dust-map to highlight image areas with dust for manual painting. This in itself is probably a worthwhile feature.
I expect the sequence of operations will be to threshold the IR image, then perform some morphological operations to create a slightly larger mask around each object found to paint out. These object-masks can then be applied to guide an image infill algorithm of some sort.
Other functions
My merge script also does a number of other functions including sharpening and generating EXIF-metadata to embed in the pictures. For posterity this scanner is called the "LA2YUA E3V3SE Scanner" in the EXIF data, named based on the printer used as a base.
I decided to limit the output image to be less than 10'000x10'000 pixels since images exceeding 100 megapixels are pretty problematic to work with.
The standard output format is a 16-bit greyscale or colour TIFF file. Exporting DNG files is entirely possible (and experimented with) but I've found DNG files to be annoying to handle in a scanner workflow. Note that to output 16-bit TIFF's in Python, use the tifffile library instead of OpenCV, it supports way more esoteric TIFF features.
Future work is improving colour management and applying ICC profiles generated by e.g. imaging an IT8 chart for colour modes.
Export scripts
My workflow is a bit primitive in some ways, I use Affinity Photo for editing, touching the original TIFF's generated by the scanner. I typically add adjustment layers for every adjustment including dust removal, so the original image data is retained.
After editing each file, I save it and use macOS tags to quickly identify which images I consider worth keeping (red tag is Control+1, very quick to do).
I then run a simple shell script to convert the TIFF's to JPEG and AVIF outputs, keeping the tags for easy sorting. JPEG's are copied into an export folder, the AVIF's are only used for my iCloud photo library. AVIF's are pretty good if slow and a bit buggy when loading 50+ MP images, but macOS Preview.app doesn't support writing them so JPEG's are preferred.
Report generator
Since each roll is typically 10+ GB of image data after conversion, I don't want to keep the original files in online storage forever.
For posterity my storage tiers are: local SSD replicated to server SSD in real-time, older files deleted to keep space free, an online HDD copy manually rsynced, and a set of LTO-5 tapes (approximately 1 tape per year's worth of pictures). I'm considering moving to LTO-7 as prices come down.
To quickly let me reference what images are on a given roll, and to keep a record of e.g. what camera/lens/film/processing was done to a given roll, I make a one-page "contact sheet" report. This is a PDF that I retain in online storage to let me quickly look up any roll I've scanned.
I originally did this using some templates in Affinity Publisher, but since that can't be scripted it was a bit tedious. As part of this project (while waiting for some long 3D prints) I wrote a simple Python program to load a simple text file containing roll information I input manually, and the JSON files for each frame.
The text file contains all the information shown in the table at the top, except the roll reference (inferred from the as-scanned roll ID), film type (can be overridden if desired). There's a free-text area which is put into the large area in the table, this will usually contain information such as what kind of event the photos are of, notes regarding development where I tried something new, or comments on issues in the scanned images.
The Python script downscales images to an appropriate resolution (~600 DPI is reasonable for a digitally viewed PDF) and a coloured border is added to each image based on the macOS tag (red border = not tagged, probably rejected). Finally a LaTeX file is written out containing the information,
This is a work in progress, I'd like to make the actual image grid a bit nicer looking and put the frame number as an overlay over the images, but LaTeX isn't the easiest thing to fine tune.
Summary
The presented system is a nice starting point, though some issues are present that makes this a less than optimal system. At the time of publication this system has been partially dismantled to start work on the follow-up, which will be the subject of a future article.
The main issue with using CSI-cameras on a Pi is the very limited selection of sensors, and their poor speed as a result of this. There's no particular upgrade path available either, it's more of a take-it-or-leave-it situation. Due to the low speed of the HQ camera I ended up with a parallel-colour image, which has moderate colour separation without a colour correction matrix. The use of a faster monochrome camera and field-sequential colour should give better uncalibrated performance.
The choice of a bi-telecentric lens seems to stand the test of time: the exceptional distortion performance and resolving power is real, stitching images captured with this type of lens is trivial compared to more standard lenses. As noted at the end of the timeline below some lens issues were found with stray light, but this can be worked around. The most major issue is where the film strip is shorter than the mask opening, leaving a large clear area. The workaround is basically to put some tape over that area.
The film mount is not an optimal solution, initial tests with the ANR-glass stack suggests newton rings are very much a problem if the film is curled. This is future work.
The choice of a 3D printer with belt driven axis and stepper motors seems sensible, it is a highly affordable motion control system that seems to offer acceptable performance. The choice of my specific Ender 3 V3 SE wasn't optimal, the motion range is slightly limiting and makes comfortably scanning full 35 mm rolls impractical. Further, the system speed is limited by the "bed-slinger" approach since the bed now contains a fairly large assembly with a lot of inertia and poor stiffness. A gantry-type CNC router base, or some type of Core-XY printer would be a better choice since the camera-lens assembly will always be easier to move than a giant light-box.
The Pi 5 is a capable little computer, but it's at its' limits here and is easily outclassed by a SFF computer from 2018 that cost around the same. For reference the Mk. II system uses a HP ProDesk 600 G3, which can take a 10 Gbit⁄s network card.
Future work
As mentioned above I'm publishing this article after starting work on a Mk. II system.
Key points expected for Mk. II include:
- Use a GigE-Vision camera for speed and to give better control over the image capture
- 10GigE-Vision as a future upgrade path
- Use a normal computer to support higher-speed I/O
- Replace the SilverTL lens with a CobaltTL lens that can support a 1" sensor, giving a ~4⨉ increase in FoV while retaining the same spatial resolution—a major speed improvement
- Camera will likely be an IDS uEye variant, since their software stack seems to be pretty good and a wide array of used cameras are available
- I did some experiments with MDVision cameras but found the software to be somewhat unstable and the camera didn't really behave as expected according to the docs. I think this could be worked around but an IDS camera was available to borrow.
- Backlight will likely be a custom panel optimised for this use case:
- RGB+IR LED's as tightly packed as practical
- Each pixel only needs on/off control, not per-pixel PWM like a Neopixel
- Array update speed doesn't need to be all that fast
- Each colour channel needs a high speed PWM that can be synchronised to the camera flash trigger output line
- I expect the Pi Pico PIO-functionality will be well suited to implementing this control scheme
- If the backlight is only on exactly when the camera is exposing, the power consumption should be significantly lower than the current system by only enabling the LED's when their output is useful
- Nicer film holders will be a mechanical engineering challenge
Timeline
Initial idea
Late February 2026 I was scanning some slightly cooked pushed Vision3 250D and noticed that the blue channel was close to saturated in my V850 scans. I was also generally unhappy with the sharpness of my scans, especially when dealing with Kodak's classic curvy 35 mm base.
Sketching a concept
In early March 2026 I started thinking of a silly idea, looking at using a 9⁄125 µm fiber-end as a flying spot scanner. This idea was sketched out and I did determine that I could get the required laser sources to make a RGB+IR signal to drive the emitter. Obviously this was ruled out once I started thinking of the mechanical requirements to move such a tiny spot over the negative, and the required detector performance would be pretty challenging too. I never bothered calculating the scan time (hours?).
A reasonable concept
Later in March 2026 I started thinking along the lines of a segmented capture scanner, and started looking at motion stage components. After looking at really cheap 3D printers on AliExpress and checking reviews I decided those are basically useless. Looking locally (via Finn.no) I found that a lot of people were and are still selling barely used 3D printers for a solid discount, and the E3 V3 SE was available barely used in box for less than half the new price. A real "for sale: baby shoes, never used" moment. I picked this specifically since it's a fairly pro looking closed design instead of looking like a collection of loose parts in formation.
At this point I was looking at Chinese made GigE Vision cameras, and had selected a reasonable monochrome camera, with the idea being that I'd use a PYTHON1300 based camera I had bought previously for a prototype. These cameras would be more optimal since I could get monochrome sensor variants, but once I remembered I had a RPi GS camera sitting around I decided to use this, figuring that this was a decent monochrome camera.
For lenses I remembered thinking telecentric lenses looked cool previously, and looking into what was available in AliExpress I fairly quickly found the Edmund Optics lens I rave about above, showing it to some optics designers I know they seemed impressed by the stated performance figures. For concept-proving I ordered up the DH-MML1-HR65 since it's a lot cheaper.
I also spent some time thinking about backlights and film mounts, after discussing the concept with some older colleagues who had clear memories of mounting slides I figured that the ANR glass + top cover glass might be viable.
Camera prototype
In the middle of March 2026 I had the 3D printer, the 1⨉ lens, and the GS camera in house. I did some initial testing and was quickly impressed by the achieved sharpness. I had issues with sensor/film plane alignment that were solved by the kinematic mount later, and I had to work out how to actually verify my alignment (pin-method described above is the best).
3D printer interfacing was pretty unproblematic, the moonraker example code was lightly modified to take G Code input in a thread-safe queue and it's stayed that way.
I also realised I had to buy a Raspberry Pi 5 since my old CM4's all had 8 GB of storage which just isn't enough for this kind of development work.
I went through a number of iterations of the lens+camera mount, 3D printers aren't really designed to take precision optics but I think I ended up with a pretty sensible mount that doesn't seem to move around too much.
Merging artifacts were an issue during this time that were caused by both misalignment of the sensor plane, and the cheap lens vignetting. Cropping the image helped with the cheap lens but the real fix was switching to the expensive EO lens. I think the bi-telecentric design is part of why it doesn't have field flatness issues with these sensors, since both should be vignette-free.
The merging software was started on 2026-03-22, the day after a 4 hour photo walk and 8 hour post-walk beer session (results: 1.5 rolls Double-X, 0.8 roll Tri-X, 0.5 roll Fuji 400, indeterminate beer amount, 1 midnight burger). It only took a few hours to get usable results, though tweaking of parameters went on for a few days after.
When I switched to using raw data from the camera (one thing the Pi 5 does better than Pi 4's, it always outputs left aligned 16 bit, the only rational way to supply 9-16 bit image data) I discovered the bayer pattern of the GS camera. Having previously ordered the HQ camera just in case I switched to this. I'd already prepped the code so switching to a new camera with different pixel geometry was relatively straight forward, an hour or two of tweaking.
With this kind of telecentric lens sensor dust tends to be pretty visible, especially if stopped down a bit. If blowing dust off isn't enough, I pull the C-mount off the camera board and stick some standard office tape on the sensor and pull it off slowly. This picks up most marks including fingerprints. I picked up that trick from an old telecom R&D engineer where they used to clean fiber optic connector ends that way before the purpose made products were available.
IR filter removal was harder than the documentation suggests, I can inform the reader that hitting the filter glass with a 300°C hot air gun will crack it in around 1 second. Heating the C-mount piece up to maybe 50-60°C made it possible to cleanly remove the glue as well, and a hand-vac was used to pick up all the fairly sharp glass shards.
Fine tuning monochrome
The last week of March 2026 was spent fine tuning the capture process and supporting aspects. I added the HDR-mode and implemented gamma correction, which required some iteration to give results that looked ok and were possible to work with later. This mode was successful enough that I now use it as the default, since the only downside is a straight export looks pretty flat without applying a tone curve.
Initially I used a gamma of 2 since floating point power functions are too slow for real-time preview, and square root and a power of 2 are relatively performant. After a while I switched to using a 16-bit LUT for real-time preview (less precise, around as fast as a floating point square root) and a pure power gamma of 2.4 for capture.
JSON outputs were added to provide "frame info" to the merger used in determining what files are there, how they're aligned, exposure parameters, type of scan etc.. A lot of the parameters are just formatted and put into the EXIF tags of the output image for posterity.
Since the CSI & DSI cables showed up I could mount the Pi and 5" display where the spool holder is meant to go on the printer. 50 cm long CSI cables seem to work fine, and the X-axis can traverse the entire width at minimum Z without snagging.
A nice heat sink for the Pi 5 with a big fan showed up, an instant 30°C temperature drop. Mounting the Pi 5 behind the screen severely degraded Wi-fi performance so I decided to run a fiber from my main switch and put a media converter into the final design.
Film mounts and backlight, round 1
The first week of April 2026 was spent getting a prototype film holder and backlight operational. In the same bag as the RPi HQ camera there were also two pieces of glass inside an extremely well padded box (shoutout to my AliExpress glass-store shipping department bros', you rule).
I set up a private Gitea-server, since I develop the capture software straight on the Pi 5 it was nice to have somewhere else to push to.
Since I had realised that I'd need to build a custom backlight controller to properly support colour scanning, I decided to initially use a set of monochrome LED strips. Green was the obvious choice for a monochrome scan, having twice the photodiodes per output pixel than the others.
A fairly immediate issue I noticed in the first test scans was the awful quality. After some testing I realised using green-only lighting with a pipeline tuned for white light meant my green channel was way overexposed since the RGB-Greyscale-RGB pipeline was giving me RGB averages values in greyscale. This meant when only the green was responding it would saturate quickly but the greyscale value would be at most 70-80% of max. I tested this by just underexposing the scan a lot and it looked fine again, it's a simple fix: just use the channel corresponding to the backlight colour and ignore the rest.
The backlight prototype had a lot of vignetting due to simply not being big enough, but I was able to get a couple of scans of the last roll I'd shot. Stacking more diffusor plates and increasing the LED-diffusor distance to 100 mm helped a fair bit. As part of this process I also added support for frame-indexing to X/Y positions corresponding to negative frames, and worked out that for 6-long strips my Y-axis margin was technically -10 mm, though in practice I had 5 mm to spare.
Above is a comparison of a lightly processed scan vs. a "final" V850 scan at 3200 dpi, I think the difference makes it fairly clear why I wanted a better scanner.
After scanning a couple of fresh rolls by moving an old LED-panel around to get the illumination reasonably flat, I implemented a template based automatic sprocket hole finder to make auto-exposing faster. I was doing it for basically every frame so this was worthwhile. Over 3 rolls with straight glass I encountered a few cases of newton rings during this round, and I suspect my film/sensor planes weren't fully aligned.
Flat field correction, round 1
The above image suggests that the lower right corner of each sub-frame is darker than the rest. I ruled out reflections by temporarily mounting a polariser on the lens, figuring that e.g. a bounce off the sensor/lens, back to the glass then in again would probably be affected.
After a bit of testing I determined that this effect gets worse with increasing f-number so leaving the lens wide open helps. I spent an embarrassing amount of time trying to invent flat field correction from first principles before I just implemented the Wikipedia formula, which worked. I decided to capture a separate flat field for the long-exposure HDR merge capture, since this uses a different analogue gain.
Below is the result of doing a flat field correction with separate HDR, but using a shorter integration time for the high-gain capture to avoid needing an attenuator.
Same shot using an attenuator (film-leader) for the high-gain HDR flat field:
The capture-highlight side of the correction works quite well:
I also ordered a tube-style shade for the EO lens, just in case stray light was causing these effects. I don't believe this is the case but using a hood can't exactly hurt either. Future work is to e.g. attenuate the backlight using the backlight controller to make it faster to update the flat field.
While waiting for the backlight components below I made a start on the report generator code, which initially was an HTML output that I figured I'd print through WebKit. CSS is a very capable way of describing a page layout, and you can do incredible things with it. Later I rewrote the output to be LaTeX format since this is a simpler way of making a PDF with somewhat predictable behaviour.
Later I also optimised the display output to improve performance by e.g. only doing gamma-correction on data going to the display instead of at full resolution. Turns out the M2 Mac I use is pretty fast at floating point gamma.
Planning a better backlight
Making a good backlight that would fit inside the printer area was challenging. After the initial diffusor + LED-strip concept proved too uneven I reassessed.
I decided to order the following, hoping it would let me solve the problem:
- PE thin diffusor film (30 cm by 3 meters of it, type no. LGT125J)
- To hopefully provide more diffusion than the acrylic plates I started with
- A couple of A4 size fresnel-lens panels
- Hoping to improve efficiency, letting me stack more diffusors
- A set of 50 W RGB-LED and 10 W 850 nm LED modules
- Hard to drive, probably a mistake
- A set of 8x32 WS2812B flexible LED panels (these are also around 8 x 32 cm)
- While adding IR would be challenging, by using a pixel-grid array I could reduce power consumption by a lot by only lighting up the area under the current frame.
- IR could be added by making a set of thin panels that fill the gaps between the RGB LED's
- Using these in on/off mode should avoid PWM-flicker issues; I figured I could adjust the density of on/off pixels to improve backlight uniformity
- Also gave me a use case for one of several mil-spec isolated 28 to 5 V at 15-20 A modules I had sitting around waiting for a use case.
- A Pi Pico module would work as the USB-bridge to let me drive the backlight from my capture program.
A challenge with this setup was achieving good uniformity along the X-axis, since I could only go 8 mm out from the bed before hitting the X/Z posts. Along the Y axis margins were plenty. A fairly simple fix is to just move the film strips closer to the centre. I had initially spaced the 35 mm strips out a fair bit, but since I could only fit 3 anyway there was no reason to spread them out as much as I was doing.
I did some modelling of the printer Y-carriage and made the frame setup shown above. This uses a set of base-plates that bolt on to the carriage and extend past the scanner area, a set of L-brackets that hold the glass are mounted on these plates. A set of 4 mm steel rods are used to make the L-pieces self-supporting.
The L-pieces have a set of M3 holes in them to make it practical to install diffusor and fresnel panels, the nominal build height is 130 mm up to the film.
An old box-lid was cut down to form the backlight base-plate. On the underside a DE-9 connector block and a thickened riveted section is installed to allow mounting of the DC/DC converter and Pi Pico for control. The RGB-panels have connection points on the bottom, so a set of holes will be drilled to access these from the bottom.
A better backlight
Upon receipt of the backlight panels I installed them, a process that took a couple of hours to get all the wiring in place. Did I mention these panels need a lot of current?
On annoyance with the panels is that they go in a zig-zag configuration, and to make the wiring easier I installed one backwards. A lot of messing around to make that code work…
Actually getting the panels to work was really easy though, I used Circuitpython on the Pi Pico and enabled the data USB endpoint to receive and transmit display commands without the REPL stuff interfering. My capture program sends an ASCII string, where the value is a simple bitmap of bitfields corresponding to R/G/B on/off. The Pico works out the details of what order the actual pixels go in.
To generate the backlight image, I get the X/Y position of the camera whenever something changes. I make an image with 1 mm pitch (320x240; pixels are 10 mm spaced), and draw an ellipse centred on the camera position, sized to be ~4⨉ larger than the image field of view (larger lit up area = more light, due to the diffusor). The image is then blurred, thresholded, decimated to the actual LED grid pitch, and sent to the Pico. Once the entire backlight image has been uploaded the Pico sends back an acknowledge to allow blocking calls.
The Pico has a 30 second timeout to ensure the backlight won't stay on forever, the capture program refreshes the last configured backlight image slightly faster than this ensuring it won't go off during capture.
To support colour preview I extended a number of functions such as AE to calculate a set of R/G/B exposure times, capture was extended to capture R/G/B if configured, and the preview mode was extended to support approximate colour balance in white light mode (since previewing R/G/B sequential would be too slow and annoying).
I also discovered that despite working in R/G/B order in most of my software I had a couple of bugs: using the red channel when illuminating blue led to some very long exposure times. The display output was also done in RGB but the bitmap output is natively BGR so some conversion had to happen there.
Milestone: first colour image!
Presented for your viewing pleasure is the first colour image I scanned, it was selected randomly from the last roll of colour I shot before starting this project. It's scanned with white light, with no cover-glass, in the storage foil, with no concept of colour management. It still outperforms the V850! Highlight retention is significantly better, which was a major issue with this roll.
Since the WS2812B "NeoPixel"'s "white" output is quite blue the white light mode isn't too far off from the mix you'd want to counter an orange based negative.
Another comparison from the same roll, this time using field-sequential mode with auto-adjusted colour balance:
Obviously these Vision3 250D frames are pretty flat and "raw" so finding how to best work with them will require some experimenting.
End of colour-honeymoon, even flatter fields
Up until this point I had targeted sequential colour capture, but after testing it out I found that since my exposure times were in the hundreds of ms anyway my PWM-flicker concerns were irrelevant. As such I reworked my system to PWM-control the backlight to white-balance off the rebate area, and to grab direct captures. After a bit of tweaking the achieved white-balance is quite good, giving reasonable colours right out of the scanner.
I also wired the backlight power supply in to the heater PWM output, I had to reconfigure the Klipper configuration since heater outputs have hard-coded self-check features (a good idea, but in the way here). Fortunately you can configure "generic" outputs and sensor inputs that are still easy to control.
This does add a slight colour error due to crosstalk between the different colour channels, but since I expected to use an IT8 target for calibration anyway this would be compensated later.
I ran into some weird colour patchiness that was hard to track down, it was entirely deterministic for a given setup and seemed somewhat related to whether the sprocket holes were in frame. Scanning through the foils seemed to make this issue more apparent but going to normal glass didn't entirely remove it. Curiously the issue seemed to follow the frame, so one possibility is some automatic adjustment in the camera I haven't managed to disable.
I ended up removing the lens hood since it seemed to make no difference whatsoever. I also reworked the flat field correction to actively use the backlight control system, and I was able to stop down to f⁄10 again for slightly improved sharpness.
I verified that the flat field performance seemed to be unaffected by the glass plates, so the easy way to do a flat field is to pull them and look straight at the diffusor.
While thinking about these issues I added a couple of other nice-to-haves that will be useful to make batch scanning possible:
- Using the sprocket-hole detector to automatically align the Y-axis when at the starting area for a new frame — a useful feature to have if negatives aren't perfectly aligned (X-alignment is less critical and harder to do)
- A simple exposure-ok check based on the sprocket detector having found a match + the AE area having an acceptable exposure.
- Slightly modified the merging software to run on my old home server instead of my poor M2 Mini
- It ran perfectly well but the RAM usage in particular made itself felt when multitasking
- Since my home server is kind of meh, made the merging software multithreaded and made sure as much processing as practical was done in threaded form (e.g. doing gamma conversion on each sub-frame instead of at the end, converting to greyscale as soon as possible etc.).
- Net result: around 30 seconds per 35 mm frame, on a 12 year old Xeon
- Since these files are around 300 MB per 35 mm frame before any processing, I started a side project to upgrade my home network to 10 Gbit⁄s. This was long overdue, obviously I did have some 10 Gbit⁄s gear already but SFP+-only 10GbE switches are now quite affordable. 10 Gbit/s uplink to 2.5GBASE-T switches are also quite affordable.
Later, I started looking more into the magical tuning JSON file that's loaded on every startup. I swapped to the stock _scientific.json file and started chewing on wires, leaving nothing but the lux-estimator (since it complains without it), DPC-disable, and the basic auto-exposure definition since it apparently can't set analogue gain without it. Notable removed or disabled features included: dead pixel correction, automatic lens-shading, colour correction matrix, automatic white-balance, green-equalisation, any kind of noise reduction, and sharpening. I repeated my flat-field correction after each change.
I did notice that some residual flat field error now showed up, so I implemented a third flat field correction that is applied to the merged HDR-image instead of just to the individual components. This flat field image ended up slightly countering the previous two wrt. lens shading though at fairly low magnitude, suggesting some residual error was present (if the other two were perfect the flat field shouldn't have any texture). A direct comparison is shown below.
If you're wondering: the colour balance being different was caused by the second flat field causing an RGB balance shift which would only show up in final captures and not preview modes. To avoid this I normalise the RGB-gain individually (which may negatively affect the correction…)
A residual error that might be a lens limitation is where large parts of the image are heavily saturated, such as this edge where the backlight is visible due to an oversized mask:
These frames are slightly shifted towards orange after inversion which seems to make sense as a purely optical crosstalk since this is the colour of the backlight. I tried disabling the black-level clamping but this made no difference.
As shown above blocking the light seems to fix the issue and it's repeatable between different sensors, but this does point to a limitation of this type of system. It also seems to imply that including sprocket holes in the scan may be detrimental, though I haven't really seen any clear indications that this actually causes issues.
As it stands, using a more precisely made mask and ensuring that the mask blocks most of the light seems to be sufficient. In cases where backlight-exposure will happen, any kind of light-proof mask can just be placed on top of the top glass as shown above to resolve it.
Charting a path to greatness
After these last tests, I started evaluating what options were available. As mentioned I upgraded my home network to 10 Gbit⁄s and as part of the acquisition process I ended up with one more computer than I needed. Starting from here, I started re-evaluating the whole system and—saving you a lot of words—ended up switching to my Python1300 global shutter camera using GigE-Vision. This camera isn't as good in most aspects but there a lot of GigE cameras available with optimal sensors.
This marks the end of this article, as I feel the Mk. II system using higher end components deserves a separate article.
My future upgrade path involves a CobaltTL™ lens with higher optical performance (though less available on the second-hand market), 10 Gbit⁄s GigE-Vision cameras, and probably the same 3D printer chassis.
The 3D printer wasn't the optimal choice for performance, though it was extremely affordable. A Core-XY machine or gantry-style CNC router would make a more sensible base, moving the entire light-box and film assembly isn't the best solution for speed.