This post is about the effect of re-sampling in image registration methods on local pixel cross correlation for analyzing calcium imaging data.
In awake 2p imaging, animal motion causes brain tissue motion and image motion. While z-motion that can not be recovered from only one imaging plane is usually fairly small, x/y-motion within the imaging plane is usually large enough to make further analysis impossible without first correcting it by shifting/deforming all images back to a common position, usually by aligning them to a template image.
We mostly use a simple, translation only, fft based registration method for this (matlab code by Manuel Guizar, see also Dario Ringach). This works well enough for our needs (imaging cell bodies with GCaMP in awake behaving mice, on a treadmill) as long as frame rates are >10Hz, ensuring that images are mostly just uniformly translated.
I recently wanted to image larger regions at <10Hz, and increase the registration precision. Because images are acquired line by line, different parts of the frame are translated differently, resulting in non-uniform scaling and shear. Correcting this deformation requires estimating the translation independently for parts of the image (typically either for each (fast) x-direction scan line, tied together with some sort of regularization/splines, or at least for a few vertically stacked sub-regions of the frame) and then applying a non-affine transformation to cancel the estimated deformation.
The main choices in this method are selection of the method of estimating translations (fft/Lucas-Kanade/…) interpolation or smoothing used for deformation coordinates across the entire image (image sub-regions via nearest neighbor/linear/splines/…), and type of pixel re-sampling (nearest neighbor/bilinear/quadratic/…). A popular method is the spline-based deformation by Greenberg and Kerr (here’s their code, see also Patrick Mineault’s blog entry).
In all cases other than integer valued translations and shear, the deformation results in having to re-sample the image, and now output (corrected) frame pixels contain information from more than one source image pixel and vice-versa. While this is not a problem for extracting the brightness value time series for analysis, it can cause issues with methods that rely on neighboring pixels being independent. In our cell body detection for instance we rely heavily on computing the cross correlation between neighboring pixels to detect GCaMP driven pixels. Here’s a nice demonstration of a very similar method in action. Image re-sampling, which mixes up neighboring pixels, causes problems with this method, as neighboring pixels that originally contained uncorrelated noise, now share brightness values from the same source pixels. This means that even background noise becomes somewhat correlated in 3×3 neighborhoods. This is a problem because this is the exact measure we’d like to use to distinguish background signal from neurites, which in our case are ~1-2px wide.
Up-sampling all images with a sufficient factor to make all features in the image bigger than the affected 1 pixel scale won’t solve this issue either, as the spatial noise correlation in the source image would be maintained, and in the limit, one source pixel would contribute 50% of the value of a region of the same size in the output.
We also can’t fully solve the issue by going with nearest neighbor resampling, because occasionally, two neighboring output pixels will be identical. In my data (~250x250px, ~500μm field of view, ~8Hz) this happens on ~1-3 x-lines/frame across the image, whenever parts of the image are ‘pulled apart’ causing one line of source data to stretch over 2 lines of aligned output data. This doesn’t sound like a lot, but the occasionally identical pixels still cause unnecessary smoothing of the cross-correlation across the stack, and the issue would be even worse for more complicated deformations.
Deforming images while retaining independent pixel values
Here’s a quick (and somewhat dirty) solution for this:
- Run a nearest-neighbor resampling, but keep track of which pixels were assigned more than once in the output image.
- Now we want to replace the doubled pixels (2nd or more pixels that are identical to the 1st) with independent draws from the same distribution as the first pixel. Because image statistics vary wildly across the frame and across time, we can’t just add back the correct amount of noise to achieve this. Instead, the simplest solution seems to be to just grab these pixels from the preceding (already aligned) frame, which has approximately the same image statistics as the doubled pixels, but contains independent noise.
- This method will occasionally mix a very small amount of data from preceding frames into the current frame. For ~5-10Hz and GCaMP6s data this should be a negligible issue though. Also, if this could cause problems, one could simply switch back to regular NN data, or bilinear data for the extracted time series once the ROIs are identified.
Here’s the complete demo code for this. The crucial step is very simple:
%start by identifying all pixels that have fan-out > 1
[C,IA,IC] = unique(ii_nn,'stable');
% this identifies all unique source pixels
% (first occurrences in IA)
fanouts=1:numel(Iout); % start with list of all pixels
fanouts(IA)=; % remove all 1sts,
% this leaves only the 2nd and further copies of pixels
% from the original to the deformed image
% set doubles to independent samples from
% prev. image to maintain spatial xcorr
Here’s some simulated data (the demo code should generate this figure) showing the xcorr from a random pixel to neighboring pixels in a 5000 frames stack of noise, deformed with pretty exaggerated shear and y-stretch. The NN image shows some spurious correlation to the source pixel in the y direction, caused by pixel doubling in frames where the image was stretched vertically. The fan-out replacement method eliminates this artifact.
Comparison of bilinear, NN and the proposed method. Images show xcorr of a randomly chosen pixel to neighboring pixels in aligned/resampled images. All input images were uncorrelated noise. The bilinear method introduces correlations across all neighbors, the NN method (in this case) only in the vertical direction, and less so. Re-sampling doubled pixels avoids inflating neighboring pixel’s xcorr.
In real data the effect is much less pronounced, but depending on the analysis, this trick might be somewhat useful.
Correcting local pixel cross-correlation
A pretty simple way to achieve a similar effect ‘in post’ is to just estimate the overall spatial cross-correlation (average from ~1000 random positions in the image in a 3×3 or 5×5 neighborhood) and then dividing it out later:
Local xcorr to a ‘source’ pixel in a region without active cells.
Same region/source pixel after dividing out average spatial correlation.
This can make detecting neighboring small cells much cleaner and should work even when significant spatial correlation structure is present in the image stack (such as due to imperfect option correction):
ROIs and correlation to source pixel – increased correlation around source pixel is clearly visible.
Same ROIs and source pixel – correcting for baseline correlation makes cell boundaries more easily visible.