This repository includes software to recover infrared interferometric images based on Compressed Sensing (CASSINI-LASSO)
Link to the GitHub repository with the code, click HERE
1. Brief Introduction to Compressed Sensing
Compressed Sensing (CS) allows us to recover a signal with less samples that the ones established from the Nyquist/Shannon theorem. For the technique to work, the signal must be sparse and compressible on a given basis. It means that the signal can be represented by a linear combination of functions with a small number of non-zero coefficients. In CS, a set of measurements, y, of a given signal, x, can be encoded by a multiplication of the matrices Φ, Ψ, and the sparse vector α. Ψ is the transformation basis where the full signal, x, is sparse, and only a few coefficients in the vector α are non-zero. Φ is, thus, the system of measurements under which the data are taken. For a visual representation of the matrices involved in CS see Fig. 1. It is important to remark that the number of measurements in y is considerably smaller than the number of features/columns in in Ψ, therefore, the inverse problem to find α is "ill-posed". CS establishes that if the product Θ = ΦΨ satisfies the Restricted Isometry Property (RIP), we will be able to recover the signal from the sub-sampled measurements. Therefore, compressed Sensing offers us a framework to solve the "ill-posed" inverse problem by a regularized optimization, using as prior the sparsity of α and/or the degree of compressibility of the signal.
This repository includes code to recover infrared interferometric images using CS from simulated Aperture Masking data. The Aperture Masking data is simulated as expected to be recorded by the near-infrared imager NIRISS on-board of the James Webb Space Telescope (JWST).
2. James-Webb Space Telescope Simulations
NIRISS (Near Infrared Imager and Slitless Spectrograph) is an infrared (band-pass = 0.8 - 5μm) high-resolution camera which allows us to observe an object using Fizeau interferometry in the form of Sparse Aperture Masking (SAM). We obtained the simualted data from our collaboration with the NIRISS team at the Space Telescope Science Institute (STScI). In order to reduce the interferograms of the simulatiosn, we fitted the fringes directly in the image plane using a model of the mask geometry and filter bandwidth, using our software CASSINI-SAMPip.
The SAM data included as example to run the code consisted in the simulation of an inclined and asymmetric proto-planetary disk observed at three different filters (see Table 1) with the following central wavelengths: 3.8μm, 4.3μm and 4.8μm. Given the pointing limitations of the JWST, we considered a maximum of three pointing positions at a position angle(E->N) of -10◦, 0◦ and 10◦. To make the JWST/SAM simulations as realistic as possible, we included piston errors between 10 and 50 nm. These are typical expected error values of the instrumental transfer function. The simulated science data were calibrated with simulated interferograms of point-like objects with similar pistonerrors as the science data. The u-v coverage employed for image reconstruction includes 318 data points (V2 + Fourier phases + CPs) and combines the different simulated pointing positions and wavelengths (see Fig. 2).
3. Image reconstruction based on Compressed Sening
The code for the reconstruction is included in the sub-directory LASSO of the main CASSINI repository. To solve the image optimization problem, the python scikit-learn
library was used. More explicitly, the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm was selected. This LASSO implementation uses a regularized minimization of the following form:
where N is the total number of elements in the sampled signal, y, and λ is the value of the hyperparameter that weights the regularizer. It is important to remark that the constraint region of the l1-norm has the form of an hypercube with several corners, which ensure sparsity of α for a convex optimization problem. This is not the case by using, for example a Ridge regression with ‖α‖22, where the constraint region is a rotational invariant n-sphere. This can also be interpreted as LASSO being a linear regression model with a Laplace prior distribution, with a sharp peak at its mean. In contrast, Gaussian prior distribution of coefficients in a Ridge regression has a more soften peak around its mean.
Before performing the minimization, a precomputed Dictionary (Θ) with 104 different disk-like structures was created. The random images of the disks were created using a pixel grid of 71×71 pixels with a pixelscale of 10 milliarcseconds (mas). To transform those images into the system of measurements of our data, their Fourier transform were performed using a proprietary implementation of the regularly spaced Fast Fourier Transform (FFT) and, the observables (squared visibilities, Fourier phases and closure phases) were obtained for the sampled u-v frequencies. Together with the code in this repository, we included a sample of the dictionary build from our database of models. The code create_dict.py
is an example of how can we read a set of images and transform them into the Fourier Space sampled with our data. This script imports the script called oitools.py
.
oitools.py
contains a small library of routines to perform transformations and data extraction on both, the visibility domain and the image one. One of the most interesting and, indeed, the one that is used to extract the interferometric observables from the images is oitools.compute_vis_matrix
. This piece of code uses a dedicated Direct Fourier Transform to get specific amplitudes and phases from the images at the specific spatial frequencies sampled with an interferometric array. The code is the following one:
def compute_vis_matrix(im, u, v, scale):
import numpy as np
import math
import pdb
sz = np.shape(im)
if sz[0] % 2 == 0:
x, y = np.mgrid[-np.floor(sz[1] / 2 - 1):np.floor(sz[1] / 2):sz[1] * 1j,
np.floor(sz[0] / 2 - 1):-np.floor(sz[0] / 2):sz[0] * 1j]
else:
x, y = np.mgrid[-np.floor(sz[1] / 2):np.floor(sz[1] / 2):sz[1] * 1j,
np.floor(sz[0] / 2):-np.floor(sz[0] / 2):sz[0] * 1j]
x = x * scale
y = y * scale
xx = x.reshape(-1)
yy = y.reshape(-1)
im = im.reshape(-1)
arg = -2.0 * math.pi * (u * yy + v * xx)
reales = im.dot(np.cos(arg))
imaginarios = im.dot(np.sin(arg))
visib = np.linalg.norm([reales, imaginarios])
phase_temp = np.arctan2(imaginarios, reales)
phase = np.rad2deg((phase_temp + np.pi) % (2 * np.pi) - np.pi)
return visib, phase
The input parameters in oitools.compute_vis_matrix
are: (i) the image from which the observables are computed; (ii) the u and v coordinates of the interferometer and; (iii) the pixel scale used in the image (this input is in units of radians).
create_dict.py
also centered and scaled each set of observables to have a mean equals to zero and standard deviation equals to the unity. Finally, the different observables were merged into a single column vector (or atom). The different atoms were stacked to create the different columns of the final Dictionary and stored into a Python binary file. Once the Dictionary was integrated, the assembled dictionary of protoplanetary disks is included in the repository enconded into a numpy binary file. (filename = Dict3.npy).
Once the Dictionary was integrated, LASSO was used to solve for the non-zero coefficients of α that fit the observables and reconstruct the image. LASSO worked over 103 iterations with a pre-defined value of the hyperparameter λ. Figure 4 displays a schematic representation of the described algorithm.
This is the main part of the code for recovering the images from the interferometric data. The code to run LASSO is called `CS_JWST_v1.py`. This routines uses the dictionary (in this case Dict3.npy) together with the combined data set to recover an image. The user has to modify the following entry parameters in the script:
oi_data = 'COMB_JWST_SAM_tot.fits' #This is the filename of the input (COMBINED) data set
dict_filename = 'Dict3.npy' #Dictionary filename
input_path = ['dataset/'] # Input path for the galery of model images
scale = 10.0 # Pixel scale of the generated images in milliarcseconds
hyperparameter = 0.5 # Hyperparameter lambda for the LASSO minimization
maxit = 10000 # Maximum number of iterations for the reconstruction
output_filename = 'recovered_im_lasso.fits' #Output filename
IMPORTANT: In case you want to run LASSO on your local machine, you need the DATABASE of models in the dataset/ directory. The repository includes a copy of this DATABASE but it is partially truncated because of space in the GitHub repository. Before running the code, please download the complete DATABASE from this Dropbox link and put it inside the LASSO directory before running the code.
The script CS_JWST_v1.py
returns the recovered image in .fits format. Figure 4 shows the best-fit image obtained with our CS LASSO algorithm together with the original model image. For the example showed in this simulation, the general structure of the disk is reproduced. The reconstructed morphology shows the correct inclination and position angle. It also shows the brighter emission of the ring along the semi-major axis. The inner cavity is also clearly observed. However, the size of the semi-minor axis is larger than the one of the model image. This can be appreciated in the map of the residuals formed by subtracting the image model from the reconstructed one. Figure 5 shows that the observables are well reproduced by the reconstructed image.
To verify the quality of the reconstructions with the original set of observables, the repository includes the script called plot_obs.py
. This piece of code produces a PNG file called "observables.png". This plot is similar to Figure 5 and it shows a comparison between the extracted obsevables from the reconstructed image and the original data. The input parameters of plot_obs.py
are:
oi_data = 'COMB_JWST_SAM_tot.fits' #Data filename
im_rec = np.squeeze(pyfits.getdata('reconvered_im_lasso_0.001.fits')) #Recovered image filename
wave_range = [3.0e-6, 5.0e-6] #Wavelength range of the observations
scale = 10.0 ## Pixel scale in milliarcseconds
To evaluate the quality of our reconstructions, we also generated images using two other codes available in the community: BSMEM and SQUEEZE. The first one uses maximum entropy for the regularization and a gradient descent method with a trust region for the minimization. The second one could use different types of regularizations, including sparsity (in the form of the l0-norm). For the minimization a Markov-Chain Monte-Carlo (MCMC) algorithm is employed. Similar pixel scale and grid parameters between CS and the reconstructions using BSMEM and SQUEEZE were used. Fig. 6 shows the reconstructions obtained with each one of the different software. Notice that the three algorithms managed to recover the general (position angle and size) structure of the target. However, different artifacts are observed. For example, the BSMEM image shows the two brightest spots of the disk. However, it does not recover the central cavity. This can be easily explained because the Maximum Entropy enforces a smooth brightness transition between the different pixels in the image. The SQUEEZE reconstruction using the l0-norm shows a map with a ”granular” structure, which does not provide well defined loci for the maximum. This image does not show a clear cavity. We remark that the SQUEEZE image can be improved by using additional regularizers. Nevertheless, this is obtained at the cost of being slower. Also, the selection of the hyperaparameters becomes more complicated for more regularizers involved in the reconstruction. Both the SQUEEZE and BSMEM images show additional artifacts around the central source. This is not the case of the CS LASSO image, which shows a uniform background. To estimate the signal-to-noise ratio (SNR) between the peak of the emission and the noise floor of the images, we computed the mean value of the background using all the pixels outside a circular mask with a radius of 15 pixels centered at the middle of the image. The SNR values are: 3.7 x 104, 1.0 x 102 and 0.8 x 102 for the CS LASSO, SQUEEZE and BSMEM images, respectively. These values suggest that the CS LASSO reconstruction achieved a contrast two orders of magnitude larger than the other reconstructions. This is particularly encouraging for the case of high-contrast observations as the ones expected with the JWST.
SUMMARY
-
To extract observables from SAM data
- Run CASSINI-SAMPip Modify the scripttest.py
or create a new one according with the data to be reduced. A description of the parameters used for the setup is included in this webpage. -
To combine SAM data
- Run theoi_combine.py
script inside the SAMpip directory. Modify the script according with the data to be combined. An oifits file is produced. A description of the parameters used for the setup is included in this webpage. -
To create a dictionary of structures for imaging
- From a given galery of images with the desired structures, use the scriptcreate_dict.py
. This code converts the images into atoms of the dictionary in the Fourier Space. For this, it uses the transformation routines included inoitools.py
. The user can runcreate_dict.py
at the root directory of the repository. This step should create a .npy binary file with the dictionary of structures to be used for the reconstruction. -
To recover an image
- With the dictionary, runCS_JWST_v1.py
. An example of the input parameters is included in this webpage. This code will produce the reconstructed image. To evaluate the quality of the reconstruction, the user can conplot_obs.py
, in order to produce a plot of the synthetic observables extracted from the best-reconstructed image and the dataset. -
Additional tools
- We recommed the user to exploreoitools.py
. This script contains a series of transformations and data handling routines that could be useful for evaluating the quality of the reconstructions and/or to vizualize the data.