This repository includes software to recover infrared interferometric images based on Compressed Sensing (CASSINI-LASSO)

1. Brief Introduction to Compressed Sensing

Compressed Sensing (CS) allows us to recover a signal with less samples that the ones established from the Nyquist/Shannon theorem. For the technique to work, the signal must be sparse and compressible on a given basis. It means that the signal can be represented by a linear combination of functions with a small number of non-zero coefficients. In CS, a set of measurements, y, of a given signal, x, can be encoded by a multiplication of the matrices Φ, Ψ, and the sparse vector α. Ψ is the transformation basis where the full signal, x, is sparse, and only a few coefficients in the vector α are non-zero. Φ is, thus, the system of measurements under which the data are taken. For a visual representation of the matrices involved in CS see Fig. 1. It is important to remark that the number of measurements in y is considerably smaller than the number of features/columns in in Ψ, therefore, the inverse problem to find α is "ill-posed". CS establishes that if the product Θ = ΦΨ satisfies the Restricted Isometry Property (RIP), we will be able to recover the signal from the sub-sampled measurements. Therefore, compressed Sensing offers us a framework to solve the "ill-posed" inverse problem by a regularized optimization, using as prior the sparsity of α and/or the degree of compressibility of the signal.

This repository includes code to recover infrared interferometric images using CS from simulated Aperture Masking data. The Aperture Masking data is simulated as expected to be recorded by the near-infrared imager NIRISS on-board of the James Webb Space Telescope (JWST).

Dummy image

Figure 1. Schematic Representation of the Compressed Sensing algorithm

2. James-Webb Space Telescope Simulations

NIRISS (Near Infrared Imager and Slitless Spectrograph) is an infrared (band-pass = 0.8 - 5μm) high-resolution camera which allows us to observe an object using Fizeau interferometry in the form of Sparse Aperture Masking (SAM). We obtained the simualted data from our collaboration with the NIRISS team at the Space Telescope Science Institute (STScI). In order to reduce the interferograms of the simulatiosn, we fitted the fringes directly in the image plane using a model of the mask geometry and filter bandwidth, using our software CASSINI-SAMPip.

The SAM data included as example to run the code consisted in the simulation of an inclined and asymmetric proto-planetary disk observed at three different filters (see Table 1) with the following central wavelengths: 3.8μm, 4.3μm and 4.8μm. Given the pointing limitations of the JWST, we considered a maximum of three pointing positions at a position angle(E->N) of -10, 0 and 10. To make the JWST/SAM simulations as realistic as possible, we included piston errors between 10 and 50 nm. These are typical expected error values of the instrumental transfer function. The simulated science data were calibrated with simulated interferograms of point-like objects with similar pistonerrors as the science data. The u-v coverage employed for image reconstruction includes 318 data points (V2 + Fourier phases + CPs) and combines the different simulated pointing positions and wavelengths (see Fig. 2).

Dummy image2

Figure 2. right: u-v coverage of our JWST-SAM simulations. left: Simulated inteferogram of a proto-planetary disk as observed with the SAM mode of the JWST.

Dummy image2

Table 1. Simulated NIRISS filters.

3. Image reconstruction based on Compressed Sening

The code for the reconstruction is included in the sub-directory LASSO of the main CASSINI repository. To solve the image optimization problem, the python scikit-learn library was used. More explicitly, the Least Absolute Shrinkage and Selection Operator (LASSO) algorithm was selected. This LASSO implementation uses a regularized minimization of the following form:

Dummy image2

where N is the total number of elements in the sampled signal, y, and λ is the value of the hyperparameter that weights the regularizer. It is important to remark that the constraint region of the l1-norm has the form of an hypercube with several corners, which ensure sparsity of α for a convex optimization problem. This is not the case by using, for example a Ridge regression with ‖α‖22, where the constraint region is a rotational invariant n-sphere. This can also be interpreted as LASSO being a linear regression model with a Laplace prior distribution, with a sharp peak at its mean. In contrast, Gaussian prior distribution of coefficients in a Ridge regression has a more soften peak around its mean.

Dummy image2

Figure 3. The diagram shows a visual representation of the CS LASSO implementation of our work. A Dictionary of models (θ = ΦΨ) is created with a group of images (Ψ), which are transformed into the measured observables atthe simulated u-v plane (Φ). Then, the Dictionary is compared with the data (y) and a set of non-zero coefficients (α) are selected. This process is repeated over a given number of iterations until the best-fit reconstructed image (x) is obtained.

Before performing the minimization, a precomputed Dictionary (Θ) with 104 different disk-like structures was created. The random images of the disks were created using a pixel grid of 71×71 pixels with a pixelscale of 10 milliarcseconds (mas). To transform those images into the system of measurements of our data, their Fourier transform were performed using a proprietary implementation of the regularly spaced Fast Fourier Transform (FFT) and, the observables (squared visibilities, Fourier phases and closure phases) were obtained for the sampled u-v frequencies. Together with the code in this repository, we included a sample of the dictionary build from our database of models. The code create_dict.py is an example of how can we read a set of images and transform them into the Fourier Space sampled with our data. This script imports the script called oitools.py.

oitools.py contains a small library of routines to perform transformations and data extraction on both, the visibility domain and the image one. One of the most interesting and, indeed, the one that is used to extract the interferometric observables from the images is oitools.compute_vis_matrix. This piece of code uses a dedicated Direct Fourier Transform to get specific amplitudes and phases from the images at the specific spatial frequencies sampled with an interferometric array. The code is the following one:


def compute_vis_matrix(im, u, v, scale):
    import numpy as np
    import math
    import pdb
    sz = np.shape(im)
    if sz[0] % 2 == 0:
        x, y = np.mgrid[-np.floor(sz[1] / 2 - 1):np.floor(sz[1] / 2):sz[1] * 1j,
               np.floor(sz[0] / 2 - 1):-np.floor(sz[0] / 2):sz[0] * 1j]
    else:
        x, y = np.mgrid[-np.floor(sz[1] / 2):np.floor(sz[1] / 2):sz[1] * 1j,
               np.floor(sz[0] / 2):-np.floor(sz[0] / 2):sz[0] * 1j]
    x = x * scale
    y = y * scale
    xx = x.reshape(-1)
    yy = y.reshape(-1)
    im = im.reshape(-1)
    arg = -2.0 * math.pi * (u * yy + v * xx)
    reales = im.dot(np.cos(arg))
    imaginarios = im.dot(np.sin(arg))
    visib = np.linalg.norm([reales, imaginarios])
    phase_temp = np.arctan2(imaginarios, reales)
    phase = np.rad2deg((phase_temp + np.pi) % (2 * np.pi) - np.pi)
    return visib, phase

The input parameters in oitools.compute_vis_matrix are: (i) the image from which the observables are computed; (ii) the u and v coordinates of the interferometer and; (iii) the pixel scale used in the image (this input is in units of radians).

create_dict.py also centered and scaled each set of observables to have a mean equals to zero and standard deviation equals to the unity. Finally, the different observables were merged into a single column vector (or atom). The different atoms were stacked to create the different columns of the final Dictionary and stored into a Python binary file. Once the Dictionary was integrated, the assembled dictionary of protoplanetary disks is included in the repository enconded into a numpy binary file. (filename = Dict3.npy).

Once the Dictionary was integrated, LASSO was used to solve for the non-zero coefficients of α that fit the observables and reconstruct the image. LASSO worked over 103 iterations with a pre-defined value of the hyperparameter λ. Figure 4 displays a schematic representation of the described algorithm.

This is the main part of the code for recovering the images from the interferometric data. The code to run LASSO is called `CS_JWST_v1.py`. This routines uses the dictionary (in this case Dict3.npy) together with the combined data set to recover an image. The user has to modify the following entry parameters in the script:


oi_data = 'COMB_JWST_SAM_tot.fits'  #This is the filename of the input (COMBINED) data set
dict_filename = 'Dict3.npy' #Dictionary filename
input_path = ['dataset/']  # Input path for the galery of model images
scale = 10.0 # Pixel scale of the generated images in milliarcseconds
hyperparameter = 0.5 # Hyperparameter lambda for the LASSO minimization
maxit = 10000 # Maximum number of iterations for the reconstruction
output_filename = 'recovered_im_lasso.fits' #Output filename

IMPORTANT: In case you want to run LASSO on your local machine, you need the DATABASE of models in the dataset/ directory. The repository includes a copy of this DATABASE but it is partially truncated because of space in the GitHub repository. Before running the code, please download the complete DATABASE from this Dropbox link and put it inside the LASSO directory before running the code.

The script CS_JWST_v1.py returns the recovered image in .fits format. Figure 4 shows the best-fit image obtained with our CS LASSO algorithm together with the original model image. For the example showed in this simulation, the general structure of the disk is reproduced. The reconstructed morphology shows the correct inclination and position angle. It also shows the brighter emission of the ring along the semi-major axis. The inner cavity is also clearly observed. However, the size of the semi-minor axis is larger than the one of the model image. This can be appreciated in the map of the residuals formed by subtracting the image model from the reconstructed one. Figure 5 shows that the observables are well reproduced by the reconstructed image.

Dummy image2

Figure 4. Left: Model image from which the simulated data were obtained. Middle: Reconstructed CS LASSO image. Right: Map of residuals. The left and right panels are normalized to the unity and the color map scale is the same for an easy comparison between the two of them.

Dummy image2

Figure 5. Comparison between the data (black dots) and the recovered Squared visibilities and Closure Phases from the reconstructed CS LASSO images. The left panel displays the squared visibilities versus spatial frequency while the right panel shows the closure phases versus spatial frequencies.

To verify the quality of the reconstructions with the original set of observables, the repository includes the script called plot_obs.py. This piece of code produces a PNG file called "observables.png". This plot is similar to Figure 5 and it shows a comparison between the extracted obsevables from the reconstructed image and the original data. The input parameters of plot_obs.py are:


oi_data = 'COMB_JWST_SAM_tot.fits' #Data filename
im_rec = np.squeeze(pyfits.getdata('reconvered_im_lasso_0.001.fits')) #Recovered image filename
wave_range = [3.0e-6, 5.0e-6] #Wavelength range of the observations
scale = 10.0 ## Pixel scale in milliarcseconds

To evaluate the quality of our reconstructions, we also generated images using two other codes available in the community: BSMEM and SQUEEZE. The first one uses maximum entropy for the regularization and a gradient descent method with a trust region for the minimization. The second one could use different types of regularizations, including sparsity (in the form of the l0-norm). For the minimization a Markov-Chain Monte-Carlo (MCMC) algorithm is employed. Similar pixel scale and grid parameters between CS and the reconstructions using BSMEM and SQUEEZE were used. Fig. 6 shows the reconstructions obtained with each one of the different software. Notice that the three algorithms managed to recover the general (position angle and size) structure of the target. However, different artifacts are observed. For example, the BSMEM image shows the two brightest spots of the disk. However, it does not recover the central cavity. This can be easily explained because the Maximum Entropy enforces a smooth brightness transition between the different pixels in the image. The SQUEEZE reconstruction using the l0-norm shows a map with a ”granular” structure, which does not provide well defined loci for the maximum. This image does not show a clear cavity. We remark that the SQUEEZE image can be improved by using additional regularizers. Nevertheless, this is obtained at the cost of being slower. Also, the selection of the hyperaparameters becomes more complicated for more regularizers involved in the reconstruction. Both the SQUEEZE and BSMEM images show additional artifacts around the central source. This is not the case of the CS LASSO image, which shows a uniform background. To estimate the signal-to-noise ratio (SNR) between the peak of the emission and the noise floor of the images, we computed the mean value of the background using all the pixels outside a circular mask with a radius of 15 pixels centered at the middle of the image. The SNR values are: 3.7 x 104, 1.0 x 102 and 0.8 x 102 for the CS LASSO, SQUEEZE and BSMEM images, respectively. These values suggest that the CS LASSO reconstruction achieved a contrast two orders of magnitude larger than the other reconstructions. This is particularly encouraging for the case of high-contrast observations as the ones expected with the JWST.

Dummy image2

Figure 7. Left: Reconstructed CS LASSO image. Middle: Reconstructed SQUEEZE image. Right: Reconstructed BSMEM image.

SUMMARY

  • To extract observables from SAM data - Run CASSINI-SAMPip Modify the script test.py or create a new one according with the data to be reduced. A description of the parameters used for the setup is included in this webpage.

  • To combine SAM data - Run the oi_combine.py script inside the SAMpip directory. Modify the script according with the data to be combined. An oifits file is produced. A description of the parameters used for the setup is included in this webpage.

  • To create a dictionary of structures for imaging - From a given galery of images with the desired structures, use the script create_dict.py. This code converts the images into atoms of the dictionary in the Fourier Space. For this, it uses the transformation routines included in oitools.py. The user can run create_dict.py at the root directory of the repository. This step should create a .npy binary file with the dictionary of structures to be used for the reconstruction.

  • To recover an image - With the dictionary, run CS_JWST_v1.py. An example of the input parameters is included in this webpage. This code will produce the reconstructed image. To evaluate the quality of the reconstruction, the user can con plot_obs.py, in order to produce a plot of the synthetic observables extracted from the best-reconstructed image and the dataset.

  • Additional tools - We recommed the user to explore oitools.py. This script contains a series of transformations and data handling routines that could be useful for evaluating the quality of the reconstructions and/or to vizualize the data.