Skip to content
/ tide Public

Direct Exoplanet Imaging with Tensor Decompositions

Notifications You must be signed in to change notification settings

lwelzel/tide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Direct Exoplanet Imaging with Tensor Decompositions.

post-processed HR 8799 observations (IRDIS) with overlaid tensor ring diagram

High-contrast imaging (HCI) is as essential pillar in the search for exoplanets. It relies heavily on advanced image processing techniques to differentiate planetary flux from the coronagraphic point spread function (PSF) of the host star and nuisance components. Angular-spectral differential imaging (ASDI) is a key method to induce diversity in HCI observations by utilizing integral field spectrographs. Traditional methods like Principal Component Analysis (PCA) have been instrumental but are limited by their inability to fully capture the complex, multi-modal relationships of the diversity exploited by ASDI data processing.

This study introduces tensor decomposition methods as robust alternatives to PCA for direct imaging of exoplanets, aiming to preserve and leverage the multi-modal structure of the observational data more effectively.

This project interprets the ASDI data as high-order tensors with cross-couplings in the spectral, temporal and spatial modes which are disrupted by flattening the tensors into a matrix for factorization. Instead of using PCA, the observation tensors are decomposed into low-rank factors using tensor decompositions, including the Canonical-Polyadic, Tucker, Tensor Train, and Tensor Ring Decomposition. These decompositions compute a low-rank PSF model while preserving the structure and higher-order relationships of the modes. The new methods are assessed on synthetic and real observations from the SPHERE instrument on the Very Large Telescope.

The tensor-based methods demonstrated a capacity to maintain the integrity of the ASDI data's multi-modal structure, and capture cross-modal interactions. They provide a more adaptable framework for PSF modeling and subtraction. Evaluation against traditional PCA show comparable performance. Tensor decomposition methods increase the flexibility as well as interpretability of factorizations and extend existing methods in direct exoplanet imaging. These findings suggest that tensor decompositions can further advance HCI post-processing and make deep learning on HCI datasets tenable.

Tensor Ring Decomposition STIM Map

IFS observations of HR 8799 reduced with the Tensor Ring Decomposition, residuals (left) and STIM map (right).

Angular-Spectral Differential Imaging

Combined Differential Imaging (CODI) is an observational technique that exploits angular and spectral diversity to differentiate the stellar PSF from the companion. By using both angular and spectral diversity the PSF of the host star is modeled by scaling the observations so that the stellar PSF is aligned along the angular and spectral dimension but the companion is misaligned along both dimensions. This process is shown in the figure below. The stellar PSF is then modeled as the quasi-static part of the observations. The PSF model is then subtracted from the observations, ideally leaving only the companion signal.

original/rescaled ASDI observations scaled ASDI observations

Schematic observations obtained by angular-spectral differential imaging. Each frame (gray border) in the 3-by-3 grids is an image of the observed solar system at wavelength λ and parallactic angle θ. The coordinate system in each frame is not wavelength and angle, but instead (projected) distances e.g. right ascension and declination. The PSF and speckle pattern are shown in red, and an off-axis source (like an exoplanet) is shown in blue, and its trajectory through the observation cube is shown as a dashed blue arc. The center of each frame, coinciding with the position of the star, is indicated by a black circle. While only 9 frames are shown here, full observations typically consist of thousands of frames. Left: pre-processed observations. The PSF and speckle pattern spread out with increasing wavelength due to the diffraction of light. Due to the rotation of the earth under the sky, the off-axis source moves on an arc through the observations. Right: scaled observations. By scaling the frames by the ratio of a reference wavelength λ0 (typically the largest wavelength in an observation) over the wavelength of a frame, also called the scale factor λ0 / λ = s, the PSF and speckle pattern is aligned throughout the entire cube. This also misaligns off-axis sources, both radially and azimuthally. Ideally, the PSF and speckle pattern is now the same in every frame and can be easily modeled.

Typically, the PSF model is found using matrix Principal Component Analysis (PCA) which relies on the truncated matrix Singular Value Decomposition (SVD). The figure below illustrates this method on the simpler case of spectral differntial imaging.

SDI with PCA

Post-processing spectral differential imaging observations with matrix principal component analysis (PCA) using the truncated matrix singular value decomposition (SVD).

First the observations are prepared by aligning the coronagraphic point spread function in the observations X_{Obs}. This misaligns the planet (P). In this schematic, the observations of the star and planet are frames with $3 \times 3$ pixels and cover $5$ spectral bands from the optical (blue) to the infrared (red). The observations are thus a $5 \times 3 \times 3$ data cube. The observations are then flattened into a $9\times 5$ matrix. This is necessary to use matrix-PCA to find a PSF model. Flattening the observations disrupts the relationships between the data points. The flattened observation matrix X is factorized using the SVD. This factorization is exact so that X=U \Sigma V^{\top} holds. The columns of U are the principal components (PC) and the rows of V^{\top} are the principal directions, ordered from most "important" (dark) to least "important" (light). The "importance" of each PC is given by its singular values on the diagonal of \Sigma. The unimportant PC contain typically only noise and the planet. The white cells are zero. To model the PSF without including the planet, only the important PC are considered, thus truncating the matrices. In this example 4 of the 5 PC are included in the model. This is equivalent to finding a rank-4 model M_{PSF} that optimally approximates the rank-5 observations X_{Obs} in a least-squares sense. The matrix PSF model M_{PSF} can then be tensorized into its original data-cube shape. By including only a few PC in the model, the planet has been successfully removed from the data. Subtracting the PSF model from the original observations leaves only the flux from the planet and noise.

Because reshaping the observations into a matrix disrupts the relationships between the modes we propose to model the observations as higher order tensors using tensor decompositions. Tensor decompositions generalize PCA and SVD to higher-order data. Below the idea behind this method using the canonical-polyadic decomposition (CPD) is illustrated on the example of an observation cube optained by spectral differential imaging (only spectral diversity).

SDI with CPD

Post-processing spectral differential imaging observations with tensor methods using the Canonical-Polyadic decomposition (CPD).

The observations are pre-processed as before. However, instead of modeling the PSF as a matrix, the tensor methods find a model that has the same shape as the original data. In the case of the CPD, this model is the sum of vector products. Each vector product has a factor for the spectrum u^{(\lambda)}_{n}, right ascension u^{(x)}_{n} and declination separation u^{(y)}_{n}. These vector products are conceptually similar to the principal components of the singular value decomposition. However, because the CPD "keeps track" of the original data shape, it is able to distinguish between features in the spectral and spatial modes.

Example

Decompositing the scaled observation tensor $\mathcal{X} \in \mathbb{R}^{I_\lambda \times I_\theta \times I_x \times I_y}$ using the Tensor Ring Decomposition, see the equations below, approximates the low-rank components of the coronagraphic PSF using four order-$3$ factors, $\mathcal{G}^{(i)}$, under the tensor trace operation. Subtracting the low-rank PSF model from the observations, rescaling and de-rotating results in the residual frame and STIM map shown in the figure above.

TRD Eq. 1

TRD Eq. 2

Equivalently in tensor network notation:

TRD TND Eq. 3