uw-logo vip-logo

Microarray Image Denoising

Leveraging Autoencoders
and Attention-Based Architectures with Synthetic Training Data

Chris Czarnecki, Krish Shah, Alexander Wong

Outline

  • What are microarrays?
  • How microarrays are read?
  • The problem we're addressing – noise
  • Prior works
  • Synthetic data generation
  • Discussion on metrics and alternative denoising methods
  • New metric for microarray denoising models
  • New state-of-the-art model for microarray denoising
  • Q&A

What are microarrays?

microarray_example.svg.png

transcription-initiation.png

transcription-initiation.png

microarrays-general-principle.png

Reading information from microarray images



Example

Suppose you are given a virus that has just 4 genes in its genome and you want to study which genes are active in the early infection stage vs. late infection stage.

hypothetical-virus.drawio.png

Things to remember

  • Each dot position corresponds to a specific sequence (e.g. a gene)
  • The intensity of each dot can be measured and compared against a control sample
  • The relative intensity of each dot corresponds to the quantity of a given sequence in a sample (e.g. gene mRNA in a particular virus)

But what's the problem?

Noise:

  • Dust particles and dirt that gets onto the glass slide during preparation
  • Noise from the scanning process


Problem

Repeating experiments in a wet lab is expensive

derisi-speck.png

How prior works approached this?

  • Up to 2020, only classical methods have been used
  • 2020 marks the first and only (to date) application of a deep-learning denoising method (Mohandas et al. [1])

Example of classical denoising – Wavelet Transform Denoising

Wavelet Transform Denoising applies to images by decomposing them into localized frequency components in the spatial domain.

The discrete wavelet transform (DWT) of a signal can be expressed as:

$$W_{j,k} = \langle f(t), \psi_{j,k}(t) \rangle$$

Where:

  • \(\psi_{j,k}(t)\) are the wavelet basis functions at scale j and position k
  • \(\langle \cdot, \cdot \rangle\) denotes the inner product

The noisy signal f(t) is decomposed into wavelet coefficients using the DWT:

$$\{W_{j,k}\} = \text{DWT}(f(t))$$

But it's not great...

uarr_wavelet_diff.gif

Denoising Autoencoder

uarr-autoencoder-no-res.drawio.png

But is it good?

  • State-of-the-art according to the Peak Signal-to-Noise Ratio criterion
  • But... It was trained on real microarray images which can never be guaranteed to be noise-free

Idea: why don't we generate our own data?

microarray_generation.gif

hypothetical-virus.drawio.png

Can we use the same Autoencoder architecture and improve the denoising power by simply training it on a large corpus of synthetic data?

PSNR on synthetic dataset (higher is better)

Method PSNR (Synth test)
AE Baseline 17.7472
AE Synth 28.1942

f-PSNR on DeRisi dataset (lower is better)

Method f-PSNR (DeRisi crop test)
AE Baseline 23.4593
AE Synth 22.0562

But what are PSNR and f-PSNR?

$$ \begin{aligned} PSNR[\mathrm{db}] = 10 \cdot \log_{10} \left(\frac{(\text{Max possible pixel value})^2}{MSE}\right) \end{aligned} $$

MSE stands for Mean Square Error.

MSE:

  • In PSNR: \(MSE = \frac{1}{mn} \sum\limits_{i=0}^{m-1}\sum\limits_{j=0}^{n-1} (G(i,j) - O(i,j))^2\)
  • In f-PSNR: \(MSE = \frac{1}{mn} \sum\limits_{i=0}^{m-1}\sum\limits_{j=0}^{n-1} (I_\eta(i,j) - O(i,j))^2\)

Where:

  • \(O\) -> the cleaned/output image
  • \(I_{\eta}\) -> the noise-added image (input image)
  • \(G\) -> the ground-truth image

Job done?

We kept asking questions...

Is an autoencoder the optimal architecture for denoising microarray images?

Is PSNR or f-PSNR a metric that should be used for microarrays, given the semantic meaning of each position in the grid?

Is it possible that the denoising model removes relevant information?

Better denoising architectures?

Restormer [2]

restormer-diagram.drawio.png

Results were so good we couldn't believe our eyes

input_ex0_test.png output_restormer.png gt_ex0_test.png

Left-to-right: (1) input, (2) Restormer output, (3) ground-truth

Quantitative results were also stellar

PSNR on synthetic dataset (higher is better)

Method PSNR (Synth test)
AE Baseline 17.7472
AE Synth 28.1942
Restormer 29.4749

f-PSNR on DeRisi dataset (lower is better)

Method f-PSNR (DeRisi crop test)
AE Baseline 23.4593
AE Synth 22.0562
Restormer 22.5572

Then we took a closer look...

uarr-closeup.png

(a) Restormer, (b) our best-performing model, (c) ground-truth

We need a domain-specific metric!

Introducing SADGE

Standard Assessment of Denoising in Gene-chip Evaluation (SADGE)

$$ \begin{aligned} SADGE = \log \frac{1}{mn}\sum\limits_{i}^{n} \sum\limits_j^{m} \lvert \operatorname{imeasure}(R_{ij}, G_{ij}) - \operatorname{imeasure}(R_{ij}, O_{ij}) \rvert \end{aligned} $$

Where:

  • \(n\) and \(m\) are the rows and columns for a gridded microarray image respectively
  • \(\operatorname{imeasure}\) is the pixel intensity measurement operator
  • \(R\) is the reference image (all dots at 50% pixel intensity)
  • \(G\) is the ground-truth image
  • \(O\) is the denoised (output) image

Results so far

SADGE (lower is better)

Method SADGE (MP, Synth test)
AE Baseline -0.6397
AE Synth -1.1172
Restormer -1.0804

OK, but how does SADGE work?

maxpool.gif

We asked yet another question...

Now that we can measure intensities of dots on the synthetic microarrays, can we use them to further condition the training?

Introducing EATME

Elementwise Attention-like Transform for Microarray Enhancement (EATME)

uarr-autoencoder-attn.drawio.png

Dot Expression Loss

We also add the following loss term to the total loss:

$$ \begin{aligned} \mathcal L_{DEL} = MSE(\mathbf e_{G}, \mathbf e_{O}) \end{aligned} $$

Where:

  • $\mathbf e_G$ indicates the expression value vector collected from the ground-truth image
  • $\mathbf e_O$ indicates the expression value vector collected from the denoised image.

Results (last time, I swear)

Test-stage PSNR on the synthetic dataset (higher is better)

Method PSNR (Synth test)
AE Baseline 17.7472
AE Synth 28.1942
Restormer 29.4749
AE Synth (with residuals) 28.1569
AE Synth (normal) 17.4355
AE Synth (EATME, DEL) 28.3126

Test-stage SADGE (lower is better)

Method SADGE (MP, Synth test)
AE Baseline -0.6397
AE Synth -1.1172
Restormer -1.0804
AE Synth (with residuals) -1.1012
AE Synth (normal) -0.4674
AE Synth (EATME, DEL) -1.1225

Test-stage f-PSNR on the DeRisi dataset (lower is better)

Method f-PSNR (DeRisi crop test)
AE Baseline 23.4593
AE Synth 22.0562
Restormer 22.5572
AE Synth (with residuals) 22.0000
AE Synth (normal) 20.6842
AE Synth (EATME, DEL) 22.1591

Summary

  • We addressed the problem of denoising of microarray images
  • We proposed a synthetic data generation pipeline for microarray image datasets
  • We introduced a new domain-specific metric for assessing the power of denoising models (SADGE)
  • We introduced a new state-of-the-art microarray image denoising model trained on the synthetic dataset leveraging the autoencoder architecture with the EATME module and an additional loss term penalizing the training for inaccurate expression readouts (DEL)

References

[1]: A. Mohandas, S. M. Joseph, and P. S. Sathidevi, ‘An Autoencoder based Technique for DNA Microarray Image Denoising’, in 2020 International Conference on Communication and Signal Processing (ICCSP), Chennai, India: IEEE, Jul. 2020, pp. 1366–1371. doi: 10.1109/ICCSP48568.2020.9182265.

[2]: S. W. Zamir, A. Arora, S. Khan, M. Hayat, F. S. Khan, and M.-H. Yang, ‘Restormer: Efficient Transformer for High-Resolution Image Restoration’, Mar. 11, 2022, arXiv: arXiv:2111.09881. doi: 10.48550/arXiv.2111.09881.