Home
This Title All WIREs
WIREs RSS Feed
How to cite this WIREs title:
WIREs Comp Stat

Inverse problems: From regularization to Bayesian inference

Full article on Wiley Online Library:   HTML PDF

Can't access this content? Tell your librarian.

Inverse problems deal with the quest for unknown causes of observed consequences, based on predictive models, known as the forward models, that associate the former quantities to the latter in the causal order. Forward models are usually well‐posed, as causes determine consequences in a unique and stable way. Inverse problems, on the other hand, are usually ill‐posed: the data may be insufficient to identify the cause unambiguously, an exact solution may not exist, and, like in a mystery story, discovering the cause without extra information tends to be highly sensitive to measurement noise and modeling errors. The Bayesian methodology provides a versatile and natural way of incorporating extra information to supplement the noisy data by modeling the unknown as a random variable to highlight the uncertainty about its value. Presenting the solution in the form of a posterior distribution provides a wide range of possibilities to compute useful estimates. Inverse problems are traditionally approached from the point of view of regularization, a process whereby the ill‐posed problem is replaced by a nearby well‐posed one. While many of the regularization techniques can be reinterpreted in the Bayesian framework through prior design, the Bayesian formalism provides new techniques to enrich the paradigm of traditional inverse problems. In particular, inaccuracies and inadequacies of the forward model are naturally handled in the statistical framework. Similarly, qualitative information about the solution may be reformulated in the form of priors with unknown parameters that can be successfully handled in the hierarchical Bayesian context.

This article is categorized under:

  • Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory
  • Algorithms and Computational Methods > Numerical Methods
  • Applications of Computational Statistics > Computational Mathematics
One‐dimensional deconvolution problem with different prior assumptions. In the top row, the data are generated by using the smooth profile shown in the panel on the left. The convolution kernel is plotted in the inset, and the noisy convolution data are indicated by the red dots. The noise level is 0.5% of the maximum of the noiseless signal. The second panel from left shows the maximum a posteriori (MAP) estimate corresponding to a second‐order smoothness prior. The third panel shows the MAP estimate with a hierarchical conditionally Gaussian prior, in which the hyperparameter θ j represents the prior variance of the difference between adjacent signal values x j and x j + 1 (Calvetti & Somersalo, ). The panel on the right is the corresponding estimate for the variance of the components. The bottom row shows the results corresponding to the same choices of prior models when the data comes from the piecewise constant profile shown in the left panel
[ Normal View | Magnified View ]
Illustration of the effect of structural prior information in a borehole tomography inverse problem. The panels on the left show the two densities used for generating the tomography data. The green squares on the left edge of each panel mark the locations of the transmitters, and the black dots on the right edge the locations of the receivers. In the upper row, the density is a smooth sinusoidal wave, while in the lower one a structural discontinuity is added. The tomography data consist of integrals of the density function along each line joining a transmitter–receiver pair, contaminated by noise with standard deviation of 1% of the maximum noiseless data entry. The panels in the center show numerical approximations of the maximum a posteriori (MAP) estimates obtained with a second‐order smoothness prior, and those on the right the corresponding estimates with structural a prior informing of possible discontinuities in the vertical direction in correspondence of the discontinuities in the second dataset. We observe that when indeed the data comes from a density distribution with discontinuities, the structural prior improves the estimate, and while a hint of the prior structure can still be seen in the estimate based on data that arises from continuous density, the discontinuity is not forced
[ Normal View | Magnified View ]
Schematics of the Bayesian inversion
[ Normal View | Magnified View ]
Structural priors that facilitate, but do not force, jumps in the solution. The top row shows the prior standard deviations γ j. On the left panel, all components are equal, γ j = 0.1. In the middle, one component differs from the others by a factor of 100. On the right, three components differ from the baseline by factors of 200, 100, and 250, respectively. Bottom row: Five independent realizations drawn from each structural prior. The standard deviations of the jumps are equal to the values γ j shown in the upper row. Observe that when the standard deviations are larger than the background, some of the realizations, but not all, show a significant jump
[ Normal View | Magnified View ]
Eight random draws from the Whittle–Matérn prior over a unit square discretized into 100 × 100 pixels. The parameters are set to γ = β = 1 in all draws, while the correlation length, controlling the size of inhomogeneities, is λ = 0.05 in the top row and λ = 0.5 in the bottom row. The Laplacian is discretized using a finite difference approximation, and the boundary condition is the homogenous Dirichlet boundary condition
[ Normal View | Magnified View ]
Schematics of the regularization procedure. The column on the left outlines the steps for Tikhonov regularization, while the column on the right follows the procedure for iterative methods
[ Normal View | Magnified View ]
Effect of the value of the hyperparameter η on the focality of the solution. The figure shows brain activity estimates from simulated magnetoencephalography (MEG) data using different values of the hyperparameter η, which is known to control the focality of the maximum a posteriori (MAP) estimate. The results shown in the center row were obtained with η = 0.1 (low focality) and those in the bottom row with η = 0.005 (high focality). The location of the simulated activity is indicated by the yellow circle. The red lines in the top row shows the location of the brain slices shown in the corresponding columns. For details, refer to the article by Calvetti et al. ()
[ Normal View | Magnified View ]

Browse by Topic

Statistical and Graphical Methods of Data Analysis > Bayesian Methods and Theory
Applications of Computational Statistics > Computational Mathematics
Algorithms and Computational Methods > Numerical Methods

Access to this WIREs title is by subscription only.

Recommend to Your
Librarian Now!

The latest WIREs articles in your inbox

Sign Up for Article Alerts