Skip to content

Commit

Permalink
Merge pull request #24874 from zachmprince/stm_poly_chaos
Browse files Browse the repository at this point in the history
Using least squares for polynomial chaos
  • Loading branch information
zachmprince authored Jul 10, 2023
2 parents a8316c2 + a931e26 commit abeacbc
Show file tree
Hide file tree
Showing 11 changed files with 963 additions and 26 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,7 @@

Polynomial chaos is a surrogate modeling technique where a quantity of interest (QoI) that is dependent on input parameters is expanded as a sum of orthogonal polynomials. Given a QoI $Q$ dependent on a set of parameters $\vec{\xi}$, the polynomial chaos expansion (PCE) is:

!equation
!equation id=eq:expansion
Q(\vec{\xi}) = \sum_{i=1}^{P}q_i\Phi_i(\vec{\xi}) ,


Expand Down Expand Up @@ -42,20 +42,47 @@ The weighting functions are defined by the probability density function of the p
| Exponential | $e^{-\xi}$ | Laguerre | $[0,\infty]$ |
| Gamma | $\frac{\xi^{\alpha}e^{-\xi}}{\Gamma(\alpha+1)}$ | Generalized Laguerre | $[0,\infty]$ |

The expression in [eq:coeff] can be integrated using many different techniques. One is performing a Monte Carlo integration,
## Computing Coefficients

!equation
q_i = \frac{1}{\left\langle\Phi_i(\vec{\xi}),\Phi_i(\vec{\xi})\right\rangle} \frac{1}{N_{\mathrm{mc}}}\sum_{n=1}^{N_{\mathrm{mc}}} Q(\vec{\xi}_n)\Phi_i(\vec{\xi}_n) ,
There are currently two techniques to compute the coefficients in [!eqref](eq:expansion), the first performs the integration described by [!eqref](eq:coeff). For Monte Carlo sampling, the integration is in the form

!equation
q_i = \frac{1}{\left\langle\Phi_i(\vec{\xi}),\Phi_i(\vec{\xi})\right\rangle} \frac{1}{N_{\mathrm{mc}}}\sum_{n=1}^{N_{\mathrm{mc}}} Q(\vec{\xi}_n)\Phi_i(\vec{\xi}_n) .

or using numerical quadrature,
And for using a [QuadratureSampler.md]

!equation
q_i = \frac{1}{\left\langle\Phi_i(\vec{\xi}),\Phi_i(\vec{\xi})\right\rangle}\sum_{n=1}^{N_q} w_n Q(\vec{\xi}_n)\Phi_i(\vec{\xi}_n) .


The numerical quadrature method is typically much more efficient that than the Monte Carlo method and has the added benefit of exactly integrating the polynomial basis. However, the quadrature suffers from the curse of dimensionality. The naive approach uses a Cartesian product of one-dimensional quadratures, which results in $(\max(k^d_i) + 1)^D$ quadrature points to be sampled. Sparse grids can help mitigate the curse of dimensionality significantly.

The other technique is using ordinary least-squares (OLS) regression, like in [PolynomialRegressionTrainer.md]. In the majority of cases, OLS is more accurate than integration for Monte Carlo sampling, as shown in the figure below.

!plot scatter caption=Comparison between OLS regression and integration methods for computing polynomial chaos coefficients
data=[{'name': 'OLS',
'type': 'scatter',
'x': [16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192],
'y': [81.6696267803207, 8.454300942177698, 4.926398159556231,
2.874495194734781, 2.2416013478899863, 1.3479621256735075,
1.1574930187341494, 0.6939367594300845, 0.5817680465237121,
0.417989501880757]},
{'name': 'OLS with Regularization (penalty = 1)',
'type': 'scatter',
'x': [16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192],
'y': [4.91600612112163, 3.94911986807379, 3.4242180431008484,
2.7094356794174708, 1.6871974555746947, 1.3923862418279123,
1.1994658425158664, 1.0696251119444582, 0.9079419204733286,
0.8692326803241464]},
{'name': 'Integration',
'type': 'scatter',
'x': [16, 32, 64, 128, 256, 512, 1024, 2048, 4096, 8192],
'y': [80.62685305548119, 57.810722889054325, 45.68405898596861,
35.58856802819979, 19.925532186155436, 15.879940134036879,
11.862525598455559, 8.760052389181865, 4.967900255917786,
3.9340295519499127]}]
layout={'xaxis': {'title': 'Number of Samples', 'type': 'log'},
'yaxis': {'title': 'Coefficient RRMSE', 'type': 'log'}}

## Generating a Tuple

In polynomial chaos, a tuple describes the combination of polynomial orders representing the expansion basis ($k_{d,i}$). Again, the naive approach would be to do a tensor product of highest polynomial order, but this is often wasteful since generating a complete monomial basis is usually optimal. Below demonstrates the difference between a tensor basis and a complete monomial basis:
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
type = PolynomialChaosTrainer
execute_on = timestep_end
order = 10
regression_type = integration
distributions = 'k_dist q_dist L_dist Tinf_dist'
sampler = sample
response = results/data:avg:value
Expand All @@ -79,6 +80,7 @@
type = PolynomialChaosTrainer
execute_on = timestep_end
order = 10
regression_type = integration
distributions = 'k_dist q_dist L_dist Tinf_dist'
sampler = sample
response = results/data:max:value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -71,6 +71,7 @@
type = PolynomialChaosTrainer
execute_on = timestep_end
order = 10
regression_type = integration
distributions = 'k_dist q_dist L_dist Tinf_dist'
sampler = sample
response = results/data:avg:value
Expand All @@ -79,6 +80,7 @@
type = PolynomialChaosTrainer
execute_on = timestep_end
order = 10
regression_type = integration
distributions = 'k_dist q_dist L_dist Tinf_dist'
sampler = sample
response = results/data:max:value
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -13,9 +13,12 @@
#include "PolynomialQuadrature.h"
#include "QuadratureSampler.h"
#include "MultiDimPolynomialGenerator.h"
#include "Calculators.h"

#include "Distribution.h"

typedef StochasticTools::Calculator<std::vector<Real>, Real> RealCalculator;

class PolynomialChaosTrainer : public SurrogateTrainer
{
public:
Expand Down Expand Up @@ -47,6 +50,22 @@ class PolynomialChaosTrainer : public SurrogateTrainer
/// The distributions used for sampling
std::vector<std::unique_ptr<const PolynomialQuadrature::Polynomial>> & _poly;

/// The method in which to perform the regression (0=integration, 1=OLS)
unsigned int _rtype;

/// The penalty parameter for Ridge regularization
const Real & _ridge_penalty;

/// QuadratureSampler pointer, necessary for applying quadrature weights
QuadratureSampler * _quad_sampler;

/// Calculators used for standardization in linear regression
std::vector<std::unique_ptr<RealCalculator>> _calculators;
Real _r_sum;

///@{
/// Matrix and rhs for the regression problem
DenseMatrix<Real> _matrix;
DenseVector<Real> _rhs;
///@}
};
147 changes: 127 additions & 20 deletions modules/stochastic_tools/src/surrogates/PolynomialChaosTrainer.C
Original file line number Diff line number Diff line change
Expand Up @@ -21,6 +21,12 @@ PolynomialChaosTrainer::validParams()
params.addRequiredParam<unsigned int>("order", "Maximum polynomial order.");
params.addRequiredParam<std::vector<DistributionName>>(
"distributions", "Names of the distributions samples were taken from.");
MooseEnum rtype("integration ols auto", "auto");
params.addParam<MooseEnum>(
"regression_type",
rtype,
"The type of regression to perform for finding polynomial coefficents.");
params.addParam<Real>("penalty", 0.0, "Ridge regularization penalty factor for OLS regression.");

params.suppressParameter<MooseEnum>("response_type");
return params;
Expand All @@ -37,60 +43,161 @@ PolynomialChaosTrainer::PolynomialChaosTrainer(const InputParameters & parameter
_coeff(declareModelData<std::vector<Real>>("_coeff")),
_poly(declareModelData<std::vector<std::unique_ptr<const PolynomialQuadrature::Polynomial>>>(
"_poly")),
_ridge_penalty(getParam<Real>("penalty")),
_quad_sampler(dynamic_cast<QuadratureSampler *>(&_sampler))
{
// Check if number of distributions is correct
if (_ndim != _sampler.getNumberOfCols())
paramError("distributions",
"Sampler number of columns does not match number of inputted distributions.");

if (_quad_sampler && (!_pvals.empty() || _pcols.size() != _sampler.getNumberOfCols()))
auto rtype_enum = getParam<MooseEnum>("regression_type");
if (rtype_enum == "auto")
_rtype = _quad_sampler ? 0 : 1;
else
_rtype = rtype_enum == "integration" ? 0 : 1;

if (_rtype == 0 && _quad_sampler &&
(!_pvals.empty() || _pcols.size() != _sampler.getNumberOfCols()))
paramError("sampler",
"QuadratureSampler must use all Sampler columns for training, and cannot be"
" used with other Reporters - otherwise, quadrature integration does not work.");
if (_rtype == 0 && _ridge_penalty != 0.0)
paramError("penalty",
"Ridge regularization penalty is only relevant if 'regression_type = ols'.");

// Make polynomials
for (const auto & nm : getParam<std::vector<DistributionName>>("distributions"))
_poly.push_back(PolynomialQuadrature::makePolynomial(&getDistributionByName(nm)));

// Create calculators for standardization
if (_rtype == 1)
{
_calculators.resize(_ncoeff * 3);
for (const auto & term : make_range(_ncoeff))
{
_calculators[3 * term] = StochasticTools::makeCalculator(MooseEnumItem("mean"), *this);
_calculators[3 * term + 1] = StochasticTools::makeCalculator(MooseEnumItem("stddev"), *this);
_calculators[3 * term + 2] = StochasticTools::makeCalculator(MooseEnumItem("sum"), *this);
}
}
}

void
PolynomialChaosTrainer::preTrain()
{
_coeff.clear();
_coeff.resize(_ncoeff, 0.0);
_coeff.assign(_ncoeff, 0.0);

if (_rtype == 1)
{
if (getCurrentSampleSize() < _ncoeff)
paramError("order",
"Number of data points (",
getCurrentSampleSize(),
") must be greater than the number of terms in the polynomial (",
_ncoeff,
").");
_matrix.resize(_ncoeff, _ncoeff);
_rhs.resize(_ncoeff);
for (auto & calc : _calculators)
calc->initializeCalculator();
_r_sum = 0.0;
}
}

void
PolynomialChaosTrainer::train()
{
DenseMatrix<Real> poly_val(_ndim, _order);

// Evaluate polynomials to avoid duplication
DenseMatrix<Real> poly_val(_ndim, _order);
for (unsigned int d = 0; d < _ndim; ++d)
for (unsigned int i = 0; i < _order; ++i)
poly_val(d, i) = _poly[d]->compute(i, _predictor_row[d]);
poly_val(d, i) = _poly[d]->compute(i, _predictor_row[d], /*normalize=*/_rtype == 0);

// Evaluate multi-dimensional polynomials
std::vector<Real> basis(_ncoeff, 1.0);
for (const auto i : make_range(_ncoeff))
for (const auto d : make_range(_ndim))
basis[i] *= poly_val(d, _tuple[i][d]);

// Loop over coefficients
for (std::size_t i = 0; i < _ncoeff; ++i)
// For integration
if (_rtype == 0)
{
Real val = *_rval;
// Loop over parameters
for (std::size_t d = 0; d < _ndim; ++d)
val *= poly_val(d, _tuple[i][d]);

if (_quad_sampler)
val *= _quad_sampler->getQuadratureWeight(_row);
_coeff[i] += val;
const Real fact = (*_rval) * (_quad_sampler ? _quad_sampler->getQuadratureWeight(_row) : 1.0);
for (const auto i : make_range(_ncoeff))
_coeff[i] += fact * basis[i];
}
// For least-squares
else
{
// Loop over coefficients
for (const auto i : make_range(_ncoeff))
{
// Matrix is symmetric, so we'll add the upper diagonal later
for (const auto j : make_range(i + 1))
_matrix(i, j) += basis[i] * basis[j];
_rhs(i) += basis[i] * (*_rval);

for (unsigned int c = i * 3; c < (i + 1) * 3; ++c)
_calculators[c]->updateCalculator(basis[i]);
}
_r_sum += (*_rval);

if (_ridge_penalty != 0.0)
for (const auto i : make_range(_ncoeff))
_matrix(i, i) += _ridge_penalty;
}
}

void
PolynomialChaosTrainer::postTrain()
{
gatherSum(_coeff);
if (_rtype == 0)
{
gatherSum(_coeff);
if (!_quad_sampler)
for (std::size_t i = 0; i < _ncoeff; ++i)
_coeff[i] /= getCurrentSampleSize();
}
else
{
gatherSum(_matrix.get_values());
gatherSum(_rhs.get_values());
for (auto & calc : _calculators)
calc->finalizeCalculator(true);
gatherSum(_r_sum);

std::vector<Real> mu(_ncoeff);
std::vector<Real> sig(_ncoeff);
std::vector<Real> sum_pf(_ncoeff);
for (const auto i : make_range(_ncoeff))
{
mu[i] = i > 0 ? _calculators[3 * i]->getValue() : 0.0;
sig[i] = i > 0 ? _calculators[3 * i + 1]->getValue() : 1.0;
sum_pf[i] = _calculators[3 * i + 2]->getValue();
}

const Real n = getCurrentSampleSize();
for (const auto i : make_range(_ncoeff))
{
for (const auto j : make_range(i + 1))
{
_matrix(i, j) -= (mu[j] * sum_pf[i] + mu[i] * sum_pf[j]);
_matrix(i, j) += n * mu[i] * mu[j];
_matrix(i, j) /= (sig[i] * sig[j]);
_matrix(j, i) = _matrix(i, j);
}
_rhs(i) = (_rhs(i) - mu[i] * _r_sum) / sig[i];
}

if (!_quad_sampler)
for (std::size_t i = 0; i < _ncoeff; ++i)
_coeff[i] /= getCurrentSampleSize();
DenseVector<Real> sol;
_matrix.lu_solve(_rhs, sol);
_coeff = sol.get_values();

for (unsigned int i = 1; i < _ncoeff; ++i)
{
_coeff[i] /= sig[i];
_coeff[0] -= _coeff[i] * mu[i];
}
}
}
Loading

0 comments on commit abeacbc

Please sign in to comment.