| Title: | Active Recovery of a Constrained and Hidden Set by Stepwise Uncertainty Reduction Strategy |
|---|---|
| Description: | Stepwise Uncertainty Reduction criterion and algorithm for sequentially learning a Gaussian Process Classifier as described in Menz et al. (2025). |
| Authors: | Morgane Menz [aut, cre], Miguel Munoz-Zuniga [aut], Delphine Sinoquet [aut], Naoual Serraji [ctb] |
| Maintainer: | Morgane Menz <[email protected]> |
| License: | GPL-3 |
| Version: | 0.0.1 |
| Built: | 2026-05-25 08:19:32 UTC |
| Source: | https://github.com/cran/ARCHISSUR |
archissur adaptively enriches the Gaussian Process Classifier (GPC) using a learning criterion to achieve a precise approximation of the feasible area contour. This is done by iteratively adding the best learning point that minimizes future uncertainty over the feasible domain, following the Stepwise Uncertainty Reduction strategy (SUR).
archissur( design.init = NULL, cst.init = NULL, model = NULL, cst_function, lower = NULL, upper = NULL, n.ite = 10, seed = NULL, nb.integration = NULL, plot_2D_pn = FALSE, batchsize = 1, n_update = 1, gpc.options = NULL, optimcontrol = NULL, verbose = 1 )archissur( design.init = NULL, cst.init = NULL, model = NULL, cst_function, lower = NULL, upper = NULL, n.ite = 10, seed = NULL, nb.integration = NULL, plot_2D_pn = FALSE, batchsize = 1, n_update = 1, gpc.options = NULL, optimcontrol = NULL, verbose = 1 )
design.init |
optional matrix representing the initial design of experiments (DoE). If not provided, you must provide a |
cst.init |
optional vector of binary observations {0,1} corresponding to the initial class labels. If not provided, it will be calculated using |
model |
optional object of type |
cst_function |
constraint function with binary outputs {0,1} to be learn. |
lower |
inputs lower bound of |
upper |
inputs upper bound of |
n.ite |
number of iterations of |
seed |
to fix the seed. |
nb.integration |
number of integration points. Default is |
plot_2D_pn |
if |
batchsize |
number of points to be learned at each iteration. Default is 1. |
n_update |
number of iterations between hyperparameter updates by likelihood maximization. Default is 1. |
gpc.options |
list with GPC model options: covariance kernel, noise variance, number of initial points for MLE optimization, standardization of inputs, constrained latent GP mean sign.If |
optimcontrol |
optional list of control parameters for enrichment criterion optimization. Default is |
verbose |
Level of verbosity for printing information during iterations. 0: No printing. 1: Print iteration number and best point found. 2: Print iteration number, best point found, criterion value, and model hyperparameters. Default is 1. |
If the field "method" is set to "genoud", one can set some parameters of this algorithm:
pop.size (default: 50d), max.generations (10d), wait.generations (2), BFGSburnin (2) and the mutations P1, P2, up to P9 (see genoud). Numbers into brackets are the default values.
If the field method is set to "discrete", one can set the field optim.points: p * d matrix corresponding to the p points where the criterion will be evaluated. If nothing is specified, 100*d points are chosen randomly.
Finally, one can control the field optim.option in order to decide how to optimize the sampling criterion.
If optim.option is set to 2 (default), batchsize sequential optimizations in dimension d are performed to find the optimum.
If optim.option is set to 1, only one optimization in dimension batchsize*d is performed. This option is only available with "genoud". This option might provide more global and accurate solutions, but is a lot more expensive.
A list containing:
Xf |
DoE |
f |
binary observations corresponding to the class labels. |
alpha |
a scalar representing the Vorob'ev threshold. |
vorob_expect |
Vorob’ev expectation. |
vorob_dev |
current Vorob’ev deviation. |
model |
an object of class |
An '.Rds' file model_gp.Rds containing an object of class gpcm corresponding to current GPC model.
The class 1 probability map. Plots are available in '2D_plots' directory.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.
#------------------------------------------------------------------- #------------------------- archissur ------------------------------- #------------------------------------------------------------------- # 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) cst_function <- function(z){ fx <- apply(z, 1, branin) f <- ifelse(fx < 14, 0, 1) return(f)} ## constraint function s <- cst_function(x) # archissur parameters design.init <- x cst.init <- s n.ite <- 2 n_update <- 5 lower <- rep(0,d) upper <- rep(1,d) ### GPC model options gpc.options <- list() gpc.options$noise.var <- 1e-6 gpc.options$multistart <- 1 res <- archissur(design.init = design.init, cst.init = cst.init, cst_function = cst_function, lower = lower, upper = upper, n.ite = n.ite, n_update = n_update, gpc.options = gpc.options) unlink('model_gp.Rds')#------------------------------------------------------------------- #------------------------- archissur ------------------------------- #------------------------------------------------------------------- # 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) cst_function <- function(z){ fx <- apply(z, 1, branin) f <- ifelse(fx < 14, 0, 1) return(f)} ## constraint function s <- cst_function(x) # archissur parameters design.init <- x cst.init <- s n.ite <- 2 n_update <- 5 lower <- rep(0,d) upper <- rep(1,d) ### GPC model options gpc.options <- list() gpc.options$noise.var <- 1e-6 gpc.options$multistart <- 1 res <- archissur(design.init = design.init, cst.init = cst.init, cst_function = cst_function, lower = lower, upper = upper, n.ite = n.ite, n_update = n_update, gpc.options = gpc.options) unlink('model_gp.Rds')
Computes conditional covariance matrix between some new points and many integration points.
computeQuickgpccov(object, integration.points, X.new, precalc.data, c.newdata)computeQuickgpccov(object, integration.points, X.new, precalc.data, c.newdata)
object |
an object of class |
integration.points |
|
X.new |
|
precalc.data |
list containing precalculated data. This list can be generated using the |
c.newdata |
the (unconditional) covariance between |
Conditional covariance matrix between integration.points and X.new.
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.
#------------------------------------------------------------------- #------------------- computeQuickgpccov ---------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization while building gpcm model plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## prediction at X.new X.new <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) gpc <- predict(object = model, newdata = X.new) c.newdata <- gpc$c ## prediction at integration points require(randtoolbox) nb.integration <- d * 100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) gpc_int <- predict(object = model, newdata = integration.points) precalc.data <- list(c.K = crossprod(gpc_int$c , model@invK)) c.newdata <- gpc$c integration.points <- scale(x = integration.points, center = [email protected], scale = [email protected]) kn <- computeQuickgpccov(object = model, integration.points = integration.points, X.new = X.new, precalc.data = precalc.data, c.newdata = c.newdata ) plan(sequential) # deactivate parallel calculations: back to sequential mode#------------------------------------------------------------------- #------------------- computeQuickgpccov ---------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization while building gpcm model plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## prediction at X.new X.new <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) gpc <- predict(object = model, newdata = X.new) c.newdata <- gpc$c ## prediction at integration points require(randtoolbox) nb.integration <- d * 100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) gpc_int <- predict(object = model, newdata = integration.points) precalc.data <- list(c.K = crossprod(gpc_int$c , model@invK)) c.newdata <- gpc$c integration.points <- scale(x = integration.points, center = model@X.mean, scale = model@X.std) kn <- computeQuickgpccov(object = model, integration.points = integration.points, X.new = X.new, precalc.data = precalc.data, c.newdata = c.newdata ) plan(sequential) # deactivate parallel calculations: back to sequential mode
computeVorobTerms compute the future uncertainty.
computeVorobTerms( i, object, intpoints.oldmean, krig2, sk.new, alpha, gpc, X.new, seed = NULL )computeVorobTerms( i, object, intpoints.oldmean, krig2, sk.new, alpha, gpc, X.new, seed = NULL )
i |
the index for which conditional probabilities are computed. It ranges from 1 to |
object |
an object of class |
intpoints.oldmean |
vectors of size |
krig2 |
a list containing the output of the function |
sk.new |
conditional standard deviations vector at the points |
alpha |
a scalar representing the Vorob'ev threshold. |
gpc |
a list containing the output of the predict function |
X.new |
input vector of size |
seed |
to fix the seed. |
a list with:
term2 |
Vector equal to |
term1 |
Vector equal to the product of |
term3 |
Vector equal to the product of |
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
#------------------------------------------------------------------- #--------------------------- computeVorobTerms ----------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## n.grid <- 20 x.grid <- seq(0,1,length=n.grid) newdata <- expand.grid(x.grid,x.grid) newdata <- as.matrix(newdata) pred1 <- predict(object=model,newdata=newdata) precalc.data <- list() precalc.data$c.K <- crossprod(pred1$c, model@invK) newdata.oldsd <- sqrt(pred1$Zsimu_var) # new points added new.x <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) # predicion at new points pred2 <- predict(object=model,newdata=new.x) Sigma.r <- pred2$cov newdata <- scale(x = newdata, center = [email protected], scale = [email protected]) new.x <- scale(x = new.x, center = [email protected], scale = [email protected]) kn <- computeQuickgpccov(object = model, integration.points = newdata, X.new = new.x, precalc.data = precalc.data, c.newdata = pred2$c) updated.predictions <- predict_update_gpc_parallel(Sigma.r = Sigma.r, newdata.oldsd = newdata.oldsd, kn = kn) # parameters for comp_term function i <- 1 intpoints.oldmean <- pred1$Zsimu_mean sk.new <- updated.predictions$sd pn <- pred1$prob alpha <- KrigInv::vorob_threshold(pn) result <- computeVorobTerms(i = i, object = model, intpoints.oldmean = intpoints.oldmean, krig2 = updated.predictions, sk.new = sk.new, alpha = alpha, gpc = pred2, X.new = new.x) plan(sequential) # deactivate parallel calculations: back to sequential mode#------------------------------------------------------------------- #--------------------------- computeVorobTerms ----------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## n.grid <- 20 x.grid <- seq(0,1,length=n.grid) newdata <- expand.grid(x.grid,x.grid) newdata <- as.matrix(newdata) pred1 <- predict(object=model,newdata=newdata) precalc.data <- list() precalc.data$c.K <- crossprod(pred1$c, model@invK) newdata.oldsd <- sqrt(pred1$Zsimu_var) # new points added new.x <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) # predicion at new points pred2 <- predict(object=model,newdata=new.x) Sigma.r <- pred2$cov newdata <- scale(x = newdata, center = model@X.mean, scale = model@X.std) new.x <- scale(x = new.x, center = model@X.mean, scale = model@X.std) kn <- computeQuickgpccov(object = model, integration.points = newdata, X.new = new.x, precalc.data = precalc.data, c.newdata = pred2$c) updated.predictions <- predict_update_gpc_parallel(Sigma.r = Sigma.r, newdata.oldsd = newdata.oldsd, kn = kn) # parameters for comp_term function i <- 1 intpoints.oldmean <- pred1$Zsimu_mean sk.new <- updated.predictions$sd pn <- pred1$prob alpha <- KrigInv::vorob_threshold(pn) result <- computeVorobTerms(i = i, object = model, intpoints.oldmean = intpoints.oldmean, krig2 = updated.predictions, sk.new = sk.new, alpha = alpha, gpc = pred2, X.new = new.x) plan(sequential) # deactivate parallel calculations: back to sequential mode
Minimization of the Vorob'ev criterion for a batch of candidate sampling points.
max_vorob_parallel_gpc( lower, upper, optimcontrol = NULL, batchsize, integration.param, object, new.noise.var = 0, seed = NULL )max_vorob_parallel_gpc( lower, upper, optimcontrol = NULL, batchsize, integration.param, object, new.noise.var = 0, seed = NULL )
lower |
vector containing the lower bounds of the design space. |
upper |
vector containing the upper bounds of the design space. |
optimcontrol |
optional list of control parameters for the optimization of the sampling criterion. The field " |
batchsize |
number of points to sample simultaneously. The sampling criterion will return batchsize points at a time for sampling. |
integration.param |
optional list of control parameter for the computation of integrals, containing the fields |
object |
an object of class |
new.noise.var |
optional scalar value of the noise variance of the new observations. Default is 0. |
seed |
to fix the seed. |
If the field method is set to "genoud", one can set some parameters of this algorithm:
pop.size (default: 50d), max.generations (10d), wait.generations (2), BFGSburnin (2) and the mutations P1, P2, up to P9 (see genoud). Numbers into brackets are the default values.
If the field method is set to "discrete", one can set the field optim.points: p * d matrix corresponding to the p points where the criterion will be evaluated. If nothing is specified, 100*d points are chosen randomly.
Finally, one can control the field optim.option in order to decide how to optimize the sampling criterion.
If optim.option is set to 2 (default), batchsize sequential optimizations in dimension d are performed to find the optimum.
If optim.option is set to 1, only one optimization in dimension batchsize*d is performed. This option is only available with "genoud". This option might provide more global and accurate solutions, but is a lot more expensive.
a list with components:
par |
the best set of parameters found. |
value |
the value of the Vorob'ev criterion at |
allvalues |
if an optimization on a discrete set of points is chosen, the value of the criterion at all these points. |
current.vorob |
current vorob’ev deviation. |
alpha |
the Vorob'ev thresold. |
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Chevalier, C. Fast uncertainty reduction strategies relying on Gaussian process models PhD Thesis. University of Bern (2013).
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.
#------------------------------------------------------------------- #------------------- max_vorob_optim_parallel_gpc------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) # parameters for max_vorob_parallel_gpc function lower <- rep(0,d) upper <- rep(1,d) batchsize = 1 integration.param <- list() require(randtoolbox) nb.integration <- d*100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.param$integration.points <- rep(upper-lower,each=nb.integration) * matrix(integration.points, nrow=nb.integration) + matrix(rep(lower,each=nb.integration), nrow=nb.integration) integration.param$integration.weights <- NULL integration.param$alpha <- NULL optimcontrol <- list() optimcontrol$method <- "multistart" crit <- max_vorob_parallel_gpc(lower = lower, upper = upper, batchsize = batchsize, integration.param = integration.param, object = model, optimcontrol = optimcontrol, seed=1) plan(sequential) # deactivate parallel calculations: back to sequential mode#------------------------------------------------------------------- #------------------- max_vorob_optim_parallel_gpc------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) # load future package for parallelization plan(multisession) # activate parallel calculations (with available cores automatic detection) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) # parameters for max_vorob_parallel_gpc function lower <- rep(0,d) upper <- rep(1,d) batchsize = 1 integration.param <- list() require(randtoolbox) nb.integration <- d*100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.param$integration.points <- rep(upper-lower,each=nb.integration) * matrix(integration.points, nrow=nb.integration) + matrix(rep(lower,each=nb.integration), nrow=nb.integration) integration.param$integration.weights <- NULL integration.param$alpha <- NULL optimcontrol <- list() optimcontrol$method <- "multistart" crit <- max_vorob_parallel_gpc(lower = lower, upper = upper, batchsize = batchsize, integration.param = integration.param, object = model, optimcontrol = optimcontrol, seed=1) plan(sequential) # deactivate parallel calculations: back to sequential mode
Useful precomputations to quickly update GPC mean and variance
precomputeUpdateData(model, integration.points)precomputeUpdateData(model, integration.points)
model |
an object of class |
integration.points |
|
a list with components:
c.K |
Vector equal to |
lambda.intpoints |
Vector equal to |
pn.intpoints |
the (averaged) probability of class 1 at |
intpoints.oldmean |
conditional mean matrix of the latent GP at |
intpoints.oldmean |
conditional variance vector of the latent GP at |
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.
#------------------------------------------------------------------- #------------------- precomputeUpdateData ---------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) ## gpcm object require(GPCsign) require(randtoolbox) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) nb.integration <- d * 100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.points <- scale(x = integration.points, center = [email protected], scale = [email protected]) precalc.data <- precomputeUpdateData(model = model, integration.points = integration.points)#------------------------------------------------------------------- #------------------- precomputeUpdateData ---------------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) ## gpcm object require(GPCsign) require(randtoolbox) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) nb.integration <- d * 100 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.points <- scale(x = integration.points, center = model@X.mean, scale = model@X.std) precalc.data <- precomputeUpdateData(model = model, integration.points = integration.points)
The functions uses Gaussian Process Classification (GPC) update formulas to quickly compute conditional variances at points newdata, when r new points newX are added.
predict_update_gpc_parallel(Sigma.r, newdata.oldsd, kn)predict_update_gpc_parallel(Sigma.r, newdata.oldsd, kn)
Sigma.r |
an r*r conditional covariance matrix at r points |
newdata.oldsd |
conditional standard deviations vector at the points |
kn |
conditional covariances between the points |
a list with:
sd |
updated conditional standard deviation at points |
lambda |
new GPC weight of x_(n+1),...,x_(n+r) for the prediction at points |
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Chevalier, C., Ginsbourger, D. (2014). Corrected Kriging update formulae for batch-sequential data assimilation, in Pardo-Iguzquiza, E., et al. (Eds.) Mathematics of Planet Earth, pp 119-122
#------------------------------------------------------------------- #------------------- predict_update_gpc_parallel ------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) plan(multisession) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## n.grid <- 20 x.grid <- seq(0,1,length=n.grid) newdata <- expand.grid(x.grid,x.grid) newdata <- as.matrix(newdata) pred1 <- predict(object=model,newdata=newdata) precalc.data <- list() precalc.data$c.K <- crossprod(pred1$c, model@invK) newdata.oldsd <- sqrt(pred1$Zsimu_var) # new points added new.x <- matrix(c(0.1,0.2),ncol=2) # predicion at new points pred2 <- predict(object=model,newdata=new.x) Sigma.r <- pred2$cov newdata <- scale(x = newdata, center = [email protected], scale = [email protected]) new.x <- scale(x = new.x, center = [email protected], scale = [email protected]) kn <- computeQuickgpccov(object = model, integration.points = newdata, X.new = new.x, precalc.data = precalc.data, c.newdata = pred2$c) updated.predictions <- predict_update_gpc_parallel(Sigma.r = Sigma.r, newdata.oldsd = newdata.oldsd, kn = kn) plan(sequential)#------------------------------------------------------------------- #------------------- predict_update_gpc_parallel ------------------- #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x) require(future) plan(multisession) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## n.grid <- 20 x.grid <- seq(0,1,length=n.grid) newdata <- expand.grid(x.grid,x.grid) newdata <- as.matrix(newdata) pred1 <- predict(object=model,newdata=newdata) precalc.data <- list() precalc.data$c.K <- crossprod(pred1$c, model@invK) newdata.oldsd <- sqrt(pred1$Zsimu_var) # new points added new.x <- matrix(c(0.1,0.2),ncol=2) # predicion at new points pred2 <- predict(object=model,newdata=new.x) Sigma.r <- pred2$cov newdata <- scale(x = newdata, center = model@X.mean, scale = model@X.std) new.x <- scale(x = new.x, center = model@X.mean, scale = model@X.std) kn <- computeQuickgpccov(object = model, integration.points = newdata, X.new = new.x, precalc.data = precalc.data, c.newdata = pred2$c) updated.predictions <- predict_update_gpc_parallel(Sigma.r = Sigma.r, newdata.oldsd = newdata.oldsd, kn = kn) plan(sequential)
Evaluation of the parallel Vorob'ev criterion for some candidate points. To be used in optimization routines, like in max_vorob_parallel_gpc. To avoid numerical instabilities, the new points are evaluated only if they are not too close to an existing observation, or if there is some observation noise. The criterion is the integral of the posterior Vorob'ev uncertainty.
vorob_optim_parallel_gpc( x, integration.points, integration.weights = NULL, intpoints.oldmean, intpoints.oldsd, precalc.data, object, new.noise.var = NULL, batchsize, alpha, current.vorob, seed = NULL )vorob_optim_parallel_gpc( x, integration.points, integration.weights = NULL, intpoints.oldmean, intpoints.oldsd, precalc.data, object, new.noise.var = NULL, batchsize, alpha, current.vorob, seed = NULL )
x |
input vector of size |
integration.points |
matrix of points for numerical integration in the design space. |
integration.weights |
vector of size |
intpoints.oldmean |
(see below). |
intpoints.oldsd |
vectors of size |
precalc.data |
list containing precalculated data. This list can be generated using the |
object |
object of class |
new.noise.var |
optional scalar value of the noise variance for the new observations. |
batchsize |
number of points to sample simultaneously. The sampling criterion will return batchsize points at a time for sampling. |
alpha |
a scalar representing the Vorob'ev threshold |
current.vorob |
current value of the vorob criterion (before adding new observations). |
seed |
to fix the seed. |
Parallel vorob value
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Chevalier, C. Fast uncertainty reduction strategies relying on Gaussian process models PhD Thesis. University of Bern (2013).
El Amri, M.R. Analyse d’incertitudes et de robustesse pour les modèles à entrées et sorties fonctionnelles. Theses, Université Grenoble Alpes (2019), https://theses.hal.science/tel-02433324.
El Amri, M.R., Helbert, C., Munoz-Zuniga, M. Feasible set estimation under functional uncertainty by gaussian process modelling (2023). 455:133,893. doi:10.1016/j.physd.2023.133893.
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.
#------------------------------------------------------------------- #-------------------vorob_optim_parallel_gpc------------------------ #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x_ <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x_, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x_) require(future) plan(multisession) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## parameteres for vorob_optim_parallel_gpc function batchsize <- 1 x <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) require(randtoolbox) nb.integration <- d * 1000 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.points <- scale(x = integration.points, center = [email protected], scale = [email protected]) precalc.data <- precomputeUpdateData(model=model, integration.points=integration.points) intpoints.oldmean <- precalc.data$intpoints.oldmean intpoints.oldsd <- precalc.data$intpoints.oldsd pn <- precalc.data$pn require(KrigInv) alpha <- KrigInv::vorob_threshold(pn) pn_bigger_than_alpha <- (pn>alpha)+0 pn_lower_than_alpha <- 1-pn_bigger_than_alpha penalisation <- 1 current.vorob <- mean(pn*pn_lower_than_alpha + penalisation*(1-pn)*pn_bigger_than_alpha) x <- scale(x = x, center = [email protected], scale = [email protected]) criter <- vorob_optim_parallel_gpc(x = x, integration.points = integration.points, intpoints.oldmean = intpoints.oldmean, intpoints.oldsd = intpoints.oldsd, precalc.data = precalc.data, object = model, batchsize = batchsize, alpha = alpha, current.vorob = current.vorob) plan(sequential)#------------------------------------------------------------------- #-------------------vorob_optim_parallel_gpc------------------------ #------------------------------------------------------------------- ## 20-points DoE, and the corresponding response d <- 2 nb_PX <- 20 x_ <- matrix(c(0.205293785978832, 0.0159983370750337, 0.684774733109666, 0.125251417595962, 0.787208786290006, 0.700475706055049, 0.480507717105934, 0.359730889653793, 0.543665267336735, 0.565974761807069, 0.303412043992361, 0.471502352650857, 0.839505250127309, 0.504914690245002, 0.573294917143728, 0.784444726564573, 0.291681289223421, 0.255053812451938, 0.87233450888786, 0.947168337730927, 0.648262257638515, 0.973264712407035, 0.421877310273815, 0.0686662506387988, 0.190976166753807, 0.810964668176754, 0.918527262507395, 0.161973686467513, 0.0188128700859558, 0.43522031347403, 0.99902788789426, 0.655561821513544, 0.741113863862512, 0.321050086076934, 0.112003007565305, 0.616551317575545, 0.383511473487687, 0.886611679106771, 0.0749211435982952, 0.205805968972305), byrow = TRUE, ncol = d) require(DiceKriging) fx <- apply(x_, 1, branin) f <- ifelse(fx < 14, -1, 1) Xf <- as.matrix(x_) require(future) plan(multisession) ## gpcm object require(GPCsign) model <- gpcm(f, Xf, coef.m = -1.25, coef.cov = c(1.17,0.89)) ## parameteres for vorob_optim_parallel_gpc function batchsize <- 1 x <- matrix(c(0.1,0.2),ncol=2,byrow=TRUE) require(randtoolbox) nb.integration <- d * 1000 integration.points <- sobol(n = nb.integration, dim = d, scrambling = 0) integration.points <- scale(x = integration.points, center = model@X.mean, scale = model@X.std) precalc.data <- precomputeUpdateData(model=model, integration.points=integration.points) intpoints.oldmean <- precalc.data$intpoints.oldmean intpoints.oldsd <- precalc.data$intpoints.oldsd pn <- precalc.data$pn require(KrigInv) alpha <- KrigInv::vorob_threshold(pn) pn_bigger_than_alpha <- (pn>alpha)+0 pn_lower_than_alpha <- 1-pn_bigger_than_alpha penalisation <- 1 current.vorob <- mean(pn*pn_lower_than_alpha + penalisation*(1-pn)*pn_bigger_than_alpha) x <- scale(x = x, center = model@X.mean, scale = model@X.std) criter <- vorob_optim_parallel_gpc(x = x, integration.points = integration.points, intpoints.oldmean = intpoints.oldmean, intpoints.oldsd = intpoints.oldsd, precalc.data = precalc.data, object = model, batchsize = batchsize, alpha = alpha, current.vorob = current.vorob) plan(sequential)
Evaluation of the Vorob'ev criterion for candidate points x, assuming that some other points are also going to be evaluated. To be used in optimization routines, like in max_vorob_parallel_gpc. To avoid numerical instabilities, the new points are evaluated only if they are not too close to an existing observation, or if there is some observation noise. The criterion is the integral of the posterior Vorob'ev uncertainty.
vorob_optim_parallel2_gpc( x, other.points, integration.points, integration.weights = NULL, intpoints.oldmean, intpoints.oldsd, precalc.data, object, new.noise.var = NULL, batchsize, alpha, current.vorob, seed = NULL )vorob_optim_parallel2_gpc( x, other.points, integration.points, integration.weights = NULL, intpoints.oldmean, intpoints.oldsd, precalc.data, object, new.noise.var = NULL, batchsize, alpha, current.vorob, seed = NULL )
x |
input vector of size d at which one wants to evaluate the criterion. |
other.points |
vector giving the other |
integration.points |
p*d matrix of points for numerical integration in the design space. |
integration.weights |
vector of size p corresponding to the weights of these integration points. |
intpoints.oldmean |
vector of size p corresponding to the latent GP mean at the integration points before adding |
intpoints.oldsd |
vector of size p corresponding to the latent GP standard deviation at the integration points before adding |
precalc.data |
list containing precalculated data. This list can be generated using the |
object |
object of class |
new.noise.var |
optional scalar value of the noise variance of the new observations. |
batchsize |
number of points to sample simultaneously. The sampling criterion will return batchsize points at a time for sampling. |
alpha |
a scalar representing the Vorob'ev threshold. |
current.vorob |
current value of the vorob criterion (before adding new observations). |
seed |
to fix the seed. |
Parallel Vorob'ev value
Morgane MENZ, Delphine SINOQUET, Miguel MUNOZ-ZUNIGA. Contributors: Naoual SERRAJI.
Menz, M., Munoz-Zuniga, M., Sinoquet, D. Estimation of simulation failure set with active learning based on Gaussian Process classifiers and random set theory (2023). https://hal.science/hal-03848238.
Chevalier, C. Fast uncertainty reduction strategies relying on Gaussian process models PhD Thesis. University of Bern (2013).
Bachoc, F., Helbert, C. & Picheny, V. Gaussian process optimization with failures: classification and convergence proof. J Glob Optim 78, 483–506 (2020). doi:10.1007/s10898-020-00920-0.