Skip to contents

An S4 Class implementing Dimensionality Reduction via Regression (DRR).

Details

DRR is a non-linear extension of PCA that uses Kernel Ridge regression.

Slots

fun

A function that does the embedding and returns a dimRedResult object.

stdpars

The standard parameters for the function.

General usage

Dimensionality reduction methods are S4 Classes that either be used directly, in which case they have to be initialized and a full list with parameters has to be handed to the @fun() slot, or the method name be passed to the embed function and parameters can be given to the ..., in which case missing parameters will be replaced by the ones in the @stdpars.

Parameters

DRR can take the following parameters:

ndim

The number of dimensions

lambda

The regularization parameter for the ridge regression.

kernel

The kernel to use for KRR, defaults to "rbfdot".

kernel.pars

A list with kernel parameters, elements depend on the kernel used, "rbfdot" uses "sigma".

pca

logical, should an initial pca step be performed, defaults to TRUE.

pca.center

logical, should the data be centered before the pca step. Defaults to TRUE.

pca.scale

logical, should the data be scaled before the pca ste. Defaults to FALSE.

fastcv

logical, should fastCV from the CVST package be used instead of normal cross-validation.

fastcv.test

If fastcv = TRUE, separate test data set for fastcv.

cv.folds

if fastcv = FALSE, specifies the number of folds for crossvalidation.

fastkrr.nblocks

integer, higher values sacrifice numerical accuracy for speed and less memory, see below for details.

verbose

logical, should the cross-validation results be printed out.

Implementation

Wraps around drr, see there for details. DRR is a non-linear extension of principal components analysis using Kernel Ridge Regression (KRR, details see constructKRRLearner and constructFastKRRLearner). Non-linear regression is used to explain more variance than PCA. DRR provides an out-of-sample extension and a backward projection.

The most expensive computations are matrix inversions therefore the implementation profits a lot from a multithreaded BLAS library. The best parameters for each KRR are determined by cross-validaton over all parameter combinations of lambda and kernel.pars, using less parameter values will speed up computation time. Calculation of KRR can be accelerated by increasing fastkrr.nblocks, it should be smaller than n^1/3 up to sacrificing some accuracy, for details see constructFastKRRLearner. Another way to speed up is to use pars$fastcv = TRUE which might provide a more efficient way to search the parameter space but may also miss the global maximum, I have not ran tests on the accuracy of this method.

References

Laparra, V., Malo, J., Camps-Valls, G., 2015. Dimensionality Reduction via Regression in Hyperspectral Imagery. IEEE Journal of Selected Topics in Signal Processing 9, 1026-1036. doi:10.1109/JSTSP.2015.2417833

Examples

if (FALSE) {
if(requireNamespace(c("kernlab", "DRR"), quietly = TRUE)) {

dat <- loadDataSet("variable Noise Helix", n = 200)[sample(200)]

emb <- embed(dat, "DRR", ndim = 3)

plot(dat, type = "3vars")
plot(emb, type = "3vars")

# We even have function to reconstruct, also working for only the first few dimensions
rec <- inverse(emb, getData(getDimRedData(emb))[, 1, drop = FALSE])
plot(rec, type = "3vars")
}

}