API
SDR
- class pySDR.SDR.SDR(path, data)
Class for applying either DR or SDR.
Parameters
- pathstr
Storage path for the outputs of the LGC algorithm.
- datanp.ndarray
Data to apply either DR or SDR on.
- apply_DR(seed=None, **kwargs)
Function that applies dimensionality reduction to the currently loaded dataset.
Note
The currently available dimensionality reduction techniques are:
KLLE, NPE, Kernel LTSA, Linear LTSA, Hessian LLE, Laplacian Eigenmaps, LPP, Diffusion Map, Isomap, Landmark Isomap, MDS, LMDS, SPE, Kernel PCA, PCA, RP, Factor Analysis, tSNE & Manifold Sculpting from the Tapkee library. Please consult the Tapkee documentation for the appropriate keyword arguments for each DR method.
UMAP from umap-learn. NB: Only the keyword arguments seed, num_neighbors, target_dimension, metric, umap_init and min_dist are currently implemented. Please consult the umap-learn documentation for more information about their meaning.
LTSA from sklearn. One can also use the sklearn backend for applying LLE or Hessian LLE by setting
backend='sklearn'
.
The DR method can be set by providing, e.g.
method='LMDS'
. By default this function will apply RP (i.e. random projection).Parameters
- seedint, default = None
Random seed to use for the given DR method. By default no random seed will be set.
- **kwargs :
Additional DR method specific keyword arguments.
Returns
- datanp.ndarray, shape (n_samples, target_dimension)
The reduced dataset.
- apply_LGC(alpha, T=10, k=100)
Function that applies local gradient clustering (LGC) to the currently loaded dataset.
Parameters
- alphafloat
Learning rate of the LGC algorithm.
- Tint, default = 10
Number clustering iterations LGC takes.
- kint, default = 100
Number of nearest neighbors to consider for computing the local gradient.
Returns
- datanp.ndarray, shape (n_samples, n_features)
The clustered dataset.
LGC
- pySDR.LGC.sharpening_for_dr(data, alpha, T=10, k=100)
Function that applies sharpening by means of local gradient clustering (LGC) to the currently loaded dataset.
Note
This function serves as a Python interface between pySDR and SDR. Please use the
pySDR.SDR.SDR
class for SDR unless your objective is to just sharpen the high dimensional data.Parameters
- alphafloat
Learning rate of the LGC algorithm.
- Tint, default = 10
Number clustering iterations LGC takes.
- kint, default = 100
Number of nearest neighbors to consider for computing the local gradient.
Returns
- datanp.ndarray, shape (n_samples, n_features)
The clustered dataset.
DR
- class pySDR.DR.DR(**kwargs)
Class for applying dimensionality reduction.
Note
This class serves as a common DR interface for Tapkee DR methods as well as sklearn and UMAP learn. It is not recommended to use this class for DR. Please use
pySDR.SDR.SDR
without anapply_LGC()
call instead.Parameters
- **kwargs
A number of keyword arguments specifying the DR method to use and its configuration. Note a
method
should always be provided.
- apply_DR(data, filepath)
Function that applies DR on the data provided and stores the results in a *.txt file.
Parameters
- datanp.ndarray, shape (n_samples, n_features)
Feature space data.
- filepathstr
File to store the results to with 6 significant digits (i.e. roughly float32 precision).
- get(arg)
General interface for all getter methods.
Parameters
- argstr
A string specifying the setting that needs to be retrieved.
Returns
- value :
The value of the setting.
- get_backend()
Function that gets the backend of the DR method.
Returns
- backendstr
Backend of the DR method (can be either tapkee, sklearn or umap-learn).
- get_gaussian_kernel_width()
Function that gets the value of the
gaussian_kernel_width
parameter of the DR method. Used by the Laplacian Eigenmaps, LPP and Diffusion Map algorithms in the Tapkee library.Returns
- gaussian_kernel_widthfloat
Width of the Gaussian kernel used by the DR method.
- get_landmark_ratio()
Function that gets the value of the
landmark_ratio
parameter of the landmark algorithms LMDS and Landmark Isomap.Returns
- landmark_ratioint, between [0,1]
Ratio of landmark points that is used by the DR method.
- get_max_iteration()
Function that gets the value of the
max_iteration
parameter of the DR method. Used by:SPE
Factor Analysis
Manifold Sculpting
Returns
- max_iterationint
Maximum number of iterations that can be reached by the DR method.
- get_method()
Function that gets the DR method ID.
Returns
- methodint
ID of the DR method that is currently set. The correspondence between ID and DR method is as follows: KLLE = 0, NPE = 1, Kernel LTSA = 2, Linear LTSA = 3, Hessian LLE = 4, Laplacian Eigenmaps = 5, LPP = 6, Diffusion Map = 7, Isomap = 8, Landmark Isomap = 9, MDS = 10, LMDS = 11, SPE = 12, Kernel PCA = 13, PCA = 14, RP = 15, Factor Analysis = 16, tSNE = 17, Manifold Sculpting = 18, UMAP = 19 & LTSA = 20.
- get_metric()
Function that gets the value of the
metric
parameter of UMAP.Returns
- metricstr
Metric parameter of the UMAP algorithm.
- get_min_dist()
Function that gets the value of the
min_dist
parameter of UMAP.Returns
- min_distfloat, between [0,1]
Value of the
min_dist
parameter of UMAP controlling how tightly UMAP packs points together.
- get_num_neighbors()
Function that gets the value of the
num_neighbors
parameter of the DR method. Used by:KLLE
NPE
Kernel LTSA
Linear LTSA
Hessian LLE
Laplacian Eigenmaps
LPP
Isomap
Landmark Isomap
Manifold Sculpting
LTSA
Returns
- num_neighborsint
Number of nearest neighbors used by the DR method.
- get_sne_perplexity()
Function that gets the value of the
sne_perplexity
parameter of tSNE.Returns
- sne_perplexityfloat
Perplexity parameter of tSNE.
- get_sne_theta()
Function that gets the value of the
sne_theta
parameter of tSNE.Returns
- sne_thetafloat
Theta parameter of the tSNE algorithm.
- get_squishing_rate()
Function that gets the value of the
squishing_rate
parameter of the Manifold Sculpting algorithm.Returns
- squishing_ratefloat
Squishing rate parameter of the Manifold Sculpting algorithm.
- get_target_dimension()
Function that gets the value of the
target_dimension
parameter of the DR method. NB: umap-learn and scikit-learn call this parametern_components
.Returns
- target_dimensionint
Number of dimensions the DR method will reduce to.
- get_umap_init()
Function that gets the value of the
init
parameter of UMAP.Returns
- initstr
Initialization used by the UMAP algorithm.
- set(**kwargs)
General interface for all setter methods.
Parameters
- **kwargs
A number of keyword arguments specifying the configuration of the DR method that need to be set.
- set_backend(backend)
Function that sets the backend of the DR method.
Parameters
- backendstr
Backend of the DR method (can be either tapkee, sklearn or umap-learn).
- set_gaussian_kernel_width(gaussian_kernel_width)
Function that sets the value of the
gaussian_kernel_width
parameter of the DR method. Used by the Laplacian Eigenmaps, LPP and Diffusion Map algorithms in the Tapkee library.Parameters
- gaussian_kernel_widthfloat
Width of the Gaussian kernel to be used by the DR method.
- set_landmark_ratio(landmark_ratio)
Function that sets the value of the
landmark_ratio
parameter of the landmark algorithms LMDS and Landmark Isomap.Parameters
- landmark_ratioint, between [0,1]
Ratio of landmark points that needs to be used by the DR method.
- set_max_iteration(max_iteration)
Function that sets the value of the
max_iteration
parameter of the DR method. Used by:SPE
Factor Analysis
Manifold Sculpting
Parameters
- max_iterationint
Maximum number of iterations that can be reached by the DR method.
- set_method(method)
Function that sets the DR method ID.
Parameters
- methodint
ID of the DR method. The correspondence between ID and DR method is as follows: KLLE = 0, NPE = 1, Kernel LTSA = 2, Linear LTSA = 3, Hessian LLE = 4, Laplacian Eigenmaps = 5, LPP = 6, Diffusion Map = 7, Isomap = 8, Landmark Isomap = 9, MDS = 10, LMDS = 11, SPE = 12, Kernel PCA = 13, PCA = 14, RP = 15, Factor Analysis = 16, tSNE = 17, Manifold Sculpting = 18, UMAP = 19 & LTSA = 20.
- set_metric(metric)
Function that sets the value of the
metric
parameter of UMAP.Parameters
- metricstr
Metric parameter of the UMAP algorithm.
- set_min_dist(min_dist)
Function that sets the value of the
min_dist
parameter of UMAP.Parameters
- min_distfloat, between [0,1]
Value of the
min_dist
parameter of UMAP controlling how tightly UMAP packs points together.
- set_num_neighbors(num_neighbors)
Function that sets the value of the
num_neighbors
parameter of the DR method. Used by:KLLE
NPE
Kernel LTSA
Linear LTSA
Hessian LLE
Laplacian Eigenmaps
LPP
Isomap
Landmark Isomap
Manifold Sculpting
LTSA
Parameters
- num_neighborsint
Number of nearest neighbors to be used by the DR method.
- set_seed(seed)
Function that sets the random seed of the DR method instance.
Parameters
- seedint
random seed
- set_sne_perplexity(sne_perplexity)
Function that sets the value of the
sne_perplexity
parameter of tSNE.Parameters
- sne_perplexityfloat
Perplexity parameter of tSNE.
- set_sne_theta(sne_theta)
Function that sets the value of the
sne_theta
parameter of tSNE.Parameters
- sne_thetafloat
Theta parameter of the tSNE algorithm.
- set_squishing_rate(squishing_rate)
Function that sets the value of the
squishing_rate
parameter of the Manifold Sculpting algorithm.Parameters
- squishing_ratefloat
Squishing rate parameter of the Manifold Sculpting algorithm.