# Introduction to Random Distributions¶

Correctly called random variable, these random variable are usefull in hyperparameter tuning.

For each hyperparmeter, a range can be defined, that is, a statistical distribution, which makes the hyperparameter a random variable. This random variable will defined what value the hyperparameter is likely to take.

Let’s explore the hyperparameter distributions, by plotting the following graph : - Probability distribution function (pdf) or probability mass function (pmf) - Cumulative distribution function (cdf) - Histogram of sampling.

## Plotting Each Hyperparameter Distribution¶

Let’s import plotting functions, and neuraxle hyperparameter classes.

:

from neuraxle.hyperparams.distributions import *
from neuraxle.hyperparams.space import HyperparameterSpace
from neuraxle.plotting import plot_histogram, plot_pdf_cdf, plot_distribution_space
import matplotlib.pyplot as plt
import numpy as np
%matplotlib inline

DISCRETE_NUM_BINS = 40
CONTINUOUS_NUM_BINS = 1000
NUM_TRIALS = 100000
X_DOMAIN = np.array(range(-100, 600)) / 100


## Discrete Distributions¶

• Here are now the discrete standard distributions, which sample discrete value or categories.

• For example, Boolean distribution gives either true or false.

### RandInt¶

:

discrete_hyperparameter_space = HyperparameterSpace({
"randint": RandInt(1, 4)
})

plot_distribution_space(discrete_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)


randint:  ### Boolean¶

:

discrete_hyperparameter_space = HyperparameterSpace({
"boolean": Boolean()
})
plot_distribution_space(discrete_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

boolean:  ### Choice¶

:

discrete_hyperparameter_space = HyperparameterSpace({
"choice": Choice([0, 1, 3])
})
plot_distribution_space(discrete_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

choice:  ### Priority Choice¶

:

discrete_hyperparameter_space = HyperparameterSpace({
"priority_choice": PriorityChoice([0, 1, 3])
})
plot_distribution_space(discrete_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

priority_choice:  ## Continuous Distributions¶

• Here are now the continuous distributions, which sample a continuous range of values. Those are probably the ones you’ll most use.

### Continuous Uniform¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"uniform": Uniform(2., 4.)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

uniform:  ### Continuous Loguniform¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"loguniform": LogUniform(1., 4.)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

loguniform:  ### Continuous Normal¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"normal": Normal(3.0, 1.0)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

normal:  ### Continuous Lognormal¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"lognormal": LogNormal(1.0, 0.5)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

lognormal:  ### Continuous Normal Clipped¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"normal_clipped": Normal(3.0, 1.0, hard_clip_min=1., hard_clip_max=5.)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

normal_clipped:  ### Continuous Lognormal Clipped¶

:

continuous_hyperparameter_space = HyperparameterSpace({
"lognormal_clipped": LogNormal(1.0, 0.5, hard_clip_min=2., hard_clip_max=4.)
})
plot_distribution_space(continuous_hyperparameter_space, num_bins=CONTINUOUS_NUM_BINS)

lognormal_clipped:  ## Quantized Hyperparameter Distributions¶

• Here are now the quantized hyperparameter distributions. Those are distributions that yield integers or other precise specific values.

• Also, notice how there are border effects to the left and right of the charts when we use Quantized(...) as a distribution wrapper to round the numbers.

• Those weird border effect wouldn’t appear if you’d limit the distribution to half numbers instead of plain number.

• Let’s say you have a Quantized(Uniform(-10, 10)): then the samples from approximately -9.5 to -8.5 are rounded to

• The bin of the number -9, but the values from -10 to -9.5 are rounder to the bin -10 and a half is missing, so the -10

• bin sees its values sampled half as often as -9. That explains the border effect, and you could fix it easily by taking the uniform range from -10.49999 to 10.49999.

### Quantized Uniform¶

:

quantized_hyperparameter_space = HyperparameterSpace({
"quantized uniform": Quantized(Uniform(1., 5.))
})
plot_distribution_space(quantized_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

quantized uniform:  ### Repaired Quantized Uniform¶

:

quantized_hyperparameter_space = HyperparameterSpace({
"repaired quantized uniform": Quantized(Uniform(0.49999, 5.49999))
})

plot_distribution_space(quantized_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

repaired quantized uniform:  ### Quantized Log Uniform¶

:

quantized_hyperparameter_space = HyperparameterSpace({
"quantized loguniform": Quantized(LogUniform(1.0, 4.0))
})

plot_distribution_space(quantized_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

quantized loguniform:  ### Quantized Normal¶

:

quantized_hyperparameter_space = HyperparameterSpace({
"quantized normal": Quantized(Normal(3.0, 1.0))
})

plot_distribution_space(quantized_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

quantized normal:  ### Quantized Lognormal¶

:

quantized_hyperparameter_space = HyperparameterSpace({
"quantized lognormal": Quantized(LogNormal(1.0, 0.5))
})
plot_distribution_space(quantized_hyperparameter_space, num_bins=DISCRETE_NUM_BINS)

quantized lognormal:  