The clusteror.nn Module

This module comprises of classes for neural networks.

class clusteror.nn.SdA(n_ins, hidden_layers_sizes, np_rs=None, theano_rs=None, field_importance=None, input_data=None)[source]

Bases: object

Stacked Denoising Autoencoder (SDA) class.

A SdA model is obtained by stacking several DAs. The hidden layer of the dA at layer i becomes the input of the dA at layer i+1. The first layer dA gets as input the input of the SdA, and the hidden layer of the last dA represents the output. Note that after pretraining, the SdA is dealt with as a normal MLP, the dAs are only used to initialize the weights.

Parameters:
  • n_ins (int) – Input dimension.
  • hidden_layers_sizes (list of int) – Each int will be assgined to each hidden layer. Same number of hidden layers will be created.
  • np_rs (Numpy function) – Numpy random state.
  • theano_rs (Theano function) – Theano random generator that gives symbolic random values.
  • field_importance (list or Numpy array) – Put on each field when calculating the cost. If not given, all fields given equal weight ones.
  • input_data (Theano symbolic variable) – Variable for input data.
theano_rs

Theano function – Theano random generator that gives symbolic random values.

field_importance

list or Numpy array – Put on each field when calculating the cost. If not given, all fields given equal weight ones.

W

Theano shared variable – Weight matrix. Dimension (n_visible, n_hidden).

W_prime

Theano shared variable – Transposed weight matrix. Dimension (n_hidden, n_visible).

bhid

Theano shared variable – Bias on output side. Dimension n_hidden.

bvis

Theano shared variable – Bias on input side. Dimension n_visible.

x

Theano symbolic variable – Used as input to build graph.

params

list – List packs neural network paramters.

dA_layers

list – List that keeps dA instances.

n_layers

int – Number of hidden layers, len(dA_layers).

get_final_hidden_layer(input_data)[source]

Computes the values of the last hidden layer.

Parameters:input_data (Theano symbolic variable) – Data input to neural network.
Returns:A graph with output as the hidden layer values.
Return type:Theano graph
get_first_reconstructed_input(hidden)[source]

Computes the reconstructed input given the values of the last hidden layer.

Parameters:hidden (Theano symbolic variable) – Data input to neural network at the hidden layer side.
Returns:A graph with output as the reconstructed data at the visible side.
Return type:Theano graph
pretraining_functions(train_set, batch_size)[source]

This function computes the cost and the updates for one trainng step of the dA.

Parameters:
  • train_set (Theano shared variable) – The complete training dataset.
  • batch_size (int) – Number of rows for each mini-batch.
Returns:

Theano functions that run one step training on each dA layers.

Return type:

List

class clusteror.nn.dA(n_visible, n_hidden, np_rs=None, theano_rs=None, field_importance=None, initial_W=None, initial_bvis=None, initial_bhid=None, input_data=None)[source]

Bases: object

Denoising Autoencoder (DA) class.

Parameters:
  • n_visible (int) – Input dimension.
  • n_hidden (int) – Output dimension.
  • np_rs (Numpy function) – Numpy random state.
  • theano_rs (Theano function) – Theano random generator that gives symbolic random values.
  • field_importance (list or Numpy array) – Put on each field when calculating the cost. If not given, all fields given equal weight ones.
  • initial_W (Numpy matrix) – Initial weight matrix. Dimension (n_visible, n_hidden).
  • initial_bvis (Numpy array) – Initial bias on input side. Dimension n_visible.
  • initial_bhid (Numpy arry) – Initial bias on output side. Dimension n_hidden.
  • input_data (Theano symbolic variable) – Variable for input data.
theano_rs

Theano function – Theano random generator that gives symbolic random values.

field_importance

list or Numpy array – Put on each field when calculating the cost. If not given, all fields given equal weight ones.

W

Theano shared variable – Weight matrix. Dimension (n_visible, n_hidden).

W_prime

Theano shared variable – Transposed weight matrix. Dimension (n_hidden, n_visible).

bhid

Theano shared variable – Bias on output side. Dimension n_hidden.

bvis

Theano shared variable – Bias on input side. Dimension n_visible.

x

Theano symbolic variable – Used as input to build graph.

params

list – List packs neural network paramters.

get_corrupted_input(input_data, corruption_level)[source]

Corrupts the input by multiplying input with an array of zeros and ones that is generated by binomial trials.

Parameters:
  • input_data (Theano symbolic variable) – Data input to neural network.
  • corruption_level (float or Theano symbolic variable) – Probability to corrupt a bit in the input data. Between 0 and 1.
Returns:

A graph with output as the corrupted input.

Return type:

Theano graph

get_cost_updates(corruption_level, learning_rate)[source]

This function computes the cost and the updates for one trainng step of the dA.

Parameters:
  • corruption_level (float or Theano symbolic variable) – Probability to corrupt a bit in the input data. Between 0 and 1.
  • learning_rate (float or Theano symbolic variable) – Step size for Gradient Descent algorithm.
Returns:

  • cost (Theano graph) – A graph with output as the cost.
  • updates (List of tuples) – Instructions of how to update parameters. Used in training stage to update parameters.

get_hidden_values(input_data)[source]

Computes the values of the hidden layer.

Parameters:input_data (Theano symbolic variable) – Data input to neural network.
Returns:A graph with output as the hidden layer values.
Return type:Theano graph
get_reconstructed_input(hidden)[source]

Computes the reconstructed input given the values of the hidden layer.

Parameters:hidden (Theano symbolic variable) – Data input to neural network at the hidden layer side.
Returns:A graph with output as the reconstructed data at the visible side.
Return type:Theano graph