Entries Tagged as ''

Microsoft Neural Network — Step-by-step Predictions

A recent post on the MSDN Forums raises the issue of reconstructing the Microsoft Neural Network and reproducing the prediction behavior based on the model content. The forum did not allow a very detailed reply and, as I believe this is an interesting topic, I will give it another try in this post. As an example, I use a small neural network which is trained on two inputs, X (having the states A and B) and Y (C and D) in order to predict a Z discrete variable, having the states E and F.

This post is a bit large, so here is what you will find:

  • - a description of the topology of the Microsoft Neural Network, particularized for my example
  • - a step-by-step description of the prediction phases
  • - a set of DMX statements that exemplify how to extract network properties from a Microsoft Neural Network mining model content
  • - a spreadsheet containing the sample data I used as well as a sheet which contains the model content and uses cell formulae to reproduce the calculations that lead to predictions (the sheet can be used to execute predictions for this network)

The network topology

Once trained, a Microsoft Neural Network modelĀ  looks more or less like below:

image

During prediction, the input nodes are populated with values deriving from the input data. The values are linearly combined with the edges leading from input nodes to the middle (hidden) layer of nodes and the input vector is translated to a new vector, which has the same dimension as the hidden layer. The translated vector is then “activated” using the tanh function for each component. The resulting vector goes through a similar transformation, this time from the hidden layer to the output layer. Therefore, it is linearly converted to the output layer dimensionality (using the weight of the edges linking hidden nodes to output nodes). The result of this transformation is activated using the sigmoid function and the final result is the set of output probabilities. These probabilities are normalized before being returned as a result (in a call like PredictHistogram)

[Read more →]