## Testing the statistics functions

To enable repeatable testing of the statistical functions I’m creating I decided to create a test script.

```\$data1 = @(1,2,3,4,5,6,7,8,9,10)
\$data2 = @(21,22,23,24,25,26,27,28,29,30)

get-mean -numbers \$data1
get-mean -numbers \$data2

get-standarddeviation -numbers \$data1
get-standarddeviation -numbers \$data2

get-correlation -numbers1 \$data1 -numbers2 \$data2```

As it always uses the same values and calls the functions in the same way I can generate repeatable results.

I’ll add extra lines to the script as I add more functions

## Calculating the correlation coefficient

This measures the degree of dependence between two sets of values -

+1 indicates perfect positive correlation

0 indicates no correlation

-1 indicates perfect negative correlation

We can calculate the correlation coefficient using this function

```function get-correlation {
[CmdletBinding()]
param (
[double[]]\$numbers1,
[double[]]\$numbers2
)

\$count1 = \$numbers1.length
\$count2 = \$numbers2.length
if (\$count1 -ne \$count2 ){
Throw "Samples are not of equal length"
}

\$avg1 = (get-mean -numbers \$numbers1).Average
\$avg2 = (get-mean -numbers \$numbers2).Average

\$sd1 = get-standarddeviation -numbers \$numbers1
\$sd2 = get-standarddeviation -numbers \$numbers2

\$varsum = 0

for (\$i=0; \$i -le (\$count1 -1); \$i++) {
\$varsum += (\$numbers1[\$i]-\$avg1) * (\$numbers2[\$i]-\$avg2)
}

\$correlation = \$varsum / ((\$count1-1) * \$sd1 * \$sd2)
\$correlation
}```

Get the mean and standard deviation  of the two datasets – using our existing functions

Calculate the sum of the products of the difference between each data point and the mean of its dataset

Take that value and divide it by the product of the standard deviations multiplied by the number of samples - 1

## Standard Deviation

Another simple calculation in PowerShell

```function get-standarddeviation {
[CmdletBinding()]
param (
[double[]]\$numbers
)

\$avg = \$numbers | Measure-Object -Average | select Count, Average

\$popdev = 0

foreach (\$number in \$numbers){
\$popdev +=  [math]::pow((\$number - \$avg.Average), 2)
}

\$sd = [math]::sqrt(\$popdev / (\$avg.Count-1))
\$sd
}```

Get the numbers. Calculate the average as we saw last time.

Sum the square of the differences between each value and the mean.  Divide by the number of samples minus 1 (corrects fro assumption we are dealing with a sample) and then take square root.

## Mean and moody

Been looking at some simple statistical calculations.  First off calculating the mean (arithmetic mean aka average in layman’s speak)

For this we can use Measure-Object

```function get-mean {
[CmdletBinding()]
param (
[double[]]\$numbers
)

\$result = \$numbers | Measure-Object -Average | select Count, Average
\$result
}```

We can use the function like this

get-mean -numbers \$(1..10)
get-mean -numbers \$(1..100)
(get-mean -numbers \$(1..100)).Average