Monday, February 27, 2017

Anomaly detection - Andrew Ng

Anomaly detection vs Supervised learning - when negative examples are
too few go for anamoly detection

Anamoly detection - choosing features - features should have Normal
distribution. Plot histogram and see. If not, try log(x), log(x+c),
x^0.5, x^0.2 etc. Try combination of features : CPU/Net traffic,
CPU^2/Network traffic etc

Multivariate Normal distribution - let's say memory is unusually high
for a given cpu load. But both of them individually have good enough
probability of occurring. But they are at different sides of their
respective bell curves. So we would go for multivariate Normal
distribution.

Each feature modelled independently as gaussian and multiplied is same
as multivariate Gaussian when axes are aligned, i.e. all off diagonal
components are zero.

Multivariate captures correlations between features automatically.
Otherwise you have to create those unusual features manually.

But the original model is computationally cheaper and scales with
large number of features. In MV, you have to do large matrix
operations.

In MV m > n => number of examples should be more than number of
features. Not so in original. Since you can't inverse the matrix.

In MV, the covariance matrix(sigma) should be invertible. It will not
be invertible if there are redundant features, i.e. you have duplicate
features like x2 = x1 or x3 = x4 + x5 etc.

No comments:

Blog Archive