Maximum margin classifiers Also known as optimal separating hyperplane Margin is the distance between hyperplane and closest training data point We want to select a hyperplane for which this distance is maximum Once we identify optimal separating hyper plane there can be many equidistance training points with the shortest distance from hyperplane Such point are… Continue reading Support Vector Machines

# Comparing LDA and LR

Few Points: LR model probability with logistic function LDA models probability with multivariate gaussian function LR find maximum likelihood solution LDA find maximum a posterior using bayes' formula When classes are well separated When the classes are well-separated, the parameter estimates for logistic regression are surprisingly unstable. Coefficients may go to infinity. LDA doesn't… Continue reading Comparing LDA and LR

# On LDA, QDA

Linear Discriminant Analysis (LDA) In LR, we estimate the posterior probability directly. In LDA we estimate likelihood and then use Bayes theorem. Calculating posterior using bayes theorem is easy in case of classification because hypothesis space is limited. Equation 4 is derived from equation 3 only. Probability(k) would be highest for the class for which… Continue reading On LDA, QDA

# Classification – One vs Rest and One vs One

In the blog post on Cost Function And Hypothesis for LR we noted that LR (Logistic Regression) inherently models binary classification. Here we will describe two approaches used to extend it for multiclass classification. One vs Rest approach takes one class as positive and rest all as negative and trains the classifier. So for the data having… Continue reading Classification – One vs Rest and One vs One

# On Classification Accuracy

Some Scenarios In finance default is failure to meet the legal obligation of loan. Given some data we want to classify whether the person will be defaulter or not. Suppose our training data-set is imbalanced. Out of 10k samples only 300 are defaulters. (3%) Classifier in the following table is good at classifying non defaulters… Continue reading On Classification Accuracy

# Cost Function And Hypothesis for LR

Hypothesis We want a hypothesis that is bounded between zero and one, regression hypothesis line extends beyond this limits. Hypothesis here also represents probability of observing an outcome. Hypothesis by ISLR and Andrew N.G : Odds and log-odds/logit In regression beta1 given average change in y for unit change in x. But here it says… Continue reading Cost Function And Hypothesis for LR

# Hypothesis and T-Distribution

We calculate t-score using hypothesis data We also get degrees of freedom from hypothesis data We supply this value to a function which gives us the probability of hypothesis being true. We can see that t test is a ratio, something like signal to noise ratio. Numerator allows us to center it around zero Denominator… Continue reading Hypothesis and T-Distribution

# Q – Q Plots

Since few days I was coming across to Q-Q plots very often and thought to learn more about it. Many a times we want our data to be normal, this is because we normality is an assumption behind many statistical models. Now how to test normality. Wikipedia has an article about this which lists many… Continue reading Q – Q Plots

# Probability Distribution

We have learned various probability distribution during high school and engineering courses. However at times we forget them, so here I am providing simple practical scenarios for each distribution with no theories involved. Bernoulli Distribution When the random variable has just two outcomes Probability of Drug/Medicine will be approved by government is p = 0.65… Continue reading Probability Distribution

# Interpreting Statistical Values

In the previous post we started exploring statistical domain and will dive in more deeply today. So basically we will try to see what all the values in summary(model) in R suggest. Here is a screenshot of how this summary looks : Significant of Residue? We want our residues to be normally distributed and centered around… Continue reading Interpreting Statistical Values