As logistic regression, probit regression is called regression only to show its similarities between to the linear regression, however it is a classification method. This post gives a brief introduction to probit regression.
Logistic regression can be thought of as a generalisation of the linear model to classification. This article provides a quick overview and an example in Python for binary and multi-class logistic regressions.
This blog post summarizes the most often used evaluation metrics for binary classification.
Maximum A Posteriori Estimation (MAP) is yet another method of density estimation. Unlike Maximum Likelihood estimation, however, it is a Bayesian method as it is based on the posterior probability. This blog gives a brief introduction to MAP estimation.
Maximum likelihood estimation (MLE) is a method of estimating some parameters in a probabilistic setting. It is based on finding the parameters of a probability distribution that maximise a likelihood function of the observed data. The idea is to find the probability density function under which the observed data is most probable, the most likely. This blog gives a brief MLE overview.
This short explains why there is no such a thing as a free lunch in the ML world.
Some ML techniques work fine when you have only a few dimensions (d), but when you increase the dimensionality, they break down. Some phenomenas that arise when analysing data only in high-dimensional spaces (and that do not occur in low-dimensional settings) are referred to as the curse of dimensionality. This short post explains what is the curse and why it occurs.
Linear Algebra is essential to understand ML for three main reasons. One that when you read a book or an article of ML, models are very often explained with linear algebra. This is a consequence of much mathematical convenience as explained below. Second, many models are founded by linear algebra methods. Third, deep learning uses extensively vectors. In either way, if ML interest you, you need to trespass linear algebra. This article contains its most important notions with NumPy examples.
This post explains what happens when you get underflow and overflow error. I think it is important to keep in mind when you look at different models in order to understand the difference between the theory and the implementation. I guess it is also an easy question to answer at interviews!
We mentioned in the previous articles that probability theory can be used with decision theory in order to predict a target. What is the link and what is the place of decision theory in ML? This article try to find answers to these questions.