Bayes’ Theorem is used to classification and prediction values from the given data sets. Which is known as Naive Bayes’ classifier. Naive Bayes’ classifier is mainly used in machine Learning.

Before going to learn Bayes’ theorem formula you should have little bit conceptual knowledge of Bayes’ theorem:

In probability theory , **Bayes’ theorem** explains the probability of most probable event, based on prior given conditions that might be related to the event. For example, if any disease is related to age, then, using ** Bayes’ theorem**, a person’s age can be used to more accurately assess the probability that they have similar disease, compared to the judgement of the probability of disease made without knowledge of the person’s age.

One of the many application of Bayes’ theorem is used to update the probability of hypothesis as more significant or information. This is called **Bayesian Inference. **Baysian inference is important method in **Mathematical Statistics. **

Bayesian Inference technique can be used almost all the fields like *science, engineering, sports, medical, and etc.*

Bayesian inference is closely related to * Subjective Probability* often called

**Bayesian Probability.**

**Subjective Probability: **is a probability which is derived from personal judgments whether a particular outcome is likely to occur. It contains no formal calculation and only reflects the opinions and past experience. Subjective probability may differ fro person to person. e.g. If you ask a person when you flip a coin what is the probability of getting tail in 5 tosses and he answered 30%. That might be more or less if coin lands 4 times tails then he might change the probability to 80% or more.

Bayes’ theorem is stated mathematically as the following equation:

**P(A|B) = P(B|A)*P(A)/P(B)**

where

**A**and**B**are events and**P(B)≠0.****P(A|B)**is conditional probability: the likelihood of event**A**occurring given that**B**is true.**P(B|A)**is also a conditional probability :the likelihood of event**B**occurring given that**A**is true.**P(A)**and**P(B)**are probabilities of observing**A**and**B**independently of each other, this is known as**marginal probability**.

**Marginal Probability: **Marginal distribution is a subset of a collection of random variables is the probability distribution of the variables contained in a subset.It gives the probabilities of various values of the variables in the subset without reference to the values of the other variables.

**Real world example**

Suppose that a person will get wet while walking down the lane. Let **R** be the random variable taking one from {**wet, dry**}, let **S** {for weather} be a discrete random variable taking one from {**winter, summer, rainy**}.

Here **R **is dependent on **S.** That is **P(R=wet)** will take all the different values depending whether **S **is winter, rainy or summer(likewise **P(R= dry). **A person, for example is far more likely to get wet when walking in rainy season than summer or winter. In other words for any given possible pair of values for **R** and **S**, one must consider the joint probability distribution of **R** and **S** to find the probability of that pair of events occurring together if the person ignores the state of the weather.

However, in trying to calculate the **marginal probability** **P(R=wet)**, what we are asking for is the probability that **R=wet** in the situation in which we don’t actually know the particular value of **S** and in which the person ignores the state of the weather.

Lets take a mathematical problem to understand the workflow of Bayes’ theorem

Lets take a look into the steps how we will calculate the the normal probability which individual probability and conditional probability.

Step 1 : (A)**Find Normal probabilities**

Suppose we have 2 boxes **A** and **B**, then we have calculate the probability of selecting the boxes which **P(A)** and **P(B)**

It is simple since we have to select one box out two so

**P(A) = 1/2 **and** P(B) = 1/2**

What if they have given the individual probabilities

**P(A) = 60%, P(B) = 30% **and** P(C) = 10% [ P(A) = 0.6, P(B) = 0.3 **and** P(C) = 0.1]**

Step 2 : (B) **Find conditional probability **

represented as **P(x|A) **here **x** is selecting element from set/collection of **A.**

Suppose we have a box contains 5 red and 3 white balls. Calculate the probability of selecting red ball.

**P(R|A) = (red ball)/(total number of balls) = 5/8**

Bayes’ Formula : **P(A|x) = P(x|A)*P(A)/[P(x|A).P(A)+p(x|B)*P(B)**

In the above problem we have calculate the probability of selecting red ball from the box which is **P(x|A)** Then our problem statement will be **P(A|x) **just** opposite. **

Here I am attaching the hand written solved example of Naive Bayes’ classifier example bayes theorem classifier example.

Featured image source

If you have any doubt please mention in comment section or shoot me an email @ khanirfan.khan21@gmail.com.

Categories: Deep learning, Machine Learning, R