Home Lifestyle Difference Between Conditional and Marginal Distribution (Explained)

Difference Between Conditional and Marginal Distribution (Explained)

by Logan

Probability is a branch of mathematics that quantifies the prediction of a certain event occurring for a given set of data. It gives mathematical interpretation to the likelihood of obtaining the desired result.

The probability of any event occurring falls between zero and one. Zero denotes that there are no chances or likelihood of that event occurring, and one represents that the likelihood of a certain event occurring is 100%.

The study of probability enables us to predict or judge the chances of success or failure of any desired event and take measures to improve it.

For example, when testing a new product, a high probability of failure signifies a low-quality product. Quantifying chances of failure or success can help the manufacturers improve their product quality and experience.

In data analytics, marginal and conditional distributions are used to find the probability in bivariate data. But before we jump into that, let’s go through some basics.

Basics of Probability

A frequently used term in probability is ‘random variable’. A random variable is used to quantify the outcomes of a random event taking place.

For example, a school conducts research to predict the performance of their students in Mathematics in the upcoming exams, based on their previous performance. The research is confined to a total number of 110 students from 6 to 8th standard. If a random variable “X” is defined as the grades obtained. The following table shows the collected data:

GradesNumber of students
A+14
A-29
B35
C19
D8
E5
Total students:110
Data Sample

P(X=A+) = 14/110 = 0.1273

0.1273 *100=12.7%

This shows that about 12.7% of the students can score up to an A+ in their upcoming exams.

What if the schools also want to analyze the grades of students with respect to their classes. So how many of 12.7% of the students scoring an A + belong to the 8th standard?

Dealing with a single random variable is pretty simple, but when your data is distributed with respect to two random variables, the calculations can be a bit complex.

The two most simplified ways of extracting relevant information from bivariate data are marginal and conditional distribution.

To visually explain the basics of probability, here’s a video from Math Antics:

Math Antics – Basic Probability

What Is Meant By Marginal Distribution?

Marginal distribution or marginal probability is the distribution of a variable independent of the other variable. It only depends on one of the two events occurring while subsuming all the possibilities of the other event.

It’s easier to understand the concept of marginal distribution when data is represented in a tabular form. The term marginal denotes that it includes the distribution along the margins.

The following tables show the grades of 110 students from 6-8th standard. We can use this information to predict a grade for their upcoming mathematics exam,

Grades6th standard7th standard8th standardTotal no. of students
A+75214
A-1181029
B6181135
C47819
D1348
E0325
SUM294437110
Data Sample

Using this table or sample data, we can calculate the marginal distribution of the grades with respect to the total number of students or the marginal distribution of students in a specific standard.

We disregard the occurrence of a second event while calculating marginal distribution.

For instance, while calculating the marginal distribution of students who obtained a C with respect to the total number of students, we simply sum the number of students for each class across the row and dice the value with the total number of students.

The total number of students who obtained a C in all the standards combined is 19.

Dividing it by the total number of students in the 6-8th standard: 19/110=0.1727

Multiplying the value with 100 gives 17.27%.

17.27% of the total students achieved a C.

We can also use this table to determine the marginal distribution of students across each standard. For example, the marginal distribution of students in the 6th standard is 29/110, which gives 0.2636. Multiplying this value by 100 gives 26.36%.

 Similarly, the marginal distribution of students in the 7th and 8th standard is 40% and 33.6%, respectively.

What Is Meant By Conditional Distributions?

Conditional distribution as interpreted by the name, is based on a preexisting condition. It’s the probability of one variable while the other variable is set at a given condition.

Conditional distributions enable you to analyze your sample concerning two variables. In data analytics, often the likelihood of an event occurring is influenced by another factor.

Conditional probability uses the tabular representation of data. This improves the visualization and analysis of the sample data.

For example, if you’re surveying the average life span of the population, two variables to take into account can be, their daily average calorie intake, and the frequency of physical activity. Conditional probability can help you figure out the impact of physical activity on the average life span of the population if their daily calorie intake is above 2500kcal or vice versa.

As we set the daily calorie intake <2500kcal, we placed a condition. Based on this condition, the impact of physical activities on the average life span can be determined.

Or, while observing the sales deviation of two prevailing brands of energy drinks, two variables that influence the sales of these energy drinks are their presence and price. We can use conditional probability to determine the influence of price and presence of two energy drinks on the customers’ intent of purchase.

To understand better, let’s look into the same example used in marginal distribution:

Grades6th standard7th standard8th standardTotal no. of students
A+75214
A-1181029
B6181135
C47819
D1348
E0325
SUM294437110
Data Sample

For instance, you want to find the distribution of 6th standard students scoring a C, concerning the total number of students. You simply divide the number of students in 6th standard who scored a C by the total number of students in all the three standards who scored a C.

So the answer will b 4/19= 0.21

Multiplying it with a hundred gives 21%

The distribution of a 7th standard student scoring a C is 7/19= 0.37

Multiplying it with 100 gives 37%

And the distribution of an 8th standard student scoring a C is 8/19= 0.421

Multiplying it with 100 gives 42.1%

Difference Between Conditional and Marginal Distribution

Difference between conditional and marginal distribution
Difference between conditional and marginal distribution

Marginal distribution is the distribution of a variable with respect to the total sample, while conditional distribution is the distribution of a variable concerning another variable.

Marginal distribution is independent of the outcomes of the other variable. In other words, it is simply unconditional.

For example, if a random variable “X” is assigned to the gender of children in a summer camp and another random variable “Y” is assigned to the age of these children then,

The marginal distribution of boys in a summer camp can be given by P(X=boys), whereas the proportion of boys under the age of 8 is given by conditional distribution as P(X=boys|Y<8).

Final thoughts

Marginal distribution shows the probabilities of different values of the variables without pointing to the other variables.

However, conditional distribution is the probability of a variable which is calculated with reference to another variable.

Both of these theories of probability are correct and their application differs in different problems, cases and scenarios.

Related Articles

related articles