Probability Distributions: Discrete, Continuous, Means, Variances & CDFs

Expert reviewed • 04 March 2025 • 11 minute read

HSC Maths Advanced Syllabus

use relative frequencies and histograms obtained from data to estimate probabilities associated with a continuous random variable
understand and use the concepts of a probability density function of a continuous random variable

Note:

Video coming soon!

Review of Discrete Distributions

Before we begin this module, we must first review items relating to the mean and variance of a distribution, discussed in the year 11 course and the previous module.

A discrete distribution describes the probabilities of the possible values of a discrete random variable. A random variable is 'discrete' if it has a countable number of distinct values. This means that the values do not go on indefinitely.

Additionally, a discrete distribution is typically defined by a probability mass function $p(x)=P(X=x)$ . This gives the probability that the discrete random variable $X$ can take a value of $x$ , where $x$ is some value. It is also important to note that $p(x)$ has two distinct qualities:

$p(x)$ cannot be negative. This means that $p(x)\geq 0$ .
The sum of the probabilities of all possible values of $X$ equals $1$ . That is: $\sum p(x)=1$

How to Calculate the Mean and Variance of a Discrete Distribution

If there is a discrete random variable $X$ , Let $p(x)=P(X=x)$

The formula to determine the expected value of a discrete distribution is as follows:

\mu=E(X)=\sum xp(x)

where,

$x$ represents each possible value of the random variable
$p(x)$ represents the probability that $X$ equals some value of $x$

There are two formulas to determine the variance of a discrete distribution. Both work, however, the choice of which one to use depends on the scenario and personal preference. The formulae are as follows:

Var(X)=E((X-\mu)^2)=\sum(x-\mu)^2p(x)\\or\\Var(X)=E(X^2)-\mu^2=\sum x^2p(x)-\mu ^2

where,

$x$ represents each possible value of the random variable
$\mu$ is the expected value or mean of $X$
$p(x)$ represents the probability that $X$ equals some value of $x$

These formulae, relating to the mean and variance of a discrete distribution, will be important in solving problems in the coming chapters.

The Cumulative Distribution Function

The Cumulative distribution function (CDF) $F(x)$ of any numerical probability distribution is the probability that the score is less than or equal to $x$ . In this chapter, we are exploring the CDF only in relation to discrete distributions. Thus, in this case, the CDF is a way to determine the combined probability value at any specified range or point involving $x$ . The formula to determine the CDF is given by:

F(x)=P(X\leq x)

Practice Question 1

Suppose $X$ is a discrete random variable representing the roll of a fair six-sided die. Determine the cumulative distribution function, when $x=3$ .

Solution

The first step in solving this question is to determine the probability of landing on a singular side of the die. As the chance of landing on a specific side of the die is equal, we can see that the probability is $\frac{1}{6}$ as there are six equal sides.

Now, we have to solve the given expression:

F(3)=P(X\leq 3)

We can rewrite this expression as follows to easily determine the solution to this problem:

F(3)=P(X=1)+P(X=2)+P(X=3)\\=\frac{1}{6}+\frac{1}{6}+\frac{1}{6}\\=\frac{1}{2}

$\therefore$ We see that the CDF $=\frac{1}{2}$

What are Relative and Cumulative Frequencies?

Relative frequency refers to the proportion or fraction of times a particular value or a set of values occurs, relative to the total number of observations in a dataset. Relative frequencies are a way to understand how data is distributed across different values by expressing the frequencies as proportions that sum up to 1. In simple terms, relative frequencies measure an estimate of the probabilities of a dataset.

Cumulative frequency refers to the sum of frequencies accumulated up to a certain point in a data set. It provides a running total of frequencies by adding each frequency to the value that preceded it in a frequency distribution table. To put it simply, cumulative frequencies are estimates of the cumulative distribution function.

Thus, relative frequency and cumulative frequency are found by dividing each respective frequency (relative or cumulative) by the total frequency. Both are often represented in tables, as shown below. This becomes important and useful for calculating both types of frequencies.

This table is an example of a relative frequency table:

Score (x)	Frequency (f)	Relative Frequency
70	5	0.25
75	3	0.15
80	6	0.30
85	3	0.15
90	3	0.15
Total	20	1

As we can see, the relative frequency column represents each frequency as a proportion of the overall frequency. For example, the relative frequency of the first row is $0.25$ which is the same as $\frac{5}{20}$ .

This table is an example of a cumulative frequency table:

Score (x)	Frequency (f)	Cumulative Frequency
70	5	5
75	3	5 + 3 = 8
80	6	8 + 6 = 14
85	3	14 + 3 = 17
90	3	17 + 3 = 20
Total	20	20

As we can see, the cumulative frequency column, continuously provides an ongoing cumulative frequency as the scores frequencies are added.

Understanding Continuous Distributions

Return to Module 10: Continuous Probability Distributions