Expert reviewed • 22 November 2024 • 11 minute read
Note:
Video coming soon!
Before we begin this module, we must first review items relating to the mean and variance of a distribution, discussed in the year 11 course and the previous module.
A discrete distribution describes the probabilities of the possible values of a discrete random variable. A random variable is ‘discrete’ if it has a countable number of distinct values. This means that the values do not go on indefinitely.
Additionally, a discrete distribution is typically defined by a probability mass function . This gives the probability that the discrete random variable can take a value of , where is some value. It is also important to note that has two distinct qualities:
If there is a discrete random variable , Let
The formula to determine the expected value of a discrete distribution is as follows:
where,
There are two formulas to determine the variance of a discrete distribution. Both work, however, the choice of which one to use depends on the scenario and personal preference. The formulae are as follows:
where,
These formulae, relating to the mean and variance of a discrete distribution, will be important in solving problems in the coming chapters.
The Cumulative distribution function (CDF) of any numerical probability distribution is the probability that the score is less than or equal to . In this chapter, we are exploring the CDF only in relation to discrete distributions. Thus, in this case, the CDF is a way to determine the combined probability value at any specified range or point involving . The formula to determine the CDF is given by:
Suppose is a discrete random variable representing the roll of a fair six-sided die. Determine the cumulative distribution function, when .
The first step in solving this question is to determine the probability of landing on a singular side of the die. As the chance of landing on a specific side of the die is equal, we can see that the probability is as there are six equal sides.
Now, we have to solve the given expression:
We can rewrite this expression as follows to easily determine the solution to this problem:
We see that the CDF
Relative frequency refers to the proportion or fraction of times a particular value or a set of values occurs, relative to the total number of observations in a dataset. Relative frequencies are a way to understand how data is distributed across different values by expressing the frequencies as proportions that sum up to 1. In simple terms, relative frequencies measure an estimate of the probabilities of a dataset.
Cumulative frequency refers to the sum of frequencies accumulated up to a certain point in a data set. It provides a running total of frequencies by adding each frequency to the value that preceded it in a frequency distribution table. To put it simply, cumulative frequencies are estimates of the cumulative distribution function.
Thus, relative frequency and cumulative frequency are found by dividing each respective frequency (relative or cumulative) by the total frequency. Both are often represented in tables, as shown below. This becomes important and useful for calculating both types of frequencies.
This table is an example of a relative frequency table:
Score (x) | Frequency (f) | Relative Frequency |
---|---|---|
70 | 5 | 0.25 |
75 | 3 | 0.15 |
80 | 6 | 0.30 |
85 | 3 | 0.15 |
90 | 3 | 0.15 |
Total | 20 | 1 |
As we can see, the relative frequency column represents each frequency as a proportion of the overall frequency. For example, the relative frequency of the first row is which is the same as .
This table is an example of a cumulative frequency table:
Score (x) | Frequency (f) | Cumulative Frequency |
---|---|---|
70 | 5 | 5 |
75 | 3 | 5 + 3 = 8 |
80 | 6 | 8 + 6 = 14 |
85 | 3 | 14 + 3 = 17 |
90 | 3 | 17 + 3 = 20 |
Total | 20 | 20 |
As we can see, the cumulative frequency column, continuously provides an ongoing cumulative frequency as the scores frequencies are added.