Earlier we entered the world of data and learned a rich collection of descriptive statistics, followed by developing a solid understanding of the probability theory. The knowledge of this chapter will now allow us to understand and practice the probability theory by means of the standard tools of calculus.
Formally, a random variable is a function from a sample space S to the set of real numbers R. Note at the very beginning that we denote random variables with uppercase letters and their particular values with the corresponding lowercase letters. So, a random variable X can take a value x.
When we define the outcomes of a random experiment via a random variable, we can generalize the very structure of the experiment and get rid off our context-dependence.
After internalizing the knowledge of this chapter, we will be able to state and solve a long array of problems with more formalism. In the rest of this chapter, we will first study the concepts of ’Cumulative distribution function’ (CDF) and ’Probability distribution function’ (PDF). At the cost of a spoiler, we can say that CDF and PDF are the theoretical counterparts of O-give and histogram, respectively. Secondly, we will study the concepts of the Expected value and Variance along with their key properties. While doing so, we will create and refer to a number of ad hoc random variables. Ad hoc is a Latin phrase meaning literally ’to this’. In English, it is used to describe ’something that has been formed or used for a special and immediate purpose, without previous planning’. In that, until we reach the section entitled ’Random variablest and distributions: Discrete probability laws’, we will be creating, using and disposing several random variables that serve our specific scientific/ technical purposes.
On one hand, our discussion and use of those ad hoc random variables and distributions will prove quite useful to handle a long list of probabilistic or statistical cases/problems. On the other hand, staying ’ad hoc’ is not good for a full-fledged practice of science, as our journey will reveal. As a matter of fact, a rich set of probability laws (Discrete probability laws and Continuous probability laws) will allow us to categorize, model and solve a variety of real-world statistical problems in a sound as well as practical fashion. Note that the use of the term ’Law’ may not be the best alternative available in scientific nomenclature; yet, it is part of the tradition. Those who are not comfortable with the use of the term ’Law’ may replace it with the term ’Distribution’. As an example, ’Uniform probability, law’ and ’Uniform probability distribution’ are simply the same thing as each other.
Now, we can proceed with our quest to learn things. Recall our repeatedly used random expertment of ’tossing a fair coin’. Head and Tail (or, H and T) being the two sides of a coin, we already know the following:
Upon these, we are allowed to define and study everything that is relevant. Despite its simplicity, such an approach lacks one important feature: mathematical generalization. Indeed, the real-world hosts a bunch of random experiments with two basic outcomes; a student to pass or to fail an exam, a patient to survive or not survive a sickness, an asteroid to hit or not to hit our planet Earth, and so on. Notice that each of these experiments look like tossing a coin. Furthermore, if the probability of passing the exam, the probability to survive and the probability to hit the Earth are equal to 1/2, these random experiments are ’identical’ with tossing a coin, except for the details of naming. So, why not to suggest a random variable X along with its probability distribution to address all these random experiments?
Consider X ∈{0, 1} (or x ∈ [0, 1]) for which P(x = 0) = 1/2 and P(x = 1) = 1/2. This is nothing but a direct equivalent of the random experiment of tossing a coin, without referring to coin explicitly. Let us leave this discussion for a while to consider another random experiment.
Now consider (or recall from our in-class discussions) the random experiment of drawing a number from the interval [1, 5] in a fully blind-folded fashion; so, there are infinitely many basic outcomes, which are the real numbers from 1 to 5 (as one cannot guarantee to pick intergers only, when blind-folded). With regard to this case, we already know the following:
or simply,
and
The final statement should trivially follow from chapter 4 (and should not sound weird to your ears anymore).
Following a similar agenda, to what we used in the case of tossing a fair coin above, we can say that the random experiment of picking a real number from [1, 5], from [2, 6], from [3, 7] or from [1001, 1005] should not differ. You may confirm this expectation once you have measured the length (size) of each of these intervals as 4 .
Define now Y ∈ [1, 5] and leave this discussion aside until we cover the following definitions. Each of these definitions is crucial for our subsequent study of probability theory and statistics. Combining/pairing with your in-class notes, use these definitions to come up with a holistic picture of the things (objects) involved.
The cumulative distribution function or CDF of a random variable X is
denoted by FX or F
, and is defined as:
|
or
|
A random variable X is continuous if F is a continuous function
of x. A random variable X is discrete if F
is a step function of
x.
Additionally, the random variables X and Y are identically distributed if for every set A,
|
where this does not necessarily mean X = Y. If X and Y are identically distributed,
|
The probability distribution function PDF of a discrete random variable X is given by:
|
or
|
For discrete random variables, probability distribution function is also called the probability mass function.
The probability distribution function PDF of a continuous random variable
X is the function fX that satisfies:
|
For continuous random variables, probability distribution function is also called the probability density function.
Using the Fundamental Theorem of Calculus, if f is continuous
|
The analogy with the discrete case is almost exact. We «add up» the point
probabilities f to obtain interval probabilities F
. With a slight abuse
of the notation:
|
Having been exposed to formal definitions of the functions and operators involved, now we will reconsider the random experiment of tossing a fair coin (discrete random variable case) and random experiment of picking a number from [1, 5] (continuous random variable case), in that order.
Using our newly acquired knowledge, we can now define the following:
Here, X is nothing but the random variable that describes the outcomes of the random experiment of tossing a fair coin.
Consider also:
You must have noticed that Y is the random variable that describes the outcomes of the random experiment of picking a number from [1, 5].
Since F(x) is a discrete function, X is a discrete random variable and since G(y) is a continuous function, Y is a continuous random variable.
In the cases of X ∼ f(x) and Y ∼ g(y) above, observe that:
Also, notice that:
and
while
Your mind should be crystal clear in this distinction of probabilities and likelihoods for continuous random variables.
But, how we define/refer to probabilities and calculate them in the case of continuous random variables? The answer should be trivial to you: since the point probabilities are all zero for a continuous random variable, we can talk about the ’probabilities of intervals’ only. Then, the following calculation for the random variable Y above is legitimate:
Alternatively,
yields the same solution. Now, give an effort to show these solutions on the graphs of g(y) and G(y).
The expected value or mean of a random variable X is:
|
and
|
provided that the sum or integral exists.
The variance of a random variable X is defined as:
|
For a discrete random variable X:
|
where
|
For a continuous random variable X:
|
where
|
The positive square root of Var is the standard deviation of X. If X is
a random variable with finite variance, then for any constants a and
b:
|
Regarding the same X and Y defined above, we can now study/compute the Expected value and the Variance:
3.1 EXERCISES ___________________________________________________________
Let X be a random variable with the following cumulative distribution function F:
Calculate the following probabilities:
Solution:
Let X be a discrete random variable with the following PDF, f:
x | 1 | 3 | 5 | 7 | 9 |
f | 0.4 | 0.1 | 0.2 | 0.2 | 0.1 |
Solution:
Explain why each of the following is or is not a valid probability distribution for a discrete random variable X:
iv.
x | 2 | 3 | 5 | 6 |
f | 0.15 | 0.15 | 0.45 | 0.35 |
Solution:
The random variable X has the following discrete probability distribution:
x | 1 | 3 | 5 | 7 | 9 |
f | 0.1 | 0.2 | 0.4 | 0.2 | 0.1 |
vii. Find E
Solution:
Consider the probability distributions,
x | 0 | 1 | 2 |
f | 0.3 | 0.4 | 0.3 |
and
y | 0 | 1 | 2 |
f | 0.1 | 0.8 | 0.1 |
ii. Which distribution appears to be more variable? Why?
Solution:
Every morning, my mother gives me a random amount of money according to the following PDF, where X is the random variable that measures the amount of money:
x | 20 | 30 | 40 | 50 |
f | 0.10 | 0.20 | 0.30 | 0.40 |
Right after that, my sister takes out of my pocket a random amount of money according to the following CDF, where Y is the random variable that measures the amount of money:
y | 5 | 10 | 15 |
F | 0.30 | 0.70 | 1.00 |
Then I leave home and spend all my money before the day ends. Create
a random variable W which shows the net amount of money before I
leave home in the morning. Calculate F and present it in tabular
format. Using these functions:
v. Calculate Var
Solution: i.
ii. First, we need to find g(y) :
iii. First, we need to find the PDF of W, call it h(w). Find the possible values of W and calculate the probability for each w. Those values are
Then,
From the previous parts we know that E(X) = 40 and E(Y) = 10. In this part, we found E(W) = 30. So, E(X) −E(Y) = 40 − 10 = 30 = E(W) → verification done.
iv. Do on your own.
v. Calculate Var(W) as:
As an alternative:
(As a follow-up exercise: calculate Var(X) and Var(Y) on your own, and verify that Var(W) = Var(X) + Var(Y)).
Consider X ∼ f(x) = , 4 ≤ x ≤ 8, Y ∼ g(y) =
, 0 ≤ y ≤ 3 and
another random variable W which is defined as W = X−Y. Calculate
E(X), E(Y), E(W), Var(X), Var(Y), Var(W).
Solution:
Without finding h(w), the following can be written:
Another way to deal with W is:
To calculate Var(W) you do the following:
Calculate E
Then, find Var(W)
If ’double integrals’ were not in the curriculum of MATH 105 or MATH 106 and if you do not have a prior knowledge of it, you may safely skip this last part.
We consider here four (one being optional) discrete probability laws
Bernoulli distribution is also called Bernoulli trial or Bernoulli process. Consider an experiment consists of 1 trial and let there be two possible outcomes, success and fail.
For X ∼ Bernoulli
|
Despite its simplicity, Bernoulli distributton is a stunningly useful one, as a building block of some other distributions.
Observe below the PDF of X ∼ Bernoulli(0.80):
Expected value and Variance:
0Go to Teaching page & experiment with Bernoulli(P) using the file named ‘Statistical distributions.xlsx’.
Consider an experiment which consists of n independent and identical
Bernoulli trials;i.e, the probability of success (P) is the same across
all the trials and a trial’s outcome does not alter the outcomes of
the subsequent trials. X being the number of successes in n trials,
X ∼ Binomial, i.e., X has a Binomial distribution with parameters n
and P:
|
Observe below the PDF of X ∼ Binomial(8, 0.80):
Then the PDF of X ∼ Binomial(8, 0.20):
And finally the PDF of X ∼ Binomial(8, 0.50):
Having compared the PDFs of Binomial(8, 0.80), Binomial(8, 0.20), Binomial(8, 0.50), can you identify the source of asymmetry of Binomial PDFs?
Expected value and Variance:
is not practical to work with. So, consider:
This means:
Then,
0 Go to Teaching page & experiment with Binomial(n, P) using the file named ‘Statistical distributions.xlsx’.
Consider an experiment which consists of counting the number of times a
certain event occurs during a given unit of time or in a given area
or volume. The probability that an event occurs in a given unit of
time, area or volume is the same for all units. The number of events
that occur in one unit of time, area or volume is independent of the
number that occur in any other mutually exclusive unit. The mean (or
expected, or typical) number of events in each unit is denoted by λ. For
X ∼ Poisson:
|
Recall that, e is called the Euler’s Number where e = 2.71828...
Observe below the PDF of X ∼ Poisson(3):
Then the PDF of X ∼ Poisson(6):
And finally the PDF of X ∼ Poisson(10):
Is the last graph symmetric? Is it possible to have a Poisson(λ) PDF which is symmetric? Why?
Expected Value and Variance:
is not useful againg. So, consider,
This means,
Then,
0 Go to Teaching page & experiment with Poisson(λ) using the file named ‘Statistical distributions.xlsx’.
Consider a Binomial process with:
|
Define
|
So,
|
Re-writting f as:
|
the derivation yields. Now,
Consider now the parts A, B and C separately:
Combining the limits:
|
is found.
The intuition is as follows: When we consider a Binomial process in which a success occurs with an infinitesimal probability in every infinitesimal time period and when there are infinitely many time periods as such, what yields for a finite time period is nothing but the Poisson distribution. The derivation can be carried out in reference to space rather than time, if you wish.
Consider an experiment which consists of randomly drawing n elements
without replacement from a set of N elements, r of which are successes and
of which are failures. X being the number of successes among n
elements, X ∼ Hypergeometric
:
|
Expected Value and Variance:
Notice that:
and that:
Consider an experiment which consists of a sequence of independent and identical Bernoulli trials; the expelriment ends when a (one) success is observed. X being the number of trials until one success, X ∼ Geometric(P), i.e., X has a geometric distribution with parameter P:
|
The construction of f(x) is intuitive as the experiment will yield x− 1 failures before the ’one and only’ success, which occurs at the end, by definition.
Observe below the PDF of X ∼ Geometric(0.80):
Observe below the PDF of X ∼ Geometric(0.50):
Observe below the PDF of X ∼ Geometric(0.20) for 1 ≤ x ≤ 10
Observe below the PDF of X ∼ Geometric(0.20) for 1 ≤ x ≤ 20
Expected Value and Variance:
Note that, denoting 1 −P = q
Consider an experiment which consists of a sequence of independent and identical Bernoulli trials; the expelriment ends when r successes are observed. X being the number of trials until r successes, X ∼ Neg Bin(r, P), i.e., X has a Negative Binomial distribution with parameters r and P:
|
To develop an intuition of f(x), notice that the last Bernoulli trial yields
success, with a probability of P and the x− 1 trials before that yield r− 1
successes with a probability of Pr−1(1 −P)x−r according to a Binomial
(x− 1, P) distribution, where the product of the two probabilities yield the
Negative Binomial PDF.
Observe below the PDF of X ∼ NegativeBinomial(4, 0.80):
Observe below the PDF of X ∼ NegativeBinomial(4, 0.65):
Observe below the PDF of X ∼ NegativeBinomial(4, 0.50):
Observe below the PDF of X ∼ NegativeBinomial(4, 0.35):
Observe below the PDF of X ∼ NegativeBinomial(4, 0.20):
X ∼ Uniform(a, b)
Expected Value and Variance:
We consider here three continuous probability laws
|
The graph of the PDF of Uniform(0, 4) looks like:
Expected Value and Variance:
0 Go to Teaching page & experiment with Uniform(a, b) using the file named ‘Statistical distributions.xlsx’.
X ∼ Triangular(a, b, c)
a: Lower limit b: Mode c: Upper limit
Triangular distribution is a practical model, mostly useful in business what-if analysis. A symmetric triangular is the sum of two identically distributed uniform variables.
|
Graphs of Exponential(0.5), Exponential(1.0), Exponential(2.0) and Exponential(4.0) PDFs can be seen in the following figure:
Expected Value and Variance:
0 Go to Teaching page & experiment with Exponential(λ) using the file named ‘Statistical distributions.xlsx’.
To gain some computational insight, consider the completion of a repetitive/routine task by an office employee. Suppose that every repetition of a task takes a random duration which is governed by an Exponential(1/4) distribution. As λ = 1/4, one task is, on average, completed in 4 time units (let’s say, days). Using this information, let’s calculate the following:
|
Below given the graph of Normal(4, 1) PDF:
When we add the guidelines that show μ− 3σ, μ− 2σ, μ−σ, μ + σ, μ + 2σ and μ + 3σ, the previous figure looks like:
Displaying the PDFs of Normal(4, 1) and Normal(4, 0.25) together, we notice that the latter has a higher peak:
Displaying the PDFs of Normal(4, 1), Normal(4, 0.25) and Normal(4, 0.09) together, we notice that the last has an even higher peak, the area under each PDF integrating to 1.
Keeping the variance σ2 the same, a change in mean μ results in a shift of the PDF. Compare Normal(4, 1) and Normal(6, 1) below:
Expected Value and Variance:
Consider ∫
−∞∞z2e−dz. For α =
:
Set ω = :
0 Go to Teaching page & experiment with Normal(μ, σ2) using the file named ‘Statistical distributions.xlsx’.
Z ∼ Normal has the standard normal distribution. If X ∼ Normal
,
the random variable Z defined as:
|
has a Normal distribution. A casual naming is z-distribution,
and
|
Recall that, e is called the Euler’s Number where e = 2.71828... and π = 3.14159...
Notice/recall that the PDF of the Standard Normal (Z) random variable has a unique parametrization. Its PDF with the guidelines that show μ− 3σ = −3, μ− 2σ = −2, μ−σ = −1, μ = 0, μ + σ = 1, μ + 2σ = 2 and μ + 3σ = 3 look like:
0Go to Teaching page & experiment with Normal(0, 1) using the file named ‘Statistical distributions.xlsx’. Is there anything in Z to experiment with?
To see how/why the Normal approximation to Binomial works, consider the PDF of X ∼ Binomial(80, 0.80) over the domains of 0, 1, ..., 80 and 55, 56, ..., 75 below:
PDF of X ∼ Binomial(80, 0.80) plotted over 0, 1, ..., 80 looks like:
PDF of X ∼ Binomial(80, 0.80) plotted over 55, 56, ..., 75 looks like:
Do you see the Normal-like behavior of X ∼ Binomial(80, 0.80) around its mean, i.e., nP = 80 ⋅ 0.80 = 64? Can you obtain the same with X ∼ Binomial(8, 0.80)? Why?
For each integer k, the kth moment of X is denoted as μK′ and is defined as:
|
The kth central moment of X is denoted as μk and is defined as:
|
Notice that μ = μ1′ = E In addition to the mean (expected value) of a
random variable, another important moment is the second central moment,
as you’ve known as variance.
X being a random variable with CDF F, the moment generating
function (MGF) of X is denoted by MX
and is defined as:
|
provided that the expected value exists for t in some neighborhood of zero.
That is, there exists h > 0 such that for all −h < t < h, E exists.
Otherwise, the MGF is said not to exist. Explicitly,
|
or
|
Distribution | MX |
Bernoulli | |
Binomial | |
Poisson | eλ(et−1) |
χn2 | |
Exponential | |
Fn1,n2 | Does not exist |
Normal | eμt+ |
tn | Does not exist |
Uniform | |
If a random variable X has the MGF MX, then
|
That is, the nth moment of X is equal to the nth derivative of MX
evaluated at t = 0. See after five years: convergence of MGF’s.
3.2 EXERCISES ___________________________________________________________
We roll a pair of fair dice. Let X be the random variable that assigns the minimum of the two numbers that turn up to each outcome.
ii. If we know that one of the dice turned up a number less than or equal to 3, what is the probability that X takes a value greater than or equal to 2?
iii. If we know that one of the dice turned up a number less that or equal to 3, what is the probability that X takes a value equal to 3?
v. Find the variance of X.
Solution:
x | f(x) |
1 | 11/36 |
2 | 9/36 |
3 | 7/36 |
4 | 5/36 |
5 | 3/36 |
6 | 1/36 |
Two balls are simultaneously chosen (i.e.., chosen without replacement)
from an urn containing 3 white, 2 black, and 1 red balls. You are given
2TL for each white ball chosen, you have to pay 1TL for each black ball
chosen, and you neither pay nor receive any money for a red ball that is
chosen. For example if you have chosen 1 white and 1 black ball, you
net winning is 2 + = 1 TL. Let X be the random variable that gives
your net winnings.
i. Construct a table that shows the possible values of X and the probabilities associated with each value, i.e.., tabulate the probability density (mass) function of X.
ii. Find the expected value of X.
Solution:
x | f(x) |
−2 | 2/30 |
−1 | 4/30 |
1 | 12/30 |
2 | 6/30 |
4 | 6/30 |
A class in statistics has 20 students. In the first midterm 2 students scored 50, 10 scored 60, 1 scored 70, 5 scored 80, and 2 scored 100. Three students are selected at random without replacement. Let X be the median score of the three students.
iii. Given that the median of the scores of the three students selected is greater than or equal to 70, what is the probability that their median is equal to 80?
iv. Find the expected value and variance of X.
Solution: Try on your own if you have time, and just for fun.
We have three coins such that when coin 1 is tossed the probability of observing a head is 0.4, when coin 2 is tossed the probability of observing a head is 0.7, and when coin 3 is tossed the probability of observing a head is 0.2. We first toss coin 1. If we observe a head we toss coin 2 otherwise we choose coin 1 or coin 3 at random and toss it.
ii. Are the events of observing a head on the second toss and observing a head on the first toss independent?
Solution: This question is reserved for in-class discussions.
A fair die is rolled ten times. We are interested in the number of times 6 is obtained.
i. Given our interest, can we think of this experiment as a binomial experiment. If so describe each Bernoulli trial, i.e.. verbally describe the Bernoulli trial, state the outcome that you will call success and the probability of success in each trial.
ii. Let X be the random variable which assigns, to each outcome, the number of times 6 is obtained in the outcome. What is the distribution of X?
iv. With what probability will X take a value greater than or equal to 4?
Solution:
that is, X ∼ Bernoulli (1/6).
It is known that 40% of all students of Economics are male. Independent observers note the gender of 12 random Economics students (a student’s gender might be noted more than once) and we count the number of males observed.
iii. You have been told that at least 2 of the students that has been observed are female. What is the probability that the number of male students, in the observed group, is 5 or less?
Solution:
Consider a game where a round of the game consists of rolling a fair die 10 times. Each time a 1 or 6 comes you win 1TL.
iii. You have learned that two of the rolls of the die resulted with a number different than 1 or 6, but you do not know what the result of the other rolls of the die was. What is the probability that you will win more than 5TL?
iv. What would your average (mean) winnings be if you played this game indefinitely?
Solution:
Based on past data, we know that, on average, 6 customers enter Coffee Break every 20 minutes.
i. What is the probability that at least 2 customers will enter Coffee Break during a given 20-minute time period?
ii. Define the probability of k customers entering Coffee Break in 20 minutes as a mathematical function. Describe what is what in your function clearly.
Solution:
X being the random variable that shows the number of customers arriving every 20 minutes. The rate of arriving customers is λ = 6. X is a Poisson random variable.
On an ordinary day, on average 3 white and 1 blue cars pass through a certain cross-section of a road every 5 minutes.
iii. What is the probability that 3 cars (blue or white) will pass in a 5-minute interval?
Solution:
As Poisson λ ’s are additive, X ∼ Poisson (4).
A Hypergeometric story: In a corporation, promotion decisions for employees are made by a committee of 5 people. The decision making procedure has the following steps:
Consider Employee A for whom the chance of a promotion is P in the eyes of each committee member. That is, each committee member has a chance of P to promote Employee A. Also, preferences of committee members are independent from each other’s. Is there a chance to be accidentally or unfairly promoted (or not promoted) in this kind of scheme?
Solution: The solution involves some steps:
First, a ’Promotion’ vote being marked as Success, each committee member’s vote is a Bernoulli trial:
Then, total votes (total of successes) (Y) is a Binomial process:
Then, W being the number of ’Promotion’ votes among the final 3, W has a Hypergometric distribution:
W ∼ Hypergeometric(5, Y, 5 −Y)
So,
Now, your task is to find g(y) for each value of y. Then you will calculate h(w) for each different value of y. At the end, you will compare Employee A’s chance to promote with and without the Step 2&3 of the promotion procedure. Note that the result may be a little surprising.
Let X1 be the random variable that gives the number of phone calls that you get between 1 PM and 2 PM. Let X2 be the random variable that gives the number of phone calls that you get between 2PM and 4PM. Assume that X1 is Poisson distributed with parameter 5 and X2 is Poisson distributed with parameter 12. Let X be the random variable that gives the number of phone calls that you get between 1PM and 4PM. Find the PDF of X.
Solution: X1 ∼ Poisson (5) and X2 ∼ Poisson (12). X = X1 + X2, X ∼ Poisson (17). Make sure you have obtained this result by following the chapter’s instructions.
Let X1 be the random variable that gives the number of phone calls that you get between 1PM and 2PM. Let X2 the random variable that gives the number of phone calls that your friend gets between 1PM and 2PM. Assume that X1 is Poisson distributed with parameter λ1 and X2 is Poisson distributed with parameter λ2. Find the distribution of X = X1 + X2, i.e.., the PDF of the random variable that gives the total number of phone calls that you and your friend receive between 1PM and 2PM.
Solution: The solution method is already arailable in the chapter.
Suppose that you buy 40 lottery tickets. Using the Poisson approximation find the probability of having at least 2 winning tickets, given that the probability of any ticket being a winning ticket is 0.02.
Solution: This is self-study for those who are interested. Not to appear in any examination.
Let X be a random variable that is uniformly distributed over .
Answer the following questions:
Solution:
A potato chips producer starts a promotion program in an effort to boost its sales. In that, gift tickets are placed in every 25 out of 100 chip bags in sale and the customers are required to collect two tickets to win a free soft drink. By the nature of such promotions, gift tickets are invisible from outside prior to purchase. In order to attain a probability of 90% at minimum to win a soft drink, how many bags of potato chips should an average customer buy? Notes:
Solution: This is to be discussed in class only along with a computer demo.
Let X be a random variable with the following PDF:
x | −3 | −1 | 0 | 1 | 2 | 3 |
f | 0.25 | 0.10 | 0.05 | 0.20 | 0.30 | 0.10 |
Define a new random variable Y as
|
iv. Find the variance of Y
Solution: This question is reserved for in-class discussions.
Let X be a random variable normally distributed with expected value of 2 and variance of 9. Answer the following questions:
Solution:
Reveal how symmetry property is used here.
Study this solution by drawing proper graphs of the PDF of the Standard normal distribution.
It is estimated that 45% of the freshmen entering a particular college will graduate from that college in four years.
i. For a random sample of 5 entering freshmen, what is the probability that exactly 3 will graduate in four years?
ii. For a random sample of 5 entering freshmen, what is the probability that a majority (more than half) will graduate in four years?
iii. 80 entering freshmen are chosen at random. Find the mean and variance of the number of these 80 that will graduate in four years.
Solution:
Bags of our packed by a particular machine have weights which are normally distributed with mean of 500gr and standard deviation of 20gr.
ii. If 2% of the bags are rejected for being underweight, what is the maximum weight for a bag to be rejected as underweight?
iii. Find an interval symmetric around the mean and the
probability of the weight of a randomly selected bag being in the
interval is 0.90.
Solution:
Use your z table, the answer is 0.30853 + 0.22662, i.e., 0.53516.
X is distributed as Bin. Describe the steps to calculate
P
for a given k by using a Poisson approximation.
Solution: This is self-study for thase who are interested. Not to appear in any examination.
We know that the number of vampires killed by Dean in a typical fight
has a Poisson distribution and the number of vampires killed by Sam
in a typical fight has a Poisson
distribution. Show that the total
number of vampires killed in a typical fight follows a Poisson
distribution.
Solution: The solution method is already available in the chapter.
X has a Uniform distribution. Calculate P
, E
and Var
.
Solution: X ∼ Uniform(0, 100)
Assume that the number of phone calls that you receive in a day is governed by a Poisson process. Answer the following questions assuming that on average you receive 3.4 phone calls in a day.
ii. Given that you have already received a phone call, what is the probability that you will receive at least3 phone calls?
The probability density function for a random variable X is defined as:
|
Solution:
Try on your own.
I roll a die repeatedly. In each roll, if the outcome is 3, my score increases by 1; nothing happens otherwise. Knowing that my score was initially zero, what are the expected value and variance of my score right after the 10000th roll?
Solution: Reveal that this physical experiment generates a random variable X with X ∼ Binomial(10000, 1/6). Then, E(X) = 10000 ⋅ (1/6) = 1666.7 and Var(X) = 10000 ⋅ (1/6) ⋅ (5/6) = 1388.9
Suppose that you’re in charge of marketing airline seats for a major carrier. Four days before the flight date you’ve 16 seats remaining on the aircraft. You know from the past experience that 80% of the people that purchase tickets in this time period will actually show up for the flight.
A machine that produces stampings for automobile engines is malfunctioning and producing 10% defectives. The defective and nondefective stampings proceed from the machine in a random manner. If the next five stampings are tested, find the probability that three of them are defective.
Solution: This question is reserved for in-class discussions.
The variance of a Poisson random variable X is known to be 4. Calculate manually the probability that X takes a value of at least2. For ease, take e = 3.
Solution: X ∼ Poisson(λ), Var(X) = 4. Since for a Poisson random variable X ∼ Poisson(λ), Var(X) = λ, λ = 4 here.
Median and the coefficient of variation for a random variable
X ∼ N are given as 100 and 0.25, respectively. Given
F
= 0.20 for the standard normal distribution, calculate the 80th
pecentile of X.
Solution: Median of a Normal random variable is equal to μ, so μ = 100.
So, X ∼ Normal (100, 625). F(−0.84) = 0.20 implies F(0.84) = 1 − 0.20 = 0.80 by the symmetry of the Standard normal distribution. Based on these, the 80th percentile of X is found as:
Self-practice this solution by drawing the Normal and Standard normal PDFs.
Find the value of
|
with proper explanations.
Solution: Notice that the question requires the calculation of F(1) −F(−2) for the Standard normal random variable. It is nothing but the area under the standard normal PDF from −2 to 1. The answer is 0.8186.
Calculate P ≤ 8 for X ∼ Bin using the Normal approximation
to Binomial distribution.
Solution: X ∼ Bin(1000, 0.010) has E(X) = 10 and Var(X) = 9.9. So, X can be approximated by X ∼ Normal(10, 9.9). Calculation of P(X ≤ 8) is then:
We make 100 independent observations from a normal population with mean 40 and standard deviation 20. Approximately, what is the probability that the mean of these observations will be less than or equal 37?
Solution: Indeed, this question is an early reference to sampling distributions. Each of the 100 observations has a Normal (40, 400) distribution, name them as X1, X2, …, X100. Try to see ∼ Normal(40, 4). Calculation of P( ≤ 37) is then straightforward.
Suppose the PDF of a logistic random variable X is given by
|
Among the many of its parametrizations, this simple function is something you are familiar from your lab work during the semester.
iii. Using the graph of F only, find the value of E
Solution:
The class grades after an exam has a normal distribution with a mean of 50 and a variance of 144. If a student is known to have a grade less than 70, what is the probability that she has received a grade between 40 and 60?
Solution: Except for the use of conditional probatrilities, this is a trivial question. Try on your own.
An experimenter tosses a coin (with P(Tail) = P) until obtaining r successes (tails). What is the distribution of the number of tosses (X) to get r successes? Derive its PDF. Hint: X = x can occur only if there are exactly r− 1 successes in the first x− 1 trials. When you notice the first x− 1 trials have a Binomial structure, the rest is trivial.
Solution: Pr(x) = P(r− 1 successes in the first x− 1 trials) times P(a success at the x−th trial). The first term is nothing but the Binomial PDF & the second term is simply P. So,
Consider a random variable x with f(x) = e−
x2
, 0 ≤ x < ∞. Notice
the resemblance of this PDF of the standard Normal PDF. Though,
domain of f(x) spans the nonnegative real numbers. What should k be
to make f(x) a proper PDF? Plot f(x) using your solution for
k.
Solution: See the past exam questions for a solution.
Earlier we have studied the bivariate probabilities, yet we haven’t described bivariate probabilities referring to random variables and distribution functions. This chapter provides a calculus-based treatment of the same topic in an attempt to complete our knowledge of the topic.
In our earlier study, we’ve discussed probability models and computation of probability for events involving one variable mostly. These were the univariate models. Now, we are diving into models that involve more than one random variable, called multivariate models. As we are talking about more than one random variables, they are best represented as an n-dimensional random vector. This random vector is a function from a sample space S into Rn, i.e.. n-dimensional Euclidean space.
Let be a random vector. The function f
: R2 →R defined
by
|
is called the joint probability distribution function or joint pdf of if
X and Y are discrete. We denote the function as fX,Y
If is a continuous random vector, if for every A ⊂R2
|
f : R2 →R is called a joint probability density function or joint PDF
of
.
Note that the following are to hold for properly defining joint PDF’s: Discrete case:
|
|
Continuous case:
|
|
Let be a discrete bivariate random vector with joint PDF fX
.
Then, the marginal PDFs of X and Y are:
|
and
|
Let be a continous bivariate random vector with joint PDF
fX
. Then, the marginal PDFs of X and Y are:
|
and
|
Let be a discrete bivariate random vector with fX,Y
, fX
, and
fY
. Then,
|
and
|
Let be a continous bivariate random vector with fX,Y
, fX
,
and fY
. Then,
|
and
|
Let be a bivariate random vector with fX,Y
, fX
. Then, X and
Y are called independent random variables if, for every x ∈R and
y ∈R
|
If X and Y are independent
|
and
|
Notice that, except for the minor changes in notation, these definitons are the same as before.
The covariance of X and Y is the number defined by:
|
The correlation of X and Y is the number defined by:
|
This value is also called the correlation coefficient. Note that, −1 ≤ ρXY ≤ 1.
For any random variables X and Y,
|
So,
|
If X and Y are independent random variables, then
|
and
|
Let X and Y be any two random variables, also let a and b are any two constants, then
|
or
|
If X and Y are independent random variables with moment generating
functions MX and MY
, then the moment generating function of
X + Y is given by:
|
3.3 EXERCISES ___________________________________________________________
Fill the empty cells in the tables, which are tables of joint CDF and joint
PDF of the random variables .
f | 0.0 | 0.5 | 1.0 |
1 | 0.1 | ||
2 | 0.1 | ||
3 | |||
F | 0.0 | 0.5 | 1.0 |
1 | 0.2 | ||
2 | 0.5 | 0.65 | 0.75 |
3 | 0.5 | 0.8 | |
Let X and Y be two discrete random variables with joint density
function f ∈R2 such that
|
First we pick a number, at random, from the interval , then we
pick a number, at random, from the interval
. Let X1 be the
random variable that gives the value of the first number and X2 the
random variable of the second number. The distribution of X1 is
uniform over
and the distribution of X2 given that x1 = x1 is
uniform over
0Go to Teaching page & experiment with Normal(0, 1) using the file named ‘Statistical distributions.xlsx’. Is there anything in Z to experiment with?