In order to solve and understand problems pertaining to probability you must know some vocabulary:

- An
**experiment**, such as rolling a die or tossing a coin, is a set of trials designed to study a physical occurrence. - An
**outcome**is any result of a particular experiment. For example, the possible outcomes for flipping a coin are heads or tails. - A
**sample space**is a list of all the possible outcomes of an experiment. - An
**event**is a subset of the sample space. For example, getting heads is an event.

#### Counting Principles

- Theorem (The basic principle of counting): If the set
*E*contains*n*elements and the set*F*contains*m*elements, there are*nm*ways in which we can choose, first, an element of*E*and then an element of*F*. - Theorem (The generalized basic principle of counting): If
*r*experiments that are to be performed are such that the first one may result in any of n_{1}possible outcomes, and if for each of these n_{1}possible outcomes there are n_{2}possible outcomes of the second experiment, and if for each of the possible outcomes of the first two experiments there are n_{3}possible outcomes of the third experiment, and if …, then there is a total of n_{1}.n_{2}…n_{r}, possible outcomes of the r experiments. - Theorem: A set with
*n*elements has 2^{n}subsets. - Tree diagrams

#### Permutations

- Permutation: n!

The number of permutations of*n*things taken*r*at a time: - Theorem: The number of distinguishable permutations of
*n*objects of*k*different types, where n_{1}are alike, n_{2}are alike, …, n_{k}are alike and n=n_{1}+n_{2}+…+n_{k}, is

#### Combinations

- Combination: The number of combinations of
*n*things taken*r*at a time: (combinatorial coefficient; binomial coefficient) - Binomial theorem:
- Multinomial expansion: In the expansion of (
*x*)_{1}+x_{2}+…+x_{k}, the coefficient of the term , n^{n}_{1}+n_{2}+…+n_{k}=n, is . Therefore, (x_{1}+x_{2}+…+x_{k})^{n }= .**Note**that the sum is taken over all nonnegative integers*n*such that_{1}, n_{2},…,n_{k}*n*._{1}+n_{2}+…+n_{k}=n

#### The Number of Integer Solutions of Equations

- There are distinct positive integer-valued vectors (
*x*) satisfying_{1},x_{2},…,x_{y}*x*+_{1}*x*+…+_{2}*x*= n,_{r }*x*>0,_{i}*i*=1,2,…,r. - There are distinct nonnegative integer-valued vectors (
*x*) satisfying_{1},x_{2},…,x_{y}*x*+_{1}*x*+…+_{2}*x*= n._{r }

### Axioms of Probability

#### Sample Space and Events

**Experiment:**An experiment (strictly speaking, a random experiment) is an operation which can result in two or more ways. The possible results of the experiment are known as outcomes of the experiment. These outcomes are known in advance. We cannot predict which of these possible outcomes will appear when the experiment is done.**Sample Space:**The set containing all the possible outcomes of an experiment as its element is known as sample space. It is usually denoted by S.**Event:**An event is a subset of the sample space S.

**Illustration 1:**

Let us consider the experiment of tossing a coin. This experiment has two possible outcomes: heads (H) or tails (T)

- sample space (S) = {H, T}

We can define one or more events based on this experiment. Let us define the two events A and B as:

A: heads appears

B: tails appears

It is easily seen that set A (corresponding to event A) contains outcomes that are **favorable to event A and set B contains outcomes favorable to event B.** Recalling that n (A) represents the number of elements in set A, we can observe that

n (A) = number of outcomes favorable to event A

n (B) = number of outcomes favorable to event B

n (S) = number of possible outcomes

Here, in this example, n (A) = 1, n (B) = 1, and n (S) = 2.

- Set theory concepts: set, element, roster method, rule method, subset, null set (empty set).
- Complement: The complement of an event
*A*with respect to*S*is the subset of all elements of S that are not in*A*. We denote the complement of*A*by the symbol*A’*(*A*).^{c} - Intersection: The intersection of two events
*A*and*B*, denoted by the symbol*A**∩**B*, is the event containing all elements that are common to*A*and*B*.

Two events*A*and*B*are mutually exclusive, or disjoint, if*A**∩**B = φ*that is, if*A*and*B*have no elements in common. - The union of the two events
*A*and*B*, denoted by the symbol*A*∪*B*, is the event containing all the elements that belong to*A*or*B*or both.

- Venn diagram:
- Sample space of an experiment: All possible outcomes (points)
- Events: subsets of the sample space

Impossible events (impossibility): Φ; sure events (certainty): S.

- DeMorgan’s laws:

#### Axioms of Probability

- Probability axioms:

(1) 0≤P(A)≤1;

(2) P(S)=1;

(3) P(A_{1}∪A_{2}∪…) = P(A_{1})+P(A_{2})+… If {A_{1}, A_{2}, …} is a sequence of mutually exclusive events. - Equally likely outcomes: the probabilities of the single-element events are all equal.

A number of sample events are said to be equally likely if there is no reason for one event to occur in preference to any other event.

#### Basic Theorems

(1) 0≤P(A)≤1;

(2)

(3) Complementary events:

(4) P(A∪B) = P(A) + P(B) – P(A∩B): inclusion-exclusion principle

(5) If A_{1}, A_{2}, A_{n} is a partition of sample space *S*, then

(6) if a and a’ are complementary events, then P(A) + P(A’)=1.

### Conditional Probability and Independence

#### Conditional Probability

1. Conditional probability:

Consider two events A and B defined on a sample space S. The probability of occurrence of event A given that event B has already occurred is known as the conditional probability of A relative to B.

**P(A/B) = P(A given B) = probability of occurrence of A assuming that B has occurred**

It implies that the outcomes favorable to B become the possible outcomes and hence outcomes favorable to P(A/B) are outcomes common to A and B.

Let n(B)=a and n(A∩B)=b. Suppose that the event B occurs. Then there are exactly a sample points and these points can be considered to be the sample space for the other event A. The event A in this sample space consists of b sample points common to A and B.

Therefore, the probability of A in this sample space = b/a.

Thus the probability of A under the assumption that event B has occurred is:

and similarly

The above result is known as **conditional probability theorem.**

2. If in an experiment the events A and B can both occur, then P(A∩B) = P(A)P(B|A) = P(B)P(A|B).

P(A∩B∩C) = P(A∩B)P(C|A∩B) = P(A)P(B|A)P(C|A∩B),

The multiplication rule: P(A_{1}∩A_{2} ∩…∩ A_{n}) = P(A_{1})P(A_{2}|A_{1})P(A_{3}|A_{1}∩A_{2}) … P(A_{n}|A_{1}∩A_{2}∩…∩A_{n-1}).

3. Partition: Let {B_{1}, B_{2}, …, B_{n}} be a set of nonempty subsets of the sample space *S* of an experiment. If the events B_{1}, B_{2}, …, B_{n} are mutually exclusive and B_{1}∪B_{2}∪…∪B_{n} = S, the set {B_{1}, B_{2}, …, B_{n}} is called a partition of *S*.

4. Theorem of total probability: If B_{1}, B_{2}, … is a partition of *S*, and *A* is any event, then

Total Probability Theorem: The probability that one of several mutually exclusive events A_{1}, A_{2}, .... A_{n} will happen is the sum of the probabilities of the separate events. i.e.

P(A_{1}∪A_{2}∪A_{n}) = P(A_{1})+P(A_{2})+.... + P(A_{n})

5. Bayes’ Theorem: If B_{1}, B_{2}, … is a partition of *S*, and *A* is any event, then

#### Independence

- Independent events: If
*A*,*B*are independent events ↔ P(A∩B) = P(A)P(B). - Theorem: If
*A*and*B*are independent, then are independent. - The events
*A*,*B*, and*C*are called independent if P(A∩B) = P(A)P(B), P(A∩C) = P(A)P(C), P(B∩C) = P(B)P(C), P(A∩B∩C) = P(A)P(B)P(C). If*A*,*B*, and*C*are independent events, we say that {A, B, C} is an independent set of events. - The set of events {A
_{1}, A_{2}, …, A_{n}} is called independent if for every subset of {A_{1}, A_{2},…, A_{n}}, .

#### Important Results:

Following results are easily derived from the definition of independent events.

- A and B are independent if

P(B/A) = P(B) and P(A/B) = P(A) - If A and B are independent, then P (A∩B) = P (A) P (B)
- A set of events A
_{1}, A_{2},... A_{n}are said to be pair-wise independent if

P(A_{i}n A_{j}) = P(A_{i}) P(A_{j}) ∀i ≠j. - The events A
_{1}, A_{2}, ... A_{n}are mutually independent (or simply independent) if and only if the multiplication rule:

P(A_{1}∩A_{2}∩**....**A_{k}) = P(A_{1}) P(A_{2}) .... P(A_{k}) ... (1)

holds for every t triples of events, k = 2, 3, .... n. If (1) holds for k = 2 and may not hold for k = 3, 4, ....... n then events A_{1}, A_{2}, ... A_{n}are said to be pair wise independent. Thus mutually independent events are pair wise independent but converse is not true. - If A and B are mutually independent events such that P (A) ≠ 0 and P (B) ≠ 0, then the events A and B have at least one common sample point (i.e. they cannot be mutually exclusive). Or in general,
**mutually exclusive events are dependent events.**

### Distribution Functions and Discrete Random Variables

#### Distribution Functions

1. Cumulative Distribution Functions (*cafe*): F_{X}(t) = P(X≤t), -∞ < t < ∞.

2. F_{X}(t) is non-decreasing; 0≤*F _{X}*(

*t*)≤1;

If c<d, then *F _{X}*(c)≤

*F*(

_{X}*d*);

*P*(

*c*<

*X*≤

*d*) =

*F*(

_{X}*d*)-

*F*(

_{X}*c*);

*P*(

*X*>

*c*) = 1-

*F*(

_{X}*c*).

The *cdf* of a discrete random variable: a step function.

### Special Discrete Distributions

#### Bernoulli and Binomial Random Variables

1. Bernoulli trials: an experiment with two different possible outcomes

Bernoulli random variable *X* with parameter *p*, *p* is the probability of a success.

*p _{x}*(

*x*) =

*p*(1-

^{x}*p*)

^{1-x}=

*p*

^{x}q^{1-x}, for

*x*∈

*R*= {0,1}

_{X}Expected value: *E[X] = μ _{x} = p*; variance:

2. Binomial distribution: number of successes to occur in *n* repeated, independent Bernoulli trials

Binomial random variable *Y* with parameters *n* and *p*

Expected value: *E(Y) = μ _{Y} = np*; Variance:

#### Multinomial Random Variables

1. Multinomial trials: an experiment with k≥2 different possible outcomes

2. Multinomial distribution: *n* independent multinomial trials

Multinomial random variable *X _{1}, X_{2}, …, X_{k}* with parameters

*n, p*: the number of

_{1}, p_{2}, …p_{k}; X_{t}*i*th outcomes; ; ;

#### Geometric distribution: trial number of the first success to occur in a sequence of independent Bernoulli trials

Geometric random variable *N* with parameter *p*

Geometric series:

Expected value: ; variance:

#### Negative binomial distribution: trial number of the *r*th success to occur

Negative binomial random variable *N*, with parameters *r*, *p*

Expected value: ; variance:

#### Poisson Distribution

1. The Poisson probability function:

Poisson random variable *K* with parameter λ

Expected value: *E(K) = μ _{K} = λ*; variance:

2. he Poison approximation to the binomial: If* X* is a binomial random variable with parameters *n* and *p=λ/n*, then

### Continuous Random Variables

#### Probability Density Functions

1. Densities

2. Probability density function (*pdf*) for a continuous random variable *X*: f_{x}(x)

*f _{x}(x)≥0*

For all

#### Cumulative Distribution Functions (*cdf*)

1.

2. The *cdf* of a discrete random variable: a step function; the *cdf* of a continuous random variable: a continuous function

3. The probability function of a discrete random variable: size of the jump in *F _{X}(t)*; the

#### Expectations and Variances

1. Definition: If *X* is a continuous random variable with *pdf f _{x}(x)*, the expected value of

*X*is defined by .

Example: In a group of adult males, the difference between the uric acid value and 6, the standard value, is a random variable X with the following *pdf*: *f _{x}(x) *= (3x

^{2}– 2x) if 2/3<x<3. Calculate the mean of these differences for the group.

2. Theorem: Let *X* be a continuous random variable with *pdf* *f _{x}(x)*; then for any function

*h*: .

3. Corollary: Let *X* be a continuous random variable with *pdf f _{x}(x)*. Let

*h*be real-valued functions and α

_{1}, h_{2}, …, h_{n}_{1}, α

_{2}, …, α

_{n}be real numbers. Then

*E[α*.

_{1}h_{1}(X) + α_{2}h_{2}(X) + … + α_{n}h_{n}(X)] = α_{1}E[h_{1}(X)] + α_{2}E[h_{2}(X)] + … + α_{n}E[h_{n}(X)]4. Definition: If *X* is a continuous random variable with *E[X] = μ _{x}*, then

*Var[X]*and σ

_{x}, called the variance and standard deviation of

*X*, respectively, are defined by , .

### Special Continuous Distributions

#### Uniform Random Variable

1. Density of a uniformly distributed random variable:

2. The *cdf* of a uniformly distributed random variable:

3. The expected value and variance of the uniform random variable: ; .

#### The Exponential Distribution

1. The exponential probability law with parameter λ:

*T*_{1} is the time of occurrence of the first event in a Poisson process with parameter λ, starting at an arbitrary origin t=0; {*X _{λt} *= 0} ≡ {T

_{1}>t}, Poisson process

*X*is the number of events to occur with parameter

_{λt}*μ=λt*.

2. The expected value and variance of the exponential random variable:

#### The Erlang Distribution

1. The *cdf* and *pdf* for *T*_{2}:

*T*_{2} Is the time of occurrence of the second event in a Poisson process with parameter λ, starting at an arbitrary origin t=0; {*X _{λt} ≤ *1} ≡ {T

_{2}>t}, Poisson process

*X*is the number of events to occur with parameter

_{λt}*μ=λt*?

2. The Erlang probability law with parameters *r* and λ:

*T*_{r} is the time of occurrence of the *r*th event in a Poisson process with parameter λ, starting at an arbitrary origin t=0; {*X _{λt} ≤ r - *1} ≡ {T

_{r}>t},

*X*is the number of events to occur.

_{λt}3. The expected value and variance of the Erlang random variable:

#### The Gamma Distribution

1. The gamma probability law with parameters *n* and λ: for u>0, n>0, λ>0.

Gamma function: ; Γ(n) = (n-1)Γ(n-1); Γ(n) = (n-1)!, if *n* is a positive integer.

The Erlang random variable is a particular case of a gamma random variable, where *n* is restricted to the integer values r=1,2,3,….

2. The expected value and variance of the gamma random variable:

#### The Normal (Gaussian) Distribution

1. The normal probability law with parameters μ and σ: , for -∞<x<∞,σ>0.

2. The expected value and variance of the normal random variable: .

3. Standard normal random variable (the unit normal): , μ=0, σ=1.

4. The new random variable: If *X* is normal with mean μ and variance σ^{2}, then *X+b* is normal with mean μ+b and variance σ^{2}; *aX* is normal with mean *aμ* and variance *a ^{2}σ^{2}; aX+b *is normal with mean

*aμ+b*and variance

*a*.

^{2}σ^{2}5. Switching from a non-unit normal to the unit normal: , *μ _{z}*=0,

*σ*=1.

_{z}6. The probability that a normal random variable is with *k* standard deviations of the mean: *P*(*μ-kσ≤X≤μ+kσ*) = *P*(*-k≤X*≤k*), *X** is the unit normal.

7. The normal approximation to the binomial: A binomial distribution *X* with parameters *n*, *p* where *n* is large, then *X* is approximately normal with μ = np, σ^{2} = npq.

### Jointly Distributed Random Variables

#### Joint Densities

1. Jointly distributed random variable: If the observed values for two or more random variables are simultaneously determined by the same random mechanism.

2. The discrete random variables X_{1}, X_{2}: the probability function is for all x_{1}, x_{2}; . The continuous random variables X_{1}, X_{2}: the *pdf* is positive; for all x_{1}, x_{2}; ; the probability is given by integrating .

3. The joint density of independent random variables: The random variables *X*, *Y* are independent if and only if p_{X,Y}(x,y) = p_{X}(x)p_{Y}(y), when *X*, *Y* are discrete; *f _{X,Y}(x,y) = f_{X}(x)f_{Y}(y)*, when

*X*,

*Y*are continuous.

4. Uniform joint densities: If *X* and *Y* are jointly uniform on a region, the joint *pdf* is , for (*x*, *y*) in the region.

The joint density of independent uniform random variables:

#### Marginal Densities

1. The discrete random variables *X _{1}, X_{2}* with the probability function and range , the marginal probability function for

*X*

_{1}is for is the marginal range for

*X*

_{1}, the set of first elements of .

2. The continuous random variables *X _{1}, X_{2 }*with

*X*is for . is the marginal range for

_{1}*X*, the set of first elements of .

_{1}#### Sums of Independent Random Variables

1. The sum of independent binomials with a common *p*: *X _{1}* with parameters n

_{1}, p, X

_{2}with parameters

*n*

_{2},

*p*, then

*X*with parameters n

_{1}+X_{2}_{1}+n

_{2},p

2. The sum of independent Poissons: *X*_{1} with parameter *λ*_{1}, *X*_{2} with parameters *λ*_{2}, then *X _{1}+X_{2}* with parameters

*λ*

_{1}+

*λ*

_{2}

3. The sum of independent exponentials with a common *λ: X _{1}, …, X_{n}* with parameter

*λ*, then

*X*

_{1}+…+

*X*

_{n}has a gamma(Erlang) distribution with parameters n,

*λ*.

4. The density of the sum of two arbitrary independent random variables: the convolution of the individual *pdf*,

5. The sum of independent normals: *X*_{1} with mean *μ*_{1} and variance with mean μ_{2} and variance , then X_{1}+X_{2} with mean μ_{1}+μ_{2} and variance

** ==========================================**

## Comments

write a comment