3.1 - Introduction to probability

Introduction to Probability

Conducting Inferential Statistics

Population - {A,B,C,D,E,F}
Interest - proportion of vowels in population
Sampling method - simple random sampling with n=3

Sample space
Definition

Sample space is a groupd of every possible outcome

ABC,ABD,ABE,ABF,ACD,ACE,ACF,ADE,ADF,AEF,BCD,BCE,BCF,BDE,BDF,BEF,CDE,CDF,CEF,CDF|S|=20
Definitions

A random experiment is a phenomenon in which a single outcome will occur from a set of possible outcomes - sample space

If we are using SRS, by definition every sample of size 3 has an equal likelihood of occuring. When every outcome is equally likely, we are in the realm of classical probability

What makes the sample unbiased? - If the sample statistic is equal to the population parameter. The bolded are unbiased

p=p^
Probability notation
Definition

If A is an event for some random experiment, the probability of A occuring, that is one of the outcomes in A, is denoted as P(A)

Examples

P(only one vowel)=1220=60%
P(unbiased)=1220=60%
P(3 vowels)=0 impossible event
P(3 consonants)=420=20%

Complement
Definition

If A is an event for some random experiment, the complement of A, denoted as A¯, consists of all the outcomes in the sample space that are not in A

A+A¯=100%

Complements are SETS. Probablitity of a complement is a percentage

Examples

A - sample contains no vowels
A¯={ABC,ABD,,DEF} - all outcomes that don't have no vowels
B - sample contains at least 2 vowels
B¯ - no or 1 vowel
C - sample contains one or three consonants
C¯ - contains none or two consonants
D - sample contains more than 2 consonants
D¯ - contains 2 or less consonants / has a vowel

Probability rules

Given any event A

0P(A)1P(A¯)=1P(A)

If P(A)=1, A is a certain event
If P(A¯)=1, A is an impossible event

Examples

A - Sample contains both vowels and consonants
A¯ - sample contains only vowels or only consonants
|A¯|=4
P(A)=1P(A¯)=1420=1620=80%
B - Sample contains equal numbers of vowels and consonants
P(B)=0

Methods of probability

Classical probability

We utilize classial probability when every outcome in our sample space is reasonably understood to be equally likely

P(A)=# of outcomes in A# of outcomes in sample space
Empirical probability

We utilize empirical probability when we estimate probabilities through repeated trials of the random experiment

P(A)# of times A occurs# of times the random experiment is run

Nobody told us that when rolling the dice is equal, but after playing with it for some time, we can deduct it by doing number of rolls

Subjective Probability

We engage in subjective probability when we assign a numerical value to value the probability of an event occuring based on personal judgment using past experiences and opinions

Foundations of Empirical Probability

The probability of an event A remains constant across all previous, current, and future trials

The computed relative frequency of an event A approaches the 'true' probability of the event as the number of trials increases
This is called the Law of Large Numbers

Examples

For each random experiment, identify the sample space and determine which method of probability would be most approximate to utilize

A married couple planning on having 4 children
sample space size - |S|=24=16
Technically there are more women born, so it should be empirical probability

The Chiefs playing out the 2026 preseason games
sample space size - |S|=23=8
Taking that Chiefs have 50% of W/L, it's classical probability

Rolling a pair of two standard, fair, six-sided dice
sample space size - |S|=62=36
A - getting at least one 1 - P(A)=1136
B - getting a prime sum - P(B)=1536
C - getting different numbers on each dice - P(C)=1636=3036=56
D - sum being odd number, or product > 34 - P(D)=1836+036=12

Counting

Basic Counting of Events

Fundamental Counting Principle

Identical and Independent Random Sampling

A random sample is formed using identical and independent random sampling (i.i.d.) when each member of a sample is chosen one-by-one with replacement where each member is equally likely for each selection

Basically - i.i.d. is SRS allowing repetition

Exercise

Sample space from i.i.d. of 3 letters from the first 6 letters of the alphabet
AAA,AAB,ABA,ABC,ABD,
each sample is created from choosing one of the 6 letters, so the sample space size$$|S| = 6 \cdot 6 \cdot 6 = 6^3 = 216 $$
Is the classical probability method appropriate
Yes, it is appropriate as we can easily create the whole sample space and work on it

Determine the size of the sample space for random experiment consisting of conducting i.i.d. random sampling of size n form a population of size N $$|S| = N^n$$

Counting and order

Determine if the selection order matters
Musical pieces for a dance - YES
Starting players on a basketball team - NO
Members of a simple random sample - NO
Members of an i.i.d. sample - YES

Examples

Band has 20 songs, and is planning a 10 song concert with an encore piece. How many different concerts do they choose from?
2019...10=6.71012

Permutation

Definition

The number of ways that r objects can be selected from n objects when the order of selection matters is called permutation

P(n,r)=nPr=n(n1)(n2)... (nr+1)P(n,n)=nPn=n(n1)... 321=n!P(n,r)=nPr=n!(nr)!

20P11=20!9!

In excel, we can use {excel} =permut(n, r) to calculate the permutation value

Combinations

Counting when order does not matter
Determine the sample space for SRS of size 3 from population with 5

5P36=5436=606=10

We divide by 6, because in SRS we would take samples {ABC,ACB,BAC,BCA,CAB,CBA} as different samples when the order matters

Formula

C(n,r)=(nr)=P(n,r)r!=n!r!(nr)!

In excel, we can use {excel}=combin(n, r)

Examples

A co-ed softball team with 5 men and 5 women. League requires that all members of the team must bat, and batters must alternate by gender.
Determine the number of possible batting orders

5men for 1st5women for 1st4men for 2nd43322112man or woman batting first5!5!2=28800

Number of passwords of length 5 consisting of only lower case letters
26512000000
Number of passwords of length 5 with both lower and upper case
522380000000
Number of passwords of length 5 with lower, upper case and digits
625916000000

A small university of 250 students has 150 STEM major students. A SRS sample of 75 students
Determine the probability of the event that the sample is unbiased with respect to the proportion of STEM students
|S|=250C75=1.1565
Event - p^=p 45 STEM, 30 non-STEM students
p=150250=35=60%
p^=35=4575
So to choose 45 STEM students - 150C45
30 non-STEM students - 100C30
So the complete probability
P=150C45100C30250C75=0.112
Determine the probability that the sample contains no STEM majors
P=100C75250C75=2.091042
Determine the probability that the sample contains only STEM majors
P=150C75250C75=8.011022

A volunteer needs 20 unique prices for a ring toss. Quarter of prizes should be soda, another quarter - Pringles, the rest half will be candy. The local store has 12 types of soda, 10 flavors of Pringles, and 20 different candy bars.
Determine the number of possible prize combinations
Soda - |S|=(125)=792
Pringles - |P|=(105)=252
Candy - |C|=(2010)=184756
Total combinations - |A|=|S||P||C|=38874341504

Counting with Indistinguishable Objects

What happens when we want to arrange 3 silver and 2 golden coins, and we cannot distinguish ones made from the same material?
Our sample space would look something like this

{GGSSS,GSGSS,GSSGS,GSSSG,SGGSS,SGSGS,SGSSG,SSGGS,SSGSG,SSSGG}

Notice that we don't distinguish between separate golden or silver coins
To count the size of that sample space, we have to use following formula

Formula

|A|=n!n1!n2!nk!

Where n is the total number of objects, and ni are the counts of the repeating, indistinguishable objects

So for our situation, we have 5 coins in total, 3 silver and 2 gold

|A|=5!2!3!=54232121321=52=10