This article presents a bunch of code snippets that you can use as a reference when dealing with normal distribution. R makes it easy to run stats. Use, improve and share your comments please.

#the 68-96-99.7% rule for the normal distribution with regards to standard deviations

#How to work with standard normal tables#What is normal distribution?

#The Normal Distribution is a theoretical probability distribution that is perfectly symmetric about its mean

#Normal Distributions are uniquely defined by two quantities: a mean (µ), and standard deviation (?)

#The entire distribution of values described by a normal distribution can be

#completely specified by knowing just the mean and standard deviation

#The 68-95-99.7 Rule for the Normal Distribution

#68% of the observations fall within one standard deviation of the mean

#The probability that any randomly selected value is with one standard deviation of the mean is 0.68 or 68%

#In R that could be found with the pnorm fuction as below:# Just remember that pnorm gives you cumulative area under the argument you provide.

area_under_1SD=pnorm(1)-pnorm(-1)

print(area_under_1SD)

print(100*area_under_1SD)#Remember to use 1-pnorm for this one

area_above_1SD=1-pnorm(1)

area_above_1SD#you can get it directly from the pnorm function

area_below_1SD=pnorm(-1)

area_below_1SD

area_under_2SD=pnorm(1.96)-pnorm(-1.96)

print(area_under_2SD)

print(100*area_under_2SD)#Remember to use 1-pnorm for this one

area_above_2SD=1-pnorm(1.96)

area_above_2SD#you can get it directly from the pnorm function

area_below_2SD=pnorm(-1.96)

area_below_2SD

#Applying the Principles of the Normal Distribution to Sample Data to Estimate

#Characteristics of Population Data#Given that blood sugar mean=123.6, standard deviation=12.9

#Using only the sample mean and standard deviation, and assuming normality,

#let’s estimate the 2.5th and 97.5th percentiles blood sugar

qnorm(.025,123.6,12.9)

qnorm(.5,123.6,12.9)# Check point, 50th percentile should give you the mean back.

qnorm(.975,123.6,12.9)#by hand calculations

#2.5th %ile: = 123.6 –(2×12.9) = 97.8

#97.5th %ile: = 123.6 +(2×12.9) = 149.4

##################

#Problem type 2

#mean and sd are given

#Test an individual observation relative to the rest of population

sample=130

n=113

mean=123.6

sd=12.9

# A patient has blood sugar =130

#what is proportion of men with higher blood sugar than 130#First find the difference betweem mean and sample

diff=sample-mean

diff

#[1] 6.4

#determine difference in terms of sd

diff_by_sd=diff/sd

diff_by_sd

#[1] 0.496124#Now we can determine what %of the normal curve is more than 0.5 SD above it’se mean

pnorm(1) #cumulative area under +1SD

pnorm(0) #cumulative area in the left hand side of the

pnorm(1)-pnorm(0) # area ABOVE the mean within 1 sd

pnorm(0)-pnorm(-1) #area BELOW the mean within 1 sd

pnorm(1)-pnorm(-1) # total area within 1 sdpnorm(.5)-pnorm(0) # area ABOVE the mean within .5 sd

# But the question is asking what % od the population is to the right side above the

#area between mean and .5 sd, which cab be found by:

.5-(pnorm(.5)-pnorm(0))# or in short we can get the same answer using

# we know 99.7% population will be within 3 sd.

pnorm(3)-pnorm(.5)

#####################################

# ANOTHER EXAMPLE

mean=7.1 #kg

sd=1.2 #kg

n=236 # no. of children / observation

#assume normal distribution

#calculate: Range of weights of children in the population

#we will use the qnorm function

#we know 2sd covers above 95% of the population between 2.5% to 97.5%

#

q25=qnorm(0.025,mean,sd)

q25

q97.5=qnorm(.975,mean,sd)

q97.5

# so, range is between 4.7 to 9.5 kgs# now once child has weight of 5kgs, how to interpret this data?

sample=5 #kg

#First find the difference betweem mean and sample

#Let’s go back to our previously defined equations

diff=sample-mean

diff

#determine difference in terms of sd

diff_by_sd=diff/sd

diff_by_sd

#so, the sample is 1.75sd below the mean# So, we can determine what percent of the population is less than

#-1.75sd in the population using pnorm

pnorm(-1.75)

#so only 4% childred would have less than 5 kg weight

pnorm(0)-pnorm(-1.75)

#so, 46% of the children will be between 5 kg and 7.1 kg

#area under -1.75 and 1.75 sd

pnorm(1.75)-pnorm(-1.75)

#so, 92% of the pulation is within 1.75 sd

#####################

#Example 3:

####################

n=1860

mean=23.6

sd=4.9

# first assume normal distribution

# find the range for 95% of the data

q25=qnorm(0.025,mean,sd)

q25

q97.5=qnorm(.975,mean,sd)

q97.5#find the 95th percentile

q95=qnorm(.95,mean,sd)

q95

#note that the 95 percentile is often used as a cut off value for a study# to find a % of pulation with a given interval (a,b) do this:

a=18.5

b=24.9

mean=23.6

sd=4.9

a=(a-mean)/sd

b=(b-mean)/sd

a

b

#% area between a and b

pnorm(b)-pnorm(a)