About
Developed by Andy Runquist
This set of exercises guides the student in exploring how to use a computer algebra system to determine the propagated error of a calculated parameter based on measured quantities with known uncertainties. This approach is based on the Monte Carlo approach.
As is detailed very thoroughly here, there are many methods for doing error propagation. Probably the most common is the calculus approach which assumes that not only do all variables follow a normal distribution, but that any calculation does so as well. Note how the examples described here don’t obey that latter issue. The link above carefully describes how the Monte Carlo method is the most accurate way of doing error propagation. From a numeric perspective, it’s also the easiest (not counting the simple crank-three-times). Certainly using a Computer Algebra System allows for coding up the calculus approach, but if the Monte Carlo approach is more accurate, why not do it and skip having to use a Computer Algebra System?
Subject Area | Mathematical/Numerical Methods |
---|---|
Level | First Year |
Available Implementations | Mathematica and Python |
Learning Objectives |
Students will be able to:
|
Time to Complete | 30 min |
- Produce a large set of numbers that obey a normal distribution with a mean of 5.4 and a standard deviation of 0.2. Many computer programming languages have a weighted random number built in. You can also build your own using the Box Muller transformation that takes two uniformly distributed random numbers between 0 and 1 and returns two normally distributed random numbers with a mean of zero and a standard deviation of 1. These can then be transformed to match the requested mean and standard deviation. Plot a histogram of the large data set, confirming that the peak and width are what you expected.
-
Determine the histogram for the speed example above by using the calculus error propagation approach.
a) Assuming that all values are distributed according to a normal distribution, the calculus approach is given by
Unexpected text node: 'style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; border-top-style: none; border-top-width: 0px; border-right-style: none; border-right-width: 0px; border-bottom-style: none; border-bottom-width: 0px; border-left-style: none; border-left-width: 0px; width: 613px; display: inline; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; line-height: normal">' style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; border-top-style: none; border-top-width: 0px; border-right-style: none; border-right-width: 0px; border-bottom-style: none; border-bottom-width: 0px; border-left-style: none; border-left-width: 0px; width: 613px; display: inline; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; line-height: normal">Unexpected text node: 'style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; border-top-style: none; border-top-width: 0px; border-right-style: none; border-right-width: 0px; border-bottom-style: none; border-bottom-width: 0px; border-left-style: none; border-left-width: 0px; width: 613px; display: inline; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; line-height: normal">' style="padding-top: 0px; padding-right: 0px; padding-bottom: 0px; padding-left: 0px; border-top-style: none; border-top-width: 0px; border-right-style: none; border-right-width: 0px; border-bottom-style: none; border-bottom-width: 0px; border-left-style: none; border-left-width: 0px; width: 613px; display: inline; margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; line-height: normal">b) Using Eq. 2, determine the error for the speed example from Exercise 1. 3. Now use the Monte Carlo method. Generate several hundreds or thousands of values for both position and time according to their respective distributions. Calculate the speed that corresponds to each of these values, and plot a histogram of speeds. Compare with the results of exercise 2.
-
Produce several (hundreds or thousands) of both position and time estimates according to their respective distributions and calculate their associated speeds. Then plot the histogram of those speeds and compare with (2).
- Determine the mean, median, and standard deviation of the histrogram for (3) and compare with 2.
- Extend the Monte Carlo approach to lab data of your own.
#!/usr/bin/env python
'''
montecarlo.py
Eric Ayars
June 2016
Python solutions to PICUP "Monte Carlo error propagation"
exercise set.
'''
from pylab import *
import random
#########################
# Exercise 1
#########################
# Generate the gaussian distribution
N = 1000 # How many to generate
mu = 5.4 # mean
sigma = 0.2 # sigma
dist = array([random.gauss(mu, sigma) for j in range(N)])
# Plot a histogram of the distribution
hist(dist, 30)
show()
#########################
# Exercise 3
#########################
# characteristits of d and t
avg_d = 5.4
sigma_d = 0.2
avg_t = 6.2
sigma_t = 1.5
N = 1000 # How many to generate
# Generate distribution of distances
d = array([random.gauss(avg_d, sigma_d) for j in range(N)])
t = array([random.gauss(avg_t, sigma_t) for j in range(N)])
# Since d and t are numpy arrays, we can just divide, the
# resulting index-wise multiplication will generate another
# numpy array of speeds.
v = d/t
# plot histogram of velocity
hist(v, 30)
show()
#########################
# Exercise 4
#########################
# report average and sigma
print('mean speed = %0.3f' % mean(v))
print('median speed = %0.3f' % median(v))
print('standard deviation = %0.3f' % std(v))
Translations
Code | Language | Translator | Run | |
---|---|---|---|---|
![]() |
Credits
Fremont Teng; Loo Kang Wee; based on codes by Andy Runquist
1. Introduction:
This briefing document reviews the "PICUP Monte Carlo error propagation JavaScript Simulation Applet HTML5" available through Open Educational Resources / Open Source Physics @ Singapore. The resource provides a set of exercises designed to teach students how to use computational tools, specifically a computer algebra system, to determine the propagated error of a calculated parameter based on measured quantities with known uncertainties. The core approach taught is the Monte Carlo method, which is compared to the traditional analytical (calculus-based) approach.
2. Main Themes and Important Ideas:
The primary focus of this resource is to introduce and illustrate the Monte Carlo method for error propagation in scientific measurements and calculations. Key themes and ideas include:
- Error Propagation: Understanding how uncertainties in measured quantities affect the uncertainty in a calculated quantity that depends on those measurements.
- Monte Carlo Method: Utilizing repeated random sampling to estimate the distribution and statistical properties (mean, standard deviation) of a calculated parameter based on the distributions of the input measured quantities. The "About" section explicitly states: "This approach is based on the Monte Carlo approach."
- Normal Distribution: The exercises assume that the measured quantities follow a normal (Gaussian) distribution. Exercise 1 focuses on generating and visualizing normally distributed random numbers.
- Histograms: Visualizing the distribution of generated random numbers and calculated parameters using histograms is a crucial component of the learning process. Students are asked to "Plot a histogram of the large data set, confirming that the peak and width are what you expected." (Exercise 1) and "plot a histogram of speeds" (Exercise 3).
- Comparison of Analytical and Monte Carlo Approaches: The resource explicitly aims to have students "Compare the analytical (calculus) approach to the Monte Carlo approach ( Exercises 2 and 3 )." This allows students to see the strengths and weaknesses of each method.
- Analytical (Calculus) Error Propagation: Exercise 2 introduces the standard formula for error propagation using partial derivatives: "Assuming that all values are distributed according to a normal distribution, the calculus approach is given by σ f = ( ∂ f ∂ x σ x ) 2 + ( ∂ f ∂ y σ y ) 2 − − − − − − − − − − − − − − − − − − √ (2)" This equation shows how the uncertainty in a function f (σf) is related to the uncertainties in its variables x (σx) and y (σy), and the partial derivatives of f with respect to those variables. The exercise then guides students to apply this to a speed calculation (v = x/t), resulting in: "Show that in the case of the speed calculation, this becomes : σ v = x t ( σ x x ) 2 + ( σ y y ) 2 − − − − − − − − − − − − − − √ (3)"
- Computational Implementation: The resource mentions available implementations in "Mathematica and Python," indicating that students are expected to use programming tools to perform the simulations and calculations. The included montecarlo.py script provides a Python example for generating normally distributed numbers and performing the Monte Carlo simulation for a speed calculation.
- Statistical Analysis: Students are expected to calculate and compare statistical properties of the generated distributions, such as mean, median, and standard deviation. Exercise 4 in the Python script prints these values for the simulated speed distribution:
- print('mean speed = %0.3f' % mean(v))
- print('median speed = %0.3f' % median(v))
- print('standard deviation = %0.3f' % std(v))
- Application to Lab Data: Exercise 5 encourages students to "Extend the Monte Carlo approach to lab data of your own," promoting the practical application of the learned concepts.
3. Key Exercises and Learning Objectives:
The resource is structured around a series of exercises, each with specific learning objectives:
- Exercise 1: Focuses on generating normally distributed random numbers, plotting histograms, and understanding the characteristics of a normal distribution (mean and standard deviation). Students will be able to: "Generate normally distributed random numbers".
- Exercise 2: Introduces the analytical (calculus) approach to error propagation and asks students to apply it to a specific example (speed calculation). Students will be able to: "Compare the analytical (calculus) approach to the Monte Carlo approach".
- Exercise 3: Implements the Monte Carlo method by generating distributions for measured quantities (position and time), calculating the dependent parameter (speed) for each set of values, and plotting a histogram of the results. Students will be able to: "Plot histograms. Calculate mean, median, and standard deviation for a distribution. Generate a new distribution from previously generated random numbers." and "Compare the analytical (calculus) approach to the Monte Carlo approach".
- Implicit Objectives: Through the exercises, students will also develop skills in using computational tools (like Python or Mathematica) for numerical simulations and data analysis.
4. Provided Code Example (montecarlo.py):
The included Python script montecarlo.py demonstrates how to perform the Monte Carlo error propagation for a speed calculation (distance/time). It shows how to:
- Generate normally distributed random numbers using random.gauss().
- Create NumPy arrays to store the generated values for distance (d) and time (t).
- Perform element-wise division to calculate the speed (v).
- Plot histograms of the generated distributions using pylab.hist().
- Calculate and print the mean, median, and standard deviation of the resulting speed distribution using NumPy functions (mean(), median(), std()).
5. Additional Information:
- The resource is designed for a "First Year" level, suggesting it's appropriate for introductory undergraduate science or engineering courses.
- The estimated "Time to Complete" for each exercise is around 30 minutes.
- The applet itself can be embedded in a webpage using the provided <iframe> code.
- The resource provides links to earlier versions of the exercise and credits the developers.
- It is licensed under the Creative Commons Attribution-Share Alike 4.0 Singapore License.
6. Conclusion:
The PICUP Monte Carlo Error Propagation JavaScript Simulation Applet HTML5 provides a valuable and practical resource for students to learn about error propagation using the Monte Carlo method. By combining interactive exercises, analytical comparisons, and computational implementation examples, it offers a comprehensive approach to understanding how uncertainties propagate through calculations. The use of freely available tools like Python further enhances its accessibility for educational purposes.
Study Guide: Monte Carlo Error Propagation
Key Concepts
- Error Propagation: The process of determining the uncertainty in a calculated quantity based on the uncertainties of the input measurements.
- Monte Carlo Method: A computational technique that relies on repeated random sampling to obtain numerical results. In the context of error propagation, this involves generating many random values for the measured quantities based on their distributions and then calculating the resulting distribution of the calculated parameter.
- Normal Distribution (Gaussian Distribution): A common probability distribution characterized by a bell-shaped curve, defined by its mean (central value) and standard deviation (width of the distribution).
- Mean: The average value of a set of numbers.
- Median: The middle value in a sorted set of numbers.
- Standard Deviation: A measure of the dispersion or spread of a set of data from its mean. A low standard deviation indicates that the data points tend to be close to the mean, while a high standard deviation indicates that the data points are spread out over a wider range of values.
- Histogram: A graphical representation of the distribution of numerical data, where the data is grouped into bins and the height of each bar corresponds to the frequency of values within that bin.
- Analytical (Calculus) Approach to Error Propagation: A method using partial derivatives to estimate the uncertainty in a calculated quantity based on the uncertainties of the input variables. This approach assumes that the uncertainties are small and that the function relating the variables is reasonably linear over the range of uncertainty.
- Box-Muller Transformation: An algorithm for generating pairs of independent, standard normally distributed (mean 0, standard deviation 1) random numbers, given a source of uniformly distributed random numbers.
- Weighted Random Number: A random number generated from a specific probability distribution, where some values are more likely to occur than others.
Quiz
- Explain the fundamental principle behind Monte Carlo error propagation in the context of a calculated parameter derived from measured quantities with uncertainties.
- What are the key advantages of using the Monte Carlo approach for error propagation compared to the analytical (calculus) approach, especially when dealing with complex functions or non-normal distributions?
- Describe the process of generating normally distributed random numbers using the Box-Muller transformation. Why might this be necessary?
- What information can be gleaned from plotting a histogram of the results obtained from a Monte Carlo error propagation simulation? Mention at least three statistical measures that can be derived from this histogram.
- According to the provided material, what is the analytical (calculus) formula (Equation 2) used to determine the propagated error (σf) of a function f(x, y) based on the uncertainties (σx, σy) of the measured quantities x and y?
- For the specific case of speed (v = x/t), where x is distance and t is time, derive the analytical formula (Equation 3) for the error in speed (σv) based on the errors in distance (σx) and time (σt).
- In Exercise 3, the provided Python code generates distributions for both distance and time. How are these distributions created, and what parameters define their characteristics?
- Explain how the Python code calculates the distribution of speed (v) once the distributions for distance (d) and time (t) have been generated. What operation is performed on the arrays?
- What statistical measures are calculated and reported in Exercise 4 of the Python code for the resulting speed distribution obtained through the Monte Carlo method?
- Briefly outline the steps involved in extending the Monte Carlo approach to analyze error propagation in your own lab data. What key pieces of information would you need?
Quiz Answer Key
- Monte Carlo error propagation involves generating a large number of random values for each measured quantity based on their known probability distributions (often assumed to be normal). For each set of these random input values, the calculated parameter of interest is computed, resulting in a distribution of possible values for that parameter. The uncertainty in the calculated parameter is then estimated from this distribution (e.g., its standard deviation).
- The Monte Carlo approach can handle non-linear relationships between variables and non-normal probability distributions more accurately than the analytical approach, which often relies on linear approximations and the assumption of normality. It also provides a full distribution of the calculated parameter, offering more insight than just a standard deviation.
- The Box-Muller transformation takes two independent, uniformly distributed random numbers between 0 and 1 and uses trigonometric functions to produce two independent, normally distributed random numbers with a mean of zero and a standard deviation of one. This is useful when a programming language doesn't have a built-in function for generating normally distributed random numbers directly. These standard normal numbers can then be scaled and shifted to match a desired mean and standard deviation.
- A histogram visually represents the frequency of different values of the calculated parameter. From a histogram, one can determine the approximate shape of the distribution, the location of the peak (which may correspond to the mean or median), the spread of the data (related to the standard deviation), and identify any potential skewness or multiple peaks in the distribution.
- The analytical (calculus) formula (Equation 2) is: σf = √((∂f/∂x σx)² + (∂f/∂y σy)²), where ∂f/∂x and ∂f/∂y are the partial derivatives of the function f with respect to x and y, and σx and σy are the standard deviations (uncertainties) of x and y, respectively.
- For speed v = x/t, the partial derivative of v with respect to x is 1/t and with respect to t is -x/t². Substituting these into Equation 2 and simplifying leads to Equation 3: σv = √( (1/t * σx)² + (-x/t² * σt)² ) = (x/t) * √((σx/x)² + (σt/t)²). Note that in the source material, 'y' is used instead of 't' in the general formula, so for speed, it should be σv = (x/t) * √((σx/x)² + (σt/t)²).
- In Exercise 3, the distributions for distance (d) and time (t) are created using the random.gauss() function from the random library in Python. This function generates normally distributed random numbers. The characteristics of the distance distribution are defined by avg_d = 5.4 (mean) and sigma_d = 0.2 (standard deviation), while the time distribution is defined by avg_t = 6.2 (mean) and sigma_t = 1.5 (standard deviation).
- The Python code calculates the speed (v) by performing element-wise division of the numpy array d (distances) by the numpy array t (times). This means that for each pair of randomly generated distance and time values (at the same index in their respective arrays), the corresponding speed is calculated and stored in the v array.
- In Exercise 4, the Python code calculates and reports the mean speed using mean(v), the median speed using median(v), and the standard deviation of the speed using std(v). These statistical measures describe the central tendency and spread of the speed distribution obtained from the Monte Carlo simulation.
- To extend the Monte Carlo approach to your own lab data, you would first need to identify the measured quantities that contribute to the calculated parameter and estimate their probability distributions (including their means and standard deviations or other relevant parameters). Then, you would use a programming language or software to generate many random samples from these distributions, calculate the parameter of interest for each set of samples, and finally analyze the resulting distribution of the calculated parameter to estimate its uncertainty (e.g., by calculating its standard deviation or determining a confidence interval).
Essay Format Questions
- Discuss the theoretical underpinnings of both the analytical (calculus-based) and the Monte Carlo methods for error propagation. Compare and contrast their assumptions, strengths, and limitations in the context of experimental data analysis.
- Explain how the Monte Carlo method leverages random sampling to approximate the uncertainty in a calculated quantity. Describe the key steps involved in a Monte Carlo error propagation simulation and discuss the factors that can influence the accuracy and reliability of the results.
- The provided material highlights a speed calculation (v = distance/time) as an example. Elaborate on how both the analytical and Monte Carlo approaches are applied to determine the uncertainty in speed given uncertainties in distance and time. Discuss the relationship between the results obtained from these two methods.
- Consider a scenario in a laboratory setting where a calculated result depends on three measured variables with known uncertainties and potentially non-normal distributions. Argue for the suitability of either the analytical or the Monte Carlo approach for error propagation in this scenario, justifying your choice based on the characteristics of the data and the complexity of the relationship between the variables.
- The Open Educational Resources material emphasizes the learning objectives of generating normally distributed random numbers, plotting histograms, and comparing analytical and Monte Carlo approaches. Discuss the pedagogical value of these exercises in developing a student's understanding of error analysis and computational methods in science and engineering.
Glossary of Key Terms
- Analytical Solution: A solution to a problem obtained through mathematical derivation and exact formulas.
- Computer Algebra System (CAS): Software programs that manipulate mathematical expressions symbolically.
- Confidence Interval: A range of values within which the true value of a parameter is estimated to lie with a certain level of probability.
- Distribution (Probability Distribution): A function that describes the likelihood of different outcomes or values for a random variable.
- Partial Derivative: The derivative of a multivariable function with respect to one variable, keeping the other variables constant.
- Random Sampling: The process of selecting a subset of individuals (or values) randomly from within a whole (a population or a distribution).
- Skewness: A measure of the asymmetry of a probability distribution.
- Uncertainty: The range within which the true value of a measurement is believed to lie, often quantified by the standard deviation.
Version:
- https://www.compadre.org/PICUP/exercises/exercise.cfm?I=112&A=MCerrorprop
- http://weelookang.blogspot.com/2018/06/monte-carlo-error-propagation.html
Other Resources
[text]
What is Monte Carlo error propagation?
Monte Carlo error propagation is a method used to estimate the uncertainty in a calculated quantity that depends on one or more measured quantities, each with its own known uncertainty. It involves generating a large number of random values for each measured quantity based on their probability distributions, using these random values to calculate the quantity of interest many times, and then analyzing the distribution of the calculated values to determine its uncertainty.
How does the Monte Carlo approach differ from the analytical (calculus) approach to error propagation?
The analytical (calculus) approach uses partial derivatives and the variances of the measured quantities to estimate the variance of the calculated quantity. It relies on a first-order Taylor series approximation and assumes the uncertainties are small and the relationships are approximately linear. The Monte Carlo approach, on the other hand, does not rely on these assumptions. It directly samples from the probability distributions of the input variables and can handle non-linear relationships and non-Gaussian error distributions more effectively.
What is involved in implementing the Monte Carlo method for error propagation?
Implementing the Monte Carlo method typically involves the following steps:
- Determine the probability distribution (e.g., normal distribution) for each measured quantity, including its mean and standard deviation.
- Generate a large number (hundreds or thousands) of random values for each measured quantity according to their respective distributions.
- For each set of generated random values, calculate the value of the quantity of interest using the given formula.
- Collect all the calculated values and analyze their distribution, typically by plotting a histogram and calculating statistical properties such as the mean, median, and standard deviation. The standard deviation of this distribution is then taken as the estimate of the propagated error.
How can normally distributed random numbers be generated for Monte Carlo simulations?
Normally distributed random numbers can be generated in several ways. Many computer programming languages have built-in functions to generate these numbers directly, requiring the user to specify the desired mean and standard deviation. Alternatively, one can use the Box-Muller transformation, which takes two uniformly distributed random numbers between 0 and 1 and returns two independent normally distributed random numbers with a mean of zero and a standard deviation of one. These can then be transformed to match any desired mean and standard deviation.
What is the purpose of plotting a histogram in Monte Carlo error propagation?
Plotting a histogram of the calculated values in a Monte Carlo simulation serves several purposes. It visually represents the distribution of the results, allowing one to see the central tendency, spread, and shape of the distribution. This can help in understanding the uncertainty in the calculated quantity and in checking if the distribution is approximately normal or if it has any skewness or other non-ideal features. It also allows for a visual comparison with the results obtained from analytical error propagation methods.
How are the mean, median, and standard deviation used in analyzing the results of a Monte Carlo error propagation?
The mean and median of the distribution of calculated values provide estimates of the central tendency of the quantity of interest. For a symmetric distribution, these values should be close to each other and to the value obtained by directly using the mean values of the input quantities in the formula. The standard deviation of the distribution of calculated values provides an estimate of the propagated error or uncertainty in the quantity of interest. A larger standard deviation indicates a larger uncertainty. Comparing these statistical measures with the results from analytical methods can help validate the Monte Carlo simulation.
Can the Monte Carlo approach be applied to real lab data?
Yes, the Monte Carlo approach is highly applicable to real lab data. If you have measured quantities with associated uncertainties (which can often be estimated from the precision of your instruments or through repeated measurements), you can model these uncertainties using appropriate probability distributions. Then, you can apply the Monte Carlo method as described earlier to estimate the uncertainty in any calculated quantity that depends on these measurements. This allows for a more realistic and potentially more accurate assessment of error propagation in experimental results.
What programming tools are mentioned for implementing Monte Carlo error propagation?
The sources mention that the PICUP Monte Carlo error propagation exercises have available implementations in Mathematica and Python. The provided Python code (montecarlo.py) demonstrates how to use the pylab and random libraries to generate normally distributed random numbers, perform calculations, and plot histograms for Monte Carlo simulations. This suggests that Python is a suitable and accessible tool for implementing this method
- Details
- Written by Loo Kang Wee
- Parent Category: Mathematics
- Category: Numbers and Algebra
- Hits: 3504