I found myself doing this for no good reason other than curiosity and procrastination -- not sure it will be of much interest as it is about how quickly estimated probability distributions converge to the true distribution as the sample size increases, but guess I'll post it anyway. The curiosity began, in part, with Brad DeLong's recent (re)post showing the rate of convergence of estimated probabilities from a coin flip to 50-50 as the number of flips increases (in response to a dumb statement from Don Luskin about how well samples can represent the underlying population), and the procrastination is driven by the fact that I have exams to grade.
This is a very simple exercise. To construct one of the lines in the first graph shown below, first draw 200 observations from a standard normal distribution (mean of zero, variance of one), then do a frequency distribution for the draws. That is, to determine the frequencies, find the fraction of the observations that are less than -4.0 (simply count the number of draws in this range and divide by 200), the fraction in the range -4.0 to -3.8, the fraction in the range -3.8 to -3.6, ... etc., etc., ... the fraction in the range 3.6 to 3.8, the fraction in the range 3.8 to 4.0, and the fraction greater than 4.0 (including the endpoints, there are 42 points or "nodes" in increments of .2, the graphs show the midpoints of the interior ranges, i.e. the graphs show the frequencies at -4.1, -3.9, -3.7, ..., 3.7, 3.9., and 4.1). This gives one line on the graph (consisting of 42 points estimated from the draw of 200 observation). Repeat this four more times (for a total of five) to complete the graph.
Thus, the first graph shows five estimates of a Normal(0,1), each
based upon 200 observations and 42 nodes (note how low the ratio of
observations to nodes is in this case - a smaller number of nodes would
be estimated more precisely, but with less nodes the distribution is
not resolved as clearly - I didn't make any attempt to try different
levels of resolution, i.e. a smaller or larger number of nodes, or to try a non-uniform distribution of nodes rather than having them equally spaced at .2 apart). The
graph also shows a standard normal for reference.
To construct the second graph, do exactly the same thing, but draw five samples of 400 observations instead of 200. Similarly, the remaining graphs show the outcome for five samples of sizes 400, 600, 800, 1000, and 2000.
Even with 42 points to estimate, the underlying distribution is revealed fairly quickly as the number of observations is increased, i.e. with 800 observations, about 20 per node (though the 800 observations aren't, of course, distributed equally across the nodes), the distribution is resolved pretty clearly.