Lab 4: Smoothing

# Initialize Otter
import otter
grader = otter.Notebook("lab4-smoothing.ipynb")
import numpy as np
import pandas as pd
import altair as alt
pd.options.mode.chained_assignment = None  # default='warn'
# disable row limit for plotting
alt.data_transformers.disable_max_rows()
# uncomment to ensure graphics display with pdf export
# alt.renderers.enable('mimetype')

Lab 4: Smoothing

So far, you’ve encountered a number of visualization techniques for displaying tidy data. In those visualizations, all graphic elements represent the values of a dataset – they are visual displays of actual data.

In general, smoothing means evening out. Visualizations of actual data are often irregular – points are distributed widely in scatterplots, line plots are jagged, bars are discontinuous. When we look at such visuals, we tend to attempt to look past these irregularities in order to discern patterns – for example, the overall shape of a histogram or the general trend in a scatterplot. Showing what a graphic might look like with irregularities evened out often aids the eye in detecting pattern. This is what smoothing is: evening out irregularities in graphical displays of actual data.

For our purposes, usually smoothing will consist in drawing a line or a curve on top of an existing statistical graphic. From a technical point of view, this amounts to adding derived geometric objects to a graphic that have fewer irregularities than the displays of actual data.

In this lab, you’ll learn some basic smoothing techniques – kernel density estimation, LOESS, and linear smoothing via regression – and how to implement them in Altair.

In Altair, smoothing is implemented via what Altair describes as transforms – operations that modify a dataset. Try not to get too attached to this terminology – ‘transform’ and ‘transformation’ are used to mean a variety of things in other contexts. You’ll begin with a brief introduction to Altair transforms before turning to smoothing techniques.

The sections of the lab are divided as follows:

  1. Introduction to Altair transforms
  2. Histogram smoothing: kernel density estimamtion
  3. Scatterplot smoothing: LOESS and linear smoothing
  4. A neat graphic

And our main goals are:

  • Get familiar with Altair transforms for dataframe operations: filter, bin, aggregate, calculate.
  • ‘Handmande’ histograms: step-by-step construction
  • Implement kernel density estimation via .transform_density(...)
  • Implement LOESS via .transform_loess(...)
  • Implement linear smoothing via transform_regression(...)

You’ll use the same data as last week to stick to a familiar example:

# import tidied lab 3 data
data = pd.read_csv('data/lab3-data.csv')
data.head()

Transforms in Altair

In Altair, operations that modify a dataset are referred to as transforms. Mostly, these are operations that could be performed manually with ease – the utility of transforms is that they wrap common operations within plotting commands, although they also make plotting codes more verbose.

Transforms encompass a broad range of types of operations, from relatively simple ones like filtering to more complex ones like smoothing. Here you’ll see a few intuitive transforms in Altair that integrate simple dataframe manipulations into the plotting process.

You’ll focus on the construction of histograms as a sort of case study. This will be a useful primer for histogram smoothing in the next section.

Filter transform

Last week you saw a way to make histograms. As a quick refresher, to make a histogram of life expectancies across the globe in 2010, one can filter the data and then plot using the following commands:

# filter
data2010 = data[data.Year == 2010]

# plot
alt.Chart(data2010).mark_bar().encode(
    x = alt.X('Life Expectancy', 
              bin = alt.Bin(step = 2), 
              title = 'Life Expectancy at Birth'),
    y = 'count()'
)

However, the filtering step can be handled within the plotting commands using .transform_filter().

This uses a helper command to specify the filtering condition – in the above example, the filtering condition is that Year is equal to 2010. A filtering condition is referred to in Altair as a ‘field predicate’. In the above example: * filtering field: Year * field predicate: equals 2010

There are different helpers for different types of field predicates – you can find a complete list in the documentation.

Here is how to use .transform_filter() to make the same histogram shown above, but skipping the step of storing a subset of the data under a separate name:

# filter and plot
alt.Chart(data).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
).mark_bar().encode(
    x = alt.X('Life Expectancy', 
              bin = alt.Bin(step = 2), 
              title = 'Life Expectancy at Birth'),
    y = 'count()'
)

Question 1: Filter transform

Construct a histogram of life expectancies across the globe in 2019 using a filter transform as shown above to filter the appropriate rows of the dataset. Use a bin size of three (not two) years.

# filter and plot
alt.Chart(data).transform_filter(
    ...
).mark_bar().encode(
    x = ...
    y = ...
)

Bin transform

The codes above provide a sleek way to construct the histogram that handles binning via arguments to alt.X(...). However, binning actually involves an operation: creating a new variable that is a discretization of an existing variable into contiguous intervals of a specified width.

To illustrate, have a look at how the histogram could be constructed ‘manually’ by the following operations. 1. Bin life expectancies 2. Count values in each bin 3. Make a bar plot of counts against bin centers.

Here’s step 1:

# bin life expectancies into 20 contiguous intervals
data2010['Bin'] = pd.cut(data2010["Life Expectancy"], bins = 20)
data2010.head()

Here’s step 2:

# count values in each bin and store midpoints
histdata = data2010.loc[:, ['Life Expectancy', 'Bin']].groupby('Bin').count()
histdata['Bin midpoint'] = histdata.index.values.categories.mid.values
histdata

And finally, step 3:

# plot histogram
alt.Chart(histdata).mark_bar(width = 10).encode(
    x = 'Bin midpoint',
    y = alt.Y('Life Expectancy', title = 'Count')
)

These operations can be articulated as a transform in Altair using .bin_transform():

# filter, bin, and plot
alt.Chart(
    data
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
).transform_bin(
    'Life Expectancy at Birth', # name to give binned variable
    field = 'Life Expectancy', # variable to bin
    bin = alt.Bin(step = 2) # binning parameters
).mark_bar(size = 10).encode(
    x = 'Life Expectancy at Birth:Q', 
    y = 'count()'
)

The plotting codes are a little more verbose, but they’re much more efficient than performing the manipulations separately in pandas.

Question 2: Bin transform

Follow the example above and make a histogram of life expectancies across the globe in 2019 using an explicit bin transform to create bins spanning three years.

# filter, bin, and plot
alt.Chart(
    data
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2019)
).transform_bin(
    ...
    field = ...
    bin = ...
).mark_bar(size = 10).encode(
    x = ...
    y = 'count()'
)

Aggregate transform

Now, the counting of observations in each bin (implemented via y = count()) is also an under-the-hood operation in constructing the histogram. You already saw how this was done ‘manually’ in the example above before introducing the bin transform.

Grouped counting is a form of aggregation in the sense discussed in lecture: it produces output that has fewer values than the input by combining multiple values (in this case rows) into one value (in this case a count of the number of rows).

This operation can also be made explicit using .transform_aggregate(). This makes use of Altair’s aggregation shorthands for common aggregation functions; see the documentation on Altair encodings for a full list of shorthands.

Here is how .transform_aggregate() would be used to perform the counting:

# filter, bin, count, and plot
alt.Chart(
    data
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
).transform_bin(
    'Life Expectancy at Birth', 
    field = 'Life Expectancy',
    bin = alt.Bin(step = 2) 
).transform_aggregate(
    Count = 'count()', # altair shorthand operation -- see docs for full list
    groupby = ['Life Expectancy at Birth'] # grouping variable(s)
).mark_bar(size = 10).encode(
    x = 'Life Expectancy at Birth:Q', 
    y = 'Count:Q'
)

Calculate transform

By default, Altair’s histograms are displayed on the count scale rather than the density scale.

The count scale means that the y-axis shows counts of observations in each bin.

By contrast, on the density scale, the y-axis would show proportions of total bar area (so that the area of plotted bars sums to 1).

It might seem like a silly distinction – after all, the two scales differ simply by a proportionality constant (the sample size times the bin width) – but as you will see shortly, the density scale is more useful for statistical thinking about the distribution of values and for direct comparisons of distributions approximated from samples of different sizes.

The scale conversion can be done using .transform_calculate(), which computes derived variables using arithmetic operations. In this case, one only needs to divide the count by the total number of observations.

# filter, bin, count, convert scale, and plot
alt.Chart(
    data
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
).transform_bin(
    'Life Expectancy at Birth', 
    field = 'Life Expectancy', 
    bin = alt.Bin(step = 2)
).transform_aggregate(
    Count = 'count()',
    groupby = ['Life Expectancy at Birth']
).transform_calculate(
    Density = 'datum.Count/(2*157)' # divide counts by sample size x binwidth
).mark_bar(size = 10).encode(
    x = 'Life Expectancy at Birth:Q', 
    y = 'Density:Q'
)

Question 3: Density scale histogram

Follow the example above and convert your histogram from Question 2 (with the year 2019, the step size of 3, and the usage of .transform_bin(...)) to the density scale.

    1. First, calculate the sample size and store the value as sample_size. Store the desired step size as bin_width.
    1. Convert your histogram from Question 2 (with the year 2019, the step size of 3, and the usage of .transform_bin(...)) to the density scale. First calculate the count explicitly using .transform_aggregate(...) and then convert to a proportion using .transform_calculate(...). Multiply sample_size with bin_width to obtain the scaling constant and hardcode it into your implementation.
# find scaling factor
sample_size = ...
bin_width = ...
print('scaling factor = ', sample_size*bin_width)

# construct histogram
alt.Chart(data).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2019)
).transform_bin(
    ...
    field = ...
    bin = ...
).transform_aggregate(
    Count = ...
    groupby = ...
).transform_calculate(
    # use sample_size*bin_width to rescale - you will need to hardcode this value
    Density = ...
).mark_bar(size = 20).encode(
    x = 'Life Expectancy at Birth:Q', # SOLUTIOON
    y = ...
)
grader.check("q3")

Density estimation

Now that you have a sense of how transforms work, we can explore transforms that perform more sophisticated operations. We’re going to focus on a technique known as kernel density estimation.

Histograms show the distribution of values in the sample. Let’s call the density-scale histogram the empirical density. A kernel density estimate is simply a smoothing of the empirical density. (It’s called an ‘estimate’ because it’s often construed as an approximation of the distribution of population values that the sample came from.)

Often the point of visualizing the distribution of a variable is to discern the shape, spread, center, and tails of the distribution to answer certain questions: * what’s a typical value? * are there multiple typical values (multi-modal)? * are there outliers? * is the distribution skewed?

Density estimates are often easier to work with in exploratory analysis because it is visually easier to distinguish the shape of a smooth curve than the shape of a bunch of bars (unless you’re really far away).

Kernel density estimates are easy to plot using .transform_density(). The cell below generates a density estimate of life expectancies across the globe in 2010. Notice the commented lines explaining the syntax.

# plot kernel density estimate of life expectancies in 2010
alt.Chart(
    data
).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
).transform_density(
    density = 'Life Expectancy', # variable to smooth
    as_ = ['Life Expectancy at Birth', 'Estimated Density'], # names of outputs
    bandwidth = 3, # how smooth?
    extent = [30, 85], # domain on which the smooth is defined
    steps = 1000 # for plotting: number of points to generate for plotting line
).mark_line(color = 'black').encode(
    x = 'Life Expectancy at Birth:Q',
    y = 'Estimated Density:Q'
)

This estimate can be layered onto the empirical density to get a better sense of the relationship between the two. The cell below accomplishes this. Notice that the plot elements are constructed as separate layers.

# base plot
base = alt.Chart(data).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', 
                            equal = 2010)
)
 
# empirical density
hist = base.transform_bin(
    as_ = 'Life Expectancy at Birth', 
    field = 'Life Expectancy', 
    bin = alt.Bin(step = 2)
).transform_aggregate(
    Count = 'count()',
    groupby = ['Life Expectancy at Birth']
).transform_calculate(
    Density = 'datum.Count/(2*157)'
).mark_bar(size = 10, opacity = 0.8).encode(
    x = 'Life Expectancy at Birth:Q', 
    y = 'Density:Q'
)

# kernel density estimate
smooth = base.transform_density(
    density = 'Life Expectancy', 
    as_ = ['Life Expectancy at Birth', 'Estimated density'],
    bandwidth = 3, 
    extent = [30, 85],
    steps = 1000 
).mark_line(color = 'black').encode(
    x = 'Life Expectancy at Birth:Q',
    y = 'Estimated density:Q'
)

# layer
hist + smooth

What if you want a different amount of smoothing? That’s what the extent parameter is for. The smoothing is local, in the following sense: at any given point, the kernel density estimate averages bar heights in a neighborhood of nearby bars in proportion to how far the bars are from the point in question.

The extent parameter specifies the size of the smoothing neighborhood in standard deviations. For instance, above extent = 3, which means that the empirical density is smoothed 3 standard deviations in either direction to produce the kernel density estimate. This is also known as the bandwidth. * If the bandwidth is increased, averaging is more global, so the density estimate will get smoother. * If the bandwidth is decreased, averaging is more local, so the density estimate will get wiggly.

There are some methods out there for automating bandwidth choice, but often it is done by hand. Arguably this is preferable, as it allows the analyst to see a few possibilities and decide what best captures the shape of the distribution.

Question 4: Selecting a bandwidth

Modify the ploting codes by decreasing the bandwidth parameter. Try several values, and then choose one that you feel captures the shape of the distribution well without getting too wiggly.

hist + base.transform_density(
    density = 'Life Expectancy', 
    as_ = ['Life Expectancy at Birth', 'Estimated density'],
    bandwidth = ...
    extent = [30, 85],
    steps = 1000 
).mark_line(color = 'black').encode(
    x = 'Life Expectancy at Birth:Q',
    y = 'Estimated density:Q'
)

Comparing distributions

The visual advantage of a kernel density estimate for discerning shape is even more apparent when comparing distributions.

A major task in exploratory analysis is understanding how the distribution of a variable of interest changes depending on other variables – for example, you have already seen in the last lab that life expectancy seems to change over time. We can explore this phenomenon from a different angle by comparing distributions in different years.

Multiple density estimates can be displayed on the same plot by passing a grouping variable (or set of variables) to .transform_density(...). For example, the cell below computes density estimates of life expectancies for each of two years.

alt.Chart(data).transform_filter(
    alt.FieldOneOfPredicate(field = 'Year',
                            oneOf = [2010, 2019])
).transform_density(
    density = 'Life Expectancy',
    groupby = ['Year'],
    as_ = ['Life Expectancy at Birth', 'Estimated Density'],
    bandwidth = 1.8, 
    extent = [25, 90],
    steps = 1000 
).mark_line().encode(
    x = 'Life Expectancy at Birth:Q',
    y = 'Estimated Density:Q',
    color = 'Year:N'
)

Often the area beneath each density estimate is filled in. This can be done by simply appending a .mark_area() call at the end of the plot.

p = alt.Chart(data).transform_filter(
    alt.FieldOneOfPredicate(field = 'Year',
                            oneOf = [2010, 2019])
).transform_density(
    density = 'Life Expectancy',
    groupby = ['Year'],
    as_ = ['Life Expectancy at Birth', 'Estimated Density'],
    bandwidth = 1.8, 
    extent = [25, 90],
    steps = 1000 
).mark_line().encode(
    x = 'Life Expectancy at Birth:Q',
    y = 'Estimated Density:Q',
    color = 'Year:N'
)

p + p.mark_area(opacity = 0.1)

Notice that this makes it much easier to compare the distributions between years – you can see a pronounced rightward shift of the smooth for 2019 compared with 2010.

We could make the same comparison based on the histograms, but the shift is a lot harder to make out. Overlaid histograms should be avoided.

alt.Chart(data).transform_filter(
    alt.FieldOneOfPredicate(field = 'Year',
                            oneOf = [2010, 2019])
).mark_bar(opacity = 0.5).encode(
    x = alt.X('Life Expectancy', bin = alt.Bin(maxbins = 30), title = 'Life Expectancy at Birth'),
    y = alt.Y('count()', stack = None),
    color = 'Year:N'
)

Question 5: Multiple density estimates

Follow the example above to construct a plot showing separate density estimates of life expectancy for each region in the 2010. You can choose whether you prefer to fill in the area beneath the smooth curves, or not. Be sure to play with the bandwidth parameter and choose a value that seems sensible to you.

# construct density estimates
p = alt.Chart(data).transform_filter(
    ...
).transform_density(
    density = ...
    groupby = ...
    as_ = ...
    bandwidth = ...
    extent = ...
    steps = ...
).mark_line(
).encode(
    x = ...
    y = ...
    color = ...
)

# add shaded area underneath curves
...

Question 6: Interpretation

Do the distributions of life expectancies seem to differ by region? If so, what is one difference that you notice? Answer in 1-2 sentences.

Type your answer here, replacing this text.

Question 7: Outlier

Notice that little peak way off to the left in the distribution of life expectancies in the Americas. That’s an outlier.

    1. Which country is it? Check by filtering data appropriately and using .sort_values(...) to find the lowest life expectancy in the Americas. Save the outlying observation as a one-row dataframe called lowest_Americas and print the row.
    1. What was the life expectancy for that country in other years? Filter the data to examine the life expectancy in the country you identified as the outlier in all y. Save the resulting data frame as outlier_country.
    1. What Happened in 2010? Can you explain why the life expectancy was so low in that country for that particular year?(Hint: if you don’t remember, Google the country name and year in question.)

Type your answer here, replacing this text.

# examine outlier
lowest_Americas = data[
    ...
    ].sort_values(
    by = ...
    ).head(1)
lowest_Americas
# show all obsrvations for country of interest
outlier_country = ...
outlier_country
grader.check("q7")

Scatterplot smoothing

In this brief section you’ll see two techniques for smoothing scatterplots: LOESS, which produces a curve; and regression, which produces a linear smooth.

The next parts will modify the dataframe data by adding a column. Create a copy data_mod1 of the original dataframe data to modify so as to not lose track of previous work:

data_mod1 = data.copy()

LOESS

Locally weighted scatterplot smoothing (LOESS) is a flexible smoothing technique for visualizing trends in scatterplots. The technical details are a little involved but quite similar conceptually to kernel density estimation; we’ll just look at the implementation for now.

To illustrate, consider the scatterplots you made in lab 3 showing the relationship between life expectancy and GDP per capita. The plot for 2010 looked like this:

# log transform gdp explicitly
data_mod1['log(GDP per capita)'] = np.log(data_mod1['GDP per capita'])

# scatterplot
scatter = alt.Chart(data_mod1).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', equal = 2000)
).mark_circle(opacity = 0.5).encode(
    x = alt.X('log(GDP per capita)', scale = alt.Scale(zero = False)),
    y = alt.Y('Life Expectancy', title = 'Life Expectancy at Birth', scale = alt.Scale(zero = False)),
    size = alt.Size('Population', scale = alt.Scale(type = 'sqrt'))
)

# show
scatter

To add a LOESS curve, simply append .transform_loess() to the base plot:

# compute smooth
smooth = scatter.transform_loess(
    on = 'log(GDP per capita)', # x variable
    loess = 'Life Expectancy', # y variable
    bandwidth = 0.25 # how smooth?
).mark_line(color = 'black')

# add as a layer to the scatterplot
scatter + smooth

Just as with kernel density estimates, LOESS curves have a bandwidth parameter that controls how smooth or wiggly the curve is. In Altair, the LOESS bandwidth is a unitless parameter between 0 and 1.

Question 8: LOESS bandwidth selection

Tinker with the bandwidth parameter to see its effect in the cell below. Then choose a value that produces a smoothing you find appropriate for indicating the general trend shown in the scatter.

# compute smooth
smooth = scatter.transform_loess(
    on = 'log(GDP per capita)', 
    loess = 'All', 
    bandwidth = ...
).mark_line(color = 'black')

# add as a layer to the scatterplot
scatter + smooth

LOESS curves can also be computed groupwise. For instance, to display separate curves for each region, one need only pass a groupby = ... argument to .transform_loess():

# scatterplot
scatter = alt.Chart(data_mod1).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', equal = 2000)
).mark_circle(opacity = 0.5).encode(
    x = alt.X('log(GDP per capita)', scale = alt.Scale(zero = False)),
    y = alt.Y('Life Expectancy', title = 'Life Expectancy at Birth', scale = alt.Scale(zero = False)),
    size = alt.Size('Population', scale = alt.Scale(type = 'sqrt')),
    color = 'region'
)

# compute smooth
smooth = scatter.transform_loess(
    groupby = ['region'], # add groupby
    on = 'log(GDP per capita)', 
    loess = 'Life Expectancy', 
    bandwidth = 0.8 
).mark_line(color = 'black')

# add as a layer to the scatterplot
scatter + smooth

The curves are a little jagged because there aren’t very many countries in each region.

data_mod1[data_mod1.Year == 2000].groupby('region').count().iloc[:, [0]]

Regression

You will be learning more about linear regression later in the course, but we can introduce regression lines now as a visualization technique. As with LOESS, you don’t need to concern yourself with the mathematical details (yet). From this perspective, regression is a form of linear smoothing – a regression smooth is a straight line. By contrast, LOESS smooths have curvature – they are not straight lines.

In the example above, the LOESS curves don’t have much curvature. So it may be a cleaner choice visually to show linear smooths. This can be done using .transform_regression(...) with a similar argument structure.

# scatterplot
scatter = alt.Chart(data_mod1).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', equal = 2000)
).mark_circle(opacity = 0.5).encode(
    x = alt.X('log(GDP per capita)', scale = alt.Scale(zero = False)),
    y = alt.Y('Life Expectancy', title = 'Life Expectancy at Birth', scale = alt.Scale(zero = False)),
    size = alt.Size('Population', scale = alt.Scale(type = 'sqrt')),
    color = 'region'
)

# compute smooth
smooth = scatter.transform_regression(
    groupby = ['region'],
    on = 'log(GDP per capita)', 
    regression = 'Life Expectancy'
).mark_line(color = 'black')

# add as a layer to the scatterplot
scatter + smooth

Question 9: Simple regression line

Based on the example immediately above, construct a scatterplot of life expectancy against log GDP per capita in 2010 with points sized according to population (and no distinction between regions). Layer a single linear smooth on the scatterplot using .transform_regression(...).

(Hint: remove the color aesthetic and grouping from the previous plot.)

# construct scatterplot
scatter = alt.Chart(data_mod1).transform_filter(
    ...
).mark_circle(opacity = 0.5).encode(
    x = ...
    y = ...
    size = ...
)

# construct smooth
smooth = scatter.transform_regression(
    on = ...
    regression = ...
).mark_line(color = 'black')

# layer
...

A neat trick

Let’s combine the scatterplot with a smooth from part 2 with the density estimates in part 1. This is an example of combining multiple plots into one visual.

Why combine? Well, sometimes it’s useful to visualize the distribution of the variable of interest together with its relationship to another variable. Imagine, for example, that you’re interested in seeing both: * the relationship between life expectancy and GDP per capita by region; and * the distributions of life expeectancies by region.

We can flip the density estimates on their side and append them as a facet to the right-hand side of the scatterplot as follows:

# scatterplot with linear smooth
scatter = alt.Chart(data_mod1).transform_filter(
    alt.FieldEqualPredicate(field = 'Year', equal = 2000)
).mark_circle(opacity = 0.5).encode(
    x = alt.X('log(GDP per capita)', scale = alt.Scale(zero = False)),
    y = alt.Y('Life Expectancy', title = 'Life Expectancy at Birth', scale = alt.Scale(zero = False)),
    size = alt.Size('Population', scale = alt.Scale(type = 'sqrt')),
    color = 'region'
)

smooth = scatter.transform_regression(
    groupby = ['region'],
    on = 'log(GDP per capita)', 
    regression = 'Life Expectancy'
).mark_line(color = 'black')

# density estimates
p = alt.Chart(data_mod1).transform_filter(
    alt.FieldEqualPredicate(field = 'Year',
                            equal = 2000)
).transform_density(
    density = 'Life Expectancy',
    groupby = ['region'], # change here
    as_ = ['Life Expectancy at Birth', 'Estimated density'],
    bandwidth = 2, 
    extent = [40, 85],
    steps = 1000 
).mark_line(order = False).encode(
    y = alt.Y('Life Expectancy at Birth:Q', 
              scale = alt.Scale(domain = (40, 85)), 
              title = '', 
              axis = None),
    x = alt.X('Estimated density:Q',
              title = '',
              axis = None),
    color = alt.Color('region:N')
).properties(width = 60)

# facet structure
(scatter + smooth) | (p + p.mark_area(order = False, opacity = 0.1))

Submission

  1. Save the notebook.
  2. Restart the kernel and run all cells. (CAUTION: if your notebook is not saved, you will lose your work.)
  3. Carefully look through your notebook and verify that all computations execute correctly and all graphics are displayed clearly. You should see no errors; if there are any errors, make sure to correct them before you submit the notebook.
  4. Download the notebook as an .ipynb file. This is your backup copy.
  5. Export the notebook as PDF and upload to Gradescope.