---
title: "[WEEK 3 TITLE]"
subtitle: "[WEEK 3 SUBTITLE]"
date: last-modified
date-format: "[Updated ]MMM D, YYYY"
format: 
  revealjs:
    theme: brownslides.scss
    logo: images/pols1140_hex.png
    footer: "[COURSE CODE]"
    multiplex: false
    transition: fade
    html-math-method: mathjax
    slide-number: c
    incremental: true
    center: false
    menu: true
    scrollable: true
    highlight-style: github
    progress: true
    code-overflow: wrap
    chalkboard: true
    # include-after-body: title-slide.html
    title-slide-attributes:
      align: left
      data-background-image: images/pols1140_hex.png
      data-background-position: 90% 50%
      data-background-size: 40%
filters:
  - openlinksinnewpage
execute: 
  eval: true
  echo: true
  warning: false
  message: false
  cache: true
---



<!--# {{< fa map-location>}} Tuesday {.inverse}
-->


# {{< fa map-location>}} Thursday {.inverse}


```{r}
#| label: init
#| echo: false
#| results: hide
#| warning: false 
#| message: false

library(tidyverse)
library(labelled)
library(haven)
library(DeclareDesign)
library(easystats)
library(texreg)
library(kableExtra)
library(dagitty)

the_packages <- c(
  ## R Markdown
  "kableExtra","DT","texreg",
  ## Tidyverse
  "tidyverse", "lubridate", "forcats", "haven", "labelled",
  ## Extensions for ggplot
  "ggmap","ggrepel", "ggridges", "ggthemes", "ggpubr", 
  "GGally", "scales", "dagitty", "ggdag", "ggforce",
  # Data 
  "COVID19","maps","mapdata","qss","tidycensus", "dataverse", 
  # Analysis
  "DeclareDesign", "easystats", "zoo"
)

## Define a function to load (and if needed install) packages


ipak <- function(pkg){
    new.pkg <- pkg[!(pkg %in% installed.packages()[, "Package"])]
    if (length(new.pkg)) 
        install.packages(new.pkg, dependencies = TRUE)
    sapply(pkg, require, character.only = TRUE)
}

## Install (if needed) and load libraries in the_packages
ipak(the_packages)


```


<!-- # {{< fa map-location>}} Monday {.inverse} -->

<!-- # {{< fa map-location>}} Friday {.inverse} --> 

<!-- ## Review {.smaller} -->

<!-- - Challenges to the Folk Theory of Democracy -->
<!--   - What is the folk theory of democracy -->
<!--   - What are some theoretical challenges -->
<!--   - What are some empirical challenges -->

<!-- - Statistical Foundations for POLS 1140 -->
<!--   - How do we describe what's typical? -->
<!--   - How do we quantify uncertainty? -->
<!--   - How do we make causal claims? -->

<!-- ## Plan for the Week {.smaller} -->

<!-- **Monday** -->

<!-- - Finish up Polls and Forecasting -->
<!-- - @Converse1964-zo -->

<!-- **Wednesday** -->

<!-- - @Ansolabehere2008-ma -->
<!-- - @Freeder2019-lu -->
<!-- - Survey groups assigned -->

<!-- **Friday** -->

<!-- - No class -->
<!-- - I'll record some comments on material we haven't covered -->

<!-- ## What's the effect of the Harris-Trump Debate? {.smaller} -->

<!-- Last week we talked: -->

<!-- - What's the effect of the debate on the 2024 election? -->

<!-- - Why might the debate have an effect? -->

<!-- - Why might the debate not have an effect? -->

<!-- - How would you know? What types of comparisons would you make? -->

<!-- ## The Polls So Far {.smallero} -->

<!-- :::{.panel-tabset} -->

<!-- ## Overview -->

<!-- What to expect when: -->

<!-- - Day of: -->
<!--   - Snap polls (CNN) -->
<!--   - Online non-probability polls (ignore) -->
<!-- - This week: -->
<!--   - Online Panels (YouGov/IPSOS) -->
<!-- - Next week:  -->
<!--   - More traditional RDD (NYT/Sienna) -->

<!-- - It will take a while to know the "effect" of the debates. -->

<!-- ## FiveThirtyEight -->

<!-- ![](images/03_538_polls.png) -->
<!-- [(Source)](https://projects.fivethirtyeight.com/polls/president-general/2024/national/?ex_cid=abcpromo) -->

<!-- ## RealClear -->

<!-- ![](images/03_realclear_polls.png) -->

<!-- [(Source)](https://www.realclearpolitics.com/epolls/latest_polls/national_president/index.html) -->

<!-- ## NYT -->

<!-- ![](images/03_nyt_polls.png) -->

<!-- [(Source)](https://www.nytimes.com/interactive/2024/us/elections/polls-president.html) -->
<!-- ::: -->

<!-- ::: -->

<!-- ## Reading poll results -->

<!-- ![](images/03_yougov.png) -->

<!-- [Source](https://today.yougov.com/politics/articles/50498-harris-wins-the-presidential-debate-poll) [Crosstabs](https://ygo-assets-websites-editorial-emea.yougov.net/documents/Reactions_to_Harris_Trump_Debate_poll_results.pdf#page=6) -->

## Plan for the Week {.smaller}

**Tuesday**

- Stats and POLS 1140
- Finish up discussion of polling and forecasting
- Discuss @Converse1964-zo

**Thursday**

- Begin:
  - Review @Converse1964-zo
  - @Ansolabehere2008-ma
  - @Freeder2019-lu
- Survey groups assigned after class


## Let's do brunch?

```{r}
#| echo: false

df <- haven::read_spss("surveys/wk03.sav")
df <- df %>% filter(share == 1)

df %>% 
  mutate(
    Brunch = fct_rev(fct_inorder(as_factor(brunch)))
  ) %>% 
    ggplot(aes(Brunch, fill=Brunch))+
  geom_bar()+
  geom_text(stat='count', aes(label=..count..), hjust=-0.25)+
  coord_flip()+
  ylim(0,30)+
  scale_fill_bluebrown()+
  theme_minimal()+
  labs(
    y="Count",
    x="",
    fill = "Brunch?",
    title = "Is brunch an acceptable first date?"
  )

```

## Eggsellent Choice

```{r}
#| echo: false

df %>%
  mutate(
    `Brunch Date?` = as_factor(brunch),
    `Why?`= brunch_why
  )%>%
  select(`Brunch Date?`, `Why?`)%>%
  DT::datatable(
    options = list(
              "pageLength" = 5)
  )
```



<!--## Groups{.smaller}

```{r}
#| echo: false

groups_df <- readr::read_csv("../files/students/students.csv")
groups_df %>% 
  group_by(group) %>% 
  summarize(
    Meeting = paste0("<a href='", unique(when2meet),"' target='_blank'> Availability </a>"),
    Members = paste(Name,collapse = ", ")
  ) %>% 
  rename(
    Group = group
  )-> group_tab

DT::datatable(group_tab,escape = F)

```
-->
# {{< fa lightbulb >}}  Statistics and POLS 1140 Part III {.inverse}

## Causal claims involve counterfactual comparisons

-   Causal claims imply claims about counterfactuals
  -   What would have happened if we were to change some aspect of the world?

- We can represent counterfactuals in terms of [potential outcomes]{.blue}

## Individual Causal Effects{.smaller}

Let $Y$ measure outcomes and $D \in \{0,1\}$ denote the presence or absence of some treatment

For any individual, we can imagine different potential outcomes:

$$
\begin{align}
Y_i(D_i = 1) & & \text{Outcome under treatment}\\
Y_i(0) & & \text{Outcome under control}\\
\end{align}
$$
The [individual causal effect]{.blue} is simply the difference in these potential outcomes

$$
\begin{align}
\tau_i = Y_i(1) - Y_i(0) && \text{Individual Causal Effect}
\end{align}
$$

The fundamental problem of causal inference is that individual causal effects are [unknowable]{.blue} because we only observe one of many potential outcomes

- A problem of missing data
  

## A statistical solution to the FPoCI {.smaller}

Rather than focus individual causal effects:

$$
\tau_i \equiv Y_i(1) - Y_i(0)
$$

We focus on average causal effects (Average Treatment Effects \[ATEs\]):

$$
E[\tau_i] = \overbrace{E[Y_i(1) - Y_i(0)]}^{\text{Average of a difference}} = \overbrace{E[Y_i(1)] - E[Y_i(0)]}^{\text{Difference of Averages}}
$$

When does the difference of averages provide us with a good estimate of the average difference?

Let's consider a simple example

## Does eating chocolate make you happy?

-   $Y_i$ happiness measured on a 0-10 scale

-   $D_i$ whether a person ate [chocolate]{style="color: chocolate"} $(D=1)$ or [fruit]{style="color: purple"} $(D = 0)$

-   $Y_i(1)$ a person's happiness eating [chocolate]{style="color: chocolate"}

-   $Y_i(0)$ a person's happiness eating [fruit]{style="color: purple"}

-   $X_i$ a person's self-reported preference $(X_i \in$ {[chocolate]{style="color: chocolate"}, [fruit]{style="color:purple"} })

```{r}
#| echo: false
candy_df <- tibble(
  y1 = c(7, 8, 5, 4, 6, 8, 5, 7, 4, 6),
  y0 = c(3, 6, 4, 3, 10, 9, 4, 8, 3, 0),
  tau = y1 - y0,
  x = c("chocolate", "chocolate", "chocolate","chocolate",
            "fruit","fruit",
            "chocolate",
            "fruit",
            "chocolate","chocolate"),
  d = if_else(x == "chocolate", 1, 0),
  y = if_else(d ==  1,y1,y0)
)

candy_tab <- candy_df
estimand_df <- tibble(
  y1_mn = mean(candy_df$y1, na.rm = T),
  y0_mn = mean(candy_df$y0,na.rm = T),
  tau_mn = mean(candy_df$tau,na.rm = T)
)

effects_df <- tibble(
  y1_mn = mean(candy_df$y1[candy_df$d == 1], na.rm = T),
  y0_mn = mean(candy_df$y0[candy_df$d == 0],na.rm = T),
  ate = y1_mn - y0_mn
)

candy_tab$y1 <- cell_spec(
  candy_df$y1, color = "chocolate"
)
candy_tab$y0 <- cell_spec(
  candy_df$y0, color = "purple"
)
candy_tab$x <- cell_spec(
  candy_df$x, color = ifelse(candy_df$x == "fruit", "purple", "chocolate"
)
)
candy_tab$d <- cell_spec(
  candy_df$d, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)
candy_tab$y <- cell_spec(
  candy_df$y, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)



```

##  {.smaller}

::: columns
::: {.column width="45%"}
#### Potential Outcomes:

```{r}
#| echo: false
#| results: asis

kable(candy_tab |> 
        select(y1, y0,tau)
      ,escape = FALSE,
      # format = "markdown",
      col.names = c(
        "$Y_i(1)$",
        "$Y_i(0)$",
        "$\\tau_i$"
      ))
```

```{r}
#| echo: false
#| results: asis

kable(estimand_df,escape = F,
      format = "markdown",
      col.names = c(
        "$E[Y_i(1)]$",
        "$E[Y_i(0)]$",
        "$E[\\tau_i]$"
      )
      )  
```
:::

::: {.column width="45%"}
-   If we could observe everyone's potential outcomes, we could calculate the ICE

-   On average eating chocolate increases happiness by 1 point on our 10-point scale (ATE = 1)

-   Suppose we conducted a study and let folks [select]{.blue} what they wanted to eat.
:::
:::

##  {.smaller}

::: columns
::: {.column width="45%"}
#### Potential Outcomes:

```{r}
#| echo: false

kable(candy_tab |> 
        select(y1, y0,tau)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$Y_i(1)$",
        "$Y_i(0)$",
        "$\\tau_i$"
      )) 
```

```{r}
#| echo: false

kable(estimand_df,escape = F,
      format = "markdown",
      col.names = c(
        "$E[Y_i(1)]$",
        "$E[Y_i(0)]$",
        "$ATE$"
      )
      ) 
```
:::

::: {.column width="45%"}
#### Observed Treatment:

```{r}
#| echo: false

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::
:::

##  {.smaller}

::: columns
::: {.column width="45%"}
#### Observed Treatment:

```{r}
#| echo: false

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::

::: {.column width="45%"}
#### Selection Bias

-   Our estimate of the ATE is [biased]{.blue} by the fact that folks who prefer fruit seem to be happier than folks who prefer chocolate in this example

-   In general, [selection bias]{.blue} occurs when folks who receive the treatment differ systematically from folks who don't

-   What if instead of letting people pick and choose, we [randomly assigned]{.blue} half our respondents to [chocolate]{style="color: chocolate"} and half to receive [fruit]{style="color: purple"}
:::
:::

##  {.smaller}

```{r}
#| echo: false

set.seed(12)
candy_df |> 
  mutate(
    d = randomizr::complete_ra(10),
    y = if_else(d ==  1,y1,y0)
  ) -> candy_df

candy_tab <- candy_df
estimand_df <- tibble(
  y1_mn = mean(candy_df$y1, na.rm = T),
  y0_mn = mean(candy_df$y0,na.rm = T),
  tau_mn = mean(candy_df$tau,na.rm = T)
)

effects_df <- tibble(
  y1_mn = mean(candy_df$y1[candy_df$d == 1], na.rm = T),
  y0_mn = mean(candy_df$y0[candy_df$d == 0],na.rm = T),
  ate = y1_mn - y0_mn
)

candy_tab$y1 <- cell_spec(
  candy_df$y1, color = "chocolate"
)
candy_tab$y0 <- cell_spec(
  candy_df$y0, color = "purple"
)
candy_tab$x <- cell_spec(
  candy_df$x, color = ifelse(candy_df$x == "fruit", "purple", "chocolate"
)
)
candy_tab$d <- cell_spec(
  candy_df$d, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)
candy_tab$y <- cell_spec(
  candy_df$y, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)

```

::: columns
::: {.column width="45%"}
#### Potential Outcomes:

```{r}
#| echo: false

kable(candy_tab |> 
        select(y1, y0,tau)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$Y_i(1)$",
        "$Y_i(0)$",
        "$\\tau_i$"
      )) 
```

```{r}
#| echo: false

kable(estimand_df,escape = F,
      format = "markdown",
      col.names = c(
        "$E[Y_i(1)]$",
        "$E[Y_i(0)]$",
        "$ATE$"
      )
      ) 
```
:::

::: {.column width="45%"}
#### Randomly Assigned Treatment:

```{r}
#| echo: false

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      )
```
:::
:::

##  {.smaller}

::: columns
::: {.column width="45%"}
#### Randomly Assigned Treatment:

```{r}
#| echo: false

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      ))
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::

::: {.column width="45%"}
#### Random Assignment

-   When treatment has been [randomly assigned]{.blue}, a difference in sample means provides an [unbiased]{.blue} estimate of the [ATE]{.blue}

-   The fact that our $\hat{ATE} = ATE$ in this example is pure coincidence.

-   If we randomly assigned treatment a different way, we'd get a different estimate.

-   In general unbiased estimators will tend to be neither too high nor too low (e.g. $E[\hat{\theta} - \theta] = 0$\])
:::
:::

## Estimating an Average Treatment Effect {.smaller}

If we treatment has been randomly assigned, we can estimate the ATE by taking the difference of means between treatment and control:

$$
\begin{align*}
E \left[ \frac{\sum_1^m Y_i}{m}-\frac{\sum_{m+1}^N Y_i}{N-m}\right]&=\overbrace{E \left[ \frac{\sum_1^m Y_i}{m}\right]}^{\substack{\text{Average outcome}\\
\text{among treated}\\ \text{units}}}
-\overbrace{E \left[\frac{\sum_{m+1}^N Y_i}{N-m}\right]}^{\substack{\text{Average outcome}\\
\text{among control}\\ \text{units}}}\\
&= E [Y_i(1)|D_i=1] -E[Y_i(0)|D_i=0]
\end{align*}
$$

That is, the ATE is causally identified by the **difference of means** estimator in an experimental design

##  {.smaller}

:::: columns
::: {.column width="30%"}
#### Random Assignment 1

```{r}
#| echo: false

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::

::: {.column width="30%"}
#### Random Assignment 2

```{r}
#| echo: false

set.seed(123)
candy_df |> 
  mutate(
    d = randomizr::complete_ra(10),
    y = if_else(d ==  1,y1,y0)
  ) -> candy_df

candy_tab <- candy_df
estimand_df <- tibble(
  y1_mn = mean(candy_df$y1, na.rm = T),
  y0_mn = mean(candy_df$y0,na.rm = T),
  tau_mn = mean(candy_df$tau,na.rm = T)
)

effects_df <- tibble(
  y1_mn = mean(candy_df$y1[candy_df$d == 1], na.rm = T),
  y0_mn = mean(candy_df$y0[candy_df$d == 0],na.rm = T),
  ate = y1_mn - y0_mn
)

candy_tab$y1 <- cell_spec(
  candy_df$y1, color = "chocolate"
)
candy_tab$y0 <- cell_spec(
  candy_df$y0, color = "purple"
)
candy_tab$x <- cell_spec(
  candy_df$x, color = ifelse(candy_df$x == "fruit", "purple", "chocolate"
)
)
candy_tab$d <- cell_spec(
  candy_df$d, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)
candy_tab$y <- cell_spec(
  candy_df$y, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)


kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::

::: {.column width="30%"}
#### Random Assignment 3

```{r}
#| echo: false

set.seed(123456)
candy_df |> 
  mutate(
    d = randomizr::complete_ra(10),
    y = if_else(d ==  1,y1,y0)
  ) -> candy_df

candy_tab <- candy_df
estimand_df <- tibble(
  y1_mn = mean(candy_df$y1, na.rm = T),
  y0_mn = mean(candy_df$y0,na.rm = T),
  tau_mn = mean(candy_df$tau,na.rm = T)
)

effects_df <- tibble(
  y1_mn = mean(candy_df$y1[candy_df$d == 1], na.rm = T),
  y0_mn = mean(candy_df$y0[candy_df$d == 0],na.rm = T),
  ate = y1_mn - y0_mn
)

candy_tab$y1 <- cell_spec(
  candy_df$y1, color = "chocolate"
)
candy_tab$y0 <- cell_spec(
  candy_df$y0, color = "purple"
)
candy_tab$x <- cell_spec(
  candy_df$x, color = ifelse(candy_df$x == "fruit", "purple", "chocolate"
)
)
candy_tab$d <- cell_spec(
  candy_df$d, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)
candy_tab$y <- cell_spec(
  candy_df$y, color = ifelse(candy_df$d == 0, "purple", "chocolate"
)
)

kable(candy_tab |> 
        select(x, d,y)
      ,escape = F,
      format = "markdown",
      col.names = c(
        "$x_i$",
        "$d_i$",
        "$y_i$"
      )) 
```

```{r}
#| echo: false

kable(effects_df,escape = F,
      format = "markdown",
      digits = 2,
      col.names = c("$\\bar{y}_{d=1}$",
        "$\\bar{y}_{d=0}$",
        "$\\hat{ATE}$")
      ) 
```
:::
::::

## Distribution of Sample ATEs

```{r}
#| echo: false

ate_fn <- function(df){
  df |> 
    mutate(
    d = randomizr::complete_ra(10),
    y = if_else(d ==  1,y1,y0)
  ) -> df
  
  ate <- mean(df$y[df$d == 1]) - mean(df$y[df$d == 0])
  return(ate)
}
set.seed(123)
plot_df <- tibble(
ate = replicate(5000,ate_fn(candy_df),simplify = "array")
)
plot_df |> 
  ggplot(aes(ate))+
  geom_histogram(bins=100)+
  theme_minimal()+
  geom_vline(aes(xintercept =1),
             col = "red") +
  labs(
    title = "Distribution of Difference of Means under Different Randomizations of Treamtent",
    x = "Difference of Means",
    y = "Count (5000 Sims)"
  )+
  xlim(-3,4)


```


## Observational vs Experimental Designs {.smaller}

-   [Experimental designs]{.blue} are studies in which a causal variable of interest, the *treatement*, is [manipulated by the researcher]{.blue} to examine its causal effects on some *outcome* of interest

-   [Observational designs]{.blue} are studies in which a causal variable of interest is determined by someone/thing [other than the researcher]{.blue} (nature, governments, people, etc.)

## Two Kinds of Bias

:::{.nonincremental}
- **Confounder bias:** Failing to control for a common cause of `D` **and** `Y` (aka Omitted Variable Bias)

- **Collider bias:** Controlling for a common consequence 

:::

## {.smaller}
#### Confounding Bias: The Coffee Example

:::: panel-tabset

## Confounding Bias
:::{.nonincremental}
- Drinking coffee doesn't cause lung cancer we might find correlation between them because they share a [common cause:]{.blue} smoking.

- Smoking is a [confounding]{.blue} variable, that if [omitted]{.blue} will [bias our results]{.blue} producing a [spurious]{.blue} relationsip 

- [Adjusting]{.blue} for [confounders]{.blue} removes this source of bias

:::{.callout-note}
When scholars include "control variables" in a regression, often they are trying to adjust for confounding variables that if omitted would bias their results
:::

:::

```{r}
#| label: confounding_day
#| echo: false
n <- 1000
coffee_df <- tibble(
  smoking = ifelse(rnorm(n)>.5,.75,0),
  Smoker = ifelse(smoking >0, "Smoker","Non-Smoker"),
  Coffee = smoking + rnorm(n),
  Cancer = 2*smoking+ rnorm(n),
)

coffee_df %>% 
ggplot(aes(Coffee,Cancer))+
  geom_point()+
  stat_smooth(method = "lm")+
  labs(title = "Positive relationship between\ncoffee and cancer")+
  theme_minimal()-> coffee_lm1_fig

coffee_df %>% 
ggplot(aes(Coffee,Cancer,col = Smoker))+
  geom_point()+
  stat_smooth(method = "lm") +
  labs(title = "No relationship between coffee\nand cancer adjusting for smoking")+
  theme_minimal()-> coffee_lm2_fig

coffee_dag1 <- dagify(
  y ~ x,
  labels = c(
    "y" = "Cancer",
    "x" = "Coffee"
  ),
  outcome = "y",
  exposure = "x"
)
coffee_dag1 %>% tidy_dagitty(layout = "linear") %>% 
  ggplot(aes(x,y, xend = xend, yend = yend))+
  geom_dag_point()+
  geom_dag_edges(edge_linetype = "dashed")+
  geom_dag_label(aes(label =label),nudge_y =.4) +
  ylim(1,-1)+
  theme_dag() +
  labs(title ="Spurious association between\ncoffee and cancer") -> coffee_dag1_fig



coffee_dag2 <- dagify(
  x ~z,
  y ~ z,
  labels = c(
    "y" = "Cancer",
    "x" = "Coffee",
    "z" = "Smoking"
     
  ),
  outcome = "y",
  exposure = "x",
  coords = list(
    x = c(x = -1, y = 1, z = 0),
    y = c(x = 0, y = 0, z = 1)

  )
) |> tidy_dagitty() |> 
  mutate(
    fill_col = ifelse(name == "z","grey","black")
  )
coffee_dag2 |> 
  ggplot(aes(x,y, xend = xend, yend = yend))+
  geom_dag_point(aes(color = fill_col))+
  geom_dag_edges()+
  geom_dag_label(aes(label =label),nudge_y =.2) +
  guides(fill="none",color = "none")+
  theme_dag()+
  labs(title ="Adjusting for smoking, no relationship\nbetween coffee and cancer")+
  scale_color_manual(values=c("black","grey"))-> coffee_dag2_fig

confounded_fig1 <- ggarrange(coffee_dag1_fig,coffee_lm1_fig)
confounded_fig2 <- ggarrange(coffee_dag2_fig, coffee_lm2_fig)

```

## Coffee and Cancer
```{r}
#| label: confounded_fig1
#| echo: false

confounded_fig1

```


## Adjusting for Smoking

```{r}
#| label: confounded_fig2
#| echo: false

confounded_fig2

```



::::

## {.smaller} 
#### Collider Bias: The Dating Example

:::: panel-tabset

## Collider bias

:::{.nonincremental}



- Why are attractive people such jerks?


- Suppose [dating]{.blue} is a function of [looks]{.blue} and [personality]{.blue}

- Dating is a [common consequences]{.blue} of [looks]{.blue} and [personality]{.blue}

- Basing our claim off of who we date is an example of [selection bias]{.blue} created by [controlling for collider]{.blue}

::::{.fragment}

:::{.callout-note}
If you see a regression model that controls for everything and the kitchen sink without theoretical justification, we might worry about the potential for collider bias
::: 

::::

:::

```{r}
#| label: collidercode
#| echo: false
dating_dag <- collider_triangle(
  x = "Looks",
  y = "Personality",
  m = "Dateability"
)
dating_dag %>% 
  tidy_dagitty() ->
  dating_dag
dating_dag %>% 
  mutate(colour = ifelse(name == "m", "Collider","Non-Collider"))->dating_dag
ggdag(dating_dag, 
      text = F,
      use_labels = "label")+
  theme_void() +labs(
    title = "Dating is collider"
  ) -> collider_dag_fig1




ggdag_dseparated(dating_dag, 
                 text = F, 
                 controlling_for = "m",
                 use_labels = "label")+
  theme_void()+
  guides(color="none",shape="none")+
  labs(title="Selection bias creates\nspurious relationship") -> collider_dag_fig2

n <- 100
set.seed(123)
collider_df <- tibble(
  Looks = rnorm(n),
  Personality = rnorm(n)
) %>% 
  mutate(
    date = case_when(
      Looks > .5  ~ 1,
      Looks < .5 & Personality >.75 ~ 1,
      T ~ 0
    ),
    Swipe = ifelse(date == 1, "Right","Left")
  )
collider_df %>% 
  filter(date==1) %>% 
  ggplot(aes(Looks, Personality, col=Swipe))+
  geom_point(alpha=.5)+
  stat_smooth(
    method = "lm",
    col = "red"
  )+
  guides(color=guide_legend(title= "Swipe"))+
  theme_minimal()+
  labs(title = "It looks like you date jerks")+
  scale_color_manual(values = "red")-> collider_lm_fig1
 
collider_lm_fig1+
  geom_point(
    data = collider_df,
    alpha = .5
  )+
  stat_smooth(
    data = collider_df,
    col = "black",
    method = "lm"
  )+scale_color_manual(values = c("grey","red"))+
  labs(title = "No relationship between looks\nand personality overall")->collider_lm_fig2

collider_fig1 <- ggarrange(collider_dag_fig2,collider_lm_fig1)
collider_fig2 <- ggarrange(collider_dag_fig1, collider_lm_fig2)



```

## Selection bias 

```{r}
#| echo: false
collider_fig1
```


## No relationship in population

```{r}
#| echo: false
collider_fig2
```

::::


## When to control for a variable:

![](https://book.declaredesign.org/figures/figure-16-3.svg)

[@Blair2023-yg] [(Chap. 6.2)](https://book.declaredesign.org/declaration-diagnosis-redesign/specifying-model.html#types-of-variables-in-models)


## Causal Inference {.smaller}

- Causal inference is about making credible counterfactual comparisons

- In an experiment, researchers create these comparisons through random assignment
  - Pro: Addresses concerns about [selection bias]
  - Con: Do results generalize?

- In an observational study, also attempt to make credible counterfactual comparisons through how they design their studies and analyze their data.
  - Pro: May generalize better/greater ecological validity
  - Cons: Greater potential for confounding and colliding bias
  
- In general characteristics of the design are more important than the specifics variables in a given model for addressing bias



# {{< fa lightbulb >}}  Using polls to forecast elections {.inverse}

## Forecasting Elections

- Election forecasts reflect varying combinations of:

  - Expert Opinion
  - Fundamentals
  - Polling

- Forecasts differ in the extent to which they rely on these components and how they integrate them in their final predictions


## FiveThirtyEight's Approach to Forecasting 
#### Under Nate Silver...

![](images/forecast538.png)


## Forecasting Elections with Polls {.smaller}

- The preeminence of polling in modern forecasts reflects the success of Nate Silver and FiveThirtyEight in correctly predicting the 2008 (49/50 states correct) 2012 (50/50) presidential elections 
  
  - Any one poll is likely to deviate from the true outcome


  - Averaging over multiple polls $\to$ more accurate predictions than any one poll, provided...


  - the polls aren't **systematically** biased


-  Concerns about the polls reflect the failure of such approaches to predict 
  
  - Trump's Victory in 2016
  
  - Strength of Trumps Support in 2020 and 2024



# {{< fa lightbulb >}} Polling in Recent Elections {.inverse}



## Polling the 2016 Election:

::::{.columns}

:::{.column width="40%"}
- The polls missed bigly
  - National polls were reasonably accurate (Clinton wins Popular Vote)
  - State polls overstated Clinton's lead / understated Trump support
:::

:::{.column width="60%"}

![](images/nyt2016.png)

[New York Times](https://www.nytimes.com/interactive/2016/11/13/upshot/putting-the-polling-miss-of-2016-in-perspective.html)
:::
::::

## How did we get it so wrong in 2016?{.smaller}

![](images/forecast2016.png)

Some likely explanations

- Likely voter models overstated Clinton's support

- Large number of undecided voters broke decisively for Trump

- White voters without a college degree underrepresented in pre-election surveys

A full autopsy from [AAPOR](https://www.aapor.org/Education-Resources/Reports/An-Evaluation-of-2016-Election-Polls-in-the-U-S.aspx)
[Image](https://www.nytimes.com/interactive/2016/upshot/presidential-polls-forecast.html?_r=0#other-forecasts)

## Weighting for education

::::{.columns}
:::{.column width="45%"}
![](images/nyted1.png)

:::

:::{.column width="45%"}
![](images/nyted2.png)
:::
::::

[New York Times](https://www.nytimes.com/2017/05/31/upshot/a-2016-review-why-key-state-polls-were-wrong-about-trump.html)


## 2018: A brief repreive?

- Polls did a better job

  - Most state polls weighted by education
  - Underestimated Democrats in House and Gubernatorial races
  - No partisan bias in Senate Races

- Forecasts correctly call:

  - Democratic House
  - Republican Senate



However...

##

![](images/vox.png)

[Vox](https://www.vox.com/2022/9/23/23353634/polls-bias-democrats-midterms)



## 2020: Historic Problems, Unclear Solutions {.smaller}

- Average polling errors for national popular vote were 4.5 percentage points  **highest in 40 years**

- Polls overstated Biden's support by 3.9 points national polls (4.3 points in state polls)

- Polls overstated Democratic support in Senate and Gubernatorial races by about 6 points

- [Forecasts predicted](https://projects.fivethirtyeight.com/2020-election-forecast/senate/) Democrats would hold 

  - 48-55 seats in the Senate (actual: 50 seats)
  - 225-254 seats in the House (actual: 222 seats)


## 2020: What Went Wrong{.smaller}

Unlike 2016, no clear cut explanations for what went wrong

::::{.columns}

:::{.column width="45%"}

**Not a cause:**

- Undecided voters
- Failing to weight for education
- Other demographic imbalances
- "Shy Trump Voters"
- Polling early vs election day voters
:::


:::{.column width="45%"}

**Potential Explanations**

- Covid-19
  - Democrats more likely to take polls
- Unit non-response
  - Between parties
  - Within parties
  - Across new and unaffiliated voters


:::
::::


[AAPOR Report](https://www.aapor.org/Education-Resources/Reports/2020-Pre-Election-Polling-An-Evaluation-of-the-202.aspx)


## How the polls did in 2022


::::{.columns}

:::{.column width="50%"}

- Overall, pretty good

- Average error close to 0

- Average absolute error ~ 4.5 percentage points

- Some polls tended overstate Republican support (e.g. Trafalgar)

:::


:::{.column width="45%"}

![](images/02_2022_error.jpeg)

:::
::::



## How the polls did in 2024{.smaller}


::::{.panel-tabset}

## Overview

- Good and bad news ([Silver Bulletin](https://www.natesilver.net/p/so-how-did-the-polls-do-in-2024-its))

- Good: 

  - Average Polling Error within historical norms

- Bad: 

  - Consistently underestimate Trump/Republican support in Presidential election years
  
  - Bad job of calling close races

## Average Error

![](images/02_average_ns.png)

## Persistent Bias

![](images/02_bias_ns.png)


## Calling Races

![](images/02_correct_ns.png)


::::

## What to expect for 2026 Midterms

::::{.columns}
:::{.column width="55%"}

-  Democrats hold a 3-5 point lead in generic ballots

- Polling traditionally better when Trumps not on the ballot

- Lot's can change between now and November (Temporal Error...)

:::


:::{.column width="45%"}

![](images/02_nyt_2026.png)

[Source: NYT](https://www.nytimes.com/interactive/polls/congressional-vote-2026.html)

:::
::::


# {{< fa lightbulb >}}  Converse (1964) {.inverse}


## Goals

This weeks readings are **HARD**

:::{.fragment}
Our goal is to answer the following:

:::

- What's the research question
- What's the theoretical framework
- What's the empirical design
- What's are the results
- What's are the conclusions

## The Structure of Converse (1964){.smaller}

0. Introduction 
1. Some Clarification of Terms
2. Sources of Constraint on Idea Elements
3. Active Use of Ideological Dimensions of Judgement
4. Recognition of Ideological Uses of Judgement
5. Constraints among idea-elements
6. Social Groupings as central objects in belief systems
7. The Stability of belief elements over time
8. Issue Publics
9. Summary
10. Conclusion

## Introduction 

- Converse introduces the concept of [belief systems]{.blue} and tells us this article is about the contrast between the belief systems held by [political elites and the mass public]{.blue}

- He gestures towards a [hierarchy of belief strata]{.blue} and the importance of belief systems for democratic theory

- Kind of slow start


## Some Clarification of Terms {.smaller}

Converse defines his core concepts



- Beliefs Systems $\sim$ Ideology



- Idea elements $\sim$ Attitudes



- Constraint:
	- The interdependence of ideas in a belief system
	- A sense of what goes with what



- Centrality: 
    - How likely a belief is to change?



- Range:
    - The diversity of topics
    

## Sources of Constraint{.smaller} 
#### (Theoretical Framework)

Converse lays out some plausible sources of ideological constraint:



- [Logical:]{.blue} More spending + Less taxes -> Bigger deficits



- [Psychological:]{.blue} "the quasi-logic of cogent arguments"



- [Social:]{.blue} Social diffusion of information -> creates perceptions of what goes with what


:::{.fragment}
Converse also offers a definition of the well-informed person who understands what goes with what but can also articulate why.
:::

## Consequences of declining information belief systems{.smaller}

Converse argues as we move from the well-informed to uninformed, several things happen:

- Belief systems lose constraint
- Social groups replace ideology principles in centrality

:::{.fragment}
So how does he go about doing showing this?
:::

## Active Use of Ideological Dimensions of Judgement 

Converse considers people's open-ended responses to questions about whether there is anything they like or dislike about presidential candidates in 1956 and the political parties discussed in detail in chapter 10 of the *The American Voter*

## Active Use of Ideological Dimensions of Judgement

![](images/03_converse/c1.png)


## The American Voter{.smaller}


Ideologues:

> *Well, the Democratic Party tends to favor socialized medicine  and I'm being influenced in that because I came from a doctor's family.* 

Group Benefits:

> *Well I just don't believe their for the common people*

Nature of the times:

> *My husband's job is better. ... My husband is a furrier and when people get money they buy furs*

No Content:

> *I hate the darned backbiting*


## {.smaller}
#### Recognition of Ideological Dimensions of Judgement

::::{.panel-tabset}

## {{< fa lightbulb >}}
Next Converse considers these levels of conceptualization in 1956 with peoples ability to attach the correct ideological labels with political parties

- Overall, most respondents label Democrats as the liberal and Republicans as the conservative party (Table 2)

- But the depth of this understanding appears quite shallow (e.g. spend vs save) (Table 3)

- Recognition varies with education (Table 4)

- Those with greater levels of recognition are more active in politics (Table 5)

## Labels

![](images/03_converse/c2.png)


## Strata

![](images/03_converse/c3.png)


## Education

![](images/03_converse/c4.png)



## Participation

![](images/03_converse/c5.png)

::::


## Constraints among idea-elements

Next Converse considers the degree of constraint (measured by correlations) between issue elements in an elite (congressional candidates) compared to the mass public

-  People who took a liberal position on one issue did not necessarily take a liberal position on another

- the correlations between between elites' issue attitudes were higher than the mass public


## Individuals lack a sense of what goes with what

![](images/03_converse/c6.png)


## What are these measures

Read the footnotes!

![](images/03_converse/q1.png)



## What are these measures

![](images/03_converse/q2.png)

## What's a tau-gamma coefficient?{.smaller}

:::{.nonincremental}
I believe Converse is using a measure of association for ordinal data that's built off the cross tabs of variables. The estimate of gamma, $G$, depends on two quantities:

- $N_s$, the number of pairs of cases ranked in the same order on both variables (number of concordant pairs),
- $N_d$, the number of pairs of cases ranked in reversed order on both variables (number of reversed pairs),

where "ties" (cases where either of the two variables in the pair are equal) are dropped.
Then

$$G=\frac{N_s-N_d}{N_s+N_d}$$

:::

:::{.callout-note}
As long as you have a basic sense of what **correlations** are trying to tell us you don't need to know the technical details of a specific estimator
:::

## Elites show higher degrees of constraint

![](images/03_converse/c7.png)


## Social Groupings as central objects in belief systems {.smaller}

- Converse takes a brief detour to discuss the role of social groups in belief systems.
- Essentially he argues that: 
    - For all but the most informed, groups play a large role
      - What evidence does Converse provide for this claim?
    - The centrality of groups (race, religion) declines with levels of political information

## Social groupings as central objects in belief systems

![](images/03_converse/c8.png)

## {.smaller}
#### The stability of belief elements over time

::::{.columns}

:::{.column width="45%"}

Converse considers the stability of responses over time, looking at data from 1958-1960 finding variation the strength of temporal correlations across surveys, and attributes this to the centrality of groups and the party system for the mass public

:::

:::{.column width="55%"}

![](images/03_converse/c9.png)

:::
::::



## The stability of belief elements over time{.smaller}

The final piece of Converse's argument concerns the stability of a single belief over time (1956, 1958, 1960):

>*The government should leave things like electrical power and housing for private businessmen to handle*

- A limiting case  an issue not in the public debate of this period

- People appear to answer the question at random

- Only a small proportion (~20% fn. 39) held stable attitudes across all three periods


## Issue Publics, Summary, Conclusion

Converse wraps up his argument by

- Allowing for the possibility of small "issue publics" on more narrow issues

- Offering some comments on cross-national and historical comparisons

- Summarizing the "continental shelf" that exists between elites and masses.


## What do we think?

- What's the research question
- What's the theoretical framework
- What's the empirical design
- What's are the results
- What's are the conclusions

## Summary of Converse (1964)

- Converse (1964) remains one of the most influential articles in American Political Behavior



- Framed decades of research on questions of ideology and citizen competence

  - Why?

- In the absence of coherent and stable worldviews, how does democracy function?

## Responses to Converse (1964)

![](images/03_converse/legacy.png)

## Responses to Converse (1964)

- **Measurement error (Today)**
- Revised definitions of citizens competence (Next Week)
- The Miracle of Aggregation
- Source Cues and Heuristics (Weeks 5 and 6)
- Revised models of Survey Response (Weeks 5 and 6)
- Revised models of what democracy requires (on going)


## The Miracle of Aggregation

![:scale 50%](https://pbs.twimg.com/media/D9DJpYdXsAAswmy.jpg)
[More: Stimson (2018)](Public Opinion In America)

## Responses to Converse (1964)

- **Measurement error (Today)**
- Revised definitions of citizens competence (Next Week)
- The Miracle of Aggregation
- Source Cues and Heuristics (Weeks 5 and 6)
- Revised models of Survey Response (Weeks 5 and 6)
- Revised models of what democracy requires (on going)





# {{< fa lightbulb >}}  Ansolabehere et al. (2008) {.inverse}

## Goals

- What's the research question
- What's the theoretical framework
- What's the empirical design
- What's are the results
- What's are the conclusions

## What's the research question

- Are issue preferences as unstable and incoherent as Converse suggests, or can accounting for measurement error reveal a more ideologically consistent mass public

## What's the theoretical framework

- Ansolabehere et al. pick up a critique made by Achen (1975) and others that the lack of constraint is primarily caused my measurement error



- They show that measurement error tends to decreases with the number of items one uses



- They offer a simple solution:  measure concepts with scales constructed from multiple items

## Measurement Error

Classic measurement error models assume what we observe is a measure of some unobserved (latent) truth, plus measurement error that has mean 0 and is uncorrelated with the latent truth, X.

$$\underbrace{W}_{\text{What we observe}} = \overbrace{X}^{\text{The latent truth}} +\underbrace{\epsilon}_{\text{Measurement Error}}$$


## Measurement Error

One can show that:

$$Var(W) = Var(X) + Var(\epsilon)\\
= \sigma^2_X + \sigma^2_\epsilon$$

And the covariance between our observed and unobserved variables is:

$$Cov(XW) = \sigma^2_X$$



## Correlations and Reliability

- With some assumptions and transformations we can show that the square of correlations describe the reliability of a measure

- Reliability is the proportion of the variance in the observed variable that comes from the latent variable of interest, and not from random error.

- This motivates Ansolabehere et al. approach


## Correlations and Reliability

$$\begin{aligned}
\rho^2 &= \left(\frac{Cov(XW)}{SD(X)SD(W)} \right)^2\\
&=\left(\frac{\sigma_X^2}{\sqrt{\sigma_X^2}(\sqrt{\sigma_X^2+\sigma_\epsilon^2)}} \right)^2\\  
&=\frac{\sigma_X^4}{\sigma_X^2(\sigma_X^2+\sigma_\epsilon^2)}\\  
&=\frac{\sigma_X^2}{\sigma_X^2+\sigma_\epsilon^2}\\  
&=\frac{\text{True Variance}}{\text{Total Variance}}\\  
\end{aligned}$$


## Measurement error reduces reliability

![](images/03_ansolabehere/a1.png)

## Multiple items reduce measurement error

![](images/03_ansolabehere/a2.png)

## With some caveats


::::{.columns}

:::{.column width="40%"}
- No autocorrelation
 - Error on one item doesn't predict error on another
- Additional items can't be too noisy

:::
:::{.column width="40%"}
![](images/03_ansolabehere/a3.png)
:::
::::

## What's the empirical design{.smaller}

- Panel data from the NES



- Principal component factor analysis to scale items together
    - Basically averaging (weighting by the variance each item contributes)



- Correlational analysis within items and across time and also within surveys



- Simulations



- Sub-group analysis by political sophistication



- Regression analysis of issue voting

## Principal Components Analysis

- Find dimensions that explain the maximum variance with the minimum error

- A useful tool for data reduction

![](https://intoli.com/blog/pca-and-svd/img/basic-pca.png)

[(Source)](https://intoli.com/blog/pca-and-svd/)

## What's are the results {.smaller}

- The over time reliability     of scales increases with the number of items used
    - Table 1, Figure 1  
- The correlations are higher between scales within surveys
    - Table 2, Figure 2
- Issue scales are more stable than policy predispositions
    - Table 3
- Little variation across political sophistication
    - Table 4
- Issue scales predict vote choice
    - Table 5


## The over time reliability of scales increases with the number of items used

![](images/03_ansolabehere/a4.png)

## The over time reliability of of scales increases with the number of items used


![](images/03_ansolabehere/a5.png)

## The correlations are higher between scales within surveys

![](images/03_ansolabehere/a6.png)

## Issue scales are more stable than policy predispositions

![](images/03_ansolabehere/a7.png)

## Little variation across political sophistication

![](images/03_ansolabehere/a8.png)

- This is in contrast to what Converse's "Black and White" model would predict and consistent with general arguments about measurement error

- The entry point for Freeder et al.'s critique

## Issue scales predict vote choice

![](images/03_ansolabehere/a9.png)

## What's are the conclusions

- Democracy is saved?



- It's the measures not the public that's the problem



- Use multiple measures and scale them together to study the concepts we're interested in



- The importance of political sophistication may be overstated

# Break

## Class Survey

 Please click [here](https://brown.co1.qualtrics.com/jfe/form/SV_3rbWjzTQ3SrgNHU) to take our periodic attendance survey 

# {{< fa lightbulb>}} Freeder et al. (2019) {.inverse}

## Goals

## Goals

- What's the research question
- What's the theoretical framework
- What's the empirical design
- What's are the results
- What's are the conclusions

## What's the research question

- Is the lack of ideological constraint really just a function of measurement error, or is it a product of citizens' ignorance of "what goes with what"


## What's the theoretical framework

- Measurement error critiques of Converse like Ansolabehere et al. can't distinguish between error due to:
    - The vagaries of the question (classical measurement error)
    - The vagaries of person (lack of knowledge)
    - The vagaries of survey response (more on this later)

:::{.fragment}

$$\overbrace{y}^{\text{Observed}} = \underbrace{\hat{y}}_{\text{True}} + 
\overbrace{u_i}^{\text{Error}} + 
\underbrace{v_i}_{\text{Survey Response}} + 
\overbrace{p_iw_i}^{WGWW}$$

:::

- Averaging reduces error from all of these sources

## What's the theoretical framework

- Knowledge of what goes with what (WGWW)  measured by awareness of which party is more liberal or conservative  explains lack of constraint, even after accounting for measurement error

## What's the empirical design

- Panel data with multiple issue items, measures of general knowledge/sophistication, and specific measures of WGWW proxied by candidate and party placements



- Correlations and scale properties



- Sub-group analysis



- Regression analysis



- Simulations

## What are the results



- More items reduces measurement error



- Constraint doesn't vary with general knowledge, but does vary with WGWW



- True of scales and individual items



- WGWW predicts attitude stability



- But only for people who agree with their party's positions

- More items won't fix the problem

## More items reduces measurement error

![](images/03_freeder/f3.png)


## Constraint doesn't vary with general knowledge, but does vary with WGWW

![](images/03_freeder/f4.png)



### True of scales and individual items

![](images/03_freeder/f2.png)



## WGWW predicts attitude stability

![](images/03_freeder/f1.png)


## But only for people who agree with their party's positions

![](images/03_freeder/f5.png)


## More items won't fix the problem

![](images/03_freeder/f6.png)

## What are the conclusions

- Correcting for measurement error alone won't save democracy

- Multiple items are still useful
    - But what does the first principal component of a multi-item scale really mean?



- Where does knowledge of what goes with what come from?
    - Are parties the only source of constraint?


# Next week

## Overview {.smaller}

[Tuesday:]{.blue}

- Recap discussion of Ideology and Issues

- General overview of political knowledge

- Discussion of Jerit, Barabas and Bolsen (2006)

[Thursday:]{.blue}

- Alternative conceptions of political knowledge Weaver, Prowse and Piston (2019)

- Misinformation (Jerit and Zhao 2020)

- Begin discussion of [A1](https://pols1140.paultesta.org/assignments/a1) 



## References