First Differencing By Group

A bit of practice taking the first difference when the data is not consistent with a typical time-series structure.

The first set of data.

library(tidyverse)
library(kableExtra)
dff <- tibble(
  'id' = c('a', 'a', 'b', 'b', 'c', 'c'),
  'survey' = c(1, 2, 1, 2, 1, 2),
  'score' = c(4, 4, 2, 4, 5, 2),
  'team' = c('a', 'a', 'a', 'a', 'a', 'a')
)
dff %>% kable() %>% kable_styling()
id survey score team
a 1 4 a
a 2 4 a
b 1 2 a
b 2 4 a
c 1 5 a
c 2 2 a

The goal is to subtract scores on the first survey from scores on the second survey. E.g., what are the change scores across the surveys for each participant?

dff %>% 
  group_by(id) %>% 
  mutate(diffscore = score - lag(score))
# A tibble: 6 × 5
# Groups:   id [3]
  id    survey score team  diffscore
  <chr>  <dbl> <dbl> <chr>     <dbl>
1 a          1     4 a            NA
2 a          2     4 a             0
3 b          1     2 a            NA
4 b          2     4 a             2
5 c          1     5 a            NA
6 c          2     2 a            -3

The second set of data.

score <- c(10,30,14,20,6)
group <- c(rep(1001,2),rep(1005,3))
df <- data.frame(score,group)

df %>% kable() %>% kable_styling()
score group
10 1001
30 1001
14 1005
20 1005
6 1005

Group 10001 has two scores whereas group 1005 has 3. I want the change from one score to another for each group.

df %>%
  group_by(group) %>%
  mutate(first_diff = score - lag(score))
# A tibble: 5 × 3
# Groups:   group [2]
  score group first_diff
  <dbl> <dbl>      <dbl>
1    10  1001         NA
2    30  1001         20
3    14  1005         NA
4    20  1005          6
5     6  1005        -14

Bo\(^2\)m =)