Unequal Time Stamps

Quick note on adding a “time” column when participants differ in the number of responses they offer. Let’s say my data are as follows:

library(tidyverse)
library(kableExtra)

df <- data.frame(
  'id' = c(1, 1, 2, 2, 2, 3, 4, 4),
  'score' = c(6, 5, 3, 4, 2, 8, 7, 7)
)

head(df, 8) %>% 
  kable() %>% 
  kable_styling()
id score
1 6
1 5
2 3
2 4
2 2
3 8
4 7
4 7

where person 1 responded twice, person 2 three times, person 3 once, and person 4 twice. I want to add another column indicating that idea.

Identify the number of times each id appears in the dataframe.

table(df$id)

1 2 3 4 
2 3 1 2 

Save the values.

id_appear_times <- unname(table(df$id))

Create a sequence from 1 to i for each i in the vector.

timer <- c()
for(i in id_appear_times){
  
  new_time <- c(1:i)
  timer <- c(timer, new_time)

}

Add it to my data.

head(df, 8) %>% 
  mutate(time = timer) %>% 
  select(time, id, everything()) %>% 
  kable() %>% 
  kable_styling()
time id score
1 1 6
2 1 5
1 2 3
2 2 4
3 2 2
1 3 8
1 4 7
2 4 7

Miscellaneous Afterthought

While playing with the code above, I considered how to generate the id column with rep or seq. Here’s how:

rep_each <- function(x, times) {
  times <- rep(times, length.out = length(x))
  rep(x, times = times)
}

time_vec <- rep_each(c(1,2,3,4), times = id_appear_times)
time_vec
[1] 1 1 2 2 2 3 4 4

Bo\(^2\)m =)