Quick note on adding a “time” column when participants differ in the number of responses they offer. Let’s say my data are as follows:
library(tidyverse)
library(kableExtra)
df <- data.frame(
'id' = c(1, 1, 2, 2, 2, 3, 4, 4),
'score' = c(6, 5, 3, 4, 2, 8, 7, 7)
)
head(df, 8) %>%
kable() %>%
kable_styling()
id | score |
---|---|
1 | 6 |
1 | 5 |
2 | 3 |
2 | 4 |
2 | 2 |
3 | 8 |
4 | 7 |
4 | 7 |
where person 1 responded twice, person 2 three times, person 3 once, and person 4 twice. I want to add another column indicating that idea.
Identify the number of times each id appears in the dataframe.
table(df$id)
1 2 3 4
2 3 1 2
Save the values.
Create a sequence from 1 to i for each i in the vector.
Add it to my data.
head(df, 8) %>%
mutate(time = timer) %>%
select(time, id, everything()) %>%
kable() %>%
kable_styling()
time | id | score |
---|---|---|
1 | 1 | 6 |
2 | 1 | 5 |
1 | 2 | 3 |
2 | 2 | 4 |
3 | 2 | 2 |
1 | 3 | 8 |
1 | 4 | 7 |
2 | 4 | 7 |
While playing with the code above, I considered how to generate the id column with rep
or seq
. Here’s how:
rep_each <- function(x, times) {
times <- rep(times, length.out = length(x))
rep(x, times = times)
}
time_vec <- rep_each(c(1,2,3,4), times = id_appear_times)
time_vec
[1] 1 1 2 2 2 3 4 4
Bo\(^2\)m =)