Christopher R. Dishop: Thomas Schelling's Model: Long Form

The Schelling model that I created in my last script used matrix operations. It required me to think in terms of patches housed in a matrix. Consider the following 3x3 grid.

     [,1] [,2] [,3]
[1,]    1    1    0
[2,]    0    2    2
[3,]    1    0    0

Each cell can be thought of as a patch. When a given patch is 0, it is unoccupied. When a given patch is 1 or 2, it is occupied by a hockey or soccer player, respectively. When I implement a Schelling model using a matrix, it puts me in a certain frame of mind. I have to consider patches as locations specified by rows and columns. At row 1 column 2, for example, sits patch XXX that is either occupied or unoccupied.

Another way to implement the Schelling model is to use long data. The same information conveyed in the 3x3 matrix is shown in long form below.

  xcoord ycoord type
1      1      1    1
2      2      1    0
3      3      1    1
4      1      2    1
5      2      2    2
6      3      2    0
7      1      3    0
8      2      3    2
9      3      3    0

Coordinates are now viewed as information that can be stored in respective columns. The type column represents whether the patch is unoccupied (0), houses a hockey player (1), or houses a soccer player (2). The goal of this post is to re-create the Schelling model using long data.

I’m going to present the code in two sections. The first demonstrates the behavior of 1 agent within a single time point. The second reveals the full model: it iterates over 50 time points using all agents.

Basic Idea

The model uses two data frames to store (the most important) information. One holds the coordinates of the living location of each agent. Susan, for example, lives at xcoord = 3 & ycoord = 10, whereas Johnny lives at xcoord = 2 & ycoord = 15. The other specifies the object located at each patch on the grid (0 = unoccupied, 1 = hockey player, 2 = soccer player), and this second data frame will be used for plotting.

To reiterate, one data frame stores agent coordinates:

library(kableExtra)

head(agent_df) %>% 
  kable() %>% 
  kable_styling()

xcoord	ycoord	type	agent
1	1	1	1
2	1	1	2
5	1	1	3
6	1	1	4
7	1	2	5
8	1	1	6

and the other, which will be used for plotting, stores patch information.

head(patch_df) %>% 
  kable() %>% 
  kable_styling()

xcoord	ycoord	type
1	1	1
2	1	1
3	1	0
4	1	0
5	1	1
6	1	1

The pseudocode for the model is as follows.

"
for each period
      
    for each agent i
                
        identify i's 8 surrounding neighbors
        count the number of similar and dissimilar neighbors
        -- e.g., if i is a hockey player and is surrounded by soccer players...
        -- then he has mostly dissimilar neighbors
        
        if agent i has more dissimilar neighbors than he desires, then label him as unhappy
        if agent i has more similar neighbors than he desires, then label him as happy
        
        repeat for all agents
        
    for each unhappy agent j
    
        randomly select a new patch to possibly move to
        if the patch is unoccupied, move there
        
        repeat for all unhappy agents
        
    
    plot the grid of patches for this period
    save the plot
    
end
"

Of course, the model is much more complex in syntax. But the basic idea is straightforward: people move if they have many dissimilar neighbors, and they stay if they have similar neighbors. Moreover, new patches, which are selected when agents want to move, are pulled randomly.

Let’s pretend we are at the first period and are beginning to iterate across agents. The code works as follows.

Starting with agent 1, identify her coordinates and type (type meaning hockey or soccer player).

    agent_coords <- agent_df %>% 
      filter(agent == 1) %>% 
      select(xcoord, ycoord)
    
    agent_type <- agent_df %>% 
      filter(agent == 1) %>% 
      select(type) %>% 
      pull()

    
glimpse(agent_coords)

Rows: 1
Columns: 2
$ xcoord <int> 1
$ ycoord <int> 1

glimpse(agent_type)

 num 1

Using agent i’s coordinates, identify her 8 surrounding neighbors.

neigh <- get_neigh(agent_coords)

The function get_neigh is prespecified (I will show you the syntax below). It’s too complicated to pick apart now. Just know that it returns the coordinates of her 8 surrounding neighbors.

neigh %>% kable() %>% kable_styling

xcoord	ycoord
2	1
2	51
1	51
51	51
51	1
51	2
1	2
2	2

I’m about to start counting similar neighbors, so I need to initialize a few counters.

total_neighs <- 0
similar_neighs <- 0

Count similar neighbors. Go through each row in the neigh matrix to find a given neighbor’s coordinates. Use those coordinates to pull all information about neighbor n from the agent data frame. Then, increment similar_neighs by 1 if neighbor n is the same type as agent i. Increment total_neighs if the patch isn’t empty.

Stated simply, if agent i is a hockey player then count the number of other hockey players. Also count the number of non-empty patches.

    for(n in 1:nrow(neigh)){
    
      # save neighbor 
      neigh_agent <- patch_df %>% 
        filter(xcoord == neigh$xcoord[n],
               ycoord == neigh$ycoord[n])
        
      # increment similar neighbors by 1 if agent is same as neighbor
      # increment total neighbors by 1 if patch isn't empty
      if(agent_type == neigh_agent$type){similar_neighs <- similar_neighs + 1}
      if(neigh_agent$type != 0){total_neighs <- total_neighs + 1}
      
    }

Now my counters have values.

cat(
  paste("Total Neighbors =", total_neighs,
        "\nSimilar Neighbors =", similar_neighs)
)

Total Neighbors = 4 
Similar Neighbors = 3

Calculate a similarity ratio. Take the number of similar neighbors and divide it by the total number of neighbors (i.e., non-empty patches).

sim_ratio <- similar_neighs / total_neighs

I won’t show the code here, but all of the information calculated so far then gets stored in a master “results” data frame.

So far, we have identified agent i’s neighbors and calculated her similarity ratio. We now need to determine whether she wants to move. If she does, we then need to find a new place for her to move to.

Schelling’s original model used inequalities to generate agent happiness (or satisfaction). When one’s similarity ratio is greater than some innate preference for similar others (say, 0.6), then the agent is happy and stays put. When one’s similarity ratio is lower than 0.6, the agent is unhappy and moves. Here is that idea embodied in code.

empty <- is.nan(sim_ratio)
happy <- NULL
    
if(empty == TRUE){happy <- TRUE} # if the agent has no neighbors, he is happy
if(empty == FALSE && sim_ratio > 0.6){happy <- TRUE}
if(empty == FALSE && sim_ratio < 0.6){happy <- FALSE}
if(empty == FALSE && sim_ratio == 0.6){happy <- FALSE}

The inequalities are located in the if statements. What makes the syntax a bit tricky is that I also included an empty object. I did that because not all agents have neighbors. It is possible for an agent to be surrounded by all empty patches. When this (unlikely) case happens, then total_neighs is equal to 0, and we all know that dividing by 0 doesn’t work. So, the code above asks whether sim_ratio has an actual value, and it only moves forward if so. Said differently, empty would equal TRUE when agent i has no neighbors. If the agent is happy, the code moves forward. If the agent is unhappy, she gets stored (not shown).

The steps above then repeat for every agent. Once it iterates over all agents, storing the unhappy agents when they arise, it finds new patches for the unhappy agents. First, randomly select new coordinates.

  new_x <- sample(51, 1)
  new_y <- sample(51, 1)

Is the patch located at those coordinates occupied?

  agent_type_at_new <- patch_df %>% 
    filter(xcoord == new_x,
           ycoord == new_y) %>% 
    select(type) %>% 
    pull()


  # 0 = unoccupied
  # 1 = hockey player
  # 2 = soccer player

  occupied <- FALSE
  if(agent_type_at_new != 0){occupied <- TRUE}

If the patch is unoccupied, then we can work with our unhappy agent. If the patch is occupied, we need to continue to sample patches until we find one that is unoccupied.

  while(occupied == TRUE){
    
    new_x <- sample(51, 1)
    new_y <- sample(51, 1)
    agent_type_at_new <- patch_df %>% 
      filter(xcoord == new_x,
             ycoord == new_y) %>% 
      select(type) %>% 
      pull()
    
    if(agent_type_at_new == 0){occupied <- FALSE}
  }

Once selected, we change the new patch to occupied within the patch_df, change the old patch to unoccupied within the patch_df, and update the agent data frame with unhappy agent i’s new coordinates. The code to do so is something like the following.

  # change new patch to unhappy agent i's type
  patch_df[patch_df$xcoord == new_x & patch_df$ycoord == new_y, "type"] <- current_unhappy$type[1]
  # go to the old patch where unhappy agent i used to live and change it to 0
  patch_df[patch_df$xcoord == current_unhappy$xcoord[1] & patch_df$ycoord == current_unhappy$ycoord[1], "type"] <- 0
  
  # update the agent_df to reflect unhappy agent i's new coordinates
  agent_df[agent_df$agent == current_unhappy$agent[1], "xcoord"] <- new_x
  agent_df[agent_df$agent == current_unhappy$agent[1], "ycoord"] <- new_y

Full Model

Here is the full model.

library(tidyverse)
library(reshape2)
library(ggplot2)

# initial grid
#
#
#
#
#
#


dims <- 51*51
empty_patches <- 781
peeps <- c(rep(1, (dims-empty_patches) / 2),
           rep(2, (dims-empty_patches) / 2),
           rep(0, empty_patches))

num_agents <- dims - empty_patches

mat <- matrix(sample(peeps, dims, replace = F), 51, 51, byrow = T)

patch_df <- melt(mat) %>% 
  mutate(xcoord = Var1,
         ycoord = Var2,
         type = value) %>% 
  select(xcoord, 
         ycoord,
         type)

agent_df <- patch_df %>% 
  filter(type %in% c(1,2)) %>% 
  mutate(agent = 1:num_agents)

plotfirst <- patch_df

alike_preference <- 0.6



# get neighbors function
#
#
#
#
#
#
#
#

get_neigh <- function(xy){
  
  # starting from right and going clockwise, I want neighbors a,b,c,d,e,f,g,h
  
  ax <- xy[1 , "xcoord"] + 1
  ay <- xy[1, "ycoord"]
  # xcoord, ycoord
  a <- c(ax, ay)
  
  bx <- xy[1 , "xcoord"] + 1
  by <- xy[1, "ycoord"] - 1
  # xcoord, ycoord
  b <- c(bx, by)
  
  cx <- xy[1 , "xcoord"]
  cy <- xy[1, "ycoord"] - 1
  # xcoord, ycoord
  c <- c(cx, cy)
  
  dx <- xy[1 , "xcoord"] - 1
  dy <- xy[1, "ycoord"] - 1
  # xcoord, ycoord
  d <- c(dx, dy)
  
  ex <- xy[1 , "xcoord"] - 1
  ey <- xy[1, "ycoord"]
  # xcoord, ycoord
  e <- c(ex, ey)
  
  fx <- xy[1 , "xcoord"] - 1
  fy <- xy[1, "ycoord"] + 1
  # xcoord, ycoord
  f <- c(fx, fy)
  
  gx <- xy[1 , "xcoord"]
  gy <- xy[1, "ycoord"] + 1
  # xcoord, ycoord
  g <- c(gx, gy)
  
  hx <- xy[1 , "xcoord"] + 1
  hy <- xy[1, "ycoord"] + 1
  # xcoord, ycoord
  h <- c(hx, hy)
  
  
  dff <- data.frame(
    'xcoord' = c(a[1], b[1], c[1], d[1], e[1], f[1], g[1], h[1]),
    'ycoord' = c(a[2], b[2], c[2], d[2], e[2], f[2], g[2], h[2])
  )
  
  dff <- dff %>% 
    mutate(xcoord = ifelse(xcoord == 0, 51, xcoord),
           xcoord = ifelse(xcoord == 52, 1, xcoord),
           ycoord = ifelse(ycoord == 0, 51, ycoord),
           ycoord = ifelse(ycoord == 52, 1, ycoord))
  
  return(dff)
  
}












# initialize stores
#
#
#
#
#
#
#
#
#

time <- 40
result_df <- data.frame(
  "time" = numeric(time*num_agents),
  "agent" = numeric(time*num_agents),
  "simratio" = numeric(time*num_agents)
)
count <- 0
save_plots <- list()








# begin iterations over periods
#
#
#
#
#
#
#
for(i in 1:time){


unhappy_store <- list()
unhappy_counter <- 0




# for each agent
for(ag in 1:num_agents){

    
  count <- count + 1
  
  # save agent's coords
  # save agent's type
    agent_coords <- agent_df %>% 
      filter(agent == ag) %>% 
      select(xcoord, ycoord)
    
    agent_type <- agent_df %>% 
      filter(agent == ag) %>% 
      select(type) %>% 
      pull()
  
  
    
    
    # identify neighbors - save their coordinates
    neigh <- get_neigh(agent_coords)
    total_neighs <- 0
    similar_neighs <- 0

    # for each neighbor
    for(n in 1:nrow(neigh)){
    
      # save neighbor 
      neigh_agent <- patch_df %>% 
        filter(xcoord == neigh$xcoord[n],
               ycoord == neigh$ycoord[n])
        
      # increment similar neighbors by 1 if agent is same as neighbor
      # increment total neighbors by 1 if patch isn't empty
      if(agent_type == neigh_agent$type){similar_neighs <- similar_neighs + 1}
      if(neigh_agent$type != 0){total_neighs <- total_neighs + 1}
      
    }
    
    
    # save his sim/total (time, agent, simratio)
    sim_ratio <- similar_neighs / total_neighs
    
    result_df[count, "time"] <- i
    result_df[count, "agent"] <- ag
    result_df[count, "simratio"] <- sim_ratio
    
    # if the agent has empty patches around him (is.nan(sim_ratio) == T)
    # or
    # if sim/total > then alike_preferences, 
    # then the agent is happy
    # otherwise, he is unhappy
    empty <- is.nan(sim_ratio)
    happy <- NULL
    
    if(empty == TRUE){happy <- TRUE}
    if(empty == FALSE && sim_ratio > alike_preference){happy <- TRUE}
    if(empty == FALSE && sim_ratio < alike_preference){happy <- FALSE}
    if(empty == FALSE && sim_ratio == alike_preference){happy <- FALSE}
    
    # if the agent is unhappy, store him
    if(happy == FALSE){
      unhappy_counter <- unhappy_counter + 1
      unhappy_store[[unhappy_counter]] <- ag
    }
    
}

# after going through all agents, have the unhappy agents move

unhappy_agents <- unlist(unhappy_store)

for(q in 1:length(unhappy_agents)){
  if(is.null(unhappy_agents) == TRUE){break}
  
  # randomly select a new patch
  new_x <- sample(51, 1)
  new_y <- sample(51, 1)
  
  # is the new patch occupied?
  agent_type_at_new <- patch_df %>% 
    filter(xcoord == new_x,
           ycoord == new_y) %>% 
    select(type) %>% 
    pull()
  
  occupied <- FALSE
  if(agent_type_at_new != 0){occupied <- TRUE}
  
  while(occupied == TRUE){
    
    new_x <- sample(51, 1)
    new_y <- sample(51, 1)
    agent_type_at_new <- patch_df %>% 
      filter(xcoord == new_x,
             ycoord == new_y) %>% 
      select(type) %>% 
      pull()
    
    if(agent_type_at_new == 0){occupied <- FALSE}
    
  }
  
  # unhappy agent
  current_unhappy <- agent_df %>% 
    filter(agent == unhappy_agents[q])
  
  # go to the new x and y position in the patch and place the agent type there
  patch_df[patch_df$xcoord == new_x & patch_df$ycoord == new_y, "type"] <- current_unhappy$type[1]
  # go to the old x and y position in the patch and change it to 0
  patch_df[patch_df$xcoord == current_unhappy$xcoord[1] & patch_df$ycoord == current_unhappy$ycoord[1], "type"] <- 0
  
  # change the agent_df to reflect the agent's new position
  agent_df[agent_df$agent == current_unhappy$agent[1], "xcoord"] <- new_x
  agent_df[agent_df$agent == current_unhappy$agent[1], "ycoord"] <- new_y
  

  
}

# create plot
# save and store plot

gp <- ggplot(patch_df, aes(x = xcoord, y = ycoord, fill = factor(type))) + 
  geom_tile() + 
  ggtitle(paste("Period =", i)) +
  scale_fill_brewer(palette = "Greens",
                    name = "Type of Patch")

save_plots[[i]] <- gp


}

ggplot(plotfirst, aes(x = xcoord, y = ycoord, fill = factor(type))) + 
  geom_tile() +
  ggtitle("Period = 0") +
  scale_fill_brewer(palette = "Greens",
                    name = "Type of Patch")

for(l in 1:time){
  
  print(save_plots[[l]])
  
}

ps, here is a ggplot tool for creating waffle plots.

Bo\(^2\)m =)

Thomas Schelling’s Model: Long Form

Basic Idea

Full Model