Take-home Exercise 4

Visualising and Analysing Daily Routines

Author

Affiliation

Ding Yanmu

 

Published

May 21, 2022

DOI

1 Introduction

Task of Take-home Exercise 4 is to reveal the daily routines of two selected participants of the city of Engagement, Ohio USA by using appropriate static and interactive statistical graphics methods.

In this exercise, I am going to use ViSIElse package to display the daily routines of two selected participants.

2 Data Discription

The data file used for this exercise is ParticipantStatusLogs1.csv. This table contains information about each participant’s daily routine. Following are the definitions of each column of data:

3 Data Preparation

3.1 Installing and launching R packages

For this exercise, I used 8 libraries. They are sf, tmap, ViSiElse, lubridate, clock, sftime, rmarkdown, tidyverse and qdapTools. The R code in the following code chunk is used to install the required packages and load them into RStudio environment.

packages <- c('sf', 'tmap', 'ViSiElse', 
              "lubridate", "clock", "sftime", 
              "rmarkdown", "tidyverse", "qdapTools")

for(p in packages){
  if(!require(p, character.only = T)){
    install.packages(p)
  }
  library(p, character.only = T)
}

3.2 Importing the dataset

Data import was completed by using read_sf() which is a function in sf package. This function is useful for reading delimited files into a tibble.

raw_data <- read_sf("data/ParticipantStatusLogs1.csv", options="GEOM_POSSIBLE_NAMES=location")

Since the size of the original csv file exceeds 200MB, which means this file cannot be committed to the GitHub repository, I integrate the data and save it as rds format to reduce the size of the source data file. In addition, transferring data into rds format won’t lead to the data missing, and reducing the data file size can read the data faster.

The R code in the following code chunk is used to save source data as rds file.

logs_data <- raw_data %>%
  mutate(Timestamp = date_time_parse(timestamp,
                                     zone = "",
                                     format = "%Y-%m-%dT%H:%M:%S")) %>%
  mutate(day = get_day(Timestamp)) %>%
  filter(day(timestamp) == 2)
write_rds(logs_data, "data/rds/ParticipantStatusLogs1.rds")

After transferring the data file into rds format, next time, I can directly use read_rds to read the data in the rds format file.

# A tibble: 291,168 x 14
   timestamp    currentLocation participantId currentMode hungerStatus
   <chr>        <chr>           <chr>         <chr>       <chr>       
 1 2022-03-02T~ POINT (-2724.6~ 0             AtHome      BecomingHun~
 2 2022-03-02T~ POINT (-2619.0~ 1             Transport   JustAte     
 3 2022-03-02T~ POINT (-1360.9~ 2             AtHome      Hungry      
 4 2022-03-02T~ POINT (-1558.5~ 3             AtHome      BecomingHun~
 5 2022-03-02T~ POINT (976.240~ 4             AtHome      BecomingHun~
 6 2022-03-02T~ POINT (-927.27~ 5             Transport   BecameFull  
 7 2022-03-02T~ POINT (1795.12~ 6             AtHome      BecameFull  
 8 2022-03-02T~ POINT (-93.629~ 7             Transport   Hungry      
 9 2022-03-02T~ POINT (616.295~ 8             AtHome      Starving    
10 2022-03-02T~ POINT (-2034.6~ 9             AtHome      BecomingHun~
# ... with 291,158 more rows, and 9 more variables:
#   sleepStatus <chr>, apartmentId <chr>, availableBalance <chr>,
#   jobId <chr>, financialStatus <chr>, dailyFoodBudget <chr>,
#   weeklyExtraBudget <chr>, Timestamp <dttm>, day <int>

3.3 Data Preprocessing

This exercise only selects participants No. 1 and No. 2 to display their daily life from 00:00:00 to 23:59:59 on March 2, 2022, so their data needs to be selected first.

After data preprocessing, the mode data for participant 1 are shown as followed.

  id start_home_list end_home_list start_recr_list end_recr_list
1  1              15           455            1085          1130
2  1               0             0               0             0
3  1               0             0               0             0
4  1               0             0               0             0
5  1               0             0               0             0
6  1               0             0               0             0
  start_rest_list end_rest_list start_tran_list end_tran_list
1             630           650               0            15
2               0             0             455           490
3               0             0             620           630
4               0             0             650           660
5               0             0            1010          1045
6               0             0            1050          1085
  start_work_list end_work_list
1             490           620
2               0             0
3               0             0
4               0             0
5               0             0
6               0             0

After data preprocessing, the mode data for participant 2 are shown as followed.

  id start_home_list end_home_list start_recr_list end_recr_list
1  2               0           360            1080          1195
2  2               0             0               0             0
3  2             980           985               0             0
4  2            1290          1440               0             0
  start_rest_list end_rest_list start_tran_list end_tran_list
1               0             0             360           430
2               0             0             910           980
3               0             0             985          1080
4               0             0            1195          1290
  start_work_list end_work_list
1             430           910
2               0             0
3               0             0
4               0             0

4 Plotting daily routines

The following two charts show the start and end times of each condition in the routine of Participant 1 and Participant 2 for the whole day of March 2, respectively.

ViSiElse differentiate two type of actions, namely: punctual and long.

Since it is difficult to accurately grasp the daily routines of the participants by only observing the start moment and end moment of each state, all the punctual actions are integrated into long actions. This can help us to observe the duration of each state, and it is also convenient to compare the two participants’ daily routines.

book1[11,] <- c("At home", "At home", "l", 11, "start_home_list", "end_home_list")
book1[12,] <- c("Enjoyment", "Enjoyment", "l", 12, "start_recr_list", "end_recr_list")
book1[13,] <- c("At Restaurant", "At Restaurant", "l", 13, "start_rest_list", "end_rest_list")
book1[14,] <- c("Transportation", "Transportation", "l", 14, "start_tran_list", "end_tran_list")
book1[15,] <- c("At work", "At work", "l", 15, "start_work_list", "end_work_list")
book1$showorder <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11, 12, 13, 14, 15)
book1 <- book1[order(as.numeric(book1$showorder)), ]
book1
              vars           label typeA showorder             deb
11         At home         At home     l        11 start_home_list
12       Enjoyment       Enjoyment     l        12 start_recr_list
13   At Restaurant   At Restaurant     l        13 start_rest_list
14  Transportation  Transportation     l        14 start_tran_list
15         At work         At work     l        15 start_work_list
1  start_home_list start_home_list     p        NA            <NA>
3    end_home_list   end_home_list     p        NA            <NA>
4  start_recr_list start_recr_list     p        NA            <NA>
5    end_recr_list   end_recr_list     p        NA            <NA>
6  start_rest_list start_rest_list     p        NA            <NA>
7    end_rest_list   end_rest_list     p        NA            <NA>
8  start_tran_list start_tran_list     p        NA            <NA>
9    end_tran_list   end_tran_list     p        NA            <NA>
10 start_work_list start_work_list     p        NA            <NA>
2    end_work_list   end_work_list     p        NA            <NA>
             fin
11 end_home_list
12 end_recr_list
13 end_rest_list
14 end_tran_list
15 end_work_list
1           <NA>
3           <NA>
4           <NA>
5           <NA>
6           <NA>
7           <NA>
8           <NA>
9           <NA>
10          <NA>
2           <NA>
book2[11,] <- c("At home", "At home", "l", 11, "start_home_list", "end_home_list")
book2[12,] <- c("Enjoyment", "Enjoyment", "l", 12, "start_recr_list", "end_recr_list")
book2[13,] <- c("At Restaurant", "At Restaurant", "l", 13, "start_rest_list", "end_rest_list")
book2[14,] <- c("Transportation", "Transportation", "l", 14, "start_tran_list", "end_tran_list")
book2[15,] <- c("At work", "At work", "l", 15, "start_work_list", "end_work_list")
book2$showorder <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11, 12, 13, 14, 15)
book2 <- book2[order(as.numeric(book2$showorder)), ]
book2
              vars           label typeA showorder             deb
11         At home         At home     l        11 start_home_list
12       Enjoyment       Enjoyment     l        12 start_recr_list
13   At Restaurant   At Restaurant     l        13 start_rest_list
14  Transportation  Transportation     l        14 start_tran_list
15         At work         At work     l        15 start_work_list
1  start_home_list start_home_list     p        NA            <NA>
3    end_home_list   end_home_list     p        NA            <NA>
4  start_recr_list start_recr_list     p        NA            <NA>
5    end_recr_list   end_recr_list     p        NA            <NA>
6  start_rest_list start_rest_list     p        NA            <NA>
7    end_rest_list   end_rest_list     p        NA            <NA>
8  start_tran_list start_tran_list     p        NA            <NA>
9    end_tran_list   end_tran_list     p        NA            <NA>
10 start_work_list start_work_list     p        NA            <NA>
2    end_work_list   end_work_list     p        NA            <NA>
             fin
11 end_home_list
12 end_recr_list
13 end_rest_list
14 end_tran_list
15 end_work_list
1           <NA>
3           <NA>
4           <NA>
5           <NA>
6           <NA>
7           <NA>
8           <NA>
9           <NA>
10          <NA>
2           <NA>

The following two charts show the modes distribution of participant 1 and participant 2’s long actions throughout the whole day on March 2, respectively.

visi1 <- visielse(X = X1,  book = book1, informer = NULL)

visi2 <- visielse(X = X2,  book = book2, informer = NULL)

For a more intuitive comparison, I combined the journeys of participant 1 and participant 2 on March 2 into one graph.

X <- rbind(X1, X2)
group <- c( "group1", "group1", "group1", "group1", "group1", 
            "group1", "group1", "group1", "group1", "group1", 
            "group1", "group1", "group2", "group2", "group2", 
            "group2")

visi <- visielse(X, group=group, book=book1 ,informer = NULL, method = "cut")

5 Summary

Through comparative observation, we can conclude that on March 2, participant 2’s schedule was more compact than participant 1.