Visualising and Analysing Daily Routines
Task of Take-home Exercise 4 is to reveal the daily routines of two
selected participants of the city of Engagement, Ohio USA by using
appropriate static and interactive statistical graphics methods.
In this exercise, I am going to use ViSIElse package to display the daily routines of two selected participants.
The data file used for this exercise is ParticipantStatusLogs1.csv. This table contains information about each participant’s daily routine. Following are the definitions of each column of data:
For this exercise, I used 8 libraries. They are sf, tmap, ViSiElse, lubridate, clock, sftime, rmarkdown, tidyverse and qdapTools. The R code in the following code chunk is used to install the required packages and load them into RStudio environment.
packages <- c('sf', 'tmap', 'ViSiElse',
"lubridate", "clock", "sftime",
"rmarkdown", "tidyverse", "qdapTools")
for(p in packages){
if(!require(p, character.only = T)){
install.packages(p)
}
library(p, character.only = T)
}
Data import was completed by using read_sf() which is a function in sf package. This function is useful for reading delimited files into a tibble.
raw_data <- read_sf("data/ParticipantStatusLogs1.csv", options="GEOM_POSSIBLE_NAMES=location")
Since the size of the original csv file exceeds 200MB, which means this file cannot be committed to the GitHub repository, I integrate the data and save it as rds format to reduce the size of the source data file. In addition, transferring data into rds format won’t lead to the data missing, and reducing the data file size can read the data faster.
The R code in the following code chunk is used to save source data as rds file.
logs_data <- raw_data %>%
mutate(Timestamp = date_time_parse(timestamp,
zone = "",
format = "%Y-%m-%dT%H:%M:%S")) %>%
mutate(day = get_day(Timestamp)) %>%
filter(day(timestamp) == 2)
write_rds(logs_data, "data/rds/ParticipantStatusLogs1.rds")
After transferring the data file into rds format, next time, I can directly use read_rds to read the data in the rds format file.
# A tibble: 291,168 x 14
timestamp currentLocation participantId currentMode hungerStatus
<chr> <chr> <chr> <chr> <chr>
1 2022-03-02T~ POINT (-2724.6~ 0 AtHome BecomingHun~
2 2022-03-02T~ POINT (-2619.0~ 1 Transport JustAte
3 2022-03-02T~ POINT (-1360.9~ 2 AtHome Hungry
4 2022-03-02T~ POINT (-1558.5~ 3 AtHome BecomingHun~
5 2022-03-02T~ POINT (976.240~ 4 AtHome BecomingHun~
6 2022-03-02T~ POINT (-927.27~ 5 Transport BecameFull
7 2022-03-02T~ POINT (1795.12~ 6 AtHome BecameFull
8 2022-03-02T~ POINT (-93.629~ 7 Transport Hungry
9 2022-03-02T~ POINT (616.295~ 8 AtHome Starving
10 2022-03-02T~ POINT (-2034.6~ 9 AtHome BecomingHun~
# ... with 291,158 more rows, and 9 more variables:
# sleepStatus <chr>, apartmentId <chr>, availableBalance <chr>,
# jobId <chr>, financialStatus <chr>, dailyFoodBudget <chr>,
# weeklyExtraBudget <chr>, Timestamp <dttm>, day <int>
This exercise only selects participants No. 1 and No. 2 to display their daily life from 00:00:00 to 23:59:59 on March 2, 2022, so their data needs to be selected first.
After data preprocessing, the mode data for participant 1 are shown as followed.
id start_home_list end_home_list start_recr_list end_recr_list
1 1 15 455 1085 1130
2 1 0 0 0 0
3 1 0 0 0 0
4 1 0 0 0 0
5 1 0 0 0 0
6 1 0 0 0 0
start_rest_list end_rest_list start_tran_list end_tran_list
1 630 650 0 15
2 0 0 455 490
3 0 0 620 630
4 0 0 650 660
5 0 0 1010 1045
6 0 0 1050 1085
start_work_list end_work_list
1 490 620
2 0 0
3 0 0
4 0 0
5 0 0
6 0 0
After data preprocessing, the mode data for participant 2 are shown as followed.
id start_home_list end_home_list start_recr_list end_recr_list
1 2 0 360 1080 1195
2 2 0 0 0 0
3 2 980 985 0 0
4 2 1290 1440 0 0
start_rest_list end_rest_list start_tran_list end_tran_list
1 0 0 360 430
2 0 0 910 980
3 0 0 985 1080
4 0 0 1195 1290
start_work_list end_work_list
1 430 910
2 0 0
3 0 0
4 0 0
The following two charts show the start and end times of each condition in the routine of Participant 1 and Participant 2 for the whole day of March 2, respectively.
ViSiElse differentiate two type of actions, namely: punctual and long.
Since it is difficult to accurately grasp the daily routines of the participants by only observing the start moment and end moment of each state, all the punctual actions are integrated into long actions. This can help us to observe the duration of each state, and it is also convenient to compare the two participants’ daily routines.
book1[11,] <- c("At home", "At home", "l", 11, "start_home_list", "end_home_list")
book1[12,] <- c("Enjoyment", "Enjoyment", "l", 12, "start_recr_list", "end_recr_list")
book1[13,] <- c("At Restaurant", "At Restaurant", "l", 13, "start_rest_list", "end_rest_list")
book1[14,] <- c("Transportation", "Transportation", "l", 14, "start_tran_list", "end_tran_list")
book1[15,] <- c("At work", "At work", "l", 15, "start_work_list", "end_work_list")
book1$showorder <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11, 12, 13, 14, 15)
book1 <- book1[order(as.numeric(book1$showorder)), ]
book1
vars label typeA showorder deb
11 At home At home l 11 start_home_list
12 Enjoyment Enjoyment l 12 start_recr_list
13 At Restaurant At Restaurant l 13 start_rest_list
14 Transportation Transportation l 14 start_tran_list
15 At work At work l 15 start_work_list
1 start_home_list start_home_list p NA <NA>
3 end_home_list end_home_list p NA <NA>
4 start_recr_list start_recr_list p NA <NA>
5 end_recr_list end_recr_list p NA <NA>
6 start_rest_list start_rest_list p NA <NA>
7 end_rest_list end_rest_list p NA <NA>
8 start_tran_list start_tran_list p NA <NA>
9 end_tran_list end_tran_list p NA <NA>
10 start_work_list start_work_list p NA <NA>
2 end_work_list end_work_list p NA <NA>
fin
11 end_home_list
12 end_recr_list
13 end_rest_list
14 end_tran_list
15 end_work_list
1 <NA>
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 <NA>
8 <NA>
9 <NA>
10 <NA>
2 <NA>
book2[11,] <- c("At home", "At home", "l", 11, "start_home_list", "end_home_list")
book2[12,] <- c("Enjoyment", "Enjoyment", "l", 12, "start_recr_list", "end_recr_list")
book2[13,] <- c("At Restaurant", "At Restaurant", "l", 13, "start_rest_list", "end_rest_list")
book2[14,] <- c("Transportation", "Transportation", "l", 14, "start_tran_list", "end_tran_list")
book2[15,] <- c("At work", "At work", "l", 15, "start_work_list", "end_work_list")
book2$showorder <- c(NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, 11, 12, 13, 14, 15)
book2 <- book2[order(as.numeric(book2$showorder)), ]
book2
vars label typeA showorder deb
11 At home At home l 11 start_home_list
12 Enjoyment Enjoyment l 12 start_recr_list
13 At Restaurant At Restaurant l 13 start_rest_list
14 Transportation Transportation l 14 start_tran_list
15 At work At work l 15 start_work_list
1 start_home_list start_home_list p NA <NA>
3 end_home_list end_home_list p NA <NA>
4 start_recr_list start_recr_list p NA <NA>
5 end_recr_list end_recr_list p NA <NA>
6 start_rest_list start_rest_list p NA <NA>
7 end_rest_list end_rest_list p NA <NA>
8 start_tran_list start_tran_list p NA <NA>
9 end_tran_list end_tran_list p NA <NA>
10 start_work_list start_work_list p NA <NA>
2 end_work_list end_work_list p NA <NA>
fin
11 end_home_list
12 end_recr_list
13 end_rest_list
14 end_tran_list
15 end_work_list
1 <NA>
3 <NA>
4 <NA>
5 <NA>
6 <NA>
7 <NA>
8 <NA>
9 <NA>
10 <NA>
2 <NA>
The following two charts show the modes distribution of participant 1 and participant 2’s long actions throughout the whole day on March 2, respectively.
visi1 <- visielse(X = X1, book = book1, informer = NULL)
visi2 <- visielse(X = X2, book = book2, informer = NULL)
For a more intuitive comparison, I combined the journeys of participant 1 and participant 2 on March 2 into one graph.
X <- rbind(X1, X2)
group <- c( "group1", "group1", "group1", "group1", "group1",
"group1", "group1", "group1", "group1", "group1",
"group1", "group1", "group2", "group2", "group2",
"group2")
visi <- visielse(X, group=group, book=book1 ,informer = NULL, method = "cut")
Through comparative observation, we can conclude that on March 2, participant 2’s schedule was more compact than participant 1.