我是 R 新手,请帮助我理解从秒到分钟的转换。
\nI'm doing a travel time analysis, and I get the result in seconds, which is not convenient for descriptive analysis and subsequent visualizations. There are a lot of calculations in different time sections, here is a small piece:
\ntrip_stats <- cyclistic_df %>%\n group_by(member_casual) %>% \n summarise(average_ride_length = round((mean(ride_length), 2), # average ride length (total ride time / trips)\n median_ride_length = round(median(ride_length), 2), # median ride length\n min_ride_length = round(min(ride_length), 2), # minimum ride length\n max_ride_length = round(max(ride_length), 2)) # maximum ride length\nhead(trip_stats)\nRun Code Online (Sandbox Code Playgroud)\nConclusion:
\n# A tibble: 2 \xc3\x97 5\n member_casual average_ride_length median_ride_length min_ride_length max_ride_length\n <chr> <drtn> <drtn> <drtn> <drtn> \n1 casual 22.72 secs 785 secs 1 secs 1922127 secs \n2 member 12.19 secs 525 secs 1 secs 89872 secs\nRun Code Online (Sandbox Code Playgroud)\nOr here's another example:
\n# Average ride length (ride_length):\nride_lengt_avg <- round(mean(cyclistic_df$ride_length), 2)\nprint(ride_lengt_avg)\nRun Code Online (Sandbox Code Playgroud)\nConclusion:
\nTime difference of 977.28 secs\nRun Code Online (Sandbox Code Playgroud)\nI tried different variants with as_hms, format "%M:%S", minutes(), but unfortunately it still outputs the results of calculations in seconds. For example, I have created a number of columns, among which there is a column with only hour:
\n# Format time as HH:MM:SS:\ncyclistic_df$time <- format(as.Date(cyclistic_df$date), "%H:%M:%S")\n\n# Create new column for time:\ncyclistic_df$time <- as_hms((cyclistic_df$started_at))\n\n# Create new column for hour:\ncyclistic_df$hour <- hour(cyclistic_df$time)\nRun Code Online (Sandbox Code Playgroud)\nConclusion:
\n0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17\n casual 32053 20813 12246 6763 4515 8707 23390 40346 55066 56190 72233 93437 109905 114531 121187 134839 153614 170534\n member 25582 15645 8736 5268 6097 26049 81660 151664 180585 120035 109540 129717 149403 147804 148800 183330 249039 297724\n \n 18 19 20 21 22 23\n casual 148452 111916 81388 69348 61476 44808\n member 233303 165938 115161 88668 65505 41332\nRun Code Online (Sandbox Code Playgroud)\nUnfortunately by analogy I can't represent the output in minutes instead of seconds.
\nDividing by 60 also doesn't help, and is only misleading because "secs" is retained.
\nI have added the following columns:
\n# Default format is yyyy-mm-dd, use start date:\ncyclistic_df$date <- as.Date(cyclistic_df$started_at) # the default format is yyyy-mm-dd\n\n# Create column for year:\ncyclistic_df$year <- format(as.Date(cyclistic_df$date), "%Y")\n\n# Create column for month:\ncyclistic_df$month <- format(as.Date(cyclistic_df$date), "%m")\n\n# Create column for day:\ncyclistic_df$day <- format(as.Date(cyclistic_df$date), "%d")\n\n# Calculate the day of the week:\ncyclistic_df$day_of_week <- wday(cyclistic_df$started_at)\n\n# Create column for day of week:\ncyclistic_df$day_of_week <- format(as.Date(cyclistic_df$date), "%A") # wday(cyclistic_df$started_at, label = T, abbr = T)\n\n# Format time as HH:MM:SS:\ncyclistic_df$time <- format(as.Date(cyclistic_df$date), "%H:%M:%S")\n\n# Create new column for time:\ncyclistic_df$time <- as_hms((cyclistic_df$started_at))\n\n# Create new column for hour:\ncyclistic_df$hour <- hour(cyclistic_df$time)\n\n# Calculate & Create ride length column by subtracting ended_at time from started_at time and converted it to minutes:\ncyclistic_df$ride_length <- as_hms(difftime(cyclistic_df$ended_at, cyclistic_df$started_at))\nRun Code Online (Sandbox Code Playgroud)\n如果我理解正确的话,那么可以一次性使用数字类型创建年、月、日、小时、骑行长度列,这样以后就不必每次都将它们转换为数字。\n最好不要使用格式(,“%__”),对吧?
\n我将非常感谢您的帮助!
\nas.numeric接受units=:
set.seed(42)
d <- (Sys.time()) - (Sys.time() - runif(10, 0, 99))
d
# Time differences in secs
# [1] 90.56578 92.77045 28.32780 82.21430 63.53279 51.39048 72.92223 13.33198 65.04222 69.80140
as.numeric(d, units = "mins")
# [1] 1.5094297 1.5461741 0.4721299 1.3702383 1.0588798 0.8565080 1.2153704 0.2221996 1.0840370 1.1633566
Run Code Online (Sandbox Code Playgroud)
您可以使用类似的内容将其应用于所有这些列(假设dplyr):
trip_stats <- trip_stats %>%
mutate(across(ends_with("_length"), ~ as.numeric(., units = "mins")))
Run Code Online (Sandbox Code Playgroud)
或者更一般地说
trip_stats <- trip_stats %>%
mutate(across(where(~ inherits(., "difftime")), ~ as.numeric(., units = "mins")))
Run Code Online (Sandbox Code Playgroud)
如果以 R 为基础,您可以执行以下操作:
isdifftm <- sapply(trip_stats, inherits, "difftime")
trip_stats[,isdifftm] <- lapply(trip_stats[,isdifftm,drop=FALSE], as.numeric, units = "mins")
Run Code Online (Sandbox Code Playgroud)
编辑
也许这就是完整的?
trip_stats <- cyclistic_df %>%
group_by(member_casual) %>%
summarize(across(disp, .names = "{.fn}", list(
average_ride_length = mean,
median_ride_length = median,
min_ride_length = min,
max_ride_length = max))) %>%
mutate(across(ends_with("_length"), ~ round(as.numeric(., units = "mins"), 2)))
Run Code Online (Sandbox Code Playgroud)