R - 将文本中的分数转换为数字

Fra*_* B. 7 string r

我试图将'9¼''转换为'9.25',但似乎无法正确读取分数.

这是我正在使用的数据:

library(XML)

url <- paste("http://mockdraftable.com/players/2014/", sep = "")  
combine <- readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F)

names(combine) <- c("Name", "Pos", "Hght", "Wght", "Arms", "Hands",
                    "Dash40yd", "Dash20yd", "Dash10yd", "Bench", "Vert", "Broad", 
                    "Cone3", "ShortShuttle20")
Run Code Online (Sandbox Code Playgroud)

例如,第一行中的Hands列是'9¼'',我将如何组合$ Hands变为9.25?对于所有其他分数1/8 - 7/8也是如此.

任何帮助,将不胜感激.

Nic*_*icE 7

在使用特殊的返回函数读取XML时,您可以尝试将unicode编码直接转换为ASCII:

library(stringi)
readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
        val = xmlValue(node); stri_trans_general(val,"latin-ascii")})
Run Code Online (Sandbox Code Playgroud)

然后,您可以使用@Metrics的建议将其转换为数字.

你可以做,例如,使用@G.格罗腾迪克的功能来自这篇文章清理Arms数据:

library(XML)
library(stringi)
library(gsubfn)
#the calc function is by @G. Grothendieck
calc <- function(s) {
        x <- c(if (length(s) == 2) 0, as.numeric(s), 0:1)
        x[1] + x[2] / x[3]
}

url <- paste("http://mockdraftable.com/players/2014/", sep = "")  

combine<-readHTMLTable(url,which=1, header=FALSE, stringsAsFactors=F,elFun=function(node) {
        val = xmlValue(node); stri_trans_general(val,"latin-ascii")})

names(combine) <- c("Name", "Pos", "Hght", "Wght", "Arms", "Hands",
                    "Dash40yd", "Dash20yd", "Dash10yd", "Bench", "Vert", "Broad", 
                    "Cone3", "ShortShuttle20")

sapply(strapplyc(gsub('\"',"",combine$Arms), "\\d+"), calc)

#[1] 30.000 31.500 30.000 31.750 31.875 29.875 31.000 31.000 30.250 33.000 32.500 31.625 32.875
Run Code Online (Sandbox Code Playgroud)

可能存在一些编码问题,具体取决于您的计算机(请参阅注释)