如何用R中的线性模型单独替换NA

zhi*_* li 4 r dataframe dplyr

我查了一些网页(但他们的结果不符合我的需要):

我想写一个可以做到这一点的函数:

说有一个向量a

a = c(100000, 137862, NA, NA, NA, 178337, NA, NA, NA, NA, NA, 295530)
Run Code Online (Sandbox Code Playgroud)

首先,找到单个和连续 前后的值NA。在这种情况下是137862, NA, NA, NA, 178337178337, NA, NA, NA, NA, NA, 295530

其次,计算每个部分的斜率,然后替换NA.

# 137862, NA, NA, NA, 178337
slope_1 = (178337 - 137862)/4

137862 + slope_1*1 # 1st NA replace with 147980.8
137862 + slope_1*2 # 2nd NA replace with 158099.5
137862 + slope_1*3 # 3rd NA replace with 168218.2

# 178337, NA, NA, NA, NA, NA, 295530

slope_2 = (295530 - 178337)/6

178337 + slope_2*1 # 4th NA replace with 197869.2
178337 + slope_2*2 # 5th NA replace with 217401.3
178337 + slope_2*3 # 6th NA replace with 236933.5
178337 + slope_2*4 # 7th NA replace with 256465.7
178337 + slope_2*5 # 8th NA replace with 275997.8
Run Code Online (Sandbox Code Playgroud)

最后,期望的向量应该是这个?

a_without_NA = c(100000, 137862, 147980.8, 158099.5, 168218.2, 178337, 197869.2, 217401.3, 
                 236933.5, 256465.7, 275997.8, 295530)
Run Code Online (Sandbox Code Playgroud)

如果开头是单个或连续的 NA,则保留。

# NA at begining
b = c(NA, NA, 1, 3, NA, 5, 7)

# 3, NA, 5
slope_1 = (5-3)/2
3 + slope_1*1 # 3rd NA replace with 4
b_without_NA = c(NA, NA, 1, 3, 4, 5, 7)

# NA at ending
c = c(1, 3, NA, 5, 7, NA, NA)

# 3, NA, 5
slope_1 = (5-3)/2
3 + slope_1*1 # 1st NA replace with 4
c_without_NA = c(1, 3, 4, 5, 7, NA, NA)
Run Code Online (Sandbox Code Playgroud)

注意:在我的真实情况下,向量的每个元素都在增加(vector[n + 1] > vector[n])。

原理我知道,但是不知道怎么写自定义函数来实现。

任何帮助将不胜感激!

Ron*_*hah 5

zoona.approx帮助:

a = c(100000, 137862, NA, NA, NA, 178337, NA, NA, NA, NA, NA, 295530)
zoo::na.approx(a, na.rm = FALSE)

# [1] 100000.0 137862.0 147980.8 158099.5 168218.2 178337.0 197869.2 217401.3
# [9] 236933.5 256465.7 275997.8 295530.0

b = c(NA, NA, 1, 3, NA, 5, 7)

zoo::na.approx(b, na.rm = FALSE)
#[1] NA NA  1  3  4  5  7

c = c(1, 3, NA, 5, 7, NA, NA)
zoo::na.approx(c, na.rm = FALSE)
#[1]  1  3  4  5  7 NA NA
Run Code Online (Sandbox Code Playgroud)