Mat*_*ock 18 python numpy r scipy
在R中,有一个非常有用的功能,有助于确定双侧t检验的参数,以获得目标统计功效.
该函数被调用power.prop.test.
http://stat.ethz.ch/R-manual/R-patched/library/stats/html/power.prop.test.html
你可以用它来调用它:
power.prop.test(p1 = .50, p2 = .75, power = .90)
Run Code Online (Sandbox Code Playgroud)
它会告诉你获得这种能量所需的样本量.这对于阻止测试的样本大小非常有用.
scipy包中是否有类似的功能?
Mat*_*ock 21
我已经设法使用下面的n公式和norm.isfscipy.stats 的逆生存函数来复制函数

from scipy.stats import norm, zscore
def sample_power_probtest(p1, p2, power=0.8, sig=0.05):
z = norm.isf([sig/2]) #two-sided t test
zp = -1 * norm.isf([power])
d = (p1-p2)
s =2*((p1+p2) /2)*(1-((p1+p2) /2))
n = s * ((zp + z)**2) / (d**2)
return int(round(n[0]))
def sample_power_difftest(d, s, power=0.8, sig=0.05):
z = norm.isf([sig/2])
zp = -1 * norm.isf([power])
n = s * ((zp + z)**2) / (d**2)
return int(round(n[0]))
if __name__ == '__main__':
n = sample_power_probtest(0.1, 0.11, power=0.8, sig=0.05)
print n #14752
n = sample_power_difftest(0.1, 0.5, power=0.8, sig=0.05)
print n #392
Run Code Online (Sandbox Code Playgroud)
Jos*_*sef 10
现在,statsmodels中提供了一些基本功率计算
http://statsmodels.sourceforge.net/devel/stats.html#power-and-sample-size-calculations http://jpktd.blogspot.ca/2013/03/statistical-power-in-statsmodels.html
博客文章尚未对statsmodels代码进行最新更改.此外,我还没有决定提供多少包装函数,因为许多功率计算只是简化为基本分布.
>>> import statsmodels.stats.api as sms
>>> es = sms.proportion_effectsize(0.5, 0.75)
>>> sms.NormalIndPower().solve_power(es, power=0.9, alpha=0.05, ratio=1)
76.652940372066908
Run Code Online (Sandbox Code Playgroud)
在R stats
> power.prop.test(p1 = .50, p2 = .75, power = .90)
Two-sample comparison of proportions power calculation
n = 76.7069301141077
p1 = 0.5
p2 = 0.75
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: n is number in *each* group
Run Code Online (Sandbox Code Playgroud)
使用R的pwr包
> library(pwr)
> h<-ES.h(0.5,0.75)
> pwr.2p.test(h=h, power=0.9, sig.level=0.05)
Difference of proportion power calculation for binomial distribution (arcsine transformation)
h = 0.5235987755982985
n = 76.6529406106181
sig.level = 0.05
power = 0.9
alternative = two.sided
NOTE: same sample sizes
Run Code Online (Sandbox Code Playgroud)
Matt得到所需n(每组)的答案几乎是正确的,但是有一个小错误.
给定d(均值差),s(标准差),sig(显着性水平,通常为.05)和功效(通常为.80),计算每组观察数的公式为:
n= (2s^2 * ((z_(sig/2) + z_power)^2) / (d^2)
Run Code Online (Sandbox Code Playgroud)
你可以在他的公式中看到,他有
n = s * ((zp + z)**2) / (d**2)
Run Code Online (Sandbox Code Playgroud)
"s"部分是错误的.一个重现r功能的正确函数是:
def sample_power_difftest(d, s, power=0.8, sig=0.05):
z = norm.isf([sig/2])
zp = -1 * norm.isf([power])
n = (2*(s**2)) * ((zp + z)**2) / (d**2)
return int(round(n[0]))
Run Code Online (Sandbox Code Playgroud)
希望这可以帮助.
你还有:
from statsmodels.stats.power import tt_ind_solve_power
Run Code Online (Sandbox Code Playgroud)
并将“无”放在您要获取的值中。例如,要在 effect_size = 0.1、power = 0.8 等情况下获得观察次数,您应该输入:
tt_ind_solve_power(effect_size=0.1, nobs1 = None, alpha=0.05, power=0.8, ratio=1, alternative='two-sided')
Run Code Online (Sandbox Code Playgroud)
并获得:1570.7330663315456 作为所需的观察次数。或者,要获得在其他值固定的情况下可以获得的功率:
tt_ind_solve_power(effect_size= 0.2, nobs1 = 200, alpha=0.05, power=None, ratio=1, alternative='two-sided')
Run Code Online (Sandbox Code Playgroud)
你得到:0.5140816347005553