我有一个Python脚本清理并在大型面板数据集(2,000,000+ observations)上执行基本统计计算.
我发现其中一些任务更适合Stata,并用必要的命令写了一个do文件.因此,我想在我的Python代码中运行.do文件.我该如何调用.do文件Python?
Rob*_*rer 11
我认为@ user229552指向了正确的方向.subprocess可以使用Python的模块.下面是一个适用于Linux OS的示例.
假设您有一个pydo.py使用以下内容调用的Python文件:
import subprocess
## Do some processing in Python
## Set do-file information
dofile = "/home/roberto/Desktop/pyexample3.do"
cmd = ["stata", "do", dofile, "mpg", "weight", "foreign"]
## Run do-file
subprocess.call(cmd)
Run Code Online (Sandbox Code Playgroud)
和一个名为Stata的文件pyexample3.do,具有以下内容:
clear all
set more off
local y `1'
local x1 `2'
local x2 `3'
display `"first parameter: `y'"'
display `"second parameter: `x1'"'
display `"third parameter: `x2'"'
sysuse auto
regress `y' `x1' `x2'
exit, STATA clear
Run Code Online (Sandbox Code Playgroud)
然后pydo.py在终端窗口中执行按预期工作.
您还可以定义Python函数并使用它:
## Define a Python function to launch a do-file
def dostata(dofile, *params):
## Launch a do-file, given the fullpath to the do-file
## and a list of parameters.
import subprocess
cmd = ["stata", "do", dofile]
for param in params:
cmd.append(param)
return subprocess.call(cmd)
## Do some processing in Python
## Run a do-file
dostata("/home/roberto/Desktop/pyexample3.do", "mpg", "weight", "foreign")
Run Code Online (Sandbox Code Playgroud)
来自终端的完整呼叫,结果如下:
roberto@roberto-mint ~/Desktop
$ python pydo.py
___ ____ ____ ____ ____ (R)
/__ / ____/ / ____/
___/ / /___/ / /___/ 12.1 Copyright 1985-2011 StataCorp LP
Statistics/Data Analysis StataCorp
4905 Lakeway Drive
College Station, Texas 77845 USA
800-STATA-PC http://www.stata.com
979-696-4600 stata@stata.com
979-696-4601 (fax)
Notes:
1. Command line editing enabled
. do /home/roberto/Desktop/pyexample3.do mpg weight foreign
. clear all
. set more off
.
. local y `1'
. local x1 `2'
. local x2 `3'
.
. display `"first parameter: `y'"'
first parameter: mpg
. display `"second parameter: `x1'"'
second parameter: weight
. display `"third parameter: `x2'"'
third parameter: foreign
.
. sysuse auto
(1978 Automobile Data)
. regress `y' `x1' `x2'
Source | SS df MS Number of obs = 74
-------------+------------------------------ F( 2, 71) = 69.75
Model | 1619.2877 2 809.643849 Prob > F = 0.0000
Residual | 824.171761 71 11.608053 R-squared = 0.6627
-------------+------------------------------ Adj R-squared = 0.6532
Total | 2443.45946 73 33.4720474 Root MSE = 3.4071
------------------------------------------------------------------------------
mpg | Coef. Std. Err. t P>|t| [95% Conf. Interval]
-------------+----------------------------------------------------------------
weight | -.0065879 .0006371 -10.34 0.000 -.0078583 -.0053175
foreign | -1.650029 1.075994 -1.53 0.130 -3.7955 .4954422
_cons | 41.6797 2.165547 19.25 0.000 37.36172 45.99768
------------------------------------------------------------------------------
.
. exit, STATA clear
Run Code Online (Sandbox Code Playgroud)
资料来源:
http://www.reddmetrics.com/2011/07/15/calling-stata-from-python.html
http://docs.python.org/2/library/subprocess.html
http://www.stata.com/support/faqs/unix/batch-mode/
可以在以下位置找到使用Python和Stata的不同路径
http://ideas.repec.org/c/boc/bocode/s457688.html
http://www.stata.com/statalist/archive/2013-08/msg01304.html
这个答案扩展了@Roberto Ferrer 的答案,解决了我遇到的一些问题。
系统路径中的Stata
为了stata运行代码,它必须在系统路径中正确设置(至少在 Windows 上)。至少对我来说,这不是在安装 Stata 时自动设置的,我发现最简单的修正是放入完整路径(对我来说是"C:\Program Files (x86)\Stata12\Stata-64)即:
cmd = ["C:\Program Files (x86)\Stata12\Stata-64","do", dofile]`
Run Code Online (Sandbox Code Playgroud)
如何在后台安静地运行代码
通过添加命令/e ie ,可以让代码在后台安静地运行(即不是每次都打开Stata)
cmd = ["C:\Program Files (x86)\Stata12\Stata-64,"/e","do", dofile]
日志文件存储位置
最后,如果你在后台安静地运行,Stata 会想要保存日志文件。它将在cmd的工作目录中执行此操作。这必须取决于运行代码的位置,但对我来说,由于我是从 Notepad++ 执行 Python,它想将日志文件保存在C:\Program Files (x86)\Notepad++.Stata中 ,而 Stata 没有写入权限。这可以通过在调用子进程时指定工作目录来更改。
对Roberto Ferrer的代码的这些修改导致:
def dostata(dofile, *params):
cmd = ["C:\Program Files (x86)\Stata12\Stata-64","/e","do", dofile]
for param in params:
cmd.append(param)
return (subprocess.call(cmd, cwd=r'C:\location_to_save_log_files'))
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
7694 次 |
| 最近记录: |