rob*_*ntw 5 python text idioms
我有一个来自遗留软件的输出文件,如下所示。我想从中提取值,例如,我可以设置一个名为direct_solar_irradianceto648.957和target ground pressureto的变量1013.00。
到目前为止,我一直在提取单独的行并像下面这样处理它们(针对我想要提取的不同值重复多次):
values = lines[97].split()
self.irradiance_direct, self.irradiance_diffuse, self.irradiance_env = values
Run Code Online (Sandbox Code Playgroud)
但是,我现在发现在选择某些参数时会在输出中间添加额外的行。这当然意味着第 97 行将不再包含我需要的值。
鉴于在某些情况下可能会在输出中添加额外的行,是否有一种很好的 Pythonic 方法来提取这些值?我想我需要在文件中搜索已知的文本片段,然后提取它们引用的数字,但我能想到的唯一方法是非常笨拙。
所以:
有没有一种很好的 Pythonic 方法来搜索这些字符串并提取我想要的值?
如果没有,是否有其他方法可以明智地做到这一点?(例如,某种我一无所知的酷文本文件解析库)。
******************************* 6sV version 1.0B ******************************
* *
* geometrical conditions identity *
* ------------------------------- *
* user defined conditions *
* *
* month: 14 day : 1 *
* solar zenith angle: 10.00 deg solar azimuthal angle: 20.00 deg *
* view zenith angle: 30.00 deg view azimuthal angle: 40.00 deg *
* scattering angle: 159.14 deg azimuthal angle difference: 20.00 deg *
* *
* atmospheric model description *
* ----------------------------- *
* atmospheric model identity : *
* midlatitude summer (uh2o=2.93g/cm2,uo3=.319cm-atm) *
* aerosols type identity : *
* Maritime aerosol model *
* optical condition identity : *
* visibility : 8.49 km opt. thick. 550 nm : 0.5000 *
* *
* spectral condition *
* ------------------ *
* monochromatic calculation at wl 0.400 micron *
* *
* Surface polarization parameters *
* ---------------------------------- *
* *
* *
* Surface Polarization Q,U,Rop,Chi 0.00000 0.00000 0.00000 0.00 *
* *
* *
* target type *
* ----------- *
* homogeneous ground *
* monochromatic reflectance 1.000 *
* *
* target elevation description *
* ---------------------------- *
* ground pressure [mb] 1013.00 *
* ground altitude [km] 0.000 *
* *
* plane simulation description *
* ---------------------------- *
* plane pressure [mb] 1013.00 *
* plane altitude absolute [km] 0.000 *
* atmosphere under plane description: *
* ozone content 0.000 *
* h2o content 0.000 *
* aerosol opt. thick. 550nm 0.000 *
* *
* atmospheric correction activated *
* -------------------------------- *
* BRDF coupling correction *
* input apparent reflectance : 0.500 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* apparent reflectance 1.1287696 appar. rad.(w/m2/sr/mic) 588.646 *
* total gaseous transmittance 1.000 *
* *
*******************************************************************************
* *
* coupling aerosol -wv : *
* -------------------- *
* wv above aerosol : 1.129 wv mixed with aerosol : 1.129 *
* wv under aerosol : 1.129 *
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* app. polarized refl. 0.0000 app. pol. rad. (w/m2/sr/mic) 0.000 *
* direction of the plane of polarization 0.00 *
* total polarization ratio 0.000 *
* *
*******************************************************************************
* *
* int. normalized values of : *
* --------------------------- *
* % of irradiance at ground level *
* % of direct irr. % of diffuse irr. % of enviro. irr *
* 0.351 0.354 0.295 *
* reflectance at satellite level *
* atm. intrin. ref. background ref. pixel reflectance *
* 0.000 0.000 1.129 *
* *
* int. absolute values of *
* ----------------------- *
* irr. at ground level (w/m2/mic) *
* direct solar irr. atm. diffuse irr. environment irr *
* 648.957 655.412 544.918 *
* rad at satel. level (w/m2/sr/mic) *
* atm. intrin. rad. background rad. pixel radiance *
* 0.000 0.000 588.646 *
* *
* *
* sol. spect (in w/m2/mic) *
* 1663.594 *
* *
*******************************************************************************
*******************************************************************************
* *
* integrated values of : *
* -------------------- *
* *
* downward upward total *
* global gas. trans. : 1.00000 1.00000 1.00000 *
* water " " : 1.00000 1.00000 1.00000 *
* ozone " " : 1.00000 1.00000 1.00000 *
* co2 " " : 1.00000 1.00000 1.00000 *
* oxyg " " : 1.00000 1.00000 1.00000 *
* no2 " " : 1.00000 1.00000 1.00000 *
* ch4 " " : 1.00000 1.00000 1.00000 *
* co " " : 1.00000 1.00000 1.00000 *
* *
* *
* rayl. sca. trans. : 0.84422 1.00000 0.84422 *
* aeros. sca. " : 0.94572 1.00000 0.94572 *
* total sca. " : 0.79616 1.00000 0.79616 *
* *
* *
* *
* rayleigh aerosols total *
* *
* spherical albedo : 0.23410 0.12354 0.29466 *
* optical depth total: 0.36193 0.55006 0.91199 *
* optical depth plane: 0.00000 0.00000 0.00000 *
* reflectance I : 0.00000 0.00000 0.00000 *
* reflectance Q : 0.00000 0.00000 0.00000 *
* reflectance U : 0.00000 0.00000 0.00000 *
* polarized reflect. : 0.00000 0.00000 0.00000 *
* degree of polar. : nan 0.00 nan *
* dir. plane polar. : -45.00 -45.00 -45.00 *
* phase function I : 1.38819 0.27621 0.71751 *
* phase function Q : -0.09117 -0.00856 -0.04134 *
* phase function U : -1.34383 0.02142 -0.52039 *
* primary deg. of pol: -0.06567 -0.03099 -0.05762 *
* sing. scat. albedo : 1.00000 0.98774 0.99261 *
* *
* *
*******************************************************************************
*******************************************************************************
*******************************************************************************
* atmospheric correction result *
* ----------------------------- *
* input apparent reflectance : 0.500 *
* measured radiance [w/m2/sr/mic] : 260.747 *
* atmospherically corrected reflectance *
* Lambertian case : 0.52995 *
* BRDF case : 0.52995 *
* coefficients xa xb xc : 0.00241 0.00000 0.29466 *
* y=xa*(measured radiance)-xb; acr=y/(1.+xc*y) *
Run Code Online (Sandbox Code Playgroud)
您可以使用自己的迷你语言,即自动提取。我执行了以下操作来自动解析专有程序输出
# will match in the order written here
tokens = ["num_ref_frames", "Max QP", "Min QP", "Avg QP", "I4x4",
"I16x16", "SkipZero", "SkipMV", "16x16", "16x8", "8x16",
"8x8", "8x4", "4x8", "4x4"]
special = ["Quarterpel MVs"]
# this dictionary (hash-table) contains the search string from tokens array
# as well as an array where the first element is the field to extract to
# create matrix array. e.g. 0 = 1st field, 1 = 2nd field, 3 = 3rd field etc.
dict = {tokens[0]: [1], tokens[1]: [1], tokens[2]: [1], tokens[3]: [1],
tokens[4]: [2], tokens[5]: [2], tokens[6]: [2], tokens[7]: [2],
tokens[8]: [2], tokens[9]: [2], tokens[10]: [2], tokens[11]: [2],
tokens[12]: [2], tokens[13]: [2], tokens[14]: [2],}
Run Code Online (Sandbox Code Playgroud)
然后我简单地循环输入,并针对每一行检查token;的内容。如果发现匹配,我根据字典条目进行分割以提取正确的字段。
special上面是处理一个需要从多行读取的特殊变量。
更新
克隆git://gist.github.com/1037403.git以获取代码的副本
usage:
./parser.py all_dec.txt
Run Code Online (Sandbox Code Playgroud)
希望能帮助到你!
| 归档时间: |
|
| 查看次数: |
1198 次 |
| 最近记录: |