Naz*_*nin 2 python xml terminal semantics syntaxnet
已经下载并培训SyntaxNet,我试图写一个程序,可以打开新的/存在的文件,例如AutoCAD文件,并通过分析文本保存在特定目录中的文件: 打开LibreOffice的文件X.将SyntaxNet的输出视为:
echo "save AUTOCAD file X in directory Y" | ./test.sh > output.txt
Input: save AUTOCAD file X in directory Y
Parse:
save VB ROOT
+-- X NNP dobj
| +-- file NN compound
| +-- AUTOCAD CD nummod
+-- directory NN nmod
+-- in IN case
+-- Y CD nummod
Run Code Online (Sandbox Code Playgroud)
首先,我考虑将解析后的文本更改为XML格式,然后使用语义分析(例如SPARQL)解析XML文件以查找ROOT = save,dobj = X和nummode = Y并编写一个python程序,它可以执行相同的操作在文中
我不知道如果我将解析后的文本更改为XML,然后使用使用查询的语义分析,以便ROOT与其保存的对应函数或脚本 匹配 dobj,在提及的目录中nummode
我有一些想法将python连接到终端与subprocess包但我没有找到任何可以帮助我保存例如AUTOCAD文件或任何其他文件,从终端或我需要编写脚本.sh,与python的帮助?
我对文本的句法和语义分析进行了大量的研究,比如Christian Chiarcos,2011,Hunter和Cohen,2006和Verspoor等,2015,还研究了微软Cortana,Sirius,谷歌,但他们都没有通过详细说明了他们如何将解析后的文本更改为执行命令,这使我得出结论,这项工作太容易被讨论,但由于我不是计算机科学专业,我无法弄清楚我能做些什么.
Naz*_*nin 10
我是计算机科学世界和SyntaxNet的初学者.我写了一个简单的SyntaxNet-Python算法,它使用SyntaxNet来分析用户插入的文本命令,"打开我用LibreOffice writer用实验室编写器编写的文件簿",然后用python算法分析SyntaxNet输出以便转向它是一个执行命令,在这种情况下,使用任何支持的格式打开一个文件,在Linux,Ubuntu 14.04环境中使用LibreOffice.你可以在这里看到LibreOffice定义的不同命令行,以便在这个包中使用不同的应用程序.
安装并运行SyntaxNet(此处说明的安装过程)后,shell脚本在目录中打开demo.sh,~/models/syntaxnet/suntaxnet/并删除conl2treefunction(line 54 to 56)以获取tab delimitedSyntaxNet 的输出而不是树格式输出.
在终端窗口中键入此命令:
echo'打开我与libreOffice writer'|的实验室作家写的文件 syntaxnet/demo.sh> output.txt
在output.txt其中文档保存在目录中demo.sh存在,并且它会以某种方式就像下图:
output.txt作为输入文件,并使用下面的蟒蛇算法来分析SyntaxNet输出,并确定你想从LibreOffice的包,用户要使用该命令的目标应用程序的文件名.#!/bin/sh
import csv
import subprocess
import sys
import os
#get SyntaxNet output as the Python algorithm input file
filename='/home/username/models/syntaxnet/work/output.txt'
#all possible executive commands for opening any file with any format with Libreoffice file
commands={
('open', 'libreoffice', 'writer'): ('libreoffice', '--writer'),
('open', 'libreoffice', 'calculator'): ('libreoffice' ,'--calc'),
('open', 'libreoffice', 'draw'): ('libreoffice' ,'--draw'),
('open', 'libreoffice', 'impress'): ('libreoffice' ,'--impress'),
('open', 'libreoffice', 'math'): ('libreoffice' ,'--math'),
('open', 'libreoffice', 'global'): ('libreoffice' ,'--global'),
('open', 'libreoffice', 'web'): ('libreoffice' ,'--web'),
('open', 'libreoffice', 'show'): ('libreoffice', '--show'),
}
#all of the possible synonyms of the application from Libreoffice
comments={
'writer': ['word','text','writer'],
'calculator': ['excel','calc','calculator'],
'draw': ['paint','draw','drawing'],
'impress': ['powerpoint','impress'],
'math': ['mathematic','calculator','math'],
'global': ['global'],
'web': ['html','web'],
'show':['presentation','show']
}
root ='ROOT' #ROOT of the senctence
noun='NOUN' #noun tagger
verb='VERB' #verb tagger
adjmod='amod' #adjective modifier
dirobj='dobj' #direct objective
apposmod='appos' # appositional modifier
prepos_obj='pobj' # prepositional objective
app='libreoffice' # name of the package
preposition='prep' # preposition
noun_modi='nn' # noun modifier
#read from Syntaxnet output tab delimited textfile
def readata(filename):
file=open(filename,'r')
lines=file.readlines()
lines=lines[:-1]
data=csv.reader(lines,delimiter='\t')
lol=list(data)
return lol
# identifies the action, the name of the file and whether the user mentioned the name of the application implicitely
def exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi):
interprete='null'
lists=readata(filename)
for sublist in lists:
if sublist[7]==root and sublist[3]==verb: # when the ROOT is verb the dobj is probably the name of the file you want to have
action=sublist[1]
dep_num=sublist[0]
for sublist in lists:
if sublist[6]==dep_num and sublist[7]==dirobj:
direct_object=sublist[1]
dep_num=sublist[0]
dep_num_obj=sublist[0]
for sublist in lists:
if direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==apposmod:
direct_object=sublist[1]
elif direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==adjmod:
direct_object=sublist[1]
for sublist in lists:
if sublist[6]==dep_num_obj and sublist[7]==adjmod:
for key, v in comments.iteritems():
if sublist[1] in v:
interprete=key
for sublist in lists:
if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
dep_num_nn=sublist[0]
for key, v in comments.iteritems():
if sublist[1] in v:
interprete=key
print interprete
if interprete=='null':
for sublist in lists:
if sublist[6]==dep_num_nn and sublist[7]==noun_modi:
for key, v in comments.iteritems():
if sublist[1] in v:
interprete=key
elif sublist[7]==root and sublist[3]==noun: # you have to find the word which is in a adjective form and depends on the root
dep_num=sublist[0]
dep_num_obj=sublist[0]
direct_object=sublist[1]
for sublist in lists:
if sublist[6]==dep_num and sublist[7]==adjmod:
actionis=any(t1==sublist[1] for (t1, t2, t3) in commands)
if actionis==True:
action=sublist[1]
elif sublist[6]==dep_num and sublist[7]==noun_modi:
dep_num=sublist[0]
for sublist in lists:
if sublist[6]==dep_num and sublist[7]==adjmod:
if any(t1==sublist[1] for (t1, t2, t3) in commands):
action=sublist[1]
for sublist in lists:
if direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==apposmod and sublist[1]!=action:
direct_object=sublist[1]
if direct_object=='file' and sublist[6]==dep_num_obj and sublist[7]==adjmod and sublist[1]!=action:
direct_object=sublist[1]
for sublist in lists:
if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
dep_num_obj=sublist[0]
for key, v in comments.iteritems():
if sublist[1] in v:
interprete=key
else:
for sublist in lists:
if sublist[6]==dep_num_obj and sublist[7]==noun_modi:
for key, v in comments.iteritems():
if sublist[1] in v:
interprete=key
return action, direct_object, interprete
action, direct_object, interprete = exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi)
# find the application (we assume we know user want to use libreoffice but we donot know what subapplication should be used)
def application(app,prepos_obj,preposition,noun_modi):
lists=readata(filename)
subapp='not mentioned'
for sublist in lists:
if sublist[1]==app:
dep_num=sublist[6]
for sublist in lists:
if sublist[0]==dep_num and sublist[7]==prepos_obj:
actioni=any(t3==sublist[1] for (t1, t2, t3) in commands)
if actioni==True:
subapp=sublist[1]
else:
for sublist in lists:
if sublist[6]==dep_num and sublist[7]==noun_modi:
actioni=any(t3==sublist[1] for (t1, t2, t3) in commands)
if actioni==True:
subapp=sublist[1]
elif sublist[0]==dep_num and sublist[7]==preposition:
sublist[6]=dep_num
for subline in lists:
if subline[0]==dep_num and subline[7]==prepos_obj:
if any(t3==sublist[1] for (t1, t2, t3) in commands):
subapp=sublist[1]
else:
for subline in lists:
if subline[0]==dep_num and subline[7]==noun_modi:
if any(t3==sublist[1] for (t1, t2, t3) in commands):
subapp=sublist[1]
return subapp
sub_application=application(app,prepos_obj,preposition,noun_modi)
if sub_application=='not mentioned' and interprete!='null':
sub_application=interprete
elif sub_application=='not mentioned' and interprete=='null':
sub_application=interprete
# the format of file
def format_function(sub_application):
subapp=sub_application
Dobj=exe(root,noun,verb,adjmod,dirobj,apposmod,commands,noun_modi)[1]
if subapp!='null':
if subapp=='writer':
a='.odt'
Dobj=Dobj+a
elif subapp=='calculator':
a='.ods'
Dobj=Dobj+a
elif subapp=='impress':
a='.odp'
Dobj=Dobj+a
elif subapp=='draw':
a='.odg'
Dobj=Dobj+a
elif subapp=='math':
a='.odf'
Dobj=Dobj+a
elif subapp=='math':
a='.odf'
Dobj=Dobj+a
elif subapp=='web':
a='.html'
Dobj=Dobj+a
else:
Dobj='null'
return Dobj
def get_filepaths(directory):
myfile=format_function(sub_application)
file_paths = [] # List which will store all of the full filepaths.
# Walk the tree.
for root, directories, files in os.walk(directory):
for filename in files:
# Join the two strings in order to form the full filepath.
if filename==myfile:
filepath = os.path.join(root, filename)
file_paths.append(filepath) # Add it to the list.
return file_paths # Self-explanatory.
# Run the above function and store its results in a variable.
full_file_paths = get_filepaths("/home/ubuntu/")
if full_file_paths==[]:
print 'No file with name %s is found' % format_function(sub_application)
if full_file_paths!=[]:
path=full_file_paths
prompt='> '
if len(full_file_paths) >1:
print full_file_paths
print 'which %s do you mean?'% subapp
inputname=raw_input(prompt)
if inputname in full_file_paths:
path=inputname
#the main code structure
if sub_application!='null':
command= commands[action,app,sub_application]
subprocess.call([command[0],command[1],path[0]])
else:
print "The sub application is not mentioned clearly"
Run Code Online (Sandbox Code Playgroud)
我再次说我是一个初学者,代码可能看起来不那么整洁或专业,但我只是试图利用我对这个SyntaxNet实用算法的迷人知识
.
这个简单的算法可以打开文件:
使用LibreOffice例如支持的任何格式.odt,.odf,.ods,.html,.odp.
它可以理解不同应用程序的隐式引用LibreOffice,例如:"用libreoffice打开文本文件簿"而不是"用libreoffice writer打开文件簿"
可以克服SyntaxNet解释被称为形容词的文件名称的问题.
| 归档时间: |
|
| 查看次数: |
2261 次 |
| 最近记录: |