Mat*_*tan 4 parsing pcap python-2.7
我想产生的所有域名的列表及其相应的IP地址从PCAP文件,使用dpkt库可用这里
我的代码主要基于此
filename = raw_input('Type filename of pcap file (without extention): ')
path = 'c:/temp/PcapParser/' + filename + '.pcap'
f = open(path, 'rb')
pcap = dpkt.pcap.Reader(f)
for ts, buf in pcap:
#make sure we are dealing with IP traffic
try:
eth = dpkt.ethernet.Ethernet(buf)
except:
continue
if eth.type != 2048:
continue
#make sure we are dealing with UDP protocol
try:
ip = eth.data
except:
continue
if ip.p != 17:
continue
#filter on UDP assigned ports for DNS
try:
udp = ip.data
except:
continue
if udp.sport != 53 and udp.dport != 53:
continue
#make the dns object out of the udp data and
#check for it being a RR (answer) and for opcode QUERY
try:
dns = dpkt.dns.DNS(udp.data)
except:
continue
if dns.qr != dpkt.dns.DNS_R:
continue
if dns.opcode != dpkt.dns.DNS_QUERY:
continue
if dns.rcode != dpkt.dns.DNS_RCODE_NOERR:
continue
if len(dns.an) < 1:
continue
#process and print responses based on record type
for answer in dns.an:
if answer.type == 1: #DNS_A
print 'Domain Name: ', answer.name, '\tIP Address: ', socket.inet_ntoa(answer.rdata)
Run Code Online (Sandbox Code Playgroud)
问题是answer.name对我来说还不够好,因为我需要请求的原始域名,而不是其CNAME表示形式。例如,原始DNS请求之一用于www.paypal.com
,但其CNAME表示形式是paypal.112.2o7.net
。
我仔细查看了一下代码,发现实际上是从DNS响应(而不是查询)中提取信息。然后,我查看了Wirehark中的响应数据包,发现原始域在“查询”和“答案”下,所以我的问题是如何提取它?
谢谢!
为了从DNS响应的“问题”部分中获取名称,通过dns.qd
提供的对象,dpkt.dns
我要做的就是以下简单的事情:
for qname in dns.qd: print qname.name
Run Code Online (Sandbox Code Playgroud)