joc*_*ode 6 python linux udp amazon-ec2 docker
当使用Docker在EC2上发送UDP数据包时,我有时会发现这个奇怪的错误(并非所有发送的消息都有异常),这在使用OpenNebula的内部集群中永远不会发生.我已经允许所有EC2实例上的每个端口上的所有入站/出站流量.这是一个例外:
2017-01-19 10:01:53,170 - ERROR: Exception caught for address: 10.99.0.153
Traceback (most recent call last):
File "./server.py", line 56, in <module>
sock.sendto(bytes('{}'.format(i), "utf-8"), (address, PORT))
OSError: [Errno 22] Invalid argument
Run Code Online (Sandbox Code Playgroud)
我用Ubuntu服务器16.04和Docker 1.12.6运行5个c4.xlarge实例.他们都在同一个码头群中.
我使用覆盖驱动程序创建服务和网络子网.此服务具有挂载点以从每个对等方获取日志.我运行150个对等体,每个对等体的内存限制为300MB.
我的Dockerfile:
FROM debian:jessie
RUN echo 'deb http://mirror.switch.ch/ftp/mirror/debian/ jessie-backports main' >> /etc/apt/sources.list && \
apt-get -yqq update && \
apt-get -yqq dist-upgrade && \
apt-get -yqq install --no-install-recommends dnsutils wget curl ntp python3 && \
apt-get -yqq clean
CMD ["/opt/epto/container-start-script.sh"]
Run Code Online (Sandbox Code Playgroud)
我使用以下shell脚本作为我的CMD:
#!/usr/bin/env bash
MY_IP_ADDR=$(/bin/hostname -i)
MY_IP_ADDR=($MY_IP_ADDR)
./server.py ${MY_IP_ADDR[0]}
Run Code Online (Sandbox Code Playgroud)
这是运行的实际python脚本:
#!/usr/bin/env python3
import socketserver
import sys
import logging
import threading
import urllib.request
import time
import socket
from random import randint
PORT = 15342
class MyUDPHandler(socketserver.BaseRequestHandler):
"""
This class works similar to the TCP handler class, except that
self.request consists of a pair of data and client socket, and since
there is no connection the client address must be given explicitly
when sending data back via sendto().
"""
def handle(self):
data = self.request[0].strip().decode("utf-8")
logging.info("Message received from {} during loop {}".format(self.client_address[0], data))
class ThreadedUDPServer(socketserver.ThreadingMixIn, socketserver.UDPServer):
pass
if __name__ == "__main__":
HOST = sys.argv[1]
logging.basicConfig(format='%(asctime)s - %(levelname)s: %(message)s', level=logging.INFO,
filename='/data/{}.test'.format(HOST))
server = ThreadedUDPServer((HOST, PORT), MyUDPHandler)
server.allow_reuse_address = True
logging.info("Create server listening on {}:{}".format(HOST, PORT))
logging.info("Server allow_reuse_address: {}".format(server.allow_reuse_address))
server_thread = threading.Thread(target=server.serve_forever)
server_thread.daemon = True
server_thread.start()
sleep_delay = randint(10, 180)
logging.info("Sleeping for {}s".format(sleep_delay))
time.sleep(sleep_delay)
logging.info("Finished sleeping")
sock = socket.socket(socket.AF_INET, socket.SOCK_DGRAM)
sock.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
content = urllib.request.urlopen('http://epto-tracker:4321/REST/v1/admin/get_view').read()
content = content.decode("utf-8")
addresses = content.split('|')
logging.info("View size: {}".format(len(addresses)))
i = 0
while True:
logging.info("Loop {}".format(i))
for address in addresses:
try:
logging.info("Sending to {}".format(address))
sock.sendto(bytes('{}'.format(i), "utf-8"), (address, PORT))
except:
logging.exception("Exception caught for address: {}".format(address))
time.sleep(5)
i += 1
Run Code Online (Sandbox Code Playgroud)
我在同一个覆盖网络上创建了第二个服务.这个包含跟踪器,节点将联系以获取网络视图:
Dockerfile:
FROM python:3.5.2-alpine
RUN pip install pydevd
COPY tracker.py /code/
WORKDIR /code
EXPOSE 4321
CMD [ "python", "./tracker.py" ]
Run Code Online (Sandbox Code Playgroud)
代码文件:
# import pydevd
import random
import logging
import time
from http.server import HTTPServer, BaseHTTPRequestHandler
available_peers = {}
K = 25
logging.basicConfig(format='%(levelname)s: %(message)s', level=logging.INFO)
def florida_string(ip):
available_peers[ip] = int(time.time())
to_choose = list(available_peers.keys())
logging.info("View size: {:d}".format(len(to_choose)))
to_choose.remove(ip)
if len(to_choose) > K:
to_send = random.sample(to_choose, K)
else:
to_send = to_choose
return '|'.join(to_choose).encode()
class FloridaHandler(BaseHTTPRequestHandler):
def do_GET(self):
if self.path == '/REST/v1/admin/get_view':
self.send_response(200)
self.send_header("Content-type", "text/plain")
self.end_headers()
self.wfile.write(florida_string(self.client_address[0]))
elif self.path == '/terminate':
if self.client_address[0] in available_peers:
del available_peers[self.client_address[0]]
logging.info("Removed {:s}".format(self.client_address[0]))
logging.info("View size: {:d}".format(len(available_peers)))
else:
logging.error("IP already removed or was never here")
self.send_response(200)
self.send_header("Content-type", "text/plain")
self.end_headers()
self.wfile.write(b"Success")
else:
self.send_response(404)
self.send_header("Content-type", "text/plain")
self.end_headers()
self.wfile.write(b"Nothing here, content is at /REST/v1/admin/get_view\n")
class FloridaServer:
def __init__(self):
self.server = HTTPServer(('', 4321), FloridaHandler)
self.server.serve_forever()
FloridaServer()
Run Code Online (Sandbox Code Playgroud)
有没有人在EC2上遇到同样的错误?