我在Python 2.7 64位Windows安装过程中收到以下错误.我以前安装了python 3.5 64位,它工作正常.但在python 2.7安装过程中我收到此错误:
Traceback (most recent call last):
File "C:\Anaconda2\Lib\_nsis.py", line 164, in <module> main()
File "C:\Anaconda2\Lib\_nsis.py", line 150, in main
mk_menus(remove=False)
File "C:\Anaconda2\Lib\_nsis.py", line 94, in mk_menus
err("Traceback:\n%s\n" % traceback.format_exc(20))
IOError: [Errno 9] Bad file descriptor
Run Code Online (Sandbox Code Playgroud)
请帮助我.
我正在尝试在kubernetes中运行Kafka Streams应用程序.当我启动pod时,我得到以下异常:
Exception in thread "streams-pipe-e19c2d9a-d403-4944-8d26-0ef27ed5c057-StreamThread-1"
java.lang.UnsatisfiedLinkError: /tmp/snappy-1.1.4-5cec5405-2ce7-4046-a8bd-922ce96534a0-libsnappyjava.so:
Error loading shared library ld-linux-x86-64.so.2: No such file or directory
(needed by /tmp/snappy-1.1.4-5cec5405-2ce7-4046-a8bd-922ce96534a0-libsnappyjava.so)
at java.lang.ClassLoader$NativeLibrary.load(Native Method)
at java.lang.ClassLoader.loadLibrary0(ClassLoader.java:1941)
at java.lang.ClassLoader.loadLibrary(ClassLoader.java:1824)
at java.lang.Runtime.load0(Runtime.java:809)
at java.lang.System.load(System.java:1086)
at org.xerial.snappy.SnappyLoader.loadNativeLibrary(SnappyLoader.java:179)
at org.xerial.snappy.SnappyLoader.loadSnappyApi(SnappyLoader.java:154)
at org.xerial.snappy.Snappy.<clinit>(Snappy.java:47)
at org.xerial.snappy.SnappyInputStream.hasNextChunk(SnappyInputStream.java:435)
at org.xerial.snappy.SnappyInputStream.read(SnappyInputStream.java:466)
at java.io.DataInputStream.readByte(DataInputStream.java:265)
at org.apache.kafka.common.utils.ByteUtils.readVarint(ByteUtils.java:168)
at org.apache.kafka.common.record.DefaultRecord.readFrom(DefaultRecord.java:292)
at org.apache.kafka.common.record.DefaultRecordBatch$1.readNext(DefaultRecordBatch.java:264)
at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:563)
at org.apache.kafka.common.record.DefaultRecordBatch$RecordIterator.next(DefaultRecordBatch.java:532)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.nextFetchedRecord(Fetcher.java:1060)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.fetchRecords(Fetcher.java:1095)
at org.apache.kafka.clients.consumer.internals.Fetcher$PartitionRecords.access$1200(Fetcher.java:949)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchRecords(Fetcher.java:570)
at org.apache.kafka.clients.consumer.internals.Fetcher.fetchedRecords(Fetcher.java:531)
at org.apache.kafka.clients.consumer.KafkaConsumer.pollOnce(KafkaConsumer.java:1146)
at org.apache.kafka.clients.consumer.KafkaConsumer.poll(KafkaConsumer.java:1103)
at org.apache.kafka.streams.processor.internals.StreamThread.pollRequests(StreamThread.java:851)
at org.apache.kafka.streams.processor.internals.StreamThread.runOnce(StreamThread.java:808)
at org.apache.kafka.streams.processor.internals.StreamThread.runLoop(StreamThread.java:774)
at org.apache.kafka.streams.processor.internals.StreamThread.run(StreamThread.java:744)
Run Code Online (Sandbox Code Playgroud)
以前我尝试使用docker容器启动kafka和kafka-streams-app,它们工作得非常好.这是我第一次尝试使用Kubernetes.
这是我的DockerFile StreamsApp …
我的机器上安装了 Kafka 并且运行良好。
然后我安装了 NetBeans,我认为这把事情搞砸了。现在我的动物园管理员没有启动。
安装 netBeans 后,我在运行时遇到错误
缺少 JDK,需要运行一些 netbeans 模块
我使用这篇文章解决了这个问题
现在,当我使用以下命令启动 zookeeper 时:
sudo bin/zookeeper-server-start.sh config/zookeeper.properties
Run Code Online (Sandbox Code Playgroud)
我得到以下回溯:
[2018-01-31 17:30:09,953] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig)
[2018-01-31 17:30:09,959] INFO autopurge.snapRetainCount set to 3 (org.apache.zookeeper.server.DatadirCleanupManager)
[2018-01-31 17:30:09,959] INFO autopurge.purgeInterval set to 0 (org.apache.zookeeper.server.DatadirCleanupManager)
[2018-01-31 17:30:09,959] INFO Purge task is not scheduled. (org.apache.zookeeper.server.DatadirCleanupManager)
[2018-01-31 17:30:09,959] WARN Either no config or no quorum defined in config, running in standalone mode (org.apache.zookeeper.server.quorum.QuorumPeerMain)
[2018-01-31 17:30:09,979] INFO Reading configuration from: config/zookeeper.properties (org.apache.zookeeper.server.quorum.QuorumPeerConfig) …Run Code Online (Sandbox Code Playgroud) 我有这样的熊猫数据框:
d = {'dollar_amount': ['200.25', '350.00', '120.00', '400.50', '1231.25', '700.00', '350.00', '200.25', '2340.00'], 'date': ['22-01-2010','22-01-2010','23-01-2010','15-02-2010','27-02-2010','07-03-2010','14-01-2011','09-10-2011','28-07-2012']}
df = pd.DataFrame(data=d)
df['date'] = pd.to_datetime(df['date'], format='%d-%m-%Y')
pd.options.display.float_format = '{:,.4f}'.format
df['dollar_amount'] = df['dollar_amount'].astype(float)
df
date dollar_amount
0 22-01-2010 200.25
1 22-01-2010 350.00
2 23-01-2010 120.00
3 15-02-2010 400.50
4 27-02-2010 1231.25
5 07-03-2010 700.00
6 14-01-2011 350.00
7 09-10-2011 200.25
8 11-11-2011 2340.00
9 12-12-2011 144.50
10 12-09-2012 760.00
11 22-10-2012 255.00
12 28-07-2012 650.00
Run Code Online (Sandbox Code Playgroud)
我想总结每年每一天的金额。所以我是这样划分年份的:
date1 = df[(df['date'] >= '2010-01-01') & (df['date'] < '2011-01-01')] …Run Code Online (Sandbox Code Playgroud) 我想完全卸载融合。我按照他们网站上的说明安装了它。有三个简单的步骤:
$ wget -qO - https://packages.confluent.io/deb/4.0/archive.key | sudo apt-key add -
$ sudo add-apt-repository "deb [arch=amd64] https://packages.confluent.io/deb/4.0 stable main"
$ sudo apt-get update && sudo apt-get install confluent-platform-oss-2.11
Run Code Online (Sandbox Code Playgroud)
现在我该如何删除/卸载它。我找不到任何与之相关的东西。
我正在尝试使用 Cassandra sink 连接器阅读 2 个 kafka 主题并插入到 2 个 Cassandra 表中。我该怎么做呢?
这是我的connector.properties文件:
name=cassandra-sink-orders
connector.class=com.datamountaineer.streamreactor.connect.cassandra.sink.CassandraSinkConnector
tasks.max=1
topics=topic1,topic2
connect.cassandra.kcql=INSERT INTO ks.table1 SELECT * FROM topic1;INSERT INTO ks.table2 SELECT * FROM topic2
connect.cassandra.contact.points=localhost
connect.cassandra.port=9042
connect.cassandra.key.space=ks
connect.cassandra.contact.points=localhost
connect.cassandra.username=cassandra
connect.cassandra.password=cassandra
Run Code Online (Sandbox Code Playgroud)
我做的一切都对吗?这是最好的方法还是应该创建两个单独的连接器?
之后我创建了一个cassandra-sink连接器,我在connector.properties文件中做了一些更改.停止工作并再次启动后,现在当我使用以下命令添加连接器时:
java -jar kafka-connect-cli-1.0.6-all.jar create cassandra-sink-orders < cassandra-sink-distributed-orders.properties
Run Code Online (Sandbox Code Playgroud)
我收到以下错误:
Error: the Kafka Connect API returned: Connector cassandra-sink-orders already exists (409)
Run Code Online (Sandbox Code Playgroud)
如何删除现有连接器?
我有一个kafka流应用程序等待在主题上发布的记录user_activity.它将接收json数据,并根据我想要将该流推送到不同主题的键的值.
这是我的溪流应用代码:
KStream<String, String> source_user_activity = builder.stream("user_activity");
source_user_activity.flatMapValues(new ValueMapper<String, Iterable<String>>() {
@Override
public Iterable<String> apply(String value) {
System.out.println("value: " + value);
ArrayList<String> keywords = new ArrayList<String>();
try {
JSONObject send = new JSONObject();
JSONObject received = new JSONObject(value);
send.put("current_date", getCurrentDate().toString());
send.put("activity_time", received.get("CreationTime"));
send.put("user_id", received.get("UserId"));
send.put("operation_type", received.get("Operation"));
send.put("app_name", received.get("Workload"));
keywords.add(send.toString());
// apply regex to value and for each match add it to keywords
} catch (Exception e) {
// TODO: handle exception
System.err.println("Unable to convert to json");
e.printStackTrace();
} …Run Code Online (Sandbox Code Playgroud) 我正在设置一些docker-compose由 cron 作业运行的 python 应用程序使用的环境变量。
docker-compose.yaml:
version: '2.1'
services:
zookeeper:
container_name: zookeeper
image: zookeeper:3.3.6
restart: always
hostname: zookeeper
ports:
- "2181:2181"
environment:
ZOO_MY_ID: 1
kafka:
container_name: kafka
image: wurstmeister/kafka:1.1.0
hostname: kafka
links:
- zookeeper
ports:
- "9092:9092"
environment:
KAFKA_ADVERTISED_HOST_NAME: kafka
KAFKA_CREATE_TOPICS: "topic:1:1"
KAFKA_LOG_MESSAGE_TIMESTAMP_TYPE: LogAppendTime
KAFKA_MESSAGE_TIMESTAMP_TYPE: LogAppendTime
KAFKA_ZOOKEEPER_CONNECT: zookeeper:2181
data-collector:
container_name: data-collector
#image: mystreams:0.1
build:
context: /home/junaid/eMumba/CASB/Docker/data_collector/
dockerfile: Dockerfile
links:
- kafka
environment:
- KAFKA_HOST=kafka
- OFFICE_365_APP_ID=98aff1c5-7a69-46b7-899c-186851054b43
- OFFICE_365_APP_SECRET=zVyS/V694ffWe99QpCvYqE1sqeqLo36uuvTL8gmZV0A=
- OFFICE_365_APP_TENANT=2f6cb1a6-ecb8-4578-b680-bf84ded07ff4
- KAFKA_CONTENT_URL_TOPIC=o365_activity_contenturl
- KAFKA_STORAGE_DATA_TOPIC=o365_storage
- KAFKA_PORT=9092
- POSTGRES_DB_NAME=casb …Run Code Online (Sandbox Code Playgroud) 我已经对此进行了很多搜索,但是似乎没有关于此的很好的指南。
根据我的搜索,有几件事情需要考虑:
问题:甚至需要重置这些主题?
--reset-offsets为--to-earliest重新启动接收器和源连接器以从头读取的最佳方法是什么?
apache-kafka ×7
java ×2
python ×2
anaconda ×1
apt ×1
cassandra ×1
cron ×1
docker ×1
group-by ×1
installation ×1
pandas ×1
python-2.7 ×1
snappy ×1