小编fit*_*ida的帖子

Cassandra错误 - 仅当分区键受EQ或IN限制时才支持

这是我正在创建的表格,此表包含有关玩最后一个杯子的玩家的信息.

CREATE TABLE players ( 
      group text, equipt text, number int, position text, name text,
      day int, month int, year int, 
      club text, liga text, capitan text,
PRIMARY key (name, day, month, year));

Run Code Online (Sandbox Code Playgroud)

执行以下查询时:

从作为选拔队队长的最老的球员那里获得5个名字

这是我的查询:

SELECT name FROM players WHERE captain='YES' ORDER BY year DESC LIMIT 5;

Run Code Online (Sandbox Code Playgroud)

我收到这个错误:

仅当分区键受EQ或IN限制时才支持

我认为这是我正在创建的表的一个问题,但我不知道如何解决它.

谢谢.

select cql sql-order-by cassandra

fit*_*ida

2019 02-16

10
推荐指数

1
解决办法

8864
查看次数

从txt文件中读取单词 - Python

我开发了一个代码,负责读取txt文件的单词,在我的案例中为"elquijote.txt",然后使用字典{key:value}来显示出现的单词及其出现次数.

例如,对于带有以下单词的文件"test1.txt":

hello hello hello good bye bye

Run Code Online (Sandbox Code Playgroud)

我的程序输出是:

 hello 3
 good  1
 bye   2

Run Code Online (Sandbox Code Playgroud)

该程序的另一个选项是,它显示的那些单词看起来比我们通过参数引入的数字的次数更多.

如果在shell中,我们输入以下命令"python readingwords.py text.txt 2",将显示文件"test1.txt"中包含的单词,这些单词出现的次数多于我们输入的次数,在本例中为2

输出:

hello 3

Run Code Online (Sandbox Code Playgroud)

现在我们可以引入常见词的第三个参数,例如确定连词,它们如此通用,我们不希望在字典中显示或引入.

我的代码工作正常,问题是使用大文件,如"elquijote.txt"需要很长时间才能完成整个过程.

我一直在想,这是因为我使用我的辅助列表来消除单词.

我认为作为一种解决方案,不要在我的列表中引入由参数输入的txt文件中出现的单词,其中包含要丢弃的单词.

这是我的代码:

def contar(aux):
  counts = {}
  for palabra in aux:
    palabra = palabra.lower()
    if palabra not in counts:
      counts[palabra] = 0
    counts[palabra] += 1
  return counts

def main():

  characters = '!?¿-.:;-,><=*»¡'
  aux = []
  counts = {}

  with open(sys.argv[1],'r') as f:
    aux = ''.join(c for c in f.read() if c not in characters) …

Run Code Online (Sandbox Code Playgroud)

python dictionary

fit*_*ida

lucky-day

5
推荐指数

1
解决办法

1178
查看次数

Pyspark - TypeError：使用reduceByKey计算平均值时“float”对象不可下标

我的“asdasd.csv”文件具有以下结构。

 Index,Arrival_Time,Creation_Time,x,y,z,User,Model,Device,gt
0,1424696633908,1424696631913248572,-5.958191,0.6880646,8.135345,a,nexus4,nexus4_1,stand
1,1424696633909,1424696631918283972,-5.95224,0.6702118,8.136536,a,nexus4,nexus4_1,stand
2,1424696633918,1424696631923288855,-5.9950867,0.6535491999999999,8.204376,a,nexus4,nexus4_1,stand
3,1424696633919,1424696631928385290,-5.9427185,0.6761626999999999,8.128204,a,nexus4,nexus4_1,stand

Run Code Online (Sandbox Code Playgroud)

好的，我得到以下 {key,value} 元组来对其进行操作。

#                                 x           y        z
[(('a', 'nexus4', 'stand'), ((-5.958191, 0.6880646, 8.135345)))]
#           part A (key)               part B (value)

Run Code Online (Sandbox Code Playgroud)

我计算平均值的代码如下，我必须计算每个键的每列 X、YZ 的平均值。

rdd_ori = sc.textFile("asdasd.csv") \
        .map(lambda x: ((x.split(",")[6], x.split(",")[7], x.split(",")[9]),(float(x.split(",")[3]),float(x.split(",")[4]),float(x.split(",")[5]))))

meanRDD = rdd_ori.mapValues(lambda x: (x,1)) \
            .reduceByKey(lambda a, b: (a[0][0] + b[0][0], a[0][1] + b[0][1], a[0][2] + b[0][2], a[1] + b[1]))\
            .mapValues(lambda a : (a[0]/a[3], a[1]/a[3],a[2]/a[3]))

Run Code Online (Sandbox Code Playgroud)

我的问题是我尝试了该代码，它在其他 PC 上运行良好，并且与我用于开发它的相同 MV (PySpark Py3)

这是一个例子，这段代码是正确的：

但我不知道为什么会收到此错误，重要的部分在Strong中。

-------------------------------------------------- ------------------------- Py4JJavaError Traceback (最近一次调用) in …

python apache-spark pyspark

fit*_*ida

2018 03-08

1
推荐指数

1
解决办法

4280
查看次数

Split csv file by the value of a column - Apache Nifi

I have a csv files, that it has the following structure.

ERP,J,JACKSON,8388 SOUTH CALIFORNIA ST.,TUCSON,AZ,85708,267-3352,,ALLENTON,MI,48002,810,710-0470,369-98-6555,462-11-4610,1953-05-00,F,
MARKETING,J,JACKSON,8388 SOUTH CALIFORNIA ST.,TUCSON,AZ,85708,267-3352,,ALLENTON,MI,48002,810,710-0470,369-98-6555,462-11-4610,1953-05-00,F,

Run Code Online (Sandbox Code Playgroud)

As you can see there is not header, but for your information the first part (first column) represents the sector where are getting the data.

What I have to do is depending on the first column value, for example (MARKETING or ERP) I have to send all that rows to a different output directory.

For example, all rows with ERP to …

csv apache apache-nifi

fit*_*ida

lucky-day

1
推荐指数

1
解决办法

667
查看次数