小编tod*_*ysm的帖子

pkg-config找不到.pc文件,尽管它们在路径中

我在Mac OSX-Lion上看到了pkg-config的一个奇怪问题.当我运行我下载的模块的python设置时,我收到以下错误:

aspen:python toddysm$ sudo ./setup.py install
Password:
`pkg-config --libs --cflags cld` returns in error: 
Package cld was not found in the pkg-config search path.
Perhaps you should add the directory containing `cld.pc'
to the PKG_CONFIG_PATH environment variable
No package 'cld' found

The `cld` C++ library is absent from this system. Please install it.
Run Code Online (Sandbox Code Playgroud)

但是当检入/ usr/local/lib文件夹时,我看到libs和.pc文件位于pkgconfig子文件夹中

aspen:~ toddysm$ cd /usr/local/lib/
aspen:lib toddysm$ ls -al
total 2640
drwxr-xr-x  6 root  wheel      204 Jul  2 17:38 .
drwxr-xr-x  9 root  wheel      306 …
Run Code Online (Sandbox Code Playgroud)

macos pkg-config python-2.7 osx-lion

10
推荐指数
2
解决办法
2万
查看次数

在PySpark中过滤带有空数组的行

我们正在尝试使用PySpark在字段中过滤包含空数组的行.这是DF的架构:

root
 |-- created_at: timestamp (nullable = true)
 |-- screen_name: string (nullable = true)
 |-- text: string (nullable = true)
 |-- retweet_count: long (nullable = true)
 |-- favorite_count: long (nullable = true)
 |-- in_reply_to_status_id: long (nullable = true)
 |-- in_reply_to_user_id: long (nullable = true)
 |-- in_reply_to_screen_name: string (nullable = true)
 |-- user_mentions: array (nullable = true)
 |    |-- element: struct (containsNull = true)
 |    |    |-- id: long (nullable = true)
 |    |    |-- id_str: string (nullable = true)
 | …
Run Code Online (Sandbox Code Playgroud)

apache-spark apache-spark-sql pyspark spark-dataframe

4
推荐指数
1
解决办法
4495
查看次数

列在pySpark中不可迭代

所以,我们有点困惑。在Jupyter Notebook中,我们具有以下数据框:

+--------------------+--------------+-------------+--------------------+--------+-------------------+ 
|          created_at|created_at_int|  screen_name|            hashtags|ht_count|     single_hashtag|
+--------------------+--------------+-------------+--------------------+--------+-------------------+
|2017-03-05 00:00:...|    1488672001|     texanraj|  [containers, cool]|       1|         containers|
|2017-03-05 00:00:...|    1488672001|     texanraj|  [containers, cool]|       1|               cool|
|2017-03-05 00:00:...|    1488672002|   hubskihose|[automation, future]|       1|         automation|
|2017-03-05 00:00:...|    1488672002|   hubskihose|[automation, future]|       1|             future|
|2017-03-05 00:00:...|    1488672002|    IBMDevOps|            [DevOps]|       1|             devops|
|2017-03-05 00:00:...|    1488672003|SoumitraKJana|[VoiceOfWipro, Cl...|       1|       voiceofwipro|
|2017-03-05 00:00:...|    1488672003|SoumitraKJana|[VoiceOfWipro, Cl...|       1|              cloud|
|2017-03-05 00:00:...|    1488672003|SoumitraKJana|[VoiceOfWipro, Cl...|       1|             leader|
|2017-03-05 00:00:...|    1488672003|SoumitraKJana|      [Cloud, Cloud]|       1|              cloud|
|2017-03-05 00:00:...|    1488672003|SoumitraKJana|      [Cloud, Cloud]|       1|              cloud|
|2017-03-05 00:00:...| …
Run Code Online (Sandbox Code Playgroud)

apache-spark apache-spark-sql pyspark spark-dataframe

2
推荐指数
1
解决办法
5245
查看次数