我有一些模拟结果,我希望将这些结果与我为其特定坐标持有的一些静态信息配对。
我正在使用熊猫,key数据框如下所示:
Orig_lat Orig_lng Dest_lat Dest_lng Site Lane_1
51.4410925 -0.0913334 51.4431736 -0.0681643 6 E
51.4431736 -0.0681643 51.4410925 -0.0913334 6 W
51.6300955 -0.0781079 51.6489284 -0.0602954 7 N
51.648917 -0.0600521 51.6299841 -0.0779832 7 S
51.4648078 -0.301316 51.4573656 -0.3219232 9 S
51.4573656 -0.3219232 51.4649063 -0.3013827 9 N
51.412392 0.0743042 51.4088694 0.0800096 11 S
51.4088694 0.0800096 51.412392 0.0743042 11 N
51.4728599 -0.0235216 51.4804927 -0.0231821 14 N
Run Code Online (Sandbox Code Playgroud)
结果dataframe如下所示:
distance duration duration_in_traffic Orig_lat Orig_lng Dest_lat Dest_lng
1456736402 1670 186 337 51.4431736 -0.0681643 …Run Code Online (Sandbox Code Playgroud) 我是斯卡拉新手。我有一个 JSON 文件,标题scala_input.json包含两项:
{
"edges_file": "/path/edges.json.gz",
"seed_file": "/path/seed.json.gz"
}
Run Code Online (Sandbox Code Playgroud)
我希望打开该文件,val从该文件中解析和属性两个。我努力了:
val input_file = "/path/scala_input.json"
val json_data = JSON.parseFull(input_file)
val edges_file = json_data.get.asInstanceOf[Map[String, Any]]("edges_file").asInstanceOf[String]
val seeds_file = json_data.get.asInstanceOf[Map[String, Any]]("seed_file").asInstanceOf[String]]
Run Code Online (Sandbox Code Playgroud)
然而,这又回来了java.util.NoSuchElementException: None.get。我还没有定义什么?json_data是input_file正确的,我确信并且edges_file存在seed_file。
假设我有一个像这样的数据框df:
Date Time Black Carbon Carbon monoxide PM10 Particulate matter
0 19/10/2015 01:00:00 No data No data No data
1 19/10/2015 02:00:00 No data No data No data
2 19/10/2015 03:00:00 10 No data No data
3 19/10/2015 04:00:00 No data 11 . No data
4 19/10/2015 05:00:00 No data No data No data
Run Code Online (Sandbox Code Playgroud)
我可以通过以下方式删除所有 na 列:
tmp_df= df.dropna(axis=1,how='all')
Run Code Online (Sandbox Code Playgroud)
但是,我希望删除一列,条件是每行都包含一个字符串,No data
在这种情况下,我们将删除该Particulate matter列
我一直在使用gdal命令行将asc文件转换为GeoJSON输出。我可以成功地做到这一点:
gdal_polygonize.py input.asc -f "GeoJSON" output.json
现在我希望使用 Python 并针对一系列文件遵循此过程。
import gdal
import glob
for file in glob.glob("dir/*.asc"):
new_name = file[:-4] + ".json"
gdal.Polygonize(file, "-f", "GeoJSON", new_name)
Run Code Online (Sandbox Code Playgroud)
Hpwever,对于完全相同的文件,我收到以下错误TypeError: in method 'Polygonize', argument 1 of type 'GDALRasterBandShadow *'
为什么命令行版本可以工作而python版本不行?
编辑 - 我发现这本书是为scala写的,1.6但剩下的就是2.11.
我正在尝试实现Michael Malak和Robin East的Spark GraphX in Action书中的加权最短路径算法.有问题的部分是清单6.4"执行使用面包屑的最短路径算法",这里是第6章.
我有自己的图表,我是从两个RDD创建的.有344436顶点和772983边.我可以使用原生GraphX库执行未加权的最短路径计算,我对图形构造很有信心.
在这种情况下,我使用他们的Dijkstra实现如下:
val my_graph: Graph[(Long),Double] = Graph.apply(verticesRDD, edgesRDD).cache()
def dijkstra[VD](g:Graph[VD,Double], origin:VertexId) = {
var g2 = g.mapVertices(
(vid,vd) => (false, if (vid == origin) 0 else Double.MaxValue, List[VertexId]())
)
for (i <- 1L to g.vertices.count-1) {
val currentVertexId = g2.vertices
.filter(!_._2._1)
.fold((0L, (false, Double.MaxValue, List[VertexId]())))(
(a,b) => if (a._2._2 < b._2._2) a else b)
)
._1
val newDistances = …Run Code Online (Sandbox Code Playgroud) 我正在查看一段代码,其中包含以下内容:
graph.vertices.filter(!_._2._1)
我知道这_是通配符,scala但我不知道!应该做什么.
!scala 中的意思是什么?
我有一个.json.gz文件希望加载到弹性搜索中。
我的第一次尝试涉及使用json模块将JSON转换为字典列表。
import gzip
import json
from pprint import pprint
from elasticsearch import Elasticsearch
nodes_f = gzip.open("nodes.json.gz")
nodes = json.load(nodes_f)
Run Code Online (Sandbox Code Playgroud)
字典示例:
pprint(nodes[0])
{u'index': 1,
u'point': [508163.122, 195316.627],
u'tax': u'fehwj39099'}
Run Code Online (Sandbox Code Playgroud)
使用Elasticsearch:
es = Elasticsearch()
data = es.bulk(index="index",body=nodes)
Run Code Online (Sandbox Code Playgroud)
但是,这返回:
elasticsearch.exceptions.RequestError: TransportError(400, u'illegal_argument_exception', u'Malformed action/metadata line [1], expected START_OBJECT or END_OBJECT but found [VALUE_STRING]')
Run Code Online (Sandbox Code Playgroud)
除此之外,我希望能够找到tax给定point的查询,在这种情况下,对我应该如何索引与elasticsearch数据的影响。
我正在尝试远程连接到MySQL服务器.我遵循了(1)的建议,并在我将远程访问的IP地址上设置用户.
user$ mysql -u TestUser -p -h 129.169.66.149
Enter password:
ERROR 2003 (HY000): Can't connect to MySQL server on '129.169.66.149' (60)
Run Code Online (Sandbox Code Playgroud)
我检查过,端口(默认为3306)是正确的,IP地址是正确的.MySQL也在运行.
从远程计算机,我可以成功ping服务器
ping 129.169.66.149
64 bytes from 129.169.66.149: icmp_seq=48 ttl=63 time=1.010 ms
Run Code Online (Sandbox Code Playgroud)
但是当我使用Telnet时:
TELNET 129.169.66.149
Trying 129.169.66.149...
telnet: connect to address 129.169.66.149: Operation timed out
telnet: Unable to connect to remote host
Run Code Online (Sandbox Code Playgroud)
任何人都可以建议吗?这是防火墙问题吗?
(1) - https://superuser.com/questions/826896/access-wordpress-mysql-database-remotely
val label = Try("here_a").getOrElse("here_b")
Run Code Online (Sandbox Code Playgroud)
在here_a没有找到的情况下,这不会回落here_b.为什么.getOrElse不起作用?
谢谢@jwvh.这些值是sting文件路径,因此例外如下Exception in thread "main" java.io.FileNotFoundException:
根据安德鲁詹姆斯拉米雷斯的评论,我尝试了这个,但问题仍然存在.
Try(throw new Exception("FAIL here_a")).getOrElse("here_b")
我也试过了
Try(throw new Exception("FileNotFoundException here_a")).getOrElse("here_b")
编辑
对于SO来说,似乎我可能过于简化了这个问题.更多背景.该字符串实际上是一个文件路径.也许这有所作为?
实际上,json可以在两个可能的位置之一中找到文件.因此,我希望尝试第一个位置,如果java.io.FileNotFoundException返回,则返回第二个位置.这就是我现在拥有的:
val input_file = Try(throw new Exception("FAIL location_a/file_a.json")).getOrElse("location_b/file_a.json")
编辑 V2
我很尴尬地说我发现了这个简单的错误.我正在运行此scala代码spark,我忘了在测试之间重新打包.sbt package就是这一切.: - /
我有~5k文件,我想解压缩.
2:13:35 2017-01-16 $ unpigz *.gz
-bash: /usr/local/bin/unpigz: Argument list too long
12:13:40 2017-01-16 $ unpigz -r *.gz
-bash: /usr/local/bin/unpigz: Argument list too long
12:15:45 2017-01-16 $ gunzip *.gz
-bash: /usr/bin/gunzip: Argument list too long
12:17:56 2017-01-16 $ cp *.gz ~/Desktop/
-bash: /bin/cp: Argument list too long
Run Code Online (Sandbox Code Playgroud)
bash可以处理的文件数量是否有计数限制?
我正在读取JSON文件,添加字段,然后写入新的JSON文件。
我读取的JSON文件links.json如下所示:
[{"negativeNode":"osgb4000000023183407","toid":"osgb4000000023296573","term":"Private Road - Restricted Access","polyline":[492019.481,156567.076,492028,156567,492041.667,156570.536,492063.65,156578.067,492126.5,156602],"positiveNode":"osgb4000000023183409","index":1,"nature":"Single Carriageway"}
,{"negativeNode":"osgb4000000023763485","toid":"osgb4000000023296574","term":"Private Road - Restricted Access","polyline":[492144.493,156762.059,492149.35,156750,492195.75,156630],"positiveNode":"osgb4000000023183408","index":2,"nature":"Single Carriageway"}
,{"negativeNode":"osgb4000000023183650","toid":"osgb4000000023296638","term":"Private Road - Restricted Access","polyline":[492835.25,156873.5,493000,156923,493018.061,156927.938],"positiveNode":"osgb4000000023183652","index":3,"nature":"Single Carriageway"}
,{"negativeNode":"osgb4000000023181163","toid":"osgb4000000023388466","term":"Local Street","polyline":[498136.506,149148.313,498123.784,149143.969,498119.223,149143.411,498116.43,149143.318,498113.638,149145.179],"positiveNode":"osgb4000000023806248","index":4,"nature":"Single Carriageway"}
]
Run Code Online (Sandbox Code Playgroud)
我打开JSON文件,读取它,创建一个新字段,然后将其转储到新文件中:
import json
links_file = open('links.json')
links = json.load(links_file)
for link in links:
link['length'] = 10
with open('links_new.json','w') as outfile:
json.dump(links, outfile)
Run Code Online (Sandbox Code Playgroud)
这样成功导出,我可以使用文本编辑器(Sublime Text)进行检查
[{"index": 1, "term": "Private Road - Restricted Access", "nature": "Single Carriageway", "negativeNode": "osgb4000000023183407", "toid": "osgb4000000023296573", "length": 10, "polyline": [492019.481, 156567.076, 492028, 156567, 492041.667, 156570.536, 492063.65, 156578.067, 492126.5, 156602], "positiveNode": "osgb4000000023183409"}, …Run Code Online (Sandbox Code Playgroud)