我正在尝试找到一种方法来下载整个 PyPi 索引 - 并且仅下载索引 - 没有代码文件。我想分析许可证类型,以便能够排除许可证类型限制过多的库。我已经在网上查看过用户指南,但如果答案就在那里,我却无法理解。
好吧,我半个智慧结束了.我用geopy对数据帧进行地理编码.我写了一个简单的函数来获取输入 - 国家名称 - 并返回纬度和经度.我使用apply来运行该函数,它返回一个Pandas系列对象.我似乎无法将其转换为数据帧.我确定我错过了一些明显的东西,但我是python的新手,还是RTFMing.顺便说一下,地理编码器功能很棒.
# Import libraries
import os
import pandas as pd
import numpy as np
from geopy.geocoders import Nominatim
def locate(x):
geolocator = Nominatim()
# print(x) # debug
try:
#Get geocode
location = geolocator.geocode(x, timeout=8, exactly_one=True)
lat = location.latitude
lon = location.longitude
except:
#didn't work for some reason that I really don't care about
lat = np.nan
lon = np.nan
# print(lat,lon) #debug
return lat, lon # Note: also tried return { 'LAT': lat, 'LON': lon } …Run Code Online (Sandbox Code Playgroud) 所以我使用 Python 3.5 中的 psycopg2 驱动程序运行以下代码到 Pandas 19.x。
buf = io.StringIO()
cursor = conn.cursor()
sql_query = 'COPY ('+ base_sql + ' limit 100) TO STDOUT WITH CSV HEADER'
cursor.copy_expert(sql_query, buf)
df = pd.read_csv(buf.getvalue(),engine='c')
buf.close()
Run Code Online (Sandbox Code Playgroud)
从内存缓冲区读取时,read_csv 会炸毁块:
pandas\parser.pyx in pandas.parser.TextReader.__cinit__ (pandas\parser.c:4175)()
pandas\parser.pyx in pandas.parser.TextReader._setup_parser_source (pandas\parser.c:8333)()
C:\Users\....\AppData\Local\Continuum\Anaconda3\lib\genericpath.py in exists(path)
17 """Test whether a path exists. Returns False for broken symbolic links"""
18 try:
---> 19 os.stat(path)
20 except OSError:
21 return False
ValueError: stat: path too long for Windows
Run Code Online (Sandbox Code Playgroud)
呃..wot …
我遇到了大熊猫的问题19.2给了我预期的结果.列ag具有['yes','no','',NaN].如果这些列中的任何一个都有"是",我希望返回该行(还有其他列未显示).这是我的代码.
xdf2 = xdf[((xdf['a'] == 'yes').all() or
(xdf['b'] == 'yes').all() or
(xdf['c'] == 'yes').all() or
(xdf['d'] == 'yes' ).all() or
(xdf['e'] == 'yes').all() or
(xdf['f'] == 'yes').all() or
(xdf['g'] =='yes').all()) ]
Run Code Online (Sandbox Code Playgroud)
这给了我以下错误:
2134 return self._engine.get_loc(key)
2135 except KeyError:
-> 2136 return self._engine.get_loc(self._maybe_cast_indexer(key))
2137
2138 indexer = self.get_indexer([key], method=method, tolerance=tolerance)
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4433)()
pandas\index.pyx in pandas.index.IndexEngine.get_loc (pandas\index.c:4279)()
pandas\src\hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13742)()
pandas\src\hashtable_class_helper.pxi in pandas.hashtable.PyObjectHashTable.get_item (pandas\hashtable.c:13696)()
KeyError: False
Run Code Online (Sandbox Code Playgroud)
没有'.all',我得到了
ValueError: The truth value of a Series is ambiguous. Use a.empty, a.bool(), …Run Code Online (Sandbox Code Playgroud)