我使用下面的代码来获取网站的标题.
from bs4 import BeautifulSoup
import urllib2
line_in_list = ['www.dailynews.lk','www.elpais.com','www.dailynews.co.zw']
for websites in line_in_list:
url = "http://" + websites
page = urllib2.urlopen(url)
soup = BeautifulSoup(page.read())
site_title = soup.find_all("title")
print site_title
Run Code Online (Sandbox Code Playgroud)
如果网站列表包含"不良"(不存在)网站/网页,或者网站有某种类型或错误,例如"404找不到页面"等,则脚本将中断并停止.
我以什么方式让脚本忽略/跳过"坏"(不存在)和有问题的网站/网页?
在执行网络分析时,我想在地图上绘制网络图。ggmap 似乎是首选,但它需要 API 访问。
是否有任何不需要 API 访问的免费和等效/替代(到 ggmap)选项?
谢谢你。
如下所示的表格,我想从中创建一个新表格(使用“颜色”列中的值)。
我试过了:
import pandas as pd
import functools
data = {'Seller': ["Mike","Mike","Mike","Mike","David","David","Pete","Pete","Pete"],
'Code' : ["9QBR1","9QBR1","9QBW2","9QBW2","9QD1X","9QD1X","9QEBO","9QEBO","9QEBO"],
'From': ["2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03","2020-01-03"],
'Color_date' : ["2020-02-14","2020-02-14","2020-05-18","2020-05-18","2020-01-04","2020-01-04","2020-03-04","2020-03-13","2020-01-28"],
'Color' : ["Blue","Red","Red","Grey","Red","Grey","Blue","Orange","Red"],
'Delivery' : ["Nancy","Nancy","Kate","Kate","Lilly","Lilly","John","John","John"]}
df = pd.DataFrame(data)
df_1 = df.set_index([df.index, 'Color'])['Color_date'].unstack()
df_1['Code'] = df['Code']
final_df = functools.reduce(lambda left,right: pd.merge(left,right,on='Code'), [df, df_1])
Run Code Online (Sandbox Code Playgroud)
“df_1”看起来不错,但“final_df”比预期的要长得多。
哪里出错了,我该如何纠正?谢谢你。
有两个相应的1对1关系列表.
names = ["David", "Peter", "Kate", "Lucy", "Kit", "Jason", "Judy"]
scores = [1,1,0.8,0.2,0.4,0.1,0.6]
Run Code Online (Sandbox Code Playgroud)
我想展示得分超过0.5并且显示在1行中的人:
Peter (1 point), David (1 point), Kate (0.8 point), Judy (0.6 point)
Run Code Online (Sandbox Code Playgroud)
我尝试的是:
import operator
names = ["David", "Peter", "Kate", "Lucy", "Kit", "Jason", "Judy"]
scores = [1,1,0.8,0.2,0.4,0.1,0.6]
dictionary = dict(zip(names, scores))
dict_sorted = sorted(dictionary.items(), key=operator.itemgetter(1), reverse=True)
print dict_sorted
Run Code Online (Sandbox Code Playgroud)
它给:
[('Peter', 1), ('David', 1), ('Kate', 0.8), ('Judy', 0.6), ('Kit', 0.4), ('Lucy', 0.2), ('Jason', 0.1)]
Run Code Online (Sandbox Code Playgroud)
怎么能进一步得到想要的结果呢?注意:需要从大到小的排序结果.
2个用于测试目的的较长列表:
names = ["Olivia","Charlotte","Khaleesi","Cora","Isla","Isabella","Aurora","Amelia","Amara","Penelope","Audrey","Rose","Imogen","Alice","Evelyn","Ava","Irma","Ophelia","Violet"]
scores = [1.0, 1.0, 0.8, 0.2, 0.2, 0.4, …Run Code Online (Sandbox Code Playgroud) 我想过滤符合以下条件的行:
我有的是:
the_list = ['C TEE edBore 1 1/4200;',
'Cylinder SingleVerticalB HHJ e 1 1/8Cooling 1',
'EngineBore 11/1; TDT 8Length 3Width 3',
'EngineCy HEE Inline2008Bore 1',
'Height 4TheChallen TET e 1Stroke 1P 305',
'Height 8C ;0;Wall15ccG QBG ccGasEngineJ 142',
'Height EQE C ;0150ccGas2007',
'Length 10Wid ETQ Length 10Width ',
'Stro EHT oke 1 1/8Length ',
'Stroke 1 1/4HP JII Stroke 1 1/4HP ',
'Stroke 1Cy QTH 7Weight ; 1/2LBS',
'Weight 18LBSLength 1 DQT …Run Code Online (Sandbox Code Playgroud) 在Python 2.7.6中,列表如下.我以何种方式拿起物品以"4"开头,长度为4,即4646和4648以下?
aaa = [2013, 2014, 2002, 4646, 4648, 20, 456, 5623, 'abc']
Run Code Online (Sandbox Code Playgroud)
我只能通过以下方式选择4个长度:
results = []
for number in aaa:
if len(str(number)) == 4:
results.append(number)
print results
Run Code Online (Sandbox Code Playgroud)
谢谢.
一切都很棒.但我是新手,所以选择最简单的.:)
python ×5
list ×2
dataframe ×1
dictionary ×1
ggmap ×1
graph ×1
if-statement ×1
pandas ×1
r ×1
web-scraping ×1