我正在尝试从json文件中读取.
这是我创建文件的方式:
import requests
import json
import time
from pprint import pprint
BASE_URL = "https://www.wikiart.org/en/api/2/UpdatedArtists"
artist_json_data = requests.get(BASE_URL).json()
with open('artistdata.json', 'w') as outfile:
while artist_json_data['hasMore']:
print(artist_json_data['paginationToken'])
url = BASE_URL + "?paginationToken=" +artist_json_data['paginationToken']
artist_json_data = requests.get(url).json()
json.dump(artist_json_data, outfile, indent=4)
time.sleep(1)
Run Code Online (Sandbox Code Playgroud)
这是我的输出的开始:
{
"data": [
{
"id": "57726da5edc2cb3880b4ca54",
"artistName": "Paul Feeley",
"url": "paul-feeley",
"lastNameFirst": "Feeley Paul",
"birthDay": "/Date(-1893456000000)/",
"deathDay": "/Date(-126230400000)/",
Run Code Online (Sandbox Code Playgroud)
当我尝试使用以下代码读取同一文件时:
from pprint import pprint
with open('artistdata.json', 'r', encoding='utf-8') as data_file:
data = json.loads(data_file)
pprint(data)
Run Code Online (Sandbox Code Playgroud)
我收到了错误
TypeError: the JSON object must be …
Run Code Online (Sandbox Code Playgroud) 我有一个熊猫数据帧:
df['total_price'].describe()
Run Code Online (Sandbox Code Playgroud)
返回
count 24895.000000
mean 216.377369
std 161.246931
min 0.000000
25% 109.900000
50% 174.000000
75% 273.000000
max 1355.900000
Name: total_price, dtype: float64
Run Code Online (Sandbox Code Playgroud)
当我申请preprocessing.StandardScaler()
时:
x = df[['total_price']]
standard_scaler = preprocessing.StandardScaler()
x_scaled = standard_scaler.fit_transform(x)
df['new_col'] = pd.DataFrame(x_scaled)
Run Code Online (Sandbox Code Playgroud)
<y 具有标准化值的新列包含一些NaN
s:
df[['total_price', 'new_col']].head()
total_price new_col
0 241.95 0.158596
1 241.95 0.158596
2 241.95 0.158596
3 81.95 -0.833691
4 81.95 -0.833691
df[['total_price', 'new_col']].tail()
total_price new_col
28167 264.0 NaN
28168 264.0 NaN
28176 94.0 NaN
28177 166.0 NaN
28178 166.0 NaN …
Run Code Online (Sandbox Code Playgroud) 我在通过Python 3.6访问Google Storage时遇到问题。我正在安装:
pip install --upgrade google-cloud-storage
Run Code Online (Sandbox Code Playgroud)
这是我的Python脚本:
from google.cloud import storage
def main():
client = storage.Client()
bucket = client.get_bucket('my_bucket')
blob1 = bucket.blob('my_file.json')
blob1.upload_from_filename(filename='my_file.json')
if __name__ == "__main__":
main()
Run Code Online (Sandbox Code Playgroud)
pip show google-cloud-storage
给我以下输出:
Name: google-cloud-storage
Version: 1.6.0
Summary: Python Client for Google Cloud Storage
Home-page: https://github.com/GoogleCloudPlatform/google-cloud-python
Author: Google Cloud Platform
Author-email: googleapis-publisher@google.com
License: Apache 2.0
Location: /usr/local/lib/python3.6/dist-packages
Requires: google-api-core, google-auth, google-cloud-core, requests, google-resumable-media
Run Code Online (Sandbox Code Playgroud)
知道这里有什么问题吗?
这是我的熊猫数据框lots_not_preprocessed_usd
:
<class 'pandas.core.frame.DataFrame'>
Index: 78718 entries, 2017-09-12T18-38-38-076065 to 2017-10-02T07-29-40-245031
Data columns (total 20 columns):
created_year 78718 non-null float64
price 78718 non-null float64
........
decade 78718 non-null int64
dtypes: float64(8), int64(1), object(11)
memory usage: 12.6+ MB
Run Code Online (Sandbox Code Playgroud)
头(1):
artist_name_normalized house created_year description exhibited_in exhibited_in_museums height images max_estimated_price min_estimated_price price provenance provenance_estate_of sale_date sale_id sale_title style title width decade
key
2017-09-12T18-38-38-076065 NaN c11 1862.0 An Album and a small Quantity of unframed Draw... NaN NaN NaN NaN 535.031166 267.515583 845.349242 NaN …
Run Code Online (Sandbox Code Playgroud) 我正在尝试重复数据帧的行.这是我的原始数据:
pd.DataFrame([
{'col1': 1, 'col2': 11, 'col3': [1, 2] },
{'col1': 2, 'col2': 22, 'col3': [1, 2, 3] },
{'col1': 3, 'col2': 33, 'col3': [1] },
{'col1': 4, 'col2': 44, 'col3': [1, 2, 3, 4] },
])
Run Code Online (Sandbox Code Playgroud)
这给了我
col1 col2 col3
0 1 11 [1, 2]
1 2 22 [1, 2, 3]
2 3 33 [1]
3 4 44 [1, 2, 3, 4]
Run Code Online (Sandbox Code Playgroud)
我想根据col3中数组的长度重复行,即我想获得像这样的数据帧.
col1 col2
0 1 11
1 1 11
2 2 22
3 2 22
4 2 …
Run Code Online (Sandbox Code Playgroud) 跑步
import logging
import azure.functions as func
def main(req: func.HttpRequest) -> func.HttpResponse:
logging.info('Python HTTP trigger function processed a request.')
name = req.params.get('name')
if not name:
try:
req_body = req.get_json()
except ValueError:
pass
else:
name = req_body.get('name')
if name:
return func.HttpResponse(f"Hello, {name}. This HTTP triggered function executed successfully.")
else:
return func.HttpResponse(
"This HTTP triggered function executed successfully. Pass a name in the query string or in the request body for a personalized response.",
status_code=200
)
Run Code Online (Sandbox Code Playgroud)
在 vscode 中
pyenv shell 3.9.12 …
Run Code Online (Sandbox Code Playgroud) 如何以如下等效的方式使用 Splash 设置 Scrapy 的用户代理:
import requests
from bs4 import BeautifulSoup
ua = {"User-Agent":"Mozilla/5.0"}
url = "http://www.example.com"
page = requests.get(url, headers=ua)
soup = BeautifulSoup(page.text, "lxml")
Run Code Online (Sandbox Code Playgroud)
我的蜘蛛看起来像这样:
import scrapy
from scrapy_splash import SplashRequest
class ExampleSpider(scrapy.Spider):
name = "example"
allowed_domains = ["example.com"]
start_urls = ["https://www.example.com/"]
def start_requests(self):
for url in self.start_urls:
yield SplashRequest(
url,
self.parse,
args={'wait': 0.5}
)
Run Code Online (Sandbox Code Playgroud) 我想通过键(String类型)对Type AVLNode的对象进行排序.我实例化了一个Comparator,并希望compareTo
在String属性上应用Method.但是,我的IDE显示错误Cannot resolve method compareTo
.我不明白为什么我不能compareTo
在字符串上使用这个方法.
import java.util.*;
public class AVLTreeTest {
public static void main(String[] args){
Comparator<AVLNode>myComp2 = new Comparator<AVLNode>() {
@Override public int compare(AVLNode n1, AVLNode n2) {
return n1.getKey().compareTo(n2.getKey());
}
};
AVLNode<String, AVLNode> a1 = new AVLNode( "test3", new Cuboid (2,3,4,5,6,7) );
AVLNode<String, AVLNode> a2 = new AVLNode( "test2", new Cuboid (2,3,4,5,6,7) );
AVLNode<String, AVLNode> a3 = new AVLNode( "test8", new Cuboid (2,3,4,5,6,7) );
AVLNode<String, AVLNode> a4 = new AVLNode( "test1", new …
Run Code Online (Sandbox Code Playgroud) 这是我的输入数据:
这是所需的输出,其中对列 r、f 和 m 进行了转换,并将结果附加到原始数据
这是代码:
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
from sklearn.compose import ColumnTransformer
from sklearn.preprocessing import PowerTransformer
df = pd.DataFrame(np.random.randint(0,100,size=(10, 3)), columns=list('rfm'))
column_trans = ColumnTransformer(
[('r_std', StandardScaler(), ['r']),
('f_std', StandardScaler(), ['f']),
('m_std', StandardScaler(), ['m']),
('r_boxcox', PowerTransformer(method='box-cox'), ['r']),
('f_boxcox', PowerTransformer(method='box-cox'), ['f']),
('m_boxcox', PowerTransformer(method='box-cox'), ['m']),
])
transformed = column_trans.fit_transform(df)
new_cols = ['r_std', 'f_std', 'm_std', 'r_boxcox', 'f_boxcox', 'm_boxcox']
transformed_df = pd.DataFrame(transformed, columns=new_cols)
pd.concat([df, transformed_df], axis = 1)
Run Code Online (Sandbox Code Playgroud)
我还需要额外的转换器,所以我需要将原始列保留在管道中。有没有更好的方法来处理这个问题?特别是在管道中进行串联和列命名?
我想格式化CSS,如图所示:在奇数行中,第一列单元格和偶数行中第二列单元格应使用不同的背景颜色进行格式化.
我知道如何替换整行或列
tr {
border-top: 0px solid $input-border-col;
&:first-child {
border-top: none;
}
&:nth-child(even) {background: #CCC;}
}
Run Code Online (Sandbox Code Playgroud)
但是还没有找到如何在每一行中交替的方法
pandas ×4
python ×4
python-3.x ×4
scikit-learn ×3
comparator ×1
css ×1
generics ×1
java ×1
java-8 ×1
json ×1
nan ×1
pipeline ×1
python-3.6 ×1
scrapy ×1
web-scraping ×1