大家早,
我想安装pymongo库但收到以下错误:
(C:\Users\xxxxxxx\AppData\Local\Continuum\anaconda3) C:\Users\xxxxxxx>
conda install -c anaconda pymongo
Fetching package metadata ...
CondaHTTPError: HTTP 000 CONNECTION FAILED for url <https://conda.anaconda.org/a
naconda/win-64/repodata.json>
Elapsed: -
An HTTP error occurred when trying to retrieve this URL.
HTTP errors are often intermittent, and a simple retry will get you on your way.
ConnectTimeout(MaxRetryError("HTTPSConnectionPool(host='conda.anaconda.org', por
t=443): Max retries exceeded with url: /anaconda/win-64/repodata.json (Caused by
ConnectTimeoutError(<urllib3.connection.VerifiedHTTPSConnection object at 0x000
00000054D6128>, 'Connection to conda.anaconda.org timed out. (connect timeout=9.
15)'))",),)
Run Code Online (Sandbox Code Playgroud)
采取的步骤解决:
1. Update C:\Users\\xxxxxxx\.condarc file with the …Run Code Online (Sandbox Code Playgroud) 希望为 Pycharm 中的任何项目创建我的 Pandas 启动选项。我有一个名为 Test 的项目,它包含三个模块。
我Startup_Panda_Options.py用我需要的设置创建,然后创建__init__.py它以便在测试项目启动时加载。当我运行一些测试数据时,test.py它失败了,因为我得到NameError: name 'pd' is not defined这意味着__init__.py从未运行过。
第二个选项是将 Startup_Panda_Options.py 放入文件夹,C:\Users\peter\Documents\PyCharm\venv\Lib\site-packages\IPython\core\profile但出现了同样的问题。
第三个选项,在 PyCharm 中通过 File | 设置 | 工具 | 启动任务 | 我参考了该Startup_Panda_Options.py文件。
因此,根据此处的建议和文档进行了三次尝试。关于如何使这些选项起作用的任何想法?三个模块如下:
Startup_Panda_Options.py:
import pandas as pd
def Panda_Options():
options = {
'display': {
'max_columns': None, # used in __repr__() methods ‘None’ value means unlimited.
'max_colwidth': 25, # sets the maximum width of columns. Cells of this length or …Run Code Online (Sandbox Code Playgroud) 下午全部
寻找两个日期到小数点后四个位之间的年数。我的资料:
df_Years = df[
df['state'].str.contains('Done')
][[
'maturity_date'
]].copy()
df_Years['maturity_date'] = pd.to_datetime(df_Date['maturity_date'])
df_Years['Today'] = pd.to_datetime('today')
display(df_Years.head(6))
maturity_date Today
13 2022-12-15 2018-03-21
81 2028-02-15 2018-03-21
82 2045-12-01 2018-03-21
100 2025-08-18 2018-03-21
115 2019-01-16 2018-03-21
116 2018-12-21 2018-03-21
display(df_Years.dtypes)
maturity_date datetime64[ns]
Today datetime64[ns]
dtype: object
#Dataframe types
Run Code Online (Sandbox Code Playgroud)
尝试1:
df_Years['Year_To_Maturity'] = df_Years['maturity_date'] - df_Years['Today']
df_Years['Year_To_Maturity'] = df_Years['Year_To_Maturity'].apply(lambda x: float(x.item().days)/365)
Run Code Online (Sandbox Code Playgroud)
错误:
AttributeError: 'Timedelta' object has no attribute 'item'
Run Code Online (Sandbox Code Playgroud)
尝试2:
df_Years['Year_To_Maturity'] = df_Years['maturity_date'] - df_Years['Today']
df_Years['Year_To_Maturity'] = df_Years['Year_To_Maturity'].apply(lambda x: float(x.item().days)/365)
Run Code Online (Sandbox Code Playgroud)
输出:
maturity_date Today Year_To_Maturity
13 …Run Code Online (Sandbox Code Playgroud) 我希望选择state包含“交易”一词trading _book且不以字母“E”、“L”、“N”开头的行
Test_Data = [('originating_system_id', ['RBCL', 'RBCL', 'RBCL','RBCL']),
('rbc_security_type1', ['CORP', 'CORP','CORP','CORP']),
('state', ['Traded', 'Traded Away','Traded','Traded Away']),
('trading_book', ['LCAAAAA','NUBBBBB','EDFGSFG','PDFEFGR'])
]
dfTest_Data = pd.DataFrame.from_items(Test_Data)
display(dfTest_Data)
originating_system_id rbc_security_type1 state trading_book
RBCL CORP Traded LCAAAAA
RBCL CORP Traded Away NUBBBBB
RBCL CORP Traded EDFGSFG
RBCL CORP Traded Away PDFEFGR
Run Code Online (Sandbox Code Playgroud)
期望的输出:
originating_system_id rbc_security_type1 state trading_book
RBCL CORP Traded Away PDFEFGR
Run Code Online (Sandbox Code Playgroud)
我认为这可以解决问题:
prefixes = ['E','L','N']
df_Traded_Away_User = dfTest_Data[
dfTest_Data[~dfTest_Data['trading_book'].str.startswith(tuple(prefixes))] &
(dfTest_Data['state'].str.contains('Traded'))
][['originating_system_id','rbc_security_type1','state','trading_book']]
display(df_Traded_Away_User)
Run Code Online (Sandbox Code Playgroud)
但我得到:
ValueError: Must pass DataFrame with boolean values only
Run Code Online (Sandbox Code Playgroud) 我已经得出了我想要的分组,但想根据每月的总数计算一个百分比列,即不管 originating_system_id 中的字符串如何
d = [('Total_RFQ_For_Month', 'size')]
df_RFQ_Channel = df.groupby(['Year_Month','originating_system_id'])['state'].agg(d)
#df_RFQ_Channel['RFQ_Pcent_For_Month'] = ?
display(df_RFQ_Channel)
Year_Month originating_system_id Total_RFQ_For_Month RFQ_Pcent_For_Month
2017-11 BBT 59 7.90%
EUCR 33 4.42%
MAXL 6 0.80%
MXUS 649 86.88%
2017-12 BBT 36 73.47%
EUCR 7 14.29%
MAXL 6 12.24%
2018-01 BBT 88 9.52%
EUCR 26 2.81%
MAXL 4 0.43%
MXUS 800 86.58%
VOIX 6 0.65%
Run Code Online (Sandbox Code Playgroud)
例子:
7.90% is BBT's Total_RFQ_For_Month (59) divided by the sum of all for 2017-11 (747)
2.81% is EUCR's Total_RFQ_For_Month (26) divided by the …Run Code Online (Sandbox Code Playgroud)