我测试了 A = Q * Lambda * Q_inverse 的定理,其中 Q 是具有特征向量的矩阵,而 Lambda 是对角线上具有特征值的对角矩阵。
我的代码如下:
import numpy as np
from numpy import linalg as lg
Eigenvalues, Eigenvectors = lg.eigh(np.array([
[1, 3],
[2, 5]
]))
Lambda = np.diag(Eigenvalues)
Eigenvectors @ Lambda @ lg.inv(Eigenvectors)
Run Code Online (Sandbox Code Playgroud)
返回:
array([[ 1., 2.],
[ 2., 5.]])
Run Code Online (Sandbox Code Playgroud)
返回的矩阵不应该与分解后的原始矩阵相同吗?
我有一个包含一些 np.inf 值的数据框,我想隔离 np.inf 出现的这些行并检查它们。然而,数据框有很多列,并且不容易一一检查,尽管这可以在循环内完成。
我试过这个但失败了:
rows_with_inf = [df1[column][df1[column] == np.inf] for column in df1.columns if ((df1[column].isin([np.inf])).sum() !=0)]
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-94-768652e951ec> in <module>
----> 1 rows_with_inf = [df1[column][df1[column] == np.inf] for column in df1.columns if ((df1[column].isin([np.inf])).sum() !=0)]
<ipython-input-94-768652e951ec> in <listcomp>(.0)
----> 1 rows_with_inf = [df1[column][df1[column] == np.inf] for column in df1.columns if ((df1[column].isin([np.inf])).sum() !=0)]
~\Anaconda3\envs\tf2\lib\site-packages\pandas\core\generic.py in __nonzero__(self)
1553 "The truth value of a {0} is ambiguous. "
1554 "Use a.empty, a.bool(), a.item(), a.any() or a.all().".format( …Run Code Online (Sandbox Code Playgroud) 我正在尝试对中等大小的数据框(~100,000 行)进行插补,其中 30 列中有 5 列具有 NA(很大比例,大约 60%)。
我用以下代码尝试了鼠标:
library(mice)
data_3 = complete(mice(data_2))
Run Code Online (Sandbox Code Playgroud)
第一次迭代后,我得到以下异常:
iter imp variable
1 1 Existing_EMI Loan_Amount Loan_Period
Error in solve.default(xtx + diag(pen)): system is computationally singular: reciprocal condition number = 1.08007e-16
Run Code Online (Sandbox Code Playgroud)
是否有其他一些更适合这种情况的软件包?我该如何处理这个问题?
当我运行以下代码时出现异常,我无法解释:
import datetime
datetime.datetime.strptime("2018-04-02", format = "%Y-%m-%d")
TypeError: strptime() takes no keyword arguments
Run Code Online (Sandbox Code Playgroud) 我想使用子图放置 3 个图。顶行有两个图,一个图将占据整个第二行。
我的代码在顶部两个图和下部图之间创建了间隙。我该如何纠正这个问题?
df_CI
Country China India
1980 5123 8880
1981 6682 8670
1982 3308 8147
1983 1863 7338
1984 1527 5704
fig = plt.figure() # create figure
ax0 = fig.add_subplot(221) # add subplot 1 (2 row, 2 columns, first plot)
ax1 = fig.add_subplot(222) # add subplot 2 (2 row, 2 columns, second plot).
ax2 = fig.add_subplot(313) # a 3 digit number where the hundreds represent nrows, the tens represent ncols
# and the units represent plot_number.
# Subplot …Run Code Online (Sandbox Code Playgroud) 这是代码示例:
data.timestamp = pd.to_datetime(data.timestamp, infer_datetime_format = True, utc = True)
data.timestamp.dtype
CategoricalDtype(categories=['2016-01-10 06:00:00+00:00', '2016-01-10 07:00:00+00:00',
'2016-01-10 08:00:00+00:00', '2016-01-10 09:00:00+00:00',
'2016-01-10 10:00:00+00:00', '2016-01-10 11:00:00+00:00',
'2016-01-10 12:00:00+00:00', '2016-01-10 13:00:00+00:00',
'2016-01-10 14:00:00+00:00', '2016-01-10 15:00:00+00:00',
...
'2016-12-31 13:00:00+00:00', '2016-12-31 14:00:00+00:00',
'2016-12-31 15:00:00+00:00', '2016-12-31 16:00:00+00:00',
'2016-12-31 17:00:00+00:00', '2016-12-31 18:00:00+00:00',
'2016-12-31 19:00:00+00:00', '2016-12-31 20:00:00+00:00',
'2016-12-31 21:00:00+00:00', '2016-12-31 23:00:00+00:00'],
ordered=False)
Run Code Online (Sandbox Code Playgroud)
我该如何解决这个问题?
我有一个 pyspark 数据框,其中包含一些带有后缀的列_24。
df.columns = [timestamp',
'air_temperature_median_24',
'air_temperature_median_6',
'wind_direction_mean_24',
'wind_speed',
'building_id']
Run Code Online (Sandbox Code Playgroud)
我尝试使用 colRegex 方法选择它们,但下面的代码导致异常:
df.select(ashrae.colRegex(".+'_24'")).show()
---------------------------------------------------------------------------
Py4JJavaError Traceback (most recent call last)
<ipython-input-103-a8189f0298e6> in <module>
----> 1 ashrae.select(ashrae.colRegex(".+'_24'")).show()
C:\spark\spark-3.0.0-preview-bin-hadoop2.7\python\pyspark\sql\dataframe.py in colRegex(self, colName)
957 if not isinstance(colName, basestring):
958 raise ValueError("colName should be provided as string")
--> 959 jc = self._jdf.colRegex(colName)
960 return Column(jc)
961
C:\spark\spark-3.0.0-preview-bin-hadoop2.7\python\lib\py4j-0.10.8.1-src.zip\py4j\java_gateway.py in __call__(self, *args)
1284 answer = self.gateway_client.send_command(command)
1285 return_value = get_return_value(
-> 1286 answer, self.gateway_client, self.target_id, self.name)
1287
1288 for temp_arg in temp_args: …Run Code Online (Sandbox Code Playgroud) python-3.x ×3
pandas ×2
python ×2
datetime ×1
eigenvalue ×1
eigenvector ×1
imputation ×1
matplotlib ×1
numpy ×1
pyspark ×1
r ×1
r-mice ×1
strptime ×1
subplot ×1