我不明白为什么KeyError: '[ 1351 1352 1353 ... 13500 13501 13502] not in index'
在运行此代码时出现错误:
cv = KFold(n_splits=10)
for train_index, test_index in cv.split(X):
f_train_X, f_valid_X = X[train_index], X[test_index]
f_train_y, f_valid_y = y[train_index], y[test_index]
Run Code Online (Sandbox Code Playgroud)
我使用X
(一个Pandas数据帧)来分割我cv.split(X)
.
X.shape
y.shape
Out: (13503, 17)
Out: (13503,)
Run Code Online (Sandbox Code Playgroud) 给定pandas DataFrame,如何将后缀“_old”添加到除两列Id
和之外的所有列Name
?
import pandas as pd
data = [[1,'Alex',22,'single'],[2,'Bob',32,'married'],[3,'Clarke',23,'single']]
df = pd.DataFrame(data,columns=['Id','Name','Age','Status'])
Run Code Online (Sandbox Code Playgroud) 如何ds
通过传递列表参数在Spark 2.3 Java中选择数据集的多个列?
例如,这可以正常工作:
ds.select("col1","col2","col3").show();
Run Code Online (Sandbox Code Playgroud)
但是,这失败了:
List<String> columns = Arrays.toList("col1","col2","col3");
ds.select(columns.toString()).show()
Run Code Online (Sandbox Code Playgroud) 我有df
一些数据框架,它是计算过程的结果。然后,我将此DataFrame存储在数据库中以备将来使用。
例如:
val rowsRDD: RDD[Row] = sc.parallelize(
Seq(
Row("first", 2.0, 7.0),
Row("second", 3.5, 2.5),
Row("third", 7.0, 5.9)
)
)
val schema = new StructType()
.add(StructField("id", StringType, true))
.add(StructField("val1", DoubleType, true))
.add(StructField("val2", DoubleType, true))
val df = spark.createDataFrame(rowsRDD, schema)
Run Code Online (Sandbox Code Playgroud)
我需要检查最终DataFrame中的所有列是否都与特定的数据类型相对应。当然,一种方法是使用架构创建DataFrame(如上述示例)。但是,在某些情况下,有时会在计算过程中(在创建初始DataFrame之后)将更改引入数据类型(例如,当更改了应用于DataFrame的某些公式时)。
因此,我想仔细检查一下最终的 DataFrame是否对应于初始模式。如果不对应,那么我想应用相应的转换。有什么办法吗?
我有以下数据帧df
:
TIME DELAY
0 2016-01-01 06:30:00 0
1 2016-01-01 14:10:00 2
2 2016-01-01 07:05:00 2
3 2016-01-01 11:00:00 1
4 2016-01-01 10:40:00 0
5 2016-01-01 08:10:00 7
6 2016-01-01 11:35:00 2
7 2016-01-02 13:50:00 2
8 2016-01-02 14:50:00 4
9 2016-01-02 14:05:00 1
Run Code Online (Sandbox Code Playgroud)
请注意,行不是按日期时间对象排序的。
对于每一行,我想知道过去 2 小时的平均延迟。为了完成这个任务,我执行了以下代码:
df.index = pd.DatetimeIndex(df["TIME"])
df["DELAY_LAST2HOURS"] = df["DELAY"].rolling("2H").mean()
Run Code Online (Sandbox Code Playgroud)
但是我收到了这个错误:
ValueError: index must be monotonic
Run Code Online (Sandbox Code Playgroud)
我怎样才能正确地解决我的任务?
我在Mac上安装gdal
如下:
brew install -v gdal
Run Code Online (Sandbox Code Playgroud)
然而,当我运行我的程序(我能够在 Linux 上成功运行)时,它给出了以下错误:
File "/usr/local/Cellar/python@2/2.7.15_1/Frameworks/Python.framework/Versions/2.7/lib/python2.7/importlib/__init__.py", line 37, in import_module
__import__(name)
File "/Users/tesor/Desktop/test/api-server/api-server/lib/python2.7/site-packages/django/contrib/gis/admin/__init__.py", line 5, in <module>
from django.contrib.gis.admin.options import GeoModelAdmin, OSMGeoAdmin
File "/Users/tesor/Desktop/test/api-server/api-server/lib/python2.7/site-packages/django/contrib/gis/admin/options.py", line 2, in <module>
from django.contrib.gis.admin.widgets import OpenLayersWidget
File "/Users/tesor/Desktop/test/api-server/api-server/lib/python2.7/site-packages/django/contrib/gis/admin/widgets.py", line 4, in <module>
from django.contrib.gis.geos import GEOSException, GEOSGeometry
File "/Users/tesor/Desktop/test/api-server/api-server/lib/python2.7/site-packages/django/contrib/gis/geos/__init__.py", line 18, in <module>
HAS_GEOS = geos_version_info()['version'] >= '3.3.0'
File "/Users/tesor/Desktop/test/api-server/api-server/lib/python2.7/site-packages/django/contrib/gis/geos/libgeos.py", line 196, in geos_version_info
raise GEOSException('Could not parse version info string "%s"' % ver)
django.contrib.gis.geos.error.GEOSException: Could not parse …
Run Code Online (Sandbox Code Playgroud) 我创建了一个非常简单的 Django API。它返回一个固定的数值(仅用于测试目的):
视图.py
from django.http import HttpResponse
def index(request):
return HttpResponse(0)
Run Code Online (Sandbox Code Playgroud)
我还有一个使用 React JS 开发的简单前端。为了开发后端和前端,我使用了这两个教程:
ReactJS:https ://mherman.org/blog/dockerizing-a-react-app/
Django Python API:https : //semaphoreci.com/community/tutorials/dockerizing-a-python-django-web-application
现在我想从 ReactJS 向 Django API 发送 POST 请求并传递name
和email
参数。我该怎么做?
这是我的 App.js
import React, { Component } from 'react';
import logo from './logo.svg';
import './App.css';
class App extends Component {
constructor(props) {
super(props);
this.state = {
fullname: "",
emailaddress: ""
};
this.handleChange = this.handleChange.bind(this);
this.handleSubmit = this.handleSubmit.bind(this);
}
handleChange(event) {
const target = …
Run Code Online (Sandbox Code Playgroud) 我无法在Django API中解决CORS问题。调用此API时,出现错误:
从源' http:// localhost '的' http:// localhost:8000 / '处获取的访问已被CORS策略阻止:对预检请求的响应未通过访问控制检查:否'Access-Control-Allow-来源的标头出现在请求的资源上。如果不透明的响应满足您的需求,请将请求的模式设置为“ no-cors”,以在禁用CORS的情况下获取资源。
为了启用CORS,我做pip install django-cors-headers
了以下代码,并将其添加到settings.py
:
INSTALLED_APPS = [
'django.contrib.admin',
'django.contrib.auth',
'django.contrib.contenttypes',
'django.contrib.sessions',
'django.contrib.messages',
'django.contrib.staticfiles',
'corsheaders',
]
MIDDLEWARE_CLASSES = [
'corsheaders.middleware.CorsMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.security.SecurityMiddleware',
'django.contrib.sessions.middleware.SessionMiddleware',
'django.middleware.common.CommonMiddleware',
'django.middleware.csrf.CsrfViewMiddleware',
'django.contrib.auth.middleware.AuthenticationMiddleware',
'django.contrib.auth.middleware.SessionAuthenticationMiddleware',
'django.contrib.messages.middleware.MessageMiddleware',
'django.middleware.clickjacking.XFrameOptionsMiddleware',
]
CORS_ORIGIN_WHITELIST = [
'localhost:80',
'localhost:8000',
'127.0.0.1:8000'
]
Run Code Online (Sandbox Code Playgroud)
我应该说我在Docker上运行我的项目。这是docker-compose.yml
:
version: '2'
services:
django-docker:
build:
context: .
dockerfile: Dockerfile.django
container_name: my.django
image: my-django
ports:
- 8000:8000
webapp-docker:
build:
context: .
dockerfile: Dockerfile.webapp
container_name: my.webapp
image: my-web
ports: …
Run Code Online (Sandbox Code Playgroud) 当我尝试安装时mysql-community-release
,我收到以下错误:
# yum install mysql-community-release
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.uv.es
* extras: mirror.uv.es
* updates: mirror.uv.es
Resolving Dependencies
--> Running transaction check
---> Package mysql-community-release.noarch 0:el7-7 will be installed
--> Processing Conflict: mysql57-community-release-el7-11.noarch conflicts mysql-community-release
--> Finished Dependency Resolution
Error: mysql57-community-release conflicts with mysql-community-release-el7-7.noarch
Run Code Online (Sandbox Code Playgroud)
这是已安装库的列表:
# yum list installed mysql\*
Loaded plugins: fastestmirror
Loading mirror speeds from cached hostfile
* base: mirror.uv.es
* extras: mirror.uv.es
* updates: mirror.uv.es
Installed Packages
mysql-community-client.x86_64 …
Run Code Online (Sandbox Code Playgroud) 这似乎是一个简单的任务,但我不知道如何在 Spark(而不是 PySpark)中使用 Scala 来完成它。我有一个df
包含不同列的数据框。其中一列的类型String
应更改为Long
。我该怎么做?
如果我执行此代码,我会收到编译错误Cannot resolve symbol col
:
df.withColumn("timestamp", col("timestamp").cast(LongType))
Run Code Online (Sandbox Code Playgroud)