我需要从不同的 python 脚本导入函数,该函数将在preprocessing.py文件内部使用。我无法找到将依赖文件传递给SKLearnProcessor对象的方法,因此我得到了ModuleNotFoundError.
代码:
from sagemaker.sklearn.processing import SKLearnProcessor
from sagemaker.processing import ProcessingInput, ProcessingOutput
sklearn_processor = SKLearnProcessor(framework_version='0.20.0',
role=role,
instance_type='ml.m5.xlarge',
instance_count=1)
sklearn_processor.run(code='preprocessing.py',
inputs=[ProcessingInput(
source=input_data,
destination='/opt/ml/processing/input')],
outputs=[ProcessingOutput(output_name='train_data',
source='/opt/ml/processing/train'),
ProcessingOutput(output_name='test_data',
source='/opt/ml/processing/test')],
arguments=['--train-test-split-ratio', '0.2']
)
Run Code Online (Sandbox Code Playgroud)
我想通过,
dependent_files = ['file1.py', 'file2.py', 'requirements.txt']. 因此,preprocessing.py可以访问所有依赖模块。
并且还需要从requirements.txt文件安装库。
您能分享任何解决方法或正确的方法吗?
2021 年 11 月 25 日更新:
Q1. (已回答,但希望使用 来解决FrameworkProcessor)
这里,函数是使用FrameworkProcessorget_run_args处理dependencies和参数。有什么方法可以从或或任何其他方式设置这些参数吗?source_dircodeScriptProcessorSKLearnProcessorProcessor
Q2。
您还可以展示一些使用我们的Processorassagemaker.workflow.steps.ProcessingStep然后使用 in …
当我尝试使用sslserver如下所示运行 Django 应用程序时,
python manage.py runsslserver
Run Code Online (Sandbox Code Playgroud)
错误:
追溯:
Validating models...
System check identified no issues (0 silenced).
November 08, 2019 - 11:17:26
Django version 2.0.7, using settings 'dashboard_channels.settings'
Starting development server at https://127.0.0.1:8000/
Using SSL certificate: \lib\site-packages\sslserver\certs\development.crt
Using SSL key: \lib\site-packages\sslserver\certs\development.key
Quit the server with CTRL-BREAK.
[08/Nov/2019 11:18:33] "GET / HTTP/1.1" 200 1299
[08/Nov/2019 11:18:34] "GET / HTTP/1.1" 200 1299
[08/Nov/2019 11:18:35] "GET /static/js/jquery.js HTTP/1.1" 200 270575
Not Found: /ws/home
[08/Nov/2019 11:18:36] "GET /ws/home HTTP/1.1" 404 2134
Run Code Online (Sandbox Code Playgroud)
浏览器控制台: …
提供的答案需要有关qlik服务器身份验证的更多详细信息
我正在尝试qlik通过WebSockets 连接到使用证书.
错误:
websocket._exceptions.WebSocketProxyException: failed CONNECT via proxy status: 503
Run Code Online (Sandbox Code Playgroud)
码:
from websocket import create_connection
import ssl
senseHost = "dummy.xyz.com"
privateKeyPath = "C:\\ProgramData\\Qlik\\Sense\\Repository\\Exported Certificates\\"
## userDirectory and userId can be found at QMC -> Users
userDirectory, userId = "DIRECTORY_OF_SERVER","QlikServerUserId"
url = "wss://" + senseHost + ":4747/app/" # valid
certs = ({"ca_certs": privateKeyPath + "root.pem",
"certfile": privateKeyPath + "client.pem",
"keyfile": privateKeyPath + "client_key.pem",
"cert_reqs":ssl.CERT_REQUIRED,
"server_side": False
})
ssl.match_hostname = lambda cert, hostname: True
ws = create_connection(url, sslopt=certs,
http_proxy_host="xyz.corp.company.com", …Run Code Online (Sandbox Code Playgroud) 我有一个 python 项目,我正在为该项目配置最新版本的 ruff,以用于 linting 和格式化目的。我的文件中有以下设置pyproject.toml:
[tool.ruff]
select = ["E", "F", "W", "Q", "I"]
ignore = ["E203"]
# Allow autofix for all enabled rules (when `--fix`) is provided.
fixable = ["ALL"]
unfixable = []
# restrict Line length to 99
line-length = 99
Run Code Online (Sandbox Code Playgroud)
ruff check具有 ruff 的自动修复功能 ( ) 的命令会--fix识别出行很长且有E501错误,但它不会将该代码格式化为换行到下一行以维持行长度限制。我需要启用或执行某些操作来确保 ruff 修复此问题吗?或者说目前这在 ruff 中是不可能的吗?请帮忙。
我尝试通过文档来查找任何内容,但我不知道在这里该怎么做。
我遇到了这种行为,不确定是否是错误。当我使用从创建的连接时cx_Oracle,它正在按假定的方式工作。但是,当我使用时Django DB connections,它给出的结果却不是预期的。
import cx_Oracle
from django.db import connections
import pandas as pd
dsn_tns = cx_Oracle.makedsn('xx.x.xx.xxx', 'port', 'dbname')
cx_Oracle_conn = cx_Oracle.connect('user', 'pass', dsn_tns)
django_conn = connections["DB2"] # In django settings, I have a created "DB2" and passed the same parameters.
query = '''
SELECT (T1.ACCEPTED- T2.CANCELLED) AS "NET" FROM
(
SELECT ID, COUNT(1) AS ACCEPTED FROM ACCEPTED_TABLE
GROUP BY ID
) T1
LEFT JOIN
(SELECT ID, COUNT(1) AS CANCELLED FROM CANCELLED_TABLE
GROUP BY ID) T2
ON …Run Code Online (Sandbox Code Playgroud) 我有一个包含一些浮点数据的 csv 文件。代码很简单
df = pd.read_csv(my_csv_vile)
print(df.iloc[:2,:4]
600663.XSHG 000877.XSHE 600523.XSHG 601311.XSHG
2016-01-04 09:31:00 49.40 8.05 22.79 21.80
2016-01-04 09:32:00 49.55 8.03 22.79 21.75
Run Code Online (Sandbox Code Playgroud)
然后我将其转换为 float32 以节省内存使用。
short_df = df.astype(np.float32)
print(short_df.iloc[:2,:4])
600663.XSHG 000877.XSHE 600523.XSHG 601311.XSHG
2016-01-04 09:31:00 49.400002 8.05 22.790001 21.799999
2016-01-04 09:32:00 49.549999 8.03 22.790001 21.750000
Run Code Online (Sandbox Code Playgroud)
值刚刚改变!如何才能保持数据不变呢?
(我也尝试过short_df.round(2),但打印仍然得到相同的输出)
您好,我有一个数据,我想重命名其中一列并选择以t字符串开头的列。
raw_data = {'patient': [1, 1, 1, 2, 2],
'obs': [1, 2, 3, 1, 2],
'treatment': [0, 1, 0, 1, 0],
'score': ['strong', 'weak', 'normal', 'weak', 'strong'],
'tr': [1,2,3,4,5],
'tk': [6,7,8,9,10],
'ak': [11,12,13,14,15]
}
df = pd.DataFrame(raw_data, columns = ['patient', 'obs', 'treatment', 'score','tr','tk','ak'])
df
patient obs treatment score tr tk ak
0 1 1 0 strong 1 6 11
1 1 2 1 weak 2 7 12
2 1 3 0 normal 3 8 13
3 2 1 …Run Code Online (Sandbox Code Playgroud) 我的数据框中有一列 'datedif' 为:
exposuredate min_exposure_date datedif
2014-10-08 2014-09-27 11 days
2014-10-09 2014-09-27 12 days
2014-09-27 2014-09-27 0 days
2014-09-27 2014-09-27 0 days
2014-10-22 2014-09-27 25 days
data.exposuredate = pd.to_datetime(data.exposuredate)
data.min_exposure_date = pd.to_datetime(data.min_exposure_date)
data['datedif'] = ((data.exposuredate)-(data.min_exposure_date))
Run Code Online (Sandbox Code Playgroud)
列的格式为 datetime64[ns]。我想提取“datedif”字段中的天数。我找不到任何可以帮助我提取天数差异的东西。
我试过:
data['datedif_day'] = data['datedif'].dt.days
Run Code Online (Sandbox Code Playgroud)
错误:
AttributeError: 'Series' 对象没有属性 'dt'
我无法在python 3.8上从pip安装熊猫。
它给了我以下长错误:
PS C:\Users\Admin> pip install pandas
Collecting pandas
Using cached https://files.pythonhosted.org/packages/07/cf/1b6917426a9a16fd79d56385d0d907f344188558337d6b81196792f857e9/pandas-0.25.1.tar.gz
ERROR: Command errored out with exit status 1:
command: 'c:\users\admin\appdata\local\programs\python\python38-32\python.exe' -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'C:\\Users\\Admin\\AppData\\Local\\Temp\\pip-install-mjaqrdny\\pandas\\setup.py'"'"'; __file__='"'"'C:\\Users\\Admin\\AppData\\Local\\Temp\\pip-install-mjaqrdny\\pandas\\setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base 'C:\Users\Admin\AppData\Local\Temp\pip-install-mjaqrdny\pandas\pip-egg-info'
cwd: C:\Users\Admin\AppData\Local\Temp\pip-install-mjaqrdny\pandas\
Complete output (190 lines):
Could not locate executable g77
Could not locate executable f77
Could not locate executable ifort
Could not locate executable ifl
Could not locate executable f90
Could not locate executable DF
Could not locate executable …Run Code Online (Sandbox Code Playgroud) 我对 python 很陌生,所以我创建了一个元素列表,如:
main_list = [1,2,3]
Run Code Online (Sandbox Code Playgroud)
我希望这个列表有一个名字,但我不想使用字典,所以我创建了一个以名字作为属性的类:
class NamedList:
def __init__(self, name, obj)
self.name = name
self.object = obj
Run Code Online (Sandbox Code Playgroud)
当我尝试访问第一个列表的长度时:
len(main_list) #works fine
Run Code Online (Sandbox Code Playgroud)
但对于第二个,它给了我这个
错误:NamedList 实例没有属性“ len ”:
new_main_list = NamedList('numbers', main_list)
len(new_main_list) #This line gives me the error
Run Code Online (Sandbox Code Playgroud)
我想知道为什么 List 类的基本属性对我的类不可用?我所有的实例最初都是一个 List 实例。
提前致谢
a = 'abhishek'
count = 0
for x in a:
if x in a:
count += 1
print(count)
Run Code Online (Sandbox Code Playgroud)
我已经尝试过了,但是它给了我字母的总数。我只希望只出现一次的唯一后者。
python ×11
pandas ×5
python-3.x ×3
websocket ×2
class ×1
django ×1
http-proxy ×1
ironpython ×1
javascript ×1
lint ×1
list ×1
numpy ×1
oracle ×1
qliksense ×1
ruff ×1
scikit-learn ×1
ssl ×1
wss ×1