Mik*_*e C 3 python theano pylearn
我在将自定义数据集加载到pylearn2时遇到问题.我正在尝试使用一个小的XOR数据集来训练一个简单的MLP.我有一个xor.csv与我的yaml文件在同一目录中命名的数据集,该文件与pylearn2的train.py脚本不在同一目录中.
这是以下内容的全部内容xor.csv:
label,x,y
0,0,0
1,0,1
1,1,0
0,1,1
Run Code Online (Sandbox Code Playgroud)
这是我的YAML文件的全部内容:
!obj:pylearn2.train.Train {
dataset: &train !obj:pylearn2.datasets.csv_dataset.CSVDataset {
path: 'xor.csv',
task: 'classification'
},
model: !obj:pylearn2.models.mlp.MLP {
layers: [
!obj:pylearn2.models.mlp.Sigmoid {
layer_name: 'h0',
dim: 10,
irange: 0.05,
},
!obj:pylearn2.models.mlp.Softmax {
layer_name: 'y',
n_classes: 1,
irange: 0.
}
],
nvis: 2,
},
algorithm: !obj:pylearn2.training_algorithms.sgd.SGD {
learning_rate: 1e-2,
batch_size: 1,
monitoring_dataset:
{
'train' : *train
},
termination_criterion:
!obj:pylearn2.termination_criteria.EpochCounter {
max_epochs: 10000
},
},
extensions: [
!obj:pylearn2.train_extensions.best_params.MonitorBasedSaveBest {
channel_name: 'valid_y_misclass',
save_path: "best.pkl"
},
]
}
Run Code Online (Sandbox Code Playgroud)
当我运行pylearn2的train.py脚本时,它在训练之前就失败了(大概是在编译theano函数时).这是整个输出:
[COMPUTER_NAME]:some_folder [MY_NAME]$ python [PATH_TO_PYLEARN2_SCRIPTS]/train.py example_mlp.yml
/Users/[MY_NAME]/anaconda/lib/python2.7/site-packages/nose/plugins/manager.py:418: UserWarning: Module argparse was already imported from /Users/[MY_NAME]/anaconda/lib/python2.7/argparse.pyc, but /Users/[MY_NAME]/anaconda/lib/python2.7/site-packages is being added to sys.path
import pkg_resources
Parameter and initial learning rate summary:
h0_W: 0.01
h0_b: 0.01
softmax_b: 0.01
softmax_W: 0.01
Compiling sgd_update...
Compiling sgd_update done. Time elapsed: 1.109511 seconds
compiling begin_record_entry...
compiling begin_record_entry done. Time elapsed: 0.090133 seconds
Monitored channels:
learning_rate
total_seconds_last_epoch
train_h0_col_norms_max
train_h0_col_norms_mean
train_h0_col_norms_min
train_h0_max_x_max_u
train_h0_max_x_mean_u
train_h0_max_x_min_u
train_h0_mean_x_max_u
train_h0_mean_x_mean_u
train_h0_mean_x_min_u
train_h0_min_x_max_u
train_h0_min_x_mean_u
train_h0_min_x_min_u
train_h0_range_x_max_u
train_h0_range_x_mean_u
train_h0_range_x_min_u
train_h0_row_norms_max
train_h0_row_norms_mean
train_h0_row_norms_min
train_objective
train_y_col_norms_max
train_y_col_norms_mean
train_y_col_norms_min
train_y_max_max_class
train_y_mean_max_class
train_y_min_max_class
train_y_misclass
train_y_nll
train_y_row_norms_max
train_y_row_norms_mean
train_y_row_norms_min
training_seconds_this_epoch
Compiling accum...
graph size: 115
Compiling accum done. Time elapsed: 1.647879 seconds
Traceback (most recent call last):
File "/Users/[MY_NAME]/pylearn2/pylearn2/scripts/train.py", line 252, in <module>
args.verbose_logging, args.debug)
File "/Users/[MY_NAME]/pylearn2/pylearn2/scripts/train.py", line 242, in train
train_obj.main_loop(time_budget=time_budget)
File "/Users/[MY_NAME]/pylearn2/pylearn2/train.py", line 196, in main_loop
self.run_callbacks_and_monitoring()
File "/Users/[MY_NAME]/pylearn2/pylearn2/train.py", line 242, in run_callbacks_and_monitoring
self.model.monitor()
File "/Users/[MY_NAME]/pylearn2/pylearn2/monitor.py", line 254, in __call__
for X in myiterator:
File "/Users/[MY_NAME]/pylearn2/pylearn2/utils/iteration.py", line 859, in next
for data, fn in safe_izip(self._raw_data, self._convert))
File "/Users/[MY_NAME]/pylearn2/pylearn2/utils/iteration.py", line 859, in <genexpr>
for data, fn in safe_izip(self._raw_data, self._convert))
File "/Users/[MY_NAME]/pylearn2/pylearn2/utils/iteration.py", line 819, in fn
return dspace.np_format_as(batch, sp)
File "/Users/[MY_NAME]/pylearn2/pylearn2/space/__init__.py", line 458, in np_format_as
space=space)
File "/Users/[MY_NAME]/pylearn2/pylearn2/space/__init__.py", line 513, in _format_as
self._validate(is_numeric, batch)
File "/Users/[MY_NAME]/pylearn2/pylearn2/space/__init__.py", line 617, in _validate
self._validate_impl(is_numeric, batch)
File "/Users/[MY_NAME]/pylearn2/pylearn2/space/__init__.py", line 984, in _validate_impl
super(IndexSpace, self)._validate_impl(is_numeric, batch)
File "/Users/[MY_NAME]/pylearn2/pylearn2/space/__init__.py", line 796, in _validate_impl
(batch.dtype, self.dtype))
TypeError: Cannot safely cast batch dtype float64 to space's dtype int64.
Run Code Online (Sandbox Code Playgroud)
这究竟是什么意思?我查看了代码CSVDataset,并在使用中加载数据np.loadtxt,这应该将它们作为浮点数引入.如果我编辑xor.csv看起来像浮动(1 -> 1.0例如),则没有任何变化.
小智 6
这是因为CSVDataset的y属性类型设置为float64.
我已经修复了csv_dataset.py的__init __(),如下所示,它可以正常工作.
我不知道这是pylearn2的问题.
if self.task == 'regression':
super(CSVDataset, self).__init__(X=X, y=y)
else:
super(CSVDataset, self).__init__(X=X, y=y.astype(int),
y_labels=np.max(y) + 1)
Run Code Online (Sandbox Code Playgroud)
顺便说一句,你应该修复你的yaml
| 归档时间: |
|
| 查看次数: |
1086 次 |
| 最近记录: |