小编Mat*_*hio的帖子

ImageDataGenerator dtype 不会改变数据类型

我正在尝试将 my 的输出数据转换ImageDataGenerator为整数。这是我的发电机：

train_mask_data_gen = ImageDataGenerator(rotation_range=10,
                                         width_shift_range=10,
                                         height_shift_range=10,
                                         zoom_range=0.3,
                                         horizontal_flip=True,
                                         vertical_flip=True,
                                         fill_mode='nearest',#interpolation used for augmenting the image
                                         cval=0,
                                         rescale=1./255,
                                         dtype='int32')

Run Code Online (Sandbox Code Playgroud)

进而：

train_mask_gen = train_mask_data_gen.flow_from_directory(os.path.join(training_dir, 'masks'),
                                                     target_size=(img_h, img_w),
                                                     batch_size=bs,
                                                     class_mode=None, # Because we have no class subfolders in this case
                                                     shuffle=True,
                                                     interpolation='nearest',#interpolation used for resizing
                                                     #color_mode='grayscale',
                                                     seed=SEED)

Run Code Online (Sandbox Code Playgroud)

输入是二进制图像（0 或 255 作为值），我将它们归一化并且我想获得整数值（例如，每个像素只能有 0 或 1 个值）。

到目前为止，如果我“测试”我获得的生成数据的类型，'numpy.float32'但我是dtype='int32在我的生成器中写入的。似乎该dtype条目被完全忽略了，我只是得到了默认值（如文档中所述：Keras Image Generator 为什么会发生这种情况？如何“强制”我的数据为整数？

python casting generator keras

Mat*_*hio

lucky-day

9
推荐指数

1
解决办法

519
查看次数

Optuna 从“外部”传递参数字典

我正在使用 Optuna 来优化一些目标函数。我想创建“包装”标准 Optuna 代码的自定义类。

举个例子，这是我的课程（它仍在进行中！）：

class Optimizer(object):
    
    def __init__(self, param_dict, model, train_x, valid_x, train_y, valid_y):
        self.model = model
        self.param_dict = param_dict
        self.train_x, self.valid_x, self.train_y, self.valid_y = train_x, valid_x, train_y, valid_y
        
    def optimization_function(self, trial):
        self.dtrain = lgb.Dataset(self.train_x, label=self.train_y)
        gbm = lgb.train(param, dtrain)
        
        preds = gbm.predict(self.valid_x)
        pred_labels = np.rint(preds)
        accuracy = sklearn.metrics.accuracy_score(self.valid_y, pred_labels)
        return accuracy
    
    
    def optimize(self, direction, n_trials):
        study = optuna.create_study(direction = direction)
        study.optimize(self.optimization_function, n_trials = n_trials)    
        return study.best_trial

Run Code Online (Sandbox Code Playgroud)

我试图将 optuna 优化的所有“逻辑”包装在此类中，而不是每次都编写一些代码如下（来自文档）：

import optuna


class Objective(object):
    def __init__(self, min_x, max_x): …

Run Code Online (Sandbox Code Playgroud)

python dictionary class lightgbm optuna

Mat*_*hio

lucky-day

9
推荐指数

3
解决办法

4022
查看次数

Squarify - 自动调整树状图中标签的大小

我正在使用Squarify在 Python 中实现一个简单的树形图。

我正在绘制艺术家姓名及其在所考虑的歌曲图表中的流百分比，正方形越大/越暗，值越高。

我的代码如下：

dataGoals = sort_by_streams[sort_by_streams["Streams"]>1]

#Utilise matplotlib to scale our stream number between the min and max, then assign this scale to our values.
norm = matplotlib.colors.Normalize(vmin=min(dataGoals.Streams), vmax=max(dataGoals.Streams))
colors = [matplotlib.cm.Blues(norm(value)) for value in dataGoals.Streams]

#Create our plot and resize it.
fig1 = plt.figure()
ax = fig1.add_subplot()
fig1.set_size_inches(16, 4.5)

#Use squarify to plot our data, label it and add colours. We add an alpha layer to ensure black labels show through
labels = ["%s\n%.2f" % …

Run Code Online (Sandbox Code Playgroud)

python matplotlib treemap squarify

Mat*_*hio

2020 09-19

7
推荐指数

1
解决办法

2780
查看次数

批量归一化会破坏验证性能

我按照一些教程向我的模型添加了一些批量归一化，以缩短训练时间。这是我的模型：

model = Sequential()

model.add(Conv2D(16, kernel_size=(3, 3), activation='relu', input_shape=(64,64,3)))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(64, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(256, kernel_size=(3, 3), activation='relu'))
model.add(BatchNormalization())

model.add(MaxPooling2D(pool_size=(2, 2)))


model.add(Flatten())

model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))

#NB: adding more parameters increases the probability of overfitting!! Try to cut instead of adding neurons!! 
model.add(Dense(units=512, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(units=20, activation='softmax'))

Run Code Online (Sandbox Code Playgroud)

如果没有批量归一化，我的数据准确率约为 50%。添加批量归一化会破坏我的性能，验证准确度降低至 10%。

为什么会发生这种情况？

conv-neural-network keras tensorflow batch-normalization

Mat*_*hio

lucky-day

6
推荐指数

1
解决办法

4829
查看次数

python假设中相同函数参数的多种策略

我正在使用Hypothesis包用 Python 编写一个简单的测试代码。有没有办法对同一个函数参数使用多种策略？例如，使用integers()和floats()来测试我的values参数而不编写两个单独的测试函数？

from hypothesis import given
from hypothesis.strategies import lists, integers, floats, sampled_from
@given(
    integers() ,floats()
)
def test_lt_operator_interval_bin_numerical(values):
    interval_bin = IntervalBin(10, 20, IntervalClosing.Both)
    assert (interval_bin < values) == (interval_bin.right < values)

Run Code Online (Sandbox Code Playgroud)

上面的代码不起作用，但它代表了我想要实现的目标。

注意：我已经尝试过使用两种不同策略创建两个不同测试的简单解决方案：

def _test_lt(values):
    interval_bin = IntervalBin(10, 20, IntervalClosing.Both)
    assert (interval_bin < values) == (interval_bin.right < values)

test_lt_operator_interval_bin_int = given(integers())(_test_lt)

test_lt_operator_interval_bin_float = given(floats())(_test_lt)

Run Code Online (Sandbox Code Playgroud)

然而我想知道是否有更好的方法来做到这一点：当策略的数量变大时，它作为代码是相当冗余的。

python pytest python-hypothesis

Mat*_*hio

2021 10-28

5
推荐指数

1
解决办法

757
查看次数

Anaconda 安装 TensorFlow 1.15 而不是 2.0

我一直在尝试在 Anaconda 中安装 Tensorflow 2.0。

到目前为止，tensorflow 工作正常（我可以在我的代码中使用该库），但是当我安装它时，它说“安装版本：2.0”，然后我得到了 1.15 版。

整个包结果可更新（因为 versione 2.0 确实存在并且应该可用），但即使我尝试更新它，我也会不断得到 1.15，它又回到了可更新状态，我又回到了循环中。

我从 Anaconda Navigator 界面和使用提示都尝试过conda update tensorflow，但没有成功。

这是链接包：tensorflow 2.0 Anaconda

我该如何解决问题？

编辑：

我成功安装了 TensorFlow 2.0 使用conda install -c anaconda tensorflow. 然后我回到 Anaconda Navigator 并尝试安装 Keras 2.2.4。

查看已安装的软件包，似乎安装 Keras 会用 1.15 覆盖 TensorFlow 2.0 版，我错了吗？这是一个错误吗？

这是图片：

python anaconda tensorflow

Mat*_*hio

2019 11-20

4
推荐指数

2
解决办法

3万
查看次数

从图像生成器绘制图像

我正在尝试绘制由图像生成器创建的图像。到目前为止，这是我提供给生成器的数据代码：

train_img_gen = train_img_data_gen.flow_from_directory(os.path.join(training_dir, 'images'),
                                                   target_size=(img_h, img_w),
                                                   batch_size=bs, 
                                                   class_mode=None, # Because we have no class subfolders in this case
                                                   shuffle=True,
                                                   interpolation='bilinear',
                                                   seed=SEED)
#edited part following the already existing answer on stackoverflow
x_batch, y_batch = next(train_img_gen)
for i in range (0,32):
    image = x_batch[i]
    plt.imshow(image.transpose(2,1,0))
    plt.show()

Run Code Online (Sandbox Code Playgroud)

我遵循了这个问题：Keras images，但没有任何成功。

我如何绘制（例如）由我生成的前 n 个图像imageGenerator？

编辑：

我添加了上述问题中使用的代码，但出现此错误：

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-7-1a18ce1c1a76> in <module>
     54 valid_gen = zip(valid_img_gen, valid_mask_gen)
     55 
---> 56 x_batch, y_batch = next(train_img_gen)
     57 …

Run Code Online (Sandbox Code Playgroud)

python image-processing keras data-augmentation

Mat*_*hio

2019 12-07

3
推荐指数

1
解决办法

1万
查看次数

使用带有scanf的strlen(%ms)

是否可以使用strlen()动态分配的字符串？

例如:

#include <stdio.h>
#include <string.h>

int main ()
{
  char *input=NULL;
  printf ("Enter a sentence: ");
  scanf("%ms", &input);
  //Is this legit?
  printf ("The sentence entered is %u characters long.\n",(unsigned)strlen(input));
  return 0;
}

Run Code Online (Sandbox Code Playgroud)

c scanf c-strings strlen dynamic-memory-allocation

Mat*_*hio

2018 08-05

2
推荐指数

1
解决办法

248
查看次数

将线图添加到 Facetgrid 图中

我RelPlot使用 Seaborn 绘制了一个简单的图，它返回了一个Facetgrid对象。我使用的代码如下：

import seaborn as sns

palette = sns.color_palette("rocket_r")

g1 = sns.relplot(
    data=df,
    x="number_of_weeks", y="avg_streams",
    hue="state", col="state",
    kind="line", palette=palette,
    height=5, aspect=1, facet_kws=dict(sharex=False), col_wrap=3, 
)

Run Code Online (Sandbox Code Playgroud)

显示了以下情节：

我正在使用的数据框具有以下结构：

   avg_streams  date    year number_of_weeks state
4   0.104011    31-01   2020    4   it
5   1.211951    07-02   2020    5   it
6   0.559374    14-02   2020    6   it
7   0.304257    21-02   2020    7   it
8   0.199218    28-02   2020    8   it
... ... ... ... ... ...
175 -0.938890   26-06   2020    25  br
176 -0.483821 …

Run Code Online (Sandbox Code Playgroud)

python matplotlib subplot line-plot seaborn

Mat*_*hio

2020 09-19

2
推荐指数

1
解决办法

3037
查看次数