小编use*_*223的帖子

如果每行包含不同数量的字段(数字相当大),如何正确读取csv文件？

我有来自亚马逊的文本文件,其中包含以下信息:

 #      user        item     time   rating     review text (the header is added by me for explanation, not in the text file
  disjiad123    TYh23hs9     13160032    5     I love this phone as it is easy to use
  hjf2329ccc    TGjsk123     14423321    3     Suck restaurant

Run Code Online (Sandbox Code Playgroud)

如您所见,数据按空格分隔,每行中有不同数量的列.但是,它是文本内容.这是我尝试过的代码:

pd.read_csv(filename, sep = " ", header = None, names = ["user","item","time","rating", "review"], usecols = ["user", "item", "rating"])#I'd like to skip the text review part

Run Code Online (Sandbox Code Playgroud)

并出现这样的错误:

ValueError: Passed header names mismatches usecols

Run Code Online (Sandbox Code Playgroud)

当我尝试阅读所有列时:

pd.read_csv(filename, sep = " ", header …

Run Code Online (Sandbox Code Playgroud)

python csv pandas

use*_*223

2017 05-23

10
推荐指数

2
解决办法

1万
查看次数

如何用pandas-python递归地构造一列数据帧？

给出这样一个数据框df:

id_      val     
11111    12
12003    22
88763    19
43721    77
...

Run Code Online (Sandbox Code Playgroud)

我想添加一列diff到df,并且它的每一行等于,我们说了,val在该行减去diff上一行和乘0.4,然后加入diff前一天:

diff = (val - diff_previousDay) * 0.4 + diff_previousDay

Run Code Online (Sandbox Code Playgroud)

并且diff第一行中的等于val * 4该行.也就是说,预期df应该是:

id_      val     diff   
11111    12      4.8
12003    22      11.68
88763    19      14.608
43721    77      ...

Run Code Online (Sandbox Code Playgroud)

我试过了:

mul = 0.4
df['diff'] = df.apply(lambda row: (row['val'] - df.loc[row.name, 'diff']) * mul + df.loc[row.name, 'diff'] if int(row.name) > 0 else …

Run Code Online (Sandbox Code Playgroud)

python recursion multiple-columns dataframe pandas

use*_*223

2016 06-24

6
推荐指数

2
解决办法

5425
查看次数

如何删除一个numpy数组中的相邻重复值？

给定一个numpy数组，我希望删除相邻的重复非零值和所有零值。例如，对于这样的数组：[0,0,1,1,1,2,2,0,1,3,3,3]，我想将其转换为：[1,2,1,3]。你知道怎么做吗？我只知道，np.unique(arr)但是它将删除所有重复的值并保持零值。先感谢您！

python arrays numpy

use*_*223

2016 06-28

5
推荐指数

2
解决办法

2475
查看次数

无法使用 keras 实现层规范化

我正在尝试在带有 keras 的全连接神经网络中实现层归一化。我遇到的问题是所有的损失都是NaN，它没有学习。这是我的代码：

class DenseLN(Layer):
    def __init__(self, output_dim, init='glorot_uniform', activation='linear', weights=None,
                 W_regularizer=None, b_regularizer=None, activity_regularizer=None,
                 W_constraint=None, b_constraint=None, bias=True, input_dim=None, **kwargs):
        self.init = initializations.get(init)
        self.activation = activations.get(activation)
        self.output_dim = output_dim
        self.input_dim = input_dim
        self.epsilon = 1e-5        

        self.W_regularizer = regularizers.get(W_regularizer)
        self.b_regularizer = regularizers.get(b_regularizer)
        self.activity_regularizer = regularizers.get(activity_regularizer)

        self.W_constraint = constraints.get(W_constraint)
        self.b_constraint = constraints.get(b_constraint)

        self.bias = bias
        self.initial_weights = weights
        self.input_spec = [InputSpec(ndim=2)]

        if self.input_dim:
            kwargs['input_shape'] = (self.input_dim,)
        super(DenseLN, self).__init__(**kwargs)

    def ln(self, x):
        # layer normalization function
        m = K.mean(x, axis=0) …

Run Code Online (Sandbox Code Playgroud)

python neural-network keras

use*_*223

2016 08-23

5
推荐指数

1
解决办法

1754
查看次数

奇怪的错误:相同的PHP代码,但在mac和windows机器上的结果不同

我遇到了一个非常奇怪的错误.给出相同的代码:

<?php session_start(); ?>
<?php if (!isset($_SESSION['email'])): ?>
<p><a href="admin_reg.php">Regsiter as admin</p>
<p><a href="student_reg.php">Register as student</p>
<p><a href="login.php">Log in</a></p>
<? else: ?>
<p><a href="logout.php">Log out</a></p>
<p><a href="group_create.php">Create group</a></p>
<p><a href="group_join.php">Join group</a></p>
<?php endif; ?>

Run Code Online (Sandbox Code Playgroud)

我的队友和我在不同的机器上运行相同的项目(他们使用Windows,我使用mac).我们都在xampp中运行它并获得正常结果:在身份验证之前,只显示前三个链接.但是在他们的机器中,所有六个链接都显示在页面上,这是不可能的.我们的PHP版本也是一样的:5.6.1*.你有什么想法吗？提前致谢!

php windows xampp macos

use*_*223

lucky-day

4
推荐指数

1
解决办法

154
查看次数

使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

我有这样一个数据框df：

a         b
10        2
3         1
0         0
0         4
....
# about 50,000+ rows

Run Code Online (Sandbox Code Playgroud)

我希望选择df[:5, 'a']. 但是当我打电话时df.loc[:5, 'a']，我收到一个错误：KeyError: 'Cannot get right slice bound for non-unique label: 5。当我调用时df.loc[5]，结果包含 250 行，而当我使用df.iloc[5]. 为什么会发生这种情况，我该如何正确索引它？先感谢您！

python dataframe pandas

use*_*223

lucky-day

4
推荐指数

2
解决办法

1万
查看次数

我一直在使用virtualenv,然后我也安装了anaconda.刚才我尝试使用anaconda的方式激活虚拟环境source activate helloworld.(事实上,我不记得这是否是我输入的命令).然后环境被激活了.但是当我试图运行笔记本时,据说即使我已经在那个环境中安装了一些库,也不存在.直到那时我才意识到我已经激活了错误的环境.然后我关闭该选项卡,cd以hellowworld和做source bin/activate.但为时已晚了.我得到了这个作为输出,prepending /home/lcc/anaconda3/envs/bin to PATH并且预期没有激活环境.你知道如何解决这个问题吗？谢谢!

以下是该activate文件的完整内容/hello/world:

#!/bin/bash

# Determine the directory containing this script
if [[ -n $BASH_VERSION ]]; then
    _SCRIPT_LOCATION=${BASH_SOURCE[0]}
    SHELL="bash"
elif [[ -n $ZSH_VERSION ]]; then
    _SCRIPT_LOCATION=${funcstack[1]}
    SHELL="zsh"
else
    echo "Only bash and zsh are supported"
    return 1
fi
_CONDA_DIR=$(dirname "$_SCRIPT_LOCATION")

if [ $# -gt 1 ]; then
    (>&2 echo "Error: did not expect more than one argument.")
    (>&2 echo "    (Got …

Run Code Online (Sandbox Code Playgroud)

python virtualenv anaconda ubuntu-14.04

use*_*223

2016 07-06

4
推荐指数

1
解决办法

2610
查看次数

init 文件在 python 中无法按预期工作

我有一些具有.py以下结构的文件夹和文件：

parent/
       __init__.py
       test.ipynb
       code/
            __init__.py
            common.py
            subcode/
                    __init__.py
                    get_data.py

Run Code Online (Sandbox Code Playgroud)

在__init__该parent文件夹下的文件中，我有import code，在其中之一中code，我有import subcode。但是当我尝试时import code.subcode，我得到了这样的错误：

ImportError: No module named 'code.subcode'; 'code' is not a package

Run Code Online (Sandbox Code Playgroud)

但是当我刚刚时import code，没有抛出任何错误。但是，当我调用时code.subcode，会发生此错误：

AttributeError: module 'code' has no attribute 'subcode'

Run Code Online (Sandbox Code Playgroud)

test.ipynb我在位于目录根目录的中尝试了上面提到的所有内容。

您知道原因是什么以及如何解决吗？谢谢！

python init

use*_*223

lucky-day

4
推荐指数

1
解决办法

3733
查看次数

在 tensorflow-r1.2 中正确使用 `tf.scatter_nd`

给定indicesshape [batch_size, sequence_len]，updatesshape [batch_size, sequence_len, sampled_size]，to_shapeshape [batch_size, sequence_len, vocab_size]， where vocab_size>> sampled_size，我想用tf.scatter将映射updates到一个巨大的张量to_shape，例如to_shape[bs, indices[bs, sz]] = updates[bs, sz]。也就是说，我想逐行映射updates到to_shape。请注意，sequence_len和sampled_size是标量张量，而其他是固定的。我尝试执行以下操作：

new_tensor = tf.scatter_nd(tf.expand_dims(indices, axis=2), updates, to_shape)

Run Code Online (Sandbox Code Playgroud)

但我得到了一个错误：

ValueError: The inner 2 dimension of output.shape=[?,?,?] must match the inner 1 dimension of updates.shape=[80,50,?]: Shapes must be equal rank, but are 2 and 1 for .... with …

Run Code Online (Sandbox Code Playgroud)

python python-3.4 deep-learning tensorflow tensor

use*_*223

2017 07-19

3
推荐指数

1
解决办法

6207
查看次数

如何处理stan中缺少的数据？

我是斯坦的新手,我正在实施概率矩阵分解模型.

给定用户项评级矩阵:

                       item
 user     1    3   NA   4     5    NA
          2    0    3   NA    1     5
          1    1    NA  NA    NA    0
          ....

Run Code Online (Sandbox Code Playgroud)

我应该如何表示data块中的可观察数据以及块中预测的缺失数据parameter？

先感谢您!

编辑:

现在我正在实现如下模型:

pmf_code = """
data {

int<lower=0> K; //number of factors
int<lower=0> N; //number of user
int<lower=0> M; //number of item
int<lower=0> D; //number of observation
int<lower=0> D_new; //number of pridictor 
int<lower=0, upper=N> ii[D]; //item 
int<lower=0, upper=M> jj[D]; //user
int<lower=0, upper=N> ii_new[D_new]; // item
int<lower=0, upper=N> jj_new[D_new]; // user
real<lower=0, …

Run Code Online (Sandbox Code Playgroud)

python stan

use*_*223

2016 02-09

2
推荐指数

1
解决办法

2035
查看次数

无法在ggplot2中使用boxplot

鉴于这样的数据框架:

dt       val
02-09     0.1
02-09     0.2
02-09     0.15
02-10     0.3
02-10     -0.1
...

Run Code Online (Sandbox Code Playgroud)

我想使用boxplot来显示val每个中的介质,方差dt:

 ggplot(data = df,aes(y=val,x=dt)) + geom_boxplot()

Run Code Online (Sandbox Code Playgroud)

但我得到的是:

它可以观察到只有一个盒子.当我尝试时outlier.colour = "red",所有的点都是红色的.为什么？所有值都在(-1,1)的区间内

r ggplot2

use*_*223

lucky-day

1
推荐指数

1
解决办法

54
查看次数

标签统计

python ×9

pandas ×3

dataframe ×2

anaconda ×1

arrays ×1

csv ×1

deep-learning ×1

ggplot2 ×1

init ×1

keras ×1

macos ×1

multiple-columns ×1

neural-network ×1

numpy ×1

php ×1

python-3.4 ×1

r ×1

recursion ×1

stan ×1

tensor ×1

tensorflow ×1

ubuntu-14.04 ×1

virtualenv ×1

windows ×1

xampp ×1

小编use_223的帖子

如果每行包含不同数量的字段(数字相当大),如何正确读取csv文件？

如何用pandas-python递归地构造一列数据帧？

如何删除一个numpy数组中的相邻重复值？

无法使用 keras 实现层规范化

奇怪的错误:相同的PHP代码,但在mac和windows机器上的结果不同

使用 python-pandas 索引数据框时，无法为非唯一标签绑定正确的切片

如何解决anaconda和virtualenv冲突的问题

init 文件在 python 中无法按预期工作

在 tensorflow-r1.2 中正确使用 `tf.scatter_nd`

如何处理stan中缺少的数据？

无法在ggplot2中使用boxplot

标签统计

标签 统计

小编use_223的帖子

标签统计