小编Avi*_*Avi的帖子

按行顺序提取上三角形的值

我有以下矩阵:

mat <- matrix(1:16, 4, 4)

> mat
     [,1] [,2] [,3] [,4]
[1,]    1    5    9   13
[2,]    2    6   10   14
[3,]    3    7   11   15
[4,]    4    8   12   16
Run Code Online (Sandbox Code Playgroud)

我想按照行的顺序将上三角形(没有对角线)转换为向量:如果我是这样的:

> mat1<-as.vector(mat[upper.tri(mat)])
> mat1
[1]  5  9 10 13 14 15
Run Code Online (Sandbox Code Playgroud)

我想按行顺序获取向量(mat1),如下所示: 5,9,13,10,14,15

r rows

2
推荐指数
1
解决办法
1367
查看次数

为什么随机种子不能使结果在Python中保持不变

我使用以下代码。对于相同的随机种子,我希望获得相同的结果。我使用相同的随机种子(在这种情况下为1)并获得不同的结果。这是代码:

import pandas as pd
import numpy as np
from random import seed
# Load scikit's random forest classifier library
from sklearn.ensemble import RandomForestClassifier

from sklearn.model_selection import train_test_split
seed(1) ### <-----

file_path = 'https://archive.ics.uci.edu/ml/machine-learning-databases/undocumented/connectionist-bench/sonar/sonar.all-data'
dataset2 = pd.read_csv(file_path, header=None, sep=',')

from sklearn import preprocessing
le = preprocessing.LabelEncoder()

#Encoding
y = le.fit_transform(dataset2[60])
dataset2[60] = y
train, test = train_test_split(dataset2, test_size=0.1)
y = train[60] 
y_test = test[60] 
clf = RandomForestClassifier(n_jobs=100, random_state=0)
features = train.columns[0:59] 
clf.fit(train[features], y)

# Apply the Classifier we trained to …
Run Code Online (Sandbox Code Playgroud)

random seed python-2.7

2
推荐指数
1
解决办法
1402
查看次数

从数据帧中删除低于阈值的行

我有以下数据框(data6):

数据6

   n   S_ID      EID VO
1: 1   41883100   1 A1
2: 2   41883100   2 B22
3: 3   41883100   3 C13
4: 4   41883100   4 D18
5: 5   41883100   5 T5-7
6: 6   41883098   1 HJ89
7: 7   41883098   2 I982
8: 8   41884555   1 ZX567
9: 9   41997896   1 TYU12
Run Code Online (Sandbox Code Playgroud)

我想在data6中将其最大EID列值大于每个S_ID的所有行保留在data6中(删除每个S_ID的EID值是1或2)。因此结果如下:

数据6

   n   S_ID      EID VO
1: 1   41883100   1 A1
2: 2   41883100   2 B22
3: 3   41883100   3 C13
4: 4   41883100   4 D18
5: 5   41883100 …
Run Code Online (Sandbox Code Playgroud)

r

1
推荐指数
1
解决办法
1178
查看次数

在data.tree中为每个节点添加数字

我有以下树:

 library (data.tree) 
    data (acme)
    t1<-acme
    > acme
                              levelName
    1  Acme Inc.                       
    2   ¦--Accounting                  
    3   ¦   ¦--New Software            
    4   ¦   °--New Accounting Standards
    5   ¦--Research                    
    6   ¦   ¦--New Product Line        
    7   ¦   °--New Labs                
    8   °--IT                          
    9       ¦--Outsource               
    10      ¦--Go agile                
    11      °--Switch to R 
Run Code Online (Sandbox Code Playgroud)

我想通过在每个节点名称中添加行数来枚举树节点名称,如下所示:

> t1
                          levelName
1  Acme Inc._1                       
2   ¦--Accounting_2
3   ¦   ¦--New Software_3
4   ¦   °--New Accounting Standards_4
5   ¦--Research_5                    
6   ¦   ¦--New Product Line_6        
7   ¦   °--New Labs_7      
8   °--IT_8                          
9       ¦--Outsource_9              
10      ¦--Go …
Run Code Online (Sandbox Code Playgroud)

tree r

0
推荐指数
1
解决办法
511
查看次数

两个配对的t检验p值= NA和t = NaN

我想做以下配对t检验:

str1<-' ENSEMBLE 0.934 0.934 0.934 0.934 '
  str2<-' J48 0.934 0.934 0.934 0.934 '

  df1 <- read.table(text=scan(text=str1, what='', quiet=TRUE), header=TRUE)
  df2 <- read.table(text=scan(text=str2, what='', quiet=TRUE), header=TRUE)

t.test ( df1$ENSEMBLE, df2$J48, mu=0 , alt="two.sided", paired = T, conf.level = 0.95)
Run Code Online (Sandbox Code Playgroud)

我得到以下结果:

Paired t-test

data:  df1$ENSEMBLE and df2$J48
t = NaN, df = 3, p-value = NA
alternative hypothesis: true difference in means is not equal to 0
95 percent confidence interval:
 NaN NaN
sample estimates:
mean of the differences 
                      0 …
Run Code Online (Sandbox Code Playgroud)

r p-value

-1
推荐指数
1
解决办法
2520
查看次数

标签 统计

r ×4

p-value ×1

python-2.7 ×1

random ×1

rows ×1

seed ×1

tree ×1