小编Pet*_*rov的帖子

熊猫:转换类型的列

我有一个带有列的数据框

                                         category  
0          [???????/Hi-Tech/????????/?????????????/ ]  
1   [/???????/??????/????????????/???? ???????????...  
2   [] 
3   [/???????/??????/????????????/???? ???????????...  
4          [???????/Hi-Tech/????????/?????????????/ ]  
5   [] 
6          [???????/Hi-Tech/????????/?????????????/ ]  
7   [/???????/??????/????????????/???? ???????????...  
8          [???????/Hi-Tech/????????/?????????????/ ]  
9   [/???????/??????/????????????/???? ???????????...  
10         [???????/Hi-Tech/????????/?????????????/ ]  
11  [/???????/??????/????????????/???? ???????????...  
12  []  
13  [/???????/??????/????????????/???? ???????????...  
14         [???????/Hi-Tech/????????/?????????????/ ] 
Run Code Online (Sandbox Code Playgroud)

列中有列表.我需要从每个列表中获取第一个字符串,但有些列表是空的,当我尝试使用时

df.category.iloc[0]
Run Code Online (Sandbox Code Playgroud)

我明白了

ValueError:值的长度与索引的长度不匹配

如何修复该错误并获取字符串而不是列表?

python string list dataframe pandas

1
推荐指数
1
解决办法
74
查看次数

Docker:拉动并运行容器

我需要运行docker容器.

首先,我已将其拉出来

docker pull [OPTIONS] NAME[:TAG|@DIGEST]
Run Code Online (Sandbox Code Playgroud)

接下来我尝试运行它

docker run [OPTIONS] IMAGE[:TAG|@DIGEST] [COMMAND] [ARG...]
Run Code Online (Sandbox Code Playgroud)

但是我收到了一个错误

docker: Error response from daemon: driver failed programming external connectivity on endpoint youthful_bhaskara (47fae1c2ecd6245d127801729b80276aeb3858526a9441760925d904ce1565ff): Error starting userland proxy: listen tcp 0.0.0.0:8888: bind: address already in use.
ERRO[0000] error waiting for container: context canceled 
Run Code Online (Sandbox Code Playgroud)

随着sudo我有一个常见的错误.

我该如何解决这个问题?也许我错过了一些中间行动?

ubuntu docker

1
推荐指数
1
解决办法
150
查看次数

Doc2Vec:获取标签的文本

我已经训练了Doc2Vec模型,试图获得预测。

我用

test_data = word_tokenize("????? ?????? ???????? ?.?.".lower())
model = Doc2Vec.load(model_path)
v1 = model.infer_vector(test_data)
sims = model.docvecs.most_similar([v1])
print(sims)
Run Code Online (Sandbox Code Playgroud)

退货

[('624319', 0.7534812092781067), ('566511', 0.7333904504776001), ('517382', 0.7264763116836548), ('523368', 0.7254455089569092), ('494248', 0.7212602496147156), ('382920', 0.7092794179916382), ('530910', 0.7086726427078247), ('513421', 0.6893941760063171), ('196931', 0.6776881814002991), ('196947', 0.6705600023269653)]
Run Code Online (Sandbox Code Playgroud)

接下来我试图知道,这个数字是什么文字

model.docvecs['624319']
Run Code Online (Sandbox Code Playgroud)

但是它只返回矢量表示形式

array([ 0.36298314, -0.8048847 , -1.4890883 , -0.3737898 , -0.00292279,
   -0.6606688 , -0.12611026, -0.14547637,  0.78830665,  0.6172428 ,
   -0.04928801,  0.36754376, -0.54034036,  0.04631123,  0.24066721,
    0.22503968,  0.02870891,  0.28329515,  0.05591608,  0.00457001],
  dtype=float32)
Run Code Online (Sandbox Code Playgroud)

那么,有什么方法可以从模型中获取该标签的文本吗?加载火车数据集需要很多时间,因此我尝试寻找另一种方法。

python gensim doc2vec

1
推荐指数
1
解决办法
419
查看次数

Python:编写类时出错

我想编写func并将其添加到类中.我用

import pandas as pd
import tldextract

domain = []
df = pd.DataFrame()
df['urls'] = ['ru.vk.com', 'eng.facebook.com', 'ru.ya.ru']
urls = df.urls.values.tolist()
class csv:
    def get_domain(self, list_url, list, df):
        self.list_url = list_url
        self.list = list
        self.df = df
        for i, url in enumerate(list_url):
            get_domain = tldextract.extract(url)
            subdomain = get_domain[0] + '.' + get_domain[1] + '.' + get_domain[2]
            if subdomain.startswith('.'):
                subdomain = subdomain[1:]
            elif subdomain.endswith('.'):
                subdomain = subdomain[:-1]
            elif subdomain.startswith('www.'):
                subdomain = subdomain[4:]
            list.append(subdomain)
        df['subdomain'] = list

df = csv()
df.get_domain(urls, domain, …
Run Code Online (Sandbox Code Playgroud)

python class

0
推荐指数
1
解决办法
74
查看次数

标签 统计

python ×3

class ×1

dataframe ×1

doc2vec ×1

docker ×1

gensim ×1

list ×1

pandas ×1

string ×1

ubuntu ×1