我正在编写一个gensim教程并遇到了一些我不理解的东西.texts是一个嵌套的字符串列表:
In [37]: texts
Out[37]:
[['human', 'machine', 'interface', 'lab', 'abc', 'computer', 'applications'],
['survey', 'user', 'opinion', 'computer', 'system', 'response', 'time'],
['eps', 'user', 'interface', 'management', 'system'],
['system', 'human', 'system', 'engineering', 'testing', 'eps'],
['relation', 'user', 'perceived', 'response', 'time', 'error', 'measurement'],
['generation', 'random', 'binary', 'unordered', 'trees'],
['intersection', 'graph', 'paths', 'trees'],
['graph', 'minors', 'iv', 'widths', 'trees', 'well', 'quasi', 'ordering'],
['graph', 'minors', 'survey']]
Run Code Online (Sandbox Code Playgroud)
并sum(texts,[])给出:
Out[38]:
['human',
'machine',
'interface',
'lab',
'abc',
'computer',
'applications',
'survey',
'user',
'opinion',
'computer',
Run Code Online (Sandbox Code Playgroud)
该列表继续了几行,但我省略了其余的以节省空间.我有两个问题:
1)为什么sum(texts,[])产生结果(即展平嵌套列表)?
2)为什么输出显示奇怪 - 每行一个元素?这个输出有什么特别之处(......或者我怀疑它可能是我的iPython表现得很奇怪).请确认您是否也看到了这一点.
这是因为将列表连在一起会将它们连接起来.
sum([a, b, c, d, ..., z], start)
Run Code Online (Sandbox Code Playgroud)
相当于
start + a + b + c + d + ... + z
Run Code Online (Sandbox Code Playgroud)
所以
sum([['one', 'two'], ['three', 'four']], [])
Run Code Online (Sandbox Code Playgroud)
相当于
[] + ['one', 'two'] + ['three', 'four']
Run Code Online (Sandbox Code Playgroud)
哪个给你
['one', 'two', 'three', 'four']
Run Code Online (Sandbox Code Playgroud)
Note that start, by default, is 0, since by default it works with numbers, so if you were to try
sum([['one', 'two'], ['three', 'four']])
Run Code Online (Sandbox Code Playgroud)
then it would try the equivalent of
0 + ['one', 'two'] + ['three', 'four']
Run Code Online (Sandbox Code Playgroud)
and it would fail because you can't add integers to lists.
The one-per-line thing is just how IPython is deciding to output your long list of strings.
| 归档时间: |
|
| 查看次数: |
299 次 |
| 最近记录: |