我有一个50K的字符串列表(城市名称),并且我需要一个最小的字符三元组(通常为n元组)列表,其中每个字符串至少被一个三元组命中一次。考虑以下列表:[“阿姆斯特丹”,“鹿特丹”,“哈勒姆”,“乌得勒支”,“格罗宁根”]
识别字母组合的列表长为4,并且应为(可选):
['ter', 'haa', 'utr', 'gro']
Run Code Online (Sandbox Code Playgroud)
我以为我的解决方案找到了正确的正确答案,但是在其他列表上使用时却给出了错误的答案。
from collections import Counter
def identifying_grams(list, n=3):
def f7(seq):
seen = set()
seen_add = seen.add
return [x for x in seq if not (x in seen or seen_add(x))]
def ngrams(text, n=3):
return [text[i:i + n] for i in range(len(text) - n + 1)]
hits = []
trigrams = []
for item in list:
# trigrams += ngrams(item)
trigrams += f7(ngrams(item))
counts = Counter(trigrams).most_common()
for trigram, count in counts:
items = []
for item …Run Code Online (Sandbox Code Playgroud) 我正在尝试制作一个不依赖于预设列数的网格.我创建了一个小样本来显示情况:
<!DOCTYPE HTML>
<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=utf-8">
<title>Grid in HTML5 and CSS3</title>
<style>
* {margin:0;padding:0;}
.row {display:block;position:relative;clear:both;}
.row>* {display:block;position:relative;clear:both;float:left;clear:none;width:100%;}
.row>*:empty {width:0px;}
/* one column in the row */
.row>:nth-last-child(1):nth-child(1) {width:100%;}
/* two columns in the row */
.row>:nth-last-child(2):nth-child(1) {width:50%;}
.row>:nth-last-child(1):nth-child(2) {width:50%;}
/* three columns in the row */
.row>:nth-last-child(3):nth-child(1) {width:33.33%;}
.row>:nth-last-child(2):nth-child(2) {width:33.33%;}
.row>:nth-last-child(1):nth-child(3) {width:33.34%;}
.row>:empty:nth-last-child(3):nth-child(1)+:not(:empty) {width:66.66%;}
.row>:empty:nth-last-child(2):nth-child(2)+:not(:empty) {width:66.67%;}
article {margin:.5em;border:1px solid green;border-radius:.3em;padding:.5em; }
</style>
</head>
<body>
<section class="row">
<div>
<article>
<p>This row has only one child.</p>
</article>
</div> …Run Code Online (Sandbox Code Playgroud) 我正在对一个新的html5-css3框架进行一些测试,我想针对一些无类HTML文件进行测试,看看我的框架是如何运行的.
有没有人知道一个或多个静态HTML文件,其中包含各种配置中的所有HTML5标签(最好没有类),我可以用它来测试我的框架.对我来说这将是一个很大的节省时间.