ber*_*zie 20 algorithm dynamic-programming python-3.x
我想了解动态规划的概念,通过对MIT OCW课程在这里.关于OCW视频的解释很棒,但我觉得在我将解释实现到代码之前我并不理解它.在实施时,我会参考这里的讲义中的一些注释,特别是注释的第3页.
问题是,我不知道如何将一些数学符号转换为代码.这是我实施的解决方案的一部分(并认为它是正确实现的):
import math
paragraph = "Some long lorem ipsum text."
words = paragraph.split(" ")
# Count total length for all strings in a list of strings.
# This function will be used by the badness function below.
def total_length(str_arr):
total = 0
for string in str_arr:
total = total + len(string)
total = total + len(str_arr) # spaces
return total
# Calculate the badness score for a word.
# str_arr is assumed be send as word[i:j] as in the notes
# we don't make i and j as argument since it will require
# global vars then.
def badness(str_arr, page_width):
line_len = total_length(str_arr)
if line_len > page_width:
return float('nan')
else:
return math.pow(page_width - line_len, 3)
Run Code Online (Sandbox Code Playgroud)
现在我不理解的部分是在讲义中的第3点到第5点.我实际上不明白,也不知道从哪里开始实现这些.到目前为止,我已经尝试迭代单词列表,并计算每个所谓的行尾的坏处,如下所示:
def justifier(str_arr, page_width):
paragraph = str_arr
par_len = len(paragraph)
result = [] # stores each line as list of strings
for i in range(0, par_len):
if i == (par_len - 1):
result.append(paragraph)
else:
dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)]
# Should I do a min(dag), get the index, and declares it as end of line?
Run Code Online (Sandbox Code Playgroud)
但是,我不知道如何继续这个功能,说实话,我不明白这一行:
dag = [badness(paragraph[i:j], page_width) + justifier(paragraph[j:], page_width) for j in range(i + 1, par_len + 1)]
Run Code Online (Sandbox Code Playgroud)
以及如何我将回到justifier
作为一个int
(因为我已经决定来存储返回值result
,这是一个名单,我应该从那里另一个功能和递归?如果有任何递归呢?
你能告诉我下一步该做什么,并解释这是如何动态编程的吗?我真的看不出递归的位置,以及子问题是什么.
谢谢你.
Joo*_*wan 21
如果您无法理解动态编程本身的核心思想,请参考以下内容:
动态编程本质上牺牲了时间复杂度的空间复杂度(但是与您节省的时间相比,您使用的额外空间通常非常少,如果正确实现,动态编程总是值得的).您可以随时存储每个递归调用的值(例如,在数组或字典中),这样当您在递归树的另一个分支中遇到相同的递归调用时,可以避免第二次计算.
而且不,你不必须使用递归.以下是我正在使用循环的问题的实现.我非常密切地关注了由AlexSilva链接的TextAlignment.pdf.希望你觉得这很有帮助.
def length(wordLengths, i, j):
return sum(wordLengths[i- 1:j]) + j - i + 1
def breakLine(text, L):
# wl = lengths of words
wl = [len(word) for word in text.split()]
# n = number of words in the text
n = len(wl)
# total badness of a text l1 ... li
m = dict()
# initialization
m[0] = 0
# auxiliary array
s = dict()
# the actual algorithm
for i in range(1, n + 1):
sums = dict()
k = i
while (length(wl, k, i) <= L and k > 0):
sums[(L - length(wl, k, i))**3 + m[k - 1]] = k
k -= 1
m[i] = min(sums)
s[i] = sums[min(sums)]
# actually do the splitting by working backwords
line = 1
while n > 1:
print("line " + str(line) + ": " + str(s[n]) + "->" + str(n))
n = s[n] - 1
line += 1
Run Code Online (Sandbox Code Playgroud)
对于任何人在此仍然有兴趣:关键是从文本的末尾向后移动(如提到这里)。如果这样做,您只需比较已经记忆的元素。
说,words
是要包装的字符串列表textwidth
。然后,在讲义中,任务减少为三行代码:
import numpy as np
textwidth = 80
DP = [0]*(len(words)+1)
for i in range(len(words)-1,-1,-1):
DP[i] = np.min([DP[j] + badness(words[i:j],textwidth) for j in range(i+1,len(words)+1)])
Run Code Online (Sandbox Code Playgroud)
带有:
def badness(line,textwidth):
# Number of gaps
length_line = len(line) - 1
for word in line:
length_line += len(word)
if length_line > textwidth: return float('inf')
return ( textwidth - length_line )**3
Run Code Online (Sandbox Code Playgroud)
他提到,可以添加第二个列表来跟踪中断位置。您可以通过将代码更改为:
DP = [0]*(len(words)+1)
breaks = [0]*(len(words)+1)
for i in range(len(words)-1,-1,-1):
temp = [DP[j] + badness(words[i:j],args.textwidth) for j in range(i+1,len(words)+1)]
index = np.argmin(temp)
# Index plus position in upper list
breaks[i] = index + i + 1
DP[i] = temp[index]
Run Code Online (Sandbox Code Playgroud)
要恢复文本,只需使用中断位置列表:
def reconstruct_text(words,breaks):
lines = []
linebreaks = []
i = 0
while True:
linebreaks.append(breaks[i])
i = breaks[i]
if i == len(words):
linebreaks.append(0)
break
for i in range( len(linebreaks) ):
lines.append( ' '.join( words[ linebreaks[i-1] : linebreaks[i] ] ).strip() )
return lines
Run Code Online (Sandbox Code Playgroud)
结果:(text = reconstruct_text(words,breaks)
)
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed
diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
Run Code Online (Sandbox Code Playgroud)
可能会想添加一些空格。这是非常棘手的(因为可能会提出各种美学规则),但是天真的尝试可能是:
import re
def spacing(text,textwidth,maxspace=4):
for i in range(len(text)):
length_line = len(text[i])
if length_line < textwidth:
status_length = length_line
whitespaces_remain = textwidth - status_length
Nwhitespaces = text[i].count(' ')
# If whitespaces (to add) per whitespace exeeds
# maxspace, don't do anything.
if whitespaces_remain/Nwhitespaces > maxspace-1:pass
else:
text[i] = text[i].replace(' ',' '*( 1 + int(whitespaces_remain/Nwhitespaces)) )
status_length = len(text[i])
# Periods have highest priority for whitespace insertion
periods = text[i].split('.')
# Can we add a whitespace behind each period?
if len(periods) - 1 + status_length <= textwidth:
text[i] = '. '.join(periods).strip()
status_length = len(text[i])
whitespaces_remain = textwidth - status_length
Nwords = len(text[i].split())
Ngaps = Nwords - 1
if whitespaces_remain != 0:factor = Ngaps / whitespaces_remain
# List of whitespaces in line i
gaps = re.findall('\s+', text[i])
temp = text[i].split()
for k in range(Ngaps):
temp[k] = ''.join([temp[k],gaps[k]])
for j in range(whitespaces_remain):
if status_length >= textwidth:pass
else:
replace = temp[int(factor*j)]
replace = ''.join([replace, " "])
temp[int(factor*j)] = replace
text[i] = ''.join(temp)
return text
Run Code Online (Sandbox Code Playgroud)
是什么让您:(text = spacing(text,textwidth)
)
Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam nonumy
eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed diam
voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit
amet. Lorem ipsum dolor sit amet, consetetur sadipscing elitr, sed diam
nonumy eirmod tempor invidunt ut labore et dolore magna aliquyam erat, sed
diam voluptua. At vero eos et accusam et justo duo dolores et ea rebum. Stet
clita kasd gubergren, no sea takimata sanctus est Lorem ipsum dolor sit amet.
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
15774 次 |
最近记录: |