zen*_*nna 6 algorithm performance clojure cartesian-product
假设我有A_1,...A_n,例如[[a b c][d e][f]].我想找到这些集合的笛卡尔积,但不包括任何忽略列表元素的超集.
例如,如果我的忽略列表是[[a e][c]],笛卡尔积的结果将是[[a d f][b d f][b e f]].注意任何术语c都不在那里,也不是[a e f].
当然,我可以做到这一点的方法是找到完整的笛卡尔积,然后删除有问题的物品,但我想要一种更有效的方法,这样我就可以避免首先检查解决方案.
我有一个初步的解决方案,涉及逐步建立购物车产品中的每个术语,并在每个阶段我删除任何元素,A_i如果将它们添加到我正在构建的术语将导致它成为任何一个忽略的超集.这样工作正常,并且比天真的解决方案更好,但仍然有大量的冗余检查,这也取决于集合的呈现顺序.例如,如果[f]在我的忽略列表中,我仍会继续尝试创建术语,直到我到达[f]然后丢弃.
具体来说,我的clojure实现是
(defn first-elements
"Get the first elements of a set of sets, unless ignored"
[sets ignores in-ignore?]
(loop [product-tuple [] sets sets]
(println "sets " sets)
(cond
(or (nil? sets) (nil? (first sets)))
product-tuple
:else
(if-let [set-op (remove #(in-ignore? product-tuple ignores %) (first sets))]
(if (and (coll? set-op) (empty? set-op))
product-tuple
(recur (conj product-tuple (first set-op)) (next sets)))
product-tuple))))
(defn in-ignore?
"if I add elem to this build will it become a superset of any of the ignores"
[build ignores elem]
(some #(clojure.set/superset? (conj (set build) elem) %) ignores))
(defn cartesian-product-ignore
"All the ways to take one item from each sequence, except for ignore"
[ignores original-sets]
(loop [cart-prod #{} sets original-sets]
(let [firsts (first-elements sets ignores in-ignore?)]
(print "firsts " firsts "-cart-prod " cart-prod " sets " sets "\n")
(cond
(zero? (count firsts))
cart-prod
(= (count sets) (count firsts))
(recur (conj cart-prod firsts) (update-in sets [(dec (count sets))] next))
:else
(recur cart-prod (assoc
(update-in sets [(dec (count firsts))] next)
(count firsts)
(original-sets (count firsts))))))))
Run Code Online (Sandbox Code Playgroud)
我认为可以对您当前的方法进行一些改进.但首先,让我们实现一个基本的cartisian-product.然后我们可以调整它以接受忽略列表.这很容易使用for和一些递归:
(defn cartesian-product [colls]
(if (empty? colls)
(list ())
(for [e (first colls)
sub-product (cartesian-product (rest colls))]
(cons e sub-product))))
;; Quick test run
(cartesian-product [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
Run Code Online (Sandbox Code Playgroud)
好.自从我们使用以来for,我们就有了懒惰的优势.如果您需要的结果不是序列序列,那么将其转换为其他内容就足够了.
现在,困难的部分 - 实现忽略集.根据您的描述,您当前的方法是从A_i中删除元素,如果将它们添加到您正在构建的术语中会导致该术语成为任何忽略集的超集.正如您的代码所示,这不仅效率低下(例如,superset?最差情况下的线性时间与其第一个参数的大小相同),但它也使代码比它需要的更复杂.
所以让我们采用不同的方法.我们不是从A_i中删除元素,而是从ignore集中删除我们添加到术语中的任何元素.然后,如果任何忽略集都为空,我们可以修剪一个术语.作为奖励,它所需要的只是对我们之前cartesian-product实施的一些改变:
(defn cartesian-product-ignore [ignore-sets colls]
(cond (some empty? ignore-sets) () ; prune
(empty? colls) (list ()) ; base case
:else ; recursive case
(for [e (first colls)
sub-product (cartesian-product-ignore (map (fn [s]
(disj s e))
ignore-sets)
(rest colls))]
(cons e sub-product))))
;; test without any ignore sets
(cartesian-product-ignore [] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:a :e :f) (:b :d :f) (:b :e :f) (:c :d :f) (:c :e :f))
;; Now the moment of truth
(cartesian-product-ignore [(set [:a :e]) (set [:c])] [[:a :b :c] [:d :e] [:f]])
=> ((:a :d :f) (:b :d :f) (:b :e :f))
Run Code Online (Sandbox Code Playgroud)
当然,可能需要进行细微更改以满足您的确切需求.例如,您可能希望将忽略集接受为矢量或序列,并在内部将它们转换为集合.但这是算法的本质..