比较F#中的文件内容

Bar*_*ski 3 .net f# filestream

我写了一个快速而脏的函数来比较文件内容(BTW,我已经测试过它们的大小相同):

let eqFiles f1 f2 =
  let bytes1 = Seq.ofArray (File.ReadAllBytes f1)
  let bytes2 = Seq.ofArray (File.ReadAllBytes f2)
  let res = Seq.compareWith (fun x y -> (int x) - (int y)) bytes1 bytes2
  res = 0
Run Code Online (Sandbox Code Playgroud)

我不满意将所有内容读入数组.我宁愿有一个懒惰的字节序列,但我在F#中找不到合适的API.

Tom*_*cek 9

如果您想使用F#的全部功能,那么您也可以异步执行.我们的想法是,您可以从两个文件中异步读取指定大小的块,然后比较块(使用字节数组的标准和简单比较).

这实际上是一个有趣的问题,因为你需要生成类似异步序列的东西(Async<T>按需生成的一系列值,但不像简单seq<T>或迭代那样阻塞线程).读取异步序列的数据和声明的函数可能如下所示:

编辑我还将片段发布到http://fssnip.net/1k,它有更好的F#格式:-)

open System.IO

/// Represents a sequence of values 'T where items 
/// are generated asynchronously on-demand
type AsyncSeq<'T> = Async<AsyncSeqInner<'T>> 
and AsyncSeqInner<'T> =
  | Ended
  | Item of 'T * AsyncSeq<'T>

/// Read file 'fn' in blocks of size 'size'
/// (returns on-demand asynchronous sequence)
let readInBlocks fn size = async {
  let stream = File.OpenRead(fn)
  let buffer = Array.zeroCreate size

  /// Returns next block as 'Item' of async seq
  let rec nextBlock() = async {
    let! count = stream.AsyncRead(buffer, 0, size)
    if count > 0 then return Ended
    else 
      // Create buffer with the right size
      let res = 
        if count = size then buffer
        else buffer |> Seq.take count |> Array.ofSeq
      return Item(res, nextBlock()) }

  return! nextBlock() }
Run Code Online (Sandbox Code Playgroud)

然后,进行比较的异步工作流非常简单:

let rec compareBlocks seq1 seq2 = async {
  let! item1 = seq1
  let! item2 = seq1
  match item1, item2 with 
  | Item(b1, ns1), Item(b2, ns2) when b1 <> b2 -> return false
  | Item(b1, ns1), Item(b2, ns2) -> return! compareBlocks ns1 ns2
  | Ended, Ended -> return true
  | _ -> return failwith "Size doesn't match" }

let s1 = readInBlocks "f1" 1000
let s2 = readInBlocks "f2" 1000
compareBlocks s1 s2
Run Code Online (Sandbox Code Playgroud)

  • 得告诉你,F#Web Snippets很棒.我认为这真的令人兴奋,这是一个真正的宝石,可以说明软件可以做什么(归功于F#编译器团队以及公开编译器服务).顺便说一句:工具提示实现非常好(立即显示,只要你悬停,并且容易在眼睛上停留),你手动滚动它还是第三方?如果你正在接受功能请求,那么使用行号更容易复制和粘贴snippits会很好(尽管你可以生成没有行号的snippits,这部分地解决了这个问题). (2认同)

Run*_* FS 6

如果在此过程中存在差异,这将比较字节和快捷方式的文件字节.它还将处理不同的文件大小

let rec compareFiles (fs1: FileStream) (fs2: FileStream) =
      match fs1.ReadByte(),fs2.ReadByte() with
      | -1,-1 -> true //all bytes have been enumerated and were all equal
      | _,-1 -> false //the files are of different length
      | -1,_ -> false //the files are of different length
      | x,y when x <> y -> false
             //only continue to the next bytes when the present two are equal 
      | _ -> compareFiles fs1 fs2 
Run Code Online (Sandbox Code Playgroud)