我在网上找到了一个针对f#的“有趣”项目,其背后的想法是找到给定字符串中子字符串的数量。
这是提示:
Description:
You are given a DNA sequence:
a string that contains only characters 'A', 'C', 'G', and 'T'.
Your task is to calculate the number of substrings of sequence,
in which each of the symbols appears the same number of times.
Example 1:
For sequence = "ACGTACGT", the output should be 6
All substrings of length 4 contain each symbol exactly once (+5),
and the whole sequence contains each symbol twice (+1).
Example 2:
For sequence = "AAACCGGTTT", the output should be 1
Only substring "AACCGGTT" satisfies the criterion above: it contains each symbol twice.
Input: String, a sequence that consists only of symbols 'A', 'C', 'G', and 'T'.
Length constraint: 0 < sequence.length < 100000.
Output: Integer, the number of substrings where each symbol appears equally many times.
Run Code Online (Sandbox Code Playgroud)
我不确定该在哪里使用,或更确切地说该怎么做。我在互联网上四处张望,试图找到应该做的事情,而我只找到了以下代码(我添加了输入变量,var变量,并将显示“事物”更改为输入,然后输入了子字符串)搜索(希望如此)):
open System
let countSubstring (where :string) (what : string) =
match what with
| "" -> 0
| _ -> (where.Length - where.Replace(what, @"").Length) / what.Length
[<EntryPoint>]
let main argv =
let input = System.Console.ReadLine();
let var = input.Length;
Console.WriteLine(var);
let show where what =
printfn @"countSubstring(""%s"", ""%s"") = %d" where what (countSubstring where what)
show input "ACGT"
show input "CGTA"
show input "GTAC"
show input "TACG"
0
Run Code Online (Sandbox Code Playgroud)
无论如何,如果有人可以帮助我,将不胜感激。
提前致谢
小智 3
首先声明一个函数numberACGT,如果字符 A 与 C、G 和 T 的数量相同,则该函数从字符串返回 1,否则返回 0。为此,声明一个由 4 个整数组成的数组 N,初始化为 0,并运行 throw 字符串,递增相应的计数器。最后比较它们之间的数组元素。
然后对每个子串(固定长度4的倍数)调用numberACGT并添加到counter count(一开始初始化为0)
let numberACGT (aString:string) =\n let N = Array.create 4 (0:int)\n let last = aString.Length - 1 \n for i = 0 to last do\n match aString.[i] with\n | \'A\' -> N.[0] <- N.[0] + 1\n | \'C\' -> N.[1] <- N.[1] + 1\n | \'G\' -> N.[2] <- N.[2] + 1\n | _ -> N.[3] <- N.[3] + 1\n if (N.[0] = N.[1]) && (N.[1] = N.[2]) && (N.[2] = N.[3]) then 1 else 0 \n\nlet numberSubStrings (aString:string) =\n let mutable count = 0\n let len = aString.Length \n for k = 1 to len / 4 do //only multiple of 4\n for pos = 0 to len - 4*k do\n count <- count + numberACGT (aString.[pos..pos+4*k-1])\n count\nRun Code Online (Sandbox Code Playgroud)\n\n我希望它足够快。
\n\n[<EntryPoint>]\nlet main argv = \n let stopWatch = System.Diagnostics.Stopwatch.StartNew()\n let input = Console.ReadLine() in\n printf "%i " (numberSubStrings input)\n stopWatch.Stop()\n let g = Console.ReadLine()\n 0\nRun Code Online (Sandbox Code Playgroud)\n\n结果:
\n\n62 4.542700\nRun Code Online (Sandbox Code Playgroud)\n\nO(n\xc2\xb2) 的新版本:
\n\nlet numberSubStringsBis (aString:string) =\n let mutable count = 0 \n let len = aString.Length \n for pos = 0 to len - 1 do\n let mutable a = 0 \n let mutable c = 0 \n let mutable g = 0 \n let mutable t = 0 \n let mutable k = pos \n while k + 3 <= len - 1 do\n for i in [k..k+3] do\n match aString.[i] with\n | \'A\' -> a <- a + 1\n | \'C\' -> c <- c + 1\n | \'G\' -> g <- g + 1\n | _ -> t <- t + 1\n k <- k + 4 \n if a=c && c=g && g=t then count <- count + 1 \n count\nRun Code Online (Sandbox Code Playgroud)\n
| 归档时间: |
|
| 查看次数: |
781 次 |
| 最近记录: |