我在网上看到了几篇关于为什么应该使用 bufio.Scanner 而不是 bufio.Reader 的宣传片。
我不知道我的测试用例是否相关,但是在从文本文件中读取 1,000,000 行时,我决定测试一个与另一个:
package main
import (
"fmt"
"strconv"
"bufio"
"time"
"os"
//"bytes"
)
func main() {
fileName := "testfile.txt"
// Create 1,000,000 integers as strings
numItems := 1000000
startInitStringArray := time.Now()
var input [1000000]string
//var input []string
for i:=0; i < numItems; i++ {
input[i] = strconv.Itoa(i)
//input = append(input,strconv.Itoa(i))
}
elapsedInitStringArray := time.Since(startInitStringArray)
fmt.Printf("Took %s to populate string array.\n", elapsedInitStringArray)
// Write to a file
fo, _ := os.Create(fileName)
for i:=0; i < numItems; i++ {
fo.WriteString(input[i] + "\n")
}
fo.Close()
// Use reader
openedFile, _ := os.Open(fileName)
startReader := time.Now()
reader := bufio.NewReader(openedFile)
for i:=0; i < numItems; i++ {
reader.ReadLine()
}
elapsedReader := time.Since(startReader)
fmt.Printf("Took %s to read file using reader.\n", elapsedReader)
openedFile.Close()
// Use scanner
openedFile, _ = os.Open(fileName)
startScanner := time.Now()
scanner := bufio.NewScanner(openedFile)
for i:=0; i < numItems; i++ {
scanner.Scan()
scanner.Text()
}
elapsedScanner := time.Since(startScanner)
fmt.Printf("Took %s to read file using scanner.\n", elapsedScanner)
openedFile.Close()
}
Run Code Online (Sandbox Code Playgroud)
我在时序上收到的相当平均的输出如下所示:
Took 44.1165ms to populate string array.
Took 17.0465ms to read file using reader.
Took 23.0613ms to read file using scanner.
Run Code Online (Sandbox Code Playgroud)
我很好奇,什么时候使用阅读器和扫描仪更好,是基于性能还是功能?
这是一个有缺陷的基准。他们没有做同样的事情。
func (b *Reader) ReadLine() (line []byte, isPrefix bool, err error)
Run Code Online (Sandbox Code Playgroud)
返回[]byte。
func (s *Scanner) Text() string
Run Code Online (Sandbox Code Playgroud)
返回 string([]byte)
为了比较,使用,
func (s *Scanner) Bytes() []byte
Run Code Online (Sandbox Code Playgroud)
这是一个有缺陷的基准。它读取短字符串,从“ 0\n”到“ 999999\n”的整数。真实世界的数据集是什么样的?
在现实世界中,我们读到莎士比亚:http://www.gutenberg.org/ebooks/100:纯文本UTF-8: pg100.txt。
Took 2.973307ms to read file using reader. size: 5340315 lines: 124787
Took 2.940388ms to read file using scanner. size: 5340315 lines: 124787
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
4971 次 |
| 最近记录: |