如何使用filter_map()而不是filter()与map()结合而不降低性能?

0x4*_*x42 6 iterator hashmap rust

我想使用filter_map()而不是unwrap()in the map()andfilter()但我发现这样做时性能会下降。如何在filter_map()不损失性能的情况下编写代码?首先为什么会出现性能损失?

src/lib.rs

use std::collections::HashMap;

pub enum Kind {
    Square(Square),
    Circle(Circle),
}

#[derive(Default, Copy, Clone)]
pub struct Circle {
    a: u32,
    b: u32,
    c: u32,
    d: u32,
}

#[derive(Default)]
pub struct Square {
    a: u32,
    b: Option<u32>,
    c: Option<u32>,
    d: Option<u32>,
    e: Option<u32>,
}

impl Kind {
    pub fn get_circle(&self) -> Option<&Circle> {
        if let Kind::Circle(b) = self {
            return Some(b);
        }
        None
    }
}
Run Code Online (Sandbox Code Playgroud)

工作台/test.rs

#![feature(test)]
extern crate test;

#[cfg(test)]
mod tests {
    use std::collections::HashMap;
    use std::net::{IpAddr, Ipv4Addr, SocketAddr};
    use test::Bencher;
    use testing::Circle;
    use testing::Kind;
    use testing::Square;

    fn get_bencher() -> HashMap<SocketAddr, Kind> {
        let mut question = HashMap::new();
        let square: Square = Default::default();
        question.insert(
            SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), 0),
            Kind::Square(square),
        );

        let circle: Circle = Default::default();
        for n in 1..=10000 {
            let socket = SocketAddr::new(IpAddr::V4(Ipv4Addr::new(127, 0, 0, 1)), n);
            question.insert(socket, Kind::Circle(circle));
        }
        question
    }

    #[bench]
    fn bencher01(b: &mut Bencher) {
        let question = get_bencher();

        b.iter(|| {
            question
                .iter()
                .map(|a| (a.0, a.1.get_circle()))
                .filter_map(|(&a, b)| Some((a, b?)))
                .collect::<Vec<_>>()
        })
    }

    #[bench]
    fn bencher02(b: &mut Bencher) {
        let question = get_bencher();

        b.iter(|| {
            question
                .iter()
                .map(|a| (a.0, a.1.get_circle()))
                .filter(|c| c.1.is_some())
                .map(|d| (*d.0, d.1.unwrap()))
                .collect::<Vec<_>>()
        })
    }

    #[bench]
    fn bencher03(b: &mut Bencher) {
        let question = get_bencher();

        b.iter(|| {
            question
                .iter()
                .filter_map(|a| Some((*a.0, a.1.get_circle()?)))
                .collect::<Vec<_>>()
        })
    }
}
Run Code Online (Sandbox Code Playgroud)

使用 Rust 每晚运行这些测试,并cargo bench强制发布模式。

输出

use std::collections::HashMap;

pub enum Kind {
    Square(Square),
    Circle(Circle),
}

#[derive(Default, Copy, Clone)]
pub struct Circle {
    a: u32,
    b: u32,
    c: u32,
    d: u32,
}

#[derive(Default)]
pub struct Square {
    a: u32,
    b: Option<u32>,
    c: Option<u32>,
    d: Option<u32>,
    e: Option<u32>,
}

impl Kind {
    pub fn get_circle(&self) -> Option<&Circle> {
        if let Kind::Circle(b) = self {
            return Some(b);
        }
        None
    }
}
Run Code Online (Sandbox Code Playgroud)

我正在rustc 1.44.0-nightly (6dee5f112 2020-04-06)使用Intel(R) Core(TM) i7-7700K CPU @ 4.20GHz Linux #### 5.6.4-arch1-1 #1 SMP PREEMPT Mon, 13 Apr 2020 12:21:19 +0000 x86_64 GNU/Linux

Jmb*_*Jmb 5

区别在于,在您的实现中,您在检查形状是否是圆形之前flat_map复制,因此当形状不是圆形时,您会浪费时间复制并丢弃它。SocketAddr看:

#[bench]
fn bencher04(b: &mut Bencher) {
    let question = get_bencher();

    b.iter(|| {
        question
            .iter()
            .filter_map(|a| {
                let c = a.1.get_circle()?;
                Some((*a.0, c))
            })
            .collect::<Vec<_>>()
    })
}
Run Code Online (Sandbox Code Playgroud)

这给了我:

#[bench]
fn bencher04(b: &mut Bencher) {
    let question = get_bencher();

    b.iter(|| {
        question
            .iter()
            .filter_map(|a| {
                let c = a.1.get_circle()?;
                Some((*a.0, c))
            })
            .collect::<Vec<_>>()
    })
}
Run Code Online (Sandbox Code Playgroud)

注意:我的差异比你的要小,因为我在 32 位平台上运行,在该平台上复制速度SocketAddr要快得多。


在 64 位平台上,bencher04性能不佳。查看生成的程序集bencher04看起来非常相似bencher02,但由于某种原因它确实移动了更多数据。

然而:

#[bench]
fn bencher05(b: &mut Bencher) {
    let question = get_bencher();

    b.iter(|| {
        question
            .iter()
            .flat_map(|a| {
                a.1.get_circle().map (|c| (*a.0, c))
            })
            .collect::<Vec<_>>()
        })
    }
Run Code Online (Sandbox Code Playgroud)

性能更接近:

running 4 tests
test tests::bencher01 ... bench:     339,720 ns/iter (+/- 23,464)
test tests::bencher02 ... bench:     329,727 ns/iter (+/- 12,212)
test tests::bencher03 ... bench:     335,785 ns/iter (+/- 16,195)
test tests::bencher04 ... bench:     327,622 ns/iter (+/- 20,807)
Run Code Online (Sandbox Code Playgroud)