反序列化后Hashmap变慢 - 为什么？

Question

反序列化后Hashmap变慢 - 为什么？

Tsc*_*ger 5 java hashmap deserialization

我有一个非常大的Hashmap(~250MB).创建它大约需要50-55秒,所以我决定将其序列化并将其保存到文件中.从文件中读取大约需要16-17秒.

唯一的问题是查找似乎这样慢.我一直认为hashmap是从文件读入内存的,因此与我自己创建hashmap的情况相比,性能应该是相同的,对吧？这是我用来将hashmap读入文件的代码:

File file = new File("omaha.ser");
FileInputStream f = new FileInputStream(file);
ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
omahaMap = (HashMap<Long, Integer>) s.readObject();
s.close();

Run Code Online (Sandbox Code Playgroud)

当我自己创建hashmap时,3亿次查找大约需要3.1秒,而当我从文件中读取相同的hashmap时大约需要8.5秒.有人知道为什么吗？我忽略了一些明显的东西吗

编辑:

我通过使用System.nanotime()获取时间来"测量"时间,因此没有使用适当的基准测试方法.这是代码:

public class HandEvaluationTest
{
    public static void Test()
    {

        HandEvaluation.populate5Card();
        HandEvaluation.populate9CardOmaha();


        Card[] player1cards = {new Card("4s"), new Card("2s"), new Card("8h"), new Card("4d")};
        Card[] player2cards = {new Card("As"), new Card("9s"), new Card("6c"), new Card("2h")};
        Card[] player3cards = {new Card("9h"), new Card("7h"), new Card("Kc"), new Card("Kh")};
        Card[] table = {new Card("2d"), new Card("2c"), new Card("3c"), new Card("5c"), new Card("4h")};


        int j=0, k=0, l=0;
        long startTime = System.nanoTime();
        for(int p=0; p<100000000; p++)    {
           j = HandEvaluation.handEval9Hash(player1cards, table);
            k = HandEvaluation.handEval9Hash(player2cards, table);
            l = HandEvaluation.handEval9Hash(player3cards, table);

        }
        long estimatedTime = System.nanoTime() - startTime;
        System.out.println("Time needed: " + estimatedTime*Math.pow(10,-6) + "ms");
        System.out.println("Handstrength Player 1: " + j);
        System.out.println("Handstrength Player 2: " + k);
        System.out.println("Handstrength Player 3: " + l);
    }
}

Run Code Online (Sandbox Code Playgroud)

大型hashmap工作在HandEvaluation.populate9CardOmaha()中完成.5张卡很小.大的代码:

 public static void populate9CardOmaha()
        {

            //Check if the hashmap is already there- then just read it and exit
            File hashmap = new File("omaha.ser");
            if(hashmap.exists())
            {
                try
                {
                    File file = new File("omaha.ser");
                    FileInputStream f = new FileInputStream(file);
                    ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
                    omahaMap = (HashMap<Long, Integer>) s.readObject();
                    s.close();
                }
                catch(IOException ioex) {ioex.printStackTrace();}
                catch(ClassNotFoundException cnfex)
                {
                    System.out.println("Class not found");
                    cnfex.printStackTrace();
                    return;
                }
                return;
            }

    // if it's not there, populate it yourself
    ... Code for populating hashmap ...
    // and then save it to file
          (

            try
            {
                File file = new File("omaha.ser");
                FileOutputStream f = new FileOutputStream(file);
                ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
                s.writeObject(omahaMap);
                s.close();
            }
            catch(IOException ioex) {ioex.printStackTrace();}
        }

Run Code Online (Sandbox Code Playgroud)

当我自己填充它(=文件不在这里)时,HandEvaluationTest.Test()中的查找大约需要8秒而不是3.也许这只是我非常天真的测量时间的方法？

Answer 1

Dee*_*ala 3

这个问题很有趣，所以我自己写了一个测试用例来验证一下。我发现实时查找与从序列化文件加载的速度没有差异。任何有兴趣运行该程序的人都可以在文章末尾找到该程序。

使用 JProfiler 监控这些方法。
序列化文件与您的文件相当。～230 MB。
在没有任何序列化的情况下，内存查找花费1210毫秒

在此输入图像描述

序列化地图并再次读取它们后，查找成本保持不变（几乎 - 1224毫秒）

在此输入图像描述

对探查器进行了调整，以在两种情况下增加最小的开销。
这是在Java(TM) SE Runtime Environment (build 1.6.0_25-b06)//4 CPUs running at 1.7 Ghz上测量的4GB Ram 800 Mhz

测量是很棘手的。我本人注意到了8 second您所描述的查找时间，但猜猜当发生这种情况时我还注意到了什么。

气相色谱活性

在此输入图像描述

您的测量结果可能也反映了这一点。如果您单独分离测量结果，Map.get()您会发现结果具有可比性。

public class GenericTest
{
    public static void main(String... args)
    {
        // Call the methods as you please for a live Vs ser <-> de_ser run
    }

    private static Map<Long, Integer> generateHashMap()
    {
        Map<Long, Integer> map = new HashMap<Long, Integer>();
        final Random random = new Random();
        for(int counter = 0 ; counter < 10000000 ; counter++)
        {
            final int value = random.nextInt();
            final long key = random.nextLong();
            map.put(key, value);
        }
        return map;
    }

    private static void lookupItems(int n, Map<Long, Integer> map)
    {
        final Random random = new Random();
        for(int counter = 0 ; counter < n ; counter++)
        {
            final long key = random.nextLong();
            final Integer value = map.get(key);
        }
    }

    private static void serialize(Map<Long, Integer> map)
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileOutputStream f = new FileOutputStream(file);
            ObjectOutputStream s = new ObjectOutputStream(new BufferedOutputStream(f));
            s.writeObject(map);
            s.close();
        }
        catch (Exception e)
        {
            e.printStackTrace();
        }
    }

    private static Map<Long, Integer> deserialize()
    {
        try
        {
            File file = new File("temp/omaha.ser");
            FileInputStream f = new FileInputStream(file);
            ObjectInputStream s = new ObjectInputStream(new BufferedInputStream(f));
            HashMap<Long, Integer> map = (HashMap<Long, Integer>) s.readObject();
            s.close();
            return map;
        }
        catch (Exception e)
        {
            throw new RuntimeException(e);
        }
    }
}

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，9 月前
查看次数：	383 次
最近记录：	10 年，9 月前