Cache coherency: Threads vs Cores

Ima*_*mak 3 concurrency multithreading caching

I am currently studying concurrent systems, and I've become a little confused with the concept of cache coherency when working with multiple threads and multiple cores at the same time.

Some assumptions as I understand:

  • Cores have caches
  • Cores may have multiple threads at one time (if hyperthreaded)
  • A thread is a single line of commands that are getting processed
  • Thus, threads are not physical hardware and threads don't have caches and use the core's cache

Suppose a core has we have two threads and x is a shared variable with value five. Both want to execute:

my_y = x;

Where my_y is a private variable defined by both threads. Now suppose thread 0 executes:

x++;

Finally, suppose that thread 1 now executes:

my_z = x;

Where my_z is another private variable.

My book says the following:

What's the value in my_z? Is it five? Or is it six? The problem is that there are (at least) three copies of x: the one in main memory, the one in thread 0's cache, and the one in thread 1's cache.

How does this work? How are there at least three copies of x and why does the book specify that each thread has its own cache? To me, it would make sense that the core which is running the two threads has the value of x in its cache and thus both threads have the value in "their" (shared) cache.

In other words, when x++ is updated, the value in the core's cache would be updated. Then, thread 1 would execute my_z = x; which is still in the core's cache and it is up to date. Thus, there would be no coherency issue because the two threads basically share the cache.

It could be that the book assumes that each core has only one thread, but the book did previously mention something about "if there are more threads than cores". Does "if there are more threads than cores" imply that a core has more than one thread (hyperthreading) or is there some sort of thread scheduling happening so that each core only has one thread at a time?

即使是这种情况(核的调度,一个核一次只能有一个线程),如果一个核拥有线程0,执行x++然后获取试图执行my_z = x;x值的线程1仍然存在如果我没记错的话,在那个核心的缓存中。

附加问题:线程的私有变量如何存储在内存中?它们的存储方式是否与使用时复制到核心缓存中的任何变量相同?如果是这种情况,如果多个线程正在使用缓存(无论是同时还是计划的),那么在核心缓存中拥有私有变量是否会出现问题?

根据 @biziclop 的要求,本书陈述了以下假设:

  • 我们使用MIMD系统,即节点具有相同的架构。(不过,这本书没有具体说明这是哪种架构)
  • 我们的项目是 SPMD。因此,我们将编写一个可以使用分支来具有多种行为的程序。
  • 我们假设内核相同但异步运行。
  • 我们使用 C 语言进行编程,在本节中我们重点关注 Pthreads。

任何帮助,将不胜感激!

Sol*_*low 5

为什么书上规定每个线程都有自己的缓存?

作者太草率了。线程没有缓存。运行线程的处理器核心具有缓存。

这本书之前[说]“如果线程多于核心”。[这]是否意味着一个核心有多个线程(超线程),或者是否发生某种线程调度,以便每个核心一次只有一个线程?

这些事情中的任何一件都可能是真的。我们已经确定作者的语言有点草率,因此从上下文中取出这句话,无法判断它是在谈论比内核更多的硬件线程还是更多的软件线程。

线程的私有变量在内存中是如何存储的?

进程中的所有线程都看到完全相同的虚拟地址空间。从最广泛的意义上讲,“私有”只是描述了一个仅由一个线程使用的内存位置,并且该位置为何仅由一个线程使用并不重要。

从更狭义的意义上讲,每个线程都有一个函数激活记录堆栈(也称为“调用堆栈”),其中包含所有活动函数调用的参数和局部变量。在许多编程语言中,一个线程不可能与任何其他线程共享其参数或局部变量,因此这些内存位置自动是“私有的”。在其他编程语言中,可以共享arg 或本地变量,但程序员必须编写显式代码来共享它,无论如何,这可能不是一个好主意。

如果多个线程正在使用缓存(无论是同时还是计划的),那么在核心缓存中拥有私有变量会出现问题吗?

当两个不同的内存位置都散列到相同的缓存位置时,称为冲突。是的!碰撞有时会发生。如果某个缓存行包含变量X,并且线程T想要访问恰好使用同一缓存行的变量Y,那么内存系统将让线程T在从主存中获取数据时等待。

这种现象也称为“错误共享”(通常当它成为问题时),如果您确定它确实降低了程序的性能,您可以通过 Google 搜索避免这种情况的策略。