In the third instance, this event is again a false sharing miss, since the block containing X1 is marked shared due to the read in P2, but P2 did not read X1. This event is a false sharing miss, since X2 was invalidated by the write of X1 in P1, but that value of X1 is not used in P2. In the second instance, P2 reads X2, which was earlier invalidated by P1. This event is a true sharing miss, since X1 was read by P2 and needs to be invalidated from P2. In the first instance, Processor P1 modifies X1. We shall see what happens for the sequence of operations shown below and classify them each of them as a true sharing miss or a false sharing miss. Let us assume that both X1 and X2 are in the same cache block and processors P1 and P2 have read X1 and X2 before. The following example in Figure 34.1 makes the sharing patterns clear. ![]() In a false sharing miss, the block is shared, but no word in the cache is actually shared, and the miss would not occur if the block size were a single word. If, however, the word being written and the word read are different and the invalidation does not cause a new value to be communicated, but only causes an extra cache miss, then it is a false sharing miss. If the word written into is actually used by the processor that received the invalidate, then the reference was a true sharing reference and would have caused a miss independent of the block size. False sharing occurs when a block is invalidated (and a subsequent reference causes a miss) because some word in the block, other than the one being read, is written into. The second effect, called false sharing, arises from the use of an invalidation based coherence algorithm with a single valid bit per cache block. Both these misses are classified as true sharing misses since they directly arise from the sharing of data among processors. Additionally, when another processor attempts to read a modified word in that cache block, a miss occurs and the resultant block is transferred. In an invalidation based protocol, the first write by a processor to a shared cache block causes an invalidation to establish ownership of that block. The first source is true sharing misses that arise from the communication of data through the cache coherence mechanism. The coherence misses can be broken into two separate sources. These are the misses that are caused due to inter-processor communication, in order to maintain coherence. In addition to these, in a multiprocessor system, we have a fourth miss called the coherence misses. We have already looked at the three Cs that contribute to the misses in a uni-processor system – capacity, conflict and compulsory. Performance of symmetric shared memory multiprocessors: In a multiprocessor system, several factors affect the performance. We will focus on the performance of symmetric shared memory multiprocessors and then elaborate on the directory based approach in this module. The previous module discussed in detail about the snoop based protocol. Requires broadcast, since caching information is at processors Useful for small scale machines (most of the market) The caches are all accessible via some broadcast medium (a bus or switch), and all cache controllers monitor or snoop on the medium to determine whether or not they have a copy of a block that is requested on a bus or switch access. Snoop based: Every cache that has a copy of the data from a block of physical memory also has a copy of the sharing status of the block, but no centralized state is kept. Communication is established using point-to-point requests through the interconnection network.Ģ. The directory can also be distributed to improve scalability. ![]() Directory based: The sharing status of a block of physical memory is kept in just one location, called the directory. As a recap, the two types are given below:ġ. In the previous module, we discussed the cache coherence problem and pointed out that there are basically two types of cache coherence protocols. ![]() The objectives of this module are to discuss about the performance of symmetric shared memory multiprocessors in terms of true sharing and false sharing misses and elaborate on the Directory based cache coherency protocol.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |