1% Intel Cache Allocation Technology and Code and Data Prioritization Features
2% Revision 1.17
3
4\clearpage
5
6# Basics
7
8---------------- ----------------------------------------------------
9         Status: **Tech Preview**
10
11Architecture(s): Intel x86
12
13   Component(s): Hypervisor, toolstack
14
15       Hardware: L3 CAT: Haswell and beyond CPUs
16                 CDP   : Broadwell and beyond CPUs
17                 L2 CAT: Atom codename Goldmont and beyond CPUs
18---------------- ----------------------------------------------------
19
20# Terminology
21
22* CAT         Cache Allocation Technology
23* CBM         Capacity BitMasks
24* CDP         Code and Data Prioritization
25* CMT         Cache Monitoring Technology
26* COS/CLOS    Class of Service
27* MSRs        Machine Specific Registers
28* PSR         Intel Platform Shared Resource
29
30# Overview
31
32Intel provides a set of allocation capabilities including Cache Allocatation
33Technology (CAT) and Code and Data Prioritization (CDP).
34
35CAT allows an OS or hypervisor to control allocation of a CPU's shared cache
36based on application/domain priority or Class of Service (COS). Each COS is
37configured using capacity bitmasks (CBMs) which represent cache capacity and
38indicate the degree of overlap and isolation between classes. Once CAT is
39configured, the processor allows access to portions of cache according to the
40established COS. Intel Xeon processor E5 v4 family (and some others) introduce
41capabilities to configure and make use of the CAT mechanism on the L3 cache.
42Intel Goldmont processor provides support for control over the L2 cache.
43
44Code and Data Prioritization (CDP) Technology is an extension of CAT. CDP
45enables isolation and separate prioritization of code and data fetches to
46the L3 cache in a SW configurable manner, which can enable workload
47prioritization and tuning of cache capacity to the characteristics of the
48workload. CDP extends CAT by providing separate code and data masks per Class
49of Service (COS). When SW configures to enable CDP, L3 CAT is disabled.
50
51# User details
52
53* Feature Enabling:
54
55    Add "psr=cat" to boot line parameter to enable all supported level CAT
56    features. Add "psr=cdp" to enable L3 CDP but disables L3 CAT by SW.
57
58* xl interfaces:
59
60    1. `psr-cat-show [OPTIONS] domain-id`:
61
62        Show L2 CAT or L3 CAT/CDP CBM of the domain designated by Xen domain-id.
63
64        Option `-l`:
65
66        `-l2`: Show cbm for L2 cache.
67
68        `-l3`: Show cbm for L3 cache.
69
70        If `-lX` is specified and LX is not supported, print error.
71        If no `-l` is specified, level 3 is the default option.
72
73    2. `psr-cat-set [OPTIONS] domain-id cbm`:
74
75        Set L2 CAT or L3 CAT/CDP CBM to the domain designated by Xen domain-id.
76
77        Option `-s`: Specify the socket to process, otherwise all sockets are
78        processed.
79
80        Option `-l`:
81
82        `-l2`: Specify cbm for L2 cache.
83
84        `-l3`: Specify cbm for L3 cache.
85
86        If `-lX` is specified and LX is not supported, print error.
87        If no `-l` is specified, level 3 is the default option.
88
89        Option `-c` or `-d`:
90
91        `-c`: Set L3 CDP code cbm.
92
93        `-d`: Set L3 CDP data cbm.
94
95    3. `psr-hwinfo [OPTIONS]`:
96
97        Show CMT & L2 CAT & L3 CAT/CDP HW information on every socket.
98
99        Option `-m, --cmt`: Show Cache Monitoring Technology (CMT) hardware
100        info.
101
102        Option `-a, --cat`: Show CAT/CDP hardware info.
103
104# Technical details
105
106L3 CAT/CDP and L2 CAT are all members of Intel PSR features, they share the base
107PSR infrastructure in Xen.
108
109## Hardware perspective
110
111CAT/CDP defines a range of MSRs to assign different cache access patterns
112which are known as CBMs, each CBM is associated with a COS.
113
114E.g. L2 CAT:
115
116                            +----------------------------+----------------+
117       IA32_PQR_ASSOC       | MSR (per socket)           |    Address     |
118     +----+---+-------+     +----------------------------+----------------+
119     |    |COS|       |     | IA32_L2_QOS_MASK_0         |     0xD10      |
120     +----+---+-------+     +----------------------------+----------------+
121            +-------------> | ...                        |  ...           |
122                            +----------------------------+----------------+
123                            | IA32_L2_QOS_MASK_n         | 0xD10+n (n<64) |
124                            +----------------------------+----------------+
125
126L3 CAT/CDP uses a range of MSRs from 0xC90 ~ 0xC90+n (n<128).
127
128L2 CAT uses a range of MSRs from 0xD10 ~ 0xD10+n (n<64), following the L3
129CAT/CDP MSRs, setting different L2 cache accessing patterns from L3 cache is
130supported.
131
132Every MSR stores a CBM value. A capacity bitmask (CBM) provides a hint to the
133hardware indicating the cache space a domain should be limited to as well as
134providing an indication of overlap and isolation in the CAT-capable cache from
135other domains contending for the cache.
136
137Sample cache capacity bitmasks for a bitlength of 8 are shown below. Please
138note that all (and only) contiguous '1' combinations are allowed (e.g. FFFFH,
1390FF0H, 003CH, etc.).
140
141           +----+----+----+----+----+----+----+----+
142           | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
143           +----+----+----+----+----+----+----+----+
144      COS0 | A  | A  | A  | A  | A  | A  | A  | A  | Default Bitmask
145           +----+----+----+----+----+----+----+----+
146      COS1 | A  | A  | A  | A  | A  | A  | A  | A  |
147           +----+----+----+----+----+----+----+----+
148      COS2 | A  | A  | A  | A  | A  | A  | A  | A  |
149           +----+----+----+----+----+----+----+----+
150
151           +----+----+----+----+----+----+----+----+
152           | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
153           +----+----+----+----+----+----+----+----+
154      COS0 | A  | A  | A  | A  | A  | A  | A  | A  | Overlapped Bitmask
155           +----+----+----+----+----+----+----+----+
156      COS1 |    |    |    |    | A  | A  | A  | A  |
157           +----+----+----+----+----+----+----+----+
158      COS2 |    |    |    |    |    |    | A  | A  |
159           +----+----+----+----+----+----+----+----+
160
161           +----+----+----+----+----+----+----+----+
162           | M7 | M6 | M5 | M4 | M3 | M2 | M1 | M0 |
163           +----+----+----+----+----+----+----+----+
164      COS0 | A  | A  | A  | A  |    |    |    |    | Isolated Bitmask
165           +----+----+----+----+----+----+----+----+
166      COS1 |    |    |    |    | A  | A  |    |    |
167           +----+----+----+----+----+----+----+----+
168      COS2 |    |    |    |    |    |    | A  | A  |
169           +----+----+----+----+----+----+----+----+
170
171We can get the CBM length through CPUID. The default value of CBM is calculated
172by `(1ull << cbm_len) - 1`. That is a fully open bitmask, all ones bitmask.
173The COS\[0\] always stores the default value without change.
174
175There is a `IA32_PQR_ASSOC` register which stores the COS ID of the VCPU. HW
176enforces cache allocation according to the corresponding CBM.
177
178## The relationship between L3 CAT/CDP and L2 CAT
179
180HW may support all features. By default, CDP is disabled on the processor.
181If the L3 CAT MSRs are used without enabling CDP, the processor operates in
182a traditional CAT-only mode. When CDP is enabled:
183
184* the CAT mask MSRs are re-mapped into interleaved pairs of mask MSRs for
185  data or code fetches.
186
187* the range of COS for CAT is re-indexed, with the lower-half of the COS
188  range available for CDP.
189
190L2 CAT is independent of L3 CAT/CDP, which means L2 CAT can be enabled while
191L3 CAT/CDP is disabled, or L2 CAT and L3 CAT/CDP are both enabled.
192
193As a requirement, the bits of CBM of CAT/CDP must be continuous.
194
195N.B. L2 CAT and L3 CAT/CDP share the same COS field in the same associate
196register `IA32_PQR_ASSOC`, which means one COS is associated with a pair of
197L2 CAT CBM and L3 CAT/CDP CBM.
198
199Besides, the max COS of L2 CAT may be different from L3 CAT/CDP (or other
200PSR features in future). In some cases, a domain is permitted to have a COS
201that is beyond one (or more) of PSR features but within the others. For
202instance, let's assume the max COS of L2 CAT is 8 but the max COS of L3
203CAT is 16, when a domain is assigned 9 as COS, the L3 CAT CBM associated to
204COS 9 would be enforced, but for L2 CAT, the HW works as default value is
205set since COS 9 is beyond the max COS (8) of L2 CAT.
206
207## Design Overview
208
209* Core COS/CBM association
210
211    When enforcing CAT/CDP, all cores of domains have the same default COS
212    (COS0) which is associated with the fully open CBM (all ones bitmask) to
213    access all cache. The default COS is used only in hypervisor and is
214    transparent to tool stack and user.
215
216    System administrator can change PSR allocation policy at runtime by tool
217    stack. Since L2 CAT shares COS with L3 CAT/CDP, a COS corresponds to a
218    2-tuple, like \[L2 CBM, L3 CBM\] with only-CAT enabled, when CDP is
219    enabled, one COS corresponds to a 3-tuple, like \[L2 CBM, L3 Code_CBM,
220    L3 Data_CBM\]. If neither L3 CAT nor L3 CDP is enabled, things would be
221    easier, one COS corresponds to one L2 CBM.
222
223* VCPU schedule
224
225    When context switch happens, the COS of VCPU is written to per-thread MSR
226    `IA32_PQR_ASSOC`, and then hardware enforces cache allocation according to
227    the corresponding CBM.
228
229* Multi-sockets
230
231    Different sockets may have different CAT/CDP capability (e.g. max COS)
232    although it is consistent on the same socket. So the capability of
233    per-socket CAT/CDP is specified.
234
235    'psr-cat-set' can set CBM for one domain per socket. On each socket, we
236    maintain a COS array for all domains. One domain uses one COS at one time.
237    One COS stores the CBM of the domain to work. So, when a VCPU of the domain
238    is migrated from socket 1 to socket 2, it follows configuration on socket 2.
239
240    E.g. user sets domain 1 CBM on socket 1 to 0x7f which uses COS 9 but sets
241    domain 1 CBM on socket 2 to 0x3f which uses COS 7. When VCPU of this domain
242    is migrated from socket 1 to 2, the COS ID used is 7, that means 0x3f is the
243    CBM to work for this domain 1 now.
244
245## Implementation Description
246
247* Hypervisor interfaces:
248
249    1. Boot line parameter "psr=cat" enables L2 CAT and L3 CAT if hardware
250       supported. "psr=cdp" enables CDP if hardware supported.
251
252    2. SYSCTL:
253
254        * XEN_SYSCTL_PSR_CAT_get_l3_info: Get L3 CAT/CDP information.
255        * XEN_SYSCTL_PSR_CAT_get_l2_info: Get L2 CAT information.
256
257    3. DOMCTL:
258
259        * XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM: Get L3 CBM for a domain.
260        * XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM: Set L3 CBM for a domain.
261        * XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE: Get CDP Code CBM for a domain.
262        * XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE: Set CDP Code CBM for a domain.
263        * XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA: Get CDP Data CBM for a domain.
264        * XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA: Set CDP Data CBM for a domain.
265        * XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM: Get L2 CBM for a domain.
266        * XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM: Set L2 CBM for a domain.
267
268* xl interfaces:
269
270    1. psr-cat-show -lX domain-id
271
272        Show LX cbm for a domain.
273
274                => XEN_SYSCTL_PSR_CAT_get_l3_info    /
275                   XEN_SYSCTL_PSR_CAT_get_l2_info    /
276                   XEN_DOMCTL_PSR_CAT_OP_GET_L3_CBM  /
277                   XEN_DOMCTL_PSR_CAT_OP_GET_L3_CODE /
278                   XEN_DOMCTL_PSR_CAT_OP_GET_L3_DATA /
279                   XEN_DOMCTL_PSR_CAT_OP_GET_L2_CBM
280
281    2. psr-cat-set -lX domain-id cbm
282
283        Set LX cbm for a domain.
284
285                => XEN_DOMCTL_PSR_CAT_OP_SET_L3_CBM  /
286                   XEN_DOMCTL_PSR_CAT_OP_SET_L3_CODE /
287                   XEN_DOMCTL_PSR_CAT_OP_SET_L3_DATA /
288                   XEN_DOMCTL_PSR_CAT_OP_SET_L2_CBM
289
290    3. psr-hwinfo
291
292        Show PSR HW information, including L3 CAT/CDP/L2 CAT
293
294                => XEN_SYSCTL_PSR_CAT_get_l3_info /
295                   XEN_SYSCTL_PSR_CAT_get_l2_info
296
297* Key data structure:
298
299    1. Feature properties
300
301            static const struct feat_props {
302                unsigned int cos_num;
303                enum cbm_type type[PSR_MAX_COS_NUM];
304                enum cbm_type alt_type;
305                bool (*get_feat_info)(const struct feat_node *feat,
306                                      uint32_t data[], unsigned int array_len);
307                void (*write_msr)(unsigned int cos, uint32_t val,
308                                  enum cbm_type type);
309            } *feat_props[PSR_SOCKET_FEAT_NUM];
310
311        Every feature has its own properties, e.g. some data and actions. A
312        feature property pointer array is declared to save every feature's
313        properties.
314
315        * Member `cos_num`
316
317            `cos_num` is the number of COS registers the feature uses, e.g.
318            L3/L2 CAT uses 1 register but CDP uses 2 registers.
319
320        * Member `type`
321
322            `type` is an array to save all 'enum cbm_type' values of the
323            feature. It is used with cos_num together to get/write a feature's
324            COS registers values one by one.
325
326        * Member `alt_type`
327
328            `alt_type` is 'alternative type'. When this 'alt_type' is input,
329            the feature does some special operations.
330
331        * Member `get_feat_info`
332
333            `get_feat_info` is used to return feature HW info through sysctl.
334
335        * Member `write_msr`
336
337            `write_msr` is used to write out feature MSR register.
338
339    2. Feature node
340
341            struct feat_node {
342                unsigned int cos_max;
343                unsigned int cbm_len;
344                uint32_t cos_reg_val[MAX_COS_REG_CNT];
345            };
346
347        When a PSR enforcement feature is enabled, it will be added into a
348        feature array.
349
350        * Member `cos_max`
351
352            `cos_max` is one of the hardware info of CAT. It means the max
353            number of COS registers. As L3 CAT/CDP/L2 CAT all have it, it is
354            declared in `feat_node`.
355
356        * Member `cbm_len`
357
358            `cbm_len` is one of the hardware info of CAT. It means the max
359            number of bits to set.
360
361        * Member `cos_reg_val`
362
363            `cos_reg_val` is an array to maintain the value set in all COS
364            registers of the feature. The array is indexed by COS ID.
365
366    3. Per-socket PSR features information structure
367
368            struct psr_socket_info {
369                bool feat_init;
370                struct feat_node *features[PSR_SOCKET_FEAT_NUM];
371                spinlock_t ref_lock;
372                unsigned int cos_ref[MAX_COS_REG_CNT];
373                DECLARE_BITMAP(dom_ids, DOMID_IDLE + 1);
374            };
375
376        We collect all PSR allocation features information of a socket in this
377        `struct psr_socket_info`.
378
379        * Member `feat_init`
380
381            feat_init` is a flag, to indicate whether the CPU init on a socket
382            has been done.
383
384        * Member `features`
385
386            `features` is a pointer array to save all enabled features pointers
387            according to feature position defined in `enum psr_feat_type`.
388
389        * Member `ref_lock`
390
391            `ref_lock` is a spin lock to protect `cos_ref`.
392
393        * Member `cos_ref`
394
395            `cos_ref` is an array which maintains the reference of one COS.
396            It maps to cos_reg_val\[MAX_COS_REG_NUM\] in `struct feat_node`.
397            If one COS is used by one domain, the corresponding reference will
398            increase by one. If a domain releases the COS, the reference will
399            decrease by one. The array is indexed by COS ID.
400
401        * Member `dom_ids`
402
403            `dom_ids` is a bitmap, every bit corresponds to a domain. Index is
404            domain_id. It is used to help restore the cos_id of the domain to 0
405            when a socket is offline and then online again.
406
407# Limitations
408
409CAT/CDP can only work on HW which enables it(check by CPUID). So far, there is
410no HW which enables both L2 CAT and L3 CAT/CDP. But SW implementation has
411considered such scenario to enable both L2 CAT and L3 CAT/CDP.
412
413# Testing
414
415We can execute above xl commands to verify L2 CAT and L3 CAT/CDP on different
416HWs support them.
417
418For example:
419
420    root@:~$ xl psr-hwinfo --cat
421    Cache Allocation Technology (CAT): L2
422    Socket ID       : 0
423    Maximum COS     : 3
424    CBM length      : 8
425    Default CBM     : 0xff
426
427    root@:~$ xl psr-cat-cbm-set -l2 1 0x7f
428
429    root@:~$ xl psr-cat-show -l2 1
430    Socket ID       : 0
431    Default CBM     : 0xff
432       ID                     NAME             CBM
433        1                 ubuntu14            0x7f
434
435# Areas for improvement
436
437A hexadecimal number is used to set/show CBM for a domain now. Although this
438is convenient to cover overlap/isolated bitmask requirement, it is not
439user-friendly.
440
441To improve this, the libxl interfaces can be wrapped in libvirt to provide more
442user-friendly interfaces to user, e.g. a percentage number of the cache to set
443and show.
444
445# Known issues
446
447N/A
448
449# References
450
451"INTEL RESOURCE DIRECTOR TECHNOLOGY (INTEL RDT) ALLOCATION FEATURES" [Intel 64 and IA-32 Architectures Software Developer Manuals, vol3](http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html)
452
453# History
454
455------------------------------------------------------------------------
456Date       Revision Version  Notes
457---------- -------- -------- -------------------------------------------
4582016-08-12 1.0      Xen 4.9  Design document written
459
4602017-02-13 1.7      Xen 4.9  Changes:
461
462                             1. Modify the design document to cover L3
463                                CAT/CDP and L2 CAT;
464
465                             2. Fix typos;
466
467                             3. Amend description of `feat_mask` to make
468                                it clearer;
469
470                             4. Other minor changes.
471
4722017-02-15 1.8      Xen 4.9  Changes:
473
474                             1. Add content in 'Areas for improvement';
475
476                             2. Adjust revision number.
477
4782017-03-16 1.9      Xen 4.9  Changes:
479
480                             1. Add 'CMT' in 'Terminology';
481
482                             2. Change 'feature list' to 'feature array'.
483
484                             3. Modify data structure descriptions.
485
486                             4. Adjust revision number.
487
4882017-05-03 1.11     Xen 4.9  Changes:
489
490                             1. Modify data structure descriptions.
491
492                             2. Adjust revision number.
493
4942017-07-13 1.14     Xen 4.10 Changes:
495
496                             1. Fix a typo.
497
4982017-08-01 1.15     Xen 4.10 Changes:
499
500                             1. Add 'alt_type' in 'feat_props' structure.
501
5022017-08-04 1.16     Xen 4.10 Changes:
503
504                             1. Remove special character which may cause
505                                html creation failure.
506
5072018-07-10 1.17     Xen 4.12 Changes:
508
509                             1. Reformat complete document to enable PDF
510                                creation.
511
512---------- -------- -------- -------------------------------------------
513