Junhao He b6e902b3d7 Add HBM Memory ACLS support for HiSilicon
When a hardware error occurs in a cell of the HBM memory, the internal
SRAM of the memory controller is used to replace the faulty memory, this
method is ACLS (Adaptive Cache Line Sparing). The IMU reports the ACLS
RAS, and the rasdaemon record it and runs the ACLS to replace the faulty
memory.

HBM ACLS can repair one cell (258-bit) memory at a time. The HBM can
check which HBM cell the physical address belongs to and filter invalid
HBM addresses. Multiple RAS errors are reported if memory errors occur
in different HBM cells.

The feature depends on the linux kernel CONFIG_HISI_MEM_RAS and
CONFIG_PAGE_EJECT.

Signed-off-by: Junhao He <hejunhao3@huawei.com>
2024-08-31 19:02:05 +08:00
2019-09-30 11:16:11 -04:00
2020-07-03 10:33:51 +08:00
Description
No description provided
596 KiB
Languages
Diff 100%