DaB*_*ler 6 linux gentoo hardware linux-kernel amd
我最近在我的 AMD Ryzen 7 1700X 上安装了 Gentoo Linux。现在我在繁重的编译负载期间遇到分段错误,并且在空闲状态下随机重新启动。
作为第一步,我验证了当前的微码版本:
grep -m 1 microcode /proc/cpuinfo
microcode : 0x8001126
Run Code Online (Sandbox Code Playgroud)
但是,根据此表,最新的微码应该是 0x08001129。因此,更新 CPU 的微码似乎是个好主意。
所以我出现了=sys-kernel/linux-firmware-20180730(包含/lib/firmware/amd-ucode/microcode_amd_fam17h.bin)。此外,我在内核中启用了以下选项:
CONFIG_MICROCODE=y
CONFIG_MICROCODE_AMD=y
Run Code Online (Sandbox Code Playgroud)
重新启动后,我尝试手动加载微码(微码更新较晚):
echo 1 > /sys/devices/system/cpu/microcode/reload
Run Code Online (Sandbox Code Playgroud)
但是,当我这样做时,没有新行出现在dmesg:
dmesg | grep microcode
[ 0.465121] microcode: CPU0: patch_level=0x08001126
[ 0.465514] microcode: CPU1: patch_level=0x08001126
[ 0.465932] microcode: CPU2: patch_level=0x08001126
[ 0.466394] microcode: CPU3: patch_level=0x08001126
[ 0.466772] microcode: CPU4: patch_level=0x08001126
[ 0.467159] microcode: CPU5: patch_level=0x08001126
[ 0.467537] microcode: CPU6: patch_level=0x08001126
[ 0.467908] microcode: CPU7: patch_level=0x08001126
[ 0.468268] microcode: CPU8: patch_level=0x08001126
[ 0.468653] microcode: CPU9: patch_level=0x08001126
[ 0.468999] microcode: CPU10: patch_level=0x08001126
[ 0.469409] microcode: CPU11: patch_level=0x08001126
[ 0.469744] microcode: CPU12: patch_level=0x08001126
[ 0.470136] microcode: CPU13: patch_level=0x08001126
[ 0.470455] microcode: CPU14: patch_level=0x08001126
[ 0.470757] microcode: CPU15: patch_level=0x08001126
[ 0.471092] microcode: Microcode Update Driver: v2.2.
Run Code Online (Sandbox Code Playgroud)
我希望像microcode: CPU0: new patch_level=0x08001129. 我在这里缺少什么?一些内核CONFIG_选项?我可以打开某种调试信息吗?或者甚至更好 - 我如何列出 中提供的微码版本microcode_amd_fam17h.bin?
小智 3
你可以尝试这样的事情:
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FIRMWARE_MEMMAP=y
CONFIG_MICROCODE=y
# CONFIG_MICROCODE_INTEL is not set
CONFIG_MICROCODE_AMD=y
CONFIG_MICROCODE_OLD_INTERFACE=y
CONFIG_FW_LOADER=y
CONFIG_EXTRA_FIRMWARE="amd-ucode/microcode_amd_fam17h.bin"
CONFIG_EXTRA_FIRMWARE_DIR="/lib/firmware"
# CONFIG_FW_LOADER_USER_HELPER is not set
Run Code Online (Sandbox Code Playgroud)
(请注意,如果您想列出多个文件,则CONFIG_EXTRA_FIRMWARE它们应该以空格分隔,并且它们的路径应该相对于CONFIG_EXTRA_FIRMWARE_DIR。)
但这可能行不通(如果仅使用图形和网络固件,它对我有用,没有尝试使用CPU固件),所以,尝试另一种方法:忽略上面的值CONFIG_EXTRA_FIRMWARE(即不设置它;但也许其他的仍然需要,不确定),而是通过将 CPU 微代码文件添加到 initramfs 文件中来尝试早期微代码加载,也许像这样(在 Gentoo 中):
/etc/kernel/postinst.d/25-glue_cpu_microcode_to_kernel:
#!/bin/bash
bootdir='/bewt'
initramfsfname="initramfs"
initramfs="$( realpath -- "/${bootdir}/${initramfsfname}" )"
vmlinuz="/${bootdir}/kernel"
prepend_microcode () {
echo "prepending CPU microcode to ${initramfs}"
local destfirst="/tmp/initrd/"
local destmc="${destfirst}/kernel/x86/microcode/"
# mkdir -p "${destmc}"
install -dm644 "${destmc}"
#this will replace the symlink /bewt/initramfs (on gentoo) with the file!
#but this makes genkernel fail as such:
#ln: failed to create symbolic link 'initramfs.old' -> '': No such file or directory
#even though it doesn't touch the .old file!
# so to fix this, we'll use realpath above!
( cp -f "/lib/firmware/amd-ucode/microcode_amd.bin" "${destmc}/AuthenticAMD.bin" && cd "${destfirst}" && find . | cpio -o -H newc > "../ucode.cpio" 2>/dev/null && cd .. && cat "ucode.cpio" "${initramfs}" > "/tmp/${initramfsfname}" && chmod a-rwx "/tmp/${initramfsfname}" && mv -f "/tmp/${initramfsfname}" "${initramfs}" )
local ec=$?
if [[ $ec -eq 0 ]]; then
echo "success."
else
#TODO: make errors be red so it's more obvious
echo "failed!"
fi
return $ec
}
prepend_microcode
Run Code Online (Sandbox Code Playgroud)
然而,genkernel可能(仍然?3年后)忽略文件/etc/kernel/postinst.d/(或者这只发生在2015年,此后得到修复,或者可能出于其他原因),这意味着您必须手动运行genkernel自己(以编译内核),然后,之后,手动运行中存在的/etc/kernel/postinst.d/所有脚本,这样做看起来像这样:
echo "!! Running genkernel..."
time genkernel all --bootdir="/bewt" --install --symlink --no-splash --no-mountboot --makeopts="-j4 V=0" --no-keymap --lvm --no-mdadm --no-dmraid --no-zfs --no-multipath --no-iscsi --disklabel --luks --no-gpg --no-netboot --no-unionfs --no-firmware --no-integrated-initramfs --compress-initramfs --compress-initrd --compress-initramfs-type=best --loglevel=5 --color --no-mrproper --no-clean --no-postclear --oldconfig
ec="$?"
if test "$ec" -ne "0"; then
echo "!! genkernel failed $ec"
exit "$ec"
fi
echo "!! Done genkernel"
list=( `find /etc/kernel/postinst.d -type f -executable | sort --general-numeric-sort` )
echo "!! Found executables: ${list[@]}"
for i in ${list[@]}; do
ec="-1"
while test "0" -ne "$ec"; do
echo "!! Executing: '$i'"
time $i
ec="$?"
echo "!! Exit code: $ec"
if test "$ec" -ne "0"; then
echo "!! something went wrong, fix it then press Enter to retry executing '$i' or press C-c now."
#exit $ec
time read -p -s "!! Press Enter to re-execute that or C-c to cancel"
fi
done
done
(注意:上面使用的 bootdir 不是/bewt,/boot所以您可能至少需要更改它;microcode_amd.bin上面的字符串也应该替换为您的:)microcode_amd_fam17h.bin上面
的list=字符串for不是处理文件名的正确方法,除非它们没有空格、换行符等,这显然是上面假设的。
如果您想查看.config早期加载 cpu 固件的旧 4.1.7 内核,请参阅此内核。