My last post was about properly determining the number of hugepages needed with the newest version of the unbreakable enterprise kernel UEK R3 in Oracle Linux. Aside from the little issue with the proper kernel version detection in the script, the process is quite simple and well documented on the web:
- turn off automatic memory management AMM by setting the parameter MEMORY_TARGET to 0
- determine the number of hugepages needed by running the hugepages_settings.sh script
- add an entry to /etc/sysctl.conf
- reload the config with sysctl -p or reboot the server
- edit /etc/security/limits.conf
- bounce the instance(s)
Unfortunately, this was not it for me. You can check the effect of the setting like this:
[root@linuxbox ~]# grep Huge /proc/meminfo
Hugepagesize: 2048 kB
This should have showed more than 0 hugepages configured. There is however an error logged by dmesg.
[root@linuxbox ~]# dmesg | tail -1
Guest pages are not properly aligned to use hugepages
Alright, the problem is that while Oracle VM may be engineered to run the Oracle database best, someone actually forgot to add support for huge pages in Para-Virtualized (PVM) guests. This should work for HVM guests (I have not tried it) but the general recommendation is to use PVM for Linux guests. The workaround for OVM 3.2.4 and later is described in metalink note 1529373.1. You have to add the parameter ‘allowsuperpage’ to the grub kernel boot options in /etc/grub.conf like this (added parameter is bold):
title Oracle VM Server (2.6.39-300.32.5.el5uek)
kernel /xen.gz dom0_mem=1160M allowsuperpage
module /vmlinuz-2.6.39-300.32.5.el5uek ro root=UUID=d15aad1f-668b-4358-a54a-d7163a15198d
You need to do this on all nodes that you may want to run or live-migrate a VM with huge pages to and reboot the servers. I had to perform an update to 3.2.7 anyway so a reboot was neccessary anyway.
Next step, you need to add the parameter “superpages=1” at the end of the vm.cfg file of this vm and stop/start (restart is not enough) the vm.
But you are not supposed to edit the vm.cfg manually and an edit of the VM in the manager (like changing number of vcpus, memory or boot options) will remove that line again. Which will then lead to this VM not actually allocating any hugepages after the next boot. The note says they are working on it and I really hope they will fix this rather sooner than later.
And in case you are wondering what all this fuss is about and why I do not simply want to live with good old standard shared memory, let Mark Bobak convince you that if you are not using Huge Pages, you are doing it wrong.