revisiting Solaris 11 Automatic Installer without DHCP

A while ago I blogged about my experience with the new automatic installer in Solaris 11 and my special setup in which I refused to use a DHCP service because DHCP is simply something we do not use in our datacenter. That particular blog post has been one of my most popular (by visits) so this must be something that other admins experience aswell.

One of the issues I came across was that while you can easily specify IP, gateway and installation server on the commandline without DHCP, the install will break at a point where it needs to install some packages from ‘’ but could not resolve that name due to a lack of DNS servers being set up. I described a rather tedious workaround in that post but came up with something a little better today. Continue reading

firewall port for ZFS Appliance with Oracle VM Storageconnect

I just stumbled across this and could not find it anywhere else on the net. I set up a ZFS Appliance with Oracle VM and their storageconnect plugin according to the documentation pdf (which are pretty easy step-by-step instructions) but in this case the OVM Server and the ZFS Appliance were not in the same network and access is denied by default in the firewall between those nets. So trying to register the appliance as a FC Storage led to this error that just tells me that the connection timed out. Continue reading

upgrading OPatch through rpm

Every DBA has been there. You “just” want to apply a PSU, one-off patch or mostly anything in Oracle and quickly browse through the release notes or Readme when you encounter this:

You must use the OPatch utility version or later to apply this patch. Oracle recommends that you use the latest released OPatch version for 12.1, which is available for download from My Oracle Support patch 6880880 by selecting the release.

Almost every patch requires you to patch OPatch first. Which really is not a big deal: download patch 6880880, move the old $ORACLE_HOME/OPatch out of the way (or back up) and unzip the new one in it’s place. A very boring task, especially of you maintain a number of Homes on a box or multiple database servers. I am tired of performing these tasks and came up with this: Continue reading

OTN interview about #RACAttack at OpenWorld 2013

We had so much fun with RAC Attack during OpenWorld in San Francisco that people soon started to ask questions. Luckily, none of those were about why we were wearing bandanas, carrying an inflatable kangaroo and took jumping pictures with all the participants all day. For most other questions, see this interview between OTN’s Laura Ramsey and me.

Linux Huge Pages in Oracle VM 3

My last post was about properly determining the number of hugepages needed with the newest version of the unbreakable enterprise kernel UEK R3 in Oracle Linux. Aside from the little issue with the proper kernel version detection in the script, the process is quite simple and well documented on the web:

  • turn off automatic memory management AMM by setting the parameter MEMORY_TARGET to 0
  • determine the number of hugepages needed by running the script
  • add an entry to /etc/sysctl.conf
  • reload the config with sysctl -p or reboot the server
  • edit /etc/security/limits.conf
  • bounce the instance(s)

Unfortunately, this was not it for me. You can check the effect of the setting like this:

[root@linuxbox ~]# grep Huge /proc/meminfo
HugePages_Total:       0
HugePages_Free:        0
HugePages_Rsvd:        0
HugePages_Surp:        0
Hugepagesize:       2048 kB

This should have showed more than 0 hugepages configured. There is however an error logged by dmesg.

[root@linuxbox ~]# dmesg | tail -1
Guest pages are not properly aligned to use hugepages

Alright, the problem is that while Oracle VM may be engineered to run the Oracle database best, someone actually forgot to add support for huge pages in Para-Virtualized (PVM) guests. This should work for HVM guests (I have not tried it) but the general recommendation is to use PVM for Linux guests. The workaround for OVM 3.2.4 and later is described in metalink note 1529373.1. You have to add the parameter ‘allowsuperpage’ to the grub kernel boot options in /etc/grub.conf like this (added parameter is bold):

title Oracle VM Server (2.6.39-300.32.5.el5uek)
        root (hd0,0)
        kernel /xen.gz dom0_mem=1160M allowsuperpage
        module /vmlinuz-2.6.39-300.32.5.el5uek ro root=UUID=d15aad1f-668b-4358-a54a-d7163a15198d
        module /initrd-2.6.39-300.32.5.el5uek.img

You need to do this on all nodes that you may want to run or live-migrate a VM with huge pages to and reboot the servers. I had to perform an update to 3.2.7 anyway so a reboot was neccessary anyway.

Next step, you need to add the parameter “superpages=1″ at the end of the vm.cfg file of this vm and stop/start (restart is not enough) the vm.

But you are not supposed to edit the vm.cfg manually and an edit of the VM in the manager (like changing number of vcpus, memory or boot options) will remove that line again. Which will then lead to this VM not actually allocating any hugepages after the next boot. The note says they are working on it and I really hope they will fix this rather sooner than later.

And in case you are wondering what all this fuss is about and why I do not simply want to live with good old standard shared memory, let Mark Bobak convince you that if you are not using Huge Pages, you are doing it wrong.

setting up Huge Pages with UEK R3 (kernel 3.8)

I came across a little hiccup when configuring huge pages on my oracle linux 6.5 playground machine with the latest unbreakable linux kernel. Part of the documentation and also this blog post by Tim Hall have a nifty script that outputs the setting one should tweak to set the nr_hugepages parameter correctly. Unfortunately, this script fails on the most recent version of the kernel shipped with Oracle Linux:

[root@linuxbox ~]# /usr/local/bin/ 
Unrecognized kernel version 3.8. Exiting.

Fortunately, the fix is really easy. There is a wrapper script that will fake the output of uname back to 2.6 and the method of setting the parameter with vm.nr_hugepages has stayed the same.

[root@linuxbox ~]$ yum install uname26
[root@linuxbox ~]$ uname26 /usr/local/bin/ 
Recommended setting: vm.nr_hugepages = 516

But even better than faking your way around this would be to modify the script and to accept 3.8 as a valid kernel version like this.

# Linux bash script to compute values for the
# recommended HugePages/HugeTLB configuration
# Note: This script does calculation for all shared memory
# segments available when the script is run, no matter it
# is an Oracle RDBMS shared memory segment or not.
# Check for the kernel version
KERN=`uname -r | awk -F. '{ printf("%d.%d\n",$1,$2); }'`
# Find out the HugePage size
HPG_SZ=`grep Hugepagesize /proc/meminfo | awk {'print $2'}`
# Start from 1 pages to be on the safe side and guarantee 1 free HugePage
# Cumulative number of pages required to handle the running shared memory segments
for SEG_BYTES in `ipcs -m | awk {'print $5'} | grep "[0-9][0-9]*"`
   MIN_PG=`echo "$SEG_BYTES/($HPG_SZ*1024)" | bc -q`
   if [ $MIN_PG -gt 0 ]; then
      NUM_PG=`echo "$NUM_PG+$MIN_PG+1" | bc -q`
# Finish with results
case $KERN in
   '2.4') HUGETLB_POOL=`echo "$NUM_PG*$HPG_SZ/1024" | bc -q`;
          echo "Recommended setting: vm.hugetlb_pool = $HUGETLB_POOL" ;;
   '2.6' | '3.8' ) echo "Recommended setting: vm.nr_hugepages = $NUM_PG" ;;
      *) echo "Unrecognized kernel version $KERN. Exiting." ;;

Btw: my test returned only 1 when I first ran it. After some advice by Frits Hoogland on twitter and a little study of this blog post by Tanel Poder it became quite obvious. I forgot to unset the MEMORY_TARGET parameter in my spfile and thus the shared memory was allocated in chunks visible in /dev/shm but not through ‘ipcs -m’. I do not trust AMM anyway, set MEMORY_MANAGEMENT to 0 and went on.

top 3 posts and review of 2013

I did a similar post last year and in trying to start a bit of a tradition looked at the same statistics again this year. And the result was quite a surprise. While I thought that 2013 was the year of the 12c database it actually was still the year of the infrastructure. At least here at this blog.

stanley calendar 20132013 has been an awesome ride for me and did not leave much more to wish for. I was able to attend and speak at a number of conferences and user group meetings around the world: IOUG Collaborate in Denver, OUGN vårseminar on a boat from Oslo to Kiel, OUG Scotland in Edinburgh, Oracle OpenWorld in San Francisco, the OTN APAC tour in Auckland and INSYNC in Perth, DOAG in Nuremberg and UKOUG in Manchester. I was awarded ACE Directorship and was elected vice president of the RAC special interest group. How much better could it get?
Continue reading

troubleshooting a CPU eating java process

Just yesterday we had a situation where a java process (a JBOSS app server) started to raise the CPU load on a machine significantly. Load average increased and top showed that the process consumed a lot of CPU. The planned “workaround”? Blame something (garbage collection is always an easy victim), restart and hope for the best. After all, there is not much to diagnose here if there is nothing in the logs, right? wrong! There are a few things that we can do with a running java vm to diagnose these issues from a solaris or linux commandline and find the root cause. Continue reading

Oracle acronym madness

I delivered a private RAC training class a few weeks ago and one of the things I was asked during introductions was to limit the amount of weird acronyms to a minumum. “OK”, I said and thought, there are maybe a handful of three-letter abbreviations that I’ll be using but they should not be more than a dozen. To check on myself, I made a point of writing down most of the acronyms I used together with their meaning. I was more than surprised to have collected more than 50 of these after just two days. Crazy.
Continue reading

Upgrading to Oracle Linux 6.5 and UEK3

OL 6.5 was released a few days ago to ULN and public yum, and of course I was going to upgrade some machines immediately. But first: Have I mentioned before that I really love Oracle Linux for providing ISOs and patches for free? One of our customers insists on using Red Hat instead of OL. No problem so far but a few days ago we needed to clone that VM for some testing and staging. Only problem: to do this we needed another support subscription from RH which the customer is paying for but more importantly it delayed the project since we now had to wait for procurement. This is why I appreciate the license model of OL: you can use it for free and have the option of purchasing support for the machines that are most important to you. Also, licensing is per (hardware) server, not VM. Read more about OL licensing directly from the horses belgian’s mouth. Continue reading