OTN Grid Computing Thread of the Month

I have been hanging around at OTN’s grid computing forums for quite a while now. I enjoy reading about other people’s problems with RAC technology, learn from the advice of other smart DBAs and draw some satifcation from helping out others in those areas I feel confident about.
And while the grid computing part of the OTN message board is rather small (compared to ‘Database/general’ for example) it is still enough to keep you occupied for a while when reading all threads. That’s why I came up with the idea of compiling a list with my favourite thread(s) each month. Please bear in mind that I do not claim any authority, a fair rating system or even having read all threads. I am not affiliated with OTN, just another fellow member. These are just my favourites, nothing more. There is no reward or benefit in being listed here. The only thing I can offer for those that contribute to my monthly favourites is to buy you a beer if you manage to track me down at a conference or SIG meeting.
I thought this would be easy and fun until I got to actually pick a single great thread. Anyway, this is what I came up with for the month of October:

ASM benefit in SAN environment is quite a general discussion about why you might prefer ASM over raw partitions or filesystems and scratches on the surface of recommendations for a good stripe size. I also liked the fact that quite a few people participated, a lot of the other threads are really mostly two-way discussions.

Another great answer was given in “Possible Splitbrain scenario” where Syed Zaheer links to an excellent series of articles on OCR mirroring by Geert De Paep.

Sun Ray 3 just arrived

I have ordered a few of the brand new Sun Ray 3 devices for our lab and internal development. When I opened the box I first thought they must have forgotten something. SUN would have shipped three seperate boxes. One for the Ray, one for the keyboard and mouse and another one for the localized power cord. But this shipment had only one box, looking like it could very well hold a family size pizza.
Inside was no pizza but the device with all peripherals and power cord. This is propably the first real big improvement since Oracle took over Sun which I could actually see and touch.
Setup was just as easy as expected, it feels like videos (flash from youtube) plays a little smoother than on the older devices but I cannot back that up with scientific evidence yet. And while we were playing around with the new toys, we also enabled ipv6 on our server which was quite easy since the Sun Ray server has had a v6 address for quite a while now. All we needed to do is switch on v6 on our shared subnet and perform a (warm) restart of the server

/opt/SUNWut/sbin/utadm -m both
/opt/SUNWut/sbin/utrestart

Since we also always use the GUI firmware on our rays, we were able to simply switch the Rays to v6. We still have to work out some kinks though. Some of the devices keep rebooting every minute when set to v6 and the whole startup cycle or initial connect appears to be a bit longer than with ipv4.

insecure by default?

I recently attended an Oracle Sales event. One of the main selling points and strong Solaris features being stressed was the “Secure by default” nature of the operating system. And while I really don’t want to argue with the basic idea or the point made (that solaris is a secure os), it started to get me thinking about some of the things that annoy me and that are not in line with the marketing claim.

I love the fact that the Solaris installer has the (default) option of disabling all network services except for SSH. I stick with this default all the time since it is easy to just reenable needed services later. But the bitter (and annoying) truth is that whenever you set up a new zone in Solaris, all those stupid services will be enabled in the zone. Telnet, http, webconsole: you name it! This can easily be remedied by issuing the command ‘netservices limited’ but that should not be needed, secure should be the default! But it really does look like this is the default behavior and not a bug.

But maybe this is something that could change with the upcoming Solaris 11.

zone cloning without shutdown

In my presentation at Oracle Openworld, I showed a demo of how to easily clone a zone to make a copy for development or testing purposes. These were the steps, given that the zone oow01 was already set up and running.

zonecfg -z oow01 export > oow43.cfg -- dump the existing config to a file
# edit the file oow43 for a new ip address and a new zonepath
zonecfg -z oow43 -f oow43.cfg -- import the new config
zoneadm -z oow01 halt -- this is neccessary for the 'clone' command
zoneadm -z oow43 clone oow01 -- this will take care of the rest

One of the questions I received from the audience after the presentation was: “Is it possible to clone a running zone?”. Sure, this is possible, but unfortunately not with the zoneadm command, so you will have to do some manual work. I did not have time to do it right there but promised to follow-up on my blog. But first a word of caution: Though I have done this with running zones multiple times with success, this does not mean it is guaranteed to work. You freeze the ZFS filesystem with a running zone midflight which is similar to turning off the power off an on and hoping that your computer works fine afterwards. From my experience, it does work fine almost all the time (and Oracle itself should be protected by instance recovery), just don’t count on it or blame me if the clone won’t start up or behave unexpectedly.
So basically, what ‘zoneadm clone’ will do for you is create a ZFS clone from the source zone’s filesystem and then modify a text file while ignoring a ‘DO NOT EDIT THIS FILE’-hint.

# the first two steps are exactly the same as above
zonecfg -z oow01 export > oow43.cfg -- dump the existing config to a file
# edit the file oow43 for a new ip address and a new zonepath
zonecfg -z oow43 -f oow43.cfg -- import the new config
zfs snapshot rpool/zones/oow42@clone_oow43 -- snapshot running zone
zfs clone rpool/zones/oow42@clone_oow43 rpool/zones/oow43 -- make a clone
# now, edit the file /etc/zones/index
# change the status of your zone from configured to installed
# and set a unique id

When I choose a new id for the zone, I just copy an existing one and modify a few digits. I don’t know if this is the ‘right’ way to do it but it works for me. Now, you can boot up this newly cloned zone and that is it. The hostname inside the zone is still the same as the original, but I usually don’t care or simply correct this.

Be careful with autotrace in SQLDeveloper

I was tuning a query today that involved a couple of nested loops. I wanted to see the effect of different values for my bind variable(s) on the actual work being performed, so naturally autotrace was going to be the weapon of choice, especially since it is so convenient to use from SQLDeveloper. So I was stuffing my query with different variables, checked that the execution plan stayed the same (of course it did) and observed autotrace’s value for ‘consistent gets’. But something was odd. When I used a value that yielded no (or just very few) results for the driving part of the nested loop, the consistent gets were higher than with a value that yielded many results. After all, the work being done or blocks needed to be read should be dependant on how often I execute that loop. For an easy (and similar) example, look at this query with bind values of either ‘%x%’ or ‘%@%’:

SELECT *
FROM oe.orders o,
oe.customers c
WHERE c.customer_id = o.customer_id
AND c.cust_email LIKE :1

If I look for order by customers with an ‘x’ in their email address, I get 2 results and it takes 27 consistent gets. When looking for customers with an ‘@’ (all of them) I get 105 results but only 19 consistent gets are needed. I spent the next 60 minutes trying find an explanation for this behavior and really started to doubt my knowledge about Oracle. So I double-checked with autotrace from sqlplus which yielded the expected results: more rows – more CGs. So the problem had to be with SQLDeveloper and it was a rather easy find from there on. SQLDeveloper has a pagination with the default page size being set to 50. This basically means that it will only fetch the first 50 rows and wait for the user to scroll down before going on with the query. Which is a very good idea for most general purpose database work because you might now want to wait for all results to be fetched and transferred over the net.
But this poses a problem in combination with the built-in autotrace function because SQLDeveloper will stop (or rather pause) executing the query after the first 50 results. And autotrace can only output the work being done until that point. In the example above, this meant that with the ‘%x%’ predicate the outer or driving loop had to go through the whole full scan of the customers table. When given the other predicate, it had to execute the inner loop 50 times (a simple index lookup) and stopped after that, not even halfway ‘done’. The workaround for me was to increase the pagination size. You will find this at Tools->Preferences->Database->Advanced under ‘SQL Array Fetch Size’ but there seems to be an upper limit and I would also wish that they would either not display the wrong autotrace results or introduce an option that disables pagination when using autotrace.