Diagnosing “random” connection resets in 11g

This was a pretty weird problem I have dealt with in the past few days. We migrated a database system from 10g to 11g a while back and almost everything worked just fine. Of course, we also rolled out new clients to the app servers and things pretty much worked. But occasionally, servers would get a “Connection reset” error/exception when trying to connect. All information we had on this issue was the stacktrace from the driver which really does not tell you a whole lot.

Caused by: java.net.SocketException: Connection reset
at java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:96)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at oracle.net.ns.DataPacket.send(DataPacket.java:219)
at oracle.net.ns.NetOutputStream.flush(NetOutputStream.java:208)
at oracle.net.ns.NetInputStream.getNextPacket(NetInputStream.java:224)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:172)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:97)
at oracle.net.ns.NetInputStream.read(NetInputStream.java:82)

So eventually, this landed on my desk and we checked the sqlnet logfiles and settings but things looked good there. We also checked network settings and statistics in the OS but things looked good there aswell. I would have liked to blame this on the networking guys but all systems are on the same subnet so this could not be an issue with a firewall or router.

So in all my desperation I asked the mighty google machine which came up with this OTN forum thread which suggested that this could be related to the implementation of random number generation on linux systems.

At some point during the connection establishment, the driver requires some random numbers which by default are generated by /dev/random on linux systems. But this pseudo-device blocks when there is not enough entropy in the system to ensure “real randomness”. Entropy is generated by mouse and keyboard input aswell as some network drivers. It looks like our machines in trouble were not generating entropy fast enough as they needed to at some points and this caused the jdbc driver to fail connecting. Anyway, the suggested workaround of setting the default source of randomness to the non-blocking /dev/urandom helped.

There is also a quite well hidden metalink note on this which also links to this really good blog article.

I remember that this has bitten me in the ass before about 10 years ago where an SSL-enabled apache webserver refused to start under certain conditions. Our first workaround was to send someone to go to the box and move the mouse or type stuff on the keyboard…

Speaking at UKOUG in Birmingham

This year’s event calender is quickly filling up with my presentation “Setting up RAC for planned downtime” being accepted at the UK’s user group conference in December. I have not been to this conference before and am thrilled to finally check out Europe’s largest english speaking Oracle event. I have heard only good things about this event and I am sure that there will be lots of smart, nice and interesting people to meet and exchange ideas with.

This might also be a good chance to get together with other RAC SIG members from the UK and Europe. Let me know if you are interested in setting something up.

VDI Windows on iPad

Yesterday, I took Oracle’s OVDC for iPad for a test drive. As expected, the installation was free an just as easy as with any other app. The setup offered a choice between automatic discovery and manual setup. I chose to enter the server IP manually since our wireless is on a different (but routed) network. There was another option for VPN but I did not take a closer look at that. I am wondering though if the client has it’s own VPN client or uses the iPad’s system-wide VPN.

Things were great from there. The app connected to my server and presented a sharp and crips image, things went smooth. A few things were a little annoying: I was not able to do a double-click. I can only assume that double-tapping too fast does not work and tapping too slow will let windows think that you want to rename the file or shortcut or whatever.
Also, the keyboard driver did not work as expected. When using special characters, things were weird with a US keyboard layout on the iPad but the keys were actually interpreted like on a german keyboard. Or the other way around.
Playing flash videos did not really go very well, the playback was pretty slow so I wouldn’t want to watch a full movie on it. I also tried the stream of a webcam at the office and this worked really well.

As a first impression i would say that the OVDC app is great for showing off your VDI setup and also for the occasional emergency task when you really need to have access to an excel file on the road. But it will propably not revolutionize the way you perform everyday desktop work. I also wonder if this will spark a new eco-system where Oracle partners rent virtual machines to regular people.
If I find the time, we will set up a demo system at the data center to see how this works over the internet and give test accounts to friends and family to see if this is something that is actually useful or just something that is only appreciated by true geeks.

Gute Neuigkeiten von VDI und Sun Ray

Heute ist wohl ein hervorragender Tag für Nutzer der Desktopvirtualisierungssoftware von SUN/Oracle. VDI ist in der Version 3.3 erschienen, und wir hatten bereits die Gelegenheit, diese zu testen. Die Administrationsoberfläche wurde etwas aufgeräumt. Vor allem aber wurde an der Performance geschraubt. So reagiert die Administration schneller und auch die virtuellen Maschinen laufen jetzt flüssiger und schneller. Außerdem wird jetzt neben Solaris auch Oracle Linux als Virtualisierungsplatform unterstützt. Das wird vielleicht die Hemmschwelle für Kunden senken, die bisher wenig Erfahrung mit Solaris hatten und trotzdem diese spannende Technologie kennenlernen wollen.

Brandneu ist ebenfalls der neue Software Client für das iPad! Ab jetzt kann man sich also den VDI oder Sun Ray Desktop auch auf das iPad holen. Das riecht nicht nur nach extremem geek-Spaß sondern macht sicherlich auch in Präsentationen der VDI Umgebung richtig was her.

Und dann habe ich noch gelesen, dass Oracle für das Design der dritten Generation von Sun Ray Clients einen Designpreis gewonnen hat. Die Geräte sehen auch wirklich gut aus, lassen sich einfach aufstellen (der Fuß wird jetzt nur noch geclippt statt geschraubt) und haben neuerdings einen Knopf, mit dem man die DTU von eh schon wenig Stromverbrauch in einen Standby-Modus schalten kann.

Modify service property in Solaris

This is nothing too exciting but it is something that I seem to alvais forget. So I am hoping that by writing it down once I might have a better chance to remember. Or at least remember where to look for pointers next time.

I was fiddling with ZFS auto-snapshots on a server. They were set up so that a daily snapshot was being kept for a month and I simply wanted to reduce that time to a week. I knew that this is set up through svc properties and could see this like this

bl3:~# svcprop auto-snapshot:daily
zfs/auto-include boolean true
zfs/avoidscrub boolean false
zfs/backup astring none
zfs/backup-lock astring unlocked
zfs/backup-save-cmd astring not\ set
zfs/fs-name astring //
zfs/interval astring days
zfs/label astring daily
zfs/offset astring 0
zfs/period astring 1
zfs/sep astring _
zfs/keep astring 31
zfs/snapshot-children boolean false
zfs/verbose boolean true
general/action_authorization astring solaris.smf.manage.zfs-auto-snapshot
general/value_authorization astring solaris.smf.manage.zfs-auto-snapshot
general/enabled boolean true
...

I knew I had to modify the zfs/keep parameter but I just could not remember how to modify these properties through svccfg. After 5 minutes of googling I found this nice summary and was able to put the pieces and syntax together:

bl3:~# svccfg -s auto-snapshot:daily
svc:/system/filesystem/zfs/auto-snapshot:daily> setprop zfs/keep=7
svc:/system/filesystem/zfs/auto-snapshot:daily> exit
bl3:~# svcadm refresh auto-snapshot:daily

Now I can only hope I remember the ‘-s’ flag to svccfg and setprop.

Speaking in San Francisco again

I was extremely excited to learn that Oracle invited me to speak at the Oracle OpenWorld conference again in 2011. It is the largest event for Database professionals and in combination with JavaOne attracts more than 42.000 professionals to come to the beatiful city of San Francisco each fall. It is a week of learning through more interesting sessions than anyone could fit into their schedule, a chance to interact with Oracle engineers at the demogrounds and exhibition halls and of course a lot of networking with old and meeting new friends at many great parties like the famous OTN night and the appreciation event which is going to feature Sting and Tom Petty and the heartbreakers. But there are also a ton of fun smaller events like the blogger meetup and OPN partner activties aswell.

My presentation will cover how Real Application Clusters can be used make planned downtime more pleasant for everybody involved. With the right preperation, setup and practice, a lot of maintenance work does not need to be performed during late night or weekend windows. A lot of these techniques and preperations also apply to unplanned downtime due to hardware failure but from my experience that is much less likely to occur than patching the database or operating system or swapping hardware parts. So I will be talking about how to set up database services, drivers and connection pool and explain what to avoid in your application development. I am also planning to include a live demo. Not just because I strongly believe in the power of real numbers, code snippets and demos but also because this introduces excitement because so much more can go wrong.