Another Successful Go-Live for SAP HANA on POWER!

Another SAP HANA deployment on IBM Power. Delivers a lower-cost, highly virtualized, flexible, no-compromise solution vs the alternative.

My client is now live, running Suite on HANA 2.0 on an IBM POWER & Storwize solution after a successful weekend migration.

The focus for this blog is to discuss what led to my client successfully migrating their SAP ECC environment to SAP HANA using capabilities inherent with IBM Power servers.  Capabilities available to every SAP HANA client who choose IBM Power vs the only other alternative platform supported for SAP HANA workloads.  The alternative option is built with Intel processors either running as bare-metal or with virtualization.  If virtualized, it will likely be VMware which I refer to as a “compromise” solution full of gotchas, limitations, restrictions and constraints.  Whereas,  IBM Power using PowerVM  is a “No Compromise” option.   I’ll give some examples of this bold statement below.

Back to my client.  In the Fall of 2018, the client chose an IBM Power solution  supporting four environments using a fully virtualized two-site solution.  Client chose to deploy HANA in parallel to their existing SAP ECC environment which runs on IBM i.   Regarding storage, each site uses  new IBM Storwize AFA products.  For high-availability and resiliency, the solution uses SuSE HA clustering between a pair of Production servers with SAP HANA Replication locally from Primary Prod to Failover Prod and then  from Primary Prod to the DR server.

DR consists of a (very) large Scale-up IBM POWER8 server hosting all HANA DB & App (NetWeaver) VM’s for each environment (Sandbox, Dev, QAS, etc).  Production uses a smaller pair of the DR server just for the HANA DB VM’s plus a pair of POWER9 Scale-out servers both hosting redundant App servers. 

Each of the IBM Power systems in this SAP HANA environment use Dual VIOS or Virtual I/O Server.  For the uninitiated, Dual VIOS means there are 2 special VM’s which virtualize and manage the I/O to every VM’s.  Remember a VM can technically use any combination of dedicated or virtual I/O but typically when a VIOS is used, it manages both network and storage I/O with one exception where I often see a client use a dedicated Fibre adapter for physical tape connections.  Benefits of implementing Dual VIOS are many. They require fewer adapters leading to smaller servers and/or less I/O expansion, provide I/O path redundancy to network & storage while also increasing serviceability as the client can do just about any kind of maintenance on the I/O path transparent to the workloads.  This means very little downtime is ever required to service and maintain the I/O subsystem.  This includes adding, removing,  upgrading and configuring adapters,  ports, updating drivers, etc.  Thus, if a port or adapter fails or if something were to happen to a VIOS (very rare), its redundant ports, adapters and VIOS are configured to automatically service the I/O from the remaining resources.   There are many options available to deploy redundant VIOS, from active/passive to active/active for both network and storage I/O.  Another benefit when virtualizing the I/O is to enable features such as Live Partition Mobility and Simplified Remote Restart ….. no compromises, remember?!

I should disclose that my company has an SAP migration, consulting and managed services practice.  We were selected to provide both the infrastructure implementation and the SAP ECC to HANA migration services.  Starting with the lower environments, my SAP services team started late last year (2018) concluding with Prod in May 2019.  This client wanted each environment to be a full copy of the HANA DB whereas it is common for clients to make the lower environments smaller. Our migration and infrastructure teams worked together during every step, creating additional VM’s, adding storage, mount points, dialing in  cores and memory for every HANA DB and App VM for each environment. 

With IBM POWER8 and IBM POWER9 servers, SAP states Production VM’s are required to use Dedicated (and dedicated donating) cores while non-Prod environments may use dedicated cores or use Shared Processor Pools (SPP).  This means clients can use every square inch of their IBM POWER servers – dialing in the cores and memory.  For Non-Prod, clients receive greater granularity sharing cores leading to even greater resource efficiency.  This leads to smaller and fewer servers – say it with me “lower cost!” make for very happy clients!

Contrast this with the alternative Intel solution with its two choices; bare-metal meaning no virtualization benefits or to use virtualization.  Bare-metal means 1 OS image per physical server. Hopefully your infrastructure provider or SAP consultant does not under size the cores & memory as the cost to remediate can be very costly (i.e. possibly new servers if the current system is already maxed out).  If the market leading virtualization product (i.e. VMware) is chosen, its VMs do not offer the granularity available as are available from IBM Power systems with its ultra-secure and rock-solid Power Hypervisor (PHYP).

The alternative virtualization product requires (i.e. limits or restricts) each VM to allocate cores in increments of full or ½ sockets.  Let’s say the HANA DB system is a 4-socket Intel server using 22-core processors totaling 88 cores with 1536 GB RAM per socket or 6 TB in total.  If the HANA DB sizing called for 46 cores, you would be required to assign 3 sockets or 66 cores for a VM which only needs 46, wasting 20 cores plus all of the excess memory attached to that 3rd socket.  Another approach using this example of waste when using virtualization for the alternative option.   If the HANA DB VM requires 3200 GB of memory.  Because this is 128 GB more than physically connected to 2 sockets, you must allocate all 4,608 GB of memory attached to the 3 sockets as well as all 66 cores on those 3 sockets as previously described.  1,408 GB of memory is wasted, unable to be used by any other VM’s on that server.  Fortunately, the larger DIMMs used to achieve the needed capacities are cheap so this waste is a drop in the bucket (in reality, these large DIMMs are NOT cheap at all!)  SAP states there is overhead incurred from this market leading virtualization product.  Also, if security is important, don’t overlook its many security vulnerabilities such as  the Intel Management Engine, VMware vSphere, Linux plus the recent Meltdown, Spectre, Foreshadow and Zombieload  side-channel threats come to mind.

Some of these security vulnerabilities come with a performance penalty.  SAP fully supports IBM POWER8 & POWER9 using SMT8 while many recommend disabling the use of hyper-threading when using Intel servers.  SMT options are SMT8, SMT4, SMT2 and SMT0. To view the SMT level on SuSE or RedHat, use `ppc64_cpu –smt` and to change the SMT level from its current level to SMT4, for example, use `ppc64_cpu –smt=4`.  Note the switch used is a “<dash><dash>smt” as many editors will change that to a large single dash.  The default for Linux on POWER8 should be SMT8 but there are some situations where the default is SMT4.  For POWER9, all supported Linux distributions should default to SMT8.  Also, clients are able to change SMT from one level to another dynamically by VM (yes, I said by “VM”).  This is a huge feature,  unavailable on Intel.

UPDATE (6/18/2019): SAP Note 2393917 states the following statement. “Due to the security vulnerability identified in CVE-2018-3646 VMware strongly advices customers to review and enable the recommendations indicated in VMware KB 55806. In particular, VMWare recommends that customers must ensure that after enablement, the maximum number of vCPUs per VM must be less than or equal to the number of total cores available on the system or the VM will fail to power on. The number of vCPUs on the VM may need to be reduced if it is to run on existing hardware. The number of vCPUs should be a factor of two. VMware is providing a tool to assist customers with the analysis of their VM configuration.” .  This SAP Note is very explicit. Though they are not declaring clients MUST disable hyper-threading, when a vendor states they “strongly advise” you to do something, they are saying essentially telling you to do something.

Here are a couple of articles  on the performance impact here and here, but do your own internet research as well.  SAP has been remediating their cloud Intel environment with details in SAP Note 2709955 but are copping out regarding what clients should do regarding their on-premise Intel servers.  Instead, they defer to the relevant vendors like Intel, VMware, RedHat, SuSE, etc to determine what they should do.  It’s not like hyper-threading is known for performance or  throughput, for that matter but if it delivers something to increase efficiency, it would be a good thing.  Lose hyper-threading and all you have left for threads are physical cores. This means less efficiency for the application as HANA loves threads which is why it scales so well on IBM Power.  For Intel sizing, this likely requires more cores with associated memory which leads to more sockets with more memory leading to larger, more expensive servers to obtain the desired scale.  Using TDI Phase 5  based on SAPS values, any sizing would need to be adjusted to compensate for not having hyper-threading.    With IBM Power, size it, dial it in the cores and memory, tune the OS and re-use spare capacity for other VM’s running Linux, AIX & IBM i (if supported by the chosen model) as needed – no compromises!

Thought I would create the table below to compare IBM’s PowerVM vs VMware’s vSphere using a couple of SAP Notes to compare PowerVM versus VMware. It gets really complicated trying to explain Intel and VMware capabilities, even IBM Power & PowerVM to a lesser degree as what is supported varies by CPU Architecture (i.e. Ivy Bridge, Haswell, Broadwell, Skylake and Cascade Lake in the case of Intel and POWER8 vs POWER9) as well as the VMware generation (i.e. vSphere pre-6.5, 6.5 and 6.7).  I’ll try to annotate differences but will ask you to reference SAP Notes for VMware vSphere on Intel are 2652670, 2718982 and 2393917 for specific details.  SAP Notes for IBM Power using PowerVM are 2055470,  2230704, 2188482 and 2535891

Disclaimer: I am verifying some of the values shown in the table below and will update the table as needed.  With new features being supported regularly, it can be a challenge to remain current.

 

REMINDER: This chart ONLY applies to Business Suite and not BW. I’d have to build another table for those differences as BW was not germane to this client or this blog.

 

VMware vSphere (OLTP)

PowerVM (OLTP)

Max VM’s per system 16** 1 – 16 Production VM’s*

 

1 – 1008 VM’s (15 Prod + 993 Non-Prod VM’s)

POWER9 E950 & E980

    1 – 8 Production VM’s*

 

1 – 1000 VM’s (7 Prod + 993    Non-Prod VM’s)

POWER8 E870(C) & E880(C)

    1 – 6 Production VM’s*

 

1 – 920 VM’s (5 Prod + 915      Non-Prod VM’s)

POWER8 E850C

    1 – 4 Production VM’s*

 

1 – 426 VM’s (3 Prod + 423    Non-Prod VM’s

POWER8 & POWER9 2-socket Scale-Out

Max VM size Up to 4 sockets (BW, SL, CL)  
VM size increments*** 1, 2, 3 and 4 full sockets Dedicated & Dedicated Donating

 

1 core increments

  ½ socket (no multiples like 1.5) but 2, 3, 4, 5, 6, 8 and 8 ½ sockets supported Shared Processor Pool (Non-Prod workloads)

 

Rule of Thumb is 20 VM’s per core

Threading 2 threads / Hyper-Thread SMT8 per core or virtual processor
Max vCPU 128 (6.5, 6.7)(BW)

POWER8: If a VM uses more than 96 cores then set SMT=4. Otherwise set SMT=8

Max Cores per VM * SMT level = threads

Ex 1: 176*4=704 threads

Ex 2: VM1 = 96 *8 = 768 threads and VM2 = 80*8 = 640 threads for a total of 1,408 threads.

 

  192 (6.7)(BW)

POWER9: If a VM uses more than 48 cores then set SMT=4. Otherwise set SMT=8.

Max Cores per VM * SMT level = threads

Ex 1: 128*4=512 threads and 64*4= 256 threads for a total of 768 threads.

Ex 2: VM1 = 48 *8 = 384 threads; VM2 = 48*8 = 384; VM3 = 48*8 = 384; VM4 = 48*8 = 384 threads for a total of 1,536 threads.

  128 (6.5, 6.7)(SL,CL)

Cores * Max # of vCPU per core * SMT level

Using SPP, take the # of cores * 20 * SMT level.

POWER8 Ex: 176 * 20 * 8 = 28,160 threads

POWER9 Ex: 192 * 20 * 8 = 30,720 threads

   224 (6.5, 6.7)(SL,CL)  
Max Cores N/A 176 cores (POWER8)
  N/A 192 cores (POWER9)
Max Memory 4 TB (6.5, 6.7)(BW) 16 TB (POWER8)
  6 TB (6.5, 6.7)(SL, CL) 24 TB (POWER9)
Memory allocation Only memory attached to ½ socket or full sockets. If more memory is required, underlying ½ or full sockets go with it. As long as the VM has the minimum memory allocated, memory increments can occur as small as 1 MB
SAP could require virtualization to reproduce on bare-metal Yes No
Min performance degradation from virtualization per SAP 14% for ½ socket VM’s

 

Avg of 10% over bare-metal

0%
     

* VIOS VM’s do not count toward these totals

** Requires an 8-socket server to achieve 16 VM’s, with each using ½ sockets per VM. If using full socket VM’s the most possible would be 8 using the 8-socket server example.

*** You can mix ½ and full socket VM’s on the same server. Example would be 4 x ½ socket VM’s which would consume 2 sockets and 6 x 1 socket VMs consuming 6 sockets totaling 8 sockets.

BW = Broadwell

SL = Sky Lake

CL = Cannon Lake

Back to the migration story.  A chain is only as strong as its weakest link.  This client’s environment is no different as their weak link happens to be their network.  During the prep-work for the Prod migration, they decided to move the primary Prod App server to the same frame hosting the Primary Prod HANA DB VM.  Using Live Partition Mobility, they dynamically moved the App VM to the same frame with the Prod HANA DB.  This provided added network stability (because of their weak link) while reducing the chance of external network latency.  It is difficult to coordinate downtime among the various stakeholders of a multi-billion-dollar company not to mention the cost of downtime.  Since they were migrating the database from the current ECC system over the network, the client liked having the option to granularly allocate resources plus move VM’s where they needed them.   With IBM Power, clients have flexibility leading to fewer scheduled outages as most maintenance and administration can be performed concurrently.  Is anyone keeping score of all the advantages obtained by IBM Power as I’ve put many hash marks in its column while placing many X’s in the column for the alternative platform.

Regarding the network traffic, the network adapters are 10 GbE optical, configured in the VIOS using Shared Ethernet Adapters which provide a virtual switch. Traffic enters and leaves the server through the SEA whereby network packets within the server are sent/received over the systems memory bus using a technology in the Power Hypervisor called Virtual Ethernet (VE).   This makes data transfers from VM to VM within the frame occur very fast with ultra-low latency and efficiency.  Hence why the client wanted the App server sitting (logically and I guess physically as well) millimeters away from the HANA DB server.

The export of the 24 TB database from the source system began just after midnight Friday night. This took approximately 6 hours.  They next moved to import the data into the new environment which took 24 hours.  At the successful conclusion of the migration, they used LPM to move the App VM back to its original home on one of the POWER9 Scale-out servers.  During the migration, the client chose to stuff more cores and memory in to the App VM while running on the Scale-up server.  the App VM was originally sized to use 8 cores and 256 GB RAM but they called an audible and bumped the cores to 12 cores and memory to 384 GB RAM.  For those familiar or not with Power systems and common workloads, this is a lot of cores and memory for an App server, but since they had spare resources on the Scale-up server, they chose to “use ‘em since they got ‘em!”.  The LPM of the App VM from its temporary residence on the Scale-up server back to the Scale-out server took approximately 15 minutes from start to finish.   During the LPM process, it copies memory pages from the source system to include dirty memory pages which on an active system is an ongoing activity.  The more memory and the more active the memory is being used in a VM, the  longer it can take to complete a LPM event (Mr Obvious is not needed, we got this!) When all memory pages are copied to the target system, the final cutover occurs in less than 2 seconds.  The VM is no longer “on” the source frame while running “on” the target frame as if nothing changed.   I am curious to see if the client has reduced the cores and memory on the App VM back to its post-migration design sizing of 4 cores and 64 GB RAM.  Normally, both actions could be performed dynamically, and though LPM is supported by SuSE and RedHat running in IBM Power, SAP doesn’t yet support it and I can verify the HANA DB doesn’t like it.

The Client has been working through their post-migration punch-list as the system went live on schedule at 4:30 pm that Sunday afternoon.  Starting with a kick-off call Friday night at 8:30 pm and going live Sunday at 4:30 pm they successfully moved their entire businesses Production environment from SAP ECC to Suite on HANA in 44 hours (31 hours for the start of export to finish of import).

Beginning last Fall, my team began to implement the infrastructure starting with the DR environment. Over the many months, we received many requests from the SAP Basis team and our SAP migration team to create new environments for testing, add resources, mount points or make some type of change to the VM.  The only feature which would’ve been beneficial to have and  still unsupported by SAP on either supported Platform is the ability to dynamically add/remove cores and memory.  I do expect this feature to be supported on IBM Power with PowerVM shortly.  These capabilities, especially dynamic memory add / removal have been around for a decade and a half with IBM Power.  Technology is very reliable, very consistent and very convenient.  I’m sure purists for Intel solutions using VMware might argue their product works just as well.  I believe SAP’s own guidance says otherwise and of course if someone would like to have some fun, we could setup a 2-server solution and run through a battery of tests to compare virtualization features on both platforms.  We’d have to run each under a heavy load as it would be unfair to our audience to do these tests in a vacuum as that isn’t real world.  While at it, maybe we could run some informal Oracle database (sorry, can’t help myself – read my previous blog to know of my Oracle obsession) testing along with a TCA/TCO analysis comparing how both platforms performs. We’ll refer to it as “using a leading enterprise RDBMS product” so we don’t upset lawyers.

In summary, I’m obviously very proud how this solution performed as it took a strong, capable team to design, deploy and support this client for 4 separate migrations.  This no-compromise solution was >35% less costly vs a competing solution making the lives of this client much better from beginning to go-live.

Kudo’s to IBM as they have  the best platform for SAP HANA and also  tremendous SAP talent available to partners and clients for pre-sales support, IBM Lab Services for HANA installation assistance and IBM Linux for SAP HANA support.

 

Excellent Resources:

IBM Systems Magazine http://ibmsystemsmag.com/power/systems-management/data-management/sap-hana-landscapes/?utm_source=SilverpopMailing&utm_medium=email&utm_campaign=052119-Power-EXTRA+%281%29+Live+Send&utm_content=Simplify+and+Accelerate+SAP+HANA+Landscapes&spMailingID=15684623&spUserID=MTMzMTk5NTQyNjAxS0&spJobID=1641419654&spReportId=MTY0MTQxOTY1NAS2#.XOVt36iYHsI.twitter

 

SAP on Power blog by Alfred Freudenberger https://saponpower.wordpress.com

 

Linux on Power – system tuning Linux https://developer.ibm.com/linuxonpower/docs/linux-on-power-system-tuning/

 

Interesting article discussing the use of SMT8 on IBM POWER9 servers running DB2 https://developer.ibm.com/linuxonpower/2018/04/19/ibm-power9-smt-performance-db2/

Get more for less with POWER9

Who doesn’t expect more from a new product, let alone if it is the next generation of that product. Whether it is the “All New 2019 Brand Model” Car/Truck/SUV or, being a Macbook fan, the latest Macbook Pro and IOS (just keep the magnetic power cord)?

We want and expect more.  IBM POWER8 delivered more.  More performance, built-in virtualization on the Enterprise systems, mobile capacity on Enterprise systems to share capacity between like servers, a more robust reliability and availability subsystem as well as improved serviceability features from the low-end to high-end.  Yes, all while dramatically improving performance over previous generations.

How do you improve upon something that is already really good – I’m purposefully avoiding using the word “great” as it’ll make me sound like a sycophant who would accept a rock with a Power badge and call it “great”.  No, I am talking about actual, verifiable features and capabilities delivering real value to businesses.

Since the POWER9 Enterprise systems have yet to be announced and I only know what I know through my secret sources, I’ll limit my statements to just the currently available POWER9 Scale-out systems.

  • POWER8 Scale-out now include PowerVM Enterprise Edition licenses
  • Workload Optimized Frequency now delivers frequencies up to 20% higher over the nominal or marketed clock frequency
  • PCIe4 slots to support higher speed and bandwidth adapters
  • From 2 to 4X greater memory capacity on most systems
  • New “bootable” internal NVMe support
  • Enhanced vTPM for improved Secure Boot & Trusted Remote Attestation
  • SR-IOV improvements
  • CAPI 2.0 and OpenCAPI capability – the latter, though I’m unaware of any supported features is exciting in what it is designed and capable of doing.
  • Improved price points using IS memory

The servers also shed some legacy features that were getting long in the tooth.

  • Internal DVD players – in lieu of USB drive support
  • S924 with 18 drive backplane no longer includes add-on 8 x 1.8″ SSD slots

As consumers, we expect more from our next generation purchases, the same holds true with POWER9.  Get more capability, features and performance for less money.

Contact me if you would like a quote to upgrade to POWER9, running x86 workloads and would like to hear how you may be able to do far more with less as well as learn how my services team will ease any concerns or burdens you may have to remain on your aging and likely, higher cost servers by upgrading to POWER9.

 

 

Have it your way with POWER9

IBM POWER offers system footprint and capabilities to meet any client requirement.

Henry Ford is attributed with saying “you can have any color you want, as long as it is black”.  Consumers, whether on the retail or enterprise side like options and want to buy products the way they want them.

IBM’s recently announced AIX, IBMi and Linux capable POWER9 Scale-out servers as seen below or learn more about each here.

P9-portfolio

These 6 systems join the AC922 AI & Cognitive beast using NVLink 2.0 supporting up to 6 x H2O Nvidia Volta GPU’s

With the 6 POWER9 based systems announced February 13, 2018, IBM is offering clients choice – virtually “any color you want”.  With these systems, get a 2 RU (rack unit) or 4RU model, with 1 or 2 sockets in each. Cores ranging from 4 to 24 and memory from 16 GB to 4 TB of system memory.  Internal storage options from HDD, SSD to NVMe plus all of the connectivity options expected with PCIe adapters – except we see newer adapters with more ports running at higher speeds.

Run AIX , IBM i and Linux on a 1 or 2-socket S922 or H922, a 1-socket S914 and a 1 or 2-socket S924 or H924.  Need Linux only, you can choose any of the previously mentioned servers or choose the cost-optimized L922 with 1 or 2-sockets support 32 GB up to 4 TB of RAM.

IBM issued a Statement of Direction as part of a broader announcement the intention to offer AIX clients on the Power based Nutanix solution.  It is reasonable to conclude there will be a POWER9 based Nutanix option as well.  Expecting a POWER9 solution isn’t surprising but being able to run AIX in a non-PowerVM based hypervisor is a big deal.

Looking at the entire POWER portfolio available today for clients, it ranges from the POWER8 based hyper-converged Nutanix, mid-range & Enterprise class POWER8 systems which compliment the POWER9 Scale-out and speciality systems.

 

POWER_portfolio_Feb2018

Whether the solution will be Nutanix running AIX & Linux, an Enterprise server with 192 cores or a 1-socket L922 running PostgreSQL or MongoDB in a lab, businesses can  “have it your way”.

 

 

 

Upgrade to POWER9 – Never been easier!

Delivering more features & performance at a lower cost, the ease and options available to upgrade have never been more compelling.

With an outstanding family of products in IBM’s POWER8 portfolio, it seemed impossible for IBM to deliver a successor with more features, increased performance, greater value, while at a lower price point.  On February 13th, IBM announced the POWER9 Scale-out products supporting AIX, IBM i and Linux while 1st POWER9 announcement occurred December 5, 2017 with the AC922, a HPC & AI beast.

These newly announced PowerVM-based systems consist of 1 & 2 sockets systems supporting up to 4 TB of DDR4 memory.  Starting with the robust 1-socket S914 then accelerating to the 2RU 2-socket S922 and the 4RU 2-socket S924 system. IBM announced sister systems to the S-models purpose-built for SAP HANA.  These systems are the H822 & H824 systems, identical to the S822 & S824. The H-models might also be considered hybrid systems as they come bundled with key software used with HANA while allowing a smaller AIX and IBM i footprint – sort of a hybrid between a S & L model system.  There is also a Linux only model, just as there was with POWER8.  Called the L922, it is a 2-socket though available in a 1-socket configuration.  Each of these systems support up to 4 TB of memory except the S914 which supports up to 1 TB.

Why should businesses consider upgrading to POWER9? If they are running on POWER7 and older systems, Clients will save significant cost by lowering hardware and software maintenance cost.  Moreover, with the increased performance, clients will be able to consolidate more VM’s than ever and reduce enterprise software product licensing as well as its exorbinant maintenance cost.

While Intel cancels Knights Landing and struggles to deliver innovation and performance on their 10nm and 7nm platforms, remaining in a perpetual state of treading water at 14nm, what they are delivering seems to most benefit ISV’s and not businesses.

The traditional workloads such as Oracle, DB2, Websphere, SAP (ECC & HANA), Oracle EBS, Peoplesoft, JD Edwards, Infor, EPIC and more all benefit.  For businesses looking to develop and deploy technologies developed in the 21st Century, these purpose built products deliver new innovations ideally suited for workloads geared toward Cognitive (analytics) and the web. NoSQL products, such as Redis Labs, Cassandra, neo4j or Scylla to open source relational databases products like PostgreSQL or MariaDB.

With the increased performance and higher efficiencies, all software boats will rise running on POWER9.

My team of Architects and Engineers at Ciber Global are prepared to help migrate workloads from your POWER5, POWER6, POWER7 and even POWER8 systems running AIX 5.3, 6.1, 7.1 and 7.1 as well as IBM i v6.1, 7.1, 7.2 and 7.3 to POWER9.

POWER9 supports AIX 6.1, 7.1 and 7.2.  For IBM i, it supports 7.2 & 7.3.  Client systems not at these levels will have our consultants available to guide them on the requirements and their upgrade options.  Whether using Live Partition Mobility, aka the Easy Button to move workloads from POWER6, POWER7 or POWER8 systems to POWER9 or using more traditional methods such as AIX NIM or IBM i Full System Save/Restore, there is likely an approach meeting the businesses needs.

Rest assured, if you have doubts or concerns reach out to my team at Ciber to discuss. And if you don’t already have the Easy Button, IBM is offering a 60-day trial key for clients to upgrade the PowerVM Standard Edition licenses to Enterprise Edition on their P6, P7 or P8 systems making the upgrade to POWER9 not only financially easy but also technically easy.

 

Does your IT shop use a combination wrench?

More and more, IT shops seem inclined to consolidate and simplify their infrastructure to one platform. A mindset that all workloads can or should run on a single platform incorporated into ‘Software-defined this’ and ‘Software-defined that’.  It tantalizes the decision makers senses as vendors claim to reduce complexity and cost.

Technology has become Ford vs Chevy or John Deere vs Case International.  Whereas these four vendors each have some unique capabilities and offerings they are also leaders in innovation and reliability.  For IT shops, there is this perception that only Intel & VMware are viable infrastructure options to deploy every workload type.  Mission / Life critical workloads in healthcare, high-frequency financial transactions, HPC, Big Data, Analytics, emerging Cognitive & AI but also traditional ERP workloads that run entire businesses – SAP ECC, SAP HANA and Oracle EBS are probably the most common that I see as there are also some industry specific ones for Industrial and automotive companies – I’m thinking of Infor.

When a new project comes up, there is little thought given to the platform. either the business or maybe the ISV will state what and how many of server X should be ordered. The parts arrive, eventually getting deployed.  Little consideration is given to the total cost of ownership or the the impact to the business caused by the system complexity.

I’ve watched a client move their Oracle workloads to IBM POWER several years ago. This allowed them to reduce their software licensing and annual maintenance cost as well as to redeploy licensing to other projects – cost avoidance by not having to add net new licensing.  As it happens in business, people moved on, out and up. New people came in whose answer to everything was Intel + VMware.  Yes, a combination wrench.

If any of you have used a combination wrench,  you know there are a few times it is the proper tool. However, it can also strip or round over the head of a bolt or nut if too much pressure or torque is applied. Sometimes the proper tool is a SAE or Metric box wrench, possible a socket, even an impact wrench.  In this clients case, they have started to move their Oracle workloads from POWER to Intel.  Workloads currently running on standalone servers or at most using 2-node PowerHA clusters.  Moving these simple (little complexity) Oracle VM’s to 6-node VMware Oracle RAC clusters that have now grown to 8-nodes.  Because we all know that Oracle RAC scales really well (please tell me you picked up on the sarcasm).

I heard from the business earlier this year that they had to buy over $5M of net-new Oracle licensing for this new environment. Because of this unforeseen expense, they are moving other commercial products to open-source since we all know that open-source is “free” to offset the Oracle cost.

Oh, I forgot to mention.  That 8-node VMWare Oracle RAC cluster is crashing virtually every day.  I guess they are putting too much pressure on the combination wrench!

Oracle is a mess & customers pay the price!

Chaos that is Oracle

Clients are rapidly adopting open source technologies in support of purpose-built applications while also shifting portions of on-premises workloads to major Cloud providers like Amazon’s AWS, Microsoft’s Azure and IBM’s SoftLayer.  These changes are sending Oracle’s licensing revenue into the tank forcing them to re-tool … I’m being kind saying it this way.

What do we see  Oracle doing these days?

  • Aggressively going after VMware environments who use Oracle Enterprise products for licensing infractions
  • Pushing each of their clients toward Oracle’s public cloud
  • Drastically changing how Oracle is licensed for Authorized Cloud Environments using Intel servers
  • Latest evidence indicates they are set to abandon Solaris and SPARC technology
  • On-going staff layoffs as they shift resources, priorities & funding from on-premises to cloud initiatives

VMware environments

I’ve previously discussed for running Oracle on Intel (vs IBM POWER), Intel & VMware have an Oracle problem. This was acknowledged by Chad Sakac, Dell EMC’s President Converged Division in his August 17, 2016 blog in what really amounted to an Open Letter to King Larry Ellison, himself. I doubt most businesses using Oracle with VMware & Intel servers fully understand the financial implications this has to their business.  Allow me to paraphrase the essence of the note “Larry, take your boot off the necks of our people”.

This is a very contentious topic so I’ll not take a position but will try to briefly explain both sides.  Oracle’s position is simple even though it is very complex.  Oracle does not recognize VMware as an approved partitioning (view it as soft partitioning) method to limit Oracle licensing. As such, clients running Oracle in a VMware environment, regardless of how little or much is used, must properly license it for every Intel server under that clients Enterprise (assume vSphere 6+).  They really do go beyond a rational argument IMHO. Since Oracle owns the software and authored the rules they use these subtleties to lean on clients extracting massive profits despite what the contract may say. An example that comes to mind is how Oracle suddenly changed licensing configurations for Oracle Standard Edition and Standard Edition One. They sunset both of these products as of December 31, 2015 replacing both with Standard Edition 2. What can only be described as screwing clients, they halved the number of sockets allowed on a server or in a RAC cluster, limited the number of cpu threads per DB instance while doubling the number of minimum Named User Plus (NUPs). On behalf of Larry, he apologizes to any 4 socket Oracle Standard Edition users but if you don’t convert to a 2 socket configuration (2 sockets for 1 server or 1 socket for 2 servers using RAC) then be prepared to license the server using the Oracle Enterprise Edition licensing model.

The Intel server vendors and VMware have a different interpretation on how Oracle should be licensed.  I’ll boil their position down to using host or cpu affinity rules.  House of Bricks published a paper that does a good job trying to defend Intel+VMware’s licensing position. In their effort, they do show how fragile of ground they sit on with its approach  highlighting the risks businesses take if they hitch their wagons to HoB, VMware & at least Dell’s recommenations.

This picture, which I believe House of Bricks gets the credit for creating captures the Oracle licensing model for Intel+VMware environments quite well. When you pull your car into a parking garage – you expect to pay for 1 spot yet Oracle says you must pay for every one as you could technically park in any of them. VMware asserts you should only pay for a single floor at most because your vehicle may not be a compact car, may not have the clearance for all levels, there are reserved & handicapped spots which you can’t use. You get the idea.

oracle_parking_garage

It simply a disaster for any business to run Oracle on Intel servers. Oracle wins if you do not virtualize, running each on standalone servers.  Oracle wins if you use VMware, regardless of how little or much you actually us.  Be prepared to pay or to litigate!

Oracle and the “Cloud”

This topic is more difficult to provide sources so I’ll just stick to anecdotal evidence. Take it or leave it. At contract renewal, adding products to contracts or new projects like migrating JD Edwards “World” to “Enterprise One” or a new Oracle EBS deployment would subject a business to an offer like this.  “Listen Bob, you can buy 1000 licenses of XYZ for $10M or you can buy 750 licenses of XYZ for $6M, buy 400 Cloud units for $3M and we will generously throw in 250 licenses …. you’ll still have to pay support of course. You won’t get a better deal Bob, act now!”.  Yes, Oracle is willing to take a hit for the on-premises license revenue while bolstering their cloud sales by simply shuffling the Titanic deck chairs. These clients, for the most part are not interested in the Oracle cloud and will never use it other than to get a better deal during negotiations. Oracle then reports to Wall Street they are having tremendous cloud growth. Just google “oracle cloud fake bookings” to read plenty of evidence to support this.

Licensing in the Cloud

Leave it to Oracle Marketing to find a way to get even deeper into clients wallets – congratulations they’ve found a new way in the “Cloud”.  Oracle charges at least 2X more with Oracle licenses on Intel servers that run in Authorized Cloud Environments (ACE). You do not license Oracle in the cloud using the on-premises licensing factor table.  The more VM’s running in a ACE,  the more you will pay vs an on-premises deployment. To properly license an on-premises Intel server (remember, it is always an underlying proof that Oracle on POWER servers is the best solution) regardless if virtualization is used, assuming a 40 core server, would equal 20 Oracle Licenses (Intel licensing factor for Intel servers is 0.5 per core). Assume 1 VMware server, ignoring it is probably part of a larger vSphere cluster.  Once licensed, clients using VMware could theorectially run Oracle as many VM’s as desired or supported by that server. Over-provision the hell out of it – doesn’t matter. That same workload in an ACE, you pay for what amounts to every core.  Remember, if the core resides on-premises it is 1 Oracle License for every 2 Intel cores but in a ACE it is 1 OL for 1 core.

AWS
Putting your Oracle workload in the cloud?  Oracle license rules stipulate if running in AWS, it labels as vCPU’s both the physical core and the hyperthread. Thus, 2 vCPU = 1 Oracle License (OL). Using the same 40 core Intel server mentioned above, with hyperthreading it would be 80 threads or 80 vCPU.  Using Oracle’s new Cloud licensing guidelines, that would be 40 OL.  If this same server was on-premises, those 40 physical cores (regardless of threads) would be 20 OL ….. do you see it?  The licensing is double!!!   If your AWS vCPU consumption is less vs the on-premises consumption you may be ok. As soon as your consumption goes above that point – well, break out your checkbook.  Let your imagination run wild thinking of the scenarios where you will pay for more licenses in the cloud vs on-prem.

Azure
Since Azure does not use hyperthreading, 1 vCPU = 1 core.  The licensing method for ACE’s for Azure or any other ACE if hyperthreading is not used, 1 vCPU = 1 OL.  If a workload requires 4 vCPU, it requires 4 OL vs the 2 OL if it was on-premises.

Three excellent references to review. The first is Oracle’s Cloud licensing document. The second link is an article by Silicon Angle giving their take of this change and the last link is for a blog by Tim Hall, a DBA and Oracle ACE Director sharing his concerns. Just search for this topic starting from January 2017 and read until you fall asleep.

Oracle
Oracle offers their own cloud and as you might imagine, they do everything they can to favor their own cloud thru licensing, contract negotiations and other means.   From SaaS, IaaS and PaaS their marketing machine says they are second to none whether the competition is SalesForce, Workday, AWS, Azure or any other.  Of course, analysts, media, the internet nor Oracle earnings reports show they are having any meaningful success – to the degree they claim.

Most recently, Oracle gained attention for updating how clients can license Oracle products in ACE’s as mentioned above.  As you might imagine, Oracle licenses its products slightly differently than in competitors clouds but they still penalize Intel and even SPARC clients, who they’ll try to migrate into the cloud running Intel (since it appears Oracle is abandoning SPARC).  The Oracle Cloud offers clients access to its products on a hourly or monthly in a metered and non-metered format on up to 4 different levels of software. Focusing on Oracle DB, the general tiers are Standard, Enterprise, High-Performance and Extreme-Performance Packages. Think of it like Oracle Standard Edition, Enterprise Edition, EE+tools, EE+RAC+tools.  Oracle also defines the hardware tier as “Compute Shapes“. The three tiers are General Purpose, High-Memory or Dedicated compute

Comparing the cost of an on-premises perpetual license for Oracle Enterprise  vs a non-metered monthly license for the Enterprise Tier means they both use Oracle Enterprise Edition Database. Remember a perpetual license is a one-time purchase, $47,500 for EE DB list price plus 22% per year annual maintenance.  The Enterprise tier using a High-memory compute shape in the Oracle cloud is $2325 per month.  This compute shape consists of 1 OCPU (Oracle CPU) or 2 vCPU (2 threads / 1 core).  Yes, just like AWS and Azure, Intel licensing is at best 1.0 vs 0.5 for on-premises licensing per core. Depending how a server might be over-provisioned as well as the fact an on-premises server would be fully licensed with 1/2 of its installed cores there are a couple of ways clients will vastly overpay for Oracle products in any cloud.

The break-even point for a perpetual license + support vs a non-metered Enterprise using High-memory compute shape is 30 months.

  • Perpetual license
    • 1 x Oracle EE DB license = $47,500
    • 22% annual maintenance = $10,450
    • 3 year cost: $78,850
  • Oracle Cloud – non-metered Enterprise using High-Memory shape
    • 1 x OCPU for Enterprise Package for High-Compute = $2325/mo
    • 1 year cloud cost = $27,900
    • 36 month cost: $83,700
  • Cross-over point is at 30 months
    • $79,050 is the 30 month cost in the Cloud
  • An Oracle Cloud license becomes significantly more expensive after this.
    • year 4 for a perpetual license would be $10,470
    • 12 months in year 4 for the Cloud license would be $27,900
    • Annual cost increase for a single cloud license over the perpetual license = $17,430
  • Please make your checks payable to “Larry Ellison”

Oracle revenue’s continue to decline as clients move to purpose-built NoSQL solutions such as MongoDB, RedisLabs, Neo4j, OrientDB, Couchbase as well as SQL based solutions from MariaDB, PostgreSQL (I like EnterpriseDB) even DB2 is a far better value.  Oracle’s idea isn’t to re-tool by innovating, listening to clients to move with the market. No, they get out their big stick – follow the classic mistake so many great clients have done before them which is not evolve while pushing clients until something breaks.   Yes, Boot Hill is full of dead technology companies who failed to innovate and adapt. This is why Oracle is in complete chaos.  Clients beware – you are on their radar!

 

 

HPE; there you go again! Part 1

Updated Sept 05, 2016: Split the blog into 2 parts (Part 2). Fixed several typo’s and sentence structure problems. Updated the description of the Superdome X blades to indicate they are 2 socket blades while using Intel E7 chips.

It must be the season as I find myself focused a bit on HPE.  Maybe it’s because they seem to be looking for their identity as they now consider selling their software business.  This time though, it is self-inflicted as there has been a series of conflicting marketing actions. From what they say in their recent HPE RAS whitepaper about the poor Intel server memory reliability stating in the introductory section that memory is far and away the highest source of component failures in a system.  Shortly after that RAS paper is released, they post a blog written by the HPE Server Memory Product Manager stating “Memory Errors aren’t the end of the World”.  Tell that to the SAP HANA and Oracle Database customers, the latter which I will be discussing in this blog.

HPE dares to step into the lion’s den on a topic with which it has little standing to imply it is an authority how Oracle Enterprise software products are licensing in IBM Power servers.  As a matter of fact, thanks to the President of VCE, Chad Sakac for acknowledging that VMware has a Oracle problem.  On August 17th, Chad penned what amounts to an open letter to Larry & Oracle begging them …. No, demanding that Larry leave his people alone.  And, by “his people”, I mean customers who run Oracle Enterprise Software Products licensed by the core on Intel servers using VMware.

Enter HPE with a recent blog by Jeff Kyle, Director of Mission Critical Solutions.  He doesn’t distinguish if he is in a product development, marketing or sales role.  I would bet he it is the latter two as I do not think a product developer would put themselves out like Jeff just did.  What he did is what all Intel marketing teams and sellers have done from the beginning of compute time when the first customer thought of running Oracle on a server that wasn’t “Big Iron”.

Jeff sets up a straw man stating “software licensing and support being one of the top cost items in any data center” followed by the obligatory claim that moving it to an “advanced” yet “industry-standard x86 servers” will deliver the ROI to achieve the goals of every customer while coming damn close to solving world hunger.

Next is where he enters the world of FUD while also stepping into the land of make-believe.  Yes, Jeff is talking about IBM Power technology as if it is treated by Oracle for licensing purposes the same as an Intel server, which it is not.  You will have to judge if he did this on purpose or simply out of ignorance.  He does throw the UNIX platforms a bone by saying they have “excellent stability and performance” but stops there as only to claim they cost more than their Industry standard x86 server counterparts.

He goes on to state UNIX servers <Hold Please> Attention: For purposes of this discussion, let’s go with the definition that future UNIX references = AIX and RISC references = IBM POWER unless otherwise stated.  As I was saying, Jeff next claims AIX & POWER are not well positioned for forward-looking Cloud deployments continuing his diminutive descriptors suggesting proper clients wouldn’t want to work with “proprietary RISC chips like IBM Power”. But, the granddaddy of all of his statements and the one that is complete disingenuous is:  <low monotone voice> “The Oracle license charge per CPU core for IBM Power is twice (2X) the amount charged for Intel x86 servers” </low monotone voice>.

In his next paragraph, he uses some sleight of hand by altering the presentation of the traditional full List Price cost for Oracle RAC that is associated with Oracle Enterprise Edition Database.  Oracle EE DB is $47,500 per license + 22% maintenance per year, starting with year 1.  Oracle RAC for Oracle EE EB is $23,000 per license + 22% maintenance per year, starting with year 1.  If you have Oracle RAC then you would by definition also have a corresponding Oracle EE DB Licenses.  The author uses a price of $11,500 per x86 CPU core and although by doing he isn’t wrong per se, I just do not like that he does not disclose the full license cost of #23,000 up front as it looks like he is trying to minimize the cost of Oracle on x86.

A quick licensing review. Oracle has an Oracle License Factor Table for different platforms to determine how to license its products that are licensed by core. Most modern Intel servers are 0.5 per License.  IBM Power is 1.0 per License.  HP Itanium 95XX chip based servers, so you know also has a license factor of 1.0.  Oracle, since they own the table and the software in question can manipulate it to favor their own platforms as they do, especially with the SPARC servers.  It ranges from 0.25 to 0.75 while Oracle’s Intel servers are consistent with the other Intel servers at 0.5.  Let’s exclude the Oracle Intel servers for purposes of what I am talking about here for reason I said, which is they manipulate the situation to favor themselves. All other Intel servers “MUST” license ALL cores in the server with very, very limited exceptions “times” the licensing factor which is 0.5.  Thus, a 2 x 18 core socket would have 36 cores. Ex: 2s x 18c = 36c x 0.5 License Factor = 18 Licenses.  That would equal 18 Oracle Licenses for whatever the product being used.

What Jeff does next was a bit surprising to me.  He suggests customers not bother with 1 & 2 socket Intel “Scale-out” servers which generally rely on Intel E5 aka EP chipsets.  By the way, Oracle with their Exadata & Oracle Database Appliances now ONLY use 2 socket servers with the E5 processors; let that sink in as to why.  The EP chips tend to have features that on paper have less performance such as less memory bandwidth & fewer cores while other features such as clock frequency are higher, a feature that is good for Oracle DB.   These chips also have lower RAS capabilities, such as missing the MCA (Machine Check Architecture) feature only found in the E7 chips.  He instead suggests clients look at “scale-up” servers which commonly classified as 4 sockets and larger systems.  This is where I need to clarify a few things.  The HP Superdome X system, although it scales to 16 sockets, does so using 2 socket blades.  Each socket uses the Intel E7 processor, which given this is a 2 socket blade is counter to what I described at the beginning of this paragraph where 1 & 2 socket servers used E5 processors.  The design of the HP SD-X is meant to scale from 1 blade to 8 blades or 2 to 16 sockets which requires the E7 processor.

With the latest Intel Broadwell EX or E7 chipsets, the number of cores available for the HD SD-X range from 4 to 24 cores per socket.  Configuring a blades with the 24 core E7_v4 (v4 indicates Broadwell) equals 48 cores or 24 Oracle Licenses.  Reference the discussion two paragraphs above.  His assertion is by moving to a larger server you get a larger memory capacity for those “in-memory compute models” and it is this combination that will dramatically improve your database performance while lowering your overall Total Cost of Ownership (TCO).

He uses a customer success story for Pella (windows) who avoided $200,000 in Oracle licensing fees after moving off a UNIX (not AIX in this case) platform to 2 x HPE Superdome X servers running Linux.  This HPE customer case study says the UNIX platform which Pella moved off 9 years ago was actually a HP Superdome with Intel Itanium processors server running HP-UX.  Did you get this? HP migrated off their own 9-year-old server while implying it might be from a competitor – maybe even AIX on Power since it was referenced earlier in the story.  That circa 2006 era Itanium may have used a Montecito class processor. All of the early models before Tukwila were pigs, in my estimation.  A lot of bluff and hyperbole but rarely delivering on the claims.  That era of SD would have also used an Oracle license factor of 0.5 as Oracle didn’t change it until 2010 and only on the newer 95xx series chips.  Older systems were grandfathered and as I recall as long as they didn’t add new licenses they would remain under the 0.5 license model.  I would expect a 2014/2015 era Intel processor would outperform a 2006 era chip, although if it would have been against a POWER5 1.9 or 2.2 GHz chip I might call it 50-50 J .

We have to spend some time discussing HP server technology as Jeff is doing some major league sleight of hand as the Superdome X server supports a special hardware partitioning capability (more details below) that DOES allow for reduced licensing that IS NOT available on non-Superdome x86 servers or from most other Intel vendors unless they also have an 8 socket or larger system like SGI – oh wait, HP just bought them.  Huh, wonder why they did this if the HPE Superdome X is so good.

Jeff then mentions an IDC research study; big deal, here is a note from my Pastor that says the HPE Superdome is not very good; who are you going to believe?

Moving the rest of the blog to Part 2.