SAP HANA on … #ChooseRight

Window of Opportunity

IBM Power systems started late in the SAP HANA market, on standby if you will when IBM still had their System X business. Once they sold off their x86 business, it opened the door for IBM to work with SAP to offer clients a 2nd platform choice especially with many ECC shops coming from Enterprise platforms and now their only option is to deploy their most critical business application on Intel.

With thousands of clients running SAP ECC using Oracle or DB2 on AIX or running IBM I, there is a large and experienced install base. IBM’s move to support Linux little endian natively beginning with POWER8 eased any development concerns SAP may have had. IBM Power has been fastest adoption for a platform after the initial SAP Ramp-up Program.

Rapid Growth

After the initial Ramp-up program, SAP announced the first GA of HANA for IBM Power in 2015. Since then, it has been the fastest adopted platform by clients to run SAP HANA.

Whether deploying a Greenfield or Brownfield SAP HANA solution, what makes IBM Power such a better platform for SAP HANA over Intel based systems? It starts with its DNA. IBM Power was born an Enterprise system, in the data center running mission critical workloads. Read the Forrester Total Economic Impact of IBM Power Systems for SAP HANA study how they rate the platform.

Flexibility, Performance & Resiliency

  • SAP Certified HANA Prod OLTP – 24 TB Scale-up
  • SAP Certified HANA Prod OLAP – 24 TB Scale-up
  • SAP Certified HANA Prod OLAP – 24 TB Scale-out
  • SAP exceptions available upon request
  • Up to 16 Production VM’s on E950 & E980
  • POWER9 servers can scale up to 64 TB of Memory
  • Highly resilient memory offering DDDC+1+1
  • Memory sparing, spare chips, ChipKill
  • HANA is always virtualized using the integrated IBM PowerVM hypervisor
  • Live Partition Mobility
  • Dynamically add / remove cores and memory
  • Supports TDI 5 delivering greater SAPS per core, up to 2X+
  • Offers Elastic Capacity on Demand activations of cores & memory
  • Highest Reliability excluding Z for 11 years per ITIC
  • Concurrent maintenance features for firmware, drives, PCIe adapters, fans and power supplies
  • Concurrent maintenance for the I/O path from VM to SAN and network when using Dual Virtual I/O Servers
  • Dynamic tuning & optimization
  • Supports SAP Native Storage Extension and Fast Restart
  • IBM Storwize storage is optimized for SAP HANA on POWER

Additional features to be announced any day (Nov 5, 2019 is todays timestamp).

  • Persistent Memory at no additional cost
  • Persistent Memory with no performance degradation
  • Use of Shared Processor Pools for Production in addition to existing support for non-Prod
  • RHEL 8

Clients will deploy fewer systems, while able to host more workloads per system, whether those are legacy SAP ECC, SolMan or non-SAP workloads such as legacy Oracle workloads or possibly new Cognitive workloads.

Bringing it ALL together

SAP HANA; whether Suite or BW on HANA or S/4HANA or BW/4HANA, businesses tend to focus on the application, discounting the infrastructure as commodity – it’s all the same. With SAP HANA, designed as a scale-up in-memory technology, IBM Power is the optimal platform to host it.

  • Primary benefits such as fewer systems with greater utilization.
  • Secondary benefits such as less infrastructure and data center services required, i.e. fewer network & SAN ports, fewer power plugs with lower electrical consumption requiring less to cool.
  • Tertiary benefits, often more difficult to quantify such as the downtime the business did NOT have to take to perform a maintenance action such as updating firmware or adding an adapter for additional capacity.
  • Other actions such as downstream activities impacting the I/O paths like a network switch service event can all be accommodated with a properly architected and deployed Power solution.

These foundational capabilities allow the business to remain on schedule, consultants continue to work and not be idle.

#ChooseRight

There are only two options for SAP HANA. One option is the platform forcing you to choose one feature for another making every decision a compromise. The other option is the platform offering complete flexibility, scalability and resiliency with no compromises as even IDC states in this whitepaper. No one wants to go back to their board asking for more money admitting they made a mistake, undersized or failed to anticipate something, so #ChooseRight!

HANA – Winning with IBM POWER

IBM Power + IBM Storwize solution beats Intel based solution + competitive storage platform to migrate clients SAP ECC environment to Suite on HANA … for less money!

The Business Challenge: A $4B manufacturing company decided to migrate their AnyDB to Suite on HANA; but on which platform? 

The Competition: The client evaluated a TDI solution from an Intel based vendor + major storage vendor as part of a converged infrastructure solution as well as a TDI solution from IBM Power + IBM Storage. The SAP Basis Manager favored an Intel solution; either in the cloud or on-premises based on the perception is was the lowest cost, optimal platform for HANA with comparable features to the other choice.

The Evaluation: I was the Client Executive and Executive Architect for my team, leading the design and competitive effort. The clients initial objection to IBM Power stemmed from their view of the current platform running SAP ECC. Though it had done so, with virtually no issues for 20 years, they viewed it as expensive, inflexible and legacy. This was largely due to the fact they had not implemented all of the virtualization capabilities, having allowed the system to grow with dedicated resources. Also, due to some in-house mistakes with the current storage, their answer was to buy more hardware vs tightening up their own internal procedures plus key individuals taking ownership for the mistake which led to the problem.

The clients SAP team thought this would be a simple exercise. Get the Intel solution price and the IBM Power solution price, present to management for the rubber stamp to move off to the next phase, migration prep. Only there was a problem. the Intel solution costing was coming in higher than expected. The AnyDB size of 24 TB reduced down to ~4.1 TB, even to 3.6 TB with cleanup, per the SAP Quicksizer. True story – the Intel team proposed they start with 3 TB systems to get them on the floor, then wait and see if they truly needed more. Thankfully, the client didn’t accept this generous offer, requesting they quote 6 TB systems as that was the next increment. With IBM Power, you can configure DIMMs to more closely match the required capacity as the platform does not have the memory placement requirements (and limitations) as Intel platforms do. This client further required all environments be sized for the full HANA DB copy, which had grown from 4.1 TB to 5.1 TB (plus OS taking it to 5.3 TB). Good thing they didn’t go with those 3 TB Intel systems, eh?

The Intel solutions configured every HANA DB environment as bare-metal because the memory requirements didn’t support VMware for virtualization. It wasn’t an option due to the VMware maximum memory requirement was less than the client required. The memory sizing also pushed the Intel solution into larger servers with more sockets. The environments consisted of Sandbox, Dev, QA and Production. Each had a full memory sized HANA DB on bare-metal servers while they did use VMware to host the NetWeaver application servers; though again, they had multiple ones for each environment. I don’t know the exact numbers but believe they were north of 16 Intel servers with 10 of those configured as 6 TB bare-metal servers.

The IBM Power solution consisted of 3 x POWER9 servers. Yes, 3. Everything was fully virtualized, designed for maximum resiliency and serviceability. 2 smaller POWER9 servers for Production, each hosting 1 HANA DB + 2 NetWeaver VM’s for Production. DR hosted a server with 3X the capacity of the Production server. This single, highly performant and reliable server was configured for 17 VM’s. Sandbox, Dev and QA each had 1 VM for HANA DB and 2 VM’s for their NetWeaver App servers. Prod had 1 VM and 2 NetWeaver App servers for failover plus 2 VM’s for the Dual Virtual I/O Servers (VIOS). At each site, the servers were connected to an IBM flash storage solution using redundant IBM branded SAN switches.

The Decision: Management was presented with both options. Feedback was given to the SAP team. My team didn’t do anything but wait as the word was the Intel team was scrambling to “fix” their numbers. Updated pricing was presented to management and a decision was made. They chose the IBM Power solution running SUSE Linux with the IBM Storwize all flash storage for their Suite on HANA solution. Their justification was simple. The incredible reliability and performance record of the existing IBM Power spoke for itself – they had actual experience running SAP on it, albeit not HANA but who was I to split hairs. Secondly, and probably most important, the IBM Power solution was at least 35% less costly than the Intel solution. By the way, I had submitted a proposal for the migration services as well. Going up against a couple of big players. I won … beat them by 30% as well.

In Conclusion: though management didn’t know at the time, or at least couldn’t fully comprehend the benefits they would obtain with the virtualization capabilities which come with IBM Power servers that are #NoCompromise as this would play a key role during the 6 month migration window allowing their consultants and business leaders flexibility to provision, change, update, modify (you name it) and more the many requests which came up suddenly without downtime, added cost or delays.

I oversaw the implementation and migration effort, which started by ensuring the solution was properly designed with all of its pieces, ordered and the client environment prepared. Then, working closely with the migration team, ensuring we understood each others roles, not just our but the platforms capabilities as well as the current and future timelines. Took this all the way to the final Go-Live migration which went off like clockwork. Down at 10 pm Friday night. Migration done by 7 am Sunday morning. Clean-up and other details tended to to Go-Live by 4 pm Sunday afternoon. And they haven’t had to take an outage since … at least not for anything hardware related.

Reach out if you want to learn how my team designs world class systems, using world class assessment tools and migration techniques which allow our solutions to be optimized, faster, efficient and ultimately lower cost.

Another Successful Go-Live for SAP HANA on POWER!

Another SAP HANA deployment on IBM Power. Delivers a lower-cost, highly virtualized, flexible, no-compromise solution vs the alternative.

My client is now live, running Suite on HANA 2.0 on an IBM POWER & Storwize solution after a successful weekend migration.

The focus for this blog is to discuss what led to my client successfully migrating their SAP ECC environment to SAP HANA using capabilities inherent with IBM Power servers.  Capabilities available to every SAP HANA client who choose IBM Power vs the only other alternative platform supported for SAP HANA workloads.  The alternative option is built with Intel processors either running as bare-metal or with virtualization.  If virtualized, it will likely be VMware which I refer to as a “compromise” solution full of gotchas, limitations, restrictions and constraints.  Whereas,  IBM Power using PowerVM  is a “No Compromise” option.   I’ll give some examples of this bold statement below.

Back to my client.  In the Fall of 2018, the client chose an IBM Power solution  supporting four environments using a fully virtualized two-site solution.  Client chose to deploy HANA in parallel to their existing SAP ECC environment which runs on IBM i.   Regarding storage, each site uses  new IBM Storwize AFA products.  For high-availability and resiliency, the solution uses SuSE HA clustering between a pair of Production servers with SAP HANA Replication locally from Primary Prod to Failover Prod and then  from Primary Prod to the DR server.

DR consists of a (very) large Scale-up IBM POWER server hosting all HANA DB & App (NetWeaver) VM’s for each environment (Sandbox, Dev, QAS, etc).  Production uses a smaller pair of the DR server just for the HANA DB VM’s plus VM’s hosting redundant App servers. 

Each of the IBM Power systems in this SAP HANA environment use Dual VIOS or Virtual I/O Server.  For the uninitiated, Dual VIOS means there are 2 special VM’s which virtualize and manage the I/O to every VM’s.  Remember a VM can technically use any combination of dedicated or virtual I/O but typically when a VIOS is used, it manages both network and storage I/O with one exception where I often see a client use a dedicated Fibre adapter for physical tape connections.  Benefits of implementing Dual VIOS are many. They require fewer adapters leading to smaller servers and/or less I/O expansion, provide I/O path redundancy to network & storage while also increasing serviceability as the client can do just about any kind of maintenance on the I/O path transparent to the workloads.  This means very little downtime is ever required to service and maintain the I/O subsystem.  This includes adding, removing,  upgrading and configuring adapters,  ports, updating drivers, etc.  Thus, if a port or adapter fails or if something were to happen to a VIOS (very rare), its redundant ports, adapters and VIOS are configured to automatically service the I/O from the remaining resources.   There are many options available to deploy redundant VIOS, from active/passive to active/active for both network and storage I/O.  Another benefit when virtualizing the I/O is to enable features such as Live Partition Mobility and Simplified Remote Restart ….. no compromises, remember?!

I should disclose that my company has an SAP migration, consulting and managed services practice.  We were selected to provide both the infrastructure implementation and the SAP ECC to HANA migration services.  Starting with the lower environments, my SAP services team started late last year (2018) concluding with Prod in May 2019.  This client wanted each environment to be a full copy of the HANA DB whereas it is common for clients to make the lower environments smaller. Our migration and infrastructure teams worked together during every step, creating additional VM’s, adding storage, mount points, dialing in  cores and memory for every HANA DB and App VM for each environment. 

With IBM POWER9 servers, SAP states Production VM’s are required to use Dedicated (and dedicated donating) cores while non-Prod environments may use dedicated cores or use Shared Processor Pools (SPP).  This means clients can use every square inch of their IBM POWER servers – dialing in the cores and memory.  For Non-Prod, clients receive greater granularity sharing cores leading to even greater resource efficiency.  This leads to smaller and fewer servers – say it with me “lower cost!” make for very happy clients!

Contrast this with the alternative Intel solution with its two choices; bare-metal meaning no virtualization benefits or to use virtualization.  Bare-metal means 1 OS image per physical server. Hopefully your infrastructure provider or SAP consultant does not under size the cores & memory as the cost to remediate can be very costly (i.e. possibly new servers if the current system is already maxed out).  If the market leading virtualization product (i.e. VMware) is chosen, its VMs do not offer the granularity available as are available from IBM Power systems with its ultra-secure and rock-solid Power Hypervisor (PHYP).

The alternative virtualization product requires (i.e. limits or restricts) each VM to allocate cores in increments of full or ½ sockets.  Let’s say the HANA DB system is a 4-socket Intel server using 22-core processors totaling 88 cores with 1536 GB RAM per socket or 6 TB in total.  If the HANA DB sizing called for 46 cores, you would be required to assign 3 sockets or 66 cores for a VM which only needs 46, wasting 20 cores plus all of the excess memory attached to that 3rd socket.  Another approach using this example of waste when using virtualization for the alternative option.   If the HANA DB VM requires 3200 GB of memory.  Because this is 128 GB more than physically connected to 2 sockets, you must allocate all 4,608 GB of memory attached to the 3 sockets as well as all 66 cores on those 3 sockets as previously described.  1,408 GB of memory is wasted, unable to be used by any other VM’s on that server.  Fortunately, the larger DIMMs used to achieve the needed capacities are cheap so this waste is a drop in the bucket (in reality, these large DIMMs are NOT cheap at all!)  SAP states there is overhead incurred from this market leading virtualization product.  Also, if security is important, don’t overlook its many security vulnerabilities such as  the Intel Management Engine, VMware vSphere, Linux plus the recent Meltdown, Spectre, Foreshadow and Zombieload  side-channel threats come to mind.

Some of these security vulnerabilities come with a performance penalty.  SAP fully supports IBM POWER8 & POWER9 using SMT8 while many recommend disabling the use of hyper-threading when using Intel servers.  SMT options are SMT8, SMT4, SMT2 and SMT0 (ie ST0). To view the SMT level on SuSE or RedHat, use `ppc64_cpu –smt` and to change the SMT level from its current level to SMT4, for example, use `ppc64_cpu –smt=4`.  Note the switch used is a “<dash><dash>smt” as many editors will change that to a large single dash.  The default for Linux on POWER8 should be SMT8 but there are some situations where the default is SMT4.  For POWER9, all supported Linux distributions should default to SMT8.  Also, clients are able to change SMT from one level to another dynamically by VM (yes, I said by “VM”).  This is a huge feature,  unavailable on Intel.

UPDATE (6/18/2019): SAP Note 2393917 states the following statement. “Due to the security vulnerability identified in CVE-2018-3646 VMware strongly advices customers to review and enable the recommendations indicated in VMware KB 55806. In particular, VMWare recommends that customers must ensure that after enablement, the maximum number of vCPUs per VM must be less than or equal to the number of total cores available on the system or the VM will fail to power on. The number of vCPUs on the VM may need to be reduced if it is to run on existing hardware. The number of vCPUs should be a factor of two. VMware is providing a tool to assist customers with the analysis of their VM configuration.” .  This SAP Note is very explicit. Though they are not declaring clients MUST disable hyper-threading, when a vendor states they “strongly advise” you to do something, they are saying essentially telling you to do something.

Here are a couple of articles  on the performance impact here and here, but do your own internet research as well.  SAP has been remediating their cloud Intel environment with details in SAP Note 2709955 but are copping out regarding what clients should do regarding their on-premise Intel servers.  Instead, they defer to the relevant vendors like Intel, VMware, RedHat, SuSE, etc to determine what they should do.  It’s not like hyper-threading is known for performance or  throughput, for that matter but if it delivers something to increase efficiency, it would be a good thing.  Lose hyper-threading and all you have left for threads are physical cores. This means less efficiency for the application as HANA loves threads which is why it scales so well on IBM Power.  For Intel sizing, this likely requires more cores with associated memory which leads to more sockets with more memory leading to larger, more expensive servers to obtain the desired scale.  Using TDI Phase 5  based on SAPS values, any sizing would need to be adjusted to compensate for not having hyper-threading.    With IBM Power, size it, dial it in the cores and memory, tune the OS and re-use spare capacity for other VM’s running Linux, AIX & IBM i (if supported by the chosen model) as needed – no compromises!

Thought I would create the table below to compare IBM’s PowerVM vs VMware’s vSphere using a couple of SAP Notes to compare PowerVM versus VMware. It gets really complicated trying to explain Intel and VMware capabilities, even IBM Power & PowerVM to a lesser degree as what is supported varies by CPU Architecture (i.e. Ivy Bridge, Haswell, Broadwell, Skylake and Cascade Lake in the case of Intel and POWER8 vs POWER9) as well as the VMware generation (i.e. vSphere pre-6.5, 6.5 and 6.7).  I’ll try to annotate differences but will ask you to reference SAP Notes for VMware vSphere on Intel are 2652670, 2718982 and 2393917 for specific details.  SAP Notes for IBM Power using PowerVM are 2055470,  2230704, 2188482 and 2535891

Disclaimer: I am verifying some of the values shown in the table below and will update the table as needed.  With new features being supported regularly, it can be a challenge to remain current.

 

REMINDER: This chart ONLY applies to Business Suite and not BW. I’d have to build another table for those differences as BW was not germane to this client or this blog.

 

VMware vSphere (OLTP)

PowerVM (OLTP)

Max VM’s per system 16** 1 – 16 Production VM’s*

 

1 – 1008 VM’s (15 Prod + 993 Non-Prod VM’s)

POWER9 E950 & E980

    1 – 8 Production VM’s*

 

1 – 1000 VM’s (7 Prod + 993    Non-Prod VM’s)

POWER8 E870(C) & E880(C)

    1 – 6 Production VM’s*

 

1 – 920 VM’s (5 Prod + 915      Non-Prod VM’s)

POWER8 E850C

    1 – 4 Production VM’s*

 

1 – 426 VM’s (3 Prod + 423    Non-Prod VM’s

POWER8 & POWER9 2-socket Scale-Out

Max VM size Up to 4 sockets (BW, SL, CL)  
VM size increments*** 1, 2, 3 and 4 full sockets Dedicated & Dedicated Donating

 

1 core increments

  ½ socket (no multiples like 1.5) but 2, 3, 4, 5, 6, 8 and 8 ½ sockets supported Shared Processor Pool (Non-Prod workloads)

 

Rule of Thumb is 20 VM’s per core

Threading 2 threads / Hyper-Thread SMT8 per core or virtual processor
Max vCPU 128 (6.5, 6.7)(BW)

POWER8: If a VM uses more than 96 cores then set SMT=4. Otherwise set SMT=8

Max Cores per VM * SMT level = threads

Ex 1: 176*4=704 threads

Ex 2: VM1 = 96 *8 = 768 threads and VM2 = 80*8 = 640 threads for a total of 1,408 threads.

 

  192 (6.7)(BW)

POWER9: If a VM uses more than 48 cores then set SMT=4. Otherwise set SMT=8.

Max Cores per VM * SMT level = threads

Ex 1: 128*4=512 threads and 64*4= 256 threads for a total of 768 threads.

Ex 2: VM1 = 48 *8 = 384 threads; VM2 = 48*8 = 384; VM3 = 48*8 = 384; VM4 = 48*8 = 384 threads for a total of 1,536 threads.

  128 (6.5, 6.7)(SL,CL)

Cores * Max # of vCPU per core * SMT level

Using SPP, take the # of cores * 20 * SMT level.

POWER8 Ex: 176 * 20 * 8 = 28,160 threads

POWER9 Ex: 192 * 20 * 8 = 30,720 threads

   224 (6.5, 6.7)(SL,CL)  
Max Cores N/A 176 cores (POWER8)
  N/A 192 cores (POWER9)
Max Memory 4 TB (6.5, 6.7)(BW) 16 TB (POWER8)
  6 TB (6.5, 6.7)(SL, CL) 24 TB (POWER9)
Memory allocation Only memory attached to ½ socket or full sockets. If more memory is required, underlying ½ or full sockets go with it. As long as the VM has the minimum memory allocated, memory increments can occur as small as 1 MB
SAP could require virtualization to reproduce on bare-metal Yes No
Min performance degradation from virtualization per SAP 14% for ½ socket VM’s

 

Avg of 10% over bare-metal

0%
     

* VIOS VM’s do not count toward these totals

** Requires an 8-socket server to achieve 16 VM’s, with each using ½ sockets per VM. If using full socket VM’s the most possible would be 8 using the 8-socket server example.

*** You can mix ½ and full socket VM’s on the same server. Example would be 4 x ½ socket VM’s which would consume 2 sockets and 6 x 1 socket VMs consuming 6 sockets totaling 8 sockets.

BW = Broadwell

SL = Sky Lake

CL = Cannon Lake

Back to the migration story.  It is difficult to coordinate downtime among the various stakeholders of a multi-billion-dollar company not to mention the cost of downtime.  Since they were migrating the database from the current ECC system over the network, the client liked having the option to granularly allocate resources plus move VM’s where they needed them.   With IBM Power, clients have flexibility leading to fewer scheduled outages as most maintenance and administration can be performed concurrently.  Is anyone keeping score of all the advantages obtained by IBM Power as I’ve put many hash marks in its column while placing many X’s in the column for the alternative platform.

Regarding the network traffic, the network adapters are 10 GbE optical, configured in the VIOS using Shared Ethernet Adapters which provide a virtual switch. Traffic enters and leaves the server through the SEA whereby network packets within the server are sent/received over the systems memory bus using a technology in the Power Hypervisor called Virtual Ethernet (VE).   This makes data transfers from VM to VM within the frame occur very fast with ultra-low latency and efficiency.  Hence why the client prefers the App servers to reside (logically and physically) millimeters away from the HANA DB server.

The export of the 24 TB database from the source system began just after midnight Friday night. This took approximately 6 hours.  They next moved to import the data into the new environment which took 24 hours.   During the migration, the client chose to stuff more cores and memory to the App VM while running on the Scale-up server.  The App VM was originally sized to use 4 cores and 64 GB RAM but they called an audible and bumped the cores to 12 cores and memory to 384 GB RAM.  For those familiar or not with Power systems and common workloads, this is a lot of cores and memory for an App server, but since they had spare resources on the Scale-up server, they chose to “use ‘em since they got ‘em!”.  After the migration, the App VM’s were reduced from their inflated values to the go-forward (lower) values reducing the cores 1 at a time and the memory 16 GB at a time, performing this dynamically.  Though dynamic add/remove of cores and memory is technically supported by SuSE and RedHat running in IBM Power, SAP doesn’t yet support it on either IBM Power or Intel, I can verify this can be successfully performed when the changes are made in these small increments vs one action.  

The Client has been working through their post-migration punch-list as the system went live on schedule at 4:30 pm that Sunday afternoon.  Starting with a kick-off call Friday night at 8:30 pm and going live Sunday at 4:30 pm they successfully moved their entire businesses Production environment from SAP ECC to Suite on HANA in 44 hours (31 hours for the start of export to finish of import).

Beginning last Fall, my team began to implement the infrastructure starting with the DR environment. Over the many months, we received many requests from the SAP Basis team and our SAP migration team to create new environments for testing, add resources, mount points or make some type of change to the VM.  The only feature which would’ve been beneficial to have and  still unsupported by SAP on either supported Platform is the ability to dynamically add/remove cores and memory.  I do expect this feature to be supported on IBM Power with PowerVM shortly.  These capabilities, especially dynamic memory add / removal have been around for a decade and a half with IBM Power.  Technology is very reliable, very consistent and very convenient.  I’m sure purists for Intel solutions using VMware might argue their product works just as well.  I believe SAP’s own guidance says otherwise and of course if someone would like to have some fun, we could setup a 2-server solution and run through a battery of tests to compare virtualization features on both platforms.  We’d have to run each under a heavy load as it would be unfair to our audience to do these tests in a vacuum as that isn’t real world.  While at it, maybe we could run some informal Oracle database (sorry, can’t help myself – read my previous blog to know of my Oracle obsession) testing along with a TCA/TCO analysis comparing how both platforms performs. We’ll refer to it as “using a leading enterprise RDBMS product” so we don’t upset lawyers.

In summary, I’m obviously very proud how this solution performed as it took a strong, capable team to design, deploy and support this client for 4 separate migrations.  This no-compromise solution was >35% less costly vs a competing solution making the lives of this client much better from beginning to go-live.

Kudo’s to IBM as they have  the best platform for SAP HANA and also  tremendous SAP talent available to partners and clients for pre-sales support, IBM Lab Services for HANA installation assistance and IBM Linux for SAP HANA support.

 

Excellent Resources:

IBM Systems Magazine http://ibmsystemsmag.com/power/systems-management/data-management/sap-hana-landscapes/?utm_source=SilverpopMailing&utm_medium=email&utm_campaign=052119-Power-EXTRA+%281%29+Live+Send&utm_content=Simplify+and+Accelerate+SAP+HANA+Landscapes&spMailingID=15684623&spUserID=MTMzMTk5NTQyNjAxS0&spJobID=1641419654&spReportId=MTY0MTQxOTY1NAS2#.XOVt36iYHsI.twitter

 

SAP on Power blog by Alfred Freudenberger https://saponpower.wordpress.com

 

Linux on Power – system tuning Linux https://developer.ibm.com/linuxonpower/docs/linux-on-power-system-tuning/

 

Interesting article discussing the use of SMT8 on IBM POWER9 servers running DB2 https://developer.ibm.com/linuxonpower/2018/04/19/ibm-power9-smt-performance-db2/

Get more for less with POWER9

Who doesn’t expect more from a new product, let alone if it is the next generation of that product. Whether it is the “All New 2019 Brand Model” Car/Truck/SUV or, being a Macbook fan, the latest Macbook Pro and IOS (just keep the magnetic power cord)?

We want and expect more.  IBM POWER8 delivered more.  More performance, built-in virtualization on the Enterprise systems, mobile capacity on Enterprise systems to share capacity between like servers, a more robust reliability and availability subsystem as well as improved serviceability features from the low-end to high-end.  Yes, all while dramatically improving performance over previous generations.

How do you improve upon something that is already really good – I’m purposefully avoiding using the word “great” as it’ll make me sound like a sycophant who would accept a rock with a Power badge and call it “great”.  No, I am talking about actual, verifiable features and capabilities delivering real value to businesses.

Since the POWER9 Enterprise systems have yet to be announced and I only know what I know through my secret sources, I’ll limit my statements to just the currently available POWER9 Scale-out systems.

  • POWER8 Scale-out now include PowerVM Enterprise Edition licenses
  • Workload Optimized Frequency now delivers frequencies up to 20% higher over the nominal or marketed clock frequency
  • PCIe4 slots to support higher speed and bandwidth adapters
  • From 2 to 4X greater memory capacity on most systems
  • New “bootable” internal NVMe support
  • Enhanced vTPM for improved Secure Boot & Trusted Remote Attestation
  • SR-IOV improvements
  • CAPI 2.0 and OpenCAPI capability – the latter, though I’m unaware of any supported features is exciting in what it is designed and capable of doing.
  • Improved price points using IS memory

The servers also shed some legacy features that were getting long in the tooth.

  • Internal DVD players – in lieu of USB drive support
  • S924 with 18 drive backplane no longer includes add-on 8 x 1.8″ SSD slots

As consumers, we expect more from our next generation purchases, the same holds true with POWER9.  Get more capability, features and performance for less money.

Contact me if you would like a quote to upgrade to POWER9, running x86 workloads and would like to hear how you may be able to do far more with less as well as learn how my services team will ease any concerns or burdens you may have to remain on your aging and likely, higher cost servers by upgrading to POWER9.

 

 

Have it your way with POWER9

IBM POWER offers system footprint and capabilities to meet any client requirement.

Henry Ford is attributed with saying “you can have any color you want, as long as it is black”.  Consumers, whether on the retail or enterprise side like options and want to buy products the way they want them.

IBM’s recently announced AIX, IBMi and Linux capable POWER9 Scale-out servers as seen below or learn more about each here.

P9-portfolio

These 6 systems join the AC922 AI & Cognitive beast using NVLink 2.0 supporting up to 6 x H2O Nvidia Volta GPU’s

With the 6 POWER9 based systems announced February 13, 2018, IBM is offering clients choice – virtually “any color you want”.  With these systems, get a 2 RU (rack unit) or 4RU model, with 1 or 2 sockets in each. Cores ranging from 4 to 24 and memory from 16 GB to 4 TB of system memory.  Internal storage options from HDD, SSD to NVMe plus all of the connectivity options expected with PCIe adapters – except we see newer adapters with more ports running at higher speeds.

Run AIX , IBM i and Linux on a 1 or 2-socket S922 or H922, a 1-socket S914 and a 1 or 2-socket S924 or H924.  Need Linux only, you can choose any of the previously mentioned servers or choose the cost-optimized L922 with 1 or 2-sockets support 32 GB up to 4 TB of RAM.

IBM issued a Statement of Direction as part of a broader announcement the intention to offer AIX clients on the Power based Nutanix solution.  It is reasonable to conclude there will be a POWER9 based Nutanix option as well.  Expecting a POWER9 solution isn’t surprising but being able to run AIX in a non-PowerVM based hypervisor is a big deal.

Looking at the entire POWER portfolio available today for clients, it ranges from the POWER8 based hyper-converged Nutanix, mid-range & Enterprise class POWER8 systems which compliment the POWER9 Scale-out and speciality systems.

 

POWER_portfolio_Feb2018

Whether the solution will be Nutanix running AIX & Linux, an Enterprise server with 192 cores or a 1-socket L922 running PostgreSQL or MongoDB in a lab, businesses can  “have it your way”.

 

 

 

Oracle is a mess & customers pay the price!

Chaos that is Oracle

Clients are rapidly adopting open source technologies in support of purpose-built applications while also shifting portions of on-premises workloads to major Cloud providers like Amazon’s AWS, Microsoft’s Azure and IBM’s SoftLayer.  These changes are sending Oracle’s licensing revenue into the tank forcing them to re-tool … I’m being kind saying it this way.

What do we see  Oracle doing these days?

  • Aggressively going after VMware environments who use Oracle Enterprise products for licensing infractions
  • Pushing each of their clients toward Oracle’s public cloud
  • Drastically changing how Oracle is licensed for Authorized Cloud Environments using Intel servers
  • Latest evidence indicates they are set to abandon Solaris and SPARC technology
  • On-going staff layoffs as they shift resources, priorities & funding from on-premises to cloud initiatives

VMware environments

I’ve previously discussed for running Oracle on Intel (vs IBM POWER), Intel & VMware have an Oracle problem. This was acknowledged by Chad Sakac, Dell EMC’s President Converged Division in his August 17, 2016 blog in what really amounted to an Open Letter to King Larry Ellison, himself. I doubt most businesses using Oracle with VMware & Intel servers fully understand the financial implications this has to their business.  Allow me to paraphrase the essence of the note “Larry, take your boot off the necks of our people”.

This is a very contentious topic so I’ll not take a position but will try to briefly explain both sides.  Oracle’s position is simple even though it is very complex.  Oracle does not recognize VMware as an approved partitioning (view it as soft partitioning) method to limit Oracle licensing. As such, clients running Oracle in a VMware environment, regardless of how little or much is used, must properly license it for every Intel server under that clients Enterprise (assume vSphere 6+).  They really do go beyond a rational argument IMHO. Since Oracle owns the software and authored the rules they use these subtleties to lean on clients extracting massive profits despite what the contract may say. An example that comes to mind is how Oracle suddenly changed licensing configurations for Oracle Standard Edition and Standard Edition One. They sunset both of these products as of December 31, 2015 replacing both with Standard Edition 2. What can only be described as screwing clients, they halved the number of sockets allowed on a server or in a RAC cluster, limited the number of cpu threads per DB instance while doubling the number of minimum Named User Plus (NUPs). On behalf of Larry, he apologizes to any 4 socket Oracle Standard Edition users but if you don’t convert to a 2 socket configuration (2 sockets for 1 server or 1 socket for 2 servers using RAC) then be prepared to license the server using the Oracle Enterprise Edition licensing model.

The Intel server vendors and VMware have a different interpretation on how Oracle should be licensed.  I’ll boil their position down to using host or cpu affinity rules.  House of Bricks published a paper that does a good job trying to defend Intel+VMware’s licensing position. In their effort, they do show how fragile of ground they sit on with its approach  highlighting the risks businesses take if they hitch their wagons to HoB, VMware & at least Dell’s recommenations.

This picture, which I believe House of Bricks gets the credit for creating captures the Oracle licensing model for Intel+VMware environments quite well. When you pull your car into a parking garage – you expect to pay for 1 spot yet Oracle says you must pay for every one as you could technically park in any of them. VMware asserts you should only pay for a single floor at most because your vehicle may not be a compact car, may not have the clearance for all levels, there are reserved & handicapped spots which you can’t use. You get the idea.

oracle_parking_garage

It simply a disaster for any business to run Oracle on Intel servers. Oracle wins if you do not virtualize, running each on standalone servers.  Oracle wins if you use VMware, regardless of how little or much you actually us.  Be prepared to pay or to litigate!

Oracle and the “Cloud”

This topic is more difficult to provide sources so I’ll just stick to anecdotal evidence. Take it or leave it. At contract renewal, adding products to contracts or new projects like migrating JD Edwards “World” to “Enterprise One” or a new Oracle EBS deployment would subject a business to an offer like this.  “Listen Bob, you can buy 1000 licenses of XYZ for $10M or you can buy 750 licenses of XYZ for $6M, buy 400 Cloud units for $3M and we will generously throw in 250 licenses …. you’ll still have to pay support of course. You won’t get a better deal Bob, act now!”.  Yes, Oracle is willing to take a hit for the on-premises license revenue while bolstering their cloud sales by simply shuffling the Titanic deck chairs. These clients, for the most part are not interested in the Oracle cloud and will never use it other than to get a better deal during negotiations. Oracle then reports to Wall Street they are having tremendous cloud growth. Just google “oracle cloud fake bookings” to read plenty of evidence to support this.

Licensing in the Cloud

Leave it to Oracle Marketing to find a way to get even deeper into clients wallets – congratulations they’ve found a new way in the “Cloud”.  Oracle charges at least 2X more with Oracle licenses on Intel servers that run in Authorized Cloud Environments (ACE). You do not license Oracle in the cloud using the on-premises licensing factor table.  The more VM’s running in a ACE,  the more you will pay vs an on-premises deployment. To properly license an on-premises Intel server (remember, it is always an underlying proof that Oracle on POWER servers is the best solution) regardless if virtualization is used, assuming a 40 core server, would equal 20 Oracle Licenses (Intel licensing factor for Intel servers is 0.5 per core). Assume 1 VMware server, ignoring it is probably part of a larger vSphere cluster.  Once licensed, clients using VMware could theorectially run Oracle as many VM’s as desired or supported by that server. Over-provision the hell out of it – doesn’t matter. That same workload in an ACE, you pay for what amounts to every core.  Remember, if the core resides on-premises it is 1 Oracle License for every 2 Intel cores but in a ACE it is 1 OL for 1 core.

AWS
Putting your Oracle workload in the cloud?  Oracle license rules stipulate if running in AWS, it labels as vCPU’s both the physical core and the hyperthread. Thus, 2 vCPU = 1 Oracle License (OL). Using the same 40 core Intel server mentioned above, with hyperthreading it would be 80 threads or 80 vCPU.  Using Oracle’s new Cloud licensing guidelines, that would be 40 OL.  If this same server was on-premises, those 40 physical cores (regardless of threads) would be 20 OL ….. do you see it?  The licensing is double!!!   If your AWS vCPU consumption is less vs the on-premises consumption you may be ok. As soon as your consumption goes above that point – well, break out your checkbook.  Let your imagination run wild thinking of the scenarios where you will pay for more licenses in the cloud vs on-prem.

Azure
Since Azure does not use hyperthreading, 1 vCPU = 1 core.  The licensing method for ACE’s for Azure or any other ACE if hyperthreading is not used, 1 vCPU = 1 OL.  If a workload requires 4 vCPU, it requires 4 OL vs the 2 OL if it was on-premises.

Three excellent references to review. The first is Oracle’s Cloud licensing document. The second link is an article by Silicon Angle giving their take of this change and the last link is for a blog by Tim Hall, a DBA and Oracle ACE Director sharing his concerns. Just search for this topic starting from January 2017 and read until you fall asleep.

Oracle
Oracle offers their own cloud and as you might imagine, they do everything they can to favor their own cloud thru licensing, contract negotiations and other means.   From SaaS, IaaS and PaaS their marketing machine says they are second to none whether the competition is SalesForce, Workday, AWS, Azure or any other.  Of course, analysts, media, the internet nor Oracle earnings reports show they are having any meaningful success – to the degree they claim.

Most recently, Oracle gained attention for updating how clients can license Oracle products in ACE’s as mentioned above.  As you might imagine, Oracle licenses its products slightly differently than in competitors clouds but they still penalize Intel and even SPARC clients, who they’ll try to migrate into the cloud running Intel (since it appears Oracle is abandoning SPARC).  The Oracle Cloud offers clients access to its products on a hourly or monthly in a metered and non-metered format on up to 4 different levels of software. Focusing on Oracle DB, the general tiers are Standard, Enterprise, High-Performance and Extreme-Performance Packages. Think of it like Oracle Standard Edition, Enterprise Edition, EE+tools, EE+RAC+tools.  Oracle also defines the hardware tier as “Compute Shapes“. The three tiers are General Purpose, High-Memory or Dedicated compute

Comparing the cost of an on-premises perpetual license for Oracle Enterprise  vs a non-metered monthly license for the Enterprise Tier means they both use Oracle Enterprise Edition Database. Remember a perpetual license is a one-time purchase, $47,500 for EE DB list price plus 22% per year annual maintenance.  The Enterprise tier using a High-memory compute shape in the Oracle cloud is $2325 per month.  This compute shape consists of 1 OCPU (Oracle CPU) or 2 vCPU (2 threads / 1 core).  Yes, just like AWS and Azure, Intel licensing is at best 1.0 vs 0.5 for on-premises licensing per core. Depending how a server might be over-provisioned as well as the fact an on-premises server would be fully licensed with 1/2 of its installed cores there are a couple of ways clients will vastly overpay for Oracle products in any cloud.

The break-even point for a perpetual license + support vs a non-metered Enterprise using High-memory compute shape is 30 months.

  • Perpetual license
    • 1 x Oracle EE DB license = $47,500
    • 22% annual maintenance = $10,450
    • 3 year cost: $78,850
  • Oracle Cloud – non-metered Enterprise using High-Memory shape
    • 1 x OCPU for Enterprise Package for High-Compute = $2325/mo
    • 1 year cloud cost = $27,900
    • 36 month cost: $83,700
  • Cross-over point is at 30 months
    • $79,050 is the 30 month cost in the Cloud
  • An Oracle Cloud license becomes significantly more expensive after this.
    • year 4 for a perpetual license would be $10,470
    • 12 months in year 4 for the Cloud license would be $27,900
    • Annual cost increase for a single cloud license over the perpetual license = $17,430
  • Please make your checks payable to “Larry Ellison”

Oracle revenue’s continue to decline as clients move to purpose-built NoSQL solutions such as MongoDB, RedisLabs, Neo4j, OrientDB, Couchbase as well as SQL based solutions from MariaDB, PostgreSQL (I like EnterpriseDB) even DB2 is a far better value.  Oracle’s idea isn’t to re-tool by innovating, listening to clients to move with the market. No, they get out their big stick – follow the classic mistake so many great clients have done before them which is not evolve while pushing clients until something breaks.   Yes, Boot Hill is full of dead technology companies who failed to innovate and adapt. This is why Oracle is in complete chaos.  Clients beware – you are on their radar!

 

 

C is for Performance!

E850C is a compact power-packed “sweet spot” server!

“C” makes the E850 a BIG deal!

IBM delivered a modest upgrade to the entry level POWER8 Enterprise server going from the E850 to the E850C.  The new features are seen with the processors, memory, Capacity on Demand and with bundled software.

The most exciting features available with the new E850C, which by the way comes with a new MTM of 8408-44E, are with the processors.  You might think I’d say that but here is why the E850C is the new “sweet spot” server for AIX & Linux workloads that require a mix of performance, scalability and reliability features.

A few things that are the same on the E850C as it was the E850.

  • Classified as a “small” tier server
  • Available with a 3 year 24 x 7 warranty
  • PVU for IBM software is 100 when using AIX
  • PVU for IBM software is 70 when using Linux
  • Supports IFL’s or Integrated Facility for Linux
  • Offers CuOD, Trial, Utility and Elastic CoD
  • Does NOT offer mobile cores or mobile memory (boo hiss)
  • Does NOT support Enterprise Pools (boo hiss)

The original 8408-E8E aka E850 was available with 32 cores at 3.72, 40 cores at 3.35 and 48 cores at 3.02 GHz, initially support 2 TB of DDR3 memory and eventually up to 4 TB of DDR3 of memory.  Using up to 4 x 1400W power supplies, due to its dense packaging what it did not offer was the option to exploit EnergyScale allowing users to decrease or increase the processor clock speeds.  The clock speeds were capped at their nominal speeds of 3.72, 3.35 and 3.02 GHz not allowing users to select if one of several options from do nothing to lower or increase based on utilization or lower to a set point and more importantly, increase to the higher rate.  This is free performance – rPerf in the case of AIX.

Focusing on the processor increase, because who the hell wants to run their computers slower, the E850C has a modest increase ranging from 2.5% to 4.6%.  I say modest because the other POWER8 models range from 4% up to 11% <play Tim Allen “grunt” from Home Improvement>.  This modest increase doesn’t matter because the new C model delivers 32 cores at 4.22 nominal increasing to 4.32 GHz, 40 cores at 3.95 nominal increasing to 4.12 GHz and 48 cores at 3.65 nominal increasing to 3.82 GHz.  These speeds are at the high end for every Scale-out server and consistent with on part with the E870C/E880C models.

Putting these performance increases into perspective; comparing nominal rPerf values for the E850 vs E850C show this: 32 core E850C with an increase of 59 rPerf. 40 core E850C with an increase of 88 rPerf and the 48 core E850C delivering a rPerf increase of 113.  By doing nothing but increasing the clock speed, the 48 core E850C is delivering an rPerf increase equivalent to a POWER6 570 with 16 cores.

It hasn’t been mentioned yet but the E850 & E850C uses a 4U chassis. Looking at the 48 core E850C just mentioned, it delivers an rPerf level of 859. Compare this to the 16U POWER7+ 770 (9117-MMD) with 64 cores that delivers only 729 rPerf or going back to the initial 770 model 9117-MMB with 48 cores in a 16U footprint delivering 464 rPerf. Using the MMD values, this is a 4:1 footprint reduction, an 18% increase in rPerf with a 25% reduction in cores – why does that matter? Greater core strength means fewer OS & virtualization licenses & SWMA but more importantly – less enterprise software licensing such as Oracle Enterprise DB.

IBM achieved this a couple of ways. Not being an IBMer, I do not have all of the techniques but by increasing the chip efficiency, increasing the power supplies to 2000W each and moving to DDR4 memory which uses less power.

What else?

Besides the improvement in clock speeds and bumping memory to DDR4, the E850C reduces the number of minimum active cores. Every E850C must have a minimum of 2 processor books; 2×8, 2×10 or 2×12 core  while only requiring 8, 10 or 12 cores being active depending on the model of processor book used.  The E850 required all cores in the first 2 processor books to be active. This change in the E850C is another benefit to clients to get into the “sweet spot” server with a lower entry price.  Same memory activations of 50% of the installed memory or 128 GB whichever is more.

A couple of nice upgrades from the E850 that are now standard. Active Memory Mirroring and PowerVM Enterprise Edition are now standard while still offering a 3 year 24 x 7 warranty (except Japan).

The E850C does not support IBM i, but it does support AIX 6.1, 7.1 and 7.2 (research specific versions at System Software Maps) and the usual Linux distro’s.

Software bundle enhancements over the E850 are:

  • Starter pack for SoftLayer
  • IBM Cloud HMC apps
  • IBM Power to Cloud Rewards
  • PowerVM Enterprise Edition

Even though it isn’t bundled in, consider using IBM Cloud PowerVC Manager, which is included with the AIX Enterprise Edition bundle or à la carte with AIX Standard Edition or any Linux distro.

In summary

The E850C is a power-packed compact package. With up to 48 cores and 4 TB Ram in a 4U footprint, it is denser than 2 x 2U S822’s with 20 cores / 1 TB RAM or the 1 x 4U S824 with 24 cores / 2 TB RAM.  Yes the E870C with 40 cores or the E880C with 48 cores, both with 8 TB of RAM in a single node still require 7U to start with.  If clients require the greatest scalability, performance, flexibility and reliability they should look at the E870C or E880C but for a lower entry price that delivers high performance in a compact solution the E850C delivers the complete package.

 

Not on the Dell/EMC Bandwagon. More of the same. OpenPOWER changes the game!

Reading articles about the two companies consummation on 9/7/16 around social media yesterday, one would think the marriage included a new product or solution which was revolutionizing the industry.  I haven’t heard of any but  I do know that both companies have continued to shed employee’s and sell off assets not core to the go-forward business to capture critical capital to fund the massive $63B deal.  They will also continue to evaluate products from both Dell & EMC’s traditional product portfolios to phase out, merge, sell or kill due to redundancies and other reasons.  It just happens. For them to say otherwise is misleading at best.  Frankly, it hurts their credibility when they deny this as there are examples already of this occurring.

Going forward I do not see how the combined products of Dell, which at its core sell commodity Intel servers that are not even best of breed, but rather the low-cost leader paired with the high-end products from EMC, which had high development cost will be any different on 9/8/16 than it was on 9/6/16.  EMC’s problem of customers moving away from the high margin high-end storage systems to the highly competitive, lower margin All Flash Array products will not be any better for the newly combined company.  This AFA space has many good competitors who offer “Good Enough” features that can offer clients 1) Lower cost 2) Comparable or better features 3) Not a tier-1 player who some customers resist due to feeling they overpay for the privilege to work with them.

About 2 years ago, EMC absorbed VCE with its Converged infrastructure called vBlock, a term I argue it is not but instead is a Integrated Infrastructure built on VMware, Cisco UCS and EMC Storage.  VMware & EMC storage offer nothing unique. UCS is unique in the Intel space but with the messy split from the VCE tri-union and now VCE who is placing a lot of emphasis on their own hyper-converged offerings as well as products from Dell due to this new found marriage.  It only makes sense to de-emphasize Cisco from a VCE solution and start promoting Dell products.  This goes from using the leader in Intel blade solutions to the “me-too” Dell products which is average in a field of “Good Enough” technology whose most notable feature is its low cost.

As I listen to the IBM announcement today that include 3 new OpenPOWER servers I can’t help but wonder how much longer Dell’s low cost advantage will remain.  Not sure what they will use for SAP HANA workloads requiring > 4 socket Intel servers since HPE just bought SGI, primarily for its 32 socket Intel server/technology.  I guess they could partner with Lenovo on their x3950 or with Cisco on their C880 which I believe they actually OEM from Hitachi. Dell servers are woefully inadequate with regard to RAS features; not just against POWER servers but even against other Intel competitors like Lenovo (thanks to their IBM purchase of xSeries), Hitachi and Fujitsu who all have stronger offerings relative to what Dell offers.   RAS features simply cost more which is why you didn’t see IBM with its xSeries, Hitachi or Fujitsu be volume leaders. This is also why you are seeing more software defined solutions built to mask hardware deficiencies. This in itself has its own problems.

Here is a quick review of today’s announcements. The first server is a 2 socket 2U server built for Big Data hosting 12 internal front facing drive slots.  The next server is a 2 socket 1U server offering almost 7K threads in a 42U rack.  It provides tremendous performance for clients looking for data-rich and dense computing.  The 3rd server is a 2 socket 2U server that is the first commercial system to offer NVIDIA‘s NVLink technology connecting 2 or 4 GPU’s directly to each other as well as to the CPU’s.  Every connection is 160 GB/s bi-directional which is roughly 5X what is available on Intel servers using GPU’s connected to PCIe3 adapter slots.

openpower_family_sept2016

These OpenPOWER systems allow clients to build their own solution or as part of a integrated product with storage and management stack built on OpenStack.  Ideal for Big Data, Analytics, HPC, Cloud, DevOps and open source workloads like SugarCRM, NoSQL, MariaDB, PostgreSQL (I like EnterpriseDB for support) or even IBM’s vast software portfolio such as DB2 v11.1.

Pricing for the 3 new OpenPOWER models as well as the first 2 announced earlier in the year is available at Scale-out Linux on page. I recently did a pricing comparison for a customer with several 2 socket Dell servers vs a comparable 2 socket S822LC.  Both the list and web price for the Dell solution were more expensive than OpenPOWER.  The Dell list price was approximately 35% more and the web list price was 10% more and I was using the price as shown on the IBM OpenPOWER page provided in the link in this same paragraph.  Clients looking to deploy large clusters, compute farms or simply want to start lowering infrastructure cost should take a hard look at OpenPOWER.  If you can install Linux on an Intel server,  you have the skills to manage a OpenPOWER server.  Rocket Scientist need not apply!

If you have questions, encourage you to contact your local or favorite business partner.  If you do not have one, I would be happy to work with you.

HPE; there you go again! Part 2

Update on Sept 05, 2016: I split-up the original blog (Part 1) into two to allow for easier reading.

The topic that started the discussion is a blog by Jeff Kyle, HPE Director of Mission Critical Systems promoting his Superdome X server at the expense, using straw men and simply factually incorrect information to base his arguments on.

Now it’s my turn to respond to all of  Jeff’s FUD.

  • Let’s start with this. My favorite topic right now which is to finally have an acknowledgement that Intel customers using VMware running Oracle Enterprise Edition Software products licensed by core have a problem.
    • VCE President, Chad Sakac pens his open letter to Larry at Oracle to take his jackboot off the necks of his VMware people.
    • Read my blog response
  • VMware’s Oracle problem is this. Oracle’s position is essentially if customers are running any Oracle Enterprise product licensed by core on a server running VMware, managed by vCenter then ALL (yes, ALL) servers in the cluster that are under that vCenter manager environment, should and MUST be licensed for ALL of the Oracle products running on that one server. Preposterous or not,  it is not my fight. Obviously, VMware & Intel server vendors who are having their sales impacted by this are not happy. Oracle, which offers an x86 alternative in the form of Exadata and Oracle Database Appliance offer their own virtualization capabilities that is NOT VMware based and which clients do NOT have to license all of the cores, only those being used.
  • VCE & House of Bricks, via a VCE sponsored whitepaper are encouraging customers to stand up to Oracle during contract negotiations, audits and in general to take the position that your business will only pay for the cores and thus the licenses with which are running Oracle Enterprise products. Of course, VCE, nor HoB nor any other Intel vendor that I have read about is providing any indemnification to customers who stand up to Oracle, found out of compliant with fines, penalties and fee’s.  They have the choice to pay up or fight in court.  Yes, it’s the right thing to do but keep in mind that Oracle is a company predisposed to litigate.
  • Yes, I agree that Software licensing & maintenance costs are one of the largest cost items to a business. Far higher than infrastructure, yet Intel vendors wouldn’t have you believe that.
  • IBM Power servers have several “forward looking Cloud deployment” technologies
    • Open source products like PowerVC built on OpenStack manages an enterprise wide Power environment that integrates into a customer’s OpenStack environment.
    • IBM Cloud PowerVC Manager, also built on OpenStack provides clients with enterprise wide private cloud capabilities.
    • Both PowerVC and Cloud PowerVC Manager integrate with VMware’s vRealize allowing vCenter to manage a Power environment.
    • If that isn’t enough, using IBM Cloud Orchestrator, clients can manage a heterogeneous compute & storage platform, both on-prem, hybrid or exclusively in the cloud.
    • IBM will announce additional capabilities on September 8, 2016 that deliver more features to Power environments.
  • “Proprietary chips” – so boring. What does that mean?
    • Let’s look at Intel as they continue to close their own ecosystem. They bought an FPGA company with plans to integrate it into their silicon. Instead of working with GPU partners like NVIDIA & AMD, they developed their own GPU offering called Knights Landing.  Long ago they integrated Ethernet controllers into their chips, and depending on the chip model, graphics capability. They build SSD’s, attempted to build mobile processors and my last example of them closing their ecosystem is Intel’s effort to build their own hi-speed, low latency communication transport called OmniPath instead of working with hi-speed Ethernet & InfiniBand partners like Mellanox.  Of course, unlike Mellanox which provide offload processing capabilities on their adapters, true to form Intel’s OmniPath does not thus requiring them to use the main processing cores to service the hi-speed ethernet traffic.  Wow, that should be some unpredictable utilization and increased core consumption…..which simply drives more servers and more licenses.
    • Now let’s look at what Power is doing. IBM has opened up the Power Architecture specification to the OpenPOWER Foundation. Power.org is still the governing body but OpenPOWER is designed to encourage development & collaboration under a liberal license to build a broad ecosystem.  Unlike Intel which is excluding more and more partners, OpenPOWER has partner companies building Power chips, systems, not to mention peripherals and software.
    • I’ll spare you from looking up the definition of Open & Proprietary as I think it is clear who is really open and who is proprietary.
  • Here is how the typical Intel “used car” salesman sells Oracle on x86: “Hey Steve, did you know that Oracle has a licensing factor of .5 on Intel while Power is 1.0? Yep, Power is twice as much. You know that means you will pay twice as much for your Oracle licenses! Hey, let’s go get a beer!”
    • What Jeff is forgetting to tell you or simply does not know is that except for this unique example with Pella with Oracle running on a Superdome X server because of its nPAR capability, as most customers do not run Oracle on the larger Intel servers like that which may offer a hardware partitioning feature which allows for reduced licensing. They typically run it on 2 or 4 socket servers.
    • The Superdome X server supports two types of partitioning that were carry overs from the original Superdome (Itanium) servers. vPARs and nPARs are both considered Hard Partitioning and thus, both allow the system to be configured into smaller groups of resources.  This allows only those cores to be licensed, then adhering to Oracle licensing rules.
    • HPE provides the Pella case study which states they have a 40 core partition separated from other cores on the server using nPAR technology that appears like server although made up of 2 blades. nPAR’s separate resources along “cell board” boundaries, which are the equivalent of an entire 2 socket blade.  Pella’s Primary Oracle environment runs with 2 blades, each with 2 x 1o cores totaling 40 cores. These two production blades with 40 cores & 20 Oracle licenses sit alongside 2 other blades in one data center while at the failover site sits another HP SD-X chassis. I wonder if Pella realizes the inefficiency of the Superdome X solution. Every Intel server has a compromise. Traditional Scale-out 1 & 2 socket servers have compromises with scalability, performance & RAS. Traditional Scale-up 4 socket & larger Intel servers have compromises with scalability, performance and RAS as well.  Each Superdome X blade has a Xbar controller plus the SX3000 Fabric bus. For this 4 socket NUMA server to act like one server, it will require 8 hops for every off-blade remote memory access.  Further, if the 2nd blade isn’t in the same slot number scheme, such as blade 1 in slot 1 and blade 2 in slot 3, then performance would be further degraded. Do you see what I mean by Intel servers having land mines with every step?
    • The Pella case study says the failover database server uses a single blade consisting of 30 cores.  Not sure how they are doing that if they are using E7_v3 or E7_v4 processors as there is not a 15 core option.  There is a E7_v2 (Ivy Bridge) 15 core option but doubt they would use it.  This single Oracle DB failover blade sits with additional 2 blades.  The fewest Oracle licenses you could pay for on the combined 4 socket or 40 core blade, assuming it is using 2 x 10 core chips per blade is 20 Oracle Licenses.  So, even if the workload ONLY requires 8, 12, 16 or 18 cores the customer would still pay for 20 Licenses.
    • This so-called $200,000 in Oracle licensing savings really is nothing, it really isn’t.  I just showed a customer running Oracle EBS with Linux on Dell how they would save $2.5M per year in Oracle maintenance cost if they moved the workload to AIX on POWER8.  If they would have deployed the solution on AIX to begin with, factoring in the 5 year TCO difference for what they are paying with Intel compared to POWER, this customer would have avoided spending $21M – let that sink in.
    • I do not intend to be disrespectful to Pella, but if you would have put the Oracle workloads running on the older HP SuperDome  onto POWER8 in 2015, you would not have bought a single Oracle license. I could guaranteed that you would have given licenses back, if desired. Not only would you avoid buying licenses, after returning licenses, you would save the 22% maintenance for each returned license.
    • Look at one of my old blogs where I give some Oracle Licensing examples comparing licensing costs for Intel vs Power. It is typical for what I regularly see with clients, if not even greater consolidation ratio’s and subsequent license reductions results.
    • The Pella case study does not mention if the new Superdome X solution uses any virtualization technology.  I can only assume it does not since it was not mentioned.  With IBM Power servers running AIX, all POWER servers come with virtualization (note I said “running AIX”).  With Power, the customer could add/remove cores & memory. They could add & remove I/O to any LPAR (LPAR = VM) while doing concurrent maintenance on the I/O path out-of-band via dual VIOS, move that VM from one server to another live…maybe it is only used when upgrading to the next generation of server …. you know, POWER9; the next generation that would deliver to Pella even more performance per core, allowing them to return more Oracle licenses, saving even more money.
  • This comes back to the “Granddaddy” statement Jeff made. Power servers have a license factor of 1.0 but with POWER server technology, customers ONLY license the cores used by Oracle. You can create dedicated VM’s where you only license those cores regardless of how many are in the server. Another option is to create a shared processor pool (SPP) and without going into all of the permutations, let’s simply say you ONLY license the cores used in the pool not to exceed the # of cores in the SPP.  However, what is different from the dedicated VM is that within the SPP, there could be 1 – N VM’s sharing those cores and thus sharing the Oracle licenses.
  • I did some analysis that I also use with my SAP HANA on POWER discussion where I show processor performance by core has increased each generation starting with POWER 5 all the way to POWER8. With POWER9 details just discussed at Hot Chips 2016 earlier this month (August), we can now expect it to deliver a healthy increase over POWER8 as well.  Thus, P5 to P5+ saw an increase in per core performance. P5+ to P6 to P6+ to P7 to P7+ to P8 all saw successive increases in per core performance. Contrast that to Intel and reference a recent blog I wrote titled “Intel, the Great Charade”.  Look at the first generation Nehalem called Gainestown which delivered a per core performance rating (as provided by Intel ) of .29. The next release was Westmere with a rating of .33. After that was Sandy Bride at .32 followed by Ivy Bridge at .31 then Haswell at .29 and the latest per core performance rating of .29.  What does this mean? In 2016, the per core performance is the same as it was for a processor in the 2007 timeframe. Yes, they have more cores per socket – but I’ll ask you; how are you paying for your Oracle, by core or by socket?
  • Next, IBM Power servers that run AIX, which is what would primarily run Oracle Enterprise Edition Database, run on servers with PowerVM which is the software suite that exploits Power Hypervisor. This is highly efficient and reliable firmware.  Part of this efficiency is how it shares and thus dispatches unused processor cycles between VM’s not to mention the availability of 8 hardware threads per core, clock speeds up to 4.5 GHz, At least 2X greater L1 & L2 caches. 3.5X greater L3 cache and 100% more L4 cache over Intel.  What does this mean? What it means is that Power does more than just beat Intel by 2X.  That is what I call a “foot race”.   When you factor in the virtualization efficiency you start to get processing efficiencies approaching 5:1, 10:1, even higher.
  • I like to tell the story of a customer I had running Oracle EBS across two sites. It had additional Oracle products: RAC and WebLogic but this example will focus just on Location 1 and on Oracle Enterprise Edition Database.  Customer was evaluating a Cisco UCS that was part of a vBlock, an Oracle Exadata and a IBM POWER7+ server. I’ll exclude Exadata, again because of some complexities it has with licensing where it skews the results against other Intel servers, just know the POWER7+ solution kicked its ass.  With the vBlock, a VCE engineer sized the server & core requirements for the 2 site solution.  Looking just at Location 1 for Oracle Enterprise Edition DB, the VCE engineer determined 300 Intel cores were required for Oracle EE DB.  All of these workloads required varying degrees of cores; 7 cores in one server rounded up to 10.  Another server required 4 cores that was rounded up to 6 or maybe 8 cores. Repeat this for dozens of workloads.  Just to reiterate that VCE did the sizing as I did the POWER7+ sizing independent from VCE, completing mine first for that matter.  My sizing demonstrated only 30 x POWER7+ cores.  That was 300 Intel cores or 150 Oracle Licenses compared to 30 x POWER cores or 30 Oracle Licenses.  If my memory serves me correctly, the hard core requirement for the Intel workload on the vBlock was around 168 cores which still would have been 84 Oracle Licenses.  This customer was receiving a 75% discount from Oracle and even with this the difference in licensing cost (Oracle EE DB, RAC & WebLogic for 2 sites) was somewhere around $10-12M.  Factor in the 22% annual maintenance and the 5 year TCO for the Intel solution ONLY for the Oracle software was around $20M vs $5-6M on POWER.  By the way, the hardware solution cost differences were relatively close in price; within several $100K.

I know we are discussing Oracle on Intel but wanted to share this SAP 2-tier S&D performance comparison between 4, 8 and 16 socket Intel servers’ vs 2 & 8 socket POWER8 servers.  I use this benchmark as I find it is one of the more reliable system wide test.  Many of the benchmarks are focused in specific areas such as floating point or integer but not transactional data involving compute, memory movement and active I/O.

SAP2tier_compare

Note in the results the 4 socket Haswell server outperform the newer Broadwell 4 socket server. Next, notice the 8 socket Haswell server outperform the newer 8 socket Broadwell 8 socket server. Lastly, notice the 2 x 16 core results, both which are on a HP Superdome X server.  Using the SAP benchmark measurement of SAPS, it shows the lowest amount of SAPS per core compared to any of the Intel servers shown. Actually, do you notice another pattern? The 4 sockets show greater efficiency over the 8 socket servers which show greater efficiency over the 16 socket servers.

Contrast that to the 2 socket POWER8 server, which by the way is 2X the best Intel result.  If the trend we just reviewed with the Intel servers above holds true, we would expect the 8 socket POWER8 result to show fewer SAPS per core than the 2 socket POWER8 server. We already know the answer because it was highlighted in green as it was the highest result that was roughly 13% greater than the 2 socket POWER8.   The 8 socket POWER8 was also 2.X+ greater than any of the Intel servers and 2.8X greater than the 16 socket HP Superdome X servers specifically.

Here comes my close – let’s see if I do a better job than Jeff!

  • My last point is this in response to Jeff’s statement that “There’s a compelling alternative. A “scale-up” (high capacity) x86 architecture with a large memory capacity for in-memory compute models can dramatically improve database performance and lower TCO.”
    • I’ve already debunked the myth and simply false statements that running Oracle on POWER costs more than Intel. In fact, it is just the opposite, and by a significant amount.
    • Also, as shown in the HPE whitepaper “How memory RAS technologies can enhance the uptime of HPE ProLiant servers” they state “It might surprise you to know that memory device failures are far and away the most frequent type of failure for scale-up servers.”. It is amazing how HPE talks out of both sides of their mouth.  Memory fails the most of any component in HPE servers yet they suggest you to buy these large scale-up servers that hold more memory, host and run more workloads such as “in-memory”  from  SAP HANA, Oracle 12c In-Memory or DB2 with BLU Acceleration.  While in their own publishing’s they acknowledge it is the part most likely to fail in their solution.
    • UPDATE: There is a better alternative to HPE Superdome X, Scale-up, Scale-out or any other Intel based server.  That alternative has higher processor performance, larger memory bandwidth, a (much) more reliable memory subsystem as well as overall system RAS capabilities with a full suite of virtualization abilities. That alternative is an IBM Server, specifically POWER8 available in open source 1 & 2 socket configurations (look at LC models), scale-out 1 & 2 models & here (look a L models) and scale-up 4 – 16 socket Enterprise models.  I’ll discuss more about HPE & IBM’s memory features in my next blog.

Your Honor, members of the jury, these are the facts as presented to you.  I leave it to you  to come back with the correct decision – Jeff Kyle and HPE are guilty of misleading customers and propagating untruths about IBM POWER.

Case closed!

 

HPE; there you go again! Part 1

Updated Sept 05, 2016: Split the blog into 2 parts (Part 2). Fixed several typo’s and sentence structure problems. Updated the description of the Superdome X blades to indicate they are 2 socket blades while using Intel E7 chips.

It must be the season as I find myself focused a bit on HPE.  Maybe it’s because they seem to be looking for their identity as they now consider selling their software business.  This time though, it is self-inflicted as there has been a series of conflicting marketing actions. From what they say in their recent HPE RAS whitepaper about the poor Intel server memory reliability stating in the introductory section that memory is far and away the highest source of component failures in a system.  Shortly after that RAS paper is released, they post a blog written by the HPE Server Memory Product Manager stating “Memory Errors aren’t the end of the World”.  Tell that to the SAP HANA and Oracle Database customers, the latter which I will be discussing in this blog.

HPE dares to step into the lion’s den on a topic with which it has little standing to imply it is an authority how Oracle Enterprise software products are licensing in IBM Power servers.  As a matter of fact, thanks to the President of VCE, Chad Sakac for acknowledging that VMware has a Oracle problem.  On August 17th, Chad penned what amounts to an open letter to Larry & Oracle begging them …. No, demanding that Larry leave his people alone.  And, by “his people”, I mean customers who run Oracle Enterprise Software Products licensed by the core on Intel servers using VMware.

Enter HPE with a recent blog by Jeff Kyle, Director of Mission Critical Solutions.  He doesn’t distinguish if he is in a product development, marketing or sales role.  I would bet he it is the latter two as I do not think a product developer would put themselves out like Jeff just did.  What he did is what all Intel marketing teams and sellers have done from the beginning of compute time when the first customer thought of running Oracle on a server that wasn’t “Big Iron”.

Jeff sets up a straw man stating “software licensing and support being one of the top cost items in any data center” followed by the obligatory claim that moving it to an “advanced” yet “industry-standard x86 servers” will deliver the ROI to achieve the goals of every customer while coming damn close to solving world hunger.

Next is where he enters the world of FUD while also stepping into the land of make-believe.  Yes, Jeff is talking about IBM Power technology as if it is treated by Oracle for licensing purposes the same as an Intel server, which it is not.  You will have to judge if he did this on purpose or simply out of ignorance.  He does throw the UNIX platforms a bone by saying they have “excellent stability and performance” but stops there as only to claim they cost more than their Industry standard x86 server counterparts.

He goes on to state UNIX servers <Hold Please> Attention: For purposes of this discussion, let’s go with the definition that future UNIX references = AIX and RISC references = IBM POWER unless otherwise stated.  As I was saying, Jeff next claims AIX & POWER are not well positioned for forward-looking Cloud deployments continuing his diminutive descriptors suggesting proper clients wouldn’t want to work with “proprietary RISC chips like IBM Power”. But, the granddaddy of all of his statements and the one that is complete disingenuous is:  <low monotone voice> “The Oracle license charge per CPU core for IBM Power is twice (2X) the amount charged for Intel x86 servers” </low monotone voice>.

In his next paragraph, he uses some sleight of hand by altering the presentation of the traditional full List Price cost for Oracle RAC that is associated with Oracle Enterprise Edition Database.  Oracle EE DB is $47,500 per license + 22% maintenance per year, starting with year 1.  Oracle RAC for Oracle EE EB is $23,000 per license + 22% maintenance per year, starting with year 1.  If you have Oracle RAC then you would by definition also have a corresponding Oracle EE DB Licenses.  The author uses a price of $11,500 per x86 CPU core and although by doing he isn’t wrong per se, I just do not like that he does not disclose the full license cost of #23,000 up front as it looks like he is trying to minimize the cost of Oracle on x86.

A quick licensing review. Oracle has an Oracle License Factor Table for different platforms to determine how to license its products that are licensed by core. Most modern Intel servers are 0.5 per License.  IBM Power is 1.0 per License.  HP Itanium 95XX chip based servers, so you know also has a license factor of 1.0.  Oracle, since they own the table and the software in question can manipulate it to favor their own platforms as they do, especially with the SPARC servers.  It ranges from 0.25 to 0.75 while Oracle’s Intel servers are consistent with the other Intel servers at 0.5.  Let’s exclude the Oracle Intel servers for purposes of what I am talking about here for reason I said, which is they manipulate the situation to favor themselves. All other Intel servers “MUST” license ALL cores in the server with very, very limited exceptions “times” the licensing factor which is 0.5.  Thus, a 2 x 18 core socket would have 36 cores. Ex: 2s x 18c = 36c x 0.5 License Factor = 18 Licenses.  That would equal 18 Oracle Licenses for whatever the product being used.

What Jeff does next was a bit surprising to me.  He suggests customers not bother with 1 & 2 socket Intel “Scale-out” servers which generally rely on Intel E5 aka EP chipsets.  By the way, Oracle with their Exadata & Oracle Database Appliances now ONLY use 2 socket servers with the E5 processors; let that sink in as to why.  The EP chips tend to have features that on paper have less performance such as less memory bandwidth & fewer cores while other features such as clock frequency are higher, a feature that is good for Oracle DB.   These chips also have lower RAS capabilities, such as missing the MCA (Machine Check Architecture) feature only found in the E7 chips.  He instead suggests clients look at “scale-up” servers which commonly classified as 4 sockets and larger systems.  This is where I need to clarify a few things.  The HP Superdome X system, although it scales to 16 sockets, does so using 2 socket blades.  Each socket uses the Intel E7 processor, which given this is a 2 socket blade is counter to what I described at the beginning of this paragraph where 1 & 2 socket servers used E5 processors.  The design of the HP SD-X is meant to scale from 1 blade to 8 blades or 2 to 16 sockets which requires the E7 processor.

With the latest Intel Broadwell EX or E7 chipsets, the number of cores available for the HD SD-X range from 4 to 24 cores per socket.  Configuring a blades with the 24 core E7_v4 (v4 indicates Broadwell) equals 48 cores or 24 Oracle Licenses.  Reference the discussion two paragraphs above.  His assertion is by moving to a larger server you get a larger memory capacity for those “in-memory compute models” and it is this combination that will dramatically improve your database performance while lowering your overall Total Cost of Ownership (TCO).

He uses a customer success story for Pella (windows) who avoided $200,000 in Oracle licensing fees after moving off a UNIX (not AIX in this case) platform to 2 x HPE Superdome X servers running Linux.  This HPE customer case study says the UNIX platform which Pella moved off 9 years ago was actually a HP Superdome with Intel Itanium processors server running HP-UX.  Did you get this? HP migrated off their own 9-year-old server while implying it might be from a competitor – maybe even AIX on Power since it was referenced earlier in the story.  That circa 2006 era Itanium may have used a Montecito class processor. All of the early models before Tukwila were pigs, in my estimation.  A lot of bluff and hyperbole but rarely delivering on the claims.  That era of SD would have also used an Oracle license factor of 0.5 as Oracle didn’t change it until 2010 and only on the newer 95xx series chips.  Older systems were grandfathered and as I recall as long as they didn’t add new licenses they would remain under the 0.5 license model.  I would expect a 2014/2015 era Intel processor would outperform a 2006 era chip, although if it would have been against a POWER5 1.9 or 2.2 GHz chip I might call it 50-50 J .

We have to spend some time discussing HP server technology as Jeff is doing some major league sleight of hand as the Superdome X server supports a special hardware partitioning capability (more details below) that DOES allow for reduced licensing that IS NOT available on non-Superdome x86 servers or from most other Intel vendors unless they also have an 8 socket or larger system like SGI – oh wait, HP just bought them.  Huh, wonder why they did this if the HPE Superdome X is so good.

Jeff then mentions an IDC research study; big deal, here is a note from my Pastor that says the HPE Superdome is not very good; who are you going to believe?

Moving the rest of the blog to Part 2.