The SAP JVM
The SAP JVM supports Java 1.4, 5, 6, 7 and runs on 15 platforms:
- Linux on x86, x86_64, IA64, PPC64, zSeries; Windows on x86, x86_64, IA64;
Solaris on x86_64, SPARC; HPUX on IA64, PARISC; AIX and AS400 on PPC64; MacOS X on x86_64
..and we provide support for any SAP JVM version until the end of days:)
The SAP JVM is derived from the Sun/Oracle code base:
- with custom ports to the platforms not supported by Oracle
- with enhancements mainly in the supportability area
- with special addons (e.g. SAP JVM Profiler)
We constantly integrate Oracle changes:
- leading to an increasing code divergence between our and Oracle's version
- initially we only got "source drops" from Sun/Oracle:
- i.e. the source code of every released Java version and every JDK update
- after more than 7 years, merging has become a nightmare
The OpenJDK Project
Announced at JavaOne 2006
- open source implementation of Java SE
- licensed under GPLv2 (with Classpath exception)
SAP can't use OpenJDK directly:
- because its customers expect a commercially licensed JDK
- because it also has to deliver JDKs 1.4 and 5
It took 5 years until SAP "officially" joined the OpenJDK project:
- convincing SAP executives/developers to join an open source project was not easy
- Oracle's Sun acquisition was not helpful either:)
- we had to ensure that we get contributed code back under our commercial license
Today, the OpenJDK is a playground and collaboration space for different implementers:
- IBM, RedHat, Apple, Twitter, Azul, SAP, ..
The OpenJDK Source Tree
The OpenJDK consists of two major building blocks
- the HotSpot Virtual Machine (~1.700 files, ~340.000 loc)
- the Java class library (~11.000 files, ~235.000 loc)
The HotSpot VM
The HotSpot VM first appeared in 2000 with Java 1.3 and is constantly evolving since then:
- mostly architecture dependent parts:
- Bytecode Interpreters
- Template Interpreter
- C++ Interpreter
- JIT Compilers
- C1 aka "Client compiler"
- C2 aka "Server compiler"
- mostly OS dependent parts:
- Runtime system
- Memory handling (VM Heap, Java Heap, CodeCache)
- Process/Thread/Signal handling
- mostly generic parts:
- Garbage collectors
- Class loader/verifiers
Porting the HotSpot VM - Effort
- Taking the Linux/x86_64 version as reference implementation:
hotspot/src/share (~1100 files, ~100.000 loc)
hotspot/src/os/linux ( ~25 files, ~9.000 loc)
hotspot/src/os_cpu/linux_x86 ( ~20 files, ~3.500 loc)
hotspot/src/cpu/x86 ( ~100 files, ~90.000 loc)
- these numbers include both interpreters and both JIT compilers
- We are currently working on the:
C2 JIT compiler:
hotspot/src/os_cpu/linux_ppc (+ ~6 files,+ ~400 loc)
hotspot/src/cpu/ppc (+ ~20 files,+ ~25.000 loc)
AIX port:
hotspot/src/os/aix ( ~30 files,+ ~14.000 loc)
hotspot/src/os_cpu/aix_ppc ( ~15 files,+ ~2000 loc)
- we already have this code in the SAP JVM - just have to bring it to the OpenJDK
The C++Interpreter
- consists of a huge interpreter loop written in C++
- and a so called "frame manager" written in Assembler
- the "frame manager" is a frameless method which handles Java method invocations
- this keeps the Java frames continuous on the mixed Java/Native stack
- and we only have one activation of the C++ interpreter loop on top of the stack
- see hotspot/src/cpu/ppc64/vm/cppInterpreter_ppc64.cpp (~3000 lines of assembler code)
+------------------+
| | C++ interpreter loop
+------------------+
|xxxxxxxxxxxxxxxxxx| java frame n
+------------------+
: .... :
+------------------+
|xxxxxxxxxxxxxxxxxx| java frame 0
+------------------+
|//////////////////| vm
|//////////////////|
One big challenge when porting the C++Interpreter is that you first have implement
a Macro Assembler for your architecture!
- the OpenJDK contains macro assemblers for x86 (~12.000 loc), SPARC (~8.000 loc) and ppc64 (~9.000 loc)
- see hotspot/src/cpu/<arch>/vm/assembler_<arch>.{hpp,inline.hpp,cpp}
- the Macro Assembler can be reused for the JIT compilers
The C2 "Server" JIT Compiler
The C2 "Server" JIT Compiler is the biggest (and most complicated) part of the HotSpot VM.
It consists of three main parts:
- the generic optimizer written in C++ (under src/share/vm/opto)
- an "Architecture Definition Language" and Compiler written in C++ (under src/share/vm/adlc)
- the "Architecture Definition" file written in ADL (under src/cpu/<arch>/vm/<arch>.ad)
hotspot/src/share/vm/opto ( ~110 files, ~128.000 loc)
hotspot/src/share/vm/adlc ( ~23 files, ~26.000 loc)
hotspot/src/cpu/x86/vm/x86_32.ad ( ~14.000 loc)
hotspot/src/cpu/x86/vm/x86_64.ad ( ~13.000 loc)
hotspot/src/cpu/sparc/vm/sparc.ad ( ~10.000 loc)
hotspot/src/cpu/ia64/vm/ia64.ad ( ~26.000 loc)
hotspot/src/cpu/ppc/vm/ppc_64.ad ( ~14.000 loc)
For every new architecture the corresponding AD file has to be written which means:
- defining the different registers
- defining the different calling conventions
- defining concrete "encodings" (i.e. assembler instructions) for the abstract optimizer nodes
Platform Specifics
The PowerPC architecture:
- we support Power 5, 6, 7
- because Power6 is "in-order", we had to implement a scheduler to avoid performance degradation
- we extended the AD-file to support "Late Expansion" of nodes
- PowerPC is a "weak memory" architecture:
- we had to use additional memory barriers (e.g. InitNode)
- we implemented "Trap Based" Null-checks for PowerPC
- it as not easy to merge this change with the "compressed oops" feature
- we extended the ConstantPool in JITed methods
The AIX operating system:
- different memory system
- mmap has different semantics, need to use shmget
- "large page support" works differently, different page sizes (stack, malloc, heap)
- thread memory is not necessarily page aligned
- this complicates the implementation of guard pages
Porting-Lessons Learned
During the last years we've ported HotSpot to quite some new platforms and we learned:
- the HotSpot has a very steep learning curve
- there is not much documentation available
- building is hard (especially on "exotic" platforms)
- testing is hard (there's neither a standard, open test suite nor a test framework)
- tool support is bad
- low level tools like Oprofile are just not mature on "exotic" platforms
- debugging is hard on "exotic" platforms (e.g. hardware watchpoints don't work)
"Convergent Evolution" is evil:
- CodeCache-Sweeper
- Tiered Compilation
- Disassembler library
- Logging/Tracing/Flightrecorder
- numerous bug-fixes
Status and Next Steps
Complete the porting of our port:
- the AIX port should be able to bootstrap itself within a "few weeks"
- the C2 "Server" compiler should be available until the end of year
Integrate our port as fast as possible into the main code line:
- Start with shared changes which don't affect other platforms:
- C++Interpreter (also used by the Zero port)
- building a core (interpreter-only) VM
- generating SafeFetch stubs
- extending the ADLC (LateExpand nodes)
- make stack growth direction configurable
- Propose extensions:
- Tiered Compilation with C2
- Profiling Safepoints
http://openjdk.java.net/projects/ppc-aix-port