6865265

Code Review for 6865265

Prepared by:	Volker H. Simonis on Mon Jul 25 18:45:43 CEST 2011
Workspace:	/share/software/Java/OpenJDK/hsx/hotspot-main/hotspot
Compare against:	http://hg.openjdk.java.net/hsx/hotspot-main/hotspot
Summary of changes:	103 lines changed: 100 ins; 0 del; 3 mod; 4208 unchg
Patch of changes:	6865265.patch
Bug id:	6865265 JVM crashes with "missing exception handler" error
Author comments:	There is a problem in the runtime exception handling code for C2-compiled methods with exception handler table entries which contain no entry with a '`-1`' `handler_bci`. Consider the following little `run()` method from the test program `test/compiler/6865265/StackOverflowBug.java`: `public static int run() { try { try { return run(); } catch (Throwable e) { return 42; } } finally { } }` `javac` compiles it to the following bytecode: `public static int run(); Code: 0: invokestatic #2 // Method run:()I 3: istore_0 4: iload_0 5: ireturn 6: astore_0 7: bipush 42 9: istore_1 10: iload_1 11: ireturn 12: astore_2 13: aload_2 14: athrow Exception table: from to target type 0 4 6 Class java/lang/Throwable 0 4 12 any 6 10 12 any 12 13 12 any` and the C2 JIT compiler creates the following compiled exception handler table for this method: ExceptionHandlerTable (size = 96 bytes) catch_pco = 20 (1 entries) bci -1 at scope depth 0 -> pco 32 catch_pco = 40 (2 entries) bci 6 at scope depth 0 -> pco 40 bci 12 at scope depth 0 -> pco 61 catch_pco = 72 (2 entries) bci 12 at scope depth 0 -> pco 72 bci 6 at scope depth 0 -> pco 85 Notice that the two table entries for the `catch_pco` 40 and 72 have no slot with a '`-1`' `handler_bci`. This leads to a problem in `SharedRuntime::compute_compiled_exc_handler()` if we try to find the right exception handler for the `StackOverflowError` exception which will eventually occur because of the unbound recursive call to `run()`. `compute_compiled_exc_handler()` computes the `bci` which corresponds to the native exception `pc` and calls `methodOopDesc::fast_exception_handler_bci_for()` with this `bci` and the corresponding exception class. `fast_exception_handler_bci_for()` then iterates over the methods exception table (the one from the class file shown above, not the compiled exception handler table) and tries to find the right exception handler `bci` for the actual exception class. During this search class loading or resolution may occur which can lead to a secondary exception being thrown. If that happens, `fast_exception_handler_bci_for()` will return the `handler_bci` which is currently under inspection together with a pending exception. If `fast_exception_handler_bci_for()` returns with a pending exception, `compute_compiled_exc_handler()` will reset `handler_bci` to `-1` and will then try to find an entry in the compiled exception handler table for the current `pco` (i.e. the offset of the current native pc relative to the beginning of the method). But if the compiled exception handler table entry for that `pco` will have no `-1` (i.e. '`catch_all`') entry the lookup will fail badly and the VM will crash with a "missing exception handler" guarantee. Question: is it correct that we implicitly create '`catch_all`' `CatchProjNode`s in `Parse::catch_call_exceptions()` with an `handler_bci` which is not '`-1`'? My fix retries the call to `fast_exception_handler_bci_for()` after it returned with a pending exception. It does this by supplying the `handler_bci` returned by that method as its new `bci input parameter and it retries this as long as no new nested excepetion occurs. Notice that this will always be the case because every method has an artificial "catch all" exception handler which is able to catch all kind of exceptions so no nested exception can be thrown if fast_exception_handler_bci_for() will finally consider this exception handler.` There's one additional thing we have to consider with this proposed change: compute_compiled_exc_handler() is called from OptoRuntime::handle_exception_C_helper() which caches the resulting compiled exception handler addresses together with the native exception pc in the corresponding nmethod. The next time an exception happens in the same nmethod at the same native pc, handle_exception_C_helper() immediately reuses the cached compiled exception handler addresses without calling compute_compiled_exc_handler() again. However, in the case where the first call to compute_compiled_exc_handler() resulted in the throwing of another nested exception, the returned exception handler pc isn't necessarily the right one for the new occurrence of this exception (because this time compute_compiled_exc_handler() could succeed without throwing another nested exception). So to cut a long story short, we shouldn't cache the exception handler pc computed by compute_compiled_exc_handler() if the handler is really for another (nested) exception than the original one. Unfortunately we can not check that reliably in handle_exception_C_helper() because we only have a handle to the original exception and the exception object can be potentially moved around during a GC which may happen during the call to compute_compiled_exc_handler(). However, if this unusual case really happens, this just means that we will not update the exception handler cache immediately, but only after the next call to compute_compiled_exc_handler() for the corresponding native exception pc. And I think that should really have no performance impact at all. Side Note There's one interesting side note to this problem. C2 only seems to create compiled exception handler table without -1 entry, if the catched exception class is not resolved (in the code it checks this by calling ciInstanceKlass::is_loaded() in Parse::catch_inline_exceptions()). This however doesn't mean that the corresponding class is not loaded at all. In my example java.lang.Throwable is loaded for sure before StackOverflowBug.run() gets compiled. It does only mean that there's no entry for the java.lang.Throwable/ApplicationClassLoader pair in the global SystemDictionary for the application class loader which loaded the StackOverflowBug class. Notice the the SystemDictionary is a hash table which uses (class name / class loader) pairs (i.e. Symbol*/oop pairs) as keys for storing class objects (klassOop). The class loader used in the key is the so called initiating class loader (i.e. the class loader of the class which initiated the class loading) while the class objects themselves have a reference to their defining class loader (i.e. the class loader which actually loaded the class). For our example this means that java.lang.Throwable is loaded quite early during VM startup by the bootstrap (native) class loader as initiating and defining class loader. Later, the so called application class loader loads the StackOverflowBug class but the loading of java.lang.Throwable with the application class loader as initiating class loader is deferred until its first usage. And only then the actual loading will take place trough the application class loader by delegating to the native class loader which will finally return the same class object which is already referenced by (java.lang.Throwable/NativeClassLoader). However this step will finally enter (java.lang.Throwable/ApplicationClassLoader) as new key into the SystemDictionary. Nevertheless, I first had a hard time to write a method with an exception class which is not already resolved during compilation, because class verification usually resolves the exception classes during the verification process. But than I realized the during the verification process, class 'assignability' is only checked by name comparison if the names are equal (see VerificationType::is_assignable_from() which is called from ClassVerifier::verify_exception_handler_table()). That is strange, because if the two classes have different names, is_assignable_from() calls is_reference_assignable_from() which resolves both classes by calling SystemDictionary::resolve_or_fail(). resolve_or_fail() correctly takes into account the current, initiating class loader and would place a corresponding new entry into the system dictionary for (class name / class loader) pairs which are not already present in the dictionary. So for my little test program, replacing 'catch (Throwable e)' by 'catch (Exception e)' would not trigger the crash anymore, because (java.lang.Exception/ApplicationClassLoader) would be resolved by the Verifier at the load time of the StackOverflowBug class. Later, when StackOverflowBug.run() will eventually be JIT-compiled, the compiler will see java.lang.Exception as resolved (with respect to the application class loader and will simply create an abbreviated catch table (i.e. one with a '-1' handler_bci entry) which will not trigger the crash. So to cut a long story short again, I don't understand, why the class verifier only checks for class name equality in VerificationType::is_assignable_from() without taking into account the initiating class loader and if this is correct. But perhaps I've just missed something here?
Legend:	Modified file Deleted file New file


 ------ ------ ------
 ------
 ---
New
Patch
Raw

test/compiler/6865265/StackOverflowBug.java

This code review page was prepared using ../../make/scripts/webrev.ksh (vers 23.18-hg).

Code Review for 6865265

Side Note