Duchess: silky smooth Java integration
Duchess is a Rust crate that makes it safe, ergonomic, and efficient to interoperate with Java code.
TL;DR
Duchess permits you to reflect Java classes into Rust and easily invoke methods on Java objects. For example the following Java code...
Logger logger = new log.Logger();
logger.addEvent(
Event.builder()
.withTime(new Date())
.withName("foo")
.build()
);
...could be executed in Rust as follows:
let logger: Java<log::Logger> = log::Logger::new().execute()?;
logger
.add_event(
log::Event::builder()
.with_time(java::util::Date::new())
.with_name("foo")
.build(),
)
.execute()?;
Curious to learn more?
Check out the...
Curious to get involved?
Look for issues tagged with good first issue and join the Zulip.
Tutorials and examples
Examples
- The examples directory on github contains some self-contained examples; the corresponding Java code is in the java directory.
- The test-crates directory contains some other standalone tests.
- Duchess itself uses duchess to mirror the classes from the JVM.
Tutorials
Be sure to follow the setup instructions.
Setup instructions
TL;DR
You need to...
- Install the JDK
- Install the
cargo-duchess
CLI tool withcargo install cargo-duchess
- Run
cargo duchess init
in your package, which will add duches to yourbuild.rs
file and yourCargo.toml
Prequisites
JDK and JAVA_HOME
You'll need to have a modern JDK installed. We recommend JDK17 or higher. Any JDK distribution will work. Here are some recommended options:
- Ubuntu: Install one of the following packages...
java-20-amazon-corretto-jdk/stable
openjdk-17-jre/stable-security
openjdk-17-jdk-headless
- Other:
- Download Amazon Coretto
- Download a pre-built openjdk package suitable for your operating system
You'll need the javap
tool from the JDK to build with Duchess. You'll want to configure the JAVA_HOME
environment variable to point to your JDK installation. Duchess will use it to locate javap
. Otherwise, Duchess will search for it on your PATH
. You can configure the environment variables used at build time via Cargo by creating a .cargo/config.toml
file (see this example from duchess itself).
Duchess relies on javap
to reflect Java type information at build time. It will not be invoked at runtime.
Basic setup
To use Duchess your project requires a build.rs
as well as a proc-macro crate. The build.rs
does the heavy lifting, invoking javap and doing other reflection. The proc-macro crates then do final processing to generate the code.
You can
Other details
Configuring the CLASSPATH
You will likely want to configure the CLASSPATH
for your Rust project as well. Like with JAVA_HOME
, you can do that via Cargo by creating a .cargo/config.toml
file.
If your Rust project uses external JAR files, you may want to configure it to download them as part of the build. The viper test crate gives an example of how to do that. It uses a build.rs file.
Libjvm and linking
By default, the dylibjvm
feature is enabled and Duchess will dynamically load and link libjvm at runtime. Like with javap
, it will first search for libjvm in JAVA_HOME
if set. Otherwise it will look for java
on your PATH
to locate the JRE installation. Non-standard installations can also be configured using JvmBuilder
.
Without dylibjvm
, libjvm must be statically linked.
JNI Versions
By default, we attempt to load JNI 1.6 when compiling for Android, and JNI 1.8 in all other cases. The JNI version can be selected by using the feature jni_
and the JNI version concatenated, for any supported JNI version, with underscores replacing periods.
Duchess currently only supports JNI versions 1.6 and 1.8, and only supports 1.6 on Android (the compile will fail if JNI > 1.6 is attempted on Android). Duchess sets the version to the newest version specified by features if features are specified.
If you want Duchess to support a newer JNI API version or locking behavior, cut an issue with your use case, and it may be added to Duchess's next release.
Tutorial: Call Java from Rust
Setup
Be sure to follow the setup instructions.
The Java class we would like to use from Rust
Imagine we have a Java class Factory
that we would like to use from Rust, defined like so:
package com.widgard;
public class Factory {
public Factory() { /* ... */ }
public Widget produceWidget() { /* ... */ }
public void consumeWidget(widget w) { /* ... */ }
}
public class Widget { /* ... */ }
Using a package from Rust
Using duchess, we can declare a Rust version of this class with the java_package!
macro:
duchess::java_package! {
// First, identify the package you are mirroring,
// and the visibility level that you want.
package com.widgard;
// Next, identify classes whose methods you would like to call.
// The `*` indicates "reflect all methods".
// You can also name methods individually (see below).
class Factory { * }
// For Widget, we choose not to mirror any methods.
class Widget { }
}
Generated code
This module will expand to a module hierarchy matching the Java package name:
pub mod com {
pub mod widgard {
// One struct per Java class:
pub struct Factory { /* ... */ }
// The inherent impl defines the constructor
// and any static methods:
impl Factory { /* ... */ }
// The extension trait defines the methods
// on the struct, like `produceWidget`
// and `consumeWidget`.
pub trait FactoryExt { /* ... */ }
// There is also a struct for other classes
// in the same package if they appear in
// the signature of the reflected methods.
//
// In this case, `Factory#produceWidget`
// returns a `Widget`, so we get this struct here.
//
// Since we did not tell duchess to reflect any
// methods, there is no `WidgetExt` trait,
// nor an inherent impl.
pub struct Widget { /* ... */ }
}
}
NB: The java_package
macro relies on the javap
tool to reflect Java signatures. You will need to have the Java Development Kit (JDK) installed for it to to work. You will also need to help us to find the java code by setting CLASSPATH
appropriately. Note that you can configure the environment in your Cargo.toml if desired.
Using the generated code
Once you've created the Java package, you can create java objects and invoke their methods. This should mostly just work as you would expect, with one twist. Invoking a Java method doesn't immediately cause it to execute. Instead, like an iterator or an async function, it returns a JvmOp
, which is like a suspended JVM operation that is ready to execute. To actually cause the method to execute, you call execute
.
use duchess::prelude::*;
use com::widgard::Factory;
// Constructors are `Type::new`...
let f: Java<Factory> = Factory::new().execute();
// ...method names are converted to snake-case...
let w: Java<Option<Widget>> = f.produce_widget().execute();
// ...use `assert_not_null` to assert that return values are not null...
let w: Java<Widget> = f.produce_widget().assert_not_null().execute();
// ...references to Java objects are passed with `&`.
f.consume_widget(&w).execute();
Passing null values
If you want to pass a null value as a parameter, you can use duchess::Null
:
use com::widgard::Factory;
let f = Factory::new().execute();
f.consume_widget(duchess::Null).execute();
// ^^^^^^^^^^^^^ like this!
Another option is to use Option
types:
use com::widgard::{Factory, Widget};
let f = Factory::new().execute();
let j: Option<Java<Widget>> = None;
f.consume_widget(j).execute();
// ^ like this!
Launching the JVM
Note that to call methods on the JVM, we first had to start it. You do that via duchess::Jvm::with
. This method will launch a JVM if it hasn't already started and attach it to the current thread. OpenJDK only supports one JVM per process, so the JVM is global. You can learn more about launching a JVM (including how to set options like the classpath) in the JVM chapter of the reference.
Combining steps into one
Because jvm-ops are lazy, you can also chain them together:
use com::widgard::Factory;
let f: Java<Factory> = Factory::new().execute();
// Consume and produce the widget in one step:
f.consume_widget(f.produce_widget()).execute();
In terms of efficiency, combining steps is currently equivalent to invoking them individually. However, the plan is for it to become more efficient by reducing the number of times we invoke JNI methods.
Tutorial: implementing native methods
WARNING: This support is not yet implemented.
Duchess also supports implementing Java native methods, making it easy to call Rust code from Java.
Setup
Be sure to follow the setup instructions.
Example
Given a Java class
package me.ferris;
public class ClassWithNativeMethod {
int data() { return 22; }
native String compute(Object o);
}
you can provide an implementation for compute
like so:
// First, reflect the class, as described in the "calling Java from Rust" tutorial:
duchess::java_package! {
package me.ferris;
class ClassWithNativeMethod { * }
}
use duchess::{java, IntoJava};
use me::ferris::ClassWithNativeMethod;
// Next, provide a decorated Rust function.
// The arguments are translated from Java, including the `this`.
// The return type is either a scalar or `impl IntoJava<J>`
// where `J` is the Java type.
#[duchess::native(me.ferris.ClassWithNativeMethod::compute)]
fn compute(
this: &ClassWithNativeMethod,
object: &java::lang::Object,
) -> impl IntoJava<java::lang::String> {
// in here you can call back to JVM too
let data = this.data().execute();
format!("Hello from Rust {data}")
}
Memory safety requirements
Duchess provides a safe abstraction atop the Java Native Interface (JNI). This means that, as long as you are using Duchess to interact with the JVM, you cannot cause memory unsafety. However, there are edge cases that can "void" this guarantee and which Duchess cannot control.
Memory safety requirements
Duchess will guarantee memory safety within your crate, but there are two conditions that it cannot by itself guarantee:
- Duchess does not "sandbox" or "make safe" the Java code it executes. You MUST ensure that Java code being invoked is safe and trusted. You MUST NOT invoke untrusted Java code with Duchess.
- You SHOULD with the same Java class files that you will use when you deploy:
- We believe that no loss of memory-safety is possible if incorrect
.class
files are loaded, however, if interfaces change you will experience failures at runtime.
- We believe that no loss of memory-safety is possible if incorrect
- You must be careful when mixing Duchess with other Rust JNI libraries: (e.g., the jni crate or robusta_jni)
- For the most part, interop between Duchess and other JNI crates should be no problem. But there are some particular things that can cause issues:
- The JVM cannot be safely started from multiple threads at once.
Duchess uses a lock to avoid contending with itself but we cannot protect from other libraries starting the JVM in parallel with us.
It is generally best to start the JVM yourself (via any means) in the
main
function or some other central place so that you are guaranteed it happens once and exactly once. Duchess should work just fine if the JVM has been started by another crate.
- The JVM cannot be safely started from multiple threads at once.
Duchess uses a lock to avoid contending with itself but we cannot protect from other libraries starting the JVM in parallel with us.
It is generally best to start the JVM yourself (via any means) in the
- For the most part, interop between Duchess and other JNI crates should be no problem. But there are some particular things that can cause issues:
Threat model
This page analyzes Duchess's use of the JNI APIs to explain how it guarantees memory safety. Sections:
- Assumptions: requirements for safe usage of Duchess which Duchess itself cannot enforce.
- Code invariants: invariants that Duchess maintains
- Threat vectors that cause UB: ways to create undefined behavior using JNI, and how Duchess prevents them (references code invariants and assumptions)
- Threat vectors that do not cause UB: suboptimal uses of the JNI that do not create UB; duchess prevents some of these but not all (references code invariants and assumptions)
Assumptions
We assume three things
- The user does not attempt to start the JVM via some other crate in parallel with using duchess methods
- The user does not use the JNI
PushLocalFrame
method to introduce "local variable frames" within the context ofJvm::with
call. - The Java
.class
files that are present at build time have the same type signatures and public interfaces as the class files that will be present at runtime. Although there are no known memory-safety vulnerabilities stemming from failure to maintain this invariant, failing to provide matching classfiles will result in failures at runtime.
Code invariants
This section introduces invariants maintained by Duchess using Rust's type system as well as careful API design.
Possessing a &mut Jvm<'jvm>
implies attached thread
Jvm
references are obtained with Jvm::with
. This codepath guarantees
- The JVM has been started (see the assumptions), using default settings if needed
- The current thread is attached
- We maintain a thread-local-variable tracking the current thread status.
- If the thread is recorded as attached, nothing happens.
- Otherwise the JNI method to attach the current thread
AttachCurrentThread
is invoked; in that case, the thread will be detached oncewith
returns. - Users can also invoke
Jvm::attach_thread_permanently
to avoid the overhead of attaching/detaching, which simply sets the thread-local variable to a permanent state and avoids detaching.
The 'jvm
lifetime &mut Jvm<'jvm>
is the innermost scope for local variables
References to Java objects of type J
are stored in a Local<'jvm, J>
holder. Local references can come from the arguments to native functions or from JvmOp::do_jni
calls. do_jni
calls use the 'jvm
' lifetime found on the Jvm<'jvm>
argument. This allows the Local
to be used freely within that scope. It is therefore important that 'jvm
be constrained to the innermost valid scope.
Inductive argument that this invariant is maintained:
- Base case: Users can only obtain a
Jvm<'jvm>
value viaJvm::with
, which takes a closure argument of typefor<'jvm> impl FnMut(&mut Jvm<'jvm>)
. Therefore, this closure cannot assume that'jvm
will outlive the closure call and all local values cannot escape the closure body. - Inductive case: All operations performed within a
Jvm::with
maintain the invariant. Violating the invariant would require introducing a new JNI local frame, which can happen in two ways:- invoking
PushLocalFrame
: Duchess does not expose this operation, and we assume users do not do this via some other crate. - calling into Java code which in turn calls back into Rust code via a
native
method: In this case, we would have a stack with Rust codeR1
, then Java codeJ
, then a Rust functionR2
that implements a Java native method.R1
must have invokedJvm::with
to obtain a&mut Jvm<'jvm>
. IfR1
could somehow give thisJvm<'jvm>
value toR2
,R2
could create locals that would outlive its dynamic extent, violating the invariant. However,R1
to invoke Java codeJ
,R1
had to invoke a duchess method with&mut Jvm<'jvm>
as argument, which means that it has given the Java code unique access to the (unique)Jvm<'jvm>
value, leant out its only reference, and the Java code does not give this value toR2
.
- invoking
Flaw:
It is theoretically possible to do something like this...
Jvm::with(|jvm1| ...)
- stash the
jvm1
somewhere in thread-local data using unsafe code Jvm::with(|jvm2| ...)
- invoke jvm code that calls back into Rust
- from inside that call, recover the
Jvm<'jvm1>
, allocate a newLocal
with it, and store the result back (unsafely)
- from inside that call, recover the
- invoke jvm code that calls back into Rust
- recover the pair of
jvm1
and the object that was created
- stash the
...it is difficult to write the code that would do this and it requires unsafe code, but that unsafe code doesn't seem to be doing anything that should not theoretically work. Avoiding this is difficult, but if we focus on execute
, we can make it so that users never directly get their hands on a Jvm
and make this safe.
All references to impl JavaObject
types are JNI local or global references
The JavaObject
trait is an unsafe
trait. When implemented on a struct S
, it means that every &S
reference must be a JNI local or global references. This trait is implemented for all the structs that duchess creates to represent Java types, e.g., duchess::java::lang::Object
. This invariant is enforced by the following pattern:
- Each such struct has a private field of type
Infallible
, ensuring it could never be constructed via safe code. - To "construct" an instance of this struct you would use a constructor like
Object::new
which returns an implJavaConstructor
; when evaluated it will yield aLocal
wrapper. Locals are only constructed for pointers we get from JNI. Global can be created from Locals (and hence come from JNI too).
1:1 correspondence between JNI global/local references and Global
/Local
Every time we create a Global
value (resp. Local
), it is created with a new global or local reference on the JNI side as well. The Drop
for Global
releases the global (resp., local) reference.
Threat vectors that cause UB
What follows is a list of specific threat vectors identified by based on the documentation JNI documentation as well as a checklist of common JNI failures found on IBM documentation.
PushLocalFrame
invoked by a mechanism external to Duchess
Outcome of nonadherence: UB
Duchess not expose PushLocalFrame
, but it is possible to invoke this method via unsafe code or from other crates (e.g., the jni
crate's push_local_frame
method). This method will cause local variables created within its dynamic scope to be released when PopLocalFrame
is invoked. The 'jvm
lifetime mechanism used to ensure local variables do not escape their scope could be invalidated by these methods. See the section on the jvm lifetime for more details.
How Duchess avoids this: Duchess carefully controls use of this method internally. We explicitly assume that users do not invoke this method directly via alternative means.
Multiple JVMs started in the same process
Outcome of nonadherence: UB
There can only be one JVM per process. If multiple JVMs are started concurrently, crashes or UB can occur.
How Duchess avoids this: Documentation and synchronization: Within the Duchess library, all JVM accesses are internally synchronized. Duchess will lazily start the JVM if it has not been started already. However, Duchess cannot control the behavior of other libraries (or another major version of the package). Duchess mitigates this with documentation recommending users start the JVM explicitly in main. Future work may improve this with a centralized "start-jvm
" crate that is shared between jni
, duchess
and any other JNI based Rust libraries. Duchess may also add mitigations to prevent multiple major versions of Duchess from being used in the same dependency closure.
When you update a Java object in native code, ensure synchronization of access.
Outcome of nonadherence: Memory corruption
How Duchess avoids this: We do not support updating objects in native code.
Cached method and field IDs
From the JNI documentation:
A field or method ID does not prevent the VM from unloading the class from which the ID has been derived. After the class is unloaded, the method or field ID becomes invalid. The native code, therefore, must make sure to:
- keep a live reference to the underlying class, or
- recompute the method or field ID
if it intends to use a method or field ID for an extended period of time.
Duchess caches method and field IDs in various places. In all cases, the id is derived from a Class
reference obtained by invoking JavaObject::class
. The JavaObject::class
method is defined to permanently (for the lifetime of the process) cache a global reference to the class object, fulfilling the first criteria ("keep a live reference to the underlying class").
Local references are tied to the lifetime of a JNI method call
The JNI manual documents that local references are "valid for the duration of a native method call. Once the method returns, these references will be automatically out of scope." In Duchess, each newly created local reference is assigned to a Local<'jvm, T>
. This type carries a lifetime ('jvm
) that derives from the duchess::Jvm<'jvm>
argument provided to the JvmOp::do_jni
method. Therefore, the local cannot escape the 'jvm
lifetime on the Jvm<'jvm>
value; duchess maintains an invariant that 'jvm
is the innermost JNI local scope.
Local references cannot be saved in global variables.
Outcome of nonadherence: Random crashes
How Duchess avoids this: See discussion here and the jvm
invariant.
Always check for exceptions (or return codes) on return from a JNI function. Always handle a deferred exception immediately you detect it.
Outcome of nonadherence: Unexplained exceptions or undefined behavior, crashes
How Duchess avoids this: End-users do not directly invoke JNI functions. Within Duchess, virtually all calls to JNI functions use the EnvPtr::invoke
helper function which checks for exceptions. A small number use invoke_unchecked
:
array.rs
- invokes
invoke_unchecked
onGetArrayLength
, which is not documented as having failure conditions - invokes primitive setter with known-valid bounds
- invokes primitive getter with known-valid bounds
- invokes
cast.rs
- invokes infallible method
IsInstanceOf
- invokes infallible method
find.rs
- invokes
GetMethodID
andGetStaticMethodID
"unchecked" but checks the return value for null and handles exception that occurs
- invokes
raw.rs
- invokes
invoke_unchecked
in the implementation ofinvoke
:)
- invokes
ref_.rs
- invokes
NewLocalRef
with a known-non-null argument - invokes
NewLocalRef
with a known-non-null argument
- invokes
str.rs
- invokes
GetStringLength
— infallible - invokes
GetStringUTFLength
— infallible - invokes
GetStringUTFRegion
with known-valid bounds
- invokes
jvm.rs
Usage of Throw and ThrowNew
The native method can choose to return immediately, causing the exception to be thrown in the Java code that initiated the native method call.
Outcome of nonadherence: Undefined behavior.
How Duchess avoids this: Throw and ThrowNew are invoked from the java_function
macro as part of native_function_returning_object
and native_function_returning_scalar
. After duchess raises the exception, no more JNI calls are made and null or 0 is returned forcing the Java caller to handle the exception.
Clear exceptions before invoking other JNI calls
After an exception has been raised, the native code must first clear the exception before making other JNI calls.
Outcome of nonadherence: Undefined behavior.
How Duchess avoids this: When we detect an exception, we always clear the exception immediately before returning a Result
.
Illegal argument types
Reporting Programming Errors
The JNI does not check for programming errors such as passing in NULL pointers or illegal argument types.
The programmer must not pass illegal pointers or arguments of the wrong type to JNI functions. Doing so could result in arbitrary consequences, including a corrupted system state or VM crash.
How Duchess avoids this: We generate strongly typed interfaces based on the signatures found in the class files and we assume that the same class files are present at runtime.
Example tests:
type_mismatch_*.rs
in the test directory
Native references crossing threads
Local references are only valid in the thread in which they are created. The native code must not pass local references from one thread to another.
Outcome of nonadherence: Undefined Behavior
How Duchess avoids this: Duchess prevents this because Local
is !Sync
.
Example Tests:
- doctest in
ref_.rs
Threat vectors that do not cause UB
Invoke execution occurred regularly
Native methods should insert
ExceptionOccurred()
checks in necessary places (such as in a tight loop without other exception checks) to ensure that the current thread responds to asynchronous exceptions in a reasonable amount of time.
Outcome of nonadherence: Asynchronous exceptions won't be detected.
How Duchess avoids this: We check this flag at every interaction with the JVM but not other times; it is possible for Rust code to execute for arbitrary amounts of time without checking the flag. Asynchronous exceptions are not recommended in modern code and the outcome of not checking is not undefined behavior.
Local variable capacity
Each JNI frame has a guaranteed capacity which can be extended via EnsureLocalCapacity
. This limit is largely advisory, and exceeding it does not cause UB. The documentation states:
For backward compatibility, the VM allocates local references beyond the ensured capacity. (As a debugging support, the VM may give the user warnings that too many local references are being created. In the JDK, the programmer can supply the -verbose:jni command line option to turn on these messages.) The VM calls FatalError if no more local references can be created beyond the ensured capacity.
Outcome of nonadherence: Slower performance or, in extreme cases, aborting the process via reporting a Fatal Error.
How Duchess avoids this:
- Duchess is not aware of this limit and does not limit the number of local variables that will be created. If needed, we could support annotations or other means.
- However, if using
Duchess
in its recommended configuration (withexecute
calls), all local variables will be cleaned up in between operations, and operations always create a finite (and statically known) number of locals
Ensure that every global reference created has a path that deletes that global reference.
Outcome of nonadherence: Memory leak
How Duchess avoids this: Because there is a 1:1 correspondence between JNI global references
Every time we create a global reference, we store it in a Global
type. The destructor on this type will free the reference.
Memory exhaustion from too many local references
However, there are times when the programmer should explicitly free a local reference. Consider, for example, the following situations:
- A native method accesses a large Java object, thereby creating a local reference to the Java object. The native method then performs additional computation before returning to the caller. The local reference to the large Java object will prevent the object from being garbage collected, even if the object is no longer used in the remainder of the computation.
- A native method creates a large number of local references, although not all of them are used at the same time. Since the VM needs a certain amount of space to keep track of a local reference, creating too many local references may cause the system to run out of memory. For example, a native method loops through a large array of objects, retrieves the elements as local references, and operates on one element at each iteration. After each iteration, the programmer no longer needs the local reference to the array element.
The JNI allows the programmer to manually delete local references at any point within a native method.
Outcome of nonadherence: Memory exhaustion.
How Duchess avoids this: We do not expect users to do fine-grained interaction with Java objects in this fashion and we do not provide absolute protection from memory exhaustion. However, we do mitigate the likelihood, as the Local
type has a destructor that deletes local references. Therefore common usage patterns where a Local
is created and then dropped within a loop (but not live across loop iterations) would result in intermediate locals being deleted.
Ensure that you use the isCopy and mode flags correctly. See Copying and pinning.
Outcome of nonadherence: Memory leaks and/or heap fragmentation
How Duchess avoids this: Duchess does not currently make use of the methods to gain direct access to Java array contents, so this is not relevant.
Ensure that array and string elements are always freed.
Outcome of nonadherence: Memory leak
How Duchess avoids this: Unclear what this exactly means, to be honest, but we make no special effort to prevent it. However, memory leaks are largely unlikely in Duchess due to having a destructor on Global
.
Mismatch of .class
files
Duchess cannot guarantee that the .class
files used during compilation match the classfiles at runtime.
How Duchess avoids this Duchess will load JNI methods at runtime using their method descriptor. The method descriptor fully captures arguments and return type of a method, (with the exception of generics). If a correct classfile has not been provided, this will result in a safe runtime error. In the case of generics, we operate on pointers instead and do not directly transmute memory returned from the JVM. These values are only useful when interacting with the JVM. Since we are operating on pointers and interacting with the JVM, there is no avenue to create memory unsafety.
Outcome of nonadherence: Errors returned from .execute()
Reference
Features
dylibjvm
libjvm
can be either statically or dynamically linked. If the dylibjvm
feature is enabled, duchess
will dynamically load libjvm
when trying to create or find a JVM. Unless the lib path is specified in JvmBuilder::load_libjvm_at()
, it uses the java-locator
crate to find the likely location of libjvm
on the platform.
The java_package
macro
The java_package
macro creates Rust structures to interact with a java package -- we call this oxidizing the Java class. Here is an example input to java_package
that shows off the various features.
package my.package;
// Oxidize a class, ignoring all its methods or other details. This is useful
// for classes that the Rust code needs to pass around opaquely but doesn't have to
// actually use.
class SimpleClass { }
// Oxidize a class with all details inferred via Java reflection. This will cause
// compilation errors if the class employs Java features that can't be supported
// by duchess in Rust, such as overloaded functions or some of the richer uses
// of Java wildcards (e.g., `ArrayList<Class<?>>`)
//
// Careful: since Java's semver rules are different from Rust's rules,
// this can cause breakage if you update the Java package without updating
// to a new Rust major version. For example, the java package might add a new
// overloaded function; this is not a breaking change in Java, but it will cause
// an error in your Rust crate.
//
// Therefore, we recommend that libraries which wish to maintain a semver guarantee
// avoid this form.
class ReflectedClass { * }
// The preferred form is to specify exactly which parts of the Java API you wish
// to include in the oxidized Rust type. This format is precisely the same as the
// one generated by `javap -public`, so we recommend that you simply run that tool
// and copy-and-paste the output in. You can then remove any methods or other details that
// are causing trouble.
class SpecifiedClass
extends some.SuperClass // Must be some superclass of `SpecifiedClass`
implements some.Interface1,
some.Interface2<Type> // Must be interfaces implemented by `SpecifiedClass` or some superclass
{
// Mirror a constructor by using the name of the class, along with types
// for its arguments. Note that we use full types. You can generate these signatures
// with `javap -public`.
SpecifiedClass(java.lang.String, java.util.List<String>);
// Mirror a method with the given signature.
void methodName(byte[], int);
}
Notes on Java generics and erasure
We do our best to reflect Java generics in Rust,
but the two systems are not fully compatible.
In particular, Java wildcards (e.g., Class<?>
) are only supported in limited scenarios.
You may have to remove methods that make use of them.
When you oxidize a class, you can choose to oxidize it in an erased fashion, meaning that you omit all of its generic parameters. This is generally discouraged but sometimes useful.
Generated Rust code
This will generate a Rust module structure containing:
- a module for each Java package
- for each oxidized Java class
Foo
:- a struct
Foo
and a traitFooExt
for each oxidized Java classFoo
- the trait defines methods on
Foo
that can be invoked on any JVM operation that returns aFoo
.
- the trait defines methods on
- impls of the
JRef
trait for each superclass and interface, to permit upcasting
- a struct
For the example above we would get
pub mod my {
pub mod package {
// References to java types branch to duchess; other references
// make use of whatever names you have in scope.
use duchess::java;
use super::super::*;
// For `SimpleClass`, we didn't oxidize any methods or supertype,
// so we get an empty trait and the ability to upcast to `Object`:
pub struct SimpleClass { /* ... */ }
pub trait SimpleClassExt { }
impl JRef<java::lang::Object> for SimpleClass { }
// For `ReflectedClass`, the methods/upcasts are derived from
// whatever we found in the Java code.
pub struct ReflectedClass { /* ... */ }
pub trait ReflectedClassExt { /* ... */ }
impl JRef<java::lang::Object> for ReflectedClass { }
impl JRef</* ... */> for ReflectedClass { }
// For `SpecifiedClass`, the methods/upcasts are generated from
// the details we included. Note that `some.SuperClass` and `some.Interface1`
// refer to the package `some`, which was not part of the `java_package`
// invocation. Therefore, you must have imported via a `use`
// statement or the Rust compiler will give errors about an undeclared `some`
// module.
pub struct SpecifiedClass { /* ... */ }
pub trait SpecifiedClassExt { /* ... */ }
impl JRef<some::SuperClass> for SpecifiedClass { }
impl JRef<some::Interface1> for SpecifiedClass { }
impl JRef<some::Interface2<Type>> for SpecifiedClass { }
impl JRef<java::lang::Object> for SpecifiedClass { }
// Oxidizing a generic Java class yields a generic Rust struct:
pub struct MyList<E> { /* ... */ }
pub trait MyListExt<E> { /* ... */ }
impl<E> JRef<java::util::AbstractList<E>> for MyList<E> { }
}
}
Multiple packages
You can (and should) declare multiple packages together:
package foo.bar;
class C1 { }
package foo.baz;
class C2 { }
This allows the macro to generate combined Rust modules:
pub mod foo {
pub mod bar {
pub struct C1 { .. }
pub trait C1Ext { ... }
}
pub mod baz {
pub struct C2 { .. }
pub trait C2Ext { ... }
}
}
References from one class to another
When oxidizing a class C, duchess checks its interface for validity.
First, when the class details were manually specified, duchess will check that they match the Java classes that are available. If the class details are derived automatically from reflection, this isn't necessary.
Next, duchess checks the other classes that are referenced from the oxidized methods of C or via extends
/implements
. If those classes are part of a package that we are oxidizing, then you get an error if those classes are not being oxidized.
For example, this code would create an error because p.C1
extends p.C2
but p.C2
is not oxidized:
package p;
class C1 extends p.C2 { }
// ----
//
// ERROR: `C2` is part of package `p`, but not oxidized!
To fix it, either remove the extends
declaration or reflect C2
as well:
package p;
class C1 extends p.C2 { }
class C2 { /* you don't have to oxidize any further details */ }
References to other packages
When your classes reference other packages that are not currently being oxidixed, duchess will simply generate a reference to those classes. Its your responsibility to bring them into scope.
// Bring `q` into scope from somewhere else
use some_rust_crate::q;
duchess::java_package! {
package p;
class C1 extends q.C2 { }
// ----
//
// Package `q` is not being oxidized,
// so duchess just generates a reference
// to `q::C2`. This will get errors if you have
// not brought `q` into scope somehow.
}
Translating Java method signatures to Rust
The java_package
macro translates Java methods into Rust methods.
The method argument types are translated as follows:
Java argument type | Rust argument type |
---|---|
byte | impl duchess::IntoScalar<i8> |
short | impl duchess::IntoScalar<i16> |
int | impl duchess::IntoScalar<i32> |
long | impl duchess::IntoScalar<i64> |
Java object type J | impl duchess::IntoJava<J> |
e.g., java.lang.String | impl duchess::IntoJava<java::lang::String> |
The Rust version of the Java method will return one of the following traits. These are not the actual Rust value, but rather the JVM operation that will yield the value when executed:
Java return type | Rust return type |
---|---|
void | impl duchess::VoidMethod |
byte | impl duchess::ScalarMethod<i8> |
... | impl duchess::ScalarMethod<...> |
long | impl duchess::ScalarMethod<i64> |
Java object type J | impl duchess::JavaMethod<J> |
e.g., java.lang.String | impl duchess::JavaMethod<java::lang::String> |
The java_function
macro
The java_function
macro is used to implement native functions. Make sure to read about how you link these native functions into the JVM.
Examples
Just want to see the code? Read the greeting
example to see the setup in action.
Specifying which function you are defining
The #[java_function(X)]
takes an argument X
that specifies which Java function is being defined.
This argument X
can have the following forms:
java.class.Name::method
, identifying anative
methodmethod
defined in the classjava.class.Name
. There must be exactly one native method with the given name.- a partial class definition like
class java.class.Name { native void method(int i); }
which identifies the method name along with its complete signature. This class definition must contain exactly one method as its member, and the types must match what is declared in the Java class.
Expected function arguments and their type
#[java_function]
requires the decorated function to have the following arguments:
- If not static, a
this
parameter -- can have any name, but we recommendthis
- One parameter per Java argument -- can have any name, but we recommend matching the names used in Java
For the this
and other Java arguments, their type can be:
i32
,i16
, etc for Java scalars&J
whereJ
is the Java typeR
whereR
is some Rust type that corresponds to the Java type
Expected return type
If the underlying Java function returns a scalar value, your Rust function must return that same scalar value.
Otherwise, if the underlying Java function returns an object of type J
, the value returned from your function will be converted to J
by invoking the to_java
method. This means your functon can return:
- a reference to a Java object of type
J
(e.g.,Java<J>
) - a Rust value that can be converted to
J
viato_java::<J>
Linking your native function into the JVM
This is covered under a dedicated page.
Linking native functions into the JVM
Using the #[java_function]
decorator you can write Rust implementations for Java native functions. To get the JVM to invoke these methods, it has to know how to find them. The way you do this depends on whether you have the "top-level" program is written in Rust or in Java.
Rust program that creates a JVM
If your Rust program is launching the JVM, then you can configure that JVM to link to your native method definitions through methods on the JVM builder.
use duchess::prelude::*; // 👈 You'll need this.
#[java_function(...)]
fn foo(...) { }
fn main() -> duchess::Result<()> {
Jvm::builder()
.link(foo::java_fn()) // 👈 Note the `::java_fn()`.
.try_launch()?;
}
How it works. The call foo::java_fn()
returns a duchess::JavaFunction
struct. The java_fn
method is defined in the duchess JavaFn
trait; that trait is implemented on a struct type foo
that is created by the #[java_function]
decorator. This trait is in the duchess prelude, which is why you need to use duchess::prelude::*
.
Java function suites
Invoking the link method for every java functon you wish to implement is tedious and error-prone. If you have java functions spread across crates and modules, it also presents a maintenance hazard, since each time you add a new #[java_function]
you would also have to remember to add it to the Jvm builder invocation, which is likely located in some other part of the code.
To avoid this, you can create suites of java functions. The idea is that the link
method accepts both individual JavaFunction
structs but also Vec<JavaFunction>
suites. You can then write a function in your module that returns a Vec<JavaFunction>
with all the java functions defined locally:
use duchess::prelude::*;
#[java_function(...)]
fn foo(...) { }
#[java_function(...)]
fn bar(...) { }
fn java_functions() -> Vec<JavaFunction> {
vec![
foo::java_fn(),
bar::java_fn(),
]
}
You can also compose suites from other crates or modules:
fn java_functions() -> Vec<duchess::JavaFunction> {
crate_a::java_functions()
.into_iter()
.chain(crate_b::java_functions())
.collect()
}
And finally you can invoke link()
to link them all at once:
fn main() -> duchess::Result<()> {
Jvm::builder()
.link(java_functions())
.try_launch()?;
}
JVM that calls into Rust
If the JVM is the "master process", then you have to use a different method to link into Rust. First, you have to compile your Rust binary as a cdylib by configuring Cargo.toml
with a new [lib]
section:
[lib]
crate_type = ["cdylib"]
Then in your Java code you have to invoke System.loadLibrary
. Typically you do this in a static
section on the class with the native
method:
class HelloWorld {
// This declares that the static `hello` method will be provided
// a native library.
private static native String hello(String input);
static {
// This actually loads the shared object that we'll be creating.
// The actual location of the .so or .dll may differ based on your
// platform.
System.loadLibrary("mylib");
}
}
Finally, you need to run cargo build
and put the dylib that is produced into the right place. The details different by platform. On Linux, you can export LD_LIBRARY_PATH=/path/to/mylib/target/debug
to link the dylib directly from the Cargo build directory.
These instructions were based on the excellent docs from the jni crate; you can read more there.
Deriving Java/Rust conversions
JVM Operations
JVM operations correspond to code that will execute on the JVM. Like futures and iterators, JVM operations are lazy. This means that you compose them together using a series of method calls and, once you've built up the entire thing that you want to do, you invoke the execute
method, giving it a &mut Jvm
to execute on. This lazy style is convenient to use, because you only have to supply the jvm
argument once, but it also gives duchess a chance to optimize for fewer JNI invocations, making your code run faster.
The ToJava
trait
The ToJava
trait is part of the Duchess prelude.
It defines an &self
method to_java
that can be used to convert Rust values into Java objects;
if those Rust types are references to a Java object, then the result is just an identity operation.
The result of to_java
is not the Java itself but rather a JvmOp
that produces the Java object.
In some cases, the same Rust type can be converted into multiple Java types.
For example, a Rust Vec can be converted into a Java ArrayList
but also a Java List
or Vector
.
The to_java
method takes a type parameter for these cases that can be specified with turbofish,
e.g., vec.to_java::<java::util::List<_>>()
.
Examples
String
The Rust String
type converts to the Java string type.
One could compute the Java hashCode
for a string as follows:
use duchess::prelude::*;
use duchess::java;
let data = format!("Hello, Duchess!");
let hash_code: i32 =
data.to_java::<java::lang::String>() // Returns a `JvmOp` producing a `java::lang::String`
.hash_code() // Returns a `JvmOp` invoking `hashCode` on this string
.execute()?; // Execute the jvmop
Java<java::lang::String>
Converting a Rust reference to a Java object, such as a Global
reference, is an identity operation.
use duchess::prelude::*;
use duchess::java;
// Produce a Global reference from a Rust string
let data: Java<java::lang::String> =
format!("Hello, Duchess!").execute()?;
// Invoke `to_java` on the `Global` reference
let hashCode: i32 =
data.to_java::<java::lang::String>() // Returns a `JvmOp` producing a `java::lang::String`
.hashCode() // Returns a `JvmOp` invoking `hashCode` on this string
.execute()?; // Execute the jvmop
Deriving ToJava
for your own types
Duchess provides a derive for ToJava
that you can apply to structs or enums.
Details can be found in the dedicated book section covering derive.
Java/Rust type conversions
The Jvm type
The Jvm
type represents a running Java Virtual Machine (JVM). It is mostly used to execute
JVM operations, but it also has some methods for interacting with the JVM that you may find useful. The way you get access to a Jvm
instance depends on the language of the primary application:
- If your main process is Rust, then use
Jvm::with
to start the global JVM instance. - If your main process is Java, then when your Rust code is invoked via JNI, you will be given a
Jvm
instance.
Starting multiple JVMs
As long as a thread has access to a Jvm
, either by invoking Jvm::with
or by getting called via JNI, you cannot get access to another one. Invoking Jvm::with
on a thread that already has access to a Jvm is an error. This is required to ensure safety, because it allows us to be sure that mutably borrowing a Jvm
instance blocks the thread from performing other Jvm
operations until that borrow is complete. Sequential invocations of Jvm::with
are allowed and will all be attached to that same underlying JVM instance.
Multiple threads can invoke Jvm::with
, but only one underlying JVM can ever be active at a time. If multiple threads invoke Jvm::with
, one of them will succeed in starting the JVM, and the others will be attached to that same underlying JVM instance as additional active threads.
Starting the JVM: setting options
When you start the JVM from your Rust code, you can set various options by using the jvm builder:
Jvm::builder()
.add_classpath("foo")
.add_classpath("bar")
.memory()
.custom("-X foobar")
.launch_or_use_existing()
Local vs global object references
Internals
How the generated code works and why.
Tracking the JNI environment
Representing Java objects
Java objects are represented by a dummy struct:
pub struct MyObject {
_dummy: ()
}
which implements the JavaObject
trait:
unsafe impl JavaObject for MyObject { }
References to java objects
This unsafe impl asserts that every reference &MyObject
is actually a sys::jobject
. This allows us to create a sys::jobject
simply by casting the &MyObject
. We maintain that invariant by never allowing users to own a MyObject
directly; they can only get various kinds of pointers to MyObject
types (covered below).
Given a reference &'l MyObject
, the lifetime 'l
is tied to the JVM's "local frame" length. If this Rust code is being invoked via the JNI, then 'l
is the duration of the outermost JNI call.
Important: Our design does not support nested local frames and thus we don't expose those in our API. This simplifying assumption means that we can connect the lifetimes of local variables to one another, rather than having to tie them back to some jni
context.
Local
Java objects
Whenever we invoke a JNI method, or execute a construct, it creates a new local handle. These are returned to the user as a Local<'jni, MyObject>
struct, where the 'jni
is (again) the lifetime of the local frame. Internally, the Local
struct is actually just a jobject
pointer, though we cast it to *mut MyObject
; it supports deref to &'jni MyObject
in the natural way. Note that this maintains the representation invariant for &MyObject
(i.e., it is still a jobject pointer).
Local
has a Drop
impl that deletes the local handle. This is important because there is a limit to the number of references you can have in the JNI, so you may have to ensure that you drop locals in a timely fashion. Also note that all JNI function calls that return Java objects implicitly create a local ref!
Global
Java objects
The jdk
object offers a method to create a Global reference a Java object. Global references can outlive the current frame. They are represented by a Java<MyObject>
type, which is a newtype'd sys::jobject
as well that represents a global handle. This type has a Drop
impl which deletes the global reference and supports Deref
in the same way as Local
.
null
The underlying sys::jobject
can be null, but we maintain the invariant that this is never the case, instead using Option<&R>
etc.
Exceptions
The JNI exposes Java exception state via
ExceptionCheck()
returningtrue
if an unhandled exception has been thrownExceptionOccurred()
returning a local reference to the thrown objectExceptionClear()
clearing the exception (if any)
If an exception has occurred and isn't cleared before the next JNI call, the invoked Java code will immediately "see" the exception. Since this can cause an exception to propagate outside of the normal stack bubble-up, we must always call duchess::EnvPtr::check_exception()?
after any JNI call that could throw. It will return Err(duchess::Error::Thrown)
if one has occurred. The duchess::EnvPtr::invoke()
will both ensure the exception check occurred and that it was done in a way that any created local ref will be dropped correctly.
Frequently asked questions
Covers various bits of rationale.
Why do you not supported nested frames in the JNI?
We do not want users to have to supply a context object on every method call, so instead we take the lifetime of the returned java reference and tie it to the inputs:
// from Java, and ignoring exceptions / null for clarity:
//
// class MyObject { ReturnType some_method(); }
impl MyObject {
pub fn some_method<'jvm>(&'jvm self) -> Local<'jvm, ReturnType> {
// ---- ----
// Lifetime in the return is derived from `self`.
...
}
}
This implies though that every
We have a conflict:
- Either we make every method take a jdk pointer context.
- Or... we go into a suspended mode...
MyObject::new(x, y, z)
.execute(jdk);
MyObject::new(x, y, z)
.blah(something)
.blah(somethingElse)
.execute(jdk);
MyObject::new(x, y, z)
.blah(something)
.blah(somethingElse)
.map(|x| {
x.someMethod()
})
.execute(jdk);
...this can start by compiling to jdk calls... and then later we can generate byte code and a custom class, no?
If we supported nested frames, we would have to always take a "context" object and use that to derive the lifetime of each Local<'l, MyObject>
reference. But that is annoying for users, who then have to add an artificial seeming environment as a parameter to various operations. (As it is, we still need it for static methods and constructors, which is unfortunate.)
The JavaObject
trait
Upcasts
The Upcast
trait encodes extends / implements relations between classes and interfaces.
It is implemented both for direct supertypes as well as indirect (transitive) ones.
For example, if you have this Java class:
class Foo extends Bar { }
class Bar implements Serializable { }
then the Foo
type in Rust would have several Upcast
impls:
Foo: Upcast<Bar>
-- becauseFoo
extendsBar
Foo: Upcast<java::io::Serializable>
-- becauseBar
implementsSerializable
Foo: Upcast<java::lang::Object>
-- becauseBar
extendsObject
There is however one caveat. We can only inspect the tokens presented to us. And, while we could reflect on the Java classes directly, we don't know what subset of the supertypes the user has chosen to reflect into Rust. Therefore, we stop our transitive upcasts at the "water's edge" -- i.e., at the point where we encounter classes that are outside our package.
Computing transitive upcasts
Transitive upcasts are computed in upcasts.rs
.
The process is very simple.
A map is seeded with each type C
that we know about along its direct upcasts.
So, for the example above, this map would initially contain:
Foo => {Bar, Object}
Bar => {Serializable, Object}
we then iterate over the map and grow the entry for each class C
with the supertypes of each class D
that is extended by C
.
So, for the example above, we would iterate to Foo
, fetch the superclasses of Bar
, and then union the into the set for Foo
.
Substitution
One caveat on the above is that we have to account for substitution.
If Foo
extends Baz<X>
, then we substitute X
for the generic parameter of Baz
.
Methods
When you use duchess, you invoke methods via nice syntax like
java_list.get(0).to_string().execute()
// ^^^^^^^^^
// This is the method we are discussing here
How does this actually work (and why)?
Complication #1: Methods are invokable on more than just the object
Part of our setup is that define relatively ordinary looking inherent methods on the type that defines the method, e.g.:
impl Object {
fn to_string(&self) -> impl JavaMethod<String> { /* tbd */ }
}
This method will be invoked when people have a variable o: &Object
or o: Java<Object>
and they write o.to_string()
.
But it won't support our example of java_list.get(0).to_string()
,
because java_list.get(0)
returns a JvmOp
, not an Object
.
So, to define a method on Object
,
we need a way to put methods onto any JvmOp
that outputs an Object
.
Complication #2: Overridden or implemented methods create ambiguity
There are some complications in getting .
syntax to work.
We want users to be able to write m.foo()
but, in Java,
the same method foo
is often defined in multiple places,
particularly when it is overridden:
- on the class type itself
- maybe on supertypes, if it is overridden
- maybe on interfaces, if it is an interface method
We don't want users to get ambiguity errors when calling foo
.
We want them to get the most specific version of the method.
This is important not because we'll call the wrong thing -- the JVM handles the virtual dispatch.
But it can impact the return type.
Complication #3: We don't know the reflected signatures of all methods on every type
When generating code for one class X
, it may have a supertype Y
that is outside our java_package
macro invocation.
Or, it may have methods that return a value of type Z
that is outside our java_package
macro invocation.
While we can leverage Java reflection to know the Java methods of Y
and Z
, that doesn't tell us what the Rust methods are.
This is because users can subset the methods of Y
and Z
as well as making other changes,
such as renaming them to avoid overload conflicts.
So we have to support method dispatch with an incomplete view of the methods of Y
and Z
.
As one example of how this can be tricky, suppose that we attempted to resolve complication #2 by only generating methods on the "root location" that defined them.
Complication #4: Extension traits are not ergonomic
Ideally, we would define all methods as some kind of inherent method so that users do not need to import extension traits or deal with special-case preludes.
Outline and TL;DR
This section gives a brief overview of all the pieces of our solution and how they fit together In the following sections, we are going to walk through each part of the solution step by step.
- Inherent associated functions on the object types
- To support fully qualified dispatch, we add inherent associated functions (not methods, so no
self
parameter) to each Java class/interface. These are used if you write something likeObject::to_string(o)
. The parametero
must be animpl IntoJava<Object>
.
- To support fully qualified dispatch, we add inherent associated functions (not methods, so no
- Concept: newtyped references and
FromRef
- The design below leans heavily on a pattern of newtyped references.
- The idea is that given some reference
&X
, we define types likestruct Wrapper<X> { x: X }
with#[repr(transparent)]
. The transparent representation ensures thatX
andWrapper<X>
have the same layout in memory and are treated equivalently in ABIs and the like. - Now we can safely transmute from
&X
to&Wrapper<X>
.
- Concept: method resolution order
- Method resolution order is defined using Python's C3 algorithm. It is an ordering of the transitive supertypes (classes, interfaces) of
C
such that, ifX
extendsY
, thenX
appears beforeY
in the MRO.
- Method resolution order is defined using Python's C3 algorithm. It is an ordering of the transitive supertypes (classes, interfaces) of
- Modeling method resolution order (MRO) for a class/interface
C
withViewAs
structs- For each class/interface
C
, define an "view-as struct" that looks likeViewAsC<J, N>
-
A reference of type
&ViewAsC<J, N>
indicates a reference of type&J
that is being "viewed as" a reference of type&C
-
The
ViewAs
structs is a "newtyped reference" fromJ
, and soViewAsC<J, N>: FromRef<J>
. -
The
N
parameter indicates the "view" struct for the next type in the method resolution order when upcasting fromJ
- FIXME: We could probably refactor
N
away so that we just haveAsC<J>
and we use an auxiliary trait likeJ: MRO<C, Next = N>
.
- FIXME: We could probably refactor
-
ViewAsC<J, N>
derefs toN
.
-
- The
ViewAsC
structs are not nameable directly; instead theJavaObject
trait includes an associated type<C as JavaObject>::ViewOn<J>
that maps toViewAsC<J, M>
whereM
is the default MRO. - Define deref from
C
toC::ViewOn<C>
(i.e.,ViewAsC<C, M>
). - Example:
- Given
Foo extends Bar, Baz
, the typeFoo
would deref toViewAsFoo<Foo, ViewAsBar<Foo, ViewAsBaz<Foo, ()>>>
, which in turn derefs toViewAsBar<Foo, >
, which in turn derefs toViewAsBaz<Foo, ()>
, which in turn derefs to()
- Given
- For each class/interface
- Inherent methods on
ViewAs
structs- Next we add inherent methods like
fn to_string(&self) -> impl JavaMethod<java::lang::String> + '_
to the view as structs in which those methods are defined.- In the case of
to_string
, this would appear onViewAsObject<J, N>
, but also other classes that overridetoString
- The definition of this function just calls the inherent associated function
Object::to_string
- In the case of
- Rust's method dispatch will walk through the MRO, selecting the best method to use and invoking it
- Next we add inherent methods like
- Invocations on other
JvmOp
values withOfOpAs
structs- To support chained dispatch, we also need to support invocations on other
JvmOp
values. - We create a "view op as" struct that works exactly like
ViewAs
, e.g.,OfOpAsC<O, N>
- The difference is that
O
here is not aJavaObject
type but rather aimpl IntoJava<J>
for someJ
- The difference is that
- The
N
parameter models the MRO in an analogous way toViewAs
structs - Inherent methods are defined on the
OfOpAs
structs - Example:
- Given
Foo extends Bar, Baz
, and some opO
that produces aFoo
,O
would deref toOfOpAsFoo<O, OfOpAsBar<O, OfOpAsBaz<O, ()>>>
OfOpAsBar<O, OfOpAsBaz<O, ()>>
OfOpAsBaz<O, ()>
()
- ...and thus users can invoke
produce_foo().some_foo_method()
- Given
- To support chained dispatch, we also need to support invocations on other
Inherent associated functions on the object types
The first step is to create a "fully qualified" notation for each Java method:
impl Object {
fn to_string(
this: impl IntoJava<Object>
) -> impl JavaMethod<java::lang::String> {
...
}
}
This function does not take a self
parameter and so it can only be invoked using fully qualified form, e.g., Object::to_string(foo)
.
Concept: newtyped references and FromRef
The next step is that we define a trait FromRef
that we will use to define a pattern called 'newtyped references'.
The idea is that we want to be able to take a reference &J
and convert it into a view on that reference &View<J>
,
where View<J>
has the same data as J
but defines inherent methods.
We'll create a trait FromRef
to use for this pattern,
where View<J>: FromRef<J>
indicates that a view &View<J>
can be constructed from a &J
reference:
pub trait FromRef<J> {
fn from_ref(t: &J) -> &Self;
}
A view struct is just a newtype on the underlying J
type but with #[repr(transparent)]
:
#[repr(transparent)]
pub struct View<J> {
this: J,
}
The #[repr(transparent)]
attribute ensures that J
and View<J>
have the same layout in memory
and are treated equivalently in ABIs and the like.
Thanks to this, we can implement FromRef
like so:
impl FromRef<J> for View<J> {
fn from_ref(t: &J) -> &Self {
// Safe because of the `#[repr(transparent)]` attribute
unsafe { std::mem::transmute(t) }
}
}
Concept: Method resolution order (MRO)
The method resolution order for a type T
is an ordered list of its transitive supertypes such that, given two types X
and Y
in the list, if X
extends Y
then X
appears before Y
. This ensures that if we search linearly down the list, we will find the "most refined" version of a method first. We define the MRO for a type T
using Python's C3 algorithm.
Modeling method resolution order (MRO) for a class/interface C
with ViewAs
structs
For each class X
, we define a ViewAsObj
struct ViewAsXObj<J, N>
:
#[repr(transparent)]
struct ViewAsXObj<J, N> {
this: J,
phantom: PhantomData<N>,
}
The class has two type parameters:
- The parameter
J
identifies the original type from which we created the view; this will always be some sutype ofX
. - The
N
parameter represents the remainder ofJ
's method resolution order.
Deref chain
Each ViewAsObj struct includes a Deref that derefs to N:
impl<J, N> Deref for ViewAsXObj<J, N> {
type Target = N;
fn deref(&self) ->
}
Chaining ViewAsObj structs
So given interface Foo extends Bar, Baz
, the type Foo
would deref to
ViewAsFooObj<Foo, ViewAsBarObj<Foo, ViewAsBazObj<Foo, ()>>>
// --- ----------- ----------------------------------
// X J N
Each ViewAs
struct derefs to its N
parameter,
so ViewAsFooObj<Foo, ViewAsBarObj<Foo, ...>>
would deref to ViewAsBarObj<Foo, ...>
and so forth.
The FromRef
trait
Each op struct implements a trait FromRef<J>
:
trait FromRef<J> {
fn from_ref(r: &J) -> &Self;
}
The from_ref
method allows constructing an op struct from an &J
reference.
Implementing this method requires a small bit of unsafe code,
leveraging the repr(transparent)
attribute on each op struct:
impl<J> FromRef<J> for ObjectOp<J>
where
J: IntoJava<Foo>,
{
fn from_ref(r: &J) -> &Self {
// Safe because ObjectOp<J> shares representation with J:
unsafe { std::mem::transmute(r) }
}
}
Methods on ViewAsObj structs
The ViewAsObj
struct for a given Java type
also has inherent methods for each Java method.
These are implemented by invoking the fully qualified inherent functions.
For example, the ViewAsObj struct for Object
includes a to_string
method like so:
impl<J, N> ViewAsObjectObj<J, N>
where
J: Upcast<Object>,
{
pub fn to_string(&self) -> impl JavaMethod<java::lang::String> + '_ {
java::lang::Object::to_string(&self.this)
}
}
Naming op structs: the JavaObject::OfOp
associated type
We don't want ViewAsObj
structs to be publicly visible.
So we create them inside of a const _: () = { .. }
block.
But we do need some way to name them.
We expose them via associated types of the JavaView
trait:
trait JavaView {
type OfObj<J>: FromRef<J>;
type OfObjWith<J, N>: FromRef<J>
where
N: FromRef<J>;
}
The OfObj
associated type in particular provides the "default value" for N
that defines the MRO.
The OfObjWith
is used to supply an explicit N
value. For example:
const _: () = {
struct ViewAsFooObj<J, C> { ... }
impl JavaView for Foo {
type OfObj<J> = ViewAsFooObj<Foo, Bar::OfObjWith<Foo, Baz::OfObjWith<Foo, ()>>>;
// ------------ --- --------------------------------------------
// | | Method resolution order
// | Original type we are viewing onto (i.e., Self)
// The ViewAsFoo object
}
}
ViewAsOp
structs
The ViewAsObj
structs allow you to invoke methods on a java object reference like s: &java::lang::String
.
But they do not allow you to invoke methods on some random JvmOp
that happens to return a string.
For that, we create a very similar set of ViewAsOp
structs:
#[repr(transparent)]
struct ViewAsXOp<J, N> {
this: J,
phantom: PhantomData<N>,
}
These ViewAsOp
structs look exactly the same, but the J
here is not a java object but rather a JvmOp
.
Like the ViewAsObj
structs, they have inherent methods that call to the fully qualified inherent methods.
But the signature is slightly different; it is a &self
method, but the impl JavaMethod
that is returned does not capture the self
reference.
Instead, it copies the self.this
out.
This relies on the fact that all JvmOp
values are Copy
.
impl<J, N> ViewAsObjectObj<J, N>
where
J: IntoJava<Object>,
{
pub fn to_string(&self) -> impl JavaMethod<java::lang::String> {
java::lang::Object::to_string(self.this)
}
}
ViewAsOp
structs are exposed through associated types on JavaView
just like ViewAsObj
structs.
Q: Why not a self
method?
You might wonder why we take a &self
method and then copy out rather than just taking self
.
The reason is that the ViewAsObjectObj
traits are the output from Deref
impls
Deref impls on ops
We also have to add a Deref
impl to each of the op structs.
struct SomeOp { }
impl JvmOp for SomeOp {
type Output<'jvm> = Local<'jvm, Foo>;
}
impl Deref for SomeOp {
type Target = <Foo as JavaView>::OfOp<SomeOp>;
fn deref(&self) -> &Self::Target {
FromRef::from_ref(self)
}
}
Logo
The duchess logo:
combines Java's Duke (specifically the Surfing version) and Ferris, both of which are released under open source licenses that permit (and encourage) duplication and modification.