Differential Fuzzing Across The Language Divide

TLDR: This article is an exploration of integrating three different languages to perform coverage guided, in-process differential fuzzing using LibAFL. Three approaches are attemped: Invoking as a command, embedding the interpreter and shared memory.

Differential fuzzing is one of the most exciting forms of fuzzing. The essence is to test competing implementations of a library or an application with the same test input, with the hope of finding a difference in the execution outcome.

Suppose you have two JSON parsing libraries, both should behave the same given the same JSON string input to parse. This is not often the case, however. Some may have heard of the infamous differential assessment of JSON parsers or crypto libraries; with the latter conducted through differential fuzzing. Differences, unfortunately, will always be present in competing implementations, differential fuzzing can help identify these cases.

Recently, having attended devconnect, I was inspired to further explore differential fuzzing. I decided to pick the Ethereum Virtual Machine (EVM) as my target as I found it to scratch a research itch - The three most popular clients, go-ethereum, Nethermind and Besu, are written in three different languages: Java, C# and Go. Two of them are interpreted! How would one approach such a fuzzing campaign? Compiled languages interoperate with C/C++, which is usually trivial to integrate into a campaign. Interpretered languages on the other hand? Could I embed them? Maybe they are fast enough to just call as a process?

I decided to find out, and write about it along the way.

The ingredients required for a differential fuzzing campaign.

A source of code coverage

Coverage guided fuzzing remains one of the most effective ways to test applications. Coverage allows a fuzzer to infer application state and control flow to better guide itself and its mutation strategies. In my opinion, you only need one implementation to provide coverage. I chose go-ethereum as my coverage source, not just because it’s the canonical implementation of the EVM but also the fact that my colleagues recently worked on a LibAFL-based shim for Go fuzzing, allowing me to plug-in any LibAFL based fuzzer. Besides that, Java and C#, in my opinion, do not have a fuzzing ecosystem whose maturity is on par with that of Go.

A unified input format.

If we were fuzzing JSON, it would be an array of bytes, representing JSON. However, we’re fuzzing a rather complex application, with complex state. Luckily, most ethereum implementations support two test formats: State Tests and Blockchain Tests.

State Test defines a test structure which enables:

Creating an initial state.
Executing a series of transactions.
Making assertions on expected execution outcomes.

As each client implements this functionality, we do not have to create complicated fuzzing harnesses for each implementation.

For this campaign, I decided to go with State Test as I was more interested in the EVM than the blockchain implementation.

A unified output format.

This is where it can get complicated, and we must remember to keep it simple. It could be tempting to compare the precise execution result, but during fuzzing, we should NOT care about why the implementations differ, only that they do differ. Triage, which comes after, would be the place to investigate differences.

Thus, we need to find the simplest way to see if the execution outcome was the same accross all chains. Successful execution of the contract is insufficient. A bug where a contract writes to two different storage slots would be missed. So, we use the stateRoot. Think of it as a representation of the current state of the entire chain. This must be the same accross all implementations after executing the same transaction on the same initial state.

Speed. A lot of it.

Fuzzing must be fast to be effective. Depending on the application, you should have hundreds, or even thousands of executions per second, per core. For us, this will be difficult and would come to be a pain-point. Nevertheless, investing in speeding up your target will reap large rewards. For example, for any given testcase, you may just want to compare two implementations instead of three at a time and allow the fuzzer to mutate which two implementations to test. Maybe you can mock certain data? Maybe you can remove uncessary hashing operations?

A fuzzer

I used a grammar fuzzer I have built, called Autarkie. Autarkie was born out of the need to implement a modern grammar fuzzing solution, whose grammar was easy to define and debuggable. Here’s my presentation on Autarkie at WHY2025.

Autarkie supports AFL++’s forkserver, libfuzzer’s in-process execution and now, partly thanks to my colleagues’ efforts, Go!

Let’s get started.

Setting up differential fuzzing

I like to get started fuzzing quickly, and then work on speeding up the target while the fuzzer does its job. Since we can stop and resume the campaign at any point, we should not waste time getting started.

Since go-thereum will be our coverage source, and thus must be in a typical harness, there are a few approaches we can try for the other two:

Invoking as a command
Embedding
Shared Memory

Let’s now go through these approaches.

Invoking as a command

Maybe the implementations are fast enough so that we can just invoke them as a process. Let’s find out.

# install dotnet 10
git clone https://github.com/NethermindEth/nethermind --depth 1
cd nethermind/src/Nethermind/Nethermind.Test.Runner/
dotnet build Nethermind.Test.Runner.csproj -c Release
cd ../../../..

./nethermind/src/Nethermind/artifacts/bin/Nethermind.Test.Runner/release/nethtest \ 
    --input test.json
________________________________________________________
Executed in    2.18 secs    fish           external
   usr time    2.11 secs  779.00 micros    2.11 secs
   sys time    0.07 secs  122.00 micros    0.07 secs

We were talking about thousands of executions per-second! We can barely get one!

Let’s try Besu.

# install openjdk-21
git clone https://github.com/hyperledger/besu --depth 1
cd besu
/gradlew :ethereum:evmtool:installDist
cd ..

time ./besu/ethereum/evmtool/build/install/evmtool/bin/evmtool state-test ./test.json
________________________________________________________
Executed in    3.03 secs    fish           external
   usr time    5.57 secs    0.00 millis    5.57 secs
   sys time    0.27 secs    1.11 millis    0.26 secs

Both appear to be no-go for command invocation. For the sake of it, let’s try the Go one too, even though we will be embedding the EVM since it will be our coverage source.

# install go 1.25
git clone https://github.com/ethereum/go-ethereum
cd go-ethereum/cmd/evm
go build
cd ../../..
./go-ethereum/cmd/evm/evm statetest ./test.json
________________________________________________________
Executed in   21.53 millis    fish           external
   usr time    6.83 millis  716.00 micros    6.11 millis
   sys time   20.46 millis   96.00 micros   20.36 millis

That’s more like it.

We now know that simply calling the commands will not work. My suspicion is that it takes too long for the interpreters to start.

What about embedding?

Embedding the interpreters

Interpreted languages typically support embedding their interpreters. Lua, for example, is famous for that. Python, to some extent, too. Perhaps we could gain significant speed by embedding? One of the main advantages is that we would need to start the interpreter only once and that we would not have any process invocation or file system overhead.

Embedding Besu (Java)

Let’s start with Besu. Since Besu’s StateTestRunner is actually a SubCommand, we need to extract the core-logic so it can be invoked as a “library function”

We then need to interface this function through the JNI which allows the invocation of Java programs from other languages.

Claude helped a lot here as I was aiming for something that “just worked”. Remember, we do the performance improvements later.

Here’s some pseudo code that roughly represented the Java harness.

// Fuzz.java
public class Fuzz {  
  public static String fuzzStateTest(final String testJson) {
        final String stateRoot;
        stateRoot = Fuzz.executeSingleTest(testJson, "Osaka");
        return stateRoot;
  }

  public static String executeSingleTest(final String testJson, final String fork) {
    // copy the core logic to run the test
    // .......
  }
}

Let’s build and then we’ll create a Rust integration using jni. I chose Rust since I am not a particularly experienced C/C++ developer.

# openjdk-21
cd besu
./gradlew --parallel ethereum:evmtool:installDist
mkdir bindings
cd bindings
cargo init
cargo add jni --feautres invocation

The following code initializes a minimal JVM with the required classpath libraries. Then, it invokes our fuzzStateTest function with a simple test state test. We do this in an infinite loop just to see the execution time.

In src/main.rs

use std::time::SystemTime;

use jni::objects::{JClass, JObject, JString, JValue};
use jni::JNIVersion;
use jni::JavaVM;

fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Find all the built .jar files
    let lib_dir = std::fs::read_dir("../ethereum/evmtool/build/install/evmtool/lib/");
    let mut jar_paths = vec![];
    if let Ok(entries) = lib_dir {
        for entry in entries.flatten() {
            let entry_path = entry.path();
            if entry_path.extension().and_then(|s| s.to_str()) == Some("jar") {
                jar_paths.push(entry_path.to_string_lossy().to_string());
            }
        }
    }
    // Add them as options
    let jvm_args = jni::InitArgsBuilder::new()
        .version(JNIVersion::V8) // Java 8 style; choose version matching your JDK
        .option(format!("-Djava.class.path={}", jar_paths.join(":")))
        .build()?;

    // Create the JVM
    let jvm = JavaVM::new(jvm_args)?;
    // Find our harness function
    let mut env = jvm.attach_current_thread()?;
    let class = env.find_class("org/hyperledger/besu/evmtool/Fuzz")?;
    // For benchmarking's sakes, let's just run this in a loop
    loop {
        let json = env.new_string(std::fs::read_to_string("./test.json").unwrap())?;
            let now = SystemTime::now();
            let result = env.call_static_method(
                &class,
                "fuzzStateTest",
                "(Ljava/lang/String;)Ljava/lang/String;",
                &[JValue::Object(&JObject::from(json))],
            )?;
        println!("{:?}", now.elapsed());
    }
    Ok(())
}

Now, let’s run our experiment to see if embedding led to a speed up.

cargo run
ok(2.594644601s)                               
ok(37.659974ms)                                
Ok(18.633532ms)                                
Ok(22.066574ms)                                
Ok(13.567659ms)                                
Ok(15.196381ms)                                
Ok(11.666704ms)                                
Ok(10.193306ms)                                
Ok(9.886665ms)                                 
Ok(8.426703ms)                                 
Ok(6.817455ms)                                 
Ok(8.034997ms)

That’s a drastic speed up! Initially, it’s a bit slow, but it’s miles ahead of 2 seconds! I’m certain that further optimizations could be made but that’s for later… We now know that we can reasonably fuzz with this.

Embedding Nethermind (.NET)

Now, let’s try .NET. which happens to support AOT (Ahead Of Time) compilation. According to the docs:

“Native AOT apps have faster startup time and smaller memory footprints.”

Perfect! let’s try!

cd nethermind/src/Nethermind/Nethermind.Test.Runner/
dotnet build Nethermind.Test.Runner.csproj -c Release
vim Nethermind.Test.Runner.csproj

Add the following lines inside the <PropertyGroup>...<PropertyGroup> section.

<PublishAot>true</PublishAot>
<ValidateExecutableReferencesMatchSelfContained>
    false
</ValidateExecutableReferencesMatchSelfContained>

Let’s build

Nethermind.Test.Runner net10.0 linux-x64 failed with 4 error(s) (3.5s) → /tmp/nethermind/src/Nethermind/artifacts/bin/Nethermind.Test.Runner/release_linux-x64/nethtest.dll
 /tmp/nethermind/src/Nethermind/Nethermind.Serialization.Json/EthereumJsonSerializer.cs(59): Trim analysis error IL2026: Nethermind.Serialization.Json.EthereumJsonSerializer.Serialize<T>(T,Boolean): Using member 'System.Text.Json.JsonSerializer.Serialize<T>(T,JsonSerializerOptions)' which has 'RequiresUnreferencedCodeAttribute' can break functionality when trimming application code. JSON serialization and deserialization might require types that cannot be statically analyzed. Use the overload that takes a JsonTypeInfo or JsonSerializerContext, or make sure all of the required types are preserved.

 /tmp/nethermind/src/Nethermind/Nethermind.Serialization.Json/EthereumJsonSerializer.cs(59): AOT analysis error IL3050: Nethermind.Serialization.Json.EthereumJsonSerializer.Serialize<T>(T,Boolean): Using member 'System.Text.Json.JsonSerializer.Serialize<T>(T,JsonSerializerOptions)' which has 'RequiresDynamicCodeAttribute' can break functionality when AOT compiling. JSON serialization and deserialization might require types that cannot be statically analyzed and might need runtime code generation. Use System.Text.Json source generation for native AOT applications.
 EXEC : error Index was outside the bounds of the array.

/home/aarnav/.nuget/packages/microsoft.dotnet.ilcompiler/10.0.0/build/Microsoft.NETCore.Native.targets(330,5): error MSB3073: The command ""/home/aarnav/.nuget/packages/runtime.linux-x64.microsoft.dotnet.ilcompiler/10.0.0/tools/ilc" @"/tmp/nethermind/src/Nethermind/artifacts/obj/Nethermind.Test.Runner/release_linux-x64/native/nethtest.ilc.rsp"" exited with code 1.

Build failed with 4 error(s) in 44.6s

It turns out that using reflection is not allowed in .NET’s AOT mode.

Fixing the JSON reflection errors led to other reflection errors in Nethermind’s EVM runtime. The extensive usage of runtime reflection meant that AOT compilation was not possible without significant changes.

Since we cannot compile it as an Ahead of Time library, we must embed the runtime, similar to Besu.

Similar to what we did for Java, we first need to abstract the test runner into another function, since it is tightly coupled with the CLI.

Let’s try a very simple approach:

// FuzzRunner.cs
public static int FuzzRunTest(IntPtr args, int sizeBytes)
{
    try
    {
        string testJson = System.Text.Encoding.UTF8.GetString(
            new ReadOnlySpan<byte>((void*)args, sizeBytes)
        );
        ulong chainId = MainnetSpecProvider.Instance.ChainId;

        IEnumerable<EthereumTest> tests = JsonToEthereumTest.ConvertStateTest(testJson);

        StateTestsRunner runner = new(
            new InMemoryTestSourceLoader(tests),
            WhenTrace.Never,
            traceMemory: false,
            traceStack: false,
            chainId,
            filter: null,
            enableWarmup: false,
            outputWriter: null);

        var result = runner.RunTests();
        // TODO: return stateRoot
        return 0;
    }
    catch (Exception ex)
    {
        Console.Error.WriteLine($"Error in FuzzRunTest: {ex}");
        return 1;
    }
}

Let’s try building and embedding.

# dotnet 10
cd nethermind/src/Nethermind/Nethermind.Test.Runner/
dotnet build Nethermind.Test.Runner.csproj -c Release
cd ../../../..
mkdir bindings
cd bindings
cargo init
cargo add notecorehost

Here’s the rust shim.

use netcorehost::{hostfxr::Hostfxr, pdcstr, pdcstring::PdCString};
use std::env;

fn main() {
    env::set_current_dir("<path_to_dlls>")
          .expect("Failed to set working directory");

    let hostfxr = netcorehost::nethost::load_hostfxr().unwrap();
    let context = hostfxr
        .initialize_for_runtime_config(pdcstr!("Nethermind.Test.Runner.Lib.runtimeconfig.json"))
        .unwrap();

    let fn_loader = context
        .get_delegate_loader_for_assembly(pdcstr!("Nethermind.Test.Runner.Lib.dll"))
        .unwrap();

    let run_test = fn_loader
        .get_function_with_default_signature(
            pdcstr!("Nethermind.Test.Runner.Lib.FuzzTestRunner, Nethermind.Test.Runner.Lib"),
            pdcstr!("FuzzRunTest"),
        )
        .unwrap();
    let json =  std::fs::read_to_string("../test.json").unwrap();
    let result = unsafe { run_test(json.as_str().as_ptr() as *const std::ffi::c_void, json.len() as i32) };
    println!("Test result: {}", result);
}

cargo run and I got a strange error:

Error in FuzzRunTest: System.Collections.Generic.KeyNotFoundException: The given key was not present in the dictionary.
   at NonBlocking.ConcurrentDictionary`2.ThrowKeyNotFoundException()
   at NonBlocking.ConcurrentDictionary`2.get_Item(TKey key)
   at Nethermind.Config.ConfigProvider.GetConfig(Type configType) in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Nethermind.Config/ConfigProvider.cs:line 92
   at Nethermind.Config.ConfigProvider.GetConfig[T]() in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Nethermind.Config/ConfigProvider.cs:line 77
   at Nethermind.Core.Test.Modules.PseudoNethermindModule.Load(ContainerBuilder builder) in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Nethermind.Core.Test/Modules/PseudoNethermindModule.cs:line 37
   at Autofac.Module.Configure(IComponentRegistryBuilder componentRegistry)
   at Autofac.Core.Registration.ModuleRegistrar.<.ctor>b__1_0(IComponentRegistryBuilder reg)
   at Autofac.ContainerBuilder.Build(IComponentRegistryBuilder componentRegistry, Boolean excludeDefaultModules)
   at Autofac.ContainerBuilder.UpdateRegistry(IComponentRegistryBuilder componentRegistry)
   at Autofac.Module.Configure(IComponentRegistryBuilder componentRegistry)
   at Autofac.Core.Registration.ModuleRegistrar.<.ctor>b__1_0(IComponentRegistryBuilder reg)
   at Autofac.ContainerBuilder.Build(IComponentRegistryBuilder componentRegistry, Boolean excludeDefaultModules)
   at Autofac.ContainerBuilder.Build(ContainerBuildOptions options)
   at Ethereum.Test.Base.GeneralStateTestBase.RunTest(GeneralStateTest test, ITxTracer txTracer) in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Ethereum.Test.Base/GeneralTestBase.cs:line 80
   at Nethermind.Test.Runner.StateTestsRunner.RunTests() in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Nethermind.Test.Runner/StateTestRunner.cs:line 110
   at Nethermind.Test.Runner.Program.FuzzRunTest(IntPtr args, Int32 sizeBytes) in /home/aarnav/projects/evm-differential-fuzzer/nethermind/src/Nethermind/Nethermind.Test.Runner/Program.cs:line 202
Result: 1

I sunk several hours into trying to fix this, to no avail. If someone knows how to embed .NET, please let me know. I suspect that the issue is:

the order in which the .dll files are loaded.
A type-registration issue during runtime.

Frankly, debugging this was a difficult undertaking for someone who has no clue about the .NET ecosystem. LLMs were of no assistance. I decided my time was better spent fuzzing instead of getting this to work. Let’s use the shared memory approach.

This gives me the opportunity to explore another method! This is bound to be useful for another target as I’m sure I will encounter targets that cannot be embedded easily.

Shared Memory

Since we could not get the embedding to work for .NET, we can implement a server with IPC. Like this, we make sure that we only need to start the runtime once.

The architecture is the following:

testdata = create_shared_memory_segment("nethermind__<core_id>");
signal = create_shared_memory_segment("nethermind__<core_id>__signal");

loop {
   if read_data(signal) == "START\0" {
        json = read_data(testdata);
        result = do_state_test(json);
        if result.is_err() {
            write_data(testdata, result.error);
            write_data(signal, "DONE\0\0");
        } else {
            write_data(testdata, result.stateRoot);
            write_data(signal, "DONE\0\0");
        }
    } else {
        sleep(10ms);
    }
}

We need the signal to avoid race-conditions when reading test data.

Implementing this was very quick with Claude, since I had no .NET knowledge. Here’s the code. Sorry for the big blob! Speed results right after.

// src/Nethermind/Nethermind.Test.Runner/Program.cs
private static int RunServerMode(ParseResult parseResult, string memoryId, CancellationToken cancellationToken)
{
    ulong chainId = parseResult.GetValue(Options.GnosisTest) ? GnosisSpecProvider.Instance.ChainId : MainnetSpecProvider.Instance.ChainId;

    const int SharedMemorySize = 8092;
    string shmPath = $"/dev/shm/{memoryId}";
    using FileStream fs = new(shmPath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite);
    fs.SetLength(SharedMemorySize);

    using MemoryMappedFile mmf = MemoryMappedFile.CreateFromFile(fs, null, SharedMemorySize, MemoryMappedFileAccess.ReadWrite, HandleInheritability.None, leaveOpen: false);
    using MemoryMappedViewAccessor accessor = mmf.CreateViewAccessor(0, SharedMemorySize, MemoryMappedFileAccess.ReadWrite);

    const int SharedMemorySizeSignal = 6;
    string shmSignalPath = $"/dev/shm/{memoryId}__signal";

    using FileStream fsSignal = new(shmSignalPath, FileMode.OpenOrCreate, FileAccess.ReadWrite, FileShare.ReadWrite);
    fsSignal.SetLength(SharedMemorySizeSignal);

    using MemoryMappedFile mmfSignal = MemoryMappedFile.CreateFromFile(fsSignal, null, SharedMemorySizeSignal, MemoryMappedFileAccess.ReadWrite, HandleInheritability.None, leaveOpen: false);
    using MemoryMappedViewAccessor accessorSignal = mmfSignal.CreateViewAccessor(0, SharedMemorySizeSignal, MemoryMappedFileAccess.ReadWrite);

    while (!cancellationToken.IsCancellationRequested)
    {
        Thread.Sleep(10);

        byte[] signalBuffer = new byte[SharedMemorySizeSignal];
        accessorSignal.ReadArray(0, signalBuffer, 0, SharedMemorySizeSignal);

        if (signalBuffer.SequenceEqual(Encoding.UTF8.GetBytes("START\0")))
        {
            byte[] buffer = new byte[SharedMemorySize];
            accessor.ReadArray(0, buffer, 0, SharedMemorySize);

            int jsonLength = Array.IndexOf(buffer, (byte)0);
            if (jsonLength == -1) jsonLength = SharedMemorySize;

            string testJson = Encoding.UTF8.GetString(buffer, 0, jsonLength);

            try
            {
                IEnumerable<EthereumTest> tests;
                tests = JsonToEthereumTest.ConvertStateTest(testJson);
                StateTestsRunner runner = new(
                    new InMemoryTestSourceLoader(tests),
                    WhenTrace.Never,
                    traceMemory: false,
                    traceStack: false,
                    chainId,
                    filter: null,
                    enableWarmup: false,
                    outputWriter: null);

                var result = runner.RunTests().First();
                string resultString = result?.StateRoot?.ToString() ?? "null";
                byte[] resultBytes = Encoding.UTF8.GetBytes(resultString);
                /// write bytes to shared memory
            }
            catch (Exception ex)
            {
                /// report error
            }
        }
    }

    return 0;
}

Let’s look at the speed results: I wrote a quick script to get results (milliseconds) from Nethermind:

# all in milliseconds
[Nethermind-1] 2611
[Nethermind-1] 15
[Nethermind-1] 13
[Nethermind-1] 8
[Nethermind-1] 6
[Nethermind-1] 6

After the first few cases, both Java and C# work at expected speeds! Great! We have two approaches, both working reasonably well.

We can begin fuzzing!

Actually! There will be a part two to this article! The full harness will be shared and I will talk about the grammar fuzzing strategies I’ve implemented and how they fared. Stick around.

To Conclude

It was a challenge to differential fuzz two languages I had almost no experience with. Over all this campaign involved four languages: Go, Rust, .NET, and Java.

Both shared memory and embedding approaches provide a signficant speed up compared to process invocation. I will probably just go with the shared memory approach for any other interpreted target because it’s trivial to implement and “just works"™.

Lessons learnt

Patching is inevitable

The sooner you get comfortable diving into unknown code, the better your results will be. Hack it, and always measure!

Use the magic strace command.

The ultimate debugging command (credit to the Fuzzing Discord).

strace -tt -yy -y -f -e trace=openat,open,read,write,pipe,socket,dup2,clone,close -s 10000 -o /tmp/strace.log <your command>

Always keep it simple.

Fuzzing is about tradeoffs. Find out what’s best for your target. Measure, and do not over-optimize at first.

The ingredients required for a differential fuzzing campaign.#

A source of code coverage#

A unified input format.#

A unified output format.#

Speed. A lot of it.#

A fuzzer#

Setting up differential fuzzing#

Invoking as a command#

Embedding the interpreters#

Embedding Besu (Java)#

Embedding Nethermind (.NET)#

Shared Memory#

We can begin fuzzing!#

To Conclude#

Lessons learnt#

Patching is inevitable#

Use the magic strace command.#

Always keep it simple.#