Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NullReferenceException is thrown if CommonDialog.ShowDialog is patched #640

Open
mingpepe opened this issue Dec 19, 2024 · 7 comments
Open

Comments

@mingpepe
Copy link

mingpepe commented Dec 19, 2024

Describe the bug
NullReferenceException is thrown on calling target function.

To Reproduce
Steps to reproduce the behavior:

  1. Patch CommonDialog.ShowDialog on (Microsoft.Win32)
  2. Call ShowDialog

Expected behavior
Not throw exception

Screenshots / Code

[HarmonyPatch(typeof(CommonDialog))]
[HarmonyPatch(nameof(CommonDialog.ShowDialog))]
[HarmonyPatch(new Type[] { })]
class Patch
{
    static bool Prefix(ref bool? __result)
    {
        __result = true;
        return false;
    }
}
internal class Program
{
    static void Main(string[] args)
    {
        Harmony.DEBUG = true;
        var harmony = new Harmony("a");
        harmony.PatchAll();
        var dialog = new OpenFileDialog();
        dialog.ShowDialog();
    }
}

Runtime environment (please complete the following information):

  • OS: Windows 10, 64bit
  • .NET version: 4.6.1
  • Harmony version: 2.3.3

Additional context

  1. It works properly for Harmony 2.2.2, but throw exception for 2.3.3
  2. With harmony.Patch I can get replacement MethodInfo, invoke it works properly
  3. Pathed function called as expected.
  4. Calling original function or not does not affect this issue.
  5. It works properly for x64, throw exception for x86
  6. The generated IL code for x86 & x64 are the same
  7. The generated IL code for v2.33 & v2.22 are the same
  8. The issue occured starting from v2.3-prerelease 2

Log

Harmony id=a, version=2.3.3.0, location=, env/clr=4.0.30319.42000, platform=Win32NT

Started from static System.Void ConsoleApp2.Program::Main(System.String[] args), location

At 2024-12-19 10.38.46

Patch: virtual System.Nullable`1<System.Boolean> Microsoft.Win32.CommonDialog::ShowDialog()

Replacement: static System.Nullable`1<System.Boolean> Microsoft.Win32.CommonDialog::Microsoft.Win32.CommonDialog.ShowDialog_Patch1(Microsoft.Win32.CommonDialog this)

IL_0000: Local var 0: System.IntPtr
IL_0000: Local var 1: MS.Win32.HwndWrapper
IL_0000: Local var 2: System.Nullable1<System.Boolean> IL_0000: Local var 3: System.Nullable1<System.Boolean>
IL_0000: Local var 4: System.Boolean
IL_0000: ldloca 3 (System.Nullable1[System.Boolean]) IL_0004: initobj System.Nullable1[System.Boolean]
IL_000A: ldc.i4 0
IL_000F: stloc 4 (System.Boolean)
IL_0013: ldc.i4.1
IL_0014: stloc 4 (System.Boolean)
IL_0018: ldloc 4 (System.Boolean)
IL_001C: brfalse => Label1
IL_0021: ldloca 3 (System.Nullable1[System.Boolean]) IL_0025: call static System.Boolean ConsoleApp2.Patch::Prefix(System.Nullable1& __result)
IL_002A: stloc 4 (System.Boolean)
IL_002E: Label1
IL_002E: nop
IL_002F: ldloc 4 (System.Boolean)
IL_0033: brfalse => Label0
IL_0038: // start original
IL_0038: ldarg.0
IL_0039: callvirt virtual System.Void Microsoft.Win32.CommonDialog::CheckPermissionsToShowDialog()
IL_003E: call static System.Boolean System.Environment::get_UserInteractive()
IL_0043: brtrue => Label2
IL_0048: ldstr "CantShowModalOnNonInteractive"
IL_004D: call static System.String System.Windows.SR::Get(System.String id)
IL_0052: newobj System.Void System.InvalidOperationException::.ctor(System.String message)
IL_0057: throw
IL_0058: Label2
IL_0058: call static System.IntPtr MS.Win32.UnsafeNativeMethods::GetActiveWindow()
IL_005D: stloc.0
IL_005E: ldloc.0
IL_005F: ldsfld System.IntPtr System.IntPtr::Zero
IL_0064: call static System.Boolean System.IntPtr::op_Equality(System.IntPtr value1, System.IntPtr value2)
IL_0069: brfalse => Label3
IL_006E: call static System.Windows.Application System.Windows.Application::get_Current()
IL_0073: brfalse => Label4
IL_0078: call static System.Windows.Application System.Windows.Application::get_Current()
IL_007D: callvirt System.IntPtr System.Windows.Application::get_ParkingHwnd()
IL_0082: stloc.0
IL_0083: Label3
IL_0083: Label4
IL_0083: ldnull
IL_0084: stloc.1
.try
{
IL_0085: ldloc.0
IL_0086: ldsfld System.IntPtr System.IntPtr::Zero
IL_008B: call static System.Boolean System.IntPtr::op_Equality(System.IntPtr value1, System.IntPtr value2)
IL_0090: brfalse => Label5
IL_0095: ldc.i4.0
IL_0096: ldc.i4.0
IL_0097: ldc.i4.0
IL_0098: ldc.i4.0
IL_0099: ldc.i4.0
IL_009A: ldc.i4.0
IL_009B: ldc.i4.0
IL_009C: ldstr ""
IL_00A1: ldsfld System.IntPtr System.IntPtr::Zero
IL_00A6: ldnull
IL_00A7: newobj System.Void MS.Win32.HwndWrapper::.ctor(System.Int32 classStyle, System.Int32 style, System.Int32 exStyle, System.Int32 x, System.Int32 y, System.Int32 width, System.Int32 height, System.String name, System.IntPtr parent, MS.Win32.HwndWrapperHook[] hooks)
IL_00AC: stloc.1
IL_00AD: ldloc.1
IL_00AE: callvirt System.IntPtr MS.Win32.HwndWrapper::get_Handle()
IL_00B3: stloc.0
IL_00B4: Label5
IL_00B4: ldarg.0
IL_00B5: ldloc.0
IL_00B6: stfld System.IntPtr Microsoft.Win32.CommonDialog::_hwndOwnerWindow
.try
{
IL_00BB: call static System.Void System.Windows.Interop.ComponentDispatcher::CriticalPushModal()
IL_00C0: ldarg.0
IL_00C1: ldloc.0
IL_00C2: callvirt abstract virtual System.Boolean Microsoft.Win32.CommonDialog::RunDialog(System.IntPtr hwndOwner)
IL_00C7: newobj System.Void System.Nullable1<System.Boolean>::.ctor(System.Boolean value) IL_00CC: stloc.2 IL_00CD: leave => Label6 IL_00D2: leave => (autogenerated) } // end try .finally { IL_00D7: call static System.Void System.Windows.Interop.ComponentDispatcher::CriticalPopModal() IL_00DC: endfinally IL_00DD: leave => (autogenerated) } // end handler IL_00DE: leave => (autogenerated) } // end try .finally { IL_00E3: ldloc.1 IL_00E4: brfalse => Label7 IL_00E9: ldloc.1 IL_00EA: callvirt virtual System.Void MS.Win32.HwndWrapper::Dispose() IL_00EF: Label7 IL_00EF: endfinally IL_00F0: leave => (autogenerated) } // end handler IL_00F1: Label6 IL_00F1: ldloc.2 IL_00F2: // end original IL_00F2: stloc 3 (System.Nullable1[System.Boolean])
IL_00F6: Label0
IL_00F6: ldloc 3 (System.Nullable`1[System.Boolean])
IL_00FA: ret
DONE

@mingpepe
Copy link
Author

mingpepe commented Dec 19, 2024

Trying to figure out what trigger this issue in CommonDialog, find out patching A.f can cause same issue.

class A
{
    public bool? f()
    {
        return false;
    }
    // Any reference field will trigger this issue
    private string s = "";
    // No value is assigned, will not trigger exception
    // private string s;
}

It needs
a) Return type is bool? (int? works fine, do not know why)
b) Class A has reference field with value

pardeike added a commit that referenced this issue Dec 19, 2024
@pardeike
Copy link
Owner

Something is weird in your test. The field has no way to influence the patch. I just added a test that is essentially the same as your case and it does not fail: 4852c9c

@pardeike
Copy link
Owner

With certain conditions, the test actually fails. I have to investigate more.

@pardeike
Copy link
Owner

I have more details, in case someone can help to understand why this happens:

image

Fatal crashes on
x86 ReleaseThin
netcoreapp3.0 / 3.1, net5-9

System.NullReferenceException on
x86 DotNet ReleaseThin / ReleaseFat
x86 Mono ReleaseThin / ReleaseFat
net35 - net48

@pardeike
Copy link
Owner

For now this seems like a WONTFIX until we maybe get an idea on what exactly the runtime does with Nullable. Let me cite some of the things devs from the .NET team wrote:

  • Nullable itself is extremely special and has custom layout; its why its blocked from various constrained scenarios, from interop in general, etc
  • *it gets special boxing and other behavior too; so you can't always observe it and sometimes its erased`

@mingpepe
Copy link
Author

After conducting research, it was found that the issue lies in MonoMod

CLR ABI says,

When the return type is a value type that cannot be passed in a register, the caller shall create a buffer to hold the result and pass the address of this buffer as a hidden parameter.
The only values that can be passed in registers are managed and unmanaged pointers, object references, and the built-in integer types int8, unsigned int8, int16, unsigned int16, int32, unsigned it32, native int, native unsigned int, and enums and value types with only one 4-byte integer primitive-type field.

That said, reference types do not have return buffer. Removing the ABI fix wrapper makes it work.

So, the root cause is that the caller does not prepare return buffer, but callee expects a return buffer, which causes an access violation, then throw NullReferenceException.

Why dosen't int? trigger this issue, but bool? does?

The generated IL code is similar:

// Nullable<int>
IL_0000: ldarg VAR OR ARG 1
IL_0004: nop 
IL_0005: nop 
IL_0006: ldarg VAR OR ARG 0
IL_000a: nop 
IL_000b: nop 
IL_000c: call 00020000 is not a MethodDesc
6000002 
IL_0011: stobj 2000003 "System.Nullable`1[[System.Int32, mscorlib]]"
IL_0016: ldarg VAR OR ARG 1
IL_001a: nop 
IL_001b: nop 
IL_001c: ret 

// Nullable<bool>
// DynamicClass.Glue:AbiFixup<System.Nullable`1[[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] WpfApp1.A:f(),System.Nullable`1[[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] WpfApp1.A.f_Patch1(WpfApp1.A)>(),System.Nullable`1[[System.Boolean, mscorlib, Version=4.0.0.0, Culture=neutral, PublicKeyToken=b77a5c561934e089]] WpfApp1.A.f_Patch1(WpfApp1.A)>(WpfApp1.A, System.Nullable`1<Boolean> ByRef)
IL_0000: ldarg VAR OR ARG 1
IL_0004: nop 
IL_0005: nop 
IL_0006: ldarg VAR OR ARG 0
IL_000a: nop 
IL_000b: nop 
IL_000c: call 00020000 is not a MethodDesc
6000002 
IL_0011: stobj 2000003 "System.Nullable`1[[System.Boolean, mscorlib]]"
IL_0016: ldarg VAR OR ARG 1
IL_001a: nop 
IL_001b: nop 
IL_001c: ret 

However, the x86 assembly is different because bool? only takes 2 bytes, so instead of just replace pointer(mov eax, esi), it writes to pointer(mov word ptr [esi], ax). So, if edx happened point to a writable address, this issue won't be triggered.(This explains why adding a reference field makes a difference, because the constructor may use edx which could result in it does not point to a wriable address)

// Nullable<int>
 06fa00b0 56         push    esi
06fa00b1 8bf2       mov     esi, edx
06fa00b3 8bd1       mov     edx, ecx
06fa00b5 8bce       mov     ecx, esi
06fa00b7 e87459fcff call    06f65a30
06fa00bc 8bc6       mov     eax, esi
06fa00be 5e         pop     esi
06fa00bf c3         ret     

// Nullable<bool>
06fd00b0 56             push    esi
06fd00b1 8bf2           mov     esi, edx
06fd00b3 e87859fcff     call    06f95a30
06fd00b8 668906         mov     word ptr [esi], ax
06fd00bb 8bc6           mov     eax, esi
06fd00bd 5e             pop     esi
06fd00be c3             ret    

Change nullable to following struct can also reproduce this issue.

struct Crash
{
    bool a;
    bool b;
}

struct Ok
{
    bool a;
    int b;
}

@pardeike
Copy link
Owner

pardeike commented Dec 30, 2024

I already opened an issue yesterday in MonoMod: MonoMod/MonoMod#206

and my first guess was the return argument being passed as a pointer (there are other cases besides bool? that require special handling).

The tricky part here is to fix this systematically, for all .net versions, all operating systems and all processors as well as for all return types.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants