Record the stuck analysis of a certain .NET medical device cleaning system
one: background
1. Storytelling
Some time ago, I assisted a friend in the training camp to analyze a program stuck problem. Looking back, this case is relatively classic. I will briefly summarize it in this article to avoid pitfalls for newcomers.
Two: WinDbg analysis
1. Why is it stuck
Because it is a form program, it is natural to see what the main thread is doing at this time? You can use ~0s; k
to take a look and you will know.
0:000>k
#ChildEBP RetAddr
00 00aff168 75e3bb0a win32u!NtUserPeekMessage+0xc
01 00aff168 75e3ba7e USER32!_PeekMessage+0x2a
02 00aff1a4 6a5d711c USER32!PeekMessageW+0x16e
03 00aff1f0 6a5841a6 System_Windows_Forms_ni+0x23711c
...
17 00afffbc 00000000 ntdll!_RtlUserThreadStart+0x1b
Judging from the thread stack, the current method is stuck on win32u!NtUserPeekMessage
. Friends who are familiar with Windows Forms messages know that this is the conventional logic for extracting Message Queue
, the next step of this method is to enter the Windows kernel state
through Wow64SystemServiceCall
, which can be verified with the u command.
0:000>ub win32u!NtUserPeekMessage+0xc
761d1010 b801100000 mov eax,1001h
761d1015 ba10631d76 mov edx,offset win32u!Wow64SystemServiceCall (761d6310)
761d101a ffd2 call edx
My friend also took a screenshot for me, and it was indeed stuck. The next question is to see what the current thread is doing in kernel state
?
2. Is it really stuck in the kernel mode
Fortunately, friends can install windbg on the stuck machine, so that they can use Attch to kernel
to observe the kernel state when the machine is stuck. The screenshot is as follows:
After the attachment is successful, you can use !process 0 f xxxx.exe
to see the thread stack of the main thread.
lkd> !process 0 f xxxx.exe
PROCESS ffffab8ebea75080
SessionId: 1 Cid: 0f78 Peb: 009f1000 ParentCid: 1134
...
THREAD ffffab8ecad14540 Cid 0f78.38f8 Teb: 00000000009f3000 Win32Thread: ffffab8ecd5dabc0 WAIT: (WrUserRequest) UserMode Non-Alertable
ffffab8ecb31bcc0 QueueObject
IRP List:
ffffab8ecad82b20: (0006,0478) Flags: 00060000 Mdl: 00000000
Not impersonating
DeviceMap ffffd400aa7eed50
Owning Process ffffab8ebea75080 Image: xxxx.exe
Attached Process N/A Image: N/A
Wait Start TickCount 1117311 Ticks: 9265 (0:00:02:24.765)
Context Switch Count 60628 IdealProcessor: 2 NoStackSwap
UserTime 00:00:10.796
KernelTime 00:00:06.593
Win32 Start Address 0x00000000006e16aa
Stack Init ffffa88b5b18fb90 Current ffffa88b5b18e780
Base ffffa88b5b190000 Limit ffffa88b5b189000 Call 0000000000000000
Priority 10 BasePriority 8 PriorityDecrement 0 IoPriority 2 PagePriority 5
Child-SP RetAddr Call Site
ffffa88b`5b18e7c0 fffff806`6627e370 nt!KiSwapContext+0x76
ffffa88b`5b18e900 fffff806`6627d89f nt!KiSwapThread+0x500
ffffa88b`5b18e9b0 fffff806`6627d143 nt!KiCommitThreadWait+0x14f
ffffa88b`5b18ea50 fffff806`6628679b nt!KeWaitForSingleObject+0x233
ffffa88b`5b18eb40 ffffa9d4`bdd32b12 nt!KeWaitForMultipleObjects+0x45b
ffffa88b`5b18ec50 ffffa9d4`bdd352d9 win32kfull!xxxRealSleepThread+0x362
ffffa88b`5b18ed70 ffffa9d4`bdd33f8a win32kfull!xxxInterSendMsgEx+0xdd9
ffffa88b`5b18eee0 ffffa9d4`bdd37870 win32kfull!xxxSendTransformableMessageTimeout+0x3ea
ffffa88b`5b18f030 ffffa9d4`bdf1e088 win32kfull!xxxSendMessage+0x2c
ffffa88b`5b18f090 ffffa9d4`bdf1e0e9 win32kfull!xxxCompositedTraverse+0x40
ffffa88b`5b18f0e0 ffffa9d4`bdf1e0e9 win32kfull!xxxCompositedTraverse+0xa1
ffffa88b`5b18f130 ffffa9d4`bdf1e0e9 win32kfull!xxxCompositedTraverse+0xa1
ffffa88b`5b18f180 ffffa9d4`bdf1e0e9 win32kfull!xxxCompositedTraverse+0xa1
ffffa88b`5b18f1d0 ffffa9d4`bdf1e2a7 win32kfull!xxxCompositedTraverse+0xa1
ffffa88b`5b18f220 ffffa9d4`bde5a013 win32kfull!xxxCompositedPaint+0x37
ffffa88b`5b18f2b0 ffffa9d4`bdd2e438 win32kfull!xxxInternalDoPaint+0x12bce3
ffffa88b`5b18f300 ffffa9d4`bdd2e03a win32kfull!xxxInternalDoPaint+0x108
ffffa88b`5b18f350 ffffa9d4`bdd30f1c win32kfull!xxxDoPaint+0x52
ffffa88b`5b18f3b0 ffffa9d4`bdd2ff08 win32kfull!xxxRealInternalGetMessage+0xfac
ffffa88b`5b18f880 ffffa9d4`be1871ce win32kfull!NtUserPeekMessage+0x158
ffffa88b`5b18f940 fffff806`6640d8f5 win32k!NtUserPeekMessage+0x2a
ffffa88b`5b18f990 00007ffe`1816ff74 nt!KiSystemServiceCopyEnd+0x25 (TrapFrame @ ffffa88b`5b18fa00)
00000000`0077e558 00000000`00000000 0x00007ffe`1816ff74
If there is very little thread information, you can use .process
to set this process as the current context, and then load user symbols. The output is as follows:
lkd> .process ffffab8ebea75080
Implicit process is now ffffab8e`bea75080
lkd> .reload
Connected to Windows 10 19041 x64 target at (Tue Mar 21 13:21:21.213 2023 (UTC + 8:00)), ptr64 TRUE
Loading Kernel Symbols
................................................................. .............
................................................................. ............
................................................................. ............
.....................
Loading User Symbols
PEB is paged out (Peb.Ldr = 00000000`009f1018). Type ".hh dbgerr001" for details
Loading unloaded module list
From the thread stack just now, it is obvious that there is a win32kfull!xxxSendMessage+0x2c
method. Friends who are familiar with SendMessage know that this is used to send messages to a certain form. So what? Which form is it?
3. Which form should I send the message to
To get the handle of the sending form, you need to extract the first parameter of the win32kfull!xxxSendMessage
method. Under the x64 calling protocol, it is passed using rcx. You need to analyze the assembly code. , if rcx is not put on the stack, it cannot be extracted.
In order to save you trouble, I suggest you ask your friends to check whether this problem also exists on 32bit operating systems? The result feedback said that it also exists. Use !thread xxx
to switch to the target thread, and use kb to extract the value on the first parameter address, which is: 00010598
. The screenshot is as follows:
I lost a sdbgext
plug-in and asked a friend to check the form handle information and found that it is 64bit. In fact, in addition to it, you can also use Spy++
to observe the form handle. The key point is I found out which thread under which process this mysterious form
was created. After throwing in the handle number, I actually found it. I found the light in the darkness. The screenshot is as follows:
From Spy++, we can see that the current form is created by the thread number 0000109C
under the process number: 000016E0
. After comparison, this thread is a thread of this process. Number.
The analysis at this point is actually very clear. It is because this thread 0000109C
created a user control, causing kernel mode
to send it a message under certain circumstances. The next step is to find out what control was created.
4. The culprit
As for the stuck caused by creating user controls outside the main thread, I feel like I have already said it all. There are still a lot of people making this problem. I am speechless. The solution is to use bp
to intercept it. System.Windows.Forms.Application+MarshalingControl..ctor
method, for specific solutions, please refer to my article: [Rethinking a super classic WinForm stuck problem] https://www.cnblogs.com /huangxincheng/p/16868486.html
The next step was hard debugging by friends, and finally found it. The screenshot is as follows:
Yes, this is the Intptr handle =this.Handle
code. The acquisition of the kernel handle allows it to take root on this thread.
Three: Summary
It’s just such a code. After going back and forth for several times and spending a few weeks, I finally solved it. It can be considered a good result. This case requires real-time observation of the kernel state of the program
and User mode
, the dump effect is not great, causing so much time to be wasted.
I believe this case also made the company boss look at him with admiration
.