6

I'm trying to set up kernel debugging on a physical hardware/desktop to test the new WinDbg Preview. Here's the steps:

  1. I'm using a desktop with Intel DG41TY board.

  2. Installed Windows 10 Pro build 1803. (Off USB, created using media creation tool.)

  3. The board has a supported network card:

    PCI\VEN_10EC&DEV_8168&SUBSYS_D6128086&REV_03
    
  4. Prepped up that debuggee for kernel debugging via Ethernet cable by running the following from an elevated cmd (where 192.168.1.29 is the debugger machine IP address):

    bcdedit /debug on
    
    bcdedit /dbgsettings net hostip:192.168.1.29 port:50000 key:1.2.3.4
    
    bcdedit /set "{dbgsettings}" busparams 3.0.0
    

    enter image description here

    I set it all up to the default boot configuration because that PC was not supposed to have a monitor, mouse or keyboard, so that it could boot up into a debugging mode by default. (For further control I was assuming to remote-desktop into it.)

  5. Connected it via an Ethernet cable to my Windows 10 Pro laptop.

  6. On the debugger laptop, I was using new, Windows Store version of WinDbg Preview:

    Debugger client version: 1.0.1805.17002
    Debugger engine version: 10.0.17674.1000
    
  7. In WinDbg I set up network kernel debugging as such:

    I'm not sure what that new "Target" field meant, I assumed it to be the target machine (or debuggee) so I gave it that desktop's IP address:

    enter image description here

  8. Then rebooted the debuggee desktop ... and nothing happened. Windows 10 hung up during the boot process. I assumed that I didn't set something right on the debuggee side, closed WinDbg and tried to reboot the debuggee machine. But it hung up during the boot again.

  9. At this point I disconnected the Ethernet cable and had to reboot it using the power button on that desktop. It failed one more time, and then Windows blue menu came up saying that it failed to automatically recover and needs to reset. (I can't remember the exact wording.)

  10. So about 2 hours later, it recovered and that desktop (debuggee) can now boot, but it wiped out everything that I installed on it. (I can recover all of my installed software since it was a brand new installation.)

So I'm wondering if I was doing something wrong, and if anyone else dealt with the same issue?

PS. I'm just trying to avoid wasting 2+ hrs for such a reset in the future.


EDIT: I was able to replace an HDD in this test PC with an SSD, then reinstall Windows 10 from scratch, and repeat the steps I described above. When I enabled kernel network debugging, that PC started booting visibly slower (about 2 minutes vs. original 15-20 sec.)

After that as soon as I connected an Ethernet cable from that test PC to my Windows 10 laptop with WinDbg Preview waiting for connection, the booting process never completed. WinDbg Preview never connected to that remote PC either.

After a while I disconnected the Ethernet cable and forced the reboot by holding down the power button. This time the boot process froze up after about 2-3 minutes of seeing the spinning dots. Here's the exact screen:

enter image description here

Then when I force-rebooted it again, it showed:

enter image description here

then:

enter image description here

and eventually:

enter image description here

(Luckily this time I created a restore point before doing the tests above. Restore points were off by default in Windows 10. So after clicking "Advanced options" I was able to restore from a restore point.)

The C:\Windows\System32\LogFiles\Srt\SrtTrail.txt file mentioned in the screenshot above contains the following:

Startup Repair diagnosis and repair log
---------------------------
Number of repair attempts: 1

Session details
---------------------------
System Disk = \Device\Harddisk0
Windows directory = C:\Windows
AutoChk Run = 0
Number of root causes = 1

Test Performed: 
---------------------------
Name: Check for updates
Result: Completed successfully. Error code =  0x0
Time taken = 0 ms

Test Performed: 
---------------------------
Name: System disk test
Result: Completed successfully. Error code =  0x0
Time taken = 0 ms

Test Performed: 
---------------------------
Name: Disk failure diagnosis
Result: Completed successfully. Error code =  0x0
Time taken = 16 ms

Test Performed: 
---------------------------
Name: Disk metadata test
Result: Completed successfully. Error code =  0x0
Time taken = 296 ms

Test Performed: 
---------------------------
Name: Disk metadata test
Result: Completed successfully. Error code =  0x0
Time taken = 16 ms

Test Performed: 
---------------------------
Name: Target OS test
Result: Completed successfully. Error code =  0x0
Time taken = 0 ms

Test Performed: 
---------------------------
Name: Volume content check
Result: Completed successfully. Error code =  0x0
Time taken = 63 ms

Test Performed: 
---------------------------
Name: Boot manager diagnosis
Result: Completed successfully. Error code =  0x0
Time taken = 0 ms

Test Performed: 
---------------------------
Name: System boot log diagnosis
Result: Completed successfully. Error code =  0x0
Time taken = 15 ms

Root cause found: 
---------------------------
Boot critical file c:\efi\microsoft\boot\resources\custom\bootres.dll is corrupt.

Repair action: File repair
Result: Failed. Error code =  0x57
Time taken = 2328 ms

---------------------------
---------------------------

Additionally, if anyone at Microsoft wants me to email you the entire C:\Windows\System32\LogFiles\Srt folder, I can do so upon request.

c00000fd
  • 1,659
  • 3
  • 25
  • 41
  • If you're trying to avoid those extra hours, make sure you have a backup handy. And I mean it, no kidding and not meant in any condescending way. – 0xC0000022L Jun 15 '18 at 19:34
  • @0xC0000022L: Sure. Thanks. I already put in an SSD into that PC to make it boot quicker. Btw, what type of backup is the fastest on Win10? – c00000fd Jun 15 '18 at 19:41
  • Personally I am using solutions from both Acronis and Paragon, but you should be able to get away even with the built-in ("Windows 7") backup or free-of-charge solutions such as CloneZilla (Linux-based). Unfortunately I never tried Ethernet debugging with Windows. So far using Firewire, USB or Serial was sufficient and then of course VirtualKD. But none of that may be an option for you, because what you can do in a VM is limited. Windows "hanging" up may make sense when the system brings up the NIC. Did you ever see a successful connection at all? I presume you know how that looks. – 0xC0000022L Jun 15 '18 at 19:48
  • @0xC0000022L: No, it never connected. I posted additional details above. – c00000fd Jun 17 '18 at 03:25
  • 1
    I don't know about the new WinDbg from the store, but the old one had an option to sync the connection. Could you try to use that? Also, I typically turn off the graphical boot logo (sos yes) to see what gets loaded. I'd also recommend to enable /bootdebug - at least temporarily - until you figure out what's wrong (it's not strictly needed for ordinary kernel mode debugging, though). – 0xC0000022L Jun 17 '18 at 12:21
  • Wait a minute. I just stumbled over this again and noticed the odd value you have for the key in your screenshot. Have you ever even used kdnet to generate one? Sorry, this is probably no longer of interest, but chances are that's the missing step (and I missed it, too, when commenting originally 4.5 years back). – 0xC0000022L Jan 16 '23 at 08:42
  • @0xC0000022L that's been a really long time ago. I don't remember. Follow this tutorial - it works for me. I do it all the time. – c00000fd Jan 16 '23 at 08:51
  • Well, it describes it for a VM on the same host, though (with kdnet less of a discrepancy than with other communication protocols, however). Your use case back then appears to have been against real hardware. And I don't need this, I just stumbled over this, because the SE algorithm pushed it back to the top and I noticed that I commented back then, but missed out on that small detail regarding kdnet. Side-note: I almost wanted to link to that same tutorial (in my previous comment), but didn't because it was for VMs – 0xC0000022L Jan 16 '23 at 08:55
  • 1
    @0xC0000022L yes, as you reminded me I was trying to set it up on a physical hardware. Tbh, it's been so long ago so I don't remember if I succeeded or not. What I remember though is that I switched away from using COM port for kernel debugging because it makes the process slow af. That connection is also very unstable. Thus, if I can (sometimes it's impossible if you have to debug real hardware device) but if you don't have to, I always set it up in a VM via the new WinDbg Next and its "Net" connection type. The experience is so much better than the old school COM port. – c00000fd Jan 16 '23 at 09:28
  • Absolutely agree, COM always was always the worst option (but initially also the only one, IIRC). They used to offer Firewire in between (now retired), which I loved for its speed, but it required supported hardware (the firewire adapters). After that came USB, which was (and still is) sort of okay, but the best by far regarding speed appears to be Ethernet or the third-party solution I mentioned: VirtualKD, which uses host/guest communication for kernel-debugging VMs (so that'd be a VM-only solution as opposed to Ethernet). – 0xC0000022L Jan 16 '23 at 09:33
  • 1
    @0xC0000022L yes for sure. Or, if you have access to a JTAG debugger that's the best. – c00000fd Jan 16 '23 at 09:48
  • Really, for Windows? Never seen that in action. For Linux and other embedded OSs (and boot loaders), sure. But not for Windows. I'm really intrigued now. If you find the time, perhaps you can write your own Q&A (emphasis on and answer ) showcasing that. But interesting that this seems possible even with Windows. – 0xC0000022L Jan 16 '23 at 09:53
  • @0xC0000022L: yes. Although it's been a while since I did it. I am not sure if you'd be able to fit Win10/11 on an IOT device these days. Back then it was mostly for WinXP embedded. And those JTAG hardware debuggers were proprietary too. – c00000fd Jan 16 '23 at 10:47
  • At least there is an edition of Windows 10 for the RPi, so could be. – 0xC0000022L Jan 16 '23 at 10:48

1 Answers1

2

You need to use the correct ethernet cable. If you are connecting target directly to your debugger host you need to use a cross over cable, not a regular ethernet cable. Alternately if you don't have a crossover a cable using a switch/hub is required to connect the devices. I'd check TCP/IP communication is working between devices prior to enabling debugging if possible.

Simply you should check:

  1. On debugger host run ipconfig and identify IP address of the ethernet adapter you will be using
  2. On target computer ensure ping -4 hostip works. You may need to allow ICMP traffic through firewall for this to work.

In addition on host debugger machine you will need to allow traffic through Windows firewall (or any other 3rd party firewalls installed)

When you first attempt to establish a network debugging connection, you might be prompted to allow the debugging application (WinDbg or KD) access through the firewall. Client versions of Windows display the prompt, but Server versions of Windows do not display the prompt. You should respond to the prompt by checking the boxes for all three network types: domain, private, and public. If you do not get the prompt, or if you did not check the boxes when the prompt was available, you must use Control Panel to allow access through the firewall. Open Control Panel > System and Security and select Allow an app through Windows Firewall. In the list of applications, locate Windows GUI Symbolic Debugger and Windows Kernel Debugger. Use the check boxes to allow those two applications through the firewall. Restart your debugging application (WinDbg or KD).

chentiangemalc
  • 1,235
  • 8
  • 16