New enhanced connection troubleshoot for Azure Networking

On the 1st of March 2023, Microsoft announced “New enhanced connection troubleshoot” for Azure Network watcher has gone GA (General Availability). Previously, Azure Network Watcher provided specialised stand-alone tools for use with network troubleshooting but these have now been consolidated into one place with additional tests and actionable insights to assist with troubleshooting.

With customers migrating advanced, high-performance workloads to Azure, it’s essential to have better oversight and management of the intricate networks that support these workloads. A lack of visibility can make it challenging to diagnose issues, leaving customers with limited control and feeling trapped in a “black box.” To enhance your network troubleshooting experience, Azure Network Watcher combines these tools with the following features:

  • Unified solution for troubleshooting all NSG (Network Security Group), user-defined routes, and blocked ports
  • Actionable insights with a step-by-step guide to resolve issues
  • Identifying configuration issues impacting connectivity
  • NSG rules that are blocking traffic
  • Inability to open a socket at the specified source port
  • No servers listening on designated destination ports
  • Misconfigured or missing routes

These new features are not available via the portal at the moment:

connection troubleshoot via portal does not display enhanced connection troubleshoot results
Connection troubleshooting via the portal

The portal will display that there are connectivity issues but will not provide enhanced information. This is accessible via PowerShell, Azure CLI and the Rest API. I will now demonstrate the real reason it’s not working.

Accessing “enhanced connection troubleshoot” output via PowerShell

I am using the following PowerShell to test the connection between the two machines:

$nw = get-aznetworkwatcher -location australiaeast
$svm = get-azvm -Name Machine1
$dvm = get-azvm -Name Machine2
Test-AzNetworkWatcherConnectivity -NetworkWatcher $nw -SourceId $svm.Id -DestinationId $dvm.Id -DestinationPort 445

This returns the following JSON:

ConnectionStatus : Unreachable
AvgLatencyInMs   :
MinLatencyInMs   :
MaxLatencyInMs   :
ProbesSent       : 30
ProbesFailed     : 30
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "a49b4961-b82f-49da-ae2c-8470a9f4c8a6",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "6c6f06de-ea3c-45e3-8a1d-372624475ced"
                       ],
                       "Issues": [
                         {
                           "Origin": "Local",
                           "Severity": "Error",
                           "Type": "GuestFirewall",
                           "Context": []
                         }
                       ]
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "6c6f06de-ea3c-45e3-8a1d-372624475ced",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": []
                     }
                   ]

As you can see, the issues discovered are explained in more detail; in this case, the local firewall is affecting communication. If we check the local Defender firewall, we can see there is a specific rule blocking this traffic:

Blocked outbound protocols

If we remove the local firewall rule, connectivity is restored:

ConnectionStatus : Reachable
AvgLatencyInMs   : 1
MinLatencyInMs   : 1
MaxLatencyInMs   : 2
ProbesSent       : 66
ProbesFailed     : 0
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "f1b763a1-f7cc-48b6-aec7-f132d3fdadf8",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "7c9c103c-44ab-4fd8-9444-22354e5f9672"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "7c9c103c-44ab-4fd8-9444-22354e5f9672",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": []
                     }
                   ]

The enhanced connection troubleshoot can detect 6 fault types:

  • Source high CPU utilisation
  • Source high memory utilisation
  • Source Guest firewall
  • DNS resolution
  • Network security rule configuration
  • User-defined route configuration

The first four faults are returned by the Network Watcher Agent extension for Windows as demonstrated above. The remaining two faults are from the Azure fabric. As you can see below, when a Network Security Group is misconfigured on the source or destination, our issue returns, but the output displays clearly where and which network security group is at fault:

ConnectionStatus : Unreachable
AvgLatencyInMs   :
MinLatencyInMs   :
MaxLatencyInMs   :
ProbesSent       : 30
ProbesFailed     : 30
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "3cbcbdbe-a6ec-454f-ad2e-946d6731278a",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "29e33dac-45ae-4ea3-8a9d-83dccddcc0eb"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualMachine",
                       "Id": "29e33dac-45ae-4ea3-8a9d-83dccddcc0eb",
                       "Address": "172.16.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine2",
                       "NextHopIds": [],
                       "Issues": [
                         {
                           "Origin": "Inbound",
                           "Severity": "Error",
                           "Type": "NetworkSecurityRule",
                           "Context": [
                             {
                               "key": "RuleName",
                               "value": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/ConnectivityTest/providers/Microsoft.Network/networkSecurityGroups/Ma
                   chine2-nsg/SecurityRules/DenyAnyCustom445Inbound"
                             }
                           ]
                         }
                       ]
                     }
                   ]

In addition to fault detection, IP Flow is also a part of the enhanced connection troubleshoot, providing a list of hops to a service. An excerpt of a trace to a public storage account is below:

PS C:\temp> Test-AzNetworkWatcherConnectivity -NetworkWatcher $nw -SourceId $svm.Id -DestinationAddress https://announcementtest.blob.core.windows.net/test1 -DestinationPort 443

ConnectionStatus : Reachable
AvgLatencyInMs   : 1
MinLatencyInMs   : 1
MaxLatencyInMs   : 1
ProbesSent       : 66
ProbesFailed     : 0
Hops             : [
                     {
                       "Type": "Source",
                       "Id": "23eb09fd-b5fa-4be1-83f2-caf09d18ada0",
                       "Address": "10.0.0.4",
                       "ResourceId": "/subscriptions/xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx/resourceGroups/CONNECTIVITYTEST/providers/Microsoft.Compute/virtualMachines/Machine1",
                       "NextHopIds": [
                         "78f3961c-9937-4679-97a7-4a19f4d1232a"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "PublicLoadBalancer",
                       "Id": "78f3961c-9937-4679-97a7-4a19f4d1232a",
                       "Address": "20.157.155.128",
                       "NextHopIds": [
                         "574ad521-7ab7-470c-b5aa-f1b4e6088888",
                         "e717c4bd-7916-45bd-b3d1-f8eecc7ed1e3",
                         "cbe6f6a6-4281-402c-a81d-c4e3d30d2247",
                         "84769cde-3f92-4134-8d48-82141f2d9bfd",
                         "aa7c2b73-0892-4d15-96c6-45b9b033829c",
                         "1c3e3043-98f2-4510-b37f-307d3a98a55b",
                         "b97778cb-9ece-4e87-bf6d-71b90fac3847",
                         "cb92d16d-d4fe-4233-b958-a4d3dbe78303",
                         "ec9a2753-3a60-4fce-9d92-7dbbc0d0219d",
                         "df2b1a3e-6555-424c-8e48-5cc0feba3623"
                       ],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualNetwork",
                       "Id": "574ad521-7ab7-470c-b5aa-f1b4e6088888",
                       "Address": "10.124.144.2",
                       "NextHopIds": [],
                       "Issues": []
                     },
                     {
                       "Type": "VirtualNetwork",
                       "Id": "e717c4bd-7916-45bd-b3d1-f8eecc7ed1e3",
                       "Address": "10.124.146.2",
                       "NextHopIds": [],
                       "Issues": []
                     },

Centralising the troubleshooting tools under one command is obviously a great enhancement; however by also providing increased visibility into configurations or system performance makes this a great update for your troubleshooting toolbox.


This post is also available here.

Read more recent blogs

Get started on the right path to cloud success today. Our Crew are standing by to answer your questions and get you up and running.