DOing Harm

Apr 18, 2022

25 mins read

There’s this thing called Windows Delivery Optimization which allows “you to get Windows updates and Microsoft Store apps from sources in addition to Microsoft, like other PCs on your local network, or PCs on the internet that are downloading the same files.”

This makes a lot of sense for traffic congestion and low-bandwidth environments. If you’re in a remote location, the ability to have a single PC download a 1GB update and distribute it to the other 100 PC’s on your LAN is a really cool concept. This modern functionality started appearing in Win10 1511+ and piqued my interest way back in 2019.

When I last messed with this functionality, it required a mess of powershell and a lot of guessing. Modern versions of windows now have better introspection so this is ripe for investigation.

This is awesome! I wonder if I can…

  • Take a look at how this working internally?
  • Use this functionality to identify all the public IP’s of hosts on the internet missing a security patch?

After all, why spray the internet with sploits when you can shoot fish in a barrel.

Getting Set Up

Microsoft provides a number of free test VM’s for a variety of VM platforms here. We’re gonna grab a copy of “MSEdge on Win10 (x64) Stable 1809” and go from there.

Ideally, I want 2 VM’s with the following attributes

Seeder

  • This VM is fully patched
  • This VM has cached the update files

Leech

  • This VM is mostly patched
  • This VM is missing at least one update that the seeder has
  • This VM is prevented from downloading updates from the internet
  • This VM prioritizes P2P updates from the seeder VM

Setting up seeder

The DO Reference has a number of registry keys we can mess with. These will be more important for the leech VM, but for the seeder we want to figure out how to cache an update for a long period of time and make sure that we’re seeding it.

The configurations are found under Configuration\Policies\Administrative Templates\Windows Components\Delivery Optimization in Group Policy.

A default configuration looks like this:

default settings

Delivery Optimization P2P over LAN appears to be enabled already and should work without issue, but we’re gonna flip some switches to give us some more debugging control.

Let’s make the following changes:

SettingValue
Minimum Peer Caching Content File Size (in MB)1
Absolute Max Cache Size (in GB)10
Max Cache Age (in seconds)0
Allow uploads while the device is on battery while under set Battery level (percentage)0
Minimum disk size allowed to use Peer Caching (in GB)1
Minimum RAM capacity (inclusive) required to enable use of Peer Caching (in GB)1
Monthly Upload Data Cap (in GB)0

In theory, this should cache all update chunks >1MB in size for an unlimited period of time as long as there’s more than 1GB of RAM available, have no peer upload limitations, and run as long as the battery measures higer than 1%.

Welp. Let’s find out.

I save settings, shutdown, change the VM network interface to allow internet connectivity, and let Windows update run…

cached files

By running Get-DeliveryOptimizationStatus | FT it looks like we’ve successfully downloaded a number of files and have a ton of fileid.

This is probably a step in the right direction? I throw wireshark on the seeder box as well and shut it down. On to leecher

Setting up leecher

Again, we start by mucking about in group policy, but this time we want to try to force this machine to only download from a LAN peer and ignore any HTTP or CDN peers.

SettingValue
Delay background download from http (in secs)4294967295
Delay Foreground download from http (in secs)4294967295
Download ModeLAN (1)
Maximum Download Bandwidth (in KB/s)0
Maximum Background Download Bandwidth (percentage)0
Maximum Download Bandwidth (percentage)0
Maximum Foreground Download Bandwidth (percentage)0

In theory, this should download updates with HTTP blended with peering behind NAT, unlimited bandwidth, and delay checking HTTP/CDN download sources for 4294967295 seconds. Hopefully this is plenty of time, but who knows, I’ve procrastinated this research since 2019…

Next we download a copy of wireshark on our desktop hosting these VM’s and set the leecher network adapter to only support host-only networking so we can get the installer onto the VM without letting it walk to our seeder VM, or any of the other Windows machines I have on my network.

Then we shut down, and create a backup OVA of the leecher VM.

Did it work?

After booting both VM’s attached to an internal network with no internet access, I confirmed network connectivity and attempted to trigger an update and… nothing. Naturally, it looks like internet is required for peer discovery. This corresponds with the documentation which states:

It relies on the cloud service for peer discovery, resulting in a list of peers and their IP addresses. Client devices then connect to their peers to obtain download files over TCP/IP.

So let’s flip these VM’s to a bridged adapter state and try again…

unexpected update

We see an update! Unfortunately, this isn’t being pulled from the seeder VM. It does however appear that the HTTP delay is working because this download hung at 10% forever.

Eventually, I realized I’d stumbled into a stupid state with Virtualbox and bridged adapters where the 2 VM’s could talk to every other device on the network except for each other. Couldn’t even ping each other.

So we take a moment to configure a NAT Network in virtualbox and swap both VM’s NIC to use it. Both VM’s should have internet access, but be NAT’d within the 10.0.2.0/24 range and be able to talk to each other.

This… doesn’t work either. I have no idea what I’m doing wrong.

Windows Store

As stated at the top of this blog, I stated the DO supported Microsoft store apps as well. Let’s grab a big file (Roblox) from the store on seeder and see if we can get leecher to pick it up. Maybe there’s a difference in behavior.

And indeed! There is! With very little fiddling about, we can see that leecher picks up the Roblox file from seeder. Note the “LanConnectionCount”.

powershell roblox

Perhaps more importantly, I captured the conversation in Wireshark from the seeder side. The leecher begins the conversation by connecting to TCP port 7680 on seeder.

leecher –> seeder

0000   0e 53 77 61 72 6d 20 70 72 6f 74 6f 63 6f 6c 00   .Swarm protocol.
0010   00 00 00 00 10 00 00 d9 a5 89 6d 90 26 67 c8 75   ..........m.&g.u
0020   bc 5d 7c fe 87 32 36 f3 9c e5 a0 1e 11 f2 7b fe   .]|..26.......{.
0030   5f 18 fe b7 fe 23 f4 ef 56 43 9f a1 87 67 49 bf   _....#..VC...gI.
0040   98 4b 3e 2f 73 40 47 00 00 00 00                  .K>/s@G....

leecher <– seeder

0000   0e 53 77 61 72 6d 20 70 72 6f 74 6f 63 6f 6c 00   .Swarm protocol.
0010   00 00 00 00 10 00 00 d9 a5 89 6d 90 26 67 c8 75   ..........m.&g.u
0020   bc 5d 7c fe 87 32 36 f3 9c e5 a0 1e 11 f2 7b fe   .]|..26.......{.
0030   5f 18 fe b7 fe 23 f4 3c aa 43 f5 49 5f a7 4b 97   _....#.<.C.I_.K.
0040   59 7c 7c 69 4f 1d a2 00 00 00 00                  Y||iO......

leecher –> seeder

0000   00 00 00 12 05 00 00 00 00 00 00 00 00 00 00 00   ................
0010   00 00 00 00 00 00                                 ......

leecher <– seeder

0000   00 00 00 12 05 ff ff ff ff ff ff ff ff ff ff ff   ................
0010   ff ff ff ff ff fc                                 ......

leecher –> seeder

0000   00 00 00 01 02                                    .....

leecher <– seeder

0000   00 00 00 01 01                                    .....

leecher –> seeder

0000   00 00 00 0d 06 00 00 00 85 00 00 00 00 00 0b b2   ................
0010   98                                                .

So that’s neat! We’ve found some sort of custom protocol Swarm protocol.

You can grab the PCAP and look around yourself here roblox_swarm.pcapng

Taking the Protocol apart

Let’s grab the first packet from the swarm protocol and throw it at the server with netcat to see if we can replay packets to get a response.

echo -e "\x0e\x53\x77\x61\x72\x6d\x20\x70\x72\x6f\x74\x6f\x63\x6f\x6c\x00" \
"\x00\x00\x00\x00\x10\x00\x00\xd9\xa5\x89\x6d\x90\x26\x67\xc8\x75" \
"\xbc\x5d\x7c\xfe\x87\x32\x36\xf3\x9c\xe5\xa0\x1e\x11\xf2\x7b\xfe" \
"\x5f\x18\xfe\xb7\xfe\x23\xf4\xa0\x5b\xec\xf6\x12\x66\xc0\x41\xbb" \
"\x0b\xc4\xba\xcf\x17\xab\x61\x00\x00\x00\x00" \
| nc -vvv -q 1 192.168.8.242 7680 | xxd

Luckily, this works! We get a response from the server.

netcat replay

With a reasonable level of confidence we can assert that this packet doesn’t have any timestamps or other cryptographic requirements in order to elicit a response.

We can also reasonably assert that the first 16 bytes are probably a static protocol header, albeit kinda big for a protocol header but w/e. String + 00 (NULL Terminated) is a fairly common pattern, as is keeping to a size of 2, 4, 8, or 16 bytes.

00000000: 0e53 7761 726d 2070 726f 746f 636f 6c00  .Swarm protocol.

As I have a fairly high level of confidence I understand this protocol header except for the first byte 0E, let’s try the values on either side 0D/0F.

echo -e "\x0d\x53\x77\x61\x72\x6d\x20\x70\x72\x6f\x74\x6f\x63\x6f\x6c\x00" \
"\x00\x00\x00\x00\x10\x00\x00\xd9\xa5\x89\x6d\x90\x26\x67\xc8\x75" \
"\xbc\x5d\x7c\xfe\x87\x32\x36\xf3\x9c\xe5\xa0\x1e\x11\xf2\x7b\xfe" \
"\x5f\x18\xfe\xb7\xfe\x23\xf4\xa0\x5b\xec\xf6\x12\x66\xc0\x41\xbb" \
"\x0b\xc4\xba\xcf\x17\xab\x61\x00\x00\x00\x00" \
| nc -vvv -q 1 192.168.8.242 7680 | xxd

This results in the connection being closed with no response.

echo -e "\x0f\x53\x77\x61\x72\x6d\x20\x70\x72\x6f\x74\x6f\x63\x6f\x6c\x00" \
"\x00\x00\x00\x00\x10\x00\x00\xd9\xa5\x89\x6d\x90\x26\x67\xc8\x75" \
"\xbc\x5d\x7c\xfe\x87\x32\x36\xf3\x9c\xe5\xa0\x1e\x11\xf2\x7b\xfe" \
"\x5f\x18\xfe\xb7\xfe\x23\xf4\xa0\x5b\xec\xf6\x12\x66\xc0\x41\xbb" \
"\x0b\xc4\xba\xcf\x17\xab\x61\x00\x00\x00\x00" \
| nc -vvv -q 1 192.168.8.242 7680 | xxd

This results in the connection being closed with no response.

Sometimes the first few bytes of a protocol correspond to a protocol version, so it’s usually reasonable to try values on both sides just to quickly see what will happen.

So where do we go from here? Well, there’s 59 more bytes to figure out. Let’s do a lil differential analysis against the response since it looked very similar.

hxd diff

The client request and server response are identical for the first 55 bytes. Knowing that the protocol header is 16 bytes, that leaves 39 bytes that are unknown and appear in the client request and response.

Using some quick python because I’m lazy, I used FF to replace values and use as a marker until the server started to spontaneously close the connection or return a different response to generate a minimal client payload that still elicited an identical response.

import socket
# Create a TCP/IP socket
sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
# Connect the socket to the port where the server is listening
server_address = ('192.168.8.242', 7680)
print('connecting to {} port {}'.format(*server_address))
message = ''
message += '0E 53 77 61 72 6D 20 70 72 6F 74 6F 63 6F 6C 00'.replace(' ', '')
message += '00 00 00 00 10 00 00 D9 A5 89 6D 90 26 67 C8 75'.replace(' ', '')
message += 'BC 5D 7C FE 87 32 36 F3 9C E5 A0 1E 11 F2 7B FE'.replace(' ', '')
message += '5F 18 FE B7 FE 23 F4 A0 5B EC F6 12 66 C0 41 BB'.replace(' ', '')
message += '0B C4 BA CF 17 AB 61 00 00 00 00               '.replace(' ', '')
message = bytes.fromhex(message)

minimal_payload = []

for i in range(len(message)):
    temp = bytearray(message)
    #FF
    temp[i] = 255
    sock = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
    sock.connect(server_address)
    sock.sendall(bytes(temp))
    reply = sock.recv(75).hex()
    #print(reply)
    expected_reply = '0e537761726d2070726f746f636f6c0000000000100000d9a5896d902667c875bc5d7cfe873236f39ce5a01e11f27bfe5f18feb7fe23f43caa43f5495fa74b97597c7c694f1da200000000'
    if(reply == expected_reply):
        minimal_payload.append(255)
    else:
        minimal_payload.append(message[i])

print(bytes(bytearray(minimal_payload)).hex())

The result of which is:

00000000: 0eff ffff ffff ffff ffff ffff ffff ffff  ................
00000010: ffff ffff ffff ffd9 a589 6d90 2667 c875  ..........m.&g.u
00000020: bc5d 7cfe 8732 36f3 9ce5 a01e 11f2 7bfe  .]|..26.......{.
00000030: 5f18 feb7 fe23 f4ff ffff ffff ffff ffff  _....#..........
00000040: ffff ffff ffff ffff ffff ff              ...........

Listen, I’mma be honest with you. I was not expecting that. The Swarm protocol doesn’t matter at all for this purpose apparently? Only 0e and a run of the following 32 bytes seem to matter:

D9 A5 89 6D 90 26 67 C8 75 BC 5D 7C FE 87 32 36 F3 9C E5 A0 1E 11 F2 7B FE 5F 18 FE B7 FE 23 F4

What do these 32 bytes mean? Well, I could give you like… a million guesses. Luckily, chances are pretty high that in the process of sending all of those invalid packets I angered the process gods and they’ll happily yell at me in the logs exactly what I want to know.

Running a bit of powershell on the seeder VM we get some interesting output:

Get-DeliveryOptimizationLogs | where {$_.LevelName -eq "Warning"} | Select Message

infohash logs

Neat so now we know those 32 bytes are an infohash.

That’s a cool little command I used isn’t it? Get-DeliveryOptimizationLogs. Throwing ProcMon at the process quickly tells us where the service is writing the logs to:

C:\Windows\ServiceProfiles\NetworkService\AppData\Local\Microsoft\Windows\DeliveryOptimization which contains 3 folders

  • Cache
  • Logs
  • State

Inside the logs folder are a variety of .etl files with prefixes dosvc and domgmt. We can easily view and parse these with PerfView which is a fantastic tool if you’ve ever needed to debug a windows application without access to source code, but I digress. We’ll take a closer look at the logs in a moment.

What’s in the Cache folder?

PS C:\Windows\ServiceProfiles\NetworkService\AppData\Local\Microsoft\Windows\DeliveryOptimization\Cache> tree /F
Folder PATH listing for volume Windows 10
Volume serial number is B4A6-FEC6
C:.
├───3839ed3f9a311ea4b80a7ef69c2d0d7ff81095f3
│       3839ed3f9a311ea4b80a7ef69c2d0d7ff81095f3.pieceshash
│       f520b585233003b78b54ed99bba40252b9bf646ed8dfae63bb8e8e0414a30046
│
├───503b057c1f8779829ec231c7f02f730d1546b24f
├───690b54039b9273d7ecb32d6b93e61c3fba97e895
├───698672b2e586d1a7bbed01cb0977d66404310d1e
│       698672b2e586d1a7bbed01cb0977d66404310d1e.pieceshash
│       78257372ab623ae7f8f06408e47ba6b7476b517196e2ecd5a7155f53be587dff
│
├───79457aec037ab97ae2798ea236d49e234e0934b0
├───87099b6535d3a7698167ced4c25913da64f83410
│       87099b6535d3a7698167ced4c25913da64f83410.pieceshash
│       911a81e4b919326203a604a39a16c5ceab1587d10df16d65334b073dcdb78f14
│
├───9b80960d74044c50960b60224137a136610dc16e
│       4149cc82a6b2a47b55e881b8ea5c5f8eddf68135342d50999726690e6915cc58
│       9b80960d74044c50960b60224137a136610dc16e.pieceshash
│
└───f662e498db811941a97d4c08e47641069c14cce7
        d9a5896d902667c875bc5d7cfe873236f39ce5a01e11f27bfe5f18feb7fe23f4
        f662e498db811941a97d4c08e47641069c14cce7.pieceshash

Do you see what I see? Look closer at the last entry in the tree: d9a5896d902667c875bc5d7cfe873236f39ce5a01e11f27bfe5f18feb7fe23f4

That’s our infohash! It’s accompanied in the folder by a .pieceshash file with the same name as the folder itself.

Let’s take a peek at f662e498db811941a97d4c08e47641069c14cce7.pieceshash

{
    "MajorVersion": 1,
    "MinorVersion": 0,
    "HashOfHashes": "2aWJbZAmZ8h1vF18/ocyNvOc5aAeEfJ7/l8Y/rf+I/Q=",
    "ContentLength": 140227224,
    "PieceSize": 1048576,
    "Pieces": [
        "UBwuCmq+MW5OCVcODhrkc+z9bB/UeEuSQ7AVQ98x9fc=",
        "ZJfCAaca3Fe8INWN3ycKgbg2+yWEEcdudduvuokRYoI=",
        "dx3Tz/IoNbdJnw1BHkO5GKx7U4HGrH61gaqaZmhh2bM=",
        "4JDQgVQwok9GYka7Ngf/mSUAoptPmfFdP9e4BZXm2ec=",
        "oxibeUKZ3bqOKc9PKZHt1jYvPmOrOWLRyvSZHicSEA0=",
        "...snip..."
    ]
}

Descriptive key names are super useful and we see some Base64 Encoding there. Taking HashOfHashes we can do the following…

echo "2aWJbZAmZ8h1vF18/ocyNvOc5aAeEfJ7/l8Y/rf+I/Q=" | base64 -d | xxd

which returns

00000000: d9a5 896d 9026 67c8 75bc 5d7c fe87 3236  ...m.&g.u.]|..26
00000010: f39c e5a0 1e11 f27b fe5f 18fe b7fe 23f4  .......{._....#.

That’s our infohash again!

We also see PieceSize and an array of Pieces, we can guess that this is probably a base64 encoded SHA256 hash of chunks of the file. We can quickly check this with:

head -c 1048576 d9a5896d902667c875bc5d7cfe873236f39ce5a01e11f27bfe5f18feb7fe23f4 \
| sha256sum | xxd -r -p | base64

which returns the same value as the first Piece : UBwuCmq+MW5OCVcODhrkc+z9bB/UeEuSQ7AVQ98x9fc=.

Quick Review

Delivery Optimization listens on TCP port 7680 and a client sends a payload of 75 bytes to start the conversation.

  • The 1st byte must be 0e.
  • The next 15 bytes “should” be “Swarm Protocol” 53 77 61 72 6d 20 70 72 6f 74 6f 63 6f 6c 00
    • These don’t appear to be validated and can be anything.
  • The 24th byte starts a section of 32 bytes which is the infohash used to identify the package the client wants to download.

We have 2 blocks of the protocol which are still unknown.

  • Bytes 17 –> 23
  • Bytes 56 –> 75

We know that we can change arbitrary values in these blocks without causing the DO server to close the connection. But what if rather than changing the values, we changed the length?

The last block is the biggest and if the server is reading the payload sequentially for parsing reasons, we can shorten the payload by 1 byte and replace that block with identifiers that would show up in the logs.

Error Based Oracle

We’re going to take our known valid 75 byte payload and shorten it by 1 byte and replace bytes 56 –> 75 with values we can use as oracle values. 00 –> 13

00000000: 0e53 7761 726d 2070 726f 746f 636f 6c00  .Swarm protocol.
00000010: 0000 0000 1000 00d9 a589 6d90 2667 c875  ..........m.&g.u
00000020: bc5d 7cfe 8732 36f3 9ce5 a01e 11f2 7bfe  .]|..26.......{.
00000030: 5f18 feb7 fe23 f400 0102 0304 0506 0708  _....#..........
00000040: 090a 0b0c 0d0e 0f10 1112 13              ...........

And turns out, you can literally throw that entire chunk in the trash and the handshake still replies just fine.

00000000: 0e53 7761 726d 2070 726f 746f 636f 6c00  .Swarm protocol.
00000010: 0000 0000 1000 00d9 a589 6d90 2667 c875  ..........m.&g.u
00000020: bc5d 7cfe 8732 36f3 9ce5 a01e 11f2 7bfe  .]|..26.......{.
00000030: 5f18 feb7 fe23 f4                        _....#.

In fact, the server only enters into unexpected behavior when the infohash itself is truncated to 31 bytes like so:

00000000: 0e53 7761 726d 2070 726f 746f 636f 6c00  .Swarm protocol.
00000010: 0000 0000 1000 00d9 a589 6d90 2667 c875  ..........m.&g.u
00000020: bc5d 7cfe 8732 36f3 9ce5 a01e 11f2 7bfe  .]|..26.......{.
00000030: 5f18 feb7 fe23                           _....#

Rather than it’s normal behavior of closing the connection, it waits indefinitely for the last byte. After sending an arbitrary 32nd byte of the infohash such as FF the server will close the connection.

This is an interesting behavior that can be used to check and help identify if WUDO is running on TCP port 7680.

In summary, if you wanted to check whether a Windows machine on your network has recently installed the latest of roblox from the windows store, the following payload works just fine…

00000000: 0e50 4953 5350 4953 5350 4953 5350 4953  .PISSPISSPISSPIS
00000010: 5350 4953 5350 49d9 a589 6d90 2667 c875  SPISSPI...m.&g.u
00000020: bc5d 7cfe 8732 36f3 9ce5 a01e 11f2 7bfe  .]|..26.......{.
00000030: 5f18 feb7 fe23 f450 4953 5350 4953 5350  _....#.PISSPISSP
00000040: 4953 5350 4953 5350 4953 53              ISSPISSPISS

Another interesting behavior

If you hadn’t noticed already due to the the P2P behavior and use of terms such as infohash, the Delivery Optimization service operates, and is modeled after the behavior of most bittorrent clients.

A funny aspect of mirroring how torrents work is that there’s this concept of a “tracker”, or a web service that keeps track of all the peers in a group who have parts of the same file they want to share. There’s some interesting sites like iKnowWhatYouDownloaded that constantly scrape torrent trackers and keep track of what files different IP’s have been seen downloading which content.

Clicking the link above will tell you if your IP has been spotted downloading a torrent and what the file is!

…so what if we could do the same thing, but for Microsoft Store downloads and Windows Updates?

According to some documentation

This workflow allows Delivery Optimization to securely and efficiently deliver requested content to the calling device. Delivery Optimization uses content metadata to determine all available locations to pull content from, as well as content verification.

  1. When a download starts, the Delivery Optimization client attempts to get its content metadata. This content metadata is a hash file containing the SHA-256 block-level hashes of each piece in the file (typically one piece = 1 MB).
  2. The authenticity of the content metadata file itself is verified prior to any content being downloaded using a hash that is obtained via an SSL channel from the Delivery Optimization service. The same channel is used to ensure the content is curated and authorized to leverage peer-to-peer.
  3. When Delivery Optimization pulls a certain piece of the hash from another peer, it verifies the hash against the known hash in the content metadata file.
  4. If a peer provides an invalid piece, that piece is discarded. When a peer sends multiple bad pieces, it’s banned and will no longer be used as a source by the Delivery Optimization client performing the download.
  5. If Delivery Optimization is unable to obtain the content metadata file, or if the verification of the hash file itself fails, the download will fall back to “simple mode” (pulling content only from an HTTP source) and peer-to-peer won’t be allowed.
  6. Once downloading is complete, Delivery Optimization uses all retrieved pieces of the content to put the file together. At that point, the Delivery Optimization caller (for example, Windows Update) checks the entire file to verify the signature prior to installing it.

So let’s recreate that and take some notes on what actually happens since some of the important parts have been left out:

Step 1

GET /geo?doClientVersion=10.0.17763.1697 HTTP/1.1
Host: geo.prod.do.dsp.mp.microsoft.com

which responds

{
    "ExternalIpAddress": "123.123.123.123",
    "CountryCode": "US",
    "KeyValue_EndpointFullUri": "https://kv801.prod.do.dsp.mp.microsoft.com/all",
    "Version": "5B36157A03CF0500DA3C2D8238E6005F3469E237734C2046890202DB6F874840",
    "CacheId": "7",
    "CompactVersion": "10.0.17763.1697",
    "ContentCert": false,
    "DownloadModeFailSafe": ""
}

This service reports the client’s public “external” IP and responds with a KeyValue_EndpointFullUri which is needed for the next request.

Step 2

The KeyValue_EndpointFullUri is requested

GET /all?doClientVersion=10.0.17763.1697&countryCode=US&profile=3211262&CacheId=7 HTTP/1.1
Host: kv801.prod.do.dsp.mp.microsoft.com

which responds

{
    "KeyValue_EndpointUri": "https://kv801.prod.do.dsp.mp.microsoft.com/",
    "KeyValue2_EndpointUri": "https://kv801.prod.do.dsp.mp.microsoft.com/",
    "Discovery_EndpointUri": "https://disc801.prod.do.dsp.mp.microsoft.com/",
    "Discovery2_EndpointUri": "https://disc801.prod.do.dsp.mp.microsoft.com/",
    "ContentPolicy_EndpointUri": "https://cp801.prod.do.dsp.mp.microsoft.com/",
    "ContentPolicy2_EndpointUri": "https://cp801.prod.do.dsp.mp.microsoft.com/",
    "KeyValue_EndpointFullUri": "https://kv801.prod.do.dsp.mp.microsoft.com/all",
    "KeyValue2_EndpointFullUri": "https://kv801.prod.do.dsp.mp.microsoft.com/all",
    "Discovery_EndpointFullUri": "https://disc801.prod.do.dsp.mp.microsoft.com/v2/content/{contentId}",
    "Discovery2_EndpointFullUri": "https://disc801.prod.do.dsp.mp.microsoft.com/content/{contentId}",
    "ContentPolicy_EndpointFullUri": "https://cp801.prod.do.dsp.mp.microsoft.com/content/{contentId}/contentpolicy",
    "ContentPolicy2_EndpointFullUri": "https://cp801.prod.do.dsp.mp.microsoft.com/content/{contentId}/contentpolicy",
    "Geo_EndpointFullUri": "https://geo.prod.do.dsp.mp.microsoft.com/geo",
    "GeoVersion_EndpointFullUri": "https://geover.prod.do.dsp.mp.microsoft.com/geoversion",
    "Client_MaxCDNConnections": "4",
    "Client_CDNConnSpeedBps": "174762",
    "Client_UpRateAutoLimitEnabled": "0",
    "Client_DownloadRateAutoLimit": "1",
    "Client_PerfSnapParticipationRate": "0.01",
    "Client_TraceRouteTargets": "[\"download.windowsupdate.com\",\"tlu.dl.delivery.mp.microsoft.com\",\"win10-trt.msedge.net\"]",
    "Client_HttpBlocksizeErrParams": "30-5-1000",
    "Client_ClusterCount": "5",
    "Client_ServicesCertValidationCn": "1",
    "Client_ServicesCertValidationGeo": "1",
    "Client_DnsPeerDiscoveryConsumerParticipationRate": "0",
    "Client_VpnKeywords": "[\"VPN\",\"Secure\",\"Virtual Private Network\",\"Juniper\",\"PANGP\",\"Citrix\"]",
    "Client_RegisteredCallersFilterList": "BeginLoadRange.*,.*CheckReachable,DoLoadFile,EdgeUpdate DO Job,IntuneAppDownload,MDMSW Job,Microsoft Component Updater DO Job,Microsoft Office Click-to-Run,MLModelDownloadJob,MSIX HttpsDataSource Download,Msk8sDownloadAgent,Windows Dlp Manager,WSXExperiencePackDownloadJob,WU Client Download,Xbox XVC Streaming",
    "Client_ProgressHungForegrndTimeoutMsecs": "1800000",
    "Client_TraceRouteParticipationRate": "0.1",
    "Client_MaxBackgroundDownloads": "100",
    "Client_CachedSvcCallAttemptCount": "5",
    "Client_ClusterMaxSizeBytes": "65536",
    "Client_MaxForegroundDownloads": "100",
    "Client_MetadataFileGetTimeoutMsecs": "90000",
    "Client_MetadataFileGetTimeoutmSec": "90000",
    "Client_ProgressHungBackgrndTimeoutMsecs": "10800000",
    "Client_OSSMaxUploadSwarms": "50",
    "ParticipationRate": "1",
    "PublicParticipationRate": "1",
    "UploadLimitGBMonth": "20",
    "Discovery_MaxBucketId": "13",
    "Discovery_MaxPartitionId": "13",
    "Version": "EA4448A76A276C1E85BD3278C35E80AEF80F85F779BBD2FDBC0BF7BB38A1D25A"
}

The ContentPolicy_EndpointUri and Discovery_EndpointUri are used in the following requests.

Step 3

Using the ContentPolicy_EndpointUri:

GET /content/074a9355101ddb601b93946302e214028394d70a/contentpolicy?doClientVersion=10.0.17763.1697&altCatalogId=http%3A%2F%2Ftlu.dl.delivery.mp.microsoft.com%2Ffilestreamingservice%2Ffiles%2Ff021a6c3-c792-46d2-ac5a-548d3c6b1b46&countryCode=US&profile=3211262&CacheId=7 HTTP/1.1
Host: cp801.prod.do.dsp.mp.microsoft.com

which responds

{
    "ContentId": "DO-2aWJbZAmZ8h1vF18-ocyNvOc5aAeEfJ7-l8Y-rf_I-Q=",
    "HashOfHashes": "2aWJbZAmZ8h1vF18/ocyNvOc5aAeEfJ7/l8Y/rf+I/Q=",
    "PiecesHashFileCdnUrls": [
        "http://dl.delivery.mp.microsoft.com/filestreamingservice//files/f021a6c3-c792-46d2-ac5a-548d3c6b1b46/pieceshash"
    ],
    "ContentCdnUrls": [
        ""
    ],
    "IsSecure": "True",
    "IsInternal": "False",
    "Policies": {
        "ForegroundQosBps": "6710886",
        "BackgroundQosBps": "2621440",
        "MaxCacheAgeSecs": "259200",
        "ExpireAtSecsSinceEpoch": "",
        "DownloadToExpire": "86400"
    },
    "Rank": 0.36026386048657066
}

Cool. So the ContentId is DO- + the HashOfHashes and the PiecesHashFileCdnUrls is where the .pieceshash file we saw further up the page came from. The PiecesHashFileCdnUrls is used in the next request.

Step 4

GET /v2/content/074a9355101ddb601b93946302e214028394d70a?partitionId=0&doClientVersion=10.0.17763.1697&altCatalogId=http%3A%2F%2Ftlu.dl.delivery.mp.microsoft.com%2Ffilestreamingservice%2Ffiles%2Ff021a6c3-c792-46d2-ac5a-548d3c6b1b46&profile=3211262&CacheId=7 HTTP/1.1
Host: disc801.prod.do.dsp.mp.microsoft.com

which responds

[
    {
        "CollectiveArray": "https://array802.prod.do.dsp.mp.microsoft.com/",
        "Weight": 100.0
    }
]

The CollectiveArray is used in the next request.

Step 5

Using the UUID from PiecesHashFileCdnUrls fn step 3 (ContentId can be complete garbage, it defaults to use AltCatalogId), form the following request

POST /join/ HTTP/1.1
Host: array802.prod.do.dsp.mp.microsoft.com
Content-Type: application/json
Content-Length: 727

{
    "ContentId": "074a9355101ddb601b93946302e214028394d70a",
    "AltCatalogId": "http://tlu.dl.delivery.mp.microsoft.com/filestreamingservice/files/f021a6c3-c792-46d2-ac5a-548d3c6b1b46",
    "PeerId": "e74bb69eac74e246843b8f1d5741fa7000000000",
    "ReportedIp": "0.0.0.0",
    "SubnetMask": "255.255.255.0",
    "Ipv6": "",
    "IsBackground": "0",
    "ClientCompactVersion": "10.0.17763.1697",
    "Uploaded": "0",
    "Downloaded": "0",
    "DownloadedCdn": "0",
    "DownloadedDoinc": "0",
    "Left": "0",
    "JoinRequestEvent": "3",
    "RestrictedUpload": "0",
    "PeersWanted": "50",
    "GroupId": "",
    "Scope": "1",
    "UploadedBPS": "0",
    "DownloadedBPS": "0",
    "Profile": "3211262",
    "Seq": "0"
}

which responds

{
    "FailureReason": null,
    "NextJoinTimeIntervalInMs": 2754185,
    "Complete": 0,
    "Incomplete": 0,
    "Rediscover": false,
    "KVVersion": "EA4448A76A276C1E85BD3278C35E80AEF80F85F779BBD2FDBC0BF7BB38A1D25A",
    "GeoVersion": "5B36157A03CF0500DA3C2D8238E6005F3469E237734C2046890202DB6F874840",
    "Peers": [
        {
            "PeerId": "db1c326c09414042a92a17119efe79d600000000",
            "Type": 128,
            "Ip": "68.47.174.154",
            "Port": 7680,
            "Ipv6": "2001:0000:349E:D136:3C44:001F:BBD0:5165",
            "InternalIp": "",
            "ExternalIp": "68.47.174.154",
            "Ipv6Port": 7680,
            "InternalPort": 0,
            "ExternalPort": 7680
        },
        {
            "PeerId": "6100b101f0884546b576a5d0bff594ff00000000",
            "Type": 128,
            "Ip": "69.136.117.9",
            "Port": 7680,
            "Ipv6": "2001:0000:349E:D136:2C23:205F:BA77:8AF6",
            "InternalIp": "",
            "ExternalIp": "69.136.117.9",
            "Ipv6Port": 7680,
            "InternalPort": 0,
            "ExternalPort": 7680
        },
        {
            "PeerId": "fd8592d8d23a58498133f67d789c659f00000000",
            "Type": 128,
            "Ip": "73.241.185.224",
            "Port": 7680,
            "Ipv6": "2001:0000:0D5B:9458:38ED:25D7:B60E:461F",
            "InternalIp": "",
            "ExternalIp": "73.241.185.224",
            "Ipv6Port": 7680,
            "InternalPort": 0,
            "ExternalPort": 7680
        }
        //Truncated for length purposes of this blog
    ],
    "Leave": false
}

If you’re thinking “Wow! It’s not great that there’s that much info readily available, but at least the InternalIp is blank!”… think again.

I removed the InternalIp for the purposes of this blog, but I promise they’re there. Additionally, you can just send the request again and get a new list.

Eventually if you can scape the “tracker” for popular files distributed with Delivery Optimization and a use tool like CytoScape to build a network graph of all of the Ip, InternalIp, and ExternalIp you get a nice detailed view of the internal network topology of some really interesting companies… but I digress. MS should probably rate limit the scraping of these endpoints.

Fuzzing and other shenanigans

Using what we’ve learned from the Swarm Protocol above, this is more than enough to construct a basic fuzzer using Boofuzz.

#!/usr/bin/env python3

from boofuzz import *


def main():
    session = Session(target=Target(connection=TCPSocketConnection("192.168.8.242", 7680)))
    define_proto(session=session)
    session.fuzz()

def define_proto(session):
    s_initialize("swarm")
    s_static("\x0e", name="start")
    s_string("Swarm protocol\x00", name="swarm_header")
    s_string("\x00\x00\x00\x00\x10\x00\x00", name="unknown1")
    s_string("\xd9\xa5\x89m\x90&g\xc8u\xbc]|\xfe\x8726\xf3\x9c\xe5\xa0\x1e\x11\xf2{\xfe_\x18\xfe\xb7\xfe#\xf4", name="infohash")
    s_string("\xefVC\x9f\xa1\x87gI\xbf\x98K>/s@G\x00\x00\x00\x00", name="unknown2")

    session.connect(s_get("swarm"))

if __name__ == "__main__":
    main()

Or we can use another fuzzer Radamsa with the start of the client handshake roblox_start.bin

radamsa -o 192.168.8.242:7680 -n inf roblox_start.bin

This leads to some interesting findings and behavior which deserve their own blog post.

Fin

Coercing the windows OS to actually use the Swarm protocol when wanted was actually fairly hard to do. I thought I’d kept the environment controlled enough several times, but it kept being weird. Even now, the seeding/caching state of the service is hard to pin down, but I was able to successfully capture some traffic from it and provide a little documentation with an infromation leak about organizations which may be using it.

If you’re trying to reproduce this research and can’t get it to work, keep trying! It’s the kind of thing that seems to misbehave for literally any reason.

Perhaps this blog will be followed in the future with an additional blog “DOing MORE Harm” which goes beyond just the initial handshake, but that all depends on how much time I have.

With that, I’ll call it a wrap. Here lies a protocol mystified for years, hiding in plain sight on damn near every Windows machine. Finally; Observed and documented.

Sharing is caring!