Contents

Write TCP/IP Stack by Yourself (1): TCP Packet Parsing

TunTap

Since the Linux kernel controls network interfaces, applications cannot directly use network interfaces to handle network packets. Linux provides the tuntap virtual network interface mechanism to allow users to handle raw network packets at the application layer.

Example of Using TUN

TunTap can create two types of virtual network interfaces: TUN and TAP. TAP is a layer 2 network interface that provides MAC frames. TUN is a layer 3 network interface that provides IP packets. We only need to use the TUN interface to handle TCP/IP protocols. If you need to handle ARP and ICMP protocols, you’ll need to use the TAP interface. Here we’ll demonstrate using the TUN interface.

test tun

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
func Test_tun(t *testing.T) {
    args := struct {
        cidr string
        name string
    }{
        cidr: "11.0.0.1/24",
        name: "testtun1",
    }
    fd, err := CreateTunTap(args.name, syscall.IFF_TUN|syscall.IFF_NO_PI)
    if err != nil {
        log.Fatalln(err)
    }

    out, err := exec.Command("ip", "addr", "add", args.cidr, "dev", args.name).CombinedOutput()
    if err != nil {
        log.Fatalln(err)
    }
    fmt.Println(out)

    out, err = exec.Command("ip", "link", "set", args.name, "up").CombinedOutput()
    if err != nil {
        log.Fatalln(err)
    }
    fmt.Println(out)
    buf := make([]byte, 1024)
    for {
        n, err := syscall.Read(fd, buf)
        if err != nil {
            log.Fatalln(err)
        }
        fmt.Println(hex.Dump(buf[:n]))
    }
}

Let’s test it by sending a simple request:

1
curl -v http://11.0.0.2/hello

You’ll get output similar to this, which is a raw IP packet:

1
2
3
4
00000000  45 00 00 3c 80 40 40 00  40 06 a4 79 0b 00 00 01  |E..<.@@.@..y....|
00000010  0b 00 00 02 bb f8 00 50  08 a8 4a 04 00 00 00 00  |.......P..J.....|
00000020  a0 02 fa f0 67 67 00 00  02 04 05 b4 04 02 08 0a  |....gg..........|
00000030  bf b6 00 fa 00 00 00 00  01 03 03 07              |............|

Parsing IP Packets

Let’s look at the IP packet format definition from RFC791:

rfc791#section-3.1

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|Version|  IHL  |Type of Service|          Total Length         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|         Identification        |Flags|      Fragment Offset    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Time to Live |    Protocol   |         Header Checksum       |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                       Source Address                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Destination Address                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Options                    |    Padding    |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Let’s analyze this packet:

1
2
3
4
00000000  45 00 00 3c 80 40 40 00  40 06 a4 79 0b 00 00 01  |E..<.@@.@..y....|
00000010  0b 00 00 02 bb f8 00 50  08 a8 4a 04 00 00 00 00  |.......P..J.....|
00000020  a0 02 fa f0 67 67 00 00  02 04 05 b4 04 02 08 0a  |....gg..........|
00000030  bf b6 00 fa 00 00 00 00  01 03 03 07              |............|

The breakdown is as follows:

IP Offset TCP Offset Byte Value Description
4/8 0x4 IP Version: IPv4
1 0x5 IP Header Length: 5 * 4 = 20 bytes
2 0x00 Type of Service
4 0x003c Total Length: 60 bytes
6 0x8040 IP Identification
6 + 3/8 010 Flags: 0: Reserved (must be 0), 1: Don’t Fragment (DF), 0: More Fragments (MF)
8 0 0000 0000 0000 Fragment Offset: 0
9 0x40 Time to Live: 64 seconds
10 0x06 Protocol: 0x06 indicates TCP
12 0xa479 Header Checksum
16 0x0b 00 00 01 Source IP: 11.0.0.1
20 0x0b 00 00 02 Destination IP: 11.0.0.2

Here’s the code to parse it:

ip.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
func (i *IPPack) Decode(data []byte) (*IPPack, error) {
    header := &IPHeader{
        Version:        data[0] >> 4,
        HeaderLength:   (data[0] & 0x0f) * 4,
        TypeOfService:  data[1],
        TotalLength:    binary.BigEndian.Uint16(data[2:4]),
        Identification: binary.BigEndian.Uint16(data[4:6]),
        Flags:          data[6] >> 5,
        FragmentOffset: binary.BigEndian.Uint16(data[6:8]) & 0x1fff,
        TimeToLive:     data[8],
        Protocol:       data[9],
        HeaderChecksum: binary.BigEndian.Uint16(data[10:12]),
        SrcIP:          net.IP(data[12:16]),
        DstIP:          net.IP(data[16:20]),
    }
    header.Options = data[20:header.HeaderLength]
    i.IPHeader = header
    payload, err := i.Payload.Decode(data[header.HeaderLength:])
    if err != nil {
        return nil, err
    }
    i.Payload = payload
    return i, nil
}

Important points to note:

Network Byte Order

Network byte order is always big-endian. When parsing packets, the high-order bytes come first. For example, 0x1234 is represented as 0x1234 in big-endian and 0x3412 in little-endian. Big-endian matches our normal writing order.

The Go implementation is straightforward:

1
2
3
4
5
6
7
8
9
func (bigEndian) Uint16(b []byte) uint16 {
    _ = b[1] // bounds check hint to compiler; see golang.org/issue/14808
    return uint16(b[1]) | uint16(b[0])<<8
}

func (littleEndian) Uint16(b []byte) uint16 {
    _ = b[1] // bounds check hint to compiler; see golang.org/issue/14808
    return uint16(b[0]) | uint16(b[1])<<8
}

IP Header Length

The IP header length is measured in 32-bit words, so we multiply by 4 to get the byte count. As stated in the RFC:

IHL: 4 bits Internet Header Length is the length of the internet header in 32 bit words, and thus points to the beginning of the data. Note that the minimum value for a correct header is 5.

Parsing TCP Packets

rfc9293#name-header-format

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
0                   1                   2                   3
0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|          Source Port          |       Destination Port        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                        Sequence Number                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Acknowledgment Number                      |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|  Data |       |C|E|U|A|P|R|S|F|                               |
| Offset| Rsrvd |W|C|R|C|S|S|Y|I|            Window             |
|       |       |R|E|G|K|H|T|N|N|                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|           Checksum            |         Urgent Pointer        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                           [Options]                           |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                                                               :
:                             Data                              :
:                                                               |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+

Let’s analyze the TCP portion of our packet:

IP Offset TCP Offset Byte Value Description
22 2 0xbbf8 Source Port: 48120
24 4 0x0050 Destination Port: 80
28 8 0x08a84a04 Sequence Number: 145246724
32 12 0x00000000 Acknowledgment Number: 0
33 + 4/8 13 + 4/8 0xa Header Length: 10 * 4 = 40 bytes
33 + 10/8 13 + 10/8 0000 00 Reserved
34 14 00 0010 Flags URG:0 ACK:0 PSH:0 RST:0 SYN:1 FIN:0 (SYN packet)
36 16 0xfaf0 Window Size: 64240
38 18 0x6767 Checksum
40 20 0x0000 Urgent Pointer
60 40 TCP Options and Padding

Here’s the code to parse it:

tcp.go

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
17
18
19
20
21
22
func (t *TcpPack) Decode(data []byte) (NetworkPacket, error) {
    header := &TcpHeader{
        SrcPort:        binary.BigEndian.Uint16(data[0:2]),
        DstPort:        binary.BigEndian.Uint16(data[2:4]),
        SequenceNumber: binary.BigEndian.Uint32(data[4:8]),
        AckNumber:      binary.BigEndian.Uint32(data[8:12]),
        DataOffset:     (data[12] >> 4) * 4,
        Reserved:       data[12] & 0x0F,
        Flags:          data[13],
        WindowSize:     binary.BigEndian.Uint16(data[14:16]),
        Checksum:       binary.BigEndian.Uint16(data[16:18]),
        UrgentPointer:  binary.BigEndian.Uint16(data[18:20]),
    }
    header.Options = data[20:header.DataOffset]
    t.TcpHeader = header
    payload, err := t.Payload.Decode(data[header.DataOffset:])
    if err != nil {
        return nil, err
    }
    t.Payload = payload
    return t, nil
}

The parsing principles are similar to IP packet parsing, so we won’t repeat the details.

Summary

In this article, we learned how to use TUN in TunTap and how to parse IP and TCP packets. This is our first step in implementing our own TCP/IP stack. The code discussed in this article can be found here.