Data Structures
The previous article was relatively simple, so we didn’t discuss the data structure design in detail. As the following articles will gradually increase in complexity, let’s first introduce the data structure design. Computer networks have a layered structure, and each layer except the physical layer has its corresponding packet structure. From the link layer to the application layer, each layer encapsulates the packet from the next layer, so we design our data structures in a similar nested fashion. The basic construction method is as follows:
packet_test.go
1
|
pack := NewIPPack(NewTcpPack(&RawPack{}))
|
The IP object wraps the TCP object, which wraps the raw object, resulting in an IP object. The constructor functions take interfaces as parameters, so if you want, you can even wrap another IP object inside the TCP object:
1
|
pack := NewIPPack(NewTcpPack(NewIPPack(&RawPack{})))
|
This approach is not only theoretically possible but also practically meaningful. Some special network tools actually implement features like network proxying by wrapping raw data packets inside TCP.
The network packet interface is defined as follows:
packet.go
1
2
3
4
|
type NetworkPacket interface {
Decode(data []byte) (NetworkPacket, error)
Encode() ([]byte, error)
}
|
The constructor is defined as:
tcp.go
1
2
3
|
func NewTcpPack(payload NetworkPacket) *TcpPack {
return &TcpPack{Payload: payload}
}
|
The network packet interface definition is very simple: the Decode function decodes data into an object, and the Encode function encodes an object into data.
IP Packet Generation
Here’s the complete implementation:
ip encode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
|
func (i *IPPack) Encode() ([]byte, error) {
var (
payload []byte
err error
)
if i.Payload != nil {
payload, err = i.Payload.Encode()
if err != nil {
return nil, err
}
}
data := make([]byte, 0)
if i.HeaderLength == 0 {
i.HeaderLength = uint8(20 + len(i.Options))
}
data = append(data, i.Version<<4|i.HeaderLength/4)
data = append(data, i.TypeOfService)
if i.TotalLength == 0 {
i.TotalLength = uint16(i.HeaderLength) + uint16(len(payload))
}
data = binary.BigEndian.AppendUint16(data, i.TotalLength)
data = binary.BigEndian.AppendUint16(data, i.Identification)
data = binary.BigEndian.AppendUint16(data, uint16(i.Flags)<<13|i.FragmentOffset)
data = append(data, i.TimeToLive)
data = append(data, i.Protocol)
data = binary.BigEndian.AppendUint16(data, i.HeaderChecksum)
data = append(data, i.SrcIP...)
data = append(data, i.DstIP...)
data = append(data, i.Options...)
if i.HeaderChecksum == 0 {
i.HeaderChecksum = calculateIPChecksum(data)
}
binary.BigEndian.PutUint16(data[10:12], i.HeaderChecksum)
data = append(data, payload...)
return data, nil
}
|
Most field conversions involve basic bit operations, which we won’t explain in detail. The checksum generation needs attention.
Checksum calculation is a bit complicated and isn’t a key focus of the TCP/IP protocol. If you want to quickly implement a working TCP/IP protocol, you can temporarily skip this part and just copy existing implementation code.
However, we can’t ignore checksums entirely, as packets with invalid checksums will be discarded.
Checksum
To calculate the checksum, we first generate the IP header data packet with the checksum field set to 0, then calculate the checksum on this data packet.
Here’s the original text from the RFC:
checksum
1
2
3
4
|
In outline, the Internet checksum algorithm is very simple:
(1) Adjacent octets to be checksummed are paired to form 16-bit
integers, and the 1's complement sum of these 16-bit integers is
formed.
|
Implementation:
1
2
3
4
5
6
|
func calculateIPChecksum(headerData []byte) uint16 {
if len(headerData)%2 == 1 {
headerData = append(headerData, 0)
}
return ^OnesComplementSum(headerData)
}
|
TCP Packet Generation
tcp encode
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
|
func (t *TcpPack) Encode() ([]byte, error) {
data := make([]byte, 0)
data = binary.BigEndian.AppendUint16(data, t.SrcPort)
data = binary.BigEndian.AppendUint16(data, t.DstPort)
data = binary.BigEndian.AppendUint32(data, t.SequenceNumber)
data = binary.BigEndian.AppendUint32(data, t.AckNumber)
if t.DataOffset == 0 {
t.DataOffset = uint8(20 + len(t.Options))
}
data = append(data, ((t.DataOffset>>2)<<4)|t.Reserved)
data = append(data, t.Flags)
data = binary.BigEndian.AppendUint16(data, t.WindowSize)
data = binary.BigEndian.AppendUint16(data, t.Checksum)
data = binary.BigEndian.AppendUint16(data, t.UrgentPointer)
data = append(data, t.Options...)
if t.Payload != nil {
payload, err := t.Payload.Encode()
if err != nil {
return nil, err
}
data = append(data, payload...)
}
if t.Checksum == 0 {
if t.PseudoHeader == nil {
return nil, errors.New("pseudo header is required to calculate tcp checksum")
}
t.Checksum = calculateTcpChecksum(t.PseudoHeader, data)
binary.BigEndian.PutUint16(data[16:18], t.Checksum)
}
return data, nil
}
|
TCP packet generation is also only complex in terms of checksum calculation. Similarly, if you want to quickly implement a working TCP/IP protocol, you can temporarily skip this part and just copy existing implementation code.
Checksum
TCP packet checksum calculation requires adding some extra data to the TCP header before calculating the checksum. Here’s the original text from the RFC:
pseudo-header
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
|
The checksum also covers a pseudo-header (Figure 2) conceptually prefixed to the TCP header.
+--------+--------+--------+--------+
| Source Address |
+--------+--------+--------+--------+
| Destination Address |
+--------+--------+--------+--------+
| zero | PTCL | TCP Length |
+--------+--------+--------+--------+
Figure 2: IPv4 Pseudo-header
Pseudo-header components for IPv4:
Source Address: the IPv4 source address in network byte order
Destination Address: the IPv4 destination address in network byte order
zero: bits set to zero
PTCL: the protocol number from the IP header
TCP Length: the TCP header length plus the data length in octets (this is not an explicitly transmitted quantity but is computed), and it does not count the 12 octets of the pseudo-header.
|
So we first need to generate the pseudo-header, then calculate the checksum. The pseudo-header data can be easily obtained from the IP packet. After generating the new packet, we can use the same function used for calculating IP checksums. Here’s the final implementation:
tcp checksum
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
|
func (t *TcpPack) SetPseudoHeader(srcIP, dstIP []byte) {
t.PseudoHeader = &PseudoHeader{SrcIP: srcIP, DstIP: dstIP}
}
func calculateTcpChecksum(pseudo *PseudoHeader, headerPayloadData []byte) uint16 {
length := uint32(len(headerPayloadData))
pseudoHeader := make([]byte, 0)
pseudoHeader = append(pseudoHeader, pseudo.SrcIP...)
pseudoHeader = append(pseudoHeader, pseudo.DstIP...)
pseudoHeader = binary.BigEndian.AppendUint32(pseudoHeader, uint32(ProtocolTCP))
pseudoHeader = binary.BigEndian.AppendUint32(pseudoHeader, length)
sumData := make([]byte, 0)
sumData = append(sumData, pseudoHeader...)
sumData = append(sumData, headerPayloadData...)
if len(sumData)%2 == 1 {
sumData = append(sumData, 0)
}
return ^OnesComplementSum(sumData)
}
|
There are many ways to optimize checksum calculation. Here’s one optimization method using uint32.
By using uint32 directly, all overflow parts are added to the high 16 bits. Then we add the high 16 bits back to the low 16 bits. If it overflows again, we continue adding back to the low 16 bits until there’s no more overflow.
1
2
3
4
5
6
7
8
9
10
11
|
func OnesComplementSum(data []byte) uint16 {
var sum uint32
for i := 0; i < len(data); i += 2 {
sum += uint32(binary.BigEndian.Uint16(data[i : i+2]))
}
// Add the carry bits back in
for sum > 0xffff {
sum = (sum & 0xffff) + (sum >> 16)
}
return uint16(sum)
}
|
Important Notes
- My TCP/IP stack project is primarily for educational purposes, so I prioritize code readability over performance. Many implementations are not optimal. Production-level code would include extensive performance optimizations, error handling, and boundary checks, sacrificing some readability for higher performance and security.
- In the current implementation, the IP ID is always 0 to simplify implementation. The IP ID is mainly used in IP fragmentation, so we can ignore it for now. The current implementation works fine for small packets.
- Both IP and TCP have options fields that involve some extended network functionality, which can also be ignored for initial implementation.
Recommended Reading
Summary
At this point, we have completed TCP packet generation. In the next article, we will start implementing the TCP three-way handshake.