The ipaddress R package was heavily influenced by the design of the ipaddress module in the Python Standard Library. For this reason, the package is centered around 3 data classes:
This vignette is styled after the official tutorial for the Python module. It introduces you to these classes, laying the groundwork for the rest of the package functionality.
IP addresses are used to facilitate communications between computers connected to the internet. At this highest level, an IP address is analogous to a mailing address.
It’s important to know there are two versions of the Internet Protocol in wide usage today. IPv4 stores addresses using 32 bits, which provides 4,294,967,296 unique addresses. But given the rapid growth of the internet, this address space was quickly depleted. The replacement protocol (known as IPv6) stores addresses using 128 bits, which provides a far greater number of unique addresses (sufficient for the foreseeable future). The transition to IPv6 is currently ongoing, so it is still very common to see IPv4 addresses.
To make IP addresses easier for humans to interpret, they are usually represented as character strings.
255separated by periods (e.g.
192.168.0.1). Each group corresponds to 8 bits.
ffffseparated by colons (e.g.
2001:0db8:85a3:0000:0000:8a2e:0370:7334). Each group corresponds to 16 bits. This representation can also be compressed by removing leading zeros and replacing consecutive groups of zeros with double-colon (e.g.
ip_address() vector is constructed from a character vector of these human-readable strings. It can handle IPv4 and IPv6 addresses simultaneously:
IP addresses are often stored as integers for convenience, so you might need to encode or decode this format. The
integer_to_ip() functions are provided for this purpose. We recommend reading the documentation before use, because these functions can be slightly counter-intuitive (due to circumventing limitations of the R integer data type).
The above example looks like we’ve simply stored the character vector. However, the constructor has actually validated each input and stored the native bit representation of each address. The
print() function has then converted them back to the human-readable character representation. We can see this in action by passing an invalid address:
There are two main advantages to storing IP data in their native bit representation:
An IP network is a contiguous range of IP addresses (also known as an IP block). These networks are very important to how address allocation and routing work.
The size of a network is determined by its prefix length. This indicates how many bits are reserved (counting from the left) for the routing prefix address (i.e. the start of the address range). The remaining bits are available for allocation to hosts (so all hosts on a network will share the same prefix bits). This means a network with a larger prefix length is actually a smaller network.
The most common representation of an IP network is called CIDR notation. This shows the routing prefix address and the prefix length, separated by a forward slash. This notation is used by both IPv4 and IPv6 networks.
As an example, the
192.168.0.0/24 network represents the address range from
ip_network() vector is constructed from a character vector of these CIDR strings. For example:
An IP network cannot have any host bits set. If host bits were set, this would refer to a specific host on the network and not the network as a whole – this is the purpose of the
ip_interface() class (see below).
ip_network() constructor enforces this rule during input validation. If an input has host bits set, a warning is emitted and
NA is returned. However, you can mask out the host bits using
strict = FALSE.
We’ve learned about host addresses and how they are grouped into networks. Unsurprisingly then, people often think about an IP address within the context of its network (i.e. storing both pieces of information simultaneously). The ipaddress package refers to this concept as an IP interface.
An IP interface could be represented in many different ways (e.g. two addresses containing the network bits and host bits separately). However, the most common representation is CIDR notation again.
ip_interface() vector is constructed from a character vector of CIDR strings, just like an
ip_network() vector. However, unlike
ip_interface() class retains the host bits.
Since this class represents a host on a specific network, most functions will treat an
ip_interface() vector like an
ip_address() vector. Some exceptions are listed under
The address and network components can be extracted using the
as_ip_network() functions, respectively.
The ipaddress package provides the
ip_network() data classes, which represent the most fundamental aspects of IP networking. The majority of functions contained in this package use these classes.
ip_interface() class is also provided, which is a hybrid class describing a specific host on a specific network. Although most functions treat this class like an
ip_address(), the constituent address and network components can be extracted using