GENWiki

Premier IT Outsourcing and Support Services within the UK

User Tools

Site Tools


rfc:rfc1296

Network Working Group M. Lottor Request for Comments: 1296 SRI International

                                    Network Information Systems Center
                                                          January 1992
                    Internet Growth (1981-1991)

Status of this Memo

 This memo provides information for the Internet community.  It does
 not specify an Internet standard.  Distribution of this memo is
 unlimited.

Abstract

 This document illustrates the growth of the Internet by examination
 of entries in the Domain Name System (DNS) and pre-DNS host tables.
 DNS entries are collected by a program called ZONE, which searches
 the Internet and retrieves data from all known domains.  Pre-DNS host
 table data were retrieved from system archive tapes.  Various
 statistics are presented on the number of hosts and domains.

Table of Contents

 Introduction....................................................   1
 How ZONE Works..................................................   2
 Problems with Data Collection...................................   3
 Scope of the Study..............................................   3
 N. Results......................................................   4
 N.1 Number of Internet Hosts....................................   4
 N.2 Number of Domains...........................................   6
 N.3 Distribution of IP Addresses per Host.......................   7
 N.4 Distribution of Hosts by Top-level Domain...................   7
 N.5 Distribution of Hosts by Host Name..........................   8
 Future Issues...................................................   8
 RFC References..................................................   9
 Security Considerations.........................................   9
 Author's Address................................................   9

Introduction

 This document provides statistics on the growth of the Internet by
 examining the number of Internet hosts and domains over a 10-year
 period.  Before the Domain Name System was established, practically
 all hosts on the Internet were registered with the Network
 Information Center (SRI-NIC) and entries were placed in the Official
 Host Table for each one.  Data on the number of hosts for pre-DNS

Lottor [Page 1] RFC 1296 Internet Growth (1981-1991) January 1992

 years comes from copies of the host table at selected times.  The DNS
 system was introduced around 1984 but took almost 4 years before it
 was fully implemented on the Internet.  However, by this time many
 hosts were no longer registered in the Host Table.
 In 1986, the ZONE (Zealot Of Name Edification) program was written.
 ZONE was originally intended to be used during the host-table-to-DNS
 transition period.  ZONE would "walk" the DNS tree and build a host
 table of all the information it collected.  This host table could
 then be used by sites that had not yet made the DNS transition.
 However, ZONE was never used for this purpose.  Instead, it was found
 to be useful for collecting statistics on the size of the domain
 system and the Internet.
 ZONE could not collect complete data on the DNS until around 1988,
 because early versions of BIND (the popular Unix DNS implementation)
 had major problems with the zone transfer function of the DNS
 protocol.  ZONE has been used in varying ways ever since to collect
 this information.  In the first few years, it was used to produce a
 wall-size chart of the domain tree.  However, the number of domains
 quickly outgrew the size of the wall and the charts were abandoned.
 In later years, statistics on the number of hosts and domains were
 extracted from the resulting host table, sometimes categorizing data
 based on top-level domain names or on computer system type or
 manufacturer.
 The time to gather the data also grew from hours to a week, and the
 size of the host table produced soon reached 50 megabytes.  In order
 to reduce the amount of data collected, ZONE is now run in a mode
 collecting only host names and IP addresses, ignoring protocol, host
 information and MX record data.  The host table is then groveled over
 by some utilities (such as sort, uniq and grep) to produce the
 statistics required.  ZONE is currently run every 3 months at SRI.

How ZONE Works

 ZONE maintains a list of domains and their servers and a flag
 indicating whether information for a domain has been successfully
 loaded from one of the servers. Because of another bug in BIND, ZONE
 must be primed with a list of all the top-level domains and their
 name servers.  It then cycles through the domain list, attempting to
 contact one of the servers for each domain not yet transferred.  When
 a server is contacted (via TCP), a Start of Authority (SOA) query is
 first sent to make sure the server is authoritative for the domain
 being requested.  If so, then a zone transfer query (AXFR) is sent to
 request all the resource records for the domain to be retrieved.
 When a name server record (NS) is received, the referenced domain and

Lottor [Page 2] RFC 1296 Internet Growth (1981-1991) January 1992

 server are added to the list of domains to process.  When host
 records (A, CNAME, HINFO, MX) are received, they are added to an in-
 core table of host information.  The program ends when it has cycled
 through the entire list of domains without receiving any new
 information.  It then dumps the table of host information to a
 HOSTS.TXT format file.

Problems with Data Collection

 For various reasons, some Internet sites do not allow zone transfers
 of their domain servers.  ZONE also eventually gives up trying to
 transfer a domain after too many failures.  The number of domains
 that could not be zone transferred during the 1-Jan-92 ZONE run was
 around 800 out of 17,000.  Additionally, it is assumed that not all
 hosts on the Internet are registered in a domain server.  These
 problems cause the statistics gathered by ZONE to be lower than the
 actual amounts.
 Manual review of some of the data collected by ZONE also shows a lot
 of random entries in the DNS.  Misformatted entries may cause bogus
 server or host records to appear.  Many times a server is found to
 not be authoritative for the domain listed.  Sometimes entire domains
 are renamed and their old entries left in place for a transition
 period, thus causing each host within that domain to be counted
 twice.  These problems cause the results of ZONE to be higher than
 the actual amounts.
 Manual scanning of the data indicates that the additional entries are
 insignificant compared to the missing entries discussed earlier.
 ZONE data can thus be viewed as the minimum number of Internet hosts,
 and not the actual figures.
 A final problem with data collection is that of expense.  Downloading
 domain information from every domain on the Internet generates a
 large amount of network traffic.  It also puts an extra CPU load on
 each domain server it must contact.  An organized effort might be
 considered to have only one such program doing this on the Internet
 at regularly scheduled intervals to keep the problem of multiple data
 collectors from occurring.

Scope of the Study

 A problem with counting hosts and domains on the Internet is defining
 what the Internet really is.  Finding host entries in the DNS does
 not necessarily indicate that the host is reachable from the
 Internet.  Many companies have mail gateways between the Internet and
 their local nets, thus disallowing direct access.  However, some of
 these companies advertise all their hosts, and some advertise only

Lottor [Page 3] RFC 1296 Internet Growth (1981-1991) January 1992

 the gateway.  Are these hosts on the Internet or not?
 Furthermore, many domains in the DNS are just mail-forwarding (MX)
 entries for off-Internet (such as Usenet) sites.  Are these domains
 really part of the Internet and should they be counted in an Internet
 size study?
 For the purposes of this study, a host has been defined as a
 [name(s),IP-address(es)] grouping discovered from the DNS.  This
 prevents us from counting a host with multiple names or addresses
 more than once.  However, this does not consider whether the host is
 directly accessible or not.  When ZONE counts the number of domains
 it includes all domains referenced by an NS record in the DNS, thus
 including MX-only domain sites in the final results.

N. Results

 This section presents data from archive tapes of SRI-NIC from 1981 to
 1986, and statistics gathered by runs of ZONE from 1986 to 1992.

N.1 Number of Internet Hosts

 The chart below shows the number of IP hosts on the Internet.  These
 are hosts with at least one IP address assigned.  Data was collected
 by ZONE except where noted.  The following two sections are graphs of
 the data in this chart.
      Date             Hosts
      08/81              213          Host table #152
      05/82              235          Host table #166
      08/83              562          Host table #300
      10/84            1,024          Host table #392
      10/85            1,961          Host table #485
      02/86            2,308          Host table #515
      11/86            5,089
      12/87           28,174
      07/88           33,000
      10/88           56,000
      01/89           80,000
      07/89          130,000
      10/89          159,000
      10/90          313,000
      01/91          376,000
      07/91          535,000
      10/91          617,000
      01/92          727,000

Lottor [Page 4] RFC 1296 Internet Growth (1981-1991) January 1992

                      Number of Internet Hosts (linear)

800| 780| 760| 740| * 720| 700| 680| . 660| 640| 620| 600| T * 580| h 560| o 540| u 520| s * 500| a 480| n . 460| d 440| s 420| . 400| o 380| f 360| * 340| H . 320| o 300| s * 280| t 260| s . 240| . 220| . 200| . 180| . 160| 140| * 120| * 100| .. 80| * 60| . 40| * 20| ..*…*

0|...*....*......*......*.....*.*....*...
  -------------------------------------------------------------------
  8     8     8     8     8     8     8     8     8     9     9     9
  1     2     3     4     5     6     7     8     9     0     1     2
                                 Date
  "*"  = data point,  "." = estimate

This graph is a linear plot of the number of Internet hosts.

Lottor [Page 5] RFC 1296 Internet Growth (1981-1991) January 1992

                      Number of Internet Hosts (logarithmic)
  | 1000000
  |                                                                *.*
  |                                                        ..*.*..*
  |                                                     ...
  | 100000                                          ..**
  |                                              *.*
H |                                          ...*
o |                                        .*
s | 10000                                ..
t |                                    ..
s |                               ....*
  |                         ...*.*

1000| …*..

  |                ...
  |            ...*
  | ..*....*...

100|.

  1. ——————————————————————

8 8 8 8 8 8 8 8 8 9 9 9

   1     2     3     4     5     6     7     8     9     0     1     2
                                  Date
   "*"  = data point,  "." = estimate

This graph is a logarithmic plot of the number of Internet hosts.

N.2 Number of Domains

 This chart shows the number of domains existing in the Internet
 Domain Name System as collected by ZONE.
      Date           Domains
      07/88              900
      10/88            1,280
      01/89            2,600
      07/89            3,900
      10/89            4,800
      10/90            9,300
      01/91           11,200
      07/91           16,000
      10/91           18,000
      01/92           17,000

Lottor [Page 6] RFC 1296 Internet Growth (1981-1991) January 1992

N.3 Distribution of IP Addresses per Host

 This chart shows how many hosts have how many IP addresses.  This
 data was collected on 1-Jan-92 and only the first 10 entries are
 shown.
   Addresses           Hosts
       1              715143
       2                9015
       3                1027
       4                 556
       5                 314
       6                 213
       7                 100
       8                  85
       9                  58
      10                  71

N.4 Distribution of Hosts by Top-level Domain

 This chart shows the number of hosts per top-level domain (top 40
 only) on 1-Jan-92.  The percentage listed is the increase since 1-
 Oct-91.  Large variations are probably due to problems and variations
 in the collection process; these figures are not meant to be
 authoritative, but serve as reasonable estimates.
 243020 edu 13%     13011 fr    4%     1791 dk   4%     357 be  -5%
 181361 com 12%     12770 nl   21%     1662 es  15%     334 gr  14%
  46463 gov 13%     12647 ch   10%     1506 kr   9%     308 br  26%
  31622 au  19%     11994 fi   15%     1111 nz -16%     284 mx  -5%
  31016 de  20%     10228 no    9%     1016 tw  n/a     207 is   0%
  27492 mil 26%      8579 jp    6%      929 za  n/a     146 pl  97%
  27052 ca  22%      4109 net -49%      784 pt  n/a     127 us  25%
  19117 org 10%      3324 at   19%      484 sg 251%      25 tn   0%
  18984 uk 139%      2719 it  197%      448 hk  78%      24 hu  71%
  18473 se  34%      2020 il   14%      374 ie  -7%       6 arpa 0%

Lottor [Page 7] RFC 1296 Internet Growth (1981-1991) January 1992

N.5 Distribution of Hosts by Host Name

 This chart shows the distribution of hosts by their host name on 1-
 Jan-92.  The host name is defined to be the first part of a fully
 qualified domain name.  Only the top 100 names are shown.

384 venus 204 mac4 172 mac9 155 pollux 138 chaos 356 pluto 201 hobbes 172 mac11 155 frodo 136 bart 323 mars 201 hermes 170 mac8 153 helios 135 pc5 288 jupiter 198 thor 169 phoenix 152 mac17 135 larry 286 saturn 198 sirius 169 mac12 151 vega 135 cs 285 pc1 196 gw 169 hal 151 mac18 133 odin 282 zeus 195 calvin 168 snoopy 150 falcon 131 tiger 262 iris 194 mac5 168 mac13 150 bach 131 sparky 260 mercury 191 mac10 167 mac15 146 castor 131 ariel 259 mac1 190 fred 167 mac14 145 sol 130 sneezy 258 orion 189 titan 167 grumpy 145 dopey 128 mac 254 mac2 189 pc3 163 gandalf 144 mac20 127 sun1 240 newton 186 opus 162 pc4 144 mac19 127 rocky 234 neptune 186 mac6 160 uranus 142 spock 126 pc6 233 pc2 185 charon 159 mac16 142 euler 125 hydra 224 gauss 185 apollo 158 sleepy 141 mickey 125 homer 222 eagle 179 mac7 158 io 141 atlas 124 isis 213 mac3 179 athena 157 earth 140 maxwell 123 moe 209 merlin 177 alpha 156 europa 140 happy 123 delta 207 cisco 172 mozart 155 rigel 140 doc 122 pc10

Future Issues

 ZONE currently runs on a DECsystem-20 and is written in assembler.
 The amount of data is quickly reaching the limits of the DEC-20
 section address space, and the hardware's ability to survive gets
 slimmer each day.  ZONE assembles all its data in core before dumping
 it to disk.  The implementation does this in order to be able to
 match host nicknames with official names before dumping complete host
 records.  Sometimes a nickname can be in a different domain than the
 official name, complicating simpler methods.
 A new version of ZONE needs to be written to run on a modern computer
 system.  A completely new architecture should be designed to handle
 the enormous amount of data collected and expected in the future.
 Data should be kept on disk so that a system crash will not wipe out
 days of collection.  Multiple zone transfers could be occurring in
 parallel to reduce the time needed for data gathering.  A new ZONE
 might run continuously, cycling through the domain system on a cycle
 lasting weeks to a month, updating a local database with statistics
 collected for each domain.  In this way, current statistics on the
 size of the Internet would always be known.  The resulting database

Lottor [Page 8] RFC 1296 Internet Growth (1981-1991) January 1992

 may also be useful for other network information services.

RFC References

 Libes, D., "Choosing a Name for Your Computer", RFC 1178, Integrated
 Systems Group/NIST, August 1990.  (Also FYI 5.)
 Mockapetris, P., "Domain Names - Implementation and Specification",
 RFC 1035, USC/Information Sciences Institute, November 1987.
 Mockapetris, P., "Domain names - Concepts and Facilities", RFC 1034,
 USC/Information Sciences Institute, November 1987.
 Lazear, W., "MILNET Name Domain Transition", RFC 1031, Mitre,
 November 1987.
 Harrenstien, K. Stahl, M., and J. Feinler, "DoD Internet Host Table
 Specification", SRI, October 1985.
 Postel, J., "Domain Name System Implementation Schedule - Revised",
 RFC 921, USC/Information Sciences Institute, October 1984.

Security Considerations

 Security issues are not discussed in this memo.

Author's Address

 Mark K. Lottor
 SRI International
 Network Information Systems Center
 333 Ravenswood Avenue, EJ282
 Menlo Park, CA  94025
 EMail: mkl@nisc.sri.com

Lottor [Page 9]

/data/webs/external/dokuwiki/data/pages/rfc/rfc1296.txt · Last modified: 1992/01/29 00:43 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki