GENWiki

Premier IT Outsourcing and Support Services within the UK

User Tools

Site Tools


rfc:rfc2781

Network Working Group P. Hoffman Request for Comments: 2781 Internet Mail Consortium Category: Informational F. Yergeau

                                                    Alis Technologies
                                                        February 2000
                  UTF-16, an encoding of ISO 10646

Status of this Memo

 This memo provides information for the Internet community.  It does
 not specify an Internet standard of any kind.  Distribution of this
 memo is unlimited.

Copyright Notice

 Copyright (C) The Internet Society (2000).  All Rights Reserved.

1. Introduction

 This document describes the UTF-16 encoding of Unicode/ISO-10646,
 addresses the issues of serializing UTF-16 as an octet stream for
 transmission over the Internet, discusses MIME charset naming as
 described in [CHARSET-REG], and contains the registration for three
 MIME charset parameter values: UTF-16BE (big-endian), UTF-16LE
 (little-endian), and UTF-16.

1.1 Background and motivation

 The Unicode Standard [UNICODE] and ISO/IEC 10646 [ISO-10646] jointly
 define a coded character set (CCS), hereafter referred to as Unicode,
 which encompasses most of the world's writing systems [WORKSHOP].
 UTF-16, the object of this specification, is one of the standard ways
 of encoding Unicode character data; it has the characteristics of
 encoding all currently defined characters (in plane 0, the BMP) in
 exactly two octets and of being able to encode all other characters
 likely to be defined (the next 16 planes) in exactly four octets.
 The Unicode Standard further defines additional character properties
 and other application details of great interest to implementors. Up
 to the present time, changes in Unicode and amendments to ISO/IEC
 10646 have tracked each other, so that the character repertoires and
 code point assignments have remained in sync. The relevant
 standardization committees have committed to maintain this very
 useful synchronism, as well as not to assign characters outside of
 the 17 planes accessible to UTF-16.

Hoffman & Yergeau Informational [Page 1] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 The IETF policy on character sets and languages [CHARPOLICY] says
 that IETF protocols MUST be able to use the UTF-8 character encoding
 scheme [UTF-8]. Some products and network standards already specify
 UTF-16, making it an important encoding for the Internet. This
 document is not an update to the [CHARPOLICY] document, only a
 description of the UTF-16 encoding.

1.2 Terminology

 The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
 "SHOULD", "SHOULD NOT", "RECOMMENDED",  "MAY", and "OPTIONAL" in this
 document are to be interpreted as described in RFC 2119 [MUSTSHOULD].
 Throughout this document, character values are shown in hexadecimal
 notation. For example, "0x013C" is the character whose value is the
 character assigned the integer value 316 (decimal) in the CCS.

2. UTF-16 definition

 UTF-16 is described in the Unicode Standard, version 3.0 [UNICODE].
 The definitive reference is Annex Q of ISO/IEC 10646-1 [ISO-10646].
 The rest of this section summarizes the definition is simple terms.
 In ISO 10646, each character is assigned a number, which Unicode
 calls the Unicode scalar value. This number is the same as the UCS-4
 value of the character, and this document will refer to it as the
 "character value" for brevity. In the UTF-16 encoding, characters are
 represented using either one or two unsigned 16-bit integers,
 depending on the character value. Serialization of these integers for
 transmission as a byte stream is discussed in Section 3.
 The rules for how characters are encoded in UTF-16 are:
  1. Characters with values less than 0x10000 are represented as a

single 16-bit integer with a value equal to that of the character

    number.
  1. Characters with values between 0x10000 and 0x10FFFF are

represented by a 16-bit integer with a value between 0xD800 and

    0xDBFF (within the so-called high-half zone or high surrogate
    area) followed by a 16-bit integer with a value between 0xDC00 and
    0xDFFF (within the so-called low-half zone or low surrogate area).
  1. Characters with values greater than 0x10FFFF cannot be encoded in

UTF-16.

 Note: Values between 0xD800 and 0xDFFF are specifically reserved for
 use with UTF-16, and don't have any characters assigned to them.

Hoffman & Yergeau Informational [Page 2] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

2.1 Encoding UTF-16

 Encoding of a single character from an ISO 10646 character value to
 UTF-16 proceeds as follows. Let U be the character number, no greater
 than 0x10FFFF.
 1) If U < 0x10000, encode U as a 16-bit unsigned integer and
    terminate.
 2) Let U' = U - 0x10000. Because U is less than or equal to 0x10FFFF,
    U' must be less than or equal to 0xFFFFF. That is, U' can be
    represented in 20 bits.
 3) Initialize two 16-bit unsigned integers, W1 and W2, to 0xD800 and
    0xDC00, respectively. These integers each have 10 bits free to
    encode the character value, for a total of 20 bits.
 4) Assign the 10 high-order bits of the 20-bit U' to the 10 low-order
    bits of W1 and the 10 low-order bits of U' to the 10 low-order
    bits of W2. Terminate.
 Graphically, steps 2 through 4 look like:
 U' = yyyyyyyyyyxxxxxxxxxx
 W1 = 110110yyyyyyyyyy
 W2 = 110111xxxxxxxxxx

2.2 Decoding UTF-16

 Decoding of a single character from UTF-16 to an ISO 10646 character
 value proceeds as follows. Let W1 be the next 16-bit integer in the
 sequence of integers representing the text. Let W2 be the (eventual)
 next integer following W1.
 1) If W1 < 0xD800 or W1 > 0xDFFF, the character value U is the value
    of W1. Terminate.
 2) Determine if W1 is between 0xD800 and 0xDBFF. If not, the sequence
    is in error and no valid character can be obtained using W1.
    Terminate.
 3) If there is no W2 (that is, the sequence ends with W1), or if W2
    is not between 0xDC00 and 0xDFFF, the sequence is in error.
    Terminate.
 4) Construct a 20-bit unsigned integer U', taking the 10 low-order
    bits of W1 as its 10 high-order bits and the 10 low-order bits of
    W2 as its 10 low-order bits.

Hoffman & Yergeau Informational [Page 3] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 5) Add 0x10000 to U' to obtain the character value U. Terminate.
 Note that steps 2 and 3 indicate errors. Error recovery is not
 specified by this document. When terminating with an error in steps 2
 and 3, it may be wise to set U to the value of W1 to help the caller
 diagnose the error and not lose information. Also note that a string
 decoding algorithm, as opposed to the single-character decoding
 described above, need not terminate upon detection of an error, if
 proper error reporting and/or recovery is provided.

3. Labelling UTF-16 text

 Appendix A of this specification contains registrations for three
 MIME charsets: "UTF-16BE", "UTF-16LE", and "UTF-16". MIME charsets
 represent the combination of a CCS (a coded character set) and a CES
 (a character encoding scheme). Here the CCS is Unicode/ISO 10646 and
 the CES is the same in all three cases, except for the serialization
 order of the octets in each character, and the external determination
 of which serialization is used.
 This section describes which of the three labels to apply to a stream
 of text. Section 4 describes how to interpret the labels on a stream
 of text.

3.1 Definition of big-endian and little-endian

 Historically, computer hardware has processed two-octet entities such
 as 16-bit integers in one of two ways. So-called "big-endian"
 hardware handles two-octet entities with the higher-order octet
 first, that is at the lower address in memory; when written out to
 disk or to a network interface (serializing), the high-order octet
 thus appears first in the data stream. On the other hand, "Little-
 endian" hardware handles two-octet entities with the lower-order
 octet first. Hardware of both kinds is common today.
 For example, the unsigned 16-bit integer that represents the decimal
 number 258 is 0x0102. The big-endian serialization of that number is
 the octet 0x01 followed by the octet 0x02. The little-endian
 serialization of that number is the octet 0x02 followed by the octet
 0x01. The following C code fragment demonstrates a way to write 16-
 bit quantities to a file in big-endian order, irrespective of the
 hardware's native byte order.
void write_be(unsigned short u, FILE f)  /* assume short is 16 bits */
{
  putc(u >> 8,   f);                     /* output high-order byte */
  putc(u & 0xFF, f);                     /* then low-order */
}

Hoffman & Yergeau Informational [Page 4] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 The term "network byte order" has been used in many RFCs to indicate
 big-endian serialization, although that term has yet to be formally
 defined in a standards-track document. Although ISO 10646 prefers
 big-endian serialization (section 6.3 of [ISO-10646]), little-endian
 order is also sometimes used on the Internet.

3.2 Byte order mark (BOM)

 The Unicode Standard and ISO 10646 define the character "ZERO WIDTH
 NON-BREAKING SPACE" (0xFEFF), which is also known informally as "BYTE
 ORDER MARK" (abbreviated "BOM"). The latter name hints at a second
 possible usage of the character, in addition to its normal use as a
 genuine "ZERO WIDTH NON-BREAKING SPACE" within text. This usage,
 suggested by Unicode section 2.4 and ISO 10646 Annex F (informative),
 is to prepend a 0xFEFF character to a stream of Unicode characters as
 a "signature"; a receiver of such a serialized stream may then use
 the initial character both as a hint that the stream consists of
 Unicode characters and as a way to recognize the serialization order.
 In serialized UTF-16 prepended with such a signature, the order is
 big-endian if the first two octets are 0xFE followed by 0xFF; if they
 are 0xFF followed by 0xFE, the order is little-endian. Note that
 0xFFFE is not a Unicode character, precisely to preserve the
 usefulness of 0xFEFF as a byte-order mark.
 It is important to understand that the character 0xFEFF appearing at
 any position other than the beginning of a stream MUST be interpreted
 with the semantics for the zero-width non-breaking space, and MUST
 NOT be interpreted as a byte-order mark. The contrapositive of that
 statement is not always true: the character 0xFEFF in the first
 position of a stream MAY be interpreted as a zero-width non-breaking
 space, and is not always a byte-order mark. For example, if a process
 splits a UTF-16 string into many parts, a part might begin with
 0xFEFF because there was a zero-width non-breaking space at the
 beginning of that substring.
 The Unicode standard further suggests than an initial 0xFEFF
 character may be stripped before processing the text, the rationale
 being that such a character in initial position may be an artifact of
 the encoding (an encoding signature), not a genuine intended "ZERO
 WIDTH NON-BREAKING SPACE". Note that such stripping might affect an
 external process at a different layer (such as a digital signature or
 a count of the characters) that is relying on the presence of all
 characters in the stream.
 In particular, in UTF-16 plain text it is likely, but not certain,
 that an initial 0xFEFF is a signature. When concatenating two
 strings, it is important to strip out those signatures, because
 otherwise the resulting string may contain an unintended "ZERO WIDTH

Hoffman & Yergeau Informational [Page 5] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 NON-BREAKING SPACE" at the connection point. Also, some
 specifications mandate an initial 0xFEFF character in objects
 labelled as UTF-16 and specify that this signature is not part of the
 object.

3.3 Choosing a label for UTF-16 text

 Any labelling application that uses UTF-16 character encoding, and
 explicitly labels the text, and knows the serialization order of the
 characters in text, SHOULD label the text as either "UTF-16BE" or
 "UTF-16LE", whichever is appropriate based on the endianness of the
 text. This allows applications processing the text, but unable to
 look inside the text, to know the serialization definitively.
 Text in the "UTF-16BE" charset MUST be serialized with the octets
 which make up a single 16-bit UTF-16 value in big-endian order.
 Systems labelling UTF-16BE text MUST NOT prepend a BOM to the text.
 Text in the "UTF-16LE" charset MUST be serialized with the octets
 which make up a single 16-bit UTF-16 value in little-endian order.
 Systems labelling UTF-16LE text MUST NOT prepend a BOM to the text.
 Any labelling application that uses UTF-16 character encoding, and
 puts an explicit charset label on the text, and does not know the
 serialization order of the characters in text, MUST label the text as
 "UTF-16", and SHOULD make sure the text starts with 0xFEFF.
 An exception to the "SHOULD" rule of using "UTF-16BE" or "UTF-16LE"
 would occur with document formats that mandate a BOM in UTF-16 text,
 thereby requiring the use of the "UTF-16" tag only.

4. Interpreting text labels

 When a program sees text labelled as "UTF-16BE", "UTF-16LE", or
 "UTF-16", it can make some assumptions, based on the labelling rules
 given in the previous section. These assumptions allow the program to
 then process the text.

4.1 Interpreting text labelled as UTF-16BE

 Text labelled "UTF-16BE" can always be interpreted as being big-
 endian.  The detection of an initial BOM does not affect de-
 serialization of text labelled as UTF-16BE. Finding 0xFF followed by
 0xFE is an error since there is no Unicode character 0xFFFE.

Hoffman & Yergeau Informational [Page 6] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

4.2 Interpreting text labelled as UTF-16LE

 Text labelled "UTF-16LE" can always be interpreted as being little-
 endian. The detection of an initial BOM does not affect de-
 serialization of text labelled as UTF-16LE. Finding 0xFE followed by
 0xFF is an error since there is no Unicode character 0xFFFE, which
 would be the interpretation of those octets under little-endian
 order.

4.3 Interpreting text labelled as UTF-16

 Text labelled with the "UTF-16" charset might be serialized in either
 big-endian or little-endian order. If the first two octets of the
 text is 0xFE followed by 0xFF, then the text can be interpreted as
 being big-endian. If the first two octets of the text is 0xFF
 followed by 0xFE, then the text can be interpreted as being little-
 endian. If the first two octets of the text is not 0xFE followed by
 0xFF, and is not 0xFF followed by 0xFE, then the text SHOULD be
 interpreted as being big-endian.
 All applications that process text with the "UTF-16" charset label
 MUST be able to read at least the first two octets of the text and be
 able to process those octets in order to determine the serialization
 order of the text. Applications that process text with the "UTF-16"
 charset label MUST NOT assume the serialization without first
 checking the first two octets to see if they are a big-endian BOM, a
 little-endian BOM, or not a BOM. All applications that process text
 with the "UTF-16" charset label MUST be able to interpret both big-
 endian and little-endian text.

5. Examples

 For the sake of example, let's suppose that there is a hieroglyphic
 character representing the Egyptian god Ra with character value
 0x12345 (this character does not exist at present in Unicode).
 The examples here all evaluate to the phrase:
  • =Ra
 where the "*" represents the Ra hieroglyph (0x12345).
 Text labelled with UTF-16BE, without a BOM:
 D8 08 DF 45 00 3D 00 52 00 61
 Text labelled with UTF-16LE, without a BOM:
 08 D8 45 DF 3D 00 52 00 61 00

Hoffman & Yergeau Informational [Page 7] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 Big-endian text labelled with UTF-16, with a BOM:
 FE FF D8 08 DF 45 00 3D 00 52 00 61
 Little-endian text labelled with UTF-16, with a BOM:
 FF FE 08 D8 45 DF 3D 00 52 00 61 00

6. Versions of the standards

 ISO/IEC 10646 is updated from time to time by published amendments;
 similarly, different versions of the Unicode standard exist: 1.0,
 1.1, 2.0, 2.1, and 3.0 as of this writing. Each new version replaces
 the previous one, but implementations, and more significantly data,
 are not updated instantly.
 In general, the changes amount to adding new characters, which does
 not pose particular problems with old data. Amendment 5 to ISO/IEC
 10646, however, has moved and expanded the Korean Hangul block,
 thereby making any previous data containing Hangul characters invalid
 under the new version. Unicode 2.0 has the same difference from
 Unicode 1.1. The official justification for allowing such an
 incompatible change was that no significant implementations and data
 containing Hangul existed, a statement that is likely to be true but
 remains unprovable. The incident has been dubbed the "Korean mess",
 and the relevant committees have pledged to never, ever again make
 such an incompatible change.
 New versions, and in particular any incompatible changes, have
 consequences regarding MIME character encoding labels, to be
 discussed in Appendix A.

7. IANA Considerations

 IANA is to register the character sets found in Appendixes A.1, A.2,
 and A.3 according to RFC 2278, using registration templates found in
 those appendixes.

8. Security Considerations

 UTF-16 is based on the ISO 10646 character set, which is frequently
 being added to, as described in Section 6 and Appendix A of this
 document. Processors must be able to handle characters that are not
 defined at the time that the processor was created in such a way as
 to not allow an attacker to harm a recipient by including unknown
 characters.
 Processors that handle any type of text, including text encoded as
 UTF-16, must be vigilant in checking for control characters that
 might reprogram a display terminal or keyboard. Similarly, processors

Hoffman & Yergeau Informational [Page 8] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 that interpret text entities (such as looking for embedded
 programming code), must be careful not to execute the code without
 first alerting the recipient.
 Text in UTF-16 may contain special characters, such as the OBJECT
 REPLACEMENT CHARACTER (0xFFFC), that might cause external processing,
 depending on the interpretation of the processing program and the
 availability of an external data stream that would be executed. This
 external processing may have side-effects that allow the sender of a
 message to attack the receiving system.
 Implementors of UTF-16 need to consider the security aspects of how
 they handle illegal UTF-16 sequences (that is, sequences involving
 surrogate pairs that have illegal values or unpaired surrogates). It
 is conceivable that in some circumstances an attacker would be able
 to exploit an incautious UTF-16 parser by sending it an octet
 sequence that is not permitted by the UTF-16 syntax, causing it to
 behave in some anomalous fashion.

9. References

 [CHARPOLICY]  Alvestrand, H., "IETF Policy on Character Sets and
               Languages", BCP 18, RFC 2277, January 1998.
 [CHARSET-REG] Freed, N. and J. Postel, "IANA Charset Registration
               Procedures", BCP 19, RFC 2278, January 1998.
 [HTTP-1.1]    Fielding, R., Gettys, J., Mogul, J., Frystyk, H.,
               Masinter, L., Leach, P. and T. Berners-Lee, "Hypertext
               Transfer Protocol -- HTTP/1.1", RFC 2616, June 1999.
 [ISO-10646]   ISO/IEC 10646-1:1993. International Standard --
               Information technology -- Universal Multiple-Octet
               Coded Character Set (UCS) -- Part 1: Architecture and
               Basic Multilingual Plane. 22 amendments and two
               technical corrigenda have been published up to now.
               UTF-16 is described in Annex Q, published as Amendment
               1. Many other amendments are currently at various
               stages of standardization. A second edition is in
               preparation, probably to be published in 2000; in this
               new edition, UTF-16 will probably be described in Annex
               C.
 [MUSTSHOULD]  Bradner, S., "Key words for use in RFCs to Indicate
               Requirement Levels", BCP 14, RFC 2119, March 1997.
 [UNICODE]     The Unicode Consortium, "The Unicode Standard --
               Version 3.0", ISBN 0-201-61633-5. Described at

Hoffman & Yergeau Informational [Page 9] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 <http://www.unicode.org/unicode/standard/versions/Unicode3.0.html>.
 [UTF-8]       Yergeau, F., "UTF-8, a transformation format of ISO
               10646", RFC 2279, January 1998.
 [WORKSHOP]    Weider, C., Preston, C., Simonsen, K., Alvestrand, H.,
               Atkinson, R., Crispin., M. and P. Svanberg, "Report of
               the IAB Character Set Workshop", RFC 2130, April 1997.

10. Acknowledgments

 Deborah Goldsmith wrote a great deal of the initial wording for this
 specification. Martin Duerst proposed numerous significant changes.
 Other significant contributors include:
 Mati Allouche
 Walt Daniels
 Mark Davis
 Ned Freed
 Asmus Freytag
 Lloyd Honomichl
 Dan Kegel
 Murata Makoto
 Larry Masinter
 Markus Scherer
 Keld Simonsen
 Ken Whistler
 Some of the text in this specification was copied from [UTF-8], and
 that document was worked on by many people. Please see the
 acknowledgments section in that document for more people who may have
 contributed indirectly to this document.

Hoffman & Yergeau Informational [Page 10] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

A. Charset registrations

 This memo is meant to serve as the basis for registration of three
 MIME charsets [CHARSET-REG]. The proposed charsets are "UTF-16BE",
 "UTF-16LE", and "UTF-16". These strings label objects containing text
 consisting of characters from the repertoire of ISO/IEC 10646
 including all amendments at least up to amendment 5 (Korean block),
 encoded to a sequence of octets using the encoding and serialization
 schemes outlined above.
 Note that "UTF-16BE", "UTF-16LE", and "UTF-16" are NOT suitable for
 use in media types under the "text" top-level type, because they do
 not encode line endings in the way required for MIME "text" media
 types. An exception to this is HTTP, which uses a MIME-like
 mechanism, but is exempt from the restrictions on the text top-level
 type (see section 19.4.2 of HTTP 1.1 [HTTP-1.1]).
 It is noteworthy that the labels described here do not contain a
 version identification, referring generically to ISO/IEC 10646. This
 is intentional, the rationale being as follows:
 A MIME charset is designed to give just the information needed to
 interpret a sequence of bytes received on the wire into a sequence of
 characters, nothing more (see RFC 2045, section 2.2, in [MIME]). As
 long as a character set standard does not change incompatibly,
 version numbers serve no purpose, because one gains nothing by
 learning from the tag that newly assigned characters may be received
 that one doesn't know about. The tag itself doesn't teach anything
 about the new characters, which are going to be received anyway.
 Hence, as long as the standards evolve compatibly, the apparent
 advantage of having labels that identify the versions is only that,
 apparent. But there is a disadvantage to such version-dependent
 labels: when an older application receives data accompanied by a
 newer, unknown label, it may fail to recognize the label and be
 completely unable to deal with the data, whereas a generic, known
 label would have triggered mostly correct processing of the data,
 which may well not contain any new characters.
 The "Korean mess" (ISO/IEC 10646 amendment 5) is an incompatible
 change, in principle contradicting the appropriateness of a version
 independent MIME charset as described above. But the compatibility
 problem can only appear with data containing Korean Hangul characters
 encoded according to Unicode 1.1 (or equivalently ISO/IEC 10646
 before amendment 5), and there is arguably no such data to worry
 about, this being the very reason the incompatible change was deemed
 acceptable.

Hoffman & Yergeau Informational [Page 11] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 In practice, then, a version-independent label is warranted, provided
 the label is understood to refer to all versions after Amendment 5,
 and provided no incompatible change actually occurs. Should
 incompatible changes occur in a later version of ISO/IEC 10646, the
 MIME charsets defined here will stay aligned with the previous
 version until and unless the IETF specifically decides otherwise.

A.1 Registration for UTF-16BE

 To: ietf-charsets@iana.org
 Subject: Registration of new charset
 Charset name(s): UTF-16BE
 Published specification(s): This specification
 Suitable for use in MIME content types under the
 "text" top-level type: No
 Person & email address to contact for further information:
 Paul Hoffman <phoffman@imc.org>
 Francois Yergeau <fyergeau@alis.com>

A.2 Registration for UTF-16LE

 To: ietf-charsets@iana.org
 Subject: Registration of new charset
 Charset name(s): UTF-16LE
 Published specification(s): This specification
 Suitable for use in MIME content types under the
 "text" top-level type: No
 Person & email address to contact for further information:
 Paul Hoffman <phoffman@imc.org>
 Francois Yergeau <fyergeau@alis.com>

A.3 Registration for UTF-16

 To: ietf-charsets@iana.org
 Subject: Registration of new charset
 Charset name(s): UTF-16
 Published specification(s): This specification

Hoffman & Yergeau Informational [Page 12] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

 Suitable for use in MIME content types under the
 "text" top-level type: No
 Person & email address to contact for further information:
 Paul Hoffman <phoffman@imc.org>
 Francois Yergeau <fyergeau@alis.com>

Authors' Addresses

 Paul Hoffman
 Internet Mail Consortium
 127 Segre Place
 Santa Cruz, CA  95060 USA
 EMail: phoffman@imc.org
 Francois Yergeau
 Alis Technologies
 100, boul. Alexis-Nihon, Suite 600
 Montreal  QC  H4M 2P2 Canada
 EMail: fyergeau@alis.com

Hoffman & Yergeau Informational [Page 13] RFC 2781 UTF-16, an encoding of ISO 10646 February 2000

Full Copyright Statement

 Copyright (C) The Internet Society (2000).  All Rights Reserved.
 This document and translations of it may be copied and furnished to
 others, and derivative works that comment on or otherwise explain it
 or assist in its implementation may be prepared, copied, published
 and distributed, in whole or in part, without restriction of any
 kind, provided that the above copyright notice and this paragraph are
 included on all such copies and derivative works.  However, this
 document itself may not be modified in any way, such as by removing
 the copyright notice or references to the Internet Society or other
 Internet organizations, except as needed for the purpose of
 developing Internet standards in which case the procedures for
 copyrights defined in the Internet Standards process must be
 followed, or as required to translate it into languages other than
 English.
 The limited permissions granted above are perpetual and will not be
 revoked by the Internet Society or its successors or assigns.
 This document and the information contained herein is provided on an
 "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
 TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
 BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
 HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
 MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.

Acknowledgement

 Funding for the RFC Editor function is currently provided by the
 Internet Society.

Hoffman & Yergeau Informational [Page 14]

/data/webs/external/dokuwiki/data/pages/rfc/rfc2781.txt · Last modified: 2000/02/23 19:34 by 127.0.0.1

Donate Powered by PHP Valid HTML5 Valid CSS Driven by DokuWiki