--- /dev/null
+rfc1939.txt POP3
+rfc2821.txt SMTP
+rfc2822.txt Internet Message Format
+rfc977.txt NNTP
+rfc2045.txt MIME 1
+rfc2046.txt MIME 2
+rfc2047.txt MIME 3
+rfc2048.txt MIME 4
+rfc2049.txt MIME 5
+rfc2060.txt IMAP4
--- /dev/null
+\r
+\r
+\r
+\r
+\r
+\r
+Network Working Group J. Myers\r
+Request for Comments: 1939 Carnegie Mellon\r
+STD: 53 M. Rose\r
+Obsoletes: 1725 Dover Beach Consulting, Inc.\r
+Category: Standards Track May 1996\r
+\r
+\r
+ Post Office Protocol - Version 3\r
+\r
+Status of this Memo\r
+\r
+ This document specifies an Internet standards track protocol for the\r
+ Internet community, and requests discussion and suggestions for\r
+ improvements. Please refer to the current edition of the "Internet\r
+ Official Protocol Standards" (STD 1) for the standardization state\r
+ and status of this protocol. Distribution of this memo is unlimited.\r
+\r
+Table of Contents\r
+\r
+ 1. Introduction ................................................ 2\r
+ 2. A Short Digression .......................................... 2\r
+ 3. Basic Operation ............................................. 3\r
+ 4. The AUTHORIZATION State ..................................... 4\r
+ QUIT Command ................................................ 5\r
+ 5. The TRANSACTION State ....................................... 5\r
+ STAT Command ................................................ 6\r
+ LIST Command ................................................ 6\r
+ RETR Command ................................................ 8\r
+ DELE Command ................................................ 8\r
+ NOOP Command ................................................ 9\r
+ RSET Command ................................................ 9\r
+ 6. The UPDATE State ............................................ 10\r
+ QUIT Command ................................................ 10\r
+ 7. Optional POP3 Commands ...................................... 11\r
+ TOP Command ................................................. 11\r
+ UIDL Command ................................................ 12\r
+ USER Command ................................................ 13\r
+ PASS Command ................................................ 14\r
+ APOP Command ................................................ 15\r
+ 8. Scaling and Operational Considerations ...................... 16\r
+ 9. POP3 Command Summary ........................................ 18\r
+ 10. Example POP3 Session ....................................... 19\r
+ 11. Message Format ............................................. 19\r
+ 12. References ................................................. 20\r
+ 13. Security Considerations .................................... 20\r
+ 14. Acknowledgements ........................................... 20\r
+ 15. Authors' Addresses ......................................... 21\r
+ Appendix A. Differences from RFC 1725 .......................... 22\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 1]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ Appendix B. Command Index ...................................... 23\r
+\r
+1. Introduction\r
+\r
+ On certain types of smaller nodes in the Internet it is often\r
+ impractical to maintain a message transport system (MTS). For\r
+ example, a workstation may not have sufficient resources (cycles,\r
+ disk space) in order to permit a SMTP server [RFC821] and associated\r
+ local mail delivery system to be kept resident and continuously\r
+ running. Similarly, it may be expensive (or impossible) to keep a\r
+ personal computer interconnected to an IP-style network for long\r
+ amounts of time (the node is lacking the resource known as\r
+ "connectivity").\r
+\r
+ Despite this, it is often very useful to be able to manage mail on\r
+ these smaller nodes, and they often support a user agent (UA) to aid\r
+ the tasks of mail handling. To solve this problem, a node which can\r
+ support an MTS entity offers a maildrop service to these less endowed\r
+ nodes. The Post Office Protocol - Version 3 (POP3) is intended to\r
+ permit a workstation to dynamically access a maildrop on a server\r
+ host in a useful fashion. Usually, this means that the POP3 protocol\r
+ is used to allow a workstation to retrieve mail that the server is\r
+ holding for it.\r
+\r
+ POP3 is not intended to provide extensive manipulation operations of\r
+ mail on the server; normally, mail is downloaded and then deleted. A\r
+ more advanced (and complex) protocol, IMAP4, is discussed in\r
+ [RFC1730].\r
+\r
+ For the remainder of this memo, the term "client host" refers to a\r
+ host making use of the POP3 service, while the term "server host"\r
+ refers to a host which offers the POP3 service.\r
+\r
+2. A Short Digression\r
+\r
+ This memo does not specify how a client host enters mail into the\r
+ transport system, although a method consistent with the philosophy of\r
+ this memo is presented here:\r
+\r
+ When the user agent on a client host wishes to enter a message\r
+ into the transport system, it establishes an SMTP connection to\r
+ its relay host and sends all mail to it. This relay host could\r
+ be, but need not be, the POP3 server host for the client host. Of\r
+ course, the relay host must accept mail for delivery to arbitrary\r
+ recipient addresses, that functionality is not required of all\r
+ SMTP servers.\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 2]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+3. Basic Operation\r
+\r
+ Initially, the server host starts the POP3 service by listening on\r
+ TCP port 110. When a client host wishes to make use of the service,\r
+ it establishes a TCP connection with the server host. When the\r
+ connection is established, the POP3 server sends a greeting. The\r
+ client and POP3 server then exchange commands and responses\r
+ (respectively) until the connection is closed or aborted.\r
+\r
+ Commands in the POP3 consist of a case-insensitive keyword, possibly\r
+ followed by one or more arguments. All commands are terminated by a\r
+ CRLF pair. Keywords and arguments consist of printable ASCII\r
+ characters. Keywords and arguments are each separated by a single\r
+ SPACE character. Keywords are three or four characters long. Each\r
+ argument may be up to 40 characters long.\r
+\r
+ Responses in the POP3 consist of a status indicator and a keyword\r
+ possibly followed by additional information. All responses are\r
+ terminated by a CRLF pair. Responses may be up to 512 characters\r
+ long, including the terminating CRLF. There are currently two status\r
+ indicators: positive ("+OK") and negative ("-ERR"). Servers MUST\r
+ send the "+OK" and "-ERR" in upper case.\r
+\r
+ Responses to certain commands are multi-line. In these cases, which\r
+ are clearly indicated below, after sending the first line of the\r
+ response and a CRLF, any additional lines are sent, each terminated\r
+ by a CRLF pair. When all lines of the response have been sent, a\r
+ final line is sent, consisting of a termination octet (decimal code\r
+ 046, ".") and a CRLF pair. If any line of the multi-line response\r
+ begins with the termination octet, the line is "byte-stuffed" by\r
+ pre-pending the termination octet to that line of the response.\r
+ Hence a multi-line response is terminated with the five octets\r
+ "CRLF.CRLF". When examining a multi-line response, the client checks\r
+ to see if the line begins with the termination octet. If so and if\r
+ octets other than CRLF follow, the first octet of the line (the\r
+ termination octet) is stripped away. If so and if CRLF immediately\r
+ follows the termination character, then the response from the POP\r
+ server is ended and the line containing ".CRLF" is not considered\r
+ part of the multi-line response.\r
+\r
+ A POP3 session progresses through a number of states during its\r
+ lifetime. Once the TCP connection has been opened and the POP3\r
+ server has sent the greeting, the session enters the AUTHORIZATION\r
+ state. In this state, the client must identify itself to the POP3\r
+ server. Once the client has successfully done this, the server\r
+ acquires resources associated with the client's maildrop, and the\r
+ session enters the TRANSACTION state. In this state, the client\r
+ requests actions on the part of the POP3 server. When the client has\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 3]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ issued the QUIT command, the session enters the UPDATE state. In\r
+ this state, the POP3 server releases any resources acquired during\r
+ the TRANSACTION state and says goodbye. The TCP connection is then\r
+ closed.\r
+\r
+ A server MUST respond to an unrecognized, unimplemented, or\r
+ syntactically invalid command by responding with a negative status\r
+ indicator. A server MUST respond to a command issued when the\r
+ session is in an incorrect state by responding with a negative status\r
+ indicator. There is no general method for a client to distinguish\r
+ between a server which does not implement an optional command and a\r
+ server which is unwilling or unable to process the command.\r
+\r
+ A POP3 server MAY have an inactivity autologout timer. Such a timer\r
+ MUST be of at least 10 minutes' duration. The receipt of any command\r
+ from the client during that interval should suffice to reset the\r
+ autologout timer. When the timer expires, the session does NOT enter\r
+ the UPDATE state--the server should close the TCP connection without\r
+ removing any messages or sending any response to the client.\r
+\r
+4. The AUTHORIZATION State\r
+\r
+ Once the TCP connection has been opened by a POP3 client, the POP3\r
+ server issues a one line greeting. This can be any positive\r
+ response. An example might be:\r
+\r
+ S: +OK POP3 server ready\r
+\r
+ The POP3 session is now in the AUTHORIZATION state. The client must\r
+ now identify and authenticate itself to the POP3 server. Two\r
+ possible mechanisms for doing this are described in this document,\r
+ the USER and PASS command combination and the APOP command. Both\r
+ mechanisms are described later in this document. Additional\r
+ authentication mechanisms are described in [RFC1734]. While there is\r
+ no single authentication mechanism that is required of all POP3\r
+ servers, a POP3 server must of course support at least one\r
+ authentication mechanism.\r
+\r
+ Once the POP3 server has determined through the use of any\r
+ authentication command that the client should be given access to the\r
+ appropriate maildrop, the POP3 server then acquires an exclusive-\r
+ access lock on the maildrop, as necessary to prevent messages from\r
+ being modified or removed before the session enters the UPDATE state.\r
+ If the lock is successfully acquired, the POP3 server responds with a\r
+ positive status indicator. The POP3 session now enters the\r
+ TRANSACTION state, with no messages marked as deleted. If the\r
+ maildrop cannot be opened for some reason (for example, a lock can\r
+ not be acquired, the client is denied access to the appropriate\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 4]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ maildrop, or the maildrop cannot be parsed), the POP3 server responds\r
+ with a negative status indicator. (If a lock was acquired but the\r
+ POP3 server intends to respond with a negative status indicator, the\r
+ POP3 server must release the lock prior to rejecting the command.)\r
+ After returning a negative status indicator, the server may close the\r
+ connection. If the server does not close the connection, the client\r
+ may either issue a new authentication command and start again, or the\r
+ client may issue the QUIT command.\r
+\r
+ After the POP3 server has opened the maildrop, it assigns a message-\r
+ number to each message, and notes the size of each message in octets.\r
+ The first message in the maildrop is assigned a message-number of\r
+ "1", the second is assigned "2", and so on, so that the nth message\r
+ in a maildrop is assigned a message-number of "n". In POP3 commands\r
+ and responses, all message-numbers and message sizes are expressed in\r
+ base-10 (i.e., decimal).\r
+\r
+ Here is the summary for the QUIT command when used in the\r
+ AUTHORIZATION state:\r
+\r
+ QUIT\r
+\r
+ Arguments: none\r
+\r
+ Restrictions: none\r
+\r
+ Possible Responses:\r
+ +OK\r
+\r
+ Examples:\r
+ C: QUIT\r
+ S: +OK dewey POP3 server signing off\r
+\r
+5. The TRANSACTION State\r
+\r
+ Once the client has successfully identified itself to the POP3 server\r
+ and the POP3 server has locked and opened the appropriate maildrop,\r
+ the POP3 session is now in the TRANSACTION state. The client may now\r
+ issue any of the following POP3 commands repeatedly. After each\r
+ command, the POP3 server issues a response. Eventually, the client\r
+ issues the QUIT command and the POP3 session enters the UPDATE state.\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 5]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ Here are the POP3 commands valid in the TRANSACTION state:\r
+\r
+ STAT\r
+\r
+ Arguments: none\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ The POP3 server issues a positive response with a line\r
+ containing information for the maildrop. This line is\r
+ called a "drop listing" for that maildrop.\r
+\r
+ In order to simplify parsing, all POP3 servers are\r
+ required to use a certain format for drop listings. The\r
+ positive response consists of "+OK" followed by a single\r
+ space, the number of messages in the maildrop, a single\r
+ space, and the size of the maildrop in octets. This memo\r
+ makes no requirement on what follows the maildrop size.\r
+ Minimal implementations should just end that line of the\r
+ response with a CRLF pair. More advanced implementations\r
+ may include other information.\r
+\r
+ NOTE: This memo STRONGLY discourages implementations\r
+ from supplying additional information in the drop\r
+ listing. Other, optional, facilities are discussed\r
+ later on which permit the client to parse the messages\r
+ in the maildrop.\r
+\r
+ Note that messages marked as deleted are not counted in\r
+ either total.\r
+\r
+ Possible Responses:\r
+ +OK nn mm\r
+\r
+ Examples:\r
+ C: STAT\r
+ S: +OK 2 320\r
+\r
+\r
+ LIST [msg]\r
+\r
+ Arguments:\r
+ a message-number (optional), which, if present, may NOT\r
+ refer to a message marked as deleted\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 6]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ If an argument was given and the POP3 server issues a\r
+ positive response with a line containing information for\r
+ that message. This line is called a "scan listing" for\r
+ that message.\r
+\r
+ If no argument was given and the POP3 server issues a\r
+ positive response, then the response given is multi-line.\r
+ After the initial +OK, for each message in the maildrop,\r
+ the POP3 server responds with a line containing\r
+ information for that message. This line is also called a\r
+ "scan listing" for that message. If there are no\r
+ messages in the maildrop, then the POP3 server responds\r
+ with no scan listings--it issues a positive response\r
+ followed by a line containing a termination octet and a\r
+ CRLF pair.\r
+\r
+ In order to simplify parsing, all POP3 servers are\r
+ required to use a certain format for scan listings. A\r
+ scan listing consists of the message-number of the\r
+ message, followed by a single space and the exact size of\r
+ the message in octets. Methods for calculating the exact\r
+ size of the message are described in the "Message Format"\r
+ section below. This memo makes no requirement on what\r
+ follows the message size in the scan listing. Minimal\r
+ implementations should just end that line of the response\r
+ with a CRLF pair. More advanced implementations may\r
+ include other information, as parsed from the message.\r
+\r
+ NOTE: This memo STRONGLY discourages implementations\r
+ from supplying additional information in the scan\r
+ listing. Other, optional, facilities are discussed\r
+ later on which permit the client to parse the messages\r
+ in the maildrop.\r
+\r
+ Note that messages marked as deleted are not listed.\r
+\r
+ Possible Responses:\r
+ +OK scan listing follows\r
+ -ERR no such message\r
+\r
+ Examples:\r
+ C: LIST\r
+ S: +OK 2 messages (320 octets)\r
+ S: 1 120\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 7]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ S: 2 200\r
+ S: .\r
+ ...\r
+ C: LIST 2\r
+ S: +OK 2 200\r
+ ...\r
+ C: LIST 3\r
+ S: -ERR no such message, only 2 messages in maildrop\r
+\r
+\r
+ RETR msg\r
+\r
+ Arguments:\r
+ a message-number (required) which may NOT refer to a\r
+ message marked as deleted\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ If the POP3 server issues a positive response, then the\r
+ response given is multi-line. After the initial +OK, the\r
+ POP3 server sends the message corresponding to the given\r
+ message-number, being careful to byte-stuff the termination\r
+ character (as with all multi-line responses).\r
+\r
+ Possible Responses:\r
+ +OK message follows\r
+ -ERR no such message\r
+\r
+ Examples:\r
+ C: RETR 1\r
+ S: +OK 120 octets\r
+ S: <the POP3 server sends the entire message here>\r
+ S: .\r
+\r
+\r
+ DELE msg\r
+\r
+ Arguments:\r
+ a message-number (required) which may NOT refer to a\r
+ message marked as deleted\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 8]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ Discussion:\r
+ The POP3 server marks the message as deleted. Any future\r
+ reference to the message-number associated with the message\r
+ in a POP3 command generates an error. The POP3 server does\r
+ not actually delete the message until the POP3 session\r
+ enters the UPDATE state.\r
+\r
+ Possible Responses:\r
+ +OK message deleted\r
+ -ERR no such message\r
+\r
+ Examples:\r
+ C: DELE 1\r
+ S: +OK message 1 deleted\r
+ ...\r
+ C: DELE 2\r
+ S: -ERR message 2 already deleted\r
+\r
+\r
+ NOOP\r
+\r
+ Arguments: none\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ The POP3 server does nothing, it merely replies with a\r
+ positive response.\r
+\r
+ Possible Responses:\r
+ +OK\r
+\r
+ Examples:\r
+ C: NOOP\r
+ S: +OK\r
+\r
+\r
+ RSET\r
+\r
+ Arguments: none\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ If any messages have been marked as deleted by the POP3\r
+ server, they are unmarked. The POP3 server then replies\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 9]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ with a positive response.\r
+\r
+ Possible Responses:\r
+ +OK\r
+\r
+ Examples:\r
+ C: RSET\r
+ S: +OK maildrop has 2 messages (320 octets)\r
+\r
+6. The UPDATE State\r
+\r
+ When the client issues the QUIT command from the TRANSACTION state,\r
+ the POP3 session enters the UPDATE state. (Note that if the client\r
+ issues the QUIT command from the AUTHORIZATION state, the POP3\r
+ session terminates but does NOT enter the UPDATE state.)\r
+\r
+ If a session terminates for some reason other than a client-issued\r
+ QUIT command, the POP3 session does NOT enter the UPDATE state and\r
+ MUST not remove any messages from the maildrop.\r
+\r
+ QUIT\r
+\r
+ Arguments: none\r
+\r
+ Restrictions: none\r
+\r
+ Discussion:\r
+ The POP3 server removes all messages marked as deleted\r
+ from the maildrop and replies as to the status of this\r
+ operation. If there is an error, such as a resource\r
+ shortage, encountered while removing messages, the\r
+ maildrop may result in having some or none of the messages\r
+ marked as deleted be removed. In no case may the server\r
+ remove any messages not marked as deleted.\r
+\r
+ Whether the removal was successful or not, the server\r
+ then releases any exclusive-access lock on the maildrop\r
+ and closes the TCP connection.\r
+\r
+ Possible Responses:\r
+ +OK\r
+ -ERR some deleted messages not removed\r
+\r
+ Examples:\r
+ C: QUIT\r
+ S: +OK dewey POP3 server signing off (maildrop empty)\r
+ ...\r
+ C: QUIT\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 10]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ S: +OK dewey POP3 server signing off (2 messages left)\r
+ ...\r
+\r
+7. Optional POP3 Commands\r
+\r
+ The POP3 commands discussed above must be supported by all minimal\r
+ implementations of POP3 servers.\r
+\r
+ The optional POP3 commands described below permit a POP3 client\r
+ greater freedom in message handling, while preserving a simple POP3\r
+ server implementation.\r
+\r
+ NOTE: This memo STRONGLY encourages implementations to support\r
+ these commands in lieu of developing augmented drop and scan\r
+ listings. In short, the philosophy of this memo is to put\r
+ intelligence in the part of the POP3 client and not the POP3\r
+ server.\r
+\r
+ TOP msg n\r
+\r
+ Arguments:\r
+ a message-number (required) which may NOT refer to to a\r
+ message marked as deleted, and a non-negative number\r
+ of lines (required)\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state\r
+\r
+ Discussion:\r
+ If the POP3 server issues a positive response, then the\r
+ response given is multi-line. After the initial +OK, the\r
+ POP3 server sends the headers of the message, the blank\r
+ line separating the headers from the body, and then the\r
+ number of lines of the indicated message's body, being\r
+ careful to byte-stuff the termination character (as with\r
+ all multi-line responses).\r
+\r
+ Note that if the number of lines requested by the POP3\r
+ client is greater than than the number of lines in the\r
+ body, then the POP3 server sends the entire message.\r
+\r
+ Possible Responses:\r
+ +OK top of message follows\r
+ -ERR no such message\r
+\r
+ Examples:\r
+ C: TOP 1 10\r
+ S: +OK\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 11]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ S: <the POP3 server sends the headers of the\r
+ message, a blank line, and the first 10 lines\r
+ of the body of the message>\r
+ S: .\r
+ ...\r
+ C: TOP 100 3\r
+ S: -ERR no such message\r
+\r
+\r
+ UIDL [msg]\r
+\r
+ Arguments:\r
+ a message-number (optional), which, if present, may NOT\r
+ refer to a message marked as deleted\r
+\r
+ Restrictions:\r
+ may only be given in the TRANSACTION state.\r
+\r
+ Discussion:\r
+ If an argument was given and the POP3 server issues a positive\r
+ response with a line containing information for that message.\r
+ This line is called a "unique-id listing" for that message.\r
+\r
+ If no argument was given and the POP3 server issues a positive\r
+ response, then the response given is multi-line. After the\r
+ initial +OK, for each message in the maildrop, the POP3 server\r
+ responds with a line containing information for that message.\r
+ This line is called a "unique-id listing" for that message.\r
+\r
+ In order to simplify parsing, all POP3 servers are required to\r
+ use a certain format for unique-id listings. A unique-id\r
+ listing consists of the message-number of the message,\r
+ followed by a single space and the unique-id of the message.\r
+ No information follows the unique-id in the unique-id listing.\r
+\r
+ The unique-id of a message is an arbitrary server-determined\r
+ string, consisting of one to 70 characters in the range 0x21\r
+ to 0x7E, which uniquely identifies a message within a\r
+ maildrop and which persists across sessions. This\r
+ persistence is required even if a session ends without\r
+ entering the UPDATE state. The server should never reuse an\r
+ unique-id in a given maildrop, for as long as the entity\r
+ using the unique-id exists.\r
+\r
+ Note that messages marked as deleted are not listed.\r
+\r
+ While it is generally preferable for server implementations\r
+ to store arbitrarily assigned unique-ids in the maildrop,\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 12]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ this specification is intended to permit unique-ids to be\r
+ calculated as a hash of the message. Clients should be able\r
+ to handle a situation where two identical copies of a\r
+ message in a maildrop have the same unique-id.\r
+\r
+ Possible Responses:\r
+ +OK unique-id listing follows\r
+ -ERR no such message\r
+\r
+ Examples:\r
+ C: UIDL\r
+ S: +OK\r
+ S: 1 whqtswO00WBw418f9t5JxYwZ\r
+ S: 2 QhdPYR:00WBw1Ph7x7\r
+ S: .\r
+ ...\r
+ C: UIDL 2\r
+ S: +OK 2 QhdPYR:00WBw1Ph7x7\r
+ ...\r
+ C: UIDL 3\r
+ S: -ERR no such message, only 2 messages in maildrop\r
+\r
+\r
+ USER name\r
+\r
+ Arguments:\r
+ a string identifying a mailbox (required), which is of\r
+ significance ONLY to the server\r
+\r
+ Restrictions:\r
+ may only be given in the AUTHORIZATION state after the POP3\r
+ greeting or after an unsuccessful USER or PASS command\r
+\r
+ Discussion:\r
+ To authenticate using the USER and PASS command\r
+ combination, the client must first issue the USER\r
+ command. If the POP3 server responds with a positive\r
+ status indicator ("+OK"), then the client may issue\r
+ either the PASS command to complete the authentication,\r
+ or the QUIT command to terminate the POP3 session. If\r
+ the POP3 server responds with a negative status indicator\r
+ ("-ERR") to the USER command, then the client may either\r
+ issue a new authentication command or may issue the QUIT\r
+ command.\r
+\r
+ The server may return a positive response even though no\r
+ such mailbox exists. The server may return a negative\r
+ response if mailbox exists, but does not permit plaintext\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 13]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ password authentication.\r
+\r
+ Possible Responses:\r
+ +OK name is a valid mailbox\r
+ -ERR never heard of mailbox name\r
+\r
+ Examples:\r
+ C: USER frated\r
+ S: -ERR sorry, no mailbox for frated here\r
+ ...\r
+ C: USER mrose\r
+ S: +OK mrose is a real hoopy frood\r
+\r
+\r
+ PASS string\r
+\r
+ Arguments:\r
+ a server/mailbox-specific password (required)\r
+\r
+ Restrictions:\r
+ may only be given in the AUTHORIZATION state immediately\r
+ after a successful USER command\r
+\r
+ Discussion:\r
+ When the client issues the PASS command, the POP3 server\r
+ uses the argument pair from the USER and PASS commands to\r
+ determine if the client should be given access to the\r
+ appropriate maildrop.\r
+\r
+ Since the PASS command has exactly one argument, a POP3\r
+ server may treat spaces in the argument as part of the\r
+ password, instead of as argument separators.\r
+\r
+ Possible Responses:\r
+ +OK maildrop locked and ready\r
+ -ERR invalid password\r
+ -ERR unable to lock maildrop\r
+\r
+ Examples:\r
+ C: USER mrose\r
+ S: +OK mrose is a real hoopy frood\r
+ C: PASS secret\r
+ S: -ERR maildrop already locked\r
+ ...\r
+ C: USER mrose\r
+ S: +OK mrose is a real hoopy frood\r
+ C: PASS secret\r
+ S: +OK mrose's maildrop has 2 messages (320 octets)\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 14]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ APOP name digest\r
+\r
+ Arguments:\r
+ a string identifying a mailbox and a MD5 digest string\r
+ (both required)\r
+\r
+ Restrictions:\r
+ may only be given in the AUTHORIZATION state after the POP3\r
+ greeting or after an unsuccessful USER or PASS command\r
+\r
+ Discussion:\r
+ Normally, each POP3 session starts with a USER/PASS\r
+ exchange. This results in a server/user-id specific\r
+ password being sent in the clear on the network. For\r
+ intermittent use of POP3, this may not introduce a sizable\r
+ risk. However, many POP3 client implementations connect to\r
+ the POP3 server on a regular basis -- to check for new\r
+ mail. Further the interval of session initiation may be on\r
+ the order of five minutes. Hence, the risk of password\r
+ capture is greatly enhanced.\r
+\r
+ An alternate method of authentication is required which\r
+ provides for both origin authentication and replay\r
+ protection, but which does not involve sending a password\r
+ in the clear over the network. The APOP command provides\r
+ this functionality.\r
+\r
+ A POP3 server which implements the APOP command will\r
+ include a timestamp in its banner greeting. The syntax of\r
+ the timestamp corresponds to the `msg-id' in [RFC822], and\r
+ MUST be different each time the POP3 server issues a banner\r
+ greeting. For example, on a UNIX implementation in which a\r
+ separate UNIX process is used for each instance of a POP3\r
+ server, the syntax of the timestamp might be:\r
+\r
+ <process-ID.clock@hostname>\r
+\r
+ where `process-ID' is the decimal value of the process's\r
+ PID, clock is the decimal value of the system clock, and\r
+ hostname is the fully-qualified domain-name corresponding\r
+ to the host where the POP3 server is running.\r
+\r
+ The POP3 client makes note of this timestamp, and then\r
+ issues the APOP command. The `name' parameter has\r
+ identical semantics to the `name' parameter of the USER\r
+ command. The `digest' parameter is calculated by applying\r
+ the MD5 algorithm [RFC1321] to a string consisting of the\r
+ timestamp (including angle-brackets) followed by a shared\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 15]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ secret. This shared secret is a string known only to the\r
+ POP3 client and server. Great care should be taken to\r
+ prevent unauthorized disclosure of the secret, as knowledge\r
+ of the secret will allow any entity to successfully\r
+ masquerade as the named user. The `digest' parameter\r
+ itself is a 16-octet value which is sent in hexadecimal\r
+ format, using lower-case ASCII characters.\r
+\r
+ When the POP3 server receives the APOP command, it verifies\r
+ the digest provided. If the digest is correct, the POP3\r
+ server issues a positive response, and the POP3 session\r
+ enters the TRANSACTION state. Otherwise, a negative\r
+ response is issued and the POP3 session remains in the\r
+ AUTHORIZATION state.\r
+\r
+ Note that as the length of the shared secret increases, so\r
+ does the difficulty of deriving it. As such, shared\r
+ secrets should be long strings (considerably longer than\r
+ the 8-character example shown below).\r
+\r
+ Possible Responses:\r
+ +OK maildrop locked and ready\r
+ -ERR permission denied\r
+\r
+ Examples:\r
+ S: +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us>\r
+ C: APOP mrose c4c9334bac560ecc979e58001b3e22fb\r
+ S: +OK maildrop has 1 message (369 octets)\r
+\r
+ In this example, the shared secret is the string `tan-\r
+ staaf'. Hence, the MD5 algorithm is applied to the string\r
+\r
+ <1896.697170952@dbc.mtview.ca.us>tanstaaf\r
+\r
+ which produces a digest value of\r
+\r
+ c4c9334bac560ecc979e58001b3e22fb\r
+\r
+8. Scaling and Operational Considerations\r
+\r
+ Since some of the optional features described above were added to the\r
+ POP3 protocol, experience has accumulated in using them in large-\r
+ scale commercial post office operations where most of the users are\r
+ unrelated to each other. In these situations and others, users and\r
+ vendors of POP3 clients have discovered that the combination of using\r
+ the UIDL command and not issuing the DELE command can provide a weak\r
+ version of the "maildrop as semi-permanent repository" functionality\r
+ normally associated with IMAP. Of course the other capabilities of\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 16]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ IMAP, such as polling an existing connection for newly arrived\r
+ messages and supporting multiple folders on the server, are not\r
+ present in POP3.\r
+\r
+ When these facilities are used in this way by casual users, there has\r
+ been a tendency for already-read messages to accumulate on the server\r
+ without bound. This is clearly an undesirable behavior pattern from\r
+ the standpoint of the server operator. This situation is aggravated\r
+ by the fact that the limited capabilities of the POP3 do not permit\r
+ efficient handling of maildrops which have hundreds or thousands of\r
+ messages.\r
+\r
+ Consequently, it is recommended that operators of large-scale multi-\r
+ user servers, especially ones in which the user's only access to the\r
+ maildrop is via POP3, consider such options as:\r
+\r
+ * Imposing a per-user maildrop storage quota or the like.\r
+\r
+ A disadvantage to this option is that accumulation of messages may\r
+ result in the user's inability to receive new ones into the\r
+ maildrop. Sites which choose this option should be sure to inform\r
+ users of impending or current exhaustion of quota, perhaps by\r
+ inserting an appropriate message into the user's maildrop.\r
+\r
+ * Enforce a site policy regarding mail retention on the server.\r
+\r
+ Sites are free to establish local policy regarding the storage and\r
+ retention of messages on the server, both read and unread. For\r
+ example, a site might delete unread messages from the server after\r
+ 60 days and delete read messages after 7 days. Such message\r
+ deletions are outside the scope of the POP3 protocol and are not\r
+ considered a protocol violation.\r
+\r
+ Server operators enforcing message deletion policies should take\r
+ care to make all users aware of the policies in force.\r
+\r
+ Clients must not assume that a site policy will automate message\r
+ deletions, and should continue to explicitly delete messages using\r
+ the DELE command when appropriate.\r
+\r
+ It should be noted that enforcing site message deletion policies\r
+ may be confusing to the user community, since their POP3 client\r
+ may contain configuration options to leave mail on the server\r
+ which will not in fact be supported by the server.\r
+\r
+ One special case of a site policy is that messages may only be\r
+ downloaded once from the server, and are deleted after this has\r
+ been accomplished. This could be implemented in POP3 server\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 17]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ software by the following mechanism: "following a POP3 login by a\r
+ client which was ended by a QUIT, delete all messages downloaded\r
+ during the session with the RETR command". It is important not to\r
+ delete messages in the event of abnormal connection termination\r
+ (ie, if no QUIT was received from the client) because the client\r
+ may not have successfully received or stored the messages.\r
+ Servers implementing a download-and-delete policy may also wish to\r
+ disable or limit the optional TOP command, since it could be used\r
+ as an alternate mechanism to download entire messages.\r
+\r
+9. POP3 Command Summary\r
+\r
+ Minimal POP3 Commands:\r
+\r
+ USER name valid in the AUTHORIZATION state\r
+ PASS string\r
+ QUIT\r
+\r
+ STAT valid in the TRANSACTION state\r
+ LIST [msg]\r
+ RETR msg\r
+ DELE msg\r
+ NOOP\r
+ RSET\r
+ QUIT\r
+\r
+ Optional POP3 Commands:\r
+\r
+ APOP name digest valid in the AUTHORIZATION state\r
+\r
+ TOP msg n valid in the TRANSACTION state\r
+ UIDL [msg]\r
+\r
+ POP3 Replies:\r
+\r
+ +OK\r
+ -ERR\r
+\r
+ Note that with the exception of the STAT, LIST, and UIDL commands,\r
+ the reply given by the POP3 server to any command is significant\r
+ only to "+OK" and "-ERR". Any text occurring after this reply\r
+ may be ignored by the client.\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 18]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+10. Example POP3 Session\r
+\r
+ S: <wait for connection on TCP port 110>\r
+ C: <open connection>\r
+ S: +OK POP3 server ready <1896.697170952@dbc.mtview.ca.us>\r
+ C: APOP mrose c4c9334bac560ecc979e58001b3e22fb\r
+ S: +OK mrose's maildrop has 2 messages (320 octets)\r
+ C: STAT\r
+ S: +OK 2 320\r
+ C: LIST\r
+ S: +OK 2 messages (320 octets)\r
+ S: 1 120\r
+ S: 2 200\r
+ S: .\r
+ C: RETR 1\r
+ S: +OK 120 octets\r
+ S: <the POP3 server sends message 1>\r
+ S: .\r
+ C: DELE 1\r
+ S: +OK message 1 deleted\r
+ C: RETR 2\r
+ S: +OK 200 octets\r
+ S: <the POP3 server sends message 2>\r
+ S: .\r
+ C: DELE 2\r
+ S: +OK message 2 deleted\r
+ C: QUIT\r
+ S: +OK dewey POP3 server signing off (maildrop empty)\r
+ C: <close connection>\r
+ S: <wait for next connection>\r
+\r
+11. Message Format\r
+\r
+ All messages transmitted during a POP3 session are assumed to conform\r
+ to the standard for the format of Internet text messages [RFC822].\r
+\r
+ It is important to note that the octet count for a message on the\r
+ server host may differ from the octet count assigned to that message\r
+ due to local conventions for designating end-of-line. Usually,\r
+ during the AUTHORIZATION state of the POP3 session, the POP3 server\r
+ can calculate the size of each message in octets when it opens the\r
+ maildrop. For example, if the POP3 server host internally represents\r
+ end-of-line as a single character, then the POP3 server simply counts\r
+ each occurrence of this character in a message as two octets. Note\r
+ that lines in the message which start with the termination octet need\r
+ not (and must not) be counted twice, since the POP3 client will\r
+ remove all byte-stuffed termination characters when it receives a\r
+ multi-line response.\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 19]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+12. References\r
+\r
+ [RFC821] Postel, J., "Simple Mail Transfer Protocol", STD 10, RFC\r
+ 821, USC/Information Sciences Institute, August 1982.\r
+\r
+ [RFC822] Crocker, D., "Standard for the Format of ARPA-Internet Text\r
+ Messages", STD 11, RFC 822, University of Delaware, August 1982.\r
+\r
+ [RFC1321] Rivest, R., "The MD5 Message-Digest Algorithm", RFC 1321,\r
+ MIT Laboratory for Computer Science, April 1992.\r
+\r
+ [RFC1730] Crispin, M., "Internet Message Access Protocol - Version\r
+ 4", RFC 1730, University of Washington, December 1994.\r
+\r
+ [RFC1734] Myers, J., "POP3 AUTHentication command", RFC 1734,\r
+ Carnegie Mellon, December 1994.\r
+\r
+13. Security Considerations\r
+\r
+ It is conjectured that use of the APOP command provides origin\r
+ identification and replay protection for a POP3 session.\r
+ Accordingly, a POP3 server which implements both the PASS and APOP\r
+ commands should not allow both methods of access for a given user;\r
+ that is, for a given mailbox name, either the USER/PASS command\r
+ sequence or the APOP command is allowed, but not both.\r
+\r
+ Further, note that as the length of the shared secret increases, so\r
+ does the difficulty of deriving it.\r
+\r
+ Servers that answer -ERR to the USER command are giving potential\r
+ attackers clues about which names are valid.\r
+\r
+ Use of the PASS command sends passwords in the clear over the\r
+ network.\r
+\r
+ Use of the RETR and TOP commands sends mail in the clear over the\r
+ network.\r
+\r
+ Otherwise, security issues are not discussed in this memo.\r
+\r
+14. Acknowledgements\r
+\r
+ The POP family has a long and checkered history. Although primarily\r
+ a minor revision to RFC 1460, POP3 is based on the ideas presented in\r
+ RFCs 918, 937, and 1081.\r
+\r
+ In addition, Alfred Grimstad, Keith McCloghrie, and Neil Ostroff\r
+ provided significant comments on the APOP command.\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 20]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+15. Authors' Addresses\r
+\r
+ John G. Myers\r
+ Carnegie-Mellon University\r
+ 5000 Forbes Ave\r
+ Pittsburgh, PA 15213\r
+\r
+ EMail: jgm+@cmu.edu\r
+\r
+\r
+ Marshall T. Rose\r
+ Dover Beach Consulting, Inc.\r
+ 420 Whisman Court\r
+ Mountain View, CA 94043-2186\r
+\r
+ EMail: mrose@dbc.mtview.ca.us\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 21]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+Appendix A. Differences from RFC 1725\r
+\r
+ This memo is a revision to RFC 1725, a Draft Standard. It makes the\r
+ following changes from that document:\r
+\r
+ - clarifies that command keywords are case insensitive.\r
+\r
+ - specifies that servers must send "+OK" and "-ERR" in\r
+ upper case.\r
+\r
+ - specifies that the initial greeting is a positive response,\r
+ instead of any string which should be a positive response.\r
+\r
+ - clarifies behavior for unimplemented commands.\r
+\r
+ - makes the USER and PASS commands optional.\r
+\r
+ - clarified the set of possible responses to the USER command.\r
+\r
+ - reverses the order of the examples in the USER and PASS\r
+ commands, to reduce confusion.\r
+\r
+ - clarifies that the PASS command may only be given immediately\r
+ after a successful USER command.\r
+\r
+ - clarified the persistence requirements of UIDs and added some\r
+ implementation notes.\r
+\r
+ - specifies a UID length limitation of one to 70 octets.\r
+\r
+ - specifies a status indicator length limitation\r
+ of 512 octets, including the CRLF.\r
+\r
+ - clarifies that LIST with no arguments on an empty mailbox\r
+ returns success.\r
+\r
+ - adds a reference from the LIST command to the Message Format\r
+ section\r
+\r
+ - clarifies the behavior of QUIT upon failure\r
+\r
+ - clarifies the security section to not imply the use of the\r
+ USER command with the APOP command.\r
+\r
+ - adds references to RFCs 1730 and 1734\r
+\r
+ - clarifies the method by which a UA may enter mail into the\r
+ transport system.\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 22]\r
+\f\r
+RFC 1939 POP3 May 1996\r
+\r
+\r
+ - clarifies that the second argument to the TOP command is a\r
+ number of lines.\r
+\r
+ - changes the suggestion in the Security Considerations section\r
+ for a server to not accept both PASS and APOP for a given user\r
+ from a "must" to a "should".\r
+\r
+ - adds a section on scaling and operational considerations\r
+\r
+Appendix B. Command Index\r
+\r
+ APOP ....................................................... 15\r
+ DELE ....................................................... 8\r
+ LIST ....................................................... 6\r
+ NOOP ....................................................... 9\r
+ PASS ....................................................... 14\r
+ QUIT ....................................................... 5\r
+ QUIT ....................................................... 10\r
+ RETR ....................................................... 8\r
+ RSET ....................................................... 9\r
+ STAT ....................................................... 6\r
+ TOP ........................................................ 11\r
+ UIDL ....................................................... 12\r
+ USER ....................................................... 13\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+\r
+Myers & Rose Standards Track [Page 23]\r
+\f\r
--- /dev/null
+
+
+
+
+
+
+Network Working Group N. Freed
+Request for Comments: 2045 Innosoft
+Obsoletes: 1521, 1522, 1590 N. Borenstein
+Category: Standards Track First Virtual
+ November 1996
+
+
+ Multipurpose Internet Mail Extensions
+ (MIME) Part One:
+ Format of Internet Message Bodies
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ STD 11, RFC 822, defines a message representation protocol specifying
+ considerable detail about US-ASCII message headers, and leaves the
+ message content, or message body, as flat US-ASCII text. This set of
+ documents, collectively called the Multipurpose Internet Mail
+ Extensions, or MIME, redefines the format of messages to allow for
+
+ (1) textual message bodies in character sets other than
+ US-ASCII,
+
+ (2) an extensible set of different formats for non-textual
+ message bodies,
+
+ (3) multi-part message bodies, and
+
+ (4) textual header information in character sets other than
+ US-ASCII.
+
+ These documents are based on earlier work documented in RFC 934, STD
+ 11, and RFC 1049, but extends and revises them. Because RFC 822 said
+ so little about message bodies, these documents are largely
+ orthogonal to (rather than a revision of) RFC 822.
+
+ This initial document specifies the various headers used to describe
+ the structure of MIME messages. The second document, RFC 2046,
+ defines the general structure of the MIME media typing system and
+ defines an initial set of media types. The third document, RFC 2047,
+ describes extensions to RFC 822 to allow non-US-ASCII text data in
+
+
+
+Freed & Borenstein Standards Track [Page 1]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ Internet mail header fields. The fourth document, RFC 2048, specifies
+ various IANA registration procedures for MIME-related facilities. The
+ fifth and final document, RFC 2049, describes MIME conformance
+ criteria as well as providing some illustrative examples of MIME
+ message formats, acknowledgements, and the bibliography.
+
+ These documents are revisions of RFCs 1521, 1522, and 1590, which
+ themselves were revisions of RFCs 1341 and 1342. An appendix in RFC
+ 2049 describes differences and changes from previous versions.
+
+Table of Contents
+
+ 1. Introduction ......................................... 3
+ 2. Definitions, Conventions, and Generic BNF Grammar .... 5
+ 2.1 CRLF ................................................ 5
+ 2.2 Character Set ....................................... 6
+ 2.3 Message ............................................. 6
+ 2.4 Entity .............................................. 6
+ 2.5 Body Part ........................................... 7
+ 2.6 Body ................................................ 7
+ 2.7 7bit Data ........................................... 7
+ 2.8 8bit Data ........................................... 7
+ 2.9 Binary Data ......................................... 7
+ 2.10 Lines .............................................. 7
+ 3. MIME Header Fields ................................... 8
+ 4. MIME-Version Header Field ............................ 8
+ 5. Content-Type Header Field ............................ 10
+ 5.1 Syntax of the Content-Type Header Field ............. 12
+ 5.2 Content-Type Defaults ............................... 14
+ 6. Content-Transfer-Encoding Header Field ............... 14
+ 6.1 Content-Transfer-Encoding Syntax .................... 14
+ 6.2 Content-Transfer-Encodings Semantics ................ 15
+ 6.3 New Content-Transfer-Encodings ...................... 16
+ 6.4 Interpretation and Use .............................. 16
+ 6.5 Translating Encodings ............................... 18
+ 6.6 Canonical Encoding Model ............................ 19
+ 6.7 Quoted-Printable Content-Transfer-Encoding .......... 19
+ 6.8 Base64 Content-Transfer-Encoding .................... 24
+ 7. Content-ID Header Field .............................. 26
+ 8. Content-Description Header Field ..................... 27
+ 9. Additional MIME Header Fields ........................ 27
+ 10. Summary ............................................. 27
+ 11. Security Considerations ............................. 27
+ 12. Authors' Addresses .................................. 28
+ A. Collected Grammar .................................... 29
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 2]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+1. Introduction
+
+ Since its publication in 1982, RFC 822 has defined the standard
+ format of textual mail messages on the Internet. Its success has
+ been such that the RFC 822 format has been adopted, wholly or
+ partially, well beyond the confines of the Internet and the Internet
+ SMTP transport defined by RFC 821. As the format has seen wider use,
+ a number of limitations have proven increasingly restrictive for the
+ user community.
+
+ RFC 822 was intended to specify a format for text messages. As such,
+ non-text messages, such as multimedia messages that might include
+ audio or images, are simply not mentioned. Even in the case of text,
+ however, RFC 822 is inadequate for the needs of mail users whose
+ languages require the use of character sets richer than US-ASCII.
+ Since RFC 822 does not specify mechanisms for mail containing audio,
+ video, Asian language text, or even text in most European languages,
+ additional specifications are needed.
+
+ One of the notable limitations of RFC 821/822 based mail systems is
+ the fact that they limit the contents of electronic mail messages to
+ relatively short lines (e.g. 1000 characters or less [RFC-821]) of
+ 7bit US-ASCII. This forces users to convert any non-textual data
+ that they may wish to send into seven-bit bytes representable as
+ printable US-ASCII characters before invoking a local mail UA (User
+ Agent, a program with which human users send and receive mail).
+ Examples of such encodings currently used in the Internet include
+ pure hexadecimal, uuencode, the 3-in-4 base 64 scheme specified in
+ RFC 1421, the Andrew Toolkit Representation [ATK], and many others.
+
+ The limitations of RFC 822 mail become even more apparent as gateways
+ are designed to allow for the exchange of mail messages between RFC
+ 822 hosts and X.400 hosts. X.400 [X400] specifies mechanisms for the
+ inclusion of non-textual material within electronic mail messages.
+ The current standards for the mapping of X.400 messages to RFC 822
+ messages specify either that X.400 non-textual material must be
+ converted to (not encoded in) IA5Text format, or that they must be
+ discarded, notifying the RFC 822 user that discarding has occurred.
+ This is clearly undesirable, as information that a user may wish to
+ receive is lost. Even though a user agent may not have the
+ capability of dealing with the non-textual material, the user might
+ have some mechanism external to the UA that can extract useful
+ information from the material. Moreover, it does not allow for the
+ fact that the message may eventually be gatewayed back into an X.400
+ message handling system (i.e., the X.400 message is "tunneled"
+ through Internet mail), where the non-textual information would
+ definitely become useful again.
+
+
+
+
+Freed & Borenstein Standards Track [Page 3]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ This document describes several mechanisms that combine to solve most
+ of these problems without introducing any serious incompatibilities
+ with the existing world of RFC 822 mail. In particular, it
+ describes:
+
+ (1) A MIME-Version header field, which uses a version
+ number to declare a message to be conformant with MIME
+ and allows mail processing agents to distinguish
+ between such messages and those generated by older or
+ non-conformant software, which are presumed to lack
+ such a field.
+
+ (2) A Content-Type header field, generalized from RFC 1049,
+ which can be used to specify the media type and subtype
+ of data in the body of a message and to fully specify
+ the native representation (canonical form) of such
+ data.
+
+ (3) A Content-Transfer-Encoding header field, which can be
+ used to specify both the encoding transformation that
+ was applied to the body and the domain of the result.
+ Encoding transformations other than the identity
+ transformation are usually applied to data in order to
+ allow it to pass through mail transport mechanisms
+ which may have data or character set limitations.
+
+ (4) Two additional header fields that can be used to
+ further describe the data in a body, the Content-ID and
+ Content-Description header fields.
+
+ All of the header fields defined in this document are subject to the
+ general syntactic rules for header fields specified in RFC 822. In
+ particular, all of these header fields except for Content-Disposition
+ can include RFC 822 comments, which have no semantic content and
+ should be ignored during MIME processing.
+
+ Finally, to specify and promote interoperability, RFC 2049 provides a
+ basic applicability statement for a subset of the above mechanisms
+ that defines a minimal level of "conformance" with this document.
+
+ HISTORICAL NOTE: Several of the mechanisms described in this set of
+ documents may seem somewhat strange or even baroque at first reading.
+ It is important to note that compatibility with existing standards
+ AND robustness across existing practice were two of the highest
+ priorities of the working group that developed this set of documents.
+ In particular, compatibility was always favored over elegance.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 4]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ Please refer to the current edition of the "Internet Official
+ Protocol Standards" for the standardization state and status of this
+ protocol. RFC 822 and STD 3, RFC 1123 also provide essential
+ background for MIME since no conforming implementation of MIME can
+ violate them. In addition, several other informational RFC documents
+ will be of interest to the MIME implementor, in particular RFC 1344,
+ RFC 1345, and RFC 1524.
+
+2. Definitions, Conventions, and Generic BNF Grammar
+
+ Although the mechanisms specified in this set of documents are all
+ described in prose, most are also described formally in the augmented
+ BNF notation of RFC 822. Implementors will need to be familiar with
+ this notation in order to understand this set of documents, and are
+ referred to RFC 822 for a complete explanation of the augmented BNF
+ notation.
+
+ Some of the augmented BNF in this set of documents makes named
+ references to syntax rules defined in RFC 822. A complete formal
+ grammar, then, is obtained by combining the collected grammar
+ appendices in each document in this set with the BNF of RFC 822 plus
+ the modifications to RFC 822 defined in RFC 1123 (which specifically
+ changes the syntax for `return', `date' and `mailbox').
+
+ All numeric and octet values are given in decimal notation in this
+ set of documents. All media type values, subtype values, and
+ parameter names as defined are case-insensitive. However, parameter
+ values are case-sensitive unless otherwise specified for the specific
+ parameter.
+
+ FORMATTING NOTE: Notes, such at this one, provide additional
+ nonessential information which may be skipped by the reader without
+ missing anything essential. The primary purpose of these non-
+ essential notes is to convey information about the rationale of this
+ set of documents, or to place these documents in the proper
+ historical or evolutionary context. Such information may in
+ particular be skipped by those who are focused entirely on building a
+ conformant implementation, but may be of use to those who wish to
+ understand why certain design choices were made.
+
+2.1. CRLF
+
+ The term CRLF, in this set of documents, refers to the sequence of
+ octets corresponding to the two US-ASCII characters CR (decimal value
+ 13) and LF (decimal value 10) which, taken together, in this order,
+ denote a line break in RFC 822 mail.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 5]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+2.2. Character Set
+
+ The term "character set" is used in MIME to refer to a method of
+ converting a sequence of octets into a sequence of characters. Note
+ that unconditional and unambiguous conversion in the other direction
+ is not required, in that not all characters may be representable by a
+ given character set and a character set may provide more than one
+ sequence of octets to represent a particular sequence of characters.
+
+ This definition is intended to allow various kinds of character
+ encodings, from simple single-table mappings such as US-ASCII to
+ complex table switching methods such as those that use ISO 2022's
+ techniques, to be used as character sets. However, the definition
+ associated with a MIME character set name must fully specify the
+ mapping to be performed. In particular, use of external profiling
+ information to determine the exact mapping is not permitted.
+
+ NOTE: The term "character set" was originally to describe such
+ straightforward schemes as US-ASCII and ISO-8859-1 which have a
+ simple one-to-one mapping from single octets to single characters.
+ Multi-octet coded character sets and switching techniques make the
+ situation more complex. For example, some communities use the term
+ "character encoding" for what MIME calls a "character set", while
+ using the phrase "coded character set" to denote an abstract mapping
+ from integers (not octets) to characters.
+
+2.3. Message
+
+ The term "message", when not further qualified, means either a
+ (complete or "top-level") RFC 822 message being transferred on a
+ network, or a message encapsulated in a body of type "message/rfc822"
+ or "message/partial".
+
+2.4. Entity
+
+ The term "entity", refers specifically to the MIME-defined header
+ fields and contents of either a message or one of the parts in the
+ body of a multipart entity. The specification of such entities is
+ the essence of MIME. Since the contents of an entity are often
+ called the "body", it makes sense to speak about the body of an
+ entity. Any sort of field may be present in the header of an entity,
+ but only those fields whose names begin with "content-" actually have
+ any MIME-related meaning. Note that this does NOT imply thay they
+ have no meaning at all -- an entity that is also a message has non-
+ MIME header fields whose meanings are defined by RFC 822.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 6]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+2.5. Body Part
+
+ The term "body part" refers to an entity inside of a multipart
+ entity.
+
+2.6. Body
+
+ The term "body", when not further qualified, means the body of an
+ entity, that is, the body of either a message or of a body part.
+
+ NOTE: The previous four definitions are clearly circular. This is
+ unavoidable, since the overall structure of a MIME message is indeed
+ recursive.
+
+2.7. 7bit Data
+
+ "7bit data" refers to data that is all represented as relatively
+ short lines with 998 octets or less between CRLF line separation
+ sequences [RFC-821]. No octets with decimal values greater than 127
+ are allowed and neither are NULs (octets with decimal value 0). CR
+ (decimal value 13) and LF (decimal value 10) octets only occur as
+ part of CRLF line separation sequences.
+
+2.8. 8bit Data
+
+ "8bit data" refers to data that is all represented as relatively
+ short lines with 998 octets or less between CRLF line separation
+ sequences [RFC-821]), but octets with decimal values greater than 127
+ may be used. As with "7bit data" CR and LF octets only occur as part
+ of CRLF line separation sequences and no NULs are allowed.
+
+2.9. Binary Data
+
+ "Binary data" refers to data where any sequence of octets whatsoever
+ is allowed.
+
+2.10. Lines
+
+ "Lines" are defined as sequences of octets separated by a CRLF
+ sequences. This is consistent with both RFC 821 and RFC 822.
+ "Lines" only refers to a unit of data in a message, which may or may
+ not correspond to something that is actually displayed by a user
+ agent.
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 7]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+3. MIME Header Fields
+
+ MIME defines a number of new RFC 822 header fields that are used to
+ describe the content of a MIME entity. These header fields occur in
+ at least two contexts:
+
+ (1) As part of a regular RFC 822 message header.
+
+ (2) In a MIME body part header within a multipart
+ construct.
+
+ The formal definition of these header fields is as follows:
+
+ entity-headers := [ content CRLF ]
+ [ encoding CRLF ]
+ [ id CRLF ]
+ [ description CRLF ]
+ *( MIME-extension-field CRLF )
+
+ MIME-message-headers := entity-headers
+ fields
+ version CRLF
+ ; The ordering of the header
+ ; fields implied by this BNF
+ ; definition should be ignored.
+
+ MIME-part-headers := entity-headers
+ [ fields ]
+ ; Any field not beginning with
+ ; "content-" can have no defined
+ ; meaning and may be ignored.
+ ; The ordering of the header
+ ; fields implied by this BNF
+ ; definition should be ignored.
+
+ The syntax of the various specific MIME header fields will be
+ described in the following sections.
+
+4. MIME-Version Header Field
+
+ Since RFC 822 was published in 1982, there has really been only one
+ format standard for Internet messages, and there has been little
+ perceived need to declare the format standard in use. This document
+ is an independent specification that complements RFC 822. Although
+ the extensions in this document have been defined in such a way as to
+ be compatible with RFC 822, there are still circumstances in which it
+ might be desirable for a mail-processing agent to know whether a
+ message was composed with the new standard in mind.
+
+
+
+Freed & Borenstein Standards Track [Page 8]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ Therefore, this document defines a new header field, "MIME-Version",
+ which is to be used to declare the version of the Internet message
+ body format standard in use.
+
+ Messages composed in accordance with this document MUST include such
+ a header field, with the following verbatim text:
+
+ MIME-Version: 1.0
+
+ The presence of this header field is an assertion that the message
+ has been composed in compliance with this document.
+
+ Since it is possible that a future document might extend the message
+ format standard again, a formal BNF is given for the content of the
+ MIME-Version field:
+
+ version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
+
+ Thus, future format specifiers, which might replace or extend "1.0",
+ are constrained to be two integer fields, separated by a period. If
+ a message is received with a MIME-version value other than "1.0", it
+ cannot be assumed to conform with this document.
+
+ Note that the MIME-Version header field is required at the top level
+ of a message. It is not required for each body part of a multipart
+ entity. It is required for the embedded headers of a body of type
+ "message/rfc822" or "message/partial" if and only if the embedded
+ message is itself claimed to be MIME-conformant.
+
+ It is not possible to fully specify how a mail reader that conforms
+ with MIME as defined in this document should treat a message that
+ might arrive in the future with some value of MIME-Version other than
+ "1.0".
+
+ It is also worth noting that version control for specific media types
+ is not accomplished using the MIME-Version mechanism. In particular,
+ some formats (such as application/postscript) have version numbering
+ conventions that are internal to the media format. Where such
+ conventions exist, MIME does nothing to supersede them. Where no
+ such conventions exist, a MIME media type might use a "version"
+ parameter in the content-type field if necessary.
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 9]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ NOTE TO IMPLEMENTORS: When checking MIME-Version values any RFC 822
+ comment strings that are present must be ignored. In particular, the
+ following four MIME-Version fields are equivalent:
+
+ MIME-Version: 1.0
+
+ MIME-Version: 1.0 (produced by MetaSend Vx.x)
+
+ MIME-Version: (produced by MetaSend Vx.x) 1.0
+
+ MIME-Version: 1.(produced by MetaSend Vx.x)0
+
+ In the absence of a MIME-Version field, a receiving mail user agent
+ (whether conforming to MIME requirements or not) may optionally
+ choose to interpret the body of the message according to local
+ conventions. Many such conventions are currently in use and it
+ should be noted that in practice non-MIME messages can contain just
+ about anything.
+
+ It is impossible to be certain that a non-MIME mail message is
+ actually plain text in the US-ASCII character set since it might well
+ be a message that, using some set of nonstandard local conventions
+ that predate MIME, includes text in another character set or non-
+ textual data presented in a manner that cannot be automatically
+ recognized (e.g., a uuencoded compressed UNIX tar file).
+
+5. Content-Type Header Field
+
+ The purpose of the Content-Type field is to describe the data
+ contained in the body fully enough that the receiving user agent can
+ pick an appropriate agent or mechanism to present the data to the
+ user, or otherwise deal with the data in an appropriate manner. The
+ value in this field is called a media type.
+
+ HISTORICAL NOTE: The Content-Type header field was first defined in
+ RFC 1049. RFC 1049 used a simpler and less powerful syntax, but one
+ that is largely compatible with the mechanism given here.
+
+ The Content-Type header field specifies the nature of the data in the
+ body of an entity by giving media type and subtype identifiers, and
+ by providing auxiliary information that may be required for certain
+ media types. After the media type and subtype names, the remainder
+ of the header field is simply a set of parameters, specified in an
+ attribute=value notation. The ordering of parameters is not
+ significant.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 10]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ In general, the top-level media type is used to declare the general
+ type of data, while the subtype specifies a specific format for that
+ type of data. Thus, a media type of "image/xyz" is enough to tell a
+ user agent that the data is an image, even if the user agent has no
+ knowledge of the specific image format "xyz". Such information can
+ be used, for example, to decide whether or not to show a user the raw
+ data from an unrecognized subtype -- such an action might be
+ reasonable for unrecognized subtypes of text, but not for
+ unrecognized subtypes of image or audio. For this reason, registered
+ subtypes of text, image, audio, and video should not contain embedded
+ information that is really of a different type. Such compound
+ formats should be represented using the "multipart" or "application"
+ types.
+
+ Parameters are modifiers of the media subtype, and as such do not
+ fundamentally affect the nature of the content. The set of
+ meaningful parameters depends on the media type and subtype. Most
+ parameters are associated with a single specific subtype. However, a
+ given top-level media type may define parameters which are applicable
+ to any subtype of that type. Parameters may be required by their
+ defining content type or subtype or they may be optional. MIME
+ implementations must ignore any parameters whose names they do not
+ recognize.
+
+ For example, the "charset" parameter is applicable to any subtype of
+ "text", while the "boundary" parameter is required for any subtype of
+ the "multipart" media type.
+
+ There are NO globally-meaningful parameters that apply to all media
+ types. Truly global mechanisms are best addressed, in the MIME
+ model, by the definition of additional Content-* header fields.
+
+ An initial set of seven top-level media types is defined in RFC 2046.
+ Five of these are discrete types whose content is essentially opaque
+ as far as MIME processing is concerned. The remaining two are
+ composite types whose contents require additional handling by MIME
+ processors.
+
+ This set of top-level media types is intended to be substantially
+ complete. It is expected that additions to the larger set of
+ supported types can generally be accomplished by the creation of new
+ subtypes of these initial types. In the future, more top-level types
+ may be defined only by a standards-track extension to this standard.
+ If another top-level type is to be used for any reason, it must be
+ given a name starting with "X-" to indicate its non-standard status
+ and to avoid a potential conflict with a future official name.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 11]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+5.1. Syntax of the Content-Type Header Field
+
+ In the Augmented BNF notation of RFC 822, a Content-Type header field
+ value is defined as follows:
+
+ content := "Content-Type" ":" type "/" subtype
+ *(";" parameter)
+ ; Matching of media type and subtype
+ ; is ALWAYS case-insensitive.
+
+ type := discrete-type / composite-type
+
+ discrete-type := "text" / "image" / "audio" / "video" /
+ "application" / extension-token
+
+ composite-type := "message" / "multipart" / extension-token
+
+ extension-token := ietf-token / x-token
+
+ ietf-token := <An extension token defined by a
+ standards-track RFC and registered
+ with IANA.>
+
+ x-token := <The two characters "X-" or "x-" followed, with
+ no intervening white space, by any token>
+
+ subtype := extension-token / iana-token
+
+ iana-token := <A publicly-defined extension token. Tokens
+ of this form must be registered with IANA
+ as specified in RFC 2048.>
+
+ parameter := attribute "=" value
+
+ attribute := token
+ ; Matching of attributes
+ ; is ALWAYS case-insensitive.
+
+ value := token / quoted-string
+
+ token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
+ or tspecials>
+
+ tspecials := "(" / ")" / "<" / ">" / "@" /
+ "," / ";" / ":" / "\" / <">
+ "/" / "[" / "]" / "?" / "="
+ ; Must be in quoted-string,
+ ; to use within parameter values
+
+
+
+Freed & Borenstein Standards Track [Page 12]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ Note that the definition of "tspecials" is the same as the RFC 822
+ definition of "specials" with the addition of the three characters
+ "/", "?", and "=", and the removal of ".".
+
+ Note also that a subtype specification is MANDATORY -- it may not be
+ omitted from a Content-Type header field. As such, there are no
+ default subtypes.
+
+ The type, subtype, and parameter names are not case sensitive. For
+ example, TEXT, Text, and TeXt are all equivalent top-level media
+ types. Parameter values are normally case sensitive, but sometimes
+ are interpreted in a case-insensitive fashion, depending on the
+ intended use. (For example, multipart boundaries are case-sensitive,
+ but the "access-type" parameter for message/External-body is not
+ case-sensitive.)
+
+ Note that the value of a quoted string parameter does not include the
+ quotes. That is, the quotation marks in a quoted-string are not a
+ part of the value of the parameter, but are merely used to delimit
+ that parameter value. In addition, comments are allowed in
+ accordance with RFC 822 rules for structured header fields. Thus the
+ following two forms
+
+ Content-type: text/plain; charset=us-ascii (Plain text)
+
+ Content-type: text/plain; charset="us-ascii"
+
+ are completely equivalent.
+
+ Beyond this syntax, the only syntactic constraint on the definition
+ of subtype names is the desire that their uses must not conflict.
+ That is, it would be undesirable to have two different communities
+ using "Content-Type: application/foobar" to mean two different
+ things. The process of defining new media subtypes, then, is not
+ intended to be a mechanism for imposing restrictions, but simply a
+ mechanism for publicizing their definition and usage. There are,
+ therefore, two acceptable mechanisms for defining new media subtypes:
+
+ (1) Private values (starting with "X-") may be defined
+ bilaterally between two cooperating agents without
+ outside registration or standardization. Such values
+ cannot be registered or standardized.
+
+ (2) New standard values should be registered with IANA as
+ described in RFC 2048.
+
+ The second document in this set, RFC 2046, defines the initial set of
+ media types for MIME.
+
+
+
+Freed & Borenstein Standards Track [Page 13]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+5.2. Content-Type Defaults
+
+ Default RFC 822 messages without a MIME Content-Type header are taken
+ by this protocol to be plain text in the US-ASCII character set,
+ which can be explicitly specified as:
+
+ Content-type: text/plain; charset=us-ascii
+
+ This default is assumed if no Content-Type header field is specified.
+ It is also recommend that this default be assumed when a
+ syntactically invalid Content-Type header field is encountered. In
+ the presence of a MIME-Version header field and the absence of any
+ Content-Type header field, a receiving User Agent can also assume
+ that plain US-ASCII text was the sender's intent. Plain US-ASCII
+ text may still be assumed in the absence of a MIME-Version or the
+ presence of an syntactically invalid Content-Type header field, but
+ the sender's intent might have been otherwise.
+
+6. Content-Transfer-Encoding Header Field
+
+ Many media types which could be usefully transported via email are
+ represented, in their "natural" format, as 8bit character or binary
+ data. Such data cannot be transmitted over some transfer protocols.
+ For example, RFC 821 (SMTP) restricts mail messages to 7bit US-ASCII
+ data with lines no longer than 1000 characters including any trailing
+ CRLF line separator.
+
+ It is necessary, therefore, to define a standard mechanism for
+ encoding such data into a 7bit short line format. Proper labelling
+ of unencoded material in less restrictive formats for direct use over
+ less restrictive transports is also desireable. This document
+ specifies that such encodings will be indicated by a new "Content-
+ Transfer-Encoding" header field. This field has not been defined by
+ any previous standard.
+
+6.1. Content-Transfer-Encoding Syntax
+
+ The Content-Transfer-Encoding field's value is a single token
+ specifying the type of encoding, as enumerated below. Formally:
+
+ encoding := "Content-Transfer-Encoding" ":" mechanism
+
+ mechanism := "7bit" / "8bit" / "binary" /
+ "quoted-printable" / "base64" /
+ ietf-token / x-token
+
+ These values are not case sensitive -- Base64 and BASE64 and bAsE64
+ are all equivalent. An encoding type of 7BIT requires that the body
+
+
+
+Freed & Borenstein Standards Track [Page 14]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ is already in a 7bit mail-ready representation. This is the default
+ value -- that is, "Content-Transfer-Encoding: 7BIT" is assumed if the
+ Content-Transfer-Encoding header field is not present.
+
+6.2. Content-Transfer-Encodings Semantics
+
+ This single Content-Transfer-Encoding token actually provides two
+ pieces of information. It specifies what sort of encoding
+ transformation the body was subjected to and hence what decoding
+ operation must be used to restore it to its original form, and it
+ specifies what the domain of the result is.
+
+ The transformation part of any Content-Transfer-Encodings specifies,
+ either explicitly or implicitly, a single, well-defined decoding
+ algorithm, which for any sequence of encoded octets either transforms
+ it to the original sequence of octets which was encoded, or shows
+ that it is illegal as an encoded sequence. Content-Transfer-
+ Encodings transformations never depend on any additional external
+ profile information for proper operation. Note that while decoders
+ must produce a single, well-defined output for a valid encoding no
+ such restrictions exist for encoders: Encoding a given sequence of
+ octets to different, equivalent encoded sequences is perfectly legal.
+
+ Three transformations are currently defined: identity, the "quoted-
+ printable" encoding, and the "base64" encoding. The domains are
+ "binary", "8bit" and "7bit".
+
+ The Content-Transfer-Encoding values "7bit", "8bit", and "binary" all
+ mean that the identity (i.e. NO) encoding transformation has been
+ performed. As such, they serve simply as indicators of the domain of
+ the body data, and provide useful information about the sort of
+ encoding that might be needed for transmission in a given transport
+ system. The terms "7bit data", "8bit data", and "binary data" are
+ all defined in Section 2.
+
+ The quoted-printable and base64 encodings transform their input from
+ an arbitrary domain into material in the "7bit" range, thus making it
+ safe to carry over restricted transports. The specific definition of
+ the transformations are given below.
+
+ The proper Content-Transfer-Encoding label must always be used.
+ Labelling unencoded data containing 8bit characters as "7bit" is not
+ allowed, nor is labelling unencoded non-line-oriented data as
+ anything other than "binary" allowed.
+
+ Unlike media subtypes, a proliferation of Content-Transfer-Encoding
+ values is both undesirable and unnecessary. However, establishing
+ only a single transformation into the "7bit" domain does not seem
+
+
+
+Freed & Borenstein Standards Track [Page 15]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ possible. There is a tradeoff between the desire for a compact and
+ efficient encoding of largely- binary data and the desire for a
+ somewhat readable encoding of data that is mostly, but not entirely,
+ 7bit. For this reason, at least two encoding mechanisms are
+ necessary: a more or less readable encoding (quoted-printable) and a
+ "dense" or "uniform" encoding (base64).
+
+ Mail transport for unencoded 8bit data is defined in RFC 1652. As of
+ the initial publication of this document, there are no standardized
+ Internet mail transports for which it is legitimate to include
+ unencoded binary data in mail bodies. Thus there are no
+ circumstances in which the "binary" Content-Transfer-Encoding is
+ actually valid in Internet mail. However, in the event that binary
+ mail transport becomes a reality in Internet mail, or when MIME is
+ used in conjunction with any other binary-capable mail transport
+ mechanism, binary bodies must be labelled as such using this
+ mechanism.
+
+ NOTE: The five values defined for the Content-Transfer-Encoding field
+ imply nothing about the media type other than the algorithm by which
+ it was encoded or the transport system requirements if unencoded.
+
+6.3. New Content-Transfer-Encodings
+
+ Implementors may, if necessary, define private Content-Transfer-
+ Encoding values, but must use an x-token, which is a name prefixed by
+ "X-", to indicate its non-standard status, e.g., "Content-Transfer-
+ Encoding: x-my-new-encoding". Additional standardized Content-
+ Transfer-Encoding values must be specified by a standards-track RFC.
+ The requirements such specifications must meet are given in RFC 2048.
+ As such, all content-transfer-encoding namespace except that
+ beginning with "X-" is explicitly reserved to the IETF for future
+ use.
+
+ Unlike media types and subtypes, the creation of new Content-
+ Transfer-Encoding values is STRONGLY discouraged, as it seems likely
+ to hinder interoperability with little potential benefit
+
+6.4. Interpretation and Use
+
+ If a Content-Transfer-Encoding header field appears as part of a
+ message header, it applies to the entire body of that message. If a
+ Content-Transfer-Encoding header field appears as part of an entity's
+ headers, it applies only to the body of that entity. If an entity is
+ of type "multipart" the Content-Transfer-Encoding is not permitted to
+ have any value other than "7bit", "8bit" or "binary". Even more
+ severe restrictions apply to some subtypes of the "message" type.
+
+
+
+
+Freed & Borenstein Standards Track [Page 16]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ It should be noted that most media types are defined in terms of
+ octets rather than bits, so that the mechanisms described here are
+ mechanisms for encoding arbitrary octet streams, not bit streams. If
+ a bit stream is to be encoded via one of these mechanisms, it must
+ first be converted to an 8bit byte stream using the network standard
+ bit order ("big-endian"), in which the earlier bits in a stream
+ become the higher-order bits in a 8bit byte. A bit stream not ending
+ at an 8bit boundary must be padded with zeroes. RFC 2046 provides a
+ mechanism for noting the addition of such padding in the case of the
+ application/octet-stream media type, which has a "padding" parameter.
+
+ The encoding mechanisms defined here explicitly encode all data in
+ US-ASCII. Thus, for example, suppose an entity has header fields
+ such as:
+
+ Content-Type: text/plain; charset=ISO-8859-1
+ Content-transfer-encoding: base64
+
+ This must be interpreted to mean that the body is a base64 US-ASCII
+ encoding of data that was originally in ISO-8859-1, and will be in
+ that character set again after decoding.
+
+ Certain Content-Transfer-Encoding values may only be used on certain
+ media types. In particular, it is EXPRESSLY FORBIDDEN to use any
+ encodings other than "7bit", "8bit", or "binary" with any composite
+ media type, i.e. one that recursively includes other Content-Type
+ fields. Currently the only composite media types are "multipart" and
+ "message". All encodings that are desired for bodies of type
+ multipart or message must be done at the innermost level, by encoding
+ the actual body that needs to be encoded.
+
+ It should also be noted that, by definition, if a composite entity
+ has a transfer-encoding value such as "7bit", but one of the enclosed
+ entities has a less restrictive value such as "8bit", then either the
+ outer "7bit" labelling is in error, because 8bit data are included,
+ or the inner "8bit" labelling placed an unnecessarily high demand on
+ the transport system because the actual included data were actually
+ 7bit-safe.
+
+ NOTE ON ENCODING RESTRICTIONS: Though the prohibition against using
+ content-transfer-encodings on composite body data may seem overly
+ restrictive, it is necessary to prevent nested encodings, in which
+ data are passed through an encoding algorithm multiple times, and
+ must be decoded multiple times in order to be properly viewed.
+ Nested encodings add considerable complexity to user agents: Aside
+ from the obvious efficiency problems with such multiple encodings,
+ they can obscure the basic structure of a message. In particular,
+ they can imply that several decoding operations are necessary simply
+
+
+
+Freed & Borenstein Standards Track [Page 17]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ to find out what types of bodies a message contains. Banning nested
+ encodings may complicate the job of certain mail gateways, but this
+ seems less of a problem than the effect of nested encodings on user
+ agents.
+
+ Any entity with an unrecognized Content-Transfer-Encoding must be
+ treated as if it has a Content-Type of "application/octet-stream",
+ regardless of what the Content-Type header field actually says.
+
+ NOTE ON THE RELATIONSHIP BETWEEN CONTENT-TYPE AND CONTENT-TRANSFER-
+ ENCODING: It may seem that the Content-Transfer-Encoding could be
+ inferred from the characteristics of the media that is to be encoded,
+ or, at the very least, that certain Content-Transfer-Encodings could
+ be mandated for use with specific media types. There are several
+ reasons why this is not the case. First, given the varying types of
+ transports used for mail, some encodings may be appropriate for some
+ combinations of media types and transports but not for others. (For
+ example, in an 8bit transport, no encoding would be required for text
+ in certain character sets, while such encodings are clearly required
+ for 7bit SMTP.)
+
+ Second, certain media types may require different types of transfer
+ encoding under different circumstances. For example, many PostScript
+ bodies might consist entirely of short lines of 7bit data and hence
+ require no encoding at all. Other PostScript bodies (especially
+ those using Level 2 PostScript's binary encoding mechanism) may only
+ be reasonably represented using a binary transport encoding.
+ Finally, since the Content-Type field is intended to be an open-ended
+ specification mechanism, strict specification of an association
+ between media types and encodings effectively couples the
+ specification of an application protocol with a specific lower-level
+ transport. This is not desirable since the developers of a media
+ type should not have to be aware of all the transports in use and
+ what their limitations are.
+
+6.5. Translating Encodings
+
+ The quoted-printable and base64 encodings are designed so that
+ conversion between them is possible. The only issue that arises in
+ such a conversion is the handling of hard line breaks in quoted-
+ printable encoding output. When converting from quoted-printable to
+ base64 a hard line break in the quoted-printable form represents a
+ CRLF sequence in the canonical form of the data. It must therefore be
+ converted to a corresponding encoded CRLF in the base64 form of the
+ data. Similarly, a CRLF sequence in the canonical form of the data
+ obtained after base64 decoding must be converted to a quoted-
+ printable hard line break, but ONLY when converting text data.
+
+
+
+
+Freed & Borenstein Standards Track [Page 18]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+6.6. Canonical Encoding Model
+
+ There was some confusion, in the previous versions of this RFC,
+ regarding the model for when email data was to be converted to
+ canonical form and encoded, and in particular how this process would
+ affect the treatment of CRLFs, given that the representation of
+ newlines varies greatly from system to system, and the relationship
+ between content-transfer-encodings and character sets. A canonical
+ model for encoding is presented in RFC 2049 for this reason.
+
+6.7. Quoted-Printable Content-Transfer-Encoding
+
+ The Quoted-Printable encoding is intended to represent data that
+ largely consists of octets that correspond to printable characters in
+ the US-ASCII character set. It encodes the data in such a way that
+ the resulting octets are unlikely to be modified by mail transport.
+ If the data being encoded are mostly US-ASCII text, the encoded form
+ of the data remains largely recognizable by humans. A body which is
+ entirely US-ASCII may also be encoded in Quoted-Printable to ensure
+ the integrity of the data should the message pass through a
+ character-translating, and/or line-wrapping gateway.
+
+ In this encoding, octets are to be represented as determined by the
+ following rules:
+
+ (1) (General 8bit representation) Any octet, except a CR or
+ LF that is part of a CRLF line break of the canonical
+ (standard) form of the data being encoded, may be
+ represented by an "=" followed by a two digit
+ hexadecimal representation of the octet's value. The
+ digits of the hexadecimal alphabet, for this purpose,
+ are "0123456789ABCDEF". Uppercase letters must be
+ used; lowercase letters are not allowed. Thus, for
+ example, the decimal value 12 (US-ASCII form feed) can
+ be represented by "=0C", and the decimal value 61 (US-
+ ASCII EQUAL SIGN) can be represented by "=3D". This
+ rule must be followed except when the following rules
+ allow an alternative encoding.
+
+ (2) (Literal representation) Octets with decimal values of
+ 33 through 60 inclusive, and 62 through 126, inclusive,
+ MAY be represented as the US-ASCII characters which
+ correspond to those octets (EXCLAMATION POINT through
+ LESS THAN, and GREATER THAN through TILDE,
+ respectively).
+
+ (3) (White Space) Octets with values of 9 and 32 MAY be
+ represented as US-ASCII TAB (HT) and SPACE characters,
+
+
+
+Freed & Borenstein Standards Track [Page 19]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ respectively, but MUST NOT be so represented at the end
+ of an encoded line. Any TAB (HT) or SPACE characters
+ on an encoded line MUST thus be followed on that line
+ by a printable character. In particular, an "=" at the
+ end of an encoded line, indicating a soft line break
+ (see rule #5) may follow one or more TAB (HT) or SPACE
+ characters. It follows that an octet with decimal
+ value 9 or 32 appearing at the end of an encoded line
+ must be represented according to Rule #1. This rule is
+ necessary because some MTAs (Message Transport Agents,
+ programs which transport messages from one user to
+ another, or perform a portion of such transfers) are
+ known to pad lines of text with SPACEs, and others are
+ known to remove "white space" characters from the end
+ of a line. Therefore, when decoding a Quoted-Printable
+ body, any trailing white space on a line must be
+ deleted, as it will necessarily have been added by
+ intermediate transport agents.
+
+ (4) (Line Breaks) A line break in a text body, represented
+ as a CRLF sequence in the text canonical form, must be
+ represented by a (RFC 822) line break, which is also a
+ CRLF sequence, in the Quoted-Printable encoding. Since
+ the canonical representation of media types other than
+ text do not generally include the representation of
+ line breaks as CRLF sequences, no hard line breaks
+ (i.e. line breaks that are intended to be meaningful
+ and to be displayed to the user) can occur in the
+ quoted-printable encoding of such types. Sequences
+ like "=0D", "=0A", "=0A=0D" and "=0D=0A" will routinely
+ appear in non-text data represented in quoted-
+ printable, of course.
+
+ Note that many implementations may elect to encode the
+ local representation of various content types directly
+ rather than converting to canonical form first,
+ encoding, and then converting back to local
+ representation. In particular, this may apply to plain
+ text material on systems that use newline conventions
+ other than a CRLF terminator sequence. Such an
+ implementation optimization is permissible, but only
+ when the combined canonicalization-encoding step is
+ equivalent to performing the three steps separately.
+
+ (5) (Soft Line Breaks) The Quoted-Printable encoding
+ REQUIRES that encoded lines be no more than 76
+ characters long. If longer lines are to be encoded
+ with the Quoted-Printable encoding, "soft" line breaks
+
+
+
+Freed & Borenstein Standards Track [Page 20]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ must be used. An equal sign as the last character on a
+ encoded line indicates such a non-significant ("soft")
+ line break in the encoded text.
+
+ Thus if the "raw" form of the line is a single unencoded line that
+ says:
+
+ Now's the time for all folk to come to the aid of their country.
+
+ This can be represented, in the Quoted-Printable encoding, as:
+
+ Now's the time =
+ for all folk to come=
+ to the aid of their country.
+
+ This provides a mechanism with which long lines are encoded in such a
+ way as to be restored by the user agent. The 76 character limit does
+ not count the trailing CRLF, but counts all other characters,
+ including any equal signs.
+
+ Since the hyphen character ("-") may be represented as itself in the
+ Quoted-Printable encoding, care must be taken, when encapsulating a
+ quoted-printable encoded body inside one or more multipart entities,
+ to ensure that the boundary delimiter does not appear anywhere in the
+ encoded body. (A good strategy is to choose a boundary that includes
+ a character sequence such as "=_" which can never appear in a
+ quoted-printable body. See the definition of multipart messages in
+ RFC 2046.)
+
+ NOTE: The quoted-printable encoding represents something of a
+ compromise between readability and reliability in transport. Bodies
+ encoded with the quoted-printable encoding will work reliably over
+ most mail gateways, but may not work perfectly over a few gateways,
+ notably those involving translation into EBCDIC. A higher level of
+ confidence is offered by the base64 Content-Transfer-Encoding. A way
+ to get reasonably reliable transport through EBCDIC gateways is to
+ also quote the US-ASCII characters
+
+ !"#$@[\]^`{|}~
+
+ according to rule #1.
+
+ Because quoted-printable data is generally assumed to be line-
+ oriented, it is to be expected that the representation of the breaks
+ between the lines of quoted-printable data may be altered in
+ transport, in the same manner that plain text mail has always been
+ altered in Internet mail when passing between systems with differing
+ newline conventions. If such alterations are likely to constitute a
+
+
+
+Freed & Borenstein Standards Track [Page 21]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ corruption of the data, it is probably more sensible to use the
+ base64 encoding rather than the quoted-printable encoding.
+
+ NOTE: Several kinds of substrings cannot be generated according to
+ the encoding rules for the quoted-printable content-transfer-
+ encoding, and hence are formally illegal if they appear in the output
+ of a quoted-printable encoder. This note enumerates these cases and
+ suggests ways to handle such illegal substrings if any are
+ encountered in quoted-printable data that is to be decoded.
+
+ (1) An "=" followed by two hexadecimal digits, one or both
+ of which are lowercase letters in "abcdef", is formally
+ illegal. A robust implementation might choose to
+ recognize them as the corresponding uppercase letters.
+
+ (2) An "=" followed by a character that is neither a
+ hexadecimal digit (including "abcdef") nor the CR
+ character of a CRLF pair is illegal. This case can be
+ the result of US-ASCII text having been included in a
+ quoted-printable part of a message without itself
+ having been subjected to quoted-printable encoding. A
+ reasonable approach by a robust implementation might be
+ to include the "=" character and the following
+ character in the decoded data without any
+ transformation and, if possible, indicate to the user
+ that proper decoding was not possible at this point in
+ the data.
+
+ (3) An "=" cannot be the ultimate or penultimate character
+ in an encoded object. This could be handled as in case
+ (2) above.
+
+ (4) Control characters other than TAB, or CR and LF as
+ parts of CRLF pairs, must not appear. The same is true
+ for octets with decimal values greater than 126. If
+ found in incoming quoted-printable data by a decoder, a
+ robust implementation might exclude them from the
+ decoded data and warn the user that illegal characters
+ were discovered.
+
+ (5) Encoded lines must not be longer than 76 characters,
+ not counting the trailing CRLF. If longer lines are
+ found in incoming, encoded data, a robust
+ implementation might nevertheless decode the lines, and
+ might report the erroneous encoding to the user.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 22]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ WARNING TO IMPLEMENTORS: If binary data is encoded in quoted-
+ printable, care must be taken to encode CR and LF characters as "=0D"
+ and "=0A", respectively. In particular, a CRLF sequence in binary
+ data should be encoded as "=0D=0A". Otherwise, if CRLF were
+ represented as a hard line break, it might be incorrectly decoded on
+ platforms with different line break conventions.
+
+ For formalists, the syntax of quoted-printable data is described by
+ the following grammar:
+
+ quoted-printable := qp-line *(CRLF qp-line)
+
+ qp-line := *(qp-segment transport-padding CRLF)
+ qp-part transport-padding
+
+ qp-part := qp-section
+ ; Maximum length of 76 characters
+
+ qp-segment := qp-section *(SPACE / TAB) "="
+ ; Maximum length of 76 characters
+
+ qp-section := [*(ptext / SPACE / TAB) ptext]
+
+ ptext := hex-octet / safe-char
+
+ safe-char := <any octet with decimal value of 33 through
+ 60 inclusive, and 62 through 126>
+ ; Characters not listed as "mail-safe" in
+ ; RFC 2049 are also not recommended.
+
+ hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
+ ; Octet must be used for characters > 127, =,
+ ; SPACEs or TABs at the ends of lines, and is
+ ; recommended for any character not listed in
+ ; RFC 2049 as "mail-safe".
+
+ transport-padding := *LWSP-char
+ ; Composers MUST NOT generate
+ ; non-zero length transport
+ ; padding, but receivers MUST
+ ; be able to handle padding
+ ; added by message transports.
+
+ IMPORTANT: The addition of LWSP between the elements shown in this
+ BNF is NOT allowed since this BNF does not specify a structured
+ header field.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 23]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+6.8. Base64 Content-Transfer-Encoding
+
+ The Base64 Content-Transfer-Encoding is designed to represent
+ arbitrary sequences of octets in a form that need not be humanly
+ readable. The encoding and decoding algorithms are simple, but the
+ encoded data are consistently only about 33 percent larger than the
+ unencoded data. This encoding is virtually identical to the one used
+ in Privacy Enhanced Mail (PEM) applications, as defined in RFC 1421.
+
+ A 65-character subset of US-ASCII is used, enabling 6 bits to be
+ represented per printable character. (The extra 65th character, "=",
+ is used to signify a special processing function.)
+
+ NOTE: This subset has the important property that it is represented
+ identically in all versions of ISO 646, including US-ASCII, and all
+ characters in the subset are also represented identically in all
+ versions of EBCDIC. Other popular encodings, such as the encoding
+ used by the uuencode utility, Macintosh binhex 4.0 [RFC-1741], and
+ the base85 encoding specified as part of Level 2 PostScript, do not
+ share these properties, and thus do not fulfill the portability
+ requirements a binary transport encoding for mail must meet.
+
+ The encoding process represents 24-bit groups of input bits as output
+ strings of 4 encoded characters. Proceeding from left to right, a
+ 24-bit input group is formed by concatenating 3 8bit input groups.
+ These 24 bits are then treated as 4 concatenated 6-bit groups, each
+ of which is translated into a single digit in the base64 alphabet.
+ When encoding a bit stream via the base64 encoding, the bit stream
+ must be presumed to be ordered with the most-significant-bit first.
+ That is, the first bit in the stream will be the high-order bit in
+ the first 8bit byte, and the eighth bit will be the low-order bit in
+ the first 8bit byte, and so on.
+
+ Each 6-bit group is used as an index into an array of 64 printable
+ characters. The character referenced by the index is placed in the
+ output string. These characters, identified in Table 1, below, are
+ selected so as to be universally representable, and the set excludes
+ characters with particular significance to SMTP (e.g., ".", CR, LF)
+ and to the multipart boundary delimiters defined in RFC 2046 (e.g.,
+ "-").
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 24]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ Table 1: The Base64 Alphabet
+
+ Value Encoding Value Encoding Value Encoding Value Encoding
+ 0 A 17 R 34 i 51 z
+ 1 B 18 S 35 j 52 0
+ 2 C 19 T 36 k 53 1
+ 3 D 20 U 37 l 54 2
+ 4 E 21 V 38 m 55 3
+ 5 F 22 W 39 n 56 4
+ 6 G 23 X 40 o 57 5
+ 7 H 24 Y 41 p 58 6
+ 8 I 25 Z 42 q 59 7
+ 9 J 26 a 43 r 60 8
+ 10 K 27 b 44 s 61 9
+ 11 L 28 c 45 t 62 +
+ 12 M 29 d 46 u 63 /
+ 13 N 30 e 47 v
+ 14 O 31 f 48 w (pad) =
+ 15 P 32 g 49 x
+ 16 Q 33 h 50 y
+
+ The encoded output stream must be represented in lines of no more
+ than 76 characters each. All line breaks or other characters not
+ found in Table 1 must be ignored by decoding software. In base64
+ data, characters other than those in Table 1, line breaks, and other
+ white space probably indicate a transmission error, about which a
+ warning message or even a message rejection might be appropriate
+ under some circumstances.
+
+ Special processing is performed if fewer than 24 bits are available
+ at the end of the data being encoded. A full encoding quantum is
+ always completed at the end of a body. When fewer than 24 input bits
+ are available in an input group, zero bits are added (on the right)
+ to form an integral number of 6-bit groups. Padding at the end of
+ the data is performed using the "=" character. Since all base64
+ input is an integral number of octets, only the following cases can
+ arise: (1) the final quantum of encoding input is an integral
+ multiple of 24 bits; here, the final unit of encoded output will be
+ an integral multiple of 4 characters with no "=" padding, (2) the
+ final quantum of encoding input is exactly 8 bits; here, the final
+ unit of encoded output will be two characters followed by two "="
+ padding characters, or (3) the final quantum of encoding input is
+ exactly 16 bits; here, the final unit of encoded output will be three
+ characters followed by one "=" padding character.
+
+ Because it is used only for padding at the end of the data, the
+ occurrence of any "=" characters may be taken as evidence that the
+ end of the data has been reached (without truncation in transit). No
+
+
+
+Freed & Borenstein Standards Track [Page 25]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ such assurance is possible, however, when the number of octets
+ transmitted was a multiple of three and no "=" characters are
+ present.
+
+ Any characters outside of the base64 alphabet are to be ignored in
+ base64-encoded data.
+
+ Care must be taken to use the proper octets for line breaks if base64
+ encoding is applied directly to text material that has not been
+ converted to canonical form. In particular, text line breaks must be
+ converted into CRLF sequences prior to base64 encoding. The
+ important thing to note is that this may be done directly by the
+ encoder rather than in a prior canonicalization step in some
+ implementations.
+
+ NOTE: There is no need to worry about quoting potential boundary
+ delimiters within base64-encoded bodies within multipart entities
+ because no hyphen characters are used in the base64 encoding.
+
+7. Content-ID Header Field
+
+ In constructing a high-level user agent, it may be desirable to allow
+ one body to make reference to another. Accordingly, bodies may be
+ labelled using the "Content-ID" header field, which is syntactically
+ identical to the "Message-ID" header field:
+
+ id := "Content-ID" ":" msg-id
+
+ Like the Message-ID values, Content-ID values must be generated to be
+ world-unique.
+
+ The Content-ID value may be used for uniquely identifying MIME
+ entities in several contexts, particularly for caching data
+ referenced by the message/external-body mechanism. Although the
+ Content-ID header is generally optional, its use is MANDATORY in
+ implementations which generate data of the optional MIME media type
+ "message/external-body". That is, each message/external-body entity
+ must have a Content-ID field to permit caching of such data.
+
+ It is also worth noting that the Content-ID value has special
+ semantics in the case of the multipart/alternative media type. This
+ is explained in the section of RFC 2046 dealing with
+ multipart/alternative.
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 26]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+8. Content-Description Header Field
+
+ The ability to associate some descriptive information with a given
+ body is often desirable. For example, it may be useful to mark an
+ "image" body as "a picture of the Space Shuttle Endeavor." Such text
+ may be placed in the Content-Description header field. This header
+ field is always optional.
+
+ description := "Content-Description" ":" *text
+
+ The description is presumed to be given in the US-ASCII character
+ set, although the mechanism specified in RFC 2047 may be used for
+ non-US-ASCII Content-Description values.
+
+9. Additional MIME Header Fields
+
+ Future documents may elect to define additional MIME header fields
+ for various purposes. Any new header field that further describes
+ the content of a message should begin with the string "Content-" to
+ allow such fields which appear in a message header to be
+ distinguished from ordinary RFC 822 message header fields.
+
+ MIME-extension-field := <Any RFC 822 header field which
+ begins with the string
+ "Content-">
+
+10. Summary
+
+ Using the MIME-Version, Content-Type, and Content-Transfer-Encoding
+ header fields, it is possible to include, in a standardized way,
+ arbitrary types of data with RFC 822 conformant mail messages. No
+ restrictions imposed by either RFC 821 or RFC 822 are violated, and
+ care has been taken to avoid problems caused by additional
+ restrictions imposed by the characteristics of some Internet mail
+ transport mechanisms (see RFC 2049).
+
+ The next document in this set, RFC 2046, specifies the initial set of
+ media types that can be labelled and transported using these headers.
+
+11. Security Considerations
+
+ Security issues are discussed in the second document in this set, RFC
+ 2046.
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 27]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+12. Authors' Addresses
+
+ For more information, the authors of this document are best contacted
+ via Internet mail:
+
+ Ned Freed
+ Innosoft International, Inc.
+ 1050 East Garvey Avenue South
+ West Covina, CA 91790
+ USA
+
+ Phone: +1 818 919 3600
+ Fax: +1 818 919 3614
+ EMail: ned@innosoft.com
+
+
+ Nathaniel S. Borenstein
+ First Virtual Holdings
+ 25 Washington Avenue
+ Morristown, NJ 07960
+ USA
+
+ Phone: +1 201 540 8967
+ Fax: +1 201 993 3032
+ EMail: nsb@nsb.fv.com
+
+
+ MIME is a result of the work of the Internet Engineering Task Force
+ Working Group on RFC 822 Extensions. The chairman of that group,
+ Greg Vaudreuil, may be reached at:
+
+ Gregory M. Vaudreuil
+ Octel Network Services
+ 17080 Dallas Parkway
+ Dallas, TX 75248-1905
+ USA
+
+ EMail: Greg.Vaudreuil@Octel.Com
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 28]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+Appendix A -- Collected Grammar
+
+ This appendix contains the complete BNF grammar for all the syntax
+ specified by this document.
+
+ By itself, however, this grammar is incomplete. It refers by name to
+ several syntax rules that are defined by RFC 822. Rather than
+ reproduce those definitions here, and risk unintentional differences
+ between the two, this document simply refers the reader to RFC 822
+ for the remaining definitions. Wherever a term is undefined, it
+ refers to the RFC 822 definition.
+
+ attribute := token
+ ; Matching of attributes
+ ; is ALWAYS case-insensitive.
+
+ composite-type := "message" / "multipart" / extension-token
+
+ content := "Content-Type" ":" type "/" subtype
+ *(";" parameter)
+ ; Matching of media type and subtype
+ ; is ALWAYS case-insensitive.
+
+ description := "Content-Description" ":" *text
+
+ discrete-type := "text" / "image" / "audio" / "video" /
+ "application" / extension-token
+
+ encoding := "Content-Transfer-Encoding" ":" mechanism
+
+ entity-headers := [ content CRLF ]
+ [ encoding CRLF ]
+ [ id CRLF ]
+ [ description CRLF ]
+ *( MIME-extension-field CRLF )
+
+ extension-token := ietf-token / x-token
+
+ hex-octet := "=" 2(DIGIT / "A" / "B" / "C" / "D" / "E" / "F")
+ ; Octet must be used for characters > 127, =,
+ ; SPACEs or TABs at the ends of lines, and is
+ ; recommended for any character not listed in
+ ; RFC 2049 as "mail-safe".
+
+ iana-token := <A publicly-defined extension token. Tokens
+ of this form must be registered with IANA
+ as specified in RFC 2048.>
+
+
+
+
+Freed & Borenstein Standards Track [Page 29]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ ietf-token := <An extension token defined by a
+ standards-track RFC and registered
+ with IANA.>
+
+ id := "Content-ID" ":" msg-id
+
+ mechanism := "7bit" / "8bit" / "binary" /
+ "quoted-printable" / "base64" /
+ ietf-token / x-token
+
+ MIME-extension-field := <Any RFC 822 header field which
+ begins with the string
+ "Content-">
+
+ MIME-message-headers := entity-headers
+ fields
+ version CRLF
+ ; The ordering of the header
+ ; fields implied by this BNF
+ ; definition should be ignored.
+
+ MIME-part-headers := entity-headers
+ [fields]
+ ; Any field not beginning with
+ ; "content-" can have no defined
+ ; meaning and may be ignored.
+ ; The ordering of the header
+ ; fields implied by this BNF
+ ; definition should be ignored.
+
+ parameter := attribute "=" value
+
+ ptext := hex-octet / safe-char
+
+ qp-line := *(qp-segment transport-padding CRLF)
+ qp-part transport-padding
+
+ qp-part := qp-section
+ ; Maximum length of 76 characters
+
+ qp-section := [*(ptext / SPACE / TAB) ptext]
+
+ qp-segment := qp-section *(SPACE / TAB) "="
+ ; Maximum length of 76 characters
+
+ quoted-printable := qp-line *(CRLF qp-line)
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 30]
+\f
+RFC 2045 Internet Message Bodies November 1996
+
+
+ safe-char := <any octet with decimal value of 33 through
+ 60 inclusive, and 62 through 126>
+ ; Characters not listed as "mail-safe" in
+ ; RFC 2049 are also not recommended.
+
+ subtype := extension-token / iana-token
+
+ token := 1*<any (US-ASCII) CHAR except SPACE, CTLs,
+ or tspecials>
+
+ transport-padding := *LWSP-char
+ ; Composers MUST NOT generate
+ ; non-zero length transport
+ ; padding, but receivers MUST
+ ; be able to handle padding
+ ; added by message transports.
+
+ tspecials := "(" / ")" / "<" / ">" / "@" /
+ "," / ";" / ":" / "\" / <">
+ "/" / "[" / "]" / "?" / "="
+ ; Must be in quoted-string,
+ ; to use within parameter values
+
+ type := discrete-type / composite-type
+
+ value := token / quoted-string
+
+ version := "MIME-Version" ":" 1*DIGIT "." 1*DIGIT
+
+ x-token := <The two characters "X-" or "x-" followed, with
+ no intervening white space, by any token>
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 31]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group N. Freed
+Request for Comments: 2046 Innosoft
+Obsoletes: 1521, 1522, 1590 N. Borenstein
+Category: Standards Track First Virtual
+ November 1996
+
+
+ Multipurpose Internet Mail Extensions
+ (MIME) Part Two:
+ Media Types
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ STD 11, RFC 822 defines a message representation protocol specifying
+ considerable detail about US-ASCII message headers, but which leaves
+ the message content, or message body, as flat US-ASCII text. This
+ set of documents, collectively called the Multipurpose Internet Mail
+ Extensions, or MIME, redefines the format of messages to allow for
+
+ (1) textual message bodies in character sets other than
+ US-ASCII,
+
+ (2) an extensible set of different formats for non-textual
+ message bodies,
+
+ (3) multi-part message bodies, and
+
+ (4) textual header information in character sets other than
+ US-ASCII.
+
+ These documents are based on earlier work documented in RFC 934, STD
+ 11, and RFC 1049, but extends and revises them. Because RFC 822 said
+ so little about message bodies, these documents are largely
+ orthogonal to (rather than a revision of) RFC 822.
+
+ The initial document in this set, RFC 2045, specifies the various
+ headers used to describe the structure of MIME messages. This second
+ document defines the general structure of the MIME media typing
+ system and defines an initial set of media types. The third document,
+ RFC 2047, describes extensions to RFC 822 to allow non-US-ASCII text
+
+
+
+Freed & Borenstein Standards Track [Page 1]
+\f
+RFC 2046 Media Types November 1996
+
+
+ data in Internet mail header fields. The fourth document, RFC 2048,
+ specifies various IANA registration procedures for MIME-related
+ facilities. The fifth and final document, RFC 2049, describes MIME
+ conformance criteria as well as providing some illustrative examples
+ of MIME message formats, acknowledgements, and the bibliography.
+
+ These documents are revisions of RFCs 1521 and 1522, which themselves
+ were revisions of RFCs 1341 and 1342. An appendix in RFC 2049
+ describes differences and changes from previous versions.
+
+Table of Contents
+
+ 1. Introduction ......................................... 3
+ 2. Definition of a Top-Level Media Type ................. 4
+ 3. Overview Of The Initial Top-Level Media Types ........ 4
+ 4. Discrete Media Type Values ........................... 6
+ 4.1 Text Media Type ..................................... 6
+ 4.1.1 Representation of Line Breaks ..................... 7
+ 4.1.2 Charset Parameter ................................. 7
+ 4.1.3 Plain Subtype ..................................... 11
+ 4.1.4 Unrecognized Subtypes ............................. 11
+ 4.2 Image Media Type .................................... 11
+ 4.3 Audio Media Type .................................... 11
+ 4.4 Video Media Type .................................... 12
+ 4.5 Application Media Type .............................. 12
+ 4.5.1 Octet-Stream Subtype .............................. 13
+ 4.5.2 PostScript Subtype ................................ 14
+ 4.5.3 Other Application Subtypes ........................ 17
+ 5. Composite Media Type Values .......................... 17
+ 5.1 Multipart Media Type ................................ 17
+ 5.1.1 Common Syntax ..................................... 19
+ 5.1.2 Handling Nested Messages and Multiparts ........... 24
+ 5.1.3 Mixed Subtype ..................................... 24
+ 5.1.4 Alternative Subtype ............................... 24
+ 5.1.5 Digest Subtype .................................... 26
+ 5.1.6 Parallel Subtype .................................. 27
+ 5.1.7 Other Multipart Subtypes .......................... 28
+ 5.2 Message Media Type .................................. 28
+ 5.2.1 RFC822 Subtype .................................... 28
+ 5.2.2 Partial Subtype ................................... 29
+ 5.2.2.1 Message Fragmentation and Reassembly ............ 30
+ 5.2.2.2 Fragmentation and Reassembly Example ............ 31
+ 5.2.3 External-Body Subtype ............................. 33
+ 5.2.4 Other Message Subtypes ............................ 40
+ 6. Experimental Media Type Values ....................... 40
+ 7. Summary .............................................. 41
+ 8. Security Considerations .............................. 41
+ 9. Authors' Addresses ................................... 42
+
+
+
+Freed & Borenstein Standards Track [Page 2]
+\f
+RFC 2046 Media Types November 1996
+
+
+ A. Collected Grammar .................................... 43
+
+1. Introduction
+
+ The first document in this set, RFC 2045, defines a number of header
+ fields, including Content-Type. The Content-Type field is used to
+ specify the nature of the data in the body of a MIME entity, by
+ giving media type and subtype identifiers, and by providing auxiliary
+ information that may be required for certain media types. After the
+ type and subtype names, the remainder of the header field is simply a
+ set of parameters, specified in an attribute/value notation. The
+ ordering of parameters is not significant.
+
+ In general, the top-level media type is used to declare the general
+ type of data, while the subtype specifies a specific format for that
+ type of data. Thus, a media type of "image/xyz" is enough to tell a
+ user agent that the data is an image, even if the user agent has no
+ knowledge of the specific image format "xyz". Such information can
+ be used, for example, to decide whether or not to show a user the raw
+ data from an unrecognized subtype -- such an action might be
+ reasonable for unrecognized subtypes of "text", but not for
+ unrecognized subtypes of "image" or "audio". For this reason,
+ registered subtypes of "text", "image", "audio", and "video" should
+ not contain embedded information that is really of a different type.
+ Such compound formats should be represented using the "multipart" or
+ "application" types.
+
+ Parameters are modifiers of the media subtype, and as such do not
+ fundamentally affect the nature of the content. The set of
+ meaningful parameters depends on the media type and subtype. Most
+ parameters are associated with a single specific subtype. However, a
+ given top-level media type may define parameters which are applicable
+ to any subtype of that type. Parameters may be required by their
+ defining media type or subtype or they may be optional. MIME
+ implementations must also ignore any parameters whose names they do
+ not recognize.
+
+ MIME's Content-Type header field and media type mechanism has been
+ carefully designed to be extensible, and it is expected that the set
+ of media type/subtype pairs and their associated parameters will grow
+ significantly over time. Several other MIME facilities, such as
+ transfer encodings and "message/external-body" access types, are
+ likely to have new values defined over time. In order to ensure that
+ the set of such values is developed in an orderly, well-specified,
+ and public manner, MIME sets up a registration process which uses the
+ Internet Assigned Numbers Authority (IANA) as a central registry for
+ MIME's various areas of extensibility. The registration process for
+ these areas is described in a companion document, RFC 2048.
+
+
+
+Freed & Borenstein Standards Track [Page 3]
+\f
+RFC 2046 Media Types November 1996
+
+
+ The initial seven standard top-level media type are defined and
+ described in the remainder of this document.
+
+2. Definition of a Top-Level Media Type
+
+ The definition of a top-level media type consists of:
+
+ (1) a name and a description of the type, including
+ criteria for whether a particular type would qualify
+ under that type,
+
+ (2) the names and definitions of parameters, if any, which
+ are defined for all subtypes of that type (including
+ whether such parameters are required or optional),
+
+ (3) how a user agent and/or gateway should handle unknown
+ subtypes of this type,
+
+ (4) general considerations on gatewaying entities of this
+ top-level type, if any, and
+
+ (5) any restrictions on content-transfer-encodings for
+ entities of this top-level type.
+
+3. Overview Of The Initial Top-Level Media Types
+
+ The five discrete top-level media types are:
+
+ (1) text -- textual information. The subtype "plain" in
+ particular indicates plain text containing no
+ formatting commands or directives of any sort. Plain
+ text is intended to be displayed "as-is". No special
+ software is required to get the full meaning of the
+ text, aside from support for the indicated character
+ set. Other subtypes are to be used for enriched text in
+ forms where application software may enhance the
+ appearance of the text, but such software must not be
+ required in order to get the general idea of the
+ content. Possible subtypes of "text" thus include any
+ word processor format that can be read without
+ resorting to software that understands the format. In
+ particular, formats that employ embeddded binary
+ formatting information are not considered directly
+ readable. A very simple and portable subtype,
+ "richtext", was defined in RFC 1341, with a further
+ revision in RFC 1896 under the name "enriched".
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 4]
+\f
+RFC 2046 Media Types November 1996
+
+
+ (2) image -- image data. "Image" requires a display device
+ (such as a graphical display, a graphics printer, or a
+ FAX machine) to view the information. An initial
+ subtype is defined for the widely-used image format
+ JPEG. . subtypes are defined for two widely-used image
+ formats, jpeg and gif.
+
+ (3) audio -- audio data. "Audio" requires an audio output
+ device (such as a speaker or a telephone) to "display"
+ the contents. An initial subtype "basic" is defined in
+ this document.
+
+ (4) video -- video data. "Video" requires the capability
+ to display moving images, typically including
+ specialized hardware and software. An initial subtype
+ "mpeg" is defined in this document.
+
+ (5) application -- some other kind of data, typically
+ either uninterpreted binary data or information to be
+ processed by an application. The subtype "octet-
+ stream" is to be used in the case of uninterpreted
+ binary data, in which case the simplest recommended
+ action is to offer to write the information into a file
+ for the user. The "PostScript" subtype is also defined
+ for the transport of PostScript material. Other
+ expected uses for "application" include spreadsheets,
+ data for mail-based scheduling systems, and languages
+ for "active" (computational) messaging, and word
+ processing formats that are not directly readable.
+ Note that security considerations may exist for some
+ types of application data, most notably
+ "application/PostScript" and any form of active
+ messaging. These issues are discussed later in this
+ document.
+
+ The two composite top-level media types are:
+
+ (1) multipart -- data consisting of multiple entities of
+ independent data types. Four subtypes are initially
+ defined, including the basic "mixed" subtype specifying
+ a generic mixed set of parts, "alternative" for
+ representing the same data in multiple formats,
+ "parallel" for parts intended to be viewed
+ simultaneously, and "digest" for multipart entities in
+ which each part has a default type of "message/rfc822".
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 5]
+\f
+RFC 2046 Media Types November 1996
+
+
+ (2) message -- an encapsulated message. A body of media
+ type "message" is itself all or a portion of some kind
+ of message object. Such objects may or may not in turn
+ contain other entities. The "rfc822" subtype is used
+ when the encapsulated content is itself an RFC 822
+ message. The "partial" subtype is defined for partial
+ RFC 822 messages, to permit the fragmented transmission
+ of bodies that are thought to be too large to be passed
+ through transport facilities in one piece. Another
+ subtype, "external-body", is defined for specifying
+ large bodies by reference to an external data source.
+
+ It should be noted that the list of media type values given here may
+ be augmented in time, via the mechanisms described above, and that
+ the set of subtypes is expected to grow substantially.
+
+4. Discrete Media Type Values
+
+ Five of the seven initial media type values refer to discrete bodies.
+ The content of these types must be handled by non-MIME mechanisms;
+ they are opaque to MIME processors.
+
+4.1. Text Media Type
+
+ The "text" media type is intended for sending material which is
+ principally textual in form. A "charset" parameter may be used to
+ indicate the character set of the body text for "text" subtypes,
+ notably including the subtype "text/plain", which is a generic
+ subtype for plain text. Plain text does not provide for or allow
+ formatting commands, font attribute specifications, processing
+ instructions, interpretation directives, or content markup. Plain
+ text is seen simply as a linear sequence of characters, possibly
+ interrupted by line breaks or page breaks. Plain text may allow the
+ stacking of several characters in the same position in the text.
+ Plain text in scripts like Arabic and Hebrew may also include
+ facilitites that allow the arbitrary mixing of text segments with
+ opposite writing directions.
+
+ Beyond plain text, there are many formats for representing what might
+ be known as "rich text". An interesting characteristic of many such
+ representations is that they are to some extent readable even without
+ the software that interprets them. It is useful, then, to
+ distinguish them, at the highest level, from such unreadable data as
+ images, audio, or text represented in an unreadable form. In the
+ absence of appropriate interpretation software, it is reasonable to
+ show subtypes of "text" to the user, while it is not reasonable to do
+ so with most nontextual data. Such formatted textual data should be
+ represented using subtypes of "text".
+
+
+
+Freed & Borenstein Standards Track [Page 6]
+\f
+RFC 2046 Media Types November 1996
+
+
+4.1.1. Representation of Line Breaks
+
+ The canonical form of any MIME "text" subtype MUST always represent a
+ line break as a CRLF sequence. Similarly, any occurrence of CRLF in
+ MIME "text" MUST represent a line break. Use of CR and LF outside of
+ line break sequences is also forbidden.
+
+ This rule applies regardless of format or character set or sets
+ involved.
+
+ NOTE: The proper interpretation of line breaks when a body is
+ displayed depends on the media type. In particular, while it is
+ appropriate to treat a line break as a transition to a new line when
+ displaying a "text/plain" body, this treatment is actually incorrect
+ for other subtypes of "text" like "text/enriched" [RFC-1896].
+ Similarly, whether or not line breaks should be added during display
+ operations is also a function of the media type. It should not be
+ necessary to add any line breaks to display "text/plain" correctly,
+ whereas proper display of "text/enriched" requires the appropriate
+ addition of line breaks.
+
+ NOTE: Some protocols defines a maximum line length. E.g. SMTP [RFC-
+ 821] allows a maximum of 998 octets before the next CRLF sequence.
+ To be transported by such protocols, data which includes too long
+ segments without CRLF sequences must be encoded with a suitable
+ content-transfer-encoding.
+
+4.1.2. Charset Parameter
+
+ A critical parameter that may be specified in the Content-Type field
+ for "text/plain" data is the character set. This is specified with a
+ "charset" parameter, as in:
+
+ Content-type: text/plain; charset=iso-8859-1
+
+ Unlike some other parameter values, the values of the charset
+ parameter are NOT case sensitive. The default character set, which
+ must be assumed in the absence of a charset parameter, is US-ASCII.
+
+ The specification for any future subtypes of "text" must specify
+ whether or not they will also utilize a "charset" parameter, and may
+ possibly restrict its values as well. For other subtypes of "text"
+ than "text/plain", the semantics of the "charset" parameter should be
+ defined to be identical to those specified here for "text/plain",
+ i.e., the body consists entirely of characters in the given charset.
+ In particular, definers of future "text" subtypes should pay close
+ attention to the implications of multioctet character sets for their
+ subtype definitions.
+
+
+
+Freed & Borenstein Standards Track [Page 7]
+\f
+RFC 2046 Media Types November 1996
+
+
+ The charset parameter for subtypes of "text" gives a name of a
+ character set, as "character set" is defined in RFC 2045. The rules
+ regarding line breaks detailed in the previous section must also be
+ observed -- a character set whose definition does not conform to
+ these rules cannot be used in a MIME "text" subtype.
+
+ An initial list of predefined character set names can be found at the
+ end of this section. Additional character sets may be registered
+ with IANA.
+
+ Other media types than subtypes of "text" might choose to employ the
+ charset parameter as defined here, but with the CRLF/line break
+ restriction removed. Therefore, all character sets that conform to
+ the general definition of "character set" in RFC 2045 can be
+ registered for MIME use.
+
+ Note that if the specified character set includes 8-bit characters
+ and such characters are used in the body, a Content-Transfer-Encoding
+ header field and a corresponding encoding on the data are required in
+ order to transmit the body via some mail transfer protocols, such as
+ SMTP [RFC-821].
+
+ The default character set, US-ASCII, has been the subject of some
+ confusion and ambiguity in the past. Not only were there some
+ ambiguities in the definition, there have been wide variations in
+ practice. In order to eliminate such ambiguity and variations in the
+ future, it is strongly recommended that new user agents explicitly
+ specify a character set as a media type parameter in the Content-Type
+ header field. "US-ASCII" does not indicate an arbitrary 7-bit
+ character set, but specifies that all octets in the body must be
+ interpreted as characters according to the US-ASCII character set.
+ National and application-oriented versions of ISO 646 [ISO-646] are
+ usually NOT identical to US-ASCII, and in that case their use in
+ Internet mail is explicitly discouraged. The omission of the ISO 646
+ character set from this document is deliberate in this regard. The
+ character set name of "US-ASCII" explicitly refers to the character
+ set defined in ANSI X3.4-1986 [US- ASCII]. The new international
+ reference version (IRV) of the 1991 edition of ISO 646 is identical
+ to US-ASCII. The character set name "ASCII" is reserved and must not
+ be used for any purpose.
+
+ NOTE: RFC 821 explicitly specifies "ASCII", and references an earlier
+ version of the American Standard. Insofar as one of the purposes of
+ specifying a media type and character set is to permit the receiver
+ to unambiguously determine how the sender intended the coded message
+ to be interpreted, assuming anything other than "strict ASCII" as the
+ default would risk unintentional and incompatible changes to the
+ semantics of messages now being transmitted. This also implies that
+
+
+
+Freed & Borenstein Standards Track [Page 8]
+\f
+RFC 2046 Media Types November 1996
+
+
+ messages containing characters coded according to other versions of
+ ISO 646 than US-ASCII and the 1991 IRV, or using code-switching
+ procedures (e.g., those of ISO 2022), as well as 8bit or multiple
+ octet character encodings MUST use an appropriate character set
+ specification to be consistent with MIME.
+
+ The complete US-ASCII character set is listed in ANSI X3.4- 1986.
+ Note that the control characters including DEL (0-31, 127) have no
+ defined meaning in apart from the combination CRLF (US-ASCII values
+ 13 and 10) indicating a new line. Two of the characters have de
+ facto meanings in wide use: FF (12) often means "start subsequent
+ text on the beginning of a new page"; and TAB or HT (9) often (though
+ not always) means "move the cursor to the next available column after
+ the current position where the column number is a multiple of 8
+ (counting the first column as column 0)." Aside from these
+ conventions, any use of the control characters or DEL in a body must
+ either occur
+
+ (1) because a subtype of text other than "plain"
+ specifically assigns some additional meaning, or
+
+ (2) within the context of a private agreement between the
+ sender and recipient. Such private agreements are
+ discouraged and should be replaced by the other
+ capabilities of this document.
+
+ NOTE: An enormous proliferation of character sets exist beyond US-
+ ASCII. A large number of partially or totally overlapping character
+ sets is NOT a good thing. A SINGLE character set that can be used
+ universally for representing all of the world's languages in Internet
+ mail would be preferrable. Unfortunately, existing practice in
+ several communities seems to point to the continued use of multiple
+ character sets in the near future. A small number of standard
+ character sets are, therefore, defined for Internet use in this
+ document.
+
+ The defined charset values are:
+
+ (1) US-ASCII -- as defined in ANSI X3.4-1986 [US-ASCII].
+
+ (2) ISO-8859-X -- where "X" is to be replaced, as
+ necessary, for the parts of ISO-8859 [ISO-8859]. Note
+ that the ISO 646 character sets have deliberately been
+ omitted in favor of their 8859 replacements, which are
+ the designated character sets for Internet mail. As of
+ the publication of this document, the legitimate values
+ for "X" are the digits 1 through 10.
+
+
+
+
+Freed & Borenstein Standards Track [Page 9]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Characters in the range 128-159 has no assigned meaning in ISO-8859-
+ X. Characters with values below 128 in ISO-8859-X have the same
+ assigned meaning as they do in US-ASCII.
+
+ Part 6 of ISO 8859 (Latin/Arabic alphabet) and part 8 (Latin/Hebrew
+ alphabet) includes both characters for which the normal writing
+ direction is right to left and characters for which it is left to
+ right, but do not define a canonical ordering method for representing
+ bi-directional text. The charset values "ISO-8859-6" and "ISO-8859-
+ 8", however, specify that the visual method is used [RFC-1556].
+
+ All of these character sets are used as pure 7bit or 8bit sets
+ without any shift or escape functions. The meaning of shift and
+ escape sequences in these character sets is not defined.
+
+ The character sets specified above are the ones that were relatively
+ uncontroversial during the drafting of MIME. This document does not
+ endorse the use of any particular character set other than US-ASCII,
+ and recognizes that the future evolution of world character sets
+ remains unclear.
+
+ Note that the character set used, if anything other than US- ASCII,
+ must always be explicitly specified in the Content-Type field.
+
+ No character set name other than those defined above may be used in
+ Internet mail without the publication of a formal specification and
+ its registration with IANA, or by private agreement, in which case
+ the character set name must begin with "X-".
+
+ Implementors are discouraged from defining new character sets unless
+ absolutely necessary.
+
+ The "charset" parameter has been defined primarily for the purpose of
+ textual data, and is described in this section for that reason.
+ However, it is conceivable that non-textual data might also wish to
+ specify a charset value for some purpose, in which case the same
+ syntax and values should be used.
+
+ In general, composition software should always use the "lowest common
+ denominator" character set possible. For example, if a body contains
+ only US-ASCII characters, it SHOULD be marked as being in the US-
+ ASCII character set, not ISO-8859-1, which, like all the ISO-8859
+ family of character sets, is a superset of US-ASCII. More generally,
+ if a widely-used character set is a subset of another character set,
+ and a body contains only characters in the widely-used subset, it
+ should be labelled as being in that subset. This will increase the
+ chances that the recipient will be able to view the resulting entity
+ correctly.
+
+
+
+Freed & Borenstein Standards Track [Page 10]
+\f
+RFC 2046 Media Types November 1996
+
+
+4.1.3. Plain Subtype
+
+ The simplest and most important subtype of "text" is "plain". This
+ indicates plain text that does not contain any formatting commands or
+ directives. Plain text is intended to be displayed "as-is", that is,
+ no interpretation of embedded formatting commands, font attribute
+ specifications, processing instructions, interpretation directives,
+ or content markup should be necessary for proper display. The
+ default media type of "text/plain; charset=us-ascii" for Internet
+ mail describes existing Internet practice. That is, it is the type
+ of body defined by RFC 822.
+
+ No other "text" subtype is defined by this document.
+
+4.1.4. Unrecognized Subtypes
+
+ Unrecognized subtypes of "text" should be treated as subtype "plain"
+ as long as the MIME implementation knows how to handle the charset.
+ Unrecognized subtypes which also specify an unrecognized charset
+ should be treated as "application/octet- stream".
+
+4.2. Image Media Type
+
+ A media type of "image" indicates that the body contains an image.
+ The subtype names the specific image format. These names are not
+ case sensitive. An initial subtype is "jpeg" for the JPEG format
+ using JFIF encoding [JPEG].
+
+ The list of "image" subtypes given here is neither exclusive nor
+ exhaustive, and is expected to grow as more types are registered with
+ IANA, as described in RFC 2048.
+
+ Unrecognized subtypes of "image" should at a miniumum be treated as
+ "application/octet-stream". Implementations may optionally elect to
+ pass subtypes of "image" that they do not specifically recognize to a
+ secure and robust general-purpose image viewing application, if such
+ an application is available.
+
+ NOTE: Using of a generic-purpose image viewing application this way
+ inherits the security problems of the most dangerous type supported
+ by the application.
+
+4.3. Audio Media Type
+
+ A media type of "audio" indicates that the body contains audio data.
+ Although there is not yet a consensus on an "ideal" audio format for
+ use with computers, there is a pressing need for a format capable of
+ providing interoperable behavior.
+
+
+
+Freed & Borenstein Standards Track [Page 11]
+\f
+RFC 2046 Media Types November 1996
+
+
+ The initial subtype of "basic" is specified to meet this requirement
+ by providing an absolutely minimal lowest common denominator audio
+ format. It is expected that richer formats for higher quality and/or
+ lower bandwidth audio will be defined by a later document.
+
+ The content of the "audio/basic" subtype is single channel audio
+ encoded using 8bit ISDN mu-law [PCM] at a sample rate of 8000 Hz.
+
+ Unrecognized subtypes of "audio" should at a miniumum be treated as
+ "application/octet-stream". Implementations may optionally elect to
+ pass subtypes of "audio" that they do not specifically recognize to a
+ robust general-purpose audio playing application, if such an
+ application is available.
+
+4.4. Video Media Type
+
+ A media type of "video" indicates that the body contains a time-
+ varying-picture image, possibly with color and coordinated sound.
+ The term 'video' is used in its most generic sense, rather than with
+ reference to any particular technology or format, and is not meant to
+ preclude subtypes such as animated drawings encoded compactly. The
+ subtype "mpeg" refers to video coded according to the MPEG standard
+ [MPEG].
+
+ Note that although in general this document strongly discourages the
+ mixing of multiple media in a single body, it is recognized that many
+ so-called video formats include a representation for synchronized
+ audio, and this is explicitly permitted for subtypes of "video".
+
+ Unrecognized subtypes of "video" should at a minumum be treated as
+ "application/octet-stream". Implementations may optionally elect to
+ pass subtypes of "video" that they do not specifically recognize to a
+ robust general-purpose video display application, if such an
+ application is available.
+
+4.5. Application Media Type
+
+ The "application" media type is to be used for discrete data which do
+ not fit in any of the other categories, and particularly for data to
+ be processed by some type of application program. This is
+ information which must be processed by an application before it is
+ viewable or usable by a user. Expected uses for the "application"
+ media type include file transfer, spreadsheets, data for mail-based
+ scheduling systems, and languages for "active" (computational)
+ material. (The latter, in particular, can pose security problems
+ which must be understood by implementors, and are considered in
+ detail in the discussion of the "application/PostScript" media type.)
+
+
+
+
+Freed & Borenstein Standards Track [Page 12]
+\f
+RFC 2046 Media Types November 1996
+
+
+ For example, a meeting scheduler might define a standard
+ representation for information about proposed meeting dates. An
+ intelligent user agent would use this information to conduct a dialog
+ with the user, and might then send additional material based on that
+ dialog. More generally, there have been several "active" messaging
+ languages developed in which programs in a suitably specialized
+ language are transported to a remote location and automatically run
+ in the recipient's environment.
+
+ Such applications may be defined as subtypes of the "application"
+ media type. This document defines two subtypes:
+
+ octet-stream, and PostScript.
+
+ The subtype of "application" will often be either the name or include
+ part of the name of the application for which the data are intended.
+ This does not mean, however, that any application program name may be
+ used freely as a subtype of "application".
+
+4.5.1. Octet-Stream Subtype
+
+ The "octet-stream" subtype is used to indicate that a body contains
+ arbitrary binary data. The set of currently defined parameters is:
+
+ (1) TYPE -- the general type or category of binary data.
+ This is intended as information for the human recipient
+ rather than for any automatic processing.
+
+ (2) PADDING -- the number of bits of padding that were
+ appended to the bit-stream comprising the actual
+ contents to produce the enclosed 8bit byte-oriented
+ data. This is useful for enclosing a bit-stream in a
+ body when the total number of bits is not a multiple of
+ 8.
+
+ Both of these parameters are optional.
+
+ An additional parameter, "CONVERSIONS", was defined in RFC 1341 but
+ has since been removed. RFC 1341 also defined the use of a "NAME"
+ parameter which gave a suggested file name to be used if the data
+ were to be written to a file. This has been deprecated in
+ anticipation of a separate Content-Disposition header field, to be
+ defined in a subsequent RFC.
+
+ The recommended action for an implementation that receives an
+ "application/octet-stream" entity is to simply offer to put the data
+ in a file, with any Content-Transfer-Encoding undone, or perhaps to
+ use it as input to a user-specified process.
+
+
+
+Freed & Borenstein Standards Track [Page 13]
+\f
+RFC 2046 Media Types November 1996
+
+
+ To reduce the danger of transmitting rogue programs, it is strongly
+ recommended that implementations NOT implement a path-search
+ mechanism whereby an arbitrary program named in the Content-Type
+ parameter (e.g., an "interpreter=" parameter) is found and executed
+ using the message body as input.
+
+4.5.2. PostScript Subtype
+
+ A media type of "application/postscript" indicates a PostScript
+ program. Currently two variants of the PostScript language are
+ allowed; the original level 1 variant is described in [POSTSCRIPT]
+ and the more recent level 2 variant is described in [POSTSCRIPT2].
+
+ PostScript is a registered trademark of Adobe Systems, Inc. Use of
+ the MIME media type "application/postscript" implies recognition of
+ that trademark and all the rights it entails.
+
+ The PostScript language definition provides facilities for internal
+ labelling of the specific language features a given program uses.
+ This labelling, called the PostScript document structuring
+ conventions, or DSC, is very general and provides substantially more
+ information than just the language level. The use of document
+ structuring conventions, while not required, is strongly recommended
+ as an aid to interoperability. Documents which lack proper
+ structuring conventions cannot be tested to see whether or not they
+ will work in a given environment. As such, some systems may assume
+ the worst and refuse to process unstructured documents.
+
+ The execution of general-purpose PostScript interpreters entails
+ serious security risks, and implementors are discouraged from simply
+ sending PostScript bodies to "off- the-shelf" interpreters. While it
+ is usually safe to send PostScript to a printer, where the potential
+ for harm is greatly constrained by typical printer environments,
+ implementors should consider all of the following before they add
+ interactive display of PostScript bodies to their MIME readers.
+
+ The remainder of this section outlines some, though probably not all,
+ of the possible problems with the transport of PostScript entities.
+
+ (1) Dangerous operations in the PostScript language
+ include, but may not be limited to, the PostScript
+ operators "deletefile", "renamefile", "filenameforall",
+ and "file". "File" is only dangerous when applied to
+ something other than standard input or output.
+ Implementations may also define additional nonstandard
+ file operators; these may also pose a threat to
+ security. "Filenameforall", the wildcard file search
+ operator, may appear at first glance to be harmless.
+
+
+
+Freed & Borenstein Standards Track [Page 14]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Note, however, that this operator has the potential to
+ reveal information about what files the recipient has
+ access to, and this information may itself be
+ sensitive. Message senders should avoid the use of
+ potentially dangerous file operators, since these
+ operators are quite likely to be unavailable in secure
+ PostScript implementations. Message receiving and
+ displaying software should either completely disable
+ all potentially dangerous file operators or take
+ special care not to delegate any special authority to
+ their operation. These operators should be viewed as
+ being done by an outside agency when interpreting
+ PostScript documents. Such disabling and/or checking
+ should be done completely outside of the reach of the
+ PostScript language itself; care should be taken to
+ insure that no method exists for re-enabling full-
+ function versions of these operators.
+
+ (2) The PostScript language provides facilities for exiting
+ the normal interpreter, or server, loop. Changes made
+ in this "outer" environment are customarily retained
+ across documents, and may in some cases be retained
+ semipermanently in nonvolatile memory. The operators
+ associated with exiting the interpreter loop have the
+ potential to interfere with subsequent document
+ processing. As such, their unrestrained use
+ constitutes a threat of service denial. PostScript
+ operators that exit the interpreter loop include, but
+ may not be limited to, the exitserver and startjob
+ operators. Message sending software should not
+ generate PostScript that depends on exiting the
+ interpreter loop to operate, since the ability to exit
+ will probably be unavailable in secure PostScript
+ implementations. Message receiving and displaying
+ software should completely disable the ability to make
+ retained changes to the PostScript environment by
+ eliminating or disabling the "startjob" and
+ "exitserver" operations. If these operations cannot be
+ eliminated or completely disabled the password
+ associated with them should at least be set to a hard-
+ to-guess value.
+
+ (3) PostScript provides operators for setting system-wide
+ and device-specific parameters. These parameter
+ settings may be retained across jobs and may
+ potentially pose a threat to the correct operation of
+ the interpreter. The PostScript operators that set
+ system and device parameters include, but may not be
+
+
+
+Freed & Borenstein Standards Track [Page 15]
+\f
+RFC 2046 Media Types November 1996
+
+
+ limited to, the "setsystemparams" and "setdevparams"
+ operators. Message sending software should not
+ generate PostScript that depends on the setting of
+ system or device parameters to operate correctly. The
+ ability to set these parameters will probably be
+ unavailable in secure PostScript implementations.
+ Message receiving and displaying software should
+ disable the ability to change system and device
+ parameters. If these operators cannot be completely
+ disabled the password associated with them should at
+ least be set to a hard-to-guess value.
+
+ (4) Some PostScript implementations provide nonstandard
+ facilities for the direct loading and execution of
+ machine code. Such facilities are quite obviously open
+ to substantial abuse. Message sending software should
+ not make use of such features. Besides being totally
+ hardware-specific, they are also likely to be
+ unavailable in secure implementations of PostScript.
+ Message receiving and displaying software should not
+ allow such operators to be used if they exist.
+
+ (5) PostScript is an extensible language, and many, if not
+ most, implementations of it provide a number of their
+ own extensions. This document does not deal with such
+ extensions explicitly since they constitute an unknown
+ factor. Message sending software should not make use
+ of nonstandard extensions; they are likely to be
+ missing from some implementations. Message receiving
+ and displaying software should make sure that any
+ nonstandard PostScript operators are secure and don't
+ present any kind of threat.
+
+ (6) It is possible to write PostScript that consumes huge
+ amounts of various system resources. It is also
+ possible to write PostScript programs that loop
+ indefinitely. Both types of programs have the
+ potential to cause damage if sent to unsuspecting
+ recipients. Message-sending software should avoid the
+ construction and dissemination of such programs, which
+ is antisocial. Message receiving and displaying
+ software should provide appropriate mechanisms to abort
+ processing after a reasonable amount of time has
+ elapsed. In addition, PostScript interpreters should be
+ limited to the consumption of only a reasonable amount
+ of any given system resource.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 16]
+\f
+RFC 2046 Media Types November 1996
+
+
+ (7) It is possible to include raw binary information inside
+ PostScript in various forms. This is not recommended
+ for use in Internet mail, both because it is not
+ supported by all PostScript interpreters and because it
+ significantly complicates the use of a MIME Content-
+ Transfer-Encoding. (Without such binary, PostScript
+ may typically be viewed as line-oriented data. The
+ treatment of CRLF sequences becomes extremely
+ problematic if binary and line-oriented data are mixed
+ in a single Postscript data stream.)
+
+ (8) Finally, bugs may exist in some PostScript interpreters
+ which could possibly be exploited to gain unauthorized
+ access to a recipient's system. Apart from noting this
+ possibility, there is no specific action to take to
+ prevent this, apart from the timely correction of such
+ bugs if any are found.
+
+4.5.3. Other Application Subtypes
+
+ It is expected that many other subtypes of "application" will be
+ defined in the future. MIME implementations must at a minimum treat
+ any unrecognized subtypes as being equivalent to "application/octet-
+ stream".
+
+5. Composite Media Type Values
+
+ The remaining two of the seven initial Content-Type values refer to
+ composite entities. Composite entities are handled using MIME
+ mechanisms -- a MIME processor typically handles the body directly.
+
+5.1. Multipart Media Type
+
+ In the case of multipart entities, in which one or more different
+ sets of data are combined in a single body, a "multipart" media type
+ field must appear in the entity's header. The body must then contain
+ one or more body parts, each preceded by a boundary delimiter line,
+ and the last one followed by a closing boundary delimiter line.
+ After its boundary delimiter line, each body part then consists of a
+ header area, a blank line, and a body area. Thus a body part is
+ similar to an RFC 822 message in syntax, but different in meaning.
+
+ A body part is an entity and hence is NOT to be interpreted as
+ actually being an RFC 822 message. To begin with, NO header fields
+ are actually required in body parts. A body part that starts with a
+ blank line, therefore, is allowed and is a body part for which all
+ default values are to be assumed. In such a case, the absence of a
+ Content-Type header usually indicates that the corresponding body has
+
+
+
+Freed & Borenstein Standards Track [Page 17]
+\f
+RFC 2046 Media Types November 1996
+
+
+ a content-type of "text/plain; charset=US-ASCII".
+
+ The only header fields that have defined meaning for body parts are
+ those the names of which begin with "Content-". All other header
+ fields may be ignored in body parts. Although they should generally
+ be retained if at all possible, they may be discarded by gateways if
+ necessary. Such other fields are permitted to appear in body parts
+ but must not be depended on. "X-" fields may be created for
+ experimental or private purposes, with the recognition that the
+ information they contain may be lost at some gateways.
+
+ NOTE: The distinction between an RFC 822 message and a body part is
+ subtle, but important. A gateway between Internet and X.400 mail,
+ for example, must be able to tell the difference between a body part
+ that contains an image and a body part that contains an encapsulated
+ message, the body of which is a JPEG image. In order to represent
+ the latter, the body part must have "Content-Type: message/rfc822",
+ and its body (after the blank line) must be the encapsulated message,
+ with its own "Content-Type: image/jpeg" header field. The use of
+ similar syntax facilitates the conversion of messages to body parts,
+ and vice versa, but the distinction between the two must be
+ understood by implementors. (For the special case in which parts
+ actually are messages, a "digest" subtype is also defined.)
+
+ As stated previously, each body part is preceded by a boundary
+ delimiter line that contains the boundary delimiter. The boundary
+ delimiter MUST NOT appear inside any of the encapsulated parts, on a
+ line by itself or as the prefix of any line. This implies that it is
+ crucial that the composing agent be able to choose and specify a
+ unique boundary parameter value that does not contain the boundary
+ parameter value of an enclosing multipart as a prefix.
+
+ All present and future subtypes of the "multipart" type must use an
+ identical syntax. Subtypes may differ in their semantics, and may
+ impose additional restrictions on syntax, but must conform to the
+ required syntax for the "multipart" type. This requirement ensures
+ that all conformant user agents will at least be able to recognize
+ and separate the parts of any multipart entity, even those of an
+ unrecognized subtype.
+
+ As stated in the definition of the Content-Transfer-Encoding field
+ [RFC 2045], no encoding other than "7bit", "8bit", or "binary" is
+ permitted for entities of type "multipart". The "multipart" boundary
+ delimiters and header fields are always represented as 7bit US-ASCII
+ in any case (though the header fields may encode non-US-ASCII header
+ text as per RFC 2047) and data within the body parts can be encoded
+ on a part-by-part basis, with Content-Transfer-Encoding fields for
+ each appropriate body part.
+
+
+
+Freed & Borenstein Standards Track [Page 18]
+\f
+RFC 2046 Media Types November 1996
+
+
+5.1.1. Common Syntax
+
+ This section defines a common syntax for subtypes of "multipart".
+ All subtypes of "multipart" must use this syntax. A simple example
+ of a multipart message also appears in this section. An example of a
+ more complex multipart message is given in RFC 2049.
+
+ The Content-Type field for multipart entities requires one parameter,
+ "boundary". The boundary delimiter line is then defined as a line
+ consisting entirely of two hyphen characters ("-", decimal value 45)
+ followed by the boundary parameter value from the Content-Type header
+ field, optional linear whitespace, and a terminating CRLF.
+
+ NOTE: The hyphens are for rough compatibility with the earlier RFC
+ 934 method of message encapsulation, and for ease of searching for
+ the boundaries in some implementations. However, it should be noted
+ that multipart messages are NOT completely compatible with RFC 934
+ encapsulations; in particular, they do not obey RFC 934 quoting
+ conventions for embedded lines that begin with hyphens. This
+ mechanism was chosen over the RFC 934 mechanism because the latter
+ causes lines to grow with each level of quoting. The combination of
+ this growth with the fact that SMTP implementations sometimes wrap
+ long lines made the RFC 934 mechanism unsuitable for use in the event
+ that deeply-nested multipart structuring is ever desired.
+
+ WARNING TO IMPLEMENTORS: The grammar for parameters on the Content-
+ type field is such that it is often necessary to enclose the boundary
+ parameter values in quotes on the Content-type line. This is not
+ always necessary, but never hurts. Implementors should be sure to
+ study the grammar carefully in order to avoid producing invalid
+ Content-type fields. Thus, a typical "multipart" Content-Type header
+ field might look like this:
+
+ Content-Type: multipart/mixed; boundary=gc0p4Jq0M2Yt08j34c0p
+
+ But the following is not valid:
+
+ Content-Type: multipart/mixed; boundary=gc0pJq0M:08jU534c0p
+
+ (because of the colon) and must instead be represented as
+
+ Content-Type: multipart/mixed; boundary="gc0pJq0M:08jU534c0p"
+
+ This Content-Type value indicates that the content consists of one or
+ more parts, each with a structure that is syntactically identical to
+ an RFC 822 message, except that the header area is allowed to be
+ completely empty, and that the parts are each preceded by the line
+
+
+
+
+Freed & Borenstein Standards Track [Page 19]
+\f
+RFC 2046 Media Types November 1996
+
+
+ --gc0pJq0M:08jU534c0p
+
+ The boundary delimiter MUST occur at the beginning of a line, i.e.,
+ following a CRLF, and the initial CRLF is considered to be attached
+ to the boundary delimiter line rather than part of the preceding
+ part. The boundary may be followed by zero or more characters of
+ linear whitespace. It is then terminated by either another CRLF and
+ the header fields for the next part, or by two CRLFs, in which case
+ there are no header fields for the next part. If no Content-Type
+ field is present it is assumed to be "message/rfc822" in a
+ "multipart/digest" and "text/plain" otherwise.
+
+ NOTE: The CRLF preceding the boundary delimiter line is conceptually
+ attached to the boundary so that it is possible to have a part that
+ does not end with a CRLF (line break). Body parts that must be
+ considered to end with line breaks, therefore, must have two CRLFs
+ preceding the boundary delimiter line, the first of which is part of
+ the preceding body part, and the second of which is part of the
+ encapsulation boundary.
+
+ Boundary delimiters must not appear within the encapsulated material,
+ and must be no longer than 70 characters, not counting the two
+ leading hyphens.
+
+ The boundary delimiter line following the last body part is a
+ distinguished delimiter that indicates that no further body parts
+ will follow. Such a delimiter line is identical to the previous
+ delimiter lines, with the addition of two more hyphens after the
+ boundary parameter value.
+
+ --gc0pJq0M:08jU534c0p--
+
+ NOTE TO IMPLEMENTORS: Boundary string comparisons must compare the
+ boundary value with the beginning of each candidate line. An exact
+ match of the entire candidate line is not required; it is sufficient
+ that the boundary appear in its entirety following the CRLF.
+
+ There appears to be room for additional information prior to the
+ first boundary delimiter line and following the final boundary
+ delimiter line. These areas should generally be left blank, and
+ implementations must ignore anything that appears before the first
+ boundary delimiter line or after the last one.
+
+ NOTE: These "preamble" and "epilogue" areas are generally not used
+ because of the lack of proper typing of these parts and the lack of
+ clear semantics for handling these areas at gateways, particularly
+ X.400 gateways. However, rather than leaving the preamble area
+ blank, many MIME implementations have found this to be a convenient
+
+
+
+Freed & Borenstein Standards Track [Page 20]
+\f
+RFC 2046 Media Types November 1996
+
+
+ place to insert an explanatory note for recipients who read the
+ message with pre-MIME software, since such notes will be ignored by
+ MIME-compliant software.
+
+ NOTE: Because boundary delimiters must not appear in the body parts
+ being encapsulated, a user agent must exercise care to choose a
+ unique boundary parameter value. The boundary parameter value in the
+ example above could have been the result of an algorithm designed to
+ produce boundary delimiters with a very low probability of already
+ existing in the data to be encapsulated without having to prescan the
+ data. Alternate algorithms might result in more "readable" boundary
+ delimiters for a recipient with an old user agent, but would require
+ more attention to the possibility that the boundary delimiter might
+ appear at the beginning of some line in the encapsulated part. The
+ simplest boundary delimiter line possible is something like "---",
+ with a closing boundary delimiter line of "-----".
+
+ As a very simple example, the following multipart message has two
+ parts, both of them plain text, one of them explicitly typed and one
+ of them implicitly typed:
+
+ From: Nathaniel Borenstein <nsb@bellcore.com>
+ To: Ned Freed <ned@innosoft.com>
+ Date: Sun, 21 Mar 1993 23:56:48 -0800 (PST)
+ Subject: Sample message
+ MIME-Version: 1.0
+ Content-type: multipart/mixed; boundary="simple boundary"
+
+ This is the preamble. It is to be ignored, though it
+ is a handy place for composition agents to include an
+ explanatory note to non-MIME conformant readers.
+
+ --simple boundary
+
+ This is implicitly typed plain US-ASCII text.
+ It does NOT end with a linebreak.
+ --simple boundary
+ Content-type: text/plain; charset=us-ascii
+
+ This is explicitly typed plain US-ASCII text.
+ It DOES end with a linebreak.
+
+ --simple boundary--
+
+ This is the epilogue. It is also to be ignored.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 21]
+\f
+RFC 2046 Media Types November 1996
+
+
+ The use of a media type of "multipart" in a body part within another
+ "multipart" entity is explicitly allowed. In such cases, for obvious
+ reasons, care must be taken to ensure that each nested "multipart"
+ entity uses a different boundary delimiter. See RFC 2049 for an
+ example of nested "multipart" entities.
+
+ The use of the "multipart" media type with only a single body part
+ may be useful in certain contexts, and is explicitly permitted.
+
+ NOTE: Experience has shown that a "multipart" media type with a
+ single body part is useful for sending non-text media types. It has
+ the advantage of providing the preamble as a place to include
+ decoding instructions. In addition, a number of SMTP gateways move
+ or remove the MIME headers, and a clever MIME decoder can take a good
+ guess at multipart boundaries even in the absence of the Content-Type
+ header and thereby successfully decode the message.
+
+ The only mandatory global parameter for the "multipart" media type is
+ the boundary parameter, which consists of 1 to 70 characters from a
+ set of characters known to be very robust through mail gateways, and
+ NOT ending with white space. (If a boundary delimiter line appears to
+ end with white space, the white space must be presumed to have been
+ added by a gateway, and must be deleted.) It is formally specified
+ by the following BNF:
+
+ boundary := 0*69<bchars> bcharsnospace
+
+ bchars := bcharsnospace / " "
+
+ bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
+ "+" / "_" / "," / "-" / "." /
+ "/" / ":" / "=" / "?"
+
+ Overall, the body of a "multipart" entity may be specified as
+ follows:
+
+ dash-boundary := "--" boundary
+ ; boundary taken from the value of
+ ; boundary parameter of the
+ ; Content-Type field.
+
+ multipart-body := [preamble CRLF]
+ dash-boundary transport-padding CRLF
+ body-part *encapsulation
+ close-delimiter transport-padding
+ [CRLF epilogue]
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 22]
+\f
+RFC 2046 Media Types November 1996
+
+
+ transport-padding := *LWSP-char
+ ; Composers MUST NOT generate
+ ; non-zero length transport
+ ; padding, but receivers MUST
+ ; be able to handle padding
+ ; added by message transports.
+
+ encapsulation := delimiter transport-padding
+ CRLF body-part
+
+ delimiter := CRLF dash-boundary
+
+ close-delimiter := delimiter "--"
+
+ preamble := discard-text
+
+ epilogue := discard-text
+
+ discard-text := *(*text CRLF) *text
+ ; May be ignored or discarded.
+
+ body-part := MIME-part-headers [CRLF *OCTET]
+ ; Lines in a body-part must not start
+ ; with the specified dash-boundary and
+ ; the delimiter must not appear anywhere
+ ; in the body part. Note that the
+ ; semantics of a body-part differ from
+ ; the semantics of a message, as
+ ; described in the text.
+
+ OCTET := <any 0-255 octet value>
+
+ IMPORTANT: The free insertion of linear-white-space and RFC 822
+ comments between the elements shown in this BNF is NOT allowed since
+ this BNF does not specify a structured header field.
+
+ NOTE: In certain transport enclaves, RFC 822 restrictions such as
+ the one that limits bodies to printable US-ASCII characters may not
+ be in force. (That is, the transport domains may exist that resemble
+ standard Internet mail transport as specified in RFC 821 and assumed
+ by RFC 822, but without certain restrictions.) The relaxation of
+ these restrictions should be construed as locally extending the
+ definition of bodies, for example to include octets outside of the
+ US-ASCII range, as long as these extensions are supported by the
+ transport and adequately documented in the Content- Transfer-Encoding
+ header field. However, in no event are headers (either message
+ headers or body part headers) allowed to contain anything other than
+ US-ASCII characters.
+
+
+
+Freed & Borenstein Standards Track [Page 23]
+\f
+RFC 2046 Media Types November 1996
+
+
+ NOTE: Conspicuously missing from the "multipart" type is a notion of
+ structured, related body parts. It is recommended that those wishing
+ to provide more structured or integrated multipart messaging
+ facilities should define subtypes of multipart that are syntactically
+ identical but define relationships between the various parts. For
+ example, subtypes of multipart could be defined that include a
+ distinguished part which in turn is used to specify the relationships
+ between the other parts, probably referring to them by their
+ Content-ID field. Old implementations will not recognize the new
+ subtype if this approach is used, but will treat it as
+ multipart/mixed and will thus be able to show the user the parts that
+ are recognized.
+
+5.1.2. Handling Nested Messages and Multiparts
+
+ The "message/rfc822" subtype defined in a subsequent section of this
+ document has no terminating condition other than running out of data.
+ Similarly, an improperly truncated "multipart" entity may not have
+ any terminating boundary marker, and can turn up operationally due to
+ mail system malfunctions.
+
+ It is essential that such entities be handled correctly when they are
+ themselves imbedded inside of another "multipart" structure. MIME
+ implementations are therefore required to recognize outer level
+ boundary markers at ANY level of inner nesting. It is not sufficient
+ to only check for the next expected marker or other terminating
+ condition.
+
+5.1.3. Mixed Subtype
+
+ The "mixed" subtype of "multipart" is intended for use when the body
+ parts are independent and need to be bundled in a particular order.
+ Any "multipart" subtypes that an implementation does not recognize
+ must be treated as being of subtype "mixed".
+
+5.1.4. Alternative Subtype
+
+ The "multipart/alternative" type is syntactically identical to
+ "multipart/mixed", but the semantics are different. In particular,
+ each of the body parts is an "alternative" version of the same
+ information.
+
+ Systems should recognize that the content of the various parts are
+ interchangeable. Systems should choose the "best" type based on the
+ local environment and references, in some cases even through user
+ interaction. As with "multipart/mixed", the order of body parts is
+ significant. In this case, the alternatives appear in an order of
+ increasing faithfulness to the original content. In general, the
+
+
+
+Freed & Borenstein Standards Track [Page 24]
+\f
+RFC 2046 Media Types November 1996
+
+
+ best choice is the LAST part of a type supported by the recipient
+ system's local environment.
+
+ "Multipart/alternative" may be used, for example, to send a message
+ in a fancy text format in such a way that it can easily be displayed
+ anywhere:
+
+ From: Nathaniel Borenstein <nsb@bellcore.com>
+ To: Ned Freed <ned@innosoft.com>
+ Date: Mon, 22 Mar 1993 09:41:09 -0800 (PST)
+ Subject: Formatted text mail
+ MIME-Version: 1.0
+ Content-Type: multipart/alternative; boundary=boundary42
+
+ --boundary42
+ Content-Type: text/plain; charset=us-ascii
+
+ ... plain text version of message goes here ...
+
+ --boundary42
+ Content-Type: text/enriched
+
+ ... RFC 1896 text/enriched version of same message
+ goes here ...
+
+ --boundary42
+ Content-Type: application/x-whatever
+
+ ... fanciest version of same message goes here ...
+
+ --boundary42--
+
+ In this example, users whose mail systems understood the
+ "application/x-whatever" format would see only the fancy version,
+ while other users would see only the enriched or plain text version,
+ depending on the capabilities of their system.
+
+ In general, user agents that compose "multipart/alternative" entities
+ must place the body parts in increasing order of preference, that is,
+ with the preferred format last. For fancy text, the sending user
+ agent should put the plainest format first and the richest format
+ last. Receiving user agents should pick and display the last format
+ they are capable of displaying. In the case where one of the
+ alternatives is itself of type "multipart" and contains unrecognized
+ sub-parts, the user agent may choose either to show that alternative,
+ an earlier alternative, or both.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 25]
+\f
+RFC 2046 Media Types November 1996
+
+
+ NOTE: From an implementor's perspective, it might seem more sensible
+ to reverse this ordering, and have the plainest alternative last.
+ However, placing the plainest alternative first is the friendliest
+ possible option when "multipart/alternative" entities are viewed
+ using a non-MIME-conformant viewer. While this approach does impose
+ some burden on conformant MIME viewers, interoperability with older
+ mail readers was deemed to be more important in this case.
+
+ It may be the case that some user agents, if they can recognize more
+ than one of the formats, will prefer to offer the user the choice of
+ which format to view. This makes sense, for example, if a message
+ includes both a nicely- formatted image version and an easily-edited
+ text version. What is most critical, however, is that the user not
+ automatically be shown multiple versions of the same data. Either
+ the user should be shown the last recognized version or should be
+ given the choice.
+
+ THE SEMANTICS OF CONTENT-ID IN MULTIPART/ALTERNATIVE: Each part of a
+ "multipart/alternative" entity represents the same data, but the
+ mappings between the two are not necessarily without information
+ loss. For example, information is lost when translating ODA to
+ PostScript or plain text. It is recommended that each part should
+ have a different Content-ID value in the case where the information
+ content of the two parts is not identical. And when the information
+ content is identical -- for example, where several parts of type
+ "message/external-body" specify alternate ways to access the
+ identical data -- the same Content-ID field value should be used, to
+ optimize any caching mechanisms that might be present on the
+ recipient's end. However, the Content-ID values used by the parts
+ should NOT be the same Content-ID value that describes the
+ "multipart/alternative" as a whole, if there is any such Content-ID
+ field. That is, one Content-ID value will refer to the
+ "multipart/alternative" entity, while one or more other Content-ID
+ values will refer to the parts inside it.
+
+5.1.5. Digest Subtype
+
+ This document defines a "digest" subtype of the "multipart" Content-
+ Type. This type is syntactically identical to "multipart/mixed", but
+ the semantics are different. In particular, in a digest, the default
+ Content-Type value for a body part is changed from "text/plain" to
+ "message/rfc822". This is done to allow a more readable digest
+ format that is largely compatible (except for the quoting convention)
+ with RFC 934.
+
+ Note: Though it is possible to specify a Content-Type value for a
+ body part in a digest which is other than "message/rfc822", such as a
+ "text/plain" part containing a description of the material in the
+
+
+
+Freed & Borenstein Standards Track [Page 26]
+\f
+RFC 2046 Media Types November 1996
+
+
+ digest, actually doing so is undesireble. The "multipart/digest"
+ Content-Type is intended to be used to send collections of messages.
+ If a "text/plain" part is needed, it should be included as a seperate
+ part of a "multipart/mixed" message.
+
+ A digest in this format might, then, look something like this:
+
+ From: Moderator-Address
+ To: Recipient-List
+ Date: Mon, 22 Mar 1994 13:34:51 +0000
+ Subject: Internet Digest, volume 42
+ MIME-Version: 1.0
+ Content-Type: multipart/mixed;
+ boundary="---- main boundary ----"
+
+ ------ main boundary ----
+
+ ...Introductory text or table of contents...
+
+ ------ main boundary ----
+ Content-Type: multipart/digest;
+ boundary="---- next message ----"
+
+ ------ next message ----
+
+ From: someone-else
+ Date: Fri, 26 Mar 1993 11:13:32 +0200
+ Subject: my opinion
+
+ ...body goes here ...
+
+ ------ next message ----
+
+ From: someone-else-again
+ Date: Fri, 26 Mar 1993 10:07:13 -0500
+ Subject: my different opinion
+
+ ... another body goes here ...
+
+ ------ next message ------
+
+ ------ main boundary ------
+
+5.1.6. Parallel Subtype
+
+ This document defines a "parallel" subtype of the "multipart"
+ Content-Type. This type is syntactically identical to
+ "multipart/mixed", but the semantics are different. In particular,
+
+
+
+Freed & Borenstein Standards Track [Page 27]
+\f
+RFC 2046 Media Types November 1996
+
+
+ in a parallel entity, the order of body parts is not significant.
+
+ A common presentation of this type is to display all of the parts
+ simultaneously on hardware and software that are capable of doing so.
+ However, composing agents should be aware that many mail readers will
+ lack this capability and will show the parts serially in any event.
+
+5.1.7. Other Multipart Subtypes
+
+ Other "multipart" subtypes are expected in the future. MIME
+ implementations must in general treat unrecognized subtypes of
+ "multipart" as being equivalent to "multipart/mixed".
+
+5.2. Message Media Type
+
+ It is frequently desirable, in sending mail, to encapsulate another
+ mail message. A special media type, "message", is defined to
+ facilitate this. In particular, the "rfc822" subtype of "message" is
+ used to encapsulate RFC 822 messages.
+
+ NOTE: It has been suggested that subtypes of "message" might be
+ defined for forwarded or rejected messages. However, forwarded and
+ rejected messages can be handled as multipart messages in which the
+ first part contains any control or descriptive information, and a
+ second part, of type "message/rfc822", is the forwarded or rejected
+ message. Composing rejection and forwarding messages in this manner
+ will preserve the type information on the original message and allow
+ it to be correctly presented to the recipient, and hence is strongly
+ encouraged.
+
+ Subtypes of "message" often impose restrictions on what encodings are
+ allowed. These restrictions are described in conjunction with each
+ specific subtype.
+
+ Mail gateways, relays, and other mail handling agents are commonly
+ known to alter the top-level header of an RFC 822 message. In
+ particular, they frequently add, remove, or reorder header fields.
+ These operations are explicitly forbidden for the encapsulated
+ headers embedded in the bodies of messages of type "message."
+
+5.2.1. RFC822 Subtype
+
+ A media type of "message/rfc822" indicates that the body contains an
+ encapsulated message, with the syntax of an RFC 822 message.
+ However, unlike top-level RFC 822 messages, the restriction that each
+ "message/rfc822" body must include a "From", "Date", and at least one
+ destination header is removed and replaced with the requirement that
+ at least one of "From", "Subject", or "Date" must be present.
+
+
+
+Freed & Borenstein Standards Track [Page 28]
+\f
+RFC 2046 Media Types November 1996
+
+
+ It should be noted that, despite the use of the numbers "822", a
+ "message/rfc822" entity isn't restricted to material in strict
+ conformance to RFC822, nor are the semantics of "message/rfc822"
+ objects restricted to the semantics defined in RFC822. More
+ specifically, a "message/rfc822" message could well be a News article
+ or a MIME message.
+
+ No encoding other than "7bit", "8bit", or "binary" is permitted for
+ the body of a "message/rfc822" entity. The message header fields are
+ always US-ASCII in any case, and data within the body can still be
+ encoded, in which case the Content-Transfer-Encoding header field in
+ the encapsulated message will reflect this. Non-US-ASCII text in the
+ headers of an encapsulated message can be specified using the
+ mechanisms described in RFC 2047.
+
+5.2.2. Partial Subtype
+
+ The "partial" subtype is defined to allow large entities to be
+ delivered as several separate pieces of mail and automatically
+ reassembled by a receiving user agent. (The concept is similar to IP
+ fragmentation and reassembly in the basic Internet Protocols.) This
+ mechanism can be used when intermediate transport agents limit the
+ size of individual messages that can be sent. The media type
+ "message/partial" thus indicates that the body contains a fragment of
+ a larger entity.
+
+ Because data of type "message" may never be encoded in base64 or
+ quoted-printable, a problem might arise if "message/partial" entities
+ are constructed in an environment that supports binary or 8bit
+ transport. The problem is that the binary data would be split into
+ multiple "message/partial" messages, each of them requiring binary
+ transport. If such messages were encountered at a gateway into a
+ 7bit transport environment, there would be no way to properly encode
+ them for the 7bit world, aside from waiting for all of the fragments,
+ reassembling the inner message, and then encoding the reassembled
+ data in base64 or quoted-printable. Since it is possible that
+ different fragments might go through different gateways, even this is
+ not an acceptable solution. For this reason, it is specified that
+ entities of type "message/partial" must always have a content-
+ transfer-encoding of 7bit (the default). In particular, even in
+ environments that support binary or 8bit transport, the use of a
+ content- transfer-encoding of "8bit" or "binary" is explicitly
+ prohibited for MIME entities of type "message/partial". This in turn
+ implies that the inner message must not use "8bit" or "binary"
+ encoding.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 29]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Because some message transfer agents may choose to automatically
+ fragment large messages, and because such agents may use very
+ different fragmentation thresholds, it is possible that the pieces of
+ a partial message, upon reassembly, may prove themselves to comprise
+ a partial message. This is explicitly permitted.
+
+ Three parameters must be specified in the Content-Type field of type
+ "message/partial": The first, "id", is a unique identifier, as close
+ to a world-unique identifier as possible, to be used to match the
+ fragments together. (In general, the identifier is essentially a
+ message-id; if placed in double quotes, it can be ANY message-id, in
+ accordance with the BNF for "parameter" given in RFC 2045.) The
+ second, "number", an integer, is the fragment number, which indicates
+ where this fragment fits into the sequence of fragments. The third,
+ "total", another integer, is the total number of fragments. This
+ third subfield is required on the final fragment, and is optional
+ (though encouraged) on the earlier fragments. Note also that these
+ parameters may be given in any order.
+
+ Thus, the second piece of a 3-piece message may have either of the
+ following header fields:
+
+ Content-Type: Message/Partial; number=2; total=3;
+ id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
+
+ Content-Type: Message/Partial;
+ id="oc=jpbe0M2Yt4s@thumper.bellcore.com";
+ number=2
+
+ But the third piece MUST specify the total number of fragments:
+
+ Content-Type: Message/Partial; number=3; total=3;
+ id="oc=jpbe0M2Yt4s@thumper.bellcore.com"
+
+ Note that fragment numbering begins with 1, not 0.
+
+ When the fragments of an entity broken up in this manner are put
+ together, the result is always a complete MIME entity, which may have
+ its own Content-Type header field, and thus may contain any other
+ data type.
+
+5.2.2.1. Message Fragmentation and Reassembly
+
+ The semantics of a reassembled partial message must be those of the
+ "inner" message, rather than of a message containing the inner
+ message. This makes it possible, for example, to send a large audio
+ message as several partial messages, and still have it appear to the
+ recipient as a simple audio message rather than as an encapsulated
+
+
+
+Freed & Borenstein Standards Track [Page 30]
+\f
+RFC 2046 Media Types November 1996
+
+
+ message containing an audio message. That is, the encapsulation of
+ the message is considered to be "transparent".
+
+ When generating and reassembling the pieces of a "message/partial"
+ message, the headers of the encapsulated message must be merged with
+ the headers of the enclosing entities. In this process the following
+ rules must be observed:
+
+ (1) Fragmentation agents must split messages at line
+ boundaries only. This restriction is imposed because
+ splits at points other than the ends of lines in turn
+ depends on message transports being able to preserve
+ the semantics of messages that don't end with a CRLF
+ sequence. Many transports are incapable of preserving
+ such semantics.
+
+ (2) All of the header fields from the initial enclosing
+ message, except those that start with "Content-" and
+ the specific header fields "Subject", "Message-ID",
+ "Encrypted", and "MIME-Version", must be copied, in
+ order, to the new message.
+
+ (3) The header fields in the enclosed message which start
+ with "Content-", plus the "Subject", "Message-ID",
+ "Encrypted", and "MIME-Version" fields, must be
+ appended, in order, to the header fields of the new
+ message. Any header fields in the enclosed message
+ which do not start with "Content-" (except for the
+ "Subject", "Message-ID", "Encrypted", and "MIME-
+ Version" fields) will be ignored and dropped.
+
+ (4) All of the header fields from the second and any
+ subsequent enclosing messages are discarded by the
+ reassembly process.
+
+5.2.2.2. Fragmentation and Reassembly Example
+
+ If an audio message is broken into two pieces, the first piece might
+ look something like this:
+
+ X-Weird-Header-1: Foo
+ From: Bill@host.com
+ To: joe@otherhost.com
+ Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+ Subject: Audio mail (part 1 of 2)
+ Message-ID: <id1@host.com>
+ MIME-Version: 1.0
+ Content-type: message/partial; id="ABC@host.com";
+
+
+
+Freed & Borenstein Standards Track [Page 31]
+\f
+RFC 2046 Media Types November 1996
+
+
+ number=1; total=2
+
+ X-Weird-Header-1: Bar
+ X-Weird-Header-2: Hello
+ Message-ID: <anotherid@foo.com>
+ Subject: Audio mail
+ MIME-Version: 1.0
+ Content-type: audio/basic
+ Content-transfer-encoding: base64
+
+ ... first half of encoded audio data goes here ...
+
+ and the second half might look something like this:
+
+ From: Bill@host.com
+ To: joe@otherhost.com
+ Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+ Subject: Audio mail (part 2 of 2)
+ MIME-Version: 1.0
+ Message-ID: <id2@host.com>
+ Content-type: message/partial;
+ id="ABC@host.com"; number=2; total=2
+
+ ... second half of encoded audio data goes here ...
+
+ Then, when the fragmented message is reassembled, the resulting
+ message to be displayed to the user should look something like this:
+
+ X-Weird-Header-1: Foo
+ From: Bill@host.com
+ To: joe@otherhost.com
+ Date: Fri, 26 Mar 1993 12:59:38 -0500 (EST)
+ Subject: Audio mail
+ Message-ID: <anotherid@foo.com>
+ MIME-Version: 1.0
+ Content-type: audio/basic
+ Content-transfer-encoding: base64
+
+ ... first half of encoded audio data goes here ...
+ ... second half of encoded audio data goes here ...
+
+ The inclusion of a "References" field in the headers of the second
+ and subsequent pieces of a fragmented message that references the
+ Message-Id on the previous piece may be of benefit to mail readers
+ that understand and track references. However, the generation of
+ such "References" fields is entirely optional.
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 32]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Finally, it should be noted that the "Encrypted" header field has
+ been made obsolete by Privacy Enhanced Messaging (PEM) [RFC-1421,
+ RFC-1422, RFC-1423, RFC-1424], but the rules above are nevertheless
+ believed to describe the correct way to treat it if it is encountered
+ in the context of conversion to and from "message/partial" fragments.
+
+5.2.3. External-Body Subtype
+
+ The external-body subtype indicates that the actual body data are not
+ included, but merely referenced. In this case, the parameters
+ describe a mechanism for accessing the external data.
+
+ When a MIME entity is of type "message/external-body", it consists of
+ a header, two consecutive CRLFs, and the message header for the
+ encapsulated message. If another pair of consecutive CRLFs appears,
+ this of course ends the message header for the encapsulated message.
+ However, since the encapsulated message's body is itself external, it
+ does NOT appear in the area that follows. For example, consider the
+ following message:
+
+ Content-type: message/external-body;
+ access-type=local-file;
+ name="/u/nsb/Me.jpeg"
+
+ Content-type: image/jpeg
+ Content-ID: <id42@guppylake.bellcore.com>
+ Content-Transfer-Encoding: binary
+
+ THIS IS NOT REALLY THE BODY!
+
+ The area at the end, which might be called the "phantom body", is
+ ignored for most external-body messages. However, it may be used to
+ contain auxiliary information for some such messages, as indeed it is
+ when the access-type is "mail- server". The only access-type defined
+ in this document that uses the phantom body is "mail-server", but
+ other access-types may be defined in the future in other
+ specifications that use this area.
+
+ The encapsulated headers in ALL "message/external-body" entities MUST
+ include a Content-ID header field to give a unique identifier by
+ which to reference the data. This identifier may be used for caching
+ mechanisms, and for recognizing the receipt of the data when the
+ access-type is "mail-server".
+
+ Note that, as specified here, the tokens that describe external-body
+ data, such as file names and mail server commands, are required to be
+ in the US-ASCII character set.
+
+
+
+
+Freed & Borenstein Standards Track [Page 33]
+\f
+RFC 2046 Media Types November 1996
+
+
+ If this proves problematic in practice, a new mechanism may be
+ required as a future extension to MIME, either as newly defined
+ access-types for "message/external-body" or by some other mechanism.
+
+ As with "message/partial", MIME entities of type "message/external-
+ body" MUST have a content-transfer-encoding of 7bit (the default).
+ In particular, even in environments that support binary or 8bit
+ transport, the use of a content- transfer-encoding of "8bit" or
+ "binary" is explicitly prohibited for entities of type
+ "message/external-body".
+
+5.2.3.1. General External-Body Parameters
+
+ The parameters that may be used with any "message/external- body"
+ are:
+
+ (1) ACCESS-TYPE -- A word indicating the supported access
+ mechanism by which the file or data may be obtained.
+ This word is not case sensitive. Values include, but
+ are not limited to, "FTP", "ANON-FTP", "TFTP", "LOCAL-
+ FILE", and "MAIL-SERVER". Future values, except for
+ experimental values beginning with "X-", must be
+ registered with IANA, as described in RFC 2048.
+ This parameter is unconditionally mandatory and MUST be
+ present on EVERY "message/external-body".
+
+ (2) EXPIRATION -- The date (in the RFC 822 "date-time"
+ syntax, as extended by RFC 1123 to permit 4 digits in
+ the year field) after which the existence of the
+ external data is not guaranteed. This parameter may be
+ used with ANY access-type and is ALWAYS optional.
+
+ (3) SIZE -- The size (in octets) of the data. The intent
+ of this parameter is to help the recipient decide
+ whether or not to expend the necessary resources to
+ retrieve the external data. Note that this describes
+ the size of the data in its canonical form, that is,
+ before any Content-Transfer-Encoding has been applied
+ or after the data have been decoded. This parameter
+ may be used with ANY access-type and is ALWAYS
+ optional.
+
+ (4) PERMISSION -- A case-insensitive field that indicates
+ whether or not it is expected that clients might also
+ attempt to overwrite the data. By default, or if
+ permission is "read", the assumption is that they are
+ not, and that if the data is retrieved once, it is
+ never needed again. If PERMISSION is "read-write",
+
+
+
+Freed & Borenstein Standards Track [Page 34]
+\f
+RFC 2046 Media Types November 1996
+
+
+ this assumption is invalid, and any local copy must be
+ considered no more than a cache. "Read" and "Read-
+ write" are the only defined values of permission. This
+ parameter may be used with ANY access-type and is
+ ALWAYS optional.
+
+ The precise semantics of the access-types defined here are described
+ in the sections that follow.
+
+5.2.3.2. The 'ftp' and 'tftp' Access-Types
+
+ An access-type of FTP or TFTP indicates that the message body is
+ accessible as a file using the FTP [RFC-959] or TFTP [RFC- 783]
+ protocols, respectively. For these access-types, the following
+ additional parameters are mandatory:
+
+ (1) NAME -- The name of the file that contains the actual
+ body data.
+
+ (2) SITE -- A machine from which the file may be obtained,
+ using the given protocol. This must be a fully
+ qualified domain name, not a nickname.
+
+ (3) Before any data are retrieved, using FTP, the user will
+ generally need to be asked to provide a login id and a
+ password for the machine named by the site parameter.
+ For security reasons, such an id and password are not
+ specified as content-type parameters, but must be
+ obtained from the user.
+
+ In addition, the following parameters are optional:
+
+ (1) DIRECTORY -- A directory from which the data named by
+ NAME should be retrieved.
+
+ (2) MODE -- A case-insensitive string indicating the mode
+ to be used when retrieving the information. The valid
+ values for access-type "TFTP" are "NETASCII", "OCTET",
+ and "MAIL", as specified by the TFTP protocol [RFC-
+ 783]. The valid values for access-type "FTP" are
+ "ASCII", "EBCDIC", "IMAGE", and "LOCALn" where "n" is a
+ decimal integer, typically 8. These correspond to the
+ representation types "A" "E" "I" and "L n" as specified
+ by the FTP protocol [RFC-959]. Note that "BINARY" and
+ "TENEX" are not valid values for MODE and that "OCTET"
+ or "IMAGE" or "LOCAL8" should be used instead. IF MODE
+ is not specified, the default value is "NETASCII" for
+ TFTP and "ASCII" otherwise.
+
+
+
+Freed & Borenstein Standards Track [Page 35]
+\f
+RFC 2046 Media Types November 1996
+
+
+5.2.3.3. The 'anon-ftp' Access-Type
+
+ The "anon-ftp" access-type is identical to the "ftp" access type,
+ except that the user need not be asked to provide a name and password
+ for the specified site. Instead, the ftp protocol will be used with
+ login "anonymous" and a password that corresponds to the user's mail
+ address.
+
+5.2.3.4. The 'local-file' Access-Type
+
+ An access-type of "local-file" indicates that the actual body is
+ accessible as a file on the local machine. Two additional parameters
+ are defined for this access type:
+
+ (1) NAME -- The name of the file that contains the actual
+ body data. This parameter is mandatory for the
+ "local-file" access-type.
+
+ (2) SITE -- A domain specifier for a machine or set of
+ machines that are known to have access to the data
+ file. This optional parameter is used to describe the
+ locality of reference for the data, that is, the site
+ or sites at which the file is expected to be visible.
+ Asterisks may be used for wildcard matching to a part
+ of a domain name, such as "*.bellcore.com", to indicate
+ a set of machines on which the data should be directly
+ visible, while a single asterisk may be used to
+ indicate a file that is expected to be universally
+ available, e.g., via a global file system.
+
+5.2.3.5. The 'mail-server' Access-Type
+
+ The "mail-server" access-type indicates that the actual body is
+ available from a mail server. Two additional parameters are defined
+ for this access-type:
+
+ (1) SERVER -- The addr-spec of the mail server from which
+ the actual body data can be obtained. This parameter
+ is mandatory for the "mail-server" access-type.
+
+ (2) SUBJECT -- The subject that is to be used in the mail
+ that is sent to obtain the data. Note that keying mail
+ servers on Subject lines is NOT recommended, but such
+ mail servers are known to exist. This is an optional
+ parameter.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 36]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Because mail servers accept a variety of syntaxes, some of which is
+ multiline, the full command to be sent to a mail server is not
+ included as a parameter in the content-type header field. Instead,
+ it is provided as the "phantom body" when the media type is
+ "message/external-body" and the access-type is mail-server.
+
+ Note that MIME does not define a mail server syntax. Rather, it
+ allows the inclusion of arbitrary mail server commands in the phantom
+ body. Implementations must include the phantom body in the body of
+ the message it sends to the mail server address to retrieve the
+ relevant data.
+
+ Unlike other access-types, mail-server access is asynchronous and
+ will happen at an unpredictable time in the future. For this reason,
+ it is important that there be a mechanism by which the returned data
+ can be matched up with the original "message/external-body" entity.
+ MIME mail servers must use the same Content-ID field on the returned
+ message that was used in the original "message/external-body"
+ entities, to facilitate such matching.
+
+5.2.3.6. External-Body Security Issues
+
+ "Message/external-body" entities give rise to two important security
+ issues:
+
+ (1) Accessing data via a "message/external-body" reference
+ effectively results in the message recipient performing
+ an operation that was specified by the message
+ originator. It is therefore possible for the message
+ originator to trick a recipient into doing something
+ they would not have done otherwise. For example, an
+ originator could specify a action that attempts
+ retrieval of material that the recipient is not
+ authorized to obtain, causing the recipient to
+ unwittingly violate some security policy. For this
+ reason, user agents capable of resolving external
+ references must always take steps to describe the
+ action they are to take to the recipient and ask for
+ explicit permisssion prior to performing it.
+
+ The 'mail-server' access-type is particularly
+ vulnerable, in that it causes the recipient to send a
+ new message whose contents are specified by the
+ original message's originator. Given the potential for
+ abuse, any such request messages that are constructed
+ should contain a clear indication that they were
+ generated automatically (e.g. in a Comments: header
+ field) in an attempt to resolve a MIME
+
+
+
+Freed & Borenstein Standards Track [Page 37]
+\f
+RFC 2046 Media Types November 1996
+
+
+ "message/external-body" reference.
+
+ (2) MIME will sometimes be used in environments that
+ provide some guarantee of message integrity and
+ authenticity. If present, such guarantees may apply
+ only to the actual direct content of messages -- they
+ may or may not apply to data accessed through MIME's
+ "message/external-body" mechanism. In particular, it
+ may be possible to subvert certain access mechanisms
+ even when the messaging system itself is secure.
+
+ It should be noted that this problem exists either with
+ or without the availabilty of MIME mechanisms. A
+ casual reference to an FTP site containing a document
+ in the text of a secure message brings up similar
+ issues -- the only difference is that MIME provides for
+ automatic retrieval of such material, and users may
+ place unwarranted trust is such automatic retrieval
+ mechanisms.
+
+5.2.3.7. Examples and Further Explanations
+
+ When the external-body mechanism is used in conjunction with the
+ "multipart/alternative" media type it extends the functionality of
+ "multipart/alternative" to include the case where the same entity is
+ provided in the same format but via different accces mechanisms.
+ When this is done the originator of the message must order the parts
+ first in terms of preferred formats and then by preferred access
+ mechanisms. The recipient's viewer should then evaluate the list
+ both in terms of format and access mechanisms.
+
+ With the emerging possibility of very wide-area file systems, it
+ becomes very hard to know in advance the set of machines where a file
+ will and will not be accessible directly from the file system.
+ Therefore it may make sense to provide both a file name, to be tried
+ directly, and the name of one or more sites from which the file is
+ known to be accessible. An implementation can try to retrieve remote
+ files using FTP or any other protocol, using anonymous file retrieval
+ or prompting the user for the necessary name and password. If an
+ external body is accessible via multiple mechanisms, the sender may
+ include multiple entities of type "message/external-body" within the
+ body parts of an enclosing "multipart/alternative" entity.
+
+ However, the external-body mechanism is not intended to be limited to
+ file retrieval, as shown by the mail-server access-type. Beyond
+ this, one can imagine, for example, using a video server for external
+ references to video clips.
+
+
+
+
+Freed & Borenstein Standards Track [Page 38]
+\f
+RFC 2046 Media Types November 1996
+
+
+ The embedded message header fields which appear in the body of the
+ "message/external-body" data must be used to declare the media type
+ of the external body if it is anything other than plain US-ASCII
+ text, since the external body does not have a header section to
+ declare its type. Similarly, any Content-transfer-encoding other
+ than "7bit" must also be declared here. Thus a complete
+ "message/external-body" message, referring to an object in PostScript
+ format, might look like this:
+
+ From: Whomever
+ To: Someone
+ Date: Whenever
+ Subject: whatever
+ MIME-Version: 1.0
+ Message-ID: <id1@host.com>
+ Content-Type: multipart/alternative; boundary=42
+ Content-ID: <id001@guppylake.bellcore.com>
+
+ --42
+ Content-Type: message/external-body; name="BodyFormats.ps";
+ site="thumper.bellcore.com"; mode="image";
+ access-type=ANON-FTP; directory="pub";
+ expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
+
+ Content-type: application/postscript
+ Content-ID: <id42@guppylake.bellcore.com>
+
+ --42
+ Content-Type: message/external-body; access-type=local-file;
+ name="/u/nsb/writing/rfcs/RFC-MIME.ps";
+ site="thumper.bellcore.com";
+ expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
+
+ Content-type: application/postscript
+ Content-ID: <id42@guppylake.bellcore.com>
+
+ --42
+ Content-Type: message/external-body;
+ access-type=mail-server
+ server="listserv@bogus.bitnet";
+ expiration="Fri, 14 Jun 1991 19:13:14 -0400 (EDT)"
+
+ Content-type: application/postscript
+ Content-ID: <id42@guppylake.bellcore.com>
+
+ get RFC-MIME.DOC
+
+ --42--
+
+
+
+Freed & Borenstein Standards Track [Page 39]
+\f
+RFC 2046 Media Types November 1996
+
+
+ Note that in the above examples, the default Content-transfer-
+ encoding of "7bit" is assumed for the external postscript data.
+
+ Like the "message/partial" type, the "message/external-body" media
+ type is intended to be transparent, that is, to convey the data type
+ in the external body rather than to convey a message with a body of
+ that type. Thus the headers on the outer and inner parts must be
+ merged using the same rules as for "message/partial". In particular,
+ this means that the Content-type and Subject fields are overridden,
+ but the From field is preserved.
+
+ Note that since the external bodies are not transported along with
+ the external body reference, they need not conform to transport
+ limitations that apply to the reference itself. In particular,
+ Internet mail transports may impose 7bit and line length limits, but
+ these do not automatically apply to binary external body references.
+ Thus a Content-Transfer-Encoding is not generally necessary, though
+ it is permitted.
+
+ Note that the body of a message of type "message/external-body" is
+ governed by the basic syntax for an RFC 822 message. In particular,
+ anything before the first consecutive pair of CRLFs is header
+ information, while anything after it is body information, which is
+ ignored for most access-types.
+
+5.2.4. Other Message Subtypes
+
+ MIME implementations must in general treat unrecognized subtypes of
+ "message" as being equivalent to "application/octet-stream".
+
+ Future subtypes of "message" intended for use with email should be
+ restricted to "7bit" encoding. A type other than "message" should be
+ used if restriction to "7bit" is not possible.
+
+6. Experimental Media Type Values
+
+ A media type value beginning with the characters "X-" is a private
+ value, to be used by consenting systems by mutual agreement. Any
+ format without a rigorous and public definition must be named with an
+ "X-" prefix, and publicly specified values shall never begin with
+ "X-". (Older versions of the widely used Andrew system use the "X-
+ BE2" name, so new systems should probably choose a different name.)
+
+ In general, the use of "X-" top-level types is strongly discouraged.
+ Implementors should invent subtypes of the existing types whenever
+ possible. In many cases, a subtype of "application" will be more
+ appropriate than a new top-level type.
+
+
+
+
+Freed & Borenstein Standards Track [Page 40]
+\f
+RFC 2046 Media Types November 1996
+
+
+7. Summary
+
+ The five discrete media types provide provide a standardized
+ mechanism for tagging entities as "audio", "image", or several other
+ kinds of data. The composite "multipart" and "message" media types
+ allow mixing and hierarchical structuring of entities of different
+ types in a single message. A distinguished parameter syntax allows
+ further specification of data format details, particularly the
+ specification of alternate character sets. Additional optional
+ header fields provide mechanisms for certain extensions deemed
+ desirable by many implementors. Finally, a number of useful media
+ types are defined for general use by consenting user agents, notably
+ "message/partial" and "message/external-body".
+
+9. Security Considerations
+
+ Security issues are discussed in the context of the
+ "application/postscript" type, the "message/external-body" type, and
+ in RFC 2048. Implementors should pay special attention to the
+ security implications of any media types that can cause the remote
+ execution of any actions in the recipient's environment. In such
+ cases, the discussion of the "application/postscript" type may serve
+ as a model for considering other media types with remote execution
+ capabilities.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 41]
+\f
+RFC 2046 Media Types November 1996
+
+
+9. Authors' Addresses
+
+ For more information, the authors of this document are best contacted
+ via Internet mail:
+
+ Ned Freed
+ Innosoft International, Inc.
+ 1050 East Garvey Avenue South
+ West Covina, CA 91790
+ USA
+
+ Phone: +1 818 919 3600
+ Fax: +1 818 919 3614
+ EMail: ned@innosoft.com
+
+
+ Nathaniel S. Borenstein
+ First Virtual Holdings
+ 25 Washington Avenue
+ Morristown, NJ 07960
+ USA
+
+ Phone: +1 201 540 8967
+ Fax: +1 201 993 3032
+ EMail: nsb@nsb.fv.com
+
+
+ MIME is a result of the work of the Internet Engineering Task Force
+ Working Group on RFC 822 Extensions. The chairman of that group,
+ Greg Vaudreuil, may be reached at:
+
+ Gregory M. Vaudreuil
+ Octel Network Services
+ 17080 Dallas Parkway
+ Dallas, TX 75248-1905
+ USA
+
+ EMail: Greg.Vaudreuil@Octel.Com
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 42]
+\f
+RFC 2046 Media Types November 1996
+
+
+Appendix A -- Collected Grammar
+
+ This appendix contains the complete BNF grammar for all the syntax
+ specified by this document.
+
+ By itself, however, this grammar is incomplete. It refers by name to
+ several syntax rules that are defined by RFC 822. Rather than
+ reproduce those definitions here, and risk unintentional differences
+ between the two, this document simply refers the reader to RFC 822
+ for the remaining definitions. Wherever a term is undefined, it
+ refers to the RFC 822 definition.
+
+ boundary := 0*69<bchars> bcharsnospace
+
+ bchars := bcharsnospace / " "
+
+ bcharsnospace := DIGIT / ALPHA / "'" / "(" / ")" /
+ "+" / "_" / "," / "-" / "." /
+ "/" / ":" / "=" / "?"
+
+ body-part := <"message" as defined in RFC 822, with all
+ header fields optional, not starting with the
+ specified dash-boundary, and with the
+ delimiter not occurring anywhere in the
+ body part. Note that the semantics of a
+ part differ from the semantics of a message,
+ as described in the text.>
+
+ close-delimiter := delimiter "--"
+
+ dash-boundary := "--" boundary
+ ; boundary taken from the value of
+ ; boundary parameter of the
+ ; Content-Type field.
+
+ delimiter := CRLF dash-boundary
+
+ discard-text := *(*text CRLF)
+ ; May be ignored or discarded.
+
+ encapsulation := delimiter transport-padding
+ CRLF body-part
+
+ epilogue := discard-text
+
+ multipart-body := [preamble CRLF]
+ dash-boundary transport-padding CRLF
+ body-part *encapsulation
+
+
+
+Freed & Borenstein Standards Track [Page 43]
+\f
+RFC 2046 Media Types November 1996
+
+
+ close-delimiter transport-padding
+ [CRLF epilogue]
+
+ preamble := discard-text
+
+ transport-padding := *LWSP-char
+ ; Composers MUST NOT generate
+ ; non-zero length transport
+ ; padding, but receivers MUST
+ ; be able to handle padding
+ ; added by message transports.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 44]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group K. Moore
+Request for Comments: 2047 University of Tennessee
+Obsoletes: 1521, 1522, 1590 November 1996
+Category: Standards Track
+
+
+ MIME (Multipurpose Internet Mail Extensions) Part Three:
+ Message Header Extensions for Non-ASCII Text
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ STD 11, RFC 822, defines a message representation protocol specifying
+ considerable detail about US-ASCII message headers, and leaves the
+ message content, or message body, as flat US-ASCII text. This set of
+ documents, collectively called the Multipurpose Internet Mail
+ Extensions, or MIME, redefines the format of messages to allow for
+
+ (1) textual message bodies in character sets other than US-ASCII,
+
+ (2) an extensible set of different formats for non-textual message
+ bodies,
+
+ (3) multi-part message bodies, and
+
+ (4) textual header information in character sets other than US-ASCII.
+
+ These documents are based on earlier work documented in RFC 934, STD
+ 11, and RFC 1049, but extends and revises them. Because RFC 822 said
+ so little about message bodies, these documents are largely
+ orthogonal to (rather than a revision of) RFC 822.
+
+ This particular document is the third document in the series. It
+ describes extensions to RFC 822 to allow non-US-ASCII text data in
+ Internet mail header fields.
+
+
+
+
+
+
+
+
+
+Moore Standards Track [Page 1]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ Other documents in this series include:
+
+ + RFC 2045, which specifies the various headers used to describe
+ the structure of MIME messages.
+
+ + RFC 2046, which defines the general structure of the MIME media
+ typing system and defines an initial set of media types,
+
+ + RFC 2048, which specifies various IANA registration procedures
+ for MIME-related facilities, and
+
+ + RFC 2049, which describes MIME conformance criteria and
+ provides some illustrative examples of MIME message formats,
+ acknowledgements, and the bibliography.
+
+ These documents are revisions of RFCs 1521, 1522, and 1590, which
+ themselves were revisions of RFCs 1341 and 1342. An appendix in RFC
+ 2049 describes differences and changes from previous versions.
+
+1. Introduction
+
+ RFC 2045 describes a mechanism for denoting textual body parts which
+ are coded in various character sets, as well as methods for encoding
+ such body parts as sequences of printable US-ASCII characters. This
+ memo describes similar techniques to allow the encoding of non-ASCII
+ text in various portions of a RFC 822 [2] message header, in a manner
+ which is unlikely to confuse existing message handling software.
+
+ Like the encoding techniques described in RFC 2045, the techniques
+ outlined here were designed to allow the use of non-ASCII characters
+ in message headers in a way which is unlikely to be disturbed by the
+ quirks of existing Internet mail handling programs. In particular,
+ some mail relaying programs are known to (a) delete some message
+ header fields while retaining others, (b) rearrange the order of
+ addresses in To or Cc fields, (c) rearrange the (vertical) order of
+ header fields, and/or (d) "wrap" message headers at different places
+ than those in the original message. In addition, some mail reading
+ programs are known to have difficulty correctly parsing message
+ headers which, while legal according to RFC 822, make use of
+ backslash-quoting to "hide" special characters such as "<", ",", or
+ ":", or which exploit other infrequently-used features of that
+ specification.
+
+ While it is unfortunate that these programs do not correctly
+ interpret RFC 822 headers, to "break" these programs would cause
+ severe operational problems for the Internet mail system. The
+ extensions described in this memo therefore do not rely on little-
+ used features of RFC 822.
+
+
+
+Moore Standards Track [Page 2]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ Instead, certain sequences of "ordinary" printable ASCII characters
+ (known as "encoded-words") are reserved for use as encoded data. The
+ syntax of encoded-words is such that they are unlikely to
+ "accidentally" appear as normal text in message headers.
+ Furthermore, the characters used in encoded-words are restricted to
+ those which do not have special meanings in the context in which the
+ encoded-word appears.
+
+ Generally, an "encoded-word" is a sequence of printable ASCII
+ characters that begins with "=?", ends with "?=", and has two "?"s in
+ between. It specifies a character set and an encoding method, and
+ also includes the original text encoded as graphic ASCII characters,
+ according to the rules for that encoding method.
+
+ A mail composer that implements this specification will provide a
+ means of inputting non-ASCII text in header fields, but will
+ translate these fields (or appropriate portions of these fields) into
+ encoded-words before inserting them into the message header.
+
+ A mail reader that implements this specification will recognize
+ encoded-words when they appear in certain portions of the message
+ header. Instead of displaying the encoded-word "as is", it will
+ reverse the encoding and display the original text in the designated
+ character set.
+
+NOTES
+
+ This memo relies heavily on notation and terms defined RFC 822 and
+ RFC 2045. In particular, the syntax for the ABNF used in this memo
+ is defined in RFC 822, as well as many of the terminal or nonterminal
+ symbols from RFC 822 are used in the grammar for the header
+ extensions defined here. Among the symbols defined in RFC 822 and
+ referenced in this memo are: 'addr-spec', 'atom', 'CHAR', 'comment',
+ 'CTLs', 'ctext', 'linear-white-space', 'phrase', 'quoted-pair'.
+ 'quoted-string', 'SPACE', and 'word'. Successful implementation of
+ this protocol extension requires careful attention to the RFC 822
+ definitions of these terms.
+
+ When the term "ASCII" appears in this memo, it refers to the "7-Bit
+ American Standard Code for Information Interchange", ANSI X3.4-1986.
+ The MIME charset name for this character set is "US-ASCII". When not
+ specifically referring to the MIME charset name, this document uses
+ the term "ASCII", both for brevity and for consistency with RFC 822.
+ However, implementors are warned that the character set name must be
+ spelled "US-ASCII" in MIME message and body part headers.
+
+
+
+
+
+
+Moore Standards Track [Page 3]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ This memo specifies a protocol for the representation of non-ASCII
+ text in message headers. It specifically DOES NOT define any
+ translation between "8-bit headers" and pure ASCII headers, nor is
+ any such translation assumed to be possible.
+
+2. Syntax of encoded-words
+
+ An 'encoded-word' is defined by the following ABNF grammar. The
+ notation of RFC 822 is used, with the exception that white space
+ characters MUST NOT appear between components of an 'encoded-word'.
+
+ encoded-word = "=?" charset "?" encoding "?" encoded-text "?="
+
+ charset = token ; see section 3
+
+ encoding = token ; see section 4
+
+ token = 1*<Any CHAR except SPACE, CTLs, and especials>
+
+ especials = "(" / ")" / "<" / ">" / "@" / "," / ";" / ":" / "
+ <"> / "/" / "[" / "]" / "?" / "." / "="
+
+ encoded-text = 1*<Any printable ASCII character other than "?"
+ or SPACE>
+ ; (but see "Use of encoded-words in message
+ ; headers", section 5)
+
+ Both 'encoding' and 'charset' names are case-independent. Thus the
+ charset name "ISO-8859-1" is equivalent to "iso-8859-1", and the
+ encoding named "Q" may be spelled either "Q" or "q".
+
+ An 'encoded-word' may not be more than 75 characters long, including
+ 'charset', 'encoding', 'encoded-text', and delimiters. If it is
+ desirable to encode more text than will fit in an 'encoded-word' of
+ 75 characters, multiple 'encoded-word's (separated by CRLF SPACE) may
+ be used.
+
+ While there is no limit to the length of a multiple-line header
+ field, each line of a header field that contains one or more
+ 'encoded-word's is limited to 76 characters.
+
+ The length restrictions are included both to ease interoperability
+ through internetwork mail gateways, and to impose a limit on the
+ amount of lookahead a header parser must employ (while looking for a
+ final ?= delimiter) before it can decide whether a token is an
+ "encoded-word" or something else.
+
+
+
+
+
+Moore Standards Track [Page 4]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ IMPORTANT: 'encoded-word's are designed to be recognized as 'atom's
+ by an RFC 822 parser. As a consequence, unencoded white space
+ characters (such as SPACE and HTAB) are FORBIDDEN within an
+ 'encoded-word'. For example, the character sequence
+
+ =?iso-8859-1?q?this is some text?=
+
+ would be parsed as four 'atom's, rather than as a single 'atom' (by
+ an RFC 822 parser) or 'encoded-word' (by a parser which understands
+ 'encoded-words'). The correct way to encode the string "this is some
+ text" is to encode the SPACE characters as well, e.g.
+
+ =?iso-8859-1?q?this=20is=20some=20text?=
+
+ The characters which may appear in 'encoded-text' are further
+ restricted by the rules in section 5.
+
+3. Character sets
+
+ The 'charset' portion of an 'encoded-word' specifies the character
+ set associated with the unencoded text. A 'charset' can be any of
+ the character set names allowed in an MIME "charset" parameter of a
+ "text/plain" body part, or any character set name registered with
+ IANA for use with the MIME text/plain content-type.
+
+ Some character sets use code-switching techniques to switch between
+ "ASCII mode" and other modes. If unencoded text in an 'encoded-word'
+ contains a sequence which causes the charset interpreter to switch
+ out of ASCII mode, it MUST contain additional control codes such that
+ ASCII mode is again selected at the end of the 'encoded-word'. (This
+ rule applies separately to each 'encoded-word', including adjacent
+ 'encoded-word's within a single header field.)
+
+ When there is a possibility of using more than one character set to
+ represent the text in an 'encoded-word', and in the absence of
+ private agreements between sender and recipients of a message, it is
+ recommended that members of the ISO-8859-* series be used in
+ preference to other character sets.
+
+4. Encodings
+
+ Initially, the legal values for "encoding" are "Q" and "B". These
+ encodings are described below. The "Q" encoding is recommended for
+ use when most of the characters to be encoded are in the ASCII
+ character set; otherwise, the "B" encoding should be used.
+ Nevertheless, a mail reader which claims to recognize 'encoded-word's
+ MUST be able to accept either encoding for any character set which it
+ supports.
+
+
+
+Moore Standards Track [Page 5]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ Only a subset of the printable ASCII characters may be used in
+ 'encoded-text'. Space and tab characters are not allowed, so that
+ the beginning and end of an 'encoded-word' are obvious. The "?"
+ character is used within an 'encoded-word' to separate the various
+ portions of the 'encoded-word' from one another, and thus cannot
+ appear in the 'encoded-text' portion. Other characters are also
+ illegal in certain contexts. For example, an 'encoded-word' in a
+ 'phrase' preceding an address in a From header field may not contain
+ any of the "specials" defined in RFC 822. Finally, certain other
+ characters are disallowed in some contexts, to ensure reliability for
+ messages that pass through internetwork mail gateways.
+
+ The "B" encoding automatically meets these requirements. The "Q"
+ encoding allows a wide range of printable characters to be used in
+ non-critical locations in the message header (e.g., Subject), with
+ fewer characters available for use in other locations.
+
+4.1. The "B" encoding
+
+ The "B" encoding is identical to the "BASE64" encoding defined by RFC
+ 2045.
+
+4.2. The "Q" encoding
+
+ The "Q" encoding is similar to the "Quoted-Printable" content-
+ transfer-encoding defined in RFC 2045. It is designed to allow text
+ containing mostly ASCII characters to be decipherable on an ASCII
+ terminal without decoding.
+
+ (1) Any 8-bit value may be represented by a "=" followed by two
+ hexadecimal digits. For example, if the character set in use
+ were ISO-8859-1, the "=" character would thus be encoded as
+ "=3D", and a SPACE by "=20". (Upper case should be used for
+ hexadecimal digits "A" through "F".)
+
+ (2) The 8-bit hexadecimal value 20 (e.g., ISO-8859-1 SPACE) may be
+ represented as "_" (underscore, ASCII 95.). (This character may
+ not pass through some internetwork mail gateways, but its use
+ will greatly enhance readability of "Q" encoded data with mail
+ readers that do not support this encoding.) Note that the "_"
+ always represents hexadecimal 20, even if the SPACE character
+ occupies a different code position in the character set in use.
+
+ (3) 8-bit values which correspond to printable ASCII characters other
+ than "=", "?", and "_" (underscore), MAY be represented as those
+ characters. (But see section 5 for restrictions.) In
+ particular, SPACE and TAB MUST NOT be represented as themselves
+ within encoded words.
+
+
+
+Moore Standards Track [Page 6]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+5. Use of encoded-words in message headers
+
+ An 'encoded-word' may appear in a message header or body part header
+ according to the following rules:
+
+(1) An 'encoded-word' may replace a 'text' token (as defined by RFC 822)
+ in any Subject or Comments header field, any extension message
+ header field, or any MIME body part field for which the field body
+ is defined as '*text'. An 'encoded-word' may also appear in any
+ user-defined ("X-") message or body part header field.
+
+ Ordinary ASCII text and 'encoded-word's may appear together in the
+ same header field. However, an 'encoded-word' that appears in a
+ header field defined as '*text' MUST be separated from any adjacent
+ 'encoded-word' or 'text' by 'linear-white-space'.
+
+(2) An 'encoded-word' may appear within a 'comment' delimited by "(" and
+ ")", i.e., wherever a 'ctext' is allowed. More precisely, the RFC
+ 822 ABNF definition for 'comment' is amended as follows:
+
+ comment = "(" *(ctext / quoted-pair / comment / encoded-word) ")"
+
+ A "Q"-encoded 'encoded-word' which appears in a 'comment' MUST NOT
+ contain the characters "(", ")" or "
+ 'encoded-word' that appears in a 'comment' MUST be separated from
+ any adjacent 'encoded-word' or 'ctext' by 'linear-white-space'.
+
+ It is important to note that 'comment's are only recognized inside
+ "structured" field bodies. In fields whose bodies are defined as
+ '*text', "(" and ")" are treated as ordinary characters rather than
+ comment delimiters, and rule (1) of this section applies. (See RFC
+ 822, sections 3.1.2 and 3.1.3)
+
+(3) As a replacement for a 'word' entity within a 'phrase', for example,
+ one that precedes an address in a From, To, or Cc header. The ABNF
+ definition for 'phrase' from RFC 822 thus becomes:
+
+ phrase = 1*( encoded-word / word )
+
+ In this case the set of characters that may be used in a "Q"-encoded
+ 'encoded-word' is restricted to: <upper and lower case ASCII
+ letters, decimal digits, "!", "*", "+", "-", "/", "=", and "_"
+ (underscore, ASCII 95.)>. An 'encoded-word' that appears within a
+ 'phrase' MUST be separated from any adjacent 'word', 'text' or
+ 'special' by 'linear-white-space'.
+
+
+
+
+
+
+Moore Standards Track [Page 7]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ These are the ONLY locations where an 'encoded-word' may appear. In
+ particular:
+
+ + An 'encoded-word' MUST NOT appear in any portion of an 'addr-spec'.
+
+ + An 'encoded-word' MUST NOT appear within a 'quoted-string'.
+
+ + An 'encoded-word' MUST NOT be used in a Received header field.
+
+ + An 'encoded-word' MUST NOT be used in parameter of a MIME
+ Content-Type or Content-Disposition field, or in any structured
+ field body except within a 'comment' or 'phrase'.
+
+ The 'encoded-text' in an 'encoded-word' must be self-contained;
+ 'encoded-text' MUST NOT be continued from one 'encoded-word' to
+ another. This implies that the 'encoded-text' portion of a "B"
+ 'encoded-word' will be a multiple of 4 characters long; for a "Q"
+ 'encoded-word', any "=" character that appears in the 'encoded-text'
+ portion will be followed by two hexadecimal characters.
+
+ Each 'encoded-word' MUST encode an integral number of octets. The
+ 'encoded-text' in each 'encoded-word' must be well-formed according
+ to the encoding specified; the 'encoded-text' may not be continued in
+ the next 'encoded-word'. (For example, "=?charset?Q?=?=
+ =?charset?Q?AB?=" would be illegal, because the two hex digits "AB"
+ must follow the "=" in the same 'encoded-word'.)
+
+ Each 'encoded-word' MUST represent an integral number of characters.
+ A multi-octet character may not be split across adjacent 'encoded-
+ word's.
+
+ Only printable and white space character data should be encoded using
+ this scheme. However, since these encoding schemes allow the
+ encoding of arbitrary octet values, mail readers that implement this
+ decoding should also ensure that display of the decoded data on the
+ recipient's terminal will not cause unwanted side-effects.
+
+ Use of these methods to encode non-textual data (e.g., pictures or
+ sounds) is not defined by this memo. Use of 'encoded-word's to
+ represent strings of purely ASCII characters is allowed, but
+ discouraged. In rare cases it may be necessary to encode ordinary
+ text that looks like an 'encoded-word'.
+
+
+
+
+
+
+
+
+
+Moore Standards Track [Page 8]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+6. Support of 'encoded-word's by mail readers
+
+6.1. Recognition of 'encoded-word's in message headers
+
+ A mail reader must parse the message and body part headers according
+ to the rules in RFC 822 to correctly recognize 'encoded-word's.
+
+ 'encoded-word's are to be recognized as follows:
+
+ (1) Any message or body part header field defined as '*text', or any
+ user-defined header field, should be parsed as follows: Beginning
+ at the start of the field-body and immediately following each
+ occurrence of 'linear-white-space', each sequence of up to 75
+ printable characters (not containing any 'linear-white-space')
+ should be examined to see if it is an 'encoded-word' according to
+ the syntax rules in section 2. Any other sequence of printable
+ characters should be treated as ordinary ASCII text.
+
+ (2) Any header field not defined as '*text' should be parsed
+ according to the syntax rules for that header field. However,
+ any 'word' that appears within a 'phrase' should be treated as an
+ 'encoded-word' if it meets the syntax rules in section 2.
+ Otherwise it should be treated as an ordinary 'word'.
+
+ (3) Within a 'comment', any sequence of up to 75 printable characters
+ (not containing 'linear-white-space'), that meets the syntax
+ rules in section 2, should be treated as an 'encoded-word'.
+ Otherwise it should be treated as normal comment text.
+
+ (4) A MIME-Version header field is NOT required to be present for
+ 'encoded-word's to be interpreted according to this
+ specification. One reason for this is that the mail reader is
+ not expected to parse the entire message header before displaying
+ lines that may contain 'encoded-word's.
+
+6.2. Display of 'encoded-word's
+
+ Any 'encoded-word's so recognized are decoded, and if possible, the
+ resulting unencoded text is displayed in the original character set.
+
+ NOTE: Decoding and display of encoded-words occurs *after* a
+ structured field body is parsed into tokens. It is therefore
+ possible to hide 'special' characters in encoded-words which, when
+ displayed, will be indistinguishable from 'special' characters in the
+ surrounding text. For this and other reasons, it is NOT generally
+ possible to translate a message header containing 'encoded-word's to
+ an unencoded form which can be parsed by an RFC 822 mail reader.
+
+
+
+
+Moore Standards Track [Page 9]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ When displaying a particular header field that contains multiple
+ 'encoded-word's, any 'linear-white-space' that separates a pair of
+ adjacent 'encoded-word's is ignored. (This is to allow the use of
+ multiple 'encoded-word's to represent long strings of unencoded text,
+ without having to separate 'encoded-word's where spaces occur in the
+ unencoded text.)
+
+ In the event other encodings are defined in the future, and the mail
+ reader does not support the encoding used, it may either (a) display
+ the 'encoded-word' as ordinary text, or (b) substitute an appropriate
+ message indicating that the text could not be decoded.
+
+ If the mail reader does not support the character set used, it may
+ (a) display the 'encoded-word' as ordinary text (i.e., as it appears
+ in the header), (b) make a "best effort" to display using such
+ characters as are available, or (c) substitute an appropriate message
+ indicating that the decoded text could not be displayed.
+
+ If the character set being used employs code-switching techniques,
+ display of the encoded text implicitly begins in "ASCII mode". In
+ addition, the mail reader must ensure that the output device is once
+ again in "ASCII mode" after the 'encoded-word' is displayed.
+
+6.3. Mail reader handling of incorrectly formed 'encoded-word's
+
+ It is possible that an 'encoded-word' that is legal according to the
+ syntax defined in section 2, is incorrectly formed according to the
+ rules for the encoding being used. For example:
+
+ (1) An 'encoded-word' which contains characters which are not legal
+ for a particular encoding (for example, a "-" in the "B"
+ encoding, or a SPACE or HTAB in either the "B" or "Q" encoding),
+ is incorrectly formed.
+
+ (2) Any 'encoded-word' which encodes a non-integral number of
+ characters or octets is incorrectly formed.
+
+ A mail reader need not attempt to display the text associated with an
+ 'encoded-word' that is incorrectly formed. However, a mail reader
+ MUST NOT prevent the display or handling of a message because an
+ 'encoded-word' is incorrectly formed.
+
+7. Conformance
+
+ A mail composing program claiming compliance with this specification
+ MUST ensure that any string of non-white-space printable ASCII
+ characters within a '*text' or '*ctext' that begins with "=?" and
+ ends with "?=" be a valid 'encoded-word'. ("begins" means: at the
+
+
+
+Moore Standards Track [Page 10]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ start of the field-body, immediately following 'linear-white-space',
+ or immediately following a "(" for an 'encoded-word' within '*ctext';
+ "ends" means: at the end of the field-body, immediately preceding
+ 'linear-white-space', or immediately preceding a ")" for an
+ 'encoded-word' within '*ctext'.) In addition, any 'word' within a
+ 'phrase' that begins with "=?" and ends with "?=" must be a valid
+ 'encoded-word'.
+
+ A mail reading program claiming compliance with this specification
+ must be able to distinguish 'encoded-word's from 'text', 'ctext', or
+ 'word's, according to the rules in section 6, anytime they appear in
+ appropriate places in message headers. It must support both the "B"
+ and "Q" encodings for any character set which it supports. The
+ program must be able to display the unencoded text if the character
+ set is "US-ASCII". For the ISO-8859-* character sets, the mail
+ reading program must at least be able to display the characters which
+ are also in the ASCII set.
+
+8. Examples
+
+ The following are examples of message headers containing 'encoded-
+ word's:
+
+ From: =?US-ASCII?Q?Keith_Moore?= <moore@cs.utk.edu>
+ To: =?ISO-8859-1?Q?Keld_J=F8rn_Simonsen?= <keld@dkuug.dk>
+ CC: =?ISO-8859-1?Q?Andr=E9?= Pirard <PIRARD@vm1.ulg.ac.be>
+ Subject: =?ISO-8859-1?B?SWYgeW91IGNhbiByZWFkIHRoaXMgeW8=?=
+ =?ISO-8859-2?B?dSB1bmRlcnN0YW5kIHRoZSBleGFtcGxlLg==?=
+
+ Note: In the first 'encoded-word' of the Subject field above, the
+ last "=" at the end of the 'encoded-text' is necessary because each
+ 'encoded-word' must be self-contained (the "=" character completes a
+ group of 4 base64 characters representing 2 octets). An additional
+ octet could have been encoded in the first 'encoded-word' (so that
+ the encoded-word would contain an exact multiple of 3 encoded
+ octets), except that the second 'encoded-word' uses a different
+ 'charset' than the first one.
+
+ From: =?ISO-8859-1?Q?Olle_J=E4rnefors?= <ojarnef@admin.kth.se>
+ To: ietf-822@dimacs.rutgers.edu, ojarnef@admin.kth.se
+ Subject: Time for ISO 10646?
+
+ To: Dave Crocker <dcrocker@mordor.stanford.edu>
+ Cc: ietf-822@dimacs.rutgers.edu, paf@comsol.se
+ From: =?ISO-8859-1?Q?Patrik_F=E4ltstr=F6m?= <paf@nada.kth.se>
+ Subject: Re: RFC-HDR care and feeding
+
+
+
+
+
+Moore Standards Track [Page 11]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ From: Nathaniel Borenstein <nsb@thumper.bellcore.com>
+ (=?iso-8859-8?b?7eXs+SDv4SDp7Oj08A==?=)
+ To: Greg Vaudreuil <gvaudre@NRI.Reston.VA.US>, Ned Freed
+ <ned@innosoft.com>, Keith Moore <moore@cs.utk.edu>
+ Subject: Test of new header generator
+ MIME-Version: 1.0
+ Content-type: text/plain; charset=ISO-8859-1
+
+ The following examples illustrate how text containing 'encoded-word's
+ which appear in a structured field body. The rules are slightly
+ different for fields defined as '*text' because "(" and ")" are not
+ recognized as 'comment' delimiters. [Section 5, paragraph (1)].
+
+ In each of the following examples, if the same sequence were to occur
+ in a '*text' field, the "displayed as" form would NOT be treated as
+ encoded words, but be identical to the "encoded form". This is
+ because each of the encoded-words in the following examples is
+ adjacent to a "(" or ")" character.
+
+ encoded form displayed as
+ ---------------------------------------------------------------------
+ (=?ISO-8859-1?Q?a?=) (a)
+
+ (=?ISO-8859-1?Q?a?= b) (a b)
+
+ Within a 'comment', white space MUST appear between an
+ 'encoded-word' and surrounding text. [Section 5,
+ paragraph (2)]. However, white space is not needed between
+ the initial "(" that begins the 'comment', and the
+ 'encoded-word'.
+
+
+ (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab)
+
+ White space between adjacent 'encoded-word's is not
+ displayed.
+
+ (=?ISO-8859-1?Q?a?= =?ISO-8859-1?Q?b?=) (ab)
+
+ Even multiple SPACEs between 'encoded-word's are ignored
+ for the purpose of display.
+
+ (=?ISO-8859-1?Q?a?= (ab)
+ =?ISO-8859-1?Q?b?=)
+
+ Any amount of linear-space-white between 'encoded-word's,
+ even if it includes a CRLF followed by one or more SPACEs,
+ is ignored for the purposes of display.
+
+
+
+Moore Standards Track [Page 12]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+ (=?ISO-8859-1?Q?a_b?=) (a b)
+
+ In order to cause a SPACE to be displayed within a portion
+ of encoded text, the SPACE MUST be encoded as part of the
+ 'encoded-word'.
+
+ (=?ISO-8859-1?Q?a?= =?ISO-8859-2?Q?_b?=) (a b)
+
+ In order to cause a SPACE to be displayed between two strings
+ of encoded text, the SPACE MAY be encoded as part of one of
+ the 'encoded-word's.
+
+9. References
+
+ [RFC 822] Crocker, D., "Standard for the Format of ARPA Internet Text
+ Messages", STD 11, RFC 822, UDEL, August 1982.
+
+ [RFC 2049] Borenstein, N., and N. Freed, "Multipurpose Internet Mail
+ Extensions (MIME) Part Five: Conformance Criteria and Examples",
+ RFC 2049, November 1996.
+
+ [RFC 2045] Borenstein, N., and N. Freed, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message Bodies",
+ RFC 2045, November 1996.
+
+ [RFC 2046] Borenstein N., and N. Freed, "Multipurpose Internet Mail
+ Extensions (MIME) Part Two: Media Types", RFC 2046,
+ November 1996.
+
+ [RFC 2048] Freed, N., Klensin, J., and J. Postel, "Multipurpose
+ Internet Mail Extensions (MIME) Part Four: Registration
+ Procedures", RFC 2048, November 1996.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Moore Standards Track [Page 13]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+10. Security Considerations
+
+ Security issues are not discussed in this memo.
+
+11. Acknowledgements
+
+ The author wishes to thank Nathaniel Borenstein, Issac Chan, Lutz
+ Donnerhacke, Paul Eggert, Ned Freed, Andreas M. Kirchwitz, Olle
+ Jarnefors, Mike Rosin, Yutaka Sato, Bart Schaefer, and Kazuhiko
+ Yamamoto, for their helpful advice, insightful comments, and
+ illuminating questions in response to earlier versions of this
+ specification.
+
+12. Author's Address
+
+ Keith Moore
+ University of Tennessee
+ 107 Ayres Hall
+ Knoxville TN 37996-1301
+
+ EMail: moore@cs.utk.edu
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Moore Standards Track [Page 14]
+\f
+RFC 2047 Message Header Extensions November 1996
+
+
+Appendix - changes since RFC 1522 (in no particular order)
+
+ + explicitly state that the MIME-Version is not requried to use
+ 'encoded-word's.
+
+ + add explicit note that SPACEs and TABs are not allowed within
+ 'encoded-word's, explaining that an 'encoded-word' must look like an
+ 'atom' to an RFC822 parser.values, to be precise).
+
+ + add examples from Olle Jarnefors (thanks!) which illustrate how
+ encoded-words with adjacent linear-white-space are displayed.
+
+ + explicitly list terms defined in RFC822 and referenced in this memo
+
+ + fix transcription typos that caused one or two lines and a couple of
+ characters to disappear in the resulting text, due to nroff quirks.
+
+ + clarify that encoded-words are allowed in '*text' fields in both
+ RFC822 headers and MIME body part headers, but NOT as parameter
+ values.
+
+ + clarify the requirement to switch back to ASCII within the encoded
+ portion of an 'encoded-word', for any charset that uses code switching
+ sequences.
+
+ + add a note about 'encoded-word's being delimited by "(" and ")"
+ within a comment, but not in a *text (how bizarre!).
+
+ + fix the Andre Pirard example to get rid of the trailing "_" after
+ the =E9. (no longer needed post-1342).
+
+ + clarification: an 'encoded-word' may appear immediately following
+ the initial "(" or immediately before the final ")" that delimits a
+ comment, not just adjacent to "(" and ")" *within* *ctext.
+
+ + add a note to explain that a "B" 'encoded-word' will always have a
+ multiple of 4 characters in the 'encoded-text' portion.
+
+ + add note about the "=" in the examples
+
+ + note that processing of 'encoded-word's occurs *after* parsing, and
+ some of the implications thereof.
+
+ + explicitly state that you can't expect to translate between
+ 1522 and either vanilla 822 or so-called "8-bit headers".
+
+ + explicitly state that 'encoded-word's are not valid within a
+ 'quoted-string'.
+
+
+
+Moore Standards Track [Page 15]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group N. Freed
+Request for Comments: 2048 Innosoft
+BCP: 13 J. Klensin
+Obsoletes: 1521, 1522, 1590 MCI
+Category: Best Current Practice J. Postel
+ ISI
+ November 1996
+
+
+ Multipurpose Internet Mail Extensions
+ (MIME) Part Four:
+ Registration Procedures
+
+Status of this Memo
+
+ This document specifies an Internet Best Current Practices for the
+ Internet Community, and requests discussion and suggestions for
+ improvements. Distribution of this memo is unlimited.
+
+Abstract
+
+ STD 11, RFC 822, defines a message representation protocol specifying
+ considerable detail about US-ASCII message headers, and leaves the
+ message content, or message body, as flat US-ASCII text. This set of
+ documents, collectively called the Multipurpose Internet Mail
+ Extensions, or MIME, redefines the format of messages to allow for
+
+ (1) textual message bodies in character sets other than
+ US-ASCII,
+
+ (2) an extensible set of different formats for non-textual
+ message bodies,
+
+ (3) multi-part message bodies, and
+
+ (4) textual header information in character sets other than
+ US-ASCII.
+
+ These documents are based on earlier work documented in RFC 934, STD
+ 11, and RFC 1049, but extends and revises them. Because RFC 822 said
+ so little about message bodies, these documents are largely
+ orthogonal to (rather than a revision of) RFC 822.
+
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 1]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ This fourth document, RFC 2048, specifies various IANA registration
+ procedures for the following MIME facilities:
+
+ (1) media types,
+
+ (2) external body access types,
+
+ (3) content-transfer-encodings.
+
+ Registration of character sets for use in MIME is covered elsewhere
+ and is no longer addressed by this document.
+
+ These documents are revisions of RFCs 1521 and 1522, which themselves
+ were revisions of RFCs 1341 and 1342. An appendix in RFC 2049
+ describes differences and changes from previous versions.
+
+Table of Contents
+
+ 1. Introduction ......................................... 3
+ 2. Media Type Registration .............................. 4
+ 2.1 Registration Trees and Subtype Names ................ 4
+ 2.1.1 IETF Tree ......................................... 4
+ 2.1.2 Vendor Tree ....................................... 4
+ 2.1.3 Personal or Vanity Tree ........................... 5
+ 2.1.4 Special `x.' Tree ................................. 5
+ 2.1.5 Additional Registration Trees ..................... 6
+ 2.2 Registration Requirements ........................... 6
+ 2.2.1 Functionality Requirement ......................... 6
+ 2.2.2 Naming Requirements ............................... 6
+ 2.2.3 Parameter Requirements ............................ 7
+ 2.2.4 Canonicalization and Format Requirements .......... 7
+ 2.2.5 Interchange Recommendations ....................... 8
+ 2.2.6 Security Requirements ............................. 8
+ 2.2.7 Usage and Implementation Non-requirements ......... 9
+ 2.2.8 Publication Requirements .......................... 10
+ 2.2.9 Additional Information ............................ 10
+ 2.3 Registration Procedure .............................. 11
+ 2.3.1 Present the Media Type to the Community for Review 11
+ 2.3.2 IESG Approval ..................................... 12
+ 2.3.3 IANA Registration ................................. 12
+ 2.4 Comments on Media Type Registrations ................ 12
+ 2.5 Location of Registered Media Type List .............. 12
+ 2.6 IANA Procedures for Registering Media Types ......... 12
+ 2.7 Change Control ...................................... 13
+ 2.8 Registration Template ............................... 14
+ 3. External Body Access Types ........................... 14
+ 3.1 Registration Requirements ........................... 15
+ 3.1.1 Naming Requirements ............................... 15
+
+
+
+Freed, et. al. Best Current Practice [Page 2]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ 3.1.2 Mechanism Specification Requirements .............. 15
+ 3.1.3 Publication Requirements .......................... 15
+ 3.1.4 Security Requirements ............................. 15
+ 3.2 Registration Procedure .............................. 15
+ 3.2.1 Present the Access Type to the Community .......... 16
+ 3.2.2 Access Type Reviewer .............................. 16
+ 3.2.3 IANA Registration ................................. 16
+ 3.3 Location of Registered Access Type List ............. 16
+ 3.4 IANA Procedures for Registering Access Types ........ 16
+ 4. Transfer Encodings ................................... 17
+ 4.1 Transfer Encoding Requirements ...................... 17
+ 4.1.1 Naming Requirements ............................... 17
+ 4.1.2 Algorithm Specification Requirements .............. 18
+ 4.1.3 Input Domain Requirements ......................... 18
+ 4.1.4 Output Range Requirements ......................... 18
+ 4.1.5 Data Integrity and Generality Requirements ........ 18
+ 4.1.6 New Functionality Requirements .................... 18
+ 4.2 Transfer Encoding Definition Procedure .............. 19
+ 4.3 IANA Procedures for Transfer Encoding Registration... 19
+ 4.4 Location of Registered Transfer Encodings List ...... 19
+ 5. Authors' Addresses ................................... 20
+ A. Grandfathered Media Types ............................ 21
+
+1. Introduction
+
+ Recent Internet protocols have been carefully designed to be easily
+ extensible in certain areas. In particular, MIME [RFC 2045] is an
+ open-ended framework and can accommodate additional object types,
+ character sets, and access methods without any changes to the basic
+ protocol. A registration process is needed, however, to ensure that
+ the set of such values is developed in an orderly, well-specified,
+ and public manner.
+
+ This document defines registration procedures which use the Internet
+ Assigned Numbers Authority (IANA) as a central registry for such
+ values.
+
+ Historical Note: The registration process for media types was
+ initially defined in the context of the asynchronous Internet mail
+ environment. In this mail environment there is a need to limit the
+ number of possible media types to increase the likelihood of
+ interoperability when the capabilities of the remote mail system are
+ not known. As media types are used in new environments, where the
+ proliferation of media types is not a hindrance to interoperability,
+ the original procedure was excessively restrictive and had to be
+ generalized.
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 3]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+2. Media Type Registration
+
+ Registration of a new media type or types starts with the
+ construction of a registration proposal. Registration may occur in
+ several different registration trees, which have different
+ requirements as discussed below. In general, the new registration
+ proposal is circulated and reviewed in a fashion appropriate to the
+ tree involved. The media type is then registered if the proposal is
+ acceptable. The following sections describe the requirements and
+ procedures used for each of the different registration trees.
+
+2.1. Registration Trees and Subtype Names
+
+ In order to increase the efficiency and flexibility of the
+ registration process, different structures of subtype names may be
+ registered to accomodate the different natural requirements for,
+ e.g., a subtype that will be recommended for wide support and
+ implementation by the Internet Community or a subtype that is used to
+ move files associated with proprietary software. The following
+ subsections define registration "trees", distinguished by the use of
+ faceted names (e.g., names of the form "tree.subtree...type"). Note
+ that some media types defined prior to this document do not conform
+ to the naming conventions described below. See Appendix A for a
+ discussion of them.
+
+2.1.1. IETF Tree
+
+ The IETF tree is intended for types of general interest to the
+ Internet Community. Registration in the IETF tree requires approval
+ by the IESG and publication of the media type registration as some
+ form of RFC.
+
+ Media types in the IETF tree are normally denoted by names that are
+ not explicitly faceted, i.e., do not contain period (".", full stop)
+ characters.
+
+ The "owner" of a media type registration in the IETF tree is assumed
+ to be the IETF itself. Modification or alteration of the
+ specification requires the same level of processing (e.g. standards
+ track) required for the initial registration.
+
+2.1.2. Vendor Tree
+
+ The vendor tree is used for media types associated with commercially
+ available products. "Vendor" or "producer" are construed as
+ equivalent and very broadly in this context.
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 4]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ A registration may be placed in the vendor tree by anyone who has
+ need to interchange files associated with the particular product.
+ However, the registration formally belongs to the vendor or
+ organization producing the software or file format. Changes to the
+ specification will be made at their request, as discussed in
+ subsequent sections.
+
+ Registrations in the vendor tree will be distinguished by the leading
+ facet "vnd.". That may be followed, at the discretion of the
+ registration, by either a media type name from a well-known producer
+ (e.g., "vnd.mudpie") or by an IANA-approved designation of the
+ producer's name which is then followed by a media type or product
+ designation (e.g., vnd.bigcompany.funnypictures).
+
+ While public exposure and review of media types to be registered in
+ the vendor tree is not required, using the ietf-types list for review
+ is strongly encouraged to improve the quality of those
+ specifications. Registrations in the vendor tree may be submitted
+ directly to the IANA.
+
+2.1.3. Personal or Vanity Tree
+
+ Registrations for media types created experimentally or as part of
+ products that are not distributed commercially may be registered in
+ the personal or vanity tree. The registrations are distinguished by
+ the leading facet "prs.".
+
+ The owner of "personal" registrations and associated specifications
+ is the person or entity making the registration, or one to whom
+ responsibility has been transferred as described below.
+
+ While public exposure and review of media types to be registered in
+ the personal tree is not required, using the ietf-types list for
+ review is strongly encouraged to improve the quality of those
+ specifications. Registrations in the personl tree may be submitted
+ directly to the IANA.
+
+2.1.4. Special `x.' Tree
+
+ For convenience and symmetry with this registration scheme, media
+ type names with "x." as the first facet may be used for the same
+ purposes for which names starting in "x-" are normally used. These
+ types are unregistered, experimental, and should be used only with
+ the active agreement of the parties exchanging them.
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 5]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ However, with the simplified registration procedures described above
+ for vendor and personal trees, it should rarely, if ever, be
+ necessary to use unregistered experimental types, and as such use of
+ both "x-" and "x." forms is discouraged.
+
+2.1.5. Additional Registration Trees
+
+ From time to time and as required by the community, the IANA may,
+ with the advice and consent of the IESG, create new top-level
+ registration trees. It is explicitly assumed that these trees may be
+ created for external registration and management by well-known
+ permanent bodies, such as scientific societies for media types
+ specific to the sciences they cover. In general, the quality of
+ review of specifications for one of these additional registration
+ trees is expected to be equivalent to that which IETF would give to
+ registrations in its own tree. Establishment of these new trees will
+ be announced through RFC publication approved by the IESG.
+
+2.2. Registration Requirements
+
+ Media type registration proposals are all expected to conform to
+ various requirements laid out in the following sections. Note that
+ requirement specifics sometimes vary depending on the registration
+ tree, again as detailed in the following sections.
+
+2.2.1. Functionality Requirement
+
+ Media types must function as an actual media format: Registration of
+ things that are better thought of as a transfer encoding, as a
+ character set, or as a collection of separate entities of another
+ type, is not allowed. For example, although applications exist to
+ decode the base64 transfer encoding [RFC 2045], base64 cannot be
+ registered as a media type.
+
+ This requirement applies regardless of the registration tree
+ involved.
+
+2.2.2. Naming Requirements
+
+ All registered media types must be assigned MIME type and subtype
+ names. The combination of these names then serves to uniquely
+ identify the media type and the format of the subtype name identifies
+ the registration tree.
+
+ The choice of top-level type name must take the nature of media type
+ involved into account. For example, media normally used for
+ representing still images should be a subtype of the image content
+ type, whereas media capable of representing audio information belongs
+
+
+
+Freed, et. al. Best Current Practice [Page 6]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ under the audio content type. See RFC 2046 for additional information
+ on the basic set of top-level types and their characteristics.
+
+ New subtypes of top-level types must conform to the restrictions of
+ the top-level type, if any. For example, all subtypes of the
+ multipart content type must use the same encapsulation syntax.
+
+ In some cases a new media type may not "fit" under any currently
+ defined top-level content type. Such cases are expected to be quite
+ rare. However, if such a case arises a new top-level type can be
+ defined to accommodate it. Such a definition must be done via
+ standards-track RFC; no other mechanism can be used to define
+ additional top-level content types.
+
+ These requirements apply regardless of the registration tree
+ involved.
+
+2.2.3. Parameter Requirements
+
+ Media types may elect to use one or more MIME content type
+ parameters, or some parameters may be automatically made available to
+ the media type by virtue of being a subtype of a content type that
+ defines a set of parameters applicable to any of its subtypes. In
+ either case, the names, values, and meanings of any parameters must
+ be fully specified when a media type is registered in the IETF tree,
+ and should be specified as completely as possible when media types
+ are registered in the vendor or personal trees.
+
+ New parameters must not be defined as a way to introduce new
+ functionality in types registered in the IETF tree, although new
+ parameters may be added to convey additional information that does
+ not otherwise change existing functionality. An example of this
+ would be a "revision" parameter to indicate a revision level of an
+ external specification such as JPEG. Similar behavior is encouraged
+ for media types registered in the vendor or personal trees but is not
+ required.
+
+2.2.4. Canonicalization and Format Requirements
+
+ All registered media types must employ a single, canonical data
+ format, regardless of registration tree.
+
+ A precise and openly available specification of the format of each
+ media type is required for all types registered in the IETF tree and
+ must at a minimum be referenced by, if it isn't actually included in,
+ the media type registration proposal itself.
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 7]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ The specifications of format and processing particulars may or may
+ not be publically available for media types registered in the vendor
+ tree, and such registration proposals are explicitly permitted to
+ include only a specification of which software and version produce or
+ process such media types. References to or inclusion of format
+ specifications in registration proposals is encouraged but not
+ required.
+
+ Format specifications are still required for registration in the
+ personal tree, but may be either published as RFCs or otherwise
+ deposited with IANA. The deposited specifications will meet the same
+ criteria as those required to register a well-known TCP port and, in
+ particular, need not be made public.
+
+ Some media types involve the use of patented technology. The
+ registration of media types involving patented technology is
+ specifically permitted. However, the restrictions set forth in RFC
+ 1602 on the use of patented technology in standards-track protocols
+ must be respected when the specification of a media type is part of a
+ standards-track protocol.
+
+2.2.5. Interchange Recommendations
+
+ Media types should, whenever possible, interoperate across as many
+ systems and applications as possible. However, some media types will
+ inevitably have problems interoperating across different platforms.
+ Problems with different versions, byte ordering, and specifics of
+ gateway handling can and will arise.
+
+ Universal interoperability of media types is not required, but known
+ interoperability issues should be identified whenever possible.
+ Publication of a media type does not require an exhaustive review of
+ interoperability, and the interoperability considerations section is
+ subject to continuing evaluation.
+
+ These recommendations apply regardless of the registration tree
+ involved.
+
+2.2.6. Security Requirements
+
+ An analysis of security issues is required for for all types
+ registered in the IETF Tree. (This is in accordance with the basic
+ requirements for all IETF protocols.) A similar analysis for media
+ types registered in the vendor or personal trees is encouraged but
+ not required. However, regardless of what security analysis has or
+ has not been done, all descriptions of security issues must be as
+ accurate as possible regardless of registration tree. In particular,
+ a statement that there are "no security issues associated with this
+
+
+
+Freed, et. al. Best Current Practice [Page 8]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ type" must not be confused with "the security issues associates with
+ this type have not been assessed".
+
+ There is absolutely no requirement that media types registered in any
+ tree be secure or completely free from risks. Nevertheless, all
+ known security risks must be identified in the registration of a
+ media type, again regardless of registration tree.
+
+ The security considerations section of all registrations is subject
+ to continuing evaluation and modification, and in particular may be
+ extended by use of the "comments on media types" mechanism described
+ in subsequent sections.
+
+ Some of the issues that should be looked at in a security analysis of
+ a media type are:
+
+ (1) Complex media types may include provisions for
+ directives that institute actions on a recipient's
+ files or other resources. In many cases provision is
+ made for originators to specify arbitrary actions in an
+ unrestricted fashion which may then have devastating
+ effects. See the registration of the
+ application/postscript media type in RFC 2046 for
+ an example of such directives and how to handle them.
+
+ (2) Complex media types may include provisions for
+ directives that institute actions which, while not
+ directly harmful to the recipient, may result in
+ disclosure of information that either facilitates a
+ subsequent attack or else violates a recipient's
+ privacy in some way. Again, the registration of the
+ application/postscript media type illustrates how such
+ directives can be handled.
+
+ (3) A media type might be targeted for applications that
+ require some sort of security assurance but not provide
+ the necessary security mechanisms themselves. For
+ example, a media type could be defined for storage of
+ confidential medical information which in turn requires
+ an external confidentiality service.
+
+2.2.7. Usage and Implementation Non-requirements
+
+ In the asynchronous mail environment, where information on the
+ capabilities of the remote mail agent is frequently not available to
+ the sender, maximum interoperability is attained by restricting the
+ number of media types used to those "common" formats expected to be
+ widely implemented. This was asserted in the past as a reason to
+
+
+
+Freed, et. al. Best Current Practice [Page 9]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ limit the number of possible media types and resulted in a
+ registration process with a significant hurdle and delay for those
+ registering media types.
+
+ However, the need for "common" media types does not require limiting
+ the registration of new media types. If a limited set of media types
+ is recommended for a particular application, that should be asserted
+ by a separate applicability statement specific for the application
+ and/or environment.
+
+ As such, universal support and implementation of a media type is NOT
+ a requirement for registration. If, however, a media type is
+ explicitly intended for limited use, this should be noted in its
+ registration.
+
+2.2.8. Publication Requirements
+
+ Proposals for media types registered in the IETF tree must be
+ published as RFCs. RFC publication of vendor and personal media type
+ proposals is encouraged but not required. In all cases IANA will
+ retain copies of all media type proposals and "publish" them as part
+ of the media types registration tree itself.
+
+ Other than in the IETF tree, the registration of a data type does not
+ imply endorsement, approval, or recommendation by IANA or IETF or
+ even certification that the specification is adequate. To become
+ Internet Standards, protocol, data objects, or whatever must go
+ through the IETF standards process. This is too difficult and too
+ lengthy a process for the convenient registration of media types.
+
+ The IETF tree exists for media types that do require require a
+ substantive review and approval process with the vendor and personal
+ trees exist for those that do not. It is expected that applicability
+ statements for particular applications will be published from time to
+ time that recommend implementation of, and support for, media types
+ that have proven particularly useful in those contexts.
+
+ As discussed above, registration of a top-level type requires
+ standards-track processing and, hence, RFC publication.
+
+2.2.9. Additional Information
+
+ Various sorts of optional information may be included in the
+ specification of a media type if it is available:
+
+ (1) Magic number(s) (length, octet values). Magic numbers
+ are byte sequences that are always present and thus can
+ be used to identify entities as being of a given media
+
+
+
+Freed, et. al. Best Current Practice [Page 10]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ type.
+
+ (2) File extension(s) commonly used on one or more
+ platforms to indicate that some file containing a given
+ type of media.
+
+ (3) Macintosh File Type code(s) (4 octets) used to label
+ files containing a given type of media.
+
+ Such information is often quite useful to implementors and if
+ available should be provided.
+
+2.3. Registration Procedure
+
+ The following procedure has been implemented by the IANA for review
+ and approval of new media types. This is not a formal standards
+ process, but rather an administrative procedure intended to allow
+ community comment and sanity checking without excessive time delay.
+ For registration in the IETF tree, the normal IETF processes should
+ be followed, treating posting of an internet-draft and announcement
+ on the ietf-types list (as described in the next subsection) as a
+ first step. For registrations in the vendor or personal tree, the
+ initial review step described below may be omitted and the type
+ registered directly by submitting the template and an explanation
+ directly to IANA (at iana@iana.org). However, authors of vendor or
+ personal media type specifications are encouraged to seek community
+ review and comment whenever that is feasible.
+
+2.3.1. Present the Media Type to the Community for Review
+
+ Send a proposed media type registration to the "ietf-types@iana.org"
+ mailing list for a two week review period. This mailing list has
+ been established for the purpose of reviewing proposed media and
+ access types. Proposed media types are not formally registered and
+ must not be used; the "x-" prefix specified in RFC 2045 can be used
+ until registration is complete.
+
+ The intent of the public posting is to solicit comments and feedback
+ on the choice of type/subtype name, the unambiguity of the references
+ with respect to versions and external profiling information, and a
+ review of any interoperability or security considerations. The
+ submitter may submit a revised registration, or withdraw the
+ registration completely, at any time.
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 11]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+2.3.2. IESG Approval
+
+ Media types registered in the IETF tree must be submitted to the IESG
+ for approval.
+
+2.3.3. IANA Registration
+
+ Provided that the media type meets the requirements for media types
+ and has obtained approval that is necessary, the author may submit
+ the registration request to the IANA, which will register the media
+ type and make the media type registration available to the community.
+
+2.4. Comments on Media Type Registrations
+
+ Comments on registered media types may be submitted by members of the
+ community to IANA. These comments will be passed on to the "owner"
+ of the media type if possible. Submitters of comments may request
+ that their comment be attached to the media type registration itself,
+ and if IANA approves of this the comment will be made accessible in
+ conjunction with the type registration itself.
+
+2.5. Location of Registered Media Type List
+
+ Media type registrations will be posted in the anonymous FTP
+ directory "ftp://ftp.isi.edu/in-notes/iana/assignments/media-types/"
+ and all registered media types will be listed in the periodically
+ issued "Assigned Numbers" RFC [currently STD 2, RFC 1700]. The media
+ type description and other supporting material may also be published
+ as an Informational RFC by sending it to "rfc-editor@isi.edu" (please
+ follow the instructions to RFC authors [RFC-1543]).
+
+2.6. IANA Procedures for Registering Media Types
+
+ The IANA will only register media types in the IETF tree in response
+ to a communication from the IESG stating that a given registration
+ has been approved. Vendor and personal types will be registered by
+ the IANA automatically and without any formal review as long as the
+ following minimal conditions are met:
+
+ (1) Media types must function as an actual media format.
+ In particular, character sets and transfer encodings
+ may not be registered as media types.
+
+ (2) All media types must have properly formed type and
+ subtype names. All type names must be defined by a
+ standards-track RFC. All subtype names must be unique,
+ must conform to the MIME grammar for such names, and
+ must contain the proper tree prefix.
+
+
+
+Freed, et. al. Best Current Practice [Page 12]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ (3) Types registered in the personal tree must either
+ provide a format specification or a pointer to one.
+
+ (4) Any security considerations given must not be obviously
+ bogus. (It is neither possible nor necessary for the
+ IANA to conduct a comprehensive security review of
+ media type registrations. Nevertheless, IANA has the
+ authority to identify obviously incompetent material
+ and exclude it.)
+
+2.7. Change Control
+
+ Once a media type has been published by IANA, the author may request
+ a change to its definition. The descriptions of the different
+ registration trees above designate the "owners" of each type of
+ registration. The change request follows the same procedure as the
+ registration request:
+
+ (1) Publish the revised template on the ietf-types list.
+
+ (2) Leave at least two weeks for comments.
+
+ (3) Publish using IANA after formal review if required.
+
+ Changes should be requested only when there are serious omission or
+ errors in the published specification. When review is required, a
+ change request may be denied if it renders entities that were valid
+ under the previous definition invalid under the new definition.
+
+ The owner of a content type may pass responsibility for the content
+ type to another person or agency by informing IANA and the ietf-types
+ list; this can be done without discussion or review.
+
+ The IESG may reassign responsibility for a media type. The most
+ common case of this will be to enable changes to be made to types
+ where the author of the registration has died, moved out of contact
+ or is otherwise unable to make changes that are important to the
+ community.
+
+ Media type registrations may not be deleted; media types which are no
+ longer believed appropriate for use can be declared OBSOLETE by a
+ change to their "intended use" field; such media types will be
+ clearly marked in the lists published by IANA.
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 13]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+2.8. Registration Template
+
+ To: ietf-types@iana.org
+ Subject: Registration of MIME media type XXX/YYY
+
+ MIME media type name:
+
+ MIME subtype name:
+
+ Required parameters:
+
+ Optional parameters:
+
+ Encoding considerations:
+
+ Security considerations:
+
+ Interoperability considerations:
+
+ Published specification:
+
+ Applications which use this media type:
+
+ Additional information:
+
+ Magic number(s):
+ File extension(s):
+ Macintosh File Type Code(s):
+
+ Person & email address to contact for further information:
+
+ Intended usage:
+
+ (One of COMMON, LIMITED USE or OBSOLETE)
+
+ Author/Change controller:
+
+ (Any other information that the author deems interesting may be
+ added below this line.)
+
+3. External Body Access Types
+
+ RFC 2046 defines the message/external-body media type, whereby a MIME
+ entity can act as pointer to the actual body data in lieu of
+ including the data directly in the entity body. Each
+ message/external-body reference specifies an access type, which
+ determines the mechanism used to retrieve the actual body data. RFC
+ 2046 defines an initial set of access types, but allows for the
+
+
+
+Freed, et. al. Best Current Practice [Page 14]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+ registration of additional access types to accommodate new retrieval
+ mechanisms.
+
+3.1. Registration Requirements
+
+ New access type specifications must conform to a number of
+ requirements as described below.
+
+3.1.1. Naming Requirements
+
+ Each access type must have a unique name. This name appears in the
+ access-type parameter in the message/external-body content-type
+ header field, and must conform to MIME content type parameter syntax.
+
+3.1.2. Mechanism Specification Requirements
+
+ All of the protocols, transports, and procedures used by a given
+ access type must be described, either in the specification of the
+ access type itself or in some other publicly available specification,
+ in sufficient detail for the access type to be implemented by any
+ competent implementor. Use of secret and/or proprietary methods in
+ access types are expressly prohibited. The restrictions imposed by
+ RFC 1602 on the standardization of patented algorithms must be
+ respected as well.
+
+3.1.3. Publication Requirements
+
+ All access types must be described by an RFC. The RFC may be
+ informational rather than standards-track, although standard-track
+ review and approval are encouraged for all access types.
+
+3.1.4. Security Requirements
+
+ Any known security issues that arise from the use of the access type
+ must be completely and fully described. It is not required that the
+ access type be secure or that it be free from risks, but that the
+ known risks be identified. Publication of a new access type does not
+ require an exhaustive security review, and the security
+ considerations section is subject to continuing evaluation.
+ Additional security considerations should be addressed by publishing
+ revised versions of the access type specification.
+
+3.2. Registration Procedure
+
+ Registration of a new access type starts with the construction of a
+ draft of an RFC.
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 15]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+3.2.1. Present the Access Type to the Community
+
+ Send a proposed access type specification to the "ietf-
+ types@iana.org" mailing list for a two week review period. This
+ mailing list has been established for the purpose of reviewing
+ proposed access and media types. Proposed access types are not
+ formally registered and must not be used.
+
+ The intent of the public posting is to solicit comments and feedback
+ on the access type specification and a review of any security
+ considerations.
+
+3.2.2. Access Type Reviewer
+
+ When the two week period has passed, the access type reviewer, who is
+ appointed by the IETF Applications Area Director, either forwards the
+ request to iana@isi.edu, or rejects it because of significant
+ objections raised on the list.
+
+ Decisions made by the reviewer must be posted to the ietf-types
+ mailing list within 14 days. Decisions made by the reviewer may be
+ appealed to the IESG.
+
+3.2.3. IANA Registration
+
+ Provided that the access type has either passed review or has been
+ successfully appealed to the IESG, the IANA will register the access
+ type and make the registration available to the community. The
+ specification of the access type must also be published as an RFC.
+ Informational RFCs are published by sending them to "rfc-
+ editor@isi.edu" (please follow the instructions to RFC authors [RFC-
+ 1543]).
+
+3.3. Location of Registered Access Type List
+
+ Access type registrations will be posted in the anonymous FTP
+ directory "ftp://ftp.isi.edu/in-notes/iana/assignments/access-types/"
+ and all registered access types will be listed in the periodically
+ issued "Assigned Numbers" RFC [currently RFC-1700].
+
+3.4. IANA Procedures for Registering Access Types
+
+ The identity of the access type reviewer is communicated to the IANA
+ by the IESG. The IANA then only acts in response to access type
+ definitions that either are approved by the access type reviewer and
+ forwarded by the reviewer to the IANA for registration, or in
+ response to a communication from the IESG that an access type
+ definition appeal has overturned the access type reviewer's ruling.
+
+
+
+Freed, et. al. Best Current Practice [Page 16]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+4. Transfer Encodings
+
+ Transfer encodings are tranformations applied to MIME media types
+ after conversion to the media type's canonical form. Transfer
+ encodings are used for several purposes:
+
+ (1) Many transports, especially message transports, can
+ only handle data consisting of relatively short lines
+ of text. There can also be severe restrictions on what
+ characters can be used in these lines of text -- some
+ transports are restricted to a small subset of US-ASCII
+ and others cannot handle certain character sequences.
+ Transfer encodings are used to transform binary data
+ into textual form that can survive such transports.
+ Examples of this sort of transfer encoding include the
+ base64 and quoted-printable transfer encodings defined
+ in RFC 2045.
+
+ (2) Image, audio, video, and even application entities are
+ sometimes quite large. Compression algorithms are often
+ quite effective in reducing the size of large entities.
+ Transfer encodings can be used to apply general-purpose
+ non-lossy compression algorithms to MIME entities.
+
+ (3) Transport encodings can be defined as a means of
+ representing existing encoding formats in a MIME
+ context.
+
+ IMPORTANT: The standardization of a large numbers of different
+ transfer encodings is seen as a significant barrier to widespread
+ interoperability and is expressely discouraged. Nevertheless, the
+ following procedure has been defined to provide a means of defining
+ additional transfer encodings, should standardization actually be
+ justified.
+
+4.1. Transfer Encoding Requirements
+
+ Transfer encoding specifications must conform to a number of
+ requirements as described below.
+
+4.1.1. Naming Requirements
+
+ Each transfer encoding must have a unique name. This name appears in
+ the Content-Transfer-Encoding header field and must conform to the
+ syntax of that field.
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 17]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+4.1.2. Algorithm Specification Requirements
+
+ All of the algorithms used in a transfer encoding (e.g. conversion
+ to printable form, compression) must be described in their entirety
+ in the transfer encoding specification. Use of secret and/or
+ proprietary algorithms in standardized transfer encodings are
+ expressly prohibited. The restrictions imposed by RFC 1602 on the
+ standardization of patented algorithms must be respected as well.
+
+4.1.3. Input Domain Requirements
+
+ All transfer encodings must be applicable to an arbitrary sequence of
+ octets of any length. Dependence on particular input forms is not
+ allowed.
+
+ It should be noted that the 7bit and 8bit encodings do not conform to
+ this requirement. Aside from the undesireability of having
+ specialized encodings, the intent here is to forbid the addition of
+ additional encodings along the lines of 7bit and 8bit.
+
+4.1.4. Output Range Requirements
+
+ There is no requirement that a particular tranfer encoding produce a
+ particular form of encoded output. However, the output format for
+ each transfer encoding must be fully and completely documented. In
+ particular, each specification must clearly state whether the output
+ format always lies within the confines of 7bit data, 8bit data, or is
+ simply pure binary data.
+
+4.1.5. Data Integrity and Generality Requirements
+
+ All transfer encodings must be fully invertible on any platform; it
+ must be possible for anyone to recover the original data by
+ performing the corresponding decoding operation. Note that this
+ requirement effectively excludes all forms of lossy compression as
+ well as all forms of encryption from use as a transfer encoding.
+
+4.1.6. New Functionality Requirements
+
+ All transfer encodings must provide some sort of new functionality.
+ Some degree of functionality overlap with previously defined transfer
+ encodings is acceptable, but any new transfer encoding must also
+ offer something no other transfer encoding provides.
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 18]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+4.2. Transfer Encoding Definition Procedure
+
+ Definition of a new transfer encoding starts with the construction of
+ a draft of a standards-track RFC. The RFC must define the transfer
+ encoding precisely and completely, and must also provide substantial
+ justification for defining and standardizing a new transfer encoding.
+ This specification must then be presented to the IESG for
+ consideration. The IESG can
+
+ (1) reject the specification outright as being
+ inappropriate for standardization,
+
+ (2) approve the formation of an IETF working group to work
+ on the specification in accordance with IETF
+ procedures, or,
+
+ (3) accept the specification as-is and put it directly on
+ the standards track.
+
+ Transfer encoding specifications on the standards track follow normal
+ IETF rules for standards track documents. A transfer encoding is
+ considered to be defined and available for use once it is on the
+ standards track.
+
+4.3. IANA Procedures for Transfer Encoding Registration
+
+ There is no need for a special procedure for registering Transfer
+ Encodings with the IANA. All legitimate transfer encoding
+ registrations must appear as a standards-track RFC, so it is the
+ IESG's responsibility to notify the IANA when a new transfer encoding
+ has been approved.
+
+4.4. Location of Registered Transfer Encodings List
+
+ Transfer encoding registrations will be posted in the anonymous FTP
+ directory "ftp://ftp.isi.edu/in-notes/iana/assignments/transfer-
+ encodings/" and all registered transfer encodings will be listed in
+ the periodically issued "Assigned Numbers" RFC [currently RFC-1700].
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 19]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+5. Authors' Addresses
+
+ For more information, the authors of this document are best
+ contacted via Internet mail:
+
+ Ned Freed
+ Innosoft International, Inc.
+ 1050 East Garvey Avenue South
+ West Covina, CA 91790
+ USA
+
+ Phone: +1 818 919 3600
+ Fax: +1 818 919 3614
+ EMail: ned@innosoft.com
+
+
+ John Klensin
+ MCI
+ 2100 Reston Parkway
+ Reston, VA 22091
+
+ Phone: +1 703 715-7361
+ Fax: +1 703 715-7436
+ EMail: klensin@mci.net
+
+
+ Jon Postel
+ USC/Information Sciences Institute
+ 4676 Admiralty Way
+ Marina del Rey, CA 90292
+ USA
+
+
+ Phone: +1 310 822 1511
+ Fax: +1 310 823 6714
+ EMail: Postel@ISI.EDU
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 20]
+\f
+RFC 2048 MIME Registration Procedures November 1996
+
+
+Appendix A -- Grandfathered Media Types
+
+ A number of media types, registered prior to 1996, would, if
+ registered under the guidelines in this document, be placed into
+ either the vendor or personal trees. Reregistration of those types
+ to reflect the appropriate trees is encouraged, but not required.
+ Ownership and change control principles outlined in this document
+ apply to those types as if they had been registered in the trees
+ described above.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed, et. al. Best Current Practice [Page 21]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group N. Freed
+Request for Comments: 2049 Innosoft
+Obsoletes: 1521, 1522, 1590 N. Borenstein
+Category: Standards Track First Virtual
+ November 1996
+
+
+ Multipurpose Internet Mail Extensions
+ (MIME) Part Five:
+ Conformance Criteria and Examples
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ STD 11, RFC 822, defines a message representation protocol specifying
+ considerable detail about US-ASCII message headers, and leaves the
+ message content, or message body, as flat US-ASCII text. This set of
+ documents, collectively called the Multipurpose Internet Mail
+ Extensions, or MIME, redefines the format of messages to allow for
+
+ (1) textual message bodies in character sets other than
+ US-ASCII,
+
+ (2) an extensible set of different formats for non-textual
+ message bodies,
+
+ (3) multi-part message bodies, and
+
+ (4) textual header information in character sets other than
+ US-ASCII.
+
+ These documents are based on earlier work documented in RFC 934, STD
+ 11, and RFC 1049, but extends and revises them. Because RFC 822 said
+ so little about message bodies, these documents are largely
+ orthogonal to (rather than a revision of) RFC 822.
+
+ The initial document in this set, RFC 2045, specifies the various
+ headers used to describe the structure of MIME messages. The second
+ document defines the general structure of the MIME media typing
+ system and defines an initial set of media types. The third
+ document, RFC 2047, describes extensions to RFC 822 to allow non-US-
+
+
+
+Freed & Borenstein Standards Track [Page 1]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ ASCII text data in Internet mail header fields. The fourth document,
+ RFC 2048, specifies various IANA registration procedures for MIME-
+ related facilities. This fifth and final document describes MIME
+ conformance criteria as well as providing some illustrative examples
+ of MIME message formats, acknowledgements, and the bibliography.
+
+ These documents are revisions of RFCs 1521, 1522, and 1590, which
+ themselves were revisions of RFCs 1341 and 1342. Appendix B of this
+ document describes differences and changes from previous versions.
+
+Table of Contents
+
+ 1. Introduction .......................................... 2
+ 2. MIME Conformance ...................................... 2
+ 3. Guidelines for Sending Email Data ..................... 6
+ 4. Canonical Encoding Model .............................. 9
+ 5. Summary ............................................... 12
+ 6. Security Considerations ............................... 12
+ 7. Authors' Addresses .................................... 12
+ 8. Acknowledgements ...................................... 13
+ A. A Complex Multipart Example ........................... 15
+ B. Changes from RFC 1521, 1522, and 1590 ................. 16
+ C. References ............................................ 20
+
+1. Introduction
+
+ The first and second documents in this set define MIME header fields
+ and the initial set of MIME media types. The third document
+ describes extensions to RFC822 formats to allow for character sets
+ other than US-ASCII. This document describes what portions of MIME
+ must be supported by a conformant MIME implementation. It also
+ describes various pitfalls of contemporary messaging systems as well
+ as the canonical encoding model MIME is based on.
+
+2. MIME Conformance
+
+ The mechanisms described in these documents are open-ended. It is
+ definitely not expected that all implementations will support all
+ available media types, nor that they will all share the same
+ extensions. In order to promote interoperability, however, it is
+ useful to define the concept of "MIME-conformance" to define a
+ certain level of implementation that allows the useful interworking
+ of messages with content that differs from US-ASCII text. In this
+ section, we specify the requirements for such conformance.
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 2]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ A mail user agent that is MIME-conformant MUST:
+
+ (1) Always generate a "MIME-Version: 1.0" header field in
+ any message it creates.
+
+ (2) Recognize the Content-Transfer-Encoding header field
+ and decode all received data encoded by either quoted-
+ printable or base64 implementations. The identity
+ transformations 7bit, 8bit, and binary must also be
+ recognized.
+
+ Any non-7bit data that is sent without encoding must be
+ properly labelled with a content-transfer-encoding of
+ 8bit or binary, as appropriate. If the underlying
+ transport does not support 8bit or binary (as SMTP
+ [RFC-821] does not), the sender is required to both
+ encode and label data using an appropriate Content-
+ Transfer-Encoding such as quoted-printable or base64.
+
+ (3) Must treat any unrecognized Content-Transfer-Encoding
+ as if it had a Content-Type of "application/octet-
+ stream", regardless of whether or not the actual
+ Content-Type is recognized.
+
+ (4) Recognize and interpret the Content-Type header field,
+ and avoid showing users raw data with a Content-Type
+ field other than text. Implementations must be able
+ to send at least text/plain messages, with the
+ character set specified with the charset parameter if
+ it is not US-ASCII.
+
+ (5) Ignore any content type parameters whose names they do
+ not recognize.
+
+ (6) Explicitly handle the following media type values, to
+ at least the following extents:
+
+ Text:
+
+ -- Recognize and display "text" mail with the
+ character set "US-ASCII."
+
+ -- Recognize other character sets at least to the
+ extent of being able to inform the user about what
+ character set the message uses.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 3]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ -- Recognize the "ISO-8859-*" character sets to the
+ extent of being able to display those characters that
+ are common to ISO-8859-* and US-ASCII, namely all
+ characters represented by octet values 1-127.
+
+ -- For unrecognized subtypes in a known character
+ set, show or offer to show the user the "raw" version
+ of the data after conversion of the content from
+ canonical form to local form.
+
+ -- Treat material in an unknown character set as if
+ it were "application/octet-stream".
+
+ Image, audio, and video:
+
+ -- At a minumum provide facilities to treat any
+ unrecognized subtypes as if they were
+ "application/octet-stream".
+
+ Application:
+
+ -- Offer the ability to remove either of the quoted-
+ printable or base64 encodings defined in this
+ document if they were used and put the resulting
+ information in a user file.
+
+ Multipart:
+
+ -- Recognize the mixed subtype. Display all relevant
+ information on the message level and the body part
+ header level and then display or offer to display
+ each of the body parts individually.
+
+ -- Recognize the "alternative" subtype, and avoid
+ showing the user redundant parts of
+ multipart/alternative mail.
+
+ -- Recognize the "multipart/digest" subtype,
+ specifically using "message/rfc822" rather than
+ "text/plain" as the default media type for body parts
+ inside "multipart/digest" entities.
+
+ -- Treat any unrecognized subtypes as if they were
+ "mixed".
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 4]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ Message:
+
+ -- Recognize and display at least the RFC822 message
+ encapsulation (message/rfc822) in such a way as to
+ preserve any recursive structure, that is, displaying
+ or offering to display the encapsulated data in
+ accordance with its media type.
+
+ -- Treat any unrecognized subtypes as if they were
+ "application/octet-stream".
+
+ (7) Upon encountering any unrecognized Content-Type field,
+ an implementation must treat it as if it had a media
+ type of "application/octet-stream" with no parameter
+ sub-arguments. How such data are handled is up to an
+ implementation, but likely options for handling such
+ unrecognized data include offering the user to write it
+ into a file (decoded from its mail transport format) or
+ offering the user to name a program to which the
+ decoded data should be passed as input.
+
+ (8) Conformant user agents are required, if they provide
+ non-standard support for non-MIME messages employing
+ character sets other than US-ASCII, to do so on
+ received messages only. Conforming user agents must not
+ send non-MIME messages containing anything other than
+ US-ASCII text.
+
+ In particular, the use of non-US-ASCII text in mail
+ messages without a MIME-Version field is strongly
+ discouraged as it impedes interoperability when sending
+ messages between regions with different localization
+ conventions. Conforming user agents MUST include proper
+ MIME labelling when sending anything other than plain
+ text in the US-ASCII character set.
+
+ In addition, non-MIME user agents should be upgraded if
+ at all possible to include appropriate MIME header
+ information in the messages they send even if nothing
+ else in MIME is supported. This upgrade will have
+ little, if any, effect on non-MIME recipients and will
+ aid MIME in correctly displaying such messages. It
+ also provides a smooth transition path to eventual
+ adoption of other MIME capabilities.
+
+ (9) Conforming user agents must ensure that any string of
+ non-white-space printable US-ASCII characters within a
+ "*text" or "*ctext" that begins with "=?" and ends with
+
+
+
+Freed & Borenstein Standards Track [Page 5]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ "?=" be a valid encoded-word. ("begins" means: At the
+ start of the field-body or immediately following
+ linear-white-space; "ends" means: At the end of the
+ field-body or immediately preceding linear-white-
+ space.) In addition, any "word" within a "phrase" that
+ begins with "=?" and ends with "?=" must be a valid
+ encoded-word.
+
+ (10) Conforming user agents must be able to distinguish
+ encoded-words from "text", "ctext", or "word"s,
+ according to the rules in section 4, anytime they
+ appear in appropriate places in message headers. It
+ must support both the "B" and "Q" encodings for any
+ character set which it supports. The program must be
+ able to display the unencoded text if the character set
+ is "US-ASCII". For the ISO-8859-* character sets, the
+ mail reading program must at least be able to display
+ the characters which are also in the US-ASCII set.
+
+ A user agent that meets the above conditions is said to be MIME-
+ conformant. The meaning of this phrase is that it is assumed to be
+ "safe" to send virtually any kind of properly-marked data to users of
+ such mail systems, because such systems will at least be able to
+ treat the data as undifferentiated binary, and will not simply splash
+ it onto the screen of unsuspecting users.
+
+ There is another sense in which it is always "safe" to send data in a
+ format that is MIME-conformant, which is that such data will not
+ break or be broken by any known systems that are conformant with RFC
+ 821 and RFC 822. User agents that are MIME-conformant have the
+ additional guarantee that the user will not be shown data that were
+ never intended to be viewed as text.
+
+3. Guidelines for Sending Email Data
+
+ Internet email is not a perfect, homogeneous system. Mail may become
+ corrupted at several stages in its travel to a final destination.
+ Specifically, email sent throughout the Internet may travel across
+ many networking technologies. Many networking and mail technologies
+ do not support the full functionality possible in the SMTP transport
+ environment. Mail traversing these systems is likely to be modified
+ in order that it can be transported.
+
+ There exist many widely-deployed non-conformant MTAs in the Internet.
+ These MTAs, speaking the SMTP protocol, alter messages on the fly to
+ take advantage of the internal data structure of the hosts they are
+ implemented on, or are just plain broken.
+
+
+
+
+Freed & Borenstein Standards Track [Page 6]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ The following guidelines may be useful to anyone devising a data
+ format (media type) that is supposed to survive the widest range of
+ networking technologies and known broken MTAs unscathed. Note that
+ anything encoded in the base64 encoding will satisfy these rules, but
+ that some well-known mechanisms, notably the UNIX uuencode facility,
+ will not. Note also that anything encoded in the Quoted-Printable
+ encoding will survive most gateways intact, but possibly not some
+ gateways to systems that use the EBCDIC character set.
+
+ (1) Under some circumstances the encoding used for data may
+ change as part of normal gateway or user agent
+ operation. In particular, conversion from base64 to
+ quoted-printable and vice versa may be necessary. This
+ may result in the confusion of CRLF sequences with line
+ breaks in text bodies. As such, the persistence of
+ CRLF as something other than a line break must not be
+ relied on.
+
+ (2) Many systems may elect to represent and store text data
+ using local newline conventions. Local newline
+ conventions may not match the RFC822 CRLF convention --
+ systems are known that use plain CR, plain LF, CRLF, or
+ counted records. The result is that isolated CR and LF
+ characters are not well tolerated in general; they may
+ be lost or converted to delimiters on some systems, and
+ hence must not be relied on.
+
+ (3) The transmission of NULs (US-ASCII value 0) is
+ problematic in Internet mail. (This is largely the
+ result of NULs being used as a termination character by
+ many of the standard runtime library routines in the C
+ programming language.) The practice of using NULs as
+ termination characters is so entrenched now that
+ messages should not rely on them being preserved.
+
+ (4) TAB (HT) characters may be misinterpreted or may be
+ automatically converted to variable numbers of spaces.
+ This is unavoidable in some environments, notably those
+ not based on the US-ASCII character set. Such
+ conversion is STRONGLY DISCOURAGED, but it may occur,
+ and mail formats must not rely on the persistence of
+ TAB (HT) characters.
+
+ (5) Lines longer than 76 characters may be wrapped or
+ truncated in some environments. Line wrapping or line
+ truncation imposed by mail transports is STRONGLY
+ DISCOURAGED, but unavoidable in some cases.
+ Applications which require long lines must somehow
+
+
+
+Freed & Borenstein Standards Track [Page 7]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ differentiate between soft and hard line breaks. (A
+ simple way to do this is to use the quoted-printable
+ encoding.)
+
+ (6) Trailing "white space" characters (SPACE, TAB (HT)) on
+ a line may be discarded by some transport agents, while
+ other transport agents may pad lines with these
+ characters so that all lines in a mail file are of
+ equal length. The persistence of trailing white space,
+ therefore, must not be relied on.
+
+ (7) Many mail domains use variations on the US-ASCII
+ character set, or use character sets such as EBCDIC
+ which contain most but not all of the US-ASCII
+ characters. The correct translation of characters not
+ in the "invariant" set cannot be depended on across
+ character converting gateways. For example, this
+ situation is a problem when sending uuencoded
+ information across BITNET, an EBCDIC system. Similar
+ problems can occur without crossing a gateway, since
+ many Internet hosts use character sets other than US-
+ ASCII internally. The definition of Printable Strings
+ in X.400 adds further restrictions in certain special
+ cases. In particular, the only characters that are
+ known to be consistent across all gateways are the 73
+ characters that correspond to the upper and lower case
+ letters A-Z and a-z, the 10 digits 0-9, and the
+ following eleven special characters:
+
+ "'" (US-ASCII decimal value 39)
+ "(" (US-ASCII decimal value 40)
+ ")" (US-ASCII decimal value 41)
+ "+" (US-ASCII decimal value 43)
+ "," (US-ASCII decimal value 44)
+ "-" (US-ASCII decimal value 45)
+ "." (US-ASCII decimal value 46)
+ "/" (US-ASCII decimal value 47)
+ ":" (US-ASCII decimal value 58)
+ "=" (US-ASCII decimal value 61)
+ "?" (US-ASCII decimal value 63)
+
+ A maximally portable mail representation will confine
+ itself to relatively short lines of text in which the
+ only meaningful characters are taken from this set of
+ 73 characters. The base64 encoding follows this rule.
+
+ (8) Some mail transport agents will corrupt data that
+ includes certain literal strings. In particular, a
+
+
+
+Freed & Borenstein Standards Track [Page 8]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ period (".") alone on a line is known to be corrupted
+ by some (incorrect) SMTP implementations, and a line
+ that starts with the five characters "From " (the fifth
+ character is a SPACE) are commonly corrupted as well.
+ A careful composition agent can prevent these
+ corruptions by encoding the data (e.g., in the quoted-
+ printable encoding using "=46rom " in place of "From "
+ at the start of a line, and "=2E" in place of "." alone
+ on a line).
+
+ Please note that the above list is NOT a list of recommended
+ practices for MTAs. RFC 821 MTAs are prohibited from altering the
+ character of white space or wrapping long lines. These BAD and
+ invalid practices are known to occur on established networks, and
+ implementations should be robust in dealing with the bad effects they
+ can cause.
+
+4. Canonical Encoding Model
+
+ There was some confusion, in earlier versions of these documents,
+ regarding the model for when email data was to be converted to
+ canonical form and encoded, and in particular how this process would
+ affect the treatment of CRLFs, given that the representation of
+ newlines varies greatly from system to system. For this reason, a
+ canonical model for encoding is presented below.
+
+ The process of composing a MIME entity can be modeled as being done
+ in a number of steps. Note that these steps are roughly similar to
+ those steps used in PEM [RFC-1421] and are performed for each
+ "innermost level" body:
+
+ (1) Creation of local form.
+
+ The body to be transmitted is created in the system's
+ native format. The native character set is used and,
+ where appropriate, local end of line conventions are
+ used as well. The body may be a UNIX-style text file,
+ or a Sun raster image, or a VMS indexed file, or audio
+ data in a system-dependent format stored only in
+ memory, or anything else that corresponds to the local
+ model for the representation of some form of
+ information. Fundamentally, the data is created in the
+ "native" form that corresponds to the type specified by
+ the media type.
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 9]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ (2) Conversion to canonical form.
+
+ The entire body, including "out-of-band" information
+ such as record lengths and possibly file attribute
+ information, is converted to a universal canonical
+ form. The specific media type of the body as well as
+ its associated attributes dictate the nature of the
+ canonical form that is used. Conversion to the proper
+ canonical form may involve character set conversion,
+ transformation of audio data, compression, or various
+ other operations specific to the various media types.
+ If character set conversion is involved, however, care
+ must be taken to understand the semantics of the media
+ type, which may have strong implications for any
+ character set conversion, e.g. with regard to
+ syntactically meaningful characters in a text subtype
+ other than "plain".
+
+ For example, in the case of text/plain data, the text
+ must be converted to a supported character set and
+ lines must be delimited with CRLF delimiters in
+ accordance with RFC 822. Note that the restriction on
+ line lengths implied by RFC 822 is eliminated if the
+ next step employs either quoted-printable or base64
+ encoding.
+
+ (3) Apply transfer encoding.
+
+ A Content-Transfer-Encoding appropriate for this body
+ is applied. Note that there is no fixed relationship
+ between the media type and the transfer encoding. In
+ particular, it may be appropriate to base the choice of
+ base64 or quoted-printable on character frequency
+ counts which are specific to a given instance of a
+ body.
+
+ (4) Insertion into entity.
+
+ The encoded body is inserted into a MIME entity with
+ appropriate headers. The entity is then inserted into
+ the body of a higher-level entity (message or
+ multipart) as needed.
+
+ Conversion from entity form to local form is accomplished by
+ reversing these steps. Note that reversal of these steps may produce
+ differing results since there is no guarantee that the original and
+ final local forms are the same.
+
+
+
+
+Freed & Borenstein Standards Track [Page 10]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ It is vital to note that these steps are only a model; they are
+ specifically NOT a blueprint for how an actual system would be built.
+ In particular, the model fails to account for two common designs:
+
+ (1) In many cases the conversion to a canonical form prior
+ to encoding will be subsumed into the encoder itself,
+ which understands local formats directly. For example,
+ the local newline convention for text bodies might be
+ carried through to the encoder itself along with
+ knowledge of what that format is.
+
+ (2) The output of the encoders may have to pass through one
+ or more additional steps prior to being transmitted as
+ a message. As such, the output of the encoder may not
+ be conformant with the formats specified by RFC 822.
+ In particular, once again it may be appropriate for the
+ converter's output to be expressed using local newline
+ conventions rather than using the standard RFC 822 CRLF
+ delimiters.
+
+ Other implementation variations are conceivable as well. The vital
+ aspect of this discussion is that, in spite of any optimizations,
+ collapsings of required steps, or insertion of additional processing,
+ the resulting messages must be consistent with those produced by the
+ model described here. For example, a message with the following
+ header fields:
+
+ Content-type: text/foo; charset=bar
+ Content-Transfer-Encoding: base64
+
+ must be first represented in the text/foo form, then (if necessary)
+ represented in the "bar" character set, and finally transformed via
+ the base64 algorithm into a mail-safe form.
+
+ NOTE: Some confusion has been caused by systems that represent
+ messages in a format which uses local newline conventions which
+ differ from the RFC822 CRLF convention. It is important to note that
+ these formats are not canonical RFC822/MIME. These formats are
+ instead *encodings* of RFC822, where CRLF sequences in the canonical
+ representation of the message are encoded as the local newline
+ convention. Note that formats which encode CRLF sequences as, for
+ example, LF are not capable of representing MIME messages containing
+ binary data which contains LF octets not part of CRLF line separation
+ sequences.
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 11]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+5. Summary
+
+ This document defines what is meant by MIME Conformance. It also
+ details various problems known to exist in the Internet email system
+ and how to use MIME to overcome them. Finally, it describes MIME's
+ canonical encoding model.
+
+6. Security Considerations
+
+ Security issues are discussed in the second document in this set, RFC
+ 2046.
+
+7. Authors' Addresses
+
+ For more information, the authors of this document are best contacted
+ via Internet mail:
+
+ Ned Freed
+ Innosoft International, Inc.
+ 1050 East Garvey Avenue South
+ West Covina, CA 91790
+ USA
+
+ Phone: +1 818 919 3600
+ Fax: +1 818 919 3614
+ EMail: ned@innosoft.com
+
+ Nathaniel S. Borenstein
+ First Virtual Holdings
+ 25 Washington Avenue
+ Morristown, NJ 07960
+ USA
+
+ Phone: +1 201 540 8967
+ Fax: +1 201 993 3032
+ EMail: nsb@nsb.fv.com
+
+ MIME is a result of the work of the Internet Engineering Task Force
+ Working Group on RFC 822 Extensions. The chairman of that group,
+ Greg Vaudreuil, may be reached at:
+
+ Gregory M. Vaudreuil
+ Octel Network Services
+ 17080 Dallas Parkway
+ Dallas, TX 75248-1905
+ USA
+
+ EMail: Greg.Vaudreuil@Octel.Com
+
+
+
+Freed & Borenstein Standards Track [Page 12]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+8. Acknowledgements
+
+ This document is the result of the collective effort of a large
+ number of people, at several IETF meetings, on the IETF-SMTP and
+ IETF-822 mailing lists, and elsewhere. Although any enumeration
+ seems doomed to suffer from egregious omissions, the following are
+ among the many contributors to this effort:
+
+ Harald Tveit Alvestrand Marc Andreessen
+ Randall Atkinson Bob Braden
+ Philippe Brandon Brian Capouch
+ Kevin Carosso Uhhyung Choi
+ Peter Clitherow Dave Collier-Brown
+ Cristian Constantinof John Coonrod
+ Mark Crispin Dave Crocker
+ Stephen Crocker Terry Crowley
+ Walt Daniels Jim Davis
+ Frank Dawson Axel Deininger
+ Hitoshi Doi Kevin Donnelly
+ Steve Dorner Keith Edwards
+ Chris Eich Dana S. Emery
+ Johnny Eriksson Craig Everhart
+ Patrik Faltstrom Erik E. Fair
+ Roger Fajman Alain Fontaine
+ Martin Forssen James M. Galvin
+ Stephen Gildea Philip Gladstone
+ Thomas Gordon Keld Simonsen
+ Terry Gray Phill Gross
+ James Hamilton David Herron
+ Mark Horton Bruce Howard
+ Bill Janssen Olle Jarnefors
+ Risto Kankkunen Phil Karn
+ Alan Katz Tim Kehres
+ Neil Katin Steve Kille
+ Kyuho Kim Anders Klemets
+ John Klensin Valdis Kletniek
+ Jim Knowles Stev Knowles
+ Bob Kummerfeld Pekka Kytolaakso
+ Stellan Lagerstrom Vincent Lau
+ Timo Lehtinen Donald Lindsay
+ Warner Losh Carlyn Lowery
+ Laurence Lundblade Charles Lynn
+ John R. MacMillan Larry Masinter
+ Rick McGowan Michael J. McInerny
+ Leo Mclaughlin Goli Montaser-Kohsari
+ Tom Moore John Gardiner Myers
+ Erik Naggum Mark Needleman
+ Chris Newman John Noerenberg
+
+
+
+Freed & Borenstein Standards Track [Page 13]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ Mats Ohrman Julian Onions
+ Michael Patton David J. Pepper
+ Erik van der Poel Blake C. Ramsdell
+ Christer Romson Luc Rooijakkers
+ Marshall T. Rose Jonathan Rosenberg
+ Guido van Rossum Jan Rynning
+ Harri Salminen Michael Sanderson
+ Yutaka Sato Markku Savela
+ Richard Alan Schafer Masahiro Sekiguchi
+ Mark Sherman Bob Smart
+ Peter Speck Henry Spencer
+ Einar Stefferud Michael Stein
+ Klaus Steinberger Peter Svanberg
+ James Thompson Steve Uhler
+ Stuart Vance Peter Vanderbilt
+ Greg Vaudreuil Ed Vielmetti
+ Larry W. Virden Ryan Waldron
+ Rhys Weatherly Jay Weber
+ Dave Wecker Wally Wedel
+ Sven-Ove Westberg Brian Wideen
+ John Wobus Glenn Wright
+ Rayan Zachariassen David Zimmerman
+
+ The authors apologize for any omissions from this list, which are
+ certainly unintentional.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 14]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+Appendix A -- A Complex Multipart Example
+
+ What follows is the outline of a complex multipart message. This
+ message contains five parts that are to be displayed serially: two
+ introductory plain text objects, an embedded multipart message, a
+ text/enriched object, and a closing encapsulated text message in a
+ non-ASCII character set. The embedded multipart message itself
+ contains two objects to be displayed in parallel, a picture and an
+ audio fragment.
+
+ MIME-Version: 1.0
+ From: Nathaniel Borenstein <nsb@nsb.fv.com>
+ To: Ned Freed <ned@innosoft.com>
+ Date: Fri, 07 Oct 1994 16:15:05 -0700 (PDT)
+ Subject: A multipart example
+ Content-Type: multipart/mixed;
+ boundary=unique-boundary-1
+
+ This is the preamble area of a multipart message.
+ Mail readers that understand multipart format
+ should ignore this preamble.
+
+ If you are reading this text, you might want to
+ consider changing to a mail reader that understands
+ how to properly display multipart messages.
+
+ --unique-boundary-1
+
+ ... Some text appears here ...
+
+ [Note that the blank between the boundary and the start
+ of the text in this part means no header fields were
+ given and this is text in the US-ASCII character set.
+ It could have been done with explicit typing as in the
+ next part.]
+
+ --unique-boundary-1
+ Content-type: text/plain; charset=US-ASCII
+
+ This could have been part of the previous part, but
+ illustrates explicit versus implicit typing of body
+ parts.
+
+ --unique-boundary-1
+ Content-Type: multipart/parallel; boundary=unique-boundary-2
+
+ --unique-boundary-2
+ Content-Type: audio/basic
+
+
+
+Freed & Borenstein Standards Track [Page 15]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ Content-Transfer-Encoding: base64
+
+ ... base64-encoded 8000 Hz single-channel
+ mu-law-format audio data goes here ...
+
+ --unique-boundary-2
+ Content-Type: image/jpeg
+ Content-Transfer-Encoding: base64
+
+ ... base64-encoded image data goes here ...
+
+ --unique-boundary-2--
+
+ --unique-boundary-1
+ Content-type: text/enriched
+
+ This is <bold><italic>enriched.</italic></bold>
+ <smaller>as defined in RFC 1896</smaller>
+
+ Isn't it
+ <bigger><bigger>cool?</bigger></bigger>
+
+ --unique-boundary-1
+ Content-Type: message/rfc822
+
+ From: (mailbox in US-ASCII)
+ To: (address in US-ASCII)
+ Subject: (subject in US-ASCII)
+ Content-Type: Text/plain; charset=ISO-8859-1
+ Content-Transfer-Encoding: Quoted-printable
+
+ ... Additional text in ISO-8859-1 goes here ...
+
+ --unique-boundary-1--
+
+Appendix B -- Changes from RFC 1521, 1522, and 1590
+
+ These documents are a revision of RFC 1521, 1522, and 1590. For the
+ convenience of those familiar with the earlier documents, the changes
+ from those documents are summarized in this appendix. For further
+ history, note that Appendix H in RFC 1521 specified how that document
+ differed from its predecessor, RFC 1341.
+
+ (1) This document has been completely reformatted and split
+ into multiple documents. This was done to improve the
+ quality of the plain text version of this document,
+ which is required to be the reference copy.
+
+
+
+
+Freed & Borenstein Standards Track [Page 16]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ (2) BNF describing the overall structure of MIME object
+ headers has been added. This is a documentation change
+ only -- the underlying syntax has not changed in any
+ way.
+
+ (3) The specific BNF for the seven media types in MIME has
+ been removed. This BNF was incorrect, incomplete, amd
+ inconsistent with the type-indendependent BNF. And
+ since the type-independent BNF already fully specifies
+ the syntax of the various MIME headers, the type-
+ specific BNF was, in the final analysis, completely
+ unnecessary and caused more problems than it solved.
+
+ (4) The more specific "US-ASCII" character set name has
+ replaced the use of the informal term ASCII in many
+ parts of these documents.
+
+ (5) The informal concept of a primary subtype has been
+ removed.
+
+ (6) The term "object" was being used inconsistently. The
+ definition of this term has been clarified, along with
+ the related terms "body", "body part", and "entity",
+ and usage has been corrected where appropriate.
+
+ (7) The BNF for the multipart media type has been
+ rearranged to make it clear that the CRLF preceeding
+ the boundary marker is actually part of the marker
+ itself rather than the preceeding body part.
+
+ (8) The prose and BNF describing the multipart media type
+ have been changed to make it clear that the body parts
+ within a multipart object MUST NOT contain any lines
+ beginning with the boundary parameter string.
+
+ (9) In the rules on reassembling "message/partial" MIME
+ entities, "Subject" is added to the list of headers to
+ take from the inner message, and the example is
+ modified to clarify this point.
+
+ (10) "Message/partial" fragmenters are restricted to
+ splitting MIME objects only at line boundaries.
+
+ (11) In the discussion of the application/postscript type,
+ an additional paragraph has been added warning about
+ possible interoperability problems caused by embedding
+ of binary data inside a PostScript MIME entity.
+
+
+
+
+Freed & Borenstein Standards Track [Page 17]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ (12) Added a clarifying note to the basic syntax rules for
+ the Content-Type header field to make it clear that the
+ following two forms:
+
+ Content-type: text/plain; charset=us-ascii (comment)
+
+ Content-type: text/plain; charset="us-ascii"
+
+ are completely equivalent.
+
+ (13) The following sentence has been removed from the
+ discussion of the MIME-Version header: "However,
+ conformant software is encouraged to check the version
+ number and at least warn the user if an unrecognized
+ MIME-version is encountered."
+
+ (14) A typo was fixed that said "application/external-body"
+ instead of "message/external-body".
+
+ (15) The definition of a character set has been reorganized
+ to make the requirements clearer.
+
+ (16) The definition of the "image/gif" media type has been
+ moved to a separate document. This change was made
+ because of potential conflicts with IETF rules
+ governing the standardization of patented technology.
+
+ (17) The definitions of "7bit" and "8bit" have been
+ tightened so that use of bare CR, LF can only be used
+ as end-of-line sequences. The document also no longer
+ requires that NUL characters be preserved, which brings
+ MIME into alignment with real-world implementations.
+
+ (18) The definition of canonical text in MIME has been
+ tightened so that line breaks must be represented by a
+ CRLF sequence. CR and LF characters are not allowed
+ outside of this usage. The definition of quoted-
+ printable encoding has been altered accordingly.
+
+ (19) The definition of the quoted-printable encoding now
+ includes a number of suggestions for how quoted-
+ printable encoders might best handle improperly encoded
+ material.
+
+ (20) Prose was added to clarify the use of the "7bit",
+ "8bit", and "binary" transfer-encodings on multipart or
+ message entities encapsulating "8bit" or "binary" data.
+
+
+
+
+Freed & Borenstein Standards Track [Page 18]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ (21) In the section on MIME Conformance, "multipart/digest"
+ support was added to the list of requirements for
+ minimal MIME conformance. Also, the requirement for
+ "message/rfc822" support were strengthened to clarify
+ the importance of recognizing recursive structure.
+
+ (22) The various restrictions on subtypes of "message" are
+ now specified entirely on a subtype by subtype basis.
+
+ (23) The definition of "message/rfc822" was changed to
+ indicate that at least one of the "From", "Subject", or
+ "Date" headers must be present.
+
+ (24) The required handling of unrecognized subtypes as
+ "application/octet-stream" has been made more explicit
+ in both the type definitions sections and the
+ conformance guidelines.
+
+ (25) Examples using text/richtext were changed to
+ text/enriched.
+
+ (26) The BNF definition of subtype has been changed to make
+ it clear that either an IANA registered subtype or a
+ nonstandard "X-" subtype must be used in a Content-Type
+ header field.
+
+ (27) MIME media types that are simply registered for use and
+ those that are standardized by the IETF are now
+ distinguished in the MIME BNF.
+
+ (28) All of the various MIME registration procedures have
+ been extensively revised. IANA registration procedures
+ for character sets have been moved to a separate
+ document that is no included in this set of documents.
+
+ (29) The use of escape and shift mechanisms in the US-ASCII
+ and ISO-8859-X character sets these documents define
+ have been clarified: Such mechanisms should never be
+ used in conjunction with these character sets and their
+ effect if they are used is undefined.
+
+ (30) The definition of the AFS access-type for
+ message/external-body has been removed.
+
+ (31) The handling of the combination of
+ multipart/alternative and message/external-body is now
+ specifically addressed.
+
+
+
+
+Freed & Borenstein Standards Track [Page 19]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ (32) Security issues specific to message/external-body are
+ now discussed in some detail.
+
+Appendix C -- References
+
+ [ATK]
+ Borenstein, Nathaniel S., Multimedia Applications
+ Development with the Andrew Toolkit, Prentice-Hall, 1990.
+
+ [ISO-2022]
+ International Standard -- Information Processing --
+ Character Code Structure and Extension Techniques,
+ ISO/IEC 2022:1994, 4th ed.
+
+ [ISO-8859]
+ International Standard -- Information Processing -- 8-bit
+ Single-Byte Coded Graphic Character Sets
+ - Part 1: Latin Alphabet No. 1, ISO 8859-1:1987, 1st ed.
+ - Part 2: Latin Alphabet No. 2, ISO 8859-2:1987, 1st ed.
+ - Part 3: Latin Alphabet No. 3, ISO 8859-3:1988, 1st ed.
+ - Part 4: Latin Alphabet No. 4, ISO 8859-4:1988, 1st ed.
+ - Part 5: Latin/Cyrillic Alphabet, ISO 8859-5:1988, 1st
+ ed.
+ - Part 6: Latin/Arabic Alphabet, ISO 8859-6:1987, 1st ed.
+ - Part 7: Latin/Greek Alphabet, ISO 8859-7:1987, 1st ed.
+ - Part 8: Latin/Hebrew Alphabet, ISO 8859-8:1988, 1st ed.
+ - Part 9: Latin Alphabet No. 5, ISO/IEC 8859-9:1989, 1st
+ ed.
+ International Standard -- Information Technology -- 8-bit
+ Single-Byte Coded Graphic Character Sets
+ - Part 10: Latin Alphabet No. 6, ISO/IEC 8859-10:1992,
+ 1st ed.
+
+ [ISO-646]
+ International Standard -- Information Technology -- ISO
+ 7-bit Coded Character Set for Information Interchange,
+ ISO 646:1991, 3rd ed..
+
+ [JPEG]
+ JPEG Draft Standard ISO 10918-1 CD.
+
+ [MPEG]
+ Video Coding Draft Standard ISO 11172 CD, ISO
+ IEC/JTC1/SC2/WG11 (Motion Picture Experts Group), May,
+ 1991.
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 20]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ [PCM]
+ CCITT, Fascicle III.4 - Recommendation G.711, "Pulse Code
+ Modulation (PCM) of Voice Frequencies", Geneva, 1972.
+
+ [POSTSCRIPT]
+ Adobe Systems, Inc., PostScript Language Reference
+ Manual, Addison-Wesley, 1985.
+
+ [POSTSCRIPT2]
+ Adobe Systems, Inc., PostScript Language Reference
+ Manual, Addison-Wesley, Second Ed., 1990.
+
+ [RFC-783]
+ Sollins, K.R., "TFTP Protocol (revision 2)", RFC-783,
+ MIT, June 1981.
+
+ [RFC-821]
+ Postel, J.B., "Simple Mail Transfer Protocol", STD 10,
+ RFC 821, USC/Information Sciences Institute, August 1982.
+
+ [RFC-822]
+ Crocker, D., "Standard for the Format of ARPA Internet
+ Text Messages", STD 11, RFC 822, UDEL, August 1982.
+
+ [RFC-934]
+ Rose, M. and E. Stefferud, "Proposed Standard for Message
+ Encapsulation", RFC 934, Delaware and NMA, January 1985.
+
+ [RFC-959]
+ Postel, J. and J. Reynolds, "File Transfer Protocol", STD
+ 9, RFC 959, USC/Information Sciences Institute, October
+ 1985.
+
+ [RFC-1049]
+ Sirbu, M., "Content-Type Header Field for Internet
+ Messages", RFC 1049, CMU, March 1988.
+
+ [RFC-1154]
+ Robinson, D., and R. Ullmann, "Encoding Header Field for
+ Internet Messages", RFC 1154, Prime Computer, Inc., April
+ 1990.
+
+ [RFC-1341]
+ Borenstein, N., and N. Freed, "MIME (Multipurpose
+ Internet Mail Extensions): Mechanisms for Specifying and
+ Describing the Format of Internet Message Bodies", RFC
+ 1341, Bellcore, Innosoft, June 1992.
+
+
+
+
+Freed & Borenstein Standards Track [Page 21]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ [RFC-1342]
+ Moore, K., "Representation of Non-Ascii Text in Internet
+ Message Headers", RFC 1342, University of Tennessee, June
+ 1992.
+
+ [RFC-1344]
+ Borenstein, N., "Implications of MIME for Internet Mail
+ Gateways", RFC 1344, Bellcore, June 1992.
+
+ [RFC-1345]
+ Simonsen, K., "Character Mnemonics & Character Sets", RFC
+ 1345, Rationel Almen Planlaegning, June 1992.
+
+ [RFC-1421]
+ Linn, J., "Privacy Enhancement for Internet Electronic
+ Mail: Part I -- Message Encryption and Authentication
+ Procedures", RFC 1421, IAB IRTF PSRG, IETF PEM WG,
+ February 1993.
+
+ [RFC-1422]
+ Kent, S., "Privacy Enhancement for Internet Electronic
+ Mail: Part II -- Certificate-Based Key Management", RFC
+ 1422, IAB IRTF PSRG, IETF PEM WG, February 1993.
+
+ [RFC-1423]
+ Balenson, D., "Privacy Enhancement for Internet
+ Electronic Mail: Part III -- Algorithms, Modes, and
+ Identifiers", IAB IRTF PSRG, IETF PEM WG, February 1993.
+
+ [RFC-1424]
+ Kaliski, B., "Privacy Enhancement for Internet Electronic
+ Mail: Part IV -- Key Certification and Related
+ Services", IAB IRTF PSRG, IETF PEM WG, February 1993.
+
+ [RFC-1521]
+ Borenstein, N., and Freed, N., "MIME (Multipurpose
+ Internet Mail Extensions): Mechanisms for Specifying and
+ Describing the Format of Internet Message Bodies", RFC
+ 1521, Bellcore, Innosoft, September, 1993.
+
+ [RFC-1522]
+ Moore, K., "Representation of Non-ASCII Text in Internet
+ Message Headers", RFC 1522, University of Tennessee,
+ September 1993.
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 22]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ [RFC-1524]
+ Borenstein, N., "A User Agent Configuration Mechanism for
+ Multimedia Mail Format Information", RFC 1524, Bellcore,
+ September 1993.
+
+ [RFC-1543]
+ Postel, J., "Instructions to RFC Authors", RFC 1543,
+ USC/Information Sciences Institute, October 1993.
+
+ [RFC-1556]
+ Nussbacher, H., "Handling of Bi-directional Texts in
+ MIME", RFC 1556, Israeli Inter-University Computer
+ Center, December 1993.
+
+ [RFC-1590]
+ Postel, J., "Media Type Registration Procedure", RFC
+ 1590, USC/Information Sciences Institute, March 1994.
+
+ [RFC-1602]
+ Internet Architecture Board, Internet Engineering
+ Steering Group, Huitema, C., Gross, P., "The Internet
+ Standards Process -- Revision 2", March 1994.
+
+ [RFC-1652]
+ Klensin, J., (WG Chair), Freed, N., (Editor), Rose, M.,
+ Stefferud, E., and Crocker, D., "SMTP Service Extension
+ for 8bit-MIME transport", RFC 1652, United Nations
+ University, Innosoft, Dover Beach Consulting, Inc.,
+ Network Management Associates, Inc., The Branch Office,
+ March 1994.
+
+ [RFC-1700]
+ Reynolds, J. and J. Postel, "Assigned Numbers", STD 2,
+ RFC 1700, USC/Information Sciences Institute, October
+ 1994.
+
+ [RFC-1741]
+ Faltstrom, P., Crocker, D., and Fair, E., "MIME Content
+ Type for BinHex Encoded Files", December 1994.
+
+ [RFC-1896]
+ Resnick, P., and A. Walker, "The text/enriched MIME
+ Content-type", RFC 1896, February, 1996.
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 23]
+\f
+RFC 2049 MIME Conformance November 1996
+
+
+ [RFC-2045]
+ Freed, N., and and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message
+ Bodies", RFC 2045, Innosoft, First Virtual Holdings,
+ November 1996.
+
+ [RFC-2046]
+ Freed, N., and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Two: Media Types", RFC 2046,
+ Innosoft, First Virtual Holdings, November 1996.
+
+ [RFC-2047]
+ Moore, K., "Multipurpose Internet Mail Extensions (MIME)
+ Part Three: Representation of Non-ASCII Text in Internet
+ Message Headers", RFC 2047, University of
+ Tennessee, November 1996.
+
+ [RFC-2048]
+ Freed, N., Klensin, J., and J. Postel, "Multipurpose
+ Internet Mail Extensions (MIME) Part Four: MIME
+ Registration Procedures", RFC 2048, Innosoft, MCI,
+ ISI, November 1996.
+
+ [RFC-2049]
+ Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Five: Conformance Criteria and
+ Examples", RFC 2049 (this document), Innosoft, First
+ Virtual Holdings, November 1996.
+
+ [US-ASCII]
+ Coded Character Set -- 7-Bit American Standard Code for
+ Information Interchange, ANSI X3.4-1986.
+
+ [X400]
+ Schicker, Pietro, "Message Handling Systems, X.400",
+ Message Handling Systems and Distributed Applications, E.
+ Stefferud, O-j. Jacobsen, and P. Schicker, eds., North-
+ Holland, 1989, pp. 3-41.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Freed & Borenstein Standards Track [Page 24]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group M. Crispin
+Request for Comments: 2060 University of Washington
+Obsoletes: 1730 December 1996
+Category: Standards Track
+
+
+ INTERNET MESSAGE ACCESS PROTOCOL - VERSION 4rev1
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Abstract
+
+ The Internet Message Access Protocol, Version 4rev1 (IMAP4rev1)
+ allows a client to access and manipulate electronic mail messages on
+ a server. IMAP4rev1 permits manipulation of remote message folders,
+ called "mailboxes", in a way that is functionally equivalent to local
+ mailboxes. IMAP4rev1 also provides the capability for an offline
+ client to resynchronize with the server (see also [IMAP-DISC]).
+
+ IMAP4rev1 includes operations for creating, deleting, and renaming
+ mailboxes; checking for new messages; permanently removing messages;
+ setting and clearing flags; [RFC-822] and [MIME-IMB] parsing;
+ searching; and selective fetching of message attributes, texts, and
+ portions thereof. Messages in IMAP4rev1 are accessed by the use of
+ numbers. These numbers are either message sequence numbers or unique
+ identifiers.
+
+ IMAP4rev1 supports a single server. A mechanism for accessing
+ configuration information to support multiple IMAP4rev1 servers is
+ discussed in [ACAP].
+
+ IMAP4rev1 does not specify a means of posting mail; this function is
+ handled by a mail transfer protocol such as [SMTP].
+
+ IMAP4rev1 is designed to be upwards compatible from the [IMAP2] and
+ unpublished IMAP2bis protocols. In the course of the evolution of
+ IMAP4rev1, some aspects in the earlier protocol have become obsolete.
+ Obsolete commands, responses, and data formats which an IMAP4rev1
+ implementation may encounter when used with an earlier implementation
+ are described in [IMAP-OBSOLETE].
+
+
+
+
+
+Crispin Standards Track [Page 1]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Other compatibility issues with IMAP2bis, the most common variant of
+ the earlier protocol, are discussed in [IMAP-COMPAT]. A full
+ discussion of compatibility issues with rare (and presumed extinct)
+ variants of [IMAP2] is in [IMAP-HISTORICAL]; this document is
+ primarily of historical interest.
+
+Table of Contents
+
+IMAP4rev1 Protocol Specification .................................. 4
+1. How to Read This Document ................................. 4
+1.1. Organization of This Document ............................. 4
+1.2. Conventions Used in This Document ......................... 4
+2. Protocol Overview ......................................... 5
+2.1. Link Level ................................................ 5
+2.2. Commands and Responses .................................... 6
+2.2.1. Client Protocol Sender and Server Protocol Receiver ....... 6
+2.2.2. Server Protocol Sender and Client Protocol Receiver ....... 7
+2.3. Message Attributes ........................................ 7
+2.3.1. Message Numbers ........................................... 7
+2.3.1.1. Unique Identifier (UID) Message Attribute ......... 7
+2.3.1.2. Message Sequence Number Message Attribute ......... 9
+2.3.2. Flags Message Attribute .................................... 9
+2.3.3. Internal Date Message Attribute ........................... 10
+2.3.4. [RFC-822] Size Message Attribute .......................... 11
+2.3.5. Envelope Structure Message Attribute ...................... 11
+2.3.6. Body Structure Message Attribute .......................... 11
+2.4. Message Texts ............................................. 11
+3. State and Flow Diagram .................................... 11
+3.1. Non-Authenticated State ................................... 11
+3.2. Authenticated State ....................................... 11
+3.3. Selected State ............................................ 12
+3.4. Logout State .............................................. 12
+4. Data Formats .............................................. 12
+4.1. Atom ...................................................... 13
+4.2. Number .................................................... 13
+4.3. String ..................................................... 13
+4.3.1. 8-bit and Binary Strings .................................. 13
+4.4. Parenthesized List ........................................ 14
+4.5. NIL ....................................................... 14
+5. Operational Considerations ................................ 14
+5.1. Mailbox Naming ............................................ 14
+5.1.1. Mailbox Hierarchy Naming .................................. 14
+5.1.2. Mailbox Namespace Naming Convention ....................... 14
+5.1.3. Mailbox International Naming Convention ................... 15
+5.2. Mailbox Size and Message Status Updates ................... 16
+5.3. Response when no Command in Progress ...................... 16
+5.4. Autologout Timer .......................................... 16
+5.5. Multiple Commands in Progress ............................. 17
+
+
+
+Crispin Standards Track [Page 2]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6. Client Commands ........................................... 17
+6.1. Client Commands - Any State ............................... 18
+6.1.1. CAPABILITY Command ........................................ 18
+6.1.2. NOOP Command .............................................. 19
+6.1.3. LOGOUT Command ............................................ 20
+6.2. Client Commands - Non-Authenticated State ................. 20
+6.2.1. AUTHENTICATE Command ...................................... 21
+6.2.2. LOGIN Command ............................................. 22
+6.3. Client Commands - Authenticated State ..................... 22
+6.3.1. SELECT Command ............................................ 23
+6.3.2. EXAMINE Command ........................................... 24
+6.3.3. CREATE Command ............................................ 25
+6.3.4. DELETE Command ............................................ 26
+6.3.5. RENAME Command ............................................ 27
+6.3.6. SUBSCRIBE Command ......................................... 29
+6.3.7. UNSUBSCRIBE Command ....................................... 30
+6.3.8. LIST Command .............................................. 30
+6.3.9. LSUB Command .............................................. 32
+6.3.10. STATUS Command ............................................ 33
+6.3.11. APPEND Command ............................................ 34
+6.4. Client Commands - Selected State .......................... 35
+6.4.1. CHECK Command ............................................. 36
+6.4.2. CLOSE Command ............................................. 36
+6.4.3. EXPUNGE Command ........................................... 37
+6.4.4. SEARCH Command ............................................ 37
+6.4.5. FETCH Command ............................................. 41
+6.4.6. STORE Command ............................................. 45
+6.4.7. COPY Command .............................................. 46
+6.4.8. UID Command ............................................... 47
+6.5. Client Commands - Experimental/Expansion .................. 48
+6.5.1. X<atom> Command ........................................... 48
+7. Server Responses .......................................... 48
+7.1. Server Responses - Status Responses ....................... 49
+7.1.1. OK Response ............................................... 51
+7.1.2. NO Response ............................................... 51
+7.1.3. BAD Response .............................................. 52
+7.1.4. PREAUTH Response .......................................... 52
+7.1.5. BYE Response .............................................. 52
+7.2. Server Responses - Server and Mailbox Status .............. 53
+7.2.1. CAPABILITY Response ....................................... 53
+7.2.2. LIST Response .............................................. 54
+7.2.3. LSUB Response ............................................. 55
+7.2.4 STATUS Response ........................................... 55
+7.2.5. SEARCH Response ........................................... 55
+7.2.6. FLAGS Response ............................................ 56
+7.3. Server Responses - Mailbox Size ........................... 56
+7.3.1. EXISTS Response ........................................... 56
+7.3.2. RECENT Response ........................................... 57
+
+
+
+Crispin Standards Track [Page 3]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+7.4. Server Responses - Message Status ......................... 57
+7.4.1. EXPUNGE Response .......................................... 57
+7.4.2. FETCH Response ............................................ 58
+7.5. Server Responses - Command Continuation Request ........... 63
+8. Sample IMAP4rev1 connection ............................... 63
+9. Formal Syntax ............................................. 64
+10. Author's Note ............................................. 74
+11. Security Considerations ................................... 74
+12. Author's Address .......................................... 75
+Appendices ........................................................ 76
+A. References ................................................ 76
+B. Changes from RFC 1730 ..................................... 77
+C. Key Word Index ............................................ 79
+
+
+IMAP4rev1 Protocol Specification
+
+1. How to Read This Document
+
+1.1. Organization of This Document
+
+ This document is written from the point of view of the implementor of
+ an IMAP4rev1 client or server. Beyond the protocol overview in
+ section 2, it is not optimized for someone trying to understand the
+ operation of the protocol. The material in sections 3 through 5
+ provides the general context and definitions with which IMAP4rev1
+ operates.
+
+ Sections 6, 7, and 9 describe the IMAP commands, responses, and
+ syntax, respectively. The relationships among these are such that it
+ is almost impossible to understand any of them separately. In
+ particular, do not attempt to deduce command syntax from the command
+ section alone; instead refer to the Formal Syntax section.
+
+1.2. Conventions Used in This Document
+
+ In examples, "C:" and "S:" indicate lines sent by the client and
+ server respectively.
+
+ The following terms are used in this document to signify the
+ requirements of this specification.
+
+ 1) MUST, or the adjective REQUIRED, means that the definition is
+ an absolute requirement of the specification.
+
+ 2) MUST NOT that the definition is an absolute prohibition of the
+ specification.
+
+
+
+
+Crispin Standards Track [Page 4]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ 3) SHOULD means that there may exist valid reasons in particular
+ circumstances to ignore a particular item, but the full
+ implications MUST be understood and carefully weighed before
+ choosing a different course.
+
+ 4) SHOULD NOT means that there may exist valid reasons in
+ particular circumstances when the particular behavior is
+ acceptable or even useful, but the full implications SHOULD be
+ understood and the case carefully weighed before implementing
+ any behavior described with this label.
+
+ 5) MAY, or the adjective OPTIONAL, means that an item is truly
+ optional. One vendor may choose to include the item because a
+ particular marketplace requires it or because the vendor feels
+ that it enhances the product while another vendor may omit the
+ same item. An implementation which does not include a
+ particular option MUST be prepared to interoperate with another
+ implementation which does include the option.
+
+ "Can" is used instead of "may" when referring to a possible
+ circumstance or situation, as opposed to an optional facility of
+ the protocol.
+
+ "User" is used to refer to a human user, whereas "client" refers
+ to the software being run by the user.
+
+ "Connection" refers to the entire sequence of client/server
+ interaction from the initial establishment of the network
+ connection until its termination. "Session" refers to the
+ sequence of client/server interaction from the time that a mailbox
+ is selected (SELECT or EXAMINE command) until the time that
+ selection ends (SELECT or EXAMINE of another mailbox, CLOSE
+ command, or connection termination).
+
+ Characters are 7-bit US-ASCII unless otherwise specified. Other
+ character sets are indicated using a "CHARSET", as described in
+ [MIME-IMT] and defined in [CHARSET]. CHARSETs have important
+ additional semantics in addition to defining character set; refer
+ to these documents for more detail.
+
+2. Protocol Overview
+
+2.1. Link Level
+
+ The IMAP4rev1 protocol assumes a reliable data stream such as
+ provided by TCP. When TCP is used, an IMAP4rev1 server listens on
+ port 143.
+
+
+
+
+Crispin Standards Track [Page 5]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+2.2. Commands and Responses
+
+ An IMAP4rev1 connection consists of the establishment of a
+ client/server network connection, an initial greeting from the
+ server, and client/server interactions. These client/server
+ interactions consist of a client command, server data, and a server
+ completion result response.
+
+ All interactions transmitted by client and server are in the form of
+ lines; that is, strings that end with a CRLF. The protocol receiver
+ of an IMAP4rev1 client or server is either reading a line, or is
+ reading a sequence of octets with a known count followed by a line.
+
+2.2.1. Client Protocol Sender and Server Protocol Receiver
+
+ The client command begins an operation. Each client command is
+ prefixed with an identifier (typically a short alphanumeric string,
+ e.g. A0001, A0002, etc.) called a "tag". A different tag is
+ generated by the client for each command.
+
+ There are two cases in which a line from the client does not
+ represent a complete command. In one case, a command argument is
+ quoted with an octet count (see the description of literal in String
+ under Data Formats); in the other case, the command arguments require
+ server feedback (see the AUTHENTICATE command). In either case, the
+ server sends a command continuation request response if it is ready
+ for the octets (if appropriate) and the remainder of the command.
+ This response is prefixed with the token "+".
+
+ Note: If, instead, the server detected an error in the command, it
+ sends a BAD completion response with tag matching the command (as
+ described below) to reject the command and prevent the client from
+ sending any more of the command.
+
+ It is also possible for the server to send a completion response
+ for some other command (if multiple commands are in progress), or
+ untagged data. In either case, the command continuation request
+ is still pending; the client takes the appropriate action for the
+ response, and reads another response from the server. In all
+ cases, the client MUST send a complete command (including
+ receiving all command continuation request responses and command
+ continuations for the command) before initiating a new command.
+
+ The protocol receiver of an IMAP4rev1 server reads a command line
+ from the client, parses the command and its arguments, and transmits
+ server data and a server command completion result response.
+
+
+
+
+
+Crispin Standards Track [Page 6]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+2.2.2. Server Protocol Sender and Client Protocol Receiver
+
+ Data transmitted by the server to the client and status responses
+ that do not indicate command completion are prefixed with the token
+ "*", and are called untagged responses.
+
+ Server data MAY be sent as a result of a client command, or MAY be
+ sent unilaterally by the server. There is no syntactic difference
+ between server data that resulted from a specific command and server
+ data that were sent unilaterally.
+
+ The server completion result response indicates the success or
+ failure of the operation. It is tagged with the same tag as the
+ client command which began the operation. Thus, if more than one
+ command is in progress, the tag in a server completion response
+ identifies the command to which the response applies. There are
+ three possible server completion responses: OK (indicating success),
+ NO (indicating failure), or BAD (indicating protocol error such as
+ unrecognized command or command syntax error).
+
+ The protocol receiver of an IMAP4rev1 client reads a response line
+ from the server. It then takes action on the response based upon the
+ first token of the response, which can be a tag, a "*", or a "+".
+
+ A client MUST be prepared to accept any server response at all times.
+ This includes server data that was not requested. Server data SHOULD
+ be recorded, so that the client can reference its recorded copy
+ rather than sending a command to the server to request the data. In
+ the case of certain server data, the data MUST be recorded.
+
+ This topic is discussed in greater detail in the Server Responses
+ section.
+
+2.3. Message Attributes
+
+ In addition to message text, each message has several attributes
+ associated with it. These attributes may be retrieved individually
+ or in conjunction with other attributes or message texts.
+
+2.3.1. Message Numbers
+
+ Messages in IMAP4rev1 are accessed by one of two numbers; the unique
+ identifier and the message sequence number.
+
+2.3.1.1. Unique Identifier (UID) Message Attribute
+
+ A 32-bit value assigned to each message, which when used with the
+ unique identifier validity value (see below) forms a 64-bit value
+
+
+
+Crispin Standards Track [Page 7]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ that is permanently guaranteed not to refer to any other message in
+ the mailbox. Unique identifiers are assigned in a strictly ascending
+ fashion in the mailbox; as each message is added to the mailbox it is
+ assigned a higher UID than the message(s) which were added
+ previously.
+
+ Unlike message sequence numbers, unique identifiers are not
+ necessarily contiguous. Unique identifiers also persist across
+ sessions. This permits a client to resynchronize its state from a
+ previous session with the server (e.g. disconnected or offline access
+ clients); this is discussed further in [IMAP-DISC].
+
+ Associated with every mailbox is a unique identifier validity value,
+ which is sent in an UIDVALIDITY response code in an OK untagged
+ response at mailbox selection time. If unique identifiers from an
+ earlier session fail to persist to this session, the unique
+ identifier validity value MUST be greater than the one used in the
+ earlier session.
+
+ Note: Unique identifiers MUST be strictly ascending in the mailbox
+ at all times. If the physical message store is re-ordered by a
+ non-IMAP agent, this requires that the unique identifiers in the
+ mailbox be regenerated, since the former unique identifers are no
+ longer strictly ascending as a result of the re-ordering. Another
+ instance in which unique identifiers are regenerated is if the
+ message store has no mechanism to store unique identifiers.
+ Although this specification recognizes that this may be
+ unavoidable in certain server environments, it STRONGLY ENCOURAGES
+ message store implementation techniques that avoid this problem.
+
+ Another cause of non-persistance is if the mailbox is deleted and
+ a new mailbox with the same name is created at a later date, Since
+ the name is the same, a client may not know that this is a new
+ mailbox unless the unique identifier validity is different. A
+ good value to use for the unique identifier validity value is a
+ 32-bit representation of the creation date/time of the mailbox.
+ It is alright to use a constant such as 1, but only if it
+ guaranteed that unique identifiers will never be reused, even in
+ the case of a mailbox being deleted (or renamed) and a new mailbox
+ by the same name created at some future time.
+
+ The unique identifier of a message MUST NOT change during the
+ session, and SHOULD NOT change between sessions. However, if it is
+ not possible to preserve the unique identifier of a message in a
+ subsequent session, each subsequent session MUST have a new unique
+ identifier validity value that is larger than any that was used
+ previously.
+
+
+
+
+Crispin Standards Track [Page 8]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+2.3.1.2. Message Sequence Number Message Attribute
+
+ A relative position from 1 to the number of messages in the mailbox.
+ This position MUST be ordered by ascending unique identifier. As
+ each new message is added, it is assigned a message sequence number
+ that is 1 higher than the number of messages in the mailbox before
+ that new message was added.
+
+ Message sequence numbers can be reassigned during the session. For
+ example, when a message is permanently removed (expunged) from the
+ mailbox, the message sequence number for all subsequent messages is
+ decremented. Similarly, a new message can be assigned a message
+ sequence number that was once held by some other message prior to an
+ expunge.
+
+ In addition to accessing messages by relative position in the
+ mailbox, message sequence numbers can be used in mathematical
+ calculations. For example, if an untagged "EXISTS 11" is received,
+ and previously an untagged "8 EXISTS" was received, three new
+ messages have arrived with message sequence numbers of 9, 10, and 11.
+ Another example; if message 287 in a 523 message mailbox has UID
+ 12345, there are exactly 286 messages which have lesser UIDs and 236
+ messages which have greater UIDs.
+
+2.3.2. Flags Message Attribute
+
+ A list of zero or more named tokens associated with the message. A
+ flag is set by its addition to this list, and is cleared by its
+ removal. There are two types of flags in IMAP4rev1. A flag of
+ either type may be permanent or session-only.
+
+ A system flag is a flag name that is pre-defined in this
+ specification. All system flags begin with "\". Certain system
+ flags (\Deleted and \Seen) have special semantics described
+ elsewhere. The currently-defined system flags are:
+
+ \Seen Message has been read
+
+ \Answered Message has been answered
+
+ \Flagged Message is "flagged" for urgent/special attention
+
+ \Deleted Message is "deleted" for removal by later EXPUNGE
+
+ \Draft Message has not completed composition (marked as a
+ draft).
+
+
+
+
+
+Crispin Standards Track [Page 9]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ \Recent Message is "recently" arrived in this mailbox. This
+ session is the first session to have been notified
+ about this message; subsequent sessions will not see
+ \Recent set for this message. This flag can not be
+ altered by the client.
+
+ If it is not possible to determine whether or not
+ this session is the first session to be notified
+ about a message, then that message SHOULD be
+ considered recent.
+
+ If multiple connections have the same mailbox
+ selected simultaneously, it is undefined which of
+ these connections will see newly-arrives messages
+ with \Recent set and which will see it without
+ \Recent set.
+
+ A keyword is defined by the server implementation. Keywords do
+ not begin with "\". Servers MAY permit the client to define new
+ keywords in the mailbox (see the description of the
+ PERMANENTFLAGS response code for more information).
+
+ A flag may be permanent or session-only on a per-flag basis.
+ Permanent flags are those which the client can add or remove
+ from the message flags permanently; that is, subsequent sessions
+ will see any change in permanent flags. Changes to session
+ flags are valid only in that session.
+
+ Note: The \Recent system flag is a special case of a
+ session flag. \Recent can not be used as an argument in a
+ STORE command, and thus can not be changed at all.
+
+2.3.3. Internal Date Message Attribute
+
+ The internal date and time of the message on the server. This is not
+ the date and time in the [RFC-822] header, but rather a date and time
+ which reflects when the message was received. In the case of
+ messages delivered via [SMTP], this SHOULD be the date and time of
+ final delivery of the message as defined by [SMTP]. In the case of
+ messages delivered by the IMAP4rev1 COPY command, this SHOULD be the
+ internal date and time of the source message. In the case of
+ messages delivered by the IMAP4rev1 APPEND command, this SHOULD be
+ the date and time as specified in the APPEND command description.
+ All other cases are implementation defined.
+
+
+
+
+
+
+
+Crispin Standards Track [Page 10]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+2.3.4. [RFC-822] Size Message Attribute
+
+ The number of octets in the message, as expressed in [RFC-822]
+ format.
+
+2.3.5. Envelope Structure Message Attribute
+
+ A parsed representation of the [RFC-822] envelope information (not to
+ be confused with an [SMTP] envelope) of the message.
+
+2.3.6. Body Structure Message Attribute
+
+ A parsed representation of the [MIME-IMB] body structure information
+ of the message.
+
+2.4. Message Texts
+
+ In addition to being able to fetch the full [RFC-822] text of a
+ message, IMAP4rev1 permits the fetching of portions of the full
+ message text. Specifically, it is possible to fetch the [RFC-822]
+ message header, [RFC-822] message body, a [MIME-IMB] body part, or a
+ [MIME-IMB] header.
+
+3. State and Flow Diagram
+
+ An IMAP4rev1 server is in one of four states. Most commands are
+ valid in only certain states. It is a protocol error for the client
+ to attempt a command while the command is in an inappropriate state.
+ In this case, a server will respond with a BAD or NO (depending upon
+ server implementation) command completion result.
+
+3.1. Non-Authenticated State
+
+ In non-authenticated state, the client MUST supply authentication
+ credentials before most commands will be permitted. This state is
+ entered when a connection starts unless the connection has been pre-
+ authenticated.
+
+3.2. Authenticated State
+
+ In authenticated state, the client is authenticated and MUST select a
+ mailbox to access before commands that affect messages will be
+ permitted. This state is entered when a pre-authenticated connection
+ starts, when acceptable authentication credentials have been
+ provided, or after an error in selecting a mailbox.
+
+
+
+
+
+
+Crispin Standards Track [Page 11]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+3.3. Selected State
+
+ In selected state, a mailbox has been selected to access. This state
+ is entered when a mailbox has been successfully selected.
+
+3.4. Logout State
+
+ In logout state, the connection is being terminated, and the server
+ will close the connection. This state can be entered as a result of
+ a client request or by unilateral server decision.
+
+ +--------------------------------------+
+ |initial connection and server greeting|
+ +--------------------------------------+
+ || (1) || (2) || (3)
+ VV || ||
+ +-----------------+ || ||
+ |non-authenticated| || ||
+ +-----------------+ || ||
+ || (7) || (4) || ||
+ || VV VV ||
+ || +----------------+ ||
+ || | authenticated |<=++ ||
+ || +----------------+ || ||
+ || || (7) || (5) || (6) ||
+ || || VV || ||
+ || || +--------+ || ||
+ || || |selected|==++ ||
+ || || +--------+ ||
+ || || || (7) ||
+ VV VV VV VV
+ +--------------------------------------+
+ | logout and close connection |
+ +--------------------------------------+
+
+ (1) connection without pre-authentication (OK greeting)
+ (2) pre-authenticated connection (PREAUTH greeting)
+ (3) rejected connection (BYE greeting)
+ (4) successful LOGIN or AUTHENTICATE command
+ (5) successful SELECT or EXAMINE command
+ (6) CLOSE command, or failed SELECT or EXAMINE command
+ (7) LOGOUT command, server shutdown, or connection closed
+
+4. Data Formats
+
+ IMAP4rev1 uses textual commands and responses. Data in IMAP4rev1 can
+ be in one of several forms: atom, number, string, parenthesized list,
+ or NIL.
+
+
+
+Crispin Standards Track [Page 12]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+4.1. Atom
+
+ An atom consists of one or more non-special characters.
+
+4.2. Number
+
+ A number consists of one or more digit characters, and represents a
+ numeric value.
+
+4.3. String
+
+ A string is in one of two forms: literal and quoted string. The
+ literal form is the general form of string. The quoted string form
+ is an alternative that avoids the overhead of processing a literal at
+ the cost of limitations of characters that can be used in a quoted
+ string.
+
+ A literal is a sequence of zero or more octets (including CR and LF),
+ prefix-quoted with an octet count in the form of an open brace ("{"),
+ the number of octets, close brace ("}"), and CRLF. In the case of
+ literals transmitted from server to client, the CRLF is immediately
+ followed by the octet data. In the case of literals transmitted from
+ client to server, the client MUST wait to receive a command
+ continuation request (described later in this document) before
+ sending the octet data (and the remainder of the command).
+
+ A quoted string is a sequence of zero or more 7-bit characters,
+ excluding CR and LF, with double quote (<">) characters at each end.
+
+ The empty string is represented as either "" (a quoted string with
+ zero characters between double quotes) or as {0} followed by CRLF (a
+ literal with an octet count of 0).
+
+ Note: Even if the octet count is 0, a client transmitting a
+ literal MUST wait to receive a command continuation request.
+
+4.3.1. 8-bit and Binary Strings
+
+ 8-bit textual and binary mail is supported through the use of a
+ [MIME-IMB] content transfer encoding. IMAP4rev1 implementations MAY
+ transmit 8-bit or multi-octet characters in literals, but SHOULD do
+ so only when the [CHARSET] is identified.
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 13]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Although a BINARY body encoding is defined, unencoded binary strings
+ are not permitted. A "binary string" is any string with NUL
+ characters. Implementations MUST encode binary data into a textual
+ form such as BASE64 before transmitting the data. A string with an
+ excessive amount of CTL characters MAY also be considered to be
+ binary.
+
+4.4. Parenthesized List
+
+ Data structures are represented as a "parenthesized list"; a sequence
+ of data items, delimited by space, and bounded at each end by
+ parentheses. A parenthesized list can contain other parenthesized
+ lists, using multiple levels of parentheses to indicate nesting.
+
+ The empty list is represented as () -- a parenthesized list with no
+ members.
+
+4.5. NIL
+
+ The special atom "NIL" represents the non-existence of a particular
+ data item that is represented as a string or parenthesized list, as
+ distinct from the empty string "" or the empty parenthesized list ().
+
+5. Operational Considerations
+
+5.1. Mailbox Naming
+
+ The interpretation of mailbox names is implementation-dependent.
+ However, the case-insensitive mailbox name INBOX is a special name
+ reserved to mean "the primary mailbox for this user on this server".
+
+5.1.1. Mailbox Hierarchy Naming
+
+ If it is desired to export hierarchical mailbox names, mailbox names
+ MUST be left-to-right hierarchical using a single character to
+ separate levels of hierarchy. The same hierarchy separator character
+ is used for all levels of hierarchy within a single name.
+
+5.1.2. Mailbox Namespace Naming Convention
+
+ By convention, the first hierarchical element of any mailbox name
+ which begins with "#" identifies the "namespace" of the remainder of
+ the name. This makes it possible to disambiguate between different
+ types of mailbox stores, each of which have their own namespaces.
+
+
+
+
+
+
+
+Crispin Standards Track [Page 14]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ For example, implementations which offer access to USENET
+ newsgroups MAY use the "#news" namespace to partition the USENET
+ newsgroup namespace from that of other mailboxes. Thus, the
+ comp.mail.misc newsgroup would have an mailbox name of
+ "#news.comp.mail.misc", and the name "comp.mail.misc" could refer
+ to a different object (e.g. a user's private mailbox).
+
+5.1.3. Mailbox International Naming Convention
+
+ By convention, international mailbox names are specified using a
+ modified version of the UTF-7 encoding described in [UTF-7]. The
+ purpose of these modifications is to correct the following problems
+ with UTF-7:
+
+ 1) UTF-7 uses the "+" character for shifting; this conflicts with
+ the common use of "+" in mailbox names, in particular USENET
+ newsgroup names.
+
+ 2) UTF-7's encoding is BASE64 which uses the "/" character; this
+ conflicts with the use of "/" as a popular hierarchy delimiter.
+
+ 3) UTF-7 prohibits the unencoded usage of "\"; this conflicts with
+ the use of "\" as a popular hierarchy delimiter.
+
+ 4) UTF-7 prohibits the unencoded usage of "~"; this conflicts with
+ the use of "~" in some servers as a home directory indicator.
+
+ 5) UTF-7 permits multiple alternate forms to represent the same
+ string; in particular, printable US-ASCII chararacters can be
+ represented in encoded form.
+
+ In modified UTF-7, printable US-ASCII characters except for "&"
+ represent themselves; that is, characters with octet values 0x20-0x25
+ and 0x27-0x7e. The character "&" (0x26) is represented by the two-
+ octet sequence "&-".
+
+ All other characters (octet values 0x00-0x1f, 0x7f-0xff, and all
+ Unicode 16-bit octets) are represented in modified BASE64, with a
+ further modification from [UTF-7] that "," is used instead of "/".
+ Modified BASE64 MUST NOT be used to represent any printing US-ASCII
+ character which can represent itself.
+
+ "&" is used to shift to modified BASE64 and "-" to shift back to US-
+ ASCII. All names start in US-ASCII, and MUST end in US-ASCII (that
+ is, a name that ends with a Unicode 16-bit octet MUST end with a "-
+ ").
+
+
+
+
+
+Crispin Standards Track [Page 15]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ For example, here is a mailbox name which mixes English, Japanese,
+ and Chinese text: ~peter/mail/&ZeVnLIqe-/&U,BTFw-
+
+5.2. Mailbox Size and Message Status Updates
+
+ At any time, a server can send data that the client did not request.
+ Sometimes, such behavior is REQUIRED. For example, agents other than
+ the server MAY add messages to the mailbox (e.g. new mail delivery),
+ change the flags of message in the mailbox (e.g. simultaneous access
+ to the same mailbox by multiple agents), or even remove messages from
+ the mailbox. A server MUST send mailbox size updates automatically
+ if a mailbox size change is observed during the processing of a
+ command. A server SHOULD send message flag updates automatically,
+ without requiring the client to request such updates explicitly.
+ Special rules exist for server notification of a client about the
+ removal of messages to prevent synchronization errors; see the
+ description of the EXPUNGE response for more detail.
+
+ Regardless of what implementation decisions a client makes on
+ remembering data from the server, a client implementation MUST record
+ mailbox size updates. It MUST NOT assume that any command after
+ initial mailbox selection will return the size of the mailbox.
+
+5.3. Response when no Command in Progress
+
+ Server implementations are permitted to send an untagged response
+ (except for EXPUNGE) while there is no command in progress. Server
+ implementations that send such responses MUST deal with flow control
+ considerations. Specifically, they MUST either (1) verify that the
+ size of the data does not exceed the underlying transport's available
+ window size, or (2) use non-blocking writes.
+
+5.4. Autologout Timer
+
+ If a server has an inactivity autologout timer, that timer MUST be of
+ at least 30 minutes' duration. The receipt of ANY command from the
+ client during that interval SHOULD suffice to reset the autologout
+ timer.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 16]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+5.5. Multiple Commands in Progress
+
+ The client MAY send another command without waiting for the
+ completion result response of a command, subject to ambiguity rules
+ (see below) and flow control constraints on the underlying data
+ stream. Similarly, a server MAY begin processing another command
+ before processing the current command to completion, subject to
+ ambiguity rules. However, any command continuation request responses
+ and command continuations MUST be negotiated before any subsequent
+ command is initiated.
+
+ The exception is if an ambiguity would result because of a command
+ that would affect the results of other commands. Clients MUST NOT
+ send multiple commands without waiting if an ambiguity would result.
+ If the server detects a possible ambiguity, it MUST execute commands
+ to completion in the order given by the client.
+
+ The most obvious example of ambiguity is when a command would affect
+ the results of another command; for example, a FETCH of a message's
+ flags and a STORE of that same message's flags.
+
+ A non-obvious ambiguity occurs with commands that permit an untagged
+ EXPUNGE response (commands other than FETCH, STORE, and SEARCH),
+ since an untagged EXPUNGE response can invalidate sequence numbers in
+ a subsequent command. This is not a problem for FETCH, STORE, or
+ SEARCH commands because servers are prohibited from sending EXPUNGE
+ responses while any of those commands are in progress. Therefore, if
+ the client sends any command other than FETCH, STORE, or SEARCH, it
+ MUST wait for a response before sending a command with message
+ sequence numbers.
+
+ For example, the following non-waiting command sequences are invalid:
+
+ FETCH + NOOP + STORE
+ STORE + COPY + FETCH
+ COPY + COPY
+ CHECK + FETCH
+
+ The following are examples of valid non-waiting command sequences:
+
+ FETCH + STORE + SEARCH + CHECK
+ STORE + COPY + EXPUNGE
+
+6. Client Commands
+
+ IMAP4rev1 commands are described in this section. Commands are
+ organized by the state in which the command is permitted. Commands
+ which are permitted in multiple states are listed in the minimum
+
+
+
+Crispin Standards Track [Page 17]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ permitted state (for example, commands valid in authenticated and
+ selected state are listed in the authenticated state commands).
+
+ Command arguments, identified by "Arguments:" in the command
+ descriptions below, are described by function, not by syntax. The
+ precise syntax of command arguments is described in the Formal Syntax
+ section.
+
+ Some commands cause specific server responses to be returned; these
+ are identified by "Responses:" in the command descriptions below.
+ See the response descriptions in the Responses section for
+ information on these responses, and the Formal Syntax section for the
+ precise syntax of these responses. It is possible for server data to
+ be transmitted as a result of any command; thus, commands that do not
+ specifically require server data specify "no specific responses for
+ this command" instead of "none".
+
+ The "Result:" in the command description refers to the possible
+ tagged status responses to a command, and any special interpretation
+ of these status responses.
+
+6.1. Client Commands - Any State
+
+ The following commands are valid in any state: CAPABILITY, NOOP, and
+ LOGOUT.
+
+6.1.1. CAPABILITY Command
+
+ Arguments: none
+
+ Responses: REQUIRED untagged response: CAPABILITY
+
+ Result: OK - capability completed
+ BAD - command unknown or arguments invalid
+
+ The CAPABILITY command requests a listing of capabilities that the
+ server supports. The server MUST send a single untagged
+ CAPABILITY response with "IMAP4rev1" as one of the listed
+ capabilities before the (tagged) OK response. This listing of
+ capabilities is not dependent upon connection state or user. It
+ is therefore not necessary to issue a CAPABILITY command more than
+ once in a connection.
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 18]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ A capability name which begins with "AUTH=" indicates that the
+ server supports that particular authentication mechanism. All
+ such names are, by definition, part of this specification. For
+ example, the authorization capability for an experimental
+ "blurdybloop" authenticator would be "AUTH=XBLURDYBLOOP" and not
+ "XAUTH=BLURDYBLOOP" or "XAUTH=XBLURDYBLOOP".
+
+ Other capability names refer to extensions, revisions, or
+ amendments to this specification. See the documentation of the
+ CAPABILITY response for additional information. No capabilities,
+ beyond the base IMAP4rev1 set defined in this specification, are
+ enabled without explicit client action to invoke the capability.
+
+ See the section entitled "Client Commands -
+ Experimental/Expansion" for information about the form of site or
+ implementation-specific capabilities.
+
+ Example: C: abcd CAPABILITY
+ S: * CAPABILITY IMAP4rev1 AUTH=KERBEROS_V4
+ S: abcd OK CAPABILITY completed
+
+6.1.2. NOOP Command
+
+ Arguments: none
+
+ Responses: no specific responses for this command (but see below)
+
+ Result: OK - noop completed
+ BAD - command unknown or arguments invalid
+
+ The NOOP command always succeeds. It does nothing.
+
+ Since any command can return a status update as untagged data, the
+ NOOP command can be used as a periodic poll for new messages or
+ message status updates during a period of inactivity. The NOOP
+ command can also be used to reset any inactivity autologout timer
+ on the server.
+
+ Example: C: a002 NOOP
+ S: a002 OK NOOP completed
+ . . .
+ C: a047 NOOP
+ S: * 22 EXPUNGE
+ S: * 23 EXISTS
+ S: * 3 RECENT
+ S: * 14 FETCH (FLAGS (\Seen \Deleted))
+ S: a047 OK NOOP completed
+
+
+
+
+Crispin Standards Track [Page 19]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6.1.3. LOGOUT Command
+
+ Arguments: none
+
+ Responses: REQUIRED untagged response: BYE
+
+ Result: OK - logout completed
+ BAD - command unknown or arguments invalid
+
+ The LOGOUT command informs the server that the client is done with
+ the connection. The server MUST send a BYE untagged response
+ before the (tagged) OK response, and then close the network
+ connection.
+
+ Example: C: A023 LOGOUT
+ S: * BYE IMAP4rev1 Server logging out
+ S: A023 OK LOGOUT completed
+ (Server and client then close the connection)
+
+6.2. Client Commands - Non-Authenticated State
+
+ In non-authenticated state, the AUTHENTICATE or LOGIN command
+ establishes authentication and enter authenticated state. The
+ AUTHENTICATE command provides a general mechanism for a variety of
+ authentication techniques, whereas the LOGIN command uses the
+ traditional user name and plaintext password pair.
+
+ Server implementations MAY allow non-authenticated access to certain
+ mailboxes. The convention is to use a LOGIN command with the userid
+ "anonymous". A password is REQUIRED. It is implementation-dependent
+ what requirements, if any, are placed on the password and what access
+ restrictions are placed on anonymous users.
+
+ Once authenticated (including as anonymous), it is not possible to
+ re-enter non-authenticated state.
+
+ In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT),
+ the following commands are valid in non-authenticated state:
+ AUTHENTICATE and LOGIN.
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 20]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6.2.1. AUTHENTICATE Command
+
+ Arguments: authentication mechanism name
+
+ Responses: continuation data can be requested
+
+ Result: OK - authenticate completed, now in authenticated state
+ NO - authenticate failure: unsupported authentication
+ mechanism, credentials rejected
+ BAD - command unknown or arguments invalid,
+ authentication exchange cancelled
+
+ The AUTHENTICATE command indicates an authentication mechanism,
+ such as described in [IMAP-AUTH], to the server. If the server
+ supports the requested authentication mechanism, it performs an
+ authentication protocol exchange to authenticate and identify the
+ client. It MAY also negotiate an OPTIONAL protection mechanism
+ for subsequent protocol interactions. If the requested
+ authentication mechanism is not supported, the server SHOULD
+ reject the AUTHENTICATE command by sending a tagged NO response.
+
+ The authentication protocol exchange consists of a series of
+ server challenges and client answers that are specific to the
+ authentication mechanism. A server challenge consists of a
+ command continuation request response with the "+" token followed
+ by a BASE64 encoded string. The client answer consists of a line
+ consisting of a BASE64 encoded string. If the client wishes to
+ cancel an authentication exchange, it issues a line with a single
+ "*". If the server receives such an answer, it MUST reject the
+ AUTHENTICATE command by sending a tagged BAD response.
+
+ A protection mechanism provides integrity and privacy protection
+ to the connection. If a protection mechanism is negotiated, it is
+ applied to all subsequent data sent over the connection. The
+ protection mechanism takes effect immediately following the CRLF
+ that concludes the authentication exchange for the client, and the
+ CRLF of the tagged OK response for the server. Once the
+ protection mechanism is in effect, the stream of command and
+ response octets is processed into buffers of ciphertext. Each
+ buffer is transferred over the connection as a stream of octets
+ prepended with a four octet field in network byte order that
+ represents the length of the following data. The maximum
+ ciphertext buffer length is defined by the protection mechanism.
+
+ Authentication mechanisms are OPTIONAL. Protection mechanisms are
+ also OPTIONAL; an authentication mechanism MAY be implemented
+ without any protection mechanism. If an AUTHENTICATE command
+ fails with a NO response, the client MAY try another
+
+
+
+Crispin Standards Track [Page 21]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ authentication mechanism by issuing another AUTHENTICATE command,
+ or MAY attempt to authenticate by using the LOGIN command. In
+ other words, the client MAY request authentication types in
+ decreasing order of preference, with the LOGIN command as a last
+ resort.
+
+ Example: S: * OK KerberosV4 IMAP4rev1 Server
+ C: A001 AUTHENTICATE KERBEROS_V4
+ S: + AmFYig==
+ C: BAcAQU5EUkVXLkNNVS5FRFUAOCAsho84kLN3/IJmrMG+25a4DT
+ +nZImJjnTNHJUtxAA+o0KPKfHEcAFs9a3CL5Oebe/ydHJUwYFd
+ WwuQ1MWiy6IesKvjL5rL9WjXUb9MwT9bpObYLGOKi1Qh
+ S: + or//EoAADZI=
+ C: DiAF5A4gA+oOIALuBkAAmw==
+ S: A001 OK Kerberos V4 authentication successful
+
+ Note: the line breaks in the first client answer are for editorial
+ clarity and are not in real authenticators.
+
+6.2.2. LOGIN Command
+
+ Arguments: user name
+ password
+
+ Responses: no specific responses for this command
+
+ Result: OK - login completed, now in authenticated state
+ NO - login failure: user name or password rejected
+ BAD - command unknown or arguments invalid
+
+ The LOGIN command identifies the client to the server and carries
+ the plaintext password authenticating this user.
+
+ Example: C: a001 LOGIN SMITH SESAME
+ S: a001 OK LOGIN completed
+
+6.3. Client Commands - Authenticated State
+
+ In authenticated state, commands that manipulate mailboxes as atomic
+ entities are permitted. Of these commands, the SELECT and EXAMINE
+ commands will select a mailbox for access and enter selected state.
+
+ In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT),
+ the following commands are valid in authenticated state: SELECT,
+ EXAMINE, CREATE, DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB,
+ STATUS, and APPEND.
+
+
+
+
+
+Crispin Standards Track [Page 22]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6.3.1. SELECT Command
+
+ Arguments: mailbox name
+
+ Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT
+ OPTIONAL OK untagged responses: UNSEEN, PERMANENTFLAGS
+
+ Result: OK - select completed, now in selected state
+ NO - select failure, now in authenticated state: no
+ such mailbox, can't access mailbox
+ BAD - command unknown or arguments invalid
+
+ The SELECT command selects a mailbox so that messages in the
+ mailbox can be accessed. Before returning an OK to the client,
+ the server MUST send the following untagged data to the client:
+
+ FLAGS Defined flags in the mailbox. See the description
+ of the FLAGS response for more detail.
+
+ <n> EXISTS The number of messages in the mailbox. See the
+ description of the EXISTS response for more detail.
+
+ <n> RECENT The number of messages with the \Recent flag set.
+ See the description of the RECENT response for more
+ detail.
+
+ OK [UIDVALIDITY <n>]
+ The unique identifier validity value. See the
+ description of the UID command for more detail.
+
+ to define the initial state of the mailbox at the client.
+
+ The server SHOULD also send an UNSEEN response code in an OK
+ untagged response, indicating the message sequence number of the
+ first unseen message in the mailbox.
+
+ If the client can not change the permanent state of one or more of
+ the flags listed in the FLAGS untagged response, the server SHOULD
+ send a PERMANENTFLAGS response code in an OK untagged response,
+ listing the flags that the client can change permanently.
+
+ Only one mailbox can be selected at a time in a connection;
+ simultaneous access to multiple mailboxes requires multiple
+ connections. The SELECT command automatically deselects any
+ currently selected mailbox before attempting the new selection.
+ Consequently, if a mailbox is selected and a SELECT command that
+ fails is attempted, no mailbox is selected.
+
+
+
+
+Crispin Standards Track [Page 23]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ If the client is permitted to modify the mailbox, the server
+ SHOULD prefix the text of the tagged OK response with the
+ "[READ-WRITE]" response code.
+
+ If the client is not permitted to modify the mailbox but is
+ permitted read access, the mailbox is selected as read-only, and
+ the server MUST prefix the text of the tagged OK response to
+ SELECT with the "[READ-ONLY]" response code. Read-only access
+ through SELECT differs from the EXAMINE command in that certain
+ read-only mailboxes MAY permit the change of permanent state on a
+ per-user (as opposed to global) basis. Netnews messages marked in
+ a server-based .newsrc file are an example of such per-user
+ permanent state that can be modified with read-only mailboxes.
+
+ Example: C: A142 SELECT INBOX
+ S: * 172 EXISTS
+ S: * 1 RECENT
+ S: * OK [UNSEEN 12] Message 12 is first unseen
+ S: * OK [UIDVALIDITY 3857529045] UIDs valid
+ S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
+ S: * OK [PERMANENTFLAGS (\Deleted \Seen \*)] Limited
+ S: A142 OK [READ-WRITE] SELECT completed
+
+6.3.2. EXAMINE Command
+
+ Arguments: mailbox name
+
+ Responses: REQUIRED untagged responses: FLAGS, EXISTS, RECENT
+ OPTIONAL OK untagged responses: UNSEEN, PERMANENTFLAGS
+
+ Result: OK - examine completed, now in selected state
+ NO - examine failure, now in authenticated state: no
+ such mailbox, can't access mailbox
+ BAD - command unknown or arguments invalid
+
+ The EXAMINE command is identical to SELECT and returns the same
+ output; however, the selected mailbox is identified as read-only.
+ No changes to the permanent state of the mailbox, including
+ per-user state, are permitted.
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 24]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The text of the tagged OK response to the EXAMINE command MUST
+ begin with the "[READ-ONLY]" response code.
+
+ Example: C: A932 EXAMINE blurdybloop
+ S: * 17 EXISTS
+ S: * 2 RECENT
+ S: * OK [UNSEEN 8] Message 8 is first unseen
+ S: * OK [UIDVALIDITY 3857529045] UIDs valid
+ S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
+ S: * OK [PERMANENTFLAGS ()] No permanent flags permitted
+ S: A932 OK [READ-ONLY] EXAMINE completed
+
+6.3.3. CREATE Command
+
+ Arguments: mailbox name
+
+ Responses: no specific responses for this command
+
+ Result: OK - create completed
+ NO - create failure: can't create mailbox with that name
+ BAD - command unknown or arguments invalid
+
+ The CREATE command creates a mailbox with the given name. An OK
+ response is returned only if a new mailbox with that name has been
+ created. It is an error to attempt to create INBOX or a mailbox
+ with a name that refers to an extant mailbox. Any error in
+ creation will return a tagged NO response.
+
+ If the mailbox name is suffixed with the server's hierarchy
+ separator character (as returned from the server by a LIST
+ command), this is a declaration that the client intends to create
+ mailbox names under this name in the hierarchy. Server
+ implementations that do not require this declaration MUST ignore
+ it.
+
+ If the server's hierarchy separator character appears elsewhere in
+ the name, the server SHOULD create any superior hierarchical names
+ that are needed for the CREATE command to complete successfully.
+ In other words, an attempt to create "foo/bar/zap" on a server in
+ which "/" is the hierarchy separator character SHOULD create foo/
+ and foo/bar/ if they do not already exist.
+
+ If a new mailbox is created with the same name as a mailbox which
+ was deleted, its unique identifiers MUST be greater than any
+ unique identifiers used in the previous incarnation of the mailbox
+ UNLESS the new incarnation has a different unique identifier
+ validity value. See the description of the UID command for more
+ detail.
+
+
+
+Crispin Standards Track [Page 25]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Example: C: A003 CREATE owatagusiam/
+ S: A003 OK CREATE completed
+ C: A004 CREATE owatagusiam/blurdybloop
+ S: A004 OK CREATE completed
+
+ Note: the interpretation of this example depends on whether "/"
+ was returned as the hierarchy separator from LIST. If "/" is the
+ hierarchy separator, a new level of hierarchy named "owatagusiam"
+ with a member called "blurdybloop" is created. Otherwise, two
+ mailboxes at the same hierarchy level are created.
+
+6.3.4. DELETE Command
+
+ Arguments: mailbox name
+
+ Responses: no specific responses for this command
+
+ Result: OK - delete completed
+ NO - delete failure: can't delete mailbox with that name
+ BAD - command unknown or arguments invalid
+
+ The DELETE command permanently removes the mailbox with the given
+ name. A tagged OK response is returned only if the mailbox has
+ been deleted. It is an error to attempt to delete INBOX or a
+ mailbox name that does not exist.
+
+ The DELETE command MUST NOT remove inferior hierarchical names.
+ For example, if a mailbox "foo" has an inferior "foo.bar"
+ (assuming "." is the hierarchy delimiter character), removing
+ "foo" MUST NOT remove "foo.bar". It is an error to attempt to
+ delete a name that has inferior hierarchical names and also has
+ the \Noselect mailbox name attribute (see the description of the
+ LIST response for more details).
+
+ It is permitted to delete a name that has inferior hierarchical
+ names and does not have the \Noselect mailbox name attribute. In
+ this case, all messages in that mailbox are removed, and the name
+ will acquire the \Noselect mailbox name attribute.
+
+ The value of the highest-used unique identifier of the deleted
+ mailbox MUST be preserved so that a new mailbox created with the
+ same name will not reuse the identifiers of the former
+ incarnation, UNLESS the new incarnation has a different unique
+ identifier validity value. See the description of the UID command
+ for more detail.
+
+
+
+
+
+
+Crispin Standards Track [Page 26]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Examples: C: A682 LIST "" *
+ S: * LIST () "/" blurdybloop
+ S: * LIST (\Noselect) "/" foo
+ S: * LIST () "/" foo/bar
+ S: A682 OK LIST completed
+ C: A683 DELETE blurdybloop
+ S: A683 OK DELETE completed
+ C: A684 DELETE foo
+ S: A684 NO Name "foo" has inferior hierarchical names
+ C: A685 DELETE foo/bar
+ S: A685 OK DELETE Completed
+ C: A686 LIST "" *
+ S: * LIST (\Noselect) "/" foo
+ S: A686 OK LIST completed
+ C: A687 DELETE foo
+ S: A687 OK DELETE Completed
+
+
+ C: A82 LIST "" *
+ S: * LIST () "." blurdybloop
+ S: * LIST () "." foo
+ S: * LIST () "." foo.bar
+ S: A82 OK LIST completed
+ C: A83 DELETE blurdybloop
+ S: A83 OK DELETE completed
+ C: A84 DELETE foo
+ S: A84 OK DELETE Completed
+ C: A85 LIST "" *
+ S: * LIST () "." foo.bar
+ S: A85 OK LIST completed
+ C: A86 LIST "" %
+ S: * LIST (\Noselect) "." foo
+ S: A86 OK LIST completed
+
+6.3.5. RENAME Command
+
+ Arguments: existing mailbox name
+ new mailbox name
+
+ Responses: no specific responses for this command
+
+ Result: OK - rename completed
+ NO - rename failure: can't rename mailbox with that name,
+ can't rename to mailbox with that name
+ BAD - command unknown or arguments invalid
+
+ The RENAME command changes the name of a mailbox. A tagged OK
+ response is returned only if the mailbox has been renamed. It is
+
+
+
+Crispin Standards Track [Page 27]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ an error to attempt to rename from a mailbox name that does not
+ exist or to a mailbox name that already exists. Any error in
+ renaming will return a tagged NO response.
+
+ If the name has inferior hierarchical names, then the inferior
+ hierarchical names MUST also be renamed. For example, a rename of
+ "foo" to "zap" will rename "foo/bar" (assuming "/" is the
+ hierarchy delimiter character) to "zap/bar".
+
+ The value of the highest-used unique identifier of the old mailbox
+ name MUST be preserved so that a new mailbox created with the same
+ name will not reuse the identifiers of the former incarnation,
+ UNLESS the new incarnation has a different unique identifier
+ validity value. See the description of the UID command for more
+ detail.
+
+ Renaming INBOX is permitted, and has special behavior. It moves
+ all messages in INBOX to a new mailbox with the given name,
+ leaving INBOX empty. If the server implementation supports
+ inferior hierarchical names of INBOX, these are unaffected by a
+ rename of INBOX.
+
+ Examples: C: A682 LIST "" *
+ S: * LIST () "/" blurdybloop
+ S: * LIST (\Noselect) "/" foo
+ S: * LIST () "/" foo/bar
+ S: A682 OK LIST completed
+ C: A683 RENAME blurdybloop sarasoop
+ S: A683 OK RENAME completed
+ C: A684 RENAME foo zowie
+ S: A684 OK RENAME Completed
+ C: A685 LIST "" *
+ S: * LIST () "/" sarasoop
+ S: * LIST (\Noselect) "/" zowie
+ S: * LIST () "/" zowie/bar
+ S: A685 OK LIST completed
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 28]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ C: Z432 LIST "" *
+ S: * LIST () "." INBOX
+ S: * LIST () "." INBOX.bar
+ S: Z432 OK LIST completed
+ C: Z433 RENAME INBOX old-mail
+ S: Z433 OK RENAME completed
+ C: Z434 LIST "" *
+ S: * LIST () "." INBOX
+ S: * LIST () "." INBOX.bar
+ S: * LIST () "." old-mail
+ S: Z434 OK LIST completed
+
+6.3.6. SUBSCRIBE Command
+
+ Arguments: mailbox
+
+ Responses: no specific responses for this command
+
+ Result: OK - subscribe completed
+ NO - subscribe failure: can't subscribe to that name
+ BAD - command unknown or arguments invalid
+
+ The SUBSCRIBE command adds the specified mailbox name to the
+ server's set of "active" or "subscribed" mailboxes as returned by
+ the LSUB command. This command returns a tagged OK response only
+ if the subscription is successful.
+
+ A server MAY validate the mailbox argument to SUBSCRIBE to verify
+ that it exists. However, it MUST NOT unilaterally remove an
+ existing mailbox name from the subscription list even if a mailbox
+ by that name no longer exists.
+
+ Note: this requirement is because some server sites may routinely
+ remove a mailbox with a well-known name (e.g. "system-alerts")
+ after its contents expire, with the intention of recreating it
+ when new contents are appropriate.
+
+ Example: C: A002 SUBSCRIBE #news.comp.mail.mime
+ S: A002 OK SUBSCRIBE completed
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 29]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6.3.7. UNSUBSCRIBE Command
+
+ Arguments: mailbox name
+
+ Responses: no specific responses for this command
+
+ Result: OK - unsubscribe completed
+ NO - unsubscribe failure: can't unsubscribe that name
+ BAD - command unknown or arguments invalid
+
+ The UNSUBSCRIBE command removes the specified mailbox name from
+ the server's set of "active" or "subscribed" mailboxes as returned
+ by the LSUB command. This command returns a tagged OK response
+ only if the unsubscription is successful.
+
+ Example: C: A002 UNSUBSCRIBE #news.comp.mail.mime
+ S: A002 OK UNSUBSCRIBE completed
+
+6.3..8. LIST Command
+
+ Arguments: reference name
+ mailbox name with possible wildcards
+
+ Responses: untagged responses: LIST
+
+ Result: OK - list completed
+ NO - list failure: can't list that reference or name
+ BAD - command unknown or arguments invalid
+
+ The LIST command returns a subset of names from the complete set
+ of all names available to the client. Zero or more untagged LIST
+ replies are returned, containing the name attributes, hierarchy
+ delimiter, and name; see the description of the LIST reply for
+ more detail.
+
+ The LIST command SHOULD return its data quickly, without undue
+ delay. For example, it SHOULD NOT go to excess trouble to
+ calculate \Marked or \Unmarked status or perform other processing;
+ if each name requires 1 second of processing, then a list of 1200
+ names would take 20 minutes!
+
+ An empty ("" string) reference name argument indicates that the
+ mailbox name is interpreted as by SELECT. The returned mailbox
+ names MUST match the supplied mailbox name pattern. A non-empty
+ reference name argument is the name of a mailbox or a level of
+ mailbox hierarchy, and indicates a context in which the mailbox
+ name is interpreted in an implementation-defined manner.
+
+
+
+
+Crispin Standards Track [Page 30]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ An empty ("" string) mailbox name argument is a special request to
+ return the hierarchy delimiter and the root name of the name given
+ in the reference. The value returned as the root MAY be null if
+ the reference is non-rooted or is null. In all cases, the
+ hierarchy delimiter is returned. This permits a client to get the
+ hierarchy delimiter even when no mailboxes by that name currently
+ exist.
+
+ The reference and mailbox name arguments are interpreted, in an
+ implementation-dependent fashion, into a canonical form that
+ represents an unambiguous left-to-right hierarchy. The returned
+ mailbox names will be in the interpreted form.
+
+ Any part of the reference argument that is included in the
+ interpreted form SHOULD prefix the interpreted form. It SHOULD
+ also be in the same form as the reference name argument. This
+ rule permits the client to determine if the returned mailbox name
+ is in the context of the reference argument, or if something about
+ the mailbox argument overrode the reference argument. Without
+ this rule, the client would have to have knowledge of the server's
+ naming semantics including what characters are "breakouts" that
+ override a naming context.
+
+ For example, here are some examples of how references and mailbox
+ names might be interpreted on a UNIX-based server:
+
+ Reference Mailbox Name Interpretation
+ ------------ ------------ --------------
+ ~smith/Mail/ foo.* ~smith/Mail/foo.*
+ archive/ % archive/%
+ #news. comp.mail.* #news.comp.mail.*
+ ~smith/Mail/ /usr/doc/foo /usr/doc/foo
+ archive/ ~fred/Mail/* ~fred/Mail/*
+
+ The first three examples demonstrate interpretations in the
+ context of the reference argument. Note that "~smith/Mail" SHOULD
+ NOT be transformed into something like "/u2/users/smith/Mail", or
+ it would be impossible for the client to determine that the
+ interpretation was in the context of the reference.
+
+ The character "*" is a wildcard, and matches zero or more
+ characters at this position. The character "%" is similar to "*",
+ but it does not match a hierarchy delimiter. If the "%" wildcard
+ is the last character of a mailbox name argument, matching levels
+ of hierarchy are also returned. If these levels of hierarchy are
+ not also selectable mailboxes, they are returned with the
+ \Noselect mailbox name attribute (see the description of the LIST
+ response for more details).
+
+
+
+Crispin Standards Track [Page 31]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Server implementations are permitted to "hide" otherwise
+ accessible mailboxes from the wildcard characters, by preventing
+ certain characters or names from matching a wildcard in certain
+ situations. For example, a UNIX-based server might restrict the
+ interpretation of "*" so that an initial "/" character does not
+ match.
+
+ The special name INBOX is included in the output from LIST, if
+ INBOX is supported by this server for this user and if the
+ uppercase string "INBOX" matches the interpreted reference and
+ mailbox name arguments with wildcards as described above. The
+ criteria for omitting INBOX is whether SELECT INBOX will return
+ failure; it is not relevant whether the user's real INBOX resides
+ on this or some other server.
+
+ Example: C: A101 LIST "" ""
+ S: * LIST (\Noselect) "/" ""
+ S: A101 OK LIST Completed
+ C: A102 LIST #news.comp.mail.misc ""
+ S: * LIST (\Noselect) "." #news.
+ S: A102 OK LIST Completed
+ C: A103 LIST /usr/staff/jones ""
+ S: * LIST (\Noselect) "/" /
+ S: A103 OK LIST Completed
+ C: A202 LIST ~/Mail/ %
+ S: * LIST (\Noselect) "/" ~/Mail/foo
+ S: * LIST () "/" ~/Mail/meetings
+ S: A202 OK LIST completed
+
+6.3.9. LSUB Command
+
+ Arguments: reference name
+ mailbox name with possible wildcards
+
+ Responses: untagged responses: LSUB
+
+ Result: OK - lsub completed
+ NO - lsub failure: can't list that reference or name
+ BAD - command unknown or arguments invalid
+
+ The LSUB command returns a subset of names from the set of names
+ that the user has declared as being "active" or "subscribed".
+ Zero or more untagged LSUB replies are returned. The arguments to
+ LSUB are in the same form as those for LIST.
+
+ A server MAY validate the subscribed names to see if they still
+ exist. If a name does not exist, it SHOULD be flagged with the
+ \Noselect attribute in the LSUB response. The server MUST NOT
+
+
+
+Crispin Standards Track [Page 32]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ unilaterally remove an existing mailbox name from the subscription
+ list even if a mailbox by that name no longer exists.
+
+ Example: C: A002 LSUB "#news." "comp.mail.*"
+ S: * LSUB () "." #news.comp.mail.mime
+ S: * LSUB () "." #news.comp.mail.misc
+ S: A002 OK LSUB completed
+
+6.3.10. STATUS Command
+
+ Arguments: mailbox name
+ status data item names
+
+ Responses: untagged responses: STATUS
+
+ Result: OK - status completed
+ NO - status failure: no status for that name
+ BAD - command unknown or arguments invalid
+
+ The STATUS command requests the status of the indicated mailbox.
+ It does not change the currently selected mailbox, nor does it
+ affect the state of any messages in the queried mailbox (in
+ particular, STATUS MUST NOT cause messages to lose the \Recent
+ flag).
+
+ The STATUS command provides an alternative to opening a second
+ IMAP4rev1 connection and doing an EXAMINE command on a mailbox to
+ query that mailbox's status without deselecting the current
+ mailbox in the first IMAP4rev1 connection.
+
+ Unlike the LIST command, the STATUS command is not guaranteed to
+ be fast in its response. In some implementations, the server is
+ obliged to open the mailbox read-only internally to obtain certain
+ status information. Also unlike the LIST command, the STATUS
+ command does not accept wildcards.
+
+ The currently defined status data items that can be requested are:
+
+ MESSAGES The number of messages in the mailbox.
+
+ RECENT The number of messages with the \Recent flag set.
+
+ UIDNEXT The next UID value that will be assigned to a new
+ message in the mailbox. It is guaranteed that this
+ value will not change unless new messages are added
+ to the mailbox; and that it will change when new
+ messages are added even if those new messages are
+ subsequently expunged.
+
+
+
+Crispin Standards Track [Page 33]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ UIDVALIDITY The unique identifier validity value of the
+ mailbox.
+
+ UNSEEN The number of messages which do not have the \Seen
+ flag set.
+
+
+ Example: C: A042 STATUS blurdybloop (UIDNEXT MESSAGES)
+ S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292)
+ S: A042 OK STATUS completed
+
+6.3.11. APPEND Command
+
+ Arguments: mailbox name
+ OPTIONAL flag parenthesized list
+ OPTIONAL date/time string
+ message literal
+
+ Responses: no specific responses for this command
+
+ Result: OK - append completed
+ NO - append error: can't append to that mailbox, error
+ in flags or date/time or message text
+ BAD - command unknown or arguments invalid
+
+ The APPEND command appends the literal argument as a new message
+ to the end of the specified destination mailbox. This argument
+ SHOULD be in the format of an [RFC-822] message. 8-bit characters
+ are permitted in the message. A server implementation that is
+ unable to preserve 8-bit data properly MUST be able to reversibly
+ convert 8-bit APPEND data to 7-bit using a [MIME-IMB] content
+ transfer encoding.
+
+ Note: There MAY be exceptions, e.g. draft messages, in which
+ required [RFC-822] header lines are omitted in the message literal
+ argument to APPEND. The full implications of doing so MUST be
+ understood and carefully weighed.
+
+ If a flag parenthesized list is specified, the flags SHOULD be set in
+ the resulting message; otherwise, the flag list of the resulting
+ message is set empty by default.
+
+ If a date_time is specified, the internal date SHOULD be set in the
+ resulting message; otherwise, the internal date of the resulting
+ message is set to the current date and time by default.
+
+
+
+
+
+
+Crispin Standards Track [Page 34]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ If the append is unsuccessful for any reason, the mailbox MUST be
+ restored to its state before the APPEND attempt; no partial appending
+ is permitted.
+
+ If the destination mailbox does not exist, a server MUST return an
+ error, and MUST NOT automatically create the mailbox. Unless it is
+ certain that the destination mailbox can not be created, the server
+ MUST send the response code "[TRYCREATE]" as the prefix of the text
+ of the tagged NO response. This gives a hint to the client that it
+ can attempt a CREATE command and retry the APPEND if the CREATE is
+ successful.
+
+ If the mailbox is currently selected, the normal new mail actions
+ SHOULD occur. Specifically, the server SHOULD notify the client
+ immediately via an untagged EXISTS response. If the server does not
+ do so, the client MAY issue a NOOP command (or failing that, a CHECK
+ command) after one or more APPEND commands.
+
+ Example: C: A003 APPEND saved-messages (\Seen) {310}
+ C: Date: Mon, 7 Feb 1994 21:52:25 -0800 (PST)
+ C: From: Fred Foobar <foobar@Blurdybloop.COM>
+ C: Subject: afternoon meeting
+ C: To: mooch@owatagu.siam.edu
+ C: Message-Id: <B27397-0100000@Blurdybloop.COM>
+ C: MIME-Version: 1.0
+ C: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
+ C:
+ C: Hello Joe, do you think we can meet at 3:30 tomorrow?
+ C:
+ S: A003 OK APPEND completed
+
+ Note: the APPEND command is not used for message delivery, because
+ it does not provide a mechanism to transfer [SMTP] envelope
+ information.
+
+6.4. Client Commands - Selected State
+
+ In selected state, commands that manipulate messages in a mailbox are
+ permitted.
+
+ In addition to the universal commands (CAPABILITY, NOOP, and LOGOUT),
+ and the authenticated state commands (SELECT, EXAMINE, CREATE,
+ DELETE, RENAME, SUBSCRIBE, UNSUBSCRIBE, LIST, LSUB, STATUS, and
+ APPEND), the following commands are valid in the selected state:
+ CHECK, CLOSE, EXPUNGE, SEARCH, FETCH, STORE, COPY, and UID.
+
+
+
+
+
+
+Crispin Standards Track [Page 35]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+6.4.1. CHECK Command
+
+ Arguments: none
+
+ Responses: no specific responses for this command
+
+ Result: OK - check completed
+ BAD - command unknown or arguments invalid
+
+ The CHECK command requests a checkpoint of the currently selected
+ mailbox. A checkpoint refers to any implementation-dependent
+ housekeeping associated with the mailbox (e.g. resolving the
+ server's in-memory state of the mailbox with the state on its
+ disk) that is not normally executed as part of each command. A
+ checkpoint MAY take a non-instantaneous amount of real time to
+ complete. If a server implementation has no such housekeeping
+ considerations, CHECK is equivalent to NOOP.
+
+ There is no guarantee that an EXISTS untagged response will happen
+ as a result of CHECK. NOOP, not CHECK, SHOULD be used for new
+ mail polling.
+
+ Example: C: FXXZ CHECK
+ S: FXXZ OK CHECK Completed
+
+6.4.2. CLOSE Command
+
+ Arguments: none
+
+ Responses: no specific responses for this command
+
+ Result: OK - close completed, now in authenticated state
+ NO - close failure: no mailbox selected
+ BAD - command unknown or arguments invalid
+
+ The CLOSE command permanently removes from the currently selected
+ mailbox all messages that have the \Deleted flag set, and returns
+ to authenticated state from selected state. No untagged EXPUNGE
+ responses are sent.
+
+ No messages are removed, and no error is given, if the mailbox is
+ selected by an EXAMINE command or is otherwise selected read-only.
+
+ Even if a mailbox is selected, a SELECT, EXAMINE, or LOGOUT
+ command MAY be issued without previously issuing a CLOSE command.
+ The SELECT, EXAMINE, and LOGOUT commands implicitly close the
+ currently selected mailbox without doing an expunge. However,
+ when many messages are deleted, a CLOSE-LOGOUT or CLOSE-SELECT
+
+
+
+Crispin Standards Track [Page 36]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ sequence is considerably faster than an EXPUNGE-LOGOUT or
+ EXPUNGE-SELECT because no untagged EXPUNGE responses (which the
+ client would probably ignore) are sent.
+
+ Example: C: A341 CLOSE
+ S: A341 OK CLOSE completed
+
+6.4.3. EXPUNGE Command
+
+ Arguments: none
+
+ Responses: untagged responses: EXPUNGE
+
+ Result: OK - expunge completed
+ NO - expunge failure: can't expunge (e.g. permission
+ denied)
+ BAD - command unknown or arguments invalid
+
+ The EXPUNGE command permanently removes from the currently
+ selected mailbox all messages that have the \Deleted flag set.
+ Before returning an OK to the client, an untagged EXPUNGE response
+ is sent for each message that is removed.
+
+ Example: C: A202 EXPUNGE
+ S: * 3 EXPUNGE
+ S: * 3 EXPUNGE
+ S: * 5 EXPUNGE
+ S: * 8 EXPUNGE
+ S: A202 OK EXPUNGE completed
+
+ Note: in this example, messages 3, 4, 7, and 11 had the
+ \Deleted flag set. See the description of the EXPUNGE
+ response for further explanation.
+
+6.4.4. SEARCH Command
+
+ Arguments: OPTIONAL [CHARSET] specification
+ searching criteria (one or more)
+
+ Responses: REQUIRED untagged response: SEARCH
+
+ Result: OK - search completed
+ NO - search error: can't search that [CHARSET] or
+ criteria
+ BAD - command unknown or arguments invalid
+
+
+
+
+
+
+Crispin Standards Track [Page 37]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The SEARCH command searches the mailbox for messages that match
+ the given searching criteria. Searching criteria consist of one
+ or more search keys. The untagged SEARCH response from the server
+ contains a listing of message sequence numbers corresponding to
+ those messages that match the searching criteria.
+
+ When multiple keys are specified, the result is the intersection
+ (AND function) of all the messages that match those keys. For
+ example, the criteria DELETED FROM "SMITH" SINCE 1-Feb-1994 refers
+ to all deleted messages from Smith that were placed in the mailbox
+ since February 1, 1994. A search key can also be a parenthesized
+ list of one or more search keys (e.g. for use with the OR and NOT
+ keys).
+
+ Server implementations MAY exclude [MIME-IMB] body parts with
+ terminal content media types other than TEXT and MESSAGE from
+ consideration in SEARCH matching.
+
+ The OPTIONAL [CHARSET] specification consists of the word
+ "CHARSET" followed by a registered [CHARSET]. It indicates the
+ [CHARSET] of the strings that appear in the search criteria.
+ [MIME-IMB] content transfer encodings, and [MIME-HDRS] strings in
+ [RFC-822]/[MIME-IMB] headers, MUST be decoded before comparing
+ text in a [CHARSET] other than US-ASCII. US-ASCII MUST be
+ supported; other [CHARSET]s MAY be supported. If the server does
+ not support the specified [CHARSET], it MUST return a tagged NO
+ response (not a BAD).
+
+ In all search keys that use strings, a message matches the key if
+ the string is a substring of the field. The matching is case-
+ insensitive.
+
+ The defined search keys are as follows. Refer to the Formal
+ Syntax section for the precise syntactic definitions of the
+ arguments.
+
+ <message set> Messages with message sequence numbers
+ corresponding to the specified message sequence
+ number set
+
+ ALL All messages in the mailbox; the default initial
+ key for ANDing.
+
+ ANSWERED Messages with the \Answered flag set.
+
+ BCC <string> Messages that contain the specified string in the
+ envelope structure's BCC field.
+
+
+
+
+Crispin Standards Track [Page 38]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ BEFORE <date> Messages whose internal date is earlier than the
+ specified date.
+
+ BODY <string> Messages that contain the specified string in the
+ body of the message.
+
+ CC <string> Messages that contain the specified string in the
+ envelope structure's CC field.
+
+ DELETED Messages with the \Deleted flag set.
+
+ DRAFT Messages with the \Draft flag set.
+
+ FLAGGED Messages with the \Flagged flag set.
+
+ FROM <string> Messages that contain the specified string in the
+ envelope structure's FROM field.
+
+ HEADER <field-name> <string>
+ Messages that have a header with the specified
+ field-name (as defined in [RFC-822]) and that
+ contains the specified string in the [RFC-822]
+ field-body.
+
+ KEYWORD <flag> Messages with the specified keyword set.
+
+ LARGER <n> Messages with an [RFC-822] size larger than the
+ specified number of octets.
+
+ NEW Messages that have the \Recent flag set but not the
+ \Seen flag. This is functionally equivalent to
+ "(RECENT UNSEEN)".
+
+ NOT <search-key>
+ Messages that do not match the specified search
+ key.
+
+ OLD Messages that do not have the \Recent flag set.
+ This is functionally equivalent to "NOT RECENT" (as
+ opposed to "NOT NEW").
+
+ ON <date> Messages whose internal date is within the
+ specified date.
+
+ OR <search-key1> <search-key2>
+ Messages that match either search key.
+
+ RECENT Messages that have the \Recent flag set.
+
+
+
+Crispin Standards Track [Page 39]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ SEEN Messages that have the \Seen flag set.
+
+ SENTBEFORE <date>
+ Messages whose [RFC-822] Date: header is earlier
+ than the specified date.
+
+ SENTON <date> Messages whose [RFC-822] Date: header is within the
+ specified date.
+
+ SENTSINCE <date>
+ Messages whose [RFC-822] Date: header is within or
+ later than the specified date.
+
+ SINCE <date> Messages whose internal date is within or later
+ than the specified date.
+
+ SMALLER <n> Messages with an [RFC-822] size smaller than the
+ specified number of octets.
+
+ SUBJECT <string>
+ Messages that contain the specified string in the
+ envelope structure's SUBJECT field.
+
+ TEXT <string> Messages that contain the specified string in the
+ header or body of the message.
+
+ TO <string> Messages that contain the specified string in the
+ envelope structure's TO field.
+
+ UID <message set>
+ Messages with unique identifiers corresponding to
+ the specified unique identifier set.
+
+ UNANSWERED Messages that do not have the \Answered flag set.
+
+ UNDELETED Messages that do not have the \Deleted flag set.
+
+ UNDRAFT Messages that do not have the \Draft flag set.
+
+ UNFLAGGED Messages that do not have the \Flagged flag set.
+
+ UNKEYWORD <flag>
+ Messages that do not have the specified keyword
+ set.
+
+ UNSEEN Messages that do not have the \Seen flag set.
+
+
+
+
+
+Crispin Standards Track [Page 40]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Example: C: A282 SEARCH FLAGGED SINCE 1-Feb-1994 NOT FROM "Smith"
+ S: * SEARCH 2 84 882
+ S: A282 OK SEARCH completed
+
+6.4.5. FETCH Command
+
+ Arguments: message set
+ message data item names
+
+ Responses: untagged responses: FETCH
+
+ Result: OK - fetch completed
+ NO - fetch error: can't fetch that data
+ BAD - command unknown or arguments invalid
+
+ The FETCH command retrieves data associated with a message in the
+ mailbox. The data items to be fetched can be either a single atom
+ or a parenthesized list.
+
+ The currently defined data items that can be fetched are:
+
+ ALL Macro equivalent to: (FLAGS INTERNALDATE
+ RFC822.SIZE ENVELOPE)
+
+ BODY Non-extensible form of BODYSTRUCTURE.
+
+ BODY[<section>]<<partial>>
+ The text of a particular body section. The section
+ specification is a set of zero or more part
+ specifiers delimited by periods. A part specifier
+ is either a part number or one of the following:
+ HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, MIME, and
+ TEXT. An empty section specification refers to the
+ entire message, including the header.
+
+ Every message has at least one part number.
+ Non-[MIME-IMB] messages, and non-multipart
+ [MIME-IMB] messages with no encapsulated message,
+ only have a part 1.
+
+ Multipart messages are assigned consecutive part
+ numbers, as they occur in the message. If a
+ particular part is of type message or multipart,
+ its parts MUST be indicated by a period followed by
+ the part number within that nested multipart part.
+
+
+
+
+
+
+Crispin Standards Track [Page 41]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ A part of type MESSAGE/RFC822 also has nested part
+ numbers, referring to parts of the MESSAGE part's
+ body.
+
+ The HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, and
+ TEXT part specifiers can be the sole part specifier
+ or can be prefixed by one or more numeric part
+ specifiers, provided that the numeric part
+ specifier refers to a part of type MESSAGE/RFC822.
+ The MIME part specifier MUST be prefixed by one or
+ more numeric part specifiers.
+
+ The HEADER, HEADER.FIELDS, and HEADER.FIELDS.NOT
+ part specifiers refer to the [RFC-822] header of
+ the message or of an encapsulated [MIME-IMT]
+ MESSAGE/RFC822 message. HEADER.FIELDS and
+ HEADER.FIELDS.NOT are followed by a list of
+ field-name (as defined in [RFC-822]) names, and
+ return a subset of the header. The subset returned
+ by HEADER.FIELDS contains only those header fields
+ with a field-name that matches one of the names in
+ the list; similarly, the subset returned by
+ HEADER.FIELDS.NOT contains only the header fields
+ with a non-matching field-name. The field-matching
+ is case-insensitive but otherwise exact. In all
+ cases, the delimiting blank line between the header
+ and the body is always included.
+
+ The MIME part specifier refers to the [MIME-IMB]
+ header for this part.
+
+ The TEXT part specifier refers to the text body of
+ the message, omitting the [RFC-822] header.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 42]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Here is an example of a complex message
+ with some of its part specifiers:
+
+ HEADER ([RFC-822] header of the message)
+ TEXT MULTIPART/MIXED
+ 1 TEXT/PLAIN
+ 2 APPLICATION/OCTET-STREAM
+ 3 MESSAGE/RFC822
+ 3.HEADER ([RFC-822] header of the message)
+ 3.TEXT ([RFC-822] text body of the message)
+ 3.1 TEXT/PLAIN
+ 3.2 APPLICATION/OCTET-STREAM
+ 4 MULTIPART/MIXED
+ 4.1 IMAGE/GIF
+ 4.1.MIME ([MIME-IMB] header for the IMAGE/GIF)
+ 4.2 MESSAGE/RFC822
+ 4.2.HEADER ([RFC-822] header of the message)
+ 4.2.TEXT ([RFC-822] text body of the message)
+ 4.2.1 TEXT/PLAIN
+ 4.2.2 MULTIPART/ALTERNATIVE
+ 4.2.2.1 TEXT/PLAIN
+ 4.2.2.2 TEXT/RICHTEXT
+
+
+ It is possible to fetch a substring of the
+ designated text. This is done by appending an open
+ angle bracket ("<"), the octet position of the
+ first desired octet, a period, the maximum number
+ of octets desired, and a close angle bracket (">")
+ to the part specifier. If the starting octet is
+ beyond the end of the text, an empty string is
+ returned.
+
+ Any partial fetch that attempts to read beyond the
+ end of the text is truncated as appropriate. A
+ partial fetch that starts at octet 0 is returned as
+ a partial fetch, even if this truncation happened.
+
+ Note: this means that BODY[]<0.2048> of a
+ 1500-octet message will return BODY[]<0>
+ with a literal of size 1500, not BODY[].
+
+ Note: a substring fetch of a
+ HEADER.FIELDS or HEADER.FIELDS.NOT part
+ specifier is calculated after subsetting
+ the header.
+
+
+
+
+
+Crispin Standards Track [Page 43]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The \Seen flag is implicitly set; if this causes
+ the flags to change they SHOULD be included as part
+ of the FETCH responses.
+
+ BODY.PEEK[<section>]<<partial>>
+ An alternate form of BODY[<section>] that does not
+ implicitly set the \Seen flag.
+
+ BODYSTRUCTURE The [MIME-IMB] body structure of the message. This
+ is computed by the server by parsing the [MIME-IMB]
+ header fields in the [RFC-822] header and
+ [MIME-IMB] headers.
+
+ ENVELOPE The envelope structure of the message. This is
+ computed by the server by parsing the [RFC-822]
+ header into the component parts, defaulting various
+ fields as necessary.
+
+ FAST Macro equivalent to: (FLAGS INTERNALDATE
+ RFC822.SIZE)
+
+ FLAGS The flags that are set for this message.
+
+ FULL Macro equivalent to: (FLAGS INTERNALDATE
+ RFC822.SIZE ENVELOPE BODY)
+
+ INTERNALDATE The internal date of the message.
+
+ RFC822 Functionally equivalent to BODY[], differing in the
+ syntax of the resulting untagged FETCH data (RFC822
+ is returned).
+
+ RFC822.HEADER Functionally equivalent to BODY.PEEK[HEADER],
+ differing in the syntax of the resulting untagged
+ FETCH data (RFC822.HEADER is returned).
+
+ RFC822.SIZE The [RFC-822] size of the message.
+
+ RFC822.TEXT Functionally equivalent to BODY[TEXT], differing in
+ the syntax of the resulting untagged FETCH data
+ (RFC822.TEXT is returned).
+
+ UID The unique identifier for the message.
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 44]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Example: C: A654 FETCH 2:4 (FLAGS BODY[HEADER.FIELDS (DATE FROM)])
+ S: * 2 FETCH ....
+ S: * 3 FETCH ....
+ S: * 4 FETCH ....
+ S: A654 OK FETCH completed
+
+6.4.6. STORE Command
+
+ Arguments: message set
+ message data item name
+ value for message data item
+
+ Responses: untagged responses: FETCH
+
+ Result: OK - store completed
+ NO - store error: can't store that data
+ BAD - command unknown or arguments invalid
+
+ The STORE command alters data associated with a message in the
+ mailbox. Normally, STORE will return the updated value of the
+ data with an untagged FETCH response. A suffix of ".SILENT" in
+ the data item name prevents the untagged FETCH, and the server
+ SHOULD assume that the client has determined the updated value
+ itself or does not care about the updated value.
+
+ Note: regardless of whether or not the ".SILENT" suffix was
+ used, the server SHOULD send an untagged FETCH response if a
+ change to a message's flags from an external source is
+ observed. The intent is that the status of the flags is
+ determinate without a race condition.
+
+ The currently defined data items that can be stored are:
+
+ FLAGS <flag list>
+ Replace the flags for the message with the
+ argument. The new value of the flags are returned
+ as if a FETCH of those flags was done.
+
+ FLAGS.SILENT <flag list>
+ Equivalent to FLAGS, but without returning a new
+ value.
+
+ +FLAGS <flag list>
+ Add the argument to the flags for the message. The
+ new value of the flags are returned as if a FETCH
+ of those flags was done.
+
+
+
+
+
+Crispin Standards Track [Page 45]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ +FLAGS.SILENT <flag list>
+ Equivalent to +FLAGS, but without returning a new
+ value.
+
+ -FLAGS <flag list>
+ Remove the argument from the flags for the message.
+ The new value of the flags are returned as if a
+ FETCH of those flags was done.
+
+ -FLAGS.SILENT <flag list>
+ Equivalent to -FLAGS, but without returning a new
+ value.
+
+ Example: C: A003 STORE 2:4 +FLAGS (\Deleted)
+ S: * 2 FETCH FLAGS (\Deleted \Seen)
+ S: * 3 FETCH FLAGS (\Deleted)
+ S: * 4 FETCH FLAGS (\Deleted \Flagged \Seen)
+ S: A003 OK STORE completed
+
+6.4.7. COPY Command
+
+ Arguments: message set
+ mailbox name
+
+ Responses: no specific responses for this command
+
+ Result: OK - copy completed
+ NO - copy error: can't copy those messages or to that
+ name
+ BAD - command unknown or arguments invalid
+
+ The COPY command copies the specified message(s) to the end of the
+ specified destination mailbox. The flags and internal date of the
+ message(s) SHOULD be preserved in the copy.
+
+ If the destination mailbox does not exist, a server SHOULD return
+ an error. It SHOULD NOT automatically create the mailbox. Unless
+ it is certain that the destination mailbox can not be created, the
+ server MUST send the response code "[TRYCREATE]" as the prefix of
+ the text of the tagged NO response. This gives a hint to the
+ client that it can attempt a CREATE command and retry the COPY if
+ the CREATE is successful.
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 46]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ If the COPY command is unsuccessful for any reason, server
+ implementations MUST restore the destination mailbox to its state
+ before the COPY attempt.
+
+ Example: C: A003 COPY 2:4 MEETING
+ S: A003 OK COPY completed
+
+6.4.8. UID Command
+
+ Arguments: command name
+ command arguments
+
+ Responses: untagged responses: FETCH, SEARCH
+
+ Result: OK - UID command completed
+ NO - UID command error
+ BAD - command unknown or arguments invalid
+
+ The UID command has two forms. In the first form, it takes as its
+ arguments a COPY, FETCH, or STORE command with arguments
+ appropriate for the associated command. However, the numbers in
+ the message set argument are unique identifiers instead of message
+ sequence numbers.
+
+ In the second form, the UID command takes a SEARCH command with
+ SEARCH command arguments. The interpretation of the arguments is
+ the same as with SEARCH; however, the numbers returned in a SEARCH
+ response for a UID SEARCH command are unique identifiers instead
+ of message sequence numbers. For example, the command UID SEARCH
+ 1:100 UID 443:557 returns the unique identifiers corresponding to
+ the intersection of the message sequence number set 1:100 and the
+ UID set 443:557.
+
+ Message set ranges are permitted; however, there is no guarantee
+ that unique identifiers be contiguous. A non-existent unique
+ identifier within a message set range is ignored without any error
+ message generated.
+
+ The number after the "*" in an untagged FETCH response is always a
+ message sequence number, not a unique identifier, even for a UID
+ command response. However, server implementations MUST implicitly
+ include the UID message data item as part of any FETCH response
+ caused by a UID command, regardless of whether a UID was specified
+ as a message data item to the FETCH.
+
+
+
+
+
+
+
+Crispin Standards Track [Page 47]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Example: C: A999 UID FETCH 4827313:4828442 FLAGS
+ S: * 23 FETCH (FLAGS (\Seen) UID 4827313)
+ S: * 24 FETCH (FLAGS (\Seen) UID 4827943)
+ S: * 25 FETCH (FLAGS (\Seen) UID 4828442)
+ S: A999 UID FETCH completed
+
+6.5. Client Commands - Experimental/Expansion
+
+6.5.1. X<atom> Command
+
+ Arguments: implementation defined
+
+ Responses: implementation defined
+
+ Result: OK - command completed
+ NO - failure
+ BAD - command unknown or arguments invalid
+
+ Any command prefixed with an X is an experimental command.
+ Commands which are not part of this specification, a standard or
+ standards-track revision of this specification, or an IESG-
+ approved experimental protocol, MUST use the X prefix.
+
+ Any added untagged responses issued by an experimental command
+ MUST also be prefixed with an X. Server implementations MUST NOT
+ send any such untagged responses, unless the client requested it
+ by issuing the associated experimental command.
+
+ Example: C: a441 CAPABILITY
+ S: * CAPABILITY IMAP4rev1 AUTH=KERBEROS_V4 XPIG-LATIN
+ S: a441 OK CAPABILITY completed
+ C: A442 XPIG-LATIN
+ S: * XPIG-LATIN ow-nay eaking-spay ig-pay atin-lay
+ S: A442 OK XPIG-LATIN ompleted-cay
+
+7. Server Responses
+
+ Server responses are in three forms: status responses, server data,
+ and command continuation request. The information contained in a
+ server response, identified by "Contents:" in the response
+ descriptions below, is described by function, not by syntax. The
+ precise syntax of server responses is described in the Formal Syntax
+ section.
+
+ The client MUST be prepared to accept any response at all times.
+
+
+
+
+
+
+Crispin Standards Track [Page 48]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Status responses can be tagged or untagged. Tagged status responses
+ indicate the completion result (OK, NO, or BAD status) of a client
+ command, and have a tag matching the command.
+
+ Some status responses, and all server data, are untagged. An
+ untagged response is indicated by the token "*" instead of a tag.
+ Untagged status responses indicate server greeting, or server status
+ that does not indicate the completion of a command (for example, an
+ impending system shutdown alert). For historical reasons, untagged
+ server data responses are also called "unsolicited data", although
+ strictly speaking only unilateral server data is truly "unsolicited".
+
+ Certain server data MUST be recorded by the client when it is
+ received; this is noted in the description of that data. Such data
+ conveys critical information which affects the interpretation of all
+ subsequent commands and responses (e.g. updates reflecting the
+ creation or destruction of messages).
+
+ Other server data SHOULD be recorded for later reference; if the
+ client does not need to record the data, or if recording the data has
+ no obvious purpose (e.g. a SEARCH response when no SEARCH command is
+ in progress), the data SHOULD be ignored.
+
+ An example of unilateral untagged server data occurs when the IMAP
+ connection is in selected state. In selected state, the server
+ checks the mailbox for new messages as part of command execution.
+ Normally, this is part of the execution of every command; hence, a
+ NOOP command suffices to check for new messages. If new messages are
+ found, the server sends untagged EXISTS and RECENT responses
+ reflecting the new size of the mailbox. Server implementations that
+ offer multiple simultaneous access to the same mailbox SHOULD also
+ send appropriate unilateral untagged FETCH and EXPUNGE responses if
+ another agent changes the state of any message flags or expunges any
+ messages.
+
+ Command continuation request responses use the token "+" instead of a
+ tag. These responses are sent by the server to indicate acceptance
+ of an incomplete client command and readiness for the remainder of
+ the command.
+
+7.1. Server Responses - Status Responses
+
+ Status responses are OK, NO, BAD, PREAUTH and BYE. OK, NO, and BAD
+ may be tagged or untagged. PREAUTH and BYE are always untagged.
+
+ Status responses MAY include an OPTIONAL "response code". A response
+ code consists of data inside square brackets in the form of an atom,
+ possibly followed by a space and arguments. The response code
+
+
+
+Crispin Standards Track [Page 49]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ contains additional information or status codes for client software
+ beyond the OK/NO/BAD condition, and are defined when there is a
+ specific action that a client can take based upon the additional
+ information.
+
+ The currently defined response codes are:
+
+ ALERT The human-readable text contains a special alert
+ that MUST be presented to the user in a fashion
+ that calls the user's attention to the message.
+
+ NEWNAME Followed by a mailbox name and a new mailbox name.
+ A SELECT or EXAMINE is failing because the target
+ mailbox name no longer exists because it was
+ renamed to the new mailbox name. This is a hint to
+ the client that the operation can succeed if the
+ SELECT or EXAMINE is reissued with the new mailbox
+ name.
+
+ PARSE The human-readable text represents an error in
+ parsing the [RFC-822] header or [MIME-IMB] headers
+ of a message in the mailbox.
+
+ PERMANENTFLAGS Followed by a parenthesized list of flags,
+ indicates which of the known flags that the client
+ can change permanently. Any flags that are in the
+ FLAGS untagged response, but not the PERMANENTFLAGS
+ list, can not be set permanently. If the client
+ attempts to STORE a flag that is not in the
+ PERMANENTFLAGS list, the server will either reject
+ it with a NO reply or store the state for the
+ remainder of the current session only. The
+ PERMANENTFLAGS list can also include the special
+ flag \*, which indicates that it is possible to
+ create new keywords by attempting to store those
+ flags in the mailbox.
+
+ READ-ONLY The mailbox is selected read-only, or its access
+ while selected has changed from read-write to
+ read-only.
+
+ READ-WRITE The mailbox is selected read-write, or its access
+ while selected has changed from read-only to
+ read-write.
+
+
+
+
+
+
+
+Crispin Standards Track [Page 50]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ TRYCREATE An APPEND or COPY attempt is failing because the
+ target mailbox does not exist (as opposed to some
+ other reason). This is a hint to the client that
+ the operation can succeed if the mailbox is first
+ created by the CREATE command.
+
+ UIDVALIDITY Followed by a decimal number, indicates the unique
+ identifier validity value.
+
+ UNSEEN Followed by a decimal number, indicates the number
+ of the first message without the \Seen flag set.
+
+ Additional response codes defined by particular client or server
+ implementations SHOULD be prefixed with an "X" until they are
+ added to a revision of this protocol. Client implementations
+ SHOULD ignore response codes that they do not recognize.
+
+7.1.1. OK Response
+
+ Contents: OPTIONAL response code
+ human-readable text
+
+ The OK response indicates an information message from the server.
+ When tagged, it indicates successful completion of the associated
+ command. The human-readable text MAY be presented to the user as
+ an information message. The untagged form indicates an
+ information-only message; the nature of the information MAY be
+ indicated by a response code.
+
+ The untagged form is also used as one of three possible greetings
+ at connection startup. It indicates that the connection is not
+ yet authenticated and that a LOGIN command is needed.
+
+ Example: S: * OK IMAP4rev1 server ready
+ C: A001 LOGIN fred blurdybloop
+ S: * OK [ALERT] System shutdown in 10 minutes
+ S: A001 OK LOGIN Completed
+
+7.1.2. NO Response
+
+ Contents: OPTIONAL response code
+ human-readable text
+
+ The NO response indicates an operational error message from the
+ server. When tagged, it indicates unsuccessful completion of the
+ associated command. The untagged form indicates a warning; the
+ command can still complete successfully. The human-readable text
+ describes the condition.
+
+
+
+Crispin Standards Track [Page 51]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Example: C: A222 COPY 1:2 owatagusiam
+ S: * NO Disk is 98% full, please delete unnecessary data
+ S: A222 OK COPY completed
+ C: A223 COPY 3:200 blurdybloop
+ S: * NO Disk is 98% full, please delete unnecessary data
+ S: * NO Disk is 99% full, please delete unnecessary data
+ S: A223 NO COPY failed: disk is full
+
+7.1.3. BAD Response
+
+ Contents: OPTIONAL response code
+ human-readable text
+
+ The BAD response indicates an error message from the server. When
+ tagged, it reports a protocol-level error in the client's command;
+ the tag indicates the command that caused the error. The untagged
+ form indicates a protocol-level error for which the associated
+ command can not be determined; it can also indicate an internal
+ server failure. The human-readable text describes the condition.
+
+ Example: C: ...very long command line...
+ S: * BAD Command line too long
+ C: ...empty line...
+ S: * BAD Empty command line
+ C: A443 EXPUNGE
+ S: * BAD Disk crash, attempting salvage to a new disk!
+ S: * OK Salvage successful, no data lost
+ S: A443 OK Expunge completed
+
+7.1.4. PREAUTH Response
+
+ Contents: OPTIONAL response code
+ human-readable text
+
+ The PREAUTH response is always untagged, and is one of three
+ possible greetings at connection startup. It indicates that the
+ connection has already been authenticated by external means and
+ thus no LOGIN command is needed.
+
+ Example: S: * PREAUTH IMAP4rev1 server logged in as Smith
+
+7.1.5. BYE Response
+
+ Contents: OPTIONAL response code
+ human-readable text
+
+
+
+
+
+
+Crispin Standards Track [Page 52]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The BYE response is always untagged, and indicates that the server
+ is about to close the connection. The human-readable text MAY be
+ displayed to the user in a status report by the client. The BYE
+ response is sent under one of four conditions:
+
+ 1) as part of a normal logout sequence. The server will close
+ the connection after sending the tagged OK response to the
+ LOGOUT command.
+
+ 2) as a panic shutdown announcement. The server closes the
+ connection immediately.
+
+ 3) as an announcement of an inactivity autologout. The server
+ closes the connection immediately.
+
+ 4) as one of three possible greetings at connection startup,
+ indicating that the server is not willing to accept a
+ connection from this client. The server closes the
+ connection immediately.
+
+ The difference between a BYE that occurs as part of a normal
+ LOGOUT sequence (the first case) and a BYE that occurs because of
+ a failure (the other three cases) is that the connection closes
+ immediately in the failure case.
+
+ Example: S: * BYE Autologout; idle for too long
+
+7.2. Server Responses - Server and Mailbox Status
+
+ These responses are always untagged. This is how server and mailbox
+ status data are transmitted from the server to the client. Many of
+ these responses typically result from a command with the same name.
+
+7.2.1. CAPABILITY Response
+
+ Contents: capability listing
+
+ The CAPABILITY response occurs as a result of a CAPABILITY
+ command. The capability listing contains a space-separated
+ listing of capability names that the server supports. The
+ capability listing MUST include the atom "IMAP4rev1".
+
+ A capability name which begins with "AUTH=" indicates that the
+ server supports that particular authentication mechanism.
+
+
+
+
+
+
+
+Crispin Standards Track [Page 53]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Other capability names indicate that the server supports an
+ extension, revision, or amendment to the IMAP4rev1 protocol.
+ Server responses MUST conform to this document until the client
+ issues a command that uses the associated capability.
+
+ Capability names MUST either begin with "X" or be standard or
+ standards-track IMAP4rev1 extensions, revisions, or amendments
+ registered with IANA. A server MUST NOT offer unregistered or
+ non-standard capability names, unless such names are prefixed with
+ an "X".
+
+ Client implementations SHOULD NOT require any capability name
+ other than "IMAP4rev1", and MUST ignore any unknown capability
+ names.
+
+ Example: S: * CAPABILITY IMAP4rev1 AUTH=KERBEROS_V4 XPIG-LATIN
+
+7.2.2. LIST Response
+
+ Contents: name attributes
+ hierarchy delimiter
+ name
+
+ The LIST response occurs as a result of a LIST command. It
+ returns a single name that matches the LIST specification. There
+ can be multiple LIST responses for a single LIST command.
+
+ Four name attributes are defined:
+
+ \Noinferiors It is not possible for any child levels of
+ hierarchy to exist under this name; no child levels
+ exist now and none can be created in the future.
+
+ \Noselect It is not possible to use this name as a selectable
+ mailbox.
+
+ \Marked The mailbox has been marked "interesting" by the
+ server; the mailbox probably contains messages that
+ have been added since the last time the mailbox was
+ selected.
+
+ \Unmarked The mailbox does not contain any additional
+ messages since the last time the mailbox was
+ selected.
+
+ If it is not feasible for the server to determine whether the
+ mailbox is "interesting" or not, or if the name is a \Noselect
+ name, the server SHOULD NOT send either \Marked or \Unmarked.
+
+
+
+Crispin Standards Track [Page 54]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The hierarchy delimiter is a character used to delimit levels of
+ hierarchy in a mailbox name. A client can use it to create child
+ mailboxes, and to search higher or lower levels of naming
+ hierarchy. All children of a top-level hierarchy node MUST use
+ the same separator character. A NIL hierarchy delimiter means
+ that no hierarchy exists; the name is a "flat" name.
+
+ The name represents an unambiguous left-to-right hierarchy, and
+ MUST be valid for use as a reference in LIST and LSUB commands.
+ Unless \Noselect is indicated, the name MUST also be valid as an
+ argument for commands, such as SELECT, that accept mailbox
+ names.
+
+ Example: S: * LIST (\Noselect) "/" ~/Mail/foo
+
+7.2.3. LSUB Response
+
+ Contents: name attributes
+ hierarchy delimiter
+ name
+
+ The LSUB response occurs as a result of an LSUB command. It
+ returns a single name that matches the LSUB specification. There
+ can be multiple LSUB responses for a single LSUB command. The
+ data is identical in format to the LIST response.
+
+ Example: S: * LSUB () "." #news.comp.mail.misc
+
+7.2.4 STATUS Response
+
+ Contents: name
+ status parenthesized list
+
+ The STATUS response occurs as a result of an STATUS command. It
+ returns the mailbox name that matches the STATUS specification and
+ the requested mailbox status information.
+
+ Example: S: * STATUS blurdybloop (MESSAGES 231 UIDNEXT 44292)
+
+7.2.5. SEARCH Response
+
+ Contents: zero or more numbers
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 55]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ The SEARCH response occurs as a result of a SEARCH or UID SEARCH
+ command. The number(s) refer to those messages that match the
+ search criteria. For SEARCH, these are message sequence numbers;
+ for UID SEARCH, these are unique identifiers. Each number is
+ delimited by a space.
+
+ Example: S: * SEARCH 2 3 6
+
+7.2.6. FLAGS Response
+
+ Contents: flag parenthesized list
+
+ The FLAGS response occurs as a result of a SELECT or EXAMINE
+ command. The flag parenthesized list identifies the flags (at a
+ minimum, the system-defined flags) that are applicable for this
+ mailbox. Flags other than the system flags can also exist,
+ depending on server implementation.
+
+ The update from the FLAGS response MUST be recorded by the client.
+
+ Example: S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
+
+7.3. Server Responses - Mailbox Size
+
+ These responses are always untagged. This is how changes in the size
+ of the mailbox are trasnmitted from the server to the client.
+ Immediately following the "*" token is a number that represents a
+ message count.
+
+7.3.1. EXISTS Response
+
+ Contents: none
+
+ The EXISTS response reports the number of messages in the mailbox.
+ This response occurs as a result of a SELECT or EXAMINE command,
+ and if the size of the mailbox changes (e.g. new mail).
+
+ The update from the EXISTS response MUST be recorded by the
+ client.
+
+ Example: S: * 23 EXISTS
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 56]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+7.3.2. RECENT Response
+
+ Contents: none
+
+ The RECENT response reports the number of messages with the
+ \Recent flag set. This response occurs as a result of a SELECT or
+ EXAMINE command, and if the size of the mailbox changes (e.g. new
+ mail).
+
+ Note: It is not guaranteed that the message sequence numbers of
+ recent messages will be a contiguous range of the highest n
+ messages in the mailbox (where n is the value reported by the
+ RECENT response). Examples of situations in which this is not
+ the case are: multiple clients having the same mailbox open
+ (the first session to be notified will see it as recent, others
+ will probably see it as non-recent), and when the mailbox is
+ re-ordered by a non-IMAP agent.
+
+ The only reliable way to identify recent messages is to look at
+ message flags to see which have the \Recent flag set, or to do
+ a SEARCH RECENT.
+
+ The update from the RECENT response MUST be recorded by the
+ client.
+
+ Example: S: * 5 RECENT
+
+7.4. Server Responses - Message Status
+
+ These responses are always untagged. This is how message data are
+ transmitted from the server to the client, often as a result of a
+ command with the same name. Immediately following the "*" token is a
+ number that represents a message sequence number.
+
+7.4.1. EXPUNGE Response
+
+ Contents: none
+
+ The EXPUNGE response reports that the specified message sequence
+ number has been permanently removed from the mailbox. The message
+ sequence number for each successive message in the mailbox is
+ immediately decremented by 1, and this decrement is reflected in
+ message sequence numbers in subsequent responses (including other
+ untagged EXPUNGE responses).
+
+ As a result of the immediate decrement rule, message sequence
+ numbers that appear in a set of successive EXPUNGE responses
+ depend upon whether the messages are removed starting from lower
+
+
+
+Crispin Standards Track [Page 57]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ numbers to higher numbers, or from higher numbers to lower
+ numbers. For example, if the last 5 messages in a 9-message
+ mailbox are expunged; a "lower to higher" server will send five
+ untagged EXPUNGE responses for message sequence number 5, whereas
+ a "higher to lower server" will send successive untagged EXPUNGE
+ responses for message sequence numbers 9, 8, 7, 6, and 5.
+
+ An EXPUNGE response MUST NOT be sent when no command is in
+ progress; nor while responding to a FETCH, STORE, or SEARCH
+ command. This rule is necessary to prevent a loss of
+ synchronization of message sequence numbers between client and
+ server.
+
+ The update from the EXPUNGE response MUST be recorded by the
+ client.
+
+ Example: S: * 44 EXPUNGE
+
+7.4.2. FETCH Response
+
+ Contents: message data
+
+ The FETCH response returns data about a message to the client.
+ The data are pairs of data item names and their values in
+ parentheses. This response occurs as the result of a FETCH or
+ STORE command, as well as by unilateral server decision (e.g. flag
+ updates).
+
+ The current data items are:
+
+ BODY A form of BODYSTRUCTURE without extension data.
+
+ BODY[<section>]<<origin_octet>>
+ A string expressing the body contents of the
+ specified section. The string SHOULD be
+ interpreted by the client according to the content
+ transfer encoding, body type, and subtype.
+
+ If the origin octet is specified, this string is a
+ substring of the entire body contents, starting at
+ that origin octet. This means that BODY[]<0> MAY
+ be truncated, but BODY[] is NEVER truncated.
+
+ 8-bit textual data is permitted if a [CHARSET]
+ identifier is part of the body parameter
+ parenthesized list for this section. Note that
+ headers (part specifiers HEADER or MIME, or the
+ header portion of a MESSAGE/RFC822 part), MUST be
+
+
+
+Crispin Standards Track [Page 58]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ 7-bit; 8-bit characters are not permitted in
+ headers. Note also that the blank line at the end
+ of the header is always included in header data.
+
+ Non-textual data such as binary data MUST be
+ transfer encoded into a textual form such as BASE64
+ prior to being sent to the client. To derive the
+ original binary data, the client MUST decode the
+ transfer encoded string.
+
+ BODYSTRUCTURE A parenthesized list that describes the [MIME-IMB]
+ body structure of a message. This is computed by
+ the server by parsing the [MIME-IMB] header fields,
+ defaulting various fields as necessary.
+
+ For example, a simple text message of 48 lines and
+ 2279 octets can have a body structure of: ("TEXT"
+ "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 2279
+ 48)
+
+ Multiple parts are indicated by parenthesis
+ nesting. Instead of a body type as the first
+ element of the parenthesized list there is a nested
+ body. The second element of the parenthesized list
+ is the multipart subtype (mixed, digest, parallel,
+ alternative, etc.).
+
+ For example, a two part message consisting of a
+ text and a BASE645-encoded text attachment can have
+ a body structure of: (("TEXT" "PLAIN" ("CHARSET"
+ "US-ASCII") NIL NIL "7BIT" 1152 23)("TEXT" "PLAIN"
+ ("CHARSET" "US-ASCII" "NAME" "cc.diff")
+ "<960723163407.20117h@cac.washington.edu>"
+ "Compiler diff" "BASE64" 4554 73) "MIXED"))
+
+ Extension data follows the multipart subtype.
+ Extension data is never returned with the BODY
+ fetch, but can be returned with a BODYSTRUCTURE
+ fetch. Extension data, if present, MUST be in the
+ defined order.
+
+ The extension data of a multipart body part are in
+ the following order:
+
+ body parameter parenthesized list
+ A parenthesized list of attribute/value pairs
+ [e.g. ("foo" "bar" "baz" "rag") where "bar" is
+ the value of "foo" and "rag" is the value of
+
+
+
+Crispin Standards Track [Page 59]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ "baz"] as defined in [MIME-IMB].
+
+ body disposition
+ A parenthesized list, consisting of a
+ disposition type string followed by a
+ parenthesized list of disposition
+ attribute/value pairs. The disposition type and
+ attribute names will be defined in a future
+ standards-track revision to [DISPOSITION].
+
+ body language
+ A string or parenthesized list giving the body
+ language value as defined in [LANGUAGE-TAGS].
+
+ Any following extension data are not yet defined in
+ this version of the protocol. Such extension data
+ can consist of zero or more NILs, strings, numbers,
+ or potentially nested parenthesized lists of such
+ data. Client implementations that do a
+ BODYSTRUCTURE fetch MUST be prepared to accept such
+ extension data. Server implementations MUST NOT
+ send such extension data until it has been defined
+ by a revision of this protocol.
+
+ The basic fields of a non-multipart body part are
+ in the following order:
+
+ body type
+ A string giving the content media type name as
+ defined in [MIME-IMB].
+
+ body subtype
+ A string giving the content subtype name as
+ defined in [MIME-IMB].
+
+ body parameter parenthesized list
+ A parenthesized list of attribute/value pairs
+ [e.g. ("foo" "bar" "baz" "rag") where "bar" is
+ the value of "foo" and "rag" is the value of
+ "baz"] as defined in [MIME-IMB].
+
+ body id
+ A string giving the content id as defined in
+ [MIME-IMB].
+
+ body description
+ A string giving the content description as
+ defined in [MIME-IMB].
+
+
+
+Crispin Standards Track [Page 60]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ body encoding
+ A string giving the content transfer encoding as
+ defined in [MIME-IMB].
+
+ body size
+ A number giving the size of the body in octets.
+ Note that this size is the size in its transfer
+ encoding and not the resulting size after any
+ decoding.
+
+ A body type of type MESSAGE and subtype RFC822
+ contains, immediately after the basic fields, the
+ envelope structure, body structure, and size in
+ text lines of the encapsulated message.
+
+ A body type of type TEXT contains, immediately
+ after the basic fields, the size of the body in
+ text lines. Note that this size is the size in its
+ content transfer encoding and not the resulting
+ size after any decoding.
+
+ Extension data follows the basic fields and the
+ type-specific fields listed above. Extension data
+ is never returned with the BODY fetch, but can be
+ returned with a BODYSTRUCTURE fetch. Extension
+ data, if present, MUST be in the defined order.
+
+ The extension data of a non-multipart body part are
+ in the following order:
+
+ body MD5
+ A string giving the body MD5 value as defined in
+ [MD5].
+
+ body disposition
+ A parenthesized list with the same content and
+ function as the body disposition for a multipart
+ body part.
+
+ body language
+ A string or parenthesized list giving the body
+ language value as defined in [LANGUAGE-TAGS].
+
+ Any following extension data are not yet defined in
+ this version of the protocol, and would be as
+ described above under multipart extension data.
+
+
+
+
+
+Crispin Standards Track [Page 61]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ ENVELOPE A parenthesized list that describes the envelope
+ structure of a message. This is computed by the
+ server by parsing the [RFC-822] header into the
+ component parts, defaulting various fields as
+ necessary.
+
+ The fields of the envelope structure are in the
+ following order: date, subject, from, sender,
+ reply-to, to, cc, bcc, in-reply-to, and message-id.
+ The date, subject, in-reply-to, and message-id
+ fields are strings. The from, sender, reply-to,
+ to, cc, and bcc fields are parenthesized lists of
+ address structures.
+
+ An address structure is a parenthesized list that
+ describes an electronic mail address. The fields
+ of an address structure are in the following order:
+ personal name, [SMTP] at-domain-list (source
+ route), mailbox name, and host name.
+
+ [RFC-822] group syntax is indicated by a special
+ form of address structure in which the host name
+ field is NIL. If the mailbox name field is also
+ NIL, this is an end of group marker (semi-colon in
+ RFC 822 syntax). If the mailbox name field is
+ non-NIL, this is a start of group marker, and the
+ mailbox name field holds the group name phrase.
+
+ Any field of an envelope or address structure that
+ is not applicable is presented as NIL. Note that
+ the server MUST default the reply-to and sender
+ fields from the from field; a client is not
+ expected to know to do this.
+
+ FLAGS A parenthesized list of flags that are set for this
+ message.
+
+ INTERNALDATE A string representing the internal date of the
+ message.
+
+ RFC822 Equivalent to BODY[].
+
+ RFC822.HEADER Equivalent to BODY.PEEK[HEADER].
+
+ RFC822.SIZE A number expressing the [RFC-822] size of the
+ message.
+
+ RFC822.TEXT Equivalent to BODY[TEXT].
+
+
+
+Crispin Standards Track [Page 62]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ UID A number expressing the unique identifier of the
+ message.
+
+
+ Example: S: * 23 FETCH (FLAGS (\Seen) RFC822.SIZE 44827)
+
+7.5. Server Responses - Command Continuation Request
+
+ The command continuation request response is indicated by a "+" token
+ instead of a tag. This form of response indicates that the server is
+ ready to accept the continuation of a command from the client. The
+ remainder of this response is a line of text.
+
+ This response is used in the AUTHORIZATION command to transmit server
+ data to the client, and request additional client data. This
+ response is also used if an argument to any command is a literal.
+
+ The client is not permitted to send the octets of the literal unless
+ the server indicates that it expects it. This permits the server to
+ process commands and reject errors on a line-by-line basis. The
+ remainder of the command, including the CRLF that terminates a
+ command, follows the octets of the literal. If there are any
+ additional command arguments the literal octets are followed by a
+ space and those arguments.
+
+ Example: C: A001 LOGIN {11}
+ S: + Ready for additional command text
+ C: FRED FOOBAR {7}
+ S: + Ready for additional command text
+ C: fat man
+ S: A001 OK LOGIN completed
+ C: A044 BLURDYBLOOP {102856}
+ S: A044 BAD No such command as "BLURDYBLOOP"
+
+8. Sample IMAP4rev1 connection
+
+ The following is a transcript of an IMAP4rev1 connection. A long
+ line in this sample is broken for editorial clarity.
+
+S: * OK IMAP4rev1 Service Ready
+C: a001 login mrc secret
+S: a001 OK LOGIN completed
+C: a002 select inbox
+S: * 18 EXISTS
+S: * FLAGS (\Answered \Flagged \Deleted \Seen \Draft)
+S: * 2 RECENT
+S: * OK [UNSEEN 17] Message 17 is the first unseen message
+S: * OK [UIDVALIDITY 3857529045] UIDs valid
+
+
+
+Crispin Standards Track [Page 63]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+S: a002 OK [READ-WRITE] SELECT completed
+C: a003 fetch 12 full
+S: * 12 FETCH (FLAGS (\Seen) INTERNALDATE "17-Jul-1996 02:44:25 -0700"
+ RFC822.SIZE 4286 ENVELOPE ("Wed, 17 Jul 1996 02:23:25 -0700 (PDT)"
+ "IMAP4rev1 WG mtg summary and minutes"
+ (("Terry Gray" NIL "gray" "cac.washington.edu"))
+ (("Terry Gray" NIL "gray" "cac.washington.edu"))
+ (("Terry Gray" NIL "gray" "cac.washington.edu"))
+ ((NIL NIL "imap" "cac.washington.edu"))
+ ((NIL NIL "minutes" "CNRI.Reston.VA.US")
+ ("John Klensin" NIL "KLENSIN" "INFOODS.MIT.EDU")) NIL NIL
+ "<B27397-0100000@cac.washington.edu>")
+ BODY ("TEXT" "PLAIN" ("CHARSET" "US-ASCII") NIL NIL "7BIT" 3028 92))
+S: a003 OK FETCH completed
+C: a004 fetch 12 body[header]
+S: * 12 FETCH (BODY[HEADER] {350}
+S: Date: Wed, 17 Jul 1996 02:23:25 -0700 (PDT)
+S: From: Terry Gray <gray@cac.washington.edu>
+S: Subject: IMAP4rev1 WG mtg summary and minutes
+S: To: imap@cac.washington.edu
+S: cc: minutes@CNRI.Reston.VA.US, John Klensin <KLENSIN@INFOODS.MIT.EDU>
+S: Message-Id: <B27397-0100000@cac.washington.edu>
+S: MIME-Version: 1.0
+S: Content-Type: TEXT/PLAIN; CHARSET=US-ASCII
+S:
+S: )
+S: a004 OK FETCH completed
+C: a005 store 12 +flags \deleted
+S: * 12 FETCH (FLAGS (\Seen \Deleted))
+S: a005 OK +FLAGS completed
+C: a006 logout
+S: * BYE IMAP4rev1 server terminating connection
+S: a006 OK LOGOUT completed
+
+9. Formal Syntax
+
+ The following syntax specification uses the augmented Backus-Naur
+ Form (BNF) notation as specified in [RFC-822] with one exception; the
+ delimiter used with the "#" construct is a single space (SPACE) and
+ not one or more commas.
+
+ In the case of alternative or optional rules in which a later rule
+ overlaps an earlier rule, the rule which is listed earlier MUST take
+ priority. For example, "\Seen" when parsed as a flag is the \Seen
+ flag name and not a flag_extension, even though "\Seen" could be
+ parsed as a flag_extension. Some, but not all, instances of this
+ rule are noted below.
+
+
+
+
+Crispin Standards Track [Page 64]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ Except as noted otherwise, all alphabetic characters are case-
+ insensitive. The use of upper or lower case characters to define
+ token strings is for editorial clarity only. Implementations MUST
+ accept these strings in a case-insensitive fashion.
+
+address ::= "(" addr_name SPACE addr_adl SPACE addr_mailbox
+ SPACE addr_host ")"
+
+addr_adl ::= nstring
+ ;; Holds route from [RFC-822] route-addr if
+ ;; non-NIL
+
+addr_host ::= nstring
+ ;; NIL indicates [RFC-822] group syntax.
+ ;; Otherwise, holds [RFC-822] domain name
+
+addr_mailbox ::= nstring
+ ;; NIL indicates end of [RFC-822] group; if
+ ;; non-NIL and addr_host is NIL, holds
+ ;; [RFC-822] group name.
+ ;; Otherwise, holds [RFC-822] local-part
+
+addr_name ::= nstring
+ ;; Holds phrase from [RFC-822] mailbox if
+ ;; non-NIL
+
+alpha ::= "A" / "B" / "C" / "D" / "E" / "F" / "G" / "H" /
+ "I" / "J" / "K" / "L" / "M" / "N" / "O" / "P" /
+ "Q" / "R" / "S" / "T" / "U" / "V" / "W" / "X" /
+ "Y" / "Z" /
+ "a" / "b" / "c" / "d" / "e" / "f" / "g" / "h" /
+ "i" / "j" / "k" / "l" / "m" / "n" / "o" / "p" /
+ "q" / "r" / "s" / "t" / "u" / "v" / "w" / "x" /
+ "y" / "z"
+ ;; Case-sensitive
+
+append ::= "APPEND" SPACE mailbox [SPACE flag_list]
+ [SPACE date_time] SPACE literal
+
+astring ::= atom / string
+
+atom ::= 1*ATOM_CHAR
+
+ATOM_CHAR ::= <any CHAR except atom_specials>
+
+atom_specials ::= "(" / ")" / "{" / SPACE / CTL / list_wildcards /
+ quoted_specials
+
+
+
+
+Crispin Standards Track [Page 65]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+authenticate ::= "AUTHENTICATE" SPACE auth_type *(CRLF base64)
+
+auth_type ::= atom
+ ;; Defined by [IMAP-AUTH]
+
+base64 ::= *(4base64_char) [base64_terminal]
+
+base64_char ::= alpha / digit / "+" / "/"
+
+base64_terminal ::= (2base64_char "==") / (3base64_char "=")
+
+body ::= "(" body_type_1part / body_type_mpart ")"
+
+body_extension ::= nstring / number / "(" 1#body_extension ")"
+ ;; Future expansion. Client implementations
+ ;; MUST accept body_extension fields. Server
+ ;; implementations MUST NOT generate
+ ;; body_extension fields except as defined by
+ ;; future standard or standards-track
+ ;; revisions of this specification.
+
+body_ext_1part ::= body_fld_md5 [SPACE body_fld_dsp
+ [SPACE body_fld_lang
+ [SPACE 1#body_extension]]]
+ ;; MUST NOT be returned on non-extensible
+ ;; "BODY" fetch
+
+body_ext_mpart ::= body_fld_param
+ [SPACE body_fld_dsp SPACE body_fld_lang
+ [SPACE 1#body_extension]]
+ ;; MUST NOT be returned on non-extensible
+ ;; "BODY" fetch
+
+body_fields ::= body_fld_param SPACE body_fld_id SPACE
+ body_fld_desc SPACE body_fld_enc SPACE
+ body_fld_octets
+
+body_fld_desc ::= nstring
+
+body_fld_dsp ::= "(" string SPACE body_fld_param ")" / nil
+
+body_fld_enc ::= (<"> ("7BIT" / "8BIT" / "BINARY" / "BASE64"/
+ "QUOTED-PRINTABLE") <">) / string
+
+body_fld_id ::= nstring
+
+body_fld_lang ::= nstring / "(" 1#string ")"
+
+
+
+
+Crispin Standards Track [Page 66]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+body_fld_lines ::= number
+
+body_fld_md5 ::= nstring
+
+body_fld_octets ::= number
+
+body_fld_param ::= "(" 1#(string SPACE string) ")" / nil
+
+body_type_1part ::= (body_type_basic / body_type_msg / body_type_text)
+ [SPACE body_ext_1part]
+
+body_type_basic ::= media_basic SPACE body_fields
+ ;; MESSAGE subtype MUST NOT be "RFC822"
+
+body_type_mpart ::= 1*body SPACE media_subtype
+ [SPACE body_ext_mpart]
+
+body_type_msg ::= media_message SPACE body_fields SPACE envelope
+ SPACE body SPACE body_fld_lines
+
+body_type_text ::= media_text SPACE body_fields SPACE body_fld_lines
+
+capability ::= "AUTH=" auth_type / atom
+ ;; New capabilities MUST begin with "X" or be
+ ;; registered with IANA as standard or
+ ;; standards-track
+
+capability_data ::= "CAPABILITY" SPACE [1#capability SPACE] "IMAP4rev1"
+ [SPACE 1#capability]
+ ;; IMAP4rev1 servers which offer RFC 1730
+ ;; compatibility MUST list "IMAP4" as the first
+ ;; capability.
+
+CHAR ::= <any 7-bit US-ASCII character except NUL,
+ 0x01 - 0x7f>
+
+CHAR8 ::= <any 8-bit octet except NUL, 0x01 - 0xff>
+
+command ::= tag SPACE (command_any / command_auth /
+ command_nonauth / command_select) CRLF
+ ;; Modal based on state
+
+command_any ::= "CAPABILITY" / "LOGOUT" / "NOOP" / x_command
+ ;; Valid in all states
+
+command_auth ::= append / create / delete / examine / list / lsub /
+ rename / select / status / subscribe / unsubscribe
+ ;; Valid only in Authenticated or Selected state
+
+
+
+Crispin Standards Track [Page 67]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+command_nonauth ::= login / authenticate
+ ;; Valid only when in Non-Authenticated state
+
+command_select ::= "CHECK" / "CLOSE" / "EXPUNGE" /
+ copy / fetch / store / uid / search
+ ;; Valid only when in Selected state
+
+continue_req ::= "+" SPACE (resp_text / base64)
+
+copy ::= "COPY" SPACE set SPACE mailbox
+
+CR ::= <ASCII CR, carriage return, 0x0D>
+
+create ::= "CREATE" SPACE mailbox
+ ;; Use of INBOX gives a NO error
+
+CRLF ::= CR LF
+
+CTL ::= <any ASCII control character and DEL,
+ 0x00 - 0x1f, 0x7f>
+
+date ::= date_text / <"> date_text <">
+
+date_day ::= 1*2digit
+ ;; Day of month
+
+date_day_fixed ::= (SPACE digit) / 2digit
+ ;; Fixed-format version of date_day
+
+date_month ::= "Jan" / "Feb" / "Mar" / "Apr" / "May" / "Jun" /
+ "Jul" / "Aug" / "Sep" / "Oct" / "Nov" / "Dec"
+
+date_text ::= date_day "-" date_month "-" date_year
+
+date_year ::= 4digit
+
+date_time ::= <"> date_day_fixed "-" date_month "-" date_year
+ SPACE time SPACE zone <">
+
+delete ::= "DELETE" SPACE mailbox
+ ;; Use of INBOX gives a NO error
+
+digit ::= "0" / digit_nz
+
+digit_nz ::= "1" / "2" / "3" / "4" / "5" / "6" / "7" / "8" /
+ "9"
+
+
+
+
+
+Crispin Standards Track [Page 68]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+envelope ::= "(" env_date SPACE env_subject SPACE env_from
+ SPACE env_sender SPACE env_reply_to SPACE env_to
+ SPACE env_cc SPACE env_bcc SPACE env_in_reply_to
+ SPACE env_message_id ")"
+
+env_bcc ::= "(" 1*address ")" / nil
+
+env_cc ::= "(" 1*address ")" / nil
+
+env_date ::= nstring
+
+env_from ::= "(" 1*address ")" / nil
+
+env_in_reply_to ::= nstring
+
+env_message_id ::= nstring
+
+env_reply_to ::= "(" 1*address ")" / nil
+
+env_sender ::= "(" 1*address ")" / nil
+
+env_subject ::= nstring
+
+env_to ::= "(" 1*address ")" / nil
+
+examine ::= "EXAMINE" SPACE mailbox
+
+fetch ::= "FETCH" SPACE set SPACE ("ALL" / "FULL" /
+ "FAST" / fetch_att / "(" 1#fetch_att ")")
+
+fetch_att ::= "ENVELOPE" / "FLAGS" / "INTERNALDATE" /
+ "RFC822" [".HEADER" / ".SIZE" / ".TEXT"] /
+ "BODY" ["STRUCTURE"] / "UID" /
+ "BODY" [".PEEK"] section
+ ["<" number "." nz_number ">"]
+
+flag ::= "\Answered" / "\Flagged" / "\Deleted" /
+ "\Seen" / "\Draft" / flag_keyword / flag_extension
+
+flag_extension ::= "\" atom
+ ;; Future expansion. Client implementations
+ ;; MUST accept flag_extension flags. Server
+ ;; implementations MUST NOT generate
+ ;; flag_extension flags except as defined by
+ ;; future standard or standards-track
+ ;; revisions of this specification.
+
+flag_keyword ::= atom
+
+
+
+Crispin Standards Track [Page 69]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+flag_list ::= "(" #flag ")"
+
+greeting ::= "*" SPACE (resp_cond_auth / resp_cond_bye) CRLF
+
+header_fld_name ::= astring
+
+header_list ::= "(" 1#header_fld_name ")"
+
+LF ::= <ASCII LF, line feed, 0x0A>
+
+list ::= "LIST" SPACE mailbox SPACE list_mailbox
+
+list_mailbox ::= 1*(ATOM_CHAR / list_wildcards) / string
+
+list_wildcards ::= "%" / "*"
+
+literal ::= "{" number "}" CRLF *CHAR8
+ ;; Number represents the number of CHAR8 octets
+
+login ::= "LOGIN" SPACE userid SPACE password
+
+lsub ::= "LSUB" SPACE mailbox SPACE list_mailbox
+
+mailbox ::= "INBOX" / astring
+ ;; INBOX is case-insensitive. All case variants of
+ ;; INBOX (e.g. "iNbOx") MUST be interpreted as INBOX
+ ;; not as an astring. Refer to section 5.1 for
+ ;; further semantic details of mailbox names.
+
+mailbox_data ::= "FLAGS" SPACE flag_list /
+ "LIST" SPACE mailbox_list /
+ "LSUB" SPACE mailbox_list /
+ "MAILBOX" SPACE text /
+ "SEARCH" [SPACE 1#nz_number] /
+ "STATUS" SPACE mailbox SPACE
+ "(" #<status_att number ")" /
+ number SPACE "EXISTS" / number SPACE "RECENT"
+
+mailbox_list ::= "(" #("\Marked" / "\Noinferiors" /
+ "\Noselect" / "\Unmarked" / flag_extension) ")"
+ SPACE (<"> QUOTED_CHAR <"> / nil) SPACE mailbox
+
+media_basic ::= (<"> ("APPLICATION" / "AUDIO" / "IMAGE" /
+ "MESSAGE" / "VIDEO") <">) / string)
+ SPACE media_subtype
+ ;; Defined in [MIME-IMT]
+
+media_message ::= <"> "MESSAGE" <"> SPACE <"> "RFC822" <">
+
+
+
+Crispin Standards Track [Page 70]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ ;; Defined in [MIME-IMT]
+
+media_subtype ::= string
+ ;; Defined in [MIME-IMT]
+
+media_text ::= <"> "TEXT" <"> SPACE media_subtype
+ ;; Defined in [MIME-IMT]
+
+message_data ::= nz_number SPACE ("EXPUNGE" /
+ ("FETCH" SPACE msg_att))
+
+msg_att ::= "(" 1#("ENVELOPE" SPACE envelope /
+ "FLAGS" SPACE "(" #(flag / "\Recent") ")" /
+ "INTERNALDATE" SPACE date_time /
+ "RFC822" [".HEADER" / ".TEXT"] SPACE nstring /
+ "RFC822.SIZE" SPACE number /
+ "BODY" ["STRUCTURE"] SPACE body /
+ "BODY" section ["<" number ">"] SPACE nstring /
+ "UID" SPACE uniqueid) ")"
+
+nil ::= "NIL"
+
+nstring ::= string / nil
+
+number ::= 1*digit
+ ;; Unsigned 32-bit integer
+ ;; (0 <= n < 4,294,967,296)
+
+nz_number ::= digit_nz *digit
+ ;; Non-zero unsigned 32-bit integer
+ ;; (0 < n < 4,294,967,296)
+
+password ::= astring
+
+quoted ::= <"> *QUOTED_CHAR <">
+
+QUOTED_CHAR ::= <any TEXT_CHAR except quoted_specials> /
+ "\" quoted_specials
+
+quoted_specials ::= <"> / "\"
+
+rename ::= "RENAME" SPACE mailbox SPACE mailbox
+ ;; Use of INBOX as a destination gives a NO error
+
+response ::= *(continue_req / response_data) response_done
+
+response_data ::= "*" SPACE (resp_cond_state / resp_cond_bye /
+ mailbox_data / message_data / capability_data)
+
+
+
+Crispin Standards Track [Page 71]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ CRLF
+
+response_done ::= response_tagged / response_fatal
+
+response_fatal ::= "*" SPACE resp_cond_bye CRLF
+ ;; Server closes connection immediately
+
+response_tagged ::= tag SPACE resp_cond_state CRLF
+
+resp_cond_auth ::= ("OK" / "PREAUTH") SPACE resp_text
+ ;; Authentication condition
+
+resp_cond_bye ::= "BYE" SPACE resp_text
+
+resp_cond_state ::= ("OK" / "NO" / "BAD") SPACE resp_text
+ ;; Status condition
+
+resp_text ::= ["[" resp_text_code "]" SPACE] (text_mime2 / text)
+ ;; text SHOULD NOT begin with "[" or "="
+
+resp_text_code ::= "ALERT" / "PARSE" /
+ "PERMANENTFLAGS" SPACE "(" #(flag / "\*") ")" /
+ "READ-ONLY" / "READ-WRITE" / "TRYCREATE" /
+ "UIDVALIDITY" SPACE nz_number /
+ "UNSEEN" SPACE nz_number /
+ atom [SPACE 1*<any TEXT_CHAR except "]">]
+
+search ::= "SEARCH" SPACE ["CHARSET" SPACE astring SPACE]
+ 1#search_key
+ ;; [CHARSET] MUST be registered with IANA
+
+search_key ::= "ALL" / "ANSWERED" / "BCC" SPACE astring /
+ "BEFORE" SPACE date / "BODY" SPACE astring /
+ "CC" SPACE astring / "DELETED" / "FLAGGED" /
+ "FROM" SPACE astring /
+ "KEYWORD" SPACE flag_keyword / "NEW" / "OLD" /
+ "ON" SPACE date / "RECENT" / "SEEN" /
+ "SINCE" SPACE date / "SUBJECT" SPACE astring /
+ "TEXT" SPACE astring / "TO" SPACE astring /
+ "UNANSWERED" / "UNDELETED" / "UNFLAGGED" /
+ "UNKEYWORD" SPACE flag_keyword / "UNSEEN" /
+ ;; Above this line were in [IMAP2]
+ "DRAFT" /
+ "HEADER" SPACE header_fld_name SPACE astring /
+ "LARGER" SPACE number / "NOT" SPACE search_key /
+ "OR" SPACE search_key SPACE search_key /
+ "SENTBEFORE" SPACE date / "SENTON" SPACE date /
+ "SENTSINCE" SPACE date / "SMALLER" SPACE number /
+
+
+
+Crispin Standards Track [Page 72]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ "UID" SPACE set / "UNDRAFT" / set /
+ "(" 1#search_key ")"
+
+section ::= "[" [section_text / (nz_number *["." nz_number]
+ ["." (section_text / "MIME")])] "]"
+
+section_text ::= "HEADER" / "HEADER.FIELDS" [".NOT"]
+ SPACE header_list / "TEXT"
+
+select ::= "SELECT" SPACE mailbox
+
+sequence_num ::= nz_number / "*"
+ ;; * is the largest number in use. For message
+ ;; sequence numbers, it is the number of messages
+ ;; in the mailbox. For unique identifiers, it is
+ ;; the unique identifier of the last message in
+ ;; the mailbox.
+
+set ::= sequence_num / (sequence_num ":" sequence_num) /
+ (set "," set)
+ ;; Identifies a set of messages. For message
+ ;; sequence numbers, these are consecutive
+ ;; numbers from 1 to the number of messages in
+ ;; the mailbox
+ ;; Comma delimits individual numbers, colon
+ ;; delimits between two numbers inclusive.
+ ;; Example: 2,4:7,9,12:* is 2,4,5,6,7,9,12,13,
+ ;; 14,15 for a mailbox with 15 messages.
+
+SPACE ::= <ASCII SP, space, 0x20>
+
+status ::= "STATUS" SPACE mailbox SPACE "(" 1#status_att ")"
+
+status_att ::= "MESSAGES" / "RECENT" / "UIDNEXT" / "UIDVALIDITY" /
+ "UNSEEN"
+
+store ::= "STORE" SPACE set SPACE store_att_flags
+
+store_att_flags ::= (["+" / "-"] "FLAGS" [".SILENT"]) SPACE
+ (flag_list / #flag)
+
+string ::= quoted / literal
+
+subscribe ::= "SUBSCRIBE" SPACE mailbox
+
+tag ::= 1*<any ATOM_CHAR except "+">
+
+text ::= 1*TEXT_CHAR
+
+
+
+Crispin Standards Track [Page 73]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+text_mime2 ::= "=?" <charset> "?" <encoding> "?"
+ <encoded-text> "?="
+ ;; Syntax defined in [MIME-HDRS]
+
+TEXT_CHAR ::= <any CHAR except CR and LF>
+
+time ::= 2digit ":" 2digit ":" 2digit
+ ;; Hours minutes seconds
+
+uid ::= "UID" SPACE (copy / fetch / search / store)
+ ;; Unique identifiers used instead of message
+ ;; sequence numbers
+
+uniqueid ::= nz_number
+ ;; Strictly ascending
+
+unsubscribe ::= "UNSUBSCRIBE" SPACE mailbox
+
+userid ::= astring
+
+x_command ::= "X" atom <experimental command arguments>
+
+zone ::= ("+" / "-") 4digit
+ ;; Signed four-digit value of hhmm representing
+ ;; hours and minutes west of Greenwich (that is,
+ ;; (the amount that the given time differs from
+ ;; Universal Time). Subtracting the timezone
+ ;; from the given time will give the UT form.
+ ;; The Universal Time zone is "+0000".
+
+10. Author's Note
+
+ This document is a revision or rewrite of earlier documents, and
+ supercedes the protocol specification in those documents: RFC 1730,
+ unpublished IMAP2bis.TXT document, RFC 1176, and RFC 1064.
+
+11. Security Considerations
+
+ IMAP4rev1 protocol transactions, including electronic mail data, are
+ sent in the clear over the network unless privacy protection is
+ negotiated in the AUTHENTICATE command.
+
+ A server error message for an AUTHENTICATE command which fails due to
+ invalid credentials SHOULD NOT detail why the credentials are
+ invalid.
+
+ Use of the LOGIN command sends passwords in the clear. This can be
+ avoided by using the AUTHENTICATE command instead.
+
+
+
+Crispin Standards Track [Page 74]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ A server error message for a failing LOGIN command SHOULD NOT specify
+ that the user name, as opposed to the password, is invalid.
+
+ Additional security considerations are discussed in the section
+ discussing the AUTHENTICATE and LOGIN commands.
+
+12. Author's Address
+
+ Mark R. Crispin
+ Networks and Distributed Computing
+ University of Washington
+ 4545 15th Aveneue NE
+ Seattle, WA 98105-4527
+
+ Phone: (206) 543-5762
+
+ EMail: MRC@CAC.Washington.EDU
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 75]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+Appendices
+
+A. References
+
+[ACAP] Myers, J. "ACAP -- Application Configuration Access Protocol",
+Work in Progress.
+
+[CHARSET] Reynolds, J., and J. Postel, "Assigned Numbers", STD 2,
+RFC 1700, USC/Information Sciences Institute, October 1994.
+
+[DISPOSITION] Troost, R., and Dorner, S., "Communicating Presentation
+Information in Internet Messages: The Content-Disposition Header",
+RFC 1806, June 1995.
+
+[IMAP-AUTH] Myers, J., "IMAP4 Authentication Mechanism", RFC 1731.
+Carnegie-Mellon University, December 1994.
+
+[IMAP-COMPAT] Crispin, M., "IMAP4 Compatibility with IMAP2bis", RFC
+2061, University of Washington, November 1996.
+
+[IMAP-DISC] Austein, R., "Synchronization Operations for Disconnected
+IMAP4 Clients", Work in Progress.
+
+[IMAP-HISTORICAL] Crispin, M. "IMAP4 Compatibility with IMAP2 and
+IMAP2bis", RFC 1732, University of Washington, December 1994.
+
+[IMAP-MODEL] Crispin, M., "Distributed Electronic Mail Models in
+IMAP4", RFC 1733, University of Washington, December 1994.
+
+[IMAP-OBSOLETE] Crispin, M., "Internet Message Access Protocol -
+Obsolete Syntax", RFC 2062, University of Washington, November 1996.
+
+[IMAP2] Crispin, M., "Interactive Mail Access Protocol - Version 2",
+RFC 1176, University of Washington, August 1990.
+
+[LANGUAGE-TAGS] Alvestrand, H., "Tags for the Identification of
+Languages", RFC 1766, March 1995.
+
+[MD5] Myers, J., and M. Rose, "The Content-MD5 Header Field", RFC
+1864, October 1995.
+
+[MIME-IMB] Freed, N., and N. Borenstein, "MIME (Multipurpose Internet
+Mail Extensions) Part One: Format of Internet Message Bodies", RFC
+2045, November 1996.
+
+[MIME-IMT] Freed, N., and N. Borenstein, "MIME (Multipurpose
+Internet Mail Extensions) Part Two: Media Types", RFC 2046,
+November 1996.
+
+
+
+Crispin Standards Track [Page 76]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+[MIME-HDRS] Moore, K., "MIME (Multipurpose Internet Mail Extensions)
+Part Three: Message Header Extensions for Non-ASCII Text", RFC
+2047, November 1996.
+
+[RFC-822] Crocker, D., "Standard for the Format of ARPA Internet Text
+Messages", STD 11, RFC 822, University of Delaware, August 1982.
+
+[SMTP] Postel, J., "Simple Mail Transfer Protocol", STD 10,
+RFC 821, USC/Information Sciences Institute, August 1982.
+
+[UTF-7] Goldsmith, D., and Davis, M., "UTF-7: A Mail-Safe
+Transformation Format of Unicode", RFC 1642, July 1994.
+
+B. Changes from RFC 1730
+
+1) The STATUS command has been added.
+
+2) Clarify in the formal syntax that the "#" construct can never
+refer to multiple spaces.
+
+3) Obsolete syntax has been moved to a separate document.
+
+4) The PARTIAL command has been obsoleted.
+
+5) The RFC822.HEADER.LINES, RFC822.HEADER.LINES.NOT, RFC822.PEEK, and
+RFC822.TEXT.PEEK fetch attributes have been obsoleted.
+
+6) The "<" origin "." size ">" suffix for BODY text attributes has
+been added.
+
+7) The HEADER, HEADER.FIELDS, HEADER.FIELDS.NOT, MIME, and TEXT part
+specifiers have been added.
+
+8) Support for Content-Disposition and Content-Language has been
+added.
+
+9) The restriction on fetching nested MULTIPART parts has been
+removed.
+
+10) Body part number 0 has been obsoleted.
+
+11) Server-supported authenticators are now identified by
+capabilities.
+
+
+
+
+
+
+
+
+Crispin Standards Track [Page 77]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+12) The capability that identifies this protocol is now called
+"IMAP4rev1". A server that provides backwards support for RFC 1730
+SHOULD emit the "IMAP4" capability in addition to "IMAP4rev1" in its
+CAPABILITY response. Because RFC-1730 required "IMAP4" to appear as
+the first capability, it MUST listed first in the response.
+
+13) A description of the mailbox name namespace convention has been
+added.
+
+14) A description of the international mailbox name convention has
+been added.
+
+15) The UID-NEXT and UID-VALIDITY status items are now called UIDNEXT
+and UIDVALIDITY. This is a change from the IMAP STATUS
+Work in Progress and not from RFC-1730
+
+16) Add a clarification that a null mailbox name argument to the LIST
+command returns an untagged LIST response with the hierarchy
+delimiter and root of the reference argument.
+
+17) Define terms such as "MUST", "SHOULD", and "MUST NOT".
+
+18) Add a section which defines message attributes and more
+thoroughly details the semantics of message sequence numbers, UIDs,
+and flags.
+
+19) Add a clarification detailing the circumstances when a client may
+send multiple commands without waiting for a response, and the
+circumstances in which ambiguities may result.
+
+20) Add a recommendation on server behavior for DELETE and RENAME
+when inferior hierarchical names of the given name exist.
+
+21) Add a clarification that a mailbox name may not be unilaterally
+unsubscribed by the server, even if that mailbox name no longer
+exists.
+
+22) Add a clarification that LIST should return its results quickly
+without undue delay.
+
+23) Add a clarification that the date_time argument to APPEND sets
+the internal date of the message.
+
+24) Add a clarification on APPEND behavior when the target mailbox is
+the currently selected mailbox.
+
+
+
+
+
+
+Crispin Standards Track [Page 78]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+25) Add a clarification that external changes to flags should be
+always announced via an untagged FETCH even if the current command is
+a STORE with the ".SILENT" suffix.
+
+26) Add a clarification that COPY appends to the target mailbox.
+
+27) Add the NEWNAME response code.
+
+28) Rewrite the description of the untagged BYE response to clarify
+its semantics.
+
+29) Change the reference for the body MD5 to refer to the proper RFC.
+
+30) Clarify that the formal syntax contains rules which may overlap,
+and that in the event of such an overlap the rule which occurs first
+takes precedence.
+
+31) Correct the definition of body_fld_param.
+
+32) More formal syntax for capability_data.
+
+33) Clarify that any case variant of "INBOX" must be interpreted as
+INBOX.
+
+34) Clarify that the human-readable text in resp_text should not
+begin with "[" or "=".
+
+35) Change MIME references to Draft Standard documents.
+
+36) Clarify \Recent semantics.
+
+37) Additional examples.
+
+C. Key Word Index
+
+ +FLAGS <flag list> (store command data item) ............... 45
+ +FLAGS.SILENT <flag list> (store command data item) ........ 46
+ -FLAGS <flag list> (store command data item) ............... 46
+ -FLAGS.SILENT <flag list> (store command data item) ........ 46
+ ALERT (response code) ...................................... 50
+ ALL (fetch item) ........................................... 41
+ ALL (search key) ........................................... 38
+ ANSWERED (search key) ...................................... 38
+ APPEND (command) ........................................... 34
+ AUTHENTICATE (command) ..................................... 20
+ BAD (response) ............................................. 52
+ BCC <string> (search key) .................................. 38
+ BEFORE <date> (search key) ................................. 39
+
+
+
+Crispin Standards Track [Page 79]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ BODY (fetch item) .......................................... 41
+ BODY (fetch result) ........................................ 58
+ BODY <string> (search key) ................................. 39
+ BODY.PEEK[<section>]<<partial>> (fetch item) ............... 44
+ BODYSTRUCTURE (fetch item) ................................. 44
+ BODYSTRUCTURE (fetch result) ............................... 59
+ BODY[<section>]<<origin_octet>> (fetch result) ............. 58
+ BODY[<section>]<<partial>> (fetch item) .................... 41
+ BYE (response) ............................................. 52
+ Body Structure (message attribute) ......................... 11
+ CAPABILITY (command) ....................................... 18
+ CAPABILITY (response) ...................................... 53
+ CC <string> (search key) ................................... 39
+ CHECK (command) ............................................ 36
+ CLOSE (command) ............................................ 36
+ COPY (command) ............................................. 46
+ CREATE (command) ........................................... 25
+ DELETE (command) ........................................... 26
+ DELETED (search key) ....................................... 39
+ DRAFT (search key) ......................................... 39
+ ENVELOPE (fetch item) ...................................... 44
+ ENVELOPE (fetch result) .................................... 62
+ EXAMINE (command) .......................................... 24
+ EXISTS (response) .......................................... 56
+ EXPUNGE (command) .......................................... 37
+ EXPUNGE (response) ......................................... 57
+ Envelope Structure (message attribute) ..................... 11
+ FAST (fetch item) .......................................... 44
+ FETCH (command) ............................................ 41
+ FETCH (response) ........................................... 58
+ FLAGGED (search key) ....................................... 39
+ FLAGS (fetch item) ......................................... 44
+ FLAGS (fetch result) ....................................... 62
+ FLAGS (response) ........................................... 56
+ FLAGS <flag list> (store command data item) ................ 45
+ FLAGS.SILENT <flag list> (store command data item) ......... 45
+ FROM <string> (search key) ................................. 39
+ FULL (fetch item) .......................................... 44
+ Flags (message attribute) .................................. 9
+ HEADER (part specifier) .................................... 41
+ HEADER <field-name> <string> (search key) .................. 39
+ HEADER.FIELDS <header_list> (part specifier) ............... 41
+ HEADER.FIELDS.NOT <header_list> (part specifier) ........... 41
+ INTERNALDATE (fetch item) .................................. 44
+ INTERNALDATE (fetch result) ................................ 62
+ Internal Date (message attribute) .......................... 10
+ KEYWORD <flag> (search key) ................................ 39
+ Keyword (type of flag) ..................................... 10
+
+
+
+Crispin Standards Track [Page 80]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ LARGER <n> (search key) .................................... 39
+ LIST (command) ............................................. 30
+ LIST (response) ............................................ 54
+ LOGIN (command) ............................................ 22
+ LOGOUT (command) ........................................... 20
+ LSUB (command) ............................................. 32
+ LSUB (response) ............................................ 55
+ MAY (specification requirement term) ....................... 5
+ MESSAGES (status item) ..................................... 33
+ MIME (part specifier) ...................................... 42
+ MUST (specification requirement term) ...................... 4
+ MUST NOT (specification requirement term) .................. 4
+ Message Sequence Number (message attribute) ................ 9
+ NEW (search key) ........................................... 39
+ NEWNAME (response code) .................................... 50
+ NO (response) .............................................. 51
+ NOOP (command) ............................................. 19
+ NOT <search-key> (search key) .............................. 39
+ OK (response) .............................................. 51
+ OLD (search key) ........................................... 39
+ ON <date> (search key) ..................................... 39
+ OPTIONAL (specification requirement term) .................. 5
+ OR <search-key1> <search-key2> (search key) ................ 39
+ PARSE (response code) ...................................... 50
+ PERMANENTFLAGS (response code) ............................. 50
+ PREAUTH (response) ......................................... 52
+ Permanent Flag (class of flag) ............................. 10
+ READ-ONLY (response code) .................................. 50
+ READ-WRITE (response code) ................................. 50
+ RECENT (response) .......................................... 57
+ RECENT (search key) ........................................ 39
+ RECENT (status item) ....................................... 33
+ RENAME (command) ........................................... 27
+ REQUIRED (specification requirement term) .................. 4
+ RFC822 (fetch item) ........................................ 44
+ RFC822 (fetch result) ...................................... 63
+ RFC822.HEADER (fetch item) ................................. 44
+ RFC822.HEADER (fetch result) ............................... 62
+ RFC822.SIZE (fetch item) ................................... 44
+ RFC822.SIZE (fetch result) ................................. 62
+ RFC822.TEXT (fetch item) ................................... 44
+ RFC822.TEXT (fetch result) ................................. 62
+ SEARCH (command) ........................................... 37
+ SEARCH (response) .......................................... 55
+ SEEN (search key) .......................................... 40
+ SELECT (command) ........................................... 23
+ SENTBEFORE <date> (search key) ............................. 40
+ SENTON <date> (search key) ................................. 40
+
+
+
+Crispin Standards Track [Page 81]
+\f
+RFC 2060 IMAP4rev1 December 1996
+
+
+ SENTSINCE <date> (search key) .............................. 40
+ SHOULD (specification requirement term) .................... 5
+ SHOULD NOT (specification requirement term) ................ 5
+ SINCE <date> (search key) .................................. 40
+ SMALLER <n> (search key) ................................... 40
+ STATUS (command) ........................................... 33
+ STATUS (response) .......................................... 55
+ STORE (command) ............................................ 45
+ SUBJECT <string> (search key) .............................. 40
+ SUBSCRIBE (command) ........................................ 29
+ Session Flag (class of flag) ............................... 10
+ System Flag (type of flag) ................................. 9
+ TEXT (part specifier) ...................................... 42
+ TEXT <string> (search key) ................................. 40
+ TO <string> (search key) ................................... 40
+ TRYCREATE (response code) .................................. 51
+ UID (command) .............................................. 47
+ UID (fetch item) ........................................... 44
+ UID (fetch result) ......................................... 63
+ UID <message set> (search key) ............................. 40
+ UIDNEXT (status item) ...................................... 33
+ UIDVALIDITY (response code) ................................ 51
+ UIDVALIDITY (status item) .................................. 34
+ UNANSWERED (search key) .................................... 40
+ UNDELETED (search key) ..................................... 40
+ UNDRAFT (search key) ....................................... 40
+ UNFLAGGED (search key) ..................................... 40
+ UNKEYWORD <flag> (search key) .............................. 40
+ UNSEEN (response code) ..................................... 51
+ UNSEEN (search key) ........................................ 40
+ UNSEEN (status item) ....................................... 34
+ UNSUBSCRIBE (command) ...................................... 30
+ Unique Identifier (UID) (message attribute) ................ 7
+ X<atom> (command) .......................................... 48
+ [RFC-822] Size (message attribute) ......................... 11
+ \Answered (system flag) .................................... 9
+ \Deleted (system flag) ..................................... 9
+ \Draft (system flag) ....................................... 9
+ \Flagged (system flag) ..................................... 9
+ \Marked (mailbox name attribute) ........................... 54
+ \Noinferiors (mailbox name attribute) ...................... 54
+ \Noselect (mailbox name attribute) ......................... 54
+ \Recent (system flag) ...................................... 10
+ \Seen (system flag) ........................................ 9
+ \Unmarked (mailbox name attribute) ......................... 54
+
+
+
+
+
+
+Crispin Standards Track [Page 82]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group J. Klensin, Editor
+Request for Comments: 2821 AT&T Laboratories
+Obsoletes: 821, 974, 1869 April 2001
+Updates: 1123
+Category: Standards Track
+
+
+ Simple Mail Transfer Protocol
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+Abstract
+
+ This document is a self-contained specification of the basic protocol
+ for the Internet electronic mail transport. It consolidates, updates
+ and clarifies, but doesn't add new or change existing functionality
+ of the following:
+
+ - the original SMTP (Simple Mail Transfer Protocol) specification of
+ RFC 821 [30],
+
+ - domain name system requirements and implications for mail
+ transport from RFC 1035 [22] and RFC 974 [27],
+
+ - the clarifications and applicability statements in RFC 1123 [2],
+ and
+
+ - material drawn from the SMTP Extension mechanisms [19].
+
+ It obsoletes RFC 821, RFC 974, and updates RFC 1123 (replaces the
+ mail transport materials of RFC 1123). However, RFC 821 specifies
+ some features that were not in significant use in the Internet by the
+ mid-1990s and (in appendices) some additional transport models.
+ Those sections are omitted here in the interest of clarity and
+ brevity; readers needing them should refer to RFC 821.
+
+
+
+
+
+
+Klensin Standards Track [Page 1]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ It also includes some additional material from RFC 1123 that required
+ amplification. This material has been identified in multiple ways,
+ mostly by tracking flaming on various lists and newsgroups and
+ problems of unusual readings or interpretations that have appeared as
+ the SMTP extensions have been deployed. Where this specification
+ moves beyond consolidation and actually differs from earlier
+ documents, it supersedes them technically as well as textually.
+
+ Although SMTP was designed as a mail transport and delivery protocol,
+ this specification also contains information that is important to its
+ use as a 'mail submission' protocol, as recommended for POP [3, 26]
+ and IMAP [6]. Additional submission issues are discussed in RFC 2476
+ [15].
+
+ Section 2.3 provides definitions of terms specific to this document.
+ Except when the historical terminology is necessary for clarity, this
+ document uses the current 'client' and 'server' terminology to
+ identify the sending and receiving SMTP processes, respectively.
+
+ A companion document [32] discusses message headers, message bodies
+ and formats and structures for them, and their relationship.
+
+Table of Contents
+
+ 1. Introduction .................................................. 4
+ 2. The SMTP Model ................................................ 5
+ 2.1 Basic Structure .............................................. 5
+ 2.2 The Extension Model .......................................... 7
+ 2.2.1 Background ................................................. 7
+ 2.2.2 Definition and Registration of Extensions .................. 8
+ 2.3 Terminology .................................................. 9
+ 2.3.1 Mail Objects ............................................... 10
+ 2.3.2 Senders and Receivers ...................................... 10
+ 2.3.3 Mail Agents and Message Stores ............................. 10
+ 2.3.4 Host ....................................................... 11
+ 2.3.5 Domain ..................................................... 11
+ 2.3.6 Buffer and State Table ..................................... 11
+ 2.3.7 Lines ...................................................... 12
+ 2.3.8 Originator, Delivery, Relay, and Gateway Systems ........... 12
+ 2.3.9 Message Content and Mail Data .............................. 13
+ 2.3.10 Mailbox and Address ....................................... 13
+ 2.3.11 Reply ..................................................... 13
+ 2.4 General Syntax Principles and Transaction Model .............. 13
+ 3. The SMTP Procedures: An Overview .............................. 15
+ 3.1 Session Initiation ........................................... 15
+ 3.2 Client Initiation ............................................ 16
+ 3.3 Mail Transactions ............................................ 16
+ 3.4 Forwarding for Address Correction or Updating ................ 19
+
+
+
+Klensin Standards Track [Page 2]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ 3.5 Commands for Debugging Addresses ............................. 20
+ 3.5.1 Overview ................................................... 20
+ 3.5.2 VRFY Normal Response ....................................... 22
+ 3.5.3 Meaning of VRFY or EXPN Success Response ................... 22
+ 3.5.4 Semantics and Applications of EXPN ......................... 23
+ 3.6 Domains ...................................................... 23
+ 3.7 Relaying ..................................................... 24
+ 3.8 Mail Gatewaying .............................................. 25
+ 3.8.1 Header Fields in Gatewaying ................................ 26
+ 3.8.2 Received Lines in Gatewaying ............................... 26
+ 3.8.3 Addresses in Gatewaying .................................... 26
+ 3.8.4 Other Header Fields in Gatewaying .......................... 27
+ 3.8.5 Envelopes in Gatewaying .................................... 27
+ 3.9 Terminating Sessions and Connections ......................... 27
+ 3.10 Mailing Lists and Aliases ................................... 28
+ 3.10.1 Alias ..................................................... 28
+ 3.10.2 List ...................................................... 28
+ 4. The SMTP Specifications ....................................... 29
+ 4.1 SMTP Commands ................................................ 29
+ 4.1.1 Command Semantics and Syntax ............................... 29
+ 4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO) ................... 29
+ 4.1.1.2 MAIL (MAIL) .............................................. 31
+ 4.1.1.3 RECIPIENT (RCPT) ......................................... 31
+ 4.1.1.4 DATA (DATA) .............................................. 33
+ 4.1.1.5 RESET (RSET) ............................................. 34
+ 4.1.1.6 VERIFY (VRFY) ............................................ 35
+ 4.1.1.7 EXPAND (EXPN) ............................................ 35
+ 4.1.1.8 HELP (HELP) .............................................. 35
+ 4.1.1.9 NOOP (NOOP) .............................................. 35
+ 4.1.1.10 QUIT (QUIT) ............................................. 36
+ 4.1.2 Command Argument Syntax .................................... 36
+ 4.1.3 Address Literals ........................................... 38
+ 4.1.4 Order of Commands .......................................... 39
+ 4.1.5 Private-use Commands ....................................... 40
+ 4.2 SMTP Replies ................................................ 40
+ 4.2.1 Reply Code Severities and Theory ........................... 42
+ 4.2.2 Reply Codes by Function Groups ............................. 44
+ 4.2.3 Reply Codes in Numeric Order .............................. 45
+ 4.2.4 Reply Code 502 ............................................. 46
+ 4.2.5 Reply Codes After DATA and the Subsequent <CRLF>.<CRLF> .... 46
+ 4.3 Sequencing of Commands and Replies ........................... 47
+ 4.3.1 Sequencing Overview ........................................ 47
+ 4.3.2 Command-Reply Sequences .................................... 48
+ 4.4 Trace Information ............................................ 49
+ 4.5 Additional Implementation Issues ............................. 53
+ 4.5.1 Minimum Implementation ..................................... 53
+ 4.5.2 Transparency ............................................... 53
+ 4.5.3 Sizes and Timeouts ......................................... 54
+
+
+
+Klensin Standards Track [Page 3]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ 4.5.3.1 Size limits and minimums ................................. 54
+ 4.5.3.2 Timeouts ................................................. 56
+ 4.5.4 Retry Strategies ........................................... 57
+ 4.5.4.1 Sending Strategy ......................................... 58
+ 4.5.4.2 Receiving Strategy ....................................... 59
+ 4.5.5 Messages with a null reverse-path .......................... 59
+ 5. Address Resolution and Mail Handling .......................... 60
+ 6. Problem Detection and Handling ................................ 62
+ 6.1 Reliable Delivery and Replies by Email ....................... 62
+ 6.2 Loop Detection ............................................... 63
+ 6.3 Compensating for Irregularities .............................. 63
+ 7. Security Considerations ....................................... 64
+ 7.1 Mail Security and Spoofing ................................... 64
+ 7.2 "Blind" Copies ............................................... 65
+ 7.3 VRFY, EXPN, and Security ..................................... 65
+ 7.4 Information Disclosure in Announcements ...................... 66
+ 7.5 Information Disclosure in Trace Fields ....................... 66
+ 7.6 Information Disclosure in Message Forwarding ................. 67
+ 7.7 Scope of Operation of SMTP Servers ........................... 67
+ 8. IANA Considerations ........................................... 67
+ 9. References .................................................... 68
+ 10. Editor's Address ............................................. 70
+ 11. Acknowledgments .............................................. 70
+ Appendices ....................................................... 71
+ A. TCP Transport Service ......................................... 71
+ B. Generating SMTP Commands from RFC 822 Headers ................. 71
+ C. Source Routes ................................................. 72
+ D. Scenarios ..................................................... 73
+ E. Other Gateway Issues .......................................... 76
+ F. Deprecated Features of RFC 821 ................................ 76
+ Full Copyright Statement ......................................... 79
+
+1. Introduction
+
+ The objective of the Simple Mail Transfer Protocol (SMTP) is to
+ transfer mail reliably and efficiently.
+
+ SMTP is independent of the particular transmission subsystem and
+ requires only a reliable ordered data stream channel. While this
+ document specifically discusses transport over TCP, other transports
+ are possible. Appendices to RFC 821 describe some of them.
+
+ An important feature of SMTP is its capability to transport mail
+ across networks, usually referred to as "SMTP mail relaying" (see
+ section 3.8). A network consists of the mutually-TCP-accessible
+ hosts on the public Internet, the mutually-TCP-accessible hosts on a
+ firewall-isolated TCP/IP Intranet, or hosts in some other LAN or WAN
+ environment utilizing a non-TCP transport-level protocol. Using
+
+
+
+Klensin Standards Track [Page 4]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ SMTP, a process can transfer mail to another process on the same
+ network or to some other network via a relay or gateway process
+ accessible to both networks.
+
+ In this way, a mail message may pass through a number of intermediate
+ relay or gateway hosts on its path from sender to ultimate recipient.
+ The Mail eXchanger mechanisms of the domain name system [22, 27] (and
+ section 5 of this document) are used to identify the appropriate
+ next-hop destination for a message being transported.
+
+2. The SMTP Model
+
+2.1 Basic Structure
+
+ The SMTP design can be pictured as:
+
+ +----------+ +----------+
+ +------+ | | | |
+ | User |<-->| | SMTP | |
+ +------+ | Client- |Commands/Replies| Server- |
+ +------+ | SMTP |<-------------->| SMTP | +------+
+ | File |<-->| | and Mail | |<-->| File |
+ |System| | | | | |System|
+ +------+ +----------+ +----------+ +------+
+ SMTP client SMTP server
+
+ When an SMTP client has a message to transmit, it establishes a two-
+ way transmission channel to an SMTP server. The responsibility of an
+ SMTP client is to transfer mail messages to one or more SMTP servers,
+ or report its failure to do so.
+
+ The means by which a mail message is presented to an SMTP client, and
+ how that client determines the domain name(s) to which mail messages
+ are to be transferred is a local matter, and is not addressed by this
+ document. In some cases, the domain name(s) transferred to, or
+ determined by, an SMTP client will identify the final destination(s)
+ of the mail message. In other cases, common with SMTP clients
+ associated with implementations of the POP [3, 26] or IMAP [6]
+ protocols, or when the SMTP client is inside an isolated transport
+ service environment, the domain name determined will identify an
+ intermediate destination through which all mail messages are to be
+ relayed. SMTP clients that transfer all traffic, regardless of the
+ target domain names associated with the individual messages, or that
+ do not maintain queues for retrying message transmissions that
+ initially cannot be completed, may otherwise conform to this
+ specification but are not considered fully-capable. Fully-capable
+ SMTP implementations, including the relays used by these less capable
+
+
+
+
+Klensin Standards Track [Page 5]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ ones, and their destinations, are expected to support all of the
+ queuing, retrying, and alternate address functions discussed in this
+ specification.
+
+ The means by which an SMTP client, once it has determined a target
+ domain name, determines the identity of an SMTP server to which a
+ copy of a message is to be transferred, and then performs that
+ transfer, is covered by this document. To effect a mail transfer to
+ an SMTP server, an SMTP client establishes a two-way transmission
+ channel to that SMTP server. An SMTP client determines the address
+ of an appropriate host running an SMTP server by resolving a
+ destination domain name to either an intermediate Mail eXchanger host
+ or a final target host.
+
+ An SMTP server may be either the ultimate destination or an
+ intermediate "relay" (that is, it may assume the role of an SMTP
+ client after receiving the message) or "gateway" (that is, it may
+ transport the message further using some protocol other than SMTP).
+ SMTP commands are generated by the SMTP client and sent to the SMTP
+ server. SMTP replies are sent from the SMTP server to the SMTP
+ client in response to the commands.
+
+ In other words, message transfer can occur in a single connection
+ between the original SMTP-sender and the final SMTP-recipient, or can
+ occur in a series of hops through intermediary systems. In either
+ case, a formal handoff of responsibility for the message occurs: the
+ protocol requires that a server accept responsibility for either
+ delivering a message or properly reporting the failure to do so.
+
+ Once the transmission channel is established and initial handshaking
+ completed, the SMTP client normally initiates a mail transaction.
+ Such a transaction consists of a series of commands to specify the
+ originator and destination of the mail and transmission of the
+ message content (including any headers or other structure) itself.
+ When the same message is sent to multiple recipients, this protocol
+ encourages the transmission of only one copy of the data for all
+ recipients at the same destination (or intermediate relay) host.
+
+ The server responds to each command with a reply; replies may
+ indicate that the command was accepted, that additional commands are
+ expected, or that a temporary or permanent error condition exists.
+ Commands specifying the sender or recipients may include server-
+ permitted SMTP service extension requests as discussed in section
+ 2.2. The dialog is purposely lock-step, one-at-a-time, although this
+ can be modified by mutually-agreed extension requests such as command
+ pipelining [13].
+
+
+
+
+
+Klensin Standards Track [Page 6]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ Once a given mail message has been transmitted, the client may either
+ request that the connection be shut down or may initiate other mail
+ transactions. In addition, an SMTP client may use a connection to an
+ SMTP server for ancillary services such as verification of email
+ addresses or retrieval of mailing list subscriber addresses.
+
+ As suggested above, this protocol provides mechanisms for the
+ transmission of mail. This transmission normally occurs directly
+ from the sending user's host to the receiving user's host when the
+ two hosts are connected to the same transport service. When they are
+ not connected to the same transport service, transmission occurs via
+ one or more relay SMTP servers. An intermediate host that acts as
+ either an SMTP relay or as a gateway into some other transmission
+ environment is usually selected through the use of the domain name
+ service (DNS) Mail eXchanger mechanism.
+
+ Usually, intermediate hosts are determined via the DNS MX record, not
+ by explicit "source" routing (see section 5 and appendices C and
+ F.2).
+
+2.2 The Extension Model
+
+2.2.1 Background
+
+ In an effort that started in 1990, approximately a decade after RFC
+ 821 was completed, the protocol was modified with a "service
+ extensions" model that permits the client and server to agree to
+ utilize shared functionality beyond the original SMTP requirements.
+ The SMTP extension mechanism defines a means whereby an extended SMTP
+ client and server may recognize each other, and the server can inform
+ the client as to the service extensions that it supports.
+
+ Contemporary SMTP implementations MUST support the basic extension
+ mechanisms. For instance, servers MUST support the EHLO command even
+ if they do not implement any specific extensions and clients SHOULD
+ preferentially utilize EHLO rather than HELO. (However, for
+ compatibility with older conforming implementations, SMTP clients and
+ servers MUST support the original HELO mechanisms as a fallback.)
+ Unless the different characteristics of HELO must be identified for
+ interoperability purposes, this document discusses only EHLO.
+
+ SMTP is widely deployed and high-quality implementations have proven
+ to be very robust. However, the Internet community now considers
+ some services to be important that were not anticipated when the
+ protocol was first designed. If support for those services is to be
+ added, it must be done in a way that permits older implementations to
+ continue working acceptably. The extension framework consists of:
+
+
+
+
+Klensin Standards Track [Page 7]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ - The SMTP command EHLO, superseding the earlier HELO,
+
+ - a registry of SMTP service extensions,
+
+ - additional parameters to the SMTP MAIL and RCPT commands, and
+
+ - optional replacements for commands defined in this protocol, such
+ as for DATA in non-ASCII transmissions [33].
+
+ SMTP's strength comes primarily from its simplicity. Experience with
+ many protocols has shown that protocols with few options tend towards
+ ubiquity, whereas protocols with many options tend towards obscurity.
+
+ Each and every extension, regardless of its benefits, must be
+ carefully scrutinized with respect to its implementation, deployment,
+ and interoperability costs. In many cases, the cost of extending the
+ SMTP service will likely outweigh the benefit.
+
+2.2.2 Definition and Registration of Extensions
+
+ The IANA maintains a registry of SMTP service extensions. A
+ corresponding EHLO keyword value is associated with each extension.
+ Each service extension registered with the IANA must be defined in a
+ formal standards-track or IESG-approved experimental protocol
+ document. The definition must include:
+
+ - the textual name of the SMTP service extension;
+
+ - the EHLO keyword value associated with the extension;
+
+ - the syntax and possible values of parameters associated with the
+ EHLO keyword value;
+
+ - any additional SMTP verbs associated with the extension
+ (additional verbs will usually be, but are not required to be, the
+ same as the EHLO keyword value);
+
+ - any new parameters the extension associates with the MAIL or RCPT
+ verbs;
+
+ - a description of how support for the extension affects the
+ behavior of a server and client SMTP; and,
+
+ - the increment by which the extension is increasing the maximum
+ length of the commands MAIL and/or RCPT, over that specified in
+ this standard.
+
+
+
+
+
+Klensin Standards Track [Page 8]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ In addition, any EHLO keyword value starting with an upper or lower
+ case "X" refers to a local SMTP service extension used exclusively
+ through bilateral agreement. Keywords beginning with "X" MUST NOT be
+ used in a registered service extension. Conversely, keyword values
+ presented in the EHLO response that do not begin with "X" MUST
+ correspond to a standard, standards-track, or IESG-approved
+ experimental SMTP service extension registered with IANA. A
+ conforming server MUST NOT offer non-"X"-prefixed keyword values that
+ are not described in a registered extension.
+
+ Additional verbs and parameter names are bound by the same rules as
+ EHLO keywords; specifically, verbs beginning with "X" are local
+ extensions that may not be registered or standardized. Conversely,
+ verbs not beginning with "X" must always be registered.
+
+2.3 Terminology
+
+ The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
+ "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
+ document are to be interpreted as described below.
+
+ 1. MUST This word, or the terms "REQUIRED" or "SHALL", mean that
+ the definition is an absolute requirement of the specification.
+
+ 2. MUST NOT This phrase, or the phrase "SHALL NOT", mean that the
+ definition is an absolute prohibition of the specification.
+
+ 3. SHOULD This word, or the adjective "RECOMMENDED", mean that
+ there may exist valid reasons in particular circumstances to
+ ignore a particular item, but the full implications must be
+ understood and carefully weighed before choosing a different
+ course.
+
+ 4. SHOULD NOT This phrase, or the phrase "NOT RECOMMENDED" mean
+ that there may exist valid reasons in particular circumstances
+ when the particular behavior is acceptable or even useful, but the
+ full implications should be understood and the case carefully
+ weighed before implementing any behavior described with this
+ label.
+
+ 5. MAY This word, or the adjective "OPTIONAL", mean that an item is
+ truly optional. One vendor may choose to include the item because
+ a particular marketplace requires it or because the vendor feels
+ that it enhances the product while another vendor may omit the
+ same item. An implementation which does not include a particular
+ option MUST be prepared to interoperate with another
+ implementation which does include the option, though perhaps with
+ reduced functionality. In the same vein an implementation which
+
+
+
+Klensin Standards Track [Page 9]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ does include a particular option MUST be prepared to interoperate
+ with another implementation which does not include the option
+ (except, of course, for the feature the option provides.)
+
+2.3.1 Mail Objects
+
+ SMTP transports a mail object. A mail object contains an envelope
+ and content.
+
+ The SMTP envelope is sent as a series of SMTP protocol units
+ (described in section 3). It consists of an originator address (to
+ which error reports should be directed); one or more recipient
+ addresses; and optional protocol extension material. Historically,
+ variations on the recipient address specification command (RCPT TO)
+ could be used to specify alternate delivery modes, such as immediate
+ display; those variations have now been deprecated (see appendix F,
+ section F.6).
+
+ The SMTP content is sent in the SMTP DATA protocol unit and has two
+ parts: the headers and the body. If the content conforms to other
+ contemporary standards, the headers form a collection of field/value
+ pairs structured as in the message format specification [32]; the
+ body, if structured, is defined according to MIME [12]. The content
+ is textual in nature, expressed using the US-ASCII repertoire [1].
+ Although SMTP extensions (such as "8BITMIME" [20]) may relax this
+ restriction for the content body, the content headers are always
+ encoded using the US-ASCII repertoire. A MIME extension [23] defines
+ an algorithm for representing header values outside the US-ASCII
+ repertoire, while still encoding them using the US-ASCII repertoire.
+
+2.3.2 Senders and Receivers
+
+ In RFC 821, the two hosts participating in an SMTP transaction were
+ described as the "SMTP-sender" and "SMTP-receiver". This document
+ has been changed to reflect current industry terminology and hence
+ refers to them as the "SMTP client" (or sometimes just "the client")
+ and "SMTP server" (or just "the server"), respectively. Since a
+ given host may act both as server and client in a relay situation,
+ "receiver" and "sender" terminology is still used where needed for
+ clarity.
+
+2.3.3 Mail Agents and Message Stores
+
+ Additional mail system terminology became common after RFC 821 was
+ published and, where convenient, is used in this specification. In
+ particular, SMTP servers and clients provide a mail transport service
+ and therefore act as "Mail Transfer Agents" (MTAs). "Mail User
+ Agents" (MUAs or UAs) are normally thought of as the sources and
+
+
+
+Klensin Standards Track [Page 10]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ targets of mail. At the source, an MUA might collect mail to be
+ transmitted from a user and hand it off to an MTA; the final
+ ("delivery") MTA would be thought of as handing the mail off to an
+ MUA (or at least transferring responsibility to it, e.g., by
+ depositing the message in a "message store"). However, while these
+ terms are used with at least the appearance of great precision in
+ other environments, the implied boundaries between MUAs and MTAs
+ often do not accurately match common, and conforming, practices with
+ Internet mail. Hence, the reader should be cautious about inferring
+ the strong relationships and responsibilities that might be implied
+ if these terms were used elsewhere.
+
+2.3.4 Host
+
+ For the purposes of this specification, a host is a computer system
+ attached to the Internet (or, in some cases, to a private TCP/IP
+ network) and supporting the SMTP protocol. Hosts are known by names
+ (see "domain"); identifying them by numerical address is discouraged.
+
+2.3.5 Domain
+
+ A domain (or domain name) consists of one or more dot-separated
+ components. These components ("labels" in DNS terminology [22]) are
+ restricted for SMTP purposes to consist of a sequence of letters,
+ digits, and hyphens drawn from the ASCII character set [1]. Domain
+ names are used as names of hosts and of other entities in the domain
+ name hierarchy. For example, a domain may refer to an alias (label
+ of a CNAME RR) or the label of Mail eXchanger records to be used to
+ deliver mail instead of representing a host name. See [22] and
+ section 5 of this specification.
+
+ The domain name, as described in this document and in [22], is the
+ entire, fully-qualified name (often referred to as an "FQDN"). A
+ domain name that is not in FQDN form is no more than a local alias.
+ Local aliases MUST NOT appear in any SMTP transaction.
+
+2.3.6 Buffer and State Table
+
+ SMTP sessions are stateful, with both parties carefully maintaining a
+ common view of the current state. In this document we model this
+ state by a virtual "buffer" and a "state table" on the server which
+ may be used by the client to, for example, "clear the buffer" or
+ "reset the state table," causing the information in the buffer to be
+ discarded and the state to be returned to some previous state.
+
+
+
+
+
+
+
+Klensin Standards Track [Page 11]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+2.3.7 Lines
+
+ SMTP commands and, unless altered by a service extension, message
+ data, are transmitted in "lines". Lines consist of zero or more data
+ characters terminated by the sequence ASCII character "CR" (hex value
+ 0D) followed immediately by ASCII character "LF" (hex value 0A).
+ This termination sequence is denoted as <CRLF> in this document.
+ Conforming implementations MUST NOT recognize or generate any other
+ character or character sequence as a line terminator. Limits MAY be
+ imposed on line lengths by servers (see section 4.5.3).
+
+ In addition, the appearance of "bare" "CR" or "LF" characters in text
+ (i.e., either without the other) has a long history of causing
+ problems in mail implementations and applications that use the mail
+ system as a tool. SMTP client implementations MUST NOT transmit
+ these characters except when they are intended as line terminators
+ and then MUST, as indicated above, transmit them only as a <CRLF>
+ sequence.
+
+2.3.8 Originator, Delivery, Relay, and Gateway Systems
+
+ This specification makes a distinction among four types of SMTP
+ systems, based on the role those systems play in transmitting
+ electronic mail. An "originating" system (sometimes called an SMTP
+ originator) introduces mail into the Internet or, more generally,
+ into a transport service environment. A "delivery" SMTP system is
+ one that receives mail from a transport service environment and
+ passes it to a mail user agent or deposits it in a message store
+ which a mail user agent is expected to subsequently access. A
+ "relay" SMTP system (usually referred to just as a "relay") receives
+ mail from an SMTP client and transmits it, without modification to
+ the message data other than adding trace information, to another SMTP
+ server for further relaying or for delivery.
+
+ A "gateway" SMTP system (usually referred to just as a "gateway")
+ receives mail from a client system in one transport environment and
+ transmits it to a server system in another transport environment.
+ Differences in protocols or message semantics between the transport
+ environments on either side of a gateway may require that the gateway
+ system perform transformations to the message that are not permitted
+ to SMTP relay systems. For the purposes of this specification,
+ firewalls that rewrite addresses should be considered as gateways,
+ even if SMTP is used on both sides of them (see [11]).
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 12]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+2.3.9 Message Content and Mail Data
+
+ The terms "message content" and "mail data" are used interchangeably
+ in this document to describe the material transmitted after the DATA
+ command is accepted and before the end of data indication is
+ transmitted. Message content includes message headers and the
+ possibly-structured message body. The MIME specification [12]
+ provides the standard mechanisms for structured message bodies.
+
+2.3.10 Mailbox and Address
+
+ As used in this specification, an "address" is a character string
+ that identifies a user to whom mail will be sent or a location into
+ which mail will be deposited. The term "mailbox" refers to that
+ depository. The two terms are typically used interchangeably unless
+ the distinction between the location in which mail is placed (the
+ mailbox) and a reference to it (the address) is important. An
+ address normally consists of user and domain specifications. The
+ standard mailbox naming convention is defined to be "local-
+ part@domain": contemporary usage permits a much broader set of
+ applications than simple "user names". Consequently, and due to a
+ long history of problems when intermediate hosts have attempted to
+ optimize transport by modifying them, the local-part MUST be
+ interpreted and assigned semantics only by the host specified in the
+ domain part of the address.
+
+2.3.11 Reply
+
+ An SMTP reply is an acknowledgment (positive or negative) sent from
+ receiver to sender via the transmission channel in response to a
+ command. The general form of a reply is a numeric completion code
+ (indicating failure or success) usually followed by a text string.
+ The codes are for use by programs and the text is usually intended
+ for human users. Recent work [34] has specified further structuring
+ of the reply strings, including the use of supplemental and more
+ specific completion codes.
+
+2.4 General Syntax Principles and Transaction Model
+
+ SMTP commands and replies have a rigid syntax. All commands begin
+ with a command verb. All Replies begin with a three digit numeric
+ code. In some commands and replies, arguments MUST follow the verb
+ or reply code. Some commands do not accept arguments (after the
+ verb), and some reply codes are followed, sometimes optionally, by
+ free form text. In both cases, where text appears, it is separated
+ from the verb or reply code by a space character. Complete
+ definitions of commands and replies appear in section 4.
+
+
+
+
+Klensin Standards Track [Page 13]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ Verbs and argument values (e.g., "TO:" or "to:" in the RCPT command
+ and extension name keywords) are not case sensitive, with the sole
+ exception in this specification of a mailbox local-part (SMTP
+ Extensions may explicitly specify case-sensitive elements). That is,
+ a command verb, an argument value other than a mailbox local-part,
+ and free form text MAY be encoded in upper case, lower case, or any
+ mixture of upper and lower case with no impact on its meaning. This
+ is NOT true of a mailbox local-part. The local-part of a mailbox
+ MUST BE treated as case sensitive. Therefore, SMTP implementations
+ MUST take care to preserve the case of mailbox local-parts. Mailbox
+ domains are not case sensitive. In particular, for some hosts the
+ user "smith" is different from the user "Smith". However, exploiting
+ the case sensitivity of mailbox local-parts impedes interoperability
+ and is discouraged.
+
+ A few SMTP servers, in violation of this specification (and RFC 821)
+ require that command verbs be encoded by clients in upper case.
+ Implementations MAY wish to employ this encoding to accommodate those
+ servers.
+
+ The argument field consists of a variable length character string
+ ending with the end of the line, i.e., with the character sequence
+ <CRLF>. The receiver will take no action until this sequence is
+ received.
+
+ The syntax for each command is shown with the discussion of that
+ command. Common elements and parameters are shown in section 4.1.2.
+
+ Commands and replies are composed of characters from the ASCII
+ character set [1]. When the transport service provides an 8-bit byte
+ (octet) transmission channel, each 7-bit character is transmitted
+ right justified in an octet with the high order bit cleared to zero.
+ More specifically, the unextended SMTP service provides seven bit
+ transport only. An originating SMTP client which has not
+ successfully negotiated an appropriate extension with a particular
+ server MUST NOT transmit messages with information in the high-order
+ bit of octets. If such messages are transmitted in violation of this
+ rule, receiving SMTP servers MAY clear the high-order bit or reject
+ the message as invalid. In general, a relay SMTP SHOULD assume that
+ the message content it has received is valid and, assuming that the
+ envelope permits doing so, relay it without inspecting that content.
+ Of course, if the content is mislabeled and the data path cannot
+ accept the actual content, this may result in ultimate delivery of a
+ severely garbled message to the recipient. Delivery SMTP systems MAY
+ reject ("bounce") such messages rather than deliver them. No sending
+ SMTP system is permitted to send envelope commands in any character
+
+
+
+
+
+Klensin Standards Track [Page 14]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ set other than US-ASCII; receiving systems SHOULD reject such
+ commands, normally using "500 syntax error - invalid character"
+ replies.
+
+ Eight-bit message content transmission MAY be requested of the server
+ by a client using extended SMTP facilities, notably the "8BITMIME"
+ extension [20]. 8BITMIME SHOULD be supported by SMTP servers.
+ However, it MUST not be construed as authorization to transmit
+ unrestricted eight bit material. 8BITMIME MUST NOT be requested by
+ senders for material with the high bit on that is not in MIME format
+ with an appropriate content-transfer encoding; servers MAY reject
+ such messages.
+
+ The metalinguistic notation used in this document corresponds to the
+ "Augmented BNF" used in other Internet mail system documents. The
+ reader who is not familiar with that syntax should consult the ABNF
+ specification [8]. Metalanguage terms used in running text are
+ surrounded by pointed brackets (e.g., <CRLF>) for clarity.
+
+3. The SMTP Procedures: An Overview
+
+ This section contains descriptions of the procedures used in SMTP:
+ session initiation, the mail transaction, forwarding mail, verifying
+ mailbox names and expanding mailing lists, and the opening and
+ closing exchanges. Comments on relaying, a note on mail domains, and
+ a discussion of changing roles are included at the end of this
+ section. Several complete scenarios are presented in appendix D.
+
+3.1 Session Initiation
+
+ An SMTP session is initiated when a client opens a connection to a
+ server and the server responds with an opening message.
+
+ SMTP server implementations MAY include identification of their
+ software and version information in the connection greeting reply
+ after the 220 code, a practice that permits more efficient isolation
+ and repair of any problems. Implementations MAY make provision for
+ SMTP servers to disable the software and version announcement where
+ it causes security concerns. While some systems also identify their
+ contact point for mail problems, this is not a substitute for
+ maintaining the required "postmaster" address (see section 4.5.1).
+
+ The SMTP protocol allows a server to formally reject a transaction
+ while still allowing the initial connection as follows: a 554
+ response MAY be given in the initial connection opening message
+ instead of the 220. A server taking this approach MUST still wait
+ for the client to send a QUIT (see section 4.1.1.10) before closing
+ the connection and SHOULD respond to any intervening commands with
+
+
+
+Klensin Standards Track [Page 15]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ "503 bad sequence of commands". Since an attempt to make an SMTP
+ connection to such a system is probably in error, a server returning
+ a 554 response on connection opening SHOULD provide enough
+ information in the reply text to facilitate debugging of the sending
+ system.
+
+3.2 Client Initiation
+
+ Once the server has sent the welcoming message and the client has
+ received it, the client normally sends the EHLO command to the
+ server, indicating the client's identity. In addition to opening the
+ session, use of EHLO indicates that the client is able to process
+ service extensions and requests that the server provide a list of the
+ extensions it supports. Older SMTP systems which are unable to
+ support service extensions and contemporary clients which do not
+ require service extensions in the mail session being initiated, MAY
+ use HELO instead of EHLO. Servers MUST NOT return the extended
+ EHLO-style response to a HELO command. For a particular connection
+ attempt, if the server returns a "command not recognized" response to
+ EHLO, the client SHOULD be able to fall back and send HELO.
+
+ In the EHLO command the host sending the command identifies itself;
+ the command may be interpreted as saying "Hello, I am <domain>" (and,
+ in the case of EHLO, "and I support service extension requests").
+
+3.3 Mail Transactions
+
+ There are three steps to SMTP mail transactions. The transaction
+ starts with a MAIL command which gives the sender identification.
+ (In general, the MAIL command may be sent only when no mail
+ transaction is in progress; see section 4.1.4.) A series of one or
+ more RCPT commands follows giving the receiver information. Then a
+ DATA command initiates transfer of the mail data and is terminated by
+ the "end of mail" data indicator, which also confirms the
+ transaction.
+
+ The first step in the procedure is the MAIL command.
+
+ MAIL FROM:<reverse-path> [SP <mail-parameters> ] <CRLF>
+
+ This command tells the SMTP-receiver that a new mail transaction is
+ starting and to reset all its state tables and buffers, including any
+ recipients or mail data. The <reverse-path> portion of the first or
+ only argument contains the source mailbox (between "<" and ">"
+ brackets), which can be used to report errors (see section 4.2 for a
+ discussion of error reporting). If accepted, the SMTP server returns
+ a 250 OK reply. If the mailbox specification is not acceptable for
+ some reason, the server MUST return a reply indicating whether the
+
+
+
+Klensin Standards Track [Page 16]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ failure is permanent (i.e., will occur again if the client tries to
+ send the same address again) or temporary (i.e., the address might be
+ accepted if the client tries again later). Despite the apparent
+ scope of this requirement, there are circumstances in which the
+ acceptability of the reverse-path may not be determined until one or
+ more forward-paths (in RCPT commands) can be examined. In those
+ cases, the server MAY reasonably accept the reverse-path (with a 250
+ reply) and then report problems after the forward-paths are received
+ and examined. Normally, failures produce 550 or 553 replies.
+
+ Historically, the <reverse-path> can contain more than just a
+ mailbox, however, contemporary systems SHOULD NOT use source routing
+ (see appendix C).
+
+ The optional <mail-parameters> are associated with negotiated SMTP
+ service extensions (see section 2.2).
+
+ The second step in the procedure is the RCPT command.
+
+ RCPT TO:<forward-path> [ SP <rcpt-parameters> ] <CRLF>
+
+ The first or only argument to this command includes a forward-path
+ (normally a mailbox and domain, always surrounded by "<" and ">"
+ brackets) identifying one recipient. If accepted, the SMTP server
+ returns a 250 OK reply and stores the forward-path. If the recipient
+ is known not to be a deliverable address, the SMTP server returns a
+ 550 reply, typically with a string such as "no such user - " and the
+ mailbox name (other circumstances and reply codes are possible).
+ This step of the procedure can be repeated any number of times.
+
+ The <forward-path> can contain more than just a mailbox.
+ Historically, the <forward-path> can be a source routing list of
+ hosts and the destination mailbox, however, contemporary SMTP clients
+ SHOULD NOT utilize source routes (see appendix C). Servers MUST be
+ prepared to encounter a list of source routes in the forward path,
+ but SHOULD ignore the routes or MAY decline to support the relaying
+ they imply. Similarly, servers MAY decline to accept mail that is
+ destined for other hosts or systems. These restrictions make a
+ server useless as a relay for clients that do not support full SMTP
+ functionality. Consequently, restricted-capability clients MUST NOT
+ assume that any SMTP server on the Internet can be used as their mail
+ processing (relaying) site. If a RCPT command appears without a
+ previous MAIL command, the server MUST return a 503 "Bad sequence of
+ commands" response. The optional <rcpt-parameters> are associated
+ with negotiated SMTP service extensions (see section 2.2).
+
+ The third step in the procedure is the DATA command (or some
+ alternative specified in a service extension).
+
+
+
+Klensin Standards Track [Page 17]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ DATA <CRLF>
+
+ If accepted, the SMTP server returns a 354 Intermediate reply and
+ considers all succeeding lines up to but not including the end of
+ mail data indicator to be the message text. When the end of text is
+ successfully received and stored the SMTP-receiver sends a 250 OK
+ reply.
+
+ Since the mail data is sent on the transmission channel, the end of
+ mail data must be indicated so that the command and reply dialog can
+ be resumed. SMTP indicates the end of the mail data by sending a
+ line containing only a "." (period or full stop). A transparency
+ procedure is used to prevent this from interfering with the user's
+ text (see section 4.5.2).
+
+ The end of mail data indicator also confirms the mail transaction and
+ tells the SMTP server to now process the stored recipients and mail
+ data. If accepted, the SMTP server returns a 250 OK reply. The DATA
+ command can fail at only two points in the protocol exchange:
+
+ - If there was no MAIL, or no RCPT, command, or all such commands
+ were rejected, the server MAY return a "command out of sequence"
+ (503) or "no valid recipients" (554) reply in response to the DATA
+ command. If one of those replies (or any other 5yz reply) is
+ received, the client MUST NOT send the message data; more
+ generally, message data MUST NOT be sent unless a 354 reply is
+ received.
+
+ - If the verb is initially accepted and the 354 reply issued, the
+ DATA command should fail only if the mail transaction was
+ incomplete (for example, no recipients), or if resources were
+ unavailable (including, of course, the server unexpectedly
+ becoming unavailable), or if the server determines that the
+ message should be rejected for policy or other reasons.
+
+ However, in practice, some servers do not perform recipient
+ verification until after the message text is received. These servers
+ SHOULD treat a failure for one or more recipients as a "subsequent
+ failure" and return a mail message as discussed in section 6. Using
+ a "550 mailbox not found" (or equivalent) reply code after the data
+ are accepted makes it difficult or impossible for the client to
+ determine which recipients failed.
+
+ When RFC 822 format [7, 32] is being used, the mail data include the
+ memo header items such as Date, Subject, To, Cc, From. Server SMTP
+ systems SHOULD NOT reject messages based on perceived defects in the
+ RFC 822 or MIME [12] message header or message body. In particular,
+
+
+
+
+Klensin Standards Track [Page 18]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ they MUST NOT reject messages in which the numbers of Resent-fields
+ do not match or Resent-to appears without Resent-from and/or Resent-
+ date.
+
+ Mail transaction commands MUST be used in the order discussed above.
+
+3.4 Forwarding for Address Correction or Updating
+
+ Forwarding support is most often required to consolidate and simplify
+ addresses within, or relative to, some enterprise and less frequently
+ to establish addresses to link a person's prior address with current
+ one. Silent forwarding of messages (without server notification to
+ the sender), for security or non-disclosure purposes, is common in
+ the contemporary Internet.
+
+ In both the enterprise and the "new address" cases, information
+ hiding (and sometimes security) considerations argue against exposure
+ of the "final" address through the SMTP protocol as a side-effect of
+ the forwarding activity. This may be especially important when the
+ final address may not even be reachable by the sender. Consequently,
+ the "forwarding" mechanisms described in section 3.2 of RFC 821, and
+ especially the 251 (corrected destination) and 551 reply codes from
+ RCPT must be evaluated carefully by implementers and, when they are
+ available, by those configuring systems.
+
+ In particular:
+
+ * Servers MAY forward messages when they are aware of an address
+ change. When they do so, they MAY either provide address-updating
+ information with a 251 code, or may forward "silently" and return
+ a 250 code. But, if a 251 code is used, they MUST NOT assume that
+ the client will actually update address information or even return
+ that information to the user.
+
+ Alternately,
+
+ * Servers MAY reject or bounce messages when they are not
+ deliverable when addressed. When they do so, they MAY either
+ provide address-updating information with a 551 code, or may
+ reject the message as undeliverable with a 550 code and no
+ address-specific information. But, if a 551 code is used, they
+ MUST NOT assume that the client will actually update address
+ information or even return that information to the user.
+
+ SMTP server implementations that support the 251 and/or 551 reply
+ codes are strongly encouraged to provide configuration mechanisms so
+ that sites which conclude that they would undesirably disclose
+ information can disable or restrict their use.
+
+
+
+Klensin Standards Track [Page 19]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+3.5 Commands for Debugging Addresses
+
+3.5.1 Overview
+
+ SMTP provides commands to verify a user name or obtain the content of
+ a mailing list. This is done with the VRFY and EXPN commands, which
+ have character string arguments. Implementations SHOULD support VRFY
+ and EXPN (however, see section 3.5.2 and 7.3).
+
+ For the VRFY command, the string is a user name or a user name and
+ domain (see below). If a normal (i.e., 250) response is returned,
+ the response MAY include the full name of the user and MUST include
+ the mailbox of the user. It MUST be in either of the following
+ forms:
+
+ User Name <local-part@domain>
+ local-part@domain
+
+ When a name that is the argument to VRFY could identify more than one
+ mailbox, the server MAY either note the ambiguity or identify the
+ alternatives. In other words, any of the following are legitimate
+ response to VRFY:
+
+ 553 User ambiguous
+
+ or
+
+ 553- Ambiguous; Possibilities are
+ 553-Joe Smith <jsmith@foo.com>
+ 553-Harry Smith <hsmith@foo.com>
+ 553 Melvin Smith <dweep@foo.com>
+
+ or
+
+ 553-Ambiguous; Possibilities
+ 553- <jsmith@foo.com>
+ 553- <hsmith@foo.com>
+ 553 <dweep@foo.com>
+
+ Under normal circumstances, a client receiving a 553 reply would be
+ expected to expose the result to the user. Use of exactly the forms
+ given, and the "user ambiguous" or "ambiguous" keywords, possibly
+ supplemented by extended reply codes such as those described in [34],
+ will facilitate automated translation into other languages as needed.
+ Of course, a client that was highly automated or that was operating
+ in another language than English, might choose to try to translate
+ the response, to return some other indication to the user than the
+
+
+
+
+Klensin Standards Track [Page 20]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ literal text of the reply, or to take some automated action such as
+ consulting a directory service for additional information before
+ reporting to the user.
+
+ For the EXPN command, the string identifies a mailing list, and the
+ successful (i.e., 250) multiline response MAY include the full name
+ of the users and MUST give the mailboxes on the mailing list.
+
+ In some hosts the distinction between a mailing list and an alias for
+ a single mailbox is a bit fuzzy, since a common data structure may
+ hold both types of entries, and it is possible to have mailing lists
+ containing only one mailbox. If a request is made to apply VRFY to a
+ mailing list, a positive response MAY be given if a message so
+ addressed would be delivered to everyone on the list, otherwise an
+ error SHOULD be reported (e.g., "550 That is a mailing list, not a
+ user" or "252 Unable to verify members of mailing list"). If a
+ request is made to expand a user name, the server MAY return a
+ positive response consisting of a list containing one name, or an
+ error MAY be reported (e.g., "550 That is a user name, not a mailing
+ list").
+
+ In the case of a successful multiline reply (normal for EXPN) exactly
+ one mailbox is to be specified on each line of the reply. The case
+ of an ambiguous request is discussed above.
+
+ "User name" is a fuzzy term and has been used deliberately. An
+ implementation of the VRFY or EXPN commands MUST include at least
+ recognition of local mailboxes as "user names". However, since
+ current Internet practice often results in a single host handling
+ mail for multiple domains, hosts, especially hosts that provide this
+ functionality, SHOULD accept the "local-part@domain" form as a "user
+ name"; hosts MAY also choose to recognize other strings as "user
+ names".
+
+ The case of expanding a mailbox list requires a multiline reply, such
+ as:
+
+ C: EXPN Example-People
+ S: 250-Jon Postel <Postel@isi.edu>
+ S: 250-Fred Fonebone <Fonebone@physics.foo-u.edu>
+ S: 250 Sam Q. Smith <SQSmith@specific.generic.com>
+
+ or
+
+ C: EXPN Executive-Washroom-List
+ S: 550 Access Denied to You.
+
+
+
+
+
+Klensin Standards Track [Page 21]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ The character string arguments of the VRFY and EXPN commands cannot
+ be further restricted due to the variety of implementations of the
+ user name and mailbox list concepts. On some systems it may be
+ appropriate for the argument of the EXPN command to be a file name
+ for a file containing a mailing list, but again there are a variety
+ of file naming conventions in the Internet. Similarly, historical
+ variations in what is returned by these commands are such that the
+ response SHOULD be interpreted very carefully, if at all, and SHOULD
+ generally only be used for diagnostic purposes.
+
+3.5.2 VRFY Normal Response
+
+ When normal (2yz or 551) responses are returned from a VRFY or EXPN
+ request, the reply normally includes the mailbox name, i.e.,
+ "<local-part@domain>", where "domain" is a fully qualified domain
+ name, MUST appear in the syntax. In circumstances exceptional enough
+ to justify violating the intent of this specification, free-form text
+ MAY be returned. In order to facilitate parsing by both computers
+ and people, addresses SHOULD appear in pointed brackets. When
+ addresses, rather than free-form debugging information, are returned,
+ EXPN and VRFY MUST return only valid domain addresses that are usable
+ in SMTP RCPT commands. Consequently, if an address implies delivery
+ to a program or other system, the mailbox name used to reach that
+ target MUST be given. Paths (explicit source routes) MUST NOT be
+ returned by VRFY or EXPN.
+
+ Server implementations SHOULD support both VRFY and EXPN. For
+ security reasons, implementations MAY provide local installations a
+ way to disable either or both of these commands through configuration
+ options or the equivalent. When these commands are supported, they
+ are not required to work across relays when relaying is supported.
+ Since they were both optional in RFC 821, they MUST be listed as
+ service extensions in an EHLO response, if they are supported.
+
+3.5.3 Meaning of VRFY or EXPN Success Response
+
+ A server MUST NOT return a 250 code in response to a VRFY or EXPN
+ command unless it has actually verified the address. In particular,
+ a server MUST NOT return 250 if all it has done is to verify that the
+ syntax given is valid. In that case, 502 (Command not implemented)
+ or 500 (Syntax error, command unrecognized) SHOULD be returned. As
+ stated elsewhere, implementation (in the sense of actually validating
+ addresses and returning information) of VRFY and EXPN are strongly
+ recommended. Hence, implementations that return 500 or 502 for VRFY
+ are not in full compliance with this specification.
+
+
+
+
+
+
+Klensin Standards Track [Page 22]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ There may be circumstances where an address appears to be valid but
+ cannot reasonably be verified in real time, particularly when a
+ server is acting as a mail exchanger for another server or domain.
+ "Apparent validity" in this case would normally involve at least
+ syntax checking and might involve verification that any domains
+ specified were ones to which the host expected to be able to relay
+ mail. In these situations, reply code 252 SHOULD be returned. These
+ cases parallel the discussion of RCPT verification discussed in
+ section 2.1. Similarly, the discussion in section 3.4 applies to the
+ use of reply codes 251 and 551 with VRFY (and EXPN) to indicate
+ addresses that are recognized but that would be forwarded or bounced
+ were mail received for them. Implementations generally SHOULD be
+ more aggressive about address verification in the case of VRFY than
+ in the case of RCPT, even if it takes a little longer to do so.
+
+3.5.4 Semantics and Applications of EXPN
+
+ EXPN is often very useful in debugging and understanding problems
+ with mailing lists and multiple-target-address aliases. Some systems
+ have attempted to use source expansion of mailing lists as a means of
+ eliminating duplicates. The propagation of aliasing systems with
+ mail on the Internet, for hosts (typically with MX and CNAME DNS
+ records), for mailboxes (various types of local host aliases), and in
+ various proxying arrangements, has made it nearly impossible for
+ these strategies to work consistently, and mail systems SHOULD NOT
+ attempt them.
+
+3.6 Domains
+
+ Only resolvable, fully-qualified, domain names (FQDNs) are permitted
+ when domain names are used in SMTP. In other words, names that can
+ be resolved to MX RRs or A RRs (as discussed in section 5) are
+ permitted, as are CNAME RRs whose targets can be resolved, in turn,
+ to MX or A RRs. Local nicknames or unqualified names MUST NOT be
+ used. There are two exceptions to the rule requiring FQDNs:
+
+ - The domain name given in the EHLO command MUST BE either a primary
+ host name (a domain name that resolves to an A RR) or, if the host
+ has no name, an address literal as described in section 4.1.1.1.
+
+ - The reserved mailbox name "postmaster" may be used in a RCPT
+ command without domain qualification (see section 4.1.1.3) and
+ MUST be accepted if so used.
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 23]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+3.7 Relaying
+
+ In general, the availability of Mail eXchanger records in the domain
+ name system [22, 27] makes the use of explicit source routes in the
+ Internet mail system unnecessary. Many historical problems with
+ their interpretation have made their use undesirable. SMTP clients
+ SHOULD NOT generate explicit source routes except under unusual
+ circumstances. SMTP servers MAY decline to act as mail relays or to
+ accept addresses that specify source routes. When route information
+ is encountered, SMTP servers are also permitted to ignore the route
+ information and simply send to the final destination specified as the
+ last element in the route and SHOULD do so. There has been an
+ invalid practice of using names that do not appear in the DNS as
+ destination names, with the senders counting on the intermediate
+ hosts specified in source routing to resolve any problems. If source
+ routes are stripped, this practice will cause failures. This is one
+ of several reasons why SMTP clients MUST NOT generate invalid source
+ routes or depend on serial resolution of names.
+
+ When source routes are not used, the process described in RFC 821 for
+ constructing a reverse-path from the forward-path is not applicable
+ and the reverse-path at the time of delivery will simply be the
+ address that appeared in the MAIL command.
+
+ A relay SMTP server is usually the target of a DNS MX record that
+ designates it, rather than the final delivery system. The relay
+ server may accept or reject the task of relaying the mail in the same
+ way it accepts or rejects mail for a local user. If it accepts the
+ task, it then becomes an SMTP client, establishes a transmission
+ channel to the next SMTP server specified in the DNS (according to
+ the rules in section 5), and sends it the mail. If it declines to
+ relay mail to a particular address for policy reasons, a 550 response
+ SHOULD be returned.
+
+ Many mail-sending clients exist, especially in conjunction with
+ facilities that receive mail via POP3 or IMAP, that have limited
+ capability to support some of the requirements of this specification,
+ such as the ability to queue messages for subsequent delivery
+ attempts. For these clients, it is common practice to make private
+ arrangements to send all messages to a single server for processing
+ and subsequent distribution. SMTP, as specified here, is not ideally
+ suited for this role, and work is underway on standardized mail
+ submission protocols that might eventually supercede the current
+ practices. In any event, because these arrangements are private and
+ fall outside the scope of this specification, they are not described
+ here.
+
+
+
+
+
+Klensin Standards Track [Page 24]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ It is important to note that MX records can point to SMTP servers
+ which act as gateways into other environments, not just SMTP relays
+ and final delivery systems; see sections 3.8 and 5.
+
+ If an SMTP server has accepted the task of relaying the mail and
+ later finds that the destination is incorrect or that the mail cannot
+ be delivered for some other reason, then it MUST construct an
+ "undeliverable mail" notification message and send it to the
+ originator of the undeliverable mail (as indicated by the reverse-
+ path). Formats specified for non-delivery reports by other standards
+ (see, for example, [24, 25]) SHOULD be used if possible.
+
+ This notification message must be from the SMTP server at the relay
+ host or the host that first determines that delivery cannot be
+ accomplished. Of course, SMTP servers MUST NOT send notification
+ messages about problems transporting notification messages. One way
+ to prevent loops in error reporting is to specify a null reverse-path
+ in the MAIL command of a notification message. When such a message
+ is transmitted the reverse-path MUST be set to null (see section
+ 4.5.5 for additional discussion). A MAIL command with a null
+ reverse-path appears as follows:
+
+ MAIL FROM:<>
+
+ As discussed in section 2.4.1, a relay SMTP has no need to inspect or
+ act upon the headers or body of the message data and MUST NOT do so
+ except to add its own "Received:" header (section 4.4) and,
+ optionally, to attempt to detect looping in the mail system (see
+ section 6.2).
+
+3.8 Mail Gatewaying
+
+ While the relay function discussed above operates within the Internet
+ SMTP transport service environment, MX records or various forms of
+ explicit routing may require that an intermediate SMTP server perform
+ a translation function between one transport service and another. As
+ discussed in section 2.3.8, when such a system is at the boundary
+ between two transport service environments, we refer to it as a
+ "gateway" or "gateway SMTP".
+
+ Gatewaying mail between different mail environments, such as
+ different mail formats and protocols, is complex and does not easily
+ yield to standardization. However, some general requirements may be
+ given for a gateway between the Internet and another mail
+ environment.
+
+
+
+
+
+
+Klensin Standards Track [Page 25]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+3.8.1 Header Fields in Gatewaying
+
+ Header fields MAY be rewritten when necessary as messages are
+ gatewayed across mail environment boundaries. This may involve
+ inspecting the message body or interpreting the local-part of the
+ destination address in spite of the prohibitions in section 2.4.1.
+
+ Other mail systems gatewayed to the Internet often use a subset of
+ RFC 822 headers or provide similar functionality with a different
+ syntax, but some of these mail systems do not have an equivalent to
+ the SMTP envelope. Therefore, when a message leaves the Internet
+ environment, it may be necessary to fold the SMTP envelope
+ information into the message header. A possible solution would be to
+ create new header fields to carry the envelope information (e.g.,
+ "X-SMTP-MAIL:" and "X-SMTP-RCPT:"); however, this would require
+ changes in mail programs in foreign environments and might risk
+ disclosure of private information (see section 7.2).
+
+3.8.2 Received Lines in Gatewaying
+
+ When forwarding a message into or out of the Internet environment, a
+ gateway MUST prepend a Received: line, but it MUST NOT alter in any
+ way a Received: line that is already in the header.
+
+ "Received:" fields of messages originating from other environments
+ may not conform exactly to this specification. However, the most
+ important use of Received: lines is for debugging mail faults, and
+ this debugging can be severely hampered by well-meaning gateways that
+ try to "fix" a Received: line. As another consequence of trace
+ fields arising in non-SMTP environments, receiving systems MUST NOT
+ reject mail based on the format of a trace field and SHOULD be
+ extremely robust in the light of unexpected information or formats in
+ those fields.
+
+ The gateway SHOULD indicate the environment and protocol in the "via"
+ clauses of Received field(s) that it supplies.
+
+3.8.3 Addresses in Gatewaying
+
+ From the Internet side, the gateway SHOULD accept all valid address
+ formats in SMTP commands and in RFC 822 headers, and all valid RFC
+ 822 messages. Addresses and headers generated by gateways MUST
+ conform to applicable Internet standards (including this one and RFC
+ 822). Gateways are, of course, subject to the same rules for
+ handling source routes as those described for other SMTP systems in
+ section 3.3.
+
+
+
+
+
+Klensin Standards Track [Page 26]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+3.8.4 Other Header Fields in Gatewaying
+
+ The gateway MUST ensure that all header fields of a message that it
+ forwards into the Internet mail environment meet the requirements for
+ Internet mail. In particular, all addresses in "From:", "To:",
+ "Cc:", etc., fields MUST be transformed (if necessary) to satisfy RFC
+ 822 syntax, MUST reference only fully-qualified domain names, and
+ MUST be effective and useful for sending replies. The translation
+ algorithm used to convert mail from the Internet protocols to another
+ environment's protocol SHOULD ensure that error messages from the
+ foreign mail environment are delivered to the return path from the
+ SMTP envelope, not to the sender listed in the "From:" field (or
+ other fields) of the RFC 822 message.
+
+3.8.5 Envelopes in Gatewaying
+
+ Similarly, when forwarding a message from another environment into
+ the Internet, the gateway SHOULD set the envelope return path in
+ accordance with an error message return address, if supplied by the
+ foreign environment. If the foreign environment has no equivalent
+ concept, the gateway must select and use a best approximation, with
+ the message originator's address as the default of last resort.
+
+3.9 Terminating Sessions and Connections
+
+ An SMTP connection is terminated when the client sends a QUIT
+ command. The server responds with a positive reply code, after which
+ it closes the connection.
+
+ An SMTP server MUST NOT intentionally close the connection except:
+
+ - After receiving a QUIT command and responding with a 221 reply.
+
+ - After detecting the need to shut down the SMTP service and
+ returning a 421 response code. This response code can be issued
+ after the server receives any command or, if necessary,
+ asynchronously from command receipt (on the assumption that the
+ client will receive it after the next command is issued).
+
+ In particular, a server that closes connections in response to
+ commands that are not understood is in violation of this
+ specification. Servers are expected to be tolerant of unknown
+ commands, issuing a 500 reply and awaiting further instructions from
+ the client.
+
+
+
+
+
+
+
+Klensin Standards Track [Page 27]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ An SMTP server which is forcibly shut down via external means SHOULD
+ attempt to send a line containing a 421 response code to the SMTP
+ client before exiting. The SMTP client will normally read the 421
+ response code after sending its next command.
+
+ SMTP clients that experience a connection close, reset, or other
+ communications failure due to circumstances not under their control
+ (in violation of the intent of this specification but sometimes
+ unavoidable) SHOULD, to maintain the robustness of the mail system,
+ treat the mail transaction as if a 451 response had been received and
+ act accordingly.
+
+3.10 Mailing Lists and Aliases
+
+ An SMTP-capable host SHOULD support both the alias and the list
+ models of address expansion for multiple delivery. When a message is
+ delivered or forwarded to each address of an expanded list form, the
+ return address in the envelope ("MAIL FROM:") MUST be changed to be
+ the address of a person or other entity who administers the list.
+ However, in this case, the message header [32] MUST be left
+ unchanged; in particular, the "From" field of the message header is
+ unaffected.
+
+ An important mail facility is a mechanism for multi-destination
+ delivery of a single message, by transforming (or "expanding" or
+ "exploding") a pseudo-mailbox address into a list of destination
+ mailbox addresses. When a message is sent to such a pseudo-mailbox
+ (sometimes called an "exploder"), copies are forwarded or
+ redistributed to each mailbox in the expanded list. Servers SHOULD
+ simply utilize the addresses on the list; application of heuristics
+ or other matching rules to eliminate some addresses, such as that of
+ the originator, is strongly discouraged. We classify such a pseudo-
+ mailbox as an "alias" or a "list", depending upon the expansion
+ rules.
+
+3.10.1 Alias
+
+ To expand an alias, the recipient mailer simply replaces the pseudo-
+ mailbox address in the envelope with each of the expanded addresses
+ in turn; the rest of the envelope and the message body are left
+ unchanged. The message is then delivered or forwarded to each
+ expanded address.
+
+3.10.2 List
+
+ A mailing list may be said to operate by "redistribution" rather than
+ by "forwarding". To expand a list, the recipient mailer replaces the
+ pseudo-mailbox address in the envelope with all of the expanded
+
+
+
+Klensin Standards Track [Page 28]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ addresses. The return address in the envelope is changed so that all
+ error messages generated by the final deliveries will be returned to
+ a list administrator, not to the message originator, who generally
+ has no control over the contents of the list and will typically find
+ error messages annoying.
+
+4. The SMTP Specifications
+
+4.1 SMTP Commands
+
+4.1.1 Command Semantics and Syntax
+
+ The SMTP commands define the mail transfer or the mail system
+ function requested by the user. SMTP commands are character strings
+ terminated by <CRLF>. The commands themselves are alphabetic
+ characters terminated by <SP> if parameters follow and <CRLF>
+ otherwise. (In the interest of improved interoperability, SMTP
+ receivers are encouraged to tolerate trailing white space before the
+ terminating <CRLF>.) The syntax of the local part of a mailbox must
+ conform to receiver site conventions and the syntax specified in
+ section 4.1.2. The SMTP commands are discussed below. The SMTP
+ replies are discussed in section 4.2.
+
+ A mail transaction involves several data objects which are
+ communicated as arguments to different commands. The reverse-path is
+ the argument of the MAIL command, the forward-path is the argument of
+ the RCPT command, and the mail data is the argument of the DATA
+ command. These arguments or data objects must be transmitted and
+ held pending the confirmation communicated by the end of mail data
+ indication which finalizes the transaction. The model for this is
+ that distinct buffers are provided to hold the types of data objects,
+ that is, there is a reverse-path buffer, a forward-path buffer, and a
+ mail data buffer. Specific commands cause information to be appended
+ to a specific buffer, or cause one or more buffers to be cleared.
+
+ Several commands (RSET, DATA, QUIT) are specified as not permitting
+ parameters. In the absence of specific extensions offered by the
+ server and accepted by the client, clients MUST NOT send such
+ parameters and servers SHOULD reject commands containing them as
+ having invalid syntax.
+
+4.1.1.1 Extended HELLO (EHLO) or HELLO (HELO)
+
+ These commands are used to identify the SMTP client to the SMTP
+ server. The argument field contains the fully-qualified domain name
+ of the SMTP client if one is available. In situations in which the
+ SMTP client system does not have a meaningful domain name (e.g., when
+ its address is dynamically allocated and no reverse mapping record is
+
+
+
+Klensin Standards Track [Page 29]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ available), the client SHOULD send an address literal (see section
+ 4.1.3), optionally followed by information that will help to identify
+ the client system. y The SMTP server identifies itself to the SMTP
+ client in the connection greeting reply and in the response to this
+ command.
+
+ A client SMTP SHOULD start an SMTP session by issuing the EHLO
+ command. If the SMTP server supports the SMTP service extensions it
+ will give a successful response, a failure response, or an error
+ response. If the SMTP server, in violation of this specification,
+ does not support any SMTP service extensions it will generate an
+ error response. Older client SMTP systems MAY, as discussed above,
+ use HELO (as specified in RFC 821) instead of EHLO, and servers MUST
+ support the HELO command and reply properly to it. In any event, a
+ client MUST issue HELO or EHLO before starting a mail transaction.
+
+ These commands, and a "250 OK" reply to one of them, confirm that
+ both the SMTP client and the SMTP server are in the initial state,
+ that is, there is no transaction in progress and all state tables and
+ buffers are cleared.
+
+ Syntax:
+
+ ehlo = "EHLO" SP Domain CRLF
+ helo = "HELO" SP Domain CRLF
+
+ Normally, the response to EHLO will be a multiline reply. Each line
+ of the response contains a keyword and, optionally, one or more
+ parameters. Following the normal syntax for multiline replies, these
+ keyworks follow the code (250) and a hyphen for all but the last
+ line, and the code and a space for the last line. The syntax for a
+ positive response, using the ABNF notation and terminal symbols of
+ [8], is:
+
+ ehlo-ok-rsp = ( "250" domain [ SP ehlo-greet ] CRLF )
+ / ( "250-" domain [ SP ehlo-greet ] CRLF
+ *( "250-" ehlo-line CRLF )
+ "250" SP ehlo-line CRLF )
+
+ ehlo-greet = 1*(%d0-9 / %d11-12 / %d14-127)
+ ; string of any characters other than CR or LF
+
+ ehlo-line = ehlo-keyword *( SP ehlo-param )
+
+ ehlo-keyword = (ALPHA / DIGIT) *(ALPHA / DIGIT / "-")
+ ; additional syntax of ehlo-params depends on
+ ; ehlo-keyword
+
+
+
+
+Klensin Standards Track [Page 30]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ ehlo-param = 1*(%d33-127)
+ ; any CHAR excluding <SP> and all
+ ; control characters (US-ASCII 0-31 inclusive)
+
+ Although EHLO keywords may be specified in upper, lower, or mixed
+ case, they MUST always be recognized and processed in a case-
+ insensitive manner. This is simply an extension of practices
+ specified in RFC 821 and section 2.4.1.
+
+4.1.1.2 MAIL (MAIL)
+
+ This command is used to initiate a mail transaction in which the mail
+ data is delivered to an SMTP server which may, in turn, deliver it to
+ one or more mailboxes or pass it on to another system (possibly using
+ SMTP). The argument field contains a reverse-path and may contain
+ optional parameters. In general, the MAIL command may be sent only
+ when no mail transaction is in progress, see section 4.1.4.
+
+ The reverse-path consists of the sender mailbox. Historically, that
+ mailbox might optionally have been preceded by a list of hosts, but
+ that behavior is now deprecated (see appendix C). In some types of
+ reporting messages for which a reply is likely to cause a mail loop
+ (for example, mail delivery and nondelivery notifications), the
+ reverse-path may be null (see section 3.7).
+
+ This command clears the reverse-path buffer, the forward-path buffer,
+ and the mail data buffer; and inserts the reverse-path information
+ from this command into the reverse-path buffer.
+
+ If service extensions were negotiated, the MAIL command may also
+ carry parameters associated with a particular service extension.
+
+ Syntax:
+
+ "MAIL FROM:" ("<>" / Reverse-Path)
+ [SP Mail-parameters] CRLF
+
+4.1.1.3 RECIPIENT (RCPT)
+
+ This command is used to identify an individual recipient of the mail
+ data; multiple recipients are specified by multiple use of this
+ command. The argument field contains a forward-path and may contain
+ optional parameters.
+
+ The forward-path normally consists of the required destination
+ mailbox. Sending systems SHOULD not generate the optional list of
+ hosts known as a source route. Receiving systems MUST recognize
+
+
+
+
+Klensin Standards Track [Page 31]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ source route syntax but SHOULD strip off the source route
+ specification and utilize the domain name associated with the mailbox
+ as if the source route had not been provided.
+
+ Similarly, relay hosts SHOULD strip or ignore source routes, and
+ names MUST NOT be copied into the reverse-path. When mail reaches
+ its ultimate destination (the forward-path contains only a
+ destination mailbox), the SMTP server inserts it into the destination
+ mailbox in accordance with its host mail conventions.
+
+ For example, mail received at relay host xyz.com with envelope
+ commands
+
+ MAIL FROM:<userx@y.foo.org>
+ RCPT TO:<@hosta.int,@jkl.org:userc@d.bar.org>
+
+ will normally be sent directly on to host d.bar.org with envelope
+ commands
+
+ MAIL FROM:<userx@y.foo.org>
+ RCPT TO:<userc@d.bar.org>
+
+ As provided in appendix C, xyz.com MAY also choose to relay the
+ message to hosta.int, using the envelope commands
+
+ MAIL FROM:<userx@y.foo.org>
+ RCPT TO:<@hosta.int,@jkl.org:userc@d.bar.org>
+
+ or to jkl.org, using the envelope commands
+
+ MAIL FROM:<userx@y.foo.org>
+ RCPT TO:<@jkl.org:userc@d.bar.org>
+
+ Of course, since hosts are not required to relay mail at all, xyz.com
+ may also reject the message entirely when the RCPT command is
+ received, using a 550 code (since this is a "policy reason").
+
+ If service extensions were negotiated, the RCPT command may also
+ carry parameters associated with a particular service extension
+ offered by the server. The client MUST NOT transmit parameters other
+ than those associated with a service extension offered by the server
+ in its EHLO response.
+
+Syntax:
+ "RCPT TO:" ("<Postmaster@" domain ">" / "<Postmaster>" / Forward-Path)
+ [SP Rcpt-parameters] CRLF
+
+
+
+
+
+Klensin Standards Track [Page 32]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+4.1.1.4 DATA (DATA)
+
+ The receiver normally sends a 354 response to DATA, and then treats
+ the lines (strings ending in <CRLF> sequences, as described in
+ section 2.3.7) following the command as mail data from the sender.
+ This command causes the mail data to be appended to the mail data
+ buffer. The mail data may contain any of the 128 ASCII character
+ codes, although experience has indicated that use of control
+ characters other than SP, HT, CR, and LF may cause problems and
+ SHOULD be avoided when possible.
+
+ The mail data is terminated by a line containing only a period, that
+ is, the character sequence "<CRLF>.<CRLF>" (see section 4.5.2). This
+ is the end of mail data indication. Note that the first <CRLF> of
+ this terminating sequence is also the <CRLF> that ends the final line
+ of the data (message text) or, if there was no data, ends the DATA
+ command itself. An extra <CRLF> MUST NOT be added, as that would
+ cause an empty line to be added to the message. The only exception
+ to this rule would arise if the message body were passed to the
+ originating SMTP-sender with a final "line" that did not end in
+ <CRLF>; in that case, the originating SMTP system MUST either reject
+ the message as invalid or add <CRLF> in order to have the receiving
+ SMTP server recognize the "end of data" condition.
+
+ The custom of accepting lines ending only in <LF>, as a concession to
+ non-conforming behavior on the part of some UNIX systems, has proven
+ to cause more interoperability problems than it solves, and SMTP
+ server systems MUST NOT do this, even in the name of improved
+ robustness. In particular, the sequence "<LF>.<LF>" (bare line
+ feeds, without carriage returns) MUST NOT be treated as equivalent to
+ <CRLF>.<CRLF> as the end of mail data indication.
+
+ Receipt of the end of mail data indication requires the server to
+ process the stored mail transaction information. This processing
+ consumes the information in the reverse-path buffer, the forward-path
+ buffer, and the mail data buffer, and on the completion of this
+ command these buffers are cleared. If the processing is successful,
+ the receiver MUST send an OK reply. If the processing fails the
+ receiver MUST send a failure reply. The SMTP model does not allow
+ for partial failures at this point: either the message is accepted by
+ the server for delivery and a positive response is returned or it is
+ not accepted and a failure reply is returned. In sending a positive
+ completion reply to the end of data indication, the receiver takes
+ full responsibility for the message (see section 6.1). Errors that
+ are diagnosed subsequently MUST be reported in a mail message, as
+ discussed in section 4.4.
+
+
+
+
+
+Klensin Standards Track [Page 33]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ When the SMTP server accepts a message either for relaying or for
+ final delivery, it inserts a trace record (also referred to
+ interchangeably as a "time stamp line" or "Received" line) at the top
+ of the mail data. This trace record indicates the identity of the
+ host that sent the message, the identity of the host that received
+ the message (and is inserting this time stamp), and the date and time
+ the message was received. Relayed messages will have multiple time
+ stamp lines. Details for formation of these lines, including their
+ syntax, is specified in section 4.4.
+
+ Additional discussion about the operation of the DATA command appears
+ in section 3.3.
+
+ Syntax:
+ "DATA" CRLF
+
+4.1.1.5 RESET (RSET)
+
+ This command specifies that the current mail transaction will be
+ aborted. Any stored sender, recipients, and mail data MUST be
+ discarded, and all buffers and state tables cleared. The receiver
+ MUST send a "250 OK" reply to a RSET command with no arguments. A
+ reset command may be issued by the client at any time. It is
+ effectively equivalent to a NOOP (i.e., if has no effect) if issued
+ immediately after EHLO, before EHLO is issued in the session, after
+ an end-of-data indicator has been sent and acknowledged, or
+ immediately before a QUIT. An SMTP server MUST NOT close the
+ connection as the result of receiving a RSET; that action is reserved
+ for QUIT (see section 4.1.1.10).
+
+ Since EHLO implies some additional processing and response by the
+ server, RSET will normally be more efficient than reissuing that
+ command, even though the formal semantics are the same.
+
+ There are circumstances, contrary to the intent of this
+ specification, in which an SMTP server may receive an indication that
+ the underlying TCP connection has been closed or reset. To preserve
+ the robustness of the mail system, SMTP servers SHOULD be prepared
+ for this condition and SHOULD treat it as if a QUIT had been received
+ before the connection disappeared.
+
+ Syntax:
+ "RSET" CRLF
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 34]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+4.1.1.6 VERIFY (VRFY)
+
+ This command asks the receiver to confirm that the argument
+ identifies a user or mailbox. If it is a user name, information is
+ returned as specified in section 3.5.
+
+ This command has no effect on the reverse-path buffer, the forward-
+ path buffer, or the mail data buffer.
+
+ Syntax:
+ "VRFY" SP String CRLF
+
+4.1.1.7 EXPAND (EXPN)
+
+ This command asks the receiver to confirm that the argument
+ identifies a mailing list, and if so, to return the membership of
+ that list. If the command is successful, a reply is returned
+ containing information as described in section 3.5. This reply will
+ have multiple lines except in the trivial case of a one-member list.
+
+ This command has no effect on the reverse-path buffer, the forward-
+ path buffer, or the mail data buffer and may be issued at any time.
+
+ Syntax:
+ "EXPN" SP String CRLF
+
+4.1.1.8 HELP (HELP)
+
+ This command causes the server to send helpful information to the
+ client. The command MAY take an argument (e.g., any command name)
+ and return more specific information as a response.
+
+ This command has no effect on the reverse-path buffer, the forward-
+ path buffer, or the mail data buffer and may be issued at any time.
+
+ SMTP servers SHOULD support HELP without arguments and MAY support it
+ with arguments.
+
+ Syntax:
+ "HELP" [ SP String ] CRLF
+
+4.1.1.9 NOOP (NOOP)
+
+ This command does not affect any parameters or previously entered
+ commands. It specifies no action other than that the receiver send
+ an OK reply.
+
+
+
+
+
+Klensin Standards Track [Page 35]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ This command has no effect on the reverse-path buffer, the forward-
+ path buffer, or the mail data buffer and may be issued at any time.
+ If a parameter string is specified, servers SHOULD ignore it.
+
+ Syntax:
+ "NOOP" [ SP String ] CRLF
+
+4.1.1.10 QUIT (QUIT)
+
+ This command specifies that the receiver MUST send an OK reply, and
+ then close the transmission channel.
+
+ The receiver MUST NOT intentionally close the transmission channel
+ until it receives and replies to a QUIT command (even if there was an
+ error). The sender MUST NOT intentionally close the transmission
+ channel until it sends a QUIT command and SHOULD wait until it
+ receives the reply (even if there was an error response to a previous
+ command). If the connection is closed prematurely due to violations
+ of the above or system or network failure, the server MUST cancel any
+ pending transaction, but not undo any previously completed
+ transaction, and generally MUST act as if the command or transaction
+ in progress had received a temporary error (i.e., a 4yz response).
+
+ The QUIT command may be issued at any time.
+
+ Syntax:
+ "QUIT" CRLF
+
+4.1.2 Command Argument Syntax
+
+ The syntax of the argument fields of the above commands (using the
+ syntax specified in [8] where applicable) is given below. Some of
+ the productions given below are used only in conjunction with source
+ routes as described in appendix C. Terminals not defined in this
+ document, such as ALPHA, DIGIT, SP, CR, LF, CRLF, are as defined in
+ the "core" syntax [8 (section 6)] or in the message format syntax
+ [32].
+
+ Reverse-path = Path
+ Forward-path = Path
+ Path = "<" [ A-d-l ":" ] Mailbox ">"
+ A-d-l = At-domain *( "," A-d-l )
+ ; Note that this form, the so-called "source route",
+ ; MUST BE accepted, SHOULD NOT be generated, and SHOULD be
+ ; ignored.
+ At-domain = "@" domain
+ Mail-parameters = esmtp-param *(SP esmtp-param)
+ Rcpt-parameters = esmtp-param *(SP esmtp-param)
+
+
+
+Klensin Standards Track [Page 36]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ esmtp-param = esmtp-keyword ["=" esmtp-value]
+ esmtp-keyword = (ALPHA / DIGIT) *(ALPHA / DIGIT / "-")
+ esmtp-value = 1*(%d33-60 / %d62-127)
+ ; any CHAR excluding "=", SP, and control characters
+ Keyword = Ldh-str
+ Argument = Atom
+ Domain = (sub-domain 1*("." sub-domain)) / address-literal
+ sub-domain = Let-dig [Ldh-str]
+
+ address-literal = "[" IPv4-address-literal /
+ IPv6-address-literal /
+ General-address-literal "]"
+ ; See section 4.1.3
+
+ Mailbox = Local-part "@" Domain
+
+ Local-part = Dot-string / Quoted-string
+ ; MAY be case-sensitive
+
+ Dot-string = Atom *("." Atom)
+
+ Atom = 1*atext
+
+ Quoted-string = DQUOTE *qcontent DQUOTE
+
+ String = Atom / Quoted-string
+
+ While the above definition for Local-part is relatively permissive,
+ for maximum interoperability, a host that expects to receive mail
+ SHOULD avoid defining mailboxes where the Local-part requires (or
+ uses) the Quoted-string form or where the Local-part is case-
+ sensitive. For any purposes that require generating or comparing
+ Local-parts (e.g., to specific mailbox names), all quoted forms MUST
+ be treated as equivalent and the sending system SHOULD transmit the
+ form that uses the minimum quoting possible.
+
+ Systems MUST NOT define mailboxes in such a way as to require the use
+ in SMTP of non-ASCII characters (octets with the high order bit set
+ to one) or ASCII "control characters" (decimal value 0-31 and 127).
+ These characters MUST NOT be used in MAIL or RCPT commands or other
+ commands that require mailbox names.
+
+ Note that the backslash, "\", is a quote character, which is used to
+ indicate that the next character is to be used literally (instead of
+ its normal interpretation). For example, "Joe\,Smith" indicates a
+ single nine character user field with the comma being the fourth
+ character of the field.
+
+
+
+
+Klensin Standards Track [Page 37]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ To promote interoperability and consistent with long-standing
+ guidance about conservative use of the DNS in naming and applications
+ (e.g., see section 2.3.1 of the base DNS document, RFC1035 [22]),
+ characters outside the set of alphas, digits, and hyphen MUST NOT
+ appear in domain name labels for SMTP clients or servers. In
+ particular, the underscore character is not permitted. SMTP servers
+ that receive a command in which invalid character codes have been
+ employed, and for which there are no other reasons for rejection,
+ MUST reject that command with a 501 response.
+
+4.1.3 Address Literals
+
+ Sometimes a host is not known to the domain name system and
+ communication (and, in particular, communication to report and repair
+ the error) is blocked. To bypass this barrier a special literal form
+ of the address is allowed as an alternative to a domain name. For
+ IPv4 addresses, this form uses four small decimal integers separated
+ by dots and enclosed by brackets such as [123.255.37.2], which
+ indicates an (IPv4) Internet Address in sequence-of-octets form. For
+ IPv6 and other forms of addressing that might eventually be
+ standardized, the form consists of a standardized "tag" that
+ identifies the address syntax, a colon, and the address itself, in a
+ format specified as part of the IPv6 standards [17].
+
+ Specifically:
+
+ IPv4-address-literal = Snum 3("." Snum)
+ IPv6-address-literal = "IPv6:" IPv6-addr
+ General-address-literal = Standardized-tag ":" 1*dcontent
+ Standardized-tag = Ldh-str
+ ; MUST be specified in a standards-track RFC
+ ; and registered with IANA
+
+ Snum = 1*3DIGIT ; representing a decimal integer
+ ; value in the range 0 through 255
+ Let-dig = ALPHA / DIGIT
+ Ldh-str = *( ALPHA / DIGIT / "-" ) Let-dig
+
+ IPv6-addr = IPv6-full / IPv6-comp / IPv6v4-full / IPv6v4-comp
+ IPv6-hex = 1*4HEXDIG
+ IPv6-full = IPv6-hex 7(":" IPv6-hex)
+ IPv6-comp = [IPv6-hex *5(":" IPv6-hex)] "::" [IPv6-hex *5(":"
+ IPv6-hex)]
+ ; The "::" represents at least 2 16-bit groups of zeros
+ ; No more than 6 groups in addition to the "::" may be
+ ; present
+ IPv6v4-full = IPv6-hex 5(":" IPv6-hex) ":" IPv4-address-literal
+ IPv6v4-comp = [IPv6-hex *3(":" IPv6-hex)] "::"
+
+
+
+Klensin Standards Track [Page 38]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ [IPv6-hex *3(":" IPv6-hex) ":"] IPv4-address-literal
+ ; The "::" represents at least 2 16-bit groups of zeros
+ ; No more than 4 groups in addition to the "::" and
+ ; IPv4-address-literal may be present
+
+4.1.4 Order of Commands
+
+ There are restrictions on the order in which these commands may be
+ used.
+
+ A session that will contain mail transactions MUST first be
+ initialized by the use of the EHLO command. An SMTP server SHOULD
+ accept commands for non-mail transactions (e.g., VRFY or EXPN)
+ without this initialization.
+
+ An EHLO command MAY be issued by a client later in the session. If
+ it is issued after the session begins, the SMTP server MUST clear all
+ buffers and reset the state exactly as if a RSET command had been
+ issued. In other words, the sequence of RSET followed immediately by
+ EHLO is redundant, but not harmful other than in the performance cost
+ of executing unnecessary commands.
+
+ If the EHLO command is not acceptable to the SMTP server, 501, 500,
+ or 502 failure replies MUST be returned as appropriate. The SMTP
+ server MUST stay in the same state after transmitting these replies
+ that it was in before the EHLO was received.
+
+ The SMTP client MUST, if possible, ensure that the domain parameter
+ to the EHLO command is a valid principal host name (not a CNAME or MX
+ name) for its host. If this is not possible (e.g., when the client's
+ address is dynamically assigned and the client does not have an
+ obvious name), an address literal SHOULD be substituted for the
+ domain name and supplemental information provided that will assist in
+ identifying the client.
+
+ An SMTP server MAY verify that the domain name parameter in the EHLO
+ command actually corresponds to the IP address of the client.
+ However, the server MUST NOT refuse to accept a message for this
+ reason if the verification fails: the information about verification
+ failure is for logging and tracing only.
+
+ The NOOP, HELP, EXPN, VRFY, and RSET commands can be used at any time
+ during a session, or without previously initializing a session. SMTP
+ servers SHOULD process these normally (that is, not return a 503
+ code) even if no EHLO command has yet been received; clients SHOULD
+ open a session with EHLO before sending these commands.
+
+
+
+
+
+Klensin Standards Track [Page 39]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ If these rules are followed, the example in RFC 821 that shows "550
+ access denied to you" in response to an EXPN command is incorrect
+ unless an EHLO command precedes the EXPN or the denial of access is
+ based on the client's IP address or other authentication or
+ authorization-determining mechanisms.
+
+ The MAIL command (or the obsolete SEND, SOML, or SAML commands)
+ begins a mail transaction. Once started, a mail transaction consists
+ of a transaction beginning command, one or more RCPT commands, and a
+ DATA command, in that order. A mail transaction may be aborted by
+ the RSET (or a new EHLO) command. There may be zero or more
+ transactions in a session. MAIL (or SEND, SOML, or SAML) MUST NOT be
+ sent if a mail transaction is already open, i.e., it should be sent
+ only if no mail transaction had been started in the session, or it
+ the previous one successfully concluded with a successful DATA
+ command, or if the previous one was aborted with a RSET.
+
+ If the transaction beginning command argument is not acceptable, a
+ 501 failure reply MUST be returned and the SMTP server MUST stay in
+ the same state. If the commands in a transaction are out of order to
+ the degree that they cannot be processed by the server, a 503 failure
+ reply MUST be returned and the SMTP server MUST stay in the same
+ state.
+
+ The last command in a session MUST be the QUIT command. The QUIT
+ command cannot be used at any other time in a session, but SHOULD be
+ used by the client SMTP to request connection closure, even when no
+ session opening command was sent and accepted.
+
+4.1.5 Private-use Commands
+
+ As specified in section 2.2.2, commands starting in "X" may be used
+ by bilateral agreement between the client (sending) and server
+ (receiving) SMTP agents. An SMTP server that does not recognize such
+ a command is expected to reply with "500 Command not recognized". An
+ extended SMTP server MAY list the feature names associated with these
+ private commands in the response to the EHLO command.
+
+ Commands sent or accepted by SMTP systems that do not start with "X"
+ MUST conform to the requirements of section 2.2.2.
+
+4.2 SMTP Replies
+
+ Replies to SMTP commands serve to ensure the synchronization of
+ requests and actions in the process of mail transfer and to guarantee
+ that the SMTP client always knows the state of the SMTP server.
+ Every command MUST generate exactly one reply.
+
+
+
+
+Klensin Standards Track [Page 40]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ The details of the command-reply sequence are described in section
+ 4.3.
+
+ An SMTP reply consists of a three digit number (transmitted as three
+ numeric characters) followed by some text unless specified otherwise
+ in this document. The number is for use by automata to determine
+ what state to enter next; the text is for the human user. The three
+ digits contain enough encoded information that the SMTP client need
+ not examine the text and may either discard it or pass it on to the
+ user, as appropriate. Exceptions are as noted elsewhere in this
+ document. In particular, the 220, 221, 251, 421, and 551 reply codes
+ are associated with message text that must be parsed and interpreted
+ by machines. In the general case, the text may be receiver dependent
+ and context dependent, so there are likely to be varying texts for
+ each reply code. A discussion of the theory of reply codes is given
+ in section 4.2.1. Formally, a reply is defined to be the sequence: a
+ three-digit code, <SP>, one line of text, and <CRLF>, or a multiline
+ reply (as defined in section 4.2.1). Since, in violation of this
+ specification, the text is sometimes not sent, clients which do not
+ receive it SHOULD be prepared to process the code alone (with or
+ without a trailing space character). Only the EHLO, EXPN, and HELP
+ commands are expected to result in multiline replies in normal
+ circumstances, however, multiline replies are allowed for any
+ command.
+
+ In ABNF, server responses are:
+
+ Greeting = "220 " Domain [ SP text ] CRLF
+ Reply-line = Reply-code [ SP text ] CRLF
+
+ where "Greeting" appears only in the 220 response that announces that
+ the server is opening its part of the connection.
+
+ An SMTP server SHOULD send only the reply codes listed in this
+ document. An SMTP server SHOULD use the text shown in the examples
+ whenever appropriate.
+
+ An SMTP client MUST determine its actions only by the reply code, not
+ by the text (except for the "change of address" 251 and 551 and, if
+ necessary, 220, 221, and 421 replies); in the general case, any text,
+ including no text at all (although senders SHOULD NOT send bare
+ codes), MUST be acceptable. The space (blank) following the reply
+ code is considered part of the text. Whenever possible, a receiver-
+ SMTP SHOULD test the first digit (severity indication) of the reply
+ code.
+
+
+
+
+
+
+Klensin Standards Track [Page 41]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ The list of codes that appears below MUST NOT be construed as
+ permanent. While the addition of new codes should be a rare and
+ significant activity, with supplemental information in the textual
+ part of the response being preferred, new codes may be added as the
+ result of new Standards or Standards-track specifications.
+ Consequently, a sender-SMTP MUST be prepared to handle codes not
+ specified in this document and MUST do so by interpreting the first
+ digit only.
+
+4.2.1 Reply Code Severities and Theory
+
+ The three digits of the reply each have a special significance. The
+ first digit denotes whether the response is good, bad or incomplete.
+ An unsophisticated SMTP client, or one that receives an unexpected
+ code, will be able to determine its next action (proceed as planned,
+ redo, retrench, etc.) by examining this first digit. An SMTP client
+ that wants to know approximately what kind of error occurred (e.g.,
+ mail system error, command syntax error) may examine the second
+ digit. The third digit and any supplemental information that may be
+ present is reserved for the finest gradation of information.
+
+ There are five values for the first digit of the reply code:
+
+ 1yz Positive Preliminary reply
+ The command has been accepted, but the requested action is being
+ held in abeyance, pending confirmation of the information in this
+ reply. The SMTP client should send another command specifying
+ whether to continue or abort the action. Note: unextended SMTP
+ does not have any commands that allow this type of reply, and so
+ does not have continue or abort commands.
+
+ 2yz Positive Completion reply
+ The requested action has been successfully completed. A new
+ request may be initiated.
+
+ 3yz Positive Intermediate reply
+ The command has been accepted, but the requested action is being
+ held in abeyance, pending receipt of further information. The
+ SMTP client should send another command specifying this
+ information. This reply is used in command sequence groups (i.e.,
+ in DATA).
+
+ 4yz Transient Negative Completion reply
+ The command was not accepted, and the requested action did not
+ occur. However, the error condition is temporary and the action
+ may be requested again. The sender should return to the beginning
+ of the command sequence (if any). It is difficult to assign a
+ meaning to "transient" when two different sites (receiver- and
+
+
+
+Klensin Standards Track [Page 42]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ sender-SMTP agents) must agree on the interpretation. Each reply
+ in this category might have a different time value, but the SMTP
+ client is encouraged to try again. A rule of thumb to determine
+ whether a reply fits into the 4yz or the 5yz category (see below)
+ is that replies are 4yz if they can be successful if repeated
+ without any change in command form or in properties of the sender
+ or receiver (that is, the command is repeated identically and the
+ receiver does not put up a new implementation.)
+
+ 5yz Permanent Negative Completion reply
+ The command was not accepted and the requested action did not
+ occur. The SMTP client is discouraged from repeating the exact
+ request (in the same sequence). Even some "permanent" error
+ conditions can be corrected, so the human user may want to direct
+ the SMTP client to reinitiate the command sequence by direct
+ action at some point in the future (e.g., after the spelling has
+ been changed, or the user has altered the account status).
+
+ The second digit encodes responses in specific categories:
+
+ x0z Syntax: These replies refer to syntax errors, syntactically
+ correct commands that do not fit any functional category, and
+ unimplemented or superfluous commands.
+
+ x1z Information: These are replies to requests for information,
+ such as status or help.
+
+ x2z Connections: These are replies referring to the transmission
+ channel.
+
+ x3z Unspecified.
+
+ x4z Unspecified.
+
+ x5z Mail system: These replies indicate the status of the receiver
+ mail system vis-a-vis the requested transfer or other mail system
+ action.
+
+ The third digit gives a finer gradation of meaning in each category
+ specified by the second digit. The list of replies illustrates this.
+ Each reply text is recommended rather than mandatory, and may even
+ change according to the command with which it is associated. On the
+ other hand, the reply codes must strictly follow the specifications
+ in this section. Receiver implementations should not invent new
+ codes for slightly different situations from the ones described here,
+ but rather adapt codes already defined.
+
+
+
+
+
+Klensin Standards Track [Page 43]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ For example, a command such as NOOP, whose successful execution does
+ not offer the SMTP client any new information, will return a 250
+ reply. The reply is 502 when the command requests an unimplemented
+ non-site-specific action. A refinement of that is the 504 reply for
+ a command that is implemented, but that requests an unimplemented
+ parameter.
+
+ The reply text may be longer than a single line; in these cases the
+ complete text must be marked so the SMTP client knows when it can
+ stop reading the reply. This requires a special format to indicate a
+ multiple line reply.
+
+ The format for multiline replies requires that every line, except the
+ last, begin with the reply code, followed immediately by a hyphen,
+ "-" (also known as minus), followed by text. The last line will
+ begin with the reply code, followed immediately by <SP>, optionally
+ some text, and <CRLF>. As noted above, servers SHOULD send the <SP>
+ if subsequent text is not sent, but clients MUST be prepared for it
+ to be omitted.
+
+ For example:
+
+ 123-First line
+ 123-Second line
+ 123-234 text beginning with numbers
+ 123 The last line
+
+ In many cases the SMTP client then simply needs to search for a line
+ beginning with the reply code followed by <SP> or <CRLF> and ignore
+ all preceding lines. In a few cases, there is important data for the
+ client in the reply "text". The client will be able to identify
+ these cases from the current context.
+
+4.2.2 Reply Codes by Function Groups
+
+ 500 Syntax error, command unrecognized
+ (This may include errors such as command line too long)
+ 501 Syntax error in parameters or arguments
+ 502 Command not implemented (see section 4.2.4)
+ 503 Bad sequence of commands
+ 504 Command parameter not implemented
+
+ 211 System status, or system help reply
+ 214 Help message
+ (Information on how to use the receiver or the meaning of a
+ particular non-standard command; this reply is useful only
+ to the human user)
+
+
+
+
+Klensin Standards Track [Page 44]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ 220 <domain> Service ready
+ 221 <domain> Service closing transmission channel
+ 421 <domain> Service not available, closing transmission channel
+ (This may be a reply to any command if the service knows it
+ must shut down)
+
+ 250 Requested mail action okay, completed
+ 251 User not local; will forward to <forward-path>
+ (See section 3.4)
+ 252 Cannot VRFY user, but will accept message and attempt
+ delivery
+ (See section 3.5.3)
+ 450 Requested mail action not taken: mailbox unavailable
+ (e.g., mailbox busy)
+ 550 Requested action not taken: mailbox unavailable
+ (e.g., mailbox not found, no access, or command rejected
+ for policy reasons)
+ 451 Requested action aborted: error in processing
+ 551 User not local; please try <forward-path>
+ (See section 3.4)
+ 452 Requested action not taken: insufficient system storage
+ 552 Requested mail action aborted: exceeded storage allocation
+ 553 Requested action not taken: mailbox name not allowed
+ (e.g., mailbox syntax incorrect)
+ 354 Start mail input; end with <CRLF>.<CRLF>
+ 554 Transaction failed (Or, in the case of a connection-opening
+ response, "No SMTP service here")
+
+4.2.3 Reply Codes in Numeric Order
+
+ 211 System status, or system help reply
+ 214 Help message
+ (Information on how to use the receiver or the meaning of a
+ particular non-standard command; this reply is useful only
+ to the human user)
+ 220 <domain> Service ready
+ 221 <domain> Service closing transmission channel
+ 250 Requested mail action okay, completed
+ 251 User not local; will forward to <forward-path>
+ (See section 3.4)
+ 252 Cannot VRFY user, but will accept message and attempt
+ delivery
+ (See section 3.5.3)
+
+ 354 Start mail input; end with <CRLF>.<CRLF>
+
+
+
+
+
+
+Klensin Standards Track [Page 45]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ 421 <domain> Service not available, closing transmission channel
+ (This may be a reply to any command if the service knows it
+ must shut down)
+ 450 Requested mail action not taken: mailbox unavailable
+ (e.g., mailbox busy)
+ 451 Requested action aborted: local error in processing
+ 452 Requested action not taken: insufficient system storage
+ 500 Syntax error, command unrecognized
+ (This may include errors such as command line too long)
+ 501 Syntax error in parameters or arguments
+ 502 Command not implemented (see section 4.2.4)
+ 503 Bad sequence of commands
+ 504 Command parameter not implemented
+ 550 Requested action not taken: mailbox unavailable
+ (e.g., mailbox not found, no access, or command rejected
+ for policy reasons)
+ 551 User not local; please try <forward-path>
+ (See section 3.4)
+ 552 Requested mail action aborted: exceeded storage allocation
+ 553 Requested action not taken: mailbox name not allowed
+ (e.g., mailbox syntax incorrect)
+ 554 Transaction failed (Or, in the case of a connection-opening
+ response, "No SMTP service here")
+
+4.2.4 Reply Code 502
+
+ Questions have been raised as to when reply code 502 (Command not
+ implemented) SHOULD be returned in preference to other codes. 502
+ SHOULD be used when the command is actually recognized by the SMTP
+ server, but not implemented. If the command is not recognized, code
+ 500 SHOULD be returned. Extended SMTP systems MUST NOT list
+ capabilities in response to EHLO for which they will return 502 (or
+ 500) replies.
+
+4.2.5 Reply Codes After DATA and the Subsequent <CRLF>.<CRLF>
+
+ When an SMTP server returns a positive completion status (2yz code)
+ after the DATA command is completed with <CRLF>.<CRLF>, it accepts
+ responsibility for:
+
+ - delivering the message (if the recipient mailbox exists), or
+
+ - if attempts to deliver the message fail due to transient
+ conditions, retrying delivery some reasonable number of times at
+ intervals as specified in section 4.5.4.
+
+
+
+
+
+
+Klensin Standards Track [Page 46]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ - if attempts to deliver the message fail due to permanent
+ conditions, or if repeated attempts to deliver the message fail
+ due to transient conditions, returning appropriate notification to
+ the sender of the original message (using the address in the SMTP
+ MAIL command).
+
+ When an SMTP server returns a permanent error status (5yz) code after
+ the DATA command is completed with <CRLF>.<CRLF>, it MUST NOT make
+ any subsequent attempt to deliver that message. The SMTP client
+ retains responsibility for delivery of that message and may either
+ return it to the user or requeue it for a subsequent attempt (see
+ section 4.5.4.1).
+
+ The user who originated the message SHOULD be able to interpret the
+ return of a transient failure status (by mail message or otherwise)
+ as a non-delivery indication, just as a permanent failure would be
+ interpreted. I.e., if the client SMTP successfully handles these
+ conditions, the user will not receive such a reply.
+
+ When an SMTP server returns a permanent error status (5yz) code after
+ the DATA command is completely with <CRLF>.<CRLF>, it MUST NOT make
+ any subsequent attempt to deliver the message. As with temporary
+ error status codes, the SMTP client retains responsibility for the
+ message, but SHOULD not again attempt delivery to the same server
+ without user review and intervention of the message.
+
+4.3 Sequencing of Commands and Replies
+
+4.3.1 Sequencing Overview
+
+ The communication between the sender and receiver is an alternating
+ dialogue, controlled by the sender. As such, the sender issues a
+ command and the receiver responds with a reply. Unless other
+ arrangements are negotiated through service extensions, the sender
+ MUST wait for this response before sending further commands.
+
+ One important reply is the connection greeting. Normally, a receiver
+ will send a 220 "Service ready" reply when the connection is
+ completed. The sender SHOULD wait for this greeting message before
+ sending any commands.
+
+ Note: all the greeting-type replies have the official name (the
+ fully-qualified primary domain name) of the server host as the first
+ word following the reply code. Sometimes the host will have no
+ meaningful name. See 4.1.3 for a discussion of alternatives in these
+ situations.
+
+
+
+
+
+Klensin Standards Track [Page 47]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ For example,
+
+ 220 ISIF.USC.EDU Service ready
+ or
+ 220 mail.foo.com SuperSMTP v 6.1.2 Service ready
+ or
+ 220 [10.0.0.1] Clueless host service ready
+
+ The table below lists alternative success and failure replies for
+ each command. These SHOULD be strictly adhered to: a receiver may
+ substitute text in the replies, but the meaning and action implied by
+ the code numbers and by the specific command reply sequence cannot be
+ altered.
+
+4.3.2 Command-Reply Sequences
+
+ Each command is listed with its usual possible replies. The prefixes
+ used before the possible replies are "I" for intermediate, "S" for
+ success, and "E" for error. Since some servers may generate other
+ replies under special circumstances, and to allow for future
+ extension, SMTP clients SHOULD, when possible, interpret only the
+ first digit of the reply and MUST be prepared to deal with
+ unrecognized reply codes by interpreting the first digit only.
+ Unless extended using the mechanisms described in section 2.2, SMTP
+ servers MUST NOT transmit reply codes to an SMTP client that are
+ other than three digits or that do not start in a digit between 2 and
+ 5 inclusive.
+
+ These sequencing rules and, in principle, the codes themselves, can
+ be extended or modified by SMTP extensions offered by the server and
+ accepted (requested) by the client.
+
+ In addition to the codes listed below, any SMTP command can return
+ any of the following codes if the corresponding unusual circumstances
+ are encountered:
+
+ 500 For the "command line too long" case or if the command name was
+ not recognized. Note that producing a "command not recognized"
+ error in response to the required subset of these commands is a
+ violation of this specification.
+
+ 501 Syntax error in command or arguments. In order to provide for
+ future extensions, commands that are specified in this document as
+ not accepting arguments (DATA, RSET, QUIT) SHOULD return a 501
+ message if arguments are supplied in the absence of EHLO-
+ advertised extensions.
+
+ 421 Service shutting down and closing transmission channel
+
+
+
+Klensin Standards Track [Page 48]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ Specific sequences are:
+
+ CONNECTION ESTABLISHMENT
+ S: 220
+ E: 554
+ EHLO or HELO
+ S: 250
+ E: 504, 550
+ MAIL
+ S: 250
+ E: 552, 451, 452, 550, 553, 503
+ RCPT
+ S: 250, 251 (but see section 3.4 for discussion of 251 and 551)
+ E: 550, 551, 552, 553, 450, 451, 452, 503, 550
+ DATA
+ I: 354 -> data -> S: 250
+ E: 552, 554, 451, 452
+ E: 451, 554, 503
+ RSET
+ S: 250
+ VRFY
+ S: 250, 251, 252
+ E: 550, 551, 553, 502, 504
+ EXPN
+ S: 250, 252
+ E: 550, 500, 502, 504
+ HELP
+ S: 211, 214
+ E: 502, 504
+ NOOP
+ S: 250
+ QUIT
+ S: 221
+
+4.4 Trace Information
+
+ When an SMTP server receives a message for delivery or further
+ processing, it MUST insert trace ("time stamp" or "Received")
+ information at the beginning of the message content, as discussed in
+ section 4.1.1.4.
+
+ This line MUST be structured as follows:
+
+ - The FROM field, which MUST be supplied in an SMTP environment,
+ SHOULD contain both (1) the name of the source host as presented
+ in the EHLO command and (2) an address literal containing the IP
+ address of the source, determined from the TCP connection.
+
+
+
+
+Klensin Standards Track [Page 49]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ - The ID field MAY contain an "@" as suggested in RFC 822, but this
+ is not required.
+
+ - The FOR field MAY contain a list of <path> entries when multiple
+ RCPT commands have been given. This may raise some security
+ issues and is usually not desirable; see section 7.2.
+
+ An Internet mail program MUST NOT change a Received: line that was
+ previously added to the message header. SMTP servers MUST prepend
+ Received lines to messages; they MUST NOT change the order of
+ existing lines or insert Received lines in any other location.
+
+ As the Internet grows, comparability of Received fields is important
+ for detecting problems, especially slow relays. SMTP servers that
+ create Received fields SHOULD use explicit offsets in the dates
+ (e.g., -0800), rather than time zone names of any type. Local time
+ (with an offset) is preferred to UT when feasible. This formulation
+ allows slightly more information about local circumstances to be
+ specified. If UT is needed, the receiver need merely do some simple
+ arithmetic to convert the values. Use of UT loses information about
+ the time zone-location of the server. If it is desired to supply a
+ time zone name, it SHOULD be included in a comment.
+
+ When the delivery SMTP server makes the "final delivery" of a
+ message, it inserts a return-path line at the beginning of the mail
+ data. This use of return-path is required; mail systems MUST support
+ it. The return-path line preserves the information in the <reverse-
+ path> from the MAIL command. Here, final delivery means the message
+ has left the SMTP environment. Normally, this would mean it had been
+ delivered to the destination user or an associated mail drop, but in
+ some cases it may be further processed and transmitted by another
+ mail system.
+
+ It is possible for the mailbox in the return path to be different
+ from the actual sender's mailbox, for example, if error responses are
+ to be delivered to a special error handling mailbox rather than to
+ the message sender. When mailing lists are involved, this
+ arrangement is common and useful as a means of directing errors to
+ the list maintainer rather than the message originator.
+
+ The text above implies that the final mail data will begin with a
+ return path line, followed by one or more time stamp lines. These
+ lines will be followed by the mail data headers and body [32].
+
+ It is sometimes difficult for an SMTP server to determine whether or
+ not it is making final delivery since forwarding or other operations
+ may occur after the message is accepted for delivery. Consequently,
+
+
+
+
+Klensin Standards Track [Page 50]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ any further (forwarding, gateway, or relay) systems MAY remove the
+ return path and rebuild the MAIL command as needed to ensure that
+ exactly one such line appears in a delivered message.
+
+ A message-originating SMTP system SHOULD NOT send a message that
+ already contains a Return-path header. SMTP servers performing a
+ relay function MUST NOT inspect the message data, and especially not
+ to the extent needed to determine if Return-path headers are present.
+ SMTP servers making final delivery MAY remove Return-path headers
+ before adding their own.
+
+ The primary purpose of the Return-path is to designate the address to
+ which messages indicating non-delivery or other mail system failures
+ are to be sent. For this to be unambiguous, exactly one return path
+ SHOULD be present when the message is delivered. Systems using RFC
+ 822 syntax with non-SMTP transports SHOULD designate an unambiguous
+ address, associated with the transport envelope, to which error
+ reports (e.g., non-delivery messages) should be sent.
+
+ Historical note: Text in RFC 822 that appears to contradict the use
+ of the Return-path header (or the envelope reverse path address from
+ the MAIL command) as the destination for error messages is not
+ applicable on the Internet. The reverse path address (as copied into
+ the Return-path) MUST be used as the target of any mail containing
+ delivery error messages.
+
+ In particular:
+
+ - a gateway from SMTP->elsewhere SHOULD insert a return-path header,
+ unless it is known that the "elsewhere" transport also uses
+ Internet domain addresses and maintains the envelope sender
+ address separately.
+
+ - a gateway from elsewhere->SMTP SHOULD delete any return-path
+ header present in the message, and either copy that information to
+ the SMTP envelope or combine it with information present in the
+ envelope of the other transport system to construct the reverse
+ path argument to the MAIL command in the SMTP envelope.
+
+ The server must give special treatment to cases in which the
+ processing following the end of mail data indication is only
+ partially successful. This could happen if, after accepting several
+ recipients and the mail data, the SMTP server finds that the mail
+ data could be successfully delivered to some, but not all, of the
+ recipients. In such cases, the response to the DATA command MUST be
+ an OK reply. However, the SMTP server MUST compose and send an
+ "undeliverable mail" notification message to the originator of the
+ message.
+
+
+
+Klensin Standards Track [Page 51]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ A single notification listing all of the failed recipients or
+ separate notification messages MUST be sent for each failed
+ recipient. For economy of processing by the sender, the former is
+ preferred when possible. All undeliverable mail notification
+ messages are sent using the MAIL command (even if they result from
+ processing the obsolete SEND, SOML, or SAML commands) and use a null
+ return path as discussed in section 3.7.
+
+ The time stamp line and the return path line are formally defined as
+ follows:
+
+Return-path-line = "Return-Path:" FWS Reverse-path <CRLF>
+
+Time-stamp-line = "Received:" FWS Stamp <CRLF>
+
+Stamp = From-domain By-domain Opt-info ";" FWS date-time
+
+ ; where "date-time" is as defined in [32]
+ ; but the "obs-" forms, especially two-digit
+ ; years, are prohibited in SMTP and MUST NOT be used.
+
+From-domain = "FROM" FWS Extended-Domain CFWS
+
+By-domain = "BY" FWS Extended-Domain CFWS
+
+Extended-Domain = Domain /
+ ( Domain FWS "(" TCP-info ")" ) /
+ ( Address-literal FWS "(" TCP-info ")" )
+
+TCP-info = Address-literal / ( Domain FWS Address-literal )
+ ; Information derived by server from TCP connection
+ ; not client EHLO.
+
+Opt-info = [Via] [With] [ID] [For]
+
+Via = "VIA" FWS Link CFWS
+
+With = "WITH" FWS Protocol CFWS
+
+ID = "ID" FWS String / msg-id CFWS
+
+For = "FOR" FWS 1*( Path / Mailbox ) CFWS
+
+Link = "TCP" / Addtl-Link
+Addtl-Link = Atom
+ ; Additional standard names for links are registered with the
+ ; Internet Assigned Numbers Authority (IANA). "Via" is
+ ; primarily of value with non-Internet transports. SMTP
+
+
+
+Klensin Standards Track [Page 52]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ ; servers SHOULD NOT use unregistered names.
+Protocol = "ESMTP" / "SMTP" / Attdl-Protocol
+Attdl-Protocol = Atom
+ ; Additional standard names for protocols are registered with the
+ ; Internet Assigned Numbers Authority (IANA). SMTP servers
+ ; SHOULD NOT use unregistered names.
+
+4.5 Additional Implementation Issues
+
+4.5.1 Minimum Implementation
+
+ In order to make SMTP workable, the following minimum implementation
+ is required for all receivers. The following commands MUST be
+ supported to conform to this specification:
+
+ EHLO
+ HELO
+ MAIL
+ RCPT
+ DATA
+ RSET
+ NOOP
+ QUIT
+ VRFY
+
+ Any system that includes an SMTP server supporting mail relaying or
+ delivery MUST support the reserved mailbox "postmaster" as a case-
+ insensitive local name. This postmaster address is not strictly
+ necessary if the server always returns 554 on connection opening (as
+ described in section 3.1). The requirement to accept mail for
+ postmaster implies that RCPT commands which specify a mailbox for
+ postmaster at any of the domains for which the SMTP server provides
+ mail service, as well as the special case of "RCPT TO:<Postmaster>"
+ (with no domain specification), MUST be supported.
+
+ SMTP systems are expected to make every reasonable effort to accept
+ mail directed to Postmaster from any other system on the Internet.
+ In extreme cases --such as to contain a denial of service attack or
+ other breach of security-- an SMTP server may block mail directed to
+ Postmaster. However, such arrangements SHOULD be narrowly tailored
+ so as to avoid blocking messages which are not part of such attacks.
+
+4.5.2 Transparency
+
+ Without some provision for data transparency, the character sequence
+ "<CRLF>.<CRLF>" ends the mail text and cannot be sent by the user.
+ In general, users are not aware of such "forbidden" sequences. To
+
+
+
+
+Klensin Standards Track [Page 53]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ allow all user composed text to be transmitted transparently, the
+ following procedures are used:
+
+ - Before sending a line of mail text, the SMTP client checks the
+ first character of the line. If it is a period, one additional
+ period is inserted at the beginning of the line.
+
+ - When a line of mail text is received by the SMTP server, it checks
+ the line. If the line is composed of a single period, it is
+ treated as the end of mail indicator. If the first character is a
+ period and there are other characters on the line, the first
+ character is deleted.
+
+ The mail data may contain any of the 128 ASCII characters. All
+ characters are to be delivered to the recipient's mailbox, including
+ spaces, vertical and horizontal tabs, and other control characters.
+ If the transmission channel provides an 8-bit byte (octet) data
+ stream, the 7-bit ASCII codes are transmitted right justified in the
+ octets, with the high order bits cleared to zero. See 3.7 for
+ special treatment of these conditions in SMTP systems serving a relay
+ function.
+
+ In some systems it may be necessary to transform the data as it is
+ received and stored. This may be necessary for hosts that use a
+ different character set than ASCII as their local character set, that
+ store data in records rather than strings, or which use special
+ character sequences as delimiters inside mailboxes. If such
+ transformations are necessary, they MUST be reversible, especially if
+ they are applied to mail being relayed.
+
+4.5.3 Sizes and Timeouts
+
+4.5.3.1 Size limits and minimums
+
+ There are several objects that have required minimum/maximum sizes.
+ Every implementation MUST be able to receive objects of at least
+ these sizes. Objects larger than these sizes SHOULD be avoided when
+ possible. However, some Internet mail constructs such as encoded
+ X.400 addresses [16] will often require larger objects: clients MAY
+ attempt to transmit these, but MUST be prepared for a server to
+ reject them if they cannot be handled by it. To the maximum extent
+ possible, implementation techniques which impose no limits on the
+ length of these objects should be used.
+
+ local-part
+ The maximum total length of a user name or other local-part is 64
+ characters.
+
+
+
+
+Klensin Standards Track [Page 54]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ domain
+ The maximum total length of a domain name or number is 255
+ characters.
+
+ path
+ The maximum total length of a reverse-path or forward-path is 256
+ characters (including the punctuation and element separators).
+
+ command line
+ The maximum total length of a command line including the command
+ word and the <CRLF> is 512 characters. SMTP extensions may be
+ used to increase this limit.
+
+ reply line
+ The maximum total length of a reply line including the reply code
+ and the <CRLF> is 512 characters. More information may be
+ conveyed through multiple-line replies.
+
+ text line
+ The maximum total length of a text line including the <CRLF> is
+ 1000 characters (not counting the leading dot duplicated for
+ transparency). This number may be increased by the use of SMTP
+ Service Extensions.
+
+ message content
+ The maximum total length of a message content (including any
+ message headers as well as the message body) MUST BE at least 64K
+ octets. Since the introduction of Internet standards for
+ multimedia mail [12], message lengths on the Internet have grown
+ dramatically, and message size restrictions should be avoided if
+ at all possible. SMTP server systems that must impose
+ restrictions SHOULD implement the "SIZE" service extension [18],
+ and SMTP client systems that will send large messages SHOULD
+ utilize it when possible.
+
+ recipients buffer
+ The minimum total number of recipients that must be buffered is
+ 100 recipients. Rejection of messages (for excessive recipients)
+ with fewer than 100 RCPT commands is a violation of this
+ specification. The general principle that relaying SMTP servers
+ MUST NOT, and delivery SMTP servers SHOULD NOT, perform validation
+ tests on message headers suggests that rejecting a message based
+ on the total number of recipients shown in header fields is to be
+ discouraged. A server which imposes a limit on the number of
+ recipients MUST behave in an orderly fashion, such as to reject
+ additional addresses over its limit rather than silently
+ discarding addresses previously accepted. A client that needs to
+
+
+
+
+Klensin Standards Track [Page 55]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ deliver a message containing over 100 RCPT commands SHOULD be
+ prepared to transmit in 100-recipient "chunks" if the server
+ declines to accept more than 100 recipients in a single message.
+
+ Errors due to exceeding these limits may be reported by using the
+ reply codes. Some examples of reply codes are:
+
+ 500 Line too long.
+ or
+ 501 Path too long
+ or
+ 452 Too many recipients (see below)
+ or
+ 552 Too much mail data.
+
+ RFC 821 [30] incorrectly listed the error where an SMTP server
+ exhausts its implementation limit on the number of RCPT commands
+ ("too many recipients") as having reply code 552. The correct reply
+ code for this condition is 452. Clients SHOULD treat a 552 code in
+ this case as a temporary, rather than permanent, failure so the logic
+ below works.
+
+ When a conforming SMTP server encounters this condition, it has at
+ least 100 successful RCPT commands in its recipients buffer. If the
+ server is able to accept the message, then at least these 100
+ addresses will be removed from the SMTP client's queue. When the
+ client attempts retransmission of those addresses which received 452
+ responses, at least 100 of these will be able to fit in the SMTP
+ server's recipients buffer. Each retransmission attempt which is
+ able to deliver anything will be able to dispose of at least 100 of
+ these recipients.
+
+ If an SMTP server has an implementation limit on the number of RCPT
+ commands and this limit is exhausted, it MUST use a response code of
+ 452 (but the client SHOULD also be prepared for a 552, as noted
+ above). If the server has a configured site-policy limitation on the
+ number of RCPT commands, it MAY instead use a 5XX response code.
+ This would be most appropriate if the policy limitation was intended
+ to apply if the total recipient count for a particular message body
+ were enforced even if that message body was sent in multiple mail
+ transactions.
+
+4.5.3.2 Timeouts
+
+ An SMTP client MUST provide a timeout mechanism. It MUST use per-
+ command timeouts rather than somehow trying to time the entire mail
+ transaction. Timeouts SHOULD be easily reconfigurable, preferably
+ without recompiling the SMTP code. To implement this, a timer is set
+
+
+
+Klensin Standards Track [Page 56]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ for each SMTP command and for each buffer of the data transfer. The
+ latter means that the overall timeout is inherently proportional to
+ the size of the message.
+
+ Based on extensive experience with busy mail-relay hosts, the minimum
+ per-command timeout values SHOULD be as follows:
+
+ Initial 220 Message: 5 minutes
+ An SMTP client process needs to distinguish between a failed TCP
+ connection and a delay in receiving the initial 220 greeting
+ message. Many SMTP servers accept a TCP connection but delay
+ delivery of the 220 message until their system load permits more
+ mail to be processed.
+
+ MAIL Command: 5 minutes
+
+ RCPT Command: 5 minutes
+ A longer timeout is required if processing of mailing lists and
+ aliases is not deferred until after the message was accepted.
+
+ DATA Initiation: 2 minutes
+ This is while awaiting the "354 Start Input" reply to a DATA
+ command.
+
+ Data Block: 3 minutes
+ This is while awaiting the completion of each TCP SEND call
+ transmitting a chunk of data.
+
+ DATA Termination: 10 minutes.
+ This is while awaiting the "250 OK" reply. When the receiver gets
+ the final period terminating the message data, it typically
+ performs processing to deliver the message to a user mailbox. A
+ spurious timeout at this point would be very wasteful and would
+ typically result in delivery of multiple copies of the message,
+ since it has been successfully sent and the server has accepted
+ responsibility for delivery. See section 6.1 for additional
+ discussion.
+
+ An SMTP server SHOULD have a timeout of at least 5 minutes while it
+ is awaiting the next command from the sender.
+
+4.5.4 Retry Strategies
+
+ The common structure of a host SMTP implementation includes user
+ mailboxes, one or more areas for queuing messages in transit, and one
+ or more daemon processes for sending and receiving mail. The exact
+ structure will vary depending on the needs of the users on the host
+
+
+
+
+Klensin Standards Track [Page 57]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ and the number and size of mailing lists supported by the host. We
+ describe several optimizations that have proved helpful, particularly
+ for mailers supporting high traffic levels.
+
+ Any queuing strategy MUST include timeouts on all activities on a
+ per-command basis. A queuing strategy MUST NOT send error messages
+ in response to error messages under any circumstances.
+
+4.5.4.1 Sending Strategy
+
+ The general model for an SMTP client is one or more processes that
+ periodically attempt to transmit outgoing mail. In a typical system,
+ the program that composes a message has some method for requesting
+ immediate attention for a new piece of outgoing mail, while mail that
+ cannot be transmitted immediately MUST be queued and periodically
+ retried by the sender. A mail queue entry will include not only the
+ message itself but also the envelope information.
+
+ The sender MUST delay retrying a particular destination after one
+ attempt has failed. In general, the retry interval SHOULD be at
+ least 30 minutes; however, more sophisticated and variable strategies
+ will be beneficial when the SMTP client can determine the reason for
+ non-delivery.
+
+ Retries continue until the message is transmitted or the sender gives
+ up; the give-up time generally needs to be at least 4-5 days. The
+ parameters to the retry algorithm MUST be configurable.
+
+ A client SHOULD keep a list of hosts it cannot reach and
+ corresponding connection timeouts, rather than just retrying queued
+ mail items.
+
+ Experience suggests that failures are typically transient (the target
+ system or its connection has crashed), favoring a policy of two
+ connection attempts in the first hour the message is in the queue,
+ and then backing off to one every two or three hours.
+
+ The SMTP client can shorten the queuing delay in cooperation with the
+ SMTP server. For example, if mail is received from a particular
+ address, it is likely that mail queued for that host can now be sent.
+ Application of this principle may, in many cases, eliminate the
+ requirement for an explicit "send queues now" function such as ETRN
+ [9].
+
+ The strategy may be further modified as a result of multiple
+ addresses per host (see below) to optimize delivery time vs. resource
+ usage.
+
+
+
+
+Klensin Standards Track [Page 58]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ An SMTP client may have a large queue of messages for each
+ unavailable destination host. If all of these messages were retried
+ in every retry cycle, there would be excessive Internet overhead and
+ the sending system would be blocked for a long period. Note that an
+ SMTP client can generally determine that a delivery attempt has
+ failed only after a timeout of several minutes and even a one-minute
+ timeout per connection will result in a very large delay if retries
+ are repeated for dozens, or even hundreds, of queued messages to the
+ same host.
+
+ At the same time, SMTP clients SHOULD use great care in caching
+ negative responses from servers. In an extreme case, if EHLO is
+ issued multiple times during the same SMTP connection, different
+ answers may be returned by the server. More significantly, 5yz
+ responses to the MAIL command MUST NOT be cached.
+
+ When a mail message is to be delivered to multiple recipients, and
+ the SMTP server to which a copy of the message is to be sent is the
+ same for multiple recipients, then only one copy of the message
+ SHOULD be transmitted. That is, the SMTP client SHOULD use the
+ command sequence: MAIL, RCPT, RCPT,... RCPT, DATA instead of the
+ sequence: MAIL, RCPT, DATA, ..., MAIL, RCPT, DATA. However, if there
+ are very many addresses, a limit on the number of RCPT commands per
+ MAIL command MAY be imposed. Implementation of this efficiency
+ feature is strongly encouraged.
+
+ Similarly, to achieve timely delivery, the SMTP client MAY support
+ multiple concurrent outgoing mail transactions. However, some limit
+ may be appropriate to protect the host from devoting all its
+ resources to mail.
+
+4.5.4.2 Receiving Strategy
+
+ The SMTP server SHOULD attempt to keep a pending listen on the SMTP
+ port at all times. This requires the support of multiple incoming
+ TCP connections for SMTP. Some limit MAY be imposed but servers that
+ cannot handle more than one SMTP transaction at a time are not in
+ conformance with the intent of this specification.
+
+ As discussed above, when the SMTP server receives mail from a
+ particular host address, it could activate its own SMTP queuing
+ mechanisms to retry any mail pending for that host address.
+
+4.5.5 Messages with a null reverse-path
+
+ There are several types of notification messages which are required
+ by existing and proposed standards to be sent with a null reverse
+ path, namely non-delivery notifications as discussed in section 3.7,
+
+
+
+Klensin Standards Track [Page 59]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ other kinds of Delivery Status Notifications (DSNs) [24], and also
+ Message Disposition Notifications (MDNs) [10]. All of these kinds of
+ messages are notifications about a previous message, and they are
+ sent to the reverse-path of the previous mail message. (If the
+ delivery of such a notification message fails, that usually indicates
+ a problem with the mail system of the host to which the notification
+ message is addressed. For this reason, at some hosts the MTA is set
+ up to forward such failed notification messages to someone who is
+ able to fix problems with the mail system, e.g., via the postmaster
+ alias.)
+
+ All other types of messages (i.e., any message which is not required
+ by a standards-track RFC to have a null reverse-path) SHOULD be sent
+ with with a valid, non-null reverse-path.
+
+ Implementors of automated email processors should be careful to make
+ sure that the various kinds of messages with null reverse-path are
+ handled correctly, in particular such systems SHOULD NOT reply to
+ messages with null reverse-path.
+
+5. Address Resolution and Mail Handling
+
+ Once an SMTP client lexically identifies a domain to which mail will
+ be delivered for processing (as described in sections 3.6 and 3.7), a
+ DNS lookup MUST be performed to resolve the domain name [22]. The
+ names are expected to be fully-qualified domain names (FQDNs):
+ mechanisms for inferring FQDNs from partial names or local aliases
+ are outside of this specification and, due to a history of problems,
+ are generally discouraged. The lookup first attempts to locate an MX
+ record associated with the name. If a CNAME record is found instead,
+ the resulting name is processed as if it were the initial name. If
+ no MX records are found, but an A RR is found, the A RR is treated as
+ if it was associated with an implicit MX RR, with a preference of 0,
+ pointing to that host. If one or more MX RRs are found for a given
+ name, SMTP systems MUST NOT utilize any A RRs associated with that
+ name unless they are located using the MX RRs; the "implicit MX" rule
+ above applies only if there are no MX records present. If MX records
+ are present, but none of them are usable, this situation MUST be
+ reported as an error.
+
+ When the lookup succeeds, the mapping can result in a list of
+ alternative delivery addresses rather than a single address, because
+ of multiple MX records, multihoming, or both. To provide reliable
+ mail transmission, the SMTP client MUST be able to try (and retry)
+ each of the relevant addresses in this list in order, until a
+ delivery attempt succeeds. However, there MAY also be a configurable
+ limit on the number of alternate addresses that can be tried. In any
+ case, the SMTP client SHOULD try at least two addresses.
+
+
+
+Klensin Standards Track [Page 60]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ Two types of information is used to rank the host addresses: multiple
+ MX records, and multihomed hosts.
+
+ Multiple MX records contain a preference indication that MUST be used
+ in sorting (see below). Lower numbers are more preferred than higher
+ ones. If there are multiple destinations with the same preference
+ and there is no clear reason to favor one (e.g., by recognition of an
+ easily-reached address), then the sender-SMTP MUST randomize them to
+ spread the load across multiple mail exchangers for a specific
+ organization.
+
+ The destination host (perhaps taken from the preferred MX record) may
+ be multihomed, in which case the domain name resolver will return a
+ list of alternative IP addresses. It is the responsibility of the
+ domain name resolver interface to have ordered this list by
+ decreasing preference if necessary, and SMTP MUST try them in the
+ order presented.
+
+ Although the capability to try multiple alternative addresses is
+ required, specific installations may want to limit or disable the use
+ of alternative addresses. The question of whether a sender should
+ attempt retries using the different addresses of a multihomed host
+ has been controversial. The main argument for using the multiple
+ addresses is that it maximizes the probability of timely delivery,
+ and indeed sometimes the probability of any delivery; the counter-
+ argument is that it may result in unnecessary resource use. Note
+ that resource use is also strongly determined by the sending strategy
+ discussed in section 4.5.4.1.
+
+ If an SMTP server receives a message with a destination for which it
+ is a designated Mail eXchanger, it MAY relay the message (potentially
+ after having rewritten the MAIL FROM and/or RCPT TO addresses), make
+ final delivery of the message, or hand it off using some mechanism
+ outside the SMTP-provided transport environment. Of course, neither
+ of the latter require that the list of MX records be examined
+ further.
+
+ If it determines that it should relay the message without rewriting
+ the address, it MUST sort the MX records to determine candidates for
+ delivery. The records are first ordered by preference, with the
+ lowest-numbered records being most preferred. The relay host MUST
+ then inspect the list for any of the names or addresses by which it
+ might be known in mail transactions. If a matching record is found,
+ all records at that preference level and higher-numbered ones MUST be
+ discarded from consideration. If there are no records left at that
+ point, it is an error condition, and the message MUST be returned as
+ undeliverable. If records do remain, they SHOULD be tried, best
+ preference first, as described above.
+
+
+
+Klensin Standards Track [Page 61]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+6. Problem Detection and Handling
+
+6.1 Reliable Delivery and Replies by Email
+
+ When the receiver-SMTP accepts a piece of mail (by sending a "250 OK"
+ message in response to DATA), it is accepting responsibility for
+ delivering or relaying the message. It must take this responsibility
+ seriously. It MUST NOT lose the message for frivolous reasons, such
+ as because the host later crashes or because of a predictable
+ resource shortage.
+
+ If there is a delivery failure after acceptance of a message, the
+ receiver-SMTP MUST formulate and mail a notification message. This
+ notification MUST be sent using a null ("<>") reverse path in the
+ envelope. The recipient of this notification MUST be the address
+ from the envelope return path (or the Return-Path: line). However,
+ if this address is null ("<>"), the receiver-SMTP MUST NOT send a
+ notification. Obviously, nothing in this section can or should
+ prohibit local decisions (i.e., as part of the same system
+ environment as the receiver-SMTP) to log or otherwise transmit
+ information about null address events locally if that is desired. If
+ the address is an explicit source route, it MUST be stripped down to
+ its final hop.
+
+ For example, suppose that an error notification must be sent for a
+ message that arrived with:
+
+ MAIL FROM:<@a,@b:user@d>
+
+ The notification message MUST be sent using:
+
+ RCPT TO:<user@d>
+
+ Some delivery failures after the message is accepted by SMTP will be
+ unavoidable. For example, it may be impossible for the receiving
+ SMTP server to validate all the delivery addresses in RCPT command(s)
+ due to a "soft" domain system error, because the target is a mailing
+ list (see earlier discussion of RCPT), or because the server is
+ acting as a relay and has no immediate access to the delivering
+ system.
+
+ To avoid receiving duplicate messages as the result of timeouts, a
+ receiver-SMTP MUST seek to minimize the time required to respond to
+ the final <CRLF>.<CRLF> end of data indicator. See RFC 1047 [28] for
+ a discussion of this problem.
+
+
+
+
+
+
+Klensin Standards Track [Page 62]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+6.2 Loop Detection
+
+ Simple counting of the number of "Received:" headers in a message has
+ proven to be an effective, although rarely optimal, method of
+ detecting loops in mail systems. SMTP servers using this technique
+ SHOULD use a large rejection threshold, normally at least 100
+ Received entries. Whatever mechanisms are used, servers MUST contain
+ provisions for detecting and stopping trivial loops.
+
+6.3 Compensating for Irregularities
+
+ Unfortunately, variations, creative interpretations, and outright
+ violations of Internet mail protocols do occur; some would suggest
+ that they occur quite frequently. The debate as to whether a well-
+ behaved SMTP receiver or relay should reject a malformed message,
+ attempt to pass it on unchanged, or attempt to repair it to increase
+ the odds of successful delivery (or subsequent reply) began almost
+ with the dawn of structured network mail and shows no signs of
+ abating. Advocates of rejection claim that attempted repairs are
+ rarely completely adequate and that rejection of bad messages is the
+ only way to get the offending software repaired. Advocates of
+ "repair" or "deliver no matter what" argue that users prefer that
+ mail go through it if at all possible and that there are significant
+ market pressures in that direction. In practice, these market
+ pressures may be more important to particular vendors than strict
+ conformance to the standards, regardless of the preference of the
+ actual developers.
+
+ The problems associated with ill-formed messages were exacerbated by
+ the introduction of the split-UA mail reading protocols [3, 26, 5,
+ 21]. These protocols have encouraged the use of SMTP as a posting
+ protocol, and SMTP servers as relay systems for these client hosts
+ (which are often only intermittently connected to the Internet).
+ Historically, many of those client machines lacked some of the
+ mechanisms and information assumed by SMTP (and indeed, by the mail
+ format protocol [7]). Some could not keep adequate track of time;
+ others had no concept of time zones; still others could not identify
+ their own names or addresses; and, of course, none could satisfy the
+ assumptions that underlay RFC 822's conception of authenticated
+ addresses.
+
+ In response to these weak SMTP clients, many SMTP systems now
+ complete messages that are delivered to them in incomplete or
+ incorrect form. This strategy is generally considered appropriate
+ when the server can identify or authenticate the client, and there
+ are prior agreements between them. By contrast, there is at best
+ great concern about fixes applied by a relay or delivery SMTP server
+ that has little or no knowledge of the user or client machine.
+
+
+
+Klensin Standards Track [Page 63]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ The following changes to a message being processed MAY be applied
+ when necessary by an originating SMTP server, or one used as the
+ target of SMTP as an initial posting protocol:
+
+ - Addition of a message-id field when none appears
+
+ - Addition of a date, time or time zone when none appears
+
+ - Correction of addresses to proper FQDN format
+
+ The less information the server has about the client, the less likely
+ these changes are to be correct and the more caution and conservatism
+ should be applied when considering whether or not to perform fixes
+ and how. These changes MUST NOT be applied by an SMTP server that
+ provides an intermediate relay function.
+
+ In all cases, properly-operating clients supplying correct
+ information are preferred to corrections by the SMTP server. In all
+ cases, documentation of actions performed by the servers (in trace
+ fields and/or header comments) is strongly encouraged.
+
+7. Security Considerations
+
+7.1 Mail Security and Spoofing
+
+ SMTP mail is inherently insecure in that it is feasible for even
+ fairly casual users to negotiate directly with receiving and relaying
+ SMTP servers and create messages that will trick a naive recipient
+ into believing that they came from somewhere else. Constructing such
+ a message so that the "spoofed" behavior cannot be detected by an
+ expert is somewhat more difficult, but not sufficiently so as to be a
+ deterrent to someone who is determined and knowledgeable.
+ Consequently, as knowledge of Internet mail increases, so does the
+ knowledge that SMTP mail inherently cannot be authenticated, or
+ integrity checks provided, at the transport level. Real mail
+ security lies only in end-to-end methods involving the message
+ bodies, such as those which use digital signatures (see [14] and,
+ e.g., PGP [4] or S/MIME [31]).
+
+ Various protocol extensions and configuration options that provide
+ authentication at the transport level (e.g., from an SMTP client to
+ an SMTP server) improve somewhat on the traditional situation
+ described above. However, unless they are accompanied by careful
+ handoffs of responsibility in a carefully-designed trust environment,
+ they remain inherently weaker than end-to-end mechanisms which use
+ digitally signed messages rather than depending on the integrity of
+ the transport system.
+
+
+
+
+Klensin Standards Track [Page 64]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ Efforts to make it more difficult for users to set envelope return
+ path and header "From" fields to point to valid addresses other than
+ their own are largely misguided: they frustrate legitimate
+ applications in which mail is sent by one user on behalf of another
+ or in which error (or normal) replies should be directed to a special
+ address. (Systems that provide convenient ways for users to alter
+ these fields on a per-message basis should attempt to establish a
+ primary and permanent mailbox address for the user so that Sender
+ fields within the message data can be generated sensibly.)
+
+ This specification does not further address the authentication issues
+ associated with SMTP other than to advocate that useful functionality
+ not be disabled in the hope of providing some small margin of
+ protection against an ignorant user who is trying to fake mail.
+
+7.2 "Blind" Copies
+
+ Addresses that do not appear in the message headers may appear in the
+ RCPT commands to an SMTP server for a number of reasons. The two
+ most common involve the use of a mailing address as a "list exploder"
+ (a single address that resolves into multiple addresses) and the
+ appearance of "blind copies". Especially when more than one RCPT
+ command is present, and in order to avoid defeating some of the
+ purpose of these mechanisms, SMTP clients and servers SHOULD NOT copy
+ the full set of RCPT command arguments into the headers, either as
+ part of trace headers or as informational or private-extension
+ headers. Since this rule is often violated in practice, and cannot
+ be enforced, sending SMTP systems that are aware of "bcc" use MAY
+ find it helpful to send each blind copy as a separate message
+ transaction containing only a single RCPT command.
+
+ There is no inherent relationship between either "reverse" (from
+ MAIL, SAML, etc., commands) or "forward" (RCPT) addresses in the SMTP
+ transaction ("envelope") and the addresses in the headers. Receiving
+ systems SHOULD NOT attempt to deduce such relationships and use them
+ to alter the headers of the message for delivery. The popular
+ "Apparently-to" header is a violation of this principle as well as a
+ common source of unintended information disclosure and SHOULD NOT be
+ used.
+
+7.3 VRFY, EXPN, and Security
+
+ As discussed in section 3.5, individual sites may want to disable
+ either or both of VRFY or EXPN for security reasons. As a corollary
+ to the above, implementations that permit this MUST NOT appear to
+ have verified addresses that are not, in fact, verified. If a site
+
+
+
+
+
+Klensin Standards Track [Page 65]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ disables these commands for security reasons, the SMTP server MUST
+ return a 252 response, rather than a code that could be confused with
+ successful or unsuccessful verification.
+
+ Returning a 250 reply code with the address listed in the VRFY
+ command after having checked it only for syntax violates this rule.
+ Of course, an implementation that "supports" VRFY by always returning
+ 550 whether or not the address is valid is equally not in
+ conformance.
+
+ Within the last few years, the contents of mailing lists have become
+ popular as an address information source for so-called "spammers."
+ The use of EXPN to "harvest" addresses has increased as list
+ administrators have installed protections against inappropriate uses
+ of the lists themselves. Implementations SHOULD still provide
+ support for EXPN, but sites SHOULD carefully evaluate the tradeoffs.
+ As authentication mechanisms are introduced into SMTP, some sites may
+ choose to make EXPN available only to authenticated requestors.
+
+7.4 Information Disclosure in Announcements
+
+ There has been an ongoing debate about the tradeoffs between the
+ debugging advantages of announcing server type and version (and,
+ sometimes, even server domain name) in the greeting response or in
+ response to the HELP command and the disadvantages of exposing
+ information that might be useful in a potential hostile attack. The
+ utility of the debugging information is beyond doubt. Those who
+ argue for making it available point out that it is far better to
+ actually secure an SMTP server rather than hope that trying to
+ conceal known vulnerabilities by hiding the server's precise identity
+ will provide more protection. Sites are encouraged to evaluate the
+ tradeoff with that issue in mind; implementations are strongly
+ encouraged to minimally provide for making type and version
+ information available in some way to other network hosts.
+
+7.5 Information Disclosure in Trace Fields
+
+ In some circumstances, such as when mail originates from within a LAN
+ whose hosts are not directly on the public Internet, trace
+ ("Received") fields produced in conformance with this specification
+ may disclose host names and similar information that would not
+ normally be available. This ordinarily does not pose a problem, but
+ sites with special concerns about name disclosure should be aware of
+ it. Also, the optional FOR clause should be supplied with caution or
+ not at all when multiple recipients are involved lest it
+ inadvertently disclose the identities of "blind copy" recipients to
+ others.
+
+
+
+
+Klensin Standards Track [Page 66]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+7.6 Information Disclosure in Message Forwarding
+
+ As discussed in section 3.4, use of the 251 or 551 reply codes to
+ identify the replacement address associated with a mailbox may
+ inadvertently disclose sensitive information. Sites that are
+ concerned about those issues should ensure that they select and
+ configure servers appropriately.
+
+7.7 Scope of Operation of SMTP Servers
+
+ It is a well-established principle that an SMTP server may refuse to
+ accept mail for any operational or technical reason that makes sense
+ to the site providing the server. However, cooperation among sites
+ and installations makes the Internet possible. If sites take
+ excessive advantage of the right to reject traffic, the ubiquity of
+ email availability (one of the strengths of the Internet) will be
+ threatened; considerable care should be taken and balance maintained
+ if a site decides to be selective about the traffic it will accept
+ and process.
+
+ In recent years, use of the relay function through arbitrary sites
+ has been used as part of hostile efforts to hide the actual origins
+ of mail. Some sites have decided to limit the use of the relay
+ function to known or identifiable sources, and implementations SHOULD
+ provide the capability to perform this type of filtering. When mail
+ is rejected for these or other policy reasons, a 550 code SHOULD be
+ used in response to EHLO, MAIL, or RCPT as appropriate.
+
+8. IANA Considerations
+
+ IANA will maintain three registries in support of this specification.
+ The first consists of SMTP service extensions with the associated
+ keywords, and, as needed, parameters and verbs. As specified in
+ section 2.2.2, no entry may be made in this registry that starts in
+ an "X". Entries may be made only for service extensions (and
+ associated keywords, parameters, or verbs) that are defined in
+ standards-track or experimental RFCs specifically approved by the
+ IESG for this purpose.
+
+ The second registry consists of "tags" that identify forms of domain
+ literals other than those for IPv4 addresses (specified in RFC 821
+ and in this document) and IPv6 addresses (specified in this
+ document). Additional literal types require standardization before
+ being used; none are anticipated at this time.
+
+ The third, established by RFC 821 and renewed by this specification,
+ is a registry of link and protocol identifiers to be used with the
+ "via" and "with" subclauses of the time stamp ("Received: header")
+
+
+
+Klensin Standards Track [Page 67]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ described in section 4.4. Link and protocol identifiers in addition
+ to those specified in this document may be registered only by
+ standardization or by way of an RFC-documented, IESG-approved,
+ Experimental protocol extension.
+
+9. References
+
+ [1] American National Standards Institute (formerly United States of
+ America Standards Institute), X3.4, 1968, "USA Code for
+ Information Interchange". ANSI X3.4-1968 has been replaced by
+ newer versions with slight modifications, but the 1968 version
+ remains definitive for the Internet.
+
+ [2] Braden, R., "Requirements for Internet hosts - application and
+ support", STD 3, RFC 1123, October 1989.
+
+ [3] Butler, M., Chase, D., Goldberger, J., Postel, J. and J.
+ Reynolds, "Post Office Protocol - version 2", RFC 937, February
+ 1985.
+
+ [4] Callas, J., Donnerhacke, L., Finney, H. and R. Thayer, "OpenPGP
+ Message Format", RFC 2440, November 1998.
+
+ [5] Crispin, M., "Interactive Mail Access Protocol - Version 2", RFC
+ 1176, August 1990.
+
+ [6] Crispin, M., "Internet Message Access Protocol - Version 4", RFC
+ 2060, December 1996.
+
+ [7] Crocker, D., "Standard for the Format of ARPA Internet Text
+ Messages", RFC 822, August 1982.
+
+ [8] Crocker, D. and P. Overell, Eds., "Augmented BNF for Syntax
+ Specifications: ABNF", RFC 2234, November 1997.
+
+ [9] De Winter, J., "SMTP Service Extension for Remote Message Queue
+ Starting", RFC 1985, August 1996.
+
+ [10] Fajman, R., "An Extensible Message Format for Message
+ Disposition Notifications", RFC 2298, March 1998.
+
+ [11] Freed, N, "Behavior of and Requirements for Internet Firewalls",
+ RFC 2979, October 2000.
+
+ [12] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message Bodies",
+ RFC 2045, December 1996.
+
+
+
+
+Klensin Standards Track [Page 68]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ [13] Freed, N., "SMTP Service Extension for Command Pipelining", RFC
+ 2920, September 2000.
+
+ [14] Galvin, J., Murphy, S., Crocker, S. and N. Freed, "Security
+ Multiparts for MIME: Multipart/Signed and Multipart/Encrypted",
+ RFC 1847, October 1995.
+
+ [15] Gellens, R. and J. Klensin, "Message Submission", RFC 2476,
+ December 1998.
+
+ [16] Kille, S., "Mapping between X.400 and RFC822/MIME", RFC 2156,
+ January 1998.
+
+ [17] Hinden, R and S. Deering, Eds. "IP Version 6 Addressing
+ Architecture", RFC 2373, July 1998.
+
+ [18] Klensin, J., Freed, N. and K. Moore, "SMTP Service Extension for
+ Message Size Declaration", STD 10, RFC 1870, November 1995.
+
+ [19] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker,
+ "SMTP Service Extensions", STD 10, RFC 1869, November 1995.
+
+ [20] Klensin, J., Freed, N., Rose, M., Stefferud, E. and D. Crocker,
+ "SMTP Service Extension for 8bit-MIMEtransport", RFC 1652, July
+ 1994.
+
+ [21] Lambert, M., "PCMAIL: A distributed mail system for personal
+ computers", RFC 1056, July 1988.
+
+ [22] Mockapetris, P., "Domain names - implementation and
+ specification", STD 13, RFC 1035, November 1987.
+
+ Mockapetris, P., "Domain names - concepts and facilities", STD
+ 13, RFC 1034, November 1987.
+
+ [23] Moore, K., "MIME (Multipurpose Internet Mail Extensions) Part
+ Three: Message Header Extensions for Non-ASCII Text", RFC 2047,
+ December 1996.
+
+ [24] Moore, K., "SMTP Service Extension for Delivery Status
+ Notifications", RFC 1891, January 1996.
+
+ [25] Moore, K., and G. Vaudreuil, "An Extensible Message Format for
+ Delivery Status Notifications", RFC 1894, January 1996.
+
+ [26] Myers, J. and M. Rose, "Post Office Protocol - Version 3", STD
+ 53, RFC 1939, May 1996.
+
+
+
+
+Klensin Standards Track [Page 69]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ [27] Partridge, C., "Mail routing and the domain system", RFC 974,
+ January 1986.
+
+ [28] Partridge, C., "Duplicate messages and SMTP", RFC 1047, February
+ 1988.
+
+ [29] Postel, J., ed., "Transmission Control Protocol - DARPA Internet
+ Program Protocol Specification", STD 7, RFC 793, September 1981.
+
+ [30] Postel, J., "Simple Mail Transfer Protocol", RFC 821, August
+ 1982.
+
+ [31] Ramsdell, B., Ed., "S/MIME Version 3 Message Specification", RFC
+ 2633, June 1999.
+
+ [32] Resnick, P., Ed., "Internet Message Format", RFC 2822, April
+ 2001.
+
+ [33] Vaudreuil, G., "SMTP Service Extensions for Transmission of
+ Large and Binary MIME Messages", RFC 1830, August 1995.
+
+ [34] Vaudreuil, G., "Enhanced Mail System Status Codes", RFC 1893,
+ January 1996.
+
+10. Editor's Address
+
+ John C. Klensin
+ AT&T Laboratories
+ 99 Bedford St
+ Boston, MA 02111 USA
+
+ Phone: 617-574-3076
+ EMail: klensin@research.att.com
+
+11. Acknowledgments
+
+ Many people worked long and hard on the many iterations of this
+ document. There was wide-ranging debate in the IETF DRUMS Working
+ Group, both on its mailing list and in face to face discussions,
+ about many technical issues and the role of a revised standard for
+ Internet mail transport, and many contributors helped form the
+ wording in this specification. The hundreds of participants in the
+ many discussions since RFC 821 was produced are too numerous to
+ mention, but they all helped this document become what it is.
+
+
+
+
+
+
+
+Klensin Standards Track [Page 70]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+APPENDICES
+
+A. TCP Transport Service
+
+ The TCP connection supports the transmission of 8-bit bytes. The
+ SMTP data is 7-bit ASCII characters. Each character is transmitted
+ as an 8-bit byte with the high-order bit cleared to zero. Service
+ extensions may modify this rule to permit transmission of full 8-bit
+ data bytes as part of the message body, but not in SMTP commands or
+ responses.
+
+B. Generating SMTP Commands from RFC 822 Headers
+
+ Some systems use RFC 822 headers (only) in a mail submission
+ protocol, or otherwise generate SMTP commands from RFC 822 headers
+ when such a message is handed to an MTA from a UA. While the MTA-UA
+ protocol is a private matter, not covered by any Internet Standard,
+ there are problems with this approach. For example, there have been
+ repeated problems with proper handling of "bcc" copies and
+ redistribution lists when information that conceptually belongs to a
+ mail envelopes is not separated early in processing from header
+ information (and kept separate).
+
+ It is recommended that the UA provide its initial ("submission
+ client") MTA with an envelope separate from the message itself.
+ However, if the envelope is not supplied, SMTP commands SHOULD be
+ generated as follows:
+
+ 1. Each recipient address from a TO, CC, or BCC header field SHOULD
+ be copied to a RCPT command (generating multiple message copies if
+ that is required for queuing or delivery). This includes any
+ addresses listed in a RFC 822 "group". Any BCC fields SHOULD then
+ be removed from the headers. Once this process is completed, the
+ remaining headers SHOULD be checked to verify that at least one
+ To:, Cc:, or Bcc: header remains. If none do, then a bcc: header
+ with no additional information SHOULD be inserted as specified in
+ [32].
+
+ 2. The return address in the MAIL command SHOULD, if possible, be
+ derived from the system's identity for the submitting (local)
+ user, and the "From:" header field otherwise. If there is a
+ system identity available, it SHOULD also be copied to the Sender
+ header field if it is different from the address in the From
+ header field. (Any Sender field that was already there SHOULD be
+ removed.) Systems may provide a way for submitters to override
+ the envelope return address, but may want to restrict its use to
+ privileged users. This will not prevent mail forgery, but may
+ lessen its incidence; see section 7.1.
+
+
+
+Klensin Standards Track [Page 71]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ When an MTA is being used in this way, it bears responsibility for
+ ensuring that the message being transmitted is valid. The mechanisms
+ for checking that validity, and for handling (or returning) messages
+ that are not valid at the time of arrival, are part of the MUA-MTA
+ interface and not covered by this specification.
+
+ A submission protocol based on Standard RFC 822 information alone
+ MUST NOT be used to gateway a message from a foreign (non-SMTP) mail
+ system into an SMTP environment. Additional information to construct
+ an envelope must come from some source in the other environment,
+ whether supplemental headers or the foreign system's envelope.
+
+ Attempts to gateway messages using only their header "to" and "cc"
+ fields have repeatedly caused mail loops and other behavior adverse
+ to the proper functioning of the Internet mail environment. These
+ problems have been especially common when the message originates from
+ an Internet mailing list and is distributed into the foreign
+ environment using envelope information. When these messages are then
+ processed by a header-only remailer, loops back to the Internet
+ environment (and the mailing list) are almost inevitable.
+
+C. Source Routes
+
+ Historically, the <reverse-path> was a reverse source routing list of
+ hosts and a source mailbox. The first host in the <reverse-path>
+ SHOULD be the host sending the MAIL command. Similarly, the
+ <forward-path> may be a source routing lists of hosts and a
+ destination mailbox. However, in general, the <forward-path> SHOULD
+ contain only a mailbox and domain name, relying on the domain name
+ system to supply routing information if required. The use of source
+ routes is deprecated; while servers MUST be prepared to receive and
+ handle them as discussed in section 3.3 and F.2, clients SHOULD NOT
+ transmit them and this section was included only to provide context.
+
+ For relay purposes, the forward-path may be a source route of the
+ form "@ONE,@TWO:JOE@THREE", where ONE, TWO, and THREE MUST BE fully-
+ qualified domain names. This form is used to emphasize the
+ distinction between an address and a route. The mailbox is an
+ absolute address, and the route is information about how to get
+ there. The two concepts should not be confused.
+
+ If source routes are used, RFC 821 and the text below should be
+ consulted for the mechanisms for constructing and updating the
+ forward- and reverse-paths.
+
+
+
+
+
+
+
+Klensin Standards Track [Page 72]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ The SMTP server transforms the command arguments by moving its own
+ identifier (its domain name or that of any domain for which it is
+ acting as a mail exchanger), if it appears, from the forward-path to
+ the beginning of the reverse-path.
+
+ Notice that the forward-path and reverse-path appear in the SMTP
+ commands and replies, but not necessarily in the message. That is,
+ there is no need for these paths and especially this syntax to appear
+ in the "To:" , "From:", "CC:", etc. fields of the message header.
+ Conversely, SMTP servers MUST NOT derive final message delivery
+ information from message header fields.
+
+ When the list of hosts is present, it is a "reverse" source route and
+ indicates that the mail was relayed through each host on the list
+ (the first host in the list was the most recent relay). This list is
+ used as a source route to return non-delivery notices to the sender.
+ As each relay host adds itself to the beginning of the list, it MUST
+ use its name as known in the transport environment to which it is
+ relaying the mail rather than that of the transport environment from
+ which the mail came (if they are different).
+
+D. Scenarios
+
+ This section presents complete scenarios of several types of SMTP
+ sessions. In the examples, "C:" indicates what is said by the SMTP
+ client, and "S:" indicates what is said by the SMTP server.
+
+D.1 A Typical SMTP Transaction Scenario
+
+ This SMTP example shows mail sent by Smith at host bar.com, to Jones,
+ Green, and Brown at host foo.com. Here we assume that host bar.com
+ contacts host foo.com directly. The mail is accepted for Jones and
+ Brown. Green does not have a mailbox at host foo.com.
+
+ S: 220 foo.com Simple Mail Transfer Service Ready
+ C: EHLO bar.com
+ S: 250-foo.com greets bar.com
+ S: 250-8BITMIME
+ S: 250-SIZE
+ S: 250-DSN
+ S: 250 HELP
+ C: MAIL FROM:<Smith@bar.com>
+ S: 250 OK
+ C: RCPT TO:<Jones@foo.com>
+ S: 250 OK
+ C: RCPT TO:<Green@foo.com>
+ S: 550 No such user here
+ C: RCPT TO:<Brown@foo.com>
+
+
+
+Klensin Standards Track [Page 73]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ S: 250 OK
+ C: DATA
+ S: 354 Start mail input; end with <CRLF>.<CRLF>
+ C: Blah blah blah...
+ C: ...etc. etc. etc.
+ C: .
+ S: 250 OK
+ C: QUIT
+ S: 221 foo.com Service closing transmission channel
+
+D.2 Aborted SMTP Transaction Scenario
+
+ S: 220 foo.com Simple Mail Transfer Service Ready
+ C: EHLO bar.com
+ S: 250-foo.com greets bar.com
+ S: 250-8BITMIME
+ S: 250-SIZE
+ S: 250-DSN
+ S: 250 HELP
+ C: MAIL FROM:<Smith@bar.com>
+ S: 250 OK
+ C: RCPT TO:<Jones@foo.com>
+ S: 250 OK
+ C: RCPT TO:<Green@foo.com>
+ S: 550 No such user here
+ C: RSET
+ S: 250 OK
+ C: QUIT
+ S: 221 foo.com Service closing transmission channel
+
+D.3 Relayed Mail Scenario
+
+ Step 1 -- Source Host to Relay Host
+
+ S: 220 foo.com Simple Mail Transfer Service Ready
+ C: EHLO bar.com
+ S: 250-foo.com greets bar.com
+ S: 250-8BITMIME
+ S: 250-SIZE
+ S: 250-DSN
+ S: 250 HELP
+ C: MAIL FROM:<JQP@bar.com>
+ S: 250 OK
+ C: RCPT TO:<@foo.com:Jones@XYZ.COM>
+ S: 250 OK
+ C: DATA
+ S: 354 Start mail input; end with <CRLF>.<CRLF>
+ C: Date: Thu, 21 May 1998 05:33:29 -0700
+
+
+
+Klensin Standards Track [Page 74]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ C: From: John Q. Public <JQP@bar.com>
+ C: Subject: The Next Meeting of the Board
+ C: To: Jones@xyz.com
+ C:
+ C: Bill:
+ C: The next meeting of the board of directors will be
+ C: on Tuesday.
+ C: John.
+ C: .
+ S: 250 OK
+ C: QUIT
+ S: 221 foo.com Service closing transmission channel
+
+ Step 2 -- Relay Host to Destination Host
+
+ S: 220 xyz.com Simple Mail Transfer Service Ready
+ C: EHLO foo.com
+ S: 250 xyz.com is on the air
+ C: MAIL FROM:<@foo.com:JQP@bar.com>
+ S: 250 OK
+ C: RCPT TO:<Jones@XYZ.COM>
+ S: 250 OK
+ C: DATA
+ S: 354 Start mail input; end with <CRLF>.<CRLF>
+ C: Received: from bar.com by foo.com ; Thu, 21 May 1998
+ C: 05:33:29 -0700
+ C: Date: Thu, 21 May 1998 05:33:22 -0700
+ C: From: John Q. Public <JQP@bar.com>
+ C: Subject: The Next Meeting of the Board
+ C: To: Jones@xyz.com
+ C:
+ C: Bill:
+ C: The next meeting of the board of directors will be
+ C: on Tuesday.
+ C: John.
+ C: .
+ S: 250 OK
+ C: QUIT
+ S: 221 foo.com Service closing transmission channel
+
+D.4 Verifying and Sending Scenario
+
+ S: 220 foo.com Simple Mail Transfer Service Ready
+ C: EHLO bar.com
+ S: 250-foo.com greets bar.com
+ S: 250-8BITMIME
+ S: 250-SIZE
+ S: 250-DSN
+
+
+
+Klensin Standards Track [Page 75]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+ S: 250-VRFY
+ S: 250 HELP
+ C: VRFY Crispin
+ S: 250 Mark Crispin <Admin.MRC@foo.com>
+ C: SEND FROM:<EAK@bar.com>
+ S: 250 OK
+ C: RCPT TO:<Admin.MRC@foo.com>
+ S: 250 OK
+ C: DATA
+ S: 354 Start mail input; end with <CRLF>.<CRLF>
+ C: Blah blah blah...
+ C: ...etc. etc. etc.
+ C: .
+ S: 250 OK
+ C: QUIT
+ S: 221 foo.com Service closing transmission channel
+
+E. Other Gateway Issues
+
+ In general, gateways between the Internet and other mail systems
+ SHOULD attempt to preserve any layering semantics across the
+ boundaries between the two mail systems involved. Gateway-
+ translation approaches that attempt to take shortcuts by mapping,
+ (such as envelope information from one system to the message headers
+ or body of another) have generally proven to be inadequate in
+ important ways. Systems translating between environments that do not
+ support both envelopes and headers and Internet mail must be written
+ with the understanding that some information loss is almost
+ inevitable.
+
+F. Deprecated Features of RFC 821
+
+ A few features of RFC 821 have proven to be problematic and SHOULD
+ NOT be used in Internet mail.
+
+F.1 TURN
+
+ This command, described in RFC 821, raises important security issues
+ since, in the absence of strong authentication of the host requesting
+ that the client and server switch roles, it can easily be used to
+ divert mail from its correct destination. Its use is deprecated;
+ SMTP systems SHOULD NOT use it unless the server can authenticate the
+ client.
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 76]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+F.2 Source Routing
+
+ RFC 821 utilized the concept of explicit source routing to get mail
+ from one host to another via a series of relays. The requirement to
+ utilize source routes in regular mail traffic was eliminated by the
+ introduction of the domain name system "MX" record and the last
+ significant justification for them was eliminated by the
+ introduction, in RFC 1123, of a clear requirement that addresses
+ following an "@" must all be fully-qualified domain names.
+ Consequently, the only remaining justifications for the use of source
+ routes are support for very old SMTP clients or MUAs and in mail
+ system debugging. They can, however, still be useful in the latter
+ circumstance and for routing mail around serious, but temporary,
+ problems such as problems with the relevant DNS records.
+
+ SMTP servers MUST continue to accept source route syntax as specified
+ in the main body of this document and in RFC 1123. They MAY, if
+ necessary, ignore the routes and utilize only the target domain in
+ the address. If they do utilize the source route, the message MUST
+ be sent to the first domain shown in the address. In particular, a
+ server MUST NOT guess at shortcuts within the source route.
+
+ Clients SHOULD NOT utilize explicit source routing except under
+ unusual circumstances, such as debugging or potentially relaying
+ around firewall or mail system configuration errors.
+
+F.3 HELO
+
+ As discussed in sections 3.1 and 4.1.1, EHLO is strongly preferred to
+ HELO when the server will accept the former. Servers must continue
+ to accept and process HELO in order to support older clients.
+
+F.4 #-literals
+
+ RFC 821 provided for specifying an Internet address as a decimal
+ integer host number prefixed by a pound sign, "#". In practice, that
+ form has been obsolete since the introduction of TCP/IP. It is
+ deprecated and MUST NOT be used.
+
+F.5 Dates and Years
+
+ When dates are inserted into messages by SMTP clients or servers
+ (e.g., in trace fields), four-digit years MUST BE used. Two-digit
+ years are deprecated; three-digit years were never permitted in the
+ Internet mail system.
+
+
+
+
+
+
+Klensin Standards Track [Page 77]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+F.6 Sending versus Mailing
+
+ In addition to specifying a mechanism for delivering messages to
+ user's mailboxes, RFC 821 provided additional, optional, commands to
+ deliver messages directly to the user's terminal screen. These
+ commands (SEND, SAML, SOML) were rarely implemented, and changes in
+ workstation technology and the introduction of other protocols may
+ have rendered them obsolete even where they are implemented.
+
+ Clients SHOULD NOT provide SEND, SAML, or SOML as services. Servers
+ MAY implement them. If they are implemented by servers, the
+ implementation model specified in RFC 821 MUST be used and the
+ command names MUST be published in the response to the EHLO command.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 78]
+\f
+RFC 2821 Simple Mail Transfer Protocol April 2001
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Klensin Standards Track [Page 79]
+\f
--- /dev/null
+
+
+
+
+
+
+Network Working Group P. Resnick, Editor
+Request for Comments: 2822 QUALCOMM Incorporated
+Obsoletes: 822 April 2001
+Category: Standards Track
+
+
+ Internet Message Format
+
+Status of this Memo
+
+ This document specifies an Internet standards track protocol for the
+ Internet community, and requests discussion and suggestions for
+ improvements. Please refer to the current edition of the "Internet
+ Official Protocol Standards" (STD 1) for the standardization state
+ and status of this protocol. Distribution of this memo is unlimited.
+
+Copyright Notice
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+Abstract
+
+ This standard specifies a syntax for text messages that are sent
+ between computer users, within the framework of "electronic mail"
+ messages. This standard supersedes the one specified in Request For
+ Comments (RFC) 822, "Standard for the Format of ARPA Internet Text
+ Messages", updating it to reflect current practice and incorporating
+ incremental changes that were specified in other RFCs.
+
+Table of Contents
+
+ 1. Introduction ............................................... 3
+ 1.1. Scope .................................................... 3
+ 1.2. Notational conventions ................................... 4
+ 1.2.1. Requirements notation .................................. 4
+ 1.2.2. Syntactic notation ..................................... 4
+ 1.3. Structure of this document ............................... 4
+ 2. Lexical Analysis of Messages ............................... 5
+ 2.1. General Description ...................................... 5
+ 2.1.1. Line Length Limits ..................................... 6
+ 2.2. Header Fields ............................................ 7
+ 2.2.1. Unstructured Header Field Bodies ....................... 7
+ 2.2.2. Structured Header Field Bodies ......................... 7
+ 2.2.3. Long Header Fields ..................................... 7
+ 2.3. Body ..................................................... 8
+ 3. Syntax ..................................................... 9
+ 3.1. Introduction ............................................. 9
+ 3.2. Lexical Tokens ........................................... 9
+
+
+
+Resnick Standards Track [Page 1]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ 3.2.1. Primitive Tokens ....................................... 9
+ 3.2.2. Quoted characters ......................................10
+ 3.2.3. Folding white space and comments .......................11
+ 3.2.4. Atom ...................................................12
+ 3.2.5. Quoted strings .........................................13
+ 3.2.6. Miscellaneous tokens ...................................13
+ 3.3. Date and Time Specification ..............................14
+ 3.4. Address Specification ....................................15
+ 3.4.1. Addr-spec specification ................................16
+ 3.5 Overall message syntax ....................................17
+ 3.6. Field definitions ........................................18
+ 3.6.1. The origination date field .............................20
+ 3.6.2. Originator fields ......................................21
+ 3.6.3. Destination address fields .............................22
+ 3.6.4. Identification fields ..................................23
+ 3.6.5. Informational fields ...................................26
+ 3.6.6. Resent fields ..........................................26
+ 3.6.7. Trace fields ...........................................28
+ 3.6.8. Optional fields ........................................29
+ 4. Obsolete Syntax ............................................29
+ 4.1. Miscellaneous obsolete tokens ............................30
+ 4.2. Obsolete folding white space .............................31
+ 4.3. Obsolete Date and Time ...................................31
+ 4.4. Obsolete Addressing ......................................33
+ 4.5. Obsolete header fields ...................................33
+ 4.5.1. Obsolete origination date field ........................34
+ 4.5.2. Obsolete originator fields .............................34
+ 4.5.3. Obsolete destination address fields ....................34
+ 4.5.4. Obsolete identification fields .........................35
+ 4.5.5. Obsolete informational fields ..........................35
+ 4.5.6. Obsolete resent fields .................................35
+ 4.5.7. Obsolete trace fields ..................................36
+ 4.5.8. Obsolete optional fields ...............................36
+ 5. Security Considerations ....................................36
+ 6. Bibliography ...............................................37
+ 7. Editor's Address ...........................................38
+ 8. Acknowledgements ...........................................39
+ Appendix A. Example messages ..................................41
+ A.1. Addressing examples ......................................41
+ A.1.1. A message from one person to another with simple
+ addressing .............................................41
+ A.1.2. Different types of mailboxes ...........................42
+ A.1.3. Group addresses ........................................43
+ A.2. Reply messages ...........................................43
+ A.3. Resent messages ..........................................44
+ A.4. Messages with trace fields ...............................46
+ A.5. White space, comments, and other oddities ................47
+ A.6. Obsoleted forms ..........................................47
+
+
+
+Resnick Standards Track [Page 2]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ A.6.1. Obsolete addressing ....................................48
+ A.6.2. Obsolete dates .........................................48
+ A.6.3. Obsolete white space and comments ......................48
+ Appendix B. Differences from earlier standards ................49
+ Appendix C. Notices ...........................................50
+ Full Copyright Statement ......................................51
+
+1. Introduction
+
+1.1. Scope
+
+ This standard specifies a syntax for text messages that are sent
+ between computer users, within the framework of "electronic mail"
+ messages. This standard supersedes the one specified in Request For
+ Comments (RFC) 822, "Standard for the Format of ARPA Internet Text
+ Messages" [RFC822], updating it to reflect current practice and
+ incorporating incremental changes that were specified in other RFCs
+ [STD3].
+
+ This standard specifies a syntax only for text messages. In
+ particular, it makes no provision for the transmission of images,
+ audio, or other sorts of structured data in electronic mail messages.
+ There are several extensions published, such as the MIME document
+ series [RFC2045, RFC2046, RFC2049], which describe mechanisms for the
+ transmission of such data through electronic mail, either by
+ extending the syntax provided here or by structuring such messages to
+ conform to this syntax. Those mechanisms are outside of the scope of
+ this standard.
+
+ In the context of electronic mail, messages are viewed as having an
+ envelope and contents. The envelope contains whatever information is
+ needed to accomplish transmission and delivery. (See [RFC2821] for a
+ discussion of the envelope.) The contents comprise the object to be
+ delivered to the recipient. This standard applies only to the format
+ and some of the semantics of message contents. It contains no
+ specification of the information in the envelope.
+
+ However, some message systems may use information from the contents
+ to create the envelope. It is intended that this standard facilitate
+ the acquisition of such information by programs.
+
+ This specification is intended as a definition of what message
+ content format is to be passed between systems. Though some message
+ systems locally store messages in this format (which eliminates the
+ need for translation between formats) and others use formats that
+ differ from the one specified in this standard, local storage is
+ outside of the scope of this standard.
+
+
+
+
+Resnick Standards Track [Page 3]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Note: This standard is not intended to dictate the internal formats
+ used by sites, the specific message system features that they are
+ expected to support, or any of the characteristics of user interface
+ programs that create or read messages. In addition, this standard
+ does not specify an encoding of the characters for either transport
+ or storage; that is, it does not specify the number of bits used or
+ how those bits are specifically transferred over the wire or stored
+ on disk.
+
+1.2. Notational conventions
+
+1.2.1. Requirements notation
+
+ This document occasionally uses terms that appear in capital letters.
+ When the terms "MUST", "SHOULD", "RECOMMENDED", "MUST NOT", "SHOULD
+ NOT", and "MAY" appear capitalized, they are being used to indicate
+ particular requirements of this specification. A discussion of the
+ meanings of these terms appears in [RFC2119].
+
+1.2.2. Syntactic notation
+
+ This standard uses the Augmented Backus-Naur Form (ABNF) notation
+ specified in [RFC2234] for the formal definitions of the syntax of
+ messages. Characters will be specified either by a decimal value
+ (e.g., the value %d65 for uppercase A and %d97 for lowercase A) or by
+ a case-insensitive literal value enclosed in quotation marks (e.g.,
+ "A" for either uppercase or lowercase A). See [RFC2234] for the full
+ description of the notation.
+
+1.3. Structure of this document
+
+ This document is divided into several sections.
+
+ This section, section 1, is a short introduction to the document.
+
+ Section 2 lays out the general description of a message and its
+ constituent parts. This is an overview to help the reader understand
+ some of the general principles used in the later portions of this
+ document. Any examples in this section MUST NOT be taken as
+ specification of the formal syntax of any part of a message.
+
+ Section 3 specifies formal ABNF rules for the structure of each part
+ of a message (the syntax) and describes the relationship between
+ those parts and their meaning in the context of a message (the
+ semantics). That is, it describes the actual rules for the structure
+ of each part of a message (the syntax) as well as a description of
+ the parts and instructions on how they ought to be interpreted (the
+ semantics). This includes analysis of the syntax and semantics of
+
+
+
+Resnick Standards Track [Page 4]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ subparts of messages that have specific structure. The syntax
+ included in section 3 represents messages as they MUST be created.
+ There are also notes in section 3 to indicate if any of the options
+ specified in the syntax SHOULD be used over any of the others.
+
+ Both sections 2 and 3 describe messages that are legal to generate
+ for purposes of this standard.
+
+ Section 4 of this document specifies an "obsolete" syntax. There are
+ references in section 3 to these obsolete syntactic elements. The
+ rules of the obsolete syntax are elements that have appeared in
+ earlier revisions of this standard or have previously been widely
+ used in Internet messages. As such, these elements MUST be
+ interpreted by parsers of messages in order to be conformant to this
+ standard. However, since items in this syntax have been determined
+ to be non-interoperable or to cause significant problems for
+ recipients of messages, they MUST NOT be generated by creators of
+ conformant messages.
+
+ Section 5 details security considerations to take into account when
+ implementing this standard.
+
+ Section 6 is a bibliography of references in this document.
+
+ Section 7 contains the editor's address.
+
+ Section 8 contains acknowledgements.
+
+ Appendix A lists examples of different sorts of messages. These
+ examples are not exhaustive of the types of messages that appear on
+ the Internet, but give a broad overview of certain syntactic forms.
+
+ Appendix B lists the differences between this standard and earlier
+ standards for Internet messages.
+
+ Appendix C has copyright and intellectual property notices.
+
+2. Lexical Analysis of Messages
+
+2.1. General Description
+
+ At the most basic level, a message is a series of characters. A
+ message that is conformant with this standard is comprised of
+ characters with values in the range 1 through 127 and interpreted as
+ US-ASCII characters [ASCII]. For brevity, this document sometimes
+ refers to this range of characters as simply "US-ASCII characters".
+
+
+
+
+
+Resnick Standards Track [Page 5]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Note: This standard specifies that messages are made up of characters
+ in the US-ASCII range of 1 through 127. There are other documents,
+ specifically the MIME document series [RFC2045, RFC2046, RFC2047,
+ RFC2048, RFC2049], that extend this standard to allow for values
+ outside of that range. Discussion of those mechanisms is not within
+ the scope of this standard.
+
+ Messages are divided into lines of characters. A line is a series of
+ characters that is delimited with the two characters carriage-return
+ and line-feed; that is, the carriage return (CR) character (ASCII
+ value 13) followed immediately by the line feed (LF) character (ASCII
+ value 10). (The carriage-return/line-feed pair is usually written in
+ this document as "CRLF".)
+
+ A message consists of header fields (collectively called "the header
+ of the message") followed, optionally, by a body. The header is a
+ sequence of lines of characters with special syntax as defined in
+ this standard. The body is simply a sequence of characters that
+ follows the header and is separated from the header by an empty line
+ (i.e., a line with nothing preceding the CRLF).
+
+2.1.1. Line Length Limits
+
+ There are two limits that this standard places on the number of
+ characters in a line. Each line of characters MUST be no more than
+ 998 characters, and SHOULD be no more than 78 characters, excluding
+ the CRLF.
+
+ The 998 character limit is due to limitations in many implementations
+ which send, receive, or store Internet Message Format messages that
+ simply cannot handle more than 998 characters on a line. Receiving
+ implementations would do well to handle an arbitrarily large number
+ of characters in a line for robustness sake. However, there are so
+ many implementations which (in compliance with the transport
+ requirements of [RFC2821]) do not accept messages containing more
+ than 1000 character including the CR and LF per line, it is important
+ for implementations not to create such messages.
+
+ The more conservative 78 character recommendation is to accommodate
+ the many implementations of user interfaces that display these
+ messages which may truncate, or disastrously wrap, the display of
+ more than 78 characters per line, in spite of the fact that such
+ implementations are non-conformant to the intent of this
+ specification (and that of [RFC2821] if they actually cause
+ information to be lost). Again, even though this limitation is put on
+ messages, it is encumbant upon implementations which display messages
+
+
+
+
+
+Resnick Standards Track [Page 6]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ to handle an arbitrarily large number of characters in a line
+ (certainly at least up to the 998 character limit) for the sake of
+ robustness.
+
+2.2. Header Fields
+
+ Header fields are lines composed of a field name, followed by a colon
+ (":"), followed by a field body, and terminated by CRLF. A field
+ name MUST be composed of printable US-ASCII characters (i.e.,
+ characters that have values between 33 and 126, inclusive), except
+ colon. A field body may be composed of any US-ASCII characters,
+ except for CR and LF. However, a field body may contain CRLF when
+ used in header "folding" and "unfolding" as described in section
+ 2.2.3. All field bodies MUST conform to the syntax described in
+ sections 3 and 4 of this standard.
+
+2.2.1. Unstructured Header Field Bodies
+
+ Some field bodies in this standard are defined simply as
+ "unstructured" (which is specified below as any US-ASCII characters,
+ except for CR and LF) with no further restrictions. These are
+ referred to as unstructured field bodies. Semantically, unstructured
+ field bodies are simply to be treated as a single line of characters
+ with no further processing (except for header "folding" and
+ "unfolding" as described in section 2.2.3).
+
+2.2.2. Structured Header Field Bodies
+
+ Some field bodies in this standard have specific syntactical
+ structure more restrictive than the unstructured field bodies
+ described above. These are referred to as "structured" field bodies.
+ Structured field bodies are sequences of specific lexical tokens as
+ described in sections 3 and 4 of this standard. Many of these tokens
+ are allowed (according to their syntax) to be introduced or end with
+ comments (as described in section 3.2.3) as well as the space (SP,
+ ASCII value 32) and horizontal tab (HTAB, ASCII value 9) characters
+ (together known as the white space characters, WSP), and those WSP
+ characters are subject to header "folding" and "unfolding" as
+ described in section 2.2.3. Semantic analysis of structured field
+ bodies is given along with their syntax.
+
+2.2.3. Long Header Fields
+
+ Each header field is logically a single line of characters comprising
+ the field name, the colon, and the field body. For convenience
+ however, and to deal with the 998/78 character limitations per line,
+ the field body portion of a header field can be split into a multiple
+ line representation; this is called "folding". The general rule is
+
+
+
+Resnick Standards Track [Page 7]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ that wherever this standard allows for folding white space (not
+ simply WSP characters), a CRLF may be inserted before any WSP. For
+ example, the header field:
+
+ Subject: This is a test
+
+ can be represented as:
+
+ Subject: This
+ is a test
+
+ Note: Though structured field bodies are defined in such a way that
+ folding can take place between many of the lexical tokens (and even
+ within some of the lexical tokens), folding SHOULD be limited to
+ placing the CRLF at higher-level syntactic breaks. For instance, if
+ a field body is defined as comma-separated values, it is recommended
+ that folding occur after the comma separating the structured items in
+ preference to other places where the field could be folded, even if
+ it is allowed elsewhere.
+
+ The process of moving from this folded multiple-line representation
+ of a header field to its single line representation is called
+ "unfolding". Unfolding is accomplished by simply removing any CRLF
+ that is immediately followed by WSP. Each header field should be
+ treated in its unfolded form for further syntactic and semantic
+ evaluation.
+
+2.3. Body
+
+ The body of a message is simply lines of US-ASCII characters. The
+ only two limitations on the body are as follows:
+
+ - CR and LF MUST only occur together as CRLF; they MUST NOT appear
+ independently in the body.
+
+ - Lines of characters in the body MUST be limited to 998 characters,
+ and SHOULD be limited to 78 characters, excluding the CRLF.
+
+ Note: As was stated earlier, there are other standards documents,
+ specifically the MIME documents [RFC2045, RFC2046, RFC2048, RFC2049]
+ that extend this standard to allow for different sorts of message
+ bodies. Again, these mechanisms are beyond the scope of this
+ document.
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 8]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+3. Syntax
+
+3.1. Introduction
+
+ The syntax as given in this section defines the legal syntax of
+ Internet messages. Messages that are conformant to this standard
+ MUST conform to the syntax in this section. If there are options in
+ this section where one option SHOULD be generated, that is indicated
+ either in the prose or in a comment next to the syntax.
+
+ For the defined expressions, a short description of the syntax and
+ use is given, followed by the syntax in ABNF, followed by a semantic
+ analysis. Primitive tokens that are used but otherwise unspecified
+ come from [RFC2234].
+
+ In some of the definitions, there will be nonterminals whose names
+ start with "obs-". These "obs-" elements refer to tokens defined in
+ the obsolete syntax in section 4. In all cases, these productions
+ are to be ignored for the purposes of generating legal Internet
+ messages and MUST NOT be used as part of such a message. However,
+ when interpreting messages, these tokens MUST be honored as part of
+ the legal syntax. In this sense, section 3 defines a grammar for
+ generation of messages, with "obs-" elements that are to be ignored,
+ while section 4 adds grammar for interpretation of messages.
+
+3.2. Lexical Tokens
+
+ The following rules are used to define an underlying lexical
+ analyzer, which feeds tokens to the higher-level parsers. This
+ section defines the tokens used in structured header field bodies.
+
+ Note: Readers of this standard need to pay special attention to how
+ these lexical tokens are used in both the lower-level and
+ higher-level syntax later in the document. Particularly, the white
+ space tokens and the comment tokens defined in section 3.2.3 get used
+ in the lower-level tokens defined here, and those lower-level tokens
+ are in turn used as parts of the higher-level tokens defined later.
+ Therefore, the white space and comments may be allowed in the
+ higher-level tokens even though they may not explicitly appear in a
+ particular definition.
+
+3.2.1. Primitive Tokens
+
+ The following are primitive tokens referred to elsewhere in this
+ standard, but not otherwise defined in [RFC2234]. Some of them will
+ not appear anywhere else in the syntax, but they are convenient to
+ refer to in other parts of this document.
+
+
+
+
+Resnick Standards Track [Page 9]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Note: The "specials" below are just such an example. Though the
+ specials token does not appear anywhere else in this standard, it is
+ useful for implementers who use tools that lexically analyze
+ messages. Each of the characters in specials can be used to indicate
+ a tokenization point in lexical analysis.
+
+NO-WS-CTL = %d1-8 / ; US-ASCII control characters
+ %d11 / ; that do not include the
+ %d12 / ; carriage return, line feed,
+ %d14-31 / ; and white space characters
+ %d127
+
+text = %d1-9 / ; Characters excluding CR and LF
+ %d11 /
+ %d12 /
+ %d14-127 /
+ obs-text
+
+specials = "(" / ")" / ; Special characters used in
+ "<" / ">" / ; other parts of the syntax
+ "[" / "]" /
+ ":" / ";" /
+ "@" / "\" /
+ "," / "." /
+ DQUOTE
+
+ No special semantics are attached to these tokens. They are simply
+ single characters.
+
+3.2.2. Quoted characters
+
+ Some characters are reserved for special interpretation, such as
+ delimiting lexical tokens. To permit use of these characters as
+ uninterpreted data, a quoting mechanism is provided.
+
+quoted-pair = ("\" text) / obs-qp
+
+ Where any quoted-pair appears, it is to be interpreted as the text
+ character alone. That is to say, the "\" character that appears as
+ part of a quoted-pair is semantically "invisible".
+
+ Note: The "\" character may appear in a message where it is not part
+ of a quoted-pair. A "\" character that does not appear in a
+ quoted-pair is not semantically invisible. The only places in this
+ standard where quoted-pair currently appears are ccontent, qcontent,
+ dcontent, no-fold-quote, and no-fold-literal.
+
+
+
+
+
+Resnick Standards Track [Page 10]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+3.2.3. Folding white space and comments
+
+ White space characters, including white space used in folding
+ (described in section 2.2.3), may appear between many elements in
+ header field bodies. Also, strings of characters that are treated as
+ comments may be included in structured field bodies as characters
+ enclosed in parentheses. The following defines the folding white
+ space (FWS) and comment constructs.
+
+ Strings of characters enclosed in parentheses are considered comments
+ so long as they do not appear within a "quoted-string", as defined in
+ section 3.2.5. Comments may nest.
+
+ There are several places in this standard where comments and FWS may
+ be freely inserted. To accommodate that syntax, an additional token
+ for "CFWS" is defined for places where comments and/or FWS can occur.
+ However, where CFWS occurs in this standard, it MUST NOT be inserted
+ in such a way that any line of a folded header field is made up
+ entirely of WSP characters and nothing else.
+
+FWS = ([*WSP CRLF] 1*WSP) / ; Folding white space
+ obs-FWS
+
+ctext = NO-WS-CTL / ; Non white space controls
+
+ %d33-39 / ; The rest of the US-ASCII
+ %d42-91 / ; characters not including "(",
+ %d93-126 ; ")", or "\"
+
+ccontent = ctext / quoted-pair / comment
+
+comment = "(" *([FWS] ccontent) [FWS] ")"
+
+CFWS = *([FWS] comment) (([FWS] comment) / FWS)
+
+ Throughout this standard, where FWS (the folding white space token)
+ appears, it indicates a place where header folding, as discussed in
+ section 2.2.3, may take place. Wherever header folding appears in a
+ message (that is, a header field body containing a CRLF followed by
+ any WSP), header unfolding (removal of the CRLF) is performed before
+ any further lexical analysis is performed on that header field
+ according to this standard. That is to say, any CRLF that appears in
+ FWS is semantically "invisible."
+
+ A comment is normally used in a structured field body to provide some
+ human readable informational text. Since a comment is allowed to
+ contain FWS, folding is permitted within the comment. Also note that
+ since quoted-pair is allowed in a comment, the parentheses and
+
+
+
+Resnick Standards Track [Page 11]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ backslash characters may appear in a comment so long as they appear
+ as a quoted-pair. Semantically, the enclosing parentheses are not
+ part of the comment; the comment is what is contained between the two
+ parentheses. As stated earlier, the "\" in any quoted-pair and the
+ CRLF in any FWS that appears within the comment are semantically
+ "invisible" and therefore not part of the comment either.
+
+ Runs of FWS, comment or CFWS that occur between lexical tokens in a
+ structured field header are semantically interpreted as a single
+ space character.
+
+3.2.4. Atom
+
+ Several productions in structured header field bodies are simply
+ strings of certain basic characters. Such productions are called
+ atoms.
+
+ Some of the structured header field bodies also allow the period
+ character (".", ASCII value 46) within runs of atext. An additional
+ "dot-atom" token is defined for those purposes.
+
+atext = ALPHA / DIGIT / ; Any character except controls,
+ "!" / "#" / ; SP, and specials.
+ "$" / "%" / ; Used for atoms
+ "&" / "'" /
+ "*" / "+" /
+ "-" / "/" /
+ "=" / "?" /
+ "^" / "_" /
+ "`" / "{" /
+ "|" / "}" /
+ "~"
+
+atom = [CFWS] 1*atext [CFWS]
+
+dot-atom = [CFWS] dot-atom-text [CFWS]
+
+dot-atom-text = 1*atext *("." 1*atext)
+
+ Both atom and dot-atom are interpreted as a single unit, comprised of
+ the string of characters that make it up. Semantically, the optional
+ comments and FWS surrounding the rest of the characters are not part
+ of the atom; the atom is only the run of atext characters in an atom,
+ or the atext and "." characters in a dot-atom.
+
+
+
+
+
+
+
+Resnick Standards Track [Page 12]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+3.2.5. Quoted strings
+
+ Strings of characters that include characters other than those
+ allowed in atoms may be represented in a quoted string format, where
+ the characters are surrounded by quote (DQUOTE, ASCII value 34)
+ characters.
+
+qtext = NO-WS-CTL / ; Non white space controls
+
+ %d33 / ; The rest of the US-ASCII
+ %d35-91 / ; characters not including "\"
+ %d93-126 ; or the quote character
+
+qcontent = qtext / quoted-pair
+
+quoted-string = [CFWS]
+ DQUOTE *([FWS] qcontent) [FWS] DQUOTE
+ [CFWS]
+
+ A quoted-string is treated as a unit. That is, quoted-string is
+ identical to atom, semantically. Since a quoted-string is allowed to
+ contain FWS, folding is permitted. Also note that since quoted-pair
+ is allowed in a quoted-string, the quote and backslash characters may
+ appear in a quoted-string so long as they appear as a quoted-pair.
+
+ Semantically, neither the optional CFWS outside of the quote
+ characters nor the quote characters themselves are part of the
+ quoted-string; the quoted-string is what is contained between the two
+ quote characters. As stated earlier, the "\" in any quoted-pair and
+ the CRLF in any FWS/CFWS that appears within the quoted-string are
+ semantically "invisible" and therefore not part of the quoted-string
+ either.
+
+3.2.6. Miscellaneous tokens
+
+ Three additional tokens are defined, word and phrase for combinations
+ of atoms and/or quoted-strings, and unstructured for use in
+ unstructured header fields and in some places within structured
+ header fields.
+
+word = atom / quoted-string
+
+phrase = 1*word / obs-phrase
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 13]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+utext = NO-WS-CTL / ; Non white space controls
+ %d33-126 / ; The rest of US-ASCII
+ obs-utext
+
+unstructured = *([FWS] utext) [FWS]
+
+3.3. Date and Time Specification
+
+ Date and time occur in several header fields. This section specifies
+ the syntax for a full date and time specification. Though folding
+ white space is permitted throughout the date-time specification, it
+ is RECOMMENDED that a single space be used in each place that FWS
+ appears (whether it is required or optional); some older
+ implementations may not interpret other occurrences of folding white
+ space correctly.
+
+date-time = [ day-of-week "," ] date FWS time [CFWS]
+
+day-of-week = ([FWS] day-name) / obs-day-of-week
+
+day-name = "Mon" / "Tue" / "Wed" / "Thu" /
+ "Fri" / "Sat" / "Sun"
+
+date = day month year
+
+year = 4*DIGIT / obs-year
+
+month = (FWS month-name FWS) / obs-month
+
+month-name = "Jan" / "Feb" / "Mar" / "Apr" /
+ "May" / "Jun" / "Jul" / "Aug" /
+ "Sep" / "Oct" / "Nov" / "Dec"
+
+day = ([FWS] 1*2DIGIT) / obs-day
+
+time = time-of-day FWS zone
+
+time-of-day = hour ":" minute [ ":" second ]
+
+hour = 2DIGIT / obs-hour
+
+minute = 2DIGIT / obs-minute
+
+second = 2DIGIT / obs-second
+
+zone = (( "+" / "-" ) 4DIGIT) / obs-zone
+
+
+
+
+
+Resnick Standards Track [Page 14]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ The day is the numeric day of the month. The year is any numeric
+ year 1900 or later.
+
+ The time-of-day specifies the number of hours, minutes, and
+ optionally seconds since midnight of the date indicated.
+
+ The date and time-of-day SHOULD express local time.
+
+ The zone specifies the offset from Coordinated Universal Time (UTC,
+ formerly referred to as "Greenwich Mean Time") that the date and
+ time-of-day represent. The "+" or "-" indicates whether the
+ time-of-day is ahead of (i.e., east of) or behind (i.e., west of)
+ Universal Time. The first two digits indicate the number of hours
+ difference from Universal Time, and the last two digits indicate the
+ number of minutes difference from Universal Time. (Hence, +hhmm
+ means +(hh * 60 + mm) minutes, and -hhmm means -(hh * 60 + mm)
+ minutes). The form "+0000" SHOULD be used to indicate a time zone at
+ Universal Time. Though "-0000" also indicates Universal Time, it is
+ used to indicate that the time was generated on a system that may be
+ in a local time zone other than Universal Time and therefore
+ indicates that the date-time contains no information about the local
+ time zone.
+
+ A date-time specification MUST be semantically valid. That is, the
+ day-of-the-week (if included) MUST be the day implied by the date,
+ the numeric day-of-month MUST be between 1 and the number of days
+ allowed for the specified month (in the specified year), the
+ time-of-day MUST be in the range 00:00:00 through 23:59:60 (the
+ number of seconds allowing for a leap second; see [STD12]), and the
+ zone MUST be within the range -9959 through +9959.
+
+3.4. Address Specification
+
+ Addresses occur in several message header fields to indicate senders
+ and recipients of messages. An address may either be an individual
+ mailbox, or a group of mailboxes.
+
+address = mailbox / group
+
+mailbox = name-addr / addr-spec
+
+name-addr = [display-name] angle-addr
+
+angle-addr = [CFWS] "<" addr-spec ">" [CFWS] / obs-angle-addr
+
+group = display-name ":" [mailbox-list / CFWS] ";"
+ [CFWS]
+
+
+
+
+Resnick Standards Track [Page 15]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+display-name = phrase
+
+mailbox-list = (mailbox *("," mailbox)) / obs-mbox-list
+
+address-list = (address *("," address)) / obs-addr-list
+
+ A mailbox receives mail. It is a conceptual entity which does not
+ necessarily pertain to file storage. For example, some sites may
+ choose to print mail on a printer and deliver the output to the
+ addressee's desk. Normally, a mailbox is comprised of two parts: (1)
+ an optional display name that indicates the name of the recipient
+ (which could be a person or a system) that could be displayed to the
+ user of a mail application, and (2) an addr-spec address enclosed in
+ angle brackets ("<" and ">"). There is also an alternate simple form
+ of a mailbox where the addr-spec address appears alone, without the
+ recipient's name or the angle brackets. The Internet addr-spec
+ address is described in section 3.4.1.
+
+ Note: Some legacy implementations used the simple form where the
+ addr-spec appears without the angle brackets, but included the name
+ of the recipient in parentheses as a comment following the addr-spec.
+ Since the meaning of the information in a comment is unspecified,
+ implementations SHOULD use the full name-addr form of the mailbox,
+ instead of the legacy form, to specify the display name associated
+ with a mailbox. Also, because some legacy implementations interpret
+ the comment, comments generally SHOULD NOT be used in address fields
+ to avoid confusing such implementations.
+
+ When it is desirable to treat several mailboxes as a single unit
+ (i.e., in a distribution list), the group construct can be used. The
+ group construct allows the sender to indicate a named group of
+ recipients. This is done by giving a display name for the group,
+ followed by a colon, followed by a comma separated list of any number
+ of mailboxes (including zero and one), and ending with a semicolon.
+ Because the list of mailboxes can be empty, using the group construct
+ is also a simple way to communicate to recipients that the message
+ was sent to one or more named sets of recipients, without actually
+ providing the individual mailbox address for each of those
+ recipients.
+
+3.4.1. Addr-spec specification
+
+ An addr-spec is a specific Internet identifier that contains a
+ locally interpreted string followed by the at-sign character ("@",
+ ASCII value 64) followed by an Internet domain. The locally
+ interpreted string is either a quoted-string or a dot-atom. If the
+ string can be represented as a dot-atom (that is, it contains no
+ characters other than atext characters or "." surrounded by atext
+
+
+
+Resnick Standards Track [Page 16]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ characters), then the dot-atom form SHOULD be used and the
+ quoted-string form SHOULD NOT be used. Comments and folding white
+ space SHOULD NOT be used around the "@" in the addr-spec.
+
+addr-spec = local-part "@" domain
+
+local-part = dot-atom / quoted-string / obs-local-part
+
+domain = dot-atom / domain-literal / obs-domain
+
+domain-literal = [CFWS] "[" *([FWS] dcontent) [FWS] "]" [CFWS]
+
+dcontent = dtext / quoted-pair
+
+dtext = NO-WS-CTL / ; Non white space controls
+
+ %d33-90 / ; The rest of the US-ASCII
+ %d94-126 ; characters not including "[",
+ ; "]", or "\"
+
+ The domain portion identifies the point to which the mail is
+ delivered. In the dot-atom form, this is interpreted as an Internet
+ domain name (either a host name or a mail exchanger name) as
+ described in [STD3, STD13, STD14]. In the domain-literal form, the
+ domain is interpreted as the literal Internet address of the
+ particular host. In both cases, how addressing is used and how
+ messages are transported to a particular host is covered in the mail
+ transport document [RFC2821]. These mechanisms are outside of the
+ scope of this document.
+
+ The local-part portion is a domain dependent string. In addresses,
+ it is simply interpreted on the particular host as a name of a
+ particular mailbox.
+
+3.5 Overall message syntax
+
+ A message consists of header fields, optionally followed by a message
+ body. Lines in a message MUST be a maximum of 998 characters
+ excluding the CRLF, but it is RECOMMENDED that lines be limited to 78
+ characters excluding the CRLF. (See section 2.1.1 for explanation.)
+ In a message body, though all of the characters listed in the text
+ rule MAY be used, the use of US-ASCII control characters (values 1
+ through 8, 11, 12, and 14 through 31) is discouraged since their
+ interpretation by receivers for display is not guaranteed.
+
+
+
+
+
+
+
+Resnick Standards Track [Page 17]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+message = (fields / obs-fields)
+ [CRLF body]
+
+body = *(*998text CRLF) *998text
+
+ The header fields carry most of the semantic information and are
+ defined in section 3.6. The body is simply a series of lines of text
+ which are uninterpreted for the purposes of this standard.
+
+3.6. Field definitions
+
+ The header fields of a message are defined here. All header fields
+ have the same general syntactic structure: A field name, followed by
+ a colon, followed by the field body. The specific syntax for each
+ header field is defined in the subsequent sections.
+
+ Note: In the ABNF syntax for each field in subsequent sections, each
+ field name is followed by the required colon. However, for brevity
+ sometimes the colon is not referred to in the textual description of
+ the syntax. It is, nonetheless, required.
+
+ It is important to note that the header fields are not guaranteed to
+ be in a particular order. They may appear in any order, and they
+ have been known to be reordered occasionally when transported over
+ the Internet. However, for the purposes of this standard, header
+ fields SHOULD NOT be reordered when a message is transported or
+ transformed. More importantly, the trace header fields and resent
+ header fields MUST NOT be reordered, and SHOULD be kept in blocks
+ prepended to the message. See sections 3.6.6 and 3.6.7 for more
+ information.
+
+ The only required header fields are the origination date field and
+ the originator address field(s). All other header fields are
+ syntactically optional. More information is contained in the table
+ following this definition.
+
+fields = *(trace
+ *(resent-date /
+ resent-from /
+ resent-sender /
+ resent-to /
+ resent-cc /
+ resent-bcc /
+ resent-msg-id))
+ *(orig-date /
+ from /
+ sender /
+ reply-to /
+
+
+
+Resnick Standards Track [Page 18]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ to /
+ cc /
+ bcc /
+ message-id /
+ in-reply-to /
+ references /
+ subject /
+ comments /
+ keywords /
+ optional-field)
+
+ The following table indicates limits on the number of times each
+ field may occur in a message header as well as any special
+ limitations on the use of those fields. An asterisk next to a value
+ in the minimum or maximum column indicates that a special restriction
+ appears in the Notes column.
+
+Field Min number Max number Notes
+
+trace 0 unlimited Block prepended - see
+ 3.6.7
+
+resent-date 0* unlimited* One per block, required
+ if other resent fields
+ present - see 3.6.6
+
+resent-from 0 unlimited* One per block - see
+ 3.6.6
+
+resent-sender 0* unlimited* One per block, MUST
+ occur with multi-address
+ resent-from - see 3.6.6
+
+resent-to 0 unlimited* One per block - see
+ 3.6.6
+
+resent-cc 0 unlimited* One per block - see
+ 3.6.6
+
+resent-bcc 0 unlimited* One per block - see
+ 3.6.6
+
+resent-msg-id 0 unlimited* One per block - see
+ 3.6.6
+
+orig-date 1 1
+
+from 1 1 See sender and 3.6.2
+
+
+
+Resnick Standards Track [Page 19]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+sender 0* 1 MUST occur with multi-
+ address from - see 3.6.2
+
+reply-to 0 1
+
+to 0 1
+
+cc 0 1
+
+bcc 0 1
+
+message-id 0* 1 SHOULD be present - see
+ 3.6.4
+
+in-reply-to 0* 1 SHOULD occur in some
+ replies - see 3.6.4
+
+references 0* 1 SHOULD occur in some
+ replies - see 3.6.4
+
+subject 0 1
+
+comments 0 unlimited
+
+keywords 0 unlimited
+
+optional-field 0 unlimited
+
+ The exact interpretation of each field is described in subsequent
+ sections.
+
+3.6.1. The origination date field
+
+ The origination date field consists of the field name "Date" followed
+ by a date-time specification.
+
+orig-date = "Date:" date-time CRLF
+
+ The origination date specifies the date and time at which the creator
+ of the message indicated that the message was complete and ready to
+ enter the mail delivery system. For instance, this might be the time
+ that a user pushes the "send" or "submit" button in an application
+ program. In any case, it is specifically not intended to convey the
+ time that the message is actually transported, but rather the time at
+ which the human or other creator of the message has put the message
+ into its final form, ready for transport. (For example, a portable
+ computer user who is not connected to a network might queue a message
+
+
+
+
+Resnick Standards Track [Page 20]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ for delivery. The origination date is intended to contain the date
+ and time that the user queued the message, not the time when the user
+ connected to the network to send the message.)
+
+3.6.2. Originator fields
+
+ The originator fields of a message consist of the from field, the
+ sender field (when applicable), and optionally the reply-to field.
+ The from field consists of the field name "From" and a
+ comma-separated list of one or more mailbox specifications. If the
+ from field contains more than one mailbox specification in the
+ mailbox-list, then the sender field, containing the field name
+ "Sender" and a single mailbox specification, MUST appear in the
+ message. In either case, an optional reply-to field MAY also be
+ included, which contains the field name "Reply-To" and a
+ comma-separated list of one or more addresses.
+
+from = "From:" mailbox-list CRLF
+
+sender = "Sender:" mailbox CRLF
+
+reply-to = "Reply-To:" address-list CRLF
+
+ The originator fields indicate the mailbox(es) of the source of the
+ message. The "From:" field specifies the author(s) of the message,
+ that is, the mailbox(es) of the person(s) or system(s) responsible
+ for the writing of the message. The "Sender:" field specifies the
+ mailbox of the agent responsible for the actual transmission of the
+ message. For example, if a secretary were to send a message for
+ another person, the mailbox of the secretary would appear in the
+ "Sender:" field and the mailbox of the actual author would appear in
+ the "From:" field. If the originator of the message can be indicated
+ by a single mailbox and the author and transmitter are identical, the
+ "Sender:" field SHOULD NOT be used. Otherwise, both fields SHOULD
+ appear.
+
+ The originator fields also provide the information required when
+ replying to a message. When the "Reply-To:" field is present, it
+ indicates the mailbox(es) to which the author of the message suggests
+ that replies be sent. In the absence of the "Reply-To:" field,
+ replies SHOULD by default be sent to the mailbox(es) specified in the
+ "From:" field unless otherwise specified by the person composing the
+ reply.
+
+ In all cases, the "From:" field SHOULD NOT contain any mailbox that
+ does not belong to the author(s) of the message. See also section
+ 3.6.3 for more information on forming the destination addresses for a
+ reply.
+
+
+
+Resnick Standards Track [Page 21]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+3.6.3. Destination address fields
+
+ The destination fields of a message consist of three possible fields,
+ each of the same form: The field name, which is either "To", "Cc", or
+ "Bcc", followed by a comma-separated list of one or more addresses
+ (either mailbox or group syntax).
+
+to = "To:" address-list CRLF
+
+cc = "Cc:" address-list CRLF
+
+bcc = "Bcc:" (address-list / [CFWS]) CRLF
+
+ The destination fields specify the recipients of the message. Each
+ destination field may have one or more addresses, and each of the
+ addresses indicate the intended recipients of the message. The only
+ difference between the three fields is how each is used.
+
+ The "To:" field contains the address(es) of the primary recipient(s)
+ of the message.
+
+ The "Cc:" field (where the "Cc" means "Carbon Copy" in the sense of
+ making a copy on a typewriter using carbon paper) contains the
+ addresses of others who are to receive the message, though the
+ content of the message may not be directed at them.
+
+ The "Bcc:" field (where the "Bcc" means "Blind Carbon Copy") contains
+ addresses of recipients of the message whose addresses are not to be
+ revealed to other recipients of the message. There are three ways in
+ which the "Bcc:" field is used. In the first case, when a message
+ containing a "Bcc:" field is prepared to be sent, the "Bcc:" line is
+ removed even though all of the recipients (including those specified
+ in the "Bcc:" field) are sent a copy of the message. In the second
+ case, recipients specified in the "To:" and "Cc:" lines each are sent
+ a copy of the message with the "Bcc:" line removed as above, but the
+ recipients on the "Bcc:" line get a separate copy of the message
+ containing a "Bcc:" line. (When there are multiple recipient
+ addresses in the "Bcc:" field, some implementations actually send a
+ separate copy of the message to each recipient with a "Bcc:"
+ containing only the address of that particular recipient.) Finally,
+ since a "Bcc:" field may contain no addresses, a "Bcc:" field can be
+ sent without any addresses indicating to the recipients that blind
+ copies were sent to someone. Which method to use with "Bcc:" fields
+ is implementation dependent, but refer to the "Security
+ Considerations" section of this document for a discussion of each.
+
+
+
+
+
+
+Resnick Standards Track [Page 22]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ When a message is a reply to another message, the mailboxes of the
+ authors of the original message (the mailboxes in the "From:" field)
+ or mailboxes specified in the "Reply-To:" field (if it exists) MAY
+ appear in the "To:" field of the reply since these would normally be
+ the primary recipients of the reply. If a reply is sent to a message
+ that has destination fields, it is often desirable to send a copy of
+ the reply to all of the recipients of the message, in addition to the
+ author. When such a reply is formed, addresses in the "To:" and
+ "Cc:" fields of the original message MAY appear in the "Cc:" field of
+ the reply, since these are normally secondary recipients of the
+ reply. If a "Bcc:" field is present in the original message,
+ addresses in that field MAY appear in the "Bcc:" field of the reply,
+ but SHOULD NOT appear in the "To:" or "Cc:" fields.
+
+ Note: Some mail applications have automatic reply commands that
+ include the destination addresses of the original message in the
+ destination addresses of the reply. How those reply commands behave
+ is implementation dependent and is beyond the scope of this document.
+ In particular, whether or not to include the original destination
+ addresses when the original message had a "Reply-To:" field is not
+ addressed here.
+
+3.6.4. Identification fields
+
+ Though optional, every message SHOULD have a "Message-ID:" field.
+ Furthermore, reply messages SHOULD have "In-Reply-To:" and
+ "References:" fields as appropriate, as described below.
+
+ The "Message-ID:" field contains a single unique message identifier.
+ The "References:" and "In-Reply-To:" field each contain one or more
+ unique message identifiers, optionally separated by CFWS.
+
+ The message identifier (msg-id) is similar in syntax to an angle-addr
+ construct without the internal CFWS.
+
+message-id = "Message-ID:" msg-id CRLF
+
+in-reply-to = "In-Reply-To:" 1*msg-id CRLF
+
+references = "References:" 1*msg-id CRLF
+
+msg-id = [CFWS] "<" id-left "@" id-right ">" [CFWS]
+
+id-left = dot-atom-text / no-fold-quote / obs-id-left
+
+id-right = dot-atom-text / no-fold-literal / obs-id-right
+
+no-fold-quote = DQUOTE *(qtext / quoted-pair) DQUOTE
+
+
+
+Resnick Standards Track [Page 23]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+no-fold-literal = "[" *(dtext / quoted-pair) "]"
+
+ The "Message-ID:" field provides a unique message identifier that
+ refers to a particular version of a particular message. The
+ uniqueness of the message identifier is guaranteed by the host that
+ generates it (see below). This message identifier is intended to be
+ machine readable and not necessarily meaningful to humans. A message
+ identifier pertains to exactly one instantiation of a particular
+ message; subsequent revisions to the message each receive new message
+ identifiers.
+
+ Note: There are many instances when messages are "changed", but those
+ changes do not constitute a new instantiation of that message, and
+ therefore the message would not get a new message identifier. For
+ example, when messages are introduced into the transport system, they
+ are often prepended with additional header fields such as trace
+ fields (described in section 3.6.7) and resent fields (described in
+ section 3.6.6). The addition of such header fields does not change
+ the identity of the message and therefore the original "Message-ID:"
+ field is retained. In all cases, it is the meaning that the sender
+ of the message wishes to convey (i.e., whether this is the same
+ message or a different message) that determines whether or not the
+ "Message-ID:" field changes, not any particular syntactic difference
+ that appears (or does not appear) in the message.
+
+ The "In-Reply-To:" and "References:" fields are used when creating a
+ reply to a message. They hold the message identifier of the original
+ message and the message identifiers of other messages (for example,
+ in the case of a reply to a message which was itself a reply). The
+ "In-Reply-To:" field may be used to identify the message (or
+ messages) to which the new message is a reply, while the
+ "References:" field may be used to identify a "thread" of
+ conversation.
+
+ When creating a reply to a message, the "In-Reply-To:" and
+ "References:" fields of the resultant message are constructed as
+ follows:
+
+ The "In-Reply-To:" field will contain the contents of the "Message-
+ ID:" field of the message to which this one is a reply (the "parent
+ message"). If there is more than one parent message, then the "In-
+ Reply-To:" field will contain the contents of all of the parents'
+ "Message-ID:" fields. If there is no "Message-ID:" field in any of
+ the parent messages, then the new message will have no "In-Reply-To:"
+ field.
+
+
+
+
+
+
+Resnick Standards Track [Page 24]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ The "References:" field will contain the contents of the parent's
+ "References:" field (if any) followed by the contents of the parent's
+ "Message-ID:" field (if any). If the parent message does not contain
+ a "References:" field but does have an "In-Reply-To:" field
+ containing a single message identifier, then the "References:" field
+ will contain the contents of the parent's "In-Reply-To:" field
+ followed by the contents of the parent's "Message-ID:" field (if
+ any). If the parent has none of the "References:", "In-Reply-To:",
+ or "Message-ID:" fields, then the new message will have no
+ "References:" field.
+
+ Note: Some implementations parse the "References:" field to display
+ the "thread of the discussion". These implementations assume that
+ each new message is a reply to a single parent and hence that they
+ can walk backwards through the "References:" field to find the parent
+ of each message listed there. Therefore, trying to form a
+ "References:" field for a reply that has multiple parents is
+ discouraged and how to do so is not defined in this document.
+
+ The message identifier (msg-id) itself MUST be a globally unique
+ identifier for a message. The generator of the message identifier
+ MUST guarantee that the msg-id is unique. There are several
+ algorithms that can be used to accomplish this. Since the msg-id has
+ a similar syntax to angle-addr (identical except that comments and
+ folding white space are not allowed), a good method is to put the
+ domain name (or a domain literal IP address) of the host on which the
+ message identifier was created on the right hand side of the "@", and
+ put a combination of the current absolute date and time along with
+ some other currently unique (perhaps sequential) identifier available
+ on the system (for example, a process id number) on the left hand
+ side. Using a date on the left hand side and a domain name or domain
+ literal on the right hand side makes it possible to guarantee
+ uniqueness since no two hosts use the same domain name or IP address
+ at the same time. Though other algorithms will work, it is
+ RECOMMENDED that the right hand side contain some domain identifier
+ (either of the host itself or otherwise) such that the generator of
+ the message identifier can guarantee the uniqueness of the left hand
+ side within the scope of that domain.
+
+ Semantically, the angle bracket characters are not part of the
+ msg-id; the msg-id is what is contained between the two angle bracket
+ characters.
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 25]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+3.6.5. Informational fields
+
+ The informational fields are all optional. The "Keywords:" field
+ contains a comma-separated list of one or more words or
+ quoted-strings. The "Subject:" and "Comments:" fields are
+ unstructured fields as defined in section 2.2.1, and therefore may
+ contain text or folding white space.
+
+subject = "Subject:" unstructured CRLF
+
+comments = "Comments:" unstructured CRLF
+
+keywords = "Keywords:" phrase *("," phrase) CRLF
+
+ These three fields are intended to have only human-readable content
+ with information about the message. The "Subject:" field is the most
+ common and contains a short string identifying the topic of the
+ message. When used in a reply, the field body MAY start with the
+ string "Re: " (from the Latin "res", in the matter of) followed by
+ the contents of the "Subject:" field body of the original message.
+ If this is done, only one instance of the literal string "Re: " ought
+ to be used since use of other strings or more than one instance can
+ lead to undesirable consequences. The "Comments:" field contains any
+ additional comments on the text of the body of the message. The
+ "Keywords:" field contains a comma-separated list of important words
+ and phrases that might be useful for the recipient.
+
+3.6.6. Resent fields
+
+ Resent fields SHOULD be added to any message that is reintroduced by
+ a user into the transport system. A separate set of resent fields
+ SHOULD be added each time this is done. All of the resent fields
+ corresponding to a particular resending of the message SHOULD be
+ together. Each new set of resent fields is prepended to the message;
+ that is, the most recent set of resent fields appear earlier in the
+ message. No other fields in the message are changed when resent
+ fields are added.
+
+ Each of the resent fields corresponds to a particular field elsewhere
+ in the syntax. For instance, the "Resent-Date:" field corresponds to
+ the "Date:" field and the "Resent-To:" field corresponds to the "To:"
+ field. In each case, the syntax for the field body is identical to
+ the syntax given previously for the corresponding field.
+
+ When resent fields are used, the "Resent-From:" and "Resent-Date:"
+ fields MUST be sent. The "Resent-Message-ID:" field SHOULD be sent.
+ "Resent-Sender:" SHOULD NOT be used if "Resent-Sender:" would be
+ identical to "Resent-From:".
+
+
+
+Resnick Standards Track [Page 26]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+resent-date = "Resent-Date:" date-time CRLF
+
+resent-from = "Resent-From:" mailbox-list CRLF
+
+resent-sender = "Resent-Sender:" mailbox CRLF
+
+resent-to = "Resent-To:" address-list CRLF
+
+resent-cc = "Resent-Cc:" address-list CRLF
+
+resent-bcc = "Resent-Bcc:" (address-list / [CFWS]) CRLF
+
+resent-msg-id = "Resent-Message-ID:" msg-id CRLF
+
+ Resent fields are used to identify a message as having been
+ reintroduced into the transport system by a user. The purpose of
+ using resent fields is to have the message appear to the final
+ recipient as if it were sent directly by the original sender, with
+ all of the original fields remaining the same. Each set of resent
+ fields correspond to a particular resending event. That is, if a
+ message is resent multiple times, each set of resent fields gives
+ identifying information for each individual time. Resent fields are
+ strictly informational. They MUST NOT be used in the normal
+ processing of replies or other such automatic actions on messages.
+
+ Note: Reintroducing a message into the transport system and using
+ resent fields is a different operation from "forwarding".
+ "Forwarding" has two meanings: One sense of forwarding is that a mail
+ reading program can be told by a user to forward a copy of a message
+ to another person, making the forwarded message the body of the new
+ message. A forwarded message in this sense does not appear to have
+ come from the original sender, but is an entirely new message from
+ the forwarder of the message. On the other hand, forwarding is also
+ used to mean when a mail transport program gets a message and
+ forwards it on to a different destination for final delivery. Resent
+ header fields are not intended for use with either type of
+ forwarding.
+
+ The resent originator fields indicate the mailbox of the person(s) or
+ system(s) that resent the message. As with the regular originator
+ fields, there are two forms: a simple "Resent-From:" form which
+ contains the mailbox of the individual doing the resending, and the
+ more complex form, when one individual (identified in the
+ "Resent-Sender:" field) resends a message on behalf of one or more
+ others (identified in the "Resent-From:" field).
+
+ Note: When replying to a resent message, replies behave just as they
+ would with any other message, using the original "From:",
+
+
+
+Resnick Standards Track [Page 27]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ "Reply-To:", "Message-ID:", and other fields. The resent fields are
+ only informational and MUST NOT be used in the normal processing of
+ replies.
+
+ The "Resent-Date:" indicates the date and time at which the resent
+ message is dispatched by the resender of the message. Like the
+ "Date:" field, it is not the date and time that the message was
+ actually transported.
+
+ The "Resent-To:", "Resent-Cc:", and "Resent-Bcc:" fields function
+ identically to the "To:", "Cc:", and "Bcc:" fields respectively,
+ except that they indicate the recipients of the resent message, not
+ the recipients of the original message.
+
+ The "Resent-Message-ID:" field provides a unique identifier for the
+ resent message.
+
+3.6.7. Trace fields
+
+ The trace fields are a group of header fields consisting of an
+ optional "Return-Path:" field, and one or more "Received:" fields.
+ The "Return-Path:" header field contains a pair of angle brackets
+ that enclose an optional addr-spec. The "Received:" field contains a
+ (possibly empty) list of name/value pairs followed by a semicolon and
+ a date-time specification. The first item of the name/value pair is
+ defined by item-name, and the second item is either an addr-spec, an
+ atom, a domain, or a msg-id. Further restrictions may be applied to
+ the syntax of the trace fields by standards that provide for their
+ use, such as [RFC2821].
+
+trace = [return]
+ 1*received
+
+return = "Return-Path:" path CRLF
+
+path = ([CFWS] "<" ([CFWS] / addr-spec) ">" [CFWS]) /
+ obs-path
+
+received = "Received:" name-val-list ";" date-time CRLF
+
+name-val-list = [CFWS] [name-val-pair *(CFWS name-val-pair)]
+
+name-val-pair = item-name CFWS item-value
+
+item-name = ALPHA *(["-"] (ALPHA / DIGIT))
+
+item-value = 1*angle-addr / addr-spec /
+ atom / domain / msg-id
+
+
+
+Resnick Standards Track [Page 28]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ A full discussion of the Internet mail use of trace fields is
+ contained in [RFC2821]. For the purposes of this standard, the trace
+ fields are strictly informational, and any formal interpretation of
+ them is outside of the scope of this document.
+
+3.6.8. Optional fields
+
+ Fields may appear in messages that are otherwise unspecified in this
+ standard. They MUST conform to the syntax of an optional-field.
+ This is a field name, made up of the printable US-ASCII characters
+ except SP and colon, followed by a colon, followed by any text which
+ conforms to unstructured.
+
+ The field names of any optional-field MUST NOT be identical to any
+ field name specified elsewhere in this standard.
+
+optional-field = field-name ":" unstructured CRLF
+
+field-name = 1*ftext
+
+ftext = %d33-57 / ; Any character except
+ %d59-126 ; controls, SP, and
+ ; ":".
+
+ For the purposes of this standard, any optional field is
+ uninterpreted.
+
+4. Obsolete Syntax
+
+ Earlier versions of this standard allowed for different (usually more
+ liberal) syntax than is allowed in this version. Also, there have
+ been syntactic elements used in messages on the Internet whose
+ interpretation have never been documented. Though some of these
+ syntactic forms MUST NOT be generated according to the grammar in
+ section 3, they MUST be accepted and parsed by a conformant receiver.
+ This section documents many of these syntactic elements. Taking the
+ grammar in section 3 and adding the definitions presented in this
+ section will result in the grammar to use for interpretation of
+ messages.
+
+ Note: This section identifies syntactic forms that any implementation
+ MUST reasonably interpret. However, there are certainly Internet
+ messages which do not conform to even the additional syntax given in
+ this section. The fact that a particular form does not appear in any
+ section of this document is not justification for computer programs
+ to crash or for malformed data to be irretrievably lost by any
+ implementation. To repeat an example, though this document requires
+ lines in messages to be no longer than 998 characters, silently
+
+
+
+Resnick Standards Track [Page 29]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ discarding the 999th and subsequent characters in a line without
+ warning would still be bad behavior for an implementation. It is up
+ to the implementation to deal with messages robustly.
+
+ One important difference between the obsolete (interpreting) and the
+ current (generating) syntax is that in structured header field bodies
+ (i.e., between the colon and the CRLF of any structured header
+ field), white space characters, including folding white space, and
+ comments can be freely inserted between any syntactic tokens. This
+ allows many complex forms that have proven difficult for some
+ implementations to parse.
+
+ Another key difference between the obsolete and the current syntax is
+ that the rule in section 3.2.3 regarding lines composed entirely of
+ white space in comments and folding white space does not apply. See
+ the discussion of folding white space in section 4.2 below.
+
+ Finally, certain characters that were formerly allowed in messages
+ appear in this section. The NUL character (ASCII value 0) was once
+ allowed, but is no longer for compatibility reasons. CR and LF were
+ allowed to appear in messages other than as CRLF; this use is also
+ shown here.
+
+ Other differences in syntax and semantics are noted in the following
+ sections.
+
+4.1. Miscellaneous obsolete tokens
+
+ These syntactic elements are used elsewhere in the obsolete syntax or
+ in the main syntax. The obs-char and obs-qp elements each add ASCII
+ value 0. Bare CR and bare LF are added to obs-text and obs-utext.
+ The period character is added to obs-phrase. The obs-phrase-list
+ provides for "empty" elements in a comma-separated list of phrases.
+
+ Note: The "period" (or "full stop") character (".") in obs-phrase is
+ not a form that was allowed in earlier versions of this or any other
+ standard. Period (nor any other character from specials) was not
+ allowed in phrase because it introduced a parsing difficulty
+ distinguishing between phrases and portions of an addr-spec (see
+ section 4.4). It appears here because the period character is
+ currently used in many messages in the display-name portion of
+ addresses, especially for initials in names, and therefore must be
+ interpreted properly. In the future, period may appear in the
+ regular syntax of phrase.
+
+obs-qp = "\" (%d0-127)
+
+obs-text = *LF *CR *(obs-char *LF *CR)
+
+
+
+Resnick Standards Track [Page 30]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+obs-char = %d0-9 / %d11 / ; %d0-127 except CR and
+ %d12 / %d14-127 ; LF
+
+obs-utext = obs-text
+
+obs-phrase = word *(word / "." / CFWS)
+
+obs-phrase-list = phrase / 1*([phrase] [CFWS] "," [CFWS]) [phrase]
+
+ Bare CR and bare LF appear in messages with two different meanings.
+ In many cases, bare CR or bare LF are used improperly instead of CRLF
+ to indicate line separators. In other cases, bare CR and bare LF are
+ used simply as ASCII control characters with their traditional ASCII
+ meanings.
+
+4.2. Obsolete folding white space
+
+ In the obsolete syntax, any amount of folding white space MAY be
+ inserted where the obs-FWS rule is allowed. This creates the
+ possibility of having two consecutive "folds" in a line, and
+ therefore the possibility that a line which makes up a folded header
+ field could be composed entirely of white space.
+
+ obs-FWS = 1*WSP *(CRLF 1*WSP)
+
+4.3. Obsolete Date and Time
+
+ The syntax for the obsolete date format allows a 2 digit year in the
+ date field and allows for a list of alphabetic time zone
+ specifications that were used in earlier versions of this standard.
+ It also permits comments and folding white space between many of the
+ tokens.
+
+obs-day-of-week = [CFWS] day-name [CFWS]
+
+obs-year = [CFWS] 2*DIGIT [CFWS]
+
+obs-month = CFWS month-name CFWS
+
+obs-day = [CFWS] 1*2DIGIT [CFWS]
+
+obs-hour = [CFWS] 2DIGIT [CFWS]
+
+obs-minute = [CFWS] 2DIGIT [CFWS]
+
+obs-second = [CFWS] 2DIGIT [CFWS]
+
+obs-zone = "UT" / "GMT" / ; Universal Time
+
+
+
+Resnick Standards Track [Page 31]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ ; North American UT
+ ; offsets
+ "EST" / "EDT" / ; Eastern: - 5/ - 4
+ "CST" / "CDT" / ; Central: - 6/ - 5
+ "MST" / "MDT" / ; Mountain: - 7/ - 6
+ "PST" / "PDT" / ; Pacific: - 8/ - 7
+
+ %d65-73 / ; Military zones - "A"
+ %d75-90 / ; through "I" and "K"
+ %d97-105 / ; through "Z", both
+ %d107-122 ; upper and lower case
+
+ Where a two or three digit year occurs in a date, the year is to be
+ interpreted as follows: If a two digit year is encountered whose
+ value is between 00 and 49, the year is interpreted by adding 2000,
+ ending up with a value between 2000 and 2049. If a two digit year is
+ encountered with a value between 50 and 99, or any three digit year
+ is encountered, the year is interpreted by adding 1900.
+
+ In the obsolete time zone, "UT" and "GMT" are indications of
+ "Universal Time" and "Greenwich Mean Time" respectively and are both
+ semantically identical to "+0000".
+
+ The remaining three character zones are the US time zones. The first
+ letter, "E", "C", "M", or "P" stands for "Eastern", "Central",
+ "Mountain" and "Pacific". The second letter is either "S" for
+ "Standard" time, or "D" for "Daylight" (or summer) time. Their
+ interpretations are as follows:
+
+ EDT is semantically equivalent to -0400
+ EST is semantically equivalent to -0500
+ CDT is semantically equivalent to -0500
+ CST is semantically equivalent to -0600
+ MDT is semantically equivalent to -0600
+ MST is semantically equivalent to -0700
+ PDT is semantically equivalent to -0700
+ PST is semantically equivalent to -0800
+
+ The 1 character military time zones were defined in a non-standard
+ way in [RFC822] and are therefore unpredictable in their meaning.
+ The original definitions of the military zones "A" through "I" are
+ equivalent to "+0100" through "+0900" respectively; "K", "L", and "M"
+ are equivalent to "+1000", "+1100", and "+1200" respectively; "N"
+ through "Y" are equivalent to "-0100" through "-1200" respectively;
+ and "Z" is equivalent to "+0000". However, because of the error in
+ [RFC822], they SHOULD all be considered equivalent to "-0000" unless
+ there is out-of-band information confirming their meaning.
+
+
+
+
+Resnick Standards Track [Page 32]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Other multi-character (usually between 3 and 5) alphabetic time zones
+ have been used in Internet messages. Any such time zone whose
+ meaning is not known SHOULD be considered equivalent to "-0000"
+ unless there is out-of-band information confirming their meaning.
+
+4.4. Obsolete Addressing
+
+ There are three primary differences in addressing. First, mailbox
+ addresses were allowed to have a route portion before the addr-spec
+ when enclosed in "<" and ">". The route is simply a comma-separated
+ list of domain names, each preceded by "@", and the list terminated
+ by a colon. Second, CFWS were allowed between the period-separated
+ elements of local-part and domain (i.e., dot-atom was not used). In
+ addition, local-part is allowed to contain quoted-string in addition
+ to just atom. Finally, mailbox-list and address-list were allowed to
+ have "null" members. That is, there could be two or more commas in
+ such a list with nothing in between them.
+
+obs-angle-addr = [CFWS] "<" [obs-route] addr-spec ">" [CFWS]
+
+obs-route = [CFWS] obs-domain-list ":" [CFWS]
+
+obs-domain-list = "@" domain *(*(CFWS / "," ) [CFWS] "@" domain)
+
+obs-local-part = word *("." word)
+
+obs-domain = atom *("." atom)
+
+obs-mbox-list = 1*([mailbox] [CFWS] "," [CFWS]) [mailbox]
+
+obs-addr-list = 1*([address] [CFWS] "," [CFWS]) [address]
+
+ When interpreting addresses, the route portion SHOULD be ignored.
+
+4.5. Obsolete header fields
+
+ Syntactically, the primary difference in the obsolete field syntax is
+ that it allows multiple occurrences of any of the fields and they may
+ occur in any order. Also, any amount of white space is allowed
+ before the ":" at the end of the field name.
+
+obs-fields = *(obs-return /
+ obs-received /
+ obs-orig-date /
+ obs-from /
+ obs-sender /
+ obs-reply-to /
+ obs-to /
+
+
+
+Resnick Standards Track [Page 33]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ obs-cc /
+ obs-bcc /
+ obs-message-id /
+ obs-in-reply-to /
+ obs-references /
+ obs-subject /
+ obs-comments /
+ obs-keywords /
+ obs-resent-date /
+ obs-resent-from /
+ obs-resent-send /
+ obs-resent-rply /
+ obs-resent-to /
+ obs-resent-cc /
+ obs-resent-bcc /
+ obs-resent-mid /
+ obs-optional)
+
+ Except for destination address fields (described in section 4.5.3),
+ the interpretation of multiple occurrences of fields is unspecified.
+ Also, the interpretation of trace fields and resent fields which do
+ not occur in blocks prepended to the message is unspecified as well.
+ Unless otherwise noted in the following sections, interpretation of
+ other fields is identical to the interpretation of their non-obsolete
+ counterparts in section 3.
+
+4.5.1. Obsolete origination date field
+
+obs-orig-date = "Date" *WSP ":" date-time CRLF
+
+4.5.2. Obsolete originator fields
+
+obs-from = "From" *WSP ":" mailbox-list CRLF
+
+obs-sender = "Sender" *WSP ":" mailbox CRLF
+
+obs-reply-to = "Reply-To" *WSP ":" mailbox-list CRLF
+
+4.5.3. Obsolete destination address fields
+
+obs-to = "To" *WSP ":" address-list CRLF
+
+obs-cc = "Cc" *WSP ":" address-list CRLF
+
+obs-bcc = "Bcc" *WSP ":" (address-list / [CFWS]) CRLF
+
+
+
+
+
+
+Resnick Standards Track [Page 34]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ When multiple occurrences of destination address fields occur in a
+ message, they SHOULD be treated as if the address-list in the first
+ occurrence of the field is combined with the address lists of the
+ subsequent occurrences by adding a comma and concatenating.
+
+4.5.4. Obsolete identification fields
+
+ The obsolete "In-Reply-To:" and "References:" fields differ from the
+ current syntax in that they allow phrase (words or quoted strings) to
+ appear. The obsolete forms of the left and right sides of msg-id
+ allow interspersed CFWS, making them syntactically identical to
+ local-part and domain respectively.
+
+obs-message-id = "Message-ID" *WSP ":" msg-id CRLF
+
+obs-in-reply-to = "In-Reply-To" *WSP ":" *(phrase / msg-id) CRLF
+
+obs-references = "References" *WSP ":" *(phrase / msg-id) CRLF
+
+obs-id-left = local-part
+
+obs-id-right = domain
+
+ For purposes of interpretation, the phrases in the "In-Reply-To:" and
+ "References:" fields are ignored.
+
+ Semantically, none of the optional CFWS surrounding the local-part
+ and the domain are part of the obs-id-left and obs-id-right
+ respectively.
+
+4.5.5. Obsolete informational fields
+
+obs-subject = "Subject" *WSP ":" unstructured CRLF
+
+obs-comments = "Comments" *WSP ":" unstructured CRLF
+
+obs-keywords = "Keywords" *WSP ":" obs-phrase-list CRLF
+
+4.5.6. Obsolete resent fields
+
+ The obsolete syntax adds a "Resent-Reply-To:" field, which consists
+ of the field name, the optional comments and folding white space, the
+ colon, and a comma separated list of addresses.
+
+obs-resent-from = "Resent-From" *WSP ":" mailbox-list CRLF
+
+obs-resent-send = "Resent-Sender" *WSP ":" mailbox CRLF
+
+
+
+
+Resnick Standards Track [Page 35]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+obs-resent-date = "Resent-Date" *WSP ":" date-time CRLF
+
+obs-resent-to = "Resent-To" *WSP ":" address-list CRLF
+
+obs-resent-cc = "Resent-Cc" *WSP ":" address-list CRLF
+
+obs-resent-bcc = "Resent-Bcc" *WSP ":"
+ (address-list / [CFWS]) CRLF
+
+obs-resent-mid = "Resent-Message-ID" *WSP ":" msg-id CRLF
+
+obs-resent-rply = "Resent-Reply-To" *WSP ":" address-list CRLF
+
+ As with other resent fields, the "Resent-Reply-To:" field is to be
+ treated as trace information only.
+
+4.5.7. Obsolete trace fields
+
+ The obs-return and obs-received are again given here as template
+ definitions, just as return and received are in section 3. Their
+ full syntax is given in [RFC2821].
+
+obs-return = "Return-Path" *WSP ":" path CRLF
+
+obs-received = "Received" *WSP ":" name-val-list CRLF
+
+obs-path = obs-angle-addr
+
+4.5.8. Obsolete optional fields
+
+obs-optional = field-name *WSP ":" unstructured CRLF
+
+5. Security Considerations
+
+ Care needs to be taken when displaying messages on a terminal or
+ terminal emulator. Powerful terminals may act on escape sequences
+ and other combinations of ASCII control characters with a variety of
+ consequences. They can remap the keyboard or permit other
+ modifications to the terminal which could lead to denial of service
+ or even damaged data. They can trigger (sometimes programmable)
+ answerback messages which can allow a message to cause commands to be
+ issued on the recipient's behalf. They can also effect the operation
+ of terminal attached devices such as printers. Message viewers may
+ wish to strip potentially dangerous terminal escape sequences from
+ the message prior to display. However, other escape sequences appear
+ in messages for useful purposes (cf. [RFC2045, RFC2046, RFC2047,
+ RFC2048, RFC2049, ISO2022]) and therefore should not be stripped
+ indiscriminately.
+
+
+
+Resnick Standards Track [Page 36]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Transmission of non-text objects in messages raises additional
+ security issues. These issues are discussed in [RFC2045, RFC2046,
+ RFC2047, RFC2048, RFC2049].
+
+ Many implementations use the "Bcc:" (blind carbon copy) field
+ described in section 3.6.3 to facilitate sending messages to
+ recipients without revealing the addresses of one or more of the
+ addressees to the other recipients. Mishandling this use of "Bcc:"
+ has implications for confidential information that might be revealed,
+ which could eventually lead to security problems through knowledge of
+ even the existence of a particular mail address. For example, if
+ using the first method described in section 3.6.3, where the "Bcc:"
+ line is removed from the message, blind recipients have no explicit
+ indication that they have been sent a blind copy, except insofar as
+ their address does not appear in the message header. Because of
+ this, one of the blind addressees could potentially send a reply to
+ all of the shown recipients and accidentally reveal that the message
+ went to the blind recipient. When the second method from section
+ 3.6.3 is used, the blind recipient's address appears in the "Bcc:"
+ field of a separate copy of the message. If the "Bcc:" field sent
+ contains all of the blind addressees, all of the "Bcc:" recipients
+ will be seen by each "Bcc:" recipient. Even if a separate message is
+ sent to each "Bcc:" recipient with only the individual's address,
+ implementations still need to be careful to process replies to the
+ message as per section 3.6.3 so as not to accidentally reveal the
+ blind recipient to other recipients.
+
+6. Bibliography
+
+ [ASCII] American National Standards Institute (ANSI), Coded
+ Character Set - 7-Bit American National Standard Code for
+ Information Interchange, ANSI X3.4, 1986.
+
+ [ISO2022] International Organization for Standardization (ISO),
+ Information processing - ISO 7-bit and 8-bit coded
+ character sets - Code extension techniques, Third edition
+ - 1986-05-01, ISO 2022, 1986.
+
+ [RFC822] Crocker, D., "Standard for the Format of ARPA Internet
+ Text Messages", RFC 822, August 1982.
+
+ [RFC2045] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part One: Format of Internet Message
+ Bodies", RFC 2045, November 1996.
+
+ [RFC2046] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Two: Media Types", RFC 2046,
+ November 1996.
+
+
+
+Resnick Standards Track [Page 37]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ [RFC2047] Moore, K., "Multipurpose Internet Mail Extensions (MIME)
+ Part Three: Message Header Extensions for Non-ASCII Text",
+ RFC 2047, November 1996.
+
+ [RFC2048] Freed, N., Klensin, J. and J. Postel, "Multipurpose
+ Internet Mail Extensions (MIME) Part Four: Format of
+ Internet Message Bodies", RFC 2048, November 1996.
+
+ [RFC2049] Freed, N. and N. Borenstein, "Multipurpose Internet Mail
+ Extensions (MIME) Part Five: Conformance Criteria and
+ Examples", RFC 2049, November 1996.
+
+ [RFC2119] Bradner, S., "Key words for use in RFCs to Indicate
+ Requirement Levels", BCP 14, RFC 2119, March 1997.
+
+ [RFC2234] Crocker, D., Editor, and P. Overell, "Augmented BNF for
+ Syntax Specifications: ABNF", RFC 2234, November 1997.
+
+ [RFC2821] Klensin, J., Editor, "Simple Mail Transfer Protocol", RFC
+ 2821, March 2001.
+
+ [STD3] Braden, R., "Host Requirements", STD 3, RFC 1122 and RFC
+ 1123, October 1989.
+
+ [STD12] Mills, D., "Network Time Protocol", STD 12, RFC 1119,
+ September 1989.
+
+ [STD13] Mockapetris, P., "Domain Name System", STD 13, RFC 1034
+ and RFC 1035, November 1987.
+
+ [STD14] Partridge, C., "Mail Routing and the Domain System", STD
+ 14, RFC 974, January 1986.
+
+7. Editor's Address
+
+ Peter W. Resnick
+ QUALCOMM Incorporated
+ 5775 Morehouse Drive
+ San Diego, CA 92121-1714
+ USA
+
+ Phone: +1 858 651 4478
+ Fax: +1 858 651 1102
+ EMail: presnick@qualcomm.com
+
+
+
+
+
+
+
+Resnick Standards Track [Page 38]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+8. Acknowledgements
+
+ Many people contributed to this document. They included folks who
+ participated in the Detailed Revision and Update of Messaging
+ Standards (DRUMS) Working Group of the Internet Engineering Task
+ Force (IETF), the chair of DRUMS, the Area Directors of the IETF, and
+ people who simply sent their comments in via e-mail. The editor is
+ deeply indebted to them all and thanks them sincerely. The below
+ list includes everyone who sent e-mail concerning this document.
+ Hopefully, everyone who contributed is named here:
+
+ Matti Aarnio Barry Finkel Larry Masinter
+ Tanaka Akira Erik Forsberg Denis McKeon
+ Russ Allbery Chuck Foster William P McQuillan
+ Eric Allman Paul Fox Alexey Melnikov
+ Harald Tveit Alvestrand Klaus M. Frank Perry E. Metzger
+ Ran Atkinson Ned Freed Steven Miller
+ Jos Backus Jochen Friedrich Keith Moore
+ Bruce Balden Randall C. Gellens John Gardiner Myers
+ Dave Barr Sukvinder Singh Gill Chris Newman
+ Alan Barrett Tim Goodwin John W. Noerenberg
+ John Beck Philip Guenther Eric Norman
+ J. Robert von Behren Tony Hansen Mike O'Dell
+ Jos den Bekker John Hawkinson Larry Osterman
+ D. J. Bernstein Philip Hazel Paul Overell
+ James Berriman Kai Henningsen Jacob Palme
+ Norbert Bollow Robert Herriot Michael A. Patton
+ Raj Bose Paul Hethmon Uzi Paz
+ Antony Bowesman Jim Hill Michael A. Quinlan
+ Scott Bradner Paul E. Hoffman Eric S. Raymond
+ Randy Bush Steve Hole Sam Roberts
+ Tom Byrer Kari Hurtta Hugh Sasse
+ Bruce Campbell Marco S. Hyman Bart Schaefer
+ Larry Campbell Ofer Inbar Tom Scola
+ W. J. Carpenter Olle Jarnefors Wolfgang Segmuller
+ Michael Chapman Kevin Johnson Nick Shelness
+ Richard Clayton Sudish Joseph John Stanley
+ Maurizio Codogno Maynard Kang Einar Stefferud
+ Jim Conklin Prabhat Keni Jeff Stephenson
+ R. Kelley Cook John C. Klensin Bernard Stern
+ Steve Coya Graham Klyne Peter Sylvester
+ Mark Crispin Brad Knowles Mark Symons
+ Dave Crocker Shuhei Kobayashi Eric Thomas
+ Matt Curtin Peter Koch Lee Thompson
+ Michael D'Errico Dan Kohn Karel De Vriendt
+ Cyrus Daboo Christian Kuhtz Matthew Wall
+ Jutta Degener Anand Kumria Rolf Weber
+ Mark Delany Steen Larsen Brent B. Welch
+
+
+
+Resnick Standards Track [Page 39]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Steve Dorner Eliot Lear Dan Wing
+ Harold A. Driscoll Barry Leiba Jack De Winter
+ Michael Elkins Jay Levitt Gregory J. Woodhouse
+ Robert Elz Lars-Johan Liman Greg A. Woods
+ Johnny Eriksson Charles Lindsey Kazu Yamamoto
+ Erik E. Fair Pete Loshin Alain Zahm
+ Roger Fajman Simon Lyall Jamie Zawinski
+ Patrik Faltstrom Bill Manning Timothy S. Zurcher
+ Claus Andre Farber John Martin
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 40]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+Appendix A. Example messages
+
+ This section presents a selection of messages. These are intended to
+ assist in the implementation of this standard, but should not be
+ taken as normative; that is to say, although the examples in this
+ section were carefully reviewed, if there happens to be a conflict
+ between these examples and the syntax described in sections 3 and 4
+ of this document, the syntax in those sections is to be taken as
+ correct.
+
+ Messages are delimited in this section between lines of "----". The
+ "----" lines are not part of the message itself.
+
+A.1. Addressing examples
+
+ The following are examples of messages that might be sent between two
+ individuals.
+
+A.1.1. A message from one person to another with simple addressing
+
+ This could be called a canonical message. It has a single author,
+ John Doe, a single recipient, Mary Smith, a subject, the date, a
+ message identifier, and a textual message in the body.
+
+----
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 41]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ If John's secretary Michael actually sent the message, though John
+ was the author and replies to this message should go back to him, the
+ sender field would be used:
+
+----
+From: John Doe <jdoe@machine.example>
+Sender: Michael Jones <mjones@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+A.1.2. Different types of mailboxes
+
+ This message includes multiple addresses in the destination fields
+ and also uses several different forms of addresses.
+
+----
+From: "Joe Q. Public" <john.q.public@example.com>
+To: Mary Smith <mary@x.test>, jdoe@example.org, Who? <one@y.test>
+Cc: <boss@nil.test>, "Giant; \"Big\" Box" <sysservices@example.net>
+Date: Tue, 1 Jul 2003 10:52:37 +0200
+Message-ID: <5678.21-Nov-1997@example.com>
+
+Hi everyone.
+----
+
+ Note that the display names for Joe Q. Public and Giant; "Big" Box
+ needed to be enclosed in double-quotes because the former contains
+ the period and the latter contains both semicolon and double-quote
+ characters (the double-quote characters appearing as quoted-pair
+ construct). Conversely, the display name for Who? could appear
+ without them because the question mark is legal in an atom. Notice
+ also that jdoe@example.org and boss@nil.test have no display names
+ associated with them at all, and jdoe@example.org uses the simpler
+ address form without the angle brackets.
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 42]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+A.1.3. Group addresses
+
+----
+From: Pete <pete@silly.example>
+To: A Group:Chris Jones <c@a.test>,joe@where.test,John <jdoe@one.test>;
+Cc: Undisclosed recipients:;
+Date: Thu, 13 Feb 1969 23:32:54 -0330
+Message-ID: <testabcd.1234@silly.example>
+
+Testing.
+----
+
+ In this message, the "To:" field has a single group recipient named A
+ Group which contains 3 addresses, and a "Cc:" field with an empty
+ group recipient named Undisclosed recipients.
+
+A.2. Reply messages
+
+ The following is a series of three messages that make up a
+ conversation thread between John and Mary. John firsts sends a
+ message to Mary, Mary then replies to John's message, and then John
+ replies to Mary's reply message.
+
+ Note especially the "Message-ID:", "References:", and "In-Reply-To:"
+ fields in each message.
+
+----
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 43]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ When sending replies, the Subject field is often retained, though
+ prepended with "Re: " as described in section 3.6.5.
+
+----
+From: Mary Smith <mary@example.net>
+To: John Doe <jdoe@machine.example>
+Reply-To: "Mary Smith: Personal Account" <smith@home.example>
+Subject: Re: Saying Hello
+Date: Fri, 21 Nov 1997 10:01:10 -0600
+Message-ID: <3456@example.net>
+In-Reply-To: <1234@local.machine.example>
+References: <1234@local.machine.example>
+
+This is a reply to your hello.
+----
+
+ Note the "Reply-To:" field in the above message. When John replies
+ to Mary's message above, the reply should go to the address in the
+ "Reply-To:" field instead of the address in the "From:" field.
+
+----
+To: "Mary Smith: Personal Account" <smith@home.example>
+From: John Doe <jdoe@machine.example>
+Subject: Re: Saying Hello
+Date: Fri, 21 Nov 1997 11:00:00 -0600
+Message-ID: <abcd.1234@local.machine.tld>
+In-Reply-To: <3456@example.net>
+References: <1234@local.machine.example> <3456@example.net>
+
+This is a reply to your reply.
+----
+
+A.3. Resent messages
+
+ Start with the message that has been used as an example several
+ times:
+
+----
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+
+
+
+Resnick Standards Track [Page 44]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ Say that Mary, upon receiving this message, wishes to send a copy of
+ the message to Jane such that (a) the message would appear to have
+ come straight from John; (b) if Jane replies to the message, the
+ reply should go back to John; and (c) all of the original
+ information, like the date the message was originally sent to Mary,
+ the message identifier, and the original addressee, is preserved. In
+ this case, resent fields are prepended to the message:
+
+----
+Resent-From: Mary Smith <mary@example.net>
+Resent-To: Jane Brown <j-brown@other.example>
+Resent-Date: Mon, 24 Nov 1997 14:22:01 -0800
+Resent-Message-ID: <78910@example.net>
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+ If Jane, in turn, wished to resend this message to another person,
+ she would prepend her own set of resent header fields to the above
+ and send that.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 45]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+A.4. Messages with trace fields
+
+ As messages are sent through the transport system as described in
+ [RFC2821], trace fields are prepended to the message. The following
+ is an example of what those trace fields might look like. Note that
+ there is some folding white space in the first one since these lines
+ can be long.
+
+----
+Received: from x.y.test
+ by example.net
+ via TCP
+ with ESMTP
+ id ABC12345
+ for <mary@example.net>; 21 Nov 1997 10:05:43 -0600
+Received: from machine.example by x.y.test; 21 Nov 1997 10:01:22 -0600
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: Fri, 21 Nov 1997 09:55:06 -0600
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 46]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+A.5. White space, comments, and other oddities
+
+ White space, including folding white space, and comments can be
+ inserted between many of the tokens of fields. Taking the example
+ from A.1.3, white space and comments can be inserted into all of the
+ fields.
+
+----
+From: Pete(A wonderful \) chap) <pete(his account)@silly.test(his host)>
+To:A Group(Some people)
+ :Chris Jones <c@(Chris's host.)public.example>,
+ joe@example.org,
+ John <jdoe@one.test> (my dear friend); (the end of the group)
+Cc:(Empty list)(start)Undisclosed recipients :(nobody(that I know)) ;
+Date: Thu,
+ 13
+ Feb
+ 1969
+ 23:32
+ -0330 (Newfoundland Time)
+Message-ID: <testabcd.1234@silly.test>
+
+Testing.
+----
+
+ The above example is aesthetically displeasing, but perfectly legal.
+ Note particularly (1) the comments in the "From:" field (including
+ one that has a ")" character appearing as part of a quoted-pair); (2)
+ the white space absent after the ":" in the "To:" field as well as
+ the comment and folding white space after the group name, the special
+ character (".") in the comment in Chris Jones's address, and the
+ folding white space before and after "joe@example.org,"; (3) the
+ multiple and nested comments in the "Cc:" field as well as the
+ comment immediately following the ":" after "Cc"; (4) the folding
+ white space (but no comments except at the end) and the missing
+ seconds in the time of the date field; and (5) the white space before
+ (but not within) the identifier in the "Message-ID:" field.
+
+A.6. Obsoleted forms
+
+ The following are examples of obsolete (that is, the "MUST NOT
+ generate") syntactic elements described in section 4 of this
+ document.
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 47]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+A.6.1. Obsolete addressing
+
+ Note in the below example the lack of quotes around Joe Q. Public,
+ the route that appears in the address for Mary Smith, the two commas
+ that appear in the "To:" field, and the spaces that appear around the
+ "." in the jdoe address.
+
+----
+From: Joe Q. Public <john.q.public@example.com>
+To: Mary Smith <@machine.tld:mary@example.net>, , jdoe@test . example
+Date: Tue, 1 Jul 2003 10:52:37 +0200
+Message-ID: <5678.21-Nov-1997@example.com>
+
+Hi everyone.
+----
+
+A.6.2. Obsolete dates
+
+ The following message uses an obsolete date format, including a non-
+ numeric time zone and a two digit year. Note that although the
+ day-of-week is missing, that is not specific to the obsolete syntax;
+ it is optional in the current syntax as well.
+
+----
+From: John Doe <jdoe@machine.example>
+To: Mary Smith <mary@example.net>
+Subject: Saying Hello
+Date: 21 Nov 97 09:55:06 GMT
+Message-ID: <1234@local.machine.example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+A.6.3. Obsolete white space and comments
+
+ White space and comments can appear between many more elements than
+ in the current syntax. Also, folding lines that are made up entirely
+ of white space are legal.
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 48]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+----
+From : John Doe <jdoe@machine(comment). example>
+To : Mary Smith
+__
+ <mary@example.net>
+Subject : Saying Hello
+Date : Fri, 21 Nov 1997 09(comment): 55 : 06 -0600
+Message-ID : <1234 @ local(blah) .machine .example>
+
+This is a message just to say hello.
+So, "Hello".
+----
+
+ Note especially the second line of the "To:" field. It starts with
+ two space characters. (Note that "__" represent blank spaces.)
+ Therefore, it is considered part of the folding as described in
+ section 4.2. Also, the comments and white space throughout
+ addresses, dates, and message identifiers are all part of the
+ obsolete syntax.
+
+Appendix B. Differences from earlier standards
+
+ This appendix contains a list of changes that have been made in the
+ Internet Message Format from earlier standards, specifically [RFC822]
+ and [STD3]. Items marked with an asterisk (*) below are items which
+ appear in section 4 of this document and therefore can no longer be
+ generated.
+
+ 1. Period allowed in obsolete form of phrase.
+ 2. ABNF moved out of document to [RFC2234].
+ 3. Four or more digits allowed for year.
+ 4. Header field ordering (and lack thereof) made explicit.
+ 5. Encrypted header field removed.
+ 6. Received syntax loosened to allow any token/value pair.
+ 7. Specifically allow and give meaning to "-0000" time zone.
+ 8. Folding white space is not allowed between every token.
+ 9. Requirement for destinations removed.
+ 10. Forwarding and resending redefined.
+ 11. Extension header fields no longer specifically called out.
+ 12. ASCII 0 (null) removed.*
+ 13. Folding continuation lines cannot contain only white space.*
+ 14. Free insertion of comments not allowed in date.*
+ 15. Non-numeric time zones not allowed.*
+ 16. Two digit years not allowed.*
+ 17. Three digit years interpreted, but not allowed for generation.
+ 18. Routes in addresses not allowed.*
+ 19. CFWS within local-parts and domains not allowed.*
+ 20. Empty members of address lists not allowed.*
+
+
+
+Resnick Standards Track [Page 49]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+ 21. Folding white space between field name and colon not allowed.*
+ 22. Comments between field name and colon not allowed.
+ 23. Tightened syntax of in-reply-to and references.*
+ 24. CFWS within msg-id not allowed.*
+ 25. Tightened semantics of resent fields as informational only.
+ 26. Resent-Reply-To not allowed.*
+ 27. No multiple occurrences of fields (except resent and received).*
+ 28. Free CR and LF not allowed.*
+ 29. Routes in return path not allowed.*
+ 30. Line length limits specified.
+ 31. Bcc more clearly specified.
+
+Appendix C. Notices
+
+ Intellectual Property
+
+ The IETF takes no position regarding the validity or scope of any
+ intellectual property or other rights that might be claimed to
+ pertain to the implementation or use of the technology described in
+ this document or the extent to which any license under such rights
+ might or might not be available; neither does it represent that it
+ has made any effort to identify any such rights. Information on the
+ IETF's procedures with respect to rights in standards-track and
+ standards-related documentation can be found in BCP-11. Copies of
+ claims of rights made available for publication and any assurances of
+ licenses to be made available, or the result of an attempt made to
+ obtain a general license or permission for the use of such
+ proprietary rights by implementors or users of this specification can
+ be obtained from the IETF Secretariat.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 50]
+\f
+RFC 2822 Internet Message Format April 2001
+
+
+Full Copyright Statement
+
+ Copyright (C) The Internet Society (2001). All Rights Reserved.
+
+ This document and translations of it may be copied and furnished to
+ others, and derivative works that comment on or otherwise explain it
+ or assist in its implementation may be prepared, copied, published
+ and distributed, in whole or in part, without restriction of any
+ kind, provided that the above copyright notice and this paragraph are
+ included on all such copies and derivative works. However, this
+ document itself may not be modified in any way, such as by removing
+ the copyright notice or references to the Internet Society or other
+ Internet organizations, except as needed for the purpose of
+ developing Internet standards in which case the procedures for
+ copyrights defined in the Internet Standards process must be
+ followed, or as required to translate it into languages other than
+ English.
+
+ The limited permissions granted above are perpetual and will not be
+ revoked by the Internet Society or its successors or assigns.
+
+ This document and the information contained herein is provided on an
+ "AS IS" basis and THE INTERNET SOCIETY AND THE INTERNET ENGINEERING
+ TASK FORCE DISCLAIMS ALL WARRANTIES, EXPRESS OR IMPLIED, INCLUDING
+ BUT NOT LIMITED TO ANY WARRANTY THAT THE USE OF THE INFORMATION
+ HEREIN WILL NOT INFRINGE ANY RIGHTS OR ANY IMPLIED WARRANTIES OF
+ MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE.
+
+Acknowledgement
+
+ Funding for the RFC Editor function is currently provided by the
+ Internet Society.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Resnick Standards Track [Page 51]
+\f
--- /dev/null
+
+
+Network Working Group Brian Kantor (U.C. San Diego)
+Request for Comments: 977 Phil Lapsley (U.C. Berkeley)
+ February 1986
+
+ Network News Transfer Protocol
+
+ A Proposed Standard for the Stream-Based
+ Transmission of News
+
+Status of This Memo
+
+ NNTP specifies a protocol for the distribution, inquiry, retrieval,
+ and posting of news articles using a reliable stream-based
+ transmission of news among the ARPA-Internet community. NNTP is
+ designed so that news articles are stored in a central database
+ allowing a subscriber to select only those items he wishes to read.
+ Indexing, cross-referencing, and expiration of aged messages are also
+ provided. This RFC suggests a proposed protocol for the ARPA-Internet
+ community, and requests discussion and suggestions for improvements.
+ Distribution of this memo is unlimited.
+
+1. Introduction
+
+ For many years, the ARPA-Internet community has supported the
+ distribution of bulletins, information, and data in a timely fashion
+ to thousands of participants. We collectively refer to such items of
+ information as "news". Such news provides for the rapid
+ dissemination of items of interest such as software bug fixes, new
+ product reviews, technical tips, and programming pointers, as well as
+ rapid-fire discussions of matters of concern to the working computer
+ professional. News is very popular among its readers.
+
+ There are popularly two methods of distributing such news: the
+ Internet method of direct mailing, and the USENET news system.
+
+1.1. Internet Mailing Lists
+
+ The Internet community distributes news by the use of mailing lists.
+ These are lists of subscriber's mailbox addresses and remailing
+ sublists of all intended recipients. These mailing lists operate by
+ remailing a copy of the information to be distributed to each
+ subscriber on the mailing list. Such remailing is inefficient when a
+ mailing list grows beyond a dozen or so people, since sending a
+ separate copy to each of the subscribers occupies large quantities of
+ network bandwidth, CPU resources, and significant amounts of disk
+ storage at the destination host. There is also a significant problem
+ in maintenance of the list itself: as subscribers move from one job
+ to another; as new subscribers join and old ones leave; and as hosts
+ come in and out of service.
+
+
+
+
+Kantor & Lapsley [Page 1]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+1.2. The USENET News System
+
+ Clearly, a worthwhile reduction of the amount of these resources used
+ can be achieved if articles are stored in a central database on the
+ receiving host instead of in each subscriber's mailbox. The USENET
+ news system provides a method of doing just this. There is a central
+ repository of the news articles in one place (customarily a spool
+ directory of some sort), and a set of programs that allow a
+ subscriber to select those items he wishes to read. Indexing,
+ cross-referencing, and expiration of aged messages are also provided.
+
+1.3. Central Storage of News
+
+ For clusters of hosts connected together by fast local area networks
+ (such as Ethernet), it makes even more sense to consolidate news
+ distribution onto one (or a very few) hosts, and to allow access to
+ these news articles using a server and client model. Subscribers may
+ then request only the articles they wish to see, without having to
+ wastefully duplicate the storage of a copy of each item on each host.
+
+1.4. A Central News Server
+
+ A way to achieve these economies is to have a central computer system
+ that can provide news service to the other systems on the local area
+ network. Such a server would manage the collection of news articles
+ and index files, with each person who desires to read news bulletins
+ doing so over the LAN. For a large cluster of computer systems, the
+ savings in total disk space is clearly worthwhile. Also, this allows
+ workstations with limited disk storage space to participate in the
+ news without incoming items consuming oppressive amounts of the
+ workstation's disk storage.
+
+ We have heard rumors of somewhat successful attempts to provide
+ centralized news service using IBIS and other shared or distributed
+ file systems. While it is possible that such a distributed file
+ system implementation might work well with a group of similar
+ computers running nearly identical operating systems, such a scheme
+ is not general enough to offer service to a wide range of client
+ systems, especially when many diverse operating systems may be in use
+ among a group of clients. There are few (if any) shared or networked
+ file systems that can offer the generality of service that stream
+ connections using Internet TCP provide, particularly when a wide
+ range of host hardware and operating systems are considered.
+
+ NNTP specifies a protocol for the distribution, inquiry, retrieval,
+ and posting of news articles using a reliable stream (such as TCP)
+ server-client model. NNTP is designed so that news articles need only
+
+
+Kantor & Lapsley [Page 2]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ be stored on one (presumably central) host, and subscribers on other
+ hosts attached to the LAN may read news articles using stream
+ connections to the news host.
+
+ NNTP is modelled upon the news article specifications in RFC 850,
+ which describes the USENET news system. However, NNTP makes few
+ demands upon the structure, content, or storage of news articles, and
+ thus we believe it easily can be adapted to other non-USENET news
+ systems.
+
+ Typically, the NNTP server runs as a background process on one host,
+ and would accept connections from other hosts on the LAN. This works
+ well when there are a number of small computer systems (such as
+ workstations, with only one or at most a few users each), and a large
+ central server.
+
+1.5. Intermediate News Servers
+
+ For clusters of machines with many users (as might be the case in a
+ university or large industrial environment), an intermediate server
+ might be used. This intermediate or "slave" server runs on each
+ computer system, and is responsible for mediating news reading
+ requests and performing local caching of recently-retrieved news
+ articles.
+
+ Typically, a client attempting to obtain news service would first
+ attempt to connect to the news service port on the local machine. If
+ this attempt were unsuccessful, indicating a failed server, an
+ installation might choose to either deny news access, or to permit
+ connection to the central "master" news server.
+
+ For workstations or other small systems, direct connection to the
+ master server would probably be the normal manner of operation.
+
+ This specification does not cover the operation of slave NNTP
+ servers. We merely suggest that slave servers are a logical addition
+ to NNTP server usage which would enhance operation on large local
+ area networks.
+
+1.6. News Distribution
+
+ NNTP has commands which provide a straightforward method of
+ exchanging articles between cooperating hosts. Hosts which are well
+ connected on a local area or other fast network and who wish to
+ actually obtain copies of news articles for local storage might well
+ find NNTP to be a more efficient way to distribute news than more
+ traditional transfer methods (such as UUCP).
+
+
+Kantor & Lapsley [Page 3]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ In the traditional method of distributing news articles, news is
+ propagated from host to host by flooding - that is, each host will
+ send all its new news articles on to each host that it feeds. These
+ hosts will then in turn send these new articles on to other hosts
+ that they feed. Clearly, sending articles that a host already has
+ obtained a copy of from another feed (many hosts that receive news
+ are redundantly fed) again is a waste of time and communications
+ resources, but for transport mechanisms that are single-transaction
+ based rather than interactive (such as UUCP in the UNIX-world <1>),
+ distribution time is diminished by sending all articles and having
+ the receiving host simply discard the duplicates. This is an
+ especially true when communications sessions are limited to once a
+ day.
+
+ Using NNTP, hosts exchanging news articles have an interactive
+ mechanism for deciding which articles are to be transmitted. A host
+ desiring new news, or which has new news to send, will typically
+ contact one or more of its neighbors using NNTP. First it will
+ inquire if any new news groups have been created on the serving host
+ by means of the NEWGROUPS command. If so, and those are appropriate
+ or desired (as established by local site-dependent rules), those new
+ newsgroups can be created.
+
+ The client host will then inquire as to which new articles have
+ arrived in all or some of the newsgroups that it desires to receive,
+ using the NEWNEWS command. It will receive a list of new articles
+ from the server, and can request transmission of those articles that
+ it desires and does not already have.
+
+ Finally, the client can advise the server of those new articles which
+ the client has recently received. The server will indicate those
+ articles that it has already obtained copies of, and which articles
+ should be sent to add to its collection.
+
+ In this manner, only those articles which are not duplicates and
+ which are desired are transferred.
+
+
+
+
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 4]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+2. The NNTP Specification
+
+2.1. Overview
+
+ The news server specified by this document uses a stream connection
+ (such as TCP) and SMTP-like commands and responses. It is designed
+ to accept connections from hosts, and to provide a simple interface
+ to the news database.
+
+ This server is only an interface between programs and the news
+ databases. It does not perform any user interaction or presentation-
+ level functions. These "user-friendly" functions are better left to
+ the client programs, which have a better understanding of the
+ environment in which they are operating.
+
+ When used via Internet TCP, the contact port assigned for this
+ service is 119.
+
+2.2. Character Codes
+
+ Commands and replies are composed of characters from the ASCII
+ character set. When the transport service provides an 8-bit byte
+ (octet) transmission channel, each 7-bit character is transmitted
+ right justified in an octet with the high order bit cleared to zero.
+
+2.3. Commands
+
+ Commands consist of a command word, which in some cases may be
+ followed by a parameter. Commands with parameters must separate the
+ parameters from each other and from the command by one or more space
+ or tab characters. Command lines must be complete with all required
+ parameters, and may not contain more than one command.
+
+ Commands and command parameters are not case sensitive. That is, a
+ command or parameter word may be upper case, lower case, or any
+ mixture of upper and lower case.
+
+ Each command line must be terminated by a CR-LF (Carriage Return -
+ Line Feed) pair.
+
+ Command lines shall not exceed 512 characters in length, counting all
+ characters including spaces, separators, punctuation, and the
+ trailing CR-LF (thus there are 510 characters maximum allowed for the
+ command and its parameters). There is no provision for continuation
+ command lines.
+
+
+
+
+Kantor & Lapsley [Page 5]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+2.4. Responses
+
+ Responses are of two kinds, textual and status.
+
+2.4.1. Text Responses
+
+ Text is sent only after a numeric status response line has been sent
+ that indicates that text will follow. Text is sent as a series of
+ successive lines of textual matter, each terminated with CR-LF pair.
+ A single line containing only a period (.) is sent to indicate the
+ end of the text (i.e., the server will send a CR-LF pair at the end
+ of the last line of text, a period, and another CR-LF pair).
+
+ If the text contained a period as the first character of the text
+ line in the original, that first period is doubled. Therefore, the
+ client must examine the first character of each line received, and
+ for those beginning with a period, determine either that this is the
+ end of the text or whether to collapse the doubled period to a single
+ one.
+
+ The intention is that text messages will usually be displayed on the
+ user's terminal whereas command/status responses will be interpreted
+ by the client program before any possible display is done.
+
+2.4.2. Status Responses
+
+ These are status reports from the server and indicate the response to
+ the last command received from the client.
+
+ Status response lines begin with a 3 digit numeric code which is
+ sufficient to distinguish all responses. Some of these may herald
+ the subsequent transmission of text.
+
+ The first digit of the response broadly indicates the success,
+ failure, or progress of the previous command.
+
+ 1xx - Informative message
+ 2xx - Command ok
+ 3xx - Command ok so far, send the rest of it.
+ 4xx - Command was correct, but couldn't be performed for
+ some reason.
+ 5xx - Command unimplemented, or incorrect, or a serious
+ program error occurred.
+
+
+
+
+
+
+Kantor & Lapsley [Page 6]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ The next digit in the code indicates the function response category.
+
+ x0x - Connection, setup, and miscellaneous messages
+ x1x - Newsgroup selection
+ x2x - Article selection
+ x3x - Distribution functions
+ x4x - Posting
+ x8x - Nonstandard (private implementation) extensions
+ x9x - Debugging output
+
+ The exact response codes that should be expected from each command
+ are detailed in the description of that command. In addition, below
+ is listed a general set of response codes that may be received at any
+ time.
+
+ Certain status responses contain parameters such as numbers and
+ names. The number and type of such parameters is fixed for each
+ response code to simplify interpretation of the response.
+
+ Parameters are separated from the numeric response code and from each
+ other by a single space. All numeric parameters are decimal, and may
+ have leading zeros. All string parameters begin after the separating
+ space, and end before the following separating space or the CR-LF
+ pair at the end of the line. (String parameters may not, therefore,
+ contain spaces.) All text, if any, in the response which is not a
+ parameter of the response must follow and be separated from the last
+ parameter by a space. Also, note that the text following a response
+ number may vary in different implementations of the server. The
+ 3-digit numeric code should be used to determine what response was
+ sent.
+
+ Response codes not specified in this standard may be used for any
+ installation-specific additional commands also not specified. These
+ should be chosen to fit the pattern of x8x specified above. (Note
+ that debugging is provided for explicitly in the x9x response codes.)
+ The use of unspecified response codes for standard commands is
+ prohibited.
+
+ We have provided a response pattern x9x for debugging. Since much
+ debugging output may be classed as "informative messages", we would
+ expect, therefore, that responses 190 through 199 would be used for
+ various debugging outputs. There is no requirement in this
+ specification for debugging output, but if such is provided over the
+ connected stream, it must use these response codes. If appropriate
+ to a specific implementation, other x9x codes may be used for
+ debugging. (An example might be to use e.g., 290 to acknowledge a
+ remote debugging request.)
+
+
+Kantor & Lapsley [Page 7]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+2.4.3. General Responses
+
+ The following is a list of general response codes that may be sent by
+ the NNTP server. These are not specific to any one command, but may
+ be returned as the result of a connection, a failure, or some unusual
+ condition.
+
+ In general, 1xx codes may be ignored or displayed as desired; code
+ 200 or 201 is sent upon initial connection to the NNTP server
+ depending upon posting permission; code 400 will be sent when the
+ NNTP server discontinues service (by operator request, for example);
+ and 5xx codes indicate that the command could not be performed for
+ some unusual reason.
+
+ 100 help text
+ 190
+ through
+ 199 debug output
+
+ 200 server ready - posting allowed
+ 201 server ready - no posting allowed
+
+ 400 service discontinued
+
+ 500 command not recognized
+ 501 command syntax error
+ 502 access restriction or permission denied
+ 503 program fault - command not performed
+
+3. Command and Response Details
+
+ On the following pages are descriptions of each command recognized by
+ the NNTP server and the responses which will be returned by those
+ commands.
+
+ Each command is shown in upper case for clarity, although case is
+ ignored in the interpretation of commands by the NNTP server. Any
+ parameters are shown in lower case. A parameter shown in [square
+ brackets] is optional. For example, [GMT] indicates that the
+ triglyph GMT may present or omitted.
+
+ Every command described in this section must be implemented by all
+ NNTP servers.
+
+
+
+
+
+
+Kantor & Lapsley [Page 8]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ There is no prohibition against additional commands being added;
+ however, it is recommended that any such unspecified command begin
+ with the letter "X" to avoid conflict with later revisions of this
+ specification.
+
+ Implementors are reminded that such additional commands may not
+ redefine specified status response codes. Using additional
+ unspecified responses for standard commands is also prohibited.
+
+3.1. The ARTICLE, BODY, HEAD, and STAT commands
+
+ There are two forms to the ARTICLE command (and the related BODY,
+ HEAD, and STAT commands), each using a different method of specifying
+ which article is to be retrieved. When the ARTICLE command is
+ followed by a message-id in angle brackets ("<" and ">"), the first
+ form of the command is used; when a numeric parameter or no parameter
+ is supplied, the second form is invoked.
+
+ The text of the article is returned as a textual response, as
+ described earlier in this document.
+
+ The HEAD and BODY commands are identical to the ARTICLE command
+ except that they respectively return only the header lines or text
+ body of the article.
+
+ The STAT command is similar to the ARTICLE command except that no
+ text is returned. When selecting by message number within a group,
+ the STAT command serves to set the current article pointer without
+ sending text. The returned acknowledgement response will contain the
+ message-id, which may be of some value. Using the STAT command to
+ select by message-id is valid but of questionable value, since a
+ selection by message-id does NOT alter the "current article pointer".
+
+3.1.1. ARTICLE (selection by message-id)
+
+ ARTICLE <message-id>
+
+ Display the header, a blank line, then the body (text) of the
+ specified article. Message-id is the message id of an article as
+ shown in that article's header. It is anticipated that the client
+ will obtain the message-id from a list provided by the NEWNEWS
+ command, from references contained within another article, or from
+ the message-id provided in the response to some other commands.
+
+ Please note that the internally-maintained "current article pointer"
+ is NOT ALTERED by this command. This is both to facilitate the
+ presentation of articles that may be referenced within an article
+
+
+Kantor & Lapsley [Page 9]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ being read, and because of the semantic difficulties of determining
+ the proper sequence and membership of an article which may have been
+ posted to more than one newsgroup.
+
+3.1.2. ARTICLE (selection by number)
+
+ ARTICLE [nnn]
+
+ Displays the header, a blank line, then the body (text) of the
+ current or specified article. The optional parameter nnn is the
+
+ numeric id of an article in the current newsgroup and must be chosen
+ from the range of articles provided when the newsgroup was selected.
+ If it is omitted, the current article is assumed.
+
+ The internally-maintained "current article pointer" is set by this
+ command if a valid article number is specified.
+
+ [the following applies to both forms of the article command.] A
+ response indicating the current article number, a message-id string,
+ and that text is to follow will be returned.
+
+ The message-id string returned is an identification string contained
+ within angle brackets ("<" and ">"), which is derived from the header
+ of the article itself. The Message-ID header line (required by
+ RFC850) from the article must be used to supply this information. If
+ the message-id header line is missing from the article, a single
+ digit "0" (zero) should be supplied within the angle brackets.
+
+ Since the message-id field is unique with each article, it may be
+ used by a news reading program to skip duplicate displays of articles
+ that have been posted more than once, or to more than one newsgroup.
+
+3.1.3. Responses
+
+ 220 n <a> article retrieved - head and body follow
+ (n = article number, <a> = message-id)
+ 221 n <a> article retrieved - head follows
+ 222 n <a> article retrieved - body follows
+ 223 n <a> article retrieved - request text separately
+ 412 no newsgroup has been selected
+ 420 no current article has been selected
+ 423 no such article number in this group
+ 430 no such article found
+
+
+
+
+
+Kantor & Lapsley [Page 10]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+3.2. The GROUP command
+
+3.2.1. GROUP
+
+ GROUP ggg
+
+ The required parameter ggg is the name of the newsgroup to be
+ selected (e.g. "net.news"). A list of valid newsgroups may be
+ obtained from the LIST command.
+
+ The successful selection response will return the article numbers of
+ the first and last articles in the group, and an estimate of the
+ number of articles on file in the group. It is not necessary that
+ the estimate be correct, although that is helpful; it must only be
+ equal to or larger than the actual number of articles on file. (Some
+ implementations will actually count the number of articles on file.
+ Others will just subtract first article number from last to get an
+ estimate.)
+
+ When a valid group is selected by means of this command, the
+ internally maintained "current article pointer" is set to the first
+ article in the group. If an invalid group is specified, the
+ previously selected group and article remain selected. If an empty
+ newsgroup is selected, the "current article pointer" is in an
+ indeterminate state and should not be used.
+
+ Note that the name of the newsgroup is not case-dependent. It must
+ otherwise match a newsgroup obtained from the LIST command or an
+ error will result.
+
+3.2.2. Responses
+
+ 211 n f l s group selected
+ (n = estimated number of articles in group,
+ f = first article number in the group,
+ l = last article number in the group,
+ s = name of the group.)
+ 411 no such news group
+
+
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 11]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+3.3. The HELP command
+
+3.3.1. HELP
+
+ HELP
+
+ Provides a short summary of commands that are understood by this
+ implementation of the server. The help text will be presented as a
+ textual response, terminated by a single period on a line by itself.
+
+ 3.3.2. Responses
+
+ 100 help text follows
+
+3.4. The IHAVE command
+
+3.4.1. IHAVE
+
+ IHAVE <messageid>
+
+ The IHAVE command informs the server that the client has an article
+ whose id is <messageid>. If the server desires a copy of that
+ article, it will return a response instructing the client to send the
+ entire article. If the server does not want the article (if, for
+ example, the server already has a copy of it), a response indicating
+ that the article is not wanted will be returned.
+
+ If transmission of the article is requested, the client should send
+ the entire article, including header and body, in the manner
+ specified for text transmission from the server. A response code
+ indicating success or failure of the transferral of the article will
+ be returned.
+
+ This function differs from the POST command in that it is intended
+ for use in transferring already-posted articles between hosts.
+ Normally it will not be used when the client is a personal
+ newsreading program. In particular, this function will invoke the
+ server's news posting program with the appropriate settings (flags,
+ options, etc) to indicate that the forthcoming article is being
+ forwarded from another host.
+
+ The server may, however, elect not to post or forward the article if
+ after further examination of the article it deems it inappropriate to
+ do so. The 436 or 437 error codes may be returned as appropriate to
+ the situation.
+
+ Reasons for such subsequent rejection of an article may include such
+
+
+Kantor & Lapsley [Page 12]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ problems as inappropriate newsgroups or distributions, disk space
+ limitations, article lengths, garbled headers, and the like. These
+ are typically restrictions enforced by the server host's news
+ software and not necessarily the NNTP server itself.
+
+3.4.2. Responses
+
+ 235 article transferred ok
+ 335 send article to be transferred. End with <CR-LF>.<CR-LF>
+ 435 article not wanted - do not send it
+ 436 transfer failed - try again later
+ 437 article rejected - do not try again
+
+ An implementation note:
+
+ Because some host news posting software may not be able to decide
+ immediately that an article is inappropriate for posting or
+ forwarding, it is acceptable to acknowledge the successful transfer
+ of the article and to later silently discard it. Thus it is
+ permitted to return the 235 acknowledgement code and later discard
+ the received article. This is not a fully satisfactory solution to
+ the problem. Perhaps some implementations will wish to send mail to
+ the author of the article in certain of these cases.
+
+3.5. The LAST command
+
+3.5.1. LAST
+
+ LAST
+
+ The internally maintained "current article pointer" is set to the
+ previous article in the current newsgroup. If already positioned at
+ the first article of the newsgroup, an error message is returned and
+ the current article remains selected.
+
+ The internally-maintained "current article pointer" is set by this
+ command.
+
+ A response indicating the current article number, and a message-id
+ string will be returned. No text is sent in response to this
+ command.
+
+3.5.2. Responses
+
+ 223 n a article retrieved - request text separately
+ (n = article number, a = unique article id)
+
+
+
+Kantor & Lapsley [Page 13]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ 412 no newsgroup selected
+ 420 no current article has been selected
+ 422 no previous article in this group
+
+3.6. The LIST command
+
+3.6.1. LIST
+
+ LIST
+
+ Returns a list of valid newsgroups and associated information. Each
+ newsgroup is sent as a line of text in the following format:
+
+ group last first p
+
+ where <group> is the name of the newsgroup, <last> is the number of
+ the last known article currently in that newsgroup, <first> is the
+ number of the first article currently in the newsgroup, and <p> is
+ either 'y' or 'n' indicating whether posting to this newsgroup is
+ allowed ('y') or prohibited ('n').
+
+ The <first> and <last> fields will always be numeric. They may have
+ leading zeros. If the <last> field evaluates to less than the
+ <first> field, there are no articles currently on file in the
+ newsgroup.
+
+ Note that posting may still be prohibited to a client even though the
+ LIST command indicates that posting is permitted to a particular
+ newsgroup. See the POST command for an explanation of client
+ prohibitions. The posting flag exists for each newsgroup because
+ some newsgroups are moderated or are digests, and therefore cannot be
+ posted to; that is, articles posted to them must be mailed to a
+ moderator who will post them for the submitter. This is independent
+ of the posting permission granted to a client by the NNTP server.
+
+ Please note that an empty list (i.e., the text body returned by this
+ command consists only of the terminating period) is a possible valid
+ response, and indicates that there are currently no valid newsgroups.
+
+3.6.2. Responses
+
+ 215 list of newsgroups follows
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 14]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+3.7. The NEWGROUPS command
+
+3.7.1. NEWGROUPS
+
+ NEWGROUPS date time [GMT] [<distributions>]
+
+ A list of newsgroups created since <date and time> will be listed in
+ the same format as the LIST command.
+
+ The date is sent as 6 digits in the format YYMMDD, where YY is the
+ last two digits of the year, MM is the two digits of the month (with
+ leading zero, if appropriate), and DD is the day of the month (with
+ leading zero, if appropriate). The closest century is assumed as
+ part of the year (i.e., 86 specifies 1986, 30 specifies 2030, 99 is
+ 1999, 00 is 2000).
+
+ Time must also be specified. It must be as 6 digits HHMMSS with HH
+ being hours on the 24-hour clock, MM minutes 00-59, and SS seconds
+ 00-59. The time is assumed to be in the server's timezone unless the
+ token "GMT" appears, in which case both time and date are evaluated
+ at the 0 meridian.
+
+ The optional parameter "distributions" is a list of distribution
+ groups, enclosed in angle brackets. If specified, the distribution
+ portion of a new newsgroup (e.g, 'net' in 'net.wombat') will be
+ examined for a match with the distribution categories listed, and
+ only those new newsgroups which match will be listed. If more than
+ one distribution group is to be listed, they must be separated by
+ commas within the angle brackets.
+
+ Please note that an empty list (i.e., the text body returned by this
+ command consists only of the terminating period) is a possible valid
+ response, and indicates that there are currently no new newsgroups.
+
+3.7.2. Responses
+
+ 231 list of new newsgroups follows
+
+
+
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 15]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+3.8. The NEWNEWS command
+
+3.8.1. NEWNEWS
+
+ NEWNEWS newsgroups date time [GMT] [<distribution>]
+
+ A list of message-ids of articles posted or received to the specified
+ newsgroup since "date" will be listed. The format of the listing will
+ be one message-id per line, as though text were being sent. A single
+ line consisting solely of one period followed by CR-LF will terminate
+ the list.
+
+ Date and time are in the same format as the NEWGROUPS command.
+
+ A newsgroup name containing a "*" (an asterisk) may be specified to
+ broaden the article search to some or all newsgroups. The asterisk
+ will be extended to match any part of a newsgroup name (e.g.,
+ net.micro* will match net.micro.wombat, net.micro.apple, etc). Thus
+ if only an asterisk is given as the newsgroup name, all newsgroups
+ will be searched for new news.
+
+ (Please note that the asterisk "*" expansion is a general
+ replacement; in particular, the specification of e.g., net.*.unix
+ should be correctly expanded to embrace names such as net.wombat.unix
+ and net.whocares.unix.)
+
+ Conversely, if no asterisk appears in a given newsgroup name, only
+ the specified newsgroup will be searched for new articles. Newsgroup
+ names must be chosen from those returned in the listing of available
+ groups. Multiple newsgroup names (including a "*") may be specified
+ in this command, separated by a comma. No comma shall appear after
+ the last newsgroup in the list. [Implementors are cautioned to keep
+ the 512 character command length limit in mind.]
+
+ The exclamation point ("!") may be used to negate a match. This can
+ be used to selectively omit certain newsgroups from an otherwise
+ larger list. For example, a newsgroups specification of
+ "net.*,mod.*,!mod.map.*" would specify that all net.<anything> and
+ all mod.<anything> EXCEPT mod.map.<anything> newsgroup names would be
+ matched. If used, the exclamation point must appear as the first
+ character of the given newsgroup name or pattern.
+
+ The optional parameter "distributions" is a list of distribution
+ groups, enclosed in angle brackets. If specified, the distribution
+ portion of an article's newsgroup (e.g, 'net' in 'net.wombat') will
+ be examined for a match with the distribution categories listed, and
+ only those articles which have at least one newsgroup belonging to
+
+
+Kantor & Lapsley [Page 16]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ the list of distributions will be listed. If more than one
+ distribution group is to be supplied, they must be separated by
+ commas within the angle brackets.
+
+ The use of the IHAVE, NEWNEWS, and NEWGROUPS commands to distribute
+ news is discussed in an earlier part of this document.
+
+ Please note that an empty list (i.e., the text body returned by this
+ command consists only of the terminating period) is a possible valid
+ response, and indicates that there is currently no new news.
+
+3.8.2. Responses
+
+ 230 list of new articles by message-id follows
+
+3.9. The NEXT command
+
+3.9.1. NEXT
+
+ NEXT
+
+ The internally maintained "current article pointer" is advanced to
+ the next article in the current newsgroup. If no more articles
+ remain in the current group, an error message is returned and the
+ current article remains selected.
+
+ The internally-maintained "current article pointer" is set by this
+ command.
+
+ A response indicating the current article number, and the message-id
+ string will be returned. No text is sent in response to this
+ command.
+
+3.9.2. Responses
+
+ 223 n a article retrieved - request text separately
+ (n = article number, a = unique article id)
+ 412 no newsgroup selected
+ 420 no current article has been selected
+ 421 no next article in this group
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 17]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+3.10. The POST command
+
+3.10.1. POST
+
+ POST
+
+ If posting is allowed, response code 340 is returned to indicate that
+ the article to be posted should be sent. Response code 440 indicates
+ that posting is prohibited for some installation-dependent reason.
+
+ If posting is permitted, the article should be presented in the
+ format specified by RFC850, and should include all required header
+ lines. After the article's header and body have been completely sent
+ by the client to the server, a further response code will be returned
+ to indicate success or failure of the posting attempt.
+
+ The text forming the header and body of the message to be posted
+ should be sent by the client using the conventions for text received
+ from the news server: A single period (".") on a line indicates the
+ end of the text, with lines starting with a period in the original
+ text having that period doubled during transmission.
+
+ No attempt shall be made by the server to filter characters, fold or
+ limit lines, or otherwise process incoming text. It is our intent
+ that the server just pass the incoming message to be posted to the
+ server installation's news posting software, which is separate from
+ this specification. See RFC850 for more details.
+
+ Since most installations will want the client news program to allow
+ the user to prepare his message using some sort of text editor, and
+ transmit it to the server for posting only after it is composed, the
+ client program should take note of the herald message that greeted it
+ when the connection was first established. This message indicates
+ whether postings from that client are permitted or not, and can be
+ used to caution the user that his access is read-only if that is the
+ case. This will prevent the user from wasting a good deal of time
+ composing a message only to find posting of the message was denied.
+ The method and determination of which clients and hosts may post is
+ installation dependent and is not covered by this specification.
+
+3.10.2. Responses
+
+ 240 article posted ok
+ 340 send article to be posted. End with <CR-LF>.<CR-LF>
+ 440 posting not allowed
+ 441 posting failed
+
+
+
+Kantor & Lapsley [Page 18]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ (for reference, one of the following codes will be sent upon initial
+ connection; the client program should determine whether posting is
+ generally permitted from these:) 200 server ready - posting allowed
+ 201 server ready - no posting allowed
+
+3.11. The QUIT command
+
+3.11.1. QUIT
+
+ QUIT
+
+ The server process acknowledges the QUIT command and then closes the
+ connection to the client. This is the preferred method for a client
+ to indicate that it has finished all its transactions with the NNTP
+ server.
+
+ If a client simply disconnects (or the connection times out, or some
+ other fault occurs), the server should gracefully cease its attempts
+ to service the client.
+
+3.11.2. Responses
+
+ 205 closing connection - goodbye!
+
+3.12. The SLAVE command
+
+3.12.1. SLAVE
+
+ SLAVE
+
+ Indicates to the server that this client connection is to a slave
+ server, rather than a user.
+
+ This command is intended for use in separating connections to single
+ users from those to subsidiary ("slave") servers. It may be used to
+ indicate that priority should therefore be given to requests from
+ this client, as it is presumably serving more than one person. It
+ might also be used to determine which connections to close when
+ system load levels are exceeded, perhaps giving preference to slave
+ servers. The actual use this command is put to is entirely
+ implementation dependent, and may vary from one host to another. In
+ NNTP servers which do not give priority to slave servers, this
+ command must nonetheless be recognized and acknowledged.
+
+3.12.2. Responses
+
+ 202 slave status noted
+
+
+Kantor & Lapsley [Page 19]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+4. Sample Conversations
+
+ These are samples of the conversations that might be expected with
+ the news server in hypothetical sessions. The notation C: indicates
+ commands sent to the news server from the client program; S: indicate
+ responses received from the server by the client.
+
+4.1. Example 1 - relative access with NEXT
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 200 wombatvax news server ready - posting ok
+
+ (client asks for a current newsgroup list)
+ C: LIST
+ S: 215 list of newsgroups follows
+ S: net.wombats 00543 00501 y
+ S: net.unix-wizards 10125 10011 y
+ (more information here)
+ S: net.idiots 00100 00001 n
+ S: .
+
+ (client selects a newsgroup)
+ C: GROUP net.unix-wizards
+ S: 211 104 10011 10125 net.unix-wizards group selected
+ (there are 104 articles on file, from 10011 to 10125)
+
+ (client selects an article to read)
+ C: STAT 10110
+ S: 223 10110 <23445@sdcsvax.ARPA> article retrieved - statistics
+ only (article 10110 selected, its message-id is
+ <23445@sdcsvax.ARPA>)
+
+ (client examines the header)
+ C: HEAD
+ S: 221 10110 <23445@sdcsvax.ARPA> article retrieved - head
+ follows (text of the header appears here)
+ S: .
+
+ (client wants to see the text body of the article)
+ C: BODY
+ S: 222 10110 <23445@sdcsvax.ARPA> article retrieved - body
+ follows (body text here)
+ S: .
+
+ (client selects next article in group)
+
+
+Kantor & Lapsley [Page 20]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ C: NEXT
+ S: 223 10113 <21495@nudebch.uucp> article retrieved - statistics
+ only (article 10113 was next in group)
+
+ (client finishes session)
+ C: QUIT
+ S: 205 goodbye.
+
+4.2. Example 2 - absolute article access with ARTICLE
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 201 UCB-VAX netnews server ready -- no posting allowed
+
+ C: GROUP msgs
+ S: 211 103 402 504 msgs Your new group is msgs
+ (there are 103 articles, from 402 to 504)
+
+ C: ARTICLE 401
+ S: 423 No such article in this newsgroup
+
+ C: ARTICLE 402
+ S: 220 402 <4105@ucbvax.ARPA> Article retrieved, text follows
+ S: (article header and body follow)
+ S: .
+
+ C: HEAD 403
+ S: 221 403 <3108@mcvax.UUCP> Article retrieved, header follows
+ S: (article header follows)
+ S: .
+
+ C: QUIT
+ S: 205 UCB-VAX news server closing connection. Goodbye.
+
+4.3. Example 3 - NEWGROUPS command
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 200 Imaginary Institute News Server ready (posting ok)
+
+ (client asks for new newsgroups since April 3, 1985)
+ C: NEWGROUPS 850403 020000
+
+ S: 231 New newsgroups since 03/04/85 02:00:00 follow
+
+
+
+Kantor & Lapsley [Page 21]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ S: net.music.gdead
+ S: net.games.sources
+ S: .
+
+ C: GROUP net.music.gdead
+ S: 211 0 1 1 net.music.gdead Newsgroup selected
+ (there are no articles in that newsgroup, and
+ the first and last article numbers should be ignored)
+
+ C: QUIT
+ S: 205 Imaginary Institute news server ceasing service. Bye!
+
+4.4. Example 4 - posting a news article
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 200 BANZAIVAX news server ready, posting allowed.
+
+ C: POST
+ S: 340 Continue posting; Period on a line by itself to end
+ C: (transmits news article in RFC850 format)
+ C: .
+ S: 240 Article posted successfully.
+
+ C: QUIT
+ S: 205 BANZAIVAX closing connection. Goodbye.
+
+4.5. Example 5 - interruption due to operator request
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 201 genericvax news server ready, no posting allowed.
+
+ (assume normal conversation for some time, and
+ that a newsgroup has been selected)
+
+ C: NEXT
+ S: 223 1013 <5734@mcvax.UUCP> Article retrieved; text separate.
+
+ C: HEAD
+ C: 221 1013 <5734@mcvax.UUCP> Article retrieved; head follows.
+
+ S: (sends head of article, but halfway through is
+ interrupted by an operator request. The following
+ then occurs, without client intervention.)
+
+
+Kantor & Lapsley [Page 22]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ S: (ends current line with a CR-LF pair)
+ S: .
+ S: 400 Connection closed by operator. Goodbye.
+ S: (closes connection)
+
+4.6. Example 6 - Using the news server to distribute news between
+ systems.
+
+ S: (listens at TCP port 119)
+
+ C: (requests connection on TCP port 119)
+ S: 201 Foobar NNTP server ready (no posting)
+
+ (client asks for new newsgroups since 2 am, May 15, 1985)
+ C: NEWGROUPS 850515 020000
+ S: 235 New newsgroups since 850515 follow
+ S: net.fluff
+ S: net.lint
+ S: .
+
+ (client asks for new news articles since 2 am, May 15, 1985)
+ C: NEWNEWS * 850515 020000
+ S: 230 New news since 850515 020000 follows
+ S: <1772@foo.UUCP>
+ S: <87623@baz.UUCP>
+ S: <17872@GOLD.CSNET>
+ S: .
+
+ (client asks for article <1772@foo.UUCP>)
+ C: ARTICLE <1772@foo.UUCP>
+ S: 220 <1772@foo.UUCP> All of article follows
+ S: (sends entire message)
+ S: .
+
+ (client asks for article <87623@baz.UUCP>
+ C: ARTICLE <87623@baz.UUCP>
+ S: 220 <87623@baz.UUCP> All of article follows
+ S: (sends entire message)
+ S: .
+
+ (client asks for article <17872@GOLD.CSNET>
+ C: ARTICLE <17872@GOLD.CSNET>
+ S: 220 <17872@GOLD.CSNET> All of article follows
+ S: (sends entire message)
+ S: .
+
+
+
+
+Kantor & Lapsley [Page 23]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ (client offers an article it has received recently)
+ C: IHAVE <4105@ucbvax.ARPA>
+ S: 435 Already seen that one, where you been?
+
+ (client offers another article)
+ C: IHAVE <4106@ucbvax.ARPA>
+ S: 335 News to me! <CRLF.CRLF> to end.
+ C: (sends article)
+ C: .
+ S: 235 Article transferred successfully. Thanks.
+
+ (or)
+
+ S: 436 Transfer failed.
+
+ (client is all through with the session)
+ C: QUIT
+ S: 205 Foobar NNTP server bids you farewell.
+
+4.7. Summary of commands and responses.
+
+ The following are the commands recognized and responses returned by
+ the NNTP server.
+
+4.7.1. Commands
+
+ ARTICLE
+ BODY
+ GROUP
+ HEAD
+ HELP
+ IHAVE
+ LAST
+ LIST
+ NEWGROUPS
+ NEWNEWS
+ NEXT
+ POST
+ QUIT
+ SLAVE
+ STAT
+
+4.7.2. Responses
+
+ 100 help text follows
+ 199 debug output
+
+
+
+Kantor & Lapsley [Page 24]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ 200 server ready - posting allowed
+ 201 server ready - no posting allowed
+ 202 slave status noted
+ 205 closing connection - goodbye!
+ 211 n f l s group selected
+ 215 list of newsgroups follows
+ 220 n <a> article retrieved - head and body follow 221 n <a> article
+ retrieved - head follows
+ 222 n <a> article retrieved - body follows
+ 223 n <a> article retrieved - request text separately 230 list of new
+ articles by message-id follows
+ 231 list of new newsgroups follows
+ 235 article transferred ok
+ 240 article posted ok
+
+ 335 send article to be transferred. End with <CR-LF>.<CR-LF>
+ 340 send article to be posted. End with <CR-LF>.<CR-LF>
+
+ 400 service discontinued
+ 411 no such news group
+ 412 no newsgroup has been selected
+ 420 no current article has been selected
+ 421 no next article in this group
+ 422 no previous article in this group
+ 423 no such article number in this group
+ 430 no such article found
+ 435 article not wanted - do not send it
+ 436 transfer failed - try again later
+ 437 article rejected - do not try again.
+ 440 posting not allowed
+ 441 posting failed
+
+ 500 command not recognized
+ 501 command syntax error
+ 502 access restriction or permission denied
+ 503 program fault - command not performed
+
+4.8. A Brief Word about the USENET News System
+
+ In the UNIX world, which traditionally has been linked by 1200 baud
+ dial-up telephone lines, the USENET News system has evolved to handle
+ central storage, indexing, retrieval, and distribution of news. With
+ the exception of its underlying transport mechanism (UUCP), USENET
+ News is an efficient means of providing news and bulletin service to
+ subscribers on UNIX and other hosts worldwide. The USENET News
+
+
+
+
+Kantor & Lapsley [Page 25]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+ system is discussed in detail in RFC 850. It runs on most versions
+ of UNIX and on many other operating systems, and is customarily
+ distributed without charge.
+
+ USENET uses a spooling area on the UNIX host to store news articles,
+ one per file. Each article consists of a series of heading text,
+ which contain the sender's identification and organizational
+ affiliation, timestamps, electronic mail reply paths, subject,
+ newsgroup (subject category), and the like. A complete news article
+ is reproduced in its entirety below. Please consult RFC 850 for more
+ details.
+
+ Relay-Version: version B 2.10.3 4.3bsd-beta 6/6/85; site
+ sdcsvax.UUCP
+ Posting-Version: version B 2.10.1 6/24/83 SMI; site unitek.uucp
+ Path:sdcsvax!sdcrdcf!hplabs!qantel!ihnp4!alberta!ubc-vision!unitek
+ !honman
+ From: honman@unitek.uucp (Man Wong)
+ Newsgroups: net.unix-wizards
+ Subject: foreground -> background ?
+ Message-ID: <167@unitek.uucp>
+ Date: 25 Sep 85 23:51:52 GMT
+ Date-Received: 29 Sep 85 09:54:48 GMT
+ Reply-To: honman@unitek.UUCP (Hon-Man Wong)
+ Distribution: net.all
+ Organization: Unitek Technologies Corporation
+ Lines: 12
+
+ I have a process (C program) which generates a child and waits for
+ it to return. What I would like to do is to be able to run the
+ child process interactively for a while before kicking itself into
+ the background so I can return to the parent process (while the
+ child process is RUNNING in the background). Can it be done? And
+ if it can, how?
+
+ Please reply by E-mail. Thanks in advance.
+
+ Hon-Man Wong
+
+
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 26]
+\f
+
+
+RFC 977 February 1986
+Network News Transfer Protocol
+
+
+5. References
+
+ [1] Crocker, D., "Standard for the Format of ARPA Internet Text
+ Messages", RFC-822, Department of Electrical Engineering,
+ University of Delaware, August, 1982.
+
+ [2] Horton, M., "Standard for Interchange of USENET Messages",
+ RFC-850, USENET Project, June, 1983.
+
+ [3] Postel, J., "Transmission Control Protocol- DARPA Internet
+ Program Protocol Specification", RFC-793, USC/Information
+ Sciences Institute, September, 1981.
+
+ [4] Postel, J., "Simple Mail Transfer Protocol", RFC-821,
+ USC/Information Sciences Institute, August, 1982.
+
+6. Acknowledgements
+
+ The authors wish to express their heartfelt thanks to those many
+ people who contributed to this specification, and especially to Erik
+ Fair and Chuq von Rospach, without whose inspiration this whole thing
+ would not have been necessary.
+
+7. Notes
+
+ <1> UNIX is a trademark of Bell Laboratories.
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+
+Kantor & Lapsley [Page 27]
+\f
prefs_filtering.c prefs_filtering.h \
mbox_folder.c mbox_folder.h \
quote_fmt_lex.l quote_fmt_lex.h \
- quote_fmt_parse.y quote_fmt.h
+ quote_fmt_parse.y quote_fmt.h \
+ matcher_parser_lex.l matcher_parser_lex.h \
+ matcher_parser_parse.y matcher_parser.h
EXTRA_DIST = \
quote_fmt_parse.h \
}
return processed_cmd;
}
+
+
+/* ************************************************************ */
+
+/*
+static void matcher_parse (gchar * str)
+{
+ matcher_parser_scan_string(str);
+ matcher_parserparse();
+}
+*/
{
RecvProtocol protocol;
gboolean active;
+ gint auth;
protocol = GPOINTER_TO_INT
(gtk_object_get_user_data(GTK_OBJECT(menuitem)));
5, 0);
gtk_widget_show(basic.uid_label);
gtk_widget_show(basic.pass_label);
+ gtk_widget_show(basic.uid_entry);
+ gtk_widget_show(basic.pass_entry);
gtk_table_set_row_spacing (GTK_TABLE (basic.serv_table),
7, VSPACING_NARROW);
gtk_widget_set_sensitive(basic.uid_label, TRUE);
gtk_widget_set_sensitive(basic.pass_label, TRUE);
+ gtk_widget_set_sensitive(basic.uid_entry, TRUE);
+ gtk_widget_set_sensitive(basic.pass_entry, TRUE);
/* update userid/passwd sensitive state */
prefs_account_nntpauth_toggled
5, VSPACING_NARROW);
gtk_widget_hide(basic.uid_label);
gtk_widget_hide(basic.pass_label);
+ gtk_widget_hide(basic.uid_entry);
+ gtk_widget_hide(basic.pass_entry);
gtk_table_set_row_spacing (GTK_TABLE (basic.serv_table),
7, 0);
gtk_widget_set_sensitive(basic.uid_label, TRUE);
gtk_widget_set_sensitive(basic.pass_label, TRUE);
+ gtk_widget_set_sensitive(basic.uid_entry, TRUE);
+ gtk_widget_set_sensitive(basic.pass_entry, TRUE);
gtk_widget_set_sensitive(receive.pop3_frame, FALSE);
prefs_account_mailcmd_toggled
(GTK_TOGGLE_BUTTON(basic.mailcmd_chkbtn), NULL);
5, 0);
gtk_widget_show(basic.uid_label);
gtk_widget_show(basic.pass_label);
+ gtk_widget_show(basic.uid_entry);
+ gtk_widget_show(basic.pass_entry);
gtk_table_set_row_spacing (GTK_TABLE (basic.serv_table),
7, VSPACING_NARROW);
gtk_widget_set_sensitive(basic.uid_label, TRUE);
gtk_widget_set_sensitive(basic.pass_label, TRUE);
+ gtk_widget_set_sensitive(basic.uid_entry, TRUE);
+ gtk_widget_set_sensitive(basic.pass_entry, TRUE);
gtk_widget_set_sensitive(receive.pop3_frame, FALSE);
gtk_widget_set_sensitive(basic.smtpserv_entry, TRUE);
gtk_widget_set_sensitive(basic.smtpserv_label, TRUE);
5, 0);
gtk_widget_show(basic.uid_label);
gtk_widget_show(basic.pass_label);
+ gtk_widget_show(basic.uid_entry);
+ gtk_widget_show(basic.pass_entry);
gtk_table_set_row_spacing (GTK_TABLE (basic.serv_table),
7, VSPACING_NARROW);
gtk_widget_set_sensitive(basic.uid_label, TRUE);
gtk_widget_set_sensitive(basic.pass_label, TRUE);
+ gtk_widget_set_sensitive(basic.uid_entry, TRUE);
+ gtk_widget_set_sensitive(basic.pass_entry, TRUE);
gtk_widget_set_sensitive(receive.pop3_frame, TRUE);
gtk_widget_set_sensitive(basic.smtpserv_entry, TRUE);
gtk_widget_set_sensitive(basic.smtpserv_label, TRUE);