The twtxt Protocol Document ID: 2be39d96c164423883fe234f4774d067 Last Update: 2020-10-22 Abstract This file documents twtxt; an HTTP-based protocol and file format that can be used for federated social media communications. 1. Introduction twtxt is a simple federated micro blogging protocol built on HTTP. It works with a plain text file, usually named twtxt.txt. 2. File Format The file used to exchange twtxt posts is a newline delimited plain text file. The file MUST be encoded as a UTF-8 unicode file. When you are encoding a file, you MUST use a line feed (\n) to separate the lines. When consuming files generated by others, an implementation MAY consider other delimeters such as CRLF. When generating the file, the lines SHOULD not have trailing whitespace. For the consumer, these SHOULD be ignored. The file SHOULD be generated with ordered lines, going from least recent to most recent. Consumers MUST handle files with arbitrary orderings. 2.1. Comments If a line starts with a # sign, it is considered a comment and MUST be ignored. These lines can be completely informational (and useless to machines), or they can include human / machine readable metadata. If the client can parse it, it CAN use the metadata. When generating the file, comments SHOULD include one ASCII space after the # sign. If the comment is empty, the space can be omitted. 2.2. Posts A non-comment line is a post. Post lines MUST be separated into two parts by a horizontal ASCII TAB. The first part is the datetime that the message was posted in. It must be in the format specified by RFC 3339. The second part of the TAB splitted message, until the newline, is the message content. 3. Protocol A client implementing twtxt MUST support the HTTP protocol. It MAY choose to support additional file-transfer protocols (such as IPFS or Gopher); but those SHOULD be open specifications, ideally with multiple implementations. 3.1. HTTP 3.1.1. Server When serving requests, the server SHOULD NOT depend on ony headers being present other than Host. It MUST be able to serve clients via HTTP 1.1, and SHOULD try to serve them over 1.0 and 2.0 as well. 3.1.2. Client The client MUST send a user-agent header. This header MUST include the name of the program and SHOULD include a version number. If the user also publishes their posts, the client SHOULD include the nickname and twtxt URL of the user in the user-agent header. The User-Agent header SHOULD be in the format `$NAME/$VERSION (+$URL; @$NICK)`. Here's an example User-Agent header: twtxt/1.2.3 (+https://example.com/twtxt.txt; @leo) 3.2. HTTPS 3.2.1. Client A client SHOULD support fetching twtxt feeds over HTTPS. It MAY pick its own method of accepting and rejecting certificates and ciphers. A client CAN use system certificate stores, or it CAN choose another method such as TOFU (Trust On First Use). 3.3. Gemini Gemini is a new network protocol, similar to Gopher. The Gemini ecosystem and community are likely to overlap with the twtxt ones. A server CAN support serving twtxt files over Gemini. The file contents MUST not be any different than the other transport protocols. 3.3. Other protocols * Gopher * IPFS 4. Feed Metadata A feed CAN include metadata in the form of comments. The basic format of a metadata comment is # key = value When generating the file, the server SHOULD format it exactly as above. A client MAY consider extra whitespace insignificant and ignore it. There are various pieces of common metadata that is found on existing twtxt feeds. Some of these have gained enough usage to be considered official. 4.1. Official Metadata Below is a list of official metadata. It is recommended that every server produces them and every client consumes them. 4.1.1. nick nick is used to show to preferred nickname of the user. Since one user mentioning another can write their nick as they wish, this piece of metadata provides the opportunity to publish a correct one. A twtxt feed MAY include this field. It MUST NOT include multiple instances, and a client consuming a feed SHOULD use the last one. 4.1.2. url url is used to announce the canonical URL of a twtxt feed. If the client fetches a feed and finds a different URL, it SHOULD update to the canonical URL. A twtxt feed MAY include this field. In case there are multiple, all of them can be considered valid and the client may choose one arbitrarily. 5. Discovery Methods There are various methods to discover users on the twtxt network. 5.1. Mentions on posts When a user that you follow mentiones another user in their post, they will include their nickname and the URL of their feed in the message. Using this, it is possible to extend your network and find more people to follow. 5.2. User-Agent strings If you have access to server logs, twtxt clients that follow you CAN include the nickname and feed URL of their user in the User-Agent header of HTTP requests. 5.3. Third-party registries Aside from the previously mentioned methods, there are also third-party registries that can be utilized for discovery. While those registries might be useful, they usually employ some anti-patterns that end up centralizing the decentralized twtxt.
The twtxt Protocol
Reading time: about 3 minutes