UU Encoder & Decoder
Encode and decode UU-format strings, the classic Unix-to-Unix encoding for email attachments and
An Encoding Format From the Pre-Internet Era
UUencoding (Unix-to-Unix encoding) was developed in 1980 by Mary Ann Horton at UC Berkeley to transmit binary files over UUCP (Unix-to-Unix Copy Protocol), a store-and-forward network system that predated the modern internet by over a decade. UUCP connected Unix machines via dial-up modems and leased lines, and the mail transport software could only handle 7-bit ASCII text. Binary files - executables, compressed archives, images - contained byte values outside the printable ASCII range that would be corrupted or stripped during transit. UUencoding solved this by converting every 3 bytes of binary data into 4 printable ASCII characters, making any file safe for text-only transmission. Paste UUencoded or plain text above to convert between formats.
UUencode vs Base64: Why Base64 Won
Both UUencode and Base64 serve the same fundamental purpose (binary-to-text encoding) with similar overhead (33% size increase). Base64 won the adoption war for several practical reasons. UUencode's character set starts from the space character (ASCII 32), and the space character caused problems with mail transfer agents that stripped trailing whitespace from lines. UUencode also uses the backtick character, which some systems interpreted as a shell command delimiter. Base64's character set (A-Z, a-z, 0-9, +, /) avoids both issues entirely. When MIME (Multipurpose Internet Mail Extensions) was standardized in the 1990s as the format for email attachments, it adopted Base64 as its binary encoding, effectively retiring UUencode from mainstream use. Base64 became the default binary encoding for the modern internet, while UUencode remained an artifact of the pre-MIME era.
The UUencode Format Structure
A UUencoded file has a self-describing structure that was remarkably practical for its era. It begins with a header line: the word "begin" followed by a three-digit Unix file permission mode (like 644 or 755) and the original filename. The encoded data follows in lines of up to 61 characters, each prefixed with a length character indicating how many decoded bytes that line represents. The file ends with a line containing only a single backtick (or space) followed by a new line with the word "end". This self-describing format meant the recipient could decode the file with the uudecode command and automatically have the correct filename and Unix permissions applied, a genuinely useful feature for system administrators transferring configuration files and executables between Unix machines via email or newsgroups.
Where You Might Still Encounter UUencoded Data
UUencoded data survives in several niches. Usenet newsgroup archives from the 1990s and early 2000s contain millions of UUencoded binary posts, particularly in the alt.binaries.* hierarchy where users shared files by posting UUencoded segments across multiple messages. Decoding these archives requires handling multi-part UUencoded files that were split across sequential newsgroup posts, each with its own begin/end markers. Some legacy industrial control systems and SCADA environments use UUencoding in communication protocols that were designed in the 1980s and never updated. Certain mainframe systems and legacy batch processing pipelines still use UUencoding for file transfers between systems that predate MIME support. Understanding UUencoding is also relevant for computer forensics and digital archaeology, where investigators encounter encoded data in old backups, archived communications, and legacy storage systems that used this format during its period of dominance.
Encoding Comparison: UUencode, Base64, and Alternatives
UUencode and Base64 both expand data by 33% and use similar algorithmic approaches. Quoted-Printable encoding is designed for text that is mostly ASCII with occasional special characters (like European language text with accented characters), encoding only the non-ASCII bytes and leaving printable characters unchanged, resulting in minimal size increase for mostly-text content. Ascii85 (also called Base85) encodes 4 bytes into 5 characters rather than 3 bytes into 4, achieving only a 25% size increase compared to Base64's 33%. Ascii85 is used in PDF and PostScript file formats. yEnc, developed specifically for Usenet binary posts, encodes most bytes as themselves (only escaping a few control characters), achieving a mere 1-2% overhead. Each encoding makes different tradeoffs between size efficiency, character set safety, and compatibility with different transport mechanisms. For modern applications, Base64 is the default choice unless a specific format or protocol requires something different.
Frequently asked questions
Is this tool free to use?
Is my data kept private?
Does it work on mobile devices?
Can I use the results commercially?
How accurate are the results?
How do I report a bug or suggest a feature?
Rate This Calculator
Your feedback helps us improve our tools