Electronics Circuits & Tutorials - Electronics Hobby Projects - A Complete Electronic Resource Centre
Electronics Circuits & Tutorials

Home About us Electronic Tutorials Engineering Hobby Projects Online Dictionaries Contact us
Tutorials
  • Basic/Beginners
  • Intermediate/Advance
  • Microcontrollers
  • Microprocessors
  • Electronics Symbols
  • Electronics Formulas
  • Dictionary of Units

     more....

Dictionaries
  • Electronics Terms
  • Abbreviations
  • Computer Terms
  • Physics Glossary
  • Science Glossary
  • Space & Solar Terms
  • Semiconductor Symbols / Abbreviation
  • Radio Terminology Bibliography

     more....

Projects
  • Engineering Projects
Home > Electronics Tutorials > Online Computer Terms Dictionary > U

Online Computer Terms Dictionary - U

UTF-8

<character> (UCS transformation format 8) An ASCII-compatible multibyte Unicode and UCS encoding, used by Java and Plan 9.

The Unicode character set occupies a 16-bit code space. The most obvious Unicode encoding (known as UCS-2) consists of a sequence of 16-bit words. Such strings can contain bytes like '\0' or '/' which have a special meaning in filenames and other C library function parameters. In addition, the majority of Unix tools expects ASCII files and can't read 16-bit words as characters without major modifications. For these reasons, UCS-2 is not a suitable external encoding of Unicode in filenames, text files, environment variables, etc.

The ISO 10646 Universal Character Set (UCS), a superset of Unicode, occupies a 31-bit code space and the obvious UCS-4 encoding for it (a sequence of 32-bit words) has the same problems.

The UTF-8 encoding of Unicode and UCS avoids the problems of fixed-length Unicode encodings because an ASCII file encoded in UTF is exactly same as the original ASCII file and all non-ASCII characters are guaranteed to have the most significant bit set (bit 0x80). This means that normal tools for text searching etc. work as expected.

UTF-8 is defined in RFC 2279.

["File System Safe UCS Transformation Format (FSS_UTF)", X/Open Preliminary Specification, X/Open Company Ltd., Document Number: P316. This information also appears in ISO/IEC 10646, Annex P].

Plan 9 UTF manual entry.

(1998-07-29)

 


Nearby terms: USSA « UTC « UTF « UTF-8 » utility-coder » UTOPIST » UTP
 

Discover
  • C/C++ Language Programming Library
  • Electronic Conversions
  • History of Electronics
  • History of Computers
  • Elec. Power Standards
  • Online Calculator and Conversions
  • Electrical Hazards - Health & Safety
  • Datasheets
  • Quick Reference links
  • Electronics Magazines
  • Career in Electronics
  • EMS Post Tracking

     more......

Home Electronic Tutorials Engineering Hobby Projects Resources Links Sitemap Disclaimer/T&C

Copyright © 1999-2020 www.hobbyprojects.com  (All rights reserved)