Wenguang's Introduction to Universal Disk Format (UDF)Outline1. What is UDF? 2. Why UDF? 3. History of UDF Revisions 4. Structure of the UDF Standard 5. UDF Specification Tutorial 5.1 Highlight of the UDF Format 5.2 UDF Volume Structure and Mount Procedure 5.3 UDF Partition Structure 5.4 UDF File and Directory Structure 5.5 Some UDF Terminologies 5.6 Tips and Notes That Are Not in the UDF Standards 1. What is UDF?Universal Disk Format (UDF) is a file system specification defined by OSTA. One objective of UDF is to replace the ISO9660 file system on optical media (CDs, DVDs, etc). It is also a good file system to replace FAT on removable media.2. Why UDF?Any removable media (CD, DVD, flash drive, external hard drive, etc) needs a file system format. Ideally, this format should have these characteristics:
3. History of UDF RevisionUDF is an evolving standard. Their major features are summarized in the following table.
4. Structure of the UDF StandardThe UDF Standard contains two sets of specifications, ECMA-167 and UDF.
Because UDF consists of ECMA-167 and UDF, you need to have both standards in hand and read them side-by-side. To make things very clear, the style of the standard is like a reference book. Learning knowledge from a reference book is not fun. It is like learning a language by reading its dictionary from A to Z, and put all the grammar together by connecting all the fragments in the dictionary. Reading two standards is twice as worse: you need to learn two new languages A and B from two dictionaries, while dictionary B is written using language A. Another side effect of reading a standard is that it makes people fall asleep fairly quickly :-; The learning process should be iterative. You start reading ECMA-167 to get some feeling, and read UDF for corresponding sessions to get more feelings, and go back. At some point, grab a UDF disc and dump its structure and read what's on it, to verify what is in your mind. You don't need to finish reading ECMA-167 and fully understands it before read UDF, because there are many details in ECMA-167 that are not used in UDF. The UDF tutorial in the following session explains what UDF looks like. I don't assume you have knowledge of other file systems such as the Unix file system. But if you know that, it will be easier to understand UDF. 5. UDF Specification Tutorial5.1 Highlight of the UDF FormatCompared with the Unix file system (UFS/FFS/ext2), UDF's main structures are highlighted below.
5.2 UDF Volume Structure and Mount ProcedureThe volume structure is transparent to the file and directory structure. It provides a framework so that different format may co-exist on the same media. This part of the standard is the most abstract and dry to read. It defines many terms that a UDF file system implementer rarely needs to care in file system operations.This part is most interesting for writers of the volume mount module (to identify that this is a UDF volume, i.e., the command mount_udf) and media formating module (to allow other system identify that this is a UDF volume, i.e., the command newfs_udf). To explain the volume structure, we step though the mount procedure to see how the UDF Volume is recognized. Mount always happens before the UDF media can be used by the host. This usually happens automatically when a removable UDF media is attached to the system, or it can be enabled manually by a command (say, the mount command on Unix type systems). The mount procedure can be separated into two parts: volume recognition and file system verification. Volume recognition is the first step to make sure this is a UDF volume. It only tells that this is a UDF media but does not tell where the file system metadata. A quick format utility can simply erase the UDF volume by erasing the recognition sequence of the volume. The volume recognition procedure looks for the Volume Recognition Sequence (VRS) from a base address (UDF's term is Volume Recognition Space). For most media, the base address is the start of the media. For multi-session optical media (CD-R, DVD-R, DVD+R, BD-R etc), the base point is the start of the last data session. VRS consists of the following three contiguous sectors which are stored after the first 32KB of the base address:
After volume recognition, the mounter must find the metadata of UDF to make sure this UDF volume is valid and its revision can be handled by the system. UDF metadata structures are called Descriptors in the standard. The start address of all descriptor are sector-aligned. Most descriptors are smaller than a sector (the bitmap and sparing table are two exceptions). Some descriptors contain pointers (i.e., addresses) of other descriptors. These descriptors are chained together in a certain order. A mounter may perform the following steps to make sure the UDF media is mountable:
5.3 UDF Partition StructureUDF defines five different types of partitions. A partition provides a uniform interface to the file system layer while hiding the different underlying physical properties. Each partition has a partition reference number, which is the zero-based index in the Partition Map of the LVD. Blocks in a partition can be addressed by a block number ranging from 0 to N-1, where N is the size of the partition. The size of a partition may not be fixed. It may increase (for Virtual Partition, Metadata Partition, and Pseudo-Overwrite Partition) or decrease (for Metadata Partition).5.3.1 Type 1 PartitionThis is the simplest partition. A type 1 partition has a start address S and size N. A logical block number A in the partition can be converted to the media physical address (in UDF's term, the logical sector address) S+A. In certain optical media, the start and size of the partition must be aligned to the packet size (such as 32KB). These special requirements are defined in the appendixes of the UDF standard. Free space of the partition is managed by the Unallocated Space Bitmap Descriptor. It contains one bit for each block of the partition. If the bit is set (1), the corresponding block is free. If it is clear (0), the corresponding block is allocated. The is contrary to what FFS/UFS uses the bitmap, because the bitmap in UDF is called Unallocated Space Bitmap.5.3.2 Sparable PartitionSparable partitions are used on overwrite media that will fail after a certain number of overwrites (several thousands), such as CD-RW. In a file system, the places that are overwritten frequently are often important metadata area, e.g., bitmaps. Sparable partition allows the failed area to be remapped to other good part on the media so the failed area appears good to the upper level.A sparable partition is similar to a type 1 partition in the sense that it has a start address and size. In addition, it defines 2 to 4 sparing tables which points to reserved spare area on the media. Each sparing table points to the same reserved spare area. If one sparing table fails, another sparing table can be used instead. The unit of overwrite on such media is packet. For example, the packet size for CD-RW is 32 2K-sectors. One sector in packet failing means the whole packet fails. When this happens, the content of this packet is written to a spare area, and its new address is written to the sparing table. When translating a logical address in the sparable partition to the physical address, the sparing table is always consulted. If the logical address is not found in the sparing table, the address translation is the same as a type 1 partition. Otherwise, its new address in the sparing area recorded in the sparing table is returned. Thus, the sparing table acts as an exception table in the address translation. This mechanism guarantees that the logical address does not change when its original packet fails. We use an example to explain how the sparing table and address translation works. To make it more intuitive, we assume the packet size be 10 sectors, although in real optical media, the packet size is always a power of two. Assume the partition starts from physical address 1000 and has 8000 sectors. We have two spare areas starting from 500 and 9000, respectively. The size of each spare area is 50. Therefore, we have 5 packets in each spare area. Since each sparing table has the same content, we only show the content of the first sparing table. Before the media has any defects, the sparing table looks like below:
UDF uses 0xFFFFFFFF to indicate that this spare packet is available. Since there is no defect, address translation is the same as a type 1 partition. So logical address 67 is translated to physical address 1067. Assume after some use, the system find the packet that contains block 93 fails when writing to it. It then write this packet to the spare packet with physical address 500, and update the sparing table:
Now the logical address translation is the same as before except for logical address 90-99. For example, logical address 67 still has the physical address 1067, but the logical address 97 now has physical address 507. As indicated in the example, the unit of sparing is a packet. The sparing table records the address of the first block of the packet. In this example, the spare area is outside of the partition. Actually, it can also be inside of the partition, and the partition then must mark the space occupied by the spare area unavailable for regular space allocation. For sparable partitions, the partition must start on a packet boundary, and its size must be an integral multiple of the packet size. 5.3.3 Virtual PartitionVirtual partition is used on write-once media. Only three types of metadata are stored in the virtual partition: File Set Descriptor, File Entry (including Extended File Entry), and Allocation Extent Descriptor. If the file data is embedded in the file entry, these file data are also stored in the virtual partition. Virtual partition makes the write-once media appear as an overwrite media. Virtual partition layers on top of the type 1 partition. A Virtual Allocation Table (VAT) is used to map logical addresses of the virtual partition to logical addresses in the underlying type 1 partition.We use a simple example to explain how it works. Assume the type 1 partition starts at physical block address 100. The File Set Descriptor has virtual address 0, and resides on physical block 100 (i.e., logical address 0), and File Entry (FE) of the root directory has virtual address 1, and resides on physical block 101 (i.e., logical address 1). The VAT is an array of integers. To get the logical address of the virtual address x is VAT[x]. Currently VAT has two entries:
The above table tells that the virtual address is the same as the physical address currently. Assume the last written address on the media is 101. If an empty file foo is added to the root directory, the root directory FE is updated and written to physical address 102 (i.e., logical address 2), and the FE of file foo is written to physical address 103. Thus the VAT becomes:
The root directory FE still has virtual address 1, but now its logical address is 2. The virtual address of the FE of file foo is 2, and it is mapped to logical address 3, i.e., physical address 103. When the root directory is updated again, the VAT is changed again. VAT is stored as a special file in the type 1 partition. No other UDF data structures point to the VAT FE (called VAT ICB by UDF), and the latest VAT FE is always the last sector written on the media. When ejecting a media with virtual partition, the VAT and VAT FE are written after flushing all data. The UDF reader first finds the last written sector, and then it can get the VAT for the address translation. 5.3.4 Metadata partitionMetadata partition is used to cluster metadata of the media together to get better performance. Metadata includes File Entries, allocation descriptors, directories, but does not include named streams or extended attributes.The metadata partition lies on top of the underlying partition, which could be a type 1 partition, sparable partition, or a pseudo-overwrite partition. The metadata partition consists of 3 files: the Metadata File, the Metadata Mirror File, and the Metadata Bitmap File. The Metadata File and Metadata Mirror File have duplicated metadata -- File Entries and Allocation Extent Descriptors. They may optionally have duplicated data, i.e., each metadata has two copies on the media. To simplify the following discussion, we assume that the Metadata Mirror File does not duplicate the Metadata File content. All data in the metadata partition are stored in the Metadata File. The logical block number in the metadata partition is the file offset in the Metadata File. Since some space in the Metadata File may be unused, the Metadata Bitmap File is used to keep track of the free space in the Metadata File. The metadata for the Metadata File, Metadata Mirror File, and Metadata Bitmap File are stored on the underlying type 1 (or sparable or pseudo-overwrite) partition. These are the only metadata that are not stored in the metadata partition. The data of the Metadata File and Metadata Mirror File must be aligned to the media ECC block size or packet size, whichever is bigger, and its size must be a multiple of the media ECC block size or packet size, whichever is bigger. We use an example to explain how the metadata partition works. We assume the ECC block size and packet size is 10, although UDF requires it to be larger than 32, and the size in real media is always a power of two. Assume the underlying partition is a type 1 partition, starts at physical address 1000 and has 8000 sectors. The content of the Metadata File (i.e., the metadata partition) has two extents: the first starts at logical address 100 and has 300 sectors, the second starts at 2000 and has 500 sectors. Therefore, the size of the metadata partition is 800 sectors. The logical address 5 in the metadata partition means a block offset 5 in the Metadata File, which is translated to logical address 105 in the type 1 partition, or physical address 1105 in the physical media. We put more examples of address translation in the following table.
The content of the Metadata Bitmap File is a Unallocated Space Bitmap Descriptor. Similar to the bitmap in a type 1 or sparable partition, the bitmap has one bit for each block in the partition. 5.3.5 Pseudo-overwrite partitionThe pseudo-overwrite partition (POW) is used for next-generation write-once media (e.g., Blu-ray Disc recordable or BD-R) on next-generation intelligent drives. These drives manage the address translation within the drive (what the virtual partition does before) to make the partition appear as an overwritable although the physical media is write-once. When POW partition is used, the metadata partition shall also be used for metadata, in the hope that metadata are clustered and achieve better performance. However, on write-once media, even when data are logically clustered in one partition, they may physically be far apart on the media. Because a longer physical distance often implies poorer performance, whether the use of metadata partition can improve performance is questionable.In a media that supports POW partition, the media can be separated into several tracks. Each track has a Next Writable Address (NWA). A new block can be written to the NWA of any track. An existing block can be overwritten. The NWA of any track can change at any time. So NWA must be queried before any new block is written. 5.3.6 Partition Descriptor and Partition MapThere are two ways to address a block on the media, the physical address (Logical Sector Number or LSN) and the logical address (Logical Block Number or LBN). Physical address is used to address metadata outside of partitions (such as the Logical Volume Descriptor). Logical address is used to address any block within partitions. Since there can be more than one partitions in a UDF volume, a Partition Reference number (PartRef) is needed in addition to LBN to address a block. We introduce to know how partitions are described before explaining how PartRef is decided.Partitions on a UDF volume are described by one or more Partition Descriptors (PD) and a partition map with one or more entries. The partition map is stored in the Logical Volume Descriptors. PD describes the physical properties of a partition. The most relevant information in a PD is its partition number, partition start location and length. It also tells whether this partition is read-only, write-once, rewritable or overwritable. The following table illustrates the basic information of two PDs. In order to reduce confusion, the partition numbers are intentionally chosen so that they do not overlap with PartRef, although in real UDF volumes, partition numbers often starts from 0.
A partition map has a number of entries describing the logical properties of the partition. Each partition map has a partition number indicating which PD this partition map refers to. There are two types of partition maps: type 1 or type 2. A type 1 partition is simply the partition with the information described in the PD with the corresponding partition number. A type 2 partition can be a sparable partition, a virtual partition, a metadata partition, or a pseudo-overwrite partition. The following table gives a possible partition map defined for the above two PDs. The 0-based index of each map entry is called the Partition Reference Number. The UDF file system can write the partition map entries in any order, which may change PartRef accordingly.
This partition map indicates that there are three partitions. The first partition (whose PartRef is 0) is a sparable partition backed by the overwritable partition described by the second PD. The second partition (whose PartRef is 1) is a type 1 read-only partition backed by the read-only partition described by the first PD. The third partition is the metadata partition residing in the sparable partition whose PartRef is 0, because both partition map entries have the same partition number 4. In this multi-partition scenario, each logical block is identified by the PartRef and the logical block number. For example, the 10th block in the metadata partition is identified by (PartRef=2, LBN=9), the first block in the read-only partition is identified by (PartRef=1, LBN=0). 5.4 UDF File and Directory Structure5.4.1 File Entry and Extended File EntryNo matter how the underlying partition structures are defined, the file and directory structures of all UDF volumes are the same. The main metadata describing file and directory structures are called Information Control Block (ICB). Their size is at most one block, and their data structures are either File Entry (FE) or Extended File Entry (EFE). Besides FE/EFE, the Allocation Extent Descriptor (AED) is used to represent very fragmented files.EFE is introduced in UDF 2.00 to represent files with named streams. EFE is very similar to FE, except with a few additional fields. The word "File" in FE/EFE is broader than the regular file in a conventional file system. It is used to represent a stream of bytes with some attributes. The file offset in the stream of bytes starts from 0 and until the end of the file. A FE/EFE may represent a file, a directory, a logical space holding extended attributes, a stream directory, a named stream, a symbolic link, a special device node, or even the whole metadata partition and the metadata bitmap. We still use the word "file" to represent the stream of bytes. 5.4.2 Extent-based Space AllocationFE/EFE uses extent-based space allocation to indicate which blocks belong to this file. There are four formats of extents, which is indicated by the lowest 2 bits of flag in the ICB of the FE/EFE. Only three of them are used in UDF. The Extended Allocation Descriptors are not used in UDF. The three formats are:
Each extent may have three different types, indicated by the highest 2 bits of the extent length. The normal type is recorded-and-allocated. The type not-recorded-not-allocated is used to represent holes in sparse files. The type not-recorded-allocated is used to represent pre-allocated space that has not be initialized yet. The size of each extent must be an integral multiple of block size except for the last extent of the file. 5.4.3 Directory StructureA directory is like a file but its file type in ICB has the directory bit set. The directory entries are variable size entries stored linearly in the file. Each entry is described by a File Identification Descriptor (FID). The first FID must has an empty name and its file characteristics must has the parent entry set. This is equivalent to the ".." entry in a FFS/UFS directory. But unlike FFS, UDF does not have a "." entry to represent itself. The FID has the file name (called File Identifier in UDF), the address of the FE/EFE of the file, and an optional variable size space for implementation use. When a file is deleted from a directory, the file characteristics of the FID is marked as deleted. The space left by deleted FIDs can be freely reused by new entries if applicable. 5.4.4 Free Space Management (Space Bitmap)UDF uses space bitmap to manage free space, similar to many file systems. Its bitmap is described by Space Bitmap Descriptor (SBD), one of a few descriptors that can be larger than a block. SBD has a UDF tag, two length fields indicating the number of bits and number of bytes of the bitmap, followed by the free space bitmap. The bitmap must be stored contiguously on the volume. Since the bitmap is called Free Space bitmap, a bit 1 in the bitmap means the block is available for allocation, and a bit 0 means the space has be allocated. This is different from bitmaps used in regular file systems. The file content of the Metadata Bitmap File is also a SBD. Its allocation on the underlying type 1 or sparable partition may be fragmented. 5.4.5 Extended AttributesExtended Attributes (EAs) are used to store additional file attributes, such as Finder Info and resource fork on Mac, and ACL on NTFS. EAs can be stored in two different places: embedded EA space and external EA space. The embedded EA space is the spare space in the file entry block after the fixed fields in a FE/EFE and before the allocation descriptors. It is fast to access but only small EAs can be stored here. The external EA space is a special file entry (EA File Entry) that pointed by the main file/directory file entry. The EAs are stored in the logical space described by the EA File Entry. Each EA has a header and a variable size body. In the external EA space, all EAs are concatenated together. Each EA header in the external EA space must starts at the block boundary. There are three types of EAs: standard EAs, implementation use EAs and application use EAs. Examples of standard EAs are file times EA (backup time, creation time etc), device specification EA for device node. Examples of implementation use EAs are Macintosh Finder Info and Resource Forks. Application use EAs are defined and used by applications. Every EA type has a special EA called unallocated space EA, used to occupy unused space left by other EAs or for padding purpose. In both embedded and external EA space, EAs are always grouped together based on their types. The standard EAs are stored first, followed by the implementation use EAs if any, and then followed by the application use EAs if any. Because all EAs are stored together in one logical space, if an EA in the middle of the external EA space grows, all EAs after it must be shifted. This makes the space allocation of external EA space complex. It is not a problem in a read-only media, but makes supporting EAs difficult in writable media. Fortunately, named streams are introduced in UDF 2.x and later which does not have the problems of EAs. Most implementation use EAs defined in UDF 1.x are stored in named streams in UDF 2.x and later. 5.4.6 Named StreamsNamed streams are introduced in UDF 2.x. The concept of stream is similar to the concept of fork in Macintosh and stream in NTFS. Every file or directory stores their data in the main stream. An arbitrary number of named streams can be stored in a file or directory. Each stream has a name.If a file/dir has named streams, the Extended File Entry (EFE) must be used for the file/dir. EFE contains the address of the file entry of a stream directory. The content of a stream directory is the same as a normal directory. It starts with a parent entry without name, followed by variable size directory entries. Each directory entry in a stream directory points to a named stream file entry, which describes the logical space storing the named stream. 5.5 Some UDF TerminologiesUDF has defined far more terminologies than most conventional hard-drive based file system. One reason is that ECMA-167 attempts to define a framework of many possible file systems instead of a specific one. Another reason is that UDF not only defines the file system structure, but also the volume and partition structure. Some UDF (or ECMA-167) terminologies are very abstract for conventional file system implementers. Fortunately, many of them are not meaningful in UDF. Some terminologies that may cause confusion are briefly discussed below.
5.6 Tips and Notes That Are Not in the UDF Standards5.6.1 Character Set: Precomposed or DecomposedUDF stores its file name using the unicode standard. This standard defines two ways (we are simplifying here, because there are more than two ways) of storing the same characters: precomposed form (NFC) and decomposed form (NFD). Macintosh uses decomposed form while Windows use precomposed form. A recent Document Change Notice (DCN) passed at the UDF committee requires to use the precomposed form on the UDF media.5.6.2 Assumptions for UDF ReaderPhilips provides an open source UDF Verifier program to verify whether a media follows the UDF standard. This verifier runs on both Windows and Mac OS X. To get UDF verifier, please follow these steps:
Sadly, not all vendors verify their mastering software using this tool before they ship their products. When there are too many "bad" UDF media in the market, the engineers who implement the UDF readers need to be able to read these UDF medias as well. So it is very important that a UDF read implementation should test as many UDF medias (mostly DVD movies and games) as possible. Last modified: Feb 1, 2009
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||