File Organization Structure and Access Methods
The organizational structure and access method of files determine the layout of files on the storage medium and how the operating system accesses and processes these file data.
File access method
Based on the order of accessing file information, file access methods can be divided into the following types:
- Sequential access:
- Read and write the information in the file in sequence.
- Typical device: tape, suitable for processing large batches of sequentially read data.
- Advantages: simple and low operating cost.
-
Disadvantages: can only be accessed sequentially, low efficiency, and does not support random access.
-
Random access (direct access):
- Information in the file can be read and written randomly in any order without sequential access.
- Typical device: disk, which can directly locate the physical address of the file according to the logical position for operation.
- Advantages: fast access speed and high flexibility.
-
Disadvantages: complex implementation, suitable for small files or files that need to be read frequently randomly.
-
Key access:
- Access files according to a given key (primary key or record name). - The logical location of the corresponding record must be retrieved first, and then converted into a physical address on the storage medium for access.
-
Commonly used for file operations in database systems.
-
Factors for selecting access methods:
- Nature of the file: It is divided into streaming files and record files.
- Characteristics of storage devices: For example, tapes are suitable for sequential access, and disks are suitable for random access.
Logical structure of files
The logical structure of a file describes the logical organization of the file, which is mainly divided into the following two categories:
- Streaming file (unstructured file):
- Logically, it is a continuous string of characters without internal structure.
-
Commonly seen in text files and binary files.
-
Record file (structured file):
- It consists of several logical records, each of which is an independent basic unit of information.
- Different logical records are distinguished by primary keys (such as ID).
- Includes the following types:
- Sequential file: Records are stored and accessed in sequence.
- Index file: Fast retrieval and access are achieved by creating indexes.
- Indexed sequential file: Combines the advantages of sequential files and indexed files.
- Direct file: Random access is achieved by directly accessing the physical address of the record.
- Hash file: Use hash functions to quickly locate records, suitable for fast search operations.
Physical structure of files
The physical structure of a file describes the actual organization of the file on the storage medium, which mainly includes the following:
- Sequential structure:
- Files are stored continuously on the storage medium, and logical records are stored in adjacent physical blocks in sequence.
- Advantages: High efficiency of sequential reading and writing.
-
Disadvantages: Easy to generate fragmentation and cannot support random access.
-
Link structure:
- Records in the file are linked together by pointers and do not need to be stored continuously on the storage medium.
- Advantages: Reduce fragmentation problems and support dynamic expansion.
-
Disadvantages: Low random access efficiency and need to traverse pointers one by one.
-
Index structure:
- Create an index table for the file and find the physical address of the file record through the index table.
- Advantages: fast retrieval speed, suitable for large files and complex search operations.
-
Disadvantages: additional storage space is required to store index tables.
-
Direct file:
- For record-based files, the corresponding physical address is directly located through the primary key.
- Advantages: fast access, suitable for frequent random access scenarios.
- Disadvantages: need to design efficient mapping functions to avoid address conflicts.
Grouping and decomposition of records
- Grouping of records:
- Multiple logical records are combined into one physical block for storage.
- Block factor: The number of logical records contained in a physical block.
-
Advantages: Reduce the number of I/O operations and improve data transmission efficiency.
-
Decomposition of records:
- The operation of separating a single logical record from a physical block.
- Commonly used to read and process physical blocks containing multiple logical records.