File Organisation

Access Methods
Access Options
Operations on Files
Organising Files

Each base relation in the database design becomes a physical file. The physical file is made up of a set of physical records which are in turn made up of a set of fields. The fields can be of fixed length or variable length, and the record can consist of fixed length fields, variable length fields or a mixture of both fixed and variable length fields.

The physical file can be then defined as "a named portion of secondary memory allocated for the purpose of storing physical records" (Mc Fadden, Hoffer, Prescott, 1999, p267). 

Access Methods top of page

Data can be put into files or retrieved from files through algorithms called access methods. These algorithms are determined by the operating system. An access method provides a group of operations that can be applied to a file. Generally several access methods can be applied to a file, however, choice is restricted to some extent by the file organisation. Some access methods can only be applied to certain file organisations (Elmasri and Navathe, 2000, p134).

There are two basic access methods

Relative finds the "nth" record address from the current address or the beginning of the file. Sequential access is a form of relative access where the absolute address of the "next" record is contained in a pointer in the current record.
Direct calculates the address of the beginning of the record. The address is calculated via a table lookup, indexing or a mathematical computation. The data must contain additional information such as a primary key, index or hashing function

Access Options top of page

The access method applied to a file operation affects the speed of retrieval and the speed of write operations. While the database designer may not be able to change the access methods offered, the designer may make decisions about files that will affect access methods applied.  The access method (or path) may be affected by:

deciding the type of file for each relation (or table)
selecting the attributes on which indexes should be defined
defining the primary Index
creating any number of additional indexes

Operations on Files top of page

Operations on files can be viewed as either a retrieval operation, where the record is selected from the file according to specific search conditions but not changed, or an update operation which  also involves a search and changes the file. The search may be for a specific record, a group of records or a specific location in the file. 

Retrieval operations include for example: find, read, find next, open, reset
Update operations include for example: delete, modify, insert

How these operations are carried out depends on the access method employed and the organisation of the file at the physical level. The operations listed are higher level operations and depend on the programming language being employed to manipulate the file.

Record at a time operations Open Prepares the file for reading and writing and sets the file pointer to the beginning of the file.
Reset Sets the file pointer to the beginning of the file.
Find (or Locate) Searches for the first record that satisfies a search condition and copies the record block into main memory buffer.
Read (or Get) Makes the record data available to the requesting program by copying the data into a variable.
Find Next Searches for the next record that satisfies the search conditions and copies the record block into the main memory buffer.
Delete How the delete operation is carried out depends on the file organisation. The record may not be deleted from disk immediately, however pointers to the beginning of that file would be broken so that search operations no longer find the record.
Modify Changes are made to the copy of the record. This copy will overwrite the copy held on disk at some point in the operation. 
Insert Writes the record to the buffer and then at some point in the operation writes the contents of the buffer to disk.
Close Writes an end-of-file marker to the end of the file and clears the buffers.
Scan which includes find, find next, and read operations as apparently one operation. 
Set at a time operations Find all Searches for all records which satisfy a particular condition.
Find ordered Searches for all records and copies them to the calling program in a specific order. 
Reorganise Indexed files require reorganisation periodically. 

Organising Files top of page

File organisation refers to the organisation of data into records, blocks and access structures (Elmasri and Navathe, 2000, p134). Records and blocks have been introduced (although briefly) in this section on physical design. Access structures link one element of a record with another and also one record with another. They include address pointers, primary keys, hashing functions, and indexes.

There are three basic file organisations. Several variations exist on each of these.

Sequential File Organisation

Records are organised sequentially according to primary key value in an ordered sequential file. Ordered sequential files have some advantages. It is easy to locate and read from the file in order of key value. A search is on key values is efficient and relatively fast if a binary search algorithm is used.  However inserting and deleting from ordered sequential files requires several manipulations and is not very efficient. 

Sequential files can also be unordered. These are called Heap files. Unordered files are very easy to insert new records. The new record is added to the end of the file. Locating a record however, requires a linear search which is not efficient particularly with large files.

Indexed File Organisation

In this type of file organisation an index table or file  is created. The index file contains key value(s) that can be matched with key values in one or more records. The index also contains the disk address of the record. To locate a record the index is accessed first using some search algorithm. When the key value is found a pointer in the index file contains the address of the matching record(s). New records can be easily added to or deleted  from the record file. The index file periodically needs to be reorganised to reflect this. Indexes are commonly maintained on secondary keys to facilitate searching for groups of records with common fields.
Hashed File Organisation

In hashed file organisation the address of each record is determined using a hashing algorithm. The algorithm converts the primary key into a value that can then be looked up in an index table which contains the record address. Hash algorithms employ a number of techniques prevent "collisions", where two records have the same hash number. 
Use the Library Reference section to find more information about File Organisation. The above descriptions are introductory only. The Library may be accessed from the top menu.

top of page