THE INDEXED FILE ORGANIZATION

In this file organization, the records of the file are stored one after another in the order they are added to the file.

In contrast to RELATIVE files, records of a INDEXED SEQUENTIAL file can be accessed by specifying an ALPHANUMERIC key in the READ statement (the KEY) .

It is the programmer's responsibility to take care of the record sizes in files. You must be careful when declaring record structures for files. Any mistake you make in record sizes will cause an error and abnormal termination.

Please note that you must provide the record structure to have a special field allocated to contain the KEY.

INDEXED files can have RANDOM or SEQUENTIAL ACCESS MODEs. If the ACCESS MODE is declared to be RANDOM in the corresponding SELECT statement, the program can read the record with key value "ABC" and then the record with key="A12", then record with key "D2X" etc etc. In short, one can access any record of an indexed file, in any order, provided that the KEY value is specified before the READ statement is executed.

If the ACCESS MODE is SEQUENTIAL, that means the records of the file will be accesses in their physical sequential order (just like SEQUENTIAL and LINE SEQUENTIAL files) and no specific KEY value be given for the READ statements; but instead, the NEXT clause will appear in READ statements meaning "Go get the record with the next consecutive key value.

        SELECT MYFILE ASSIGN TO DISK "C:\DATADIR\MYFILE.DAT"
                    ORGANIZATION IS INDEXED
                    ACCESS MODE IS RANDOM
                    RECORD KEY IS M-IDNO.


        SELECT MYFILE-2 ASSIGN TO DISK "C:\DATADIR\MYFILE2.TXT"
                    ORGANIZATION IS INDEXED
                    ACCESS MODE IS SEQUENTIAL
                    RECORD KEY IS M-IDNO.

In the FILE-SECTION, you must provide fields for the KEY, the M-IDNO in this example, in the FD block of the INDEXED file;

        FD   MYFILE.
        01   MYFILE-REC.
             02  M-IDNO          PIC XXXX.
             02  M-NAME          PIC X(16).
             02  M-SURNAME       PIC X(16).
             02  M-BIRTHDATE.
                03 M-BD-YEAR     PIC 9999.
                 03 M-BD-MONTH   PIC 99.
                 03 M-BD-DAY     PIC 99.

Note : Since the keys are not supposed to be numerical, you can use the X picture for key fields.

PROCEDURE DIVISION Considerations

Like all files, Indexed files must be OPENed before they can be processes and CLOSEd when they are not needed anymore.

Indexed files can be opened for :

  1. OUTPUT
  2. INPUT
  3. I-O (Input-Output)

Opening an indexed file for OUTPUT means that the program will only issue WRITE statements to a NON-EXISTING and JUST CREATED file. Therefore, when you open a file for OUTPUT, COBOL assumes that the file does not exist and try to create a new one. IF THE FILE EXISTS, ITS CONTENTS WILL BE CLEARED WHEN OPENED AND ASSUMED TO BE A BRAND NEW FILE INTO WHICH RECORDS WILL BE ADDED. You should be very careful when opening a file for OUTPUT. One small mistake and all your valuable records are lost forever.

Once you open an indexed file in Output mode, you are expected to write records in the INCREASING order of keys. That is, before writing the record with key A98 you should first write all the records with key values alphanumerically less than "A98". Please note that indexed fikle keys are case sensitive; that is the key value "a89" is NOT EQUAL TO "A89"; in fact it is alphanumerically greater than "A89" because the ASCII value of "a" (97) is greater than the ASCII value of "A" (65).

Opening an indexed file for INPUT means that the program will only issue READ statements to an EXISTING file. Therefore, when you open a file for INPUT, COBOL assumes that the file exists and try to access it. IF THE FILE DOES NOT EXIST, AN ERROR MESSAGE WILL BE ISSUED INDICATING THAT THE MENTIONED FILE COULD NOT BE FOUND.

If you have declared the ACCESS MODE to be RANDOM, before each READ statement, you are supposed to move a valid KEY value to the record field variable that is declared as the RECORD KEY.

For instance;

     OPEN INPUT MYFILE.
     ....     
     MOVE "X23"  TO M-IDNO.
     READ MYFILE.

Good programmers should take precautions in their program to avoid error messages and subsequent abnormal terminations is an INVALID value for the record key specified before the READ statement.

The INVALID KEY clause handles this in COBOL.

     MOVE "2300"  TO M-IDNO.
     READ MYFILE INVALID KEY PERFORM OUT-OF-RANGE.

The INVALID KEY condition raises if there is no record in the file with key value equal to the specified key value.

If you have declared the ACCESS MODE as SEQUENTIAL, you should use the NEXT clause in the READ statement. Like

READ MYFILE NEXT AT END PEFORM NO-MORE-RECORDS.

When you are finished with a file you must CLOSE it.

       CLOSE NEW-FILE.
       CLOSE OLD-FILE NEW-FILE.

You can close more than one files with a single CLOSE statement. When a COBOL program terminates, all files mentioned in the program are automatically closed; therefore you will not get an error message for those files you forget to close or do not close at all. Please note that relying on COBOL to close your files is not a proper programming style. Although not absolutely necessary, you should close your files when you are done with them.

REWINDING an indexed file with ACCESS MODE RANDOM is not meaninful. If the ACCESS MODE is SEQUENTIAL, you can CLOSE and than OPEN INPUT again to rewind a sequentialy accessed indexed file.

Simple INPUT-OUTPUT Statements for INDEXED Files.

Once you open an indexed file, you can use READ or WRITE statements to read or add records to this file. You cannot use a READ statement for a file opened for OUTPUT and similarly you cannot use a WRITE statement for a file you have opened for INPUT.

A typical READ statement for RANDOM access mode looks like :

       MOVE "1X3" TO M-IDNO.
       READ MY-FILE.

A good programmer must check whether the READ statement could find a record to read successfully. That means; the programmer must check whether there were any records with the indicated key value when the statement was executed. The situation can be checked with the following construct :

       READ MY-FILE  INVALID KEY PERFORM  NO-RECORDS-FOUND.

The construct tells the COBOL compiler to execute the paragraph labeled NO-RECORDS-FOUND when the READ statement cannot find any record with the key value stored in the key field variable.

Once your program EXECUTES a successful READ statement, the information in the data record that was just brought into memory will be available in the corresponding variables mentioned in your record description declaration (the FD block).

When you want to create an indexed file, you have to open it for OUTPUT. The COBOL statement that is used to put records into a new file is the WRITE statement. The important pointa that you should be careful with are that,

  1. the file should be opened
  2. new values for the field variables of the FD record description must be moved to proper variables of the record
  3. the value of the KEY VARIABLE must have a valid value and this valid value should be alphanumerically greater than the the key value of the record written previously,
  4. the RECORD NAME is specified in the WRITE statement and NOT the FILE NAME.

UPDATING RECORDS IN AN INDEXED FILE

Sometimes you need to make some small changes in the contents of a file. Of course it possible to create a brand new file with the new, modified contents but this is not practical. COBOL provides you with an OPEN I-O mode with which you can modify only the required record in a file.

In other words, there is another file opening mode; the I-O mode; and another special REWRITE statement. Suppose that you want to change the surname field of a record in an indexed file (originally "AYFER") into "AYFEROGLU".

The program that you can write could read like

     SELECT MYFILE ASSIGN TO DISK "MYFILE.DAT" 
                   ORGANIZATION IS INDEXED
                   ACCESS MODE IS RANDOM
                   RECORD KEY IS M-IDNO.
     ....
     FD   MYFILE.
     01   MYFILE-REC.
          02  M-IDNO      PIC XXXX.
          02  M-NAME      PIC X(16).
          02  M-SURNAME   PIC X(16).
          02  M-BIRTHDATE.
              03 M-BD-YEAR  PIC 9999.
              03 M-BD-MONTH PIC 99.
              03 M-BD-DAY   PIC 99.
     ...
     PROCEDURE DIVISION.
         ....
         OPEN I-O MYFILE.
         ....
         MOVE "X20" TO M-IDNO.
         READ MYFILE INVALID KEY PERFORM KEY-ERROR.
    *
    * TO MAKE SURE THAT WE ARE ON THE CORRECT RECORD
    *
         IF M-SURNAME NOT = "AYFER" DISPLAY "WRONG RECORD"
                                    DISPLAY"RECORD NOT UPDATED"                       
                                    CLOSE MYFILE
                                    STOP RUN.
         MOVE "AYFEROGLU" TO M-SURNAME.
         REWRITE MYFILE-REC.
         ....
         CLOSE MYFILE.
         .....

Important : The key field of a record in an indexed file CANNOT be updated. If the programmer has to update the key field, the only solution is to delete the record with the old key value and a add the same record to the file; this time with the new key value.


ADDING RECORDS TO AN INDEXED FILE

When you need to add records to an indexed file, you should open the file in I-O mode and just write new records.

Note : Keys in an indexed file must be unique. That is to say, no two records can have tha same key. There is, however, a technique to have duplicate keys in an indexed file, but this will be covered later.

DELETING RECORDS OF AN INDEXED FILE

If you do not want a specific record to be kept in an indexed file any more, you can use the DELETE statement to make the record inaccessible.

The DELETE statement is used more or less like the WRITE statement. You must move a coorect key value to the record key field and iisue the DELETE statement.

         MOVE "X12" TO M-IDNO.
         DELETE MYFILE-REC.

Please note that, when you delete a record, the physical record is NOT deleted. Only the record is rendered inaccessible. Suppose, for instance that an indexed file is 20 Mbytes in total length and you delete half of the records in this file. After all the deletions are completed, you shall see that the file is still 20 Mbytes in length. The only way you can recover the disk space used by deleted records is REORGANIZING the indexed file. (See section "REORGANIZING INDEXED FILES").

ADVANTAGES of INDEXED FILES

  1. Quite easy to process,
  2. With proper selection of a key field, records in a large file can be searched and accessed in very quickly.
  3. Any field of the records can be used as the key. The key field can be numerical or alphanumerical.

DISADVANTAGES of INDEXED FILES

  1. Extra data structures have to be maintained (the COBOL run-time modules take care of these and it is not the programmers' concern). These extra data structures maintained on the disk can use up much disk space, especially for long key values.
  2. The indexed files have to be reorganized from time time to get rid of deleted records and improve performance that gets gradually decreased with addition of new records.

ADVANCED FEATURES OF INDEXED SEQUENTIAL FILES

Reorganizing Indexed Files : This operation is usually done by a utiliy program suppled by the maunfacturer of the COBOL compiler that you use. Refer to the manuals of your compiler for details and instructions.

MULTIPLE KEYS (PRIMARY & ALTERNATE KEYS)

Sometimes the programmers need to assign more than one key fields in their records. An example could be a database where one wants to keep records for vehicles registered to a certain country. The Turkish vehicle plate numbering system is NOT suitable for using relative files because it contains alphabetic parts ( eg. 06 AHT 08) . As you might guess, this type of key values is very very suitable to be organized as an INDEXED file.

Having declared the "PLATE" field as the key field, now, given a plate number, a program which has declared an indexed file with RANDOM access mode, can find the corresponding record very quickly (if the does exist, this condition will also be detected very quickly).

Knowing that no two vehicles have the same plate number (unique), the plate number is certainly a very good choice for the key.

However, suppose that the program you have to write is also expected to find records for vehicles of a certain brand very quickly. Since the key is the plate number, the only way you can find "OPEL" brand vehicles by going through all the records (declare the access mode to be sequential and READ NEXT all records till the AT END condition raises) testing the BRAND field against the value given by the ýser ("OPEL" for instance).

With indexed files, there is a better solution : You can declare the BRAND field also to be a key; but AN ALTERNATE key!. The PRIMARY key will still be the plate numbers. Since there are many vehicles manufactured by OPEL in your database, you know that the ALTERNATE KEY values will not be uniqe. THIS IS ALLOWED IN COBOL. ONLY THE PRIMARY KEY HAS TO BE UNIQE. ALTERNATE KEYS CAN HAVE DUPLICATE VALUES.

You can declare as many alternate keys your program requires. More alternate keys you have, more extra data structures will be required and slower be your program. Therefore certain attention should be paid before assigning alternate keys. You should avoid unnecessary alternate keys.

When you have alternate keys, you declare them in the SELECT statement together with the PRIMARY key.

         SELECT VEHICLES ASSIGN TO DISK "VEHICLES.DAT"
                   ORGANIZATION IS INDEXED
                   ACCESS MODE IS RANDOM
                   RECORD KEY IS PLATE
                   ALTERNATE KEY IS BRAND.
         ....
         FD   VEHICLES.
         01   VEHICLE-REC.
              02 PLATE           PIC XXXXXXXXXX.
              02 BRAND           PIC X(20).
              02 OWNER           PIC X(32).
              02 DATE-REGISTERED PIC 99999999.
              02 ENGINE-NO       PIC X(20).
              02 CHASSIS-NO      PIC X(20).
              ...

When you need to access the file with the primary key, you just move the key value to the primary key field of the record and issue a READ statement.

              MOVE "06 AHT 08" TO PLATE.
              READ VEHICLES INVALID KEY PERFORM NOT-FOUND.

When you need to access the file using an alternate key, you move an appropriate value to the alternate key you want to use and issue a READ statement in which you specify which key you want to use

              MOVE "CHEVROLET" TO BRAND.
              READ VEHICLES KEY BRAND INVALID KEY PERFORM NOT-FOUND.

The START Statement

Sometimes, it is necessary to find a set of records for which you want to specify a criterion. For example, in our "vehicles" example, you might want to find the records of "TOYOTA" brand vehicles. If the criterion is related to one of the keys, you will not have to go through the whole file, testing the fields against a certan value. Instead, you can use the START statement to find the first vehicle with "TOYOTA" recorded as the brand.

An example is

              START VEHICLES KEY BRAND IS EQUAL TO "TOYOTA"                         
                             INVALID KEY NO-SUCH-BRAND.
          LOOP1.
              READ VEHICLES NEXT AT END PERFORM END-OF-BRAND.
              ...
              GO LOOP1.
              ...

IMPORTANT : Please note that when you use the START statement, the file is accessed as a RANDOM file; whereas the subsequent READ NEXT statements acces the file in SEQUENTIAL mode. This dilemna is solved by the ACCESS MODE IS DYNAMIC phrase in the SELECT statement. When you declare the access mode to be DYNAMIC, you can access the file both sequentially ( the NEXT option of the READ statement) and randomly by specifying a certain key value. When you want to use the START statement you must open an index file either as INPUT or I-O. The START statement is not allowed and in fact not meaningful when the file is opened for OUTPUT.





Back to FILES page...