SORTING FILES IN COBOL (RM-Cobol)

Line sequential and Sequential files are used very frequently in data processing applications.

The records in these files are usually need to be put in ascending or descending order for proper and easy handling. (Remember that searching records in an ordered file ( a file sorted in respect to the field which is the search key; you can stop reading records when you find a record which has the key field having greater value than the value you are searching for).

Sorting a set of data which will fit into arrays (therefore fitting into the memory) is quite easy. (Remember the Internal Sort techniques you have learned in previous courses). Sorting large files which will not fit into arrays in memory, on the other hand, is not easy. This requires special algorithms to be implemented and generally called "EXTERNAL SORT TECHNIQUES".

Many COBOL compilers provide facilities to solve this "Exteral Sort" problem. RM-COBOL, for instance, has a very powerful SORT statement which will let the user SORT and MERGE sequential files very easily.

SORTING does not need much explanation here. The MERGE process however, needs a little bit explanation.

MERGING DATA is simply combining two or more SORTED sequential files together into a single file so that the resulting file is also sorted. We shall not cover the MERGE facility of the COBOL SORT statement in this course.

The RM-COBOL SORT Statement

The general syntax of the RM-COBOL SORT statement is :

SORT sort-work-file  ON {ASCENDING,DESCENDING} KEY data-name1, data-name2, ...
                     ON {ASCENDING,DESCENDING} KEY data-name1, data-name2, ...
                     USING file-to-be-sorted
                     GIVING sorted-file

In order to use this SORT statement, you will have to declare 3 files;

The WORK FILE for the SORTING Process
The file to be sorted (input file)
The file which will contain the sorted records (output file)

The WORK FILE is not an actual file; it is a special declarative file structure with which you tell the compiler that you will perform an external sort on a file and also indicate the fields on which you shall set the sort criteria. The concept will become clearer with an example :

Suppose you have sequential file with personnel records (PERSONEL.DAT) and you want to sort this file on the persons' names and surnames. In order to use the SORT statement, you must declare a SORT WORK file which will declare the structure of the records to be sorted and indicate the lenghts and positions of the fields which contain the names and surnames:

Suppose that the structure of the PERSONEL.DAT file is :

       01  PERS-REC.
           02 ID-NO              PIC X(8).
           02 NAME               PIC X(16).
           02 SURNAME            PIC X(16).
           02 GENDER             PIC X.
           02 DEPT-CODE          PIC X.
           02 NCHILDREN          PIC 99.
           02 HOME-ADR1          PIC X(25).
           02 HOME-ADR2          PIC X(25).
           02 HOME-ADR3          PIC X(25).
           02 HOME-TEL           PIC X(12).
           02 EMPLOYMENT-DATE.
              03 R-DAY             PIC 99.
              03 R-MONTH           PIC 99.
              03 R-YEAR            PIC 9999.
           02 LEAVE-DATE.
              03 R-DAY             PIC 99.
              03 R-MONTH           PIC 99.
              03 R-YEAR            PIC 9999.
           02 LEAVE-REASON       PIC X.

The corresponding SORT WORK file declaration might look something like :

       SD  SORT-WORK-FILE.
       01  PERS-REC.
           02 ID-NO              PIC X(8).
           02 NAME               PIC X(16).
           02 SURNAME            PIC X(16).
           02 FILLER             PIC X(107).

This declaration will enable the user to issue a SORT statement using the NAME and SURNAME fields; into either ascending or descending order.

Please note that a 108 byte filler is used in the SORT-WORK-FILE record description so that the work file's record length matches the record length of the PERSONEL.DAT file.

Please also note the "SD" indicator used in place of "FD" in FILE SECTION.

You can perform sort on more than one field at time; and also you can sort into ascending order (increasing) on one field, and into descending order (decreasing) on another. For example, while sorting the PERSONEL.DAT file, you can declare the NAME field to be the primary sort field and the SURNAME to be the secondary sort field; so that records with identical names will be sorted into surnames among themselves. (Just like the entries in a telephone directory).

Referring to our PERSONEL.DAT example, typical file declarations and a SORT statement would look like :

In the ENVIRONMENT DIVISION :

       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT SORT-FILE ASSIGN TO DISK "SORTWORK".
           SELECT UNSORTED-PERSON ASSIGN TO DISK "PERSONEL.DAT"
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORTED-PERSON   ASSIGN TO DISK "SPERSONL.DAT"
                  ORGANIZATION IS LINE SEQUENTIAL.

In the DATA DIVISION :

       FILE SECTION.
       SD  SORT-FILE.
       01  SORT-RECORD.
           02  FILLER                     PIC X(8).
           02  S-NAME                     PIC X(16).
           02  S-SURNAME                  PIC X(16).
           02  FILLER                     PIC X(108).
      
       FD  UNSORTED-PERSON.
       01  FILLER                         PIC X(148).

       FD  SORTED-PERSON.
       01  FILLER                         PIC X(148).

In the PROCEDURE DIVISION :

           SORT SORT-FILE
                   ON ASCENDING KEY S-NAME S-SURNAME
                   ON DESCENDING S-EMP-DATE
                   USING UNSORTED-PERSON
                   GIVING SORTED-PERSON.

Notes :

The lengths of records in the work, input and output files should match.
You do not need to declare the details of input and output file records if you do not need these details in your program.
You should not OPEN or CLOSE the work, input and output files before, after or during the SORT operation.
You can specify more than one sort field and the fields may have different ordering (ascending/descending).

A complete RM-COBOL program which sorts the PERSONEL.DAT file, creating the sorted file SPERSONL.DAT might look like :

Please study the SORT statement and its relevant file declarations carefully!

       IDENTIFICATION DIVISION.
       PROGRAM-ID.  "SORT DEMO".
       ENVIRONMENT DIVISION.
       CONFIGURATION SECTION.
       SOURCE-COMPUTER.  RMCOBOL-85.
       OBJECT-COMPUTER.  RMCOBOL-85.
       INPUT-OUTPUT SECTION.
       FILE-CONTROL.
           SELECT SORT-FILE ASSIGN TO DISK "SORTWORK".
           SELECT UNSORTED-PERSON ASSIGN TO DISK "PERSONEL.DAT"
                  ORGANIZATION IS LINE SEQUENTIAL.
           SELECT SORTED-PERSON   ASSIGN TO DISK "SPERSONL.DAT"
                  ORGANIZATION IS LINE SEQUENTIAL.
      *
       DATA DIVISION.
       FILE SECTION.
       SD  SORT-FILE.
       01  SORT-RECORD.
           02  S-ID-NO                    PIC X(8).
           02  S-NAME                     PIC X(16).
           02  S-SURNAME                  PIC X(16).
           02  S-GENDER                   PIC X.
           02  S-DEPT-CODE                PIC 9.
           02  FILLER                     PIC X(89).
           02  S-EMP-DATE.
               03 S-EMP-DATE-DD           PIC 99.
               03 S-EMP-DATE-MM           PIC 99.
               03 S-EMP-DATE-YY           PIC 9999.
           02  S-LEAVE-DATE.
               03 S-LEAVE-DATE-DD         PIC 99.
               03 S-LEAVE-DATE-MM         PIC 99.
               03 S-LEAVE-DATE-YY         PIC 9999.
           02  S-LEAVE-REASON             PIC X.
      

       FD  UNSORTED-PERSON.
       01  FILLER                         PIC X(148).

       FD  SORTED-PERSON.
       01  FILLER                         PIC X(148).
      *
       PROCEDURE DIVISION.

       MAIN-PGM.

           SORT SORT-FILE
                   ON ASCENDING KEY S-NAME S-SURNAME
                   ON DESCENDING S-EMP-DATE
                   USING UNSORTED-PERSON
                   GIVING SORTED-PERSON.

           STOP RUN.

Back to first page...