Disassembling 1900 semicompiled format

All the original ICL compilers (Algol, Cobol, Fortran and PLAN) compile to semicompiled (link-ready) format which is then consolidated (linked) to produce the final executable. The various libraries of subroutines also hold semicompiled format.

For the profiling project we need to modify some routines in the Fortran run time library, SUBGROUPSRF4. Unfortunately the original source of the library as been lost. However there is a near 1 to 1 correspondance between the contents of semicompiled format and PLAN, so if we can dissasemble the semicompiled code back to PLAN then we can perform our modifications.

Luckily (due to the heroic efforts of Brian Spoor who seems to have re-typed it from the original) we have a partial copy of the document SCM which defines the semicompiled format, so the project seems possible.

Semicompiled format

In memory an ICL 1900 program is split into three regions - "lower", addresses less than 4096 which can be adressed with 12 bit immmediate addresses, "program", addresses less than 32768 that can be reached by the 15 bit addresses in branch instructions (in extended branch mode "program" can be larger, but space in the first 16384 words is needed for "replacers", holding the targets of "replaced" or indirect branches) and "upper", the rest of memory.

RegionbelowNote
Lower4096
Replacers16384Only present in extended branch mode
Program32768Or up to 4194304 in extended branch mode
Upper32768Or up to 4194304 in 22 bit addressing mode

The job of the consolidator is to read semicompiled data and lay it out in store, filling in the values from the compiler and putting the regions of the same type (lower, program and upper) together.

The input to the consolidator is a collection of segments (separately compiled subroutines) taken from the compiler output or libraries. Each segment defines the amount of upper, lower and program it needs, and the contents of the regions.

The format of a segment is a segment leader which contains the data needed to consolidate the segment with other programs (for example the entry points of the segment) and the segment data which holds the data of the segment.

Here's an example of semicompiled, showing some of these features:

Given the following simple PLAN segment:

#PROGRAM       /TESTSEG(15AM,22AM,DBM,EBM)
#UPPER         COMMON/TESTCOM/
               DATA(2)
#LOWER
PDATA          /DATA
#PROGRAM

      LDX   1  PDATA
      LDX   0  0(1)
      CALL  3  EXTERN
      BVSR     *+1
      EXIT  2  0
#END
#FINISH

We get semicompiled that decodes as:

SEGMENT DATA TITLE RECORD
        #PROGRAM /TESTSEG
        DBM, EBM
        15AM, 22AM
A fairly obvious translation of the #PROGRAM directive.
SEGMENT DATA RECORD
        TA = PR +       0
        TA = LP+       0
        R2 =  TESTCOM
        [TA] =        0+R2 ; TA = TA + 1
        TA = PR +       0
        [TA] =  2097152+LP ; TA = TA + 1
        [TA] =     4096 ; TA = TA + 1
        [TA] =  7208960 ; TA = TA + 1
        IF EBM NEXT OP IS RELATIVE
        NOOP
        [TA] =  5177348+PR ; TA = TA + 1
        [TA] =  5144576 ; TA = TA + 1
        TA = PR +       2
        END

The PLAN compiler tells the consolidator to write to the PRogram area, but quickly backtracks and says the next output is for Lower Preset (TA = LP + 0). Then the R2 relativiser is pointed at the common block TESTCOM and R2 is written to the word pointed to by the current transfer address.

The transfer address is set to the beginning of the PRogram area, and 2097152+LP is written. (2097152 is "LDX 1 0", LP is pointing at our #LOWER).

More words are written to the region, then the transfer address is set back to PR+2 (pointing at the CALL instruction).

SEGMENT DATA RECORD
        R2 =  EXTERN
        IF EBM NEXT OP IS REPLACED
        NOOP
        MASK =    32767
        [TA] =        0+R2 ; TA = TA + 1
        NOOP
        END

R2 is pointed at the external routine, the address of which is written to the call instruction. (By setting the MASK before the write the rest of the instuction, "CALL 3" is left untouched). The special "IF EBM NEXT OP IS REPLACED" instruction to the consolidator will force creation of a replaced branch in EBM mode. (A "replacer" will be created to hold the address of EXTERN and the branch instruction will point at this relativiser instead of directly at EXTERN).

Why is this strange backtracking done? Remember that PLAN is a one pass compiler and has no special declaration for external symbols, When PLAN compiles the "CALL 3 EXTERN" it doesn't know whether EXTERN is a forward branch or an external reference, so it just assembles "CALL 3 0". Later on it will either ask the consolidator to fill in an external reference as in this example, or, if it finds the missing symbol in the same segment it will ask the consolidator to fill in an internal branch, for example:

    TA=PR+2
    IF EBM NEXT OP IS RELATIVE
    MASK=32767
    [TA] = xxx+PR; TA = TA+1
Where "xxx" is the location in the current segment of the forward label.

The semicompiled continues:

SEGMENT DATA TERMINATOR RECORD
        LP =        1
SEGMENT LEADER TITLE RECORD
        #PROGRAM /TESTSEG
        DBM, EBM
        15AM, 22AM
SEGMENT LEADER CUE RECORD
        CUE TYPE 41 VAL       5 DATA       0 NAME TESTSEG
SEGMENT LEADER CUE RECORD
        CUE TYPE  3 VAL       2 DATA       0 NAME TESTCOM
SEGMENT LEADER CUE RECORD
        CUE TYPE  0 VAL       0 DATA       0 NAME EXTERN
SEGMENT LEADER TERMINATOR RECORD
        LP =        1

Finally a segment leader is written that shows the external references used by or contained in the segment. Yes, the leader comes after the data.

(This is the output from the READSEMI program I wrote to investigate the format of ICL semicompiled, it is available on the disassembler tape).

Availability

The disassembler is available as a newcopyin tape, (5250004,SEMICOMPTAPE). To read the tape use NEWCOPYIN to read the file NEWCOPYINPAR from the tape, then copy in all the files listed in NEWCOPYINPAR:
10.46.03← NEWCOPYIN (5250004),TRAPOPEN
← NEWCOPYINPAR
← ****
...
10.46.04← NEWCOPYIN (5250004),TRAPOPEN,*CR NEWCOPYINPAR
...
10.46.05←
The compiled disassembler is on the tape as PROGRAM DISS. Run it by:
        LO PROGRAM DISS
        AS *LP
        AS *FW,OUTPUT(GRAP)
        AS *DA,SEMICOMPILED
        EN
Unintelligible debugging output is produced by
        EN ,PARAM(DEBUG)
You can compile it by:
        RATFOR *CR DISASSEMBLE(/RAT),BIN,SEMI :LIB.SUBGROUPGFIO
Assuming you have ratfor and gfio installed.