For the profiling project we need to modify some routines in the Fortran run time library, SUBGROUPSRF4. Unfortunately the original source of the library as been lost. However there is a near 1 to 1 correspondance between the contents of semicompiled format and PLAN, so if we can dissasemble the semicompiled code back to PLAN then we can perform our modifications.
Luckily (due to the heroic efforts of Brian Spoor who seems to have re-typed it from the original) we have a partial copy of the document SCM which defines the semicompiled format, so the project seems possible.
In memory an ICL 1900 program is split into three regions - "lower", addresses less than 4096 which can be adressed with 12 bit immmediate addresses, "program", addresses less than 32768 that can be reached by the 15 bit addresses in branch instructions (in extended branch mode "program" can be larger, but space in the first 16384 words is needed for "replacers", holding the targets of "replaced" or indirect branches) and "upper", the rest of memory.
|Replacers||16384||Only present in extended branch mode|
|Program||32768||Or up to 4194304 in extended branch mode|
|Upper||32768||Or up to 4194304 in 22 bit addressing mode|
The job of the consolidator is to read semicompiled data and lay it out in store, filling in the values from the compiler and putting the regions of the same type (lower, program and upper) together.
The input to the consolidator is a collection of segments (separately compiled subroutines) taken from the compiler output or libraries. Each segment defines the amount of upper, lower and program it needs, and the contents of the regions.
The format of a segment is a segment leader which contains the data needed to consolidate the segment with other programs (for example the entry points of the segment) and the segment data which holds the data of the segment.
Here's an example of semicompiled, showing some of these features:Given the following simple PLAN segment:
#PROGRAM /TESTSEG(15AM,22AM,DBM,EBM) #UPPER COMMON/TESTCOM/ DATA(2) #LOWER PDATA /DATA #PROGRAM LDX 1 PDATA LDX 0 0(1) CALL 3 EXTERN BVSR *+1 EXIT 2 0 #END #FINISH
We get semicompiled that decodes as:
SEGMENT DATA TITLE RECORD #PROGRAM /TESTSEG DBM, EBM 15AM, 22AMA fairly obvious translation of the #PROGRAM directive.
SEGMENT DATA RECORD TA = PR + 0 TA = LP+ 0 R2 = TESTCOM [TA] = 0+R2 ; TA = TA + 1 TA = PR + 0 [TA] = 2097152+LP ; TA = TA + 1 [TA] = 4096 ; TA = TA + 1 [TA] = 7208960 ; TA = TA + 1 IF EBM NEXT OP IS RELATIVE NOOP [TA] = 5177348+PR ; TA = TA + 1 [TA] = 5144576 ; TA = TA + 1 TA = PR + 2 END
The PLAN compiler tells the consolidator to write to the PRogram area, but quickly backtracks and says the next output is for Lower Preset (TA = LP + 0). Then the R2 relativiser is pointed at the common block TESTCOM and R2 is written to the word pointed to by the current transfer address.
The transfer address is set to the beginning of the PRogram area, and 2097152+LP is written. (2097152 is "LDX 1 0", LP is pointing at our #LOWER).
More words are written to the region, then the transfer address is set back to PR+2 (pointing at the CALL instruction).
SEGMENT DATA RECORD R2 = EXTERN IF EBM NEXT OP IS REPLACED NOOP MASK = 32767 [TA] = 0+R2 ; TA = TA + 1 NOOP END
R2 is pointed at the external routine, the address of which is written to the call instruction. (By setting the MASK before the write the rest of the instuction, "CALL 3" is left untouched). The special "IF EBM NEXT OP IS REPLACED" instruction to the consolidator will force creation of a replaced branch in EBM mode. (A "replacer" will be created to hold the address of EXTERN and the branch instruction will point at this relativiser instead of directly at EXTERN).
Why is this strange backtracking done? Remember that PLAN is a one pass compiler and has no special declaration for external symbols, When PLAN compiles the "CALL 3 EXTERN" it doesn't know whether EXTERN is a forward branch or an external reference, so it just assembles "CALL 3 0". Later on it will either ask the consolidator to fill in an external reference as in this example, or, if it finds the missing symbol in the same segment it will ask the consolidator to fill in an internal branch, for example:
TA=PR+2 IF EBM NEXT OP IS RELATIVE MASK=32767 [TA] = xxx+PR; TA = TA+1Where "xxx" is the location in the current segment of the forward label.
The semicompiled continues:
SEGMENT DATA TERMINATOR RECORD LP = 1 SEGMENT LEADER TITLE RECORD #PROGRAM /TESTSEG DBM, EBM 15AM, 22AM SEGMENT LEADER CUE RECORD CUE TYPE 41 VAL 5 DATA 0 NAME TESTSEG SEGMENT LEADER CUE RECORD CUE TYPE 3 VAL 2 DATA 0 NAME TESTCOM SEGMENT LEADER CUE RECORD CUE TYPE 0 VAL 0 DATA 0 NAME EXTERN SEGMENT LEADER TERMINATOR RECORD LP = 1
Finally a segment leader is written that shows the external references used by or contained in the segment. Yes, the leader comes after the data.
(This is the output from the READSEMI program I wrote to investigate the format of ICL semicompiled, it is available on the disassembler tape).
10.46.03← NEWCOPYIN (5250004),TRAPOPEN ← NEWCOPYINPAR ← **** ... 10.46.04← NEWCOPYIN (5250004),TRAPOPEN,*CR NEWCOPYINPAR ... 10.46.05←The compiled disassembler is on the tape as PROGRAM DISS. Run it by:
LO PROGRAM DISS AS *LP AS *FW,OUTPUT(GRAP) AS *DA,SEMICOMPILED ENUnintelligible debugging output is produced by
EN ,PARAM(DEBUG)You can compile it by:
RATFOR *CR DISASSEMBLE(/RAT),BIN,SEMI :LIB.SUBGROUPGFIOAssuming you have ratfor and gfio installed.