Hi Oliver, At 2024-06-12T22:12:34+0200, Oliver Corff via wrote: > Absolutely reasonable. [...] > Do you know whether your scan is scaled 1:1? In this case only, direct > measures could be taken from the image, assuming that the paper size > is letter.
I attached the scan I used to my earlier email. I think a quick glance is enough to reveal that such hopes are much too high for this document. The pages aren't even scanned _straight_. This made it tedious to repair the OCR generated from it. The OCR engine appears to have imposed resolutely horizontal baselines on the page images and quantized the word position to them, which scrambled the word order with high reliability. I've attached the OCR text output; you may find it amusing. (There _was_ an "unskew" option in the OCR UI. I clicked it. It seems to have done little or nothing.) The pages also appear not to be cropped consistently, which annoys me for another reason: that makes it impossible for me judge the sizes of the page margins used by the formatter/macro package. With these problems I think the geometry of the page scans is pretty far from a rectangle with a consistent aspect ratio. I think it's more likely than not that the paper format was U.S. letter. If this had been a journal article, that bet would be off. > Which, in return, makes visual identity an ideal tool to check for > undiscovered glitches which have the potential to cause inconsistent > line breaks. > > When working in a negotiation team quite a few years ago, we would > take pages of claimed-to-be identical copies or transcripts of text, > superimpose them and check them against the light of a strong lamp for > gray areas --- mismatches in print. This helped us discover a good few > issues like altered digits etc., and the method was *much* faster than > reading side by side. Right. I think I've mentioned this on the list before, but this is the principle behind the blink comparator, the tool that helped Clyde Tombaugh discover the dwarf planet Pluto. And I do in fact use that technique in groff development (including today while comparing nroff mode output for various memorandum types between DWB 3.3 and groff). Since I run terminals maximized to the screen geometry anyway, such comparison is always a keyboard chord away. I simply don't have any hope of applying the technique here. Not unless a much superior scan of an authenticated original document turns up. Regards, Branden
a itn hare th
Pate
oS ~~
ae
Case-39394-21
te ee
an ors i
date: July 7, 1978
Subject: A UNIX™ Operating System for the DEC VAX-
from: Thomas B. London.
John F. Reiser
78-1353-4
fe
™:
nh ne Bye
11/780 Computer
li
Bell Laboratories
MEMORANDUM FOR FILE
Introdaction
'
ms
1.
lee
ic digital comThe VAX-11/780 [1] is a new, general-purpose, stored-program
electron
it provides
prices
mputer
puter manufactured by Digital Equipment Corporation. At minico
address space bound of
addresses and data which are 32 bits wide; the traditional minicomputer
the implementation of a
64K is gone. This memorandum describes the VAX-11/780 and
2 contains an overview
UNIX operating system and complete user e..vironment for it. Section
only to devotees of computer syssuitable for general consumption, details
normally of interest
on software portability in Section
t
tem architecture appear in Section 3. The authors commen
4.
2.
Overview
Environment.
the VAXA user of UNIX and C software on the PDP-11 will find that
Nala Weta SO
a ap
apparent differences in the com11/780 provides a very similar environment.
There are no
rily invoked directly from
customa
arc
mand language or the vast majority of programs which
hardware, except by issuing
the
the shell. A casual user probably will not be able to distinguish
the current user) or by noting that
the command “who am i” (which identifies the hardware and
is in hexadecimal rather than
one of the columns printed by the process status command ps
pointer data types all occupy 4
octal. The C language programmer will find that int, long, and
The architecture seen
is “culturally compatible” with
by the user-mode assembly-language programmer of a VAX-11
r with the PDP-11 can quickly
the PDP-11. Specific details differ, but a programmer familia
and uses
MASSBUS interfaces
understand tne differences. The VAX-11 provides UNIBUS and
the same input/output peripheral devices as a PDP-11.
virtual address space, intelliSignificant new features of the VAX-11 include an
extended
The address space of a process is
gent console, and dramatically improved physical packaging.
divided into a large number of
divided into a few gigantic segments. Each segment is further
« viable memory management
paging
demand
small pages. Sufficient hardware exists to make
omputer through a standard
microc
LSI-11
an
strategy. All console functions are handled by
processor and can still halt,
the
from
located
ASCII terminal. The terminal may be remotely
of the VAX-1 1/780 is well
design
l
boot, or diagnese the VAX-11. The mechanical and physica
parts are easily accessiAll
cables.
done. The processor contains no sliding drawers or moving
ble for servicing.
Adequate airflow is maintained even under maintenance conditions.
ee
The VAX-11 is a follow-on computer to the PDP-11.
Ae
Hardware.
oe ee
ed char.
ted to longer integer types, but one may use the declaration unsign
te ps ene mcrae neS Bit wile ala eee the Mage ae oh eatin
stored in a different
bytes (a short still occupies 2 bytes), and that a long has its two halves
on when converextensi
sign
suffer
still
ers
order on the PDP-11 than on the VAX-11. Charact
The actual configuration purchased by Department
Configuration.
1353 is:
VAX-11/780 cpu
0.5 megabytes memory with battery backup
floating-point accelerator
12Kbyte uses-writeable control store
UNIBUS adaptor with DZ11 (8 RS-232C lines)
MASSBUS adaptor with TE16 tape drive (800/1600 bpi)
bytes per spindle)
_ MASSBUS adaptor with two RP06 disk spindles (176M
additional BA1IKE UNIBUS box
1978 was $241,255; the price including a
ary
The list price of the above configuration in Febru
DEC discount to a Bell Labs purchaser was $200, 242.
Software.
‘We
have
implemented
a
UNIX
operating
system
[2]
and
complete
user
ting system is Research version 7 as of
software environment on the VAX-11/780. The opera
shell, C compiler, code improver c2,
April 15, 1978. The environment includes the Bourne
y libS, C subroutine library libe,
assembler, loader, debugger, standard 1/0 subroutine librar
enance prothan 130 commands.
source code control system SCCS, nrofftroff, and more
Maint
disk pack handling have also been
grams for file system checking, bootstrapping, ani physical
implemented.
ting system,
We began with the C language code of Research version 7 of the UNIX opera
ing a C compiler which produand a PDP-11/45 running UNIX as a bootstrap
machine. Creat
The code generator portion of the ced VAX-11 native-mode assembly code was the
first task.
code ;
J
a
i
PDP-11/45 to the VAX-11/780.
for deadstart load, and physically carried these tapes from the
arrived on March
Work on the C compiler began in mid-December 1977. The hardware
system.
the
n of
3. We held a party on May 19 to celebrate successful multiuser operatio
rd
loader, based on similar
portable C compiler was rewritten to do this. An assembler and
Existing PDP-11/70 device for the Interdata 8/32, completed the basic support
software.
were adapted to the VAX-11/780. :
drivers for disk, tape, and terminal communication lines
etc.) were completely
Assembly language interfaces (trap handlers, hardware initialization,
initial file system and an
for
format
rewritten. We then created magnetic tapes in the proper
-
/780 and on a
Performance. Identical documents were formatted by nroffon our VAX-11
Identical C proPDP-11/70 running Research version 7 UNIX, both systems used
RP06 disks.
the PDP-11/70.
grams were compiled and assembled on the VAX-11/780 and on
As reported
by the fime command, the results (converted to seconds) were:
nroff -ms -e -1T450-12 ios.c >/dev/null
VAX-11/780
PDP-11/70
real
47.0
54.0
user
28.6
36.9
sys
8.7
7.9
real
86.0
82.0
153.0
user
43.5
64.0
114.6
ce -c -O pftn.c
PDP-11/70 (Ritchie compiler)
VAX-11/780 (portable compiler)
PDP-11/70 (portable compiler
for Interdata 8/32)
sys
11.8
10.5
16.6
time, the
From the statistics on nroff one should conclude that, based on user-mode CPU
VAX-11/780 can execute the code produced by the VAX-11
C compiler approximately 22%
faster thar the PDP-11/70 can execute the code produced by the PDP-11 C
compiler.
This is a
by the
measure of the combined power of the hardware and efficiency of the code
generated
~
_
Except
compiler.
as an
upper
limit,
figures
the
give
no
indication
as to the
throughput,
in real time and system
response time, or efficiency of the operating system. The differences
significant.
time between the VAX-11/780 and the PDP-11/ 70 are not
a "black box” comparThe times given for compilation of the file pfin.c are an
attempt at
er) which takes C language
ison of appies and oranges. The black box is any program (compil
son is that the current
compari
ox
input and produces executable instructions. The black-b
the VAX-11 requires
VAX-11 C compiler running on the VAX-11/780 and compiling code for
on the PDPrunning
r
49% more user-mode CPU time than the current PDP-11 C compile
the
The apples and oranges aspect arises because
black box viewpoint, are (on the inside) totally
11/70 and compiling code for the PDP-11.
two compilers, while equivalent from the
different pieces of software.
Ritchie; the VAX-11
M.
The PDP-11 compiler is a production compiler written by D.
on work
compiler is a portable compiler based
The
by S. C. Johnson.
and compiling for the Interfigures for the portable compiler running on the
PDP-11/70
portable compilers. We have no
data 8/32 are included for those who wish to compare two
enable
the tests which would
VAX-11 equivalent to the Ritchie compiler, and thus cannct run
comparison of two production compilers.
programs appears in
The loaded size in bytes of the operating system and seven other
(instructions) sizes on the
Table 1. One should note the general similarity between the text
sizes on the VAX-11 and
data)
alized
(uniniti
PDP-11 and on the VAX-11, and between the bss
on
the
Inte:data 8/32.
The
particular
PDP-11
system
UNIX
chosen
has
several
not in the VAX-11
input/output device drivers and experimental multiplexing software
which accounts for its larger text size.
If many global integer variables (or large arrays) are
used, there is a tendency for the data and bss portions to double
PDP-11
to a VAX-11
.]
'
;
going from a
However, character arrays occupy the same amount of
An unusually large number of references to global variables in the nroff
program accounts for its increase in text size on the VAX-11
4
in size when
or an Interdata 8/32 because an int occupies two bytes on the PDP-11
and four bytes on the other machines.
space on all machines.
more
system,
compared with the PDP-11.
A
used in the VAX-11 code
program can be written to automatically change the addressing modes
been
so that most references to global data become
done.
shorter than at present, but this has not
t hardware environEvaluation. We believe that the VAX-11/780 provides an
excellen
state, we view the
ment for running UNIX and C software. With the software in its current
that the 64K
except
,
software
UNIX
system as operationally equivalent to a PDP-11/70 running
advanced
We believe that the
limit on process address space is gone and progrems run faster.
ties of the VAX-11/780 offer an
memory management and user/system communication capabili
ially higher throughput than
opportunity to construct future UNIX-like systems with substant
provided by today’s UNIX on a PDP- i 1/70.
3.
Details
Hardware
.
memory, and input/output
Four main subsystems — the central processor, console, main
processor, memory, ‘id
central
The
— constitute the VAX-11/780 computer system.
Interconnect (SBI), an
ane
Backpl
input/output subsystems are connected by the Synchronous
13.3 megabytes per second. The
internal synchronous bus with a maximum data throughput of
the SBI address space is (zserSBI deals in physical addresses which are 30 bits
wide. Half of
registers. Arbitraticn for bus
ved for memory addresses, and half for input/output device
use the next bus cycle.
cycles on the SBI is distributed; each subsystem decides if it will
er computer. The archiThe central processor is a microprogrammed 32-bit
ysneral-regist
mmer is “culturally compatible” with
tecture seen by the user-mode assembly-language progra
can learn and understand the
the PDP-11; an expert programmer familiar with the PDP-11
handles binary integers of 8, 16, and 32 bits,
differences in one day or less. The processor
s
(64 bit) floating-point numbers, character string
single precision (32 bit) and double precision
tu
up
s
string
wide; and IBM-style packed decimal
up to 65535 bytes long, bit fields up to 32 bits
er, all other data types require
res‘rictions whatsoe'
31 digits lony. Bit fields have no alignment
genThe central processor provides sixteen 32-bit
alignment only to a byte (8 bit) boundary.
the
counter pe. Software operating in one of
eral registers. Register 15 is the program
cinstru
The
sp.
er
register 14 as a stack point
privileged access modes (see below) must use
call and return
tions which implement high-lev.1 procedure
(pushl, calls, callg, ret) assume a
the
(fp, the frame pointer) and register 12 (ap,
convention about the use of sp, register 13
use
s
string
handle character and packed decimal
argument pointer). The instructions which
to be interruptible. Floating-point
counters, so as
registers 0 through 5 to hold pointers and
care no separate floating-point registers. Instru
operations may use the general registers, there
by
tion code occupies one byte and is followed
tions take from zero to six operands. The opera
modes (including all
each. Nine addressing
the operands, which require from one to nine bytes
the addressing modes are independent of the
the PDP-11 modes except *—(r)) are allowed, and
executing in the context of a process, there are
operation code. When the central processor is
tive, kernel), each with its own stack
four access privilege modes (user, supervisor, execu
A fifth stack
stack is easy to implement.
pointer, software which desires a per-process kernel
e interrupt context. The VAX-11/780
pointer is used when executing in a special system-wid
associative, write-through, memory data
processor includes an cight kilobyte, two-way set
a 128-address virtual. address translation
cache; an eight-byte instruction stream buffer, and
A programmable
buffer.
MSI logic.
Most of the processor is implemented in Schottky TTL
ed during loss of line voltage) are stan-.
realtime clock and a time-of-year clock (battery operat
ng-point accelerator and user-writeable condard equipment. Options include a
hardwired floati
trol store.
ter, local memory, floppy disk, DECThe console subsystem consists of an LSI-11
compu
port. The console is connected directly to
writer terminal, and remote-access communications
of a conventional “lights and switches”
the central processor and performs all the functions
operation and
disk serves as the initial bootstrap device for normal
lly) 98% of all memory
need for extra memory references during address translation for (typica
RAMs with an error
r
nducto
semico
MOS
memory is implemented using
references. The
all double-bit errors and 70% of
correcting code which corrects all single-bit errors and detects
handle 8 memory boards; using 4K
all greater-than-double bit errors. A memory controller can
memory controllers, thus the
chips each board can hold 128K bytes. There can be two
wee
eee
front panel. The floppy
When activated by a key switch on the central
holds special microcode for diagnostic operation.
e. A terminal connected through the
processor, the remote-acccss port becomes the consol
it, diagnose it, etc.
remote-access fort can halt the central processor, boot
the VAX-11/780 consists of 2°°32 8-bit
The virtual address space of a process running on
mine one of four segments. Two of
bytes. The two high-order bits of a 32-bit address deter
address space of all processes. One of the
these segments are system segments common to the
two segments are separately defined for
system segments is reserved for future use. The other
context switching instructions. One of the
each process and are automatically managed by the
grows towards lower-numbered memory
per-process segments is designed for a stack which
bytes. Memory mapping hardware translates
addresses. Segments are divided into pages of 512
A page table contains one four-byte
virtual addresses into physical addresses using page tables.
bit, a four-bit field which encodes access
entry for each page mapped, the entry contains a valid
number where the page is mapped.
privileges, a modify bit, and the physical page-frame
re!) A base register and @ iimit regis(There is no reference bit which is
maintained by hardwa
register of a per-process segment conter describe the page table of each
segment. The bese
register for the system segment contains a virtual address within the system
segment, the base
sor contains a virtual address
tains a physical memory address. The VAX-11/780 central proces
r pairs which eliminates the
translation buffer holding 128 virtual address-page frame numbe
|
maximum
amount of physical memory
is currently 2 megabytes.
When
16K chips are used
be 8 mega(forecasted for late 1978], each board will hold 512K, and physical
memory can
ey
failure.
bytes. There is a battery backup option for maintaining data in the event of a
power
Esch optional battery will maintain 1 megabyte for 10 minutes.
adaptors. A
The input/output subsystem consists of UNIBUS adaptors and MASSBUS
the SBI. The UBA
UNIBUS adaptor (UBA) is an interface between a standard UNIBUS and
It also conUNIBUS.
the
er
administ
bus arbitration and everything else necessary to
does the
addresses. The maxtains a set of registers for mapping UNIBUS addresses to and
from SDI
S adaptor (MBA) is an
imum throughput on a UBA is 1.5 megabytes per second. A MASSBU
TE16 tape, etc.). An MBA
interface between the SBI and MASSBUS devices (RPC6 disk,
controller on a
would be more properly called an RH-780 controller, analogous to the RH-11
PDP-11/70 MASSBUS;
units
only one unit inay transfer data at a time, although severa! similar
. The MBA contains
connected to the same MBA can execute control functions simultaneously.
registers lie in the /O
the device control registers normally found in an RH controller. The
which translate devsection of SBI addresses. An MBA also contains a set of
mapping registers
on a MBA is 2.0
ice byte addresses to and from SBI addresses. The maximum throughput
Theoretically
megabytes per second. The published limits are 1 UBA and 4 MBAs per system.
of central procesone could have any number of either kind as long as the sum of
the number
since the SBI
sors, memory controllers, MBAs, and twice the number of UBAs were 15 or less,
has 15 “ports”.
with the
The physical packaging of the system has been dramatically improved compared
is
PPP-11. The VAX-11/780 processor cabinet contains no drawers or moving cables.
The SBI
—
flow
air
sufficient
fixed and rigid. Three one-third horsepower squirrel-cage blowers provide
be replaced within
even while servicing the CPU. Any logic card, rower supply, or blower can
x 1.17m x
twenty minutes by one person using only a screwdriver. The CPU stands 1.53m
usually bolted
0.77m (HWD); cabinets housing the CPU, UNIBUS devices, and tape drive are
section 2)
(see
tion
configura
Our
0.77m.
x
togetner to form a single unit 1.53m x 2.5lm
weighs 3452 pounds and requires 42050 BTU/hr cooling.
C Compiler
portable comA VAX-11 “native mode” C compiler was constructed using S. C.
Johnson’s
it produced code which
piler as a base. After one month, a reasonable version began to evolve:
bootstrap PDPwas good enough to exercise the assembler, loader, and debugger
(on the
(which does
addressing
indexed
VAX-11
of
use
make
11/45). This initial version did not
or
instructions,
field
bit
shifts),
index
single-level array subscripting including appropriate
the
since
particularly
bugs,
of
share
autoincrement/decrement audressing. It contained its
code.
hardware had not arrived and could not be used to actually rin the generated
of the
Substantial effort has been subsequently direcied towards improving all aspects
and
y,
efficientl
more
compiier: buss have been corrected, routines have been made to execute
the quality of the generated code has been improved.
All addressing modes are supported, bit-
wi’
and autodefield instructions are used for programmer-defined bit fields, and
autoincrement
crement addressing as well as three-address instructions are used.
Overall, our experience with the compiler has been vety favorable.
11/780 was delivered, the compiler worked well enough to compile itself,
and many user-level commands. In fact, since the delivery of the machine,
dozen serious bugs have been detected. Additionally, the framework of the
When the VAXthe UNIX kernel,
only about a halfcompiler has pro-
for
ven itself to be flexible: a compiler for the Interdata 8/32 was transformed
into a compiler
a
the VAX-11/780, some improvements and extensions were easily added, and, in
general,
witn
that,
feel
authors
The
quickly evolving compiler has remained stable and productive.
few extensions to the model of the compiler and a certain amount of tuning, the
current VAX11 compiler could easily remain as the production VAX-11 compiler.
n of the compiler, as well as in the
There are still some deficiencies in the current versio
quite large; see the statistics in section 2 and
basic “product” itself. The compiler is slow and
gy of the first pass can be attributed to the
Table 1. Some of the blame for the size and lethar
nicate
for the parser, and to the use of ASCII to commu
use of lex for the scanner and ya
bytes
17K
is
r
e large routines: the scanne
information between passes. Both /ex and yacc produc
bytes
parser is 16K bytes long (over 5.5K
in length (over 4.5K dytes of instructions), and the
spends 20% of its time in the lexical scanner.
of instructions). On the average, the first pass
yylook, and 9% of its time in the parser yyparse.
passes causes an additional speed penalty
Using ASCII to communicate between the two
of its
programs, the first pass (parser) spends roughly 30%
for character conversion. On typical
_strout
time performing output services (i.e., calls to _doprnt (18%),
its time
of
21%
y
roughl
spends
tor)
while the second pass (code genera
used to
e
routin
the
y,
ionall
(Addit
calls to read (18%) and rdin (3%)).
31)
—(2°°
is
(which
48"
binary contained a bug which caused *.21474836
our PDP-11/45.)
model.
The above problems are not inherent to the compiler
(8%), and printf (4%)),
reading it back in (i.e.,
convert from ASCII to
) to be read as zero on
To speedup compilation, the
er), and the interpass data can be
scanner can be hand-coded (as in the standard PDP-11 compil
With these simple modifications
formatted in binary (or the two passes can be combined).
e a compiler almost twice as fast
(some are already in progress), it should be possible to produc
as the current one.
Two
features of the VAX-11
architecture
—
three-address instructions and indexed
ure of the compiler. The full
addressing mode — were difficult to mode! within the basic struct
lt that it was not really
difficu
so
address instructions proved to be
implementation of threeer, tries to merge several instrucattempted. Instead,
c2, the assembly language code improv
example, tr: statement @ = b+7
tions into an appropriate three-a.'dress instruction. For
compiles
addl3__b,c.r0
movi
10,a
which the improver can change to:
addl3—byc,a
for a savings of three bytes and over 400 nanoseconds.
this shortening. It cannot tell the difference between
However, c2 will not always succeed in
a=b+c;
return;
and
return(a = b +c),
may be required later)
since register r0 must be considered “live” (i.e., contains a value which
across the return statement.
of an element of a
The VAX-1]1 has six indexed addressing modes which yield the address
or double). The
one-dimensional array of a base type (char, skort, int, long, pointer, float,
statement
ali) = b&) * clk);
external or
where i, j, and & are declared register int and a, 5, and ¢ are double arrays
(either
local). can be compiled into the single instruction: ~
“a
muld3
b{jJ,clk), ali]
oO
must be a register, the base address
Although the index specifier (e.g. iin the above example)
or another indexed mode. For
specifier can be any addressing mode except register, literal,
(+ +)fiJ, and (p+ +J[i) (or their
example, the C-language constructs a/i/, (sp)[il, (pill, e(ep+
+ +i), respectively) all can be
equivalents ¢(a+i), *(ep+i), o(--p+i), (p++ +i), and
type, pis a pointer to the same
done with a single VAX-11 address (where a is an array of base
ze or conveniently represent
type, and /is of type register int). It is usually difficult to recogni
(e.g., a/i/ where a is not
such constructs (e.g., @p+ +J/i/ is fun), or generate the possible cases
readily addressable).
ion trees of height one
‘fhe fact that the code generator can easily recognize only express
(two if OREG and UNARY
making
MUL nodes are taken into account) causes substantial difficulty in
ing.
use of indexed mode, three address instructions, and indirect address
the statement
trees of non-trivial height occur not infrequently (e.g. as a worst case,
a=b
Expression
+ (p+ +)i{i);
instruction
has an expression tree of height six, but can be compiled into the single
addi3__b,°(p) +[i),a
raised by forcing the
if p and i are register variables). The complexity of the code generator is
checks, special ©
compression of subtrees into single nodes which are then treated with special
code, etc.
ent, even though
The size and alignment attributes of data objects are logically independ
have imposed
previous hardware architectures (IBM 360, PDP-11, Interdata 8/32, ...)
although prons,
restrictio
such
alignment restrictions based on size. The VAX 11/780 has no
grams run faster with data aligned on natural boundaries.
The C language has little notion of
basic data types
alignment; because of run-time penalties, the VAX-11 C compiler aligns all the
on address boundaries which are a multiple of sizeof the basic type.
Due to questions about
on char c:10,.
alignment, both the language and the compiler have difficulty with the declarati
effects which cannot
The decision to naturally align most data items has urdesirable side
be ignored.
Consider the structure declaration
struct foo [
char c,
float f;
} bar;
is currently 8 bytes
On the PDP-11, sizeof (oo) is 6 bytes while on the VAX-11, sizeof (foo)
5 bytes in each case.
(the offset of f within bar is 2 and 4 respectively). sizeof (foo) could be
floats, the differing alignment
Although both machines use the same data formats for chars and
s cannot speak directly to
machine
imposed by the the VAX-11 C compiler means that the two
Since
information.
binary
one another using media whick record structures containing
.
alignment is important, we feel that it ought io be specifiable in the C
language
Operating system conversion
rting software
A UNIX system running on a PDP-11/45 was used as the base for transpo
produced by members of
to the VAX-11/780. The software itself originated with the code
Programs were crossCenter 127, Computing Science Research, for the Interdata
8/32.
absolute bit-string files
compiled, assembled, loaded, and put on magnetic tape in ¢p format,
the VAX-11/780.
were put on tape ‘n dd format. Tapes were then carried across the room to
(in assembly
An absolute tape boot (in machine language), «p boot and: primary disk boot
verifier,
disk
er,
formatt
(disk
utilities
lone
language), secondary disk buut (in C), and stand-a
tape-to-disk, disk-to-ta;
, disk-to-disk, and disk-to-console, all in C) were then used to bring
up the system.
er than expected.
Establishing an initial file system on the disk took long
was running USG issue 3 of the UNIX
The PDP-11/45
operating system with a "16-bit" file system and the
system. Also, C-language code on
VAX-11/780 was to have a Research version 7 °32-bit" file
be stored in a different order than Cthe VAX-11 expects the bytes of a 32-bit
integer to
red herrings hard, and suffered. We
language code on the PDP-11}. We swallowed these two
em is to modify the program mkfs so
now know that the proper way to create an initial file syst
ng the proper bits, put that file on
that its output (on the bootstrap machine) is a file containi
ine.
tape, and use the tape-to-disk utility on the target mach
g system onto the hardware archiMapping the software architecture of the UNIX
operatin
s. Commentary on these decisions foltecture of the VAX-11 required a number of
decision
lows.
The
SCB
(system
context
base)
processor
the
user stack
register
contains
a page-aligned
physical
puts
memory address which is the base of the hardware
this vector at physical memory address zero.
the VAX-11/780
Operating system code, data, kernel stacks, and interrupt stack occupy
and data are loaded into
system segment (virtual addresses 80000000 to bfffffir). User code
segment
cero
and
(0 to 3fffffif)
interrupt vector.
is initialized
The
system
UNIX
in segment
one
(7ffffif to
calls
User processes pass arguments to system service code using the ordinary
40000000).
The
privileges.
kernel
gain
to
used
then
subroutine calling sequence. The chmk instruction is
does
but
stack,
kernel
the
chmk instruction switches the stack pointer sp from the user stack to
the value in ap to
- not change the argument pointer ap or the frame pointer fp. The kernel uses
values to be directly
copy the arguments into u.u_arg. The VAX-11 hardware allows the
addressed, but the kernel software requires the copy.
keeps swappable
The w area is a per-process data structure in which the operating system
information about a process.
The kernel virtual address of the u area must be a constant across
address 0160000; when
all processes. The PDP-11 implementation puts the wu area at kernel
space segmentation
process switching occurs the u area is switched by changing a kernel data
the u area could
register. Since the operating system can address user memory on a VAX-11,
be placed in (protected) user memory, say at address 0 or at 7fffe000.
However, it was desira-
s part of the w area,
ble for the first implementation to make the page tables for user segment
base of the u area
The
space.
system
in
lies
area
u
which creates timing problems unless the
the u area is
occurs,
g
switchin
process
When
.
was assigned kernel virtual address 80020000
translation
le
page-tab
the
ting
invalida
and
table
changed by changing the system-space page
cache for the appropriate pages.
process,
Since the operating system can directly address the meme-y of the cur:.nt user
macros
into
made
be
could
and
the procedures fubyte, subyte, fuword, etc., are unnecessary
with
(along
es
procedur
these
,
which would merely do the appropriate load or store. However
copyin and copyout) were kept to ensure that each access to user space is valid.
to
A VAX-11/780 internal processor register called the PCB (process context base)
points
when
an area in which the VAX-11/780 saves the hardware state of the machine (96
bytes)
switching context. This save area \.as put in the wu area as u_rsav.
The implementation of context switching required major effort. The VAX-11 has
two
very nice instructions (svpetx, save process context, and Idpctx, load process
context) which
facilitate context switching. Unfortunately, they do not impiement the
mechanism which the
UNIX system expects. (The mechanism used by UNIX is so dispersed and
intricately detailed
that it is hard to imagine any hardware which implements it directly.) The
terptstion to drasti-
cally change the UNIX code has been resisted so far.
inated, but it took
more
than a week.
The
newer
The savwretudretu tar pit was VAX-
save/restore primitive
does
make
the
C-
language code prettier, but the assembly-language side (at least for the
VAX-11) is just as dirty
as ever.
The
UNIX
context
switching
mechanism
requires
three state save
areas,
W.u_rsav,
also used for abnormal returns. The
u.u_ssav, and u.u_qsav because the seme mechanism is
of the
ctions use only a single state save area. To make use
VAX-11 context ‘switching instru
deal of microcode and bastardizes call
VAX-11 instructions, the software simulates a great
is certainly high on the list of things to
frames in a most ugly manner. Context switching
the PDP-11!).
rewrite in the second implementation (even for
to implement.
The procedures sureg and estabur were also tricky
They were designed with
fewer) of registers would be needed to map the
the assumption that only a small number (16 or
process requires 64 page table
of a user process, while on the VAX-11 a 32K
address space
entries. Furthermore, the memory
expand and getxfile.
Handling DMA
map
in
of a process is diddled in tricky ways, particularly
eneck.
I/O hardware was the other major implementation bottl
The UBA
ry page numbers, and physical addresses are
and MBA mapping registers contain physical memo
hardware which implements the mapping
hard to handle. It is not pleasant to deal with the
ing registers may be neither read nor
registers. If an I/O transfer is in progress then the mapp
by the transfer. As a result, the
written; this applies even to registers which would not be used
ng the current 1/O operation. Furthermap for the next I/O operation cannot be
setup duri
the byte counter is only 16 bits wide.
more, a single transfer is limited to 64K bytes because
I/O operations. The solution to these
ple
Thus swapping a process to the disk can require multi
registers in each map to service both
problems involved permanently reserving the last 129
ters are available to map the system
swap and physical I/O operations. The remaining map regis
ECC error correction is currently
buffers, and are loaded at system initialization time. Disk
s on raw I/O cause process terminadone only for /O involving the system
buffers. Disk error
tion, the swap area on disk had better be error-free.
entation for the VAX- 11/780
Like the UNIX system for the PDP-11, the current implem
when there
y and swaps processes to disk
maintains each process in contiguous physical memor
fragmentation
is not enough physical memory to contain them all.
Reducing external memory
a
a
g hardware for scatter loading is high on
to zero by utilizing the VAX-11/780 memory mappin
pass. To simplify kernel memory allocathe list of things to do in the second
implementation
an assembly parameter which currently
tion, the size of the user-segment memory map is
text, data, and stack. This also deserves
allows three pages of page table or 192K bytes total for
to allow processes larger than physical
to be rewritten, both to allow varying process size, and
would mean dynamic wu area size if
memory through demand peging. Dynamic page table size
the page table remained part of the u area.
s a tedious simulation of the
The code in sendsig for sending a signal to a process involve
privilege modes upon termination
calls instruction due to the problem of “inward retum” across
of the kernel code readable by a
of the routine which handles the signal. Making a portion
a problem with the Bourne shell, the
user-mode process would simplify sendsig. Motivated by
signal number is passed as a parameter to the signalled routine.
uses the low-order bit of a
Interprocess communication via signals (signal and kill)
implies that a procedure which
machine address for something other than addressing. This
that every procedure must |
means
which
ry,
handles signals must start on an even byte bounda
a pseudo-op to the assembler to
start on an even byte boundary. The C compiler thus issues
on a VAX-11. It also imposes
memory
align the beginning of each procedure. This can waste
of conditional jump instrucion
a nontrivial requirement on the assembler, since if the resolut
alignment directive must also
tions can change the parity of the length of a procedure then the
distinct value
be handied like a conditional jump.
In hindsight, it would have been better if a
bottom bit.
(say +1 or -1) were used for ignore, rather than multiplexing the
n by zero. The sysThe VAX-11/780 provides a (non-maskable) trap for integer
divisio
subscript
into a signal to the process. A similer situation exists for
tem would like to turn this
underflow, and reserved operand also
range trap. Integer overflow, floating overflow, floating
-10-
is needed with some other means for
need signal numbers. Perhaps only one “error” signal
interrupts, signals, asynchronous I/O, ar?
determining the true fault. The whole business of
attention.
the use of the hardware AST mechanism deserves more
involving the proc and
A bug was discovered in the UNIX code for process termination
only be noticed if a
would
but it
xproc structures. (The problem also existed on the PDP-11,
highly unlikely.)
is
which
process had accumulated more than 65535 ticks of system time,
When a process dies its resource
process CPU time) are temporarily
dents of the parent process. The
process issues a wait system call;
utilization statistics (currently only
saved so that they can be added to
actual accumulation is done by the
the child process is then completely
exit status, system, and
the totals for the descenkernel when the parent
erased. Tue kernel was
dy the scheduler to contain
overlaying the statistics in a part of the proc structure normally used
no harm. But “~ the
causing
ately,
immedi
the pointer p textp. Ordinarily the exit was processed
the
scheduler could sneak in after
system was loaded so that swapping was necessary, then the
interpret the timing data in the
child exited and before the parent read the statistics, and would
memory reference from
zombie xproc structure as a pointer. This invariably caused an illegal
kernel mode on the VAX-11/780.
a design quirk in
One of the greatest disappointments with the current system stems from
between floating-point
the FP-11 floating-point processor for the PDP-11. When convertir.
to be stored at the
and 32-bit integer, the FP-11 expects the high-order 16 bits of the integer
of the PDP-11,
lower memory address; this is not in line with the general "right to left”
design
the PDP-11
for
code
which would place the low-order 16 bits in the lower memory address. C
e stores the least .
uses the FP-11 convention for storing beng integers. The VAX-11 hardwar
for the VAX-11
significant bit of any integer data type in the lowest addressed byte. C code
nted in the
represe
integers
long
ing
contain
files
uses the hardware convention. This means that
local convention are not binary compatible
UNIX system on the PDP-11. This is the
machines: char, short, float, and double all
(and the structure alignment problem noted
between a UNIX system on the VAX-11 and a
only exception for data types common to both
have a common representation. Except for this
earlier), disk packs containing 32-bit file systems,
Plus for the
tapes, etc., would have been interchangeable. The fact that DEC’s Fortran-IV
between
PDP-11 avoided the FP-11 convention, and that RSX-11 files are binary compatible
the VAX-11
and the PDP-11, is only salt on an open wound!
Subroutine libraries
libe. Conversion of the system-call
Most routines are merely
LI:
.word
chmk
bee
jmp
ret
interface routines was straightforward
but tedious.
0x0000
$nn.Ll
cerror
The routines printf, ecvt, and fevt were left to 1ibS and were not implemented
in libe.
iibS. Conversion of the standard input/output library libS posed no problems
except for
__doprnt, the routine which constructs character representations of other
datatypes for the prin-
ting routines printf, Jprint/, and sprinyf. Since many programs spend 15% to
20% of their execution time within __doprnt, it pays to code the routine for
speed in assembly language. Packeddecimal instructions handle decimal,
unsigned, and floating-point conversions. The algorithm
chosen for converting from floating-point to character string revealed a
microcode bug in the
VAX-11/780's ashp (arithmetic shift and round packed) instruction. Under
certain conditions
a carry from the rounded digit propagated both to the adjacent digit and to the
digit eight places
further left. This usually caused an overflow, since the destination
packed-decimal string was
-ll-
for the
spurious carry. DEC claims to have a fix
typically not long enough to represent the
cts
corre
meantime a five-instruction patch detects and
bug, but the FCO has not arrived. In the
the spurious overflow.
Commands
as, id.
8/32 was the model for an interCode developed by Center 127 for the Interdata
heuassembler uses an algorithm described in [3] with
pretation by a VAX-11/780 artist. The
jump
ristic improvement of [4] to resolve conditional
pseudoinstructions.
Variable-length,
~—”
files to
forced the relocation information in object
unaligned instructions and address constants
deducing
for each relocatable datum, rather than
include the explicit segment-relative address
the
between the position in the segment and
the address from a one-to-one correspondence
infor-
This caused a slight change in the header
corresponding position in the relocetion table.
mation within object files.
generated by the VAX-11 C compiler is
c2. The code tmprover for the assembly language
usage pass, performed once
A “backwards” register
based on a similar program for the PDP-11.
is live
addition. Knowing that no temporary register
and before anything else, was a major
where
pass introduces three-address instructions
across a backwards jump, the register usage
bs), extract field
jump on bit (jbe, jbs, jibe,
ever possible. It also recognizes situations where
pushal, pushab) instructions can be used.
(extzv, movzbl), and move address (moval, movab,
aob, acb was als extended.
instructions sob,
The code for insertion of fancy loop control
a
lic debugging routine was the writing of
adb. Tne most signifcant change to the symbo
outand
input
uctions. Additionally, the character
disassembler for VAX-11 nativeemode instr
initialized
radix for all numeric values. The radix is
put routines were modified to use a default
to sixteen.
sh.
interpreter.
The (Bourne) shell is the star.dard user command
It required by far the
it is not
portable program, for the simple reason that
largest conversion effort of any supposedly
rewritpainstakingly
be
to
language and had
portable. Critical portions are coded in assembly
in
routine
standard
functionally different from the
ten. The shell uses its own sbrk which is
the
giving
a signal to be passed a parameter
libe. The shell wants the routine which fields
a private routine. This was handled by
also
was
number of the signal being caught, signal
in the first place, doing away with the
having the operating system provide the parameter
sys(for constr cting the argurcent list to an ex2e
private code for signal Tie code in fixargs
tem call) bad te be dicdled.
ns
Jievimem
ijostat.
(physical
The
process
memory)
and
when
input/output
they should
status
have
commands
referred
consistently
to Mev/kmem
referenced
(kernel
virtual
by the kernel were allocated
jiostat also assumed that certain variables maintained
memory).
as part of a structure.
contiguously, even though they were not declared
pr.
bug that caused a division by zero
The command which formats and prints files had a
On a PMP-11
several files and the first file in the list did not exést.
when it was asked to print
2 VAX-11 it gives an unmaskable trap. ©
division by zero returns the dividend, but on
their arguments using the first parameter
cat, du. These two commands did not count
-1) could be
ent (argv/argc], initialized as
argc, but rather assumed that an additional argum
ss references the fixed end of the stack,
used as a pointer. On the PDP-11 the resulting addre
on the VAX-11, -1 is an illegal address.
preparation and phototypesetter commands
nroff troff. The source code for the document
produce properly ruaning version of these comis not portable; several weeks
were required to
quite
it) constent “2° instead of sizeof(int) was
mands. Use of the explicit (or worse, implic
y
occup
ns
are adjacent in external declar.iio
common. The cede assumes that variables which
proge
tables are initialized by assembly-lsngua
contiguous memory at execution time. Several
thought it knew the
grams.
|
code which
Converting the tables was merely tedious, changing the
tia
alee
PI
wine oe ga
was created using the conver-
to provide version
SCCS. Version 4 of the Source Code Control System [5] is used
itself had not
SCCS
for
source
The
backup for software in case disastrous bugs are introduced.
ng. The
massagi
some
d
require
!
quite been converted to version 7 UNIX, and the header files
procedures for dynamic
PWB routines logname and pexec had to be simulated. The utility
and to remove PDP-11
storage allocation required some work to integrate them with libS
delta to bomb. The
dialect. The exit status of the dif’command changed in version 7, causing
The documentation
code implicitly assumed that all checksums were computed modulo 65536.
procedure safoi
The
"65535".
say
reaily
is incorrect: everywhere "99999" appears it should
paran.ter. Naturally, satoi
returns two values, storing one of them indirectly through a pointer
to track down.
day
a
and its callers did not agree on sizeof the stored value; this took
4.
Software portability
We thank the members
of Center
127, Computing Science Research, for their efforts in
re portable.
producing the basic software and for their recent efforts towards making the
softwa
system for
g
runnin
a
create
quickly
can
The fact that peor‘e other than the original develcrers
a new machine is a tribute to how well the original work was done.
stumbled
Yct in our effort to transpuit a complete UNIX system to the VAX-11/780 we
g
lack or
seemin
across a large number of nonportable constructions and were dismayed by the
strongly recomapprapriste facilities to detect and prevent them. Based on our
experience, we
er
ed sostint
enhanc
beil
and ks comp
andge
mend that the C langua
The actual arguments in a procedure call are type checked against the procedure
declara1.
protion, and a “dummy” declaration which specifies types is permitted even if
the called
cedure is not actually declared in the same compilation.
2.
3.
The
'—>’ operator is checked to insure that the structure element
on the right is a
member of a structure to which the pointer on the left may point.
A structure element may be declared with any name as long as the name is unique
within
(The current requirement that a structure
the immediately surrounding structure.
element name must uniquely correspond to an offset from the beginning of the
structure,
across ail structures in a compilation, creates naming problems and frequently
leads *a
errors of the type noted in item 2 above.)
4.
The issue of alignment to an even-byte (or other) boundary is brought into the
open, so
that arbitrary data structures can be accurately described.
There is a program called Unt [6] which, if conscientiously used throughout the
life of a
piece of sc{vware, provides type checking which partially addresses the first
two points in the
above list. The problem is that Jint is big, noisy, relatively recent and
unknown, and (partially
as a result) infrequently used. There is little incen.ivs for the average
programmer to use lint
as a matter of course. The authors believe that type checking belongs in the
everyday compiler
as the defauli, where it is very inexpensive to implement. Those who wish to do
“dirty” work
may request that type checking be disabled; those who wish to bless their dirty
work may use
type casts.
We believe that these four enhancements would go a long way towards making C
langu-
age software portable as a rule rather than as an caception, thus preserving
Bell Laboratories’
investment in present and future C software.
Bb
This memorandum
Face Pte i
format of an 2.out file required some effort.
ted nrofftroff programs on the VAX-1 1/780.
wai
Tees) ees
and Department 8234, for helpful comments and suggestions.
uns Aboud
Thomas B. London
Te
e
aneT
ng questions
Acknowledgments. Thank you, D. M. Ritchie and S. C. Johnson, for answeri
stand-alone utilities,
at key moments; G. K. Swanson, for assistance with boot procedures and
help in bringing up
for
Sharma,
K.
D.
and
J. F. Jarvis, for the mathematical function library,
127 and 135,
Centers
of
s
member
user-level commands. Additional thanks go to many other
Tees BaP emer
™
er
-13-
F Renew
ohn F. Reiser
HO-1353-tbi/jfr
Att:
References
Table 1
Maynard,
Mas-
sachusetts, 1977.
17, 7 (July
D.™M. Ritchie and K. Thompson, The UNIX Time-Sharing System, CACM
1974), 365-375. See also BSTJ 57, 6 (July-August 1978), 1905-1929.
Design
W. Wulf, R. K. Johnsson, C. B. Weinstock, S. O. Hobbs, and C. M. Geschke, The
of an Optimizing Compiler. American Elsevier, New York, 1975.
78J. F. Reiser, Common Instances of Pathological Span-dependent Instructions, TM
1353-3.
SCCS/PWB User's Manual, The Source Code Control System.
§.C. Johnson, Jint, a C Program Checker. Computing Science Technical Report
#65, Bell
Laboratories, December 1977.
ne
Handbook.
ee
Architecture
aes
5.
6.
-VAX-11/780
wee
4.
Corporation,
oe
3.
Equipment
SD ae
2.
Digital
Vee
1.
ee
References
Se
Data
Bss
Total
ede
Text
2470
44040
79976
PDP-11
VAX—11
Interdata 8/32
PDP—ii
48064
Interdata 8/32
94574
39208
= 78216
11904
39448
19826
29492
32192
17656
23512
24920
74218
90524
=117718
PDP—I1
VAX—11
Interdata 8/32
21248
23408
35652
6254
9092
9032
$246
7§52
7560
32748
40052
52244
PDP-11
VAX—11
Interdata 8/32
VAX=11
34476
4292
131088
;
=
See
os
C, passl
ed
a
—
*
~ i
Le
a
C, pass2
grep
PpP—il
1936
Interdata 8/32
11950
1160
1936
15046
PDP-—11
VAX—-1l
Interdata 8/32
768
1140
1920
3856
5764
5768
11728
13788
23348
PDP—11
29312
6684
7842
43838
9408
_
10636
-
58836
6656
1578
2104
10338
ee
es
VAX—-11
A4
:
~
q
.
ls
nrofft
§
4
VAX—11
Interdata 8/32
:
4
ia
sort
a
4
‘
al
j
PDP-—11
VAX-11
Interdata 8/32
36360
-
6580
13886
1764
2208
2788
2792
Table 1. Loaded Program Sizes (in bytes)
:
4
7276
476
4864
11132
18886
A
ake
Bente
Se
M4
i
;
fa
/unix
System
Ee
See
Program
ee
Ss
-14-
Serre A
tel
pes gpae
pnctc
—
se
Dene aire
ii
RO
ere
signature.asc
Description: PGP signature
