Talk:VAX

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Lede[edit]

Guy Harris: Hi Guy. I want to quibble about the lede sentence. The most significant things about the VAX were 32 bit architecture and virtual memory. RISC was still mostly a research project in the mid-1970s when the VAX was developed. While it could physically handle larger memory, the PDP-11 was limited to 16-bit addressing and that was becoming a major problem. (I got to argue with Gordon Bell about this, he said programmers should write modular code.) Competitors were starting to introduce larger address spaces, e.g. Interdata. VAX was somewhat late to the game. I think the lede should say something like "VAX is a 32-bit computer with virtual memory developed by the Digital Equipment Corporation (DEC) in the mid-1970s." and later "Like most computers of this era, the VAX was a complex instruction set computer (CISC). Its instruction set architecture (ISA) was designed for backwards compatibility..."--agr (talk) 13:21, 24 December 2021 (UTC)[reply]

@ArnoldReinhold: I think it needs to be stated explicitly, somewhere, that it's CISC, using the word or phrase; I agree it doesn't need to be in the first sentence. IBM System/360 and IBM System/360 architecture both state it only in the infobox, so that might be sufficient. (Motorola 68000 and Motorola 68000 series, however, do state it in the first sentence.)
(I wouldn't call it "a 32-bit computer"; it's a line of 32-bit computers - or a 32-bit line of computers.) Guy Harris (talk) 20:36, 24 December 2021 (UTC)[reply]
Well, 32 bit computer architecture. The architecture can exist even if no implementations are every built. Gah4 (talk) 22:48, 24 December 2021 (UTC)[reply]
+1 for "32-bit instruction set architecture" or similar Vt320 (talk) 23:01, 24 December 2021 (UTC)[reply]
As far as I know, the name is used both for the ISA and the line of machines that implement it, as is the case for the various flavors of System/3x0 and for the PDP-11 (three architectures for which implementations were built). Guy Harris (talk) 23:19, 24 December 2021 (UTC)[reply]
Yes architecture names are commonly used for the line of machines, at least until clones appear and there is ambiguity. It could also refer to a specific machine inside a computer center, with only one machine of a specific architecture. Outside of a computer center, it should be ... architecture, or as noted, ... line of machines. Gah4 (talk) 12:40, 25 December 2021 (UTC)[reply]
VAX was a trademark of DEC and was used to refer to the family of computers. The underlying architecture was referred to simply as the VAX architecture, as was the case for the System/360, PDP-11, DG Nova, etc. Clones may have used the architecture but could not legally call themselves VAX, I've edited the lede to reflect that and moved the CISC mention lower down. Used series of computers to match the PDP-11 article, but I have no objection to line or family.--agr (talk) 23:12, 27 December 2021 (UTC)[reply]

Early Adopter of Virtual Memory?[edit]

By what standard was the VAX an early adopter of virtual memory? Many PDP-11 models had virtual memory, although it was not demand paged. The KL10, used in DECsystem-10 and DECSYSTEM-20 computers had a more feature rich demand paged VM implementation introduced in 1975. The KI10 had demand paged VM under TOPS-10 in 1972. The older PDP-10s and ancestor PDP-6s all had VM, but again not not demand paged. The IBM System/370 had demand paged VM (IBM called it Dynamic Address Translation) from 1972. One model of System/360, the 360/67, had DAT and 32 bit demand paged VM, in 1966. BTW, it also was the first major systems to offer system virtualization (like VMWARE but in 1966!) The VAX, and the Prime computer that someone oddly mentions without context, were designed in a period when demand paged VM was mainstream. The VAX's implementation was considered mature in the industry. Although it is enjoyable to remember that UNIX took until 1984 to figure out demand paged VM. 149.32.192.36 (talk) 17:37, 27 July 2022 (UTC)[reply]

In my opinion, you don't have virtual memory unless you have demand paging. The IBM system/360 model 67 was, as far as I know, the first commercial computer with demand paging, though there were academic computers that preceeded it. IBM's first attempt to write an operating system for the model 67, TSS/360, was not successful, but later work succeeded in making virtual memory a commercial success.
However, that work, and work done at DEC with TOPS-10 and TOPS-20, was considered "mainframe". The VAX-11/780 was an early example of virtual memory in the department-level class of computers. John Sauter (talk) 23:15, 27 July 2022 (UTC)[reply]
Demand paging or demand-segment-loading (or both).
There's also the GE 645, but I don't know which one first was used by anybody other than the developers (IBM for the 67, GE/Honeywell and MIT for the 645).
The 16-bit Nord-1 is claimed to have offered virtual memory by 1969, but doesn't give details; the page for the Nord-10 says that the Nord-10/S added demand paging. Guy Harris (talk) 00:51, 28 July 2022 (UTC)[reply]
IBM always calls it VS, Virtual Storage, maybe to not confuse with VM, Virtual Machine. Though it might be that IBM always called it storage and not memory. For the 360/67, there is CP/67, the predecessor of VM/370. There is OS/VS1 and OS/VS2 for the larger S/370 models, and DOS/VS for smaller ones. The B5000 from 1961 has virtual memory, though it might be segmented instead of paged. Gah4 (talk) 05:30, 28 July 2022 (UTC)[reply]
The B5000's virtual memory was segmented (the "demand-segment-loading" I referred to above), not paged. Later Burroughs machine also, I think, supported paged segments (which used paging to load chunks of large segments) as well as unpaged segments (which used demand loading of the entire segment). Guy Harris (talk) 10:32, 28 July 2022 (UTC)[reply]
I tried to find one explaining it, but didn't find it. The difference between page and segment, presumably, depends on the sizes of the pages and segments. Gah4 (talk) 02:20, 29 July 2022 (UTC)[reply]
"The difference between page and segment" Oh, dear. The term Burroughs used for a paged area appears to be - wait for it - a "segmented array". A "segment" of a "segmented array" is 256 words. It may be that it's up to the compiler to decide whether to make an array "segmented" or not; This version of the A Series Algol Reference Manual says, on page 44, that "Normally. an array row longer than 1024 words is automatically paged (segmented) at run time into segments of 256 words each." The B 6900 System Reference Manual discusses how the INDX syllable works with "segmented arrays"; it looks as if there is no single place in the document that completely describes segmented arrays - it looks as if it's scattered all over the document. (I am really not impressed by the latter document; yes, the system is a bit complicated, but they not only didn't have the A team doing this document, they didn't even have the B team doing it - maybe the C team, or maybe the team were too good so they went with a lower team.)
(And as for the "segment" term, the B6900 document also says "The B 6900 system utilizes the same dynamic storage allocation concept that was utilized in former Information Process­ing Systems. This concept utilizes a descriptor method of segmentation which allows variable length segments of data to be used. This method is more efficient than “fixed-size” paging concepts.", which appears to be taking about items referred to by descriptors as "segments", even though some items referred to by a descriptor may itself be "segmented", so I guess a "segmented array" is a "segmented segment", or maybe it's composed of (fixed-size) segments, or....) Guy Harris (talk) 07:53, 29 July 2022 (UTC)[reply]
As well as I know it, in the early years segment was used for variable sized units, and page for fixed size units. What I meant above was, how well each one works depends on the sizes of segments and pages. To make it more confusing, in IBM's description for S/370 (and I presume the 360/67), there are two levels of addressing, which they call segments (usually 64K) and page (either 2K or 4K, selectable). The two level system is needed so that page tables don't get too big to fit in memory. When VAX was new, I wrote a paper for a CS assignment on similarities between S/370 and VAX addressing. Instead of segment and page, VAX uses pageable page tables. The high two bits of 32 bit addresses select an address space, one of which is where first level page tables go, and those point to pageable page tables somewhere else. For 64 bit addressing, many systems now use five levels of table addressing, though with cheats to simplify things until those are really needed.
There are also systems, the DEC KA-10 as one example, that can swap a users whole memory, which I believe is also called a segment, in and out. But it is all or nothing. I believe that many low-end time share minicomputers, such as HP TSB-2000, also did that. If you have enough segments, and they are a reasonable size, then segmentation should work pretty well. Gah4 (talk) 20:25, 29 July 2022 (UTC)[reply]
Sigh. Here we go again. Virtual memory and demand paging are two different concepts. The PDP-11 have virtual memory, but no OS implemented demand paging. Not because it couldn't, but because it wasn't deemed needed.
Virtual memory is basically just the idea that each process have its own memory space, covering the full addressing range. Several programs can refer to the same virtual address, but they have different actual content, and they do not affect each other. The memory is *virtual*. The "eXteion" in VAX is not (and never was) that VAX added virtual memory, but it extended it from the PDP-11. Going from 16 bits to 32 bits. Lookup DECs own definitions, which really should be what the statements here should be based on. If DEC called it VAX with that meaning, it's completely wrong to force some other, incorrect interpretation on it here.
See the wikipedia article on virtual memory for further details. Sillbit (talk) 14:13, 9 September 2023 (UTC)[reply]
To quote the VAX-11/780 processor handbook introduction:
"The goals of the VAX architecture were to provide a significant enhancement to the virtual addressing capability of the PDP-11 series." Sillbit (talk) 14:42, 9 September 2023 (UTC)[reply]
Virtual memory is basically just the idea that each process have its own memory space Do you have reliable sources to indicate that this is the primary or only meaning assigned to the term "virtual memory"?
For example, IBM used the term "virtual storage" (at least in the past, they referred to "storage" rather than "memory", perhaps to avoid anthropomorphizing computers) even on systems that have only one address spae in which all processes (or whatever term was used for the IBM OS in question) ran:
  • OS/VS1 had a single "virtual storage", in which all programs ran; it behaved as OS/360 MFT, but with memory partitions in demand-paged virtual storage rather than real storage. See the first section of OS/Virtual Storage 1 Features Supplement.
  • Release 1 of OS/VS2 also had a single "virtual storage"; it behaved as OS/360 MVT, but with memory regions in demand-paged virtual storage rather than real storage. See the first section of OS/Virtual Storage 2 Features Supplement. It was not until Release 2 that OS/VS2 supported "multiple address spaces" (that OS later became known just as "MVS").
  • The "Translating a large virtual address" chapter of IBM System/38 Technical Developments says, at the beginning of the first paragraph, "The System/38 supports a large virtual address space structure, large enough to contain all programs and data required by the system." All processes on System/38 ran within that single, demand-paged virtual address space, and all objects within the system, whether persistent (living across reboots, at a permanently-assigned starting virtual address) or temporary, existed in that address space.
What do, for example, various textbooks on operating systems - preferably textbooks that acknowledge the existence of single address space operating systems - refer to with the term "virtual memory"? Guy Harris (talk) 18:45, 9 September 2023 (UTC)[reply]
It is used in different ways when you have a memory management unit between the addresses as seen by programs, and by the memory box. In most case, including VAX and OS/VS2 1.x, data can be written to disk, allowing a larger address space than available physical memory. The PDP-11 is a little unusual, in that the 16 bit "virtual" address space is smaller than the physical memory. Gah4 (talk) 19:34, 9 September 2023 (UTC)[reply]
It is used in different ways when you have a memory management unit between the addresses as seen by programs, and by the memory box. 1) If you don't have a memory management unit (or something descriptor-based with descriptors having presence bits, as the Burroughs B5xxx and B6xxx/7xxx/A-series do), what do you have that would count as "virtual" in any sense? And what does "by the memory box" mean here?
The PDP-11 is a little unusual, in that the 16 bit "virtual" address space is smaller than the physical memory. There's more to virtual memory than "an address space can be larger than the total physical memory". If you have a system that supports multiple programs running at the same time (even if only one can be running at a given instant, as on a system with one non-threaded CPU core), and each program gets a separate address space, virtual memory would allow those multiple programs to be schedulable (meaning that at least some of their code and data is in physical memory) even if not all of their address spaces will fit in physical memory, in their entirety, simultaneously.
Other systems than the PDP-11 also supported maximum address space sizes less than maximum physical memory sizes, e.g. SPARC systems with the SPARC Reference MMU (32-bit virtual addresses, 36-bit physical addresses) and IA-32 systems with Physical Address Extension (again, 32-bit virtual addresses, 36-bit physical addresses). However, demand-paged OSes for SPARC and IA-32 systems already existed, so a lot less work was necessary to support that, and those were 32-bit systems, so there was more address space for paging code and data structures to support it, and the total number of pages in the address space was larger, so I suspect the working sets of a process was typically a smaller fraction of the total address space of the process than would be the case on a PDP-11. That's probably why DEC didn't have any demand-paged OSes for the PDP-11 models that could support it (the KB-11, used in the 11/45, 11/70, etc., had registers saving enough state to allow register updates done in the middle of instructions before a page fault to be rolled back, allowing the instruction to be restarted once a page is brought into memory and the MMU updated to map that page, but the KD-11, used in the 11/40, didn't, even though it had an MMU). Guy Harris (talk) 20:08, 9 September 2023 (UTC)[reply]
I have used RSX-11, but don't know all that much about it. In any case, for the PDP-11 programs use virtual address space, and that is mapped into the sometimes larger physical address space. But there isn't demand paging like in many systems. Some systems use swapping, where the whole image is copied to/from disk, and that still counts. But okay, in general it can be called virtual memory or virtual storage. If it is demand pageable, then you say that: demand pageable virtual memory. There are, then, bank switched systems, which can also swap to/from disk, which may or may not be demand. EMS for IBM PC type machines required the program to keep track of which it was using and when, but still counts. The 80286 allows swapping of variable sized segments, and OS/2 1.x did that pretty well.
As to the original question, I don't know that VAX did anything so special. Well, maybe bring down the cost and make it more available to more people. VAX has too small a page size, and they should have known that. Gah4 (talk)

Memory mapping hardware can provide various capabilities:

  • dynamic relocation, where a given insruction or data address doesn't directly refer to physical memory, and, if the beginning physical address for a given range of instruction or data addresses is initially set at program start time to a value that may not be the same every time the program is started, or is changed after it's initially set, the mapping hardware's map can be changed and the program will be unaware of the physical memory addresses;
  • bounds checking, where references addresses outside any of the ranges in question will not be allowed and, typically, will cause a trap;
  • demand loading, where given ranges might be in secondary storage rather than main physical memory, and a reference to them can cause hardware, firmware, or software to find memory into which to load that range, load it in, and change the mapping hardware's map to mark that range as being in memory and restart the program, which can repeat the reference, which will then succeed.

"Memory mapping hardware" can here cover segmentation, paging, or paging with segmentation - and base-and-bounds registers, which could be, I guess, though of as a primitive form of segmentation. All those forms support dynamic relocation and bounds checking. If they have "presence bits", they can also support demand loading.

Demand loading requires that hardware, firmware, or software - and it's usually software - handle a "that's not present" condition. "Not present" fault handlers require a fair bit of software work.

It also requires that, if an instruction has modified some system state before the fault, either those changes can be rolled back before the program is restarted after the data is loaded or information sufficient to restart the instruction in the middle is made available to whatever restarts the program. The PDP-11/45, 11/55, and 11/70 provided information allowing the changes to be rolled back, but the PDP-11/40 and PDP-11/34 had memory mapping hardware but lacked that capability. Similarly, the Motorola 68000 lacked that capability (requiring workarounds such as Apollo's dual-processor scheme), but the 68010 would, on a bus error, dump what Henry Spencer called "stack puke" on the stack, and the "return from exception" would load that information and restart the instruction from the middle. The VAX PSR's "first part done" bit served a purpose similar to the "stack puke". Some processors don't modify any system state before the fault, and don't have this problem. Not having addressing modes that modify registers really helps here, so S/3x0 (including z/Architecture) and most if not all RISC processors have that advantage (although, on S/3x0, some SS instructions, for example, may have to do presence checks before doing any work, or be somewhat "stateless" like MVCL).

So, even if the memory mapping hardware could support demand loading, no systems that include that memory mapping hardware might end up supporting it.

The issue here is which of those capabilities does "virtual memory" or "virtual storage" involve? Is the dynamic relocation, or dynamic relocation plus bounds checking sufficient, or is demand loading (so that a program can run even if not all of the code and data is currently in main memory) required? This may be equivalent to "does 'virtual' mean that the code doesn't directly refer to physical addresses, or does it mean that code and data can be accessed as if it were in physical memory even if it isn't?"

And the answer, for a Wikipedia article, isn't "what does one particular editor think", or even "what do most editors think", it's "what, according to reliable sources, does the term refer to in the real world".

If we look at vendor documentation:

  • As per my earlier comment, IBM, at least, appears mainly to use "virtual storage" to refer to demand loading.
  • The Atlas papers don't appear to use the term.
  • A quick look at some early Burroughs B5xxx manuals (for a system that, as I understand it, had presence bits in descriptors and did demand loading of code and data accessed through descriptors) didn't show the term, either.
  • The Motorola 68010/68012 data sheet uses it, in section 1.3.1 "Virtual Memory", to refer to demand loading, not just memory mapping.
  • As for DEC:
    • As noted above, DEC's VAX-11/780 handbook says, on page 1-1, that "The goals of the VAX architecture were to provide a significant enhancement to the virtual addressing capability of the PDP.11 series...", althugh that speaks of virtual addressing rather than virtual memory. It's not clear whether "virtual addressing" refers just to memory mapping or to memory mapping plus demand loading.
    • If we look at the PDP-11/45 handbook, chapter 6 "Memory Management" speaks of "virtual addressing", which is just memory mapping; the handbook doesn't seem to mention "virtual memory", unlike the 11/780 handbook. Section 6.5.2.1 "Access Control Field (ACF)" does speak of "missing page faults" and of the "non-resident" access control mode in the Page Descriptor Registers. The 11/45 MMU appears to have been designed to support a variety of uses, including but not limited to demand loading.
    • Book 1, Programming with the PDP-10 Instruction Set of the PDP-10 Reference Handbook just speaks, in section 2.15 "Time Sharing", of the KA10's "protection and relocation registers" (base/bounds registers) as doing "relocation" of "user address"es, with the "user address"es not being virtual addresses. The KI10, according to the DECsystem-10 Technical Summary, supports "virtual memory" with "virtual address"es; according to this internal DEC document, the hardware and OS support demand paging.

so it's somewhat mixed, with "virtual" sometimes used to refer to just mapping and sometimes used to refer to demand loading.

If somebody wants to look at various operating system papers and textbooks, to see how "virtual" is used, that might provide more sources. Guy Harris (talk) 20:55, 10 September 2023 (UTC)[reply]