Networks, Maps, and Happy Accidents—a Q&A with BU MET’s John Day

Lecturer, Computer Science
MSEE, BSEE, University of Illinois

Please describe your specific area of expertise.

This is a difficult question to answer. One of the problems of being an “old far— . . . guy,” is that I came through computer science at the tail-end of when one could know everything that was going on in the field— because there wasn’t that much going on! Then, there was only one computer conference a year. So, one got a good foundation and a chance to really work on almost everything.

Backing up quickly, I should probably admit that I am the first in my family to go to college. I grew up in a rural town of 900. The largest town in my county was 14,000. We had the largest graduating class they had ever had (for two towns): 40. Mine was one of those towns where my children can show up and people will know who they are because of who they look like.

I was also quite lucky to do my education at the University of Illinois, where I was probably the greenest freshman ever. As an undergrad in 1966, my student job was as an operator of a 300 MeV electron accelerator called the Betatron on the south end of campus across the street from the university power plant. It was an interesting glimpse into not only particle physics, but also into BIG electricity. (For those for whom it means something, it had a .01 farad capacitor bank at 6,000 volts that discharged 18,000 amps 6 times a second over a copper conductor that was 4 inches square in cross section, and ignitron tube rectifiers, only these “tubes” were tanks 2.5 feet across with gallons of mercury and diffusion pumps to maintain the vacuum. No wonder it was across the street from the power plant!) To bring it down to earth, a Betatron was basically a 400-ton transformer, where the secondary winding was an electron beam 10 feet in diameter! Even though it was two miles away on the north end of campus, the Betatron had a bad influence on ILLIAC II (Illinois Automatic Computer), the second computer Illinois built in 1963 and where I ended up spending much of my time. ILLIAC II used asynchronous logic (no clock circuits), which made it very fast. (Every time they turned on the Betatron, it would cause a six-cycle spike on the university power, which until they figured it out really screwed up the ILLIAC II logic!) Two semesters of logic design for asynchronous logic was very interesting!

In 1969, I had the good fortune to join the Operating System (OS) team building ILLIAC IV, the first supercomputer, a parallel processor with 64 processors.1 This was at the height of the demonstrations against the Vietnam War, and our project was by far the largest military (Defense Advanced Research Projects Agency, or DARPA) contract on campus. The machine itself was being built by Burroughs Corporation outside Philadelphia. Part of the reason for building it was to push large-scale integration (LSI) technology. Texas Instruments was to make the chips. They could, but not for the price and they got out of the contract. So instead of the machine being about the size of a mainframe computer at that time, it was 10 feet high, 8 feet deep, and 50 feet (!) long (there could be 10 or 12 instructions on the wires to the processors) and the cost went up . . . a lot. (Hence, 64.) ILLIAC IV was to be a “peripheral processor” to a Burroughs mainframe. Our job was to figure out how to keep it busy. So, we did a lot of research on staging big data through the hierarchy of storage for the machine, parallel programming languages, and new parallel algorithms. In those days, one could tell hardware engineers from software engineers by looking at them. (On one trip to Philadelphia to work on the machine, we were repeatedly asked, “Are you guys in a band?”) That trip got a lot more interesting. The student radicals, after failing at one reason to demonstrate against ILLIAC IV (we demonstrate over lunch and then come back to work), argued that classified research should not be done on campus. We agreed with them. Not only were we against it, but given where the machine was to be on campus, classified research couldn’t be done. At one point our office was fire-bombed but the bomb didn’t go off. Good thing: that little two-story frame building was a tinder box. However, the demonstrations proved to be a useful ploy for the head of the project to get out from under cost overruns and ILLIAC IV went to NASA–Ames Research Center in California where it could, in fact, do classified research. A good introduction to not only supercomputing and operating systems but also politics!

One of the best “accidents” in building ILLIAC IV was that we had a B5500 mainframe to play with. The 5500 came on the market in 1964. In Elliott Organick’s book on it, he says, “it appears they got everything right.” And they did! It was the epitome of a simple, elegant design! This machine was a life-changing experience. If you knew how A and B worked, you could guess how C would work—and be right. (On an IBM machine, if you knew how A and B worked, you had no idea how C would work.) Everything made sense. For those who would want to know, this was a zero-address stack machine, tagged architecture (the stack was tagged non-executable), virtual memory, descriptor-based, call-by-name in hardware, the lowest-level language was Algol (no assembler), the Algol compiler was written in itself, and the operating system was written in an extension of Algol, called Espol. This was easily a decade ahead of everything else. (Looking back on it, we now realize that this was a secure OS.) This experience convinced us that hodge-podge designs of special cases (the only kind most of you have seen) were not the norm. Solutions could be simple and elegant—and “simple and elegant” was more efficient. We tried to learn from the experience and we got pretty good at it. I cannot stress enough the impact of this experience had on our concept of design. It helped that the rest of our team were very smart. I learned a lot from them. The head of our group was the best software designer I have ever worked with and his only degree was in music composition. Another of our group (who I am still working with) who taught me a lot had worked for John Cage for the two years he was at Illinois.

The core of the group were also lucky because we were all devoted to the most charismatic, brilliant professor I have ever known, Heinz von Forester, head of the Biological Computer Lab (BCL), one of the four founders of cybernetics, and the nephew of Ludwig Wittgenstein. He had also brought to BCL Ross Ashby, another cybernetics founder as well as a psychologist and expert on finite state machines (I had two courses with him and many discussions); Humberto Maturana, who founded modern neurophysiology with his ground-breaking work on “What the Frog’s Eye Tells the Frog’s Brain”; and several others. BCL was interdisciplinary, for real, 50 years before the current fad. But also, Heinz put us through the wringer. He taught us how to be critical, precise thinkers. And then we took that critical thinking and applied it to our work, holding each other to the same standard in our work on operating systems and networks.

Our group moved on to writing three operating systems from scratch for the PDP-11 to give us access to the ARPANET. (There were no OS courses then, or OS textbooks, so we had to figure it out on our own with the B5500 as inspiration and what Heniz had taught us as discipline.) In 1970, we were the twelfth node (computer) on the ’Net. Our first OS was good, but had limitations; our second was elegant (very close to what you would do today) but very slow and needed lots of memory (today machines are 10,000 times faster and memory costs almost nothing). The third was very small (4K of code) and very, very fast. It became a proven secure OS. Our group put the first Unix system on the ’Net in the summer of 1975. We then stripped down Unix to fit on a one-board PDP-11, to use for an intelligent terminal with a plasma screen2 and touch for a land-use management system for the six counties around Chicago that used the intelligent terminal for accessing databases on both coasts over the ’Net. I wrote the mapping software for that effort. (Yes, we were doing GIS then.) We couldn’t do color but we could do different patterns for land-use. The user could select all of this and display information without using the keyboard.

At the same time our group was also collaborating with the rest of the ARPANET community on the development of protocols, such as NCP, Telnet, and FTP. We started experimenting with TCP in 1977. In 1976, I moved to Houston, Texas, so my wife could do her post-doc at the Texas Medical Center. I continued to work at Illinois and commute over the ’Net. While I was in Houston, the National Commission on Libraries and Information Science asked me to join their task force to develop protocols for libraries. (Being a devoted bibliophile, I had to do it!) The various library participants—NYPL, LC, OCLC, Ballots, and so forth—were a bit suspicious of each other, but after working together for a year they were much more comfortable and the group produced the precursor to what became the Z39 library standard.

We continued to work on networking research problems, developing protocols, doing research on distributed databases (one of my papers is cited in our database textbooks), and participating in the International Network Working Group (INWG), where we met the team from CYCLADES in France who invented the foundation technology of the Internet, i.e., datagrams and an end-to-end Transport Protocol. The team that Louis Pouzin assembled was one of the best and smartest I have ever worked with. That started a working relationship and friendship that lasts to this day.

[An aside: Useful learning isn’t always defined by the curriculum your advisor recommends. At one point, I audited “Invertebrate Zoology” because I wanted to survey the structure of nervous systems, which computers had some relation to. My first semester in computer science grad school, the only course I took was the “Social Ecology of the Amazon Basin.” I have no real interest in the Amazon, but this was going to be an interesting course: there were visiting lectures from anthropology (physical and cultural), sociology, history, biology, ecology, geology, Spanish and Portuguese literature, the World Bank, and the IMF. The specifics of the course could have been most anything. It was the method, the process, that was important. I learned things in that course that I still use.]

In 1978, Hubert Zimmermann, of CYCLADES, recently appointed the Open System Interconnection (OSI) architecture chair, asked me to take on the task of developing Formal Description Techniques for the new OSI effort (one of the problems I had been working on in INWG.) We had to select from over twenty proposed methods, which we finally narrowed down to two that used very different approaches: a finite state machine technique called ESTELLE and a temporal ordering language called LOTOS. Both became International Standards and were used to specify and find problems in many protocols. Consequently, I have a reasonable background in formal methods.

In the middle of all of this, I eventually assumed the role of Rapporteur of the OSI Reference Model, the seven-layer model one always reads about. I also became chair of the American National Standards Institute (ANSI) committee on OSI architecture. This put me in a close working relation with Charlie Bachman, who had proposed the seven-layer model, but was better known as a Turing Award winner for basically inventing databases. Charlie and I worked together off and on until his death two years ago. The OSI work was highly political and very contentious. It was a war between the computer companies and the phone companies, and everyone against IBM. As Reference Model Rapporteur, I not only had its political problems to deal with, but also got involved the problems facing virtually every layer of the Model. As Rapporteur and U.S. chair, I assembled a group to do Part 2 on Security and chaired the group to do Part 3 of the Model on Naming and Addressing.

Meanwhile, the company I was working for needed a new approach to network management and I got the job. In the spring of 1984, I worked out the new network management architecture modeled on nervous systems. That fall, General Motors visited. They were starting a major effort with Boeing and the National Bureau of Standards (now known as the National Institute of Standards and Technology, or NIST) on factory automation and expansion of the relatively new IEEE 802 committee. The piece they were missing was network management. We showed them what we were doing and they were enthusiastic. (A short lesson in electro-political engineering, sometimes called standards. IBM had been stonewalling OSI because their network architecture, SNA, was wholly incompatible. In 1982, IBM shifted gears, running full-page ads adopting OSI, but arguing that while OSI handled data it didn’t do management. Guess what they were stonewalling now? Right—network management. Given my prominence as Rapporteur of the OSI Reference Model, I sent some of my staff to IEEE meetings to work with GM. They used the network management architecture we had developed and produced a protocol for it. IEEE then brought it into OSI fully defined and ready to go. IBM had not seen it coming. That broke the logjam, as I suspected it would.) Of course, at the same time, we were architecting a new line of LAN products and developed and deployed a network management system on Apollo workstations that was 10 years ahead of its time.

In the mid-1980s, I attended a talk at the Kendall Whaling Museum in the town where I lived. The director was presenting his forthcoming book, Herman Melville’s Picture Gallery. There are three chapters in Moby Dick that talk about all of the extant pictures of whale, and many of the images were in the collection of the Kendall. One passage referred to Chinese pictures. After a huge search, the director had found a manuscript panel of what they thought was a nineteenth-century copy of a panel of a manuscript map. I loaned him some books I had (another long story) and we realized it was a seventeenth-century original. Since I was traveling a lot for OSI, I volunteered to do the backup research. Over the next few years, I was able to determine that this was part of the very rare Matteo Ricci (Li Madou in Chinese) world map of 1602, a woodblock print intended as a six-panel folding screen, six feet high and 12 feet long. Ricci was the first Jesuit into China, and is at the crux of the study of early East–West contact—made all the more interesting because Western science was one of the few things that Europeans could use to impress upon the Chinese that they weren’t just barbarians. This was a panel of the even more rare manuscript copy to which had been added pictures of sea monsters, animals, and ships. The research took me on many adventures: to junk rooms beneath the Vatican Library, to Seoul National University during anti-American week (there were demonstrations over lunch, just as in my earlier days), and others. I have now published on this work several times, consulted for auction houses on new copies that have come on the market, and am recognized as an expert in the history of cartography and science in seventeenth-century China—and believe it or not, there are parallels with Internet research.

You are credited with developing the principles behind Recursive InterNetwork Architecture (RINA). In common language, what are the implications of RINA? Are there specific advantages over current Internet architecture?

First, let’s get something out in the way. If the Internet were an operating system, it would be DOS; not even close to being UNIX, let alone the elegance of the B5500. The Internet is fundamentally flawed and can’t be fixed, either for technical reasons or because the political will is not there. Yes, it works (so does DOS). But at every major decision point, (and there have been seven or eight of them), with the right answer and the wrong answer well-known, the Internet consistently chose the wrong one. It is surprising that it works at all, a testament to Moore’s Law.

In broad strokes, the problems with the Internet include a flawed understanding of naming and addressing (missing two-thirds of the minimal), no security, and congestion control so badly flawed that is predatory and can’t be fixed, to which has been added a large number of patches and kludges to try to get around these problems, which only makes it worse. Of the four protocols we could have chosen, TCP was the worst; IP addresses name the wrong thing; IP fragmentation has never worked; and just to cap it off the Internet is a network, not an internet. There is not a single technical innovation created by the Internet that can be used as an exemplary solution. Worse, this is not a case of “if we knew then what we know now.” This is a case of “we knew then what we know now” (and now we know more).

That said, I didn’t embark on what became RINA to fix the Internet. In fact, quite the opposite: I have been trying to figure out what are the fundamental principles. We can build better stuff, if we know what the basic principles are. And I firmly believed, then and now, that the nature of the problem would tell us what those principles are—we just had to listen to the problem. In the early 1970s, we were on the right track; we knew, for example, that networking had more in common with operating systems than telecom. Bob Metcalf, the inventor of Ethernet, captured what many of us thought at the time: networking was InterProcess Communication (IPC). We quickly realized that naming and addressing paralleled OSs with three levels of analogous naming: application names, logical addresses, and physical addresses.

Being the Rapporteur of the OSI Reference Model was the perfect place to see the whole picture and determine what patterns there were. However, chairing a standards committee doesn’t put you in charge. As is often said, it is more like herding cats. During the OSI work, I had seen some interesting patterns turn up that would simplify things immensely. But suggestions were always met by the often-heard excuse “If we simplify here, it will only get more complicated elsewhere.” I had seen the B5500, and I knew that this fatalistic excuse wasn’t always true, but I couldn’t prove it. I needed a chance to work out the solution across the whole problem to show it actually got simpler everywhere else.

(Contrary to what is taught, I despise the idea that one starts with “requirements.” In computing, starting with requirements is the best way I know of to get a lousy design. The tendency is to try to implement one “thing” for each requirement, so you can point to a piece of code and say, “that code does this requirement.” One should start with the principles, the invariants. Follow those, and then ask what that tells you about the requirements. It is often the case that the inherent structure satisfies the requirements without doing anything else; there is no single thing one can point to that says this does that requirement. Just doing requirements is the refuge of the mediocre. I don’t teach that. For example, two major problems in the Internet today are multihoming and mobility. RINA solves both of them and both are free. They are inherent to the structure. There is nothing one can point to that does multihoming or that does mobility. Nothing special is required.)3

As I am constantly telling our collaborators around the world when they confront a new area, “Don’t try to solve the problem. First ask, what does the problem tell us? Then do what the problem says—it is smarter than we are.” And, so far, we have found that problems do tell us the solutions. Which brings up another pet peeve. Ever hear someone say “the Devil is in the details” or “the more I look at a problem the more complex it gets”? These are sure signs of someone with no sense of design. We have found quite the opposite: we don’t find devils in the details, we find angels! That is because we have teased out the principles and the invariants, and then constructed our model not to break the invariances. When we look at something new and ask ourselves what the problem (and our model) tells us, most of the answer is already there. As Charles Kettering said, “A problem well-defined is a problem half-solved.”

So, what did RINA find?
• Networking is IPC and only IPC.
• There aren’t five or seven layers, but one layer that repeats.
• A layer is a distributed application that does IPC.
• Layers are not different functions, but different ranges of allocation of data rate, Quality of Service, and scale.
• That there is only one data transfer protocol with different policies.
• And there is only one application protocol with different object models.
• This leads to a collapse in complexity by orders of magnitude.
• That firewalls are unnecessary; the layer is a securable container enhancing security and greatly lowering its cost.
• That a network can be renumbered in seconds, rather than months.
• That no special protocols or other mechanisms are required for multihoming, mobility, or multicasting.
• That a global address space is not necessary and yet all applications can be reached. New layers can be created dynamically.

All of this, and more, derives from three simple results:
1) Constructing networking by starting with InterProcess Communication in a single computer.
2) Analyzing the functions and separating mechanism and policy.
3) Richard Watson’s discovery that the necessary and sufficient condition for synchronization is to bound three times. (The three-way handshake taught in all of the textbooks has nothing to do with it.)

Because RINA defines a layer as a distributed application, we have now found that we are working to a unified model that includes distributed applications, operating systems, and networks, which also means that it covers data centers, clouds, and IoT, among other things.

For the past 10 years, the National Science Foundation funded an effort to find a new architecture for the Internet. It was unsuccessful. Why? Because they focused on what to build (requirements), rather than what they didn’t understand. RINA shows that understanding answered the questions of what to do for the future and what to build. They needed to go back to go forward.

What are some challenges to implementing architecture such as RINA?

There are basically two challenges:
1) Money, to build what we have developed. We have gotten some funding from the European Commission, but it had to be directed at their goals, not necessarily the goals that a “new”4 paradigm needs to investigate.
2) Research, where there are several challenges. RINA separates mechanism and policy. Hence, experimentation is necessary to determine the correct policies for specific environments. The great degree of commonality should enable capabilities we can’t even imagine with the current complexity. So, there is much work to be done to explore not only how the model works, but what it enables. Some properties are totally new, so there is much to explore. For example, RINA does not require a global address space and can create new layers as needed. No one expected this result and there were many reasons to believe it was not even possible! But that was one of the model’s implications that we discovered later, as was the discovery that a layer is a securable container. We didn’t consider security at all when we did the model, but it turned out to be more secure!5

There is much to explore. It is exciting to look at something we haven’t considered and find that the solution is already there. One of the more exciting things for me is how often the most innocuous thing we explore turns out to generate a novel, deeper insight, or shows one of my long-held beliefs to be wrong!

In what ways can BU students participate in your work on RINA? Have MET students made specific contributions to the research?

Students have made many contributions to RINA:

• One showed that the Error and Flow Control Protocol we use (developed in 1980) does not have the security flaws found in TCP. This contributed to our growing suspicion that strong design may have as much to do with security as security does!
• Another showed that a layer in RINA is a securable container, and also quantified the huge difference in complexity security that was in RINA.
• Another team used RINA concepts to unify WiFi and VLANs. They are the same thing, both are multiple layers of the same rank over a common media.
• Yet another showed that the large number of IoT protocols are unnecessary and can be done by a single protocol.
• One student showed that, in RINA, the Link Aggregation Protocol in IEEE 802.1 was unnecessary. The inherent structure provided the necessary capability.
• Another showed that RINA could be a secure approach to anonymity.

And there is much more to discover.

To join our research is quite simple: take a course with me! In essence, the course is the interview. Before one can do the research, one needs to learn the fundamental principles of networking, which are not found in any of the textbooks in use today. (Networking textbooks are more vocational than university level.) Taking a course gives me the opportunity to see how the student works.

You teach Computer Networks (MET CS 535) and Network Media Technologies (MET CS 635) in Spring 2019. Please highlight a particular project within one of these courses that most interests your students. What “real-life” exercises or problems do you bring to class?

Since CS 535 is an introductory course, I generally let the students choose a project and get their feet wet. Some of the contributions listed above came from CS 535. In CS 635 and CS 775 (Advanced Networking), we usually try to mount a class-wide project. I am eyeing several ideas now that we might explore—problems that make the student think, such as what RINA says about congestion management, creating layers dynamically, the properties of mobility, et cetera. I consider my job to be as much about teaching students how to think as it is about teaching networking.

___________________________

[1] It was supposed to be 256, but . . . keep reading!

[2] Plasma screens in 1976? One of the advantages of being at Illinois was that I had second-semester circuit theory from the person who invented plasma screens.

[3] That said, in other fields this is not the case. For example, in electrical or mechanical engineering, one starts with the requirements. Why? Because the principles already exist—with Maxwell’s Equations, the principles of thermodynamics, and Newton’s Laws.

[4] “New” in quotes because, to a large extent, we have picked up where Pouzin and CYCLADES left off in the early 1970s. I am sure that they would have gotten there eventually, but politics intervened, making the last 30 years of networking a Dark Age. It is now time for the RINAssance.

[5] When new results and simplifications “fall out” of a model, it is a good sign you are on the right track. Before that the model is just natural history: descriptive. It is science when it makes predictions and tells us things we didn’t know.