Robert E. Kahn

Robert E. KahnSigma Xi promotes companionship among researchers so we highlight our members through the Meet Your Fellow Companion series. Dr. Robert E. Kahn is an Internet pioneer who, among other things, co-invented the TCP/IP protocols. He was recently elected to the National Academy of Sciences. Dr. Kahn is the chairman, CEO, and president of the Corporation for National Research Initiatives in Reston, Virginia.









Transcript from Video:

Heather Thorstensen: Hello and welcome! This interview is part of Sigma Xi, The Scientific Research Society’s Meet Your Fellow Companion series, where we talk to our members about their work. My name is Heather Thorstensen, I’m Sigma Xi’s manager of communications. Today I’m speaking with Dr. Robert E. Kahn.  He is an Internet pioneer as well as the president and CEO of the Corporation for National Research Initiatives in Reston, Virginia. He was recently elected to the National Academy of Sciences. He joined Sigma Xi in 1966 and he is a life member.

Dr. Kahn, thanks for talking with me.

Robert Kahn: Thanks so much, happy to be here.  


You were part of some exciting work that helped make the Internet what it is today, such as the co-inventing the TCP/IP protocols that help ensure reliable information can be transmitted. You also conceived the idea of open architecture networking, which allows components of the Internet to work together. I was wondering what do you see as the greatest opportunity the Internet can still provide in the future?
Well, I think we’re still discovering what the Internet is capable of. It’s grown from a small system that consisted of three nets and maybe a few hundred computers, back in the early '80s—or actually going back to the early '70s when myself and my colleague Vint Cerf first started to work on the protocols—to today when there are arguably several billion machines, applications, probably takes it up to a much bigger number. And we’re now moving toward the Internet of Things where people are prognosticating maybe 50 billion, 100 billion things on the Internet. Of course, we’ve had things from day one but this is a whole expansion of things you carry in your pocket, things that exist in your kitchen, in your car, in your ceilings, everywhere. So that really is a big unknown as to what exactly is going to transpire there.

And virtually everything that has happened with the Internet over the years has been somewhat driven by serendipity, new ideas that came out, or logical extensions of things that came before.

We know there’s going to be higher bandwidths going forward, faster processors, more storage. We all know that wireless is a critical part of the Internet. In fact, it’s now I believe the dominate part of the Internet and growing rapidly. And, you know, the whole business of managing information has taken front and center stage, starting with the work on the web that began more than 20 years ago to some of the newer technologies like we’ve been developing at CNRI. So those are things that you can logically predict.

But it’s the unknowns, the serendipitous occurrences, the new ideas that people have that nobody ever thought about that’s really going to dictate much of the Internet going forward.

 

What do you think the biggest threat is to the open Internet right now?
Well, it’s still not widely agreed on exactly what that term means. To some it means that they can get access to your information, to others it means that there’s not charge to anything. But what is pretty much understood is the Internet is a grand collaboration including governmental bodies around the world on a scale that’s never really been achieved in the digital world before. There’s probably very few counterparts in any context whatsoever. Even the United Nations wasn’t on the scale that we’re talking about for the Internet. So it really is kind of a definitional term that I think we’re going to have to live with and parse and try to understand. I don’t think it’s a widely understood term so it’s really hard to answer the question with any real specificity.


Ok, what about the open Internet in terms of security: the threat to our security on the Internet?

The thing I can relate to here is that in an open Internet, the presumption is that almost anybody can participate. While it’s true that any nation can control what happens within its boundaries, the Internet is larger than any one nation. So the fact that you can have so many actors involved basically means you’re going to get good guys and bad guys and cyber security has been one of the biggest concerns that we’ve had in recent years. When we were starting with hundreds of computers and friendly researchers, none of those things tended to happen. Well now with 3 billion devices or more applications out there, it’s really something that has sort of come out of control to some extent. So cyber security is really a very big issue today. It’s probably going to be an even bigger issue in the future when we have 50 or 100 billion things out there, all capable of attacking each other.


The not-for-profit organization that you started in 1986, the Corporation of National Research Initiatives, is developing something right now called digital object architecture. Could you explain what that is and why it’s important?
Ok, it’s a little hard to give an elevator pitch on it. That’s what I’ll try to do. The really short version. The Internet itself, it’s well known that it works based on the use of IP addresses, these are addresses to the actual machines you’re trying to get data to, and applications on those machines. When I started working on networking in the earliest days, I was involved with the ARPANET directly. We addressed the computers by the wires they were on. We addressed in the Internet by the IP addresses. When the web came along, we had to address it by virtual files on given machines. But if you’re trying to deal with information, particularly with the information you might want to archive over very long periods of time, it doesn’t help if you’re retrieving, let’s say, a Sigma Xi journal, 100 years from now, to say “this used to be in the file name such-and-such on a machine that had the following IP address 100 years ago.” Companies don’t last very long. Maybe the societies will because non-profits don’t have the same dynamics as for-profit companies but if it’s government information you might want it to last in perpetuity, well at least as long as the country is around. And so, identifying things about the technology is really not the right long-term solution.

And so the digital object architecture basically coined the notion of digital objects, which is basically structured data that’s interpretable by any of the machines if you’ve got the right applications. But more importantly, the information itself has a unique persistent identifier. So if in a hundred years, you come back and say “give me the information that’s got that identifier,” no matter what the technology of the day is, you should be able to find it if that information has been managed properly. So, that was the essential idea, to be able to identify the information itself, structured information in digital form, and that you would be able to manage it over the long haul.

There were three components of that architecture. One of which is something we call repository, which basically provides a means of accessing these digital objects. So if you want to access one a hundred years from now, it needs to be accessible from somewhere. So that’s what the repository does, it stores them and makes them available with the appropriate security controls and the like but by virtue of its identifier. So you’ll say “give me the object whose identifier is x” and presumably you will get it no matter what storage technology is used behind the scenes. In today’s parlance, it could be cloud services it could be RAID arrays, discs, thumb drives, whatever you’re using but in the future we just don’t know what that would look like.

Second is if you don’t know the identifier, you need a way of acquiring them and so registries store metadata about the objects and those registries can be searched just like you might search a web browser engine today to find an appropriate URL, you can search a registry to find an identifier to find whatever the search criteria of the day is. Today, it might be keywords. Tomorrow, it might be voice, singing, music, pictures, whatever and that would call back identifiers, maybe more than one. But algorithms will be developed, I believe, in the research community that makes searching based on more than just keywords viable.

And then if you’ve got these identifiers, you need to then understand how you go from the identifier to wherever the repository is where you can access the information. Today, there are thousands of organizations around the globe that are running what we call local handle services. We call these identifiers “handles.” Other groups have branded them in different ways. For example, the publishers call handles “DOIs.” Other people just call them by their generic name: digital object identifiers. But you need to know which of these local handle services to go to to find out what this identifier resolves to because there’s no central place that contains all of that information. So there is a thing called a Global Handle Registry that knows where the local handle services are and if you provide an identifier to it, it will say “basically, what I can tell you is who would have created that identifier, if anybody did. Go and ask them.”

And so, that’s the way the system is currently constructed. It’s got this Global Handle Registry that lets you identify where the local handle services are, the local services will resolve identifiers into state information about the objects, like where they’re located— it could be in many places, you can have public keys, it can have terms and conditions for use, authentication information, things like that— and repositories and registries that help you find the identifiers to actually get the information.


Do you see this architecture with digital objects going across all information on the Internet? Would we be having these digital objects with these unique identifiers on every social media post and everything or is it just supposed to be about government documents and that kind of thing?
No, it’s very general. Many people thought it was only about publishing originally but it’s really about transactions of any kind that might want to take place. Remember, it’s not about the information so much as a digital representation on the Internet. If we’re talking about digital information in all forms, they could be movies, they coud be publications, they could be semiconductor chip designs, they could be contracts, they could be government information, private company information. And security is important and I didn’t mention it before but it’s built into the architecture so it’s not an add-on that would come later but it’s pervasive and potentially could apply to everything but it will take time to get from here to there because just like when the automobile was introduced we had an economy that involved other forms of transportation. Horses didn’t disappear on day one. We still have railroads in the country, we still have ships at sea. When the airplane was introduced, it didn’t erase any of those others, in any event. So the transition to this will probably take time. It will require buy in from different constituencies who see it as part of a longer term global infrastructure. We had all kinds of issues with the Internet in the early days, when people thought it might be proprietary, maybe government, until finally they finally realized it was a public capability that was being made available. Government funded it initially in the public interest and the same thing is true here. This is intended to be in the public interest and my hope is that everybody who’s interested in managing information effectively over the long term will find this of interest. If they find the current technology is perfectly good for what they’re doing, my guess is they will stick with that until they see some reason to do something different. So it’s going to have partial adoption in some places, slower adoption in others, and eventually could transform things in fundamental ways.


You mentioned the transition time. How long do you think it will be before this might be something that is a global opportunity?
It is being used globally right now. Virtually every major scientific publication that I’m aware of anyway uses this to identify the articles that are there. In the United States, the entertainment industry uses it to manage their assets on the net. There’s a site they created called EIDR.org that is based on the registry technology and the handle technology. They have worked jointly with the publishers because sometimes the publishing industry does multimedia kinds of things where print and video may want to work together. So it’s all the same technology base but it is being widely adopted so far in selected groups and there are some governmental buy-in. The Library of Congress has been using this for quite a while in the United States. There are just quite a number of areas. I think this may find more uptake in some of the big data research initiatives around the globe.


Why do you think it’s so important that this become widely adopted?

Well, it was an attempt on our part to show an alternative. So whether people adopt it or not, it’s really their choice. We’re trying to show them what the option is.

This was a similar thing that I encountered in the early days when we were developing packet-switching technology. And it was really intended to show folks who were in the business that there was a better way to do communications for dealing with computers because computers were not like people having long-distance phone calls. They wanted to send a little version of data, have it reliably delivered like a rapid postcard through a network and we didn’t have any nets that were like that. So we showed them an alternative and it wasn’t adopted initially. It was built up in the research community and eventually people said, “hmm, that’s actually interesting. Let’s adopt it.” Today, a large part of the communications infrastructure of the world is based on that kind of technology. They call it different things but it took a while and the same was true of the Internet when it first emerged. Not everybody jumped in initially and eventually the folks who were most concerned about it, because it potentially threatened their revenue streams and businesses, said “hey, wait. Maybe it actually is another market for us. Let’s jump in and see if we can be major players” and they did. They were very successful. So it may take a while and that’s fine.

My goal here wasn’t to make them somehow adopt it so much as it was to show them what the potential was and leave it up to them to make their own decisions.


What does your election to the National Academy of Sciences mean to you?
Well, it was certainly a very big surprise. I really never expected that would happen. I was very pleased. It’s kind of wonderful to be recognized by some of your peers that way. Networking has had a very unique history in some ways. Even in the computing field, networking wasn’t viewed as a first class citizen for a while because either it was electrical engineering and not part of the discipline or it was viewed as kind of a wiring exercise to make the things work together. But eventually, I think people began to recognize that it has actually a serious intellectual component, a serious engineering component to it and I think the scientific community writ large is now coming around to that recognition as well. So, personally it’s very rewarding to know that your peers and colleagues are recognizing your own contributions. I hope I’m not the only one who ever gets there, other people I think are worthy of being recognized as well, including some of my close colleagues. But I think this is just a very good thing for the field as a whole that I represent.


Could you give me an example of other projects that CNRI is working on?
I think at the moment one that we’re working on may seem a little unusual for a place that’s known for working on networking and information technology and the like, but we have for the past, roughly, 15 years been providing a service, a national service, to help people in the country who want to design and have implemented micro-electro-mechanical devices, and more recently nanotechnology, to get them fabricated. So this is an effort that started with support from the U.S. government, DARPA [the U.S. Defense Advanced Research Projects Agency] in particular. We have supported, over the last 15 years, some two or three thousand individual projects. Some of these are very sophisticated.

It builds on technology that we funded when I was in DARPA before called MOSIS, which stands for MOS Implementation System.  And this was a service that we provided for researchers who wanted to design VLSI [Very Large Scale Integration] chips to be able to do it—ship the designs over originally the ARPANET and eventually the Internet, have those designs be merged and have a single wafer, or set of wafers, be fabricated through a central fabrication facility that we organized and produced chips for the research community at very low cost in very short time periods, without the researchers themselves having to know about foundries and mask making and all the details of producing actual, working chips.

So we did that, starting back in the early ‘80s. It’s still going on today, being managed out of the University of Southern California. And we set up something that was not exactly the same but very similar that would allow people to come in over the Internet, do designs of processes to run mechanical devices. Unlike the electronics world where one process…like you probably would have to dig into the technology to know what cmos [complimentary metal-oxide semiconductor] and nmos [negative-channel metal-oxide semiconductor] or Silicon on Sapphire really meant but if you can advertise one of those processes, there are lots of electronic devices that you can make with one technology from processors to memory to who knows what. In the MEMS [micro-electro-mechanical systems] world, every device sort of requires its own separate process. It’s like anything mechanical, you wouldn’t expect the same process to be used to build an automobile, an airplane, or rifle, or refrigerator, or submarine. They’re all different, they have different sizes, they have different steps in the manufacturing. And so the process steps need to be unique to the device.

Now, we are doing research on how to move beyond that so that you can get more reuse of the technology and therefore reusable processes could be more in vogue but for the moment, it’s very process dependent. So people can come in, over the Internet, define a process or ask us to help them define a process. Once the process is specified then we can take a design for a particular device that they want fabricated, actually help them get it fabricated, and mail them back working micromechanical chips in a matter of weeks to months at very low cost. 


Wow, great. Thank you. Well, thank you for doing this interview, I really appreciate it.
Again, my pleasure. Thank you very much.