ICANN Not Deal With DNS Anymore
Saturday, June 25 2011 @ 08:44 AM PDT
Contributed by: Richard Pitt
Take the fact that today our internet systems are largely "always on" and couple that with both the high speed of the net in general and the huge computing and storage ability of our computers, and contrast that with the technology in the era that the Domain Name System (DNS) grew up in; one of connect on demand, slow network and scarce/expensive local processing and storage and you come up with my premise that DNS has outlived its usefulness. Is there an alternative to the system that is a money-tree for ICANN and a lever against perceived badness by the US (at least) government (in siezing domain names) and a perceive threat to national sovereignty by most of the other countries of the world?
Maybe - in fact probably.
According to an item I read yesterday that was talking about ICANN's opening up the Top Level Domain (TLD) system to anything someone wants to pay huge amounts of money for, most people use a search engine to find a company's domain rather than trying the "obvious" domain names in hopes that they'll hit the right one. They no longer play the guessing game, they search instead. The point the article made is that all that this expansion of the TLD system does is put money into ICANN's pockets - it really does nothing to help you or me find the company, and does "tax" companies by making them protect their brands at huge expense; The CocaCola Company would have to register for the TLD "coke" to keep the Mafia or some other criminal drug cartel from registering it and using it to market the drug (trite example but you get the picture).
The other problem of course is that the net is now so ubitquitous and has been around long enough that DNS suffers from a couple of major pollutants that affect such "intuitive" stabs-in-the-dark:
- many companies, from different parts of the world, have the same or similar names - and all have tried to get into the Dot-COM root domain for historic reasons.
- purveyors of general nastyness love to register "near miss" domain names - mis-spellings, hyphenated, etc. - and load them with spam or malware or plain misinformation
The interesting thing is that, somehow, the search engines generally get us to the point where the company or product we seek is in the first page of (and most times the top line of) options they give us, no matter what the domain name really is. They do this using several techniques that they're constantly tuning:
- geographic profiling - both of you and of the various companies that might share similar names - so you'll find the "Joe's Cleaning" from your town and I'll find the one in mine.
- popularity and feedback profiling - sites that are the "correct one" where only one such site should exist (international brands such as Ford, Coke, etc.) get in-bound links from legitimate sites that are long-lasting and from otherwise generally well accepted sites - and people who click on the sites' links don't go back and search again.
So I asked myself, "why do we need the Domain Name System in general, and ICANN, the guardian of DNS top level domains, in particular?"
Is there some other way to do this that works as well or better, and does not subject us to the US FBI seizures and excess (IMHO) fees and charges of ICANN in particular and the Domain Registrars in general?
Just remember, DNS is simply a crutch to the human mind that does not deal well with simply remembering numbers. At heart, it simply translates the human readable (and rememberable) string to a set of unique numbers that describes a host to the technology you're using; www.digital-rag.com becomes 188.8.131.52
I started thinking about what it would take to do things in a completely peer-to-peer fashion, or at least in an "open source" fashion, with the search engines (everone's favorite, not one in particular) providing the glue that binds the information sources of the world together. The whole reason for the DNS system is that people don't easily deal with remembering numbers; the IPv4 numbers run from a low of 4 to a high of 12, split into 4 segments by dots. IPv6 is worse - it has more sections, and has base-16 numbers (including abcdef) instead of just the base-10 ones that go into the typical IPv4 representation. Somehow, there has to be a way to marry the concept of searching with the underlying technical address structure of the TCP/IP system the content lives on.
A new way of finding web information sources about some company, person, entity...
The system would have to be "reputation" based - it would have to be able to deal with the fact that the unscrupulous would try to steal a domain's identity, or at least its viewers, for some nepharious reason. This is, as noted above, one of the areas that the search engines work on daily.
The system would have to allow for multiple "domains" (or subjects) on a single IP address. Today's web servers look at the content of the request sent by your browser and "switch" between virtual sessions (NameVirtual) based on the first part of the request - the domain name. The fact is, the underlying web servers could switch just as easily based on almost anything in the request - the hard part is getting the request to the right web server in the first place, and IT's address is numeric.
The system would have to allow for changes in where some company or individual's information resided; no longer at this IP address, now at this new one.
There would likely still need to be some form of formal registration, but governments in their various physical domains could provide that infrastructure as an add-on to their current corporate registration facility - part of what is becoming government's portion of the symantic web of information (yet another reason the DNS system is irrelevant today.)
Thoughts on what it would take
Web Servers Figure Much Out Now from the content of the URL they receive from your browser
Once HTTP version 1.1 came out, the "NameVirtual" facility became available and web servers were no longer restricted to handling only one IP address with one domain name - they could handle as many as the underlying hardware could cope with, thousands and more in some cases.
With the advent of database-backed URL re-write and analysis, the rest of the URL could also be viewed and acted upon by the server such that things like re-direction and URL re-write now happen without slowing the response to the viewer more than micro seconds. In fact, your friendly search engines use this facility to act on your inquiries - they build each and every page you see "on the fly" based on what you type into the query line they give you, which translates directly into a URL with parameters built into it when you push "Search".
The web server today can be configured to show any page it has, based on any URL it receives, and can create pages to suit. It doesn't care about DNS really.
I've long thought that domain owners were missing a bet by not using the "*" option in the current DNS system to allow "any" sub-domain of their main domain - and then simply allow the web server to key on the sub-domain "words" to run a local search in the main web site. This would allow you to request something like "http://red.carpet.tiles.somestoredomain.com" and have the web server for somestoredomain.com look up "red carpet tiles" and give you the appropriate page(s). This is somewhat akin to what happens now with the strings that follow a "?" in a site with active pages - these "arguments" are passed into the active page and allow the page to show more specific content.
The problem is/was that the technical community really frowns on the use of the "*" in DNS records - it can (and does) cause all manner of problems with services other than the HTTP/web (i.e. port 80) itself. It is the other 65,534 ports of the TCP/IP protocol that mostly don't deal with the request having any information in it that screws up this concept.
So, as it turns out, a web server can receive an inquiry directly to its IP address that contains more than enough information to determine which of several/many/millions of pages of information it holds is most relevant to you, even if there is no DNS record tying its IP address to any particular domain. I can as easily and successfully send my next query to 184.108.40.206 (one of the 6 addresses I got when I looked up www.google.com just now) as I could by using www.google.com as the first part of the URL. Both would work.
In fact, as the administrator of many different systems, I get to see the logs of those systems and sometimes marvel at the fact that somehow a browser asks my web server for a page from a domain I've never heard of. One of the reasons for this is that I have a block of addresses that are almost the same as some of the "non-routable" address blocks, and sometimes people get my addresses mixed up with setting up non-routable "test" systems (mine are 198.162... and the non-routables are 192.168...) The web server answers the inquiry - and gives a "404 - page not found" error - but I could just as easily tell the web server to translate any such "wrong" query to send back one of my pages, or do something else completely, including looking up the domain myself and punting the request somewhere else. Once the query finds its way to my computer, I can do anything with it to answer it, and the browser at the other end typically does not care. The viewer might - but the browser does not.
Local Storage - no need for separate caching DNS servers
Another technical feature today is the size of available local storage (and available RAM for programs). Your own recently purchased computer probably has more storage in it that you'll use before you decide to get a new computer, and adding to it is fairly trivial in any case. If your computer needed a few Gigabytes to store information about the computers you most frequently visit (or might visit), you would not likely notice. The CPU you have is typically idle most of the time - and the amount of RAM necessary to run your own local "DNS" server is fairly small.
The caching DNS server system was designed to offload the lookup load and storage needed to cache the addresses that local users were interested in. A company might have a couple, and all queries would go via them, so each machine did not have to "bother" the root name servers for each and every lookup. This saved space on each workstation, network bandwidth to the internet (from the business) and computing resources (to some extent) at the root DNS servers.
More recently, your ISP runs caching DNS servers that they "strongly suggest" you use, by putting their information in all the support documentation and having their DHCP servers set your system to use them. This saves them bandwidth to the rest of the internet (nowhere near necessary today but... ) and it also gives them an interesting facility in that they can have their DNS servers "lie" to you - they can intercept your query for (for example) Google's search engine and redirect you to their own. Or they could note that you're looking for a domain that does not exist - and push you over to their own, or try to sell it to you, or other strange behaviors.
In light of some of these things - and just because I know how, I run my own caching DNS servers on all my systems. The local DNS system I run does not impact the speed of my systems, and in fact for many things it speeds things up by some small amount for second and subsequent lookups to the same domain because the information is local - no network traffic at all. With today's fast computers and huge memory amounts, there no longer is any reason not to run your own DNS caching server locally on each system.
That being said, what if your local "DNS" system can figure out the IP address of where a domain is from some other information, not from the root DNS systems? Would you care?
We also have the potential for the average consumer computer to store a significant set of information about the computers and web pages/sites the owner might typically visit. In fact, as of today (June 25, 2011) there are only something like 131 million registered domains in the top main (COM, NET, ORG, INFO, BIZ, US) domain levels; a truly trivial number in the face of Gigabytes of local storage.
This gives a LOT of flexibility to do things like providing peer-to-peer information on what a particular server hosts in general and in specific, in a format that is easily digested by the search engines and/or cached by your own local system. This takes the DNS system's caching to a whole new level - every desktop machine (and of course any/all servers) could cache information on pretty much the entire web - no longer needing to go to specific "caching DNS" servers (and hence, to the root servers that ICANN controls) to ask directions.
But what about updating things if/when they change? This is one of the built-in functions of the current DNS system - records contain a series of count-down times that trigger things like local cache timeouts and eventually, the whole domain's timeout in secondary domain servers if the master for that domain "disappears" (i.e. the owner forgot to pay the registrar, and the registrar took the main pointer out of the root servers.)
Hmmm... if we don't use registrars anymore, I'll bet that a lot of DNS timeouts (or the equivallent loss of how to find the company's data repository) would stop happening. The only other reasons for changes would be moving to a different server at a different IP address, or discontinuing the server content altogether.
It turns out that today's internet web of users and machines is so connected that peer-to-peer update of such information within a typical DNS timeout period of a few hours is also pretty trivial. In addition, our News clients "listen" (poll) various RSS syndication sites for us periodically and frequently, so with minor modification, the sites we're most interested in will be able to tell us updates if/when. If a site is no longer where our system thinks it should be, our system could start inquiring of various search engines or other designated sites for the new location. It might query our friends' computers, places like Facebook, etc., where such information might accumulate through the actions of those sites polling various servers themselves looking for "anything and everything" and categorizing it. It only takes one such system to "notice" that some new server is saying it now hosts a particular mix of content (AKA one or more domains) that used to be at a different hardware address for such information to propagate quickly.
Better, when you query a search engine, have the browser do its DNS queries to that search engine's DNS server - then what is returned is relevant to what the search engine knows as the correct IP address for the URL.
A Technical Challenge - Maybe
I know that many people think that the number of IP addresses under IPv4 (about 4.3 billion) is a lot - and of course the number under IPv6 is truly staggering (340,282,366,920,938,000,000,000,000
Well, there are a lot of "hackers" out there who have set their computers to "ping" each and every one of the IPv4 addresses over a period of time - so I'm sure the search engines could if they needed to - but they really don't need to. Any server that has content they want the world to know about need only ensure they have one link pointing to their IP address from any other already known server in the world - and if they can't find any friends to do this for them, they can simply arrange to use the front page of any/all of the search engines they are trying to attract because those search engines will then know that the address asking them a question "exists" and they can add it to their list to spider.
The real challenge is to somehow inform the world of what content the server at any particular IP address has that might be interesting. The easiest way would to host a page at the default location on the server at that IP address that had links to the "real" content URLS of the server and that included formal (or informal) designations that would be recognized by the host web software in URLs that accessed that information. The format probably should be formalized - maybe some XML format - but my bet is that the search engines of today would be able to deal with almost any human-readable format that made sense so things like:
- http://192.168.100.1/pacdat.pacdat.net/index.html Pacific Data Capture
- http://192.168.100.1/p-zip.com/index.html P-Zip Marketing
and other similar constructs would lead the search engine to try http://pacdat.pacdat.net/index.html at the IP address 192.168.100.1 (note this is non-routable, don't try this at home) and if it got reasonable information, it would then start handing out information that someone calling themselves Pacific Data Capture with a web site at 192.168.100.1 would answer if your browser went there, based on the content the search engine found at that address if someone asked for it.
How do you know you have the right "domain" if there are no registrations at ICANN?
What it comes down to is, how do you know which of the several/many links you see on your search engine's results page is the "right" one? How do they (the search engine people) know what to put on the page?
It comes down to reputaion - "correct" pages have more points - and the points are made up of things that are hard to fake. Only one of those points today is the actual registered domain name. A suitable replacement might be the provision of a cryptographic binding facility between the physical (government, local, business) registrar and a facility on the web server purporting to represent that company. The registrar would be given a public key which it would publish (or the registrar would generate a key pair and publish the public one, giving the private one to the registered company/individual) and to verify, anyone (especially the search engines) would simply encrypt something with the public key and have the web site decrypt it for them - match, and you know you're dealing with the right entity.
So, What Are The Key Elements Necessary to Eliminiate ICANN?
- Search engines provide an IP address of each server they find information from that matches the information you're looking for - and give you a suitable URL to use to query the web server at that IP address. As an example, Google already has their own DNS servers, all they need to do is ensure that the "domain" of any URL they hand out either has the IP address in it, or resolves in their own DNS server (perverting the normal nature of such a server - not likely to happen but if ICANN and the hierarchical DNS disappeared tomorrow, this one minor change would keep the world working fairly well) The "proper" way would be to establish a separate facility from traditional DNS but that worked much the same way from the client point of view.
- Some method of dealing with legacy services such as email and the fact that many/most email receiving systems check to see that a (sending) domain exists before allowing receipt. Of course the number of people who use services like Hotmail or Gmail instead of their own domains is pretty high, so maybe this is a non-issue.
This is put out as a very simple thought experiment. If the concepts here were to be put into practice there are all manner of details that need to be addressed - things like how a server at a particular IP address announces to the world that it carries information about some set of things (RSS feed, sitemap, etc.) and some interesting details on securing the information on who hold title to a particular name and/or corporate name in each geo-political region. But just think about it; simply start using your registered name as a network identifier, and let the search engines figure out who really wants to find you. Maybe good, maybe bad - but deffinetly different from today's FBI-trumped costly ICANN bureacracy.