So where did it all start? In the first of three posts on the history of search engines, I discuss the history of the Internet itself and the precursors to modern web-based search engines (in the period immediately before the first web appeared). widely used). browser). On this whirlwind tour, I take on now-forgotten tools like Archie, Veronica, and WAIS.

A Brief History of the Internet

The Internet is arguably the greatest invention of the 20th century, allowing almost unlimited connection of people with each other and with the resources they seek. While the invention of the telegraph, telephone, radio, and computer laid the foundation for this communications revolution, it was a series of rapid technological developments during the 1960s that paved the way for the creation of the Internet.

Arguably the grandfathers of the Internet were JCR Licklider and Leonard Kleinrock, both from the Massachusetts Institute of Technology (MIT). Licklider was the first head of the computer research program at the Defense Advanced Research Projects Agency (DARPA) and in August 1962 wrote a paper about a “Galactic Web” of globally interconnected computers, through which everyone could access Quickly access data and programs from anywhere. . Kleinrock made this dream come true through his work on packet switching theory (1961-1964) and the creation of the first (albeit small) wide area computer network (or WAN), in 1965 (connecting a computer TX-2 at MIT to a Q-32 in California).

Kleinrock worked closely with Lawrence G. Roberts in creating the WAN, and it was Roberts who wrote the design for the ARPANET (Advanced Research Projects Agency Network) in late 1966, increasingly collaborating with teams from the National Laboratory. of Physics (NPL). ) in the UK and the RAND Corporation (who had independently developed packet switching technologies without knowledge of each other’s work).

During 1968 Bolt Beranek and Newman (BBN) were selected to build the ARPANET and in September 1969 the first node was installed at the University of California (UCLA). A month later, the second node was added (at the Stanford Research Institute) and the first Host-to-Host message ever sent over the Internet was launched from UCLA. The month I was born!

During the period from 1970 to 1972, many computers were added to the ARPANET, protocols were developed, and software was written. In October 1972, BBN’s March Ray Tomlinson developed the first email system and sent out the first email (“quertyuiop”). The following year, the first ARPANET connections were made outside the US, to NORSAR in Norway and University College London (UCL) in the UK. For a great 1972 video documentary on the ARPANET, visit my blog.

Although the original ARPANET grew rapidly during the 1970s, it remained primarily an academic reserve. The next key step in the development of the modern web began in 1982, with the adoption by many players of the TCP/IP protocol, which was faster, easier to use, and less expensive to implement than earlier protocols. This in turn made it much easier for small networks to connect to the network and for those links to branch in all directions. From this point on, all networks using TCP/IP refer to themselves as part of the Internet (rather than the ARPANET) and standardization on TCP/IP allows the number of Internet sites and users to grow exponentially.

To use an analogy, these developments created the easel, but there were still precious little paintings for the artist to use. Most of the early mass market Internet tools were overly technical in nature and difficult to use. Do any of you remember terms like WAIS (wide area search), Archie (archive search), Gopher (data retrieval), Newsnet and more?

Two key tools were to change all this forever. In 1989, Tim Berners-Lee and the team at CERN (European Laboratory for Particle Physics) invented the hypertext-based World Wide Web. Four years later, in 1993, Marc Andreesen of the US National Center for Supercomputer Applications (NCSA) launched Mosaic, the world’s first commercial web browser. Tim’s original specifications for URIs, HTTP, and HTML were further refined in the following years, and Andreesen went on to develop the Netscape web browser, based on the original MOSAIC kernel.

The rest, as they say, is history! From this point, the Internet has grown exponentially. According to internetworldstats.com, in December 1995 there were only 16 million Internet users (0.4% of the world’s population), but this had increased to 361 million in December 2000 (an increase of 2.300%) and to 1.018 million in December 2005.

The world’s first search engines

The father and mother of the modern search engine were Archie and Veronica. Archie, developed in 1990 by Emtage, Heelan and Deutsch (students at McGill University in Montreal) was, in a sense, the world’s first search engine. Archie was a tool used to index FTP files and allowed users to search and find specific files. The user had to have a pretty good idea of ​​the filename he was looking for, since Archie only indexed filenames (although wildcards were supported, which helped).

In early versions of Archie, the system worked by simply running a job once a month to log in to each of the member FTP servers and request a list. These listings were stored in local files for searching with the Unix grep command. Once a user had found a file in Archie’s index, they had to connect to the FTP host and poke around until they found the file they were looking for (much like the early days of Napster’s music file sharing almost 10 years later). . This was not for the faint hearted and the system was only widely used by the technical or academic!

The name Archie derives from the word “archive”, but users associated it with the comic book series of the same name, created by Bob Montana (with fictional teenage characters Archie Andrews, Betty Cooper, Veronica Lodge, Reggie Mantle and Forsythe.” Jughead”Jones characters). As such, when Gopher began to take off in 1992, Foster and Barrie (at the University of Nevada) named their newly developed Gopher search engine Veronica, after Archie’s comic book girlfriend. Officially, Veronica stood for “Very Easy Rodent-Oriented Net-wide Index to Computer Archives”.

Veronica was a constantly updated database of the names of nearly every menu item on thousands of gopher servers and was searchable directly from most major gopher menus. Veronica was, technically, an improvement over Archie in that it (a) indexed the full title of a document instead of just the file name, and (b) connected the user directly to the source file with a single click. What neither Archie nor Veronica did, however, was fully index the target document. This meant that both lacked the so-called “semantic capacity”, that is, the ability to connect documents with different titles but similar content.

In 1991, Brewster Kahle (at Thinking Machines) launched the Wide Area Information Server (WAIS) at Xerox PARC. WAIS only enjoyed a brief presence on the Internet history stage. However, it could certainly be described as the first genuine forerunner of modern search engines, in that it was the first to fully index all text in the Gopher and other Internet documents. As Kahle said at the time, he wanted users to be able to “jump to the middle of the scroll.” WAIS complemented Veronica, which only searched menu titles for gopher sites, but it quickly became obsolete due to the rapid growth of the World Wide Web (which came to replace or front-end all major Web properties). FTP, Archie, Gopher and WAIS).

As

At the time of writing, internet sales account for approximately 15% of all sales in the UK (having increased by 50% over the last year). The numbers are even higher in North America. U-Switch predicts that 40% of all sales will be over the Internet by 2020 and Google is now the number one brand in the world; Not bad for a business less than ten years old! Sometimes it seems incredible to me that so much has happened so quickly. My reason for writing this series of articles is, in part, to bear witness to those early pioneers of the Internet and search, so that we don’t forget their vital contributions.

In the second part of the series, on my blog, I review web search before Google came to dominate; looking at the first web crawler, WWW Wanderer, and early pioneers like Altavista and Northern Light.