8.4 Networking


This section under major construction.

Timeline of Internet.

Networking. Client-server model, peer-to-peer networking.

TCP/IP. Created by Bob Kahn and Vint Cerf.

World Wide Web. Vannevar Bush was a visionary who described what would become the Internet in a famous 1945 paper As We May Think. His paper describes a theoretical model for storing information and accessing it using links from one piece of data to another. Ted Nelson and Doug Englebart developed this idea into what we know know of as hypertext. In 1980, Tim Berners-Lee made Bush's dream a reality. He formatted hypertext using HTML (Hypertext Markup Language) and wrote a browser which he called WorldWideWeb and the first web server info.cern.ch. Use of the WWW became popular in the mid 1990s and is now indispensable to everyday student life.

Protocols.

telnet www.nytimes.com 80
Trying 199.239.136.200...
Connected to www.nytimes.com.
Escape character is '^]'.
GET /2003/12/23/technology/23linux.html HTTP/1.0
Host: www.nytimes.com
Referer: http://news.google.com

Web server. Java makes it easy to communicate with a web server. The data type URL represents is a uniform resource locator. A resource can be a file or a website. To read the contents of a website, use our In class.

In in = new In("http://www.cnn.com");
while (in.hasNextLine()) {
    String line = in.readLine();
    System.out.println(line);
}    

Traceroute. How do my packets get from A to B, and how long does it take for them to get there?

% traceroute cornell.edu
traceroute to cornell.edu (132.236.56.6), 30 hops max, 40 byte packets
 1  ignition (128.112.139.1)  0.860 ms  0.599 ms  0.653 ms
 2  fw-mgmt (128.112.138.2)  1.209 ms  0.493 ms  0.532 ms
 3  csgate-subnet193-192 (128.112.139.193)  1.017 ms  0.957 ms  0.838 ms
 4  gigagate1.Princeton.EDU (128.112.128.114)  1.070 ms  0.956 ms  0.905 ms
 5  vgate1.Princeton.EDU (128.112.12.22)  2.498 ms  0.998 ms  1.000 ms
 6  local.princeton.magpi.net (198.32.42.65)  2.657 ms  2.823 ms  3.699 ms
 7  remote1.abilene.magpi.net (198.32.42.210)  9.740 ms  5.872 ms  8.518 ms
 8  nycmng-washng.abilene.ucaid.edu (198.32.8.84)  10.191 ms  9.677 ms  10.253 ms
 9  nyc-gsr-abilene-nycm.nysernet.net (199.109.4.129)  9.892 ms  9.575 ms  9.620 ms
10  nyc-m20-nyc-gsr.nysernet.net (199.109.4.2)  9.997 ms  11.565 ms  11.049 ms
11  cornell-nyc-m20.nysernet.net (199.109.5.29)  16.958 ms  16.989 ms  17.246 ms
12  core1-msfc-dmz1.cit.cornell.edu (128.253.222.5)  17.504 ms  17.054 ms  17.038 ms
13  bb3-msfc-0000-07-vl7.cit.cornell.edu (128.253.222.167)  17.463 ms  18.548 ms  17.245 ms
14  cornell.edu (132.236.56.6)  17.142 ms *  17.352 ms

Mail. The program Mail.java uses sockets to create a SMTP (simple mail transfer protocol) client on port 25. It is a crude program for sending email.

What happens if you change the originating and reply addresses in email? You may be surprised to learn that SMTP has no authentication mechanism, so you can make an email look like it came from anywhere. This is called email spoofing. Spoofing has some legitimate uses (e.g., a whistle-blower who wishes to remain anonymous), but it is mostly used by spammers to mask their identities. If you carefully examine the email header of such a forged email, you can see the IP number of the machine that connected to port 25. However, a casual Internet user will be duped. Of course, you should never use this deceptive technique without prior consent from the recipient. It is illegal in some jurisdictions.

An open relay is an SMTP email server that processes mail that is neither to nor from a local user. If smtp.princeton.edu were an open relay, then you could run the Mail.java from any computer, even if it were outside the princeton.edu domain. Spammers regularly exploit such open relays to "launder" their email. Open relays enable a spammer to anonymously send vast amounts of email, using someone else's resources. If you run a web server, be sure that you don't run an open relay.

Echo client and server. The program EchoClient.java establishes a connection with a server (on port 4444), reads lines from standard input, sends them to the server, and prints back out the server's response. It uses In.java and Out.java. The program EchoServer.java is the companion server program. It listens for connection requests from clients on port 4444. (You can any port from 1024 to 65,536; ports 0-1023 are reserved for "well-known" tasks, e.g., 80 for http, 21 for ftp). Upon receiving one, it establishes a connection, reads lines from the client, and echoes them back to the client. The statement

ServerSocket serverSocket = new ServerSocket(4444);

creates a ServerSocket that listens for connection requests on port 4444. The key line

Socket clientSocket = serverSocket.accept();

makes the server wait until a connection request arrives, and then creates a Socket connection with the client. This is a blocking statement: the program comes to a standstill until the accept method returns. The other key part of the server code is:

String s;
while ((s = in.readLine()) != null) {
    out.println(s);
}

In this context, in is the input stream coming from the client and out is the output stream going to the client. This loop repeatedly reads strings from the client and echoes them back to the client. The call to readLine() is blocking so the program comes to a stop until it returns a String. When the client finally disconnects, readLine() returns null and the server can continue. To execute the server and the client, start the server, then execute the client program:

OS X, Linux
-----------
% java EchoServer &
% java EchoClient username localhost

Windows
-----------
> start java EchoServer
> java EchoClient username localhost

The program ChatClient.java is a simple GUI version of EchoClient.java. The user types messages into a JTextField and hits enter when they wish to send the message to the server. The results appear in a JTextArea.

Threads, deadlocking, and synchronization. Two threads running in "parallel." A race condition occurs when two (or more threads) access shared data, and the resulting behavior is different, depending on "how the threads are scheduled." Avoid race conditions by locking an object so that it can't be called by another thread until the object is unlocked.

A thread might need to wait for another thread to be done with an object. Deadlocking can result. Need careful coordination. Unsynchronized blocks of code accessing the same value can corrupt state if you're not careful. (Give example.) On the other hand, synchronized blocks of code accessing the same value can deadlock if you're not careful. (Give example.) Even if methods set() and get() are synchronized, the code fragment a.set(a.get() + 1) can result in unpredictable behavior if another thread accesses a in between the calls to a.get and a.set. Avoid threaded programming when you can; it is notoriously hard to debug concurrency errors.

Chat server. The echo client and server demonstrates two programs communicating over sockets. However, only one echo client can communicate with the server at a time. The program ChatServer.java uses threads to allow an arbitrary number of clients to connect at one time. Furthermore, it broadcasts each message it receives to all of the connected clients. This is a bare bones chatroom. It uses the helper classes Connection.java and ConnectionListener.java. For the client program, we can reuse ChatClient.java exactly as is since it already behaves exactly as we wish: it sends messages to the server and echoes back everything that the server transmits.

 

Threads. Java makes dealing with threads as easy as possible, but it is still a difficult task because the flow of execution is no longer as clear.

Synchronization. Connection.java is an example of a producer/consumer relationship. Each Connection reads in messages from the client. The ConnectionListener extracts messages from the Connection and broadcasts them to all of the clients. We must be careful to synchronize this activity so that each message is broadcast once and only once. When a message arrives form the client setMessage() is called and it sets the variable message. When it is ready to be broadcast to all clients, getMessage() is called to retrieve the string. Upon completion, it sets message to null to indicate that is done with the message. To ensure that setMessage() is never called twice consecutively before an intervening getMessage(), we lock the object using wait() and notifyAll(). If setMessage() is called before getMessage() broadcasts the previous message, then message is not null, so setMessage() executes the wait() statement. This blocks setMessage() from further execution until another method invokes notifyAll(). When getMessage() is done processing a message, it sets message to null and calls notifyAll() to unblock setMessage(). The synchronized keyword ensures that only one of the two methods getMessage() and setMessage() execute at a particular instant in time.

public synchronized String getMessage() {
    if (message == null) return null;
    String temp = message;
    message = null;
    notifyAll();
    return temp;
}

public synchronized void setMessage(String s) {
    if (message != null) {
        try                  { wait();               }
        catch (Exception ex) { ex.printStackTrace(); }
    }
    message = s;
}

Q + A

Q. Can a thread call a synchronized method on an object for which it already holds the lock?

A. Yes. Java locks are reentrant.

Creative Exercises

  1. Stock quote. Write a program that takes one command line parameter which is the three letter symbol of a stock and queries the web, say cbs.marketwatch.com, and prints out the current price of the stock.
  2. Curl. Curl is a Linux program that takes the name of a web page as a command line argument and prints out its contents.
  3. Dead link checker. Write a program that takes the URL of a web page as a command line argument and checks all of the hyperlinks in the page to see if they are valid. Use regular expressions to identify the hyperlinks. To start, only check completely specified URLs, e.g., that start with http://. Then, allow relative hyperlinks.