3.1 Using Data Types
In the first two chapters of this booksite our programs were confined to operations on numbers, booleans, and strings. Of course, the reason is that the Python data types that we've encountered so far —
str — manipulate numbers, booleans, and strings, using familiar operations. In this chapter, we begin to consider other data types.
In this section, we focus on client programs that use existing data types, to give you some concrete reference points for understanding these new concepts and to illustrate their broad reach. We introduce constructors to create objects from a data type and methods to operate on their values. We consider programs that manipulate electrical charges, colors, images, files, and web pages — quite a leap from the built-in data types that we used in previous chapters.
A method is a function associated with a specified object (and, by extension, with the type of that object). That is, a method corresponds to a data-type operation.
We can call (or invoke) a method by using a variable name, followed by the dot operator (.), followed by the method name, followed by a list of arguments delimited by commas and enclosed in parentheses. As a simple example, Python's built-in
int type has a method named
bit_length(), so you can determine the number of bits in the binary representation of an
int value as follows:
x = 3 ** 100 bits = x.bit_length() stdio.writeln(bits)
This code writes 159 to standard output, telling you that 3100 (a huge integer) has 159 bits when expressed in binary.
The syntax and behavior of method calls is nearly the same as the syntax and behavior of function calls. For example, a method can take any number of arguments, those arguments are passed by object reference, and the method returns a value to its caller. Similarly, like a function call, a method call is an expression. The main difference is syntactic: you invoke a method using a specified object and the dot operator. In object-oriented programming we usually prefer method-call syntax to function-call syntax because it emphasizes the role of the object.
The key difference between a function and a method is that a method is associated with a specified object. You can think of this specified object as an extra argument that gets passed to a function, in addition to the ordinary methods argument.
Your experience with using
str demonstrates that you do not need to know how a data type is implemented to be able to use it. You know that
str values are sequences of characters and that you can perform the operation of concatenating two
str values to produce a
str data type includes many other operations, as documented in the API shown below.
The operations in the
str API can be divided into three categories:
- Built-in operators
not in, and the comparison operators, which are characterized by special symbols and syntax
- A built-in function
len()with standard function-call syntax
find(), and so forth, which are distinguished in the API with a variable name followed by the dot operator
From this point forward, any API that we might consider will have these kinds of operations. Next, we consider each of them in turn.
Built-in operators.An operator (or function) that you can apply to more than one type of data is said to be polymorphic. You have already been using the
+operator, familiar for numbers, for string concatenation. The API tells you that you can use the
operator, familiar for arrays, to extract a single character from a string and the
[:]operator to extract a substring from a string.
Built-in functions.Python also builds in a number of polymorphic functions, such as the
len()function. Polymorphic functions are like polymorphic operators, but without the special syntax.
Methods.We include built-in operators and built-in functions for convenience (and to conform to Python norms), but most of our effort in creating data types goes into developing methods that operate on object values, such as
find(), and the other methods in the
The table below gives several examples of simple string-processing applications that illustrate the utility of the various operations in Python's
str data type.
The program potentialgene.py is a more substantial example of string processing. The program accepts a DNA sequence as a command-line argument, and determines whether it corresponds to a potential gene. The textbook provides details.
A User-Defined Data Type
As a running example of a user-defined data type, we will consider a data type
Charge for charged particles. In particular, we are interested in a two-dimensional model that uses Coulomb's law, which tells us that the electric potential at a point due to a given charged particle is represented by V = kq/r, where q is the charge value, r is the distance from the point to the charge, and k = 8.99 × 109 N m2/C2 is a constant known as the electrostatic constant, or Coulomb's constant. For consistency, we use SI (Systeme International d'Unites): in this formula, N designates newtons (force), m designates meters (distance), and C represent coulombs (electric charge). When there are multiple charged particles, the electric potential at any point is the sum of the potentials due to each charge.
Application programming interface.We specify the behavior of the
Chargedata type by listing its operations in an API, deferring the discussion of the implementation until Section 3.2.
The first entry in the API, which has the same name as the data type, is known as a
constructor. Each call to the
Charge constructor creates exactly one new
Charge object. The other two entries define the data-type operations. The first is a method
potentialAt(), which computes and returns the potential due to the charge at the given point (
y). The second is the built-in function
str(), which returns a string representation of the charged particle.
File conventions.The code that defines a user-defined data type resides in a
.pyfile. By convention, we define each data type in a distinct
.pyfile, with the same name as the data type (but not capitalized). Thus, the
Chargedata type is found in a file named
charge.py. To compose a client program that uses the
Chargedata type, we put the following
importstatement at the top of the client .py file:
from charge import Charge
Note that the format of the
import statement that we use with user-defined data types differs from the format we use with functions.
Creating objects.To create an object from a user-defined data type, you call its constructor, which directs Python to create a new individual object. You call a constructor just as if it were a function, using the name of the data type, followed by the constructor's arguments enclosed in parentheses and separated by commas. For example,
Charge(x0, y0, q0)creates a new
Chargeobject with position (
y0) and charge value
q0and returns a reference to the new object.
You can create any number of objects of the same data type. Recall from Section 1.2 that each object has its own identity, type, and value. So, while any two objects reside at distinct places in computer memory, they may be of the same type and store the same value.
Calling a method.As discussed at the beginning of this section, we typically use a variable name to identify the object to be associated with the method we intend to call. For our example, the method call
c1.potentialAt(.20, .50)returns a float that represents the potential at the query point (0.20, 0.50) due to the
Chargeobject referenced by
c1. The distance between the query point and the charge location is 0.34, so this potential is 8.99 × 109 × 21.3 / 0.34 = 5.63 × 1011.
String representation.In any data-type implementation, it is typically worthwhile to include an operation that converts an object's value to a string. Python has a built-in function
str()for this purpose, which you been using from the beginning to convert integers and floats to strings for output. Since our
ChargeAPI has a
str()implementation, any client can call
str()to get a string representation of a
Chargeobject. For our example, the call
str(c1)returns the string
'21.3 at (0.51, 0.63)'.
These mechanisms are summarized in the client chargeclient.py, which creates two
Charge objects and computes the total potential due to the two charges at a query point taken from the command line.
Next we consider several more examples of user-defined data types.
Color is a sensation in the eye attributable to electromagnetic radiation. Since we often want to view and manipulate color images on our computers, color is a widely used abstraction in computer graphics.To represent color values, we define a
Color uses the RGB color model, in which a color is defined by three integers, each between 0 and 255, that represent the intensity of the red, green, and blue (respectively) components of the color. Other color values are obtained by mixing the red, green, and blue components.
Color has a constructor that takes three integer arguments, so that you can compose the code
red = color.Color(255, 0, 0) blue = color.Color( 0, 0, 255)
to create objects whose values represent pure red and pure blue, respectively. We have been using colors in
stddraw since Section 1.5, but have been limited to a set of predefined colors, such as
stddraw.PINK. Now you have millions of colors available for your use.
The program alberssquares.py is a
stddraw client that allows you to experiment with colors. The program accepts two colors from the command line, and displays the colors in the format developed in the 1960s by the color theorist Josef Albers, who revolutionized the way that people think about color.
% python alberssquares.py 9 90 166 100 100 100
% python alberssquares.py 0 174 239 147 149 252
Luminance.The quality of the images on modern displays such as LCD monitors, LED TVs, and cellphone screens depends on an understanding of a color property known as monochrome luminance, or effective brightness. A standard formula for luminance is derived from the eye's sensitivity to red, green, and blue. It is a linear combination of the three intensities: if a color's red, green, and blue values are r, g, and b, respectively, then its luminance is defined by this formula:
Y = 0.299r + 0.587g + 0.114b
Grayscale.The RGB color model has the property that when all three color intensities are the same, the resulting color is on a grayscale that ranges from black (all 0s) to white (all 255s). To print a color photograph in a black-and-white newspaper (or a book), we need a function to convert from color to grayscale. A simple way to convert a color to grayscale is to replace the color with a new one whose red, green, and blue values equal its monochrome luminance.
Color compatibility.The luminance value is also crucial in determining whether two colors are compatible, in the sense that printing text in one of the colors on a background in the other color will be readable. A widely used rule of thumb is that the difference between the luminance of the foreground and background colors should be at least 128. For example, black text on a white background has a luminance difference of 255, but black text on a (book) blue background has a luminance difference of only 74.
Program luminance.py is a module that we can use to convert a color to grayscale and to test whether two colors are compatible, for example, when we use colors in
stddraw applications. The functions in luminance.py illustrate the utility of using data types to organize information. Using the
Color data type and passing objects as arguments makes these implementations substantially simpler than the alternative of having to pass around the three intensity values. Returning multiple values from a function also would be awkward and more error prone without the
Color data type.
Digital Image Processing
We have been using
stddraw to plot geometric objects (points, lines, circles, squares) in a window on the computer screen. The basic abstraction for computer displays is the same one that is used for digital photographs and is very simple: A digital image is a rectangular grid of pixels (picture elements), where the color of each pixel is individually defined. Digital images are sometimes referred to as raster or bitmapped images.
Picture data type, defined in the module
picture.py, implements the digital image abstraction. The set of values is nothing more than a two-dimensional array of
Color values, and the operations are described by this API:
By convention, (0, 0) is the upper leftmost pixel, so the image is laid out as in the customary order for arrays (by contrast, the convention for
stddraw is to have the point (0,0) at the lower-left corner, so that drawings are oriented as in the customary manner for Cartesian coordinates).
Picture constructor you can create a
Picture object by reading an image from a
.jpg file. Using the
save() method, you can save the images that you create in either
.jpg format; subsequently can view the images that you create in the same way that you view photographs or other images. In addition, the
stddraw module supports a
picture() function that allows you to draw a given
Picture object in the standard drawing window along with lines, rectangles, circles, and so forth.
Grayscale.The program grayscale.py is a filter that takes a file name from the command line and produces a grayscale version of that image. It creates a new
Pictureobject initialized with the color image, then sets the color of each pixel to a new
Colorhaving a grayscale value computed by applying the
toGray()function in luminance.py to the color of the corresponding pixel in the source. Try running it on the files mandrill.jpg, mandrill.png, darwin.jpg, and darwin.png.
% python grayscale.py mandrill.jpg
% python grayscale.py darwin.jpg
Scaling.One of the most common image-processing tasks is to make an image smaller or larger.
The program scale.py takes a filename and the width and height of the target image as command-line arguments, and rescales the image to the specified size. Try running it on mandrill.jpg, mandrill.png, darwin.jpg, and darwin.png.
% python scale.py mandrill.jpg 200 200
% python scale.py mandrill.jpg 200 100
% python scale.py mandrill.jpg 100 200
Fade effect.The program fade.py is a
stddrawclient that uses a linear interpolation strategy to implement a fade effect. It computes n - 1 intermediate images, with each pixel in the tth image being a weighted average of the corresponding pixels in the source and target. The function
blend()implements the interpolation: the source color is weighted by a factor of 1-t/n and the target color by a factor of t/n (when t is 0, we have the source color; when t is n, we have the target color). Note that fade.py assumes that the images have the same width and height; if you have images for which this is not the case, you can use scale.py to create a scaled version of one or both of them for fade.py. Try running it on various permutations of mandrill.jpg, mandrill.png, darwin.jpg, and darwin.png.
% python fade.py mandrill.png darwin.png 5
Potential value visualization.Image processing is also helpful in scientific visualization. Program potential.py visualizes the potential values created by a set of charges. It relies on the data file charges.txt. The calculation at the heart of the approach is very simple: for each pixel, we compute corresponding (x, y) values in the unit square, then call
potentialAt()for each charge to find the potential at that point due to all of the charges, summing the values returned.
% python potential.py < charges.txt
Input and Output Revisited
In Section 1.5, you learned how to read and write numbers and text using our
stdio module. When using
stdio, we are dependent upon the operating system's piping and redirection mechanism for access to files, and any one program can work with just one input file and one output file.
In this section we define the data types
OutStream for input streams and output streams, respectively. Rather than being restricted to just one input stream and one output stream, we can readily create multiple objects of each data type, connecting the streams to various sources and destinations. We also get the flexibility to set variables to reference such objects, pass them as arguments to or return values from functions or methods, and create arrays of them, manipulating them just as we manipulate objects of any data type.
Input stream data type.Our data type
InStream, defined in the module
instream.py, is a more general version of the reading aspects of
stdiothat supports reading numbers and text from files and websites as well as the standard input stream. This is its API:
When you call the
InStream constructor with a string argument, the constructor first tries to find a file on your local computer with that name. If it cannot do so, it assumes the argument is a website name and tries to connect to that website. (If no such website exists, it raises an
IOError at run time.) In either case, the specified file or website becomes the source of the input for the
InStream object thus created, and the
read*() methods will read input from that stream.
Output stream data type.Similarly, our data type
OutStream, defined in the module
outstream.py, is a more general version of the writing aspect of
stdiothat supports writing strings to a variety of output streams, including standard output and files. Again, the API specifies the same methods as its
You specify the file that you want to use for output by using the one-argument constructor with the file's name as argument.
OutStream interprets this string as the name of a new file on your local computer, and sends its output there.
File concatenation and filtering.The program cat.py is a sample client of
OutStreamthat uses multiple input streams to concatenate several input files into a single output file. For example, this command concatenates files in1.txt and in2.txt to create the file
python cat.py in1.txt in2.txt out.txt
Screen scraping.The program stockquote.py is a client of the
InStreamdata types. It queries a web page, extracts some information, and reports the results — a process known as screen scraping. Specifically, the program accepts as a command-line argument the symbol of a New York Stock Exchange stock and writes its current trading price to standard output. For example, if the command-line argument is
goog(the NYSE symbol for Google), the program reads the Web page
InStreamconstructor, identifies the stock price of Google using
strmethods, and writes the stock price to standard output. The program depends on the format of the Yahoo web page; if Yahoo changes their web page format, we would need to change our program. Nevertheless, this is likely more convenient than maintaining the data ourselves.
Extracting data.The program split.py uses one
InStreamobject and multiple
OutStreamobjects to split a CSV (comma-separated value) file into separate files, one for each comma-delimited field. For example, the command
python split.py djia 3
splits the file djia.csv into files
In Python, we create objects by calling a constructor. Each time we create an object, Python reserves computer memory for that object. But when and how do we destroy objects so that the memory in which they reside can be freed for reuse? We will briefly address this question.
Orphaned objects.The ability to bind a variable to a different object creates the possibility that a program may have created an object that it can no longer reference. For example, consider the three assignment statements in the figure at right. After the third assignment statement, not only do
c2refer to the same
Chargeobject (the one at (.51, .63) with charge value 21.3), but there is also no longer a reference to the
Chargeobject that was created and used to initialize
c2. Such an object is said to be an orphan. Objects can also become orphans when variables go out of scope.
Memory management of objects.Programs tend to create huge numbers of objects but have a need for only a small number of them at any given time. Accordingly, programming languages and systems need mechanisms to create objects (and allocate memory), and mechanisms to destroy objects (and free memory) when the objects become orphans. Most programming systems take care of allocating memory for variables when they come into existence and freeing that memory when they go out of scope. Memory management for objects is more complicated: Python knows to allocate memory for an object when it is created, but cannot know precisely when to free the memory associated with an object, because the dynamics of a program in execution determine when an object becomes an orphan and so should be destroyed. There is no way for the system to predict what a program will do, so it has to monitor what a program is doing and take action accordingly.
In many languages (such as C and C++), the programmer is responsible for both allocating and deallocating memory. Doing so is tedious and notoriously error-prone. One of Python's most significant features is its ability to automatically manage memory. The idea is to free the programmers from the responsibility of managing memory by keeping track of orphaned objects and returning the memory they use to a pool of free memory. Reclaiming memory in this way is known as garbage collection, and Python's type system enables it to perform this operation this efficiently and automatically.
Q & A
Q. What happens if I call a method that is not defined for a given object?
A. Python raises an
AttributeError at run time.
Q. Why can we use
stdio.writeln(x) instead of
stdio.writeln(str(x)) to write an object
x that is not a string?
A. For convenience, the
stdio.writeln() function calls the built-in
str() function automatically whenever a string object is needed.
Q. I noticed that potential.py calls
stdarray.create1D() to create an array of
Charge objects, but provides only one argument (the desired number of elements). Doesn't
stdarray.create1D() require that I provide two arguments: the desired number of elements and the initial value of the elements?
A. If no initial value is specified, both
stdarray.create2D() use the special value
None, which refers to no object. Immediately after calling
stdarray.create1D(), potential.py changes each array element so it refers to a new
Q. Can I call a method with a literal or other expression?
A. Yes, from the client's point of view, you can use any expression to invoke a method. When Python executes the method call, it evaluates the expression and calls the method on the resulting object. For example,
(3 ** 100).bit_length() returns 159. However, you need to be careful with integer literals — for example,
1023.bit_length() raises a
SyntaxError because Python interprets
1023. as a floating-point literal; instead, you can use
Q. Can I chain together several string method calls in one expression?
A. Yes. For example, the expression
s.strip().lower() works as expected. That is, it evaluates to a new string that is a copy of
s with leading and trailing whitespace removed, and with all remaining characters converted to lowercase. It works because (1) each of the methods returns its result as a string and (2) the dot operator is left associative, so Python calls the methods from left to right.
Q. Why red, green, and blue instead of red, yellow, and blue?
A. In theory, any three colors that contain some amount of each primary color would work, but two different color models have evolved: RGB produces good colors on television screens, computer monitors, and digital cameras, and CMYK is typically used for the printed page (see the "color conversion exercise in Section 1.2). CMYK does include yellow (cyan, magenta, yellow, and black). Two different schemes are appropriate because printed inks absorb color; thus, where there are two different inks, there are more colors absorbed and fewer reflected. In contrast, video displays emit color; thus, where there are two different colored pixels, there are more colors emitted.
Q. Is there any problem with creating thousands of
Color objects, as in grayscale.py? It seems wasteful.
A. All programming-language constructs come at some cost. In this case the cost is reasonable, since the time required to create the
Color objects is tiny compared to the time required to actually draw the picture.
Q. Can a data type have two methods (or constructors) with the same name but with a different number of arguments?
A. No, just as with functions, you cannot have two methods (or constructors) with the same name. As with functions, methods (and constructors) can use optional arguments with default values. This is how the
Picture data type creates the illusion of having two constructors.
Q. When using a linear filter, each pixel becomes a weighted average of its 8 neighbors. What do I do when the pixel has less than 8 neighbors because it is near the border?
A. You could assume the image is toroidal (periodic boundary conditions) and make the left boundary wrap around to the right boundary, and the top boundary wrap around to the bottom boundary.
Q. Where can I download some test files for image processing?
Compose a program that takes a float command-line argument
w, creates four
Chargeobjects (each with charge value 1.0) that are each distance
win each of the four cardinal directions from
(.5, .5), and writes the potential at (0.25, 0.5).
Solution: See fourchargeclient.py.
Compose a program that takes from the command line three integers between 0 and 255 that represent red, green, and blue values of a color and then creates and shows a 256-by-256
Pictureof that color.
- Modify alberssquares.py to take nine command-line arguments that specify three colors and then draws the six squares showing all the Albers squares with the large square in each color and the small square in each different color.
Compose a program that takes the name of a grayscale picture file as a command-line argument and uses
stddrawto plot a histogram of the frequency of occurrence of each of the 256 grayscale intensities.
Compose a program that takes the name of a picture file as a command-line argument and flips the image horizontally.
Compose a program that takes the name of an picture file as a command-line input, and creates three images — on with only the red components, one with only the green components, and one with only the blue components.
Compose a program that takes the name of an picture file as a command-line argument and writes the pixel coordinates of the lower left corner and the upper right corner of the smallest bounding box (rectangle parallel to the x- and y-axes) that contains all of the non-white pixels.
Compose a program that takes as command line arguments the name of a picture file and the pixel coordinates of a rectangle within the image; reads from standard input a list of
Colorvalues (represented as triples of integers); and serves as a filter, printing out those
Colorvalues for which all pixels in the rectangle are background/foreground compatible. (Such a filter can be used to pick a color for text to label an image.)
Compose a function
isValidDNA()that takes a string as input and returns
Trueif and only if it is comprised entirely of the characters A, C, T, and G.
Compose a function
complementWC()that takes a DNA string as its argument and returns its Watson-Crick complement: replace A with T, C with G, and vice versa.
Compose a function
palindromeWC()that takes a DNA string as its argument and returns
Trueif the string is a Watson-Crick complemented palindrome, and
Falseotherwise. A Watson-Crick complemented palindrome is a DNA string that is equal to the reverse of its Watson-Crick complement.
Compose a program to check whether an ISBN number is valid (see the "Checksum" exercise in Section 1.3), taking into account that an ISBN number can have hyphens inserted at arbitrary places.
What does the following code fragment write?
s = 'Hello World' s.upper() s[6:11] stdio.writeln(s)
Solution: 'Hello World'. String objects are immutable — a string method returns a new
strobject with the appropriate value, but does not change the value of the object that was used to invoke it. This code ignores the objects returned and just writes the original string. To update
s = s.upper()and
s = s[6:11].
sis a circular shift of a string
tif it matches when the characters are circularly shifted by any number of positions. For example, ACTGACG is a circular shift of TGACGAC, and vice versa. Detecting this condition is important in the study of genomic sequences. Compose a function that checks whether two given strings
tare circular shifts of one another. Hint: The solution is a one-liner with the
inoperator and string concatenation.
Given a string site that represents a website URL, compose a code fragment to determine its domain type. For example, the domain type for the string
Compose a function that takes a domain name as argument and returns the reverse domain (reverse the order of the strings between periods). For example, the reverse domain of
edu.princeton.cs.introcs. This computation is useful for web log analysis (see the "Reverse domain" creative exercise in Section 4.2).
What does the following recursive function return?
def mystery(s): n = len(s) if n <= 1: return s a = s[0, n//2] b = s[n//2, n] return mystery(b) + mystery(a)
Compose a version of potentialgene.py that finds all potential genes contained in a long DNA string. Add a command-line argument to allow the user to specify a minimum length of a potential gene.
Compose a program that takes a start string and a stop string as command line arguments and writes all substrings of a given string that start with the first, end with the second, and otherwise contain neither. Note: Be especially careful of overlaps!
Compose a filter that reads text from an input stream and prints it to an output stream, removing any lines that consist only of whitespace.
Modify potential.py to take an integer n from the command line and generate n random
Chargeobjects in the unit square, with potential values drawn randomly from a Gaussian distribution with mean 50 and standard deviation 10.
Modify stockquote.py to take multiple symbols on the command line.
The example file
djia.csvused for split.py lists the date, high price, volume, and low price of the Dow Jones stock market average for every day since records have been kept. Download this file from the booksite and compose a program that plots the prices and volumes, at a rate taken from the command line.
Compose a program
merge.pythat takes a delimiter string followed by an arbitrary number of file names as command line arguments, concatenates the corresponding lines of each file, separated by the delimiter, and then writes the result to standard output, thus performing the opposite operation from split.py.
Find a website that publishes the current temperature in your area, and compose a screen-scraper program
weather.pyso that typing
python weather.pyfollowed by your zip code will give you a weather forecast.
Picture filtering. Compose a module
write()functions for use with standard input and standard output. The
write()function takes a
Pictureas argument and writes the picture to standard output, using the following format: if the picture is w-by-h, write w, then h, then w*h triples of integers representing the pixel color values, in row major order. The
read()function takes no arguments and returns a
Picture, which it creates by reading a picture from standard input, in the format just described. Note: The picture filtering will use up much more disk space than the picture — the standard formats compress this information so that it will not take up so much space.
Kama Sutra cipher. Compose a filter that takes two strings as command-line arguments (the key strings), then reads standard input, substitutes for each letter as specified by the key strings, and writes the result to standard output. This operation is the basis for one of the earliest known cryptographic systems. The condition on the key strings is that they must be of equal length and that any letter in standard input must be in one of them. For example, if input is all capital letters and the keys are THEQUICKBROWN and FXJMPSVRLZYDG, then we make the table.
T H E Q U I C K B R O W N F X J M P S V L A Z Y D G
which tells us that we should substitute F for T, T for F, H for X, X for H, and so forth when copying the input to the output. The message is encoded by replacing each letter with its pair. For example, the message MEET AT ELEVEN is encoded as QJJF BF JKJCJG. Someone receiving the message can use the same keys to get the message back.
Solution: See kamasutra.py.
Safe password verification. Compose a function that takes a string as argument and returns
Trueif it meets the following conditions,
- At least eight characters long
- Contains at least one digit (0-9)
- Contains at least one uppercase letter
- Contains at least one lowercase letter
- Contains at least one character that is neither a letter nor a number
Such checks are commonly used for passwords on the web.
Color study. Compose a program that displays the color study shown below, which gives Albers squares corresponding to each of the 256 levels of blue (blue-to-white in row-major form) and gray (black-to-white in column-major form) that were used to print the textbook.
Solution: See colorstudy.py.
Entropy. The Shannon entropy measures the information content of an input string and plays a cornerstone role in information theory and data compression. Given a string of n characters, let fc be the frequency of occurrence of character c. The quantity pc = fc / n is an estimate of the probability that c would be in the string if it were a random string, and the entropy is defined to be the sum of the quantity -pc log2 pc, over all characters that appear in the string. The entropy is said to measure the information content of a string: if each character appears the same number times, the entropy is at its minimum value. Compose a program that computes and writes to standard output the entropy of the string from standard input. Run your program on a web page that you read regularly and on a recent paper that you wrote.
Minimize potential. Compose a function that takes an array of
Chargeobjects with positive potential as its argument and finds a point such that the potential at that point is within 1% of the minimum potential anywhere in the unit square. Use a
Chargeobject to return this information. Compose a test client that callsyour function to write the point coordinates and charge value for the data given in the text and for the random charges described in a previous exercise in this section.
Slide show. Compose a program that takes the names of several image files as command-line arguments and displays them in a slide show (one every two seconds), using a fade effect to black and a fade from black between pictures.
Tile. Compose a program that takes the name of an image file and two integers
nas command-line arguments and creates an
ntiling of the picture.
Rotation filter. Compose a program that takes two command line arguments (the name of an image file and a real number θ) and rotates the image θ degrees counterclockwise. To rotate, copy the color of each pixel (si, sj) in the source image to a target pixel (ti, tj) whose coordinates are given by the following formulas:
ti = (si - ci)cos θ + (sj - cj)sin θ + ci
tj = (si - ci)sin θ + (sj - cj)cos θ + cj
where (ci, cj) is the center of the image. For example, these images show a 30 degree rotation:
Solution: See rotation.py.
Swirl filter. Creating a swirl effect is similar to rotation, except that the angle changes as a function of distance to the center. Use the same formulas as in the previous exercise, but compute θ as a function of (si, sj), specifically π/256 times the distance to the center. For example:
Solution: See swirl.py.
Wave filter. Compose a filter like those in the previous two exercises that creates a wave effect, by copying the color of each pixel (si, sj) in the source image to a target pixel (ti, tj), where ti = si and tj = sj + 20 sin(2 π sj/64). Add code to take the amplitude (20 in the accompanying figure) and the frequency (64 in the accompanying figure) as command-line arguments. Experiment with various values of these parameters.
Solution: See wave.py.
Glass filter. Compose a program that takes the name of an image file as a command-line argument and applies a glass filter: set each pixel p to the color of a random neighboring pixel (whose pixel coordinates are each within 5 pixels of p's coordinates). For example:
Solution: See glass.py.
Morph. The example images shown in this section for fade.py do not quite line up in the vertical direction (the mandrill's mouth is much lower than Darwin's). Modify fade.py to add a transformation in the vertical dimension that makes a smoother transition.
Clusters. Compose a program that take the name of a picture file from the command line and produces and displays to stddraw a picture with filled circles that cover compatible areas. First, scan the image to determine the background color (a dominant color that is found in more than half the pixels). Use depth first search, as described in Section 2.4, to find contiguous sets of pixels that are foreground-compatible with the background. A scientist might use this program to study natural scenarios such as birds in flight or particles in motion. Take a photo of balls on a billiards table and try to get your program to identify the balls and positions.
Digital zoom. Compose a program
zoom.pythat takes the name of an image file and three numbers s, x, and y as command-line arguments and displays a picture that zooms in on a portion of the input picture. The numbers are all between 0 and 1, with s to be interpreted as a scale factor and (x, y) as the relative coordinates of the point that is to be at the center of the output image. Use this program to zoom in on your dog or a friend in some digital image on your computer.