Think Python How to Think Like a Computer Scientist
Download 1.04 Mb. Pdf ko'rish
|
thinkpython
- Bu sahifa navigatsiya:
- Exercise 14.6
14.11. Glossary
145 14.11 Glossary persistent: Pertaining to a program that runs indefinitely and keeps at least some of its data in permanent storage. format operator: An operator, %, that takes a format string and a tuple and generates a string that includes the elements of the tuple formatted as specified by the format string. format string: A string, used with the format operator, that contains format sequences. format sequence: A sequence of characters in a format string, like %d, that specifies how a value should be formatted. text file: A sequence of characters stored in permanent storage like a hard drive. directory: A named collection of files, also called a folder. path: A string that identifies a file. relative path: A path that starts from the current directory. absolute path: A path that starts from the topmost directory in the file system. catch: To prevent an exception from terminating a program using the try and except statements. database: A file whose contents are organized like a dictionary with keys that correspond to values. 14.12 Exercises Exercise 14.5 The urllib module provides methods for manipulating URLs and downloading information from the web. The following example downloads and prints a secret message from thinkpython.com : import urllib conn = urllib.urlopen('http://thinkpython.com/secret.html') for line in conn.fp: print line.strip() Run this code and follow the instructions you see there. Exercise 14.6 In a large collection of MP3 files, there may be more than one copy of the same song, stored in different directories or with different file names. The goal of this exercise is to search for these duplicates. 1. Write a program that searches a directory and all of its subdirectories, recursively, and returns a list of complete paths for all files with a given suffix (like .mp3). Hint: os.path provides several useful functions for manipulating file and path names. 2. To recognize duplicates, you can use a hash function that reads the file and generates a short summary of the contents. For example, MD5 (Message-Digest algorithm 5) takes an arbitrarily-long “message” and returns a 128-bit “checksum.” The probability is very small that two files with different contents will return the same checksum. You can read about MD5 at wikipedia.org/wiki/Md5. On a Unix system you can use the program md5sum and a pipe to compute checksums from Python. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling