The Self-Taught Computer Scientist
Chapter 3 Search Algorithms 33
Download 1.48 Mb. Pdf ko'rish
|
books-library.net-11301817Az7X6
- Bu sahifa navigatsiya:
- Unicode
- Dec Bin Char
- Dec Bin Char Dec Bin Char Dec Bin Char Figure 3.7
- Chapter 3
- Sorting data
Chapter 3 Search Algorithms
33 for the numbers 0 through 9 because 0 through 9 in the ASCII table are not numeric values. They are characters for nonmathematical purposes, such as expressing the numerals in the street address 26 Broadway Street, New York. Because ASCII maps each character to a 7- bit binary number, it can only represent a maximum of 128 different characters (2 7 is 128). Most computers, however, extend ASCII to 8 bits so that it can represent 256 characters. While you can represent the 256 characters in the Latin- script alphabet with ASCII, it does not support enough characters to deal with the text from other writing systems, such as Japanese or Mandarin. To deal with this problem, computer scientists developed the Unicode character set. Character encoding means assigning a number to characters for digital representation. UTF- 8 is one of the character encoding methods computer scientists use to implement the Unicode character set. Instead of using 7 or 8 bits like ASCII, UTF- 8 uses up to 32 bits to encode each character, which allows you to represent more than a million characters. UTF- 8 is compatible with ASCII because it uses the same bit representation for the Latin- script alphabet. For example, both ASCII and UTF- 8 represent an uppercase A with 1000001. Dec Bin Char 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 @ A B C D E F G H I J K L M N O P Q R S T U V W X Y Z [ \ ] ^ _ ` a b c d e f g h i j k l m n o p q r s t u v w x y z { I } ~ [DEL] space ! " # $ % & ‘ ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? 0000 0000 0000 0001 0000 0010 0000 0011 0000 0100 0000 0101 0000 0110 0000 0111 0000 1000 0000 1001 0000 1010 0000 1011 0000 1100 0000 1101 0000 1110 0000 1111 0001 0000 0001 0001 0001 0010 0001 0011 0001 0100 0001 0101 0001 0110 0001 0111 0001 1000 0001 1001 0001 1010 0001 1011 0001 1100 0001 1101 0001 1110 0001 1111 0010 0000 0010 0001 0010 0010 0010 0011 0010 0100 0010 0101 0010 0110 0010 0111 0010 1000 0010 1001 0010 1010 0010 1011 0010 1100 0010 1101 0010 1110 0010 1111 0011 0000 0011 0001 0011 0010 0011 0011 0011 0100 0011 0101 0011 0110 0011 0111 0011 1000 0011 1001 0011 1010 0011 1011 0011 1100 0011 1101 0011 1110 0011 1111 0100 0000 0100 0001 0100 0010 0100 0011 0100 0100 0100 0101 0100 0110 0100 0111 0100 1000 0100 1001 0100 1010 0100 1011 0100 1100 0100 1101 0100 1110 0100 1111 0101 0000 0101 0001 0101 0010 0101 0011 0101 0100 0101 0101 0101 0110 0101 0111 0101 1000 0101 1001 0101 1010 0101 1011 0101 1100 0101 1101 0101 1110 0101 1111 0110 0000 0110 0001 0110 0010 0110 0011 0110 0100 0110 0101 0110 0110 0110 0111 0110 1000 0110 1001 0110 1010 0110 1011 0110 1100 0110 1101 0110 1110 0110 1111 0111 0000 0111 0001 0111 0010 0111 0011 0111 0100 0111 0101 0111 0110 0111 0111 0111 1000 0111 1001 0111 1010 0111 1011 0111 1100 0111 1101 0111 1110 0111 1111 [NUL] [SOH] [STX] [ETX] [EOT] [ENQ] [ACK] [BEL] [BS] [TAB] [LF] [VT] [FF] [CR] [SO] [SI] [DLE] [DC1] [DC2] [DC3] [DC4] [NAK] [SYN] [ETB] [CAN] [EM] [SUB] [ESC] [FS] [GS] [RS] [US] Dec Bin Char Dec Bin Char Dec Bin Char Figure 3.7: ASCII chart Introduction to Algorithms 34 You can use Python’s built- in ord() function to get a character’s ASCII value: print(ord('a')) >> 97 As you can see, the ASCII value for a is 97 (in base 10). The ord() function is useful when you need to work directly with the underlying ASCII codes of different characters. To modify the binary search you wrote earlier to search for characters, you have to get and compare the ASCII values of characters. Each time through your loop, you check to see whether the ASCII code of each character is higher than, lower than, or equal to the character’s ASCII code you are looking for. Instead of showing you the solution here, I challenge you to code it yourself at the end of this chapter. You now know how linear and binary searches work and when to use them when you are searching for data. While a binary search is very efficient, it is not the fastest way to search for data. In Part II, you will learn how to search for data using a hash table and why it is the most efficient kind of search. Vocabulary search algorithm: An algorithm that looks for data in a data set. data set: A collection of data. linear search: A search algorithm that iterates through each element in a set of data and compares it to the target number. sorted data: Data arranged in a meaningful way. binary search: Another algorithm for searching a list for a number, and it is faster than a linear search. floor division operator: An operator that returns the whole value of division, rounded down. exponentiation: A mathematical operation you write as b n (or b**n in Python), where you multiply a number b times itself n times. base: The b in the equation for exponentiation (b n ). exponent: The n in the equation for exponentiation (b n ). logarithm: The power you must raise a number to in order to produce another number. character set: A map between characters and binary numbers. Chapter 3 Search Algorithms 35 ASCII: The American Standard Code for Information Interchange character set. Unicode: A character set that can store more characters than ASCII. character encoding: Assigning a number to characters for digital representation. UTF- 8: One of the methods of character encoding computer scientists use to implement the Unicode character set. Challenge 1. Given a list of words in alphabetical order, write a function that performs a binary search for a word and returns whether it is in the list. 4 Sorting Algorithms I think the bubble sort would be the wrong way to go. Barack Obama As a computer scientist, in addition to searching for data, you will often have to sort data. Sorting data means arranging it in a meaningful way. For example, if you have a list of numbers, you could sort them from the smallest to the largest (ascending). Or, imagine you are building an app that keeps track of the books each user has read. In an app like this, you might want to allow the user to see their books sorted in different ways. For example, you could give them the option to see their books sorted from the shortest book to the longest, oldest to newest, or newest to oldest. There are many different sorting algorithms to help you sort data, each with strengths and weaknesses. For example, some sort algorithms work best in specific situations, like if an iterable is nearly sorted. In this chapter, you will learn about bubble sort, insertion sort, and merge sort. Other popular sorts include quicksort, shell sort, and heap sort. There are many sorting algorithms you will have to use in only rare circumstances, so after teaching you a few sorts, I spend the remainder of the chapter showing you how to use Python’s built- in functions to sort data, something you will often use in real- world programming. When you are building your programs in the real world, you should almost always use your programming language’s built- in sorting function. You should not implement the classic sorting algo- rithms discussed here (except in rare circumstances) because modern programming languages like Python have built- in sorting functions that are faster. However, learning a few of the classic sorting algorithms will help you better understand time complexity and teach you concepts you can use in situations other than sorting, such as the merge step in a merge sort. Bubble Sort Bubble sort is a sorting algorithm where you iterate through a list of numbers, compare each number to the next number, and swap them if they are out of order. Computer scientists call it a bubble sort because the numbers with the highest values “bubble up” to the end of the list, and the numbers with the smallest values move to the beginning of the list as the algorithm progresses. |
Ma'lumotlar bazasi mualliflik huquqi bilan himoyalangan ©fayllar.org 2024
ma'muriyatiga murojaat qiling
ma'muriyatiga murojaat qiling