Sorting Algorithms
Working with Data Structures and manipulating data.
import random
numbers = []
for i in range(10):
numbers.append(random.randint(0,100))
print("Random List")
print(numbers)
Warm Up
Discuss with a partner... What are some strategies you would use to sort this list? (Don't worry about writing code for now)
- I would sort the list from least to greatest and do this by implementing either a quicksort method or another type of loop that would sort it based on the previous value in the list and comparing with the current value it is checking.
Explore
Get into groups of 3
We will be focusing on 4 algorithms today.
We will look at the first one together, Bubble Sort
What is happening with this sort?
In this sort, it checks two values and swaps the value if the 2nd value is smaller than the first and s
In your groups you will each choose to be an expert on a sorting algorithm. Merge, Selection, and Insertion. Take about 5 minutes to read about your algorithm and be ready to explain it to your other group members.
Splits the lists in halfs until each set is one individual value and then you sort them and then you combine them
Finds the smallest value and moves it to the front of the list and then repeats it to the unsorted list
Compare 1st value with 2nd and if 2nd smaller, swap the two values. Repeat until whole list is sorted.
Practice
[75, 17, 46, 80, 67, 45, 69, 79, 40, 0]
How would you sort this list with...
- Bubble Sort
Takes the first two values, compares them and then swaps then if 2<1 and repeats this for each value in the list
- Selection Sort
Finds the smallest value and moves it to the front of the list and then repeats it to the unsorted list
Explain.
[88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
How would you sort this list with...
- Merge Sort
Splits all the numbers into individual values and then sorts them, then combining those sorted values into one big list
- Insertion Sort
Compare 1st value with 2nd and if 2nd smaller, swap the two values. Repeat until whole list is sorted.
Explain.
def insertionSort(arr):
# Traverse through 1 to len(arr)
for i in range(1, len(arr)):
key = arr[i]
# Move elements of arr[0..i-1], that are
# greater than key, to one position ahead
# of their current position
j = i-1
while j >= 0 and key < arr[j] :
arr[j + 1] = arr[j]
j -= 1
arr[j + 1] = key
# Driver code to test above
arr = [88, 39, 53, 39, 58, 43, 74, 81, 71, 51]
insertionSort(arr)
for i in range(len(arr)):
print ("% d" % arr[i])
You can use all of the methods to sort this list and you would do it the same way as you would with integers but sort it by the first letter of each word
import nltk
import random
nltk.download('words') # Download the word list (only required once)
from nltk.corpus import words
english_words = words.words()
#print(len(english_words)) # Prints the number of words in the list
# You can now use the 'english_words' list in your code
words = []
for i in range(10):
words.append(english_words[random.randint(0,len(english_words))])
print("Random List:")
print(words)
# sorted_words = sorted(words)
# print("Sorted List:")
# print(sorted_words)
Discuss
Answer the following with your group.
- When should you use each algorithm? What makes an algorithm the right choice?
If the list is very small and can be done with a little number of iterations, you should use either, merge sort, insertion, or bubble however, if the list is very long such as the
- Given the following lists...
- [0, 2, 6, 4, 8, 10]
Bubble or Insertion: It wouldn't change anything but the unsorted 2 values - taking the least amount of time
- [Elephant, Banana, Cat, Dog, Apple]
Selection or Insertion: They work the best for really small datasets
- [29, 13, 83, 47, 32, 78, 100, 60, 65, 15, 24, 9, 40, 68, 53, 8, 90, 58, 39, 32, 34, 91, 74, 94, 49, 87, 34, 87, 23, 17, 27, 2, 38, 58, 84, 15, 9, 46, 74, 40, 44, 8, 55, 28, 81, 92, 81, 88, 53, 38, 19, 21, 9, 54, 21, 67, 3, 41, 3, 74, 13, 71, 70, 45, 5, 36, 80, 64, 97, 86, 73, 74, 94, 79, 49, 32, 20, 68, 64, 69, 1, 77, 31, 56, 100, 80, 48, 75, 85, 93, 67, 57, 26, 56, 43, 53, 59, 28, 67, 50]
Merge: The best since its really good for large datasets and would be the most optimal one to use to sort this big list
Select the algorithm you believe is best for each, explain.
HACKS
Provided below is a Bubble Sort Algorithm sorting a list of dictionaries based off of selected key.
-
Now it's time to do some coding...
-
Run code and then research and answer these questions...
-
Is a list and/or dictionary in python considered a primitive or collection type? Why?
In Python, a list and a dictionary are considered collection types. This is because they can hold multiple values of different types in a single variable. Primitive types, on the other hand, can only hold a single value of a specific type.
-
Is the list passed into bubble sort "pass-by-value" or "pass-by-reference? Describe why in relation to output.
In Python, the list passed into a function, such as the bubble sort algorithm, is actually passed by reference. This means that when you pass a list as an argument to a function, any modifications made to the list within the function will affect the original list outside of the function.
-
- Implement new cell(s) and/or organize cells to do the following.
- Create your own list
- Use your expertise sorting algorithm (selection, insertion, merge). Note, I got my bubble sort from Geek for Geeks and made modifications. Each student in a group should have a unique algorithm.
- Test your list with my bubble sort
- Test my list with your new sort, do NOT make a copy my list when doing this
- Research analysis on sorting:comparisons, swaps, time. Build this into your hacks. - Find a better way to print the data, key first, then other elements in viewable form.
Use the code below to help guide your adventure
"""
* Creator: Nighthawk Coding Society
Bubble Sort of a List with optimizations
"""
# bubble sorts a list of dictionaries, base off of provided key
def bubbleSort(list, key):
n = len(list) - 1 # list are indexed 0 to n-1, len is n
# Traverse through list with i index
for i in range(n):
swapped = False # optimize code, so it exits if now swaps on inner loop
# Inner traversal using j index
for j in range(n-i): # n-i as positions on right are in order in bubble
# Swap if the element KeyN is greater KeyN1
keyN = list[j].get(key)
keyN1 = list[j+1].get(key)
if keyN > keyN1:
swapped = True
list[j], list[j + 1] = list[j + 1], list[j] # single line swap
if not swapped: # if no swaps on inner pass, list is sorted
return # exit function
if __name__ == "__main__":
# list/dictionary sample
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# assuming uniform keys, pick 1st row as source of keys
key_row = cookie_prices[0]
# print list as defined
print("Original")
print(cookie_prices)
for key in key_row: # finds each key in the row
print(key)
bubbleSort(cookie_prices, key) # sort list of people
print(cookie_prices)
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
cookie_prices = [
{"name": "Chocolate Chip", "price": 2.50},
{"name": "Oatmeal Raisin", "price": 2.00},
{"name": "Peanut Butter", "price": 2.75},
{"name": "Snickerdoodle", "price": 1.75},
{"name": "Sugar", "price": 1.50}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(cookie_prices, "price")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")
import time
def merge_sort(arr, key):
"""
This function sorts the given array of dictionaries using the merge sort algorithm.
The sorting is done based on the value of the given key in each dictionary.
Args:
- arr: list of dictionaries to be sorted
- key: key in each dictionary to be used for sorting
Returns:
- sorted_arr: sorted list of dictionaries
"""
# If the length of the array is greater than 1, split it into two halves
if len(arr) > 1:
mid = len(arr) // 2
left_half = arr[:mid]
right_half = arr[mid:]
# Recursively sort the left and right halves
merge_sort(left_half, key)
merge_sort(right_half, key)
# Merge the sorted halves
i = j = k = 0
while i < len(left_half) and j < len(right_half):
# Compare the values of the given key in the left and right halves
if left_half[i][key] < right_half[j][key]:
# If the value in the left half is smaller, add it to the sorted array
arr[k] = left_half[i]
i += 1
else:
# If the value in the right half is smaller, add it to the sorted array
arr[k] = right_half[j]
j += 1
k += 1
# Add any remaining elements from the left half to the sorted array
while i < len(left_half):
arr[k] = left_half[i]
i += 1
k += 1
# Add any remaining elements from the right half to the sorted array
while j < len(right_half):
arr[k] = right_half[j]
j += 1
k += 1
# Return the sorted array
return arr
# Example usage
list_of_people = [
{"name": "Risa", "age": 18, "city": "New York"},
{"name": "John", "age": 63, "city": "Eugene"},
{"name": "Shekar", "age": 18, "city": "San Francisco"},
{"name": "Ryan", "age": 21, "city": "Los Angeles"}
]
# Record the start time
start_time = time.time()
# Sort the list of people based on their age using the merge_sort function
sorted_arr = merge_sort(list_of_people, "age")
# Record the end time
end_time = time.time()
# Print the sorted array and the time taken to sort it
print("Sorted array:", sorted_arr)
print("Time taken:", end_time - start_time, "seconds")