DH Bridge

Encouraging Computational Thinking and Digital Skills in the Humanities

Functions and Loops

In this module, we will learn to:

  1. to create and call functions
  2. to create a "for" loop and "append" (add) items to a list

We are now ready to start writing a function to gather all of the information we want from the DPLA.

Our file currently looks like this:

1
2
3
4
5
6
7
8
9
10
11
12
# Load Libraries
from dpla.api import DPLA

# Set Variables
dpla = DPLA('Your-Key-Here')
# result = dpla.search('cooking', fields=['sourceResource'], page_size = 50)
all_records = []

# Define Functions

# Make Function Calls
# print result.items[1]

We will now add a function that handles the query for any given page number.

1. Creating a "Pull Records" Function

Functions are little blocks of code that do particular tasks. They involve variables and logic, they take what is given to them, and return a result.

To write a function, we start with the keyword "def", then give our function a name, and finally end with '():' Inside the parentheses, we can indicate how many pieces of information are going into the function.

In your "my_first_script.py" file, add the line

1
2
# Define Functions
def pull_records(pages, end, size):

In this line, you have declared a 'pull_records' function, and told Python that this function will involve three variables (pages, end, and size). These three variables names can be anything you like (you could name them "snap", "crackle", and "pop"), but should be meaningful to the their role in the function. In our case, they will hold information for the first page and last page, and the number of items per page.

We are now ready to add the steps involved in getting the search records.

On the next line, tab in once and type:

1
2
3
# Define Functions
def pull_records(pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'], page_size=size, page=pages)

Your file should now look like this:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
# Load Libraries
from dpla.api import DPLA

# Set Variables
dpla = DPLA('Your-Key-Here')
# result = dpla.search('cooking', fields=['sourceResource'], page_size = 50)
all_records = []

# Define Functions
def pull_records(pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'], page_size=size, page=pages)

# Make Function Calls
# print result.items[1]

It is important to note that in Python, white-space has meaning: when writing functions in Python, white space is used to designate what is in a function or within a loop and what is outside of it.

Let's add a "print" statement to our function and test out the first stage of this function. On the next line, tab in and add the following "print" statement after "paged_search…":

1
2
3
4
# Define Functions
def pull_records(pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'], page_size = size, page = pages)
	print paged_search.items[2]

To execute the function, we will call the function name and give it values. On a new line in the 'Make Function Calls' section, add the line:

1
2
3
# Make Function Calls
# print result.items[1]
pull_records(2, 3, 50)

Your file should now look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
# Load Libraries
from dpla.api import DPLA

# Set Variables
dpla = DPLA('Your-Key-Here')
# result = dpla.search('cooking', fields=['sourceResource'], page_size = 50)
all_records = []

# Define Functions
def pull_records(pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'], page_size = size, page = pages)
	print paged_search.items[2]

# Make Function Calls
# print result.items[1]
pull_records(2, 3, 50)

Save the file and go to Terminal to run it (python my_first_script.py). Then come back to the function and work with your table to understand how the function works.

2. Using a 'for loop' to save items to 'all_records'

You have written and executed your first function. Well done!

Now we need to add another function to store those results to the empty 'all_records' list we set up in the last module. While this is not necessary when you only have one page of results, it becomes necessary when you are trying to save from multiple pages.

To create a "Save Each" function, we need to define a new function in the my_first_script.py file. A best practice in programming is to group functions together toward the top of the file. Let's put our "save_each" function after "pull_records" but before we call the "pull_records" function:

1
def save_each(x):

The 'x' here is again arbitrary. We are telling the function that there is one attribute that we will be passing in, and to take that variable and plug it in for 'x' throughout the function.

We now need to add our first loop. With our current search, there are 50 items in our paged_search variable. We want to save each of those items separately to the "all_records" list. This means the computer needs to move through each individual item, grab the item, and add it to "all_records".

Tabbing in one space on the next line under def save_each(n):s add:

1
2
3
def save_each(x):
	for each in x.items:
		# do something to every item in the list of items

This is called a "for loop". It tells Python to iterate through each item in the list "x".

Because this logic is contained within the function, we need to add another tab. Each time we have a new loop or new function, we tab in all the lines associated with that block of code. To show that we're done listing steps for a particular bit of logic, we align the code with the appropriate amount of whitespace.

To add the item to the "all_records" list, we use the "append" method:

1
2
3
4
def save_each(x):
	for each in x.items:
		# do something to every item in the list of items
		all_records.append(each)

"append" grabs the value of "each" and adds it to the series we are saving as a list.

Now we can use this function in our "pull_records" function. Currently, our "pull_records" function looks as follows:

1
2
3
def pull_records (pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'],  page_size=size, page=pages)
	print paged_search.items[2]

Let's comment out the "print paged_search.item[2]" line, because that was just there to check that the first bit worked.

Now add a call to the "save_each" function, that passes in our search results.

1
2
3
4
def pull_records (pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'],  page_size=size, page=pages)
	# print paged_search.items[2]
	save_each(paged_search)

Our file should now look like:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
# Load Libraries
from dpla.api import DPLA

# Set Variables
dpla = DPLA('Your-Key-Here')
# result = dpla.search('cooking', fields=['sourceResource'], page_size = 50)
all_records = []

# Define Functions
def pull_records (pages, end, size):
	paged_search = dpla.search('cooking', fields=['sourceResource'],  page_size=size, page=pages)
	# print paged_search.items[2]
	save_each(paged_search)

def save_each(x):
	for each in x.items:
		# do something to every item in the list of items
		all_records.append(each)

# Make Function Calls
# print result.items[1]
pull_records(2, 3, 50)

To test this, let's now add a "print" statement to the end of the file, after the "pull_records" function has been run, to make sure that the items are going into the "all_records" variable.

After "pull_records()" add:

1
print all_records[30]

Save and run your script.

We have made great progress! We now have two functions to handle making the query and saving the results, but we are still only working with one "page" of search results at a time. In the next module, we will add yet another kind of loop in order to move through the different pages.

Group Challenge

Break out the paper and pencils. Work as a group to create a diagram or metaphor for a "for loop". When would a "for loop" be useful?

Previous Module Next Module