webSearch.py:
def sitegram(url): Given a url, return a dictionary with each word on the web page denoted by the url as its key, and a count of the occurrences of the word as its value. I recommend that you implement the function as follows:
flatten() from Assignment 6 to create a list of words from the list of lists of words.
frequency() function
on page 90 of the textbook as a guide.
def makeUrlData(urlList): Given a list of urls, return a dictionary mapping each url to the histogram returned by sitegram() above. Using a dictionary comprehension gives a very concise solution.
def rankedPages(urlData, term): The urlData parameter is a dictionary along the lines of what makeUrlData() returns. The term is a search term. The function returns a list of tuples: the first element is the url, and the second is the number of times that term is present in that url's page. Urls that lack the term do not appear. The list is ordered from the url with the highest count of the term downwards. I recommend the following implementation:
urlData. The first tuple element will be the url, and the second element the number of occurrences in the page of term. Only include the pages for which the term count is greater than zero.
def valueFrom(duple): return duple[1]
>>> urlData = makeUrlData(['http://ozark.hendrix.edu/~ferrer/','http://ozark.hendrix.edu/~ferrer/courses/150/','http://ozark.hendrix.edu/~ferrer/courses/150/f11/syllabus.html'])
>>> rankedPages(urlData, 'course')
[('http://ozark.hendrix.edu/~ferrer/courses/150/f11/syllabus.html', 6)]
>>> rankedPages(urlData, 'courses')
[('http://ozark.hendrix.edu/~ferrer/', 5), ('http://ozark.hendrix.edu/~ferrer/courses/150/', 4), ('http://ozark.hendrix.edu/~ferrer/courses/150/f11/syllabus.html', 2)]
points.py:
def interp(i, n, bound): Given integers i and n (i < n) and a two-element tuple bound, return the ith element inside bound, assuming that bound is subdivided into n pieces.
>>> interp(0, 10, (1,40))
1.0
>>> interp(1, 10, (1,40))
4.9
>>> interp(5, 10, (1,40))
20.5
>>> interp(10, 10, (1,40))
40.0
def makePoints(f, bound, n): Given a function f, a bound expressed as a lower and upper value in a tuple, and a number of intervals n, returns a dictionary mapping x coordinates to y coordinates for the given function. This function should use the interp function defined above. As there are n intervals, it will return n + 1 points:
>>> def parabola(x):
... return x**2
...
>>> makePoints(parabola, (1,10), 5)
{2.8: 7.839999999999999, 1.0: 1.0, 6.4: 40.96000000000001, 10.0: 100.0, 8.2: 67.24, 4.6: 21.159999999999997}
>>> makePoints(parabola, (1,10), 2)
{5.5: 30.25, 1.0: 1.0, 10.0: 100.0}
>>>
def d2File(d, filename): Given a dictionary, it outputs each key/value pair on a separate line, separated by a space.