CSCI 151 Lab 12: Text prediction
Overview
In this lab, we will implement a trie for storing strings and use it to predict words as they are being typed.
Setup
- Download the skeleton for this project.
Description
Modern smartphones include keyboard assistants that will suggest words while you are typing. There are many data structures that can be used to assist in this interaction, to predict the remainder of words, and autocorrect typed words. One such data structure is a trie, which divides the words into their component characters, and then stores the words in such a way that words with a shared prefix share a part of the tree.
There are two main components to the model found in the skeleton project: a Trie
to store the tree of strings, and a SortedArrayMap
to map characters to child Trie
s. SortedArrayMap
in turn is implemented in terms of a SortedArray
class.
You will only have to write code in the Trie
class, but you will need to also look at the other classes to see what methods are available. Methods that remain to be implemented in the Trie
class have been marked with TODO
for easy identification.
Step 1: size (D)
Implement the size
method. It should essentially count how many Trie
nodes have isMember
set to true
. This can be done recursively:
- If
isMember
istrue
, count 1. - Add up the result of calling
size
recursively on all children.
To iterate over all children, you can use a foreach loop, which depends on the iterator()
method from SortedArrayMap
.
Step 2: getChildWith and find (C)
Implement getChildWith
and find
.
getChildWith
is not recursive. It just gets the child corresponding to a given letter one level down, or returnsnull
if no child corresponds to that letter. This method is just for convenience and will be reused in other methods.find
is another helper method you may find useful later. It follows an entireString
through the trie, taking one step down per character, and returns theTrie
it ends up at, ornull
if none is found. It does not matter whetherisMember
is set or not.You probably want to implement
find
recursively. The recursive structure should be something like this:- Is the input string
.equal("")
? This is the base case. - Otherwise, get the first character with
charAt(0)
and use it to look for a child indexed by that character. If the child exists, make a recursive call with the remainder of the string (substring(1)
).
- Is the input string
Step 3: contains, add, remove (B)
Implement
contains
, which tests whether a given string is contained in the trie (note "contained" means "isMember
is set").Implement
add
, which adds a new word to the trie.Implement a simple version of
remove
, which simply finds the end of the word to remove and then setsisMember
tofalse
. This will pass all the tests but leaves lots of uselessTrie
nodes lying around; see Step 5.
Step 4: inorder, successorsTo (A)
Implement inorder
and successorsTo
. See the comments in the code for descriptions of what they should do. Coming up with a good way to organize inorder
is tricky. Feel free to ask me for hints. Once you have written inorder
, it should be possible to write successorsTo
in a concise way that reuses several previous pieces.
Step 5: better remove (100)
For a grade of 100 on the lab, implement a better version of remove
, which actually deletes any unneeded Trie
nodes.