CSCI 151 Lab 12: Text prediction


Overview

In this lab, we will implement a trie for storing strings and use it to predict words as they are being typed.

Setup

Description

Modern smartphones include keyboard assistants that will suggest words while you are typing. There are many data structures that can be used to assist in this interaction, to predict the remainder of words, and autocorrect typed words. One such data structure is a trie, which divides the words into their component characters, and then stores the words in such a way that words with a shared prefix share a part of the tree.

There are two main components to the model found in the skeleton project: a Trie to store the tree of strings, and a SortedArrayMap to map characters to child Tries. SortedArrayMap in turn is implemented in terms of a SortedArray class.

You will only have to write code in the Trie class, but you will need to also look at the other classes to see what methods are available. Methods that remain to be implemented in the Trie class have been marked with TODO for easy identification.

Step 1 (D)

Implement the size method. It should essentially count how many Trie nodes have isMember set to true. This can be done recursively:

To iterate over all children, you can use a foreach loop, which depends on the iterator() method from SortedArrayMap.

We actually wrote this method together in class:

public int size() {
    int count = 0;
    if (isMember) {
        count++;
    }

    for (Association<Character, Trie> assoc : children) {
        count += assoc.getValue().size();
    }

    return count;
}

Step 2 (C)

Step 3 (B)

Step 4 (A)

Implement inorder and successorsTo. See the comments in the code for descriptions of what they should do. Coming up with a good way to organize inorder is tricky. Feel free to ask me for hints. Once you have written inorder, it is possible to write successorsTo in a concise way that reuses several previous pieces.

Step 5 (100)

For a grade of 100 on the lab, implement a better version of remove, which actually deletes any unneeded Trie nodes.