Autocomplete is a feature to suggest possible words to a partially typed string. This feature is widely used by most applications such as – Google, Bing, IDEs etc. To support auto complete feature, Trie data structure is used to maintain a dictionary and based on those suggestions can be made.
A trie is a N-ary tree data structure for storing strings in order to support fast string search. Tries can insert or search the string in O(L) time where L is length of string. Trie is mainly used in cases where dictionary kind of data structure needs to be maintained.
Assuming the dictionary contains all the strings were made of all small alphabets “a” to “z” then each node of the trie will have 26 pointers, one for each character.
Let’s have a look into Trie class definition which we are going to use it here.
#define LETTER_LENGTH 26
class Trie
{
public:
Trie* m_ptr[LETTER_LENGTH];
bool is_word_complete;
Trie ();
};
Trie::Trie ()
{
for (int i = 0; i < LETTER_LENGTH; i++)
{
m_ptr[i] = NULL;
}
}
Let’s look into the below diagram to understand how autocomplete/suggestions would work in Tries.
Important Points
- For autocomplete to work, words need to be pushed into Trie before.
- Suggestions will be made based on prefix string typed by the user.
- Suggestions can be made based on multiple conditions, such as
- Most used based suggestions etc.
Implementation
In order to support autocomplete, Trie needs to be inserted with words initially and based on those suggestions will be made. Let’s define an algorithm to support auto complete feature using Trie.
Algorithm
- Start with Trie root node and iterate through string characters.
- For every character in string.
- Calculate the character index position (for eg: int index = key[i] – ‘a’)
- If trie node is present at the index location, then check for next character and set trie node to this node present at index.
- Else string is not in Trie. Return false.
- At this point we will have Trie node which correspond to string typed by user.
- Call a recursive function “print_trie” by passing Trie node and string constructed till this point.
- If “is_word_complete” is set to true then print the string received as suggestion.
- Iterate through all Trie node pointers and if pointer is present then
- Create a new string by adding pointer character to received string.
- Call “print_trie” with this Trie node and string constructed in previous step.
Before going ahead let’s have a look into Trie basics operations article. Let’s have a look into autocomplete function.
void print_trie (Trie* root, char* str)
{
if (root->is_word_complete) {
std::cout << str << std::endl;
}
for (int i = 0; i < LETTER_LENGTH; i++) {
if (root->m_ptr[i]) {
char t[40] = {'\0'};
snprintf (t, sizeof(str)+1, "%s%c", str, (char)('a' + i));
print_trie (root->m_ptr[i], t);
}
}
}
void auto_suggestion (Trie* root, const char* key)
{
Trie* temp = root;
for (int i = 0; i < strlen (key); i++) {
if (!temp->m_ptr[(key[i] - 'a')]) {
std::cout << "No suggestion found" << std::endl;
return;
}
temp = temp->m_ptr[(key[i] - 'a')];
}
if (temp) {
printf ("Suggestion for %s are: ", key);
char str[40] = {'\0'};
snprintf (str, sizeof(key), "%s", key);
print_trie (temp, str);
} else {
std::cout << "No suggestion found" << std::endl;
}
}
Before going ahead let’s have a look into Trie basics operations article. Let’s define a main function which can use above functions to show how autocomplete works.
int main ()
{
char keys[][20] = {"this", "is", "sparta", "these", "thief", "there"};
Trie* root = new Trie ();
int entries = sizeof (keys)/ sizeof (keys[0]);
for (int i = 0; i < entries; i++) {
insert (root, keys[i]);
}
auto_suggestion (root, "th");
}
Let’s look into the output of above program.
Suggestion for th are: there
these
thief
this