Google’s SMITH Algorithm – Everything You Need to Know

Google Update Google’s SMITH Algorithm – Everything You Need to Know By Elinsys, 23rd Mar 2021

Google recently published a research paper on a new algorithm called SMITH. It is said that this algorithm outperforms BERT for understanding long queries and long documents. This new algorithm is able to understand passages within documents in the same way that BERT is able to understand words and sentences. The SMITH algorithm is hence able to understand documents better.

What is the SMITH Algorithm?

The SMITH algorithm is a new model to understand the entire document. BERT is more suited for understanding words within the context of sentences. BERT is trained on data sets to predict randomly hidden words are from the context within sentences, the SMITH algorithm is trained to predict what the next block of sentences are. The algorithm helps understand larger documents better than the BERT algorithm.

BERT Algorithm has some limitations

It’s limited to short text because it works on text matching of a few words or sentences.
Due to the quadratic computational complexity of self-attention with respect to input text length.
Matching texts which are very long requires more thorough understanding of semantic relations including matching pattern between text fragments with long distance.
Long documents contain internal structures like sections, passages and sentences. The document structure plays a key role for content understanding. A model also needs to take document structure information into account for better document matching performance.
Processing of long texts is more likely to trigger practical issues like out of TPU/GPU memories without careful model design.
BERT is limited to how long documents can be.
SMITH performs better as the documents get longer.
SMITH model doesn’t replace BERT it supplements BERT.

First the Algorithm undergoes pre-training where it is trained on the data set. In a typical pre-training, engineers mask random words within sentences and the algorithm tried to predict the masked words. As the algorithm learns, it eventually becomes optimized to make fewer mistakes on the training data. This results in fewer mistakes. Then, relations between sentence blocks in a document are used for understanding what the document is about. After testing, researchers noted that SMITH does better with longer text documents.