Documentation

Quick Start

Up and running in under a minute

Preprocessing

Cleaning up the Text for you.

TextNormalizer

A class to process the textual data.

Tokenizer - Method

Method to perform tokenization on sentences.

Stem - Method

Method to perform stemming on words.

Remove Stop Words - Method

Method to perform stopword removal on list of words.

Embedding

A class to perform word embedding.

Get Vector - Method

Method to get embedding vector for word.

Get Closest - Method

Method to get closely related words.

Convert - Method

Method to transliterate word from one script to another.

Revert - Method

Method to transliterate word back to the original script.

BahdanauAttention

PyTorch Attention module part of torch.nn

LuongAttention

PyTorch Attention module part of torch.nn

Load Dataset - Method

Method to download and load corpora and datasets.