indicLP

indicLP is python package developed specifically to perform NLP tasks on Indic Languages like Hindi and Tamil, to make the world of AI a more inclusive space.

Download View on Github

Zip-archive, includes all the source and site example files

indicLP from Aakash and Adityan is an awesome python package designed for NLP tasks in Indic Languages like Hindi and Tamil!
Unicode Supported

Unicode

The package is built to support unicode representation which make up most of Indic characters.

Tokenization

Tokenization

Sentence Piece tokenizer built using google sentencepiece library for all supported languages.

Easy-to-use

Intuitive

Intuitive set of functionalities to make the development process easier.

Word Embedding

Gensim word embedding built in for all the supported languages.

Constantly Developing

We are constantly looking to improve the library by adding more languages and functionalities.

License

indicLP licensed under MIT.
indicLP Library is absolutely free for personal or commercial use.