Load Dataset - Method

Download and load corpora and datasets directly in your program using.

Getting Started

Download and load corpora and datasets directly in your program using

Example

from indicLP.datasets import Dataset
from indicLP.utils import  SentenceIterator

dt = Dataset()
data = dt.load_dataset("ponniyin-selvan",True)
sentenceIterator = SentenceIterator(data[0],"\\n")
count = 0
for i in sentenceIterator:
    print(i)
    count += 1
    if count == 5:
        break

Input Arguments

Following are the input arguments to be provided while using get_vector method:

dataset_name (string): Name of the dataset to be loaded. List of supported datasets is given below
combine (bool): Incase of corpus, this parameter when set to true will combine all the text files present in the dataset into 1.

Supported Datasets

Following are the dataset supported

bbc_hindi
tamil-news-classification
hindi-nli
ponniyin-selvan