Getting Started
Download and load corpora and datasets directly in your program using
Example
from indicLP.datasets import Dataset from indicLP.utils import SentenceIterator dt = Dataset() data = dt.load_dataset("ponniyin-selvan",True) sentenceIterator = SentenceIterator(data[0],"\\n") count = 0 for i in sentenceIterator: print(i) count += 1 if count == 5: break
Input Arguments
Following are the input arguments to be provided while using get_vector method:
- dataset_name (string): Name of the dataset to be loaded. List of supported datasets is given below
- combine (bool): Incase of corpus, this parameter when set to true will combine all the text files present in the dataset into 1.
Supported Datasets
Following are the dataset supported