The Bangla dataset is a text-dependent dataset. There are 40 speakers in this dataset. Each speaker has 10 utterances of a single phrase, repeated for 3 different speaking speeds. Each speaker was asked to utter 'Ami vat khai (I eat rice)' for 10 times at each speed. All speakers were from Noakhali, Bangladesh. The speakers were aged from 8 to 27 years. Each speaker was informed about the objective of this data recording. Each of them gave verbal consent to publish and use this dataset for research purposes. This dataset was recorded in a quiet environment with an Android mobile phone, with a sampling rate of 16 KHz, and saved in .wav format. The dataset is first divided into 3 directories, one for each speaking speed. Each of these directories contain numbered directories for the 40 speakers that contain their 10 utterances. Please cite the paper for this dataset: M. A. Islam and A.-N. Sakib, "Bangla dataset and MMFCC in text-dependent speaker identification," Engineering and Applied Science Research, vol. 46, no. 1, pp. 56-63, 2019.