You have 0 Pre-Labeled Datasets Added to Quote
Request Quote
Browse all Pre-Labeled Datasets
Dataset Text | Albanian (Albania) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 12,000 words | Add Dataset to Quote | sqi_ALB_PHON | Appen Global | Pronunciation Dictionary | Albanian | Albania | N/A | N/A | N/A | N/A | 12,000 | N/A | text | Albanian (Albania) Pronunciation Dictionary | ||
Dataset Text | Amharic (Ethiopia) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 49,000 words | Add Dataset to Quote | amh_ETH_PHON | Appen Global | Pronunciation Dictionary | Amharic | Ethiopia | N/A | N/A | N/A | N/A | 49,000 | N/A | text | Amharic (Ethiopia) Pronunciation Dictionary | ||
Dataset Text | Arabic (Algeria) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 11,000 words | Add Dataset to Quote | ara_DZA_PHON | Appen Global | Pronunciation Dictionary | Arabic | Algeria | N/A | N/A | N/A | N/A | 11,000 | N/A | text | Arabic (Algeria) Pronunciation Dictionary | ||
Dataset Audio | Arabic (Eastern Algeria) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 29 hours | Add Dataset to Quote | EAR_ASR001 | Appen Global | Conversational Speech | Arabic | Algeria | Low background noise (home/office) | 496 | 2 | Available on request | 11,327 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed however, for a smaller number of calls, only one half of the conversation was collected and transcribed 8% landline, 92% mobile |
Arabic (Eastern Algeria) conversational telephony | |
Dataset Text | Arabic (Egypt) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 40,000 words | Add Dataset to Quote | ara_EGY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Egypt | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Arabic (Egypt) Pronunciation Dictionary | ||
Dataset Audio | Arabic (Egypt) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 352 hours | Add Dataset to Quote | ARE_ASR001_CN | Appen China | Scripted Speech | Arabic | Egypt | Low background noise (home/office) | 627 | 1 | 128,908 | 207,576 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised |
Arabic (Egypt) scripted smartphone | |
Dataset Text | Arabic (Iraq) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 13,000 words | Add Dataset to Quote | ara_IRQ_POS | Appen Global | Part of Speech Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 13,000 | N/A | text | Arabic (Iraq) Part of Speech Dictionary | ||
Dataset Text | Arabic (Iraq) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 19,000 words | Add Dataset to Quote | ara_IRQ_PHON | Appen Global | Pronunciation Dictionary | Arabic | Iraq | N/A | N/A | N/A | N/A | 19,000 | N/A | text | Person names | Arabic (Iraq) Pronunciation Dictionary | |
Dataset Text | Arabic (Libya) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 48,000 words | Add Dataset to Quote | ara_LBY_PHON | Appen Global | Pronunciation Dictionary | Arabic | Libya | N/A | N/A | N/A | N/A | 48,000 | N/A | text | Arabic (Libya) Pronunciation Dictionary | ||
Dataset Audio | Arabic (Modern Standard Arabic) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 12 hours | Add Dataset to Quote | MSA_ASR001 | Global Phone | Scripted Speech | Arabic | Tunisia | Low background noise (home/office) | 78 | 1 | 4,908 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Arabic (Modern Standard Arabic) scripted microphone | |
Dataset Audio | Arabic (Morocco) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 33 hours | Add Dataset to Quote | ARY_ASR001 | Appen Global | Conversational Speech | Arabic | Morocco | Low background noise | 180 | 2 | 80,544 | 23,836 | 8 | alaw | Each speaker participated in 1 to 4 conversations. Speakers are identified by a unique 4-digit speaker ID which is recorded in the demographic file Transcription is available in original script and fully reversible Romanised version with accompanying pronunciation lexicon English translation of product transcription is available (ARY_MT001, ARY_ASRMT001) |
Arabic (Morocco) conversational telephony | |
Dataset Text | Arabic (Morocco) conversational telephony translation | Common Use Cases: MT, Chatbot , Conversational AI | Recording Device: N/A | Unit: 80,544 utterances | Add Dataset to Quote | ARY_MT001 | Appen Global | Conversational Translation | Arabic | Morocco | N/A | 180 | N/A | 80,430 | 23,844 | N/A | text | Corresponding audio, transcription, fully reversible romanised transcription and pronunciation lexicon data are available (ARY_ASR001, ARY_ASRMT001) | Arabic (Morocco) conversational telephony translation | |
Dataset Text | Arabic (Morocco) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 60,000 words | Add Dataset to Quote | ara_MAR_PHON | Appen Global | Pronunciation Dictionary | Arabic | Morocco | N/A | N/A | N/A | N/A | 60,000 | N/A | text | Arabic (Morocco) Pronunciation Dictionary | ||
Dataset Text | Arabic (MSA) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 40,000 words | Add Dataset to Quote | arb_MSA_PHON | Appen Global | Pronunciation Dictionary | Standard Arabic | N/A | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Arabic (MSA) Pronunciation Dictionary | ||
Dataset Audio | Arabic (Saudi Arabia) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 322 hours | Add Dataset to Quote | ARS_ASR001_CN | Appen China | Scripted Speech | Arabic | Saudi Arabia | Low background noise (home/office) | 227 | 1 | 104,574 | 156,282 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised 300-1000 prompts per speaker covering general content including education, sports, entertainment, travel, culture and technology |
Arabic (Saudi Arabia) scripted smartphone | |
Dataset Text | Arabic (Sudan) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 17,000 words | Add Dataset to Quote | ara_SDN_PHON | Appen Global | Pronunciation Dictionary | Arabic | Sudan | N/A | N/A | N/A | N/A | 17,000 | N/A | text | Arabic (Sudan) Pronunciation Dictionary | ||
Dataset Text | Arabic (United Arab Emirates (UAE)) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 75,000 words | Add Dataset to Quote | ara_ARE_PHON | Appen Global | Pronunciation Dictionary | Arabic | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 75,000 | N/A | text | Arabic (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
Dataset Audio | Arabic (United Arab Emirates (UAE)) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 170 hours | Add Dataset to Quote | ARU_ASR001_CN | Appen China | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise (home/office) | 133 | 1 | 42,352 | 85,775 | 16 | wav | Dataset contains audio with corresponding text prompts Text prompts are not vowelised |
Arabic (United Arab Emirates (UAE)) scripted smartphone | |
Dataset Audio | Arabic (United Arab Emirates (UAE)) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 48 hours | Add Dataset to Quote | OrienTel United Arab Emirates MCA (Modern Colloquial Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 880 | 1 | 43,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Arabic (United Arab Emirates (UAE)) scripted telephony | |
Dataset Audio | Arabic (United Arab Emirates (UAE)) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 31 hours | Add Dataset to Quote | OrienTel United Arab Emirates MSA (Modern Standard Arabic) | Nuance | Scripted Speech | Arabic | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 24,500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 49 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Arabic (United Arab Emirates (UAE)) scripted telephony | |
Dataset Audio | Arabic (United Arab Emirates (UAE)/ Saudi Arabia) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 86 hours | Add Dataset to Quote | CGA_ASR001 | Appen Global | Scripted Speech | Arabic | United Arab Emirates (UAE) - Saudi Arabia | Low background noise (home/office) | 150 | 4 | 42,000 | 19,245 | 16 | raw PCM | Fully transcribed with acoustic event tagging derived from the SpeechDAT conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words All transcriptions fully vowelized 280 prompts per speaker including 30 Person names (first name and family name) from a set of 15, 10 single isolated digits 0-10, 8-digit sequences (randomly generated), 200 phonetically balanced sentences, 30 x 10-word phonetically balanced word strings |
Arabic (United Arab Emirates (UAE)/ Saudi Arabia) scripted microphone | |
Dataset Text | Arabic NER news text | Common Use Cases: NER, Content Classification, Search Engines | Recording Device: N/A | Unit: 20,774 sentences | Add Dataset to Quote | ARB_NER001 | Appen Global | News NER | Standard Arabic | N/A | N/A | N/A | N/A | 20,774 | Available on request | N/A | text | Arabic NER news text | ||
Dataset Text | Assamese (India) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 40,000 words | Add Dataset to Quote | asm_IND_PHON | Appen Global | Pronunciation Dictionary | Assamese | India | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Assamese (India) Pronunciation Dictionary | ||
Dataset Audio | Baby crying audio | Common Use Cases: Baby Monitor, Security & Other Consumer Applications | Recording Device: Mobile phone | Unit: 3 hours | Add Dataset to Quote | CRY_ASR001 | Appen China | Human Sound | N/A | China | Low background noise (home/office) | 100 | 1 | N/A | N/A | 16 | wav | Crying sound of babies 0-3 years old, each lasting around 2 minutes. | Baby crying audio | |
Dataset Audio | Bahasa Indonesia conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 31 hours | Add Dataset to Quote | BAH_ASR001 | Appen Global | Conversational Speech | Indonesian | Indonesia | Low background noise | 1,002 | 2 | 30,695 | 11,480 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For a large proportion of calls, only one half of the conversation was collected and transcribed 28% landline, 72% mobile |
Bahasa Indonesia conversational telephony | |
Dataset Text | Basque (Spain) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 10,000 words | Add Dataset to Quote | eus_ESP_PHON | Appen Global | Pronunciation Dictionary | Basque | Spain | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Basque (Spain) Pronunciation Dictionary | ||
Dataset Audio | Bengali (Bangladesh) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 47 hours | Add Dataset to Quote | BEN_ASR001 | Appen Global | Conversational Speech | Bengali | Bangladesh | Mixed (in-car, roadside, home/office) | 1,000 | 2 | 108,923 | 17,922 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Bengali (Bangladesh) conversational telephony | |
Dataset Text | Bengali (India) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 29,000 words | Add Dataset to Quote | ben_IND_PHON | Appen Global | Pronunciation Dictionary | Bengali | India | N/A | N/A | N/A | N/A | 29,000 | N/A | text | Bengali (India) Pronunciation Dictionary | ||
Dataset Audio | Bulgarian (Bulgaria) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 38 hours | Add Dataset to Quote | BUL_ASR001 | Appen Global | Conversational Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 217 | 2 | 86,453 | 22,342 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 49% landline, 51% mobile Conversations cover a range of topics including: Holiday/Leisure, Movies/TV Shows and Work. |
Bulgarian (Bulgaria) conversational telephony | |
Dataset Text | Bulgarian (Bulgaria) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 55,000 words | Add Dataset to Quote | bul_BGR_PHON | Appen Global | Pronunciation Dictionary | Bulgarian | Bulgaria | N/A | N/A | N/A | N/A | 55,000 | N/A | text | Bulgarian (Bulgaria) Pronunciation Dictionary | ||
Dataset Audio | Bulgarian (Bulgaria) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 22 hours | Add Dataset to Quote | BUL_ASR002 | Global Phone | Scripted Speech | Bulgarian | Bulgaria | Low background noise (home/office) | 77 | 1 | 8,674 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Bulgarian (Bulgaria) scripted microphone | |
Dataset Image | Business-to-business printed text document OCR | Common Use Cases: Document Processing, Document Search | Recording Device: Camera, scan | Unit: 5,832 documents | Add Dataset to Quote | IMG_OCR_B2B | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-business documents containing printed text. 38% Premium Quality images including Purchase Order, Payment Advice or Remittance Advice, Order Confirmation and Delivery note; 64% Standard Quality images in various challenging conditions in a wider range of categories including Complaints or Return, Delivery advice, Delivery note, Dunning, Goods receipt, Invoice, Offer, Order confirmation, Pay slip, Payment Advice or Remittance Advice, Purchase Order, Receipt, and Supplier load | Business-to-business printed text document OCR | |
Dataset Image | Business-to-consumer/other text document OCR | Common Use Cases: Document Processing, Document Search | Recording Device: Camera, scan | Unit: 22,626 documents | Add Dataset to Quote | IMG_OCR_B2C_Other | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | Scans and photographs of business-to-consumer and miscellaneous other category documents containing text: 37% invoices, 42% receipts, 1% documents with tables, 2% handwritten forms and documents, 2% menus, 11% product labels, 2% posters, 3% street signs. 6 Languages collected in 23+ locales: 11% Arabic, 43% English, 4% French, 4% German, 24% Spanish, 14% Russian | Business-to-consumer/other text document OCR | |
Dataset Text | Cantonese (China) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 10,000 words | Add Dataset to Quote | yue_HKG_POS | Appen Global | Part of Speech Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Traditional | Cantonese (China) Part of Speech Dictionary | |
Dataset Text | Cantonese (China) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 37,000 words | Add Dataset to Quote | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 37,000 | N/A | text | Simplified | Cantonese (China) Pronunciation Dictionary | |
Dataset Text | Cantonese (China) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 40,000 words | Add Dataset to Quote | yue_CHN_PHON | Appen Global | Pronunciation Dictionary | Cantonese | China | N/A | N/A | N/A | N/A | 40,000 | N/A | text | Traditional | Cantonese (China) Pronunciation Dictionary | |
Dataset Text | Catalan (Spain) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 10,000 words | Add Dataset to Quote | cat_ESP_PHON | Appen Global | Pronunciation Dictionary | Catalan | Spain | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Catalan (Spain) Pronunciation Dictionary | ||
Dataset Text | Cebuano (Philippines) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 21,000 words | Add Dataset to Quote | ceb_PHL_PHON | Appen Global | Pronunciation Dictionary | Cebuano | Philippines | N/A | N/A | N/A | N/A | 21,000 | N/A | text | Cebuano (Philippines) Pronunciation Dictionary | ||
Dataset Audio | Chinese (multinational foreigner) scripted smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 200 hours | Add Dataset to Quote | FOREIGNER_ASR001_CN | Appen China | Scripted Speech | Mandarin Chinese | China | Low background noise | 309 | 1 | 16 | wav | This database contains 200 hours of foreigners speaking Chinese from the following countries: Argentina, Egypt, Australia, Russia, the Philippines, Kazakhstan, Korea, Kyrgyzstan, Canada, Kuala Lumpur, Kenya, Laos, Malaysia, Mauritius, the United States, Mongolia, South Africa, Japan, Tajikistan, Thailand, Turkey, Hong Kong, Singapore, India, Indonesia, Vietnam There is no data from South Korea, Brazil, or data recorded by minors. Each session lasts about an hour; sentence duration ranges between 3-10 seconds The content is in the form of an individual reading while being recorded on a mobile phone in a home/office environment. Sensitive data and personal information has been scrubbed. |
Chinese (multinational foreigner) scripted smartphone | |||
Dataset Audio | Croatian (Croatia) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 39 hours | Add Dataset to Quote | CRO_ASR001 | Appen Global | Conversational Speech | Croatian | Croatia | Low background noise (home/office) | 200 | 2 | Available on request | 23,919 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 53% landline, 47% mobile Conversations cover a range of topics including: News & Current Affairs, Health and Sport. |
Croatian (Croatia) conversational telephony | |
Dataset Text | Croatian (Croatia) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 19,000 words | Add Dataset to Quote | hrv_HRV_PHON | Appen Global | Pronunciation Dictionary | Croatian | Croatia | N/A | N/A | N/A | N/A | 19,000 | N/A | text | Croatian (Croatia) Pronunciation Dictionary | ||
Dataset Audio | Croatian (Croatia) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 11 hours | Add Dataset to Quote | CRO_ASR002 | Global Phone | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 94 | 1 | 4,499 | 23,929 | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Croatian (Croatia) scripted microphone | |
Dataset Audio | Croatian (Croatia) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 263 hours | Add Dataset to Quote | CRO_ASR003_CN | Appen China | Scripted Speech | Croatian | Croatia | Low background noise (home/office) | 243 | 1 | 73,467 | 136,140 | 16 | wav | Dataset contains audio with corresponding text prompts | Croatian (Croatia) scripted smartphone | |
Dataset Text | Czech (Czech Republic) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 50,000 words | Add Dataset to Quote | ces_CZE_PHON | Appen Global | Pronunciation Dictionary | Czech | Czech Republic | N/A | N/A | N/A | N/A | 50,000 | N/A | text | Czech (Czech Republic) Pronunciation Dictionary | ||
Dataset Audio | Czech (Czech Republic) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 31 hours | Add Dataset to Quote | CZE_ASR001 | Global Phone | Scripted Speech | Czech | Czech Republic | Low background noise (home/office) | 102 | 1 | 12,425 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Czech (Czech Republic) scripted microphone | |
Dataset Audio | Czech (Czech Republic) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 93 hours | Add Dataset to Quote | Czech SpeechDat(E) Dataset | Nuance | Scripted Speech | Czech | Czech Republic | Low background noise | 1,000 | 1 | 52,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, and phonetically rich words and sentences |
Czech (Czech Republic) scripted telephony | |
Dataset Text | Danish (Denmark) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 100,000 words | Add Dataset to Quote | dan_DNK_POS | Appen Global | Part of Speech Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 100,000 | N/A | text | Danish (Denmark) Part of Speech Dictionary | ||
Dataset Text | Danish (Denmark) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 107,000 words | Add Dataset to Quote | dan_DNK_PHON | Appen Global | Pronunciation Dictionary | Danish | Denmark | N/A | N/A | N/A | N/A | 107,000 | N/A | text | Danish (Denmark) Pronunciation Dictionary | ||
Dataset Audio | Danish (Denmark) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 53 hours | Add Dataset to Quote | Speecon Danish | Nuance | Scripted Speech | Danish | Denmark | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Danish (Denmark) scripted microphone | |
Dataset Audio | Dari (Afghanistan) broadcast | Common Use Cases: ASR, Automatic Captioning, Keyword Spotting | Recording Device: Microphone | Unit: 51 hours | Add Dataset to Quote | DAR_BRC001 | Appen Global | Broadcast Speech | Dari | Afghanistan | Low background noise (studio) | N/A | 1 | Available on request | Available on request | N/A | wav | Dataset is fully transcribed and timestamped Pronunciation lexicon not currently available but can be developed upon request Dataset is largely speech only and does not include music or advertisements Data types include: talk shows, interviews, news broadcasts (excluding news reading by anchors) 13% landline, 87% mobile |
Dari (Afghanistan) broadcast | |
Dataset Audio | Dari (Afghanistan) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 40 hours | Add Dataset to Quote | DAR_ASR001 | Appen Global | Conversational Speech | Dari | Afghanistan | Low background noise | 500 | 2 | Available on request | 11,168 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Dataset is largely speech only and does not include music or advertisements 13% landline, 87% mobile |
Dari (Afghanistan) conversational telephony | |
Dataset Text | Dari (Afghanistan) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 31,000 words | Add Dataset to Quote | prs_AFG_PHON | Appen Global | Pronunciation Dictionary | Dari | Afghanistan | N/A | N/A | N/A | N/A | 31,000 | N/A | text | Dari (Afghanistan) Pronunciation Dictionary | ||
Dataset Text | Dholuo (Kenya) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 23,000 words | Add Dataset to Quote | luo_KEN_PHON | Appen Global | Pronunciation Dictionary | Dholuo | Kenya | N/A | N/A | N/A | N/A | 23,000 | N/A | text | Dholuo (Kenya) Pronunciation Dictionary | ||
Dataset Audio | Dongbei dialect (China) Conversational Speech | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Recording pen/microphone | Unit: 84.6 hours | Add Dataset to Quote | DONGBEI_ASR001_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 268 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. |
Dongbei dialect (China) Conversational Speech | |||
Dataset Audio | Dongbei dialect (China) Conversational Speech | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 75.2 hours | Add Dataset to Quote | DONGBEI_ASR002_CN | Appen China | Conversational Speech | Dongbei dialect | China | Low background noise | 185 | 1 | 8 | wav | Audio only; transcription not included Audio recordings cover 19 districts: Shenyang Heping District, Shenhe District, Huanggu District, Dadong District, Tiexi District, Lvyuan District, Chaoyang District, Kuancheng District, Erdao District, Nanguan District, Daoli District, Nangang District, Daowai District, Pingfang District, Songbei District, Xiangfang District, Hulan District, Acheng District and Shuangcheng District Northeast suburb accents not included, and no minors were recorded. Each recording session contains 20-30 minutes of free dialogue between 2-5 people. Sensitive data and personal information has been scrubbed. |
Dongbei dialect (China) Conversational Speech | |||
Dataset Audio | Dutch (Belgium) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 47 hours | Add Dataset to Quote | Speecon Dutch from Belgium | Nuance | Scripted Speech | Dutch | Belgium | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Dutch (Belgium) scripted microphone | |
Dataset Audio | Dutch (Belgium) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Microphone | Unit: 80 hours | Add Dataset to Quote | Flemish SpeechDat(II) FDB-1000 (FIXED1FL) | Nuance | Scripted Speech | Dutch | Belgium | Low background noise | 1,000 | 1 | 52,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
Dutch (Belgium) scripted telephony | |
Dataset Audio | Dutch (Netherlands & Belgium) scripted in-car | Common Use Cases: ASR, Virtual Assistant, In Car HMI & Entertainment | Recording Device: Microphone and mobile phone | Unit: 27 hours | Add Dataset to Quote | Dutch and Flemish SpeechDat-Car | Nuance | Scripted Speech | Dutch | Netherland - Belgium | Mixed (in-car) | 302 | 5 | 15,100 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report 125 prompts per adult speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech |
Dutch (Netherlands & Belgium) scripted in-car | |
Dataset Audio | Dutch (Netherlands) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 36 hours | Add Dataset to Quote | NLD_ASR001 | Appen Global | Conversational Speech | Dutch | Netherlands | Low background noise | 200 | 2 | Available on request | 14,964 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 51% landline, 49% mobile Conversations cover a range of topics including: Holiday/Leisure, Work and Sport. |
Dutch (Netherlands) conversational telephony | |
Dataset Text | Dutch (Netherlands) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 45,000 words | Add Dataset to Quote | nld_NLD_PHON | Appen Global | Pronunciation Dictionary | Dutch | Netherlands | N/A | N/A | N/A | N/A | 45,000 | N/A | text | Dutch (Netherlands) Pronunciation Dictionary | ||
Dataset Audio | Dutch (Netherlands) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 68 hours | Add Dataset to Quote | Speecon Dutch from the Netherlands | Nuance | Scripted Speech | Dutch | Netherlands | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
Dutch (Netherlands) scripted microphone | |
Dataset Image | East African facial images | Common Use Cases: Facial Recognition | Recording Device: Camera | Unit: 13500 images | Add Dataset to Quote | IMG_FACE_KEN_CN | Appen China | Human Face | N/A | Kenya | Mixed background and lighting conditions | 99 | N/A | N/A | N/A | N/A | jpg | Images of 99 participants containing all combinations of 9 different lighting conditions, 2 different distances between participants face and smartphone, 7 different camera angles. All combinations of these 3 requirements were completed per participant. A random 32 images per person include occlusions such as sunglasses, masks, wigs or hats A random 36 shots include different facial expressions including stare, open mouth, pout mouth smile and frown Lighting conditions: indoor normal light, outdoor normal light, indoor backlight, outdoor backlight, indoor ordinary dark light, full black screen fill light, point light source (white light, street light), neon light (monochromatic red, green and blue, multi-color mixed light), side glare Distances: 30cm and 50cm Camera angles: front, left 45°, right 45°, left 15°, right 15°, top 30°, bottom 30° |
East African facial images | |
Dataset Audio | English (Arabic - Levant/Egypt) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 28 hours | Add Dataset to Quote | ENA_ASR001 | Appen Global | Conversational Speech | English | Egypt | Low background noise | 250 | 2 | Available on request | 5,619 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins |
English (Arabic - Levant/Egypt) conversational telephony | |
Dataset Text | English (Australia) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 157,000 words | Add Dataset to Quote | eng_AUS_PHON | Appen Global | Pronunciation Dictionary | English | Australia | N/A | N/A | N/A | N/A | 157,000 | N/A | text | English (Australia) Pronunciation Dictionary | ||
Dataset Audio | English (Australia) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 92 hours | Add Dataset to Quote | AUS_ASR001 | Appen Global | Scripted Speech | English | Australia | Low background noise (home/office) | 500 | 1 | 82,500 | 35,137 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 162 prompts (read speech) per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items (from a set of 215), phonetically rich sentences and words |
English (Australia) scripted telephony | |
Dataset Audio | English (Australia) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 118 hours | Add Dataset to Quote | AUS_ASR002 | Appen Global | Scripted Speech | English | Australia | Mixed | 1,000 | 1 | 75,000 | 18,952 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 75 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words The prompts are a mixture of 'read' and 'elicited' items where 5 prompts per script are 'spontaneous free speech' |
English (Australia) scripted telephony | |
Dataset Text | English (Canada) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 3,000 words | Add Dataset to Quote | eng_CAN_POS | Appen Global | Part of Speech Dictionary | English | Canada | N/A | N/A | N/A | N/A | 3,000 | N/A | text | English (Canada) Part of Speech Dictionary | ||
Dataset Text | English (Canada) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 50,000 words | Add Dataset to Quote | eng_CAN_PHON | Appen Global | Pronunciation Dictionary | English | Canada | N/A | N/A | N/A | N/A | 50,000 | N/A | text | English (Canada) Pronunciation Dictionary | ||
Dataset Audio | English (Canada) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 144 hours | Add Dataset to Quote | ENC_ASR001 | Appen Global | Scripted Speech | English | Canada | Mixed | 1,000 | 1 | 99,000 | 12,483 | 8 | alaw or wav | Fully transcribed to SALA II/SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words 99 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
English (Canada) scripted telephony | |
Dataset Text | English (Hong Kong) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 18,000 words | Add Dataset to Quote | eng_HKG_PHON | Appen Global | Pronunciation Dictionary | English | Hong Kong | N/A | N/A | N/A | N/A | 18,000 | N/A | text | English (Hong Kong) Pronunciation Dictionary | ||
Dataset Audio | English (India) conversational smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 143 hours | Add Dataset to Quote | ENI_ASR003 | Appen Global | Conversational Speech | English | India | Mixed (home, car, public place, outdoor) | 272 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
English (India) conversational smartphone | |
Dataset Audio | English (India) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 67 hours | Add Dataset to Quote | ENI_ASR002 | Appen Global | Conversational Speech | English | India | Low background noise | 540 | 2 | 77,565 | 11,646 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 271 telephony conversations are recorded for this project |
English (India) conversational telephony | |
Dataset Text | English (India) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 13,000 words | Add Dataset to Quote | eng_IND_POS | Appen Global | Part of Speech Dictionary | English | India | N/A | N/A | N/A | N/A | 13,000 | N/A | text | English (India) Part of Speech Dictionary | ||
Dataset Text | English (India) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 60,000 words | Add Dataset to Quote | eng_IND_PHON | Appen Global | Pronunciation Dictionary | English | India | N/A | N/A | N/A | N/A | 60,000 | N/A | text | English (India) Pronunciation Dictionary | ||
Dataset Audio | English (India) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 217 hours | Add Dataset to Quote | ENI_ASR001 | Appen Global | Scripted Speech | English | India | Mixed | 2,358 | 1 | 115,541 | 9,190 | 8 | alaw or wav | Fully transcribed to SpeechDAT type conventions. Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 49 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
English (India) scripted telephony | |
Dataset Text | English (Ireland) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 12,000 words | Add Dataset to Quote | eng_IRL_PHON | Appen Global | Pronunciation Dictionary | English | Ireland | N/A | N/A | N/A | N/A | 12,000 | N/A | text | English (Ireland) Pronunciation Dictionary | ||
Dataset Text | English (NZ) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 28,000 words | Add Dataset to Quote | eng_NZL_PHON | Appen Global | Pronunciation Dictionary | English | NZ | N/A | N/A | N/A | N/A | 28,000 | N/A | text | English (NZ) Pronunciation Dictionary | ||
Dataset Audio | English (Philippines) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 53 hours | Add Dataset to Quote | ENF_ASR001 | Appen Global | Conversational Speech | English | Philippines | Low background noise | 450 | 2 | 41,602 | 7,272 | 8 | alaw or wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins |
English (Philippines) conversational telephony | |
Dataset Text | English (Philippines) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 7,000 words | Add Dataset to Quote | eng_PHL_PHON | Appen Global | Pronunciation Dictionary | English | Philippines | N/A | N/A | N/A | N/A | 7,000 | N/A | text | English (Philippines) Pronunciation Dictionary | ||
Dataset Text | English (United Arab Emirates (UAE)) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 5,000 words | Add Dataset to Quote | eng_ARE_PHON | Appen Global | Pronunciation Dictionary | English | United Arab Emirates (UAE) | N/A | N/A | N/A | N/A | 5,000 | N/A | text | English (United Arab Emirates (UAE)) Pronunciation Dictionary | ||
Dataset Audio | English (United Arab Emirates (UAE)) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 33 hours | Add Dataset to Quote | OrienTel English as spoken in the United Arab Emirates | Nuance | Scripted Speech | English | United Arab Emirates (UAE) | Low background noise | 500 | 1 | 25,500 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 51 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
English (United Arab Emirates (UAE)) scripted telephony | |
Dataset Audio | English (United Kingdom) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 150 hours | Add Dataset to Quote | UKE_ASR001 | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1,175 | 2 | 298,562 | 24,193 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words This version contains full 15-minute calls - there is a reduced version with 5 min calls named UKE_ASR001B. |
English (United Kingdom) conversational telephony | |
Dataset Audio | English (United Kingdom) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 50 hours | Add Dataset to Quote | UKE_ASR001B | Appen Global | Conversational Speech | English | United Kingdom | Low background noise | 1,150 | 2 | Available on request | 13,192 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words This version contains full 5-minute calls - there is an expanded version with 15 min calls named UKE_ASR001. |
English (United Kingdom) conversational telephony | |
Dataset Text | English (United Kingdom) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 155,000 words | Add Dataset to Quote | eng_GBR_POS | Appen Global | Part of Speech Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 155,000 | N/A | text | English (United Kingdom) Part of Speech Dictionary | ||
Dataset Text | English (United Kingdom) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 195,000 words | Add Dataset to Quote | eng_GBR_PHON | Appen Global | Pronunciation Dictionary | English | United Kingdom | N/A | N/A | N/A | N/A | 195,000 | N/A | text | English (United Kingdom) Pronunciation Dictionary | ||
Dataset Audio | English (United Kingdom) scripted microphone - single female | Common Use Cases: TTS | Recording Device: Headset microphone | Unit: 11 hours | Add Dataset to Quote | TC-STAR female baseline voice Laura | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription |
English (United Kingdom) scripted microphone - single female | |
Dataset Audio | English (United Kingdom) scripted microphone - single male | Common Use Cases: TTS | Recording Device: Headset microphone | Unit: 7 hours | Add Dataset to Quote | TC-STAR male baseline voice Ian | Nuance | Scripted Speech | English | United Kingdom | Low background noise (studio) | 1 | 1 | Available on request | Available on request | 96 | Available on request | Dataset includes manual orthographic transcription, automatic segmentation into phonemes, automatic generation of pitch marks (where a certain percentage of phonetic segments and pitch marks has been manually checked) Dataset is accompanied by a pronunciation lexicon with POS, lemma and phonetic transcription |
English (United Kingdom) scripted microphone - single male | |
Dataset Audio | English (United States - African American) conversational smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 50 hours | Add Dataset to Quote | USE_ASR004 | Appen Global | Conversational Speech | English | United States | Mixed (home, car, public place, outdoor) | 94 | 1 | Available on request | Available on request | 48 | wav | Two person conversations recorded on a smartphone covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
English (United States - African American) conversational smartphone | |
Dataset Text | English (United States) Conversation SMS - Threaded | Common Use Cases: Virtual Assistant, Chatbot | Recording Device: N/A | Unit: 952,677 messages | Add Dataset to Quote | ENG_SMS001 | Appen Global | SMS text messages | English | United States | N/A | Available on request | N/A | 952,677 | Available on request | N/A | text | This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
Dataset Text | English (United States) Conversation SMS - Threaded | Common Use Cases: Virtual Assistant, Chatbot | Recording Device: N/A | Unit: 106,649 messages | Add Dataset to Quote | ENG_SMS001A | Appen Global | SMS text messages | English | United States | N/A | 390 | N/A | 106,649 | Available on request | N/A | text | This is a subset of ENG_SMS001. This dataset contains threaded SMS conversations between 2 participants, using iMessage and Android SMS. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation SMS - Threaded | |
Dataset Text | English (United States) Conversation WhatsApp - Threaded | Common Use Cases: Virtual Assistant, Chatbot | Recording Device: N/A | Unit: 351,826 messages | Add Dataset to Quote | ENG_SMS002 | Appen Global | WhatsApp text messages | English | United States | N/A | Available on request | N/A | 351,826 | Available on request | N/A | text | This dataset contains threaded text message conversations between 2 participants, using WhatsApp. All messages are in US English. Contains timestamps and text message exchanges, with metadata including gender, age range and relationship between participants. Consent is obtained from all participants and the dataset does not contain PII. | English (United States) Conversation WhatsApp - Threaded | |
Dataset Audio | English (United States) conversational smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 1000 hours | Add Dataset to Quote | USE_ASR003 | Appen Global | Conversational Speech | English | United States | Low background noise | 1,856 | 1 | 500,000 | 52,586 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Conversations cover a wide variety of topics including: study/major/work, hometown, living arrangements, weather and seasons, punctuality, TV programs/film) |
English (United States) conversational smartphone | |
Dataset Text | English (United States) Medical Terms Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 8,000 words | Add Dataset to Quote | eng_USA_Med_PHON | Appen Global | Pronunciation Dictionary | English | United States | N/A | N/A | N/A | N/A | 8,000 | N/A | text | Pronunciation dictionary of medical terms with their associated transcriptions and domain tagging. Data is comprised of medical words extracted from PubMed abstracts, as well as pharmaceutical drug names collected by Appen through web-spidering. Pronunciations were processed by native speakers of US English and domain tagging done by a team of US English native speakers with medical transcription or other medical qualifications and experience. Domains include: Anatomy, Biochem/biological, Condition, General, Organisation, Person, Pharmaceutical, Procedure. |
English (United States) Medical Terms Pronunciation Dictionary | |
Dataset Text | English (United States) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 263,000 words | Add Dataset to Quote | eng_USA_POS | Appen Global | Part of Speech Dictionary | English | United States | N/A | N/A | N/A | N/A | 263,000 | N/A | text | English (United States) Part of Speech Dictionary | ||
Dataset Text | English (United States) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 358,000 words | Add Dataset to Quote | eng_USA_PHON | Appen Global | Pronunciation Dictionary | English | United States | N/A | N/A | N/A | N/A | 358,000 | N/A | text | English (United States) Pronunciation Dictionary | ||
Dataset Audio | English (United States) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 53 hours | Add Dataset to Quote | Speecon English (USA) database | Nuance | Scripted Speech | English | United States | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
English (United States) scripted microphone | |
Dataset Audio | English (United States) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 62 hours | Add Dataset to Quote | USE_ASR001 | Appen Global | Scripted Speech | English | United States | Low background noise (studio) | 200 | 2 | 80,000 | 18,318 | 48 | raw PCM or wav PCM | Dataset is fully transcribed and timestamped Dataset is formatted according to SALA II/SpeechDAT style conventions Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 400 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words |
English (United States) scripted microphone | |
Dataset Audio | English (United States) Ultra High-Volume labeled speech | Common Use Cases: ASR, Conversational AI, Speech Analytics, Automatic Captioning, In Car HMI & Entertainment, Virtual Assistant | Recording Device: N/A | Unit: 1196 hours | Add Dataset to Quote | USE_UHV001 | Appen Global | Broadcast Speech | English | United States | Low background noise | 20472 | 1 | 423371 | Available on request | 16 | wav | Customised packaging available High quality labelled speech datasets of web-sourced licensable broadcast audio data, curated to ensure representative speaker demographic distributions, and filtered through human quality checks. Utterance-level labelling includes: speech transcription, accent identification, speaker identification, verification, gender and age-group detection, domain classification. Domains include: Agriculture & plants, Animals & Pets, Art & Culture, Beauty & Fashion, Career, Clothing, Education, Entertainment, Family & Relationships, Finance & Insurance, Food, Health, History, Hospitality, Legal, Leisure, News & Politics, Religion & Spirituality, Retail, Science & Technology, Social Networks, Sports, Telecom, Travel, Weather, Others |
English (United States) Ultra High-Volume labeled speech | |
Dataset Text | English NER news text | Common Use Cases: NER, Content Classification, Search Engines | Recording Device: N/A | Unit: 22,768 sentences | Add Dataset to Quote | ENG_NER001 | Appen Global | News NER | English | N/A | N/A | N/A | N/A | 22,768 | Available on request | N/A | text | English NER news text | ||
Dataset Text | Farsi/Persian NER news text | Common Use Cases: NER, Content Classification, Search Engines | Recording Device: N/A | Unit: 19,584 sentences | Add Dataset to Quote | FAR_NER001 | Appen Global | News NER | Iranian Persian | Iran | N/A | N/A | N/A | 19,584 | Available on request | N/A | text | Farsi/Persian NER news text | ||
Dataset Text | Finnish (Finland) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 10,000 words | Add Dataset to Quote | fin_FIN_POS | Appen Global | Part of Speech Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Finnish (Finland) Part of Speech Dictionary | ||
Dataset Image | Finnish (Finland) printed text OCR | Common Use Cases: Document Processing, Document Search | Recording Device: Camera | Unit: 7293 images | Add Dataset to Quote | IMG_OCR_FIN_CN | Appen China | Document OCR | Finnish | Finland | Mixed lighting conditions | 4 | N/A | N/A | N/A | N/A | jpg | Images containing text, such as billboards / outer packaging / signage / magazines / menus, etc. | Finnish (Finland) printed text OCR | |
Dataset Text | Finnish (Finland) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 86,000 words | Add Dataset to Quote | fin_FIN_PHON | Appen Global | Pronunciation Dictionary | Finnish | Finland | N/A | N/A | N/A | N/A | 86,000 | N/A | text | Finnish (Finland) Pronunciation Dictionary | ||
Dataset Text | French (Algeria) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 4,000 words | Add Dataset to Quote | fra_DZA_PHON | Appen Global | Pronunciation Dictionary | French | Algeria | N/A | N/A | N/A | N/A | 4,000 | N/A | text | Arabic script | French (Algeria) Pronunciation Dictionary | |
Dataset Audio | French (Belgium) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 76 hours | Add Dataset to Quote | Belgian French SpeechDat(II) FDB-1000 (FIXED1BF) | Nuance | Scripted Speech | French | Belgium | Low background noise | 1,000 | 1 | 53,000 | Available on request | 8 | alaw | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words and spontaneous items for control |
French (Belgium) scripted telephony | |
Dataset Audio | French (Canada) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 9 hours | Add Dataset to Quote | FRC_ASR003 | Appen Global | Conversational Speech | French | Canada | Mixed | 68 | 2 | Available on request | 6,022 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Average length of calls: 10-15 mins For the majority of calls, only one half of the conversation was collected and transcribed, however, for a smaller number of calls, both speakers (in-line/out-line) were collected and transcribed |
French (Canada) conversational telephony | |
Dataset Text | French (Canada) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 67,000 words | Add Dataset to Quote | fra_CAN_PHON | Appen Global | Pronunciation Dictionary | French | Canada | N/A | N/A | N/A | N/A | 67,000 | N/A | text | French (Canada) Pronunciation Dictionary | ||
Dataset Audio | French (Canada) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 46 hours | Add Dataset to Quote | FRC_ASR002 | Appen Global | Scripted Speech | French | Canada | Low background noise (home/office) | 150 | 1 | 22,500 | 10,755 | 16 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 150 prompts per speaker including digits, digit strings (randomly generated), addresses and phonetically rich sentences and words |
French (Canada) scripted microphone | |
Dataset Audio | French (Canada) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone | Unit: 131 hours | Add Dataset to Quote | FRC_ASR001 | Appen Global | Scripted Speech | French | Canada | Mixed | 1,000 | 1 | 100,000 | 11,697 | 8 | alaw | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 100 prompts per speaker including digits, natural numbers, letter strings, personal, place, and business names, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
French (Canada) scripted telephony | |
Dataset Audio | French (France) conversational smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 159 hours | Add Dataset to Quote | FRF_ASR004 | Appen Global | Conversational Speech | French | France | Mixed (home, car, public place, outdoor) | 298 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
French (France) conversational smartphone | |
Dataset Audio | French (France) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 25 hours | Add Dataset to Quote | FRF_ASR001 | Appen Global | Conversational Speech | French | France | Low background noise | 563 | 2 | Available on request | 11,922 | 8 | alaw | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed |
French (France) conversational telephony | |
Dataset Audio | French (France) In-Car | Common Use Cases: ASR, Virtual Assistant, In Car HMI & Entertainment | Recording Device: Microphone and mobile phone | Unit: 113 hours | Add Dataset to Quote | French SpeechDat-Car | Nuance | Scripted Speech | French | France | Mixed (in-car) | 300 | 5 | 37,500 | Available on request | 16 and 8 | Available on request | Dataset is fully transcribed and is accompanied by a pronunciation lexicon and validation report Approximately 125 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names (some spontaneous), generic command and control items, phonetically rich words and sentences and prompts for spontaneous speech 113.7 hours |
French (France) In-Car | |
Dataset Text | French (France) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 95,000 words | Add Dataset to Quote | fra_FRA_POS | Appen Global | Part of Speech Dictionary | French | France | N/A | N/A | N/A | N/A | 95,000 | N/A | text | French (France) Part of Speech Dictionary | ||
Dataset Text | French (France) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 112,000 words | Add Dataset to Quote | fra_FRA_PHON | Appen Global | Pronunciation Dictionary | French | France | N/A | N/A | N/A | N/A | 112,000 | N/A | text | French (France) Pronunciation Dictionary | ||
Dataset Audio | French (France) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 26 hours | Add Dataset to Quote | FRF_ASR003 | Global Phone | Scripted Speech | French | France | Low background noise (home/office) | 98 | 1 | 10,273 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
French (France) scripted microphone | |
Dataset Audio | French (France) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 41 hours | Add Dataset to Quote | French SpeechDat(II) FDB-1000 | Nuance | Scripted Speech | French | France | Low background noise (home/office) | 1,017 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (France) scripted telephony | |
Dataset Audio | French (France) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 305 hours | Add Dataset to Quote | French SpeechDat(II) FDB-5000 | Nuance | Scripted Speech | French | France | Low background noise | 5,040 | 1 | 237,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 47 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (France) scripted telephony | |
Dataset Audio | French (Luxembourg) telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 45 hours | Add Dataset to Quote | Luxembourgish French SpeechDat(II) FDB-500 (FIXED1LF) | Nuance | Scripted Speech | French | Luxembourg | Low background noise | 614 | 1 | 32,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
French (Luxembourg) telephony | |
Dataset Text | Georgian (Georgia) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 67,000 words | Add Dataset to Quote | kat_GEO_PHON | Appen Global | Pronunciation Dictionary | Georgian | Georgia | N/A | N/A | N/A | N/A | 67,000 | N/A | text | Georgian (Georgia) Pronunciation Dictionary | ||
Dataset Audio | German (Germany) conversational smartphone | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 104 hours | Add Dataset to Quote | DEU_ASR004 | Appen Global | Conversational Speech | German | Germany | Mixed (home, car, public place, outdoor) | 198 | 1 | Available on request | Available on request | 48 | wav | Two person conversations covering a broad range of generic topics including clothing, culture, education, finance, food, health, history, hospitality, insurance, media/entertainment, sports, travel/holiday, weather and work. Each speaker participates in up to 12 conversations that are 5-15 minutes long. Pronunciation lexicon not currently available but can be developed upon request |
German (Germany) conversational smartphone | |
Dataset Text | German (Germany) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 146,000 words | Add Dataset to Quote | deu_DEU_PHON | Appen Global | Pronunciation Dictionary | German | Germany | N/A | N/A | N/A | N/A | 146,000 | N/A | text | German (Germany) Pronunciation Dictionary | ||
Dataset Audio | German (Germany) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 16 hours | Add Dataset to Quote | DEU_ASR001 | Appen Global | Scripted Speech | German | Germany | Low background noise (studio) | 127 | 2 | 12,700 | 6,826 | 48 | raw PCM | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words Each speaker read 100 prompts including digits, natural numbers, personal and city names, telephone numbers, generic command and control items, phonetically rich sentences and words |
German (Germany) scripted microphone | |
Dataset Audio | German (Germany) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 25 hours | Add Dataset to Quote | DEU_ASR003 | Global Phone | Scripted Speech | German | Germany | Low background noise (home/office) | 77 | 1 | 10,085 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
German (Germany) scripted microphone | |
Dataset Audio | German (Germany) telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 31 hours | Add Dataset to Quote | German SpeechDat (II) FDB-1000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 988 | 1 | 43,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 44 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Germany) telephony | |
Dataset Audio | German (Germany) telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 268 hours | Add Dataset to Quote | German SpeechDat(II) FDB-4000 | Nuance | Scripted Speech | German | Germany | Low background noise (home/office) | 4,000 | 1 | 160,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 40 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Germany) telephony | |
Dataset Audio | German (Luxembourg) telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 33 hours | Add Dataset to Quote | Luxembourgish German SpeechDat(II) FDB-500 (FIXED1LG) | Nuance | Scripted Speech | German | Luxembourg | Low background noise | 500 | 1 | 26,500 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 53 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Luxembourg) telephony | |
Dataset Text | German (Switzerland) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 27,000 words | Add Dataset to Quote | deu_CHE_PHON | Appen Global | Pronunciation Dictionary | German | Switzerland | N/A | N/A | N/A | N/A | 27,000 | N/A | text | German (Switzerland) Pronunciation Dictionary | ||
Dataset Audio | German (Switzerland) scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 53 hours | Add Dataset to Quote | Speecon German (Switzerland) database | Nuance | Scripted Speech | German | Switzerland | Mixed (office, entertainment, car, public place) | 600 (550 adult speakers and 50 child speakers) | 4 | 170,000 | Available on request | 16 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 290 prompts per adult speaker and 210 prompts per child speaker including digits, natural numbers, letter strings, personal, place and business names, application words for adult speakers, command (toy, phone and general) for child speakers, phonetically rich words and sentences and free and elicited spontaneous responses for adult speakers |
German (Switzerland) scripted microphone | |
Dataset Audio | German (Turkey) telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 31 hours | Add Dataset to Quote | OrienTel German Spoken by Turkish | Nuance | Scripted Speech | German | Turkey | Low background noise | 300 | 1 | 15,600 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 52 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
German (Turkey) telephony | |
Dataset Text | Greek (Greece) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 5,000 words | Add Dataset to Quote | ell_GRC_PHON | Appen Global | Pronunciation Dictionary | Greek | Greece | N/A | N/A | N/A | N/A | 5,000 | N/A | text | Greek (Greece) Pronunciation Dictionary | ||
Dataset Audio | Greek (Greece) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 191 hours | Add Dataset to Quote | GRE_ASR001_CN | Appen China | Scripted Speech | Greek | Greece | Low background noise (home/office) | 287 | 1 | 54,113 | 68,271 | 16 | wav | Dataset contains audio with corresponding text prompts | Greek (Greece) scripted smartphone | |
Dataset Text | Guarani (Paraguay) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 36,000 words | Add Dataset to Quote | grn_PRY_PHON | Appen Global | Pronunciation Dictionary | Guarani | Paraguay | N/A | N/A | N/A | N/A | 36,000 | N/A | text | Guarani (Paraguay) Pronunciation Dictionary | ||
Dataset Text | Haitian Creole (Haiti) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 18,000 words | Add Dataset to Quote | hat_HTI_PHON | Appen Global | Pronunciation Dictionary | Haitian Creole | Haiti | N/A | N/A | N/A | N/A | 18,000 | N/A | text | Haitian Creole (Haiti) Pronunciation Dictionary | ||
Dataset Image | Handwritten text document OCR | Common Use Cases: Document Processing, Document Search | Recording Device: Camera, scan | Unit: 964 images | Add Dataset to Quote | IMG_OCR_Handwritten | Appen Global | Document OCR | N/A | N/A | Mixed lighting conditions | N/A | N/A | N/A | N/A | N/A | png | This is a subset of IMG_OCR_B2C_Other. Scans and photographs of handwritten forms and handwritten documents. 6 Languages collected in 23+ locales: 8% Arabic, 41% English, 7% French, 2% German, 20% Russian, 22% Spanish | Handwritten text document OCR | |
Dataset Audio | Hausa (Nigeria) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 33 hours | Add Dataset to Quote | HAU_ASR002 | Appen Global | Conversational Speech | Hausa | Nigeria | Low background noise | 200 | 2 | Available on request | 7,949 | 8 | alaw | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers |
Hausa (Nigeria) conversational telephony | |
Dataset Text | Hausa (Nigeria) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 11,000 words | Add Dataset to Quote | hau_NGA_PHON | Appen Global | Pronunciation Dictionary | Hausa | Nigeria | N/A | N/A | N/A | N/A | 11,000 | N/A | text | Hausa (Nigeria) Pronunciation Dictionary | ||
Dataset Audio | Hausa scripted microphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Microphone | Unit: 20 hours | Add Dataset to Quote | HAU_ASR001 | Global Phone | Scripted Speech | Hausa | Cameroon | Low background noise (home/office) | 103 | 1 | 7,895 | Available on request | 16 | wav | Dataset is fully transcribed and the transcription is available both in original script and in Romanized form Each speaker reads a number of phonetically rich sentences selected from national newspaper articles available from the web to cover a wide domain with large vocabulary Developed in collaboration with the Karlsruhe Institute of Technology (KIT) |
Hausa scripted microphone | |
Dataset Audio | Hebrew (Israel) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 34 hours | Add Dataset to Quote | HEB_ASR001 | Appen Global | Conversational Speech | Hebrew | Israel | Low background noise | 200 | 2 | Available on request | 19,250 | 8 | alaw or wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words 200 telephony conversations are recorded for this project - 100 speakers make 2 calls each (1 from landline, 1 from mobile) to a pool of 100 call receivers 50% landline, 50% mobile Conversations cover a range of topics including: Friends, Family and Studies. |
Hebrew (Israel) conversational telephony | |
Dataset Text | Hebrew (Israel) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 31,000 words | Add Dataset to Quote | heb_ISR_PHON | Appen Global | Pronunciation Dictionary | Hebrew | Israel | N/A | N/A | N/A | N/A | 31,000 | N/A | text | Hebrew (Israel) Pronunciation Dictionary | ||
Dataset Audio | Hindi (India) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics, TTS | Recording Device: Mobile phone and landline | Unit: 32 hours | Add Dataset to Quote | HIN_ASR002 | Appen Global | Conversational Speech | Hindi | India | Mixed | 996 | 2 | Available on request | 12,266 | 8 | wav | Dataset is fully transcribed and timestamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words For the majority of calls, both speakers (in-line/out-line) were collected and transcribed, however, for a smaller number of calls, only one half of the conversation was collected and transcribed 29% landline, 71% mobile |
Hindi (India) conversational telephony | |
Dataset Text | Hindi (India) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: | Unit: 35,000 words | Add Dataset to Quote | hin_IND_PHON | Appen Global | Pronunciation Dictionary | Hindi | India | N/A | N/A | N/A | N/A | 35,000 | N/A | text | Hindi (India) Pronunciation Dictionary | ||
Dataset Audio | Hindi (India) scripted telephony | Common Use Cases: ASR, Virtual Assistant, TTS | Recording Device: Mobile phone | Unit: 224 hours | Add Dataset to Quote | HIN_ASR001 | Appen Global | Scripted Speech | Hindi | India | Low background noise | 1,920 | 1 | 96,000 | 9,853 | 8 | alaw or wav | Fully transcribed to SpeechDAT type conventions Dataset is accompanied by a pronunciation lexicon [SAMPA] containing all transcribed words 50 prompts per speaker including digits, natural numbers, personal, business and place names, web addresses, confirmation items (yes, no + fuzzy), generic command and control items, phonetically rich sentences and words |
Hindi (India) scripted telephony | |
Dataset Video | Human body movement | Common Use Cases: Fitness Applications, Action Classification, Gesture Recognition | Recording Device: Mobile phone | Unit: 2000 videos | Add Dataset to Quote | VED_HUMAN_BODY_CN | Appen China | Human Body | N/A | China | Mixed background and lighting conditions | 1000 | N/A | N/A | N/A | N/A | mp4 | Video clips are approximately 10-20 seconds long | Human body movement | |
Dataset Text | Hungarian (Hungary) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 500 words | Add Dataset to Quote | hun_HUN_PHON | Appen Global | Pronunciation Dictionary | Hungarian | Hungary | N/A | N/A | N/A | N/A | 500 | N/A | text | Hungarian (Hungary) Pronunciation Dictionary | ||
Dataset Audio | Hungarian (Hungary) scripted smartphone | Common Use Cases: ASR, Virtual Assistant, Chatbot | Recording Device: Mobile phone | Unit: 286 hours | Add Dataset to Quote | HUN_ASR001_CN | Appen China | Scripted Speech | Hungarian | Hungary | Low background noise (home/office) | 254 | 1 | 94,031 | 201,921 | 16 | wav | Dataset contains audio with corresponding text prompts | Hungarian (Hungary) scripted smartphone | |
Dataset Audio | Hungarian (Hungary) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Landline only | Unit: 65 hours | Add Dataset to Quote | Hungarian SpeechDat(E) | Nuance | Scripted Speech | Hungarian | Hungary | Low background noise | 1,000 | 1 | 48,000 | Available on request | 8 | Available on request | Dataset is fully transcribed to SpeechDAT type conventions and is accompanied by a pronunciation lexicon and validation report 48 prompts per speaker including digits, natural numbers, letter strings, personal, place and business names, confirmation items (yes, no + fuzzy), generic command and control items and phonetically rich sentences and words |
Hungarian (Hungary) scripted telephony | |
Dataset Text | Icelandic (Iceland) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 3,000 words | Add Dataset to Quote | isl_ISL_PHON | Appen Global | Pronunciation Dictionary | Icelandic | Iceland | N/A | N/A | N/A | N/A | 3000 | N/A | text | Icelandic (Iceland) Pronunciation Dictionary | ||
Dataset Text | Igbo (Nigeria) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 32,000 words | Add Dataset to Quote | ibo_NGA_PHON | Appen Global | Pronunciation Dictionary | Igbo | Nigeria | N/A | N/A | N/A | N/A | 32,000 | N/A | text | Igbo (Nigeria) Pronunciation Dictionary | ||
Dataset Text | Indonesian (Indonesia) Part of Speech Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 10,000 words | Add Dataset to Quote | ind_IDN_POS | Appen Global | Part of Speech Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 10,000 | N/A | text | Indonesian (Indonesia) Part of Speech Dictionary | ||
Dataset Text | Indonesian (Indonesia) Pronunciation Dictionary | Common Use Cases: ASR, TTS, Language Modelling | Recording Device: N/A | Unit: 95,000 words | Add Dataset to Quote | ind_IDN_PHON | Appen Global | Pronunciation Dictionary | Indonesian | Indonesia | N/A | N/A | N/A | N/A | 95,000 | N/A | text | Indonesian (Indonesia) Pronunciation Dictionary | ||
Dataset Audio | Inner Mongolian (China) Conversational Speech | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone | Unit: 100 hours | Add Dataset to Quote | NMG_ASR001_CN | Appen China | Conversational Speech | Inner Mongolian | China | Low background noise | 200 | 1 | 16 | wav | Audio only; transcription not included Audio recordings cover the following areas: Xilingol League, Tongliao, Hohhot. Each recording session contains about 30 minutes of free dialogue between 2 people. |
Inner Mongolian (China) Conversational Speech | |||
Dataset Audio | Iranian Persian (Farsi) (Iran) conversational telephony | Common Use Cases: ASR, Conversational AI, Speech Analytics | Recording Device: Mobile phone and landline | Unit: 30 hours | Add Dataset to Quote | FAR_ASR002 | Appen Global | Conversational Speech | Iranian Persian (Farsi) | Iran | Mixed | 1,000 | 2 | Available on request | 12,358 | 8 | wav | Dataset is fully transcribed and time stamped Dataset is accompanied by a pronunciation lexicon containing all transcribed words |
Iranian Persian (Farsi) (Iran) conversational telephony | |
Dataset Audio | Iranian Persian (Farsi) (Iran) scripted telephony | Common Use Cases: ASR, Virtual Assistant | Recording Device: Mobile phone and landline | Unit: 85 hours | Add Dataset to Quote | FAR_ASR001 | Appen Global | Scripted Speech | Iranian Persian (Farsi) | Iran | Mixed | 789 | 1 | 38,400 |