equivalent language data
[ɪˈkwɪvələnt ˈlæŋɡwɪdʒ ˈdeɪtə]
nounpl: equivalent language data sets / equivalent language data collections
dados linguísticos equivalentes
1. Information or text in one language that corresponds to and conveys the same meaning as information or text in another language, used for comparative linguistic analysis, translation studies, or machine learning training
The researchers used equivalent language data in English and Spanish to train their translation algorithm.
Os pesquisadores utilizaram dados linguísticos equivalentes em inglês e espanhol para treinar seu algoritmo de tradução.
2. Parallel corpora or multilingual datasets where the same content appears in multiple languages side by side for linguistic comparison
Equivalent language data from the European Parliament corpus helped linguists understand regional language variations.
Os dados linguísticos equivalentes do corpus do Parlamento Europeu ajudaram linguistas a entender variações linguísticas regionais.
3. In natural language processing, aligned text or utterances across different languages that represent identical or semantically equivalent information
Machine translation systems require vast amounts of equivalent language data to achieve high accuracy.
Sistemas de tradução automática requerem grandes volumes de dados linguísticos equivalentes para alcançar alta precisão.
This is specialized terminology primarily used in academic, computational linguistics, and technology sectors. It gained prominence with the rise of machine learning and artificial intelligence in the 2010s-2020s. Both Brazil and Portugal use the same term in technical contexts, though it is more commonly encountered in research institutions, tech companies, and universities rather than in everyday conversation. In the USA, it is standard terminology in NLP research communities and tech industry discussions.
Look up more words on Fala2Me
The free English-Portuguese dictionary with real Brazilian accents, NYC slang, conjugator and more
Open Fala2Me →