Month: October 2022

Misc

BLOOM: a large language model that’s open source and build for the scientific community

Reading Time: < 1 minute

Since Transformers was invented, there have been plenty of large language models as Machine Learning has taken off in the last few years. GPT3, RoBERTa, PaLM, and many more. Most of them were trained by large companies like OpenAI, Facebook, or Google. They often use open-source databases like Wikipedia. The problem with these models is they retain the same bias that already exists in these data since humans are not free of bias. The companies may take precautions when deploying the final results of the models, but the training data were not examined for bias. In addition, because the dominating language in the business world that make money is still English, not much attempt was made in including low-resource languages.

BLOOM, the BigScience Large Open-science Open-access Multilingual Language Model, was made by scientists for research purposes. They specifically focused on eliminating bias in the training data. And also included many more Asian and African languages. These improvements target the research and education sector. So far, its direct usage is language generation, which generally still suffers from repetition. But the encoding itself can be used for summarization and question and answer.

Huggingface model card: Link here

Research site: Link Here

AI

Whisper – OpenAI’s latest speech transcription package

Reading Time: < 1 minute

Speech transcription is the process of converting speech audio into text. The text becomes searchable and there is variety of Natural Language Processing (NLP) tools that can make sense of it. Traditionally this is done by humans. Early technology are less accurate (<70%) so the NLP tools does not work affectively. Machine Learning made great strides and increased the accuracy to more than 90%. However this technology is largely inaccessible to an average person or app developer. Training your own model require technical knowledge, and cloud solutions like Google, AWS, or Microsoft Azure is relatively expensive for large quantities.

With Whisper, developers can use their own GPU capable hardware to make mass amount of speech transcriptions. Theoretically, this will enable more exciting solutions that utilize speech transcription technology. I personally like to see some competition in personal assistant field, on wearable technology.

Here is a tutorial on how to set it up.

https://www.assemblyai.com/blog/how-to-run-openais-whisper-speech-recognition-model/

And the original paper and code if you are interested.

https://openai.com/blog/whisper/