In today’s scenario, one way of people’s success identified by how they are communicating and sharing information with others. That’s where the concepts of language come into the picture. However, there are many languages in the world. Each has many standards and alphabets, and the combination of these words arranged meaningfully resulted in the formation of a sentence. Each language has its own rules while developing these sentences and these sets of rules are also known as grammar.
In my previous blog, I explained how to convert speech into text using the Speech Recognition library with the help of Google speech recognition API. In this blog, we see how to convert speech into text using Facebook Wav2Vec 2.0 model.
Facebook recently introduced and open-sourced their new framework for self-supervised learning of representations from raw audio data called Wav2vec 2.0. Facebook researchers claim this framework can enable automatic speech recognition models with just 10 minutes of transcribed speech data.
As everyone knows, Transformers are playing a major role in Natural Language Processing. The latest version of Hugging Face transformers…
spaCy is an open-source, advanced Natural Language Processing (NLP) library in Python. The library was developed by Matthew Honnibal and Ines Montani, the founders of the company Explosion.ai. In my previous article, I have explained the Natural Language Processing using the NLTK library. spaCy was designed particularly for production usage and it helps to process, and understand the large volume of text. It provides crisp and user-friendly API.
To know more about NLP, I invite you to check out my previous article. In this article, we see how to use a spaCy library for various NLP-related tasks.
I work predominantly in NLP for the last three months at work. It’s been a long time I work on the image data. Hence, I decided to build a unique image classifier model as part of my personal project and learning.
One thing I am really missing in the current pandemic is traveling. These days I used to see a lot of travel vlogs and travel pictures on Instagram, wondering when we will go back to the normal world.
This strikes me to create an image classifier model with five classes like Mountain, Beach, Desert, Lake, and Museum. However, I…
In my previous article, I have written about a content-based recommendation engine using TF-IDF for Goodreads data. In this article, I am using the same Goodreads data and build the recommendation engine using word2vec.
Like the previous article, I am going to use the same book description to recommend books. The algorithm that we use always struggles to handle raw text data and it only understands the data in numeric form. In order to make it understand, we need to convert the raw text into numeric. …
Data scientists come from different backgrounds. In today’s agile environment, it is highly essential to respond quickly to customer needs and deliver value. Faster value provides more wins for the customer and hence more wins for the organization.
Information Technology is always under immense pressure to increase agility and speed up delivery of new functionality to the business. A particular point of pressure is the deployment of new or enhanced application code at the frequency and immediacy demanded by typical digital transformation. Under the covers, this problem is not simple, and it is compounded by infrastructure challenges. Challenges like how…
Speech is the most common means of communication and the majority of the population in the world relies on speech to communicate with one another. Speech recognition system basically translates spoken languages into text. There are various real-life examples of speech recognition systems. For example, Apple SIRI which recognize the speech and truncates into text.
If we plan to buy any new product, we normally ask our friends, research the product features, compare the product with similar products, read the product reviews on the internet and then we make our decision. How convenient if all this process was taken care of automatically and recommend the product efficiently? A recommendation engine or recommender system is the answer to this question.
Content-based filtering and Collaborative based filtering are the two popular recommendation systems. In this blog, we will see how we can build a simple content-based recommender system using Goodreads.com data.
A content-based recommendation system recommends items…
Python is a best friend for the majority of the Data Scientists. Libraries make their life simpler. I have come across five cool Python libraries while working on my NLP project. This helped me a lot and I would like to share the same in this article.
Amazing library to convert text numerics into int and float. Useful library for NLP projects. For more details, please check PyPI and this github repo.
!pip install numerizer
#importing numerize library
from numerizer import numerize#examplesprint(numerize(‘Eight fifty million’))
print(numerize(‘one two three’))
print(numerize(‘Three hundred and Forty five’))
print(numerize(‘Six and one quarter’))
print(numerize(‘Jack is having fifty million’))
Data science, machine learning and artificial intelligence have been hot domains for a few years now. Many people want to work as data scientists and are putting in an immense effort to upgrade their skills through university, online course or self-study. However, there are a lot of challenges in the real world in terms of working and solving a business problem. Non-technical skills are equally important in order to work as a data scientist. In this blog, I am sharing my personal experience that I have come across in my work as a data scientist.
There are a lot of…