Empowering change empowering Tech

Data science can be a vast topic and cannot be covered in one go. But then let’s try to understand it in a very simple and easy way.

Every corner of the world today is packed with data in its original form. When you go shopping, have a medical exam, watch a movie or show, use the Internet, or take an exam. Everything is giving birth to lots and lots of data. But why is this data so important?

Science is when one tries to understand anything using scientific tools. And the data is a set of qualitative and quantitative variables on any subject. So, understanding both definitions, it can be said that; Data science is a field where data is used as raw material and then processed with scientific tools to extract an end result. This end result helps increase business value and customer satisfaction.


You see their products every day in your day to day. Products that are the result of combining large amounts of unstructured data and using it to find solutions to business and customer-related problems. Some of them are:

  • Digital Ads: At the same time, two different people can see different ads on their computer screens. The reason is data science, which recognizes one’s preferences and displays ads relevant to them.

  • Image and Voice Recognition: Either Facebook’s Auto-Tagging Option or Alexa, Siri, etc. Recognizing your voice and doing exactly what you told them to do is data science again.

  • Recommendation systems – When you shop on an online website or search for a show on any entertainment app, you get suggestions. These suggestions are created using data science by tracking past activities and likes.

  • Fraud detection: many financial institutions use it to know how to track the financial and credit position of customers, to know in time whether to lend them or not. This reduces credit risk and bad loans.

  • Search Engines – These search engines handle a large amount of data, and searching for what you asked for in a second might be impossible if only algorithms weren’t there to help with this gigantic task.


It is a big topic, it comprised several different stages and steps before one can come to the final conclusion. Are:

  • Obtaining data from various sources.

  • Store data categorically

  • Clean the data for inconsistencies.

  • Explore the data and find trends and patterns in it.

  • Machine learning that models the patterns found in algorithms.

  • And finally interpret the algorithms and communicate them.


Various techniques are used, and all of these techniques should be learned by an aspiring data science.

  • SQL or NoSQL for database management

  • Hadoop, Apache Flink, and Spark for storage.

  • Python, R, SAS, Hadoop, Flink, and Spark for data processing, encoding, and processing.

  • Python libraries, R libraries, statistics, experimental design to explore and search the data to find the necessary inferences.

  • Machine learning, multivariate calculus, linear algebra to model the data.

  • Communication and presentation skills along with business acumen to make inferences useful in making strategic decisions.

Leave a Reply

Your email address will not be published. Required fields are marked *