As a data science learner, it would be helpful if you are familiar with the concept of Data Types in computer science. Another important thing to know, is to figure out the nature of the data we need to analyse.
Data can be quantitative or qualitative, let us examine each type with some examples.
You guessed it, “quantitative” means something related to numbers. For example, the heights of some people in a room, or the number of students in a class.
Quantitative data can be categorized to:
- Continuous: as in the heights example. In computer science, this is equivalent to the floating-point data type.
- Discrete: as in the number of students in a class, we can’t have 25.5 students in a class! In computer science, this is equivalent to the integer data type. Notice that the integer type can be negative too. And although the discrete quantitative data could be negative too, it is often positive in real-life data.
Other examples of quantitative data are the weight, time or length. Quantitative data is measured.
From the latin word “qualis” which means “of what kind”, we say this type of data is “qualitative” or “categorical”, for example, in a system to predict the probability of a tumor being benign or malignant, we ask the patient if he/she smokes and that ends up in our data sheet as Yes/No column. While pre-processing the data to prepare it to be the input of our machine learning algorithm, we might encode that column as binary (1 vs. 0) but this doesn’t change the fact that the type of that column is qualitative.
Other examples of qualitative data are the models of cars, colors, ethnicity, poll options or gender. Qualitative data is observed.
Don’t be fooled by the numbers! Although some data such as the phone numbers and zip code are really represented by numbers, they are considered qualitative not quantitative! Why? because it doesn’t make sense to make calculations on such numbers as summation (addition) or averaging them. Still not convinced? Find the maximum phone number of your friends and let me know what more information that tells you about the world in which we are living.