Introduction
In economics, we often deal with facts and figures to understand and solve problems. For example, to understand the changes in food grain production over the years, we need numbers showing the output for each year. These numerical facts are called data.
Data helps us see patterns and provides the evidence needed to find clear solutions to economic problems. The values we study, which change over time or between different items (like the production of food grains each year), are called variables. Each individual value of a variable is an observation.
Example
If we look at the production of food grains in India, the year is one variable (let's call it X) and the amount of production in million tonnes is another variable (Y). The production in 1970-71 was 108 million tonnes, while in 2016-17 it was 272 million tonnes. The years and the production figures are the data that help us analyze this trend.
This chapter explores where this data comes from and the different methods we use to collect it.
What are the Sources of Data?
Statistical data can be gathered from two main sources: primary and secondary.
-
Primary Data: This is data collected for the first time by the researcher for a specific purpose. It is based on first-hand information.
[!example]
If you want to find out which filmstar is most popular among students at your school, you would have to go and ask the students directly. The information you collect would be primary data because you gathered it yourself.
-
Secondary Data: This is data that has already been collected and processed by another person or agency. This information can be found in published sources like government reports, newspapers, websites, or books.
[!example]
If you publish a report on the filmstar's popularity and someone else uses your report for their own study, your data becomes secondary data for them. Using secondary data is often cheaper and saves a lot of time.
Note
Data is considered primary for the person or agency that collects it first. It becomes secondary for anyone who uses that same data later.
How do we Collect the Data?
The most common way to collect primary data is by conducting a survey. A survey is a method of gathering information from individuals by asking them questions. Manufacturers use surveys to get feedback on their products, and political parties use them to gauge public opinion about candidates.
Preparation of Instrument
The main tool used in surveys is the questionnaire or interview schedule. This is a list of questions that the respondent (the person being surveyed) answers. A well-designed questionnaire is crucial for collecting accurate data.
Here are some key points to remember when preparing a questionnaire:
- Keep it short: The number of questions should be as minimal as possible. A long questionnaire can be tiring for the respondent.
- Be easy to understand: Avoid using ambiguous or difficult words.
- Arrange questions logically: Questions should be ordered in a way that feels natural to the person answering.
- Move from general to specific: Start with broader questions and then move to more detailed ones. For instance, ask about the regularity of electricity supply before asking if a price increase is justified.
- Be precise and clear: Avoid vague questions. Instead of asking, "What percentage of your income do you spend on clothing in order to look presentable?", a better question is simply, "What percentage of your income do you spend on clothing?"
- Avoid ambiguity: Questions should be framed so they can be answered quickly and correctly. Instead of "Do you spend a lot of money on books in a month?", provide clear options like "Less than Rs 200," "Rs 200-300," etc.
- Do not use double negatives: Questions starting with "Don't you..." or "Wouldn't you..." can lead to biased answers. Ask "Do you think smoking should be prohibited?" instead of "Don't you think smoking should be prohibited?"
- Do not ask leading questions: A leading question suggests a particular answer. For example, "How do you like the flavour of this high-quality tea?" is a poor question because it implies the tea is high-quality. A better question is, "How do you like the flavour of this tea?"
- Do not indicate alternatives: Avoid limiting the respondent's choices in the question itself. Instead of "Would you like to do a job after college or be a housewife?", ask "What would you like to do after college?"
Types of Questions
- Open-ended (Unstructured) Questions: These allow for individualised, detailed responses. However, they can be difficult to interpret and score because answers vary widely. An example is, "What is your view about globalisation?"
- Closed-ended (Structured) Questions: These provide a set of options for the respondent to choose from. They are easier to analyze but may not capture the respondent's true feelings if the right option isn't available.
- Two-way Question: This type has only two possible answers, usually 'yes' or 'no'.
- Multiple Choice Question: This type offers more than two options. It's good practice to include an "Any other (please specify)" option to capture responses the researcher didn't anticipate.
Mode of Data Collection
There are three basic ways to conduct a survey and collect data:
Personal Interviews
In this method, the researcher (or a trained investigator) conducts a face-to-face interview with the respondent.
- Advantages:
- It has the highest response rate.
- The interviewer can explain questions and clarify any doubts, avoiding misunderstanding.
- It allows for asking all types of questions, including open-ended ones.
- The interviewer can observe the respondent's reactions, which can provide extra information.
- Disadvantages:
- It is the most expensive method as it requires trained interviewers.
- It is more time-consuming.
- The presence of the interviewer might influence the respondent, preventing them from saying what they truly think.
Mailing Questionnaire
Here, the questionnaire is sent to individuals by mail (or through online surveys and SMS) with a request to complete and return it.
- Advantages:
- It is less expensive than personal interviews.
- It can reach people in remote areas.
- The interviewer cannot influence the respondent's answers.
- Respondents can take their time to give thoughtful answers.
- Disadvantages:
- The response rate is often low, as people may not return the questionnaire.
- There is no opportunity to clarify questions, which can lead to misunderstanding.
Telephone Interviews
The investigator asks questions over the telephone.
- Advantages:
- It is cheaper than personal interviews and can be done in a shorter time.
- The interviewer can clarify questions for the respondent.
- Respondents may be more willing to answer sensitive questions over the phone than in person.
- Disadvantages:
- Its use is limited because not everyone may own a telephone.
- The interviewer cannot watch the respondent's reactions.
Pilot Survey
Before launching a full-scale survey, it is wise to conduct a Pilot Survey, which is a try-out of the questionnaire with a small group. This pre-testing helps to:
- Identify shortcomings in the questions (e.g., if they are confusing).
- Assess the clarity of instructions.
- Check the performance of the investigators (enumerators).
- Estimate the cost and time required for the actual survey.
Census and Sample Surveys
Census or Complete Enumeration
A survey that includes every single element of the group being studied is known as a Census or the Method of Complete Enumeration.
Example
The Census of India, carried out every ten years, is a perfect example. To study the country's population, officials collect information from every single household in both rural and urban India. The last Census was held in 2011. It collects data on population size, literacy, employment, birth and death rates, and more.
Population and Sample
- Population or Universe: In statistics, this refers to the entire group of items or individuals being studied. If your research is about agricultural labourers in a district, then all agricultural labourers in that district make up the population.
- Sample: This is a smaller group or section selected from the population, from which information is obtained. A good sample should be representative of the entire population.
Most surveys are sample surveys because they are more practical than a census.
- Benefits of a Sample Survey:
- It provides reliable information at a much lower cost and in a shorter time.
- Since the group is smaller, more detailed and in-depth information can be collected.
- It requires a smaller team of enumerators, making them easier to train and supervise effectively.
Random Sampling
This is a method where individual units are selected from the population at random, meaning every individual has an equal chance of being chosen. This is also called the lottery method.
Example
Imagine a government wants to study the impact of a petrol price hike on 30 households in a locality that has 300 households. In a random sample, they could write the names of all 300 households on slips of paper, mix them up, and draw 30 names. This ensures the selection is unbiased. Today, computer programs are often used to generate random samples.
Non-Random Sampling
In this method, all units of the population do not have an equal chance of being selected. The investigator uses their judgment, convenience, or a specific quota to choose the sample.
Example
If you need to select 10 out of 100 households in an area, and you choose the ones that are closest to you or ones where you know the residents, you are using non-random sampling. This method involves personal bias.
Sampling and Non-Sampling Errors
When we use a sample to estimate something about a population, errors can occur. These are divided into two types.
Sampling Errors
A sampling error is the difference between the result obtained from a sample (the sample estimate) and the actual value of the entire population (the population parameter).
Example
Suppose the average income of 5 farmers (the entire population) is Rs 600. If we take a sample of two farmers whose incomes are Rs 500 and Rs 600, the sample average is Rs 550. The sampling error is the difference between the true value (Rs 600) and the sample estimate (Rs 550), which is Rs 50.
Note
It is possible to reduce sampling error by taking a larger sample. The larger the sample, the closer its results are likely to be to the population's actual values.
Non-Sampling Errors
These errors are more serious because they are not related to the size of the sample. They can happen even in a full census. It is very difficult to minimize non-sampling errors.
Types of non-sampling errors include:
- Sampling Bias: This happens when the sampling plan is designed in such a way that some members of the population cannot be included.
- Non-Response Errors: This occurs when an interviewer cannot contact a person in the sample, or the person refuses to answer. This can make the sample unrepresentative.
- Errors in Data Acquisition: This type of error comes from recording incorrect responses. This can happen due to carelessness, differences in measurement tools, or mistakes made by the enumerator when writing down the data (e.g., recording 13 instead of 31).
Census of India and NSSO
In India, several national and state-level agencies collect, process, and publish statistical data. Two of the most important are:
-
Census of India: Conducted by the Registrar General of India, it is the most complete record of our population. It has been conducted regularly every ten years since 1881. The data is used to understand many economic and social issues in the country.
-
National Sample Survey (NSS): Established by the Government of India, the NSS (now known as the National Sample Survey Office or NSSO) conducts regular nationwide surveys on various socio-economic issues. It provides data on literacy, employment, unemployment, consumer expenditure, healthcare, and more. The data is released through reports and a journal called Sarvekshana and is used by the government for planning purposes.
Conclusion
Data is a crucial tool in economics for understanding and analyzing problems. We can collect it ourselves as primary data through surveys, or we can use secondary data that has already been collected by others. The choice of which data to use and how to collect it depends on the goals of the study. Whether using a complete census or a representative sample, it is important to be aware of potential errors to ensure the conclusions drawn are sound.