Statistics

  • Uploaded by: Virencarpediem
  • 0
  • 0
  • January 2021
  • PDF

This document was uploaded by user and they confirmed that they have the permission to share it. If you are author or own the copyright of this book, please report to us by using this DMCA report form. Report DMCA


Overview

Download & View Statistics as PDF for free.

More details

  • Words: 7,381
  • Pages: 18
Loading documents preview...
Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

CHAPTER-IV Sampling and Estimation Theory Meaning of population or Universe: Population in statistics means the whole of the information which comes under the purview of statistical investigation. A set of units or items under study is known as population. It is the totally of all the observations of a statistical experiment or enquiry. In order words, an aggregate of objects; animate or inanimate under study is the population, it is also known as the universe. For example, the population of the heights of students in a School, or the population of the sum of points obtained in throwing two dice.

Classification of population: 1. Homogeneous and heterogeneous population. 2. Finite and Infinite population. 3. Observed and hypothetical population. 1. Homogeneous and heterogeneous population: Homogeneous population is the population in which all the units are similar in relation to the variables under study. For example, Grains in the cooked rice. On the other hand, heterogeneous population is the population in which all the units are not similar to one another in relation to the variables under study. For example, Fish in the ocean. 2. Finite and Infinite population: A population having finite number of observations or units which are countable is known as finite population. A population having infinite number of observations or units which cannot be countable is known as infinite population. The numbers of schools in a city, the number of students in a college etc. are the examples of finite population while number of stars in the sky, drops of water in the ocean etc. are the examples of infinite population. 3. Real population and hypothetical population: A population whose units exist physically is known as real population. For example, the number of trees in a garden on the other hand, population whose units are not physically in existence but are imaginary is known as hypothetical population. For example, set of outcomes obtained by tossing two dice simultaneously.

Sample: A part of the population selected for study is called as sample. In other words, the selection of a group of individuals or items from population in such a way that this group represents the population is called a sample. For example, a housewife tests a small quantity if rice to see that it has been cooked or not, this small quantity of rice is sample and represents the entire quantity of rice cooked. A sample is a selected portion of the population, a sample drawn from a population provides valuable information about its parent population. It gives a fairly accurate result and a reliable picture of the total observations under investigation. It is always used to measure and estimate the corresponding characteristics of its parent population. When the sample drawn is perfectly representative, it is identical with its parent population almost in every respect except that it is smaller than the population.

Quantitative Aptitude

4.1

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Population and sample Size: The number of units in a population is known as population size. It is denoted by capital ‘N’. The number of individuals or items included in a finite sample is called a sample size. It is denoted by small ‘n’. Appropriate sample size is paramount in understanding the population.

Parameter and statistic: There are various statistical measures in statistics such as mean, median, mode, standard deviation; coefficient of variation, variance etc., and these statistical measures can be computed both from population (or universe) data and sample data. Parameter: Any statistical measure computed from population data is known as parameter. Statistic: A statistic is a statistical measure which relates to the sample and is based on Sample data. Thus, a population mean, population median, population variance, population coefficient of variation etc., are all parameters. Statistic computed from a sample such as sample mean, sample variance etc., which are drawn from the parent population plays an important role in (i) The theory of Estimation and, (ii) Testing of hypothesis. The usual notation used for parameter (in the case of population data) and statistic (in the case of sample data) are given below. Formally, a parameter is any function of population values, while a statistic is a function of sample values. Very often the values of various parameters are unknown and these are estimated by the corresponding statistic. For example, sample mean x (or sample standard deviation (s)is used as an estimator of population mean (or population standard deviation.)

Notations. Statistical Measure.

Population

Sample

µ M

x m s s2 p n

Mean. Median Standard Deviation. Variance. Proportion. Size.

σ σ2 P N

Censes survey and sample survey: Census survey or complete enumeration method: The survey or inquiry in which data is collected by inspecting all the units of population is known as population survey or census survey. In other words, it is a study of all the units of the population.

Merits of census method: 1. The data are collected from each and every unit of the population hence complete data about the population will be available which makes drawing conclusions easy and meaningful. 2. The results are more accurate and reliable, because every item of the universe is enquired. Quantitative Aptitude

4.2

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

3. Intensive study is possible in case census survey. 4. The data collected may be used for various survey analyses. It acts as very good secondary data for other users. 5. This method is free from sampling and non- sampling errors.

Demerits of census method: 1. It requires a large number of enumerators and it is a costly method. Therefore the government alone can use this method for conducting population census, production census etc. 2. It requires more labour, time, energy etc. to successfully complete survey. 3. It is not possible in case the population or universe is infinite. 4. Where the nature of investigation is destructive this method is not suitable.

Sample survey: The survey or inquiry in which data is collected by inspecting units of sample is known as sample survey.

Merits of sample survey: 1. It saves time, because fewer items are collected and processed. When the results are urgently required, this method is very helpful. 2. Only selected items are studied in sampling so there is reduction in time and cost. 3. More reliable results can be obtained because: a. There are fewer chances of sampling statistical errors. If there is sampling error, it is possible to estimate and control the results. b. Experts can be employed for processing and analysing sample data to get accurate and reliable results. 5. Sampling method is sometimes the only method possible. If the population under study is infinite, sampling method is the only method. 6. Where the nature of investigation is destructive this method is highly suitable. 7. The organization and administration of sample survey is easy and convenient.

Demerits of sample survey: 1. If sample survey is not properly planned and carefully executed, the results obtained may be misleading and inaccurate and conclusions would be illusory conclusions. 2. Sample may not be representative when collected by amateurs. 3. Personal bias may spoil the quality of sample. 4. Choice of sample size is very difficult and tricky. 5. Conditions of complete coverage may require high expertise. 6. This method is not free from errors. It may give birth to sampling and non sampling errors in the data which makes analysis difficult.

Quantitative Aptitude

4.3

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Meaning of Sampling: It is the procedure or process of selecting a sample from a population. Sampling can defined as the process of drawing a sample from a population and of compiling a suitable statistic from such a sample in order to estimate the parameter of the parent population and to test the significance of the statistic computed from such sample.

Essentials of good sampling: 1. It must be representative sample. It means sample should posses all the characteristics of population. 2. Adequate sample size should be determined and collected. 3. Each and every item of the sample should be selected independently.

Sampling Design: A systematic procedure or technique of selecting sample units from the population units is called sampling method or sampling design. It can be classified into two categories:

Methods of sampling: 1. Random sampling method (Probability sampling) a. Simple or unrestricted random sampling. b. Restricted Random Sampling. 1. 2. 3. 4.

Stratified sampling. Systematic sampling. Cluster sampling. Multi-stage sampling

c. Discovery or Exploratory sampling

2. Non-Random sampling (Non-probability sampling) a. b. c. d. e.

Judgment, Deliberate or purposive sampling. Quota sampling. Convenience sampling. Sequential sampling. Snow ball sampling

1.Random sampling: Random or probability sampling is the scientific technique of drawing samples from the population according to some laws of chance in which each unit in the universe or population has some definite pre-assigned probability of being selected in sample. a) Simple Random Sampling Simple random sampling is a method where each item in the universe has an equal chance of known opportunity of being selected. According to Harper” A random sample is a sample selected in such a way that every item in the population has an equal chance of being included. Simple random sampling may be without replacement or with replacement. When the item selected in the first draw is not replaced in the population before making the second draw then it known as Simple random sampling may be without replacement (SRSWOR). When the item selected in the first draw is replaced in the population before making the second draw then it known as Simple random sampling may be with replacement (SRSWR). The methods become almost identical when the population is infinite.

Quantitative Aptitude

4.4

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Simple Random sampling can be carried out in two ways: 1. Lottery method. 2. Table of random numbers Method. 1. Lottery method: Under this method, all units of population are serially numbered on slips which are similar in all aspects and these slips are mixed up in a big drum and then a slip is selected at random. That numbered population unit is included whose number is written on the slip in the sample. This process is carried out as many times as we require the sample size. 2. Table of random numbers Method. In this method first the serial numbers are assigned to the population units and then the sample units are selected according to the available random number tables. The popular tables are Tippet’s table, Kendall & Badington’s table, Fisher’s & Yeat’s table, Rand Corporation’s table etc. Merits: 1. There is less chance for personal bias. 2. When the size of the sample increases, large Numbers and the law of Statistical Regularity begin to operate. 3. Theory of probability is applicable, only if a sample is randomly collected. 4. This method is economical as it saves time, money and labour. 5. This method is highly suitable for homogeneous population. Demerits: 1. This requires a complete list of the population but such up-to-date lists are not available in many enquiries. 2. If the size of the sample is small, then it will not be a representative of the population. 3. In case of a heterogeneous population this method may not give a good representative sample. To overcome this problem sample size has to be increased which in turn increases the cost of sampling. 4. In case of infinite population this method is not suitable. b) Restricted Random sampling: Restricted Random sampling is three types. 1. Stratified sampling. 2. Systematic sampling. 3. Cluster sampling. 4. Multi-stage sampling. 1. Stratified sampling. In stratified random sampling, the population is divided into strata (groups) before the sample is drawn. Strata are so designed that they do not overlap. Elementary units from each stratum are drawn at random and the units so drawn constitute a sample. Stratified sampling is suitable in those cases where the population is heterogeneous but there is homogeneity within each of the groups or strata. Stratified random sampling can of two types viz, proportional stratified random sampling where the size of sample is proportional to the size of stratum and disproportional stratified random sampling where the size of sample is not proportional to the size of stratum or groups.

Quantitative Aptitude

4.5

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Merits: 1. It is more representative. 2. It ensures greater accuracy. 3. It is easy to administer as the universe is sub-divided. 4. For non-homogeneous population, it may yield more reliable results. Demerits: 1. To divide the population into homogeneous strata, it requires more money, time and statistical experience. 2. If proper stratification is not done, the sample will have an effect of bias. If different strata of population overlap, such a sample will not be a representative one. 2. Systematic Sampling. In this method every elementary unit of the population is arranged in order and the sample units are distributed at equal and regular intervals. In other words, a sample of suitable size is obtained (from the orderly arranged population) by taking every unit say tenth unit of the population. One of the first units in this ordered arrangement is chosen at random and the sample is computed by selecting every tenth (say) from the rest of the lot. If the first unit selected is 4, then the other units constituting the sample will be 14, 24, 34, and 44 and so on. Merits: 1. This is a simple and convenient method in case finite population. 2. The time and labour required is reasonable. Demerits: 1. It may not represent the whole population. 2. There is chance for the element of personal bias of investigator. 3. Information of each unit is necessary. 3. Cluster sampling: Cluster sampling involves arranging the population units into heterogeneous subgroups or clusters and then by using simple random sampling we select the cluster and each and every unit of that cluster examined and data values are collected. The difference between stratified and cluster sampling is that in stratified sampling there is small variation within itself but wide variation between the groups. While in cluster sampling there is considerable variation within each group but groups are essentially similar to one another. The clusters may or may not have equal number of items or observations. Merits: 1. It is an easy method. 2. It is helpful in large-scale survey where the preparation of list is difficult, time consuming or expensive. 3. It is valuable in underdeveloped countries, where no detailed and accurate framework is available. Demerits 1. It is less accurate than other methods. 2. It is not a comprehensive method.

Quantitative Aptitude

4.6

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

4. Multistage sampling: In this sampling method, of elementary units are selected in stages. Firstly samples of general groups are selected and then from among them samples of elementary units are selected at different stages. It is suitable in those cases where population size is very big and it contains a large number of units. Merits: 1. It gives good representative sample. 2. It introduces flexibility in the sampling method. 3. It is helpful in large-scale survey. Demerits 1. It is difficult and complex method of sampling. 2. It is less accurate than other methods. c) Discovery or Exploratory sampling: Under this method no predetermined plan for sampling is available. The sample depends upon spontaneous finding or discovery of sample units in the process of searching for clue to solve the problem under consideration. Merits: 1. It is completely free from personal bias of the investigator. 2. It is suitable in cases where no information is available about population. Demerits: 1. It is not suitable when highly accurate and reliable data is required. 2. It involves uncertainty about nature and size of sample. 3. It is time consuming and costly.

2.Non-random sampling method: A sample of elementary units that is being selected on the basis of personal Judgment is called a non-probability sampling. It is of five types: 1. Purposive sampling: Purposive sampling is the method of sampling by which a sample is drawn from a population based entirely on the personal judgment of the investigator. It is also known as judgment sampling or deliberate sampling. Merits: 1. Knowledge of investigator can be best used in this technique of sampling. 2. It is an economical method. 3. Better control of significant variables. Demerits: 1. Knowledge of population is essential for using this method. 2. Inferential statistics can be used. 3. It is highly affected by personal bias. 4. Difficult method in case of heterogeneous population.

Quantitative Aptitude

4.7

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

2. Quota Sampling: In quota sampling method, quotas are fixed according to the basic parameters of the population determined earlier and each filed investigator is assigned with quotas of number of elementary units to be interviewed. Merits: 1. It is an easy method. 2. It is suitable in case of homogeneous population. Demerits: 1. It does not give representative sample. 2. It has the influence of regional geographical and social factors. 3. Convenience Sampling: In convenience sampling, a sample is obtained by selecting convenient population elements from the population. Merits: 1. It is an easy and economical method. 2. It is suitable in case of homogeneous population. Demerits: 1. It does not give representative sample. 2. It is crude method of collecting sample. 3. It is not suitable in case heterogeneous population. 4. This method is not good for important situations. 4. Sequential Sampling: In sequential sampling a number of sample lots are drawn one after another from the population depending on the results of the earlier samples drawn from the sane population. Sequential sampling is very useful in Statistical Quality control. If the first sample is acceptable, then no further sample is drawn. On the other hand if the initial lot is completely unacceptable, it is rejected straightway. But if the initial lot is of doubtful and marginal character falling in the area lying between the acceptance and rejection limits a second sample is drawn and if need be a third sample of bigger size may be drawn in order to arrive at a decision on the final acceptance or rejection of the lot. Such sampling can be based on any of the random or non-random method of selection.

Merits: 1. It is a simple method. 2. It is used to obtain a more representative sample. 3. It is very helpful to make public policies, decisions, etc.; the executives and public officials use this method for their urgent problem.

Demerits: 1. 2. 3. 4.

Due to individual bias the sample may not be a representative one. It is difficult to get correct sampling errors. The estimates are not accurate. Its results cannot be compared with other sampling studies.

Quantitative Aptitude

4.8

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

5. Snowball sampling: When the place where the respondents are, is not known to the investigators then the snowball sampling method is used. Under this method, investigator to begin with finds one or two respondents and collects the data from them. In addition to that the investigator requests the respondents to give information about other respondents known to them so that the investigator can collect data from other respondents easily. Merits: 1. This method helps to find unknown respondents. 2. This method can be used with the help of internet. Merits: 1. The size of sample is not under the control of investigator. 2. It is time consuming method.

Test of reliability of samples: 1. A number of samples may be taken from the same universe and their results compared; if there is not much variation, the sample is reliable. 2. Sub-samples may be taken from the main sample and studied. If the results of sample and sub-samples are similar, the sample is reliable. 3. The measurement of sample and the measurement of the universe are compared .If there is similarity, the sample is reliable

Sampling theory The study of relationship existing between a population and the samples drawn from the population is called sampling. Sampling theory is based on sampling. It deals with statistical inferences drawn from sampling results. Statistical inferences made on the basis of sampling results are of the following three types.

1) Statistical Estimation. It helps in estimating an unknown population parameter (such as population mean, median, mode standard deviation Kurtosis etc.,) on the basis of suitable statistic (such as sample mean, mode, median, variance etc.,) computed from the sample drawn from such parent population.

2) Tests of significance. Sampling theory helps in testing of significance about the population characteristics on the basis of suitable statistic computed from a sample drawn from such parent population. In other words, it helps in determining whether observed differences questions help us in deciding whether one production is better than the other. The test of significance plays an important role in the decision theory.

3) Statistical Inference. Statistical inference means drawing conclusions about some matters on the basis of certain statistical results. These statistical results are obtained by drawing samples from the population and then by computing statistic from these samples so as to make statistical inferences. These statistical inferences enable us to draw statistical conclusions about some measures of a population on the basis of such statistic.

Quantitative Aptitude

4.9

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Sampling distribution of a statistic: Sampling distribution of a statistic is the frequency distribution which is formed with various values of a statistic computed from different samples of the same size drawn from the same population. We can draw a large number of samples of same size from a population of fixed size, each sample containing different population members. Any Statistic may (statistical measure of sample) like mean, median, variance. Standard deviation etc. may be computed for each of these samples. As a result a series of various values of that statistic may be obtained. These various values can be arranged into a frequency distribution table, which is known as the sampling distribution of the statistic.

Sampling may be done with replacement or without replacement. Sampling with replacement means that the same unit of the population may be included in each sample more than once. Sampling without replacement means that the same unit of population may not be included in each sample more than once. In the case of sampling with replacement, the total number of possible samples each of size ‘n’ drawn out of population of size N is Nⁿ. But if the sampling is without replacement the total number of possible samples will be C (N, n) = m (say): For each of these samples a value of statistic say, sample mean x is computed. As the samples are formed with different sample units so the value of each of the sample means will be different. The sample mean can be regarded as a random x and each sample mean then constitute as the observed value of this new random variable x and each sample means will be different. The sample mean can be regarded as a random variable x and each sample mean then constitute as the observed of this new random variable x. Let these values be observations. These mean values of various samples can be used to form a frequency distribution. Then this frequency distribution of the statistic x is known as the sampling distribution of sample mean. Similarly, the sampling distribution of standard deviation or coefficient of variation or variance may be constructed with the various the values of standard deviation or coefficient of variation or variance respectively. If the population size is infinitely, large or sampling is done with replacement. Then the total number of possible samples of the same size which may be drawn from the population cannot be determined. In such a case a large number of repeated random samples from the population of fixed size can be drawn and the values of statistic for these samples may be computed. These values of the statistic can be used to form a frequency distribution. This frequency distribution of the statistic is known as the sampling distribution of the statistic. The main characteristic of the Sampling distribution of a statistic is that it approaches normal distribution even when the population distribution is not normal provided the sample size is sufficiently large (greater than 30). Another important feature of the sampling distribution of statistic is that the mean and the standard deviation of the sampling distribution of sample mean bear a definite relation to the corresponding parameters. i.e., mean and standard deviation of parent population.

Quantitative Aptitude

4.10

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Sampling distribution helps us: 1. To estimate the unknown population parameter from the known statistic. 2. To set the confidence limits of the parameter within which the parameter values are expected to lie. 3. To test a hypothesis and to draw a statistical inference from it.

Sampling Distribution of Proportion: Let us suppose that the population under study is classified only two mutually exclusive and exhaustive classes, according as the given unit possesses or does not possess the given attribute, say smoking, drinking honesty, beauty etc., under consideration. Let us consider a population consisting of N units and let the number of units possessing the attribute under study be ‘a’ (say). P = Proportion of units (in the population) processing the given attribute = a/N Q= 1 –P, Proportion of population units which do not possess given attribute. Let us take a simple random sample without replacement (srswor) of size n from this population. If X is the number of units possessing the given attribute in the sample, then we define: p = Proportion of sampled units possessing the given attribute, p=x/n, q = 1-P, proportion of units (in the sample) which do not possess the given attribute.

Standard error of a statistic: The statistical measure of standard deviation may be computed both from the observations of the population and also from the observations of a sample. Also, we know that the standard deviation is a measure of the average amount of the variability of all the observations of variable from their mean. When the average amount of the variability of the observations of a population is computed, it is called the standard deviation. However, when the average amount of the variability of the observations of a sampling distribution of a statistic is computed, it is known as standard error. Thus the standard deviation computed from the observations of a sampling distribution of a statistic is called the standard error of the statistic. In other words, the standard deviation used to measure the variability of the values of a statistic from sample to sample is called the standard error of the statistic. Thus, the standard deviation and standard error have the same meaning and same connection but are used in different cases and different circumstances. Both of them are used to measure the variability of observations. Standard error is used to measure the variability of the values of a statistic computed from the samples of the same size drawn from the population, whereas standard deviation is used to measure the variability of the observations of the population itself.

Utility of Standard Error of Statistic The standard error of sample mean and the standard error of proportion are used in sampling in order to obtain the following facts. 1. To determine the precision of the sample statistic. 2. Standard error is used to set up the confidence limits within which the population parameter may lie. 3. Standard error is used to test the hypothesis and to draw a statistical conclusion from it. 4. To test the randomness of the sample. 5. Standard error is used to measure the variability of the values of a statistic from its mean.

Quantitative Aptitude

4.11

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Basic Statistical Laws Sample survey is the study of the unknown population on the basis of a proper sample drawn from it. The four popular and well defined statistical laws which indicate the utility of large size of the sample form the purpose of reducing sampling errors are. 1. Law of statistical Regularity. 2. Law of Inertia of Large Numbers. 3. Principle of Optimization. 4. Principles of validity.

1. Law of statistical Regularity. It states that a reasonably larger number of items selected at random from a large group of items, will on the average, represent the characteristics of the group. In the words of the statistician, W.I. King “The law of statistical regularity lays down that a moderately large number of items chosen at random from a large group, are almost sure (on the average) to possess the characteristics of the large group’. This law explains that if a reasonably large sample is selected at random without bias (i.e. probability sampling), it is almost certain that on an average, the sample so chosen, shall have the same characteristics as those of the parent population from where the units constituting the sample have been drawn, It is on the basis of this theory that the law of statistical theory tells us that a random selection is very likely to give a representative sample.

2. Law of Inertia of large Numbers It states that “large groups or aggregate of data show high degree of stability because there is a greater possibility that the extremes on one side are compensated by the extremes on the other side.” The law of inertia of large number is a corollary to the law of statistical regularity. It emphasizes the fact that large numbers are relatively more stable and more reliable than small ones. In a large number, it is unlikely that the data would move in only one direction. Thus, the greater the size of the sample, the greater will be the compensation or tendency to neutralize one another and consequently more stable would be the result. For example, the birth rate, death rate etc. may vary from place to place in India as a whole country, they will be found some what stable over a number of years. The simplest method of increasing the accuracy of a sample is to increase its size. The larger the size of the sample, the more reliable is the result. The other things remaining unchanged, the sampling is inversely proportional to the root of the number of items in the sample. The laws of statistical Regularity and the Inertia of Large Numbers have great importance in the theory of sampling as the sampling error is reduced to minimum if these laws are correctly applied.

3. Law of optimization. The principle of optimization ensures that an optimum level of efficiency at a minimum cost or maximum efficiency at a given level of cost can be achieved with the selection of an appropriate sampling design.

4. Law of validity. The principle of validity states that a sampling design is valid only if it is possible to obtain valid estimates and valid tests about population parameters. Only a probability sampling ensures this validity. Quantitative Aptitude

4.12

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Theory of Estimation: It is possible to draw valid conclusion about the population parameters from sampling distribution. We have a sample from a population involving unknown parameter, such as the sample mean. The problem is to construct a sample quantity that will serve to estimate the unknown parameter. Viz., population mean, µ.such a sample quantity is called the estimator and the actual numerical value obtained by evaluating an estimator in a given instance is the estimate. Note that an estimator must be a statistic and it must depend only on the sample and not on the parameter to be estimated, an estimator is a statistic which for all practical purposes, can be used in place of unknown parameter of the population. Estimators are bound to differ from the true value of the population parameters. But the tolerable divergence between the estimated and the true value of the population parameter may be specified before hand.

Characteristics of a Good Estimator: A good estimator is one which is as close to the true value of the parameter as possible. A good estimator must posses the following characteristics. 1. Unbiasedness. 2. Consistency. 3. Efficiency. 4. Sufficiency. 1. Unbiasedness: An estimator T is said to be an unbiased estimator for the parameter ø if E(T) = ø, i.e. on an average the sample statistic assumes the value of the parameter. 2. Consistency: An estimator T is said to be consistent estimator for the parameter ø, If E(T) = ø and V(T)=0 as n → ∞. i.e. for large sample size the value of sample statistic is nearer to the value of population parameter and its variance is nearer to zero. 3. Efficiency: An estimator T is said to be efficient if it has smallest variance among all possible other estimators. 4. Sufficiency: An estimator T is said to be sufficient for the parameter ø, if it contains all information about parameter ø that is contained in the population.

Types of Estimation: Estimation is divided into two types: 1. Point Estimation. 2. Interval estimation.

1) Point Estimation: In point estimation a single statistic is used to provide an estimate of the population parameter. In other words, the estimate of a population parameter given by single number is called the point estimation of the parameter. In Point estimation, we find a statistic which may be used for to replace an unknown parameter of the population for all practical purposes.

Quantitative Aptitude

4.13

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

2) Interval Estimation: There are situations where the point estimation is not desirable and we are interested in finding such limits within which with a known probability or to a known degree of reliability the value of the population parameter is expected to lie. Such a process of estimation is called the interval estimation. In other words, interval estimation is the range of values used in making estimation of a population parameter. Thus the interval estimation is the range of values used in making estimation of a population parameter. Thus the interval estimation of the population parameter is the estimation of the population by an interval around a point.

Interval Estimate of Mean and Proportion: 1. Confidence interval of the Mean: Let µ be the Population Mean and ‘x’ be the sample mean of the sampling distributions of means. It is also assumed that the sample mean is normal if the sample is large. Then the interval estimate of Population Mean µ by the sample mean x of the sampling distribution of means.

Steps to Determine Confidence Interval of Mean: Step 1: Calculate Arithmetic Mean i.e. X . Step 2: Select the confidence level and corresponding to that specific level confidence, find from the table, the critical value of Z or t. the use of Z or t values depend upon the information about population standard deviation and sample size. The following table summarizes when to use which value (Z or t). Sample size

When population standard deviation is When population standard known deviation is not known

In case of large Value of Z. samples

Value of Z.

In case of small Value of Z (the population must be normal) samples

Value of t.

Note: 1. Confidence Coefficients of Z from the table of areas under the standard normal probability distribution are various confidence levels are given below. Confidence Level Value of confidence Co-efficient 90% 1.64 95% 1.96 98% 2.33 99% 2.58 Without any reference to 3.00 the confidence level 2. Confidence coefficients of t have to be ascertained from the t tables. The t values depend on the degrees of freedom. Degree of freedom is n-1.

Quantitative Aptitude

4.14

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Step 3: Calculate Standard error of Mean S.E (x) with the help of following results. When population When population Standard Population Size Standard Deviation is Deviation is not known known (whether the (whether the sample size is sample size is small or small or large.) large.)

In case the population is either σ SE(X) = infinite or in which samples are n drawn from a finite population with replacement In case the population is either finite σ (generally when ever the ratio of sample SE(X) = n size (n) to population size (N) is 0.05 or more.) in other words when sampling fraction (n/N) is 0.05 or more.

SE(X) =

N −n N −1

SE(X) =

s n −1

s

N −n n −1 N −1

Step 4: Construct the confidence interval as follows: Sample size

When population standard deviation is known

When population standard deviation is not known

For large samples

X±(Z X S.E(X))

X±(Z X S.E(X))

For small samples

X±(Z X S.E(X))

X±(t X S.E(X))

Confidence Limits: Lower Limit: X - (Z X S.E(X)) and Upper Limit: X + (Z X S.E(X)). Note: 1. When both the population standard deviation ( ) and the sample standard deviation (s) are given, always use the population standard deviation  while calculating the standard error. 2. When the population standard deviation () is not known, use sample standard deviation (s) as an unbiased estimate of population Standard Deviation () in the numerator and √  1 in the denominator of the formula of S.E(X). N −n 3. Finite Population Multiplier is used when the population is finite (i.e. when its N −1 size is given) and the sampling fraction (n/N) is 0.05 or more irrespective of whether the sample size is small or large. 4. Value of t is to be used when the sample size is small and the population Standard Deviation () is not known. In other cases, value of Z is to be used. 5. Confidence Coefficients Z from the table of areas under the standard normal probability distribution are various confidence levels are given below. Confidence Level 90% 95% 98% 99% Almost sure level Confidence 1.64 1.96 2.33 2.58 3.00 Coefficient Z

Quantitative Aptitude

4.15

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Interval Estimate of the Proportion: Population Proportion: The Population proportion (P) is the ratio of the number of elements possessing a characteristic to the total number of elements in the Population (N).

Population proportion 

.   ! "!#$#"$". %# .   ! &#.

or P 

' 

If we multiply the population by 100, then we get the percentage and we may make use of percentage for the proportion and vice-versa.

Sample Proportion: The sample proportion (p) is the ratio of the number of elements possessing a characteristic to the total number of elements (n) in the sample.

Sample proportion 

.   ! "!#$#"$"  ! #. %# .   ! #.

It is important to note that the mean of sampling distribution of p proportion, i.e., E(p) = p. Population Size When Population (N) is known. In case the population is either infinite or PQ in which samples are drawn from a finite SE(p) = n population with replacement In case the population is either finite PQ (generally when ever the ratio of sample SE(p)= n size (n) to population size (N) is 0.05 or more.) in other words when sampling fraction (n/N) is 0.05 or more. Step 3: Construct a Confidence Interval as follows:

N −n N −1

or p 

+ 

equals the population When Population (N) is not known. SE(p) =

pq n

SE(p) =

pq n

N −n N −1

P or p± (Z X S.E(X))

Confidence Limits: Lower Limit: P or p - (Z X S.E(X)) and Upper Limit: P or p + (Z X S.E(X)). Note: (i) When both the population proportion (p) and the sample proportion (p) are given, always the use population proportion p while calculating the standard error. (ii) When the population proportion (p) is not known, use sample proportion (p) as an unbiased estimate of population proportion (p). N −n (iii) Finite Population Multiplier is used when the population is finite (i.e. when N −1 its size is given) and the sampling fraction (n/N) is 0.05 or more.

Determination of sample size: The determination of sample size for estimating a mean or proportion is a crucial question. By selecting a sample size lower than the correct size may affect reliability and a higher one will mean more cost and time. The determination of the size of a sample is the most important factor for the purposes of estimation of the value of the population parameters.

Quantitative Aptitude

4.16

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

Sample size for estimation a mean: In order to determine the sample size for estimating a population mean, the following factors must be known. 1. The desired confidence level. 2. The permissible sampling error E = x-µ. 3. The standard deviation.

Sample Size for Estimating Population Mean  zσ    E 

n= 

2

where z = confidence interval = 1.96 at 5% Level of significance. E = Permissible sampling error = x − µ

Sample size for estimating a proportion: In this case, we must know the following three factors. 1) The desired confidence level. 2) The permissible sampling error. 3) The estimated true proportion of success.

Sample Size for Estimating Population Proportion  z 2 pq   where z = confidence interval at the given % Level of significance. n=   E2  



E = Permissible sampling error

Errors in sample survey A sample is a part of the whole population. A sample drawn from the population depends on chance and as such all the characteristics of the population may not be present in the sample drawn from the same population. Any statistical measure say, mean of the sample, may not be equal to the corresponding statistical measurer (mean) of the population from which the sample has been drawn. Thus there can be discrepancies in the statistical measurer of population, i.e. parameter and the statistical measurers of sample drawn from the same population. i.e. Statistic, these discrepancies are known as Errors in sampling. Errors in sampling are of two types. 1) Sampling Errors. 2) Non-sampling Errors or Bias.

1. Sampling Errors. Sampling errors is inherent in the method of sampling. Sampling depends on chance and due to the existence of chance in sampling. Errors in sampling arise primarily due to the following reasons. 1. Faulty selection of the sample: This may be due to selection of defective sampling techniques which may introduce the element of bias, e.g. purposive or judgment sampling, in which investigator deliberately selects a non-representative sample. 2. Substitution: Sometimes an investigator while collecting the information from a particular sampling unit included in the random selection substitutes a convenient member of the population and Quantitative Aptitude

4.17

Sampling Theory.

Spellbound Centre for Professional Studies, Hyderabad

Faculty, K.Veerendra Patil

this may lead to some bias as the characteristic possessed by the substituted unit may be different from those possessed by the original unit included in sampling. 3. Variability of the population. Sampling error may also depend on the variability or heterogeneity of the population from which the samples are drawn.

2. Non-sampling Errors or Bias: Non –sampling errors or Bias automatically creep in due to human factors which always varies from one investigator to another, Bias may arise in the following different ways. 1. Due to negligence and carelessness on the part of investigator. 2. Due to faulty planning of sampling. 3. Due to the faulty selection of sample units. 4. Due to incomplete investigation and sample survey. 5. Due to framing of a wrong questionnaire. 6. Due to negligence and response on the part of the respondents. 7. Due to substitution of a selected unit by another unit. 8. Due to error in compilation of data. 9. Due to applying wrong statistical measure.

Quantitative Aptitude

4.18

Sampling Theory.

Related Documents

Statistics
January 2021 6
Statistics 2
January 2021 1
Worksheet Statistics
February 2021 2
Reviewer Statistics
February 2021 2

More Documents from "NIRALI"

Statistics
January 2021 6