Editing of Secondary Data
Secondary data whether published, or otherwise, should be used with much caution for that they were collected by some others originally for their purpose at different times and under different situation which may not suit the present investigation in all respects. There might be many errors of omission, commission, compensation, and duplication with those data. Therefore, before using such data they must be very carefully edited, or scrutinized to ensure that they are free from inaccuracy, inconsistency, inadequacy and unsuitability.
In the words of prof. A.L.Bowley, “It is never safe to take published statistics at their face value without knowing their meanings and limitations, and it is always necessary to criticize arguments that can be based on them.”
In the words of L.R. Connor, “Statistics, especially other people’s statistics are full of pitfalls for the user.”
Further, in the words of Simon Kuznets, “The degree of reliability of secondary source is to be assessed from the source, the compiler, and his capacity to produce correct statistics and the users also, for the most part, tends to accept a series, particularly, one issued by a government agency, as its face value, without requiring its reliability.”
Therefore, before making use of any secondary data they should be strictly edited in the light of the following tests:
1.Test of Reliability.
While editing the secondary data, the editor must see that the data obtained are accurate and reliable.
For testing the reliability of the data, the editor should make the following queries:
(i) Who collected the data ?
(ii) Where from the data were collected ?
(iii) Is the compiler dependable in regard to honesty, integrity, experience and training ?
(iv) Is the source of the data dependable in regard to accuracy, adequacy and consistency ?
(v) What methods were employed in the primary collection of the data ?
(vi) Are the methods of collection proper and dependable ?
(vii) At what time, the data were collected and was it a normal time ?
(viii) Was there any possibility of bias, and prejudices creeping into the minds of the compilers ?
(ix) What degree of accuracy was fixed by the investigator and was it achieved ?
(x) Was the size of the sample adequate ?
(xi) Was the sample at random, or adequate ?
(xii) With what purpose the data were collected ?
(xiii) What period is covered by the data and how far it is relevant for the present study ?
(xiv) What units of collection and measurement were employed ? Were they clearly defined ? Are they suitable for the present purpose ?
(xv) Were the editing, tabulation, and analysis of the data carefully and consciously done ?
2.Test of Adequacy
While editing the secondary data it must be seen that they are adequate, or sufficient for the purpose of the enquiry. As pointed out earlier, too much of data may prove to be confusing and irrelevant. Similarly, too less of data, also, will not serve the purpbse and give the true picture of the problem under study. Therefore, the data must be adequate for the purpose. Whether the data collected from the secondary sources are adequate, or not can be tested in the light of the following queries:
(i) What was the geographical area from which the data were collected ?
(ii) Is the area of collection wider or narrower than the area covered under the present study ?
For example, if the object is to measure the change in the general price level of India through the construction of a whole-sale price index number but the data collected relate only to the cost of living of the people in a particular locality, it would not serve the purpose on the ground of inadequacy.
(iii) What is the period covered by the data ?
(iv) Is the period covered by the data commensurate with the period of the problem under study ?
(v) What was the degree of accuracy achieved with the collected data ?
(vi) Is the degree of accuracy achieved with the data commensurate with the degree of accuracy desire in the present enquiry ?
For example, if in the collected data, 95% degree of accuracy was achieved, and in the present study, 99% degree of accuracy is desired, the data thus collected will not suit the purpose of the present enquiry.
3. Test of Suitability
While editing the secondary data, it must be seen that the data collected are suitable for the present study. If the data collected are not suitable, it will vitiate the whole purpose of the enquiry and lead to erroneous conclusions. The suitability of the data can be tested in the light of the following queries:
(i) What was the nature of the problem for which the data were collected? If the nature of the problem under study does not resemble with that of the problem for which the data were originally collected, the same data will not be suitable for the investigation.
(ii) What was the object of the enquiry ?
If the object of the past enquiry is completely different from that of the present enquiry, the collected data will not suit the present purpose. Thus, if the object of the present investigation is to study the trend in the wholesale price, but the data collected were for studying retail prices, such data would be unsuitable for the purpose.
(iii) What was the scope of the enquiry?
If the geographical area covered by the data collected was more wider, If the geographical area covered by the data were collected to study the functioning of the non-banking financial institutes in New York, the said data will not be suitable for studying the same problem with reference to the state of Washington.
(iv) Do the definitions given to the various terms and units used in the earlier investigation remain the same as those under the present enquiry ? If the definition of the various terms and the units used in the two enquiries are completely different, the data collected originally will not suit the present enquiry. For instance, if the term ‘wage’ under the present enquiry relates to the unskilled labour, the said collected data will not suit the present enquiry.
(v) What was the time covered by the data in the earlier enquiry ?
If the time covered by the data was radically different from the time required to be covered by the data under the present enquiry, the data concerned will not be suitable for the problem under study.