Stages of statistical research. Stages of statistical research

  • 12.10.2019

Introduction

1. Methodology for obtaining initial data

2. Statistical summary and grouping of primary data

2.1 Grouping

2.2 Determination of arithmetic and structural averages

2.3 Histogram and cumulate

2.4 Cost of fixed assets

2.5 Production volume

3. Correlation analysis

3.1 Study of the relationship between factor and performance characteristics. Building a correlation table

3.2 Determination of the degree of closeness of communication

4. Regression analysis

4.1 Simulation

4.2 Prediction

Conclusion

Used literature and programs

Introduction

Statistical study of phenomena public life begins with the stage of statistical observation, during which, in accordance with cognitive goals and objectives, an array of initial data about the object under study is formed, i.e. the information base of the study is formed, on which accounting and control, planning, statistical analysis and management are carried out. At this stage, methods of mass observation based on the "law of large numbers" are used, since quantitative patterns of mass phenomena are clearly manifested in the study of only a sufficiently large number of socio-economic phenomena and processes.

Any statistical observation should be prepared and carried out according to a clearly developed plan, which includes issues of methodology, organization and technique of data collection, control of its quality and reliability. Thus, statistical observation must have a program and an organizational plan for conducting it. At the same time, it is necessary to resolve questions about the method, form, type, means, timing, place of organization and conduct of observation, etc., which, in turn, determines its plannedness.

Statistical observation should not be carried out spontaneously, from time to time, but systematically: either continuously or periodically - at regular intervals. This is due to the spatio-temporal variation of the studied socio-economic phenomena and processes.

Statistical observation can be carried out by state statistics bodies, research institutes, economic and analytical services of various organizational structures.

The second stage of statistical research is a statistical summary and grouping of statistical observation data. As a result of statistical observation, information is obtained about each unit of the population, which has numerous features that change in time and space. Under these conditions, there is a need to systematize and generalize the results of statistical observation and obtain, on this basis, a summary of the characteristics of the entire object using generalizing indicators, in order to make it possible to identify the characteristic features, specific features of the statistical population as a whole and its individual components and to discover patterns of socially studied - economic phenomena and processes. From what has been said, it follows that a summary of the primary statistical material is necessary.

The statistical summary is carried out according to a specially developed program that ensures the completeness and reliability of the results obtained. This program contains a list of groups into which the set of observation units can be divided according to individual characteristics, as well as a system of indicators characterizing the studied set of phenomena as a whole and its individual parts.

The third stage of statistical research is the analysis of statistical information. At this stage, based on the results of a statistical study, conclusions are obtained that are useful for practical actions, and the phenomenon or process under study is predicted.

1. Methodology for obtaining initial data

In order to study the dependence of the volume of production on the value of fixed assets for the period 2006-2007. the territorial body of state statistics for the Chelyabinsk region organized a statistical study of instrument-making enterprises.

Produced 20 percent typical sample.

The object of statistical observation is a set of instrument-making enterprises in the city of Chelyabinsk and the Chelyabinsk region. The reporting unit of statistical observation is an instrument-making enterprise.

In order to improve the system of sample surveys of instrument-making enterprises, the Goskomstat of the Russian Federation has developed a target Program.

According to the Program, in order to save resources, 20% of the total number of enterprises in the Chelyabinsk region operating on the date of work will be examined. The program activities include a number of organizational, methodological, software and technological works that ensure the preparation and conduct of sample surveys of instrument-making enterprises, the subject of which covers such an issue as the dependence of the volume of production on the value of fixed assets. In order to ensure proper procedures for the preparation of sample surveys of enterprises, the activities of the Program also include the training of personnel in surveys and outreach. The Program is expected to be implemented during 2008-2009. The results of a sample observation of instrument-making enterprises in the city of Chelyabinsk and the Chelyabinsk region for two indicators (production volume and cost of fixed assets) are shown in table 1.

Table 1 . The main performance indicators of instrument-making enterprises in the city of Chelyabinsk and the Chelyabinsk region for the period 2006 - 2007.

Factory No.

The cost of fixed assets

Production volume, million rubles

Factory No.

The cost of fixed assets

Production volume, million rubles

2. Statistical summary and grouping of primary data

2.1 Grouping

According to the statistical observation data, it can be seen that the variation of signs manifests itself in relatively narrow boundaries and the distribution is uniform. In this case, a grouping is built at equal intervals. The number of groups depends primarily on the degree of fluctuation of the trait: the greater the fluctuation of the trait (range of variation), the more groups can be formed. Below are the formulas for building a statistical grouping.

Since the sample size is not large, we use the formula to determine the number of groups:

Interval value h according to the formula:

The value obtained by formula (1.2), which will be the interval step, is rounded off (rounding should not differ from the original value by more than 10-15%). In this case, for the first interval, the lower boundary will be , and the upper - (+ h) etc. Thus, the lower limit of the i -th interval is equal to the upper limit of the (i -1) -th interval. Abstract >> Philosophy

... stages. 1.Main stages development of sociology 1.1First stage ... , "Main questions of Marxism", "Art and public a life", "K... naturalistic interpretations public phenomena. The essence of ... credibility statistical information in sociological research". ...

  • Marketing research in tourism

    Cheat sheet >> Physical education and sports

    Tourism of the region, scientific and statistical research in the field of tourism, training ... (member of the commission). 3. Basic stage: organization of certification work ... etc.). Method is a way of knowing, research phenomena public life, reception or system of receptions in...

  • Methods statistical research (2)

    Test work >> Economics

    Costs life. Index... statistical research statistical study consists of three major stages: statistical ... stage research; an organizational plan for its implementation is drawn up; object is defined (set public phenomena ...

  • Statistical methods of analysis of macroeconomic indicators (1)

    Abstract >> Marketing

    ... statistical study…………………………………………………………………………..4 2.2.System statistical macroeconomic indicators………………..6 2.3. Main... level life population... stage statistical research apply the method statistical ... public phenomena reflected...

  • At the heart of any statistical research are three interrelated stages of work:

    1) statistical observation;

    2) summary and grouping of observational data;

    3) scientific processing and analysis of the summary results. Each subsequent stage of statistical research can be carried out provided that the previous (preceding) stages of work have been carried out.

    Statistical observation is the first stage of statistical research.

    Statistical observation- this is a systematic, scientifically organized collection of information about a particular set of social and, in particular, economic phenomena or processes.

    Statistical observations are very diverse and differ in the nature of the studied phenomena, the form of organization, the time of observation, and the completeness of the coverage of the studied phenomena. In this connection, a classification of statistical observations according to individual characteristics .

    1. According to the form of organization statistical observations are divided into reporting and specially organized statistical observations.

    Reporting- this is the main organizational form of statistical observation, which boils down to collecting information from enterprises, institutions and organizations about various aspects of their activities on special forms called reports. Reporting is mandatory. Reporting is divided into main and current, depending on the length of the period in relation to which it is prepared.

    Basic reporting also called annual and contains the widest range of indicators covering all aspects of the enterprise.

    Current reporting presented throughout the year for various time intervals.

    However, there are data that are fundamentally impossible to obtain on the basis of reporting and data that are inappropriate to include in it. It is to obtain these two types of data that specially organized statistical observations are used - various surveys and censuses.

    Statistical Surveys- These are such specially organized observations in which the studied set of phenomena is observed for a certain period of time.

    Census- this is a form of specially organized statistical observation, in which the studied set of phenomena is observed on a certain date (at a certain moment).

    2. On the basis of time all statistical observations are divided into continuous and discontinuous.

    Continuous (current) statistical observation is an observation that is carried out continuously in time. With this type of observation, individual phenomena, facts, events are recorded as they occur.


    Discontinuous statistical observation- this is an observation in which the observed phenomena, facts, events are recorded not continuously, but after periods of time of equal or unequal duration. There are two types of discontinuous monitoring - periodic and one-time. periodic called discontinuous observation, which is carried out at intervals of time of equal duration. one-time called observation, which is carried out through periods of time of unequal duration or having a one-time character.

    3. On the basis of completeness of coverage of the studied mass phenomena, facts, events, statistical observations are divided into continuous and non-continuous, or partial.

    Continuous observation aims to take into account all phenomena, facts, events, forming the totality under study, without exception.

    Discontinuous observation aims to take into account only a certain part of the phenomena, facts, events that form the totality under study.

    Statistical observation consists in the collection of primary statistical material, in the scientifically organized registration of all significant facts related to the object under consideration. This is the first stage of any statistical research.

    The grouping method makes it possible to systematize and classify all the facts collected as a result of mass statistical observation. This is the second stage of the statistical study.

    The method of generalizing indicators makes it possible to characterize the studied phenomena and processes with the help of statistical values ​​- absolute, relative and average. At this stage of the statistical study, the interrelations and scales of phenomena are revealed, the patterns of their development are determined, and predictive estimates are given.

    At the first stage of statistical research, primary statistical data, or initial statistical information, is formed, which is the foundation of the future statistical building. In order for the building to be durable, solid and of high quality, its foundation must be. If an error was made in the collection of primary statistical data or the material turned out to be of poor quality, this will affect the correctness and reliability of both theoretical and practical conclusions. Therefore, statistical observation from the initial to the final stage - obtaining the final materials - must be carefully thought out and clearly organized. Statistical observation gives raw material for a generalization that begins with a summary. If, during statistical observation, information is obtained about each of its units that characterizes it from many sides, then these reports characterize the entire statistical aggregate and its individual parts. At this stage, the population is divided according to the signs of difference and combined according to the signs of similarity, the total indicators are calculated for the groups and as a whole. Using the grouping method, the studied phenomena are divided into the most important types, characteristic groups and subgroups according to essential features. With the help of groupings, populations that are qualitatively homogeneous in a significant respect are limited, which is a prerequisite for the definition and application of generalizing indicators.

    At the final stage of the analysis, with the help of generalizing indicators, relative and average values ​​are calculated, a summary assessment of the variation of signs is given, the dynamics of phenomena is characterized, indices and balance constructions are applied, indicators are calculated that characterize the closeness of relationships in changing signs. For the purpose of the most rational and visual presentation of digital material, it is presented in the form of tables and graphs.

    Statistical observation - the first stage of statistical research

    Statistical observation is the first stage of any statistical research, which is a scientifically organized accounting of facts characterizing the phenomena and processes of social life, and the collection of mass data obtained on the basis of this accounting.

    However, not every collection of information is a statistical observation. One can talk about statistical observation only when statistical regularities are studied, i.e. those that appear only in a mass process, in a large number of units of some aggregate. Therefore, statistical observation should be planned, massive and systematic.

    The regularity of statistical observation lies in the fact that it is prepared and carried out according to a developed plan, which includes questions of methodology, organization, information collection techniques, control over the quality of the collected material, its reliability, and presentation of the final results. The massive nature of statistical observation suggests that it covers big number cases of manifestation of this process, sufficient to obtain truthful statistical data characterizing not only individual units, but the entire population as a whole.

    Finally, the systematic nature of statistical observation is determined by the fact that it must be carried out either systematically, or continuously, or regularly. The study of trends and patterns of socio-economic processes characterized by quantitative and qualitative changes is possible only on this basis. From the foregoing, it follows that the following requirements are imposed on statistical observation:

    • 1) completeness of statistical data (completeness of coverage of units of the studied population, aspects of a particular phenomenon, as well as completeness of coverage over time);
    • 2) reliability and accuracy of data;
    • 3) their uniformity and comparability.

    Program-methodological and organizational issues of statistical observation

    Any statistical research must begin with a precise formulation of its purpose and specific tasks, and thus the information that can be obtained in the process of observation. After that, the object and unit of observation are determined, a program is developed, and the type and method of observation are selected.

    2.1 Scheme for conducting a statistical study

    Statistical data analysis systems are a modern and effective tool for statistical research. Wide opportunities for processing statistical data have special systems of statistical analysis, as well as universal means– Excel, Matlab, Mathcad, etc.

    But even the most perfect tool cannot replace the researcher, who must formulate the purpose of the study, collect data, select methods, approaches, models and tools for data processing and analysis, and interpret the results.

    Figure 2.1 shows the scheme for conducting a statistical study.

    Fig.2.1 - circuit diagram statistical study

    The starting point of statistical research is the formulation of the problem. When determining it, the purpose of the study is taken into account, it is determined what information is needed and how it will be used in making a decision.

    The statistical study itself begins with a preparatory stage. During the preparatory phase, analysts study technical task- a document compiled by the customer of the study. The terms of reference should clearly state the objectives of the study:

      the object of study is defined;

      lists the assumptions and hypotheses that must be confirmed or refuted during the study;

      describes how the results of the study will be used;

      the timeframe in which the study is to be conducted and the budget for the study.

    Based on the terms of reference, a analytical report structure- then, in any form the results of the research should be presented, as well as statistical observation program. The program is a list of features to be recorded during the observation process (or questions to which reliable answers must be obtained for each surveyed unit of observation). The content of the program is determined both by the characteristics of the observed object and the objectives of the study, and by the methods chosen by analysts for further processing the collected information.

    The main stage of statistical research includes the collection of the necessary data and their analysis.

    The final stage of the study is the preparation of an analytical report and its provision to the customer.

    On fig. 2.2 is a diagram of statistical data analysis.

    Fig.2.2 - The main stages of statistical analysis

    2.2 Collection of statistical information

    The collection of materials involves the analysis of the terms of reference of the study, the identification of sources of necessary information and (if necessary) the development of questionnaires. In the study of information sources, all the required data is divided into primary(data not available and to be collected directly for this study), and secondary(previously collected for other purposes).

    The collection of secondary data is often referred to as "desk" or "library" research.

    Examples of primary data collection: observations of store visitors, surveys of hospital patients, discussion of a problem at a meeting.

    Secondary data is divided into internal and external.

    Examples of internal secondary data sources:

      information system of the organization (including the accounting subsystem, the sales management subsystem, CRM (CRM-system, short for Customer Relationship Management) - application software for organizations designed to automate customer interaction strategies) and others);

      previous studies;

      written reports from employees.

    Examples of external secondary data sources:

      reports of statistical bodies and other state institutions;

      reports from marketing agencies, professional associations, etc.;

      electronic databases (address directories, GIS, etc.);

      libraries;

      mass media.

    The main outputs of the data collection phase are:

      planned sample size;

      sample structure (presence and size of quotas);

      type of statistical observation (data collection survey, questioning, measurement, experiment, examination, etc.);

      information about the parameters of the survey (for example, the possibility of the fact of falsification of questionnaires);

      coding scheme for variables in the database of the program selected for processing;

      plan-scheme of data transformation;

      plan-scheme of the statistical procedures used.

    This stage also includes the questioning procedure itself. Of course, questionnaires are developed only to obtain primary information.

    The received data should be appropriately edited and prepared. Each questionnaire or form of observation is checked and, if necessary, corrected. Each answer is assigned numeric or alphabetic codes - information is encoded. Data preparation includes editing, decryption and data validation, coding and necessary transformations.

    2.3 Characterization of the sample

    As a rule, the data collected as a result of statistical observation for statistical analysis are a sample. The sequence of data transformation into the process of statistical research can be schematically represented as follows (Fig. 2.3)

    Figure 2.3 Statistical Data Conversion Scheme

    Analyzing the sample, it is possible to draw conclusions about the general population represented by the sample.

    Final determination of general sampling parameters produced when all questionnaires are collected. It includes:

      determination of the real number of respondents,

      determination of the sample structure,

      distribution according to the place of the survey,

      establishing a confidence level of the statistical reliability of the sample,

      calculation of statistical error and determination of sample representativeness.

    Real quantity respondents may be more or less than planned. The first option is better for analysis, but disadvantageous for the customer of the study. The second one can adversely affect the quality of the study, and, therefore, is unprofitable for either analysts or customers.

    Sample structure can be random or non-random (respondents were selected on the basis of a previously known criterion, for example, by the quota method). Random samples are a priori representative. Non-random samples may be intentionally unrepresentative of the general population, but provide important information for research. In this case, you should also carefully consider the filtering questions of the questionnaire, which are designed specifically to screen out unsuitable respondents.

    For determination of estimation accuracy, first of all, it is necessary to establish the level of confidence (95% or 99%). Then the maximum statistical error sample is calculated as

    or
    ,

    where - sample size, - the probability of the occurrence of the event under study (the respondent getting into the sample), - the probability of the reverse event (the respondent not being included in the sample), - confidence coefficient,
    is the variance of the feature.

    Table 2.4 lists the most commonly used values ​​of confidence probability and confidence coefficients.

    Table 2.4

    2.5 Computer data processing

    Data analysis using a computer involves a number of necessary steps.

    1. Determination of the structure of the initial data.

    2. Entering data into a computer in accordance with their structure and program requirements. Editing and transformation of data.

    3. Setting the method of data processing in accordance with the objectives of the study.

    4. Obtaining the result of data processing. Editing and saving it in the desired format.

    5. Interpretation of the processing result.

    Steps 1 (preparatory) and 5 (final) are not able to be performed by any computer program- their researcher makes himself. Steps 2-4 are performed by the researcher using the program, but it is the researcher who determines the necessary data editing and transformation procedures, data processing methods, and the format for presenting the processing results. The help of the computer (steps 2-4) is, ultimately, in the transition from a long sequence of numbers to a more compact one. At the “input” of the computer, the researcher submits an array of initial data that is inaccessible to comprehension, but suitable for computer processing (step 2). Then the researcher gives the program a command to process the data in accordance with the task and data structure (step 3). At the “output”, he receives the result of processing (step 4) - also an array of data, only a smaller one, accessible to comprehension and meaningful interpretation. At the same time, an exhaustive analysis of data usually requires their repeated processing using different methods.

    2.6 Choosing a data analysis strategy

    The choice of a strategy for analyzing the collected data is based on knowledge of the theoretical and practical aspects of the subject area under study, the specifics and known characteristics of information, the properties of specific statistical methods, as well as the experience and views of the researcher.

    It must be remembered that data analysis is not the ultimate goal of the study. Its purpose is to obtain information that will help solve a specific problem and make adequate management decisions. The choice of analysis strategy should begin with an examination of the results of the previous stages of the process: defining the problem and developing a research plan. As a "draft", a preliminary data analysis plan is used, developed as one of the elements of the study plan. Then, as additional information becomes available at subsequent stages of the research process, certain changes may need to be made.

    Statistical methods are divided into one- and multivariate. One-dimensional methods (univariatetechniques) are used when all elements of the sample are evaluated by one indicator, or if there are several of these indicators for each element, but each variable is analyzed separately from all the others.

    Multivariate techniques are great for data analysis if two or more indicators are used to evaluate each sample item and these variables are analyzed simultaneously. Such methods are used to determine dependencies between phenomena.

    Multivariate methods differ from univariate methods primarily in that they shift the focus from the levels (averages) and distributions (variances) of phenomena and focus on the degree of relationship (correlation or covariance) between these phenomena.

    Univariate methods can be classified based on whether the data being analyzed is metric or non-metric (Figure 3). Metric data is measured on an interval scale or relative scale. Nonmetric data is evaluated on a nominal or ordinal scale

    In addition, these methods are divided into classes based on how many samples - one, two or more - are analyzed during the studies.

    The classification of one-dimensional statistical methods is presented in Figure 2.4.

    Rice. 2.4 Classification of one-dimensional statistical methods depending on the analyzed data

    The number of samples is determined by how the data is handled for a particular analysis, not by how the data was collected. For example, data on males and females can be obtained within the same sample, but if their analysis is aimed at revealing a difference in perception based on the difference in sex, the researcher will have to operate with two different samples. Samples are considered independent if they are not experimentally related to each other. Measurements made in one sample do not affect the values ​​of variables in another. For analysis, data relating to different groups of respondents, such as those collected from females and males, are usually treated as independent samples.

    On the other hand, if the data for two samples refer to the same group of respondents, the samples are considered to be paired - dependent.

    If there is only one sample of metric data, the z- and t-test can be used. If there are two or more independent samples, in the first case, you can use the z- and t-test for two samples, in the second case, the method of one-way ANOVA. For two related samples, a paired t-test is used. When it comes to non-metric data on a single sample, the researcher can use the frequency distribution tests, chi-square, Kolmogorov-Smirnov (K~S) test, series test and binomial test. For two independent samples with non-metric data, the following analysis methods can be resorted to: chi-square, Mann-Whitney, medians, K-S, one-way analysis of variance Kruskal-Wallis (DA K-U). In contrast, if there are two or more related samples, the sign, McNemar, and Wilcoxon tests should be used.

    Multivariate statistical methods are aimed at identifying existing patterns: the interdependence of variables, the relationship or sequence of events, interobject similarity.

    Quite conventionally, five standard types of patterns can be distinguished, the study of which is of significant interest: association, sequence, classification, clustering and forecasting.

    An association occurs when several events are related to each other. For example, a study conducted at a supermarket might show that 65% of those who buy corn chips also take Coca-Cola, and when there is a discount for such a set, they buy Coke in 85% of cases. Having information about such an association, it is easy for managers to assess how effective the discount provided is.

    If there is a chain of events connected in time, then one speaks of a sequence. So, for example, after buying a house in 45% of cases, a new stove is also purchased within a month, and within two weeks, 60% of newcomers acquire a refrigerator.

    With the help of classification, signs are revealed that characterize the group to which this or that object belongs. This is done by analyzing already classified objects and formulating a certain set of rules.

    Clustering differs from classification in that the groups themselves are not predetermined. With the help of clustering, various homogeneous groups of data are distinguished.

    The basis for all kinds of forecasting systems is historical information stored in the form of time series. If it is possible to find patterns that adequately reflect the dynamics of the behavior of target indicators, it is likely that with their help it is possible to predict the behavior of the system in the future.

    Multivariate statistical methods can be divided into relationship analysis methods and classification analysis (Fig. 2.5).

    Fig.2.5 - Classification of multivariate statistical methods

    1. STAGES OF STATISTICAL RESEARCH

    The process of studying socio-economic phenomena through a system of statistical methods and quantitative characteristics - a system of indicators, is called statistical research.

    The main stages of statistical research are:

    1) statistical observation;

    2) summary of received data;

    3) statistical analysis.

    If necessary, a statistical study may contain an additional stage - a statistical forecast.

    Statistical observation is a scientifically organized collection of data on the phenomena and processes of social life by registering their essential features according to a pre-developed program of observation. Observation data are primary statistical information about the observed objects, which is the basis for obtaining their general characteristics. Observation acts as one of the main methods of statistics and as one of the most important stages statistical research.

    Conducting a statistical study is impossible without a high-quality information base obtained in the course of statistical observation. Therefore, from the moment of changing the idea of ​​statistics as a descriptive science, special rules for conducting observation and special requirements for its results - statistical data are being developed. That is, observation is one of the main methods of statistics.

    Observation is the first stage of statistical research, the quality of which determines the achievement of the final objectives of the study.

    1.1. Observation is carried out according to a specially prepared program.

    The program includes a list of characteristics of the object of study, data on which must be obtained as a result of observation.

    When preparing an observation, it is necessary to determine in advance:

    1. An observation program in which:

    a) the object of observation is defined, i.e. the set of units of the phenomenon that needs to be investigated. Moreover, it is necessary to distinguish the unit of observation from the reporting unit. Reporting unit - a unit that provides statistical data, may consist of several population units, or may coincide with a population unit. For example, in a population survey, the unit might be a household member and the reporting unit might be the household.

    b) the boundaries of the object of observation are determined.

    c) the characteristics of the object of observation are determined, information about which must be obtained as a result of observation.

    2. Time of observation of an object - the time at which or for which information about the object under study is recorded.

    3. Timing of the observation. That is, the time period for data collection and the end date of the observation are determined. The terms of observation affect the time of completion of the statistical study as a whole and the timeliness of its conclusions.

    4. Means and resources needed for monitoring: the number of qualified specialists; material resources; means of processing the results of observation.

    5. Requirements for statistical data. The main requirements are: a) reliability, i.e. information about the object of study should reflect its real state at the time of observation; b) comparability of data, i.e. information obtained as a result of observation should be comparable, which is ensured by a unified methodology for collecting and analyzing data, by units of measurement, etc.

    1.2. There are several types of statistical observation.

    1. By coverage of population units:

    a) solid;

    b) non-continuous (selective, monographic, according to the method of the main array)

    2. By the time of registration of facts: a) current (continuous); b) discontinuous (periodic, one-time)

    3. According to the method of collecting information: a) direct observation; b) documentary observation; c) survey (questionnaire, correspondent, etc.)

    Summary - the process of bringing the received data into the system, their processing and calculation of intermediate and general results, the calculation of interrelated analytical values.

    The next stage of the statistical study is the preparation of the information obtained during the observation for analysis. This stage is called summary.

    The summary includes:

    — systematization of information obtained during observations;

    - their grouping;

    - development of a system of indicators characterizing educated groups;

    - creation of development tables for grouped data;

    — calculation of derived values ​​according to development tables.

    In the literature on the theory of statistics, one often encounters the consideration of summaries and groupings as independent stages research. However, it should be noted that the concept of a summary includes actions for grouping statistical data, so here the concept of “summary” is adopted as the name of the research stage.

    Statistical analysis - research characteristic features structures, connections of phenomena, trends, patterns of development of socio-economic phenomena, for which specific economic-statistical and mathematical-statistical methods are used. Statistical analysis is completed by the interpretation of the obtained results.

    Statistical forecast - scientific identification of the state and probable ways of development of phenomena and processes, based on a system of established cause-and-effect relationships and patterns.

    EXERCISE 1

    As a result of a sample survey wages 60 employees of an industrial enterprise received the following data (Table 1).

    Build an interval series of distribution according to the resultant attribute, forming five groups with equal intervals.

    Determine the main indicators of variation (dispersion, standard deviation, coefficient of variation), mean power value (mean value of the feature) and structural means. Depict graphically in the form of: a) histograms; b) cumulates; c) ogives. Make a conclusion.

    SOLUTION

    1. Let's determine the range of variation according to the performance indicator - according to the length of service according to the formula:

    R \u003d Xmax - Xmin \u003d 36 - 5 \u003d 31

    where Xmax is the maximum amount of assets

    Xmin - the minimum amount of assets

    2. Determine the value of the interval

    i \u003d R / n \u003d 31/5 \u003d 6.2

    taking into account the obtained value of the intervals, we group the banks and obtain

    3. Let's build an auxiliary table

    Feature group

    Meaning of values ​​in a group

    x i

    Quantity feature frequency (frequency)

    fi

    in % of the total

    ω

    Accumulated frequency

    Si

    Interval midpoint

    * f i

    ω

    I

    5 – 11,2

    6,8,7,5,8,6,10,9,9,7, 6,6,9,10,7,9,10,10, 11,8,9,8, 7, 6, 9, 10

    43,3

    43,3

    210,6

    350,73

    46,24

    1202,24

    II

    11,2 – 17,4

    16,15,13,12,14,14, 12,14,17,13,15,17, 14

    21,7

    14,3

    185,9

    310,31

    0,36

    4,68

    III

    17,4 – 23,6

    18,21,20,20,21,18, 19,22,21,21,21,18, 19

    21,7

    86,7

    20,5

    266,5

    444,85

    31,36

    407,68

    IV

    23,6 –29,8

    28,29,25,28, 24

    26,7

    133,5

    221,61

    11,8

    139,24

    696,2

    V

    29,8 – 36

    36,35,33,

    32,9

    98,7

    164,5

    TOTAL

    895,2

    1492

    541,2

    3282,8

    4. The average value of the attribute in the studied population is determined by the weighted arithmetic formula:

    of the year

    5. Dispersion and standard deviation of a feature is determined by the formula



    Definition of volatility


    Thus, V>33.3%, therefore, the population is heterogeneous.

    6. Definition of fashion

    Mode is the value of the feature that occurs most frequently in the studied population. In the studied interval variational series, the mode is calculated by the formula:


    where

    x M0
    – the lower limit of the modal interval:

    i M0 is the value of the modal interval;

    f M0-1 f M0 f M0+1 are the frequencies (frequencies) of the modal, premodal, and postmodal intervals, respectively.

    The modal interval is the interval having the highest frequency (frequency). In our problem, this is the first interval.


    7. Calculate the median.

    Median is a variant located in the middle of an ordered variational series, dividing it into two equal parts, so that half of the population units have attribute values ​​less than the median, and half more than the median.

    In the interval series, the median is determined by the formula:


    where is the beginning of the median interval;

    - the value of the median interval

    is the frequency of the median interval;

    is the sum of accumulated frequencies in the pre-median interval.

    The median interval is the interval in which the ordinal number of the median is located. To determine it, it is necessary to calculate the sum of the accumulated frequencies up to a number exceeding half of the totality.

    According to Gr. 5 auxiliary table we find the interval, the amount of accumulated often exceeds 50%. This is the second interval - from 11.6 to 18.4, and it is the median.

    Then


    Consequently, half of the employees with work experience less than 13.25 years, and half - more than this value.

    6. Draw a series in the form of a polygon, a histogram, a cumulative straight line, an ogive.

    Graphical representation plays an important role in the study of variational series, as it allows in a simple and visual form to analyze statistical data.

    There are several ways to graphically represent series (histogram, polygon, cumulate, ogive), the choice of which depends on the purpose of the study and on the type of variation series.

    The distribution polygon is mainly used to display a discrete series, but you can also build a polygon for an interval series if you first bring it to the maternity one. The distribution polygon is a closed broken line in a rectangular coordinate system with coordinates (x i , q i), where x i is the value of the i-th feature, q i is the frequency or frequency of the i-ro feature.

    A distribution histogram is used to display an interval series. To build a histogram on the horizontal axis, successively lay off segments equal to the intervals of the sign, and on these segments, as on the bases, rectangles are built, the heights of which are equal to the frequencies or particulars for a series with equal intervals, densities; for a series with unequal intervals.


    Cumulate is a graphical representation of a variational series, when the accumulated frequencies or particulars are plotted on the vertical axis, and the values ​​of the feature are plotted on the horizontal axis. The cumulate serves for graphical representation of both discrete and interval variational series.


    Conclusion: Thus, the main indicators of the variation of the studied series were calculated: the average value of the feature - work experience is 14.9 years, the dispersion is calculated equal to 54.713, in turn, the standard deviation of the feature is 7.397. The mode has a value of 9.13, in the modal interval is the first interval of the studied series. The median of the series is 13.108, dividing the series into two equal parts, which indicates that in the organization under study, half of the employees have less than 13.108 years of work experience, and half have more.

    TASK 2

    We have the following initial data characterizing the dynamics for 1997-2001. (table 2).

    Table 2 Initial data

    Year

    1997

    1998

    1999

    2000

    2001

    Production of granulated sugar, thousand tons

    1620

    1660

    1700

    1680

    1700

    Determine the main indicators of a series of dynamics. Present the calculation in the form of a table. Calculate the average annual values ​​of indicators. In the form of a graphic image - a polygon, indicate the dynamics of the analyzed indicator. Make a conclusion.

    SOLUTION

    Given

    Year

    years

    1997

    1998

    1999

    2000

    2001

    1620

    1660

    1700

    1680

    1700

    1) The average level of dynamics is calculated by the formula


    2) We calculate chain and basic growth rates as follows:

    1. Absolute growth is determined by the formula:

    Аib = yi – y0

    Aic \u003d yi - yi-1

    2. The growth rate is determined by the formula: (%)

    Trb = (yi / y0) *100

    Trc \u003d (yi / yi-1) * 100

    3. The growth rate is determined by the formula: (%)

    Tnrb \u003d Trb -100%:

    Тnрц = Трц - 100%

    4. Average absolute growth:


    y n
    is the final level of the dynamic series;

    y 0
    – the initial level of the dynamic series;

    n c
    is the number of chain absolute increments.

    5. Average annual growth rate:


    6. Average annual growth rate:


    3) Absolute content of 1% increase:

    A \u003d Xi-1 / 100

    All calculated indicators are summarized in a table.

    Indicators

    years

    1997

    1998

    1999

    2000

    2001

    Number of surgeries per period

    1620

    1660

    1700

    1680

    1700

    2. Absolute growth

    Aic

    3. Growth rate

    Trib

    102,5

    104,9

    103,7

    104,9

    trits

    102,5

    102,4

    98,8

    101,2

    4. Growth rate

    Тпib

    Tpits

    5. Meaning of 1% increase

    16,2

    16,6

    17,0

    16,8

    5) Average annual value


    7. Draw graphically as a polygon.


    Thus, the following is obtained. The largest absolute and relative increase in surgical operations for the period was in 1999 and amounted to 1700, the absolute increase compared to the base year was 80 operations, the growth rate compared to the base year 1997 was 104.9%, and the base growth rate was 4.9 %. The largest chain absolute gains were in 1998 and 1999 - 40 operations each. The highest chain growth rate was observed in 1998 - 102.5%, and the smallest chain growth rate in the number of transactions was in 2000 - 98.8%.

    TASK 3

    There is data on the sale of goods (see table 3)

    Table 3 Initial data on the sale of goods

    Product

    Base year

    Reporting year

    quantity

    price

    quantity

    price

    1100

    1000

    1350

    1300

    1650

    1700

    Determine: a) individual indices ( i p , i q); b) common indices (I p , I q , I pq); c) an absolute change in trade due to: 1) the quantity of goods; 2) prices.

    Make a conclusion based on the calculated indicators.

    SOLUTION

    Let's create an auxiliary table

    View

    basic

    Reporting

    Work

    Indices

    Quantity, q 0

    Price, p 0

    Qty, q 1

    Price, p 1

    q 0 * p 0

    q 1 * p 1

    i q \u003d q 1 / q 0

    i p \u003d p 1 / p 0

    q 1 * p 0

    44000

    35000

    0,875

    0,909

    38500

    1100

    1000

    41800

    40000

    0,909

    1,053

    38000

    7500

    8400

    1,200

    0,933

    9000

    1350

    1300

    40500

    26000

    0,667

    0,963

    27000

    45000

    44000

    1,100

    0,889

    49500

    1650

    1700

    26400

    25500

    1,030

    0,938

    27200

    TOTAL

    205200

    178900

    189200


    Conclusion: As you can see, the total increase in turnover for the year amounted to (-26300) conventional units, including the impact of changes in the quantity of goods sold by - 16000 and due to changes in the price of goods - 10300 conventional units. The total increase in trade turnover amounted to 87.2%. It should be noted that according to the calculated indices of the quantity of goods by assortment, there is a slight increase in turnover for goods "P" by 120% and goods "C" by 110%, a slight increase in the sale of goods "T" - only 103%. The sales of goods "P" decreased quite significantly - only 66.7% of sales in the base year, sales of goods "H" - 87.5% and goods "O" - 90.9% of the corresponding indicator of the base year slightly higher. The individual price index shows that the price increased only for the product "O" - by 105.3%, at the same time for all other types of goods - "N", "P", "R", "S", "T" the individual price index indicates a negative trend (decrease), respectively - 90.9%; 93.3%;, 96.3%, 88.9; 93.8.

    The overall index of the physical volume of sales indicates a slight decrease in the total volume of sales by 94.6%; the general price index indicates a general decrease in the price of goods sold by 92.2%, and the general trade turnover index indicates an overall decrease in trade turnover by 87.2%.

    TASK 4

    From the initial data of table No. 1 (select lines from 14 to 23), on two grounds - length of service and wages, conduct a correlation-regression analysis, determine the parameters of correlation and determination. Construct a graph of the correlation between two signs (resultant and factorial). Make a conclusion.

    SOLUTION

    Initial data

    Production experience

    Salary

    1800

    2500

    1750

    1580

    1750

    1560

    1210

    1860

    1355

    1480

    Straight Line Dependency

    The parameters of the equation are determined by the least squares method, by the system of normal equations


    To solve the system, we use the method of determinants.

    Parameters are calculated by formulas