Finally, smart data strategies always start with analyzing internal data sets, before integrating them with public or external sources. It is not advantageous to store and process data just because of the data, since, with the amount of data generated daily, noise increases more quickly than the signal. Pareto's 80/20 rule applies: the 80% of the phenomenon could probably be explained by the 20% of property data (COREA, 2019).
This study has the general objective of presenting the data acquisition aspect in data science. When performing the analysis of the relevant aspects for data management, it can be observed that there are several factors that can influence the analysis to obtain insights from these data.
Usually the field technician work in the given site by the client, engaged in fixing and sustaining the network, hardware, software, and other telecom and IT-related set-ups.
When analyzing the factors that should be considered when managing data in data science, it is also observed that more data may imply higher costs and not necessarily greater precision of the models generated from these data. In addition, many variables can increase the complexity of a model without necessarily increasing their accuracy or efficiency. It is also worth remembering that data quality is of fundamental importance, since low quality data can invalidate the results obtained.
Therefore, it is important to keep in mind that asking the right question and interpreting the results are still the competence of the human brain, so that if it is neglected, it can bring harm to organizations making the investments made in technology useless to obtain insights from large quantities. of data. In addition, it can be seen that human judgment is essential to assess factors such as data adequacy; nature of the data; requirement of time and cost of acquisition. The advances in the techniques of storage and processing of big data are constant, so that the aspects considered important can be modified with time. With this, new works can be carried out in order to present the advances and innovations in data management in the context of data science.
Comments
Post a Comment