As a data developer, the concept or process of data analysis may be clear to your mind. However, although there happen to be similarities between the art of data analysis and that of statistical analysis, there are important differences to be understood as well.
This article is taken from the book Statistics for Data Science by James D. Miller. This book takes you through an entire journey of statistics, from knowing very little to becoming comfortable in using various statistical methods for data science tasks.
In this article, we’ve broken things into the following topics:
Some in the study of statistics sometimes describe statistical analysis as part of statistical projects that involves the collection and scrutiny of a data source in an effort to identify trends within the data.
With data analysis, the goal is to validate that the data is appropriate for a need, and with statistical analysis, the goal is to make sense of, and draw some inferences from, the data.
There is a wide range of possible statistical analysis techniques or approaches that can be considered.
It is worthwhile to mention some key points, dealing with ensuring a successful (or at least productive) statistical analysis effort.
Some interesting advice on ensuring success with statistical projects includes the following quote:
When asked about the objectives of statistical analysis, one often refers to the process of describing or establishing the nature of a data source.
Establishing the nature of something implies gaining an understanding of it. This understanding can be found to be both simple as well as complex. For example, can we determine the types of each of the variables or components found within our data source; are they quantitative, comparative, or qualitative?
A more advanced statistical analysis aims to identify patterns in data; for example, whether there is a relationship between the variables or whether certain groups are more likely to show certain attributes than others.
Further, establishing the nature of a data source is also, really, a process of modeling that data source. During modeling, the process always involves asking questions such as the following (in an effort establish the nature of the data):
Another way to describe establishing the nature of your data is adding context to it or profiling it. In any case, the objective is to allow the data consumer to better understand the data through visualization.
Another motive for adding context or establishing the nature of your data can be to gain a new perspective on the data.
In this article, we explored the purpose and process of statistical analysis and listed the steps involved in a successful statistical analysis.
Next, to learn about statistical regression and why it is important to data science, read our book Statistics for Data Science.
Estimating population statistics with Point Estimation.
Why You Need to Know Statistics To Be Good Data Scientist.
Why choose IBM SPSS Statistics over R for your data analysis project.
I remember deciding to pursue my first IT certification, the CompTIA A+. I had signed…
Key takeaways The transformer architecture has proved to be revolutionary in outperforming the classical RNN…
Once we learn how to deploy an Ubuntu server, how to manage users, and how…
Key-takeaways: Clean code isn’t just a nice thing to have or a luxury in software projects; it's a necessity. If we…
While developing a web application, or setting dynamic pages and meta tags we need to deal with…
Software architecture is one of the most discussed topics in the software industry today, and…