4 min read

The answer is yes, and no. This is a question that could’ve easily been applied to textile manufacturing in the 1890s, and could’ve received a similar answer. By this I mean, textile manufacturing improved leaps and bounds throughout the industrial revolution, however, despite their productivity, textile mills were some of the most dangerous places to work. Before I further explain my answer, let’s agree on a definition for data science. Wikipedia defines data science as, “an interdisciplinary field about scientific methods, processes, and systems to extract knowledge or insights from data in various forms, either structured or unstructured.” I see this as the process of acquiring, managing, and analyzing data.

Advances in data science

First, let’s discuss why data science is definitely getting easier. Advances in technology and data collection have made data science easier. For one thing, data science as we know it wasn’t even possible 40 years ago, but due to advanced technology we can now analyze, gather, and manage data in completely new ways. Scripting languages like R and Python have mostly replaced more convoluted languages like Haskell and Fortran in the realm of data analysis. Tools like Hadoop bring together a lot of different functionality to expedite every element of data science. Smartphones and wearable tech collect data more effectively and efficiently than older data collection methods, which gives data scientists more data of higher quality to work with.

Perhaps most importantly, the utility of data science has become more and more recognized throughout the broader world. This helps provide data scientists the support they need to be truly effective. These are just some of the reasons why data science is getting easier.

Unintended consequences

While many of these tools make data science easier in some respects, there are also some unintended consequences that actually might make data science harder. Improved data collection has been a boon for the data science industry, but using the data that is streaming in is similar to drinking out of a firehose. Data scientists are continually required to come up with more complicated ways of taking data in, because the stream of data has become incredibly strong. While R and Python are definitely easier to learn than older alternatives, neither language is usually accused of being parsimonious.

What a skilled Haskell programming might be able to do in 100 lines, might take a less skilled Python scripter 500 lines. Hadoop, and tools like it, simplify the data science process, but it seems like there are 50 new tools like Hadoop a day. While these tools are powerful and useful, sometimes data scientists spend more time learning about tools and less time doing data science, just to keep up with the industry’s landscape. So, like many other fields related to computer science and programming, new tech is simultaneously making things easier and harder.

Golden age of data science

Let me rephrase the title question in an effort to provide even more illumination: is now the best time to be a data scientist or to become one? The answer to this question is a resounding yes. While all of the current drawbacks I brought up remain true, I believe that we are in a golden age of data science, for all of the reasons already mentioned, and more. We have more data than ever before and our data collection abilities are improving at an exponential rate. The current situation has gone so far as to create the necessity for a whole new field of data analysis, Big Data. Data science is one of the most vast and quickly expanding human frontiers at present. Part of the reason for this is what data science can be used for. Data science can effectively answer questions that were previously unanswered. Of course this makes for an attractive field of study from a research standpoint.

One final note on whether or not data science is getting easier. If you are a person who actually creates new methods or techniques in data science, especially if you need to support these methods and techniques with formal mathematical and scientific reasoning, data science is definitely not getting easier for you. As I just mentioned, Big Data is a whole new field of data science created to deal with new problems caused by the efficacy of new data collection techniques. If you are a researcher or academic, all of this means a lot of work. Bootstrapped standard errors were used in data analysis before a formal proof of their legitimacy was created. Data science techniques might move at the speed of light, but formalizing and proving these techniques can literally take lifetimes. So if you are a researcher or academic, things will only get harder. If you are more of a practical data scientist, it may be slightly easier for now, but there’s always something!

About the Author

Erik Kappelman wears many hats including blogger, developer, data consultant, economist, and transportation planner. He lives in Helena, Montana and works for theDepartment of Transportation as a transportation demand modeler.

Erik Kappelman wears many hats including blogger, developer, data consultant, economist, and transportation planner. He lives in Helena, Montana and works for the Department of Transportation as a transportation demand modeler.

LEAVE A REPLY

Please enter your comment!
Please enter your name here