So, you probably believe in the power of Big Data and the potential it has to change the world. Your company might have already invested in or is planning to invest in a big data project. That’s great! But what if I were to tell you that only 15% of the business were successfully able to deploy their Big Data projects to production. That can’t be a good sign surely!
Now, don’t just go freeing up your Big Data budget. Not yet.
Big Data’s Big Challenges
For all the hype around Big Data, research suggests that many organizations are failing to leverage its opportunities properly. A recent survey by NewVantage partners, for example, explored the challenges facing organizations currently running their own Big Data projects or trying to adopt them. Here’s what they had to say:
“In spite of the successes, executives still see lingering cultural impediments as a barrier to realizing the full value and full business adoption of Big Data in the corporate world. 52.5% of executives report that organizational impediments prevent realization of broad business adoption of Big Data initiatives. Impediments include lack or organizational alignment, business and/or technology resistance, and lack of middle management adoption as the most common factors. 18% cite lack of a coherent data strategy.”
Clearly, even some of the most successful organizations are struggling to get a handle on Big Data. Interestingly, it’s not so much about gaps in technology or even skills, but rather lack of culture and organizational alignment that’s making life difficult. This isn’t actually that surprising. The problem of managing the effects of technological change is one that goes far beyond Big Data – it’s impacting the modern workplace in just about every department, from how people work together to how you communicate and sell to customers.
It’s out of this scenario that we’ve seen the irresistible rise of DevOps. DevOps, for the uninitiated, is an agile methodology that aims to improve the relationship between development and operations. It aims to ensure a fluid collaboration between teams; with a focus on automating and streamlining monotonous and repetitive tasks within a given development lifecycle, thus reducing friction and saving time.
We can perhaps begin to see, then, that this approach – usually used in typical software development scenarios – might actually offer a solution to some of the problems faced when it comes to big data.
A typical Big Data project
Like a software development project, a Big Data project will have multiple different teams working on it in isolation.
For example, a big data architect will look into the project requirements and design a strategy and roadmap for implementation, while the data storage and admin team will be dedicated to setting up a data cluster and provisioning infrastructure. Finally, you’ll probably then find data analysts who process, analyse and visualize data to gain insights. Depending on the scope and complexity of your project it is possible that more teams are brought in – say, data scientists are roped in to trains and build custom machine learning models.
DevOps for Big Data: A match made in heaven
Clearly, there are a lot of moving parts in a typical Big Data project – each role performing considerably complex tasks. By adopting DevOps, you’ll reduce any silos that exist between these roles, breaking down internal barriers and embedding Big Data within a cross-functional team.
It’s also worth noting that this move doesn’t just give you a purely operational efficiency advantage – it also gives you much more control and oversight over strategy. By building a cross-functional team, rather than asking teams to collaborate across functions (sounds good in theory, but it always proves challenging), there is a much more acute sense of a shared vision or goal. Problems can be solved together, discussions can take place constantly and effectively. With the operational problems minimized, everyone can focus on the interesting stuff.
By bringing DevOps thinking into big data, you also set the foundation for what’s called continuous analytics. Taking the principle of continuous integration, fundamental to effective DevOps practice, whereby code is integrated into a shared repository after every task or change to ensure complete alignment, continuous analytics streamlines the data science lifecycle by ensuring a fully integrated approach to analytics, where as much as possible is automated through algorithms. This takes away the boring stuff – once again ensuring that everyone within the project team can focus on what’s important.
We’ve come a long way from Big Data being a buzzword – today, it’s the new normal. If you’ve got a lot of data to work with, to analyze and to understand, you better make sure you’ve the right environment setup to make the most from it. That means there’s no longer an excuse for Big Data projects to fail, and certainly no excuse not to get one up and running.
If it takes DevOps to make Big Data work for businesses then it’s a MINDSET worth cultivating and running with.