Does your organization have key Oracle DBA skills that can help you to derive value from big data? And what is “big data” anyhow? Does it just relate to big (like petabyte-sized) databases? Or is there more to it?
Besides being simply “big,” big data has two other key attributes:
- It can also include a heterogeneous mix of structured and unstructured data types, and
- It tends to come at you hard and fast!
The business challenge with big data is to figure out which data elements within the big data “deluge” are of value to you, and how to most effectively capture and analyze those elements. As the volume of available data grows and the number and types of questions the business wants to ask also expands, new “big data administration” skills also come into play.
An expert Oracle DBA can help with harnessing big data by creating efficient ways to make critical data available to business processes. As I see it, three emerging “key skills” are especially useful in this regard: integration, reporting and Hadoop.
Perhaps the most vital “big data DBA” skill set involves integrating data from a wide range of sources. In the case of big data, successful integrations look beyond mechanics to understand what the business needs, what problem(s) it is looking to solve, and why the data has potential value. Effective communication with business stakeholders is key to success in this area. Oracle Big Data Connectors can help by making it easier to acquire and pre-process data with Apache Hadoop and perform integrated analysis within Oracle Database.
“Big data DBAs” also face new challenges with reporting and data visualization. Finding the nuggets of gold within the big data rubble pile means developing reports that can be efficiently and effectively analyzed by business stakeholders, often across multiple parameters and with different goals in mind. Points of focus here include creating the appropriate data structures, and maintaining satisfactory performance. Tuning queries for specific reporting functions means going beyond typical DBA activities to deeply understand the business context and match business needs to the different visualization and reporting tools you have available.
Apache Hadoop is an open-source software framework for storing and processing large-scale datasets across clusters of inexpensive servers running in parallel. Hadoop is a robust and highly scalable platform that enables businesses to run applications against petabytes of data. Hadoop also facilitates cost-effective storage and fast, flexible querying of both structured and unstructured data sources. It’s designed to enable companies to store “all” their data for later processing. Big data DBAs are getting into Hadoop in a big way, because there are multiple ways for Oracle shops to leverage Hadoop-based data and integrate it with Oracle Database for maximum benefit.
Of course, each organization will have unique challenges and opportunities when working with big data. Proper project planning and database design will be central to success and rapid time-to-value. Contact Buda Consulting to talk about how we can support your business as it moves to embrace big data.