(Word Cloud Designed by Haoning Richter using R Language on June 7, 2018)

Big Data, Hadoop, Audit and Risk Considerations

For the past 10 years, Big Data has been one of the most discussed Phenomena and business challenges in many organizations in the world.  However, it has not been discussed much in the internal audit profession. I recently had the opportunity to study big data in depth and thought it may be useful to apply what I’ve learned to the considerations that internal auditors or compliance professionals may face in the big data world: ethics, responsibilities, social and legal obligations, and compliance.

We cannot stop the ever-increasing complexity and volume of data and tools in big data so we shall embrace it by learning and trying to understand what and how we can become more effective auditors to help business and organizations solve problems and achieve business objectives.

Big Data – What is it?

The dictionary says,extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.”

Big data includes structured data, semi-structured data, and unstructured data. There are four characteristics of Big Data [1]:

  • Volume: The amount of data or data intensity
  • Velocity: The speed of data being produced, changed, received, and processed
  • Variety: The different data sources coming from internal and external of an entity
  • Veracity: The quality and provenance of received data

According to SAS Insights, big data has two additional dimensions [2]:

  • In addition to the increasing velocities and varieties of data, data flows can be highly inconsistent with periodic peaks. Is something trending in social media? Daily, seasonal and event-triggered peak data loads can be challenging to manage. Even more so with unstructured data.
  • Today’s data comes from multiple sources, which makes it difficult to link, match,