What are big data and how does it play into business? The term “big data” was coined in the late 1990s by two men who would later become important figures in the field of information systems. Jim Wright and John Norton distinguished themselves as the fathers of modern big data theory. They conceptualized big data as a language–a powerful tool that would enable people to process unmeasurable quantities of diverse types of data. These “data sets” would be collected and evaluated over time to yield new insights and allow business people to make informed decisions about their customers and products.
In the early days of the computer age, there were a handful of different technologies that were involved in providing big data analysis tools. There were different ways to crunch numbers and extract useful information from files and other types of information. Some systems used what is now known as the relational database management system (or RDBMS) and some leveraged what became known as the object-oriented database management system (or ODBMS). One of the primary technologies used in the early days of big data analysis was the spreadsheet. Initially, spreadsheet software was designed to simply handle basic data analysis problems by allowing users to enter data into cells and then manipulate the data within the spreadsheet in order to provide visual representation of the data set’s key characteristics.
The ability to process large amounts of unstructured big data and to perform complex analytical functions quickly and reliably has led to what is today known as big data analytics. Many different pieces of hardware and specialized software are available for conducting various aspects of big data analytics. One popular area of application is social media marketing. Businesses use social media to provide in-depth customer profiles and also facilitate interaction through various forms of messaging. By combining customer data with structured social media data, companies have been able to build personalized connections and campaigns that help to strengthen customer relationships and drive sales.
Another application of the big data concept is e-commerce. Companies now use unstructured data to provide detailed business intelligence reports, such as product and service demand, competitor analysis, segmentation, and market saturation. These reports are often provided in charts and tables, but sometimes the text is necessary to extract the important information. Extracting text information can be done through several different techniques, such as using a specialized application such as the IBM text mining tool or a web-based tool such as Google Sheets. Extracting text information is particularly difficult if the original data is unstructured.
A more challenging task is the task of how to extract valuable information from unstructured big data. Traditional approaches to this problem have relied heavily on the knowledge of people who possess the relevant skills, experience, and expertise. Such experts are in short supply, especially in today’s job market. Experts must therefore build their skills by acquiring training and developing new tools to overcome the challenges of big data.
The biggest challenge for the IT industry, and therefore for businesses looking to implement big data analytics, is the development of large and powerful database management systems. Database management systems, or DMS, must store data and extract useful information from it. Today’s large databases, developed using various technologies such as HDFS, file system, and MySQL/ Oracle, are difficult to manage. File system technology, which uses NFS and IPC, is also quite limited.
This poses two important characteristics for a successful big data analytics system. First, it must let us take advantage of multiple streams of data. Data sets obtained using different technologies – traditional relational and object-oriented databases – are valuable because each one has a distinct purpose. Second, it must let us leverage the power of artificial intelligence. Since humans are better at organizing and managing sets of information, a good DMS must be able to let us manipulate big data sets using state of the art tools.
Although big data concept and development are the targets of many companies aiming to implement DMS, some of them are not very clear about the idea of a data warehouse, or a structured way of managing big data sets. Companies developing DMS products must therefore carefully explain the benefits of a data warehouse. They must also explain the advantages of using an unstructured data warehouse, and how it differs from a traditional relational or object-oriented database management system. Finally, they should allow users to experiment with various warehouse models, as well as with the file system virtualization and other options.