What are big data and why should you care? Big data is quickly becoming an industry term in which everyone from CEO’s to marketing managers are scrambling to understand the concept. Big data has emerged as a synonym for “data analytics” and “big data”. In recent years there has been a dramatic growth in software development companies offering big data analytics services. Although not all of these companies offer the same level of expertise, the potential for leveraging huge unstructured data for investment purposes is enormous.
Data insight or “big data” comprises the full range of activities in which computers can be used to deliver insights that have not been available before. Data mining refers to the process of finding previously undiscovered profitable trends in massive amounts of unstructured data. These discoveries are made possible by what is called the “stream processing”. Data stream processing involves finding relationships between unstructured data sources and then using mathematical algorithms to mine these relationships for new insights.
The primary reason for the growth of what is sometimes referred to as social media analytics is that people are sharing ever increasing amounts of data online. There are two main ways this data is being shared. The first is through large amounts of public posts on blogs and forums. These post tend to make it clear to people that there is widespread interest in topics of all sorts – sports, news, entertainment, and lifestyle.
The second way in which social media is making it easier to analyze consumer data is via “behavioural recognition”. This can be applied in a variety of different ways. Many companies have developed sophisticated algorithmically driven systems that can predict user responses to certain situations. By understanding how users are likely to behave in different situations – for example, whether they will click a link in a particular setting, fill out a form, or make a purchase – advertisers can exploit these behavioural clues to target their advertising.
It is not just the financial markets that can benefit from what is known as descriptive analytics. One of the primary driving forces behind the commercial success of YouTube and other social media outlets is the fact that people are sharing large amounts of data sources with each other. A simple example would be that someone posts a video of themselves doing something funny things. Others might be able to identify aspects of the video, such as the effect it has on the audience, the funny parts and the parts that are disturbing – and then use this information to market their products or services better to the audience.
There are two main ways that this analytical process differs from traditional machine learning. In traditional machine learning, the computer scientists create models, and the trained engineer or technician then implement them in a production environment. What is new is that the data Mining techniques enable analysts to not only analyze the raw data from the internet, but also to extract insights from the structure of that data. This is done by the use of what is called an artificial intelligence (AI) system – typically a deep neural network (ANN), or a self-programmed device.
Unlike traditional machine learning and statistical analysis where the data sets are very structured, what is called unstructured data analytics work with large amounts of unstructured data. Traditional approaches cannot deal with this because they require data analysts to sort through large amounts of data to find relevant relationships. Another limitation is that in some cases, even very carefully selected Machine Learning methods can still fail to identify relationships when the data is very unstructured. However, the ability to deal with large amounts of unstructured data makes the job much easier and more efficient.
The last point to note is that what is often termed “big data analytics” actually covers a wide range of uses for this emerging technology. For instance, researchers in finance use it to detect potential frauds in a process called fraud management. Also, law enforcement agencies use it to track criminal activities in a large volume of data, and even terrorists use it to detect potential terrorist activities. The uses of analytics for this growing technology are only just beginning to develop and are only likely to grow in the coming years.