How Metadata Makes Big Data Manageable – Explained by the Digital Photo Example
The “Big” in Big Data suggests that especially large amounts of data are crucial in the analysis of Big Data. To derive real and meaningful insights from data, it is important to understand the data essential and data types.
Big Data virtually consist of any kind of data from our daily digital interactions – for example, they could be internal data or external data; they could be text and image data from the mail server or from social media. Obviously, the mail or social media data are fundamentally different from the sensor data that collected from the production monitoring systems. Metadata plays an important role to filter among the various types of data; it is the data about data.
Let’s use the metadata of digital photos as an example. Every photo recorded with a digital camera consists of data such as the colour, brightness and number of individual pixels that form the image. In addition, each image file has much more useful information should be included in the metadata:
- When the picture was recorded?
- Which shutter speed which aperture and which program was used?
- Which ISO value was set?
Many cameras now have integrated GPS chips that can record the location information of the shooting place. These data are a typical example of image metadata. People can investigate the photos in many ways using the metadata. For example, at the end of a year, use the data to evaluate their own picture-taking experience:
- At what time I shoot most photos?
- What places in the world I took my pictures?
- I have tried many times in too little light to photograph or I almost always use the default program?
Generally speaking, metadata thus provide information on data. They therefore represent a kind of keywords and help to identify data in a search.
With the metadata of digital photos, we can easily search specific images we are interested. Assuming that you need an image of a sunset in Thailand for advertising, then you only need to satisfy two main criteria to quickly find the right shots from thousands of images – recording time and locations; these searching criteria from the metadata are sufficient to exclude a huge amount of uninterested images.
This simple example about the photo metadata has explained clearly why metadata in Big Data analysis are so critical. They make it possible to reduce the data amount involved in the analysis and enable people to focus on the data they are interested.
Regardless of the computing power, metadata allow us to make large amounts of data manageable. The less the data to be processed, the more likely we are able to make results available in real-time.
Great and educative