Introduction Since the time when Big Data was introduced it has gone through multiple phases of evolution. Hadoop was introduced in 2005 with some initial features such as the MapReduce processing engine which allowed large scale data processing workloads distributed…
Big Data and Education Industry
Overview: Big data has been driving revolutionary changes in education. There hardly remains an area in education not impacted by big data. You can notice the changes in the ways educational institutions are governed, course quality is managed and student…
Big Data characteristics and pain points
Overview Big data is based on three most important characteristics, known as volume, velocity and veracity. It comes in different forms and structure. Big data analytics is having significant impact in business decision. But it comes with some pain points.…
How Open Data Platform simplifies Hadoop adoption?
Overview The Open Data Platform (ODP) is an industry initiative focused on simplifying the adoption of Apache Hadoop by the Enterprise and enabling Big Data solutions to thrive with better ecosystem interoperability. It builds on the strengths of the Apache…
Measuring the ROI in Hadoop adoption
Overview: Nowadays, people seem to be really misinformed about Hadoop, mainly due to lots of half-truths that are fluttering about it in the market. However, all these half-truths were normal as Hadoop is said to be one of the best…
Hadoop Basic concepts – Learn it now
Introduction: In this series, we will discuss some of the basic concepts in Hadoop and big data. We have tried to cover basic concepts and explain them to make it easy to learn and implement. We will keep on adding…
Hadoop installation modes – Let’s explore
Overview: Apache Hadoop can be installed in different modes as per the requirement. These different modes are configured during installation. By default, Hadoop is installed in Standalone mode. The other modes are Pseudo distributed mode and distributed mode. The purpose…
What is Spring for Apache Hadoop?
Overview: Spring is one of the widely used frameworks in enterprise applications development. Spring has different components like Spring ORM, Spring JDBC etc to support different features. Spring for Apache Hadoop is the framework to support application building with Hadoop components…
What are the latest trends in big data and analytics?
Overview: Big data technology is coming up with best practices and better trends every day. Big data is gradually coming into main stream projects also and gaining momentum. With big data, analytics is also getting much importance, as it is…
What is Hadoop distributed file system (HDFS)?
Overview: In this article I will discuss about HDFS, which is the underlying file system of Apache Hadoop framework. Hadoop Distributed File System (HDFS) is a distributed storage space that spans across thousands of commodity hardware. This file system provides…
How Hadoop Streaming works?
Overview: Hadoop streaming is one of the most important utility in Hadoop distribution. The Streaming interface of Hadoop allows you to write Map-Reduce program in any language of your choice, which can work with STDIN and STDOUT. So, Streaming can…
What Are The Advanced Hadoop MapReduce Features?
The basic MapReduce programming explains the work flow details. But it does not cover the actual working details inside the MapReduce programming framework. This article will explain the data movement through the MapReduce architecture and the API calls used to…
Hadoop key terms, Simplified
Overview: In the current technology landscape, big data and analytics are the two most important areas where people are taking lot of interest. The obvious reason behind this traction is – enterprises are getting business benefit out of these big…