Introduction Since the time when Big Data was introduced it has gone through multiple phases of evolution. Hadoop was introduced in 2005 with some initial features such as the MapReduce processing engine which allowed large scale data processing workloads distributed…
What is Apache Kudu?
Visão global : Kudu is the new open source project which provides updateable storage. It is a complement to HDFS/HBase, which provides sequential and read-only storage. Kudu is more suitable for fast analytics on fast data, which is the demand of…
Big data and AI need each other
Visão global: To anyone applying AI in any form, the response to the heading above might be “Duh!” That’s an obvious statement to those engrossed at the coalface, but for many others (especially on the client side), they have yet to…
How Open Data Platform simplifies Hadoop adoption?
Overview The Open Data Platform (ODP) is an industry initiative focused on simplifying the adoption of Apache Hadoop by the Enterprise and enabling Big Data solutions to thrive with better ecosystem interoperability. It builds on the strengths of the Apache…
SQL on Hadoop – How does it work?
Visão global: SQL on Hadoop is a group of analytical application tools that combine the SQL-style querying and processing of data with the most recent Hadoop data framework elements. The emergence of SQL on Hadoop is an important development for big…
Hadoop Basic concepts – Learn it now
Introdução: In this series, we will discuss some of the basic concepts in Hadoop and big data. We have tried to cover basic concepts and explain them to make it easy to learn and implement. We will keep on adding…
What is HDFS federation?
Visão global: We are well aware of the features of Hadoop and HDFS. In this document we will talk about the HDFS federation which helps us to enhance an existing HDFS architecture. It provides a clear separation between namespace and storage…
What is Hadoop distributed file system (HDFS)?
Visão global: In this article I will discuss about HDFS, which is the underlying file system of Apache Hadoop framework. Hadoop Distributed File System (HDFS) is a distributed storage space that spans across thousands of commodity hardware. This file system provides…