HADOOP Training In Chennai
Hadoop
is created by Douglas Reed cutting. Who named hadoop after his child’s stuffed
elephant to support Lucene and Nutch search engine products
Open
source project administered by Apache software foundation
Hadoop
Consists of two key services( HDFS and MedReduce)
Hadoop
is a software framework for data intensive computing applications
1. Software platform that lets one easily write and run applications that process vast amounts of data. It includes:
– MapReduce – offline computing engine
– HDFS – Hadoop distributed file system
– HBase (pre-alpha) – online data access
2. Yahoo! is the biggest contributor
3. Hadoop implements Google’s MapReduce, using HDFS
4. MapReduce divides applications into many small blocks of work.
5. HDFS creates multiple replicas of data blocks for reliability, placing them on compute nodes around the cluster.
6. MapReduce can then process the data where it is located.
7. Hadoop‘s target is to run on clusters of the order of 10,000-nodes.
Example Applications and Organizations using Hadoop
a.
Amazon
b.
Yahoo
c.
AOL
d.
FaceBook
e.
FOX interactive media
Why do We Need Hadoop ?
a.
Hadoop provides storage for Big Data at reasonable cost
b.
Hadoop allows to capture new or more data
c.
With Hadoop, you can store data longer
d.
Hadoop provides scalable analytics
e.
Hadoop provides rich analytics
What qualities/skills in trainees help
a.
Good understanding of data warehouse concepts and design
patterns
b.
Strong experience with Core Java
c.
Good experience on Hadoop ,Experience with HDFS, Map-reduce and
other tools in Hadoop ecosystem
d.
Strong knowledge and hands-on experience with Map-reduce
programming model and high level languages like pig or hive
e.
Experience with NoSQL data-stores like HBase, Cassandra
f.
Understands various configuration parameters and helps arrive at
values for optimal cluster performance
g.
Knowledge of configuration management / deployment tools like
Puppet / Chef
h.
Setting up cluster monitoring and alerting mechanism tools like
Ganglia, Nagios etc
i.
Experience in setting up cross-data center replication
j.
Understands how security model using Kerberos and enterprise
LDAP product works and helps implement the same
No comments:
Post a Comment