Apache Hadoop – Muaj kev sib piv ntawm cheebtsam txawv

Apache Hadoop yog lub moj khaum software qhib-qhov chaw sau Java. Nws yog neeg muab rau lub cia thiab ua rau poob lawm ntau ntawm cov ntaub ntawv, Nws yog ib qho kuj zoo li cov ntaub ntawv loj. Tam sim no, Apache Hadoop comprises ntawm tej tej cheebtsam uas tso cai rau lub cia thiab ua los ntawm tagnrho cov ntaub ntawv loj hauv ib cheeb tsam clustered. Txawm li cas los, lub ntsiab ob Cheebtsam tau cov Hadoop muab theej thiab faib cov ntaub ntawv kaw lus thiab MapReduce programming.

Nyob rau cov tshooj no, peb yuav ua ntej los saib lub Cheebtsam ntau uas tsim Apache Hadoop thiab ces lub tshuab integrated thiab databases.

Cheebtsam uas Apache Hadoop

Hadoop, ua ib tug lej, muaj feem hauv qab no.

Hadoop muab theej thiab faib cov ntaub ntawv kaw lus -Abbreviated ua HDFS, Nws yog ib feem ntau ib cov ntaub ntawv uas zoo sib xws rau ntau tus uas twb muaj lawm twb sawv daws yuav. Txawm li cas los, li no kuj yog ib qhov system tej ntaub ntawv virtual.

Yog ib tug notable txawv uas lwm cov ntaub ntawv nrov lub nruab nrog, Nws yog ib qho, Thaum peb tsiv ib cov ntaub ntawv nyob rau hauv HDFS, Nws tseem yuav phua rau hauv ntaub ntawv me me. Cov ntaub ntawv me me thiab no ces replicated ntawm tsawg kawg yog peb nyias servers, kom lawv yuav tsum siv ib lwm txoj rau yam unforeseen. Tseem, no replication suav tsis tas txhais, decided as per yuav tsum tau.

Hadoop MapReduce -MapReduce no mas cov programming nam cov Hadoop uas tso cai rau ntawm tej loj tagnrho ntawm cov ntaub ntawv ua.

Muaj tseem yog ib tug muab lov los thov hauv me me thiab thov, twg ces muab xa mus rau ntau lub servers. Qhov no tso cai rau siv cov scalable zog cov CPU.

HBASE -HBASE zoo li muaj ib txheej bacteria uas sits atop lub HDFS thiab tau tsim by means of Java programming lus. HBASE uas muaj qhov chaw hauv qab no –

  • Tsis paub
  • Scalable mas
  • Kam rau ua txhaum

Txhua leej nkaus xwb uas tau tshwm sim nyob rau hauv HBASE qaib by means of ib qhov tseem ceeb. Tus nab npawb ntawm txhua kuj tsis txhais, tab sis, theej grouped rau hauv kem tsev neeg.

Zookeeper -Qhov no yeej muaj centralized zog uas koom tes –

  • Configuration lus
  • Naming lus
  • Synchronization lus

Dhau li ntawm no, Zookeeper tseem yog lub luag hauj lwm rau kev pab pawg thiab ntaub ntawm HBASE. Nws tau los siv rau cov kev pab cuam MapReduce.

Solr/Lucene – Qhov no yog tsis muaj dab tsi tab sis ib lub cav nrhiav. Nws qiv yog tsim los ntawm Apache thiab yuav tsum tau nyob 10 xyoo mas koj thiaj yuav tsim tau nyob rau hauv nws daim ntawv robust tam sim no.

Programming lug -Muaj ntau yeej ob tug programming lug ntawd yog muab raws li daim tseem Hadoop programming lug. Cov no yog –

  • Nas muv
  • NPUA

Dhau li ntawm no, muaj ob peb lwm programming haiv lus uas yuav siv tau rau kev sau ntawv pab, namely C, JAQL thiab Java. Peb kuj ua lawv pab ntawm SQL kev sis raug zoo nrog cov database, Txawm hais tias qhov uas yuav tsum tau siv cov txheem cov tsav tsheb JDBC los yog ODBC.

Rau integrated haujlwm Hadoop

Cov neeg muag khoom enterprise tau lawv tus kheej heev Hadoop khoom uas hais tias tseem comprise ntawm lawv database, li zoo li analytical offerings. Cov offerings kuj tsis tas yuav muab koj rau qhov Hadoop los ntawm elsewhere, tab sis, theej muab nws ua ib tug ntxhais nam ntawm lawv cov ntsiab.

Tej no yog –

EMC Greenplum

Greenplum zoo li muaj ib tug tuaj tshiab zoo nkauj nyob hauv koj lub lag luam enterprise thiab muaj ib lub koob npe nrov rau yog tus muaj zog pab analytics. Nws tuaj cuag li ib koog Analytics Platform, uas muaj nqaij –

  • Greenplum database meant siv rau cov ntaub ntawv xyaum
  • Nws cov Hadoop tis, peb paub los lawm raws li tus Greenplum HD
  • Ib txheej productivity nrog cov ntaub ntawv kev kawm hu ua paab chorus tseem.

IBM

IBM tus enterprise tis rau Hadoop yog hu ua Infosphere BigInsights. Nws implements ib array ntawm nta rau Hadoop, xws li-

  • Cov cuab yeej rau kev tswj
  • Cov cuab yeej rau cov koom haum saib xyuas
  • Comprises nws kuj muaj ib cov ntawv luam ua ntaub ntawv tsom xam lwm yam cuab yeej uas nyob rau hauv lub rooj ntawm cov chaw pab, xws li paub cov neeg, xov tooj, chaw nyob thiab tshaj.

Yog koj siv cov JAQL lus nug lus, ib tug tau integrate lub Hadoop ntau IBM khoom ib yam li DB2, los tseem Netezza. BigSheets, lub spreadsheet zoo nkaus li daim ntawv ua hauj lwm los ntawm cov ntaub ntawv loj loj no kuj pab. Saib tam sim no, BigInsights yuav tsuas siv li huab by means of Amazon, Rackspace, Rightscale, yam.

Microsoft

Hadoop tas tub ntxhais ib sab ntawm Microsoft lub loj cov ntaub ntawv ❑ ❑. Pursuing muaj ib qho kev integrated, nws npaj rau kev ua ntaub ntawv muaj ntau dua nws cov tuam suite rau analytics.

Microsoft loj cov ntsiab muaj tau coj mus ua tus neeg rau zaub mov qhov rais platform thiab los rau lub qhov rais Azure platform, Nws yog ib qho raws li huab. Kev nrog rau lub Center ntawm qhov rais thiab dhia Directory, lub tuam txhab muaj nws tis hom ntawm Hadoop. Ntxiv, nws integrates Hadoop nrog nws cov SQL neeg rau zaub mov, Nrig txog kev pom Studio, thiab .NET.

Oracle

Nkag mus rau hauv lub ntiaj teb los ntawm cov ntaub ntawv loj nrog ib tug kws oracle raws li kev sau rau daim ntawv loj kws. Qhov no saib kuas yooj yim xws li Hadoop, thiab los nrog tus tshiab NoSQL database, uas tso cai rau kev analytics thiab muaj kev sib txuas rau Oracle databases thiab cov Exadata warehousing lineup. NoSQL tseem paub ua ib scalable tseem ceeb raws li tus nqi database Self.

Oracle cas tseem muaj cov R analytical platform kev nrog Hadoop, thiab cov uas ua kom yooj yim rau China. Oracle tus R Enterprise khoom tseem yog ib qho uas ua rau database yooj yim xws li, thiab tseem muaj Hadoop.

Databases rau cov analytics uas Hadoop connectivity

Databases Massively thaum uas tig mus xyuas txog kev txhawb (MPP) lom zem ntau tus meant rau ntaub ntawv xyaum ntaub ntawv loj loj no, tsis zoo li cov Hadoop lub specialization ntawm unstructured ntaub ntawv. Greenplum, thiab cov loj npaum li cas ntawv Aster thiab Vertica, Cov piv txwv zoo rau pioneers thaum ntxov hauv no regard.

Cov MPP databases paub tias mus daws tshwj workloads ntawd tej analytics, thiab kuj xws li cov ntaub ntawv. Cov no muaj connectors Hadoop thiab lwm yam ntaub ntawv warehousing platforms.

Lig lawm no database dlaws no mas yog ib co lwm players rau qhov kev lag luam, xws li-

  • Aster tej ntaub ntawv tau lawm mas yuav kis tau Teradata
  • HP lawm mas Vertica
  • Greenplum yog tam sim no nyob rau hauv EMC

Hadoop-koom tuam txhab uas muag

Yuav kom tau raws li tus tsim tawm uas tau tsav zoo tagnrho lub ntiaj teb cov ntaub ntawv loj, distributions ntawm Hadoop ntau zaus koj nyob rau hauv daim ntawv ntawm zej zog cov khoom. Tej yam khoom tsis muaj ib tug enterprise tswj kev, tab sis theej txhua tus functionalities uas yuav tsum tau kho thiab tshuaj xyuas.

Cloudera

Cloudera cas yuav hiob lub tsev lag luam uas muab Hadoop distributions. Raug muab ib qho ntsiab enterprise, nrog rau kev kawm, kev xaiv cov kev pab thiab kev them nyiaj yug. Tseem, Cloudera lawm txoj kev koom tes heev heev rau lub Hadoop by means of qhov qhib txoj kev koom tes.

Hortonworks

Hortonworks tau ua ntev nrog Hadoop. Nws yog ib qhov khoom mas hauv Yahoo, thiab raws li tus originator ntawm Hadoop, nws los txhawb tub ntxhais siv tshuab Hadoop aims. Nws kuj muaj partnered Microsoft kom zoo zog rau lawv xws li Hadoop.

Xaus

Tsab xov xwm saum toj no kom meej meej piav tus modules muaj ntau yam uas tsim Hadoop, nrog rau cov heev heev enterprise thiab cov zej zog nyob khoom uas muaj kev siv tam sim no. Uas nkag mus rau hauv ntau dua prominence Hadoop, Nws yog tus tsim kom tsuas muaj sij hawm yuav ntau dua los yog ntxiv rau daim ntawv zuj zus.

Tagged:
============================================= ============================================== Yuav zoo TechAlpine phau ntawv rau Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Txaus siab rau qhov blog? Tshaj tawm lus thov :)

Follow by Email
LinkedIn
LinkedIn
Share