Hadoop cov lus tseem ceeb, Yooj yim zog

Txheej txheem cej luam:

Nyob hauv txoj kev siv tshuab toj roob hauv pes, cov ntaub ntawv loj loj thiab analytics yog tus ob uas tseem ceeb tshaj tej qhov chaw uas neeg noj ntau txaus siab. Cuab kev tos qab no traction yog – qhauj tau txais kev pab ua hauj lwm hauv tej ntaub ntawv loj loj thiab BI daim ntaub ntawv. Hadoop yog tam sim no muaj ib tug loj cub tshuab, yog li txoj kev pab thiab kev sib sab laj no kuj kis tus kab mob tshaj tawm tech. Tab sis npog, peb muaj cai dab tsi yog – neeg tseem pom tias nws nyuaj los nkag siab txog cov ntsiab lus tseeb, feem ntau ua ib lub tswv yim vague txog Hadoop thiab lwm yam yees.

Nyob rau cov tshooj no, peb siv zog as yog Hadoop tseem ceeb lus piav nyob rau hauv ib txoj kev yooj yim heev, yog li ntawd txoj kev thiab non-technical tuaj nkag siab.

Hadoop eco-mob – txhais li cas xyov?

Hadoop yog lub platform tswj los qhib qhov haib heev Apache Foundation. Hadoop platform yog ua rau Java yees thiab cuab kav ntawm cov zauv loj loj ntim ntawm cov ntaub ntawv hauv lub cheeb tsam clustered distributed heterogeneous. Nws cov scaling muaj rab peev xwm ua kom lub zoo meej haum rau faib xam.

Hadoop eco-lawv cov pawg tub ntxhais lub Cheebtsam Hadoop thiab lwm yam kev mob. Nyob rau hauv cov hoob Cheebtsam, Hadoop muab theej thiab faib cov ntaub ntawv kaw lus (HDFS) thiab cov MapReduce programming qauv yog lub ntsiab lus tseem ceeb tshaj ob. Cov muaj tus mob caj, Nas muv rau SQL, Tus npua rau dataflow, Zookeeper kev tswj cov kev pab thiab lwm yam tseem ceeb. Peb yuav piav txog cov ntsiab lus uas nyob rau hauv cov lus.

Hadoop ecosystem

Hadoop ecosystem

Image1: Hadoop eco-lawv

Vim li cas koj yuav tsum paub txog cov ntsiab lus tseem ceeb uas?

Peb tau qhia txog tias Hadoop yog ib lub npe nrov heev nta, thiab sawv daws yog tham txog tej no, txhob txwm los yog tsis. Ces yog qhov teeb meem- Yog hais tias koj tawm rau tswv yim txog yog ib yam dab tsi los yog mloog ib yam dab tsi, tab sis tsis paub txhais li cas xyov, ces koj yuav tsis tau txuas lub dots los yog zom nws. Qhov teeb meem yog pom dua thaum cov neeg uas yog los ntawm tus sau ntau, zoo li neeg ua lag ua luam, li hais mav, sab saum toj los xyuas dua thiab lwm yam. Vim hais tias cov neeg no yuav tsis tau paub ' Cas Hadoop xwb?‘, theej yog xav paub txog 'cas yuav nqa nyiaj ua lag ua luam '. Kom paub cov kev pab ua hauj lwm, ib ntsis ntawm kev nkag siab txog cov ntsiab lus uas Hadoop yog ib qho tseem ceeb heev nyob rau tag nrho cov khaubncaws sab nraud povtseg. Tab sis tib lub sij hawm, cov ntsiab lus uas yuav tsum tau piav txoj kev yooj yim tsis jargons txoj, nyiam kom tus nyeem.

Peb to taub lub ntsiab lus tseem ceeb uas

Ntawm ntu no peb yuav tshawb cov ntsiab lus uas txawv rau hauv Hadoop thiab nws cov eco-suab, muaj ib txhia piav. Rau clarity hauv siab, peb yuav muab ob sab pawg, yog ib tus luj module lwm qhov ib yog cov tej pob khoom software ntxiv thiab lwm yam cuab yeej uas yuav muab ntsia tau nyias los tsaws rau saum qaum Hadoop. Hadoop hais txog tas nrho cov chaw.

Ua ntej, peb tau saib cov nqe lus uas tau los nyob hauv lub txoos module.

  • Apache Hadoop: Apache Hadoop yog ib qhov chaw qhib moj khaum zauv loj ntim ntawm cov ntaub ntawv hauv lub cheeb tsam clustered. Nws siv yooj yim MapReduce programming cov qauv kev txhim khu kev qha, scalable thiab distributed xam. Qhov cia tuaj le caag ob yog faib hauv lub moj khaum no.
  • Hadoop khaub: Raws li qhia lub npe, nws muaj hlauv taws xob uas txhawb ntau Hadoop modules. Nws yuav yeej tau ib lub tsev qiv ntawv hom cuab yeej thiab hlauv taws xob. Hadoop uas yog mas siv developers thaum txoj kev loj hlob daim ntawv.
  • HDFS: HDFS (Hadoop muab theej thiab faib cov ntaub ntawv kaw lus) yog ib qhov system distributed ntaub ntawv nyob hauv radon spans. Nws scales heev heev thiab muaj siab throughput. Ntaub ntawv blocks yog replicated thiab nyob rau hauv ib txoj kev distributed ntawm ib cheeb tsam clustered.
  • MapReduce: MapReduce yog ib tug qauv rau thaum uas tig mus ua kev loj ntim ntawm cov ntaub ntawv nyob rau hauv ib cheeb tsam distributed programming. Qhov kev pab cuam MapReduce muaj ob lub ntsiab Cheebtsam, hauv daim ntawv qhia no yog ib qho () txujci, uas co filtering thiab sorting. Lwm qhov ib yog tus txo tej () ib feem, tsim los ua txoj kev ntawm qhov zis los ntawm daim ntawv qhia ib sab.
  • Tsis tau ib qho Negotiator (XOV PAJ): Nws yuav yeej tau ib qho chaw tus thawj tswj muaj nyob rau hauv Hadoop 2. Lub luag hauj lwm ntawm xov PAJ yog los tswj thiab teem lub sij hawm xam kev pab nyob rau hauv ib cheeb tsam clustered.

Tam sim no, Peb cia saib lwm yam lus hauv Hadoop

  • HBase: HBase yog ib qhov chaw qhib, scalable, distributed thiab cov-paub database. Nws yog Java sau thiab raws li tus Google Rooj loj. Lwm cia cov ntaub ntawv uas yog HDFS.
  • Nas muv: Nas muv yog ntaub ntawv warehouse software, uas txhawb kev nyeem ntawv, sau ntawv thiab tswj loj ntim ntawm cov ntaub ntawv nyob rau hauv lub distributed cia lawv. Nws muab SQL li lus nug lus hu ua HiveQL (HQL), rau querying tus dataset. Nas muv txhawb cia nyob rau hauv HDFS thiab lwm yam ntaub ntawv uas tau tshaj lub Amazon S3 thiab lwm yam zoo ib yam li.
  • Tus npua Apache: Tus npua ntawd ib theem high platform rau ntau cov teeb tsom. Cov lus sau ntawv rau tus npua scripts yog hu ua tus npua Latin. Nws yeej abstracts lub MapReduce lwm pab thiab ua kom yooj yim rau developers mus ua hauj lwm rau hom MapReduce tsis tau sau ntawv rau lub chaws txoos.
  • Apache txim: Txim (qhib tau qhov twg los) sawv yog xam moj khaum thiab hais laij cav txog cov ntaub ntawv Hadoop (loj teev cov ntaub ntawv txheej). Nws yuav luag co 100 lub sij hawm sai dua MapReduce nco qas ntsoov. Thiab, rau disk, Nws tseem yuav luag 10 lub sij hawm sai. Txim tau khiav rau nyias tej kev kawm/hom zoo li hom stand-alone, nyob Hadoop, nyob rau EC2 thiab lwm yam. Nws nkag tau rau tej ntaub ntawv los ntawm HDFS, HBase, Nas muv los yog muaj lwm yam Hadoop tej ntaub ntawv tau qhov twg los.
  • Sqoop: Sqoop yog ib tug uas hais kom ua kab rho tej ntaub ntawv los ntawm RDBMS thiab Hadoop tej ntaub ntawv bases. Nws mas siv rau cov ntaub ntawv ntshuam/export sib paub thiab cov-paub databases. Lub npe ' Sqoop’ yuav ua combining pib thiab hnub ib sab ntawm ob tug lwm yam lus,Sql+Muajoop'.
  • Oozie: Oozie yeej yog ib tug Hadoop num txaus cav. Nws lub sij hawm ua hauj lwm nyob tsev uas tswj hauj lwm Hadoop.
  • ZooKeeper: Apache ZooKeeper yog ib lub platform qhib tau qhov twg los, uas muaj kev kawm siab pab ua kom sib haum Hadoop faib kev siv. Nws yog ib tug centralized qhov kev pab tuav ntaub ntawv configuration, naming npe, distributed synchronization thiab pab pawg.
  • Flume: Apache Flume yog ib qhov kev pab cuam distributed, mas siv rau tej ntaub ntawv, aggregation thiab zog. Nws ua haujlwm heev nraaj nrog loj npaum li cas cav thiab tej ntaub ntawv.
  • Hawj txawm: Hawj txawm yeej yog ib tug interface Web site rau cov ntsiab lus Hadoop. Nws yog ib qhov qhib tau qhov twg los, txhawb Hadoop thiab nws cov eco-suab. Lub hom phiaj tseem ceeb yog muab ua zoo dua tus neeg siv kev. Nws kev luag thiab chaw nco thiab ua tau editors rau txim, Nas muv thiab HBase thiab lwm yam.
  • Mahout: Mahut yog qhib tau qhov twg los software tsev kawm tshuab scalable thiab kev tsuas siv cov ntaub ntawv ceev ceev.
  • Ambari: Ambari yeej yog ib tug uas web los xyuas thiab tswj tej pawg ua ke Hadoop. Nws yog cov eco lawv cov kev pab thiab lwm yam cuab yeej xws li HDFS, MapReduce, HBase, ZooKeeper, Npua, Sqoop thiab lwm yam. Qhov tseem ceeb peb functionalities provisioning, tswj thiab xyuas tej pawg ua ke Hadoop.

Raws li Hadoop eco-uas yog tujtaws evolving, tshiab software, cov kev pab thiab lwm yam cuab yeej uas tseem hnubpoob. Yog li ntawd, yuav muaj cov lus tshiab thiab jargons nyob hauv lub ntiaj teb ntaub ntawv loj. Peb xav tau ib tug saib zoo zoo thiab nkag siab txog cov neeg nyob rau lub sij hawm.

Xaus

Nyob rau cov tshooj no peb twb sim qhia hais tias tseem ceeb tshaj tseem ceeb lus hauv Hadoop eco-lawv. Peb muaj tseem tham me ntsis txog cov eco-lawv thiab vim li cas peb yuav tsum paub txog cov ntsiab lus uas. Hadoop yog tam sim no muaj ib tug loj cub tshuab, kom tus neeg tau txais ntau koom tes nrog rau nws. Li ntawd, Nws yog lub sij hawm txoj kev to taub tej ntsiab lus yooj yim thiab ntsiab lus uas siv nyob rau hauv lub ntiaj teb Hadoop. Nyob rau lub neej tom ntej, yuav cov tshiab lub ntsiab lus muaj ntau, thiab peb yuav tsum hloov peb tus kheej kom haum.

Tagged: ,
============================================= ============================================== Yuav zoo TechAlpine phau ntawv rau Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Txaus siab rau qhov blog? Tshaj tawm lus thov :)

Follow by Email
LinkedIn
LinkedIn
Share