HDFS federation yog dab tsi?

Txheej txheem cej luam: Peb yeej paub zoo txog cov yam ntxwv ntawm Hadoop thiab HDFS. Nyob rau hauv daim ntawv no peb yuav tham txog qhov HDFS federation uas pab peb yuav txhim khu kev HDFS architecture uas twb muaj lawm. Nws muaj ib tug sib cais ntawm namespace thiab cia yog li enables scalability thiab rho tawm los ntawm cov pawg lawm.

Taw qhia: Hadoop federation separates cov namespace txheej thiab txheej cia. Nws enables tau thaiv cia txheej. Nws kuj expands cov architecture ntawm tus uas twb muaj lawm HDFS pawg pub tshiab implementations thiab siv ua plaub. Lub HDFS architecture tam sim no muaj ob khaubncaws sab nraud povtseg –

  • Namespace – No txheej tswj ntaub ntawv, Wage thiab blocks. No txheej kev txhawb zog rau tej ntaub ntawv yooj yim zog haujlwm e.g. qhia ntaub ntawv, cov ntaub ntawv creation, kev hloov kho cov ntaub ntawv thiab deletion ua ntaub ntawv thiab folders.
  • Cia thaiv – Cov txheej no muaj ob yam –
    • Thaiv Management Qhov no tswj tus datanodes hauv tus sawv thiab qhia haujlwm li creation, deletion, kev hloov kho thiab nrhiav. Nws kuj yuav saib xyuas thiab cov kev tswj hauv replication.
    • Lub cev cia Qhov no stores cov blocks thiab muab ntaub ntawv lossis sau ua haujlwm.
An HDFS cluster

Tus HDFS pawg

Daim duab 1: Tus HDFS pawg

Nyob hauv lub HDFS architecture tam sim no, Peb tsuas muaj ib namespace rau cov tseem sawv uas yog tswj los ntawm ib lub npe tib ntawm muaj. Siv cov kev no nws yuav yooj yim rau koj siv cov HDFS pawg. No lub layering ntawm architecture xwb fine rau me setups txawm rau koom haum loj uas lub ntim loj loj ntawm cov ntaub ntawv yuav tsum tau coj pab rau tus ceev ceev, e.g. yahoo thiab nws nyob kauj muaj ib co kev uas yog leej twg tus Hadoop federation Facebook. Li ntawd, Hadoop federation yuav tau txhais tias yog tus siab heev architecture rau overcome tau txoj kev uas yuav pib siv ntawd tam sim no HDFS.

Peb xyuas txoj kev raws li tau piav rau hauv qab no –

  • Cia thaiv nruj nreem coupled thiab Namespace- Nyob rau hauv cov architecture tam sim no lub nplov cia thiab tus Namespace yog nruj nreem coupled uas ua rau cov implementations ua lwm cov npe o yooj yim thiab txwv tsis pub muaj lwm cov kev pab ncaj qha mus siv qhov thaiv cia.
  • Namespace Scalability – Lub HDFS pawg scales horizontally los ntawm kev ntxiv datanodes, tiam sis peb tsis tau ntxiv dua namespace rau ib pawg uas twb muaj lawm horizontally. Peb yuav scale namespace mas nyob ib namenode nkaus xwb. Lub namenode stores cov ntaub ntawv tiav lawv metadata nyob rau hauv nws cov cim xeeb uas txwv cov blocks, ntaub ntawv thiab Wage mus yuav txaus siab rau qhov lawv cov ntaub ntawv uas yuav tsum tau accommodated nyob hauv lub cim xeeb ntawm cov hluas namenode.
  • Kev kawm – Tus tam sim no cov ntaub ntawv kaw lus haujlwm no tag rau lub throughput ntawm ib ntawm tib lub npe uas nyob tuaj txhawb zog 60000 txhawb kev pab raws qib. Tab sis tus tshiab tom ntej daim ntawv qhia kom tawm Apache yuav muaj ib cov kev them nyiaj yug rau ntau tshaj 100000 txhawb kev paub tab thiab li no yuav tau sib txuam o.
  • Rho tawm- Feem ntau cov HDFS deployments muaj heev nyob rau hauv ib cheeb tsam tej tsev neeg uas sawv ib yuav muab tso tawm los ntawm lub koom haum ntau. Hauv no teeb ib kem namespace tsi tau ib daim ntawv los yog ib lub koom haum.

HDFS Federation:

Hadoop federation tso cai rau cov kev pab cuam lub npe scaling horizontally. Nws yuav siv li ob peb namenodes los yog namespaces uas muaj kev ywj siab hais txog kev sib. Yog cov neeg sab nraud namenodes federated i.e. lawv yuav tsis tau Inter ua kom sib haum. Cov datanodes yog siv hom cia los ntawm txhua tus namenodes. Txhua tus datanode yuav tsum zwm rau tag nrho cov kev namenodes hauv cov pawg. Cov datanodes xa periodic ntaub ntawv thiab teb cov commands ntawm lub npe o. Peb muaj ib pas thaiv uas yog txheej blocks uas muaj ib zaug xwb namespace. Nyob rau hauv ib pawg, lub datanodes stores blocks rau tag nrho lub thaiv dej da. Txhua lub pas thaiv yog tswj ntawm nws tus kheej. Qhov chaw lub npe kom neeg thaiv tawm lawv tus ID rau tshiab blocks tsis qhia rau lwm cov namespaces enables no. Yog ib lub namenode tsis li cas los xij, tus datanode yuav mus liab los lwm namenodes.

Ib namespace thiab nws cov kev thaiv collectively hu ua Namespace Volume. Thaum lub namespace los yog ib tus namenode yuav deleted coj thaiv pas ntawm tus datanode yuav deleted yeej. In the Process of pawg tu-gradation, txhua namespace ntim upgraded li ib chav tsev.

An HDFS federation architecture

Tus HDFS federation architecture

Daim duab 2: Tus HDFS federation architecture

Kev pab cuam ntawm Hadoop Federation:

Hadoop federation los nrog ib co zoo thiab kev pab uas tau teev kom muaj –

  • Scalability thiab rho tawm- Namenodes ntau horizontally scales rau hauv cov ntaub ntawv kaw lus namespace. Qhov no ua tau separates namespace tagnrho rau cov neeg siv thiab pawg ntawm daim ntawv thiab muab ib txog kev rho tawm.
  • Qhov cia generic- Qhov thaiv theem pas abstraction pub rau cov architecture tsim tshiab ua ntaub ntawv thov lub tid thaiv cia. Peb yuav tau yooj yim tsa tau daim ntaub ntawv ntawm kev thaiv cia txheej tshiab uas tsis tas siv cov ntaub ntawv kaw lus interface. Mekas pawg ntawm pas thaiv tau kuj tau ua uas yog qhov txawv ntawm lub neej ntawd thaiv pas.
  • Tej yam yooj yim tsim – Namenodes thiab namespaces yog ywj siab hais txog kev sib. Yog muaj kog scenario uas yuav tsum tau hloov lub npe uas twb muaj lawm o. Ib lub npe ntawm no los ua kom tau robust. Federation no kuj tau tshaj rov qab. Integrates nws yooj yim nrog tus uas twb muaj lawm ntawm ib qho deployments uas ua hauj lwm uas tsis muaj kev hloov configuration.

Configuring ib HDFS Federation:

Configuration ntawm Hadoop Federation yog tsim tej tib tias tag nrho cov kev ntshav rau txoj sawv tau cov configuration ib yam. Tus configuration yog nqa ntawm cov kauj ruam hauv qab no –

  • Kauj ruam 1 – Cov lus hauv qab no tsis xav muab ntxiv rau qhov configuration uas twb muaj lawm-
    • nameservices – Qhov no yog configured nrog rau ib daim ntawv uas muaj comma cais NameServiceIDs. No parameter yog siv los ntawm Datanodes los txiav txim rau tas nrho cov kev namenodes hauv cov pawg.
  • Kauj ruam 2 – Cov configurations hauv qab no yuav tsum tau suffixed nrog cov coj lub npe muab kev pab ID rau hauv ntau configuration ntawv.
    • Namenode
    • Tej zaum NameNode
    • BackupNode

Ib tug qauv configuration tej ntaub ntawv rau ob namenodes yog muaj li nram qab no-

Qhia 1: Ib tug qauv configuration tej ntaub ntawv rau ob o

[Chaws]

<configuration>

<khoom>

<lub npe>dfs.nameservices</lub npe>

<tus nqi>ns1, ns2</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.rpc-address.ns1</lub npe>

<tus nqi>nn-host1:6600</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.http-address.ns1</lub npe>

<tus nqi>nn-host1:8080</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.secondaryhttp-address.ns1</lub npe>

<tus nqi>snn-host1:8080</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.rpc-address.ns2</lub npe>

<tus nqi>nn-host2:6600</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.http-address.ns2</lub npe>

<tus nqi>nn-host2:8080</tus nqi>

</khoom>

<khoom>

<lub npe>dfs.namenode.secondaryhttp-address.ns2</lub npe>

<tus nqi>snn-host2:8080</tus nqi>

</khoom>

</configuration>

[/Chaws]

Formatting lub Namenode: Cia peb cov commands rau hom namenode.

  • Kauj ruam 1 – Muaj ib lub npe ntawm yuav tsum formatted siv hauv qab no –

$Namenode HADOOP_USER_HOME/rau hauv/hdfs-hom [-clusterId <cluster_id>]

Cov sawv ceem yuav tsum muaj thiab yuav tsum txhob sib ceg uas ib lwm exiting sawv ceem. Yog hais tias tsis muaj, ib pawg nws daim id generated thaum lub sij hawm ntawm formatting.

  • Kauj ruam 2 – Namenode ntxiv yuav tsum formatted siv cov nram qab no hais kom ua-

$Namenode HADOOP_PREFIX_HOME/rau hauv/hdfs-hom - clusterId <cluster_id>

Nws tseem ceeb heev no tias cov sawv ceem hais no yuav tsum muaj sib npaug ntawm uas hais rau cov kauj ruam 1. Yog hais tias ob tug no sib txawv, lub namenode ntxiv yuav tsis raug ib sab ntawm tus federated sawv.

Pib thiab siv ceev xwmphem rau sawv: Peb mus lub commands pib thiab tsis muaj tus sawv.

  • Pib sawv – Cov pawg yuav tau pib los ntawm executing cov nram qab no hais kom ua-

$HADOOP_PREFIX_HOME/bin/Start-dfs.sh

  • Tsis muaj tus sawv – Cov pawg yuav tsum nres ntawm executing qhov hais kom ua li nram no

$HADOOP_PREFIX_HOME/bin/Start-dfs.sh

Ntxiv ib namenode tshiab rau ib pawg uas twb muaj lawm: Peb tau twb qhia tias ntau yam ntawm lub npe yog lub plawv Hadoop federation. Li ntawd, nws tseem ceeb heev kom koj to taub txog cov kauj ruam mus ntxiv npe tshiab o thiab teev horizontally.
Cov kauj ruam hauv qab no yog pivtxwv ntxiv tshiab namenodes –

  • Parameter-configuration nameservices yuav tsum muab ntxiv rau qhov configuration.
  • NameServiceID suffixed yuav tsum nyob hauv lub configuration
  • Tshiab hais txog cov config Namenode yuav tsum muab ntxiv rau hauv cov ntaub ntawv configuration.
  • Configuration ntawv yuav tsum tau propagated rau tag nrho cov kev ntshav rau txoj sawv.
  • Pib lub tshiab namenode thiab cov namenode lwm yam
  • Kho kom tshiab rau lwm datanodes de cov tshiab khiv nws namenode tau khiav rau nram qab no hais kom ua-

o yog $HADOOP_PREFIX_HOME/rau hauv/hdfs dfadmin - refreshNameNode <datanode_host_name>:<datanode_rpc_port>

  • Qhov hais saum toj no kom ua yuav tsum tau sau tiv thaiv txhua datanodes rau cov pawg.

Txoj kev: HDFS federation twb tau nkag tau rau overcome tau txoj kev uas yuav pib siv ntawd ua ntej lawm HDFS. Ntxiv scalability nyob namespace txheej yog qhov tseem ceeb tshaj plaws feature ntawm HDFS federation architecture. Tiam sis HDFS federation no kuj tau tshaj rov qab, ces daim tib namenode configuration yuav kuj ua hauj lwm uas tsis muaj kev hloov.
Peb cia saib lub ntsab lug peb sib tham nyob rau hauv daim ntawv cov lus hauv qab no txhob

  • HDFS federation separates cov txheej namenode thiab cov txheej cia.
  • HDFS federation yog tsim los overcome rau cov kev txwv ntawm ib qhov ntawm HDFS architecture uas cov qhov cia tau scale tuaj tus namespace horizontally tsis.
  • HDFS federation tawm nrog zoo li nram no –
    • Rho tawm
    • Scalability
    • Tej yam yooj yim tsim
  • HDFS configuration yooj yim heev thiab tseem yog ib qho yooj yim uas tswj tus.

 

Tagged:
============================================= ============================================== Yuav zoo TechAlpine phau ntawv rau Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Txaus siab rau qhov blog? Tshaj tawm lus thov :)

Follow by Email
LinkedIn
LinkedIn
Share