Li cas los ua koj cov tsab ntawv nas MUV thawj?

Txheej txheem cej luam:

Apache Hive yog ib feem ntawm Hadoop eco-lawv. Nas muv tau txhais tias yog cov ntaub ntawv warehouse li software uas tswj cov lus nug thiab cov loj tej ntaub ntawv los xyuas dua rau HDFS (Hadoop muab theej thiab faib cov ntaub ntawv kaw lus). Ib tug yuav tsum nco ntsoov tias nas muv tsi muaj ntaub ntawv warehouse software theej nws muab ib co mechanism tswj cov ntaub ntawv ntawm ib puas ncig distributed thiab query los siv ib hom lus xws li SQL hu ua HiveQL los yog hom lus nug lus nas MUV. Nas muv scripts yuav muab txhais li ib pawg tshees commands bundled ua ke kom txhob tiav thaum. Nyob rau cov tshooj no kuv yuav tham txog tus nas muv scripts thiab nws tso.

Taw qhia:

HDFS los yog Hadoop faib cov ntaub ntawv uas muab ib tug scalable thiab txhaum tiv thaiv ua rau enabled cia ntaub ntawv. Nas MUV muaj ib SQL tej yam yooj yim xws li lus nug lus – HIVE QL. HIVE QL rau hauv daim ntawv qhia tshuaj txo developers plug-in rau lawv kev cai mappers thiab reducers ua sophisticated ntau soj lus.

Kev txwv ntawm nas MUV:

Latency kev TSHEES queries no feem ntau heev vim yog lub overheads tej hauj lwm cuav thiab teem dua. Nas muv tsis muaj lub sij hawm ntawm queries thiab kab theem tshiab. Nws yog zoo siv rau kev tsom xam cav.

Rau lwm cov ntaub ntawv nas MUV:

Nas muv tej ntawv muaj quas ua plaub pawg nram qab no:

  • Databases: Qhov no muaj cov namespaces uas yog separates tus ntxhuav thiab lwm cov ntaub ntawv rau lwm kom tsis txhob muaj lub npe tsis sib haum.
  • Ntxhuav: Cov no yog homogeneous rau lwm cov ntaub ntawv no muaj ib hom schema. Ib tug piv txwv uas siv feem ntau yuav yog ntawm ib phab saib cov lus uas txhua leej muaj cov nram qab no txhua :
  • USERID
  • IPADDRESS
  • KAWG ACCESSED
  • PAGE URL

Piv txwv no teev tau cov ntaub ntawv ntawm qhov kev pab ntawm ib qhov website los yog ib daim ntawv thov kom tus neeg siv cov.

  • Partitions: Partitions txiav txim seb cov ntaub ntawv yog muab. Txhua cov lus muaj ib los sis ntau partitions. Partitions tseem yuav pab cov neeg siv los xyuas cov natwm uas los siav ib yam xaiv kev nraaj.
  • Tshaj cum nteg qe los yog tej pawg ua ke: Cov ntaub ntawv nyob rau hauv txhua muab faib tej zaum yuav tau ntxiv subdivided mus thoob los yog tej pawg ua ke los blocks. Cov ntaub ntawv nyob rau hauv qhov piv txwv li saum toj no yeej tau clustered based rau cov neeg siv daim id rau ntawm qhov chaw nyob tus ip los yog nyob rau phab url sab.

Cov hom ntaub ntawv nas MUV:

Raws li qhov yuav tsum tau, Nas MUV txhawb cov ntaub ntawv txheej thaum ub thiab txoj kev raws li qhia hauv qab no:

  • Yam txheej thaum ub:
    • ZAUV
      • RAU COV MENYUAM ME ME QUAV 1 byte integer
      • RAU COV MENYUAM ME ME 2 byte integer
      • RAU COV MENYUAM 4 byte integer
      • BIGINT 8 byte integer
    • BOOLEAN
      • BOOLEAN muaj tseeb los tsis tseeb
    • FLOATING POINT tooj
      • NTAB precision tib
      • Precision coj coj
    • Txoj hlua hom
      • Txoj hlua cim lawv liag
    • Txoj kab: Txoj kev yuav tau ua lub tsev siv hom ntaub ntawv txheej thaum ub thiab lwm yam puas muaj kev pab los ntawm :
      • Structs
      • Maps los yog cov nqi tseem ceeb officers
      • Arrays – Indexed npe

Nas MUV Scripting:

Zoo ib yam li tej lwm scripting lus, Nas MUV scripts yuav siv los coj txheej TSHEES commands collectively. Nas MUV scripting pab peb ua tau kom txo tau lub sij hawm thiab siv zog thov sau ntawv thiab cov neeg commands executing manually. Nas MUV scripting yog txaus siab nas MUV 0.10.0 los yog ntau lub versions ntawm nas MUV. Sau thiab coj ib tsab ntawv nas MUV, Peb xav tau rau nruab Cloudera tis rau Hadoop CDH4.

Sau ntawv nas MUV SCRIPTS:

Ua ntej, qhib lub davhlau ya nyob twg rau koj Cloudera CDH4 tis thiab muab cov hauv qab no hais kom ua los ua ib tug Hive tsab ntawv.

Hais kom ua: gedit sample.sql

Zoo li muaj lwm yam lus nug lus, tus Ua ntaub ntawv thov tsab ntawv nas muv yuav tsum tau rua nrog .sql Extension. Qhov no yuav pab kom cov tso hauv qhov commands. Tam sim no qhib cov ntaub ntawv nyob rau hauv hom txawv thiab sau ntawv rau koj nas muv commands uas yuav tseg tsis tau siv tsab ntawv no. Hauv no coj mus kuaj tsab ntawv, peb yuav ua tus paub tab tom qab lawv (tsim, piav thiab mam li thauj cov ntaub ntawv rau hauv lub rooj. Thiab mam li retrieve cov ntaub ntawv los ntawm lub rooj).

· Tsim ib cov lus 'khoom' hauv nas muv:

Hais kom ua: sau cov lus product_dtl ( product_id: rau cov menyuam, khoom npe: hlua, product_price: ntab, product_category: hlua) natwm hom delimited teb haujlwm yog ',’ ;

Os { product_id, khoom npe, product_price, product_category} cov npe ntawm txhua tus cov kab lus 'product_dtl'. “Liaj teb haujlwm los ntawm ',' "ntawd hais tias tus txhua nyob rau hauv cov ntaub ntawv input muaj cais los ntawm cov ',' delimiter. Koj kuj siv tau lwm delimiters as per koj yuav tsum tau. Piv txwv, peb yuav muab cov ntaub ntawv nyob rau hauv ib qho ntaub ntawv input los cais ib kab tshiab ('n') cim.

· Piav cov lus:

Hais kom ua: hais txog product_dtl;

· Thauj cov ntaub ntawv rau hauv lub rooj:

Tam sim no, xyuas cov ntaub ntawv loading qhov wb. Tsim ib input ntaub ntawv uas muaj cov ntaub ntawv uas yuav tsum muab tso rau hauv lub rooj.

Hais kom ua: sudo gedit input.txt

Tam sim no peb tsim ob peb cov ntaub ntawv hauv lub ntawv input ntaub ntawv raws li nyob rau hauv daim duab hauv qab no – qhia

Input File

Daim duab 1: Cov ntaub ntawv input.

Li ntawd, cov ntaub ntawv peb input yuav zoo li –

1, Laptop, 45000, Xyuas hauv computer

2, Cwj mem kua kob, 2, Ntawv sau

3, Mov, 64.45, Sau tej khoom noj

4, Rooj tog, 65000, Interiors
Los ntsaws cov ntaub ntawv los ntawm cov ntaub ntawv no peb yuav tsum coj cov nram qab no-

Hais kom ua: thauj cov ntaub ntawv hauv zos inpath ' / home/cloudera/input.txt' ua lus product_dtl;

Retrieving cov ntaub ntawv:

Rau retrieve peb siv yooj yim qho nqe lus raws li nyob rau hauv – cov ntaub ntawv

Hais kom ua: xaiv * ntawm product_dtl;

Qhov hais saum toj no kom ua yuav coj thiab nqa tas nrho cov ntaub ntawv los ntawm lub rooj 'khoom'.

Tsab ntawv yuav zoo li daim duab hauv qab no:

SQL File

Daim duab 2: Cov ntaub ntawv coj mus kuaj SQL

Tseg tej ntaub ntawv no sample.sql thiab khiav rau nram qab no hais kom ua

Hais kom ua: nas muv – f /home/cloudera/sample.sql

Thaum executing tsab ntawv, pav lub tag nrho cov kev ntawm lub tsab ntawv qhov chaw nyob. Nov tsab qauv ntawv puas muaj tshwm sim nyob rau hauv lub tam sim no directory; Kuv muab tsis tau txoj kev ntawm tsab ntawv teb tiav.

Cov duab hauv qab no qhia tias tag nrho cov commands tau ntawv kho mob tseg ntse.

Cov zis hauv qab no qhia tau hais tias cov lus tsim thiab peb coj input ntaub ntawv cov ntaub ntawv yog muab rau tus database.

 

1 Laptop 45000 Xyuas hauv computer

2 Cwj mem kua kob 2 Ntawv sau

3 Mov 64.45 Khoom noj

4 Rooj tog 65000 Interiors

Txoj kev:

Ua ntej concluding peb sib tham, peb yuav tsum ceev faj seeb ib qais cov ntsiab lus nram qab no

  • Apache HIVE yog ib feem ntawm HDFS
  • Nas MUV yog ib SQL ib yam li cov lus nug lus
  • Yog yooj yim to taub thiab siv tsab ntawv nas MUV
  • Nas muv txhawb ntau hom ntaub ntawv txheej thaum ub thiab hom ntaub ntawv txoj.
Tagged:
============================================= ============================================== Yuav zoo TechAlpine phau ntawv rau Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Txaus siab rau qhov blog? Tshaj tawm lus thov :)

Follow by Email
LinkedIn
LinkedIn
Share