What is Hadoop Streaming?

Leta : Hadoop streaming is a powerful utility which comes with Hadoop distribution.The basic concept of Hadoop framework is to split the job,process it in parallel and then join it back to get the end result.So there are two main components involved in this framework.
a) Map application
b) Reduce application

The Hadoop streaming utility allows you to write Map/Reduce applications in any language that is capable of working with STDIN and STDOUT.

Tagged on:

One thought on “What is Hadoop Streaming?

  1. Dingcheng Li

    Prebral sem si predstavitveni članek o Hadoop streaming. Zdelo se mi je zelo koristno. Ampak imam več vprašanj o tem, kako jo uporabljati.

    Eden glavnih vprašanje, bi rad vprašal je, če moj perl skript potrebuje več kot en argument, kako jih lahko preide v ukazni vrstici?

    For example, Včasih sem naslednji ukaz, kjer sem več vhodov za ravnanje več argumentov. V resnici pa, vhodni podatki samo prva. Vsi drugi so le nekateri viri Perl skript potrebuje, da se glasi v pomoč v postopku na prvi vnos podatkov.

    Hadoop jar /usr/lib/hadoop-mapreduce/hadoop-streaming.jar -D mapred.reduce.tasks = 0 D mapred.map.tasks.speculative.execution = false -D mapred.task.timeout = 12000000 -input nlp_research /edt_nlp_data/3000001.txt -input shift.txt -input seznami -input dict -input nlp_research / deid-1.1 / deid.config -inputformat org.apache.hadoop.mapred.lib.NLineInputFormat -output perl_output -mapper deid_mapper.pl - deid_mapper.pl datoteka

    Če mi lahko dal nekaj napotkov, to bi bilo super!

============================================= ============================================== Buy best TechAlpine Books on Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Enjoy this blog? Please spread the word :)

Follow by Email
LinkedIn
LinkedIn
Share