DOM thiab SAX Parser nyob Java

Java Parser

Yuav ua li cas sau ntawv Java Parser?

Txheej txheem cej luam:

Parser yog ib qho tseem ceeb tivthaiv tej lus programming. Ntau ntau parsers qhib tau qhov twg los muaj nyob rau hauv cov lag luam. Li ntawd, tus tsim tawm tau los xaiv tus parser as per lub chaw. Tam sim no, muaj ntau zaus, yog parser yuav tsis muaj nyob dawb do, nyob ntawd scenario, developers yuav tsum npaj nyias parsers kev cai ntau hom lus xws li Java, C thiab lwm yam. Vim li cas qab tsim kev cai parsers yuav ua qhov teeb meem los ntawm kev ua tau zoo, complexity, flaws hauv parsing, tsis txuam tus yuav tsum tau yam.

Nyob rau cov tshooj no peb yuav sim mus tshawb seb parsing puas tau rau Java thiab peb kuj yuav muaj ib tug saib txawv nrov Java parsers.







Yog parsing dab tsi thiab dab tsi yog ib tug parser?

Ua ntej yuav nkag mus txhawb, peb yuav tsum paub lub ntsiab ntawm cov lus 'parsing' thiab 'parser'. Wb sib zoo.

Nyob rau hauv cov lus yooj yim, parsing yuav tau txhais tias yog ib tug mechanism ntawm txhawm cia ib ntu ntawm cov ntaub ntawv ua tej daim me me raws tej txheej kev cai hais tseg muaj menyuam mus saib. Thiab ces txhais tau, hloov tau txoj kev los tswj cov daim ntaub as per lub me me.

Thiab, parser yog ib qho software kev uas siv rau cov ntaub ntawv mus rau hauv me chunks. Tus parser yuav sau tej lus raws li qhov yuav tsum tau.

Qhov ntau hom kev parsers nyob rau hauv Java no zoo li cas?

Parsers yuav ua categorized yam. Txoj kev nyuaj, tus parser yuav tsis yog tshwm sim los muaj lossis random. Nyob rau hauv ib cov tshwm sim los muaj parser, xwb rau tam sim no parsed cov ntaub ntawv tsis yog puas siv tau. Nws tsis tau rov qab mus los nraim. Nyob rau hauv ib random parser, parsed tej ntaub ntawv yuav tsum accessed across, li yeev yog tau. SAX thiab StAX parsers yog cov piv txwv ntawm parser tshwm sim los muaj thiab XML DOM yog ib qho piv txwv ntawm random parser.

Nyob rau hauv ib txoj kev sib txawv, parsers yuav muaj dej num ua ntawv parser XML parser. Ib phau ntawv parser parses ntaub ntawv whereas XML parser parses XML/JSON ntaub ntawv. Nyob rau peb sab laj peb yuav tsom rua nrov Java DOM thiab SAX parsers thiab cov lus piv ntxwv.


DOM parser thiab SAX parser

DOM (Document Object Model) nyiaj thiab lub sij tus interface uas yuav siv tau los muab tej ntaub ntawv XML. XML parsers sau los ntawm kev siv lub tswv yim no interface. DOM parsers yog random parsers uas yog tsim thaum twg

  • Ntaub ntawv hais txog tus qauv ntawm cov ntaub ntawv tseem ceeb heev
  • Koj yuav tau tawm mus yuj yees li tus qauv

DOM parser muab ob peb Java interfaces thiab kev ua hauj lwm nrog rau cov ntaub ntawv XML. Nws rov ib tsob ntoo tsa ntawm txhua lub ntsiab rau ntawm ib daim ntawv XML. Thiab ntoo yuav traversed mus ua hauj lwm nrog rau cov ntaub ntawv.

SAX (Simple IBPI mus XML) yog muaj tshwm sim los muaj kev tshwm sim raws li parser. Nws parses cov ntaub ntawv XML ib yam tshwm sim los muaj, txij lub hauv paus txog rau thaum xaus. Nws tsis tsim ib cov qauv tsob ntoo rau parse; theej nws xa ib cov kev tshwm sim ceeb toom thaum parsing hais. SAX no haum thaum twg

  • Yog yuav tsum tau ua tawm thiab tshwm sim los muaj
  • Cov ntaub ntawv XML no loj heev
  • Complex nyob rau XML nesting no tsis muaj
  • Ib feem ntawm cov ntaub ntawv XML xav tau manipulated

SAX parser muaj interfaces nrog rau txoj kev rov hu tau ceeb toom kev tshwm sim thaum parsing.







Yuav ua li cas rau ib DOM parser siv nyob rau hauv Java?

Ntawm ntu no peb yuav ua hauj lwm nrog rau ib daim ntawv XML thiab DOM parser. Nram no yog ib tug qauv XML tej ntaub ntawv uas muaj neeg ua hauj lwm cov ntaub ntawv ntawm lub tuam txhab. Qhov no nws yog ntawv input rau lub parser.

Nram no yog ib tug XML daim ntawv uas muaj neeg ua hauj lwm hais txog cov ntaub ntawv ntawm ib lub tuam txhab. Yog lub hauv paus caij 'tuam txhab ', Nws yog ib qho rau saum daim ntawv. Tom qab ntawd, ‘neeg ua hauj lwm ' yog muaj caij tuaj tom ntej. Nws muaj cov neeg ua hauj lwm cov ntaub ntawv xws li lub npe, nyiaj hli yam. Parsing yuav pib ntawm lub hauv paus caij tom ntej.

Qhia 1: Qauv ntawv XML ua

[chaws]

<?xml version =”1.0″?>

<niaj hnub>

<neeg ua hauj lwm empid =”3931″>

<firstname>Kaushik</firstname>

<lastname>Npawg</lastname>

<npe menyuam yaus>Kaushik</npe menyuam yaus>

<nyiaj hli>85000</nyiaj hli>

</neeg ua hauj lwm>

<neeg ua hauj lwm empid =”4932″>

<firstname>E-mail</firstname>

<lastname>saparoff</lastname>

<npe menyuam yaus>E-mail</npe menyuam yaus>

<nyiaj hli>95000</nyiaj hli>

</neeg ua hauj lwm>

<neeg ua hauj lwm empid =”5935″>

<firstname>Siav</firstname>

<lastname>Maum muas lwj</lastname>

<npe menyuam yaus>Siav</npe menyuam yaus>

<nyiaj hli>90000</nyiaj hli>

</neeg ua hauj lwm>

</niaj hnub

[/chaws]







Tam sim no peb tsim tau ib Java parser siv parsing qauv DOM. Nram qab no yog cov kauj ruam mus tau raws li nyob rau hauv qhov kev pab cuam rau extract cov ntaub ntawv.

  • Nyob rau hauv cov ntshuam nqe mus txhua qhov XML hais txog tej pob khoom
  • Mus saib tej ntaub ntawv input daim ntawv thiab sau ntawv builder
  • Extract tus cag caij
  • Ntawm daim ntawv uas muaj tsim ' ua hauj lwm’ ntawm
  • Iterate los ntawm daim ntawv thiab extract qhov tseem ceeb

Qhia 2: Siv lub tswv yim DOM parser

[chaws]

//Ua ke ua ib pob

pob com.eduonix.xml;

//Import tag nrho rau tej pob khoom

ntshuam java.io.File;

ntshuam javax.xml.parsers.DocumentBuilder;

ntshuam javax.xml.parsers.DocumentBuilderFactory;

ntshuam org.w3c.dom.Document;

ntshuam org.w3c.dom.Element;

ntshuam org.w3c.dom.Node;

ntshuam org.w3c.dom.NodeList;

//Sau ntawv rau hoob kawm pej xeem parser

tsev kawm TestDomParser {

pej xeem tsis muaj dabtsis loj zoo li qub(Txoj hlua[] args){

sim {

//Mus saib tej ntaub ntawv input daim ntawv thiab sau ntawv builder

Ua inputDataFile = ntaub ntawv tshiab(“InputData.txt”);

DocumentBuilderFactory dbldrFactory

= DocumentBuilderFactory.newInstance();

DocumentBuilder docBuilder = dbldrFactory.newDocumentBuilder();

Ntaub ntawv teev tseg docmt = docBuilder.parse(inputDataFile);

docmt.getDocumentElement().normalize();

System.out.println(“Lub npe ntawm lub hauv paus caij:”

+ docmt.getDocumentElement().getNodeName());

//Sau rau ntawm daim ntawv teev

NodeList ndList = docmt.getElementsByTagName(“neeg ua hauj lwm”);

System.out.println(“*****************************”);

//Iterate los ntawm daim ntawv thiab extract qhov tseem ceeb

rau (rau cov menyuam tempval = 0; tempval < ndList.getLength(); tempval ) {

Ntawm nd = ndList.item(tempval);

System.out.println(“\n rau lub caij tam sim no lub npe :”

+ nd.getNodeName());

Yog hais tias (nd.getNodeType() == Node.ELEMENT_NODE) {

Caij elemnt = (Caij) nd;

System.out.println(“Neeg ua hauj lwm tus ID : ”

+ elemnt.getAttribute(“empid”));

System.out.println(“Neeg ua hauj lwm lub npe: ”

+ elemnt

.getElementsByTagName(“firstname”)

.yam khoom(0)

.getTextContent());

System.out.println(“Neeg ua hauj lwm xeem: ”

+ elemnt

.getElementsByTagName(“lastname”)

.yam khoom(0)

.getTextContent());

System.out.println(“Neeg ua hauj lwm lub npe Nick: ”

+ elemnt

.getElementsByTagName(“npe menyuam yaus”)

.yam khoom(0)

.getTextContent());

System.out.println(“Neeg ua hauj lwm cov nyiaj hli: ”

+ elemnt

.getElementsByTagName(“nyiaj hli”)

.yam khoom(0)

.getTextContent());

}

}

} ntes (Kos e) {

//Ntes thiab kos print – yog muaj

e.printStackTrace();

}

}

}

[/chaws]







Tam sim no compile thiab khiav qhov kev pab Java khaws cov ntaub ntawv XML hauv ib qhov chaw zoo. Tso zis rau hauv daim ntawv thov yuav tsum muaj raws li nram no. Nws qhia tau hais tias tag nrho cov neeg ua hauj lwm cov ntaub ntawv muaj nyob rau hauv XML.

Compiling cov khawb hauv qhov chaws….

$javac com/eduonix/xml/TestDomParser.java 2>&1

Executing cov kev pab cuam….

$java-Xmx128M-Xms16M com/eduonix/xml/TestDomParser

Lub npe ntawm lub hauv paus caij:niaj hnub

*****************************************************

Lub caij tam sim no lub npe: neeg ua hauj lwm

Neeg ua hauj lwm tus ID: 3931

Neeg ua hauj lwm lub npe: Kaushik

Neeg ua hauj lwm xeem: Npawg

Neeg ua hauj lwm lub npe Nick: Kaushik

Neeg ua hauj lwm cov nyiaj hli: 85000

Lub caij tam sim no lub npe: neeg ua hauj lwm

Neeg ua hauj lwm tus ID: 4932

Neeg ua hauj lwm lub npe: E-mail

Neeg ua hauj lwm xeem: saparoff

Neeg ua hauj lwm lub npe Nick: E-mail

Neeg ua hauj lwm cov nyiaj hli: 95000

Lub caij tam sim no lub npe: neeg ua hauj lwm

Neeg ua hauj lwm tus ID: 5935

Neeg ua hauj lwm lub npe: Siav

Neeg ua hauj lwm xeem: Maum muas lwj

Neeg ua hauj lwm lub npe Nick: Siav

Neeg ua hauj lwm cov nyiaj hli: 90000

Parser qha

Thaum parsers, txoj kev zoo tshaj yog nyob ntawm seb tus zauv thiab muaj cai. Ib phau ntawv parser no haum thaum koj yog parsing ntawv tawm tswv yim, thiab ces tokenizing/splitting nws thiab kev siv cov ntaub ntawv. XML parsers yog tias lawv tsis tsim thaum koj tau txais cov ntaub ntawv XML/JSON yog ib lub tswv yim. Nram qab no yog tej lub zoo tshaj siv txoj cai raws li nyob rau hauv XML parsing.

DOM parser zoo haum thaum cov zauv uas hais muaj nyob rau hauv 1000 muaj ib tug yuav tsum tau kev muab/Hmong hais. Tiam sis ua DOM tsim ib cov qauv ntoo ua ntej yuav pib ua, kev kawm yog ib qho tseem ceeb parameter. Li ntawd, kev manipulation ib feem ntawm ib daim ntawv xml, DOM tsis pom zoo.

SAX yog zoo haum rau ntaub ntawv xml loj loj tawm tus qauv thiab nws hais. Nws yog lub teeb ceeb thawj thiab tsim kev ntiav xml daim ntawv parsing. Thaum nws tsis tau muaj tus qauv tsob ntoo, cov kev kawm yeej zoo zog li DOM parser.







Xaus:

Parsing yog ib feem ntawm cov lus programming. Java tau nws tus kheej kev parsing ntawv nyeem, Cov ntaub ntawv XML. Nyob rau cov tshooj no peb muaj kev pab tej thaj chaw uas muaj parsing li lub tswvyim generic. Thiab ces peb sib tham txog tej chaw ntawm parsing thiab parsers nyiam DOM thiab SAX. Nyob rau sab lus piv txwv, peb muaj nqi DOM parser thiab nws cov lus yuav siv. Qhov kawg ntawm tsab xov xwm peb tau xaus lus rau txoj kev zoo tshaj nyob hauv txoj kev ua lag luam.

============================================= ============================================== Yuav zoo TechAlpine phau ntawv rau Amazon
============================================== ---------------------------------------------------------------- electrician ct chestnutelectric
error

Txaus siab rau qhov blog? Tshaj tawm lus thov :)

Follow by Email
LinkedIn
LinkedIn
Share