Language tags in HTML and XML - World Wide Web ...

文章推薦指數: 80 %
投票人數:10人

Although the codes are case insensitive, they are commonly written lowercased, but this is merely a convention. Note also that, where ISO offers ... Accesskeynskipstoinpagenavigation.Skiptothecontentstart Relatedlinks RFC3066:TagsfortheIdentificationofLanguages ISO639:CodesfortheRepresentationofNamesofLanguages ISO3166:CodesforCountryNames IANAlanguagetagregistry AuthoringTechniquesforXHTML&HTMLInternationalization:Specifyingthelanguageof content1.0 W3CI18Nresourceindex:Languagedeclarationsandlanguagenegotiation  Internationalization  Home  About  Groups  Topics  Techniques  Resources  News      LanguagetagsinHTMLandXML onthispage: RFC3066 - Specialprimarysubtags -IANA-registeredtags -Matchingtags - Issueswithtags -Bythe way -Furtherreading Terminology Inthisarticlewerefertothevalueofalanguageattributesuchasfr-CAasalanguagetag. ThefrandCApartsarereferredtoassubtagswhendescribedaspartsofatag.When describedasmembersofanISOlistoflanguagesorcountries,frandCAarereferredtoascodes. Languagetagscanbe(andshouldbe)usedtoindicatethelanguageoftextinHTMLandXMLdocuments.For HTML4,languagetagsarespecifiedwiththelangattribute.For XML,languagetagsaregiveninthexml:langattribute.Inbothcases,language informationisinheritedalongthedocumenthierarchy,i.e.ithastobegivenonlyonceifthewholedocumentisinonelanguage,andlanguage informationnests,i.e.innerattributesoverwriteouterattributes. LanguagetagsaredefinedinRFC3066,whichobsoletestheolder RFC1766.XMLhasbeenupdatedtouseRFC3066byan erratum.RFC3066isbasedon ISO-639two-letterandthreeletterlanguagecodes,andon ISO-3166two-lettercountrycodes.RFC1766did notincludethree-letterlanguagecodes. Examplesinclude: Code Language Explanation en English ISO-639two-letterlanguagecode mas Masai ISO-639three-letterlanguagecode fr-CA FrenchasusedinCanada ISO-639two-lettercodewithISO-3166two-lettercountrycode en-scouse EnglishLiverpudliandialectknownas'Scouse' ISO-639two-letterlanguagecodewithaddition,IANA-registered i-klingon Klingon IANA-registeredlanguagecode x-pig-latin PigLatin Unregistered/Experimental Languagetagsstartingwithi-aredefinedintheIANAregistryof languagetags.Languagetagsstartingwithx-denoteexperimentaltagswithoutguaranteeforuniqueness.Thelistof ISO-639two-letterandthree-letterlanguagecodesisprovidedbythe ISO639-2RegistrationAuthority(LibraryofCongress,USA). AccordingtoRFC3066,forlanguageswithbothatwo-letterandathree-lettercode,the two-lettercodemustbeused.Thisalsosolvestheproblemofthoselanguagesthathavetwodifferentthree-lettercodes,becauseallofthemalso haveatwo-lettercode. XMLnowalsoprovidesameanstopreventinheritanceoflanguageusingtheemptystring,ie. xml:lang="" Essentially,thissays:Idonotwanttoassociateanylanguagewiththisinformation. Theremainderofthisarticleprovidesadditionaldetailonhowtouselanguagetags. RFC3066rules RFC3066isthestandardthatdefineshowtouselanguagetagstoidentifylanguages. Alanguagetagiscomposedofaprimarysubtag,followedbyzeroormoreadditionalsubtags,separatedbyhyphens. Theprimarysubtagrepresentsalanguage(therearetwopossibleexceptions,i-andx-, whicharedescribedbelow),andanyfollowingsubtagsservetoqualifythedialectorusageofthelanguage.Theselatter subtagstypicallyrepresentcountries,dialectsorscripts. ThefollowingexampleindicatesthatadocumentiswrittennotjustinEnglishbutinBritishEnglish,asopposedto,say,US English. Subtagsarecaseinsensitive;theycanincludethelettersanddigitsAtoZ, atozand0to9;andtheymustbe8 charactersorlessinlength. NotethattheHTMLspecificationstillrecommendstheuseofRFC1766foridentifyinglanguage.RFC3066isanupdateofRFC1766that supersedesit,andthereisaplannederratuminplacefortheHTMLspecification,soyoushoulduseRFC3066despitewhattheHTMLspecification currentlysays. RFC3066merelyexpandsandclarifiesthepossibilitiesforspecifyinglanguages.IfyouhavebeenusingRFC1766you shouldnotneedtomakeanychangestoyourtaginordertostartusingRFC3066. AproposedsuccessortoRFC3066iscurrentlybeingdeveloped,butitaimsto retainbackwardscompatibilitywithtagscreatedusingRFC3066. Theprimarysubtag Allsubtagsininitialpositionmustbe1,2or3lettersinlength.All2and3lettersubtagsinthispositionmustbelanguagecodes fromISO639part2,whichdefinescodestorepresentlanguages.1lettersubtags mustbeoneoftheprefixesi-orx-wewilldescribelater. Althoughthecodesarecaseinsensitive,theyarecommonlywrittenlowercased,butthisismerelyaconvention. Notealsothat,whereISOoffersachoicebetween2-letterand3-lettercodes,youshouldchoosethe2-letterone.Thisensuresthatfor eachlanguage,asfaraspossible,auniquecodeisused.Olderdatausingtwo-lettercodes(basedonRFC1766,whichdidnotallowthree-letter codes)doesnotneedtobechanged.Also,thequestionofwhichthree-lettercodetouseisavoided,sincethefewlanguagesthathavetwodifferent three-lettercodesallhaveatwo-lettercode. Additionalsubtags Subtagscanbeaddedtoindicategeographic,dialectal,script,orotherrefinementstotheprimary(language)tag.Anynumberofsubtags canfollowtheprimarytag,althoughitisunusualtoseemorethanone. RFC3066specifiesthatany2-lettertagsinthesecondsubtagmustbe ISO3166countrycodes.Therearenorulesfor anythirdandsubsequentsubtagsthatareused. Two-letterISOsubtagsindicatingcountryarecommonlywrittenuppercase,butthisisonlyaconvention. Specialprimarysubtags RFC3066definesacoupleofinstanceswherethelanguagetagmightnotbeginwithanISOlanguagecode. Alanguagetagthatbeginswithi-isreservedforIANA-registeredlanguagetags.Examplesinclude i-mingo i-klingon i-tao Alanguagetagthatbeginswithx-providesamechanismforuser-definedlanguagetags.Thesecondtagmustbemore thanoneletterlong,andmustnotbeoneofthefollowingreservedsubtags:AA,QM-QZ,XA-XZ,andZZ.Forexample: x-mylanguage Ofcourse,neitheroftheseapproachesshouldbeusedtoidentifyalanguageiftheapproachbasedoninitialtwo-orthree-letterISOcodes isavailable.Thesemethodsrestrictorpreventinteroperablelanguagetagrecognition. IANA-registeredlanguagetags ItispossibletoregisterlanguagetagswithIANAusingthesubmissionprocess describedinRFC3066.Thesetagscanhave3-to8-lettersubtagsinthesecondposition. Whilethei-prefixisreservedspecificallyforIANAtags,notallIANAtagsbeginwithit.Forexample,anumber ofChinesedialectshavebeenregisteredwithIANA.Theseincludezh-guoyu,zh-hakka, zh-min,zh-min-nan,zh-wuu,etc. RegisteringtagswithIANAisbetterthanusinguser-definedtagsbecauseitmaximizesthelikelihoodofinteroperability,duetothefact thattheIANAtagsarevisibletoothers.Ontheotherhand,IANAtagsmaybedeprecatedasnewcodesareaddedtotheISOstandard.Forthisreason, theremaybesomerisktolong-terminteroperabilitywhenusingcertainIANAregisteredtags.Thisisparticularlylikelytoapplytotagsbeginning withthei-prefix. IANAtagsthathavebeendeprecatedatthetimethistutorialwaspublishedincludeno-bok(Norwegian"Book language"-useISO639nb),i-navajo(Navajo-useISO639nv), i-lux(Luxembourgish-useISO639lb),andothers. SomeparticularlyusefultagsregisteredwithIANAallowyoutospecifyTraditionalvs.SimplifiedChinese.Inthepastitwasnecessaryto distinguishthetwobyusingsomethinglikezh-CN(MainlandChina)forSimplifiedChineseandzh-TW(Taiwan)forTraditionalChinese.Apartfromthe factthatthisismislabelled,youcouldnotguaranteethatotherswouldrecognizetheseconventions,orevenfollowthem.Forexample,somepeople usedzh-HKtorepresentTraditionalChinese.NowIANAmakesavailablethetagszh-Hansandzh-HantforSimplifiedandTraditionalChinese, respectively.Thefollowingtwoparagraphsillustratetheuseofthesetags. 当世界需要沟通时,请用Unicode!

當世界需要溝通時,請用統一碼(Unicode) Itisexpectedthatthesetagswillpersistfortheforeseeablefuture,soitwouldbegoodtousethemassoonaspossibleinorderto improvefutureinteroperabilitysoonerratherthanlater. Matchinglanguagetags AccordingtoRFC3066'en-GB'shouldalsomatch'en'.Forexample,thefollowingCSScodecolorsallEnglishtextredinbrowsersthat supportthepseudo-attribute:lang. :lang(en){color:red;} Inthefollowingcode,thetextdescribedaslang="en-GB"willbered.

Enjanvier,touteslesboutiquesdeLondresaffichentdespanneaux SALE,maisenfaitcesmagasinssontbienpropres!

Ontheotherhand,giventhefollowingCSSdeclaration, :lang(en-GB){color:red;} theword'SALE'shouldnotberedinthefollowingcode.

Enjanvier,touteslesboutiquesdeLondresaffichentdespanneaux SALE,maisenfaitcesmagasinssontbienpropres!

Note,however,thatthisisnotthecaseforlanguagenegotiationonanApacheserver.Ifyouwanttobeautomatically directedtoapageexample.fr.htmlandyourbrowsersettingsonlystateapreferencefor'fr-CA',youwillneedtoadd'fr'toyoursettings.(See Settinglanguagepreferencesinabrowser.) Issueswithlanguagetags AlthoughRFC3066languagetagsworkwellmuchofthetime,therearestillsomeissues: ManymorecodesareneededthanthoseprovidedbyISOtocovertheapproximately6,000languagesoftheworld. Theydon'tcovertheneedstoexpressgeneralregions;forexample,thereisstillnotagforthegeneralizedLatin-AmericanSpanish thatmanyorganizationsusetocreateSpanishcontent. Thereissomelackofclaritybetweentheuseoflanguagetagvaluesfordesignatinglanguagevs.locale.'Locales'arecombinationsof languageplusgeographicalregiontypicallyusedtosetsuchthingsasdateandtimedefaultsinsoftware. Thereisaneed,sometimes,todistinguishthescriptused,inadditiontothelanguage.Forexample,Mongolianmightbewrittenin MongolianscriptorCyrillic;CroatianmightbewritteninLatinorCyrillic;... Peoplearecurrentlyworkingonsolutionstotheseissues,includingpeoplefromISOTC37,SIL,andW3C,etc.The proposedsuccessortoRFC3066isalsotargetingtheseissues. Bytheway... LanguagetagsforHTMLwerefirstformallydefinedinRFC2070,F.Yergeau,et.al.InternationalizationoftheHypertextMarkupLanguage.RFC2070wasincorporatedinto HTML4,andhasbeenreclassifiedashistoric. NotechangestoISOlanguagecodes,inparticular thosein1989(withdrawingiw,in,andji,replacingthembyhe,id,andyi,andaddingse,iu,ug,andza). Unicodeprovidescross-referencestoMicrosoftandApplecodes. ManyotherW3CandWeb-relatedspecificationsuselanguagetags: XHTML1.0,reformulatingHTMLintermsofXML,which advisestouseboththeHTMLlangattributeandtheXMLxml:langattribute, withthelatertakingprecedenceincasethereshouldbeanydifferences. HTTPuseslanguagetagsintheAccept-LanguageandContent-Languageheaders. SMILandSVGcanuselanguagetagsinthestatement. CSSandXSLuselanguagetagsfordetailedstylecontrol. Notealsothatlanguageinformationcanbeattachedtoobjectssuchasimagesandincludedaudiofiles. Furtherreading RFC3066:TagsfortheIdentificationofLanguageshttp://www.ietf.org/rfc/rfc3066.txt ISO639:CodesfortheRepresentationofNamesofLanguages http://www.loc.gov/standards/iso639-2/langcodes.html ISO3166:CodesforCountryNames http://www.iso.org/iso/en/prods-services/iso3166ma/02iso-3166-code-lists/list-en1.html IANAlanguagetagregistryhttp://www.iana.org/assignments/language-tags AuthoringTechniquesforXHTML&HTMLInternationalization:Specifyingthelanguageofcontent 1.0http://www.w3.org/TR/i18n-html-tech-lang/ W3CI18Nresourceindex:Languagedeclarationsandlanguagenegotiation http://www.w3.org/International/resource-index.html#lang Authors:MartinDürst&RichardIshida(W3C). Lastupdate2005-05-1320:06GMT Forasummaryofsignificantchanges,searchforthetitleinthechangelog. Copyright©2005W3C®(MIT,ERCIM,Keio),AllRights Reserved.W3Cliability,trademark,documentuse andsoftwarelicensingrulesapply.Yourinteractionswiththissitearein accordancewithourpublicandMemberprivacystatements.


請為這篇文章評分?