How can I subset a data set? | R FAQ - UCLA

文章推薦指數: 80 %
投票人數:10人

It is possible to subset both rows and columns using the subset function. The select argument lets you subset variables (columns). The data frame x.sub2 ... SkiptoprimarynavigationSkiptomaincontentSkiptoprimarysidebarTheRprogram(asatextfile)forallthecode onthispage. Subsettingisaveryimportantcomponentofdatamanagementandthere areseveralwaysthatonecansubsetdatainR.Thispageaimstogiveafairly exhaustivelistofthewaysinwhichitispossibletosubsetadatasetinR. Firstwewillcreatethedataframethatwillbeusedinallthe examples.Wewillcallthisdataframex.dfanditwillbecomposedof5variables (V1–V5)where thevaluescomefromanormaldistributionwithamean0andstandarddeviationof1;aswell as,onevariable(y)containingintegersfrom1to5. set.seed(1234) x2) x.sub         V1         V2        V3       V4         V5y 4-1.3456980.1099621710.889714510.5093141-0.023655723 5 1.4291250.5228073000.488990490.5594521 0.984861704 6 1.5060560.0016135550.088804581.4595894 0.064051405 Subsettingrowsusingmultipleconditionalstatements Thereisnolimittohowmanylogicalstatementsmaybecombinedtoachievethesubsetting thatisdesired.Thedataframex.sub1containsonlytheobservationsforwhichthe valuesofthevariableyisgreaterthan2andforwhichthevariableV1isgreaterthan0.6. x.sub12&V1>0.6) x.sub1        V1         V2        V3       V4       V5y 51.4291250.5228073000.488990490.55945210.98486174 61.5060560.0016135550.088804581.45958940.06405145 Subsettingbothrowsandcolumns Itispossibletosubsetbothrowsandcolumnsusingthesubsetfunction.The selectargumentletsyousubsetvariables(columns).Thedataframex.sub2 containsonlythevariablesV1andV4andthenonlytheobservationsofthesetwovariables wherethevaluesofvariableyaregreaterthan2andthevaluesofvariableV2aregreaterthan 0.4. x.sub22&V2>0.4,select=c(V1,V4)) x.sub2        V1       V4 51.4291250.5594521 Inthedataframex.sub3containsonlytheobservationsinvariablesV2-V5 forwhichthevaluesinvariableyaregreaterthan3. x.sub33,select=V2:V5) x.sub3           V2        V3       V4       V5 50.5228073000.488990490.55945210.9848617 60.0016135550.088804581.45958940.0640514 Subsettingrowsusingindices Anothermethodforsubsettingdatasetsisbyusingthebracketnotationwhichdesignates theindicesofthedataset.Thefirstindexisfortherowsandthesecondforthecolumns. Thex.sub4dataframecontainsonlytheobservationsforwhichthevaluesofvariableyare equalto1.Notethatleavingtheindexforthecolumnsblankindicatesthatwewantx.sub4to containallthevariables(columns)oftheoriginaldataframe. x.sub4



請為這篇文章評分?