Subsetting Data | R Learning Modules - IDRE UCLA

文章推薦指數: 80 %
投票人數:10人

1. Subsetting variables ... To manipulate data frames in R we can use the bracket notation to access the indices for the observations and the variables. It is ... SkiptoprimarynavigationSkiptomaincontentSkiptoprimarysidebarVersioninfo:CodeforthispagewastestedinRversion3.0.2(2013-09-25) On:2013-11-19 With:lattice0.20-24;foreign0.8-57;knitr1.5 1.Subsettingvariables TomanipulatedataframesinRwecanusethebracketnotationtoaccess theindicesfortheobservationsandthevariables.Itiseasiesttothink ofthedataframeasarectangleofdatawheretherowsaretheobservations andthecolumnsarethevariables.Justlikeinmatrixalgebra,theindices forarectangleofdatafollowtheRxCprinciple;inotherwords,thefirst indexisforRowsandthesecondindexisforColumns[R,C]. Whenweonlywanttosubsetvariables(orcolumns)weusethesecondindex andleavethefirstindexblank.Leavinganindexblankindicatesthatyouwant tokeepalltheelementsinthatdimension.Inthefirstexamplewecreatethe dataframehsb3containingonlythe variablesid,readandwrite,butalltheobservations fromtheoriginaldataframehsb2.small.Inordertoknowwhich variablescorrespondtowhichnumberintheindexweusethenamesfunction, whichwilllistthenamesofthevariablesintheorderinwhichtheyappearinthedataframe. Fromthislistweseethatidisvariable1,readisvariable7andwrite isvariable8.Wecannotrefertothevariablesbytheirnamesaloneuntilwehaveattachedthedata. hsb2.small50)) ##idfemaleracesesschtypprogreadwritemathsciencesocst ##170041115752414757 ##2121142136859536361 ##5172042124752575361 ##6113042124452516361 ##750032115059425361 ##984042116357545851 ##1048032125755525051 ##1260042125765516361 ##1395043127360716171 ##14104043125463575546 ##1538031124557503156 ##1776043124752515056 ##18195042215757605856 ##19114043126865625561 ##22143042136363757266 ##2420013126052576161 Thereisnolimittohowmanylogicalstatementsmaybecombinedtoachievethesubsetting thatisdesired.Thedataframewrite.1containsonlytheobservationsforwhichthe valuesofthevariablewriteisgreaterthan50andforwhichthevariable readisgreaterthan60. (write.150&read>60)) ##idfemaleracesesschtypprogreadwritemathsciencesocst ##2121142136859536361 ##984042116357545851 ##1395043127360716171 ##19114043126865625561 ##22143042136363757266 Itispossibletosubsetbothrowsandcolumnsusingthesubsetfunction.The selectargumentletsyousubsetvariables(columns).Thedataframewrite.2 containsonlythevariableswriteandreadandthenonlytheobservations ofthesetwovariableswherethevaluesofvariablewritearegreaterthan50 andthevaluesofvariablereadaregreaterthan65. (write.250&read>60,select=c(write,read))) ##writeread ##25968 ##95763 ##136073 ##196568 ##226363 Inthedataframewrite.3containsonlytheobservationsinvariables readthroughscienceforwhichthevaluesinthevariable sciencearelessthan55. (write.3



請為這篇文章評分?