r - Creating a subset array -
all.
long story short:
i have dataset on 60000 entries.
one variable based on individual (location), 10 different categories. overall satisfaction, out of score of 1 10.
however, of categories contain either word rural, or word urban.
what compare overall mean of cases contain word rural in location variable, , cases contain word urban in location variable.
i have used work-around, create additional column in initial dataset in excel finds word rural or urban in location column , returns either rural or urban depending found, i'm sure there must way strictly using r.
is possible? thank you!
create dummy data:
set.seed(1) foo <- data.frame( loc=sample(c(paste0("rural",letters[1:5]),paste0(letters[10:14],"urban")), 100,replace=true), xx=rnorm(100))
now sounds want grepl()
grep keywords, , by()
calculate means by keyword:
> with(foo,by(xx,grepl("urban",loc),mean)) grepl("urban", loc): false [1] -0.07220176 ------------------------------- grepl("urban", loc): true [1] 0.04159463
or maybe want t-test:
> with(foo,t.test(xx~grepl("urban",loc))) welch 2 sample t-test data: xx grepl("urban", loc) t = -0.60245, df = 97.076, p-value = 0.5483 alternative hypothesis: true difference in means not equal 0 95 percent confidence interval: -0.4886860 0.2610932 sample estimates: mean in group false mean in group true -0.07220176 0.04159463
Comments
Post a Comment