[R 함수] aggregate, apply 사용 방법

Googlevis

R과 Google chart API의 인터페이스를 제공한다.

install('googlevis')
library('googleVis')
Fruits

aggregate(x, by, function, ... )


> banana <- Fruits[Fruits$Fruit == 'Bananas',]

아래와 같이 'Bananas' 를 직접 Hard coding해서, 'Bananas'의 정보를 얻을 수는 있을 것이다. 만약, 문자가 변경되어 'Oranges', 'Apples' 를 보고 싶다면? 매번 변경해줘야 한다. 이런 Hard coding은 하지 않도록 해야한다.

aggregate

aggregate (함수를 적용할 대상~기준, 데이터 , 함수)

데이터를 기준에 따라 분할하고 각각에 대한 요약 통계를 계산(함수를 이용해) 한 다음 결과를 편리한 형식으로 반환한다. (Splits the data into subsets, computes summary statistics for each, and returns the result in a convenient form)

## Default S3 method:

aggregate(x, ...)

## S3 method for class 'data.frame'
aggregate(x, by, FUN, ..., simplify = TRUE, drop = TRUE)

## S3 method for class 'formula'
aggregate(formula, data, FUN, ...,
          subset, na.action = na.omit)

## S3 method for class 'ts'
aggregate(x, nfrequency = 1, FUN = sum, ndeltat = 1,
          ts.eps = getOption("ts.eps"), ...)

# 과일별 판매수익 합계
>  aggregate(Sales~Fruit, Fruits, sum)
    Fruit Sales
1  Apples   298
2 Bananas   260
3 Oranges   287


Sales~Fruit : 기준이 되는 대상은 Fruit이다. Fruit의 종류별로 Grouping이 된다. 
              Sales는 Group으로 나뉘어진 과일들의 데이터 중 함수를 적용할 대상이된다.
Fruits : 위에서 설치한 데이터셋이다.
sum : 마지막 인자로 sum함수를 주었기 때문에, 과일별로 Salse 값의 sum을 계산하게 된다.

# 연도별 이익 합계
>   aggregate(Profit~Year, Fruits, sum)
  Year Profit
1 2008     44
2 2009     61
3 2010     30

# 과일별, 연도별 이익 합계
>   aggregate(Profit~Year+Fruit, Fruits, sum)
  Year   Fruit Profit
1 2008  Apples     20
2 2009  Apples     32
3 2010  Apples     13
4 2008 Bananas      9
5 2009 Bananas     16
6 2010 Bananas     10
7 2008 Oranges     15
8 2009 Oranges     13
9 2010 Oranges      7

# 년별 월별 세금을 계산 ( 총 판매액(Sales) - 총 수익(Profit) * 0.1 )
> dfFee <- aggregate((Sales-Profit)*0.1~substr(Date,6,7)+Year,Fruits,sum)
> names(dfFee) <- c('Year','Date','Fee')
Year Date  Fee
1   12 2008 23.5
2   12 2009 23.7
3   12 2010 23.8

apply(array, margin, function ... )

margin값으로 1행을 의미, 2열을 의미, c(1, 2)행과열 를 설정할 수 있다.

> df2 <- Fruits[, c(4:6)]


> rowSums(df2)
196 222 178 192 170 186 188 196 162

# df2 matrix를 행별로 sum 연산
> apply(df2,1,sum)
196 222 178 192 170 186 188 196 162

이 블로그 검색

develop

주식 자동매매 시스템

파이썬을 이용한 주식 자동매매 시스템

[R 함수] aggregate, apply 사용 방법

Googlevis

aggregate(x, by, function, ... )

apply(array, margin, function ... )

이 블로그의 인기 게시물

Linux에서 CSV파일 사용방법

R에서 외부 데이터 이용하기 (Excel, csv)

파이썬을 이용한 주식 자동매매 시스템 3 - 계좌정보 조회