BIS_LAB: January 2013

Wednesday, January 23, 2013

Gaurav Bhattacharya :- BA Session 3

Session3 - Business Application Lab

ASSIGNMENT 1a:

Fit ‘lm’ and comment on the applicability of ‘lm’

Plot1: Residual vs Independent curve

Plot2: Standard Residual vs independent curve

> file<-read.csv(file.choose(),header=T)

> file

mileage groove

1 0 394.33

2 4 329.50

3 8 291.00

4 12 255.17

5 16 229.33

6 20 204.83

7 24 179.00

8 28 163.83

9 32 150.33

> x<-file$groove

> x

[1] 394.33 329.50 291.00 255.17 229.33 204.83 179.00 163.83 150.33

> y<-file$mileage

> y

[1] 0 4 8 12 16 20 24 28 32

> reg1<-lm(y~x)

> res<-resid(reg1)

> res

1 2 3 4 5 6 7 8 9

3.6502499 -0.8322206 -1.8696280 -2.5576878 -1.9386386 -1.1442614 -0.5239038 1.4912269 3.7248633

> plot(x,res)

As the plot is parabolic, so we will not be able to do regression.

Assignment 1 (b) -Alpha-Pluto Data

Fit ‘lm’ and comment on the applicability of ‘lm’

Plot1: Residual vs Independent curve

Plot2: Standard Residual vs independent curve

Also do:

Qq plot

Qqline

> file<-read.csv(file.choose(),header=T)

> file

alpha pluto

1 0.150 20

2 0.004 0

3 0.069 10

4 0.030 5

5 0.011 0

6 0.004 0

7 0.041 5

8 0.109 20

9 0.068 10

10 0.009 0

11 0.009 0

12 0.048 10

13 0.006 0

14 0.083 20

15 0.037 5

16 0.039 5

17 0.132 20

18 0.004 0

19 0.006 0

20 0.059 10

21 0.051 10

22 0.002 0

23 0.049 5

> x<-file$alpha

> y<-file$pluto

> x

[1] 0.150 0.004 0.069 0.030 0.011 0.004 0.041 0.109 0.068 0.009 0.009 0.048

[13] 0.006 0.083 0.037 0.039 0.132 0.004 0.006 0.059 0.051 0.002 0.049

> y

[1] 20 0 10 5 0 0 5 20 10 0 0 10 0 20 5 5 20 0 0 10 10 0 5

> reg1<-lm(y~x)

> res<-resid(reg1)

> res

1 2 3 4 5 6 7

-4.2173758 -0.0643108 -0.8173877 0.6344584 -1.2223345 -0.0643108 -1.1852930

8 9 10 11 12 13 14

2.5653342 -0.6519557 -0.8914706 -0.8914706 2.6566833 -0.3951747 6.8665650

15 16 17 18 19 20 21

-0.5235652 -0.8544291 -1.2396007 -0.0643108 -0.3951747 0.8369318 2.1603874

22 23

0.2665531 -2.5087486

> plot(x,res)

> qqnorm(res)

> qqline(res)

Assignment 2: Justify Null Hypothesis using ANOVA

> file<-read.csv(file.choose(),header=T)

> file

Chair Comfort.Level Chair1

1 I 2 a

2 I 3 a

3 I 5 a

4 I 3 a

5 I 2 a

6 I 3 a

7 II 5 b

8 II 4 b

9 II 5 b

10 II 4 b

11 II 1 b

12 II 3 b

13 III 3 c

14 III 4 c

15 III 4 c

16 III 5 c

17 III 1 c

18 III 2 c

> file.anova<-aov(file$Comfort.Level~file$Chair1)

> summary(file.anova)

Df Sum Sq Mean Sq F value Pr(>F)

file$Chair1 2 1.444 0.7222 0.385 0.687

Tuesday, January 15, 2013

IT BA lab Assignment #2

Session 2:

Today we have learnt about creation,inverse,transpose and multiplication of matrices.Then we moved on to
regression and residual analysis by taking NSE historical data for NIFTY index for a certain period.Finally we had an introductory idea about how to plot normally distributed curve.

Assignment 1:

Create two matrices of say size 3 X 3 and select the column 1 from one matrix and column 3 from second matrix. After selecting the columns in objects say x1 and x1 merge these two columns using cbind to create a new matrix .

Solution:

To create a matrix:

x <- c[1:9]

dim(x) <- c(3,3)

y <- c[10:18]

dim(y) <- c(3,3)

To select a column

z1 <- x[ ,3]

z2 <- y[ ,2]

z3<- cbind(z1,z2)

Output:

Assignment 2:

Multiply both the matrices.

Solution:

z <- x %*% y

Output:

Assignment 3:

Read historical data of NIFTY indices from NSE for the period 1st Dec 2012 to 31st Dec 2012. Find regression and residuals

Solution:

To read the csv file:

nse <- read.csv(file.choose(),header=T)

For finding the regression and residuals the following commands are used

reg <- lm(High ~ Open , data = nse)

residuals(reg)

Output:

Assignment 4:

Generate a normal distribution data and plot it.

Solution:

For creating the ND following commands are used:

x<-rnorm(40,0,1)

y<-dnorm(x)

For plotting the data

plot(x,y)

Output:

Tuesday, January 8, 2013

Session # 1 - 8 Jan 2013

An intro to the world of R:

R is an open source programming language and software environment for statistical computing and graphics. The R language is widely used among statisticians and data miners for developing statistical softwareand data analysis.R provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. R is easily extensible through functions and extensions, and the R community is noted for its active contributions in terms of packages. There are some important differences, but much code written for S runs unaltered. Many of R's standard functions are written in R itself, which makes it easy for users to follow the algorithmic choices made.

Assignment 1:

Draw a histogram after concatenating 3 data points.

Soln :

Commands used are as under -:

> x<-c(1,2,3)

> plot(x, type = "h")

Histogram

Assignment 2: Drawing a line graph with points and naming the graph and the axis.

Soln : Let z be the variable that contains data from the .csv file selected.

Reading from the csv file is done as under -:

> z<-read.csv(file.choose(), header=T)

This command prompts the user to select the data file from the saved location.

zcol1 be the variable that contains contents of column 3 from the excel data.
the following commands were used.
> zcol1<-z[,3]
> plot(zcol1 , type="b" , main="NSE Graph" , xlab="Time" , ylab="indices")

Assignment 3:

Create a scatter plot by using share HIGH and LOW values from the NSE Historical data as obtained from the .csv file.

Soln :

HIGH values as obtained in previous ques

> zcol1<-z[,3]

LOW values are in column 4 from the csv file

> zcol2<-z[,4]

To plot the scatter plot

> plot(zcol1,zcol2)

Assignment 4 :

To find the volatility between the share values obtained from NSE historical data and obtain the range for the same.

Soln :-

To obtain the volatility , we wold require the maximum value amongst the HIGH values and the minimum values amongst the LOW values.

Merging both the columns into one vector variable 'y' to get the HIGH and LOW values together.

> y<-c(zcol1,zcol2)

> summary(y)

will give the min and the max value as under -:

Min. 1st Qu. Median Mean 3rd Qu. Max.
4888 5660 5723 5758 5884 6021

> range(y)

will give the desired range of volatility

[1] 4888.20 6020.75

Assignment 5:

To create a matrix.

Soln:

BIS_LAB

Wednesday, January 23, 2013

Gaurav Bhattacharya :- BA Session 3

Session3 - Business Application Lab

Tuesday, January 15, 2013

BIS_LAB SESSION 2

Business Application IT Lab

IT BA lab Assignment #2

Tuesday, January 8, 2013

Session # 1 - 8 Jan 2013

About Me