Monday, September 15, 2008

R project

The R project is a free software environment for statistical computing and graphics.

You can install it in Linux Ubuntu with:
sudo apt-get install r-base
Then type R (in big case) and like in Python, you get a console for commands. Also you can write a program and save it in a file and call it for execution:
R -f myfile.r

a <- read.table("compofdistances.dat", header=T, sep="\t")
#this will read the data from a file, header is true (it has a header) and the separations are by tabs.
plot(a[,1],a[,2])
#this will plot the first column against the other. One is plotted on X axis and the other on Y axis as reference. In other words, if the first column is the vector (1,2,3) and the second column is the vector (4,5,6), then the points in the graph will be 1,4 ; 2,5 and 3,6.

b <- read.table("compofdistances.dat", header=T, sep="\t")
plot(a[,1],b[,2],col=c("red","blue"))
#This will plot the point alternately in red and blue.
#Another example, suppose we have the next file called triangle_areas.txt:
0 36267.2574608
1 12386.1343796
2 6417203.85229
3 6616673.58732
4 6417203.85229
5 6417203.85229
6 63352.669897
7 12386.1343796
8 188188.032427
9 6417203.85229
.
.
.
1920 39008.7753129
1921 12386.1343796
1922 156563.806249
1923 83012.3642354
1924 29033.6398076
1925 67316.811118
1926 36147.5333869
1927 6616673.58732
1928 14443.5575429
1929 12386.1343796

#Then in R we write the next command:
a<-read.table("triangle_areas.txt")
b<-a[a[,2]<10000,2]
#this is recursive, the second part inside the square brackets (the number 2) indicates that we are refering to the second column from the file. In the first part we are referring only to those less than 10000, and we put them in b.
plot(b)
#To plot the results
> length(b)
[1] 972
#To know the length of "b".
> b
[1] 36267.257 12386.134 12386.134 36540.916 19763.353 36267.257 19763.353
[8] 15872.262 30528.117 36540.916 2292.506 2292.506 14199.754 12386.134
[15] 12386.134 32088.702 36540.916 15378.449 36540.916 29033.640 1167.037
[22] 12386.134 15378.449 20265.898 1167.037 33405.568 36540.916 9910.127
[29] 15378.449 45617.156 1167.037 28488.625 1167.037 17056.617 12386.134
[36] 29033.640 32088.702 29033.640 32088.702 12365.535 19763.353 19763.353
[43] 20265.898 14443.558 12386.134 36540.916 28581.609 28488.625 19763.353
[50] 32088.702 12386.134 32088.702 1799.776 25689.556 1167.037 12386.134
[57] 36540.916 30528.117 1062.624 9021.118 25325.056 29033.640 19763.353
[64] 19763.353 19763.353 25689.556 15378.449 12386.134 19763.353 32088.702
[71] 36267.257 9910.127 15378.449 1062.624 36540.916 25689.556 1167.037
[78] 33217.250 12386.134 29033.640 32088.702 45417.858 8887.827 48636.615
[85] 29033.640 48636.615 9910.127 12386.134 32088.702 1167.037 1167.037
[92] 32088.702 12386.134 12386.134 36267.257 36540.916 36540.916 12386.134
[99] 19763.353 8887.827 32088.702 12386.134 36267.257 32088.702 36540.916

> a<-read.table("triangle_areas.txt", header<-F)
#Here, I guess we are saying that the header part is FALSE. In other words, that the file does not have a header.

Link to some examples.
http://www.harding.edu/fmccown/R/
In 4shared.