[R-sig-Geo] Problem with data and representation of a complex timeseries ( R + Postgress + Rpy)

Roger Bivand Roger.Bivand at nhh.no
Fri Jun 26 08:40:42 CEST 2009


On Thu, 25 Jun 2009, reyman wrote:

> Hey guys,
>
> I have problem with data stockage of timeline series. I'm trying to explain
> you...

Please try reshape() after studying its help page carefully. Try a small 
example first. It will let you switch between your long representation 
where pop (and rank) vary by year, and a wide representation, with 
pop_year and rank_year with cities as rows. This question could have been 
sent to R-help, and it would be (much) easier to follow if you dropped 
PostgreSQL and rpy, they are not your problem - your problem is reshaping 
the data to run T years' analyses on data which are now in the long 
representation.

Alternatively, select each time slice separately and analyse just that.

Roger

>
> *At the end, i want a rank-size by date (1600 - 2000) for the 1000 biggest
> city of a simulation*
>
> First in my db have two columns wich interest me : population, date
>
> With "Rpy" and "rdbiPgSql" , i make this query (for example) :
>> Select v_date, v_pop from data where date = 1650 order by v_pop limit 1000
> So i want in first time the 1000 biggest city for one date ..
>
> In result, i have a dataframe like this :
>
> v_pop      | v_date
> 15000      1650
> 10000      1650
> 5000        1650
> .....etc
>
> For the rank size, i need to create a rank, so i use the rank(
> MyDataframe$v_pop ) function in R
>
> v_pop      | v_date     | rank
> 15000      1650         1
> 10000       1650        2
> 5000         1650        3
>
> Next, i have many problem to resolve (for a beginer like me ... ouch)
>
> a) i need to create a timeline serie with this dataframe, in this order ...
> but v_date existe already ..
> b) i need to complete this dataframe with all of the 1000 biggest city and
> rank for each date ( with an iteration on the query, date change to 10 by 10
> years , condition is date < 2000 ).. like this
>
> v_pop      | v_date     | rank
> 15000      1650         1
> 10000       1650        2
> 5000         1650        3
> 18000        1700       1
> 12000        1700       2
> 9500          1700       3
> ...etc...
> 50000        2000       1
> 48000        2000       2
> 18000        2000       3
>
> For each date the same query and rank function with result to concatenate,
> but how ?
> I need other dataframe (one by date) or not ? an how i can represent this
> data ? i need to use timeseries or not ?
>
> c) Representation of the data
>
> I need y log population, x log rank and one line/plot by timeline series...
> a lot ... (1650 to 2000 by 10 year > 1650 1660 1670 ... etc)
> I need to use maplot or there are an other tools to represente  each data
> above of other ?? *So, in fact i want only one graphics with all timeline
> series present ...*
>
> Thanks a lot in advance for you help !!
> Sebastien.
>
> 	[[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>

-- 
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no



More information about the R-sig-Geo mailing list