[R-sig-Geo] Problem with data and representation of a complex timeseries ( R + Postgress + Rpy)
Roger Bivand
Roger.Bivand at nhh.no
Fri Jun 26 08:40:42 CEST 2009
On Thu, 25 Jun 2009, reyman wrote:
> Hey guys,
>
> I have problem with data stockage of timeline series. I'm trying to explain
> you...
Please try reshape() after studying its help page carefully. Try a small
example first. It will let you switch between your long representation
where pop (and rank) vary by year, and a wide representation, with
pop_year and rank_year with cities as rows. This question could have been
sent to R-help, and it would be (much) easier to follow if you dropped
PostgreSQL and rpy, they are not your problem - your problem is reshaping
the data to run T years' analyses on data which are now in the long
representation.
Alternatively, select each time slice separately and analyse just that.
Roger
>
> *At the end, i want a rank-size by date (1600 - 2000) for the 1000 biggest
> city of a simulation*
>
> First in my db have two columns wich interest me : population, date
>
> With "Rpy" and "rdbiPgSql" , i make this query (for example) :
>> Select v_date, v_pop from data where date = 1650 order by v_pop limit 1000
> So i want in first time the 1000 biggest city for one date ..
>
> In result, i have a dataframe like this :
>
> v_pop | v_date
> 15000 1650
> 10000 1650
> 5000 1650
> .....etc
>
> For the rank size, i need to create a rank, so i use the rank(
> MyDataframe$v_pop ) function in R
>
> v_pop | v_date | rank
> 15000 1650 1
> 10000 1650 2
> 5000 1650 3
>
> Next, i have many problem to resolve (for a beginer like me ... ouch)
>
> a) i need to create a timeline serie with this dataframe, in this order ...
> but v_date existe already ..
> b) i need to complete this dataframe with all of the 1000 biggest city and
> rank for each date ( with an iteration on the query, date change to 10 by 10
> years , condition is date < 2000 ).. like this
>
> v_pop | v_date | rank
> 15000 1650 1
> 10000 1650 2
> 5000 1650 3
> 18000 1700 1
> 12000 1700 2
> 9500 1700 3
> ...etc...
> 50000 2000 1
> 48000 2000 2
> 18000 2000 3
>
> For each date the same query and rank function with result to concatenate,
> but how ?
> I need other dataframe (one by date) or not ? an how i can represent this
> data ? i need to use timeseries or not ?
>
> c) Representation of the data
>
> I need y log population, x log rank and one line/plot by timeline series...
> a lot ... (1650 to 2000 by 10 year > 1650 1660 1670 ... etc)
> I need to use maplot or there are an other tools to represente each data
> above of other ?? *So, in fact i want only one graphics with all timeline
> series present ...*
>
> Thanks a lot in advance for you help !!
> Sebastien.
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
--
Roger Bivand
Economic Geography Section, Department of Economics, Norwegian School of
Economics and Business Administration, Helleveien 30, N-5045 Bergen,
Norway. voice: +47 55 95 93 55; fax +47 55 95 95 43
e-mail: Roger.Bivand at nhh.no
More information about the R-sig-Geo
mailing list