[R-sig-DB] Data Frame from a Teradata table

MacQueen, Don m@cqueen1 @end|ng |rom ||n|@gov
Thu Sep 24 17:44:16 CEST 2015


The output from
  str(tdf)
  class(tdf)
would be helpful.

It may be that "td.data.frame" objects, whatever they are, do not use the
same syntax as "data.frame" objects. Perhaps they only support numeric
indexing, not logical indexing.


Assuming logical indexes are valid for td.data.frame objects,
  tdf$Dias_Mora
should be
  tdf$dias_mora
[see the variable names shown in the output of summary()]

If that doesn't fix it, then try this:

tmp <- df$Dias_Mora > Dias_Mora &
  tdf$periodo >= Fecha_Inicio_YM &
  tdf$periodo <= Fecha_Final_YM &
  tdf$tipo_id == 3

Then, one or more of these should help reveal the problem:
  class(tmp)
  str(tmp)
  table(tmp)
tmp should be logical, and should not have any NA (missing) values

All of this assumes that
  Dias_Mora
  Fecha_Inicio_YM
  Fecha_Final_YM
all exist and are of the correct type (apparently numeric)

Also, please do not post in HTML. And you should identify what package
td.data.frame comes from, since it is not part of base R.

-- 
Don MacQueen

Lawrence Livermore National Laboratory
7000 East Ave., L-627
Livermore, CA 94550
925-423-1062





On 9/22/15, 2:41 PM, "R-sig-DB on behalf of Marco Cetraro"
<r-sig-db-bounces using r-project.org on behalf of marco.cetraro using gmail.com>
wrote:

>Hi all,
>
>I am new in R language.  I have created a data frame in R using a Teradata
>table, the statement:
>
>tdf <- td.data.frame("base_08092015_v2")
>
>where base_08092015_v2 is a Teradata table.
>
>summary(tdf)
>   numero_id           dias_mora          periodo           saldo
> Min.   :      626   Min.   :  -2.00   Min.   :201002   Min.   :        1
> 1st Qu.:   446602   1st Qu.:   0.00   1st Qu.:201201   1st Qu.: 11196611
> Median :  1038866   Median :   0.00   Median :201212   Median : 17477384
> Mean   :  2251666   Mean   :  54.84   Mean   :201244   Mean   : 20259955
> 3rd Qu.:  1589698   3rd Qu.:   0.00   3rd Qu.:201310   3rd Qu.: 25689429
> Max    :178371212   Max    :7334.00   Max    :201409   Max    :200000000
>                                                        NA's   :    37762
>    tipo_id
> Min.   :3.000
> 1st Qu.:3.000
> Median :3.000
> Mean   :3.021
> 3rd Qu.:3.000
> Max    :9.000
>
>
>My problem is that I get an error when I try to filter the td.data.frame
>tdf:
>
>new_tdf <- tdf[tdf$Dias_Mora > Dias_Mora & tdf$periodo >= Fecha_Inicio_YM
>&
>tdf$periodo <= Fecha_Final_YM & tdf$tipo_id == 3, ]
>*Error in `[.td.data.frame`(tdf, tdf$Dias_Mora > Dias_Mora & tdf$periodo
>>=
> : *
>*  Invalid subscript type 'logical'*
>
>Also, when I executed the statement:
>
>dups <- tdf5[duplicated(tdf5$periodo), ]
>*Error in `[.td.data.frame`(tdf5, duplicated(tdf5$periodo), ) : *
>*  Invalid subscript type 'logical'*
>
>
>*I don't understand the error.  I searched on the internet as well
>as specialize R websites and I couldn't find any information.*
>
>*THANK YOU VERY MUCH.*
>
>
>-- 
>Regards,
>
>Marco Cetraro
>
>	[[alternative HTML version deleted]]
>
>_______________________________________________
>R-sig-DB mailing list -- R Special Interest Group
>R-sig-DB using r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-db




More information about the R-sig-DB mailing list