R for Stata users

Formulas

The table below shows the correspondance between regression models in Stata and R

Stata R
y x1 x2 y ~ x1 + x2
y x1,nocons y ~ 0 + x1
y i.x1 y ~ as.factor(x1)
y c.x1#c.x2 y ~ x1:x2
y c.x1##c.x2 y ~ x1*x2
y c.x1##i.x2 y ~ x1*as.factor(x2)

Estimation commands


Post-estimation commands

An estimation function returns a list that contains the estimates, the covariance matrix, and in a lot of cases, the residuals, the predicted values, or the original variables used in the estimation. Apply the names function to examine the result:

result <- felm(y ~ x2, df)
names(result)
#>  [1] "coefficients"  "badconv"       "Pp"            "N"             "p"            
#>  [6] "inv"           "beta"          "response"      "fitted.values" "residuals"    
#> [11] "r.residuals"   "terms"         "cfactor"       "numrefs"       "df"           
#> [16] "df.residual"   "rank"          "exactDOF"      "vcv"           "robustvcv"    
#> [21] "clustervcv"    "cse"           "ctval"         "cpval"         "clustervar"   
#> [26] "se"            "tval"          "pval"          "rse"           "rtval"        
#> [31] "rpval"         "xp"            "call"   
pryr::object_size(result)
#> [1] 88 MB

Applying summary prints a table similar to Stata output

summary(result)
#> Call:
#>    felm(formula = y ~ x2, data = df) 
#> 
#> Residuals:
#>     Min      1Q  Median      3Q     Max 
#> -48.834 -23.175  -5.028  25.222  50.939 
#> 
#> Coefficients:
#>              Estimate Std. Error t value Pr(>|t|)    
#> (Intercept) 48.746112   0.064228 758.949   <2e-16 ***
#> x2           0.001997   0.001059   1.886   0.0593 .  
#> ---
#> Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
#> 
#> Residual standard error: 29.91 on 999998 degrees of freedom
#> Multiple R-squared: 3.556e-06   Adjusted R-squared: 1.556e-06 
#> F-statistic:3.556 on 1 and 999998 DF, p-value: 0.05934 

The package stargazer allows to combine several regression results in a table:

stargazer(result, type = "text")
#> ===============================================
#>                         Dependent variable:    
#>                     ---------------------------
#>                                  y             
#> -----------------------------------------------
#> x2                            -0.0004          
#>                               (0.001)          
#>                                                
#> Constant                     50.315***         
#>                               (0.064)          
#>                                                
#> -----------------------------------------------
#> Observations                 1,000,000         
#> R2                            0.00000          
#> Adjusted R2                  -0.00000          
#> Residual Std. Error    29.707 (df = 999998)    
#> ===============================================
#> Note:               *p<0.1; **p<0.05; ***p<0.01