Chapter 4 Missing values

4.1 EIA Data

4.1.1 Missing State/Fuel Type Power Generation

The EIA.gov data set was one of the data sets with missing values that we used in our analysis. Most of the missing values were absent because most states did not generate electricity for all fuel types. This missing data can show which fuel types are most widely adopted by all U.S. states and which fuel types are less common. For example, geothermal had the most missing values, so it was one of the least used fuels.

From the table above, it is seen that all states use solar energy at some point. An interesting observation is that there are some states that did not use coal at any point during this time period. Taking a closer look, using the table below, it is seen that Vermont and Rhode Island are the two states that did not use coal as an energy generation source. This table also shows that California was the only state to use all fuel types.
state all_solar coal conventional_hydroelectric natural_gas other other_biomass other_gases other_renewables_(total) petroleum_liquids small_scale_solar_photovoltaic wind wood_and_wood_derived_fuels all_utility_scale_solar nuclear utility_scale_photovoltaic hydro_electric_pumped_storage petroleum_coke utility_scale_thermal geothermal
RI 1545.78451 NA 90.54041 140623.69762 587.38 3002.36653 NA 4520.65219 829.22405 960.12196 916.06594 NA 587.66915 NA 587.66915 NA NA NA NA
VT 1684.51429 NA 25420.24971 59.94506 -1.41701 388.45171 NA 12711.54855 159.16609 874.49978 3069.33903 8410.28802 830.96858 65159.442 830.96858 NA NA NA NA
ID 2896.76559 1377.5923 200738.45136 47981.8643 1427.72107 1312.4956 NA 41311.7864 5.39752 296.66977 26780.39738 9573.82691 2600.09582 NA 2600.09582 NA NA NA 1016.55094
ME 565.58343 4108.98727 73834.47994 126702.42834 7814.63875 4931.44991 0.17447 89613.91163 12731.37149 416.76863 19141.63292 65365.23826 168.74774 NA 168.74774 NA NA NA NA
AK 26.38682 13030.92762 30793.78971 70595.78698 -22.16802 473.0772 5.39677 1845.77939 17954.04243 26.38682 1338.21114 3.10034 NA NA NA NA NA NA NA
CA 267831.66499 29127.68279 618822.28708 2129739.69811 13146.6396 52140.16 32925.9688 751218.99583 3516.27601 89082.10476 180860.38923 72449.91284 190711.00956 540330.483 166033.04439 -1122.36599 19548.66688 23581.32219 255057.52497
HI 8869.15441 30439.81552 1839.57599 NA 5932.25169 6328.47489 782.0006 19104.16844 160786.68716 7160.67493 6818.8049 0.00916 1739.32444 NA 1739.32444 NA NA NA 3873.90104
CT 4280.65312 41146.99533 8662.60686 279472.58019 14040.91211 13931.95732 29.67647 16498.26611 20459.41144 3520.95153 70.70303 1545.98283 759.70159 337873.449 759.70159 25.684 NA NA NA
NH 582.00275 45999.11115 28921.52503 95930.5315 1173.79507 2938.11728 NA 28240.62979 8805.41674 710.9275 4183.32978 20845.12707 NA 202125.158 NA NA NA NA NA
DE 1180.3357 49191.98863 NA 78563.26664 16.81783 1282.66562 5906.66587 1821.39827 8202.78606 771.84785 54.67906 NA 484.05366 NA 484.05366 NA 143.539 NA NA
SD 10.56905 57746.82459 102941.98802 10129.70107 61.96046 7.43935 NA 39031.63483 292.94667 7.61476 39014.95476 NA 5.92104 NA 5.92104 NA NA NA NA
OR 5131.43772 62240.43928 702882.8772 295338.95395 854.11813 4478.44034 NA 113327.9236 409.73103 1414.03884 89786.94429 13787.47825 3744.39389 NA 3743.80891 NA NA 0.58499 1493.30099
NJ 22304.47549 111013.88373 464.68909 542276.92395 11675.23234 18292.77718 3584.17472 27281.9166 9083.56476 14453.32033 286.94427 NA 8696.22749 646616.702 8696.22749 -3956.668 539.56673 NA NA
MA 19868.91052 125349.72803 20620.13733 420664.39289 17100.10678 22269.9357 NA 34057.41236 49863.77042 12698.37239 2128.51567 2215.31674 7312.4242 98555.949 7308.85315 -10675.135 NA 3.57101 NA
WA 1269.29777 146701.0413 1571941.74236 202915.97348 1643.5455 4289.67765 6517.56803 128935.0278 1050.94228 1134.34764 95400.71839 29107.28595 137.34613 176113.702 137.34613 429.564 NA NA NA
NV 33179.13055 161215.91874 43470.51335 466394.93881 227.58015 412.78079 141.43264 82187.23607 1278.05502 3852.31663 2884.71215 0.89038 31428.04287 NA 29344.94287 NA NA 2083.1 47460.81011
NY 14030.75486 231014.57831 552589.54463 956076.05277 18610.66959 31340.63778 116.21746 95527.19309 123370.90597 10618.21748 49709.77689 10938.07296 3538.71055 864706.032 3538.71055 -11978.199 3462.73945 NA NA
MS 1545.31959 235184.6115 NA 641218.64863 88.609 185.05253 283.49253 30893.68741 12179.86891 72.013 NA 29226.63742 1473.30659 196913.867 1473.30659 NA NA NA NA
MT 291.33129 318809.89867 207100.42207 5616.14615 3742.01502 NA 197.65333 26129.76753 308.54095 155.34924 24994.379 999.40649 135.98205 NA 135.98205 NA 9047.59509 NA NA
LA 1561.1192 386082.30632 21055.78992 1110469.40107 12549.27508 1613.86026 37060.1244 55276.54305 8294.21747 1559.73691 NA 53458.99311 201.26799 341972.739 201.26799 NA 64033.03348 NA NA
MD 8138.47137 410153.73008 39634.41419 99844.88968 6161.16777 8062.93168 3822.56858 18481.22638 20577.2108 5455.18233 4714.13353 2932.12813 2772.03423 296154.316 2772.03423 NA NA NA NA
NE 260.01279 456174.09375 22654.60661 14023.3529 0.058 1328.34659 0.455 49375.74286 464.57267 75.06445 47862.45104 NA 184.94834 175542.64 174.9754 NA NA 0.465 NA
VA 6531.45188 475985.00487 27403.27484 497461.57572 9425.9349 18017.80697 2.959 66258.93608 37309.13266 803.17482 32.771 42480.07906 5728.27706 581830.162 5728.27706 -28559.521 NA NA NA
NM 10571.88264 485144.08273 3599.22513 163650.52088 2.20802 337.27488 NA 64975.07069 1241.74519 1855.19671 54854.69599 NA 9575.25613 NA 9575.25613 NA NA NA 204.32802
AR 1332.28719 527849.63495 67317.52244 248337.9462 585.02454 1411.34091 NA 32714.04228 2996.82878 258.6885 NA 30229.10587 1073.59869 295601.959 1073.59869 746.002 6.08246 NA NA
MN 6669.7018 562456.61504 17215.49909 108368.78689 7006.48047 12319.14932 76.4754 165858.5385 1504.3265 586.14578 129864.79835 17588.32904 6086.26202 270134.863 6086.26202 NA 3832.24137 NA NA
OK 394.53682 569918.64239 51546.60373 664389.12545 237.11556 1393.47504 251.58508 222201.85569 974.41619 107.76632 215740.00131 4781.60909 286.7705 NA 286.7705 -2782.475 4.52707 NA NA
ND 3.95326 577009.03413 43125.16295 7857.04458 594.08987 114.70323 918.4035 102519.74213 782.09783 3.95326 102403.01865 NA NA NA NA NA NA NA NA
KS 309.43823 593642.92435 365.03397 43421.42441 23.02298 697.35703 NA 165309.82524 4743.16426 190.64058 164479.77627 NA 128.68774 186783.672 128.68774 NA 409.8712 NA NA
SC 6341.83099 620145.55522 47334.60961 240440.59162 1253.84686 2613.56125 6.26658 45539.39845 4137.51122 1436.59252 NA 38020.48375 4905.35347 1093161.238 4905.35347 -20409.311 1297.08295 NA NA
CO 11311.73993 664849.60521 34528.85534 258380.84231 785.75524 1200.98123 15.49846 117353.18201 464.99774 4341.80185 108010.12943 564.95157 7577.12001 NA 7577.12001 -4368.985 NA NA NA
UT 15104.34631 665184.02267 15435.81324 121708.212 2281.83223 985.41086 252.15668 28415.56531 824.30038 2398.50377 8366.04847 NA 12709.56153 NA 12709.56153 NA NA NA 6325.33915
IA 1080.03489 668398.13277 18377.48586 56457.33542 83.95136 3539.72075 NA 255398.05122 1866.79639 890.04137 251602.43475 29.62454 221.82216 93138.763 221.82216 NA 1073.62205 NA NA
AZ 52434.27495 750190.25638 141540.10285 656739.71056 2838.10947 834.06966 NA 48859.93524 1343.82378 16052.24419 6265.63319 1981.29382 39617.85648 624097.736 33970.87471 1963.21503 NA 5646.03374 NA
WI 1044.37124 767108.12568 44079.46405 193605.79998 1668.51052 9831.40252 0.01843 48188.96527 1576.46753 586.37286 20220.38471 17679.18007 457.99838 234869.527 457.99838 NA 8057.2525 NA NA
WY 530.11849 856119.2437 18291.76518 13245.24721 1424.39821 NA 5334.82066 59308.38827 955.45887 49.38749 58827.65727 NA 480.731 NA 480.731 NA NA NA NA
TN 1991.62408 856218.3585 184952.05394 116874.61301 154.43055 1336.98459 261.50256 20419.45431 3985.92341 641.80763 682.87929 16943.63475 1370.21045 599837.351 1370.21045 -12698.521 NA NA NA
FL 24774.7416 1037854.05087 4132.72798 2584653.2338 64320.72198 48264.90776 152.95182 113078.39593 216455.11963 4131.35384 1.422 43550.94652 21262.54175 591171.39556 20584.41377 NA 64587.82146 678.12799 NA
AL 45.20489 1125526.51801 196067.79013 774497.87114 1594.98921 550.30187 2681.65652 70815.93177 3338.19003 13.94389 NA 68665.40914 1600.22085 781903.679 1600.22085 NA NA NA NA
NC 42184.77426 1164205.99636 99712.96707 395234.11512 8648.0477 6919.34227 1.871 88934.00133 8100.73511 1669.15498 2417.11 38562.69455 41034.85761 843140.745 41034.85761 637.658 NA NA NA
MI 1516.2678 1172135.16912 30834.45985 378045.0067 6541.86592 17586.62515 14766.05777 98339.87453 7001.91906 679.84548 46555.77013 33360.17106 837.31255 623127.761 837.31255 -18247.318 8759.58824 NA NA
GA 7448.43252 1185176.03652 65448.32175 631581.86733 1324.50687 4199.39263 NA 98043.73035 5423.42635 865.55397 NA 79202.67584 14639.32809 682021.878 14639.32809 -11219.83 7002.13368 NA NA
MO 2308.71695 1436631.64653 27123.00771 101326.56243 744.62634 1091.97273 31.03444 25571.74032 1852.99079 1816.48294 23644.70808 342.82578 492.23401 184653.11 492.23401 3431.65499 1229.66152 NA NA
WV 74.13589 1581867.50519 30169.56341 16124.85338 7.22683 49.55801 1256.98443 19017.67701 3720.98422 74.13589 18956.95031 1.57405 NA NA NA NA NA NA NA
KY 434.90704 1610321.13974 70178.56006 98964.1018 572.80741 1665.28802 26.08363 8268.89356 2239.89494 239.60219 NA 6408.30087 195.30485 NA 195.30485 NA 33348.21879 NA NA
IL 2221.071 1615090.33 2722.20677 205219.06859 3794.80356 12341.67282 4697.46191 146594.1206 4975.72028 1540.28256 133458.91101 1.74842 791.78849 1982880.396 776.2115 NA 235.41958 15.577 NA
PA 3963.358 1810885.66868 57540.48067 926761.91579 16782.68328 32949.98662 10877.39419 85177.40242 25816.98211 3157.96605 39505.98486 11785.99744 935.43471 1613850.278 935.26012 -12646.336 3889.5396 0.17459 NA
IN 2890.21313 2020734.57321 8090.71721 242636.17366 7491.38208 6377.64439 49316.59525 64302.2994 3449.27652 624.46551 55628.19657 NA 2296.28952 NA 2296.28952 NA 7365.66929 NA NA
OH 2393.47135 2052208.92151 9470.57018 378511.15255 319.45896 5545.837 10644.70962 29028.76502 5726.45544 1206.05839 15237.73873 6946.48792 1298.70171 327734.202 1289.35271 NA 17611.94245 148.1 NA
TX 34090.65723 2765273.52803 22496.22729 4319682.39565 13559.97617 9934.25974 63920.24054 774165.1792 6042.87739 6082.99391 716669.86844 19235.273 28325.78326 818976.511 28322.73926 NA 15285.04875 3.049 NA

The plot below shows the time series of different types of solar energy production. From the plot it can be noticed that the total solar value, which is the aggregation of the utility scale solar and small scale photovoltaics fuels, is missing data for dates before 2014. It seems that the aggregation did not include the utility scale solar values before that time; perhaps the aggregation had an error because there were N/a’s in the data for small scale solar photovoltaic during this time period. This data was fixed and aggregated correctly in the data transformation section.

4.2 WB Data

4.2.1 Missing values analysis of the WB data

We examined missing data in the WB data set from two perspectives - by development indicators (row) and years (column). We observed that 4 indicators (Pov Rt, Ext Debt, Debt Sv %, and ODA) are completely missing in the dataset. Additionally the earlier years (1960 - 1969) have 29-36 missing indicators with the exception of the last year’s (2020) data. It is possible that 2020 data is not available because of a time lag in publishing data for such development indicators. The tables of top 15 missing values by rows/columns are listed below:

##     Pov Rt   Ext Debt  Debt Sv %        ODA  Prim Comp   Start up    Water P 
##         59         59         59         59         56         52         51 
##        HIV   Under Wt  Migration   HT Exp %   Contr Pr   Prim Sch     Agri % 
##         49         48         47         45         40         36         36 
## Industry % 
##         36
## 1960 1961 1963 1964 1966 1968 1969 2020 1962 1967 1965 1971 1970 1972 1974 
##   36   33   31   31   31   31   31   31   30   30   29   25   24   23   23

We further analyzed the missing value patterns in the development indicators using the mi:missing_data function which excluded the 4 indicators (listed above) that are completely missing from the WB data set. The heatmap of missing values analysis by data fields provided us with the missing value patterns for example GNI PPP and GNI PC PPP are both missing on same years (pre-1990). It also confirmed that the indicators that are required for our data analysis such as CO2 Emis PC, GNI PC, Energy Use PC, GDP %, and Power Cons PC are largely present in the historical data set to explore important trends for U.S. clean energy industry.

## NOTE: The following pairs of variables appear to have the same missingness pattern.
##  Please verify whether they are in fact logically distinct variables.
##      [,1]           [,2]          
## [1,] "Life Exp"     "Fertility Rt"
## [2,] "Life Exp"     "Ado Ft"      
## [3,] "Life Exp"     "Mort Rt"     
## [4,] "Fertility Rt" "Ado Ft"      
## [5,] "Fertility Rt" "Mort Rt"     
## [6,] "Ado Ft"       "Mort Rt"