×
Reviews 4.9/5 Order Now

Impact of Opioid-Related Deaths on Healthcare Expenditure: STATA Analysis

November 07, 2023
Olivia Beecham
Olivia Beecham
🇺🇸 United States
STATA
Olivia Beecham, a STATA Homework Expert, holds a Master's in Statistics from the University of Queensland, Australia. With over 8 years of experience in data analysis and econometrics, she provides comprehensive assistance in STATA-related assignments.
STATA
Key Topics
  • Problem Statement:
    • Solution
Tip of the day
Always define your hypotheses and understand the context of the data before starting. Use statistical software like SPSS, R, or Python for accuracy and efficiency. Double-check formulas and ensure your results align with your analysis. Clear labeling of graphs and tables adds value to your presentation.
News
In 2024, Minitab enhanced its web app's tabular output for improved readability, aiding students in data analysis.

In this comprehensive STATA analysis, we dive into the complex relationship between opioid-related deaths and various healthcare expenditure and regional factors in the United States. This investigation encompasses multiple econometric models, each offering unique insights into the opioid crisis, healthcare costs, and socio-economic indicators. Explore the detailed findings of this study and gain a deeper understanding of the intricate dynamics surrounding this critical issue.

Problem Statement:

A STATA analysis homework that explores the relationships between various variables in the context of opioid-related deaths and healthcare expenditure in the United States. These variables are drawn from different sources, and they span the years 2000-2020. Here is a brief overview of the key variables and their sources:

Solution

Data Description

The name and the description of each variable has been shown below in Table 1 with their data sources.

Table 1

Variables and Data Sources

NameVariableDescriptionSource
medicareinmillionMedicare SpendingTotal Medicare spending by stateCMS GOV
medicaidMedicaid SpendingTotal Medicaid spending by stateMedicaid GOVYears: 2000-2020
tcmcaremcaidmlMedicare/MedicaidMedicare/Medicaid costs in millions
tcmcaremcaidmladjMedicare/Medicaid adjustedMedicare/Medicaid costs in millions, adjusted
oddeathsOverdosesOpioid Overdose DeathsKaiser Family Foundation
populationPopulationState Population EstimatesUS Census Bureau
medincMedian Household IncomeMedian Family IncomeUS Census Bureau
mdehhincadjAdjusted Median Household IncomeMedian household in income in millions adjusted for 2020
stategdpState GDPAnnual Gross Domestic Product by stateUS Bureau of Economic Analysis
unemprateUnemployment RateAverage annual unemployment rates by stateUS Bureau of Labor Statistics
lfprLabor Force Participation RatePercent of civilian noninstitutional population 16+ of age working or actively seeking workUS Bureau of Labor Statistics
prctinsuredPercent InsuredPercent of population with Health Insurance CoverageUS Census Bureau for years 1999-2012 and 2008-2020
pchsgradPercent high school graduatesPercent of population with a high school degree or higherUS Census Bureau Educational Attainment
pcgdpmanuManufacturing in GDP, %% GDP in the manufacturing sector in each stateData file given by professor
pcempmanuManufacturing Employment, %% of employment in the manufacturing sector in each stateData file given by professor
Phase2Dummy VariableDummy variable = 1, if year is later than 2009; = 0, if year is 2009 or earlierCreated in STATA
prescriptPrescription RateOpioid prescriptions dispensed per 100 persons per yearCenters for Disease Control and Prevention
cpiConsumer Price IndexConsumer Price Index, measure of inflation rate

After describing the variable names and its sources, now we can get to know our data. Table 2 gives a summary statistic of the numerical variables in the dataset.

Table 2

Summary Statistics of Numerical Variables

VariableObsMeanStd. Dev.MinMax
stateid0
year1.07120106.0581320002020
medicarein-n1.0719405,17711727,7621686833
medicade1.0717.93e+091.15e+101,05e+089,78e+10
oddeaths1.071519,6671675,8681105508
population1.071603782067821644943003,94e+07
medinc1.07152509.6311683,312935995572
stategdp1.071303082.5388213,117152,53042694
1fpr1.07165,47324.26337453,375,3
unemprate1.0715,6001872,0395482,113,8
prctinsured1.07188.184874,38269574.597,5
pchsgrad1.07187.226423,66063277,194
disprate1.07152,5257739,22420146,9
cpi1.071214,30126,59992168,8257,971
tcmcaremca~l1.07117333,1422642,78515,2392184633,3
tcmcaremca~1.07120303.4725272,16787,4217184633,3
medhhincadj1.07163040.3610263,0536226.6297948.47
gdpadj1.07136060244575426213.553118353
pcgdpmanu1.071.1197542.0558243,0017988,3006745
pcempmanu1.071.0433397.01928750015069.1108672

Econometric Models

First Model:

The first question investigates the major causes of death in United States associated with opioids. Thus, we construct our first model, but only with supply side variables.

logod= β_0+ β_1×phase2+ β_2×pcempanu+ β_3×phase2pcempanu+ β_4×medhhincadj+ β_5×gdpadj+ β_6×prctinsured+ β_7×pchsgrad+ β_8×unemprate+ β_9×lfpr +β_i×d_j

where i=10, 11, …, 30 and j = 1, 2, …, 20

  • The y, or the dependent variable here is the logod, which is the logarithm of overdose deaths named as oddeaths in the dataset.
  • The x, or explanatory variables are already explained in Table 1.
  • 0 is the intercept or constant term,
  • 1 is the coefficient of phase2 variable,
  • 2 is the coefficient of pcempanu variable,
  • 3 is the coefficient of the interaction term: phase2pcempanu,
  • 4 is the coefficient of meddhincadj variable,
  • 5 is the coefficient of gdpadj variable,
  • 6 is the coefficient of prctinsured variable,
  • 7 is the coefficient of pchsgrad variable,
  • 8 is the coefficient of unemprate variable,
  • 9 is the coefficient of lfpr variable,
  • i for i = 10, …, 30 are the coefficient of the year dummy variables, dj for j = 1, ..., 20.
  • Second Model:
  • The second question asked in this report is whether or not the rising number of deaths attributed to opioids affects Medicare or Medicaid expenditures in the states. Thus, the second model will be using the supply side variables. Since there are some missing data before 2006 in this dataset, we will be using 2007-2020 data in our analysis this time.

Second Model:

The second question asked in this report is whether or not the rising number of deaths attributed to opioids affects Medicare or Medicaid expenditures in the states. Thus, the second model will be using the supply side variables. Since there are some missing data before 2006 in this dataset, we will be using 2007-2020 data in our analysis this time.

logod= γ_0+ γ_1×phase2+ γ_2×pcempanu+ γ_3×phase2pcempanu+ γ_4×meddhincadj+ γ_5×gdpadj+ γ_6×prctinsured+ γ_7×pchsgrad+ γ_8×unemprate+ γ_9×lfpr+γ_10×totprescripml+ γ_k×d_m

where, k= 11, …, 24 and m= 7, …, 20

  • The dependent variable, y in this model is again logod, which is the logarithm of the overdose deaths.
  • The x, or explanatory variables are already explained in Table 1.
  • γ_0 is the intercept or the constant term in the model.
  • γ_1 is the coefficient for the phase2 variable,
  • γ_2 is the coefficient for the pcempanu variable,
  • γ_3 is the coefficient of the interaction term: phase2pcempanu,
  • γ_4 is the coefficient of meddhincadj variable,
  • γ_5 is the coefficient of gdpadj variable,
  • γ_6is the coefficient of prctinsured variable,
  • γ_7 is the coefficient of pchsgrad variable,
  • γ_8is the coefficient of unemprate variable,
  • γ_9 is the coefficient of lfpr variable,
  • γ_10 is the coefficient of totprescripml variable, which is the total number of prescriptions dispensed in the state, in millions. It has been calculated by using the prescript variable in the main dataset.
  • γ_k for k = 11, …, 24 are the coefficient of the year dummy variables, d_m for m = 7, ..., 20. The prescript variable in the main dataset.k for k = 11, …, 24 are the coefficient of the year dummy variables, dm for m = 7, ..., 20.

Third and Fourth Models

The final question asked in the report investigates which states has suffered the most deaths due to overdose opioid deaths in the past two decades. To investigate this, two different model will be examined, first one will be constructed with fixed effects model, meanwhile the second will use the random effects model. The fixed effect model’s equation has been written in equation (3).

logtcmcaremcaidmladj_it= α_0i+α_1×logod_it+ α_2 gdpadj_it+ α_3×prctinsured_it+ α_4×unemprate_it (3)

Where i represent state id, i = 1, …, 51; t represents time, t = 2000, …, 2020

  • The independent variable is logtcmcaremcaidmladj, which is the logarithm of tcmcaremcaidmladj variable who respresent the adjusted Medicare/Medicaid costs in millions.
  • The explanatory variables are already explained in the Table 1.
  • α_0i is the unobserved individual level effect, which is fixed over time,
  • α_1is the coefficient of logod, which is the logarithm of overdose deaths,
  • α_2 is the coefficient of gdpadj variable,
  • α_3 is the coefficient of the prctinsured variable,
  • α_4 is the coefficient of unemprate variable.

logtcmcaremcaidmladj_it= α_it+α_1×logod_it+ α_2 gdpadj_it+ α_3×prctinsured_it+ α_4×unemprate_it

The random effect model (4) has the same equation with (3), however this time the constant term a random variable instead of representing the individual effects.

Estimation Results

The estimation results of regression with equation (1) have shown in the Table 3 below. In Table 3, we see that the coefficients of following variables are insignificant at 5% significance level: phase2, pchsgrad. Also, the year dummy variables for 2001 to 2004 and 2014 to 2020 are statistically not significant at 5% level. All other variables in the regression are significant at 5% level. The variables who have a positive and significant effect on the overdose deaths are the unemployment rate (unemp) and percentage of employment in manufacturing (pcempanu). If the unemployment rate increases 1 point, the logarithm of over deaths will increase 0,132 points. What see as important here, if the pcempanu increases 1 point, the dependent variable will increase 23,89 points, which is a very high effect. The variables who have significantly negative effects are the interaction term (phase2pcempmanu) with a coefficient equal to -12.27, the percentage of people with health insurance (prctinsured) with -0.036, and lastly the labor force participation rate (lfpr) with a -0.14 coefficient term. Additionally, since the Prob > F = 0,000, the global model is statistically significant too. The explanatory variables used in the model explains the 66% of the variation of the dependent variable: logarithm of the overdose deaths.

Number of obs -1.071
F(28, 10421 -56.71
Prob =0.0000
R-squared -0.6628
Root MSE -76804
LogodCoef.Robust Std. Err.tP>|t195% Conf.Interval
phase2-,55176212336824-2.360.018-1.010304-.0932203
pcempmanu23,887931.87809812.720.00020.2026427,57321
phase2pcerpmanu-12,260632.659244-4.610,000-17,47872-7,842548
nedhhincad],0000443,87e-0611.370.000.0000364.0000516
gdpadj1.40e-069,96e-0814.000.0001.20e-061.59e-06
pretinsured-.0357383.0092943-3.850.000-,053976-,0175005
pchsgrad.021227401139671.860.063-.0011356.0435984
unexprate.1310926,02345525.590.000.0850679.1771174
(for-,14079510094262-14.940.000-,1592916- 1222986
d1.220855615940171.390.166-.0919294.5336405
d2.276628315839931.750.081-.0341898.5874461
d3.3121025.16729561.870,002-,01617226403771
d4.391420316441042.380,017.0688069.7140333
d5.501153916349343.070.002.18034018219677
d6.7034302.15988894.440.00038968931,017171
d7.715632716181744.420.000.39810761.033158
dB.816411315917195.130.00050487741.128745
d9-,64201312041274-3.150.002-1.042561-2414662
418-,66048821994405-3.310.001-1.051839-2691373
411-.58376441906185-3.000,002-,9578043-2097246
612-,53283421823659-2.920.004-, 8906806-1749879
413-.5066247181447-2.790.005-.8626678-.1505816
414-2113538.1694441-1,258,213-5438444.1211368
415-,0829241664983-0.500,619-,4096205.2437685
416.113638517379250.650,513-22738464546616
417.1904761.17728681.078,283-15740375383558
618.105022.17602130.600.551-,2403747.4504186
419.|omitted
620-.1612533.1839959-0.888,381-,52229831997915
_cons11,45015.931450212.290.0009.62241513.27788

The regression results for the equation (2) are given in the Table 4. In this model, following variables are statistically not significant at 5% level: gdpadj, prctinsured, pchsgrad and the time dummies between 2007-2013. The variables with a positive and significant effect at 5% level are phase2, pcempmanu, the adjusted GDP (gdpadj), the unemployment rate (unemprate) and finally the total prescriptions in the state (totprescripml). Also, the dummies between 2016 and 2020 are statistically significant and they have a positive coefficient term. In the other hand, the variables with a significant negative effect are the interaction term (phase2pcempmanu) and the labor force participation rate (lfpr). Finally, we see that the Prob > F value is equal to 0,000 thus, the model is globally significant according to 5% significance level, and the explanatory variable of the model explains 68.8% of the variation of the dependent variable.

Table 4

Estimation Results for the Second Model

Number of obs .765
F(23, 741) .60.42
Prob - F .0.0000
R-squared .0.6874
Root MSF ..7058
logodCoet.Robust Std. Err.tP>|t|(95% Cont.Interval
phase2.6487628.27227432.380,017.11424191,183284
pcempmanu14,645222,0091787.290,00010.7008618.58958
phase2pcempmanu-7.9227463,770275-2,108,036-15,32444-,5210528
medhhincadj.00004224.52e-069,330,000,0000333,0000511
gdpad)5.83e-081.38e-070.428,674-2.13e-073,30e-07
prctinsured8124985,01020621,220,221-,007546.0325271
pchsgrad.0074972.01395090,548,591-,0198907,0348851
unemprate.1067042.02660624.010.000.0544716.1589368
1fpr-,1117171,0118902-9.400.000-,1350596-,0883746
totprescripmi.183295.014037513,060,000,155737,210853
d7-.0599654.1370085-0.448,662-.3289364,2090056
de.0184087.12950420.080.936-,2438301.2646474
49a(omitted)
d10.0038718.12903270.038,976-,2494414.2571851
d11.067174.13482880.500.618-,197518.3318659
d12106787514071950.760,448-,1694689.3830439
d13.141715.1485070,958,340-,14982954332595
d14.3397481.15776092,158,032,0300366,6494597
d15.441147917501952.520.012.0975548.784741
d16.6411327.18790763,418,001,27223811,010027
d178124088.19732024,128,000.42503561,199782
d18.8432053.20682324,080,000.4371761,249235
d19823107520966023.930,000.41150881.234706
d20.7635703.17596324,348,000,41812461,109016
.cons6.1950211.2142955,100,0003.8111538,578889

For the last question, two different types of panel data regression have been conducted. Table 5 shows the estimation results for the fixed effects model, where Table 6 shows the estimation results for the random effects model.

The fixed effect model is preferred, when we want to analyze only the impact of variables that vary over time. In our fixed model, the error terms are correlated with the regressors: corr(u_i, Xb)= 0,7136. Since the Prob > F = 0,000 is smaller than 0,05, we can conclude that the model is globally significant at 5% significance level. In this model, according to the p value of the t-test, all the variables are significant at 5% level, and all of them has a positive effect on the dependent variable. Thus, the logarithm of the Medicare/Medicaid cost increases with overdose deaths, adjusted GDP, percentage of insured people, and with unemployment rates.

Table 5

Estimation Results of The Fixed Effect Model

Number ofobs1.071
Number of groups51
Obs per group :
min21
avg21,0
max =21
F (4,50)II158,98
Prob >F0,0000

adjusted for 51clusters in stateid )

logtemcare~jCoef.Robust Std. Err.tP>It195% Conf.Intervall
logod2346927,014642216,030,00020528292641025
gdpadj5,90e-072,05e-072,880,0061,79e-071,00e-06
prctinsured.0261885.00456775,730,000.017014,035363
unemprate,0296832,00346678,560,0000227202, 0366462
_cons5,356807,399770613,400,0004,5538446,15977

In the random effects model, we suppose that the error terms are not correlated with the regressors, thus corr(u_i, Xb)= 0. This assumption allows us for time-invariant variables to play a role as explanatory variables. All the variables are statistically significant, which means they have a significant influence on the dependent variable, Medicare/Medicaid cost. All variables have positive effect on the dependent variable, meanwhile, the overdose deaths have the highest effect.

Table 6

Estimation Results of The Random Effect Model

Number ofobs1.071
Number of groups51
Obs per group :
min21
avg21,0
max =21
F (4,50)II158,98
Prob >F0,0000

adjusted for 51clusters in stateid )

Number of obs1.071
Number of groups51
Obs per group :
min21
avg21,0
max =21
F (4,50)II158,98
Prob >F0,0000

Simulation

The first simulation will try to answer the following question: In the last two decades, does the overdose deaths growth rate is different between the manufacturing and non-manufacturing states? If so, how different are they in terms of overdose deaths growth rates? To do so, I have sub-grouped the states as “Manufacturing States” if the percentage of manufacturing in GDP is higher than its average; and grouped the other states as “Non-Manufacturing States” whom has a percentage of manufacturing in GDP lower than the overall mean. After, I have taken the average values of each year. Table 7 and Figure 1 displays the results of the simulation.

Table 7

Growth Rate of Overdose Deaths: A Comparison of Manufacturing and Other States

YearsManufacturing StatesNon-Manufacturing StatesDifferences
2000000
20010,1290322580,1282280670,00080419
20020,2531965270,253317122-0,00012059
20030,0870806570,08899574-0,00191508
20040,0616816650,0591424920,00253917
20050,0853099530,085826081-0,00051613
20060,17611370,1742498230,00186388
20070,0540086070,052628560,00138005
20080,0568384620,056999619-0,00016116
20090,0433937160,0422230240,00117069
20100,0319764590,033933518-0,00195706
20110,0807432750,081618984-0,00087571
20120,0170397080,017714968-0,00067526
20130,081090430,081600278-0,00050985
20140,1435370340,1402796750,00325736
20150,1557024450,156182824-0,00048038
20160,277016190,27727148-0,00025529
20170,1268750440,1258053740,00106967
2018-0,016634071-0,016045614-0,00058846
20190,0652452850,0651644738,0812E-05
20200,3765482910,3756218910,0009264

Figure 1

Growth Rate of Overdose Deaths

As we can see in Table 7, the differences between two types of states are really low.

Similar Samples

In this section, you'll find examples of our work showcasing the meticulous approach we take to assignments. Each sample illustrates our expert's proficiency in statistical analysis, including the use of STATA for data management and interpretation, ensuring clarity and precision in every solution provided.

Our Popular Services