Changes in Inequality in Houston, TX

Victoria M. Vazquez, Ph.D.

Introduction

I recently read an article on Fortune Magazine about income inequality, Gilded Age 2.0: U.S. Income Inequality Increases to pre-Great Depression Levels$^{1}$. The article describes two groups of people in stark contrast: the top 1% of the population in posession of 40% of the wealth in the country versus the bottom 90% of the population, which has been trending towards a smaller and smaller proportion of the country's wealth. It was just a briefing in the February 13, 2019 issue of the magazine, but it isn't the first time I have read about the ever growing chasm between the top 1% wealthiest people in the U.S. and everyone else. The Occupy Wall Street movement started in 2011, spurring numerous similar activities throughout the country in protest of the " richest 1% of people that are writing the rules of an unfair global economy that is foreclosing on our future."$^{2}$ And in the 2016 presedential election the candidates highlighted the issue of growing income inequality and the need for solutions.$^{3}$

Methods

All of the attention paid to inequality made me want to investigate the issue. As I am not an economist, I decided to explore some simple metrics related to wealth. This study is a small look at the issue of inequality, but at least it is a start. I have chosen to explore people living below the poverty line and those receiving an income of $200,000 or more annually.

Inequality is a complex multifaceted social issue, therefore I focused my initial exploration on the city of Houston, Texas (Fig. 1). Houston is the fourth largest city in the United States with an estimated 2,267,336 people in 2017$^{4}$. The city is culturally diverse, with 29.2% of the population born outside of the United States, compared to the national average of 13.4%.$^{5}$

houston_ppt_map_small.jpg

                            Figure 1: Map of Houston, TX, study site.
In [1]:
#Import libraries necessary for a suite of data analyses.
import pandas as pd
from pandas import ExcelWriter
from pandas import ExcelFile
import matplotlib.pyplot as plt
from matplotlib.ticker import FuncFormatter
from matplotlib import rcParams
%matplotlib inline
import seaborn as sns
plt.style.use('seaborn-whitegrid')
import numpy as np
import scipy as sp
from scipy import stats
from scipy.stats import ttest_ind
import os
import datetime
from datetime import datetime

I obtained income and education data from the 2013 - 2017 American Community Survey 5-Year Estimates table$^{6}$ (Table 1). This survey is conducted every year by the U.S. Census Bureau and the data is used to determine allocation of federal and state funding, among other things. The survey sample size ranged from 127,757 to 150,406 with a response rate over 88% (Table 2).

In [2]:
#Load data from local computer.
inequality = pd.read_csv('/home/vanellope/Documents/Thinkful/Inequality Project/data/inequality_data.csv')
display(inequality)
year per_poverty pov_error per_200k 200k_error pov_lessHS pov_lessHS_error pov_HS pov_HS_error pov_assoc ... med_earn_less_HS earn_less_HS_error med_earn_HS earn_HS_error med_earn_assoc earn_assoc_error med_earn_college earn_college_error med_earn_grad earn_grad_error
0 2010 21.0 0.4 4.8 0.2 29.1 0.7 18.1 0.5 11.6 ... 18192 252 24000 421 31274 316 50835 566 66852 1372
1 2011 21.5 0.4 5.3 0.2 29.8 0.8 18.5 0.5 12.4 ... 18490 311 24800 462 31650 336 52060 480 70423 1570
2 2012 22.2 0.4 5.5 0.2 30.9 0.9 19.3 0.7 13.7 ... 18300 356 20052 399 31120 336 52803 671 71865 1368
3 2013 22.9 0.3 5.8 0.2 31.4 0.7 20.7 0.7 14.6 ... 18407 345 24452 447 31053 328 52945 678 72290 1408
4 2014 22.9 0.5 6.2 0.2 31.4 0.7 21.1 0.8 14.6 ... 18901 299 24417 426 31271 367 53627 834 74144 1502
5 2015 22.5 0.3 6.5 0.2 30.4 0.8 21.1 0.6 14.3 ... 19181 299 24080 465 31224 297 54469 902 74340 2148
6 2016 21.9 0.4 6.8 0.2 29.6 0.8 20.9 0.7 14.1 ... 19725 300 24269 401 31186 297 55089 908 75485 1234
7 2017 21.2 0.4 7.4 0.2 28.8 0.8 20.3 0.6 13.6 ... 20631 342 25172 324 31907 298 57030 887 76971 2026

8 rows × 23 columns

Table 1: Poverty, income, and education data obtained from the American Community Survey.

In [3]:
#Table describing survey data.
data_info = pd.DataFrame({'Year': ['2010', '2011', '2012', '2013*', '2014', '2015', '2016', '2017'], 
                   'Sample Size': [127757, 130965, 150406, 140249, 146897, 146469, 141647, 136667], 
                   'Response Rate (%)': [95.8, 96.6, 97.0, 88.4, 95.1, 94.3, 93.4, 91.5]})
data_info['Sample Size'] = data_info.apply(lambda x: "{:,}".format(x['Sample Size']), axis=1)
display(data_info)
Year Sample Size Response Rate (%)
0 2010 127,757 95.8
1 2011 130,965 96.6
2 2012 150,406 97.0
3 2013* 140,249 88.4
4 2014 146,897 95.1
5 2015 146,469 94.3
6 2016 141,647 93.4
7 2017 136,667 91.5

Table 2: Sample size and response rate for the 2013-2017 American Community Survey.

Data cleaning and analysis was conducted using Python 3.7.3$^{7}$ within a Jupyter Notebook 5.7.8.$^{8}$ Pandas 0.24.2 was used for dataframe manipulation.$^{9}$ Plots were created using MatplotLib.$^{10}$

*Note: As a result of the 2013 government shutdown, the ACS did not have a second mailing, a telephone followup, nor a person followup operation for the 2013 October panel. Only respondents from the first mailing (Internet in the United States, paper questionnaire in Puerto Rico) contribute to the overall response for this panel. This caused a drop in the number of Final Interviews (housing units) for the 2013 sample year.

How many people live in poverty?

The percent of people in Houston living in poverty has remained fairly constant at about 21-23% with 0.3-0.5% margin of error. The percent of people earning $200,000 or more per year has risen slightly from 5% to 7% with a negligible margin of error (Fig. 2).

In [10]:
#Plot percent of people in poverty and percent of people with income greater than $200k per year.
plt.figure(figsize=(10,10))
plt.errorbar(inequality['year'], inequality['per_poverty'], yerr = inequality['pov_error'], label = 'In Poverty',
             fmt='o', color='c', ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['per_200k'], yerr = inequality['200k_error'], 
             label = 'Income Greater than $200k', fmt='o', color='b', ecolor='slategrey', elinewidth=3, capsize=0)
plt.ylim(0,25)
plt.title("Fig. 2 Changes in Poverty and Wealth in Houston, Texas", fontsize = 18)
plt.xlabel("Year", fontsize = 16)
plt.ylabel("Percent of People", fontsize = 16)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
Out[10]:
<matplotlib.legend.Legend at 0x7fb926e1bf60>

What is the relationship between education and income?

As expected, the more education a person has received, the greater their income. The incomes of those with college degrees and graduate or professional degrees rose steadily between 2010 and 2017. However, the incomes of those in the other three educational attainment categories remained level in the same time period (Fig. 3).

In [11]:
#Plot median annual income at different educational attainment levels.
plt.figure(figsize=(10,10))
plt.errorbar(inequality['year'], inequality['med_earn_less_HS'], yerr = inequality['earn_less_HS_error'], 
             label = 'Less than High School', fmt='o', color='b',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['med_earn_HS'], yerr = inequality['earn_HS_error'], 
             label = 'High School', fmt='o', color='g',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['med_earn_assoc'], yerr = inequality['earn_assoc_error'], 
             label = 'Some College or Associate Degree', fmt='o', color='c',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['med_earn_college'], yerr = inequality['earn_college_error'], 
             label = 'College Degree', fmt='o', color='m',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['med_earn_grad'], yerr = inequality['earn_grad_error'], 
             label = 'Graduate Degree', fmt='o', color='y',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.ylim(0,80000)
plt.title("Fig. 3 Educational Attainment and Income in Houston, Texas", fontsize = 18)
plt.xlabel("Year", fontsize = 16)
plt.ylabel("Annual Median Income (US$)", fontsize = 16)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=0.)
Out[11]:
<matplotlib.legend.Legend at 0x7fb926da3cf8>

What is the relationship between education and poverty?

If we focus on the lower end of the range of incomes, those in poverty, we see a similar expected outcome. The educational attainment level with the highest percentage of people in poverty was those with less than a high school degree (~30%). The proportion of people in poverty decreased as the educational level increased (Fig. 4).

In [12]:
#Plot of percent of people at different educational attainment levels.
plt.figure(figsize=(10,10))
plt.errorbar(inequality['year'], inequality['pov_lessHS'], yerr = inequality['pov_lessHS_error'], 
             label = 'Less Than High School', fmt='o', color='b',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['pov_HS'], yerr = inequality['pov_HS_error'], 
             label = 'High School Degree', fmt='o', color='g',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['pov_assoc'], yerr = inequality['pov_assoc_error'], 
             label = 'Some College or Associate Degree', fmt='o', color='c',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.errorbar(inequality['year'], inequality['pov_college'], yerr = inequality['pov_college_error'], 
             label = 'College Degree', fmt='o', color='m',
             ecolor='slategrey', elinewidth=3, capsize=0)
plt.ylim(0,35)
plt.title("Fig. 4 Poverty and Educational Attainment in Houston, Texas", fontsize = 18)
plt.xlabel("Year", fontsize = 16)
plt.ylabel("Percent of People", fontsize = 16)
plt.legend(bbox_to_anchor=(1.05, 1), loc=2, borderaxespad=1.0)
Out[12]:
<matplotlib.legend.Legend at 0x7fb926a29860>

What is the data telling us?

There has been little, if any, change in the two extreme income categories: poverty and greater than 200,000 annual income.

People with more education receive greater income.

People with more education are less likely to live in poverty.

In [ ]:
 

Houston: A Hub of the Oil and Gas Industry

Houston is dominated by the oil and gas industry with seven out of the 11 wealthiest people in the city in 2018 having received their fortune from this industry, concentrating great wealth in a few people. Considering the top 11 wealthiest people in Houston during 2018 had net worths upward of \$2 billion dollars (Table 3), using an income category of \\$200,000+ misses the extreme nature of the income gap. Income measured as payment from an employer is also too restrictive a measure of wealth since it excludes wealth from investment accounts and real estate, among others.

In [9]:
wealthiest = pd.DataFrame({'Name': ['Richard Kinder', 'Dannine Avara', 'Scott Duncan', 'Milane Frantz', 'Randa Williams', 
                                    'Tillman Fertitta', 'Dan Friedkin', 'Jeffery Hildebrand', 'Robert McNair', 
                                    'John Arnold', 'Leslie Alexander'], 'Source of Wealth': ['Pipelines', 'Pipelines', 
                                    'Pipelines', 'Pipelines', 'Pipelines', 'Entertainment', 'Automobiles', 'Energy', 
                                    'Energy', 'Investment Banking', 'Investment Banking'], 'Net Worth': ['$6.6B', 
                                    '$6.2B', '$6.2B', '$6.2B', '$6.2B', '$4.5B', '$4B', '$4B', '$3.8B', '$3.3B', 
                                    '$2.1B']})
display(wealthiest)
Name Source of Wealth Net Worth
0 Richard Kinder Pipelines $6.6B
1 Dannine Avara Pipelines $6.2B
2 Scott Duncan Pipelines $6.2B
3 Milane Frantz Pipelines $6.2B
4 Randa Williams Pipelines $6.2B
5 Tillman Fertitta Entertainment $4.5B
6 Dan Friedkin Automobiles $4B
7 Jeffery Hildebrand Energy $4B
8 Robert McNair Energy $3.8B
9 John Arnold Investment Banking $3.3B
10 Leslie Alexander Investment Banking $2.1B

Table 3: Eleven wealthiest people in Houston, TX during 2018.$^{11}$

Other Factors Involved

I performed a brief analysis of income inequality in Houston, TX which showed that education is a factor in annual income. A more thorough assessment would include other factors potentially related to income, such as race, gender identity, quality of and opportunities in K-12 education, and annual income of previous generations in the family.

Including data from the 1980s up to 2010 would strengthen the study. The economic assessment that triggered the Fortune Magazine article described the 1980s as the beginning of the most recent widening of the income divide.$^{12}$

It would also be interesting to include data from cities with many technology companies since many computer programming jobs don't require advanced formal education to receive high compensation.

Access to the raw data would allow me to look at the distribution of income within the population and perform statistical tests, such as bootstrapping, to determine if the medians are significantly different between groups in a given year and between years.

References

  1. Kelleher, Kevin. 2019. Gilded Age 2.0: U.S. Income Inequality Hasn't Been This Bad Since Just Before the Great Depression. http://fortune.com/2019/02/13/us-income-inequality-bad-great-depression/ Accessed June 1, 2019.
  2. Occupy Wall Street: We Are the 99 Percent http://occupywallst.org/ Accessed June 12, 2019.
  3. Lauter, David. 2015. Income inequality emerges as key issue in 2016 presidential election. https://www.latimes.com/nation/la-na-campaign-income-20150205-story.html Accessed June 10, 2019.
  4. City of Houston Planning and Development Department. 2019. City of Houston Demographics https://www.houstontx.gov/planning/Demographics/Infographics/2019/Infographic_demographics_Jan2019.pdf. Accessed June 11, 2019.
  5. U.S. Census Bureau. QuickFacts: Houston city, Texas; United States https://www.census.gov/quickfacts/fact/table/houstoncitytexas,US/PST045218 Accessed June 5, 2019.
  6. U.S. Census Bureau. 2018. https://www.census.gov/programs-surveys/acs Accessed June 10, 2019.
  7. Python Software Foundation, 2019. https://docs.python.org/3/ Accessed June 10, 2019.
  8. Jupyter Team. 2015. https://jupyter-notebook.readthedo.cs.io/en/stable/#. Accessed June 10, 2019.
  9. _ 2019. Pandas: Powerful Python Data Analysis Toolkit Pandas https://pandas.pydata.org/pandas-docs/stable/index.html Accessed June 14, 2019.
  10. Hunter, J.D. 2007. Matplotlib: A 2D Graphics Environment, Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95.
  11. Helman, Christopher. 2018. The Richest People in Texas, 2018. https://www.forbes.com/sites/christopherhelman/2018/11/13/forbes-400-the-richest-people-in-texas-2018/#78b5eb794847 Accessed June 13, 2019.
  12. Zucman, Gabriel. 2019. Global Wealth Inequality. NBER Working Paper No. w25462. Available at SSRN: https://ssrn.com/abstract=3319688