A Story of Newyorkers’ Mental Health

One of the subjects that I’m very interested in learning more is mental health. For this project, I want to look into mental health data in NYC and see if I could find any significant pattern amongst people who are diagnosed and/or said yes to having depression. What percentage of the population actually is having mental health problems? How does the demographic picture look like within this subgroup? Are there any causes leading to depression? Smoking, drinking, exercise, and eating habits will be examined as well.

Many projects on this topic have been done by experts already, so my findings would not necessarily add to ongoing researches by psychologists and mental health experts. However, within the range of this class, it might be helpful for my classmates to see what might be causing depression, and the general descriptive statistics on this topic at their place of residence (NYC). Apart from that, hopefully, whoever comes across my project on Tableau Public could get some valuable information about mental health.

The data I used is the NYC Community Health Survey 2017. It is available on nyc.gov and can be easily found by googling “chs 2017.” There are also PDFs of the codebook and questionnaire on the same site. The data itself consists of responses to health and biographical questions from people who were randomly selected. Also, it’s worth noting that the survey is completely dependent on households with landline phones, which I suspect to be rarity these days. The data file format is SAS, which was something I have never seen before. After some research, I was able to open the file with R, using the package “haven” and then used “rio” to export it as a csv file. Loaded the file into Tableau and I then realized I can’t really do much with these data without some manipulation in R. I went back to R and first create a subset including variables I want to use: currdepress, phq8score, smoker, heavydrink17, age18_64, sex, sexialid17, education, emp3, imputed_pov200, newrace. After that I filtered out all the NA’s and responses other than yes or no. The size of the subset is 6642 data points.

Then I duplicated these variables as factors, since they are all in the form of 1’s, 2’s, 3’s, etc. each type of question in the health survey. These new variables are the dimensions I used in Tableau.

Also, I duplicated the original variables again as dummies using the “as.numeric” function. These dummies were used for calculations and served as measures to use in Tableau. All of these conversions were done using the package “tidyverse.” Finally, I exported this subset to a new csv file and use Tableau from there.

Moving on to the visualizations, the first tab is a dashboard about the overall picture about depression from the dataset. The two measure of mental health well-being I selected are the eight-item patient health questionnaire scores and responses to the question of “have you been feeling depressed in the last 2 weeks?” The treemap shows that more than half of the survey have score 0 on the PHQ-8 scale, and 358 of them scored 10 or higher, and were diagnosed with depression. The color scale shifts from dark blue to dark red according to how the score progresses. The waffle chart shows that 9% of the respondents admittedly feeling depressed in the last two weeks. Reponses were put into 100 squares, and 9 of the squares represent the answer “yes” to having depression in the last 2 weeks. But who are these respondents? The next tab will attempt to answer that.

Looking at the demographic picture, Hispanic responders seem to feel more depressed than other races, with 236 compared to the 2nd race in the ranking which is white (162). Among various sexual identifications, it come as no surprise that the majority are straight. However, the rate of depression is 12.6% among those who do not identify with their biological sexual orientation. Across different age groups, the ranking is: older adult, then adult, and young adults. However, when comparing each number to the total size of each age group, the percentage comes out to be approximately the same around 10%.

Moving on to the next tab, descriptive statistics of income and education among those who feel depressed are shown. On the left side are two bar charts comparing depression among poor and not poor households. Poor households are households that earn below 200% of the federal poverty line, and not-poor households earn more or equal to 200% of the federal poverty line. The height of these two bars represent the total count of people who feel depressed, and less people in the not-poor category have depression than the other income level. Among poor households with mental health issues, 50% are Hispanic, followed by black at 23.59%, white at 15.28%, asian at 8.04%, and other races at 3.49%. Things look very different among the not-poor household, with almost half of the group are white (43.39%), 2nd place and 3rd place black and hispanic are approximately the same. On the right, the treemap describes depression across employment status and education level. The higher the depression count, the darker the color gets. To a lot of people surprise upon viewing this visual, college graduate who are employed have the most cases of depression, at 15.93%, followed by employed individuals who attended some college, and high school graduates who are not in the labor force. Perhaps the most noticeable statistics is the dark blue color of the college graduates who are employed, and most people who viewed this visual commented on it.

Does people’s behaviors affect whether or not they feel depressed? The last tab attempts to answer that question. Respondents’ smoking habits are shown in the treemap. Among those who have depression, 332, which is more than half of them never smoked. This seem logical, since I assume most people do not smoke compared to smokers. Take that number and minus 49, you get the number of people who are depressed and have have quit smoking or are current smoker. In terms of percentage 46% of depressed people have had or are currently have exposure to tobacco use. Among these 46%, 182 of them are current smoker, and 101 are former smoker. The chart below the treemap shows depression among heavy drinkers and non-heavy drinkers. A person is considered a heavy drinker if that person is a man who has more than 2 drinks a day, or a woman who has more than 1 drink per day. 562 out of 6232 non-heavy drinkers have depression, while 53 out of 410 heavy drinkers have depression. The disparity is massive when looking at the size of the two bar charts compared to one another, but the percentages compared to the overall number of respondents in each category do not differ substantially, although the depression rate among heavy drinkers is higher (12.9%>9%). However, these numbers do not seem to show a clear relationship between drinking habits and depression.