InsightsCategory: Public SafetyWho Gets Stopped By San Diego Police?
2 Answers
mac0428 answered 3 weeks ago

We start with the following data, SD Police Vehicle Stops from 2014 to early 2017 which includes the record of PD pull-over and stop cause and SD Police Beat and Service Area
Three perspectives could be analyzed to investigate the vehicle pull-over at San Diego.

  1. Driver Age V.S. Stop Causeage_stop_cause

The heatmap shows the number of pull-over with the different driver age on the major 8 stop causes. It is shown that the first two major pull-over reaons from San Diego PD are
a. Moving Violation
b. Equipment Violation
We also have the pull-over counts on the age distribution and sex distribution.
The drivers with the age 20 to 29 are inclined to be pulled over by violating moving traffic laws. With the age increases, the counts is reduced. It also might suggest the insurance company will have the high premium at this range of the age. The male drivers are also intended to be pulled compared to female drivers.

  1. Race V.S. Stop Causerace_stop_cause

The heatmap shows the number of pull-over with the different driver race on the major 8 stop causes. We also observe the pull-over rate with the race distribution and sex distribution. There are 3 major race groups being inclined to be pulled over due to moving and equipment violations:
a. WHITE
b. HISPANIC
c. BLACK
Based on the data, a signigicant amount of the White drivers are pulled over due to the moving violation, the Hispanic drivers are inclined to being pulled over due to equipment violation

  1. Race V.S. PD Service Area (for the major stop cause)The pull over data in SD Police Vehicle Stops from 2014 to early 2017 is recorded with PD service area code rather than PD beat number. We can still analyze the trend on different PD service areas

service_area_race_moving
This heatmap shows the number of pull-over due to moving violations with different driver races in the 8 most observed violation areas.
The heatmap shows the White drivers among other races are inclined being pulled due to moving violations in 7 major police service areas, except service area 710 which is TIJUANA RIVER VALLEY. Tijuana River Valley is well-known being occupied by the Hispanic people, so the Hispanic is the major race being pulled over at that column.
service_area_race_equipment
Another heatmap shows the number of pull-over due to equipment violations with different driver races in the 8 most observed violation areas. The heatmap suggests the Hispanic, the White and the Black are inclined to being pulled over due to Equipment Violation. In the certain service area, the number of the Hispanic or the Black pulled over is more than the number of the White.
The detailed implemented Python code including further investigations and choropleth maps is in SD_Police_Vehicle_Stop_Investigation

Attachments
eric Staff replied 3 weeks ago

Mac, Excellent answer, thanks. I edited the image links, because they were not displaying, at least for me. The original had the link to the Github page for the image, which I replaced with the download link for the image.

mac0428 replied 3 weeks ago

Thanks Eric for modifying the link,

I was having the trouble linking the figures.
BTW, would you mind helping me having the 3rd plots replaced by the following one:
https://github.com/MacYeh/Help-San-Diego-Community-/blob/master/police_pull_over/figure/service_area_race_moving.png

Thanks for the modification again,
Mac

eric Staff replied 3 weeks ago

For race, the values should be normalized by the population of the race, to get a per-capita rate. The absolute numbers will always be dominated by Whites and Hispanics, because those are the largest racial groups. Unfortunately, doing that is a bit of a challenge. Not too hard for the county, but populations for Police Beats would require a geographic merge between the beats shape file and census data.

I’ve updated my Police Beats dataset ( https://data.sandiegodata.org/dataset/sandiego-gov-police_regions) to include a new file, beat_demographics, which has the population, by race, for each of the beats.

eric Staff replied 3 weeks ago

Actually, I replaced all of the image urls with the images in in the attachements, since that's probably more correct.

mac0428 replied 3 weeks ago

Sounds good, per-capita rate makes more sense than the absolute number. I will revisit the data and analyze with it.

mac0428 answered 2 weeks ago

For the race analysis, I add three heatmaps to reflect the per-capita rate of moving/equipment violations in different beat areas. The census data is from Police Beat Datasets which includes distribution of major race groups [White, Black, Asian, Hispanic, Hawaian, American Indian] in each beat areas. The data cleanning is a little bit complicated, this process could be done with Pandas and the dictionay mapping, however.
The total population of each racial group:
American Indian population of beat 3467
Asian population of beat 231342
Black population of beat 88761
Hawaiian population of beat 5368
Hispanic population of beat 461066
White population of beat 603401
Race Stop Cause per-capita
We plot the per-capita rate on the total population at San Diego of each race group and the stop causes. It shows the first two race groups violating the moving and equipment rules in terms of per-capita rate are the American Indian and the Black. The American Indian per-capita rate is even larger than 0.5 from the heatmap, the stopping cause data does not contain the details if the same driver is recorded to different violation cases, however. This is the point we have to consider before interpreting the data based on per-capita
Race per-capita and Beat Region on Moving Violation
We have the per-capita rate between the top 8 most beat areas on moving violation and the different racial groups. The per-capita rate is calculated with each racial population of beat areas. The results show at the certain beat areas, like 120 (which is mainly the Mission Bay/La Jolla regions), the Black race has the highest rate, the American Indian race has the highest rate at other beat area, like 240 (which is mainly Mira Mesa region). The heatmap also shows per-capita rate is high among all race-groups at Beat 520 (which is Downtown Region). The American Indian per-capita rate is even larger than 1 at some regions, the stopping cause data does not reveal the details if the same driver is recorded to different violation cases, however. This is the point we have to consider before interpreting the data based on per-capita.
Race per-capita and Beat Region on Equipment Violation
We have the per-capita rate between the top 8 most beat areas on equipment violation and the different racial groups. The per-capita rate is calculated with each racial population of beat areas. The results show at the certain beat areas, like 240 and 930 (which is mainly the Mira Mesa and Del Mar/Carmal Valley Regions), the American Indian race has the highest rate which is even larger than 1. Note that the stopping cause data does not reveal the details if the same driver is recorded to different violation cases, however. This is the point we have to consider before interpreting the data based on per-capita. Beat Area 930 (Del-Mar and Carmal Valley Regions) lead to the higher per-capita rate for some of the groups.
Attention: per-capita rate is high on the certain beat area and racial groups. We have to notice the stop cause database is recorded without identifying if the same person has different cases violations.
Thus, interpreting per-capita rate, especially the lower number of racial population, should be careful, like American Indian.
The detailed implemented Python code including further investigations and choropleth maps is in
SD_Police_Vehicle_Stop_Investigation