Analyzing California Graduate Students

Thousands of students graduate from schools in California every year. The California Department of Education (CDE) has one of the largest student populations in the country. You can find more information about the CDE.

The CDE has a Data & Statistics section on their website. A variety of datasets have been shared for educators, researchers and the public from the early 90s to current.

This project uses the dataset for the 2014 academic year. The following questions will be answered:

  • What is the maximum, minimum and total number of graduates in the academic year of 2014?
  • What is the mean and median number of graduates?
  • How many male and female graduates are there? What is the range?
  • What is the number of graduates by ethnicity?
  • What is the relation between graduates and UC graduates?

The initial dataset can be downloaded as a .txt file here.

Information about the columns of the graduates file


This 14-digit code is the official, unique identification of a school within California. The first two digits identify the county, the next five digits identify the school district, and the last seven digits identify the school.


This is a coded field for race/ethnic designation. The ethnic designations are coded as follows:

  • Code 0 = Not reported
  • Code 1 = American Indian or Alaska Native, Not Hispanic
  • Code 2 = Asian, Not Hispanic
  • Code 3 = Pacific Islander, Not Hispanic
  • Code 4 = Filipino, Not Hispanic
  • Code 5 = Hispanic or Latino
  • Code 6 = African American, not Hispanic (formerly known as Black, not Hispanic)
  • Code 7 = White, not Hispanic
  • Code 9 = Two or More Races, Not Hispanic (See Glossary for complete definitions of the ethnic groups.)

This field is a coded field identifying gender. The gender is coded as follows:

  • M = Male
  • F = Female

Number of twelfth-grade graduates. This data includes summer graduates and does not include students with high school equivalencies (i.e., General Educational Development (GED) test or California High School Proficiency Examination (CHSPE)).


Number of twelfth-grade graduates who also completed all courses required for entry into the University of California (UC) and/or California State University (CSU) with a grade “C” or better. This data includes summer graduates and does not include students with high school equivalencies (i.e., GED or CHSPE).


Year of data.

Exploratory Data Analysis

This is how it looks when we enter the data.

There are 21,013 rows in our dataset. There is no need to delete or remove spaces, letters or missing data.

  • The largest number of graduates in a school in California is 508. The smallest number is 1.
  • The total number of graduates in California schools is 426,950. However, 185,179 of them are eligible to go to a university in CA.
  • This equivalent to  %43 students.

The gender of graduates information is a little interesting. There were 210,946 male students versus 216,003 female students. That means there were 5,056 more female graduates.


Ethnicities have been coded with integers.

Hispanic or Latino ethnicity is dominant in the schools with over 200,000 students. White ethnicity is the second, followed by Asian.

There is a big variety in the number of graduates. The most concentrated graduate numbers range from 0 to 150, and 0 to 75 for UC graduates. There are some schools that had zero UC grads.

Source code and nbviewer version of the notebook is here.

0 comments on “Analyzing California Graduate Students

Leave a Reply

Fill in your details below or click an icon to log in: Logo

You are commenting using your account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

<span>%d</span> bloggers like this: