Blog

Structure of students email address


 

One of the more involved discussions we tend to have with schools is around the structure and generation of student account names. It’s an increasingly important topic as more and more educational systems are linked. In working with the Manaiakalani cluster schools we arrived at a standard "Hapara code" now used by Teacher Dashboard. This document highlights the design criteria and the resulting code structure.

We are looking to open source our implementation of Hapara codes to allow other schools to easily generate safe and robust student email addresses outside of Teacher Dashboard.

 
What do we want from a student email address?
 
A student email address must:

  1. Protect the student’s identity i.e. not disclose or allow disclosure of the full student name or other identifying details (like student ID). For this reason we cannot use the student’s first and last name (i.e. james.neutron@myschool) or the school/national ID (i.e. 38292747@myschool), or direct derivations thereof (so an attacker with full knowledge of a single student’s details could not use these to “reverse engineer” personal details of another student)
  2. Be personalized: fully “synthetic” codes are impersonal and impede collaboration (i.e. S828273@myschool)
  3. Be respectful: the ID should not arbitrarily cut the student’s name. In many cultures the names have actual meaning, and mechanical abbreviations can result in unintended consequences and be perceived as culturally insensitive. To avoid mechanical abbreviations, a separate field for PREFERRED name needs to be maintained and used to store a shorter (or just preferred) alternative)
  4. Automatically generated based on student info held by the SIMS/SMS – it’s impractical to ask students to pick a unique “handle” to personalize their account (i.e. seargent_pepper@myschool)
  5. Be unique over tenure and scope of the account (tenure may be 1 to 14 years, scope may be just one school, a school district, or nation-wide – see point 9 below). This means that the generated ID has to include a synthetic element derived from a unique stable identifier – some aspect of student’s identity that is unlikely to change over their tenure (like national student ID or school student ID #). For this reason using academic years or class names to distinguish students isn’t a good idea (i.e. jamesY11R15@myschool). Student ID’s cannot be used directly owing to 1) and 2)
  6. Be as short as possible: students will be increasingly expected to use this login, potentially multiple times over a day. For this reason, the “synthetic” component of the ID must be kept as short as possible.
  7. Be memorable and easy to communicate to student, and should therefore avoid any special characters (_+.={}() etc) and for the synthetic component, avoid illegible letter combinations (ill1 0O, etc.)
  8. Be appropriate: the ID should not spell out anything questionable. This is generally not an issue with names, but can become a problem when names are combined with synthetic codes (i.e. jamesISSY@myschool)
  9. OPTIONALLY:

  10. Be portable from school to school to allow students to retain their identity while switching domains. If the scope of the ID must span schools, the synthetic component of the ID must be derived from a SIMS/SMS data element
 
Hapara Student ID Codes
 

The Hapara codes are designed to be personalized yet protect the student’s identity, be short yet highly unique, and reduce the likelihood of inappropriate text being generated:

anne8f4ht@

anne8f4ht

Name prefix consists of the first name or preferred name (if provided) to ensure the account is personal to each student. Prefered names can be used to shorten compound first names or where the legal name isn’t commonly used.

anne8f4ht

Synthetic code contains a subset of lowercase digits and letters. The code excludes vowels and starts with a digit to reduce the likelehood of inappropriate text being generated, and excludes characters that could be easily misread/confused. The algorithm factors in the student’s last name and uses non-reversible mathematical constructs to make it more difficult to reverse-engineer national or school ID numbers based on the email address.

 
The length of the synthetic code can be adjusted to make the resulting ID more unique:
 
Synthetic code length Collision risk
4 characters Possible on matching first/preferred name, last name, and last 4-5 digits of the school/national ID #
5 characters Possible on matching first/preferred name, last name, and last 5-6 digits of the school/national ID #
6 characters Possible on matching first/preferred name, last name, and last 6-7 digits of the school/national ID #
  (note – for school/national ID’s not structured as sequential #’s the most significant digits apply)
 
What Data is needed to genrate Hapara codes
 

Data needs to be complete, accurate, consistent and and stable. Our algorithm requires the following data fields:

  • Legal last name
  • Legal first name
  • Preferred name
  • Stable ID: national student identifier, or school ID

Note that a other name-related fields are not needed (prefix, suffix, middle initial). These may still be needed to associate a full name with the account ID (if desired), but are not used in generating a student ID.

Completeness of student data is usually mandated by regulatory requirements. Records missing the above data elements (with the exception of preferred name, which is only used when populated) cannot be used to generate a Hapara code.

Accuracy is often a reflection of how the data is sourced and verified, and administrative maturity of the school. Best practice and increasingly, legal requirement, is to only use specific forms of legal documents to populate name fields. If students (and parents) do not actually use the legal name, the preferred name field should be used. Experience suggests that students and parents are not a very reliable source for accurate legal name data :)

Consistency in name data relates largely to capitalization and use of punctuation. Legal names are rarely all uppercase, and automated conversion requires considerable caution. Punctuation, especially hyphenation of compound names, is also fraught and may require original legal documents to be sighted to correct. Hapara codes generator ignores punctuation and letter case to reduce the likelihood of ID change due to a cosmetic change.

Stability means that student names are only changed for a valid reason. Changes to student name has increasing and significant ripple-effect as more and more systems are linked to SIMS/SMS data; many online systems may not cope with an account rename operation gracefully, possibly resulting in loss of student content. At minimum, admin staff responsible for day-to-day management of SIMS/SMS should be made aware of the implications of even seemingly "cosmetic" updates; realistically, some formal policies should be set in place around how student identity data is accepted and verified by the school.

 
Teacher Dashboard account ID Genration
 

The Teacher Dashboard Console, Configuration, Account Creation page can be used to configure how Teacher Dashboard deals with student ID’s. You can select to use your own ID’s, or have TD generate Hapara codes of selected length for students.

Please note that if you select that Hapara codes are to be generated, you must provide the school or national identifier in the student data. Records without the appropriate code will be generate warnings on load and will not be processed.

 
Blog Topics
» Education
» General
» Google Apps
» News
 
Recent Posts
Research: improved teaching practices, student engagement, and learning outcomes (May 11, 2012)
Research Paper Summary The Manaiakalani Project Evaluation What is the impact of the Manaiakalani 1:1 Project on literacy tea
Starting out with student blogs and bloggers (August 19, 2011)
This post discusses setting up Blogger for your school domain and the issues you should take into consideration to make it ea
Structure of students email address (April 8, 2011)
Generating safe and robust student email addresses - our approach and solution....
Teacher Dashboard at Ulearn' 10 (October 7, 2010)
Teacher Dashboard for Google Apps Education Edition: Make it easier for your teachers to deal with their classroom Google Doc