|
One of the more involved discussions we tend to have with schools is around the structure and generation of student account names. It’s an increasingly important topic as more and more educational systems are linked. In working with the Manaiakalani cluster schools we arrived at a standard "Hapara code" now used by Teacher Dashboard. This document highlights the design criteria and the resulting code structure.
We are looking to open source our implementation of Hapara codes to allow other schools to easily generate safe and robust student email addresses outside of Teacher Dashboard.
|
| |
| What do we want from a student email address? |
| |
A student email address must:
- Protect the student’s identity i.e. not disclose or allow disclosure of the full student name or other identifying details (like student ID). For this reason we cannot use the student’s first and last name (i.e. james.neutron@myschool) or the school/national ID (i.e. 38292747@myschool), or direct derivations thereof (so an attacker with full knowledge of a single student’s details could not use these to “reverse engineer” personal details of another student)
- Be personalized: fully “synthetic” codes are impersonal and impede collaboration (i.e. S828273@myschool)
- Be respectful: the ID should not arbitrarily cut the student’s name. In many cultures the names have actual meaning, and mechanical abbreviations can result in unintended consequences and be perceived as culturally insensitive. To avoid mechanical abbreviations, a separate field for PREFERRED name needs to be maintained and used to store a shorter (or just preferred) alternative)
- Automatically generated based on student info held by the SIMS/SMS – it’s impractical to ask students to pick a unique “handle” to personalize their account (i.e. seargent_pepper@myschool)
- Be unique over tenure and scope of the account (tenure may be 1 to 14 years, scope may be just one school, a school district, or nation-wide – see point 9 below). This means that the generated ID has to include a synthetic element derived from a unique stable identifier – some aspect of student’s identity that is unlikely to change over their tenure (like national student ID or school student ID #). For this reason using academic years or class names to distinguish students isn’t a good idea (i.e. jamesY11R15@myschool). Student ID’s cannot be used directly owing to 1) and 2)
- Be as short as possible: students will be increasingly expected to use this login, potentially multiple times over a day. For this reason, the “synthetic” component of the ID must be kept as short as possible.
- Be memorable and easy to communicate to student, and should therefore avoid any special characters (_+.={}() etc) and for the synthetic component, avoid illegible letter combinations (ill1 0O, etc.)
- Be appropriate: the ID should not spell out anything questionable. This is generally not an issue with names, but can become a problem when names are combined with synthetic codes (i.e. jamesISSY@myschool)
OPTIONALLY:
- Be portable from school to school to allow students to retain their identity while switching domains. If the scope of the ID must span schools, the synthetic component of the ID must be derived from a SIMS/SMS data element
|
| |
| Hapara Student ID Codes |
| |
|
The Hapara codes are designed to be personalized yet protect the student’s identity, be short yet highly unique, and reduce the likelihood of inappropriate text being generated:
anne8f4ht@
anne8f4ht
Name prefix consists of the first name or preferred name (if provided) to ensure the account is personal to each student. Prefered names can be used to shorten compound first names or where the legal name isn’t commonly used.
anne8f4ht
Synthetic code contains a subset of lowercase digits and letters. The code excludes vowels and starts with a digit to reduce the likelehood of inappropriate text being generated, and excludes characters that could be easily misread/confused. The algorithm factors in the student’s last name and uses non-reversible mathematical constructs to make it more difficult to reverse-engineer national or school ID numbers based on the email address.
|
| |
| The length of the synthetic code can be adjusted to make the resulting ID more unique: |
| |
| Synthetic code length |
Collision risk |
| 4 characters |
Possible on matching first/preferred name, last name, and last 4-5 digits of the school/national ID # |
| 5 characters |
Possible on matching first/preferred name, last name, and last 5-6 digits of the school/national ID # |
| 6 characters |
Possible on matching first/preferred name, last name, and last 6-7 digits of the school/national ID # |
| |
(note – for school/national ID’s not structured as sequential #’s the most significant digits apply) |
|
| |
| What Data is needed to genrate Hapara codes |
| |
|
Data needs to be complete, accurate, consistent and and stable. Our algorithm requires the following data fields:
- Legal last name
- Legal first name
- Preferred name
- Stable ID: national student identifier, or school ID
Note that a other name-related fields are not needed (prefix, suffix, middle initial). These may still be needed to associate a full name with the account ID (if desired), but are not used in generating a student ID.
Completeness of student data is usually mandated by regulatory requirements. Records missing the above data elements (with the exception of preferred name, which is only used when populated) cannot be used to generate a Hapara code.
Accuracy is often a reflection of how the data is sourced and verified, and administrative maturity of the school. Best practice and increasingly, legal requirement, is to only use specific forms of legal documents to populate name fields. If students (and parents) do not actually use the legal name, the preferred name field should be used. Experience suggests that students and parents are not a very reliable source for accurate legal name data :)
Consistency in name data relates largely to capitalization and use of punctuation. Legal names are rarely all uppercase, and automated conversion requires considerable caution. Punctuation, especially hyphenation of compound names, is also fraught and may require original legal documents to be sighted to correct. Hapara codes generator ignores punctuation and letter case to reduce the likelihood of ID change due to a cosmetic change.
Stability means that student names are only changed for a valid reason. Changes to student name has increasing and significant ripple-effect as more and more systems are linked to SIMS/SMS data; many online systems may not cope with an account rename operation gracefully, possibly resulting in loss of student content. At minimum, admin staff responsible for day-to-day management of SIMS/SMS should be made aware of the implications of even seemingly "cosmetic" updates; realistically, some formal policies should be set in place around how student identity data is accepted and verified by the school.
|
| |
| Teacher Dashboard account ID Genration |
| |
|
The Teacher Dashboard Console, Configuration, Account Creation page can be used to configure how Teacher Dashboard deals with student ID’s. You can select to use your own ID’s, or have TD generate Hapara codes of selected length for students.
Please note that if you select that Hapara codes are to be generated, you must provide the school or national identifier in the student data. Records without the appropriate code will be generate warnings on load and will not be processed.
|