ICANN: Label Generation Rules for the Root Zone Version 4

Purpose: To determine valid top-level Internationalized Domain Name (IDN) labels and their variant labels, the community had finalized the Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels (the Procedure). The Procedure requires community-based Generation Panels (GPs), organized for relevant scripts, to convene and propose specific rules. These rules are evaluated and then integrated into the Root Zone Label Generation Rules (RZ-LGR) by the Integration Panel.

Current Status: The Integration Panel has successfully evaluated the Root Zone Label Generation Rules (LGR) proposals for Bangla and Chinese scripts, as well as the updated  proposal for Malayalam script. These proposals were finalized and submitted by the respective GPs, following their releases for Public Comment. The IP has integrated these proposals, along with other scripts already integrated into the third version of the Root Zone LGR (RZ-LGR-3), to develop the fourth version of the Root Zone LGR (RZ-LGR-4).

Next Steps: As per the Procedure, RZ-LGR-4 is being released for Public Comment to gather community feedback for its finalization. Proposals for additional scripts will be integrated in future versions of the RZ-LGR.

Section I: Description and Explanation

As per the Procedure which guides this work, the RZ-LGR is developed with the GPs starting their analysis from the current version of the Maximal Starting Repertoire (MSR) and developing a proposal for the respective script(s) based on the principles and additional considerations presented in the Procedure. The RZ-LGR-4 is designed to be the fourth edition of a RZ-LGR that meets the requirement for a conservative set of label generation rules for stable and secure operation of the Internet’s Root Zone. RZ-LGR-4 contains rules for 18 scripts, including Arabic, Bangla, Chinese, Devanagari, Ethiopic, Georgian, Gujarati, Gurmukhi, Hebrew, Kannada, Khmer, Lao, Malayalam, Oriya, Sinhala, Tamil, Telugu and Thai, based on the proposals submitted by the respective GPs. The Integration Panel also considered the Armenian and Cyrillic script proposals, but as it has interactions with the LGRs of Greek and Latin scripts which are being developed, it was deemed prudent to delay their integration.

RZ-LGR provides a specification to mechanically determine valid IDN Top-Level Domains (TLDs). The RZ-LGR also determines the corresponding set of blocked and allocatable variant labels. Additional mechanisms need to be developed to determine which, if any, of the allocatable variant labels generated by the RZ-LGR will be allocated to the applicants.

The current version of the RZ-LGR will be followed by future versions that will support additional scripts and writing systems, as proposals from more GPs become available. It is necessary to ensure that these future additions are upwardly compatible. In addition to the panels which have already completed, work is also underway by Greek, Japanese, Korean, Latin and Myanmar panels. GPs for additional scripts, including Thaana and Tibetan are being formed.

Section II: Background

The Root Zone LGR development procedure requires three steps. Initially, the Integration Panel creates the Maximal Starting Repertoire (MSR) for the GPs to initiate their work. Based on the latest version of the MSR, the community-based GPs organize and develop proposals for the RZ-LGR for their respective scripts or writing systems. After Public Comment, these proposals are submitted to the Integration Panel for evaluation. Finally, the successfully evaluated proposals are integrated into the next version of RZ-LGR.

The current MSR-4 covers the following 28 scripts: Arabic, Armenian, Bengali, Cyrillic, Devanagari, Ethiopic, Georgian, Greek, Gujarati, Gurmukhi, Han, Hangul, Hebrew, Hiragana, Kannada, Katakana, Khmer, Lao, Latin, Malayalam, Myanmar, Oriya, Sinhala, Tamil, Telugu, Thaana, Thai, and Tibetan, and is based on Unicode version 6.3.

Successful development of RZ-LGR depends on having a community-based GP for each script or writing system. A GP develops a LGR proposal to be used to generate valid TLD labels and their variant labels for the relevant script or writing system. Each proposal contains the valid code points, their variant code points and Whole Label Evaluation (WLE) rules. In doing so, the GP may need to coordinate efforts with other GPs, whenever their repertoires either overlap or are closely related. Each proposal is reviewed by the community through Public Comment process before submission to the IP for further consideration.

In the Procedure it is stated that the Integration Panel creates a set of recommended label generation rules that integrates all the approved proposals from the GPs. When the IP has created such a set, it is posted for Public Comment using the prevailing ICANN procedures. At the end of the Public Comment period, the IP receives and reviews the Public Comment to finalize the LGR. The resulting label generation rules become the next versions of the RZ-LGR.

Section III: Relevant Resources

The following Root Zone Label Generation Rules version 4 (RZ LGR-4) files are published for Public Comment.

Summary Documents:

  1. Overview and Summary: https://www.icann.org/sites/default/files/lgr/lgr-4-overview-29jun20-en.pdf
  2. Repertoire Tables, non-CJK: https://www.icann.org/sites/default/files/lgr/lgr-4-non-cjk-29jun20-en.pdf
  3. Repertoire Tables, Han: https://www.icann.org/sites/default/files/lgr/lgr-4-han-29jun20-en.pdf

XML versions (normative):

  1. Common: https://www.icann.org/sites/default/files/lgr/lgr-4-common-29jun20-en.xml
  2. Arabic: https://www.icann.org/sites/default/files/lgr/lgr-4-arabic-script-29jun20-en.xml
  3. Bangla: https://www.icann.org/sites/default/files/lgr/lgr-4-bengali-script-29jun20-en.xml
  4. Chinese: https://www.icann.org/sites/default/files/lgr/lgr-4-chinese-script-29jun20-en.xml
  5. Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-4-devanagari-script-29jun20-en.xml
  6. Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-4-ethiopic-script-29jun20-en.xml
  7. Georgian: https://www.icann.org/sites/default/files/lgr/lgr-4-georgian-script-29jun20-en.xml
  8. Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-4-gujarati-script-29jun20-en.xml
  9. Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-4-gurmukhi-script-29jun20-en.xml
  10. Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-4-hebrew-script-29jun20-en.xml
  11. Kannada: https://www.icann.org/sites/default/files/lgr/lgr-4-kannada-script-29jun20-en.xml
  12. Khmer: https://www.icann.org/sites/default/files/lgr/lgr-4-khmer-script-29jun20-en.xml
  13. Lao: https://www.icann.org/sites/default/files/lgr/lgr-4-lao-script-29jun20-en.xml
  14. Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-4-malayalam-script-29jun20-en.xml
  15. Oriya: https://www.icann.org/sites/default/files/lgr/lgr-4-oriya-script-29jun20-en.xml
  16. Sinhala: https://www.icann.org/sites/default/files/lgr/lgr-4-sinhala-script-29jun20-en.xml
  17. Tamil: https://www.icann.org/sites/default/files/lgr/lgr-4-tamil-script-29jun20-en.xml
  18. Telugu: https://www.icann.org/sites/default/files/lgr/lgr-4-telugu-script-29jun20-en.xml
  19. Thai: https://www.icann.org/sites/default/files/lgr/lgr-4-thai-script-29jun20-en.xml

HTML versions of the XML files (non-normative, for easier readability):

  1. Common: https://www.icann.org/sites/default/files/lgr/lgr-4-common-29jun20-en.html
  2. Arabic: https://www.icann.org/sites/default/files/lgr/lgr-4-arabic-script-29jun20-en.html
  3. Bangla: https://www.icann.org/sites/default/files/lgr/lgr-4-bengali-script-29jun20-en.html
  4. Chinese: https://www.icann.org/sites/default/files/lgr/lgr-4-chinese-script-29jun20-en.html
  5. Devanagari: https://www.icann.org/sites/default/files/lgr/lgr-4-devanagari-script-29jun20-en.html
  6. Ethiopic: https://www.icann.org/sites/default/files/lgr/lgr-4-ethiopic-script-29jun20-en.html
  7. Georgian: https://www.icann.org/sites/default/files/lgr/lgr-4-georgian-script-29jun20-en.html
  8. Gujarati: https://www.icann.org/sites/default/files/lgr/lgr-4-gujarati-script-29jun20-en.html
  9. Gurmukhi: https://www.icann.org/sites/default/files/lgr/lgr-4-gurmukhi-script-29jun20-en.html
  10. Hebrew: https://www.icann.org/sites/default/files/lgr/lgr-4-hebrew-script-29jun20-en.html
  11. Kannada: https://www.icann.org/sites/default/files/lgr/lgr-4-kannada-script-29jun20-en.html
  12. Khmer: https://www.icann.org/sites/default/files/lgr/lgr-4-khmer-script-29jun20-en.html
  13. Lao: https://www.icann.org/sites/default/files/lgr/lgr-4-lao-script-29jun20-en.html
  14. Malayalam: https://www.icann.org/sites/default/files/lgr/lgr-4-malayalam-script-29jun20-en.html
  15. Oriya: https://www.icann.org/sites/default/files/lgr/lgr-4-oriya-script-29jun20-en.html
  16. Sinhala: https://www.icann.org/sites/default/files/lgr/lgr-4-sinhala-script-29jun20-en.html
  17. Tamil: https://www.icann.org/sites/default/files/lgr/lgr-4-tamil-script-29jun20-en.html
  18. Telugu: https://www.icann.org/sites/default/files/lgr/lgr-4-telugu-script-29jun20-en.html
  19. Thai: https://www.icann.org/sites/default/files/lgr/lgr-4-thai-script-29jun20-en.html

Section IV: Additional Information

Finalized Proposals for Root Zone Label Generation Ruleset (RZ-LGR) by the Generation Panels: https://www.icann.org/resources/pages/lgr-proposals-2015-12-01-en

Maximal Starting Repertoire version 4: https://www.icann.org/resources/pages/msr-2015-06-21-en

RZ-LGR-3: https://www.icann.org/resources/pages/root-zone-lgr-2015-06-21-en

The Procedure: Procedure to Develop and Maintain the Label Generation Rules for the Root Zone in Respect of IDNA Labels

Call for Generation Panels: Call for Generation Panels to develop Root Zone Label Generation Rules

LGR Toolset: https://www.icann.org/resources/pages/lgr-toolset-2015-06-21-en

  • Open Date: 29 Jun 2020 23:59 UTC
  • Close Date: 11 Aug 2020 23:59 UTC
  • Staff Report Due: 25 Aug 2020 23:59 UTC

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.