Add RTL/LTR Markdown linter for mixed-direction text consistency and PR annotation (#11877)

* Add RTL/LTR Markdown linter for mixed-direction text consistency and PR annotation

Introduce a Python-based linter (scripts/rtl_ltr_linter.py) to automatically detect and annotate issues related to mixed Right-To-Left (RTL) and Left-To-Right (LTR) text in Markdown files. The linter analyzes list items, book entries, and metadata for potential bidirectional text rendering problems, such as missing Unicode directionality markers (RLM/LRM) and improper handling of LTR keywords or symbols in RTL contexts.

Key features:
- Scans all Markdown files in the repository, with full logs saved as workflow artifacts.
- Annotates only changed or added lines in pull requests, providing targeted feedback in the GitHub Actions Job Summary.
- Detects common RTL/LTR issues, including:
  - Missing directionality markers after LTR keywords (e.g., "HTML") or symbols (e.g., "C#") in RTL text.
  - BIDI (bidirectional) mismatches that may affect text display.
  - Incorrect ordering of author names and metadata in RTL contexts.
- Configurable via rtl_linter_config.yml for keywords, symbols, and severity levels.
- Includes a GitHub Actions workflow (rtl-ltr-linter.yml) for automated checks on PRs.

* Add test cases for RTL/LTR linter in English and Arabic book lists

Add sample entries to free-programming-books-en.md and free-programming-books-ar.md to test the RTL/LTR Markdown linter.
These test cases include various combinations of RTL and LTR text, keywords, symbols, and metadata to verify that the linter correctly detects directionality issues and outputs the expected logs and annotations.

* Restore original book lists after RTL/LTR linter test cases

Revert test entries in free-programming-books-en.md and free-programming-books-ar.md, restoring the original book lists. This commit removes temporary test data used for validating the RTL/LTR Markdown linter, preparing the repository for merging the PR with a clean state.

No functional changes to the linter or configuration files; only test content has been removed.

* Update RTL/LTR linter workflow and script: run only on RTL file changes or "RTL" label, fail only on errors

The GitHub Actions workflow for the RTL/LTR Markdown linter now runs only if:
- The PR modifies .md files related to RTL languages (ar, he, fa, ur), or
- The PR has the "RTL" label.
The linter script has been updated to fail the check only if errors are found on changed lines, not for warnings.

* Only upload linter artifact if linter step runs

Prevent warning about missing artifact by uploading the linter output log only if the linter step was executed (success or failure). This avoids unnecessary warnings when the linter is skipped because no RTL files were changed and no RTL label is present.

* Test workflow: modify non-RTL markdown file

Modified free-programming-books-en.md to verify that the RTL/LTR linter workflow does not run when only non-RTL markdown files are changed and the "RTL" label is not present.

* Test workflow: modify RTL markdown file to trigger linter

Modified free-programming-books-ar.md to verify that the RTL/LTR linter workflow runs as expected when an RTL markdown file is changed.

* Fix RTL/LTR BIDI issues in some markdown files

Applied directional markers (‎, ‏) and other formatting fixes to resolve BIDI (bidirectional text) errors and warnings reported by the linter in several .md files.

* Fix workflow: upload linter log only if linter step has not been skipped

Updated the workflow to upload the linter output artifact only when the linter step was actually executed (not skipped)

* Add debug step to check linter outcome in workflow

Added a debug step after the linter execution in the workflow to print the outcome and conclusion of the run_linter step.

* Set continue-on-error for linter step to allow artifact upload and debug

* Remove workflow debug step and update markdown file

Removed the debug step from the RTL/LTR linter workflow and applied further changes to a markdown file.

* Fix RTL/LTR BIDI issues in some markdown files

Applied directional markers (‎, ‏) and other formatting fixes to resolve BIDI (bidirectional text) errors and warnings reported by the linter in several .md files. This commit is a second batch of corrections to improve RTL/LTR rendering and pass the linter checks.

* Fix RTL/LTR BIDI issues in some markdown files

Applied directional markers (‎, ‏) and other formatting fixes to resolve BIDI (bidirectional text) errors and warnings reported by the linter in several .md files. This commit is a third batch of corrections to improve RTL/LTR rendering and pass the linter checks.

* Do not produce log file if no issues found

Updated the linter script to avoid creating the log file when no issues, warnings, or notices are found. If no issues are detected, the script now prints a "::notice ::No issues found"

* Always print annotation with number of errors and warnings found

Updated the linter script to always print an annotation indicating how many errors and warnings were found, even if there are none.

* Fix: always print summary annotation with number of issues found

* Add a missing newline character at end of file free-courses-he.md

* Update linter configuration and revert markdown files to pre-fix state

Updated the organization of keywords and symbols in the linter configuration file. Reverted all markdown files to their original state prior to the fixes.

* Update free-programming-books-he.md with fixes

* Update free-programming-books-he.md with further fixes

* Update free-programming-books-he.md with fixes

* Update free-programming-books-fa_IR.md with fixes

* Update free-programming-books-he.md with further fixes

* Update free-programming-books-ar.md with fixes

* Update free-programming-books-ar.md with further fixes

* Update free-podcasts-screencasts-ar.md with fixes

* Update free-podcasts-screencasts-fa_IR.md with fixes

* Update free-courses-he.md with fixes

* Update free-courses-he.md with further fixes

* Update free-courses-fa_IR.md with fixes

* Update free-courses-fa_IR.md with further fixes

* Update free-courses-ar.md with fixes

* Update free-courses-ar.md with further fixes

* Update free-courses-ar.md with further fixes

* Update free-courses-ur.md with fixes

* Update some markdown files with further improvements

* Fix alignment of nested lists in free-programming-books-fa_IR.md

* Update CONTRIBUTING.md and CONTRIBUTING-it.md with RTL/LTR linter error fixing guidelines

Added a section to CONTRIBUTING.md and CONTRIBUTING-it.md explaining how to fix RTL/LTR Markdown linter errors, including when to use ‏ and ‎ with practical examples for contributors working on files with mixed RTL and LTR text
This commit is contained in:
Gabriele Ciccotelli
2025-05-28 16:46:25 +02:00
committed by GitHub
parent 1be7c48c60
commit caa05be694
14 changed files with 1425 additions and 465 deletions

View File

@@ -262,3 +262,52 @@ Se riesci a stamparlo e conservarne l'essenza, non è un tutorial interattivo.
- È possibile specificare più di un file da controllare, utilizzando un singolo spazio per separare ogni voce.
- Se specifichi più di un file, i risultati della build si basano sul risultato dell'ultimo file controllato. Dovresti essere consapevole che potresti ottenere il passaggio di build verdi a causa di ciò, quindi assicurati di ispezionare il registro di build alla fine della Pull Request facendo clic su "Show all checks" -> "Details".
### Come risolvere gli errori del linter RTL/LTR
Se viene eseguito il linter RTL/LTR Markdown Linter (sui file `*-ar.md`, `*-he.md`, `*-fa.md`, `*-ur.md`) e si vedono errori o warning:
- **Parole LTR** (ad esempio "HTML", "JavaScript") in testo RTL: aggiungi `‏` immediatamente dopo ogni segmento LTR;
- **Simboli LTR** (ad esempio "C#", "C++"): aggiungi `‎` immediatamente dopo ogni simbolo LTR;
#### Esempi
**SCORRETTO**
```html
<div dir="rtl" markdown="1">
* [كتاب الأمثلة في R](URL) - John Doe (PDF)
</div>
```
**CORRETTO**
```html
<div dir="rtl" markdown="1">
* [كتاب الأمثلة في R&rlm;](URL) - John Doe&rlm; (PDF)
</div>
```
---
**SCORRETTO**
```html
<div dir="rtl" markdown="1">
* [Tech Podcast - بودكاست المثال](URL) Ahmad Hasan, محمد علي
</div>
```
**CORRETTO**
```html
<div dir="rtl" markdown="1">
* [Tech Podcast - بودكاست المثال](URL) Ahmad Hasan,&rlm; محمد علي
</div>
```
---
**SCORRETTO**
```html
<div dir="rtl" markdown="1">
* [أساسيات C#](URL)
</div>
```
**CORRETTO**
```html
<div dir="rtl" markdown="1">
* [أساسيات C#&lrm;](URL)
</div>
```

View File

@@ -286,3 +286,52 @@ If you can print it out and retain its essence, it's not an Interactive Tutorial
- You may specify more than one file to check, using a single space to separate each entry.
- If you specify more than one file, results of the build are based on the result of the last file checked. You should be aware that you may get passing green builds due to this so be sure to inspect the build log at the end of the Pull Request by clicking on "Show all checks" -> "Details".
### Fixing RTL/LTR linter errors
If you run the RTL/LTR Markdown Linter (on `*-ar.md`, `*-he.md`, `*-fa.md`, `*-ur.md` files) and see errors or warnings:
- **LTR words** (e.g. “HTML”, “JavaScript”) in RTL text: append `&rlm;` immediately after each LTR segment;
- **LTR symbols** (e.g. “C#”, “C++”): append `&lrm;` immediately after each LTR symbol;
#### Examples
**BAD**
```html
<div dir="rtl" markdown="1">
* [كتاب الأمثلة في R](URL) - John Doe (PDF)
</div>
```
**GOOD**
```html
<div dir="rtl" markdown="1">
* [كتاب الأمثلة في R&rlm;](URL) - John Doe&rlm; (PDF)
</div>
```
---
**BAD**
```html
<div dir="rtl" markdown="1">
* [Tech Podcast - بودكاست المثال](URL) Ahmad Hasan, محمد علي
</div>
```
**GOOD**
```html
<div dir="rtl" markdown="1">
* [Tech Podcast - بودكاست المثال](URL) Ahmad Hasan,&rlm; محمد علي
</div>
```
---
**BAD**
```html
<div dir="rtl" markdown="1">
* [أساسيات C#](URL)
</div>
```
**GOOD**
```html
<div dir="rtl" markdown="1">
* [أساسيات C#&lrm;](URL)
</div>
```