Skip to content

Patterns

This module contains additional parsing patterns and utilities that are used in RTN.

Functions: - normalize_title: Normalize the title string to remove unwanted characters and patterns. - check_pattern: Check if a pattern is found in the input string.

Arguments: - patterns (list[regex.Pattern]): A list of compiled regex patterns to check. - raw_title (str): The raw title string to check.

For more information on each function, refer to the respective docstrings.

check_pattern(patterns, raw_title)

Check if a pattern is found in the input string.

Source code in RTN/patterns.py
55
56
57
def check_pattern(patterns: list[regex.Pattern], raw_title: str) -> bool:
    """Check if a pattern is found in the input string."""
    return any(pattern.search(raw_title) for pattern in patterns)

normalize_title(raw_title, lower=True)

Normalize the title to remove special characters and accents.

Source code in RTN/patterns.py
41
42
43
44
45
46
47
48
49
50
51
52
def normalize_title(raw_title: str, lower: bool = True) -> str:
    """Normalize the title to remove special characters and accents."""
    import unicodedata
    translation_table = str.maketrans(translationTable)
    lowered = raw_title.lower() if lower else raw_title
    # Normalize unicode characters to their closest ASCII equivalent
    normalized = unicodedata.normalize("NFKC", lowered)
    # Apply specific translations
    translated = normalized.translate(translation_table)
    # Remove punctuation
    cleaned_title = "".join(c for c in translated if c.isalnum() or c.isspace())
    return cleaned_title.strip()