Parser¶
Parser module for parsing torrent titles and extracting metadata using RTN patterns.
The module provides functions for parsing torrent titles, extracting metadata, and ranking torrents based on user preferences.
Functions:
- parse
: Parse a torrent title and enrich it with additional metadata.
Classes:
- Torrent
: Represents a torrent with metadata parsed from its title and additional computed properties.
- RTN
: Rank Torrent Name class for parsing and ranking torrent titles based on user preferences.
Methods
- rank
: Parses a torrent title, computes its rank, and returns a Torrent object with metadata and ranking.
For more information on each function or class, refer to the respective docstrings.
RTN
¶
RTN (Rank Torrent Name) class for parsing and ranking torrent titles based on user preferences.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`settings`
|
SettingsModel
|
The settings model with user preferences for parsing and ranking torrents. |
required |
`ranking_model`
|
BaseRankingModel
|
The model defining the ranking logic and score computation. |
required |
Notes
- The
settings
andranking_model
must be provided and must be valid instances ofSettingsModel
andBaseRankingModel
. - The
lev_threshold
is calculated from thesettings.options["title_similarity"]
and is used to determine if a torrent title matches a correct title.
Example
from RTN import RTN
from RTN.models import SettingsModel, DefaultRanking
settings_model = SettingsModel()
ranking_model = DefaultRanking()
rtn = RTN(settings_model, ranking_model)
Source code in RTN/parser.py
30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
__init__(settings, ranking_model)
¶
Initializes the RTN class with settings and a ranking model.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`settings`
|
SettingsModel
|
The settings model with user preferences for parsing and ranking torrents. |
required |
`ranking_model`
|
BaseRankingModel
|
The model defining the ranking logic and score computation. |
required |
Raises:
Type | Description |
---|---|
ValueError
|
If settings or a ranking model is not provided. |
TypeError
|
If settings is not an instance of SettingsModel or the ranking model is not an instance of BaseRankingModel. |
Example
from RTN import RTN
from RTN.models import SettingsModel, DefaultRanking
settings_model = SettingsModel()
ranking_model = DefaultRanking()
rtn = RTN(settings_model, ranking_model, lev_threshold=0.94)
Source code in RTN/parser.py
53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 |
|
rank(raw_title, infohash, correct_title='', remove_trash=False, speed_mode=True, **kwargs)
¶
Parses a torrent title, computes its rank, and returns a Torrent object with metadata and ranking.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
`raw_title`
|
str
|
The original title of the torrent to parse. |
required |
`infohash`
|
str
|
The SHA-1 hash identifier of the torrent. |
required |
`correct_title`
|
str
|
The correct title to compare against for similarity. Defaults to an empty string. |
required |
`remove_trash`
|
bool
|
Whether to check for trash patterns and raise an error if found. Defaults to True. |
required |
`speed_mode`
|
bool
|
Whether to use speed mode for fetching. Defaults to True. |
required |
Returns:
Name | Type | Description |
---|---|---|
Torrent |
Torrent
|
A Torrent object with metadata and ranking information. |
Raises:
Type | Description |
---|---|
ValueError
|
If the title or infohash is not provided for any torrent. |
TypeError
|
If the title or infohash is not a string. |
GarbageTorrent
|
If the title is identified as trash and should be ignored by the scraper, or invalid SHA-1 infohash is given. |
Notes
- If
correct_title
is provided, the Levenshtein ratio will be calculated between the parsed title and the correct title. - If the ratio is below the threshold, a
GarbageTorrent
error will be raised. - If no correct title is provided, the Levenshtein ratio will be set to 0.0.
Example
from RTN import RTN
from RTN.models import SettingsModel, DefaultRanking
settings_model = SettingsModel()
ranking_model = DefaultRanking()
rtn = RTN(settings_model, ranking_model)
torrent = rtn.rank("The Walking Dead S05E03 720p HDTV x264-ASAP[ettv]", "c08a9ee8ce3a5c2c08865e2b05406273cabc97e7")
assert isinstance(torrent, Torrent)
assert isinstance(torrent.data, ParsedData)
assert torrent.fetch
assert torrent.rank > 0
assert torrent.lev_ratio > 0.0
Source code in RTN/parser.py
79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 113 114 115 116 117 118 119 120 121 122 123 124 125 126 127 128 129 130 131 132 133 134 135 136 137 138 139 140 141 142 143 144 145 146 147 148 149 150 151 |
|
parse(raw_title, translate_langs=False, json=False)
¶
Parses a torrent title using PTN and enriches it with additional metadata extracted from patterns.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
-
|
`raw_title` (str
|
The original torrent title to parse. |
required |
-
|
`translate_langs` (bool
|
Whether to translate the language codes in the parsed title. Defaults to False. |
required |
-
|
`json` (bool
|
Whether to return the parsed data as a dictionary. Defaults to False. |
required |
Returns:
Type | Description |
---|---|
ParsedData | Dict[str, Any]
|
|
Example
parsed_data = parse("Game of Thrones S08E06 1080p WEB-DL DD5.1 H264-GoT")
print(parsed_data.parsed_title) # 'Game of Thrones'
print(parsed_data.normalized_title) # 'game of thrones'
print(parsed_data.type) # 'show'
print(parsed_data.seasons) # [8]
print(parsed_data.episodes) # [6]
print(parsed_data.resolution) # '1080p'
print(parsed_data.audio) # ['DD5.1']
print(parsed_data.codec) # 'H264'
Source code in RTN/parser.py
154 155 156 157 158 159 160 161 162 163 164 165 166 167 168 169 170 171 172 173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188 189 190 191 |
|