Developers’ Reference#
The library installs a couple of commands to your system. The documentation for these commands can be found
below or by executing ms3 -h.
When using ms3 as a module, we are dealing with four main object types:
MSCXobjects hold the information of a single parsed MuseScore file;Annotationsobjects hold a set of annotation labels which can be either attached to a score (i.e., contained in its XML structure), or detached.Both types of objects are contained within a
Scoreobject. For example, a set ofAnnotationsread from a TSV file can be attached to the XML of anMSCXobject, which can then be output as a MuseScore file.To manipulate many
Scoreobjects at once, for example those of an entire corpus, we useParseobjects.
Since MSCX and Annotations
objects are always attached to a Score, the documentation
starts with this central class.
The Parse class#
- class ms3.parse.Parse(directory: str | Collection[str] | None = None, recursive: bool = True, only_metadata_pieces: bool = True, include_convertible: bool = False, include_tsv: bool = True, exclude_review: bool = True, file_re: Pattern | str | None = None, folder_re: Pattern | str | None = None, exclude_re: Pattern | str | None = None, file_paths: Collection[str] | None = None, labels_cfg: dict = {}, ms=None, **logger_cfg)[source]#
Class for creating one or several
Corpusobjects and performing actions on all of them.- __init__(directory: str | Collection[str] | None = None, recursive: bool = True, only_metadata_pieces: bool = True, include_convertible: bool = False, include_tsv: bool = True, exclude_review: bool = True, file_re: Pattern | str | None = None, folder_re: Pattern | str | None = None, exclude_re: Pattern | str | None = None, file_paths: Collection[str] | None = None, labels_cfg: dict = {}, ms=None, **logger_cfg)[source]#
Initialize a Parse object and try to create corpora if directories and/or file paths are specified.
- Parameters:
directory – Path to scan for corpora.
recursive – Pass False if you don’t want to scan
directoryfor subcorpora, but force making it a corpus instead.only_metadata_pieces – The default view excludes piece names that are not listed in the corpus’ metadata.tsv file (e.g. when none was found). Pass False to include all pieces regardless. This might be needed when setting
recursiveto False.include_convertible – The default view excludes scores that would need conversion to MuseScore format prior to parsing. Pass True to include convertible scores in .musicxml, .midi, .cap or any other format that MuseScore 3 can open. For on-the-fly conversion, however, the parameter
msneeds to be set.include_tsv – The default view includes TSV files. Pass False to disregard them and parse only scores.
exclude_review – The default view excludes files and folders whose name contains ‘review’. Pass False to include these as well.
file_re – Pass a regular expression if you want to create a view filtering out all files that do not contain it.
folder_re – Pass a regular expression if you want to create a view filtering out all folders that do not contain it.
exclude_re – Pass a regular expression if you want to create a view filtering out all files or folders that contain it.
file_paths – If
directoryis specified, the file names of these paths are used to create a filtering view excluding all other files. Otherwise, all paths are expected to be part of the same parent corpus which will be inferred from the first path by looking for the first parent directory that either contains a ‘metadata.tsv’ file or is a git. This parameter is deprecated andfile_reshould be used instead.labels_cfg – Pass a configuration dict to detect only certain labels or change their output format.
ms – If you pass the path to your local MuseScore 3 installation, ms3 will attempt to parse musicXML, MuseScore 2, and other formats by temporarily converting them. If you’re using the standard path, you may try ‘auto’, or ‘win’ for Windows, ‘mac’ for MacOS, or ‘mscore’ for Linux. In case you do not pass the ‘file_re’ and the MuseScore executable is detected, all convertible files are automatically selected, otherwise only those that can be parsed without conversion.
**logger_cfg – Keyword arguments for changing the logger configuration. E.g.
level='d'to see all debug messages.
- corpus_paths: Dict[str, str]#
{corpus_name -> path} dictionary with each corpus’s base directory. Generally speaking, each corpus path is expected to contain a
metadata.tsvand, maybe, to be a git.
- corpus_objects: Dict[str, Corpus]#
{corpus_name -> Corpus} dictionary with one object per
corpus_path.
- labels_cfg#
dictConfiguration dictionary to determine the output format oflabelsandexpandedtables. The dictonary is passed toScoreupon parsing.
- property ms: str#
Path or command of the local MuseScore 3 installation if specified by the user and recognized.
- property n_detected: int#
Number of detected files aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property n_orphans: int#
Number of files that are always disregarded because they could not be attributed to any of the pieces.
- property n_parsed: int#
Number of parsed files aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property n_parsed_scores: int#
Number of parsed scores aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property n_parsed_tsvs: int#
Number of parsed TSV files aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property n_unparsed_scores: int#
Number of all detected but not yet parsed scores, aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property n_unparsed_tsvs: int#
Number of all detected but not yet parsed TSV files, aggregated from all
Corpusobjects without taking views into account. Excludes metadata files.
- property view: View#
Retrieve the current View object. Shorthand for
get_view().
- add_corpus(directory: str, corpus_name: str | None = None, only_metadata_pieces: bool | None = None, include_convertible: bool | None = None, include_tsv: bool | None = None, exclude_review: bool | None = None, file_re: Pattern | str | None = None, folder_re: Pattern | str | None = None, exclude_re: Pattern | str | None = None, paths: Collection[str] | None = None, **logger_cfg) None[source]#
This method creates a
Corpusobject which scans the directorydirectoryfor parseable files. It inherits allViewsfrom the Parse object.- Parameters:
directory – Directory to scan for files.
corpus_name – By default, the folder name of
directoryis used as name for this corpus. Pass a string to use a different identifier.**logger_cfg – Keyword arguments for configuring the logger of the new Corpus object. E.g.
level='d'to see all debug messages. Note that the logger is a child logger of this Parse object’s logger and propagates, so it might filter debug messages. You can use _.change_logger_cfg(level=’d’) to change the level post hoc.
- add_dir(directory: str, recursive: bool = True, only_metadata_pieces: bool | None = None, include_convertible: bool | None = None, include_tsv: bool | None = None, exclude_review: bool | None = None, file_re: Pattern | str | None = None, folder_re: Pattern | str | None = None, exclude_re: Pattern | str | None = None, paths: Collection[str] | None = None, **logger_cfg) None[source]#
This method decides if the directory
directorycontains several corpora or if it is a corpus itself, and callsadd_corpus()for each corpus.- Parameters:
directory – Directory to scan for corpora.
recursive – By default, if any of the first-level subdirectories contains a ‘metadata.tsv’ or is a git, all first-level subdirectories of
directoryare treated as corpora, i.e. oneCorpusobject per folder is created. Pass False to prevent this, which is equivalent to callingadd_corpus(directory)**logger_cfg – Keyword arguments for configuring the logger of the new Corpus objects. E.g.
level='d'to see all debug messages. Note that the loggers are child loggers of this Parse object’s logger and propagate, so it might filter debug messages. You can use _.change_logger_cfg(level=’d’) to change the level post hoc.
- add_files(file_paths: str | Collection[str], corpus_name: str | None = None) None[source]#
Deprecated: To deal with particular files only, use
add_corpus()passing the directory containing them and configure the :class`~.view.View` accordingly. This method here does it for you but easily leads to unexpected behaviour. It expects the file paths to point to files located in a shared corpus folder on some higher level or in folders for whichCorpusobjects have already been created.- Parameters:
file_paths – Collection of file paths. Only existing files can be added.
corpus_name –
By default, I will try to attribute the files to existing
Corpusobjects based on their paths. This makes sense only when new files have been created after the directories were scanned.For paths that do no not contain an existing corpus_path, I will try to detect the parent directory that is a corpus (based on it being a git or containing a
metadata.tsv). If this is without success for the first path, I will raise an error. Otherwise, all subsequent paths will be considered to be part of that same corpus (watch out meaningless relative paths!).You can pass a folder name contained in the first path to create a new corpus, assuming that all other paths are contained in it (watch out meaningless relative paths!).
Pass an existing corpus_name to add the files to a particular corpus. Note that all parseable files under the corpus_path are detected anyway, and if you add files from other directories, it will lead to invalid relative paths that work only on your system. If you’re adding files that have been created after the Corpus object has, you can leave this parameter empty; paths will be attributed to the existing corpora automatically.
- change_labels_cfg(labels_cfg=(), staff=None, voice=None, harmony_layer=None, positioning=None, decode=None, column_name=None, color_format=None)[source]#
Update
Parse.labels_cfgand retrieve new ‘labels’ tables accordingly.- Parameters:
labels_cfg (
dict) – Using an entire dictionary or, to change only particular options, choose from:staff – Arguments as they will be passed to
get_labels()voice – Arguments as they will be passed to
get_labels()harmony_layer – Arguments as they will be passed to
get_labels()positioning – Arguments as they will be passed to
get_labels()decode – Arguments as they will be passed to
get_labels()column_name – Arguments as they will be passed to
get_labels()
- compare_labels(key: str = 'detached', new_color: str = 'ms3_darkgreen', old_color: str = 'ms3_darkred', detached_is_newer: bool = False, add_to_rna: bool = True, view_name: str | None = None, metadata_update: dict | None = None, force_metadata_update: bool = False) Tuple[int, int][source]#
Compare detached labels
keyto the ones attached to the Score to create a diff. By default, the attached labels are considered as the reviewed version and labels that have changed or been added in comparison to the detached labels are colored in green; whereas the previous versions of changed labels are attached to the Score in red, just like any deleted label.- Parameters:
key – Key of the detached labels you want to compare to the ones in the score.
new_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).old_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).detached_is_newer – Pass True if the detached labels are to be added with
new_colorwhereas the attached changed labels will turnold_color, as opposed to the default.add_to_rna – By default, new labels are attached to the Roman Numeral layer. Pass False to attach them to the chord layer instead.
metadata_update – Dictionary containing metadata that is to be included in the comparison score. Notably, ms3 uses the key ‘compared_against’ when the comparison is performed against a given git_revision.
force_metadata_update – By default, the metadata is only updated if the comparison yields at least one difference to avoid outputting comparison scores not displaying any changes. Pass True to force the metadata update, which results in the properts
changedbeing set to True.
- Returns:
- Number of scores in which labels have changed.
Number of scores in which no label has chnged.
- count_extensions(view_name: str | None = None, per_piece: bool = False, include_metadata: bool = False)[source]#
Count file extensions.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to count file extensions. By default, all keys are selected.ids (
Collection) – If you pass a collection of IDs,keysis ignored and only the selected extensions are counted.per_key (
bool, optional) – If set to True, the results are returned as a dict {key: Counter}, otherwise the counts are summed up in one Counter.per_subdir (
bool, optional) – If set to True, the results are returned as {key: {subdir: Counter} }.per_key=Trueis therefore implied.
- Returns:
By default, the function returns a Counter of file extensions (Counters are converted to dicts). If
per_keyis set to True, a dictionary {key: Counter} is returned, separating the counts. Ifper_subdiris set to True, a dictionary {key: {subdir: Counter} } is returned.- Return type:
- count_pieces(view_name: str | None = None) int[source]#
Number of selected pieces under the given view.
- disambiguate_facet(facet: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'], view_name: str | None = None, ask_for_input=True) None[source]#
Calls the method on every selected corpus.
- get_dataframes(notes: bool = False, rests: bool = False, notes_and_rests: bool = False, measures: bool = False, events: bool = False, labels: bool = False, chords: bool = False, expanded: bool = False, form_labels: bool = False, cadences: bool = False, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, flat=False, include_empty: bool = False) DataFrame | Dict[Tuple[str, str], Dict[str, List[Tuple[File, DataFrame]]] | List[Tuple[File, DataFrame]]][source]#
Renamed to
get_facets().
- get_facet(facet: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'], view_name: str | None = None, choose: Literal['auto', 'ask'] = 'auto', unfold: bool = False, interval_index: bool = False, concatenate: bool = True) Dict[str, Tuple[File, DataFrame]] | DataFrame[source]#
Retrieves exactly one DataFrame per piece, if available.
- get_view(view_name: str | None = None, **config) View[source]#
Retrieve an existing or create a new View object, potentially while updating the config.
- insert_detached_labels(view_name: str | None = None, key: str = 'detached', staff: int = None, voice: Literal[1, 2, 3, 4] = None, harmony_layer: Literal[0, 1, 2] | None = None, check_for_clashes: bool = True)[source]#
Attach all
Annotationsobjects that are reachable viaScore.keyto their respectiveScore, altering the XML in memory. Callingstore_scores()will output MuseScore files where the annotations show in the score.- Parameters:
key – Key under which the
Annotationsobjects to be attached are stored in theScoreobjects. Defaults to ‘detached’.staff (
int, optional) – If you pass a staff ID, the labels will be attached to that staff where 1 is the upper stuff. By default, the staves indicated in the ‘staff’ column ofms3.annotations.Annotations.dfwill be used.voice ({1, 2, 3, 4}, optional) – If you pass the ID of a notational layer (where 1 is the upper voice, blue in MuseScore), the labels will be attached to that one. By default, the notational layers indicated in the ‘voice’ column of
ms3.annotations.Annotations.dfwill be used.harmony_layer (
int, optional) –By default, the labels are written to the layer specified as an integer in the columnharmony_layer.Pass an integer to select a particular layer:* 0 to attach them as absolute (‘guitar’) chords, meaning that when opened next time,MuseScore will split and encode those beginning with a note name ( resulting in ms3-internal harmony_layer 3).* 1 the labels are written into the staff’s layer for Roman Numeral Analysis.* 2 to have MuseScore interpret them as Nashville Numberscheck_for_clashes (
bool, optional) – By default, warnings are thrown when there already exists a label at a position (and in a notational layer) where a new one is attached. Pass False to deactivate these warnings.
- iter_corpora(view_name: str | None = None) Iterator[Tuple[str, Corpus]][source]#
Iterate through corpora under the current or specified view.
- iter_independent_corpora(view_name: str | None = None) Iterator[Tuple[str, Corpus]][source]#
Like iter_corpora() but creating new Corpus objects that are not stored in this Parse object to avoid filling up memory when parsing many files.
- load_ignored_warnings(path: str) None[source]#
Adds a filters to all loggers included in a IGNORED_WARNINGS file.
- Parameters:
path – Path of the IGNORED_WARNINGS file.
- update_metadata_tsv_from_parsed_scores(root_dir: str | None = None, suffix: str = '', markdown_file: str | None = 'README.md', view_name: str | None = None) List[str][source]#
Gathers the metadata from parsed and currently selected scores and updates ‘metadata.tsv’ with the information.
- Parameters:
root_dir – In case you want to output the metadata to folder different from
corpus_path.suffix – Added to the filename: ‘metadata{suffix}.tsv’. Defaults to ‘’. Metadata files with suffix may be used to store views with particular subselections of pieces.
markdown_file – By default, a subset of metadata columns will be written to ‘README.md’ in the same folder as the TSV file. If the file exists, it will be scanned for a line containing the string ‘# Overview’ and overwritten from that line onwards.
view_name – The view under which you want to update metadata from the selected parsed files. Defaults to None, i.e. the active view.
- Returns:
The file paths to which metadata was written.
- update_score_metadata_from_tsv(view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', write_empty_values: bool = False, remove_unused_fields: bool = False, write_text_fields: bool = False, update_instrumentation: bool = False) List[File][source]#
Update metadata fields of parsed scores with the values from the corresponding row in metadata.tsv.
- Parameters:
view_name
force
choose
write_empty_values – If set to True, existing values are overwritten even if the new value is empty, in which case the field will be set to ‘’.
remove_unused_fields – If set to True, all non-default fields that are not among the columns of metadata.tsv (anymore) are removed.
write_text_fields – If set to True, ms3 will write updated values from the columns
title_text,subtitle_text,composer_text,lyricist_text, andpart_name_textinto the score headers.update_instrumentation – Set to True to update the score’s instrumentation based on changed values from ‘staff_<i>_instrument’ columns.
- Returns:
List of File objects of those scores of which the XML structure has been modified.
- update_scores(root_dir: str | None = None, folder: str = '.', suffix: str = '', overwrite: bool = False) List[str][source]#
Update scores created with an older MuseScore version to the latest MuseScore 3 version.
- Parameters:
root_dir – In case you want to create output paths for the updated MuseScore files based on a folder different from
corpus_path.folder –
The default ‘.’ has the updated scores written to the same directory as the old ones, effectively overwriting them if
root_diris None.If
folderis None, the files will be written to{root_dir}/scores/.If
folderis an absolute path,root_dirwill be ignored.If
folderis a relative path starting with a dot.the relative path is appended to the file’s subdir. For example,../scoreswill resolve to a sibling directory of the one where thefileis located.If
folderis a relative path that does not begin with a dot., it will be appended to theroot_dir.
suffix – String to append to the file names of the updated files, e.g. ‘_updated’.
overwrite – By default, existing files are not overwritten. Pass True to allow this.
- Returns:
A list of all up-to-date paths, whether they had to be converted or were already in the latest version.
- update_tsvs_on_disk(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = 'tsv', view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto') List[str][source]#
Update existing TSV files corresponding to one or several facets with information freshly extracted from a parsed score, but only if the contents are identical. Otherwise, the existing TSV file is not overwritten and the differences are displayed in a log warning. The purpose is to safely update the format of existing TSV files, (for instance with respect to column order) making sure that the content doesn’t change.
- Parameters:
facets
view_name
force – By default, only TSV files that have already been parsed are updated. Set to True in order to force-parse for each facet one of the TSV files included in the given view, if necessary.
choose
- Returns:
List of paths that have been overwritten.
- metadata_tsv(view_name: str | None = None) DataFrame[source]#
Concatenates the ‘metadata.tsv’ (as they come) files for all corpora with a [corpus, piece] MultiIndex. If you need metadata that filters out pieces according to the current view, use
metadata().
- store_extracted_facets(view_name: str | None = None, root_dir: str | None = None, measures_folder: str | None = None, notes_folder: str | None = None, rests_folder: str | None = None, notes_and_rests_folder: str | None = None, labels_folder: str | None = None, expanded_folder: str | None = None, form_labels_folder: str | None = None, cadences_folder: str | None = None, events_folder: str | None = None, chords_folder: str | None = None, metadata_suffix: str | None = None, markdown: bool = True, simulate: bool = False, unfold: bool = False, interval_index: bool = False)[source]#
Store facets extracted from parsed scores as TSV files.
- Parameters:
view_name
root_dir –
- (‘measures’, ‘notes’, ‘rests’, ‘notes_and_rests’, ‘labels’, ‘expanded’, ‘form_labels’, ‘cadences’,
’events’, ‘chords’)
measures_folder
notes_folder
rests_folder
notes_and_rests_folder
labels_folder
expanded_folder
:param : :param form_labels_folder: Specify directory where to store the corresponding TSV files. :param cadences_folder: Specify directory where to store the corresponding TSV files. :param events_folder: Specify directory where to store the corresponding TSV files. :param chords_folder: Specify directory where to store the corresponding TSV files. :param metadata_suffix: Specify a suffix to update the ‘metadata{suffix}.tsv’ file for each corpus. For the main file, pass ‘’ :param markdown: By default, when
metadata_pathis specified, a markdown file calledREADME.mdcontainingthe columns [file_name, measures, labels, standard, annotators, reviewers] is created. If it exists already, this table will be appended or overwritten after the heading
# Overview.- Parameters:
simulate
unfold – By default, repetitions are not unfolded. Pass True to duplicate values so that they correspond to a full playthrough, including correct positioning of first and second endings.
interval_index
Returns:
- store_parsed_scores(view_name: str | None = None, only_changed: bool = True, root_dir: str | None = None, folder: str = '.', suffix: str = '', overwrite: bool = False, simulate=False) Dict[str, List[str]][source]#
Stores all parsed scores under this view as MuseScore 3 files.
- Args:
view_name: Name of another view if another than the current one is to be used. only_changed:
By default, only scores that have been modified since parsing are written. Set to False to store all scores regardless.
root_dir: Directory where to re-build the sub-directory tree of the
Corpusin question. folder:Different behaviours are available. Note that only the third option ensures that file paths are distinct for files that have identical pieces but are located in different subdirectories of the same corpus.
If
folderis None (default), the files’ type will be appended to theroot_dir.If
folderis an absolute path,root_dirwill be ignored.If
folderis a relative path that does not begin with a dot., it will be appended to theroot_dir.If
folderis a relative path starting with a dot.the relative path is appended to the file’s subdir. For example, ``..
- otes`` will resolve to a sibling directory of the one where the
file is located.
suffix: Suffix to append to the original file name. overwrite: Pass True to overwrite existing files. simulate: Set to True if no files are to be written.
- Returns:
Paths of the stored files.
- parse(view_name=None, level=None, parallel=True, only_new=True, labels_cfg={}, cols={}, infer_types=None, **kwargs)[source]#
Shorthand for executing parse_scores and parse_tsv at a time. :param view_name:
- parse_scores(level: str = None, parallel: bool = True, only_new: bool = True, labels_cfg: dict = {}, view_name: str = None, choose: Literal['all', 'auto', 'ask'] = 'all')[source]#
Parse MuseScore 3 files (MSCX or MSCZ) and store the resulting read-only Score objects. If they need to be writeable, e.g. for removing or adding labels, pass
parallel=Falsewhich takes longer but prevents having to re-parse at a later point.- Parameters:
keys (
strorCollection, optional) – For which key(s) to parse all MSCX files.ids (
Collection) – To parse only particular files, pass their IDs.keysandfextsare ignored in this case.level ({'W', 'D', 'I', 'E', 'C', 'WARNING', 'DEBUG', 'INFO', 'ERROR', 'CRITICAL'}, optional) – Pass a level name for which (and above which) you want to see log records.
parallel (
bool, optional) – Defaults to True, meaning that all CPU cores are used simultaneously to speed up the parsing. It implies that the resulting Score objects are in read-only mode and that you might not be able to use the computer during parsing. Pass False to parse one score after the other, which uses more memory but will allow making changes to the scores.only_new (
bool, optional) – By default, score which already have been parsed, are not parsed again. Pass False to parse them, too.
- Return type:
None
- parse_tsv(view_name=None, level=None, cols={}, infer_types=None, only_new=True, choose: Literal['all', 'auto', 'ask'] = 'all', **kwargs)[source]#
Parse TSV files (or other value-separated files such as CSV) to be able to do something with them.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to parse all non-MSCX files. By default, all keys are selected.ids (
Collection) – To parse only particular files, pass there IDs.keysandfextsare ignored in this case.fexts (
strorCollection, optional) – If you want to parse only files with one or several particular file extension(s), pass the extension(s)cols (
dict, optional) – By default, if a column called'label'is found, the TSV is treated as an annotation table and turned into an Annotations object. Pass one or several column name(s) to treat them as label columns instead. If you pass{}or no label column is found, the TSV is parsed as a “normal” table, i.e. a DataFrame.infer_types (
dict, optional) – To recognize one or several custom label type(s), pass{name: regEx}.level ({'W', 'D', 'I', 'E', 'C', 'WARNING', 'DEBUG', 'INFO', 'ERROR', 'CRITICAL'}, optional) – Pass a level name for which (and above which) you want to see log records.
**kwargs – Arguments for
pandas.DataFrame.to_csv(). Defaults to{'sep': ' ', 'index': False}. In particular, you might want to update the default dictionaries fordtypesandconvertersused inload_tsv().
- Returns:
None
Args – only_new: view_name:
- __iter__() Iterator[Tuple[str, Corpus]][source]#
Iterate through all (corpus_name, Corpus) tuples, regardless of any Views.
Yields: (corpus_name, Corpus) tuples
- property parsed_mscx: DataFrame#
Deprecated property. Replaced by
n_parsed_scores
- property parsed_tsv: DataFrame#
Deprecated property. Replaced by
n_parsed_tsvs
- add_detached_annotations(*args, **kwargs)[source]#
Deprecated method. Replaced by
insert_detached_labels().
- get_lists(*args, **kwargs)[source]#
Deprecated method. Replaced by
get_facets().
- iter(*args, **kwargs)[source]#
Deprecated method. Replaced by
ms3.corpus.Corpus.iter_facets().
- parse_mscx(*args, **kwargs)[source]#
Deprecated method. Replaced by
parse_scores().
- store_scores(*args, **kwargs)[source]#
Deprecated method. Replaced by
store_parsed_scores().
- update_metadata(*args, **kwargs)[source]#
Deprecated method. Replaced by
update_score_metadata_from_tsv().
The Corpus class#
- class ms3.corpus.Corpus(directory: str, view: View = None, only_metadata_pieces: bool = True, include_convertible: bool = False, include_tsv: bool = True, exclude_review: bool = True, file_re: Pattern | str | None = None, folder_re: Pattern | str | None = None, exclude_re: Pattern | str | None = None, paths: Collection[str] | None = None, labels_cfg={}, ms=None, **logger_cfg)[source]#
Collection of scores and TSV files that can be matched to each other based on their file names.
- name#
Folder name of the corpus.
- repo: Repo | None#
If the corpus is part of a git repository, this attribute holds the corresponding
git.Repoobject.
- files: List[File]#
[File]list ofFiledata objects containing information on the file location etc. for all detected files.
- labels_cfg#
dictConfiguration dictionary to determine the output format oflabelsandexpandedtables. The dictonary is passed toScoreupon parsing.
- ix2pname: Dict[int, str]#
{ix -> piece name} dict for associating files with the piece they have been matched to. None for indices that could not be matched, e.g. metadata.
- property pnames: List[str]#
All piece names including those of scores that are not listed in metadata.tsv
- add_dir(directory: str, filter_other_pieces: bool = False, file_re: str = '.*', folder_re: str = '.*', exclude_re: str = '^(\\.|_)') List[File][source]#
Add additional files pertaining to the already existing pieces of the corpus.
If you want to use a directory with other pieces, create another
Corpusobject or combine several corpora in aParseobject.- Parameters:
directory – Directory to scan for parseable (score or TSV) files. Only those that begin with one of the corpus’s pieces will be matched and registered, the others will be kept under
ix2orphan_file.filter_other_pieces – Set to True if you want to filter out all pieces that were not matched up with one of the added files. This can be useful if you’re loading TSV files with labels and want to parse only the scores for which you have added labels.
file_re – Regular expressions for filtering certain file names or folder names. The regEx are checked with search(), not match(), allowing for fuzzy search.
folder_re – Regular expressions for filtering certain file names or folder names. The regEx are checked with search(), not match(), allowing for fuzzy search.
exclude_re – Exclude files and folders containing this regular expression.
- Returns:
List of
Fileobjects pertaining to the matched, newly added paths.
- add_file_paths(paths: Collection[str]) List[File][source]#
Iterates through the given paths, converts those that correspond to parseable files to
Fileobjects (trying to infer their type from the path), and appends those tofiles.- Parameters:
paths – File paths that are to be registered with this Corpus object.
- Returns:
A list of
Fileobjects corresponding to parseable files (based on their extensions).
- collect_fnames_from_scores() None[source]#
Construct sorted list of pieces from all detected scores.
- create_metadata_tsv(suffix='', view_name: str | None = None, overwrite: bool = False, force: bool = True) str | None[source]#
Creates a ‘metadata.tsv’ file for the current view.
- create_pieces(pnames: Collection[str] | str = None) None[source]#
Creates and stores one
Pieceobject per piece.
- detect_parseable_files() None[source]#
Walks through the corpus_path and collects information on all parseable files.
- disambiguate_facet(facet: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'], view_name: str | None = None, ask_for_input=True) None[source]#
Make sure that, for a given facet, the current view includes only one or zero files. If at least one piece has more than one file, the user will be asked which ones to use. The others will be excluded from the view.
- Parameters:
facet – Which facet to disambiguate.
ask_for_input – By default, if there is anything to disambiguate, the user is asked to select a group of files. Pass False to see only the questions and choices without actually disambiguating.
- extract_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = None, view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto', unfold: bool = False, interval_index: bool = False, flat=False) Dict[str, Dict[str, List[Tuple[File, DataFrame]]] | List[Tuple[File, DataFrame]]][source]#
Retrieve a dictionary with the selected feature matrices extracted from the parsed scores. If you want to retrieve parsed TSV files, use
get_all_parsed().
- find_and_load_metadata() None[source]#
Checks if a ‘metadata.tsv’ is present at the default path and parses it.
- pieces_in_metadata(metadata_ix: int | None = None) List[str][source]#
pieces (file names without extension and suffix) serve as IDs for pieces. Retrieve those that are listed in the ‘metadata.tsv’ file for this corpus. The argument is simply self.metadata_ix and serves caching of the results for multiple metadata.tsv files.
- pieces_not_in_metadata() List[str][source]#
pieces (file names without extension and suffix) serve as IDs for pieces. Retrieve those that are not listed in the ‘metadata.tsv’ file for this corpus.
- get_dataframes(notes: bool = False, rests: bool = False, notes_and_rests: bool = False, measures: bool = False, events: bool = False, labels: bool = False, chords: bool = False, expanded: bool = False, form_labels: bool = False, cadences: bool = False, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, flat=False, include_empty: bool = False) Dict[str, Dict[str, Tuple[File, DataFrame]] | List[Tuple[File, DataFrame]]][source]#
Renamed to
get_facets().
- get_facet(facet: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'], view_name: str | None = None, choose: Literal['auto', 'ask'] = 'auto', unfold: bool = False, interval_index: bool = False, concatenate: bool = True) Dict[str, Tuple[File, DataFrame]] | DataFrame[source]#
Retrieves exactly one DataFrame per piece, if available.
- get_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = None, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, flat=False, include_empty: bool = False) Dict[str, Dict[str, Tuple[File, DataFrame]] | List[Tuple[File, DataFrame]]][source]#
- Parameters:
facets
view_name
force – Only relevant when
choose='all'. By default, only scores and TSV files that have already been parsed are taken into account. Setforce=Trueto force-parse all scores and TSV files selected under the given view.choose
unfold
interval_index
flat
include_empty
Returns:
- get_all_pnames(pieces_in_metadata: bool = True, pieces_not_in_metadata: bool = True) List[str][source]#
pieces (file names without extension and suffix) serve as IDs for pieces. Use this function to retrieve the comprehensive list, ignoring views.
- Parameters:
pieces_in_metadata – pieces that are listed in the ‘metadata.tsv’ file for this corpus, if present
pieces_not_in_metadata – pieces that are not listed in the ‘metadata.tsv’ file for this corpus
- Returns:
The file names included in ‘metadata.tsv’ and/or those of all other scores.
- get_pieces(view_name: str | None = None) List[str][source]#
Retrieve pieces included in the current or selected view.
- get_view(view_name: str | None = None, **config) View[source]#
Retrieve an existing or create a new View object, potentially while updating the config.
- iter_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = None, view_name: str | None = None, choose: Literal['auto', 'ask'] = 'auto', unfold: bool = False, interval_index: bool = False, include_files: bool = False) Iterator[source]#
Iterate through (piece, *DataFrame) tuples containing exactly one or zero DataFrames per requested facet.
- Parameters:
facets
view_name
choose
unfold
interval_index
include_files
- Returns:
(piece, *DataFrame) tuples containing exactly one or zero DataFrames per requested facet per piece (piece).
- iter_pieces(view_name: str | None = None) Iterator[Tuple[str, Piece]][source]#
Iterate through (name, corpus) tuples under the current or specified view.
- load_facet_into_scores(facet: Literal['expanded', 'labels'], view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto', git_revision: str | None = None, key: str = 'detached', infer: bool = True, **cols) int[source]#
Loads annotations from maximum one TSV file to maximum one score per piece. Each score will contain the annotations as a ‘detached’ annotation object accessible via the indicated
key(defaults to ‘detached’).
- look_for_ignored_warnings(directory: str | None = None)[source]#
Looks for a text file called IGNORED_WARNINGS and, if it exists, loads it, configuring loggers as indicated.
- load_ignored_warnings(path: str) Tuple[List[Logger], List[str]][source]#
Loads in a text file containing warnings that are to be ignored, i.e., wrapped in DEBUG messages. The purpose is to mark certain warnings as OK, warranted by a human, to allow checks to pass regardless.
- load_metadata_file(file: File, allow_prefixed: bool = False) None[source]#
Loads the TSV file at the given path and stores it as metadata. If the file is called ‘metadata.tsv’ it will be treated as the corpus’ main file for determining pieces. Otherwise it is expected to be named ‘metadata{suffix}.tsv’ and the suffix will be used as name for an additionally created view.
- parse(view_name=None, level=None, parallel=True, only_new=True, labels_cfg={}, cols={}, infer_types=None, **kwargs)[source]#
Shorthand for executing parse_scores and parse_tsv at a time. :param view_name:
- parse_mscx(*args, **kwargs)[source]#
Renamed to
parse_scores().
- parse_scores(level: str = None, parallel: bool = True, only_new: bool = True, labels_cfg: dict = {}, view_name: str = None, choose: Literal['all', 'auto', 'ask'] = 'all')[source]#
Parse MuseScore 3 files (MSCX or MSCZ) and store the resulting read-only Score objects. If they need to be writeable, e.g. for removing or adding labels, pass
parallel=Falsewhich takes longer but prevents having to re-parse at a later point.- Parameters:
level ({'W', 'D', 'I', 'E', 'C', 'WARNING', 'DEBUG', 'INFO', 'ERROR', 'CRITICAL'}, optional) – Pass a level name for which (and above which) you want to see log records.
parallel (
bool, optional) – Defaults to True, meaning that all CPU cores are used simultaneously to speed up the parsing. It implies that the resulting Score objects are in read-only mode and that you might not be able to use the computer during parsing. Set to False to parse one score after the other, which uses more memory but will allow making changes to the scores.only_new (
bool, optional) – By default, score which already have been parsed, are not parsed again. Pass False to parse them, too.
- Return type:
None
- parse_tsv(view_name: str | None = None, cols={}, infer_types=None, level=None, only_new: bool = True, choose: Literal['all', 'auto', 'ask'] = 'all', **kwargs)[source]#
Parse TSV files to be able to do something with them.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to parse all non-MSCX files. By default, all keys are selected.ids (
Collection) – To parse only particular files, pass there IDs.keysandfextsare ignored in this case.fexts (
strorCollection, optional) – If you want to parse only files with one or several particular file extension(s), pass the extension(s)cols (
dict, optional) – By default, if a column called'label'is found, the TSV is treated as an annotation table and turned into an Annotations object. Pass one or several column name(s) to treat them as label columns instead. If you pass{}or no label column is found, the TSV is parsed as a “normal” table, i.e. a DataFrame.infer_types (
dict, optional) – To recognize one or several custom label type(s), pass{name: regEx}.level ({'W', 'D', 'I', 'E', 'C', 'WARNING', 'DEBUG', 'INFO', 'ERROR', 'CRITICAL'}, optional) – Pass a level name for which (and above which) you want to see log records.
**kwargs – Arguments for
pandas.DataFrame.to_csv(). Defaults to{'sep': ' ', 'index': False}. In particular, you might want to update the default dictionaries fordtypesandconvertersused inload_tsv(). Passing kwargs prevents ms3 from parsing TSVs in parallel, so it will be a bit slower.
- Return type:
None
- register_files_with_pieces(files: List[File] | None = None, pnames: str | Collection[str] | None = None) None[source]#
Iterates through the
filesand tries to match it with thepiecesand registered matchedFileobjects with the correspondingPieceobjects (unless already registered).By default, the method uses this object’s
filesandpieces. To match with a Piece, the file name (without extension) needs to start with the Piece’spiece; otherwise, it will be stored underix2orphan_file.- Parameters:
files –
Fileobjects to register with the correspondingPieceobjects based on their file names.pnames – Names of the pieces that the files are to be matched to. Those that don’t match any will be stored
:param under
ix2orphan_file.:
- metadata(view_name: str | None = None, choose: Literal['auto', 'ask'] | None = None) DataFrame[source]#
Returns metadata.tsv but only for pieces included in the current or indicated view. If no TSV file is present, get metadata from the current scores.
- update_scores(root_dir: str | None = None, folder: str | None = '.', suffix: str = '', overwrite: bool = False) List[str][source]#
Update scores created with an older MuseScore version to the latest MuseScore 3 version.
- Parameters:
root_dir – In case you want to create output paths for the updated MuseScore files based on a folder different from
corpus_path.folder –
The default ‘.’ has the updated scores written to the same directory as the old ones, effectively overwriting them if
root_diris None.If
folderis None, the files will be written to{root_dir}/scores/.If
folderis an absolute path,root_dirwill be ignored.If
folderis a relative path starting with a dot.the relative path is appended to the file’s subdir. For example,../scoreswill resolve to a sibling directory of the one where thefileis located.If
folderis a relative path that does not begin with a dot., it will be appended to theroot_dir.
suffix – String to append to the file names of the updated files, e.g. ‘_updated’.
overwrite – By default, existing files are not overwritten. Pass True to allow this.
- Returns:
A list of all up-to-date paths, whether they had to be converted or were already in the latest version.
- update_tsvs_on_disk(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = 'tsv', view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto') List[str][source]#
Update existing TSV files corresponding to one or several facets with information freshly extracted from a parsed score, but only if the contents are identical. Otherwise, the existing TSV file is not overwritten and the differences are displayed in a log warning. The purpose is to safely update the format of existing TSV files, (for instance with respect to column order) making sure that the content doesn’t change.
- Parameters:
facets
view_name
force – By default, only TSV files that have already been parsed are updated. Set to True in order to force-parse for each facet one of the TSV files included in the given view, if necessary.
choose
- Returns:
List of paths that have been overwritten.
- insert_detached_labels(view_name: str | None = None, key: str = 'detached', staff: int = None, voice: Literal[1, 2, 3, 4] = None, harmony_layer: Literal[0, 1, 2] | None = None, check_for_clashes: bool = True) Tuple[int, int][source]#
Attach all
Annotationsobjects that are reachable viaScore.keyto their respectiveScore, altering the XML in memory. Callingstore_scores()will output MuseScore files where the annotations show in the score.- Parameters:
key – Key under which the
Annotationsobjects to be attached are stored in theScoreobjects. Defaults to ‘detached’.staff (
int, optional) – If you pass a staff ID, the labels will be attached to that staff where 1 is the upper stuff. By default, the staves indicated in the ‘staff’ column ofms3.annotations.Annotations.dfwill be used.voice ({1, 2, 3, 4}, optional) – If you pass the ID of a notational layer (where 1 is the upper voice, blue in MuseScore), the labels will be attached to that one. By default, the notational layers indicated in the ‘voice’ column of
ms3.annotations.Annotations.dfwill be used.harmony_layer (
int, optional) –By default, the labels are written to the layer specified as an integer in the columnharmony_layer.Pass an integer to select a particular layer:* 0 to attach them as absolute (‘guitar’) chords, meaning that when opened next time,MuseScore will split and encode those beginning with a note name ( resulting in ms3-internal harmony_layer 3).* 1 the labels are written into the staff’s layer for Roman Numeral Analysis.* 2 to have MuseScore interpret them as Nashville Numberscheck_for_clashes (
bool, optional) – By default, warnings are thrown when there already exists a label at a position (and in a notational layer) where a new one is attached. Pass False to deactivate these warnings.
- change_labels_cfg(labels_cfg=(), staff=None, voice=None, harmony_layer=None, positioning=None, decode=None, column_name=None, color_format=None)[source]#
Update
Corpus.labels_cfgand retrieve new ‘labels’ tables accordingly.- Parameters:
labels_cfg (
dict) – Using an entire dictionary or, to change only particular options, choose from:staff – Arguments as they will be passed to
get_labels()voice – Arguments as they will be passed to
get_labels()harmony_layer – Arguments as they will be passed to
get_labels()positioning – Arguments as they will be passed to
get_labels()decode – Arguments as they will be passed to
get_labels()column_name – Arguments as they will be passed to
get_labels()
- compare_labels(key: str = 'detached', new_color: str = 'ms3_darkgreen', old_color: str = 'ms3_darkred', detached_is_newer: bool = False, add_to_rna: bool = True, view_name: str | None = None, metadata_update: dict | None = None, force_metadata_update: bool = False) Tuple[int, int][source]#
Compare detached labels
keyto the ones attached to the Score to create a diff. By default, the attached labels are considered as the reviewed version and labels that have changed or been added in comparison to the detached labels are colored in green; whereas the previous versions of changed labels are attached to the Score in red, just like any deleted label.- Parameters:
key – Key of the detached labels you want to compare to the ones in the score.
new_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).old_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).detached_is_newer – Pass True if the detached labels are to be added with
new_colorwhereas the attached changed labels will turnold_color, as opposed to the default.add_to_rna – By default, new labels are attached to the Roman Numeral layer. Pass False to attach them to the chord layer instead.
metadata_update –
- Dictionary containing metadata that is to be included in the comparison score. Notably, ms3 uses the key
’compared_against’ when the comparison is performed against a given git_revision.
- force_metadata_update:
By default, the metadata is only updated if the comparison yields at least one difference to avoid outputting comparison scores not displaying any changes. Pass True to force the metadata update, which results in the properts
changedbeing set to True.
- Returns:
Number of scores in which labels have changed. Number of scores in which no label has chnged.
- count_annotation_layers(keys=None, which='attached', per_key=False)[source]#
Counts the labels for each annotation layer defined as (staff, voice, harmony_layer). By default, only labels attached to a score are counted.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to count annotation layers. By default, all keys are selected.which ({'attached', 'detached', 'tsv'}, optional) – ‘attached’: Counts layers from annotations attached to a score. ‘detached’: Counts layers from annotations that are in a Score object, but detached from the score. ‘tsv’: Counts layers from Annotation objects that have been loaded from or into annotation tables.
per_key (
bool, optional) – If set to True, the results are returned as a dict {key: Counter}, otherwise the counts are summed up in one Counter. Ifwhich='detached', the keys are keys from Score objects, otherwise they are keys from this Corpus object.
- Returns:
By default, the function returns a Counter of labels for every annotation layer (staff, voice, harmony_layer) If
per_keyis set to True, a dictionary {key: Counter} is returned, separating the counts.- Return type:
- count_pieces(view_name: str | None = None) int[source]#
Number of selected pieces under the given view.
- count_labels(keys=None, per_key=False)[source]#
Count label types.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to count label types. By default, all keys are selected.per_key (
bool, optional) – If set to True, the results are returned as a dict {key: Counter}, otherwise the counts are summed up in one Counter.
- Returns:
By default, the function returns a Counter of label types. If
per_keyis set to True, a dictionary {key: Counter} is returned, separating the counts.- Return type:
- count_tsv_types(keys=None, per_key=False)[source]#
Count inferred TSV types.
- Parameters:
keys (
strorCollection, optional) – Key(s) for which to count inferred TSV types. By default, all keys are selected.per_key (
bool, optional) – If set to True, the results are returned as a dict {key: Counter}, otherwise the counts are summed up in one Counter.
- Returns:
By default, the function returns a Counter of inferred TSV types. If
per_keyis set to True, a dictionary {key: Counter} is returned, separating the counts.- Return type:
- detach_labels(view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto', key: str = 'removed', staff: int = None, voice: Literal[1, 2, 3, 4] = None, harmony_layer: Literal[0, 1, 2, 3] | None = None, delete: bool = True)[source]#
Calls
Score.detach_labels <ms3.score.Score.detach_labels()on every parsed score under the current or selected view.
- store_extracted_facets(view_name: str | None = None, root_dir: str | None = None, measures_folder: str | None = None, notes_folder: str | None = None, rests_folder: str | None = None, notes_and_rests_folder: str | None = None, labels_folder: str | None = None, expanded_folder: str | None = None, form_labels_folder: str | None = None, cadences_folder: str | None = None, events_folder: str | None = None, chords_folder: str | None = None, metadata_suffix: str | None = None, markdown: bool = True, simulate: bool = False, unfold: bool = False, interval_index: bool = False, frictionless: bool = True) List[str][source]#
Store facets extracted from parsed scores as TSV files.
- Parameters:
view_name
root_dir
measures_folder
notes_folder
rests_folder
notes_and_rests_folder
labels_folder
expanded_folder
:param : :param form_labels_folder: Specify directory where to store the corresponding TSV files. :param cadences_folder: Specify directory where to store the corresponding TSV files. :param events_folder: Specify directory where to store the corresponding TSV files. :param chords_folder: Specify directory where to store the corresponding TSV files. :param metadata_suffix: Specify a suffix to update the ‘metadata{suffix}.tsv’ file for this corpus. For the main file, pass ‘’ :param markdown: By default, when
metadata_pathis specified, a markdown file calledREADME.mdcontainingthe columns [file_name, measures, labels, standard, annotators, reviewers] is created. If it exists already, this table will be appended or overwritten after the heading
# Overview.- Parameters:
simulate
unfold – By default, repetitions are not unfolded. Pass True to duplicate values so that they correspond to a full playthrough, including correct positioning of first and second endings.
interval_index
frictionless – If True (default), the file is written together with a frictionless resource descriptor JSON file whose column schema is used to validate the stored TSV file.
- Returns:
A list of file stored to disk. If
frictionless=True(default), it will be the list of descriptor file paths describing the stored TSV files (i.e., the list contains one file for every two files written to disk). Otherwise, it will be the list of TSV file paths.
- update_metadata_tsv_from_parsed_scores(root_dir: str | None = None, suffix: str = '', markdown_file: str | None = 'README.md', view_name: str | None = None) List[str][source]#
Gathers the metadata from parsed and currently selected scores and updates ‘metadata.tsv’ with the information.
- Parameters:
root_dir – In case you want to output the metadata to folder different from
corpus_path.suffix – Added to the filename: ‘metadata{suffix}.tsv’. Defaults to ‘’. Metadata files with suffix may be used to store views with particular subselections of pieces.
markdown_file – By default, a subset of metadata columns will be written to ‘README.md’ in the same folder as the TSV file. If the file exists, it will be scanned for a line containing the string ‘# Overview’ and overwritten from that line onwards.
view_name – The view under which you want to update metadata from the selected parsed files. Defaults to None, i.e. the active view.
- Returns:
The file paths to which metadata was written.
- update_score_metadata_from_tsv(view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', write_empty_values: bool = False, remove_unused_fields: bool = False, write_text_fields: bool = False, update_instrumentation: bool = False) List[File][source]#
Update metadata fields of parsed scores with the values from the corresponding row in metadata.tsv.
- Parameters:
view_name
force
choose
write_empty_values – If set to True, existing values are overwritten even if the new value is empty, in which case the field will be set to ‘’.
remove_unused_fields – If set to True, all non-default fields that are not among the columns of metadata.tsv (anymore) are removed.
write_text_fields – If set to True, ms3 will write updated values from the columns
title_text,subtitle_text,composer_text,lyricist_text, andpart_name_textinto the score headers.update_instrumentation – Set to True to update the score’s instrumentation based on changed values from ‘staff_<i>_instrument’ columns.
- Returns:
List of File objects of those scores of which the XML structure has been modified.
- store_parsed_scores(view_name: str | None = None, only_changed: bool = True, root_dir: str | None = None, folder: str = '.', suffix: str = '', overwrite: bool = False, simulate=False) List[str][source]#
Stores all parsed scores under this view as MuseScore 3 files.
- Parameters:
view_name
only_changed – By default, only scores that have been modified since parsing are written. Set to False to store all scores regardless.
root_dir
folder
suffix – Suffix to append to the original file name.
overwrite – Pass True to overwrite existing files.
simulate – Set to True if no files are to be written.
- Returns:
Paths of the stored files.
- ms3.corpus.parse_musescore_file(file: File, logger: Logger, logger_parent: Logger, labels_cfg: dict = {}, logger_cfg: dict = {}, read_only: bool = False, ms: str | None = None) Score[source]#
Performs a single parse and returns the resulting Score object or None.
- Parameters:
file – File object with path information of a score that can be opened (or converted) with MuseScore 3.
logger – Logger to be used within this function (not for the parsing itself).
logger_cfg – Logger config for the new Score object (and therefore for the parsing itself).
read_only – Pass True to return smaller objects that do not keep a copy of the original XML structure in memory. In order to make changes to the score after parsing, this needs to be False (default).
ms – MuseScore executable in case the file needs to be converted.
- Returns:
The parsed score.
The Piece class#
- class ms3.piece.Piece(pname: str, view: View | None = None, labels_cfg: dict | None = None, ms=None, **logger_cfg)[source]#
Wrapper around
Scorefor associating it with parsed TSV files- facet2files: Dict[str, List[File]]#
{typ -> [
File]} dict storing file information for associated types.
- ix2file: Dict[int, File]#
{ix ->
File} dict storing the registered file information for access via index.
- facet2parsed: Dict[str, Dict[int, Score | DataFrame]]#
{typ -> {ix ->
pandas.DataFrame`|:obj:`Score}} dict storing parsed files for associated types.
- ix2parsed: Dict[int, Score | DataFrame]#
{ix ->
pandas.DataFrame`|:obj:`Score} dict storing the parsed files for access via index.
- ix2annotations: Dict[int, Annotations]#
{ix ->
Annotations} dict storing Annotations objects for the parsed labels and expanded labels.
- labels_cfg#
dictConfiguration dictionary to determine the output format oflabelsandexpandedtables. The dictonary is passed toScoreupon parsing.
- all_facets_present(view_name: str | None = None, selected_facets: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'] | Collection[Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown']] | None = None) bool[source]#
Checks if parsed TSV files have been detected for all selected facets under the active or indicated view.
- Parameters:
view_name – Name of the view to check.
selected_facets – If passed, needs to be a subset of the facets selected by the view, otherwise the result will be False. If no
selected_facetsare passed, check for those selected by the active or indicated view.
- Returns:
True if for each selected facet at least one file has been registered.
- score_metadata(view_name: str | None, choose: Literal['auto', 'ask'], as_dict: Literal[False]) Series[source]#
- score_metadata(view_name: str | None, choose: Literal['auto', 'ask'], as_dict: Literal[True]) dict
- Parameters:
choose
as_dict – Set to True to change the return type from
pandas.Seriestodict.
Returns:
- property tsv_metadata: Dict[str, str] | None#
If the
Corpushasmetadata_tsv, this field will contain the {column: value} pairs of the row pertaining to this piece.
- metadata(view_name: str | None = None) Series | None[source]#
If a row of ‘metadata.tsv’ has been stored, return that, otherwise extract from a (force-)parsed score.
- get_view(view_name: str | None = None, **config) View[source]#
Retrieve an existing or create a new View object, potentially while updating the config.
- change_labels_cfg(labels_cfg=(), staff=None, voice=None, harmony_layer=None, positioning=None, decode=None, column_name=None, color_format=None)[source]#
Update
Piece.labels_cfgand retrieve new ‘labels’ tables accordingly.- Parameters:
labels_cfg (
dict) – Using an entire dictionary or, to change only particular options, choose from:staff – Arguments as they will be passed to
get_labels()voice – Arguments as they will be passed to
get_labels()harmony_layer – Arguments as they will be passed to
get_labels()positioning – Arguments as they will be passed to
get_labels()decode – Arguments as they will be passed to
get_labels()column_name – Arguments as they will be passed to
get_labels()
- compare_labels(key: str = 'detached', new_color: str = 'ms3_darkgreen', old_color: str = 'ms3_darkred', detached_is_newer: bool = False, add_to_rna: bool = True, view_name: str | None = None, metadata_update: dict | None = None, force_metadata_update: bool = False) Tuple[int, int][source]#
Compare detached labels
keyto the ones attached to the Score to create a diff. By default, the attached labels are considered as the reviewed version and labels that have changed or been added in comparison to the detached labels are colored in green; whereas the previous versions of changed labels are attached to the Score in red, just like any deleted label.- Parameters:
key – Key of the detached labels you want to compare to the ones in the score.
new_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).old_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).detached_is_newer – Pass True if the detached labels are to be added with
new_colorwhereas the attached changed labels will turnold_color, as opposed to the default.add_to_rna – By default, new labels are attached to the Roman Numeral layer. Pass False to attach them to the chord layer instead.
metadata_update –
- Dictionary containing metadata that is to be included in the comparison score. Notably, ms3 uses the key
’compared_against’ when the comparison is performed against a given git_revision.
- force_metadata_update:
By default, the metadata is only updated if the comparison yields at least one difference to avoid outputting comparison scores not displaying any changes. Pass True to force the metadata update, which results in the properts
changedbeing set to True.
- Returns:
Number of scores in which labels have changed. Number of scores in which no label has chnged.
- count_detected(include_empty: bool = False, view_name: str | None = None, prefix: bool = False) Dict[str, int][source]#
Count how many files per facet have been detected.
- Parameters:
include_empty – By default, facets without files are not included in the dict. Pass True to include zero counts.
view_name
prefix – Pass True if you want the facets prefixed with ‘detected_’.
- Returns:
{facet -> count of detected files}
- extract_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = None, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, flat=False) Dict[str, List[Tuple[File, DataFrame]]] | List[Tuple[File, DataFrame]][source]#
Retrieve a dictionary with the selected feature matrices extracted from the parsed scores. If you want to retrieve parsed TSV files, use
get_all_parsed().
- get_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = None, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, flat=False) Dict[str, Tuple[File, DataFrame]] | List[Tuple[File, DataFrame]][source]#
Retrieve score facets both freshly extracted from parsed scores and from parsed TSV files, depending on the parameters and the view in question.
If choose != ‘all’, the goal will be to return one DataFrame per facet. Preference is given to a DataFrame freshly extracted from an already parsed score; otherwise, from an already parsed TSV file. If both are not available, preference will be given to a force-parsed TSV, then to a force-parsed score.
- Parameters:
facets
view_name
force – Only relevant when
choose='all'. By default, only scores and TSV files that have already been parsed are taken into account. Setforce=Trueto force-parse all scores and TSV files selected under the given view.choose
unfold
interval_index
flat
Returns:
- get_facet(facet: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'], view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto', unfold: bool = False, interval_index: bool = False) Tuple[File | None, DataFrame | None][source]#
Retrieve a DataFrame from a parsed score or, if unavailable, from a parsed TSV. If none have been parsed, first force-parse a TSV and, if not included in the given view, force-parse a score.
- get_file(facet: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'], view_name: str | None = None, parsed: bool = True, unparsed: bool = True, choose: Literal['auto', 'ask'] = 'auto') File | None[source]#
- Parameters:
facet
choose
- Returns:
A {file_type -> [
File] dict containing the selected Files or, if flat=True, just a list.
- get_files(facets: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'] | Literal['tsv', 'tsvs'] | Collection[Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown']] = None, view_name: str | None = None, parsed: bool = True, unparsed: bool = True, choose: Literal['all', 'auto', 'ask'] = 'all', flat: bool = False, include_empty: bool = False) Dict[str, List[File]] | List[File][source]#
- Parameters:
facets
- Returns:
A {file_type -> [
File] dict containing the selected Files or, if flat=True, just a list.
- get_parsed(facet: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'], view_name: str | None = None, choose: Literal['auto', 'ask'] = 'auto', git_revision: str | None = None, unfold: bool = False, interval_index: bool = False) Tuple[File | None, Score | DataFrame | None][source]#
Retrieve exactly one parsed score or TSV file. If none has been parsed, parse one automatically.
- Parameters:
facet
view_name
choose
git_revision
Returns:
- get_all_parsed(facets: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'] | Literal['tsv', 'tsvs'] | Collection[Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown']] = None, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', flat: bool = False, include_empty: bool = False, unfold: bool = False, interval_index: bool = False) Dict[Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'], List[Tuple[File, Score | DataFrame]]] | List[Tuple[File, Score | DataFrame]][source]#
Return multiple parsed files.
- iter_extracted_facet(facet: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'], view_name: str | None = None, force: bool = False, unfold: bool = False, interval_index: bool = False) Iterator[Tuple[File | None, DataFrame | None]][source]#
Iterate through the selected facet extracted from all parsed or yet-to-parse scores.
- iter_extracted_facets(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']], view_name: str | None = None, force: bool = False, unfold: bool = False, interval_index: bool = False) Iterator[Tuple[File, Dict[str, DataFrame]]][source]#
Iterate through the selected facets extracted from all parsed or yet-to-parse scores.
- iter_facet2files(view_name: str | None = None, include_empty: bool = False) Iterator[Tuple[str, List[File]]][source]#
Iterating through
facet2filesunder the current or specified view.
- iter_facet2parsed(view_name: str | None = None, include_empty: bool = False) Iterator[Dict[str, List[File]]][source]#
Iterating through
facet2parsedunder the current or specified view and selecting only parsed files.
- iter_files(facets: Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown'] | Literal['tsv', 'tsvs'] | Collection[Literal['scores', 'measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords', 'unknown']] = None, view_name: str | None = None, parsed: bool = True, unparsed: bool = True, choose: Literal['all', 'auto', 'ask'] = 'all', flat: bool = False, include_empty: bool = False) Iterator[Dict[str, File]] | Iterator[List[File]][source]#
Equivalent to iterating through the result of
get_files().
- load_annotation_table_into_score(ix: int | None = None, df: DataFrame | None = None, view_name: str | None = None, choose: Literal['auto', 'ask'] = 'auto', key: str = 'detached', infer: bool = True, **cols) None[source]#
Attach an
Annotationsobject to the score and make it available asScore.{key}. It can be an existing object or one newly created from the TSV filetsv_path.- Parameters:
ix – Either pass the index of a TSV file containing annotations, or
df – A DataFrame containing annotations.
key – Specify a new key for accessing the set of annotations. The string needs to be usable as an identifier, e.g. not start with a number, not contain special characters etc. In return you may use it as a property: For example, passing
'chords'lets you access theAnnotationsasScore.chords. The key ‘annotations’ is reserved for all annotations attached to the score.infer – By default, the label types are inferred in the currently configured order (see
name2regex). Pass False to not add and not change any label types.**cols – If the columns in the specified TSV file diverge from the standard column names, pass them as standard_name=’custom name’ keywords.
- store_extracted_facet(facet: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'], root_dir: str | None = None, folder: str | None = None, view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', unfold: bool = False, interval_index: bool = False, frictionless: bool = True, raise_exception: bool = True, write_or_remove_errors_file: bool = True)[source]#
Extract a facet from one or several available scores and store the results as TSV files, the paths of which are computed from the respective score’s location.
- Args:
facet: root_dir:
Defaults to None, meaning that the path is constructed based on the corpus_path. Pass a directory to construct the path relative to it instead. If
folderis an absolute path,root_diris ignored.- folder:
If
folderis None (default), the files’ type will be appended to theroot_dir.If
folderis an absolute path,root_dirwill be ignored.If
folderis a relative path starting with a dot.the relative path is appended to the file’s subdir. For example, ``..
- otes`` will resolve to a sibling directory of the one where the
file is located.
If
folderis a relative path that does not begin with a dot., it will be appended to theroot_dir.
view_name: force: choose: unfold: interval_index: frictionless:
If True (default), the file is written together with a frictionless resource descriptor JSON file whose column schema is used to validate the stored TSV file.
- raise_exception:
If True (default) raise if the resource is not valid. Only relevant when frictionless=True (i.e., by default).
- write_or_remove_errors_file:
If True (default) write a .errors file if the resource is not valid, otherwise remove it if it exists. Only relevant when frictionless=True (i.e., by default).
Returns:
- store_parsed_score_at_ix(ix, root_dir: str | None = None, folder: str = '.', suffix: str = '', overwrite: bool = False, simulate=False) str | None[source]#
Creates a MuseScore file from the Score object at the given index.
- Parameters:
ix
folder
suffix – Suffix to append to the original file name.
root_dir
overwrite – Pass True to overwrite existing files.
simulate – Set to True if no files are to be written.
- Returns:
Path of the stored file.
- update_score_metadata_from_tsv(view_name: str | None = None, force: bool = False, choose: Literal['all', 'auto', 'ask'] = 'all', write_empty_values: bool = False, remove_unused_fields: bool = False, write_text_fields: bool = False, update_instrumentation: bool = False) List[File][source]#
Update metadata fields of parsed scores with the values from the corresponding row in metadata.tsv.
- Parameters:
view_name
force
choose
write_empty_values – If set to True, existing values are overwritten even if the new value is empty, in which case the field will be set to ‘’.
remove_unused_fields – If set to True, all non-default fields that are not among the columns of metadata.tsv (anymore) are removed.
write_text_fields – If set to True, ms3 will write updated values from the columns
title_text,subtitle_text,composer_text,lyricist_text, andpart_name_textinto the score header.update_instrumentation – Set to True to update the score’s instrumentation based on changed values from ‘staff_<i>_instrument’ columns.
- Returns:
List of File objects of those scores of which the XML structure has been modified.
- update_tsvs_on_disk(facets: Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords'] | Collection[Literal['measures', 'notes', 'rests', 'notes_and_rests', 'labels', 'expanded', 'form_labels', 'cadences', 'events', 'chords']] = 'tsv', view_name: str | None = None, force: bool = False, choose: Literal['auto', 'ask'] = 'auto') List[str][source]#
Update existing TSV files corresponding to one or several facets with information freshly extracted from a parsed score, but only if the contents are identical. Otherwise, the existing TSV file is not overwritten and the differences are displayed in a log warning. The purpose is to safely update the format of existing TSV files, (for instance with respect to column order) making sure that the content doesn’t change.
- Parameters:
facets
view_name
force – By default, only TSV files that have already been parsed are updated. Set to True in order to force-parse for each facet one of the TSV files included in the given view, if necessary.
choose
- Returns:
List of paths that have been overwritten.
- get_dataframe(*args, **kwargs) None[source]#
Deprecated method. Replaced by
get_parsed(),extract_facet(), andget_facet().
The View class#
- class ms3.view.View(view_name: str | None = 'all', only_metadata_pieces: bool = False, include_convertible: bool = True, include_tsv: bool = True, exclude_review: bool = False, **logger_cfg)[source]#
Object storing regular expressions and filter lists, storing and keeping track of things filtered out.
- is_default(relax_for_cli: bool = False) bool[source]#
Checks includes and excludes that may influence the selection of pieces. Returns True if the settings do not filter out any pieces. Only if
relax_for_cliis set to True, the filtersinclude_convertibleandexclude_revieware permitted, too.
- copy(new_name: str | None = None) View[source]#
Returns a copy of this view, i.e., a new View object.
- update_config(view_name: str | None = None, only_metadata_pieces: bool | None = None, include_convertible: bool | None = None, include_tsv: bool | None = None, exclude_review: bool | None = None, file_paths: str | Collection[str] | None = None, file_re: str | None = None, folder_re: str | None = None, exclude_re: str | None = None, folder_paths: str | Collection[str] | None = None, **logger_cfg)[source]#
Update the configuration of the View. This is a shorthand for issuing several calls to
include()andexclude()at once.- Parameters:
view_name – New name of the view.
only_metadata_pieces – Whether or not pieces that are not included in a metadata.tsv should be excluded.
include_convertible – Whether or not scores that need conversion via MuseScore before parsing should be
included.
include_tsv – Whether or not TSV files should be included.
exclude_review – Whether or not files and folder that include ‘review’ should be excluded.
file_paths – The exact file names will be extracted and used as exclusive filter, that is, all files that do not have one of these file names will be excluded. This is regardless of eventual relative or absolute paths included in the argument.
file_re – Include only files whose file name includes this regular expression.
folder_re – Include only files from folders whose name includes this regular expression.
exclude_re – Exclude all file and folders whose name includes this regular expression.
folder_paths – Include only files from these folders.
**logger_cfg
Returns:
- check_token(category: Literal['corpora', 'folders', 'pieces', 'files', 'suffixes', 'facets', 'paths'], token: str) bool[source]#
Checks if a string pertaining to a certain category should be included in the view or not.
- check_file(file: File) Tuple[bool, str][source]#
Check if an individual File passes all filters w.r.t. its subdirectories, file name and suffix.
- Parameters:
file
- Returns:
False if file is to be discarded from this view. The criterion based on which the file is being excluded.
- filter_by_token(category: Literal['corpora', 'folders', 'pieces', 'files', 'suffixes', 'facets', 'paths'], tuples: Iterable[tuple]) Iterator[tuple][source]#
Filters out those tuples where the token (first element) does not pass _.check_token(category, token).
- filtered_tokens(category: Literal['corpora', 'folders', 'pieces', 'files', 'suffixes', 'facets', 'paths'], tokens: Collection[str]) List[str][source]#
Applies
filter_by_token()to a collection of tokens.
- class ms3.view.DefaultView(view_name: str | None = 'default', only_metadata_pieces: bool = True, include_convertible: bool = False, include_tsv: bool = True, exclude_review: bool = True, **logger_cfg)[source]#
- ms3.view.create_view_from_parameters(only_metadata_pieces: bool = True, include_convertible: bool = False, include_tsv: bool = True, exclude_review: bool = True, file_paths=None, file_re=None, folder_re=None, exclude_re=None, level=None) View[source]#
From the arguments of an __init__ method, create either a DefaultView or a custom view.
The Score class#
- class ms3.score.Score(musescore_file=None, match_regex=['dcml', 'form_labels'], read_only=False, labels_cfg={}, parser='bs4', ms=None, **logger_cfg)[source]#
Object representing a score.
- ABS_REGEX = '^\\(?[A-G|a-g](b*|#*).*?(/[A-G|a-g](b*|#*))?$'#
strClass variable with a regular expression that recognizes absolute chord symbols in their decoded (string) form; they start with a note name.
- NASHVILLE_REGEX = '^(b*|#*)(\\d).*$'#
strClass variable with a regular expression that recognizes labels representing a Nashville numeral, which MuseScore is able to encode.
- RN_REGEX = '^$'#
strClass variable with a regular expression for Roman numerals that momentarily matches nothing because ms3 tries interpreting Roman Numerals als DCML harmony annotations.
- convertible_formats = ('cap', 'capx', 'midi', 'mid', 'musicxml', 'mxl', 'xml')#
tupleFormats that have to be converted before parsing.
- parseable_formats = ('mscx', 'mscz', 'cap', 'capx', 'midi', 'mid', 'musicxml', 'mxl', 'xml')#
tupleFormats that ms3 can parse.
- read_only#
bool, optional Defaults toFalse, meaning that the parsing is slower and uses more memory in order to allow for manipulations of the score, such as adding and deleting labels. Set toTrueif you’re only extracting information.
- full_paths#
dict{KEY: {i: full_path}}dictionary holding the full paths of all parsed MuseScore and TSV files, including file names. Handled internally by_handle_path().
- paths#
dict{KEY: {i: file path}}dictionary holding the paths of all parsed MuseScore and TSV files, excluding file names. Handled internally by_handle_path().
- files#
dict{KEY: {i: file name with extension}}dictionary holding the complete file name of each parsed file, including the extension. Handled internally by_handle_path().
- fnames#
dict{KEY: {i: file name without extension}}dictionary holding the file name of each parsed file, without its extension. Handled internally by_handle_path().
- fexts#
dict{KEY: {i: file extension}}dictionary holding the file extension of each parsed file. Handled internally by_handle_path().
- _detached_annotations#
dict{(key, i): Annotations object}dictionary for accessing all detachedAnnotationsobjects.
- _name2regex#
dictMapping names to their corresponding regex. Managed via the propertyname2regex. ‘dcml’: utils.DCML_REGEX,
- labels_cfg#
dictConfiguration dictionary to determine the output format of theAnnotationsobjects contained in the current object, especially when callingScore.mscx.labels(). The default options correspond to the default parameters ofAnnotations.get_labels().
- parser#
{‘bs4’} Currently only one XML parser has been implemented which uses BeautifulSoup 4.
- review_report#
pandas.DataFrameAfter callingcolor_non_chord_tones(), this DataFrame contains the expanded chord labels plus the six additional columns [‘n_colored’, ‘n_untouched’, ‘count_ratio’, ‘dur_colored’, ‘dur_untouched’, ‘dur_ratio’] representing the statistics of chord (untouched) vs. non-chord (colored) notes.
- comparison_report#
pandas.DataFrameDataFrame showing the labels modified (‘new’) and added (‘old’) bycompare_labels().
- property name2regex#
listordict, optional The order in which label types are to be inferred. Assigning a new value results in a call toinfer_types(). Passing a {label type: regex} dictionary is a shortcut to update type regex’s or to add new ones. The inference will take place in the order in which they appear in the dictionary. To reuse an existing regex will updating others, you can refer to them as None, e.g.{'dcml': None, 'my_own': r'^(PAC|HC)$'}.
- property has_detached_annotations#
boolIs True as long as the score containsAnnotationsobjects, that are not attached to theMSCXobject.
- attach_labels(key, staff=None, voice=None, harmony_layer=None, check_for_clashes=True, remove_detached=True)[source]#
Insert detached labels
keyinto this score’sMSCXobject.- Parameters:
key (
str) – Key of the detached labels you want to insert into the score.staff (
int, optional) – By default, labels are added to staves as specified in the TSV or to -1 (lowest). Pass an integer to specify a staff.voice (
int, optional) – By default, labels are added to voices (notational layers) as specified in the TSV or to 1 (main voice). Pass an integer to specify a voice.harmony_layer (
int, optional) –By default, the labels are written to the layer specified as an integer in the columnharmony_layer.Pass an integer to select a particular layer:* 0 to attach them as absolute (‘guitar’) chords, meaning that when opened next time,MuseScore will split and encode those beginning with a note name ( resulting in ms3-internal harmony_layer 3).* 1 the labels are written into the staff’s layer for Roman Numeral Analysis.* 2 to have MuseScore interpret them as Nashville Numberscheck_for_clashes (
bool, optional) – Defaults to True, meaning that the positions where the labels will be inserted will be checked for existing labels.remove_detached (
bool, optional) – By default, the detachedAnnotationsobject is removed after successfully attaching it. Pass False to have it remain in detached state.
- Returns:
- change_labels_cfg(labels_cfg={}, staff=None, voice=None, harmony_layer=None, positioning=None, decode=None, column_name=None, color_format=None)[source]#
Update
Score.labels_cfgandMSCX.labels_cfg.- Parameters:
labels_cfg (
dict) – Using an entire dictionary or, to change only particular options, choose from:staff – Arguments as they will be passed to
get_labels()voice – Arguments as they will be passed to
get_labels()harmony_layer – Arguments as they will be passed to
get_labels()positioning – Arguments as they will be passed to
get_labels()decode – Arguments as they will be passed to
get_labels()
- check_labels(keys='annotations', regex=None, regex_name='dcml', **kwargs)[source]#
Tries to match the labels
keysagainst the givenregexor the one of the registeredregex_name. Returns wrong labels.- Parameters:
keys (
strorCollection, optional) – The key(s) of the Annotation objects you want to check. Defaults to ‘annotations’, the attached labels.regex (
str, optional) – Pass a regular expression against which to check the labels if you don’t want to use the one of an existingregex_nameor in order to register a new one on the fly by passing the new name asregex_name.regex_name (
str, optional) – To use the regular expression of a registered type, pass its name, defaults to ‘dcml’. Pass a new name and aregexto register a new label type on the fly.kwargs – Parameters passed to
check_labels().
- Returns:
Labels not matching the regex.
- Return type:
- color_non_chord_tones(color_name: str = 'red') DataFrame | None[source]#
Iterates through the attached labels, tries to interpret them as DCML harmony labels, colors the notes in the parsed score that are not expressed by the respective label for a score segment, and stores a report under
review_report.- Parameters:
color_name – Name the color that the non-chord tones should get, defaults to ‘red’. Name can be a CSS color or a MuseScore color (see
utils.MS3_COLORS).- Returns:
A coloring report which is the original
dfwith the appended columns ‘n_colored’, ‘n_untouched’, ‘count_ratio’, ‘dur_colored’, ‘dur_untouched’, ‘dur_ratio’. They contain the counts and durations of the colored vs. untouched notes as well the ratio of each pair. Note that the report does not take into account notes that reach into a segment, nor does it correct the duration of notes that reach into the subsequent segment.
- compare_labels(key: str = 'detached', new_color: str = 'ms3_darkgreen', old_color: str = 'ms3_darkred', detached_is_newer: bool = False, add_to_rna: bool = True, metadata_update: dict | None = None, force_metadata_update: bool = False) Tuple[int, int][source]#
Compare detached labels
keyto the ones attached to the Score to create a diff. By default, the attached labels are considered as the reviewed version and labels that have changed or been added in comparison to the detached labels are colored in green; whereas the previous versions of changed labels are attached to the Score in red, just like any deleted label.- Parameters:
key – Key of the detached labels you want to compare to the ones in the score.
new_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).old_color – The colors by which new and old labels are differentiated. Identical labels remain unchanged. Colors can be CSS colors or MuseScore colors (see
utils.MS3_COLORS).detached_is_newer – Pass True if the detached labels are to be added with
new_colorwhereas the attached changed labels will turnold_color, as opposed to the default.add_to_rna – By default, new labels are attached to the Roman Numeral layer. Pass False to attach them to the chord layer instead.
metadata_update – Dictionary containing metadata that is to be included in the comparison score. Notably, ms3 uses the key ‘compared_against’ when the comparison is performed against a given git_revision.
force_metadata_update – By default, the metadata is only updated if the comparison yields at least one difference to avoid outputting comparison scores not displaying any changes. Pass True to force the metadata update, which results in the properts
changedbeing set to True.
- Returns:
Number of attached labels that were not present in the old version and whose color has been changed. Number of added labels that are not present in the current version any more and which have been added as a consequence.
- detach_labels(key, staff=None, voice=None, harmony_layer=None, delete=True, inverse=False, regex=None)[source]#
Detach all annotations labels from this score’s
MSCXobject or just a selection of them, without taking labels_cfg into account (don’t decode the labels). The extracted labels are stored as a newAnnotationsobject that is accessible viaScore.{key}. By default,deleteis set to True, meaning that if you callstore_scores()afterwards, the created MuseScore file will not contain the detached labels.- Parameters:
key (
str) – Specify a new key for accessing the detached set of annotations. The string needs to be usable as an identifier, e.g. not start with a number, not contain special characters etc. In return you may use it as a property: For example, passing'chords'lets you access the detached labels asScore.chords. The key ‘annotations’ is reserved for all annotations attached to the score.staff (
int, optional) – Pass a staff ID to select only labels from this staff. The upper staff has ID 1.voice ({1, 2, 3, 4}, optional) – Can be used to select only labels from one of the four notational layers. Layer 1 is MuseScore’s main, ‘upper voice’ layer, coloured in blue.
harmony_layer (
intorstr, optional) – Select one of the harmony layers {0,1,2,3} to select only these.delete (
bool, optional) – By default, the labels are removed from the XML structure inMSCX. Pass False if you want them to remain. This could be useful if you only want to extract a subset of the annotations for storing them separately but without removing the labels from the score.
- get_infer_regex()[source]#
- Returns:
Mapping of label types to the corresponding regular expressions in the order in which they are currently set to be inferred.
- Return type:
- get_labels(key: str | None = None, interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing all Labels, i.e., all <Harmony> tags, of the score or another set of annotations. Corresponds to calling
get_labels()on the selected object (by default, the one representing labels attached to the score) with the current_labels_cfg. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, volta, harmony_layer, label, offset_x, offset_y, regex_match- Parameters:
key
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.
- Returns:
DataFrame representing all Labels, i.e., all <Harmony> tags in the score.
- new_type(name, regex, description='', infer=True)[source]#
Declare a custom label type. A type consists of a name, a regular expression and, falculatively, of a description.
- Parameters:
regex (
str) – Regular expression that matches all labels of the custom type.description (
str, optional) – Human readable description that appears when calling the propertyScore.types.infer (
bool, optional) – By default, the labels of allAnnotationsobjects are matched against the new type. Pass False to not change any label’s type.
- load_annotations(tsv_path: str | None = None, anno_obj: Annotations | None = None, df: DataFrame | None = None, key: str = 'detached', infer: bool = True, **cols) None[source]#
Attach an
Annotationsobject to the score and make it available asScore.{key}. It can be an existing object or one newly created from the TSV filetsv_path.- Parameters:
tsv_path – If you want to create a new
Annotationsobject from a TSV file, pass its path.anno_obj – Instead, you can pass an existing object.
df – Or you can automatically create one from a given DataFrame.
key – Specify a new key for accessing the set of annotations. The string needs to be usable as an identifier, e.g. not start with a number, not contain special characters etc. In return you may use it as a property: For example, passing
'chords'lets you access theAnnotationsasScore.chords. The key ‘annotations’ is reserved for all annotations attached to the score.infer – By default, the label types are inferred in the currently configured order (see
name2regex). Pass False to not add and not change any label types.**cols – If the columns in the specified TSV file diverge from the standard column names, pass them as standard_name=’custom name’ keywords.
- store_annotations(key='annotations', tsv_path=None, **kwargs)[source]#
Save a set of annotations as TSV file. While
store_liststores attached labels only, this method can also store detached labels by passing akey.- Parameters:
key (
str, optional) – Key of theAnnotationsobject which you want to output as TSV file. By default, the annotations attached to the score (key=’annotations’) are stored.tsv_path (
str, optional) – Path of the newly created TSV file including the file name. By default, the TSV file is stored next to tkwargs – Additional keyword arguments will be passed to the function
pandas.DataFrame.to_csv()to customise the format of the created file (e.g. to change the separator to commas instead of tabs, you would passsep=',').
- write_score_to_handler(file_handler)[source]#
Write the current
MSCXobject to a file handler. Just a shortcut forScore.mscx.write_score_to_handler().- Parameters:
file_handler – File handler to write to.
- store_score(filepath)[source]#
Store the current
MSCXobject attached to this score as uncompressed MuseScore file. Just a shortcut forScore.mscx.store_scores().- Parameters:
filepath – Path of the newly created MuseScore file, including the file name ending on ‘.mscx’. Uncompressed files (‘.mscz’) are not supported.
- _handle_path(path, key=None)[source]#
Puts the path into
paths, files, fnames, fextsdicts with the given key.
- parse_mscx(musescore_file=None, read_only=None, parser=None, labels_cfg={})[source]#
This method is called by
__init__()to parse the score. It checks the file extension and in the case of a compressed MuseScore file (.mscz), a temporary uncompressed file is generated which is removed after the parsing process. Essentially, parsing means to initiate aMSCXobject and to make it available asScore.mscxand, if the score includes annotations, to initiate anAnnotationsobject that can be accessed asScore.annotations. The method doesn’t systematically clean up data from a hypothetical previous parse.- Parameters:
musescore_file (
str, optional) – Path to the MuseScore file to be parsed.read_only (
bool, optional) – Defaults toFalse, meaning that the parsing is slower and uses more memory in order to allow for manipulations of the score, such as adding and deleting labels. Set toTrueif you’re only extracting information.parser ('bs4', optional) – The only XML parser currently implemented is BeautifulSoup 4.
labels_cfg (
dict, optional) – Store a configuration dictionary to determine the output format of theAnnotationsobject representing the currently attached annotations. SeeMSCX.labels_cfg.
- output_mscx(**kwargs) None[source]#
Deprecated method. Replaced by
store_score().
The MSCX class#
This class defines the user interface for accessing score information via Score.mscx.
It consists mainly of shortcuts for interacting with the parser in use, currently
Beautifulsoup exclusively.
- class ms3.score.MSCX(mscx_src, read_only=False, parser='bs4', labels_cfg={}, parent_score=None, **logger_cfg)[source]#
Object for interacting with the XML structure of a MuseScore 3 file. Is usually attached to a
Scoreobject and exposed asScore.mscx. An object is only created if a score was successfully parsed.- changed#
boolSwitches to True as soon as the original XML structure is changed. Does not automatically switch back to False.
- read_only#
bool, optional Shortcut forMSCX.parsed.read_only. Defaults toFalse, meaning that the parsing is slower and uses more memory in order to allow for manipulations of the score, such as adding and deleting labels. Set toTrueif you’re only extracting information.
- parser#
{‘bs4’} The currently used parser.
- labels_cfg#
dictConfiguration dictionary to determine the output format of theAnnotationsobject representing the labels that are attached to a score (stored as_annotations`). The options correspond to the parameters ofAnnotations.get_labels().
- cadences(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
pandas.DataFrameDataFrame representing all cadence annotations in the score.
- chords(mode: Literal['auto', 'strict'] = 'auto', interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame of Chords representing all <Chord> tags contained in the MuseScore file (all <note> tags come within one) and attached score information and performance maerks, e.g. lyrics, dynamics, articulations, slurs (see the explanation for the
modeparameter for more details). Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, volta, chord_id, dynamics, articulation, staff_text, slur, Ottava:8va, Ottava:8vb, pedal, TextLine, decrescendo_hairpin, diminuendo_line, crescendo_line, crescendo_hairpin, tempo, qpm, lyrics:1, Ottava:15mb- Parameters:
mode – Defaults to ‘auto’, meaning that additional performance markers available in the score are to be included, namely lyrics, dynamics, fermatas, articulations, slurs, staff_text, system_text, tempo, and spanners (e.g. slurs, 8va lines, pedal lines). This results in NaN values in the column ‘chord_id’ for those markers that are not part of a <Chord> tag, e.g. <Dynamic>, <StaffText>, or <Tempo>. To prevent that, pass ‘strict’, meaning that only <Chords> are included, i.e. the column ‘chord_id’ will have no empty values.
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.
- Returns:
DataFrame of Chords representing all <Chord> tags contained in the MuseScore file.
- events(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing a raw skeleton of the score’s XML structure and contains all Event callbacks API contained in it. It is the original tabular representation of the MuseScore file’s source code from which all other tables, except
measuresare generated.- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame containing the original tabular representation of all Event callbacks API encoded in the MuseScore file.
- expanded(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing Expanded labels, i.e., all annotations encoded in <Harmony> tags which could be matched against one of the registered regular expressions and split into feature columns. Currently this method is hard-coded to return expanded DCML harmony labels only but it takes into account the current
_labels_cfg. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, volta, label, alt_label, offset_x, offset_y, regex_match, globalkey, localkey, pedal, chord, numeral, form, figbass, changes, relativeroot, cadence, phraseend, chord_type, globalkey_is_minor, localkey_is_minor, chord_tones, added_tones, root, bass_note- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing all Labels, i.e., all <Harmony> tags in the score.
- property has_annotations#
boolShortcut forMSCX.parsed.has_annotations. Is True as long as at least one label is attached to the current XML.
- property n_form_labels#
intShortcut forMSCX.parsed.n_form_labels. Is True if at least one StaffText seems to constitute a form label.
- form_labels(detection_regex: str = None, exclude_harmony_layer: bool = False, interval_index: bool = False, expand: bool = True, unfold: bool = False) DataFrame | None[source]#
DataFrame representing form labels (or other) that have been encoded as <StaffText>s rather than in the <Harmony> layer. This function essentially filters all StaffTexts matching the
detection_regexand adds the standard position columns.- Parameters:
detection_regex – By default, detects all labels starting with one or two digits followed by a column (see
the regex). Pass another regex to retrieve only StaffTexts matching this one.exclude_harmony_layer – By default, form labels are detected even if they have been encoded as Harmony labels (rather than as StaffText). Pass True in order to retrieve only StaffText form labels.
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.
- Returns:
DataFrame containing all StaffTexts matching the
detection_regex
- labels(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing all Labels, i.e., all <Harmony> tags in the score, as returned by calling
get_labels()on the object at_annotationswith the current_labels_cfg. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, volta, harmony_layer, label, offset_x, offset_y, regex_match- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing all Labels, i.e., all <Harmony> tags in the score.
- measures(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Measures of the MuseScore file (which can be incomplete measures). Comes with the columns mc, mn, quarterbeats, duration_qb, keysig, timesig, act_dur, mc_offset, volta, numbering_offset, dont_count, barline, breaks, repeats, next
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the measures of the MuseScore file (which can be incomplete measures).
- offset_dict(all_endings: bool = False, unfold: bool = False) dict[source]#
{mc -> offset} dictionary measuring each MC’s distance from the piece’s beginning (0) in quarter notes.
- property metadata#
dictShortcut forMSCX.parsed.metadata. Metadata from and about the MuseScore file.
- notes(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Notes of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, tied, tpc, midi, volta, chord_id
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Notes of the MuseScore file.
- notes_and_rests(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Notes and Rests of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, tied, tpc, midi, volta, chord_id
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Notes and Rests of the MuseScore file.
- property parsed: _MSCX_bs4#
_MSCX_bs4Standard way of accessing the object exposed by the current parser.MSCXuses this object’s interface for requesting manipulations of and information from the source XML.
- rests(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Rests of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, nominal_duration, scalar, volta
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Rests of the MuseScore file.
- property staff_ids#
listofintThe staff IDs contained in the score, usually just a list of increasing numbers starting at 1.
- property style#
StyleCan be used like a dictionary to change the information within the score’s <Style> tag.
- add_labels(annotations_object)[source]#
Receives the labels from an
Annotationsobject and adds them to the XML structure representing the MuseScore file that might be written to a file afterwards.- Parameters:
annotations_object (
Annotations) – Object of labels to be added.- Returns:
Number of actually added labels.
- Return type:
- change_label_color(mc, mc_onset, staff, voice, label, color_name=None, color_html=None, color_r=None, color_g=None, color_b=None, color_a=None)[source]#
Shortcut for :py:meth:
MSCX.parsed.change_label_color- Parameters:
mc (
int) – Measure count of the labelmc_onset (
fractions.Fraction) – Onset position to which the label is attached.staff (
int) – Staff to which the label is attached.voice (
int) – Notational layer to which the label is attached.label (
str) – (Decoded) label.color_name (
str, optional) – Two ways of specifying the color.color_html (
str, optional) – Two ways of specifying the color.color_r (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_g (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_b (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_a (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.
- change_labels_cfg(labels_cfg={}, staff=None, voice=None, harmony_layer=None, positioning=None, decode=None, column_name=None, color_format=None)[source]#
Update
MSCX.labels_cfg.- Parameters:
labels_cfg (
dict) – Using an entire dictionary or, to change only particular options, choose from:staff – Arguments as they will be passed to
get_labels()voice – Arguments as they will be passed to
get_labels()harmony_layer – Arguments as they will be passed to
get_labels()positioning – Arguments as they will be passed to
get_labels()decode – Arguments as they will be passed to
get_labels()
- color_non_chord_tones(df: DataFrame, color_name: str = 'red', chord_tone_cols: Collection[str] = ['chord_tones', 'added_tones'], color_nan: bool = True) DataFrame[source]#
Iterates backwards through the rows of the given DataFrame, interpreting each row as a score segment, and colors all notes that do not correspond to one of the tonal pitch classes (TPC) indicated in one of the tuples contained in the
chord_tone_cols. The columns ‘mc’ and ‘mc_onset’ are taken to indicate each score segment’s start, which reaches to the subsequent one (the last segment reaching to the end of the score). Only notes whose onsets lie within the respective segment are colored, meaning that those whose durations reach into a segment are not taken into account.- Parameters:
df – A DataFrame with the columns [‘mc’, ‘mc_onset’] +
chord_tone_colscolor_name – Name the color that the non-chord tones should get, defaults to ‘red’. Name can be a CSS color or a MuseScore color (see
utils.MS3_COLORS).chord_tone_cols – Names of the columns containing tuples of chord tones, expressed as TPC. Not that in the expanded tables extracted by default, these columns correspond to intervals relative to the local tonic. The absolute representation required here can be obtained using
Annotations.expand_dcmlwithabsolute=True.color_nan – By default, if all of the
chord_tone_colscontain a NaN value, all notes in the segment will be colored. Pass False to add the segment to the previous one instead.
- Returns:
A coloring report which is the original
dfwith the appended columns ‘n_colored’, ‘n_untouched’, ‘count_ratio’, ‘dur_colored’, ‘dur_untouched’, ‘dur_ratio’. They contain the counts and durations of the colored vs. untouched notes as well the ratio of each pair. Note that the report does not take into account notes that reach into a segment, nor does it correct the duration of notes that reach into the subsequent segment.
- delete_labels(df)[source]#
Delete a set of labels from the current XML.
- Parameters:
df (
pandas.DataFrame) – A DataFrame with the columns [‘mc’, ‘mc_onset’, ‘staff’, ‘voice’]
- replace_labels(annotations_object)[source]#
- Parameters:
annotations_object (
Annotations) – Object of labels to be added.
- get_chords(staff=None, voice=None, mode='auto', lyrics=False, staff_text=False, dynamics=False, articulation=False, spanners=False, **kwargs)[source]#
Retrieve a customized chord list, e.g. one including less of the processed features or additional, unprocessed ones compared to the standard chord list.
- Parameters:
staff (
int) – Get information from a particular staff only (1 = upper staff)voice (
int) – Get information from a particular voice only (1 = only the first layer of every staff)mode ({'auto', 'all', 'strict'}, optional) –
‘auto’ (default), meaning that those aspects are automatically included that occur in the score; the resulting DataFrame has no empty columns except for those parameters that are set to True.
’all’: Columns for all aspects are created, even if they don’t occur in the score (e.g. lyrics).
’strict’: Create columns for exactly those parameters that are set to True, regardless which aspects occur in the score.
lyrics (
bool, optional) – Include lyrics.staff_text (
bool, optional) – Include staff text such as tempo markings.dynamics (
bool, optional) – Include dynamic markings such as f or p.articulation (
bool, optional) – Include articulation such as arpeggios.spanners (
bool, optional) – Include spanners such as slurs, 8va lines, pedal lines etc.**kwargs (
bool, optional) – Set a particular keyword to True in order to include all columns from the _events DataFrame whose names include that keyword. Column names include the tag names from the MSCX source code.
- Returns:
DataFrame representing all <Chord> tags in the score with the selected features.
- Return type:
- get_raw_labels()[source]#
Shortcut for
MSCX.parsed.get_raw_labels(). Retrieve a “raw” list of labels, meaning that label types reflect only those defined within <Harmony> tags which can be 1 (MuseScore’s Roman Numeral display), 2 (Nashville) or undefined (in the case of ‘normal’ chord labels, defaulting to 0).- Returns:
DataFrame with raw label features (i.e. as encoded in XML)
- Return type:
- infer_mc(mn, mn_onset=0, volta=None)[source]#
Shortcut for
MSCX.parsed.infer_mc(). Tries to convert a(mn, mn_onset)into a(mc, mc_onset)tuple on the basis of this MuseScore file. In other words, a human readable score position such as “measure number 32b (i.e., a second ending), beat 3” needs to be converted to(32, 1/2, 2)if “beat” has length 1/4, or–if the meter is, say 9/8 and “beat” has a length of 3/8– to(32, 6/8, 2). The resulting(mc, mc_onset)tuples are required for attaching a label to a score. This is only necessary for labels that were not originally extracted by ms3.- Parameters:
mn (
intorstr) – Measure number as in a reference print edition.mn_onset (
fractions.Fraction, optional) – Distance of the requested position from beat 1 of the complete measure (MN), expressed as fraction of a whole note. Defaults to 0, i.e. the position of beat 1.volta (
int, optional) – In the case of first and second endings, which bear the same measure number, a MN might have to be disambiguated by passing 1 for first ending, 2 for second, and so on. Alternatively, the MN can be disambiguated traditionally by passing it as string with a letter attached. In other words,infer_mc(mn=32, volta=1)is equivalent toinfer_mc(mn='32a').
- Returns:
int– Measure count (MC), denoting particular <Measure> tags in the score.
- write_score_to_handler(file_handler: IO) bool[source]#
Shortcut for
MSCX.parsed.write_score_to_handler(). Write the current XML structure to a file handler.- Parameters:
file_handler – File handler to write to.
- Returns:
Whether the file was successfully created.
- store_score(filepath: str) bool[source]#
Shortcut for
MSCX.parsed.store_scores(). Store the current XML structure as uncompressed MuseScore file.- Parameters:
filepath – Path of the newly created MuseScore file, including the file name ending on ‘.mscx’. Uncompressed files (‘.mscz’) are not supported.
- Returns:
Whether the file was successfully created.
- store_excerpt(start_mc: int | None = None, start_mn: int | None = None, start_mc_onset: Fraction | float | None = None, end_mc: int | None = None, end_mn: int | None = None, end_mc_onset: Fraction | float | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | None = Fraction(1, 4), directory: str | None = None, suffix: str | None = None)[source]#
Store an excerpt of the current score as a new .mscx file by defining start and end measure. If no end measure is specified, the excerpt will include everything following the start measure. The original score header and metadata are kept. Start and end measure both can be specified either as MC (the number in MuseScore’s status bar) or as MN (the number as displayed in the score).
- Parameters:
start_mc – Measure count of the first measure to be included in the excerpt. If
start_mcis given,start_mnmust be None.start_mn – Measure number of the first measure to be included in the excerpt. If
start_mnis given,start_mcmust be None.start_mc_onset – The starting onset value in the first measure. Every note with onset value strictly smaller than
start_mc_onsetwill be removed from the excerpt.end_mc – Measure count of the last measure to be included in the excerpt. If
end_mcis given,end_mnmust be None.end_mn – Measure number of the last measure to be included in the excerpt. If
end_mnis given,end_mcmust be None.end_mc_onset – The ending onset value in the last measure. Every not with onset value strictly greate than
end_mc_onsetwill be removed from the excerpt.exclude_start – If set to True, the note corresponding to
start_mc_onsetwill be removed as well.exclude_end – If set to True, the note corresponding to
end_mc_onsetwill be removed as well.metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
directory – Path to the folder where the excerpts are to be stored.
suffix – String to be inserted in the excerpts filename[suffix]_[start_mc]-[end_mc]
- Returns:
- if it was impossible to find a quarterbeat value for the given start measure.
In this case the function will not produce an excerpt.
- Return type:
Optional[None]
- get_phrase_boundaries()[source]#
This method uses the expanded and unfolded labels to find all the phrase boundaries where a beginning is defined by an opening bracket { and the end is defined by a cadence. This cadence can either come with a closing bracket } or after the end of a phrase and before the beginning of the next one. The start and end point are also associated with onset values to precisely know the position of the labels within the measure in order to be able to trim “unrelated” notes later on.
- Returns:
“mcs”, “start_onset”, “end_onset”. The first one corresponds to a tuple containing all the measure counts included in the phrase, the second is onset value of the starting label and the last key is the onset value for the ending label.
- Return type:
a list of all unique maps that identify all phrases in the score. Each map has three keys
- store_phrase_excerpts(metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), directory: str | None = None, suffix: str | None = 'phrase', random_skip: bool | None = False)[source]#
Store excerpts based on the phrase annotations contained in the score, if any. For this purpose, the self.find_phrases() method is called; for each pair of start and end MC an excerpt will be stored. The resulting excerpts will be named
[original_filename]_phrase_[start_mc]-[end_mc].mscxby default or[original_filename]_[suffix]_[start_mc]-[end_mc].mscxifsuffixis specified.- Parameters:
metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
directory – Optional[str], optional name of the directory you want the excerpt saved to, by default None
suffix – Optional[str], optional It is the string “category identifier” of your excerpts. For instance the name of the output files will in general be
[original_filename]_[suffix]_[start_mc]-[end_mc].mscxrandom_skip – Optional[bool], optional This boolean value, if True, will make the method randomly skip extracted excerpts and don’t generate them. This parameter is set by default to False.
- store_measures(included_mcs: Tuple[int, ...], start_mc_onset: Fraction | float | None = None, end_mc_onset: Fraction | float | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), directory: str | None = None, suffix: str | None = None)[source]#
This method takes a tuple containing the number of the measures that contained in the excerpt to be stored. The method will infer the active global and local keys, relative to the excerpt, from the annotations. It will then store the excerpt in the given (or default) directory with the name
[original_filename]_[suffix]_[start_mc]-[end_mc].mscx.- Parameters:
included_mcs – Tuple[int] The mc values of the measures to be included in the excerpt
start_mc_onset – Optional[Fraction | float], optional The value of the chosen onset for the true start of the excerpt. If onset is
Noneor0, then the excerpt will normally begin on the onset of the first included measure. In the case where this value should be different, for example1/2or.5, then all the notes with onset strictly smaller than this value will be removed from the first measure.end_mc_onset – Optional[Fraction | float], optional This has the same behaviour as the previous parameter. This means that if is set to None or to the value of the last onset in the measure, then the excerpt will normally finish at the end of the last included measure. In the cse where this value should be different, for example
1/2or.5, then all notes with onset strictly greater than this value will be removed from the last measure.exclude_start – Optional[bool], optional If set to True the note (in first measure) with onset value equal to
start_mc_onsetwill also be removed thus excluding the first onset (i.e. the end)exclude_end – Optional[bool], optional If set to True the note (in last measure) with onset value equal to
end_mc_onsetwill also be removed thus excluding the last onset (i.e. the end)metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
directory – Optional[str], optional name of the directory you want the excerpt saved to, by default None
suffix – Optional[str], optional It is the string “category identifier” of your excerpts. For instance the name of the output files will in general be
[original_filename]_[suffix]_[start_mc]-[end_mc].mscx
- store_within_phrase_excerpts(metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), directory: str | None = None, suffix: str | None = 'within_phrase', random_skip: bool | None = False)[source]#
Extract random snippets from the given score. The snippets have the constraint that they must strictly lie within a phrase. This means that within this type of excerpt neither phrase beginnings nor phrase endings will be considered. Not even cadences. By default, it extracts all possible snippets and stores them at the optional directory path. The resulting excerpts will be named
[original_filename]_within_phrase_[start_mc]-[end_mc].mscx.- Parameters:
metronome_tempo – Optional[float], optional The value that the user wants to set as the tempo of the excerpts. The tag will be added to XML tree of the excerpt’s file and will have the desired tempo
metronome_beat_unit – Optional[Fraction | float], optional To obtain the correct value for the tempo it is important to specify the beat unit that corresponds to the given tempo value. Since MuseScore works in quarter-beats, the convention is that 1 indicates that the unit is the quarter beat and all other values are relative to this one (i.e. 1/2 would be the eighth note etc.)
directory – Optional[str], optional name of the directory you want the excerpt saved to, by default None
suffix – Optional[str], optional It is the string “category identifier” of your excerpts. For instance the name of the output files will in general be
[original_filename]_[suffix]_[start_mc]-[end_mc].mscxrandom_skip – Optional[bool], optional This boolean value, if True, will make the method randomly skip extracted excerpts and don’t generate them. This parameter is set by default to False.
- store_phrase_endings(metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), max_excerpt_length: int | None = 2, directory: str | None = None, suffix: str | None = 'phrase_end', random_skip: bool | None = False)[source]#
Calls the self.find_phrase_endings() method to find all phrase endings contained in the score, then stores all corresponding excerpts. A phrase ending is specified to finish on a cadence and to start 2 MCs before the corresponding closing bracket that indicates the “end” of the phrase. The resulting excerpts will be named
[original_filename]_[suffix]_[start_mc]-[end_mc].mscx.- Parameters:
metronome_tempo – Optional[float], optional The value that the user wants to set as the tempo of the excerpts. The tag will be added to XML tree of the excerpt’s file and will have the desired tempo
metronome_beat_unit – Optional[Fraction | float], optional To obtain the correct value for the tempo it is important to specify the beat unit that corresponds to the given tempo value. Since MuseScore works in quarter-beats, the convention is that 1 indicates that the unit is the quarter beat and all other values are relative to this one (i.e. 1/2 would be the eighth note etc.)
max_excerpt_length – Optional[int], optional This parameter specifies the maximum number of measures to be included in the excerpt. For example, if max_excerpt_length is set to 3, all phrase endings excerpts will contain a max. of 3 measures.
directory – Optional[str], optional name of the directory you want the excerpt saved to, by default None
suffix – Optional[str], optional It is the string “category identifier” of your excerpts. For instance the name of the output files will in general be
[original_filename]_[suffix]_[start_mc]-[end_mc].mscxrandom_skip – Optional[bool], optional This boolean value, if True, will make the method randomly skip extracted excerpts and don’t generate them. This parameter is set by default to False.
- store_random_excerpts(n_excerpts: int | None = None, mc_length: int | None = 2, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), directory: str | None = None, suffix: str | None = 'random')[source]#
Method that stores
n_excerptsrandom excerpts eachmn_lengthsmeasures long. Ifn_excerptsis not specified then the method will create the maximum possible number of different excerpts containingmn_lengthmeasures each.- Parameters:
n_excerpts – The number of random excerpts to be created
mc_length – The allowed number of measures for each excerpt
metronome_tempo – The tempo value that the user might specify to overwrite the original piece tempo
metronome_beat_unit – Beat unit value that goes with the specified tempo value. Might be
1/4if the unit is the quarter note,1/8if the unit is the eighth note and so on.directory – Name of the directory into which the excerpts need to be stored
suffix – Suffix to be added to the name of the generated excerpts
- update_metadata(composer: str | None = None, workTitle: str | None = None, movementNumber: str | None = None, movementTitle: str | None = None, workNumber: str | None = None, poet: str | None = None, lyricist: str | None = None, arranger: str | None = None, copyright: str | None = None, creationDate: str | None = None, mscVersion: str | None = None, platform: str | None = None, source: str | None = None, translator: str | None = None, compared_against: str | None = None, **kwargs)[source]#
Update the metadata tags of the parsed score.
The Annotations class#
- class ms3.annotations.Annotations(tsv_path=None, df=None, cols={}, index_col=None, sep='\t', mscx_obj=None, infer_types=None, read_only=False, **logger_cfg)[source]#
Class for storing, converting and manipulating annotation labels.
- property harmony_layer_counts#
Returns the counts of the harmony_layers as dict.
- get_labels(staff: int | None = None, voice: Literal[1, 2, 3, 4] | None = None, harmony_layer: Literal[0, 1, 2, 3] | None = None, positioning: bool = False, decode: bool = True, drop: bool = False, inverse: bool = False, column_name: str | None = None, color_format: Literal['html', 'rgb', 'rgba', 'name'] | None = None, regex=None)[source]#
Returns a DataFrame of annotation labels.
- Parameters:
staff (
int, optional) – Select harmonies from a given staff only. Pass staff=1 for the upper staff.harmony_layer ({0, 1, 2, 3, 'dcml', ...}, optional) –
- If MuseScore’s harmony feature has been used, you can filter harmony types by passing
0 for unrecognized strings 1 for Roman Numeral Analysis 2 for Nashville Numbers 3 for encoded absolute chords ‘dcml’ for labels from the DCML harmonic annotation standard … self-defined types that have been added to self.regex_dict through the use of self.infer_types()
positioning (
bool, optional) – Set to True if you want to include information about how labels have been manually positioned.decode (
bool, optional) – Set to False if you want to keep labels in harmony_layer 0, 2, and 3 labels in their original form as encoded by MuseScore (e.g., with root and bass as TPC (tonal pitch class) where C = 14 for layer 0).drop (
bool, optional) – Set to True to delete the returned labels from this object.column_name (
str, optional) – Can be used to rename the columns holding the labels.color_format ({'html', 'rgb', 'rgba', 'name', None}) – If label colors are encoded, determine how they are displayed.
- expand_dcml(drop_others=True, warn_about_others=True, drop_empty_cols=False, chord_tones=True, relative_to_global=False, absolute=False, all_in_c=False, **kwargs)[source]#
Expands all labels where the regex_match has been inferred as ‘dcml’ and stores the DataFrame in self._expanded.
- Parameters:
drop_others (
bool, optional) – Set to False if you want to keep labels in the expanded DataFrame which have not regex_match ‘dcml’.warn_about_others (
bool, optional) – Set to False to suppress warnings about labels that have not regex_match ‘dcml’. Is automatically set to False ifdrop_othersis set to False.drop_empty_cols (
bool, optional) – Return without unused columnschord_tones (
bool, optional) – Pass True if you want to add four columns that contain information about each label’s chord, added, root, and bass tones. The pitches are expressed as intervals relative to the respective chord’s local key or, ifrelative_to_global=True, to the globalkey. The intervals are represented as integers that represent stacks of fifths over the tonic, such that 0 = tonic, 1 = dominant, -1 = subdominant, 2 = supertonic etc.relative_to_global (
bool, optional) – Pass True if you want all labels expressed with respect to the global key. This levels and eliminates the features localkey and relativeroot.absolute (
bool, optional) – Pass True if you want to transpose the relative chord_tones to the global key, which makes them absolute so they can be expressed as actual note names. This implies prior conversion of the chord_tones (but not of the labels) to the global tonic.all_in_c (
bool, optional) – Pass True to transpose chord_tones to C major/minor. This performs the same transposition of chord tones as relative_to_global but without transposing the labels, too. This option clashes with absolute=True.kwargs – Additional arguments are passed to
get_labels()to define the original representation.
- Returns:
Expanded DCML labels
- Return type:
The BeautifulSoup parser#
- class ms3.bs4_parser._MSCX_bs4(soup: BeautifulSoup, read_only: bool = False, logger_cfg: dict | None = None)[source]#
This sister class implements
MSCX’s methods for a score parsed with beautifulsoup4.- measure_nodes#
{staff -> {MC -> tag} }
- tags#
Nested dictionary allowing to access the score’s XML elements in a convenient and structured manner:
- {MC ->
- {staff ->
- {voice ->
- {mc_onset ->
- [{“name” -> str,
“duration” -> Fraction, “tag” -> bs4.Tag }, …
]
}
}
}
}
- staff2drum_map: Dict[int, DataFrame]#
For each stuff that is to be treated as drumset score, keep a mapping from MIDI pitch (DataFrame index) to note and instrument features. The columns typically include [‘head’, ‘line’, ‘voice’, ‘name’, ‘stem’, ‘shortcut’]. When creating note tables, the ‘name’ column will be populated with the names here rather than note names.
- property has_voltas: bool#
Return True if the score includes first and second endings. Otherwise, no ‘volta’ columns will be added to facets.
- add_label(label, mc, mc_onset, staff=1, voice=1, **kwargs)[source]#
Adds a single label to the current XML in form of a new <Harmony> (and maybe also <location>) tag.
- Parameters:
label
mc
mc_onset
staff
voice
kwargs
- add_standard_cols(df: DataFrame) DataFrame[source]#
Ensures that the DataFrame’s first columns are [‘mc’, ‘mn’, (‘volta’), ‘timesig’, ‘mc_offset’]
- change_label_color(mc, mc_onset, staff, voice, label, color_name=None, color_html=None, color_r=None, color_g=None, color_b=None, color_a=None)[source]#
Change the color of an existing label.
- Parameters:
mc (
int) – Measure count of the labelmc_onset (
fractions.Fraction) – Onset position to which the label is attached.staff (
int) – Staff to which the label is attached.voice (
int) – Notational layer to which the label is attached.label (
str) – (Decoded) label.color_name (
str, optional) – Two ways of specifying the color.color_html (
str, optional) – Two ways of specifying the color.color_r (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_g (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_b (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.color_a (
intorstr, optional) – To specify a RGB color instead, pass at least, the first three.color_a(alpha = opacity) defaults to 255.
- chords(mode: Literal['auto', 'strict'] = 'auto', interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame of Chords representing all <Chord> tags contained in the MuseScore file (all <note> tags come within one) and attached score information and performance maerks, e.g. lyrics, dynamics, articulations, slurs (see the explanation for the
modeparameter for more details). Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, volta, chord_id, dynamics, articulation, staff_text, slur, Ottava:8va, Ottava:8vb, pedal, TextLine, decrescendo_hairpin, diminuendo_line, crescendo_line, crescendo_hairpin, tempo, qpm, metronome_base, metronome_number, tempo_visible, lyrics:1, Ottava:15mb- Parameters:
mode – Defaults to ‘auto’, meaning that additional performance markers available in the score are to be included, namely lyrics, dynamics, fermatas, articulations, slurs, staff_text, system_text, tempo, and spanners (e.g. slurs, 8va lines, pedal lines). This results in NaN values in the column ‘chord_id’ for those markers that are not part of a <Chord> tag, e.g. <Dynamic>, <StaffText>, or <Tempo>. To prevent that, pass ‘strict’, meaning that only <Chords> are included, i.e. the column ‘chord_id’ will have no empty values.
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.
- Returns:
DataFrame of Chords representing all <Chord> tags contained in the MuseScore file.
- cl(recompute: bool = False) DataFrame[source]#
Get the raw Chords without adding quarterbeat columns.
- color_notes(from_mc: int, from_mc_onset: Fraction, to_mc: int | None = None, to_mc_onset: Fraction | None = None, midi: List[int] = [], tpc: List[int] = [], inverse: bool = False, color_name: str | None = None, color_html: str | None = None, color_r: int | None = None, color_g: int | None = None, color_b: int | None = None, color_a: int | None = None) Tuple[List[Fraction], List[Fraction]][source]#
Colors all notes occurring in a particular score segment in one particular color, or only those (not) pertaining to a collection of MIDI pitches or Tonal Pitch Classes (TPC).
- Parameters:
from_mc – MC in which the score segment starts.
from_mc_onset – mc_onset where the score segment starts.
to_mc – MC in which the score segment ends. If not specified, the segment ends at the end of the score.
to_mc_onset – If
to_mcis defined, the mc_onset where the score segment ends.midi – Collection of MIDI numbers to use as a filter or an inverse filter (depending on
inverse).tpc – Collection of Tonal Pitch Classes (C=0, G=1, F=-1 etc.) to use as a filter or an inverse filter (depending on
inverse).inverse – By default, only notes where all specified filters (midi and/or tpc) apply are colored. Set to True to color only those notes where none of the specified filters match.
color_name – Specify the color either as a name, or as HTML color, or as RGB(A). Name can be a CSS color or a MuseScore color (see
utils.MS3_COLORS).color_html – Specify the color either as a name, or as HTML color, or as RGB(A). An HTML color needs to be string of length 6.
color_r – If you specify the color as RGB(A), you also need to specify color_g and color_b.
color_g – If you specify the color as RGB(A), you also need to specify color_r and color_b.
color_b – If you specify the color as RGB(A), you also need to specify color_r and color_g.
color_a – If you have specified an RGB color, the alpha value defaults to 255 unless specified otherwise.
- Returns:
List of durations (in fractions) of all notes that have been colored. List of durations (in fractions) of all notes that have not been colored.
- delete_label(mc, staff, voice, mc_onset, empty_only=False)[source]#
Delete a label from a particular position (if there is one).
- Parameters:
mc (
int) – Measure count.staff – Notational layer in which to delete the label.
voice – Notational layer in which to delete the label.
mc_onset (
fractions.Fraction) – mc_onsetempty_only (
bool, optional) – Set to True if you want to delete only empty harmonies. Since normally all labels at the defined position are deleted, this flag is needed to prevent deleting non-empty <Harmony> tags.
- Returns:
Whether a label was deleted or not.
- Return type:
- events(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing a raw skeleton of the score’s XML structure and contains all Event callbacks API contained in it. It is the original tabular representation of the MuseScore file’s source code from which all other tables, except
measuresare generated.- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame containing the original tabular representation of all Event callbacks API encoded in the MuseScore file.
- form_labels(detection_regex: str = None, exclude_harmony_layer: bool = False, interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing form labels (or other) that have been encoded as <StaffText>s rather than in the <Harmony> layer (see argument
exclude_harmony_layer). This function essentially filters all StaffTexts matching thedetection_regexand adds the standard position columns.- Parameters:
detection_regex – By default, detects all labels starting with one or two digits followed by a column (see
the regex). Pass another regex to retrieve only StaffTexts matching this one.exclude_harmony_layer – By default, form labels are detected even if they have been encoded as Harmony labels (rather than as StaffText). Pass True in order to retrieve only StaffText form labels.
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.unfold – Pass True to retrieve a Dat
- Returns:
DataFrame containing all StaffTexts matching the
detection_regex
- fl(detection_regex: str = None, exclude_harmony_layer=False) DataFrame[source]#
Get the raw Form labels (or other) that match the
detection_regex, but without adding quarterbeat columns.- {ref}`$1`
- detection_regex:
By default, detects all labels starting with one or two digits followed by a column (see
the regex). Pass another regex to retrieve only StaffTexts matching this one.
- Returns:
DataFrame containing all StaffTexts matching the
detection_regexor None
- get_chords(staff: int | None = None, voice: Literal[1, 2, 3, 4] | None = None, mode: Literal['auto', 'strict'] = 'auto', lyrics: bool = False, dynamics: bool = False, articulation: bool = False, staff_text: bool = False, system_text: bool = False, tempo: bool = False, spanners: bool = False, thoroughbass: bool = False, **kwargs) DataFrame[source]#
Retrieve a customized chord lists, e.g. one including less of the processed features or additional, unprocessed ones.
- Parameters:
staff – Get information from a particular staff only (1 = upper staff)
voice – Get information from a particular voice only (1 = only the first layer of every staff)
mode –
Defaults to ‘auto’, meaning that those aspects are automatically included that occur in the score; the resulting DataFrame has no empty columns except for those parameters that are set to True.’strict’: Create columns for exactly those parameters that are set to True, regardless whether they occur in the score or not (in which case the column will be empty).lyrics – Include lyrics.
dynamics – Include dynamic markings such as f or p.
articulation – Include articulation such as arpeggios.
staff_text – Include expression text such as ‘dolce’ and free-hand staff text such as ‘div.’.
system_text – Include system text such as movement titles.
tempo – Include tempo markings.
spanners – Include spanners such as slurs, 8va lines, pedal lines etc.
thoroughbass – Include thoroughbass figures’ levels and durations.
**kwargs
- Returns:
DataFrame representing all <Chord> tags in the score with the selected features.
- get_texts(only_header: bool = True) Dict[str, str][source]#
Process <Text> nodes (normally attached to <Staff id=”1”>).
- get_instrumentation() Dict[str, str][source]#
Returns a {staff_<i>_instrument -> instrument_name} dict.
- make_excerpt(included_mcs: Tuple[int] | int, globalkey: str | None = None, localkey: str | None = None, start_mc_onset: Fraction | float | None = None, end_mc_onset: Fraction | float | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | None = Fraction(1, 4), decompose_repeat_tags: bool | None = True) Excerpt[source]#
Create an excerpt by removing all <Measure> tags that are not selected in
included_mcs. The order of the given integers is inconsequential because measures are always printed in the order in which they appear in the score. Also, it is assumed that the MCs are consecutive, i.e. there are no gaps between them; otherwise the excerpt will not show correct measure numbers and might be incoherent in terms of missing key and time signatures.- Parameters:
included_mcs – List of measure counts to be included in the excerpt. Pass a single integer to get an excerpt from that MC to the end of the piece.
globalkey – If the excerpt has chord labels, make sure the first label starts with the given global key, e.g. ‘F#’ for F sharp major or ‘ab’ for A flat minor.
localkey – If the excerpt has chord labels, make sure the first label starts with the given local key, e.g. ‘I’ for the major tonic key or ‘#iv’ for the raised subdominant minor key or ‘bVII’ for the lowered subtonic major key.
start_mc_onset – Onset value (either Fraction or float) specified as the “true” start of the first measure. Every note with strictly smaller onset value will be “removed” (i.e. mutated into rest)
end_mc_onset – Onset value (either Fraction or float) specified as the “true” end of the last measure. Every note with strictly greater onset value will be “removed” (i.e. mutated into rest)
exclude_start – If set to True, the first note corresponding to
start_mc_onsetwill also be “removed”exclude_end – If set to True, the last note corresponding to
end_mc_onsetwill also be “removed”metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
decompose_repeat_tags – If set to true, the XML tree will be cleansed from all tags referring to repeat-like structures to avoid possible “broken” structures within the excerpt.
- _make_measure_list(sections=True, secure=True, reset_index=True)[source]#
Regenerate the measure list from the parsed score with advanced options.
- measures(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Measures of the MuseScore file (which can be incomplete measures). Comes with the columns mc, mn, quarterbeats, duration_qb, keysig, timesig, act_dur, mc_offset, volta, numbering_offset, dont_count, barline, breaks, repeats, next
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the measures of the MuseScore file (which can be incomplete measures).
- ml(recompute: bool = False) DataFrame[source]#
Get the raw Measures without adding quarterbeat columns.
- Parameters:
recompute – By default, the measures are cached. Pass True to enforce recomputing anew.
- new_tag(name: str, value: str | None = None, attributes: dict | None = None, after: Tag | None = None, before: Tag | None = None, append_within: Tag | None = None, prepend_within: Tag | None = None) Tag[source]#
Create a new tag with the given name, value and attributes and insert it into the score relative to a given tag. Only one of
after,before,append_withinandprepend_withincan be specified.- Parameters:
name – <name></name>
value – <name>value</name> (if specified)
attributes – <name key=value, …></name>
after – Insert the tag as sibling following the given tag.
before – Insert the tag as sibling preceding the given tag.
append_within – Insert the tag as last child of the given tag.
prepend_within – Insert the tag as first child of the given tag.
- Returns:
The new tag.
- notes(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Notes of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, tied, tpc, midi, volta, chord_id
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Notes of the MuseScore file.
- nl(recompute: bool = False) DataFrame[source]#
Get the raw Notes without adding quarterbeat columns.
- Parameters:
recompute – By default, the notes are cached. Pass True to enforce recomputing anew.
- notes_and_rests(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Notes and Rests of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, gracenote, tremolo, nominal_duration, scalar, tied, tpc, midi, volta, chord_id
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Notes and Rests of the MuseScore file.
- nrl(recompute: bool = False) DataFrame[source]#
Get the raw Notes and Rests without adding quarterbeat columns.
- Parameters:
recompute – By default, the measures are cached. Pass True to enforce recomputing anew.
- offset_dict(all_endings: bool = False, unfold: bool = False) dict[source]#
Dictionary mapping MCs (measure counts) to their quarterbeat offset from the piece’s beginning. Used for computing quarterbeats for other facets.
- Parameters:
all_endings – If a pieces as alternative endings, by default, only the second ending is taken into account for computing quarterbeats in order to make the timeline correspond to a rendition without performing repeats. Events in other endings, notably the first, receive value NA so that they can be filtered out. For score addressability, one might want to apply a continuous timeline to all measures, in which case one would pass True to use the column ‘quarterbeats_all_endings’ of the measures table if it has one. If not, falls back to the default ‘quarterbeats’.
unfold – Pass True to compute quarterbeats for a mc_playthrough column resulting from unfolding repeats. The parameter
all_endingsis ignored in this case because the unfolded version brings each ending in its correct place.
- Returns:
{MC -> quarterbeat_offset}. Offsets are Fractions. If
all_endingsis not set toTrue, values for MCs that are part of a first ending (or third or larger) are NA.
- rests(interval_index: bool = False, unfold: bool = False) DataFrame | None[source]#
DataFrame representing the Rests of the MuseScore file. Comes with the columns quarterbeats, duration_qb, mc, mn, mc_onset, mn_onset, timesig, staff, voice, duration, nominal_duration, scalar, volta
- Parameters:
interval_index – Pass True to replace the default
RangeIndexby anIntervalIndex.- Returns:
DataFrame representing the Rests of the MuseScore file.
- rl(recompute: bool = False) DataFrame[source]#
Get the raw Rests without adding quarterbeat columns.
- Parameters:
recompute – By default, the measures are cached. Pass True to enforce recomputing anew.
- parse_soup() None[source]#
First step of parsing the MuseScore source. Involves discovering the <staff> tags and storing the <Measure> tags of each in the
measure_nodesdictionary. Also stores the drum_map for each Drumset staff.
- parse_measures()[source]#
Converts the score into the three DataFrame self._measures, self._events, and self._notes
- ms3.bs4_parser.replace_chord_tag_with_rest(target_tag)[source]#
This functions takes as a parameter a given chord tag from the XML tree and mutates it into a rest tag of the same exact notation. This functionality is useful to trim excerpts to have more control over the actual musical elements that are extracted. It also gives the advantage of not changing the relative positions of notes from the original score.
- Parameters:
target_tag – bs4.Tag The chord tag that needs to be mutated into a rest tag of the same duration
- class ms3.bs4_parser.Excerpt(soup: BeautifulSoup, measures: Tuple[int] | int, read_only: bool = False, logger_cfg: dict | None = None, first_mn: int | None = None, first_timesig: str | None = None, first_keysig: int | None = None, first_harmony_values: Dict[str, str] | None = None, first_tempo_tag: Tag | None = None, staff2clef: Dict[int, Dict[str, str]] | None = None, final_barline: bool = False, globalkey: str | None = None, localkey: str | None = None, start_mc_onset: Fraction | None = None, end_mc_onset: Fraction | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | None = Fraction(1, 1), decompose_repeat_tags: bool | None = True)[source]#
Takes a copy of
_MSCX_bs4.soupand eliminates all <Measure> tags that do not correspond to the given list of MCs.- set_tempo(first_tempo_tag, metronome_tempo, metronome_beat_unit)[source]#
This method handles the enforcing of the tempo at the beginning of the excerpt. If a metronome mark was found in the piece from which the excerpt was taken, and was still active, and no tempo was specified by the user, then it will be set again in the first measure of the excerpt. Otherwise, if the user indeed specified a tempo along with a beat unit, a custom metronome mark will be added to the beginning of the excerpt overwriting any possible pre-existing metronome mark that could’ve been there.
- Parameters:
first_tempo_tag – The last active metronome mark found in the original piece (if any was found)
metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
- trim(start_mc_onset: Fraction | None = None, end_mc_onset: Fraction | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False)[source]#
This method handles the trimming of the excerpt where notes outside of the set onset boundaries are mutated into rests (to not change the relative positions of the notes in the whole excerpt).
- Parameters:
start_mc_onset – The onset value before which we want to mutate all other notes (associated with first measure)
end_mc_onset – The onset value after which we want to mutate all other notes (associated with last measure)
exclude_start – If set to True, the note corresponding to the start_mc_onset in the first measure will also be removed
exclude_end – If set to True, the note corresponding to the end_mc_onset in the last measure will also be removed
- get_onset_zero_harmony(return_layer: Literal[False]) Tag | None[source]#
- get_onset_zero_harmony(return_layer: Literal[True]) Tuple[Tag | None, int, int]
Iterate through all tags at mc_onset 0 for all notational (staff, voice) layers and return the first <Harmony> tag or None.
- set_clefs(staff2clef: Dict[int, Dict[str, str]])[source]#
Set the initial clefs for the given staves.
- set_first_keysig(first_keysig: int)[source]#
Set the key signature of the first measure to the given value.
- set_first_mn(first_mn: int)[source]#
Set the measure number of the first measure to the given value.
- replace_chords_with_rests(start_onset: Fraction | float | None = None, end_onset: Fraction | float | None = None, exclude_start: bool | None = False, exclude_end: bool | None = False)[source]#
The method that given the specific onset and measure values, will handle the silencing of all notes that are not withing the onset bounds. More specifically, notes that appear before the
start_onsetin thestart_mcwill be mutated to rests (i.e. silenced). Same thing goes for theend_mc. All notes found after theend_onsetwill also be mutated to rests.- Parameters:
start_onset – onset value set for the first measure. Everything before this will be silenced
end_onset – onset value set for the last measure. Everything after this will be silenced
exclude_start – If set to
True, the note corresponding tostart_onsetin the first measure will also be silencedexclude_end – If set to
True, the note corresponding toend_onsetin the last measure will also be silenced
- enforce_tempo(piece_tempo_tag: Tag | None = None, metronome_tempo: float | None = None, metronome_beat_unit: Fraction | float | None = Fraction(1, 4), user_call: bool | None = True)[source]#
Creates the artificial hidden metronome mark that either comes from the last active metronome mark of the original piece or from some specified tempo and beat unit values specified by the user.
- Parameters:
piece_tempo_tag
metronome_tempo – Optional[float], optional Setting this value will override the tempo at the beginning of the excerpt which, otherwise, is created automatically according to the tempo in vigour at that moment in the score. This is achieved by inserting a hidden metronome marking with a value that depends on the specified “beats per minute”, where “beat” depends on the value of the
metronome_beat_unitparameter.metronome_beat_unit – Optional[Fraction | float], optional Defaults to 1/4, which stands for a quarter note. Please note that for now, the combination of beat unit and tempo is converted and expressed as quarter notes per minute in the (invisible) metronome marking. For example, specifying 1/8=100 will effectively result in 1/4=50 (which is equivalent).
user_call
Returns:
- class ms3.bs4_parser.ParsedParts(soup: BeautifulSoup, **logger_cfg)[source]#
Storing found parts object from a BeautifulSoup file
- Parameters:
soup – bs4.BeautifulSoup, BeautifulSoup object to parse
- :param **logger_cfg:obj:dict: The following options are available:
‘name’: LOGGER_NAME -> by default the logger name is based on the parsed file(s) ‘level’: {‘W’, ‘D’, ‘I’, ‘E’, ‘C’, ‘WARNING’, ‘DEBUG’, ‘INFO’, ‘ERROR’, ‘CRITICAL’} ‘file’: PATH_TO_LOGFILE to store all log messages under the given path.
- Parameters:
optional – The following options are available: ‘name’: LOGGER_NAME -> by default the logger name is based on the parsed file(s) ‘level’: {‘W’, ‘D’, ‘I’, ‘E’, ‘C’, ‘WARNING’, ‘DEBUG’, ‘INFO’, ‘ERROR’, ‘CRITICAL’} ‘file’: PATH_TO_LOGFILE to store all log messages under the given path.
- ms3.bs4_parser.get_enlarged_default_dict() Dict[str, dict][source]#
- Allows users to point to an instrument not only with a ‘trackName’, but also with ‘id’, ‘longName’, ‘shortName’,
‘instrumentId’, ‘part_trackName’
- Returns:
dictionary mapping any of the possible fields (‘id’, ‘longName’, ‘shortName’, trackName’,
- Return type:
Dict[str, dict]
’instrumentId’, ‘part_trackName’) corresponding to an instrument into complete information about the instrument (‘id’, ‘longName’, ‘shortName’, ‘trackName’, ‘instrumentId’, ‘part_trackName’, ‘ChannelName’, ‘ChannelValue’)
- class ms3.bs4_parser.Instrumentation(soup: BeautifulSoup, **logger_cfg)[source]#
Easy way to read and write the instrumentation of a score, that is ‘id’, ‘longName’, ‘shortName’, ‘trackName’, ‘instrumentId’, ‘part_trackName’,
‘ChannelName’, ‘ChannelValue’.
- soup_references() dict[str, dict[str, Tag]][source]#
Stores tags references for each staff
Returns: the dictionary in the format {‘staff_1’: {‘id’: None, ‘longName’: None, ‘shortName’: None, ‘trackName’: None, ‘instrumentId’: None, ‘part_trackName’: None, ‘ChannelName’, ‘ChannelValue’}, ‘staff_2’: {…}, …} containing the BeautifulSoup tags
- property fields#
Extracts information from the tag and stores it for each staff
Returns: the dictionary in the format {‘staff_1’: {‘id’: None, ‘longName’: None, ‘shortName’: None, ‘trackName’: None, ‘instrumentId’: None, ‘part_trackName’: None, ‘ChannelName’, ‘ChannelValue’}, ‘staff_2’: {…}, …} containing the information extracted from tags
- get_instrument_name(staff_name: str | int)[source]#
Allows users accessing the instrument trackname attributed to the staff staff_name :param staff_name: a number or a string in the format ‘staff_1’ defining the staff of interest
- Returns:
trackName extracted from tag for the staff staff_name
- Return type:
- add_suffix(new_values, suffix)[source]#
Adds suffix of the instrument :param new_values: the dictionary of fields to update :param suffix: the string containing version
- Returns:
the dictionary with updated names with versions
- modify_drumset_tags(staff_type, value, changed_part, field_to_change)[source]#
Sets tags specific for Drumset instruments :param staff_type: the tags containing info of the field :param value: new value of the field :param changed_part: the index of part to update :param field_to_change: the name of field to update
- modify_list_tags(changed_part, found, value)[source]#
Sets instruments if there is alist of values to update :param changed_part: number of part of soup file where to find and update in the original file :param found: parts of soup containing channel info in the original file :param value: new values to set :return: corrected list of parts of the same length as value list
- set_instrument(staff_id: str | int, trackname)[source]#
Modifies the instrument and all its corresponding information in the soup source file
- Parameters:
staff_id – an integer number i or a string in the format ‘staff_i’ defining the staff of interest
trackname – key defining the new value of the instrument, can be one of (‘id’, ‘longName’, ‘shortName’, trackName’, ‘instrumentId’, ‘part_trackName’)
- class ms3.bs4_parser.Metatags(soup)[source]#
Easy way to read and write any style information in a parsed MSCX score.
- class ms3.bs4_parser.Style(soup)[source]#
Easy way to read and write any style information in a parsed MSCX score.
- class ms3.bs4_parser.Prelims(soup: BeautifulSoup, **logger_cfg)[source]#
Easy way to read and write the preliminaries of a score, that is Title, Subtitle, Composer, Lyricist, and ‘Instrument Name (Part)’.
- ms3.bs4_parser.get_duration_event(elements)[source]#
Receives a list of dicts representing the events for a given mc_onset and returns the index and name of the first event that has a duration, so either a Chord or a Rest.
- ms3.bs4_parser.get_vbox(soup: BeautifulSoup, logger=None) Tag | None[source]#
Returns the first <VBox> tag contained in the first staff, if any, which usually corresponds to the vertical box at the top of a MuseScore file which contains the prelims (title, composer, etc.)
- ms3.bs4_parser.get_part_info(part_tag, start_staff_id=1)[source]#
Instrument names come in different forms in different places. This function extracts the information from a <Part> tag and returns it as a dictionary.
start_staff_idis used as the base for staff numbering when the inner<Staff>tags lack anidattribute (MuseScore 4 format), where the canonical IDs live on the top-level<Staff id="N">siblings instead. MuseScore numbers staves sequentially across parts, so callers should pass a running counter.
- ms3.bs4_parser.make_spanner_cols(df: DataFrame, spanner_types: Collection[str] | None = None, logger=None) DataFrame[source]#
- From a raw chord list as returned by
get_chords(spanners=True) create a DataFrame with Spanner IDs for all chords for all spanner types they are associated with.
- Parameters:
spanner_types – If this parameter is passed, only the enlisted spanner types [‘Slur’, ‘HairPin’, ‘Pedal’, ‘Ottava’] are included.
History of this algorithm#
At first, spanner IDs were written to Chords of the same layer until a prev/location was found. At first this caused some spanners to continue until the end of the piece because endings were missing when selecting based on the subtype column (endings don’t specify subtype). After fixing this, there were still mistakes, particularly for slurs, because: 1. endings can be missing, 2. endings can occur in a different voice than they should, 3. endings can be expressed with different values than the beginning (all three cases found in ms3/tests/test_local_files/MS3/stabat_03_coloured.mscx) Therefore, the new algorithm ends spanners simply after their given duration.
- From a raw chord list as returned by
- ms3.bs4_parser.recurse_node(node, prepend=None, exclude_children=None)[source]#
The heart of the XML -> DataFrame conversion. Changes may have ample repercussions!
- Returns:
Keys are combinations of tag (& attribute) names, values are value strings.
- Return type:
- ms3.bs4_parser.text_tag2str(tag: Tag) str[source]#
Transforms a <text> tag into a string that potentially includes written-out HTML tags.
- ms3.bs4_parser.text_tag2str_components(tag: Tag) List[str][source]#
Recursively traverses a <text> tag and returns all string components, effectively removing all HTML markup.
- ms3.bs4_parser.text_tag2str_recursive(tag: Tag, join_char: str = '') str[source]#
Gets all string components from a <text> tag and joins them with join_char.
- ms3.bs4_parser.tag2text(tag: Tag) Tuple[str, str][source]#
Takes the <Text> from a MuseScore file’s header and returns its style and string.
- ms3.bs4_parser.get_thoroughbass_symbols(item_tag: Tag) Tuple[str, str][source]#
Returns the prefix and suffix of a <FiguredBassItem> tag if present, empty strings otherwise.
- ms3.bs4_parser.thoroughbass_item(item_tag: Tag) str[source]#
Turns a <FiguredBassItem> tag into a string by concatenating brackets, prefix, digit and suffix.
- ms3.bs4_parser.process_thoroughbass(thoroughbass_tag: Tag) Tuple[List[str], Fraction | None][source]#
Turns a <FiguredBass> tag into a list of components strings, one per level, and duration.
- ms3.bs4_parser.get_row_at_quarterbeat(df: DataFrame, quarterbeat: Literal[None]) DataFrame[source]#
- ms3.bs4_parser.get_row_at_quarterbeat(df: DataFrame, quarterbeat: float) Series | None
- Returns the row of a DataFrame that is active at a given quarterbeat by interpreting subsequent intervals of
the given dataframe’s “quarterbeat” column as activation intervals. That is, the rows are interpreted as consecutive, non-overlapping events and the
duration_qbcolumn is not taken into account for computing the activation intervals. The last interval’s right boundary is np.inf, so that all values higher than the latest event resolve to the latest event without needing to know the end of the piece.
- Parameters:
df – DataFrame in which the column “quarterbeat” is monotonically increasing.
quarterbeat – The position the active row for which will be returned. If the position does not exist because it’s before the first event, None is returned. If None is passed (default), the whole dataframe is returned.
- Returns:
The row of the dataframe.
The expand_dcml module#
This is the same code as in the corpora repo as copied on September 24, 2020 and then adapted.
- class ms3.expand_dcml.SliceMaker[source]#
This class serves for storing slice notation such as
:3as a variable or passing it as function argument.Examples
SM = SliceMaker() some_function( slice_this, SM[3:8] ) select_all = SM[:] df.loc[select_all]
- ms3.expand_dcml.expand_labels(df, column='label', regex=None, rename={}, dropna=False, propagate=True, volta_structure=None, relative_to_global=False, chord_tones=True, absolute=False, all_in_c=False, skip_checks=False, logger=None)[source]#
Split harmony labels complying with the DCML syntax into columns holding their various features and allows for additional computations and transformations.
Uses:
compute_chord_tones(),features2type(),labels2global_tonic(),propagate_keys(),propagate_pedal(),replace_special(),roman_numeral2fifths(),split_alternatives(),split_labels(),transform(),transpose()- Parameters:
df (
pandas.DataFrame) – Dataframe where one column contains DCML chord labels.column (
str) – Name of the column that holds the harmony labels.regex (
re.Pattern) – Compiled regular expression used to split the labels. It needs to have named groups. The group names are used as column names unless replaced bycols.rename (
dict, optional) – Dictionary to map the regex’s group names to deviating column names of your choice.dropna (
bool, optional) – Pass True if you want to drop rows wherecolumnis NaN/<NA>propagate (
bool, optional) – By default, information about global and local keys and about pedal points is spread throughout the DataFrame. Pass False if you only want to split the labels into their features. This ignores all following parameters because their expansions depend on information about keys.volta_structure (
dict, optional) – {first_mc -> {volta_number -> [mc1, mc2…]} } dictionary as you can get it fromScore.mscx.volta_structure. This allows for correct propagation into second and other voltas.relative_to_global (
bool, optional) – Pass True if you want all labels expressed with respect to the global key. This levels and eliminates the features localkey and relativeroot.chord_tones (
bool, optional) – Pass True if you want to add four columns that contain information about each label’s chord, added, root, and bass tones. The pitches are expressed as intervals relative to the respective chord’s local key or, ifrelative_to_global=True, to the globalkey. The intervals are represented as integers that represent stacks of fifths over the tonic, such that 0 = tonic, 1 = dominant, -1 = subdominant, 2 = supertonic etc.absolute (
bool, optional) – Pass True if you want to transpose the relative chord_tones to the global key, which makes them absolute so they can be expressed as actual note names. This implies prior conversion of the chord_tones (but not of the labels) to the global tonic.all_in_c (
bool, optional) – Pass True to transpose chord_tones to C major/minor. This performs the same transposition of chord tones as relative_to_global but without transposing the labels, too. This option clashes with absolute=True.
- Returns:
Original DataFrame plus additional columns with split features.
- Return type:
- ms3.expand_dcml.extract_features_from_labels(S: Series, regex: Pattern | str | None = None) DataFrame[source]#
Applies .str.extract(regex) on the Series and returns a DataFrame with all named capturing groups.
- ms3.expand_dcml.split_labels(df, label_column='label', regex=None, rename={}, dropna=False, inplace=False, skip_checks=False, logger=None)[source]#
Split harmony labels complying with the DCML syntax into columns holding their various features.
- Parameters:
df (
pandas.DataFrame) – Dataframe where one column contains DCML chord labels.label_column (
str) – Name of the column that holds the harmony labels.regex (
re.Pattern) – Compiled regular expression used to split the labels. It needs to have named groups. The group names are used as column names unless replaced by cols.rename (
dict) – Dictionary to map the regex’s group names to deviating column names.dropna (
bool, optional) – Pass True if you want to drop rows wherecolumnis NaN/<NA>inplace (
bool, optional) – Pass True if you want to mutatedf.
- ms3.expand_dcml.values_into_df(df: DataFrame, new_values: DataFrame) DataFrame[source]#
Updates the given DataFrame with the values from the other DataFrame by updating existing columns and concatenating new columns. The returned DataFrame has the columns of
new_valueson the right-hand side as if they had been concatenated.
- ms3.expand_dcml.features2type(numeral, form=None, figbass=None, logger=None)[source]#
Turns a combination of the three chord features into a chord type.
- Returns:
‘M’ (Major triad)
’m’ (Minor triad)
’o’ (Diminished triad)
’+’ (Augmented triad)
’mm7’ (Minor seventh chord)
’Mm7’ (Dominant seventh chord)
’MM7’ (Major seventh chord)
’mM7’ (Minor major seventh chord)
’o7’ (Diminished seventh chord)
’%7’ (Half-diminished seventh chord)
’+7’ (Augmented (minor) seventh chord)
’+M7’ (Augmented major seventh chord)
- ms3.expand_dcml.replace_special(df, regex, merge=False, inplace=False, cols={}, special_map={}, logger=None)[source]#
- Move special symbols in the numeral column to a separate column and replace them by the explicit chords they
stand for. | In particular, this function replaces the symbols It, Ger, and Fr.
Uses:
merge_changes()- Parameters:
df (
pandas.DataFrame) – Dataframe containing DCML chord labels that have been split by split_labels().regex (
re.Pattern) – Compiled regular expression used to split the labels replacing the special symbols.It needs to have named groups. The group names are used as column names unless replaced by cols.merge (
bool, optional) – False: By default, existing values, except figbass, are overwritten. True: Merge existing with new values (for changes and relativeroot).cols (
dict, optional) –The special symbols appear in the column numeral and are moved to the column special. In case the column names for
['numeral','form', 'figbass', 'changes', 'relativeroot', 'special']deviate, pass a dict, such as{'numeral': 'numeral_col_name', 'form': 'form_col_name 'figbass': 'figbass_col_name', 'changes': 'changes_col_name', 'relativeroot': 'relativeroot_col_name', 'special': 'special_col_name'}
special_map (
dict, optional) – In case you want to add or alter special symbols to be replaced, pass a replacement map, e.g. {‘N’: ‘bII6’}. The column ‘figbass’ is only altered if it’s None to allow for inversions of special chords.inplace (
bool, optional) – Pass True if you want to mutatedf.
- ms3.expand_dcml.merge_changes(left, right, *args)[source]#
Merge two changes into one, e.g. b3 and +#7 to +#7b3.
Uses:
changes2list()
- ms3.expand_dcml.propagate_keys(df, volta_structure=None, globalkey='globalkey', localkey='localkey', add_bool=True, logger=None)[source]#
- Propagate information about global keys and local keys throughout the dataframe.Pass split harmonies for one piece at a time. For concatenated pieces, use apply().
Uses:
series_is_minor()- Parameters:
df (
pandas.DataFrame) – Dataframe containing DCML chord labels that have been split by split_labels().volta_structure (
dict, optional) – {first_mc -> {volta_number -> [mc1, mc2…]} } dictionary as you can get it fromScore.mscx.volta_structure. This allows for correct propagation into second and other voltas.globalkey (
str, optional) – In case you renamed the columns, pass column names.localkey (
str, optional) – In case you renamed the columns, pass column names.add_bool (
bool, optional) – Pass True if you want to add two boolean columns which are true if the respective key is a minor key.
- ms3.expand_dcml.propagate_pedal(df, relative=True, drop_pedalend=True, cols={}, logger=None)[source]#
Propagate the pedal note for all chords within square brackets. By default, the note is expressed in relation to each label’s localkey.
Uses:
rel2abs_key(),abs2rel_key()- Parameters:
df (
pandas.DataFrame) – Dataframe containing DCML chord labels that have been split by split_labels() and where the keys have been propagated using propagate_keys().relative (
bool, optional) – Pass False if you want the pedal note to stay the same even if the localkey changes.drop_pedalend (
bool, optional) – Pass False if you don’t want the column with the ending brackets to be dropped.cols (
dict, optional) –In case the column names for
['pedal','pedalend', 'globalkey', 'localkey']deviate, pass a dict, such as{'pedal': 'pedal_col_name', 'pedalend': 'pedalend_col_name', 'globalkey': 'globalkey_col_name', 'localkey': 'localkey_col_name'}
Utils#
Transformations#
Functions for transforming DataFrames as output by ms3.
- ms3.transformations.make_note_name_and_octave_columns(notes: DataFrame, staff2drums: Dict[int, dict | DataFrame | Series] | None = None) Tuple[Series, Series][source]#
Takes a notelist and maybe a {staff -> {midi_pitch -> ‘instrument_name’}} mapping and returns two columns named ‘name’ and ‘octave’.
- ms3.transformations.add_quarterbeats_col(df: DataFrame, offset_dict: Series | dict, offset_dict_all_endings: Series | dict | None = None, interval_index: bool = False, name: str | None = None, logger=None) DataFrame[source]#
- Insert a column measuring the distance of events from MC 1 in quarter notes. If no ‘mc_onset’ column is present,
the column corresponds to the values given in the offset_dict.
- Parameters:
df – DataFrame with an
mcormc_playthroughcolumn, and anmc_onsetcolumn.offset_dict –
If unfolded: {mc_playthrough -> offset}Otherwise: {mc -> offset}You can create the dict using the functions make_continuous_offset_series() or make_offset_dict_from_measures().It is not required if the column ‘quarterbeats’ exists already.offset_dict_all_endings – Argument added later as a straightforward way to add two quarterbeats columns, the second one being the ‘quarterbeats_all_endings’ which is so important that with ms3 v2.2.0 it is included by default. It is independent from unfolding because its main purpose is score addressability.
interval_index – Defaults to False. Pass True to replace the index with an
pandas.IntervalIndex(depends on the successful creation of the columnduration_qb).name – If specified, name of the added column. Defaults to ‘quarterbeats’ for normal, and ‘quarterbeats_playthrough’ for unfolded dataframes.
logger
- Returns:
The DataFrame with quarterbeats and duration_qb columns added.
- ms3.transformations.make_quarterbeats_column(mc_column: Series, mc_onset_column: Series | None, offset_dict: Series | dict, name: str = 'quarterbeats') Series[source]#
Turn each combination of mc and mc_onset into a quarterbeat value using the offset_dict that maps mc to the measure’s quarterbeat position (distance from the beginning of the piece).
- Parameters:
mc_column – A sequence of MC values, each of which will be mapped to its quarterbeats value in
offset_dict.mc_onset_column – If specified, these values will be added to the mapped quarterbeats values.
offset_dict – {mc -> quarterbeats}, can be a Series.
name – Name of the returned Series.
- Returns:
Quarterbeats column.
- ms3.transformations.add_weighted_grace_durations(notes, weight=0.5, logger=None)[source]#
For a given notes table, change the ‘duration’ value of all grace notes, weighting it by
weight.- Parameters:
notes (
pandas.DataFrame) – Notes table containing the columns ‘duration’, ‘nominal_duration’, ‘scalar’weight (
Fractionorfloat) – Value by which to weight duration of all grace notes. Defaults to a half.
- Returns:
Copy of
noteswith altered duration values.- Return type:
- ms3.transformations.compute_chord_tones(df: DataFrame, bass_only: bool = False, expand: bool = False, cols: dict | None = None, logger: Logger | None = None) DataFrame | Series[source]#
Compute the chord tones for DCML harmony labels. They are returned as lists of tonal pitch classes in close position, starting with the bass note. The tonal pitch classes represent intervals relative to the local tonic:
-2: Second below tonic -1: fifth below tonic 0: tonic 1: fifth above tonic 2: second above tonic, etc.
The labels need to have undergone
split_labels()andpropagate_keys(). Pedal points are not taken into account.Uses:
features2tpcs()- Parameters:
df – Dataframe containing DCML chord labels that have been split by
split_labels()and where the keys have been propagated using propagate_keys(add_bool=True).bass_only – Pass True if you need only the bass note.
expand – Pass True if you need chord tones and added tones in separate columns. Otherwise a Series is returned.
cols – In case the column names for
['mc', 'numeral', 'form', 'figbass', 'changes', 'relativeroot', 'localkey', 'globalkey']deviate, pass a dict, such aslogger
- Returns:
For every row of df one tuple with chord tones, expressed as tonal pitch classes. If expand is True, the function returns a DataFrame with four columns: Two with tuples for chord tones and added tones, one with the chord root, and one with the bass note.
- ms3.transformations.dfs2quarterbeats(dfs: DataFrame | List[DataFrame], measures: DataFrame, unfold=False, quarterbeats=True, interval_index=True, logger=None) List[DataFrame][source]#
Pass one or several DataFrames and one measures table to unfold repeats and/or add quarterbeats columns and/or index.
- Parameters:
dfs – DataFrame(s) that are to be unfolded and/or receive quarterbeats.
measures
unfold
quarterbeats
interval_index
- Returns:
Altered copies of dfs.
- ms3.transformations.get_chord_sequences(at, major_minor=True, level=None, column='chord', logger=None)[source]#
Transforms an annotation table into lists of chord symbols for n-gram analysis. If your table represents several pieces, make sure to pass the groupby parameter
levelto avoid including inexistent transitions.- Parameters:
at (
pandas.DataFrame) – Annotation table.major_minor (
bool, optional) –Defaults to True: the length of the chord sequences corresponds to localkey segments. The result comes asdict of dicts. | If you pass False, chord sequences are returned as they are, potentially including incorrect transitions, e.g., when
the localkey changes. The result comes as list of lists, where the sublists result from the groupby if you specified
level.level (
intorlist) – Argument passed topandas.DataFrame.groupby(). Defaults to -1, resulting in a GroupBy by all levels except the last. Conversely, you can pass, for instance, 2 to group by the first two levels.column (
str) – Name of the column containing the chord symbols that compose the sequences.
- Returns:
-
’localkey_is_minor’ ->
bool, ‘sequence’ ->list} } | If False, the sequences are returned as a list of lists - Return type:
- ms3.transformations.group_annotations_by_features(at, features='numeral', logger=None)[source]#
Drop exact repetitions of one or several feature columns when occurring under the same localkey (and pedal point). For example, pass
features = ['numeral', 'form', 'figbass']to drop rows where all three features are identical with the previous row _and_ the localkey stays the same. If the columnduration_qbis present, it is updated with the new durations, as would be the IntervalIndex if there is one. Uses: nan_eq()- Parameters:
at (
pandas.DataFrame) – Annotation tablefeatures (
strorlist) – Feature or feature combination for which to remove immediate repetitionsdropna (
bool) – Also subsumes rows for which allfeaturesare NaN, rather than treating them as a new value.
- Return type:
Example
>>> df +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | | quarterbeats | duration_qb | localkey | chord | numeral | form | figbass | changes | relativeroot | +==============+==============+=============+==========+===============+=========+======+=========+=========+==============+ | [37.5, 38.5) | 75/2 | 1.0 | I | viio65(6b3)/V | vii | o | 65 | 6b3 | V | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [38.5, 40.5) | 77/2 | 2.0 | I | Ger | vii | o | 65 | b3 | V | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [40.5, 41.5) | 81/2 | 1.0 | I | V(7v4) | V | | | 7v4 | | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [41.5, 43.5) | 83/2 | 2.0 | I | V(64) | V | | | 64 | | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [43.5, 44.5) | 87/2 | 1.0 | I | V7(9) | V | | 7 | 9 | | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [44.5, 46.5) | 89/2 | 2.0 | I | V7 | V | | 7 | | | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+ | [46.5, 48.0) | 93/2 | 1.5 | I | I | I | | | | | +--------------+--------------+-------------+----------+---------------+---------+------+---------+---------+--------------+
>>> group_annotations_by_features(df) +--------------+--------------+-------------+----------+--------------+---------+-------+ | | quarterbeats | duration_qb | localkey | relativeroot | numeral | chord | +==============+==============+=============+==========+==============+=========+=======+ | [37.5, 40.5) | 75/2 | 3.0 | I | V | vii | vii/V | +--------------+--------------+-------------+----------+--------------+---------+-------+ | [40.5, 46.5) | 81/2 | 6.0 | I | NaN | V | V | +--------------+--------------+-------------+----------+--------------+---------+-------+ | [46.5, 48.0) | 93/2 | 1.5 | I | NaN | I | I | +--------------+--------------+-------------+----------+--------------+---------+-------+
- ms3.transformations.labels2global_tonic(df, cols={}, inplace=False, logger=None)[source]#
Transposes all numerals to their position in the global major or minor scale. This eliminates localkeys and relativeroots. The resulting chords are defined by [numeral, figbass, changes, globalkey_is_minor] (and pedal).
Uses:
transform(),rel2abs_key^, :py:func:`resolve_relative_keys()->str_is_minor()transpose_changes(),series_is_minor(),- Parameters:
df (
pandas.DataFrame) – Dataframe containing DCML chord labels that have been split by split_labels() and where the keys have been propagated using propagate_keys(add_bool=True).cols (
dict, optional) –In case the column names for
['numeral', 'form', 'figbass', 'changes', 'relativeroot', 'localkey', 'globalkey']deviate, pass a dict, such as{'chord': 'chord_col_name' 'pedal': 'pedal_col_name', 'numeral': 'numeral_col_name', 'form': 'form_col_name', 'figbass': 'figbass_col_name', 'changes': 'changes_col_name', 'relativeroot': 'relativeroot_col_name', 'localkey': 'localkey_col_name', 'globalkey': 'globalkey_col_name'}}
inplace (
bool, optional) – Pass True if you want to mutate the input.
- Returns:
If inplace=False, the relevant features of the transposed chords are returned. Otherwise, the original DataFrame is mutated.
- Return type:
- ms3.transformations.make_chord_col(df, cols=None)[source]#
The ‘chord’ column contains the chord part of a DCML label, i.e. without indications of key, pedal, cadence, or phrase. This function can re-create this column, e.g. if the feature columns were changed. To that aim, the function takes a DataFrame and the column names that it adds together, creating new strings. Column names ‘changes’ and ‘relativeroot’, if present, are treated specially (see the code).
- ms3.transformations.make_gantt_data(at, last_mn=None, relativeroots=True, mode_agnostic_adjacency=True, logger=None)[source]#
Takes an expanded DCML annotation table and returns a DataFrame with timings of the included key segments, based on the column
localkey. The column names are suited for the plotly library. Uses: rel2abs_key, resolve_relative_keys, roman_numeral2fifths roman_numerals2semitones, labels2global_tonic- Parameters:
at (
pandas.DataFrame) – Expanded DCML annotation table.last_mn (
int, optional) – By default, the columnquarterbeatsis used for computing Start and Finish unless the column is not present, in which case a continuous version of measure numbers (MN) is used. In the latter case you should pass the last measure number of the piece in order to calculate the correct duration of the last key segment; otherwise it will go until the end of the last label’s MN. As soon as you pass a value, the columnquarterbeatsis ignored even if present. If you want to ignore it but don’t know the last MN, pass -1.relativeroots (
bool, optional) – By default, additional rows are added based on the columnrelativeroot. Pass False to prevent that.mode_agnostic_adjacency (
bool, optional) – By default (ifrelativerootsis True), additional rows are added for labels adjacent to temporarily tonicized roots, no matter if the mode is identical or not. For example, before and after a V/V, all V _and_ v labels will be grouped as adjacent segments. Pass False to group only labels with the same mode (only V labels in the example), or None to include no adjacency at all.
- ms3.transformations.notes2pcvs(notes, pitch_class_format='tpc', normalize=False, long=False, fillna=True, additional_group_cols=None, ensure_columns=None, logger=None)[source]#
- Parameters:
notes (
pandas.DataFrame) – Note table to be transformed into a wide or long table of Pitch Class Vectors by grouping via the first (or only) index level. The DataFrame needs containing at least the columns ‘duration_qb’ and ‘tpc’ or ‘midi’, depending onpitch_class_format.pitch_class_format (
str, optional) –Defines the type of pitch classes to use for the vectors.’tpc’ (default): tonal pitch class, such that -1=F, 0=C, 1=G etc.’name’: tonal pitch class as spelled pitch, e.g. ‘C’, ‘F#’, ‘Abb’ etc.’pc’: chromatic pitch classes where 0=C, 1=C#/Db, … 11=B/Cb.’midi’: original MIDI numbers; the result are pitch vectors, not pitch class vectors.normalize (
bool, optional) – By default, the PCVs contain absolute durations in quarter notes. Pass True to normalize the PCV for each group.long (
bool, optional) – By default, the resulting DataFrames have wide format, i.e. each row contains the PCV for one slice. Pass True if you need long format instead, i.e. with a non-uniquepandas.IntervalIndexand two columns,[('tpc'|'midi'), 'duration_qb']where the first column’s name depends onpitch_class_format.fillna (
bool, optional) – By default, if a Pitch class does not appear in a PCV, its value will be 0. Pass False if you want NA instead.additional_group_cols ((
listof)str) – If you would like to maintain some information from other columns ofnotesin additional index levels, pass their names.ensure_columns (
Iterable, optional) – By default, pitch classes that don’t appear don’t get a column. Pass a value if you want to ensure the presence of particular columns, even if empty. For example, ifpitch_class_format='pc'you could passensure_columns=range(12).
- ms3.transformations.resolve_all_relative_numerals(at, additional_columns=None, inplace=False)[source]#
Resolves Roman numerals that include slash notation such as ‘#vii/ii’ => ‘#i’ or ‘V/V/V’ => ‘VI’ in a major and ‘#VI’ in a minor key. The function expects the columns [‘globalkey_is_minor’, ‘localkey_is_minor’] to be present. The former is necessary only if the column ‘localkey’ is present and needs resolving. Execution will be slightly faster if performed on the entire DataFrame rather than using
transform_multiple().- Parameters:
at (
pandas.DataFrame) – Annotation table.additional_columns (
strorlist) – By default, the function resolves, if present, the columns [‘relativeroot’, ‘pedal’] but here you can name other columns, too. They will be resolved based on the localkey’s mode.inplace (
bool, optional) – By default, a manipulated copy ofatis returned. Pass True to mutate instead.
- ms3.transformations.segment_by_adjacency_groups(df, cols, na_values='group', group_keys=False, logger=None)[source]#
Drop exact adjacent repetitions within one or a combination of several feature columns and adapt the IntervalIndex and the column ‘duration_qb’ accordingly. Uses:
adjacency_groups(),reduce_dataframe_duration_to_first_row()- Parameters:
df (
pandas.DataFrame) – DataFrame to be reduced, expected to contain the columnduration_qb. In order to use the result as a segmentation, it should have apandas.IntervalIndex.cols (
list) – Feature columns which exact, adjacent repetitions should be grouped to a segment, keeping only the first row.na_values ((
listof)strorAny, optional) –Either pass a list of equal length ascolsor a single value that is passed toadjacency_groups()for each. Not dealing with NA values will lead to wrongly grouped segments. The default option is the safest.’group’ creates individual groups for NA values’backfill’ or ‘bfill’ groups NA values with the subsequent group’pad’, ‘ffill’ groups NA values with the preceding groupAny other value works like ‘group’, with the difference that the created groups will be named with this value.group_keys (
bool, optional) – By default, the grouped values will be returned as an appended MultiIndex, differentiation groups via ascending integers. If you want to duplicate the columns’ value, e.g. to account for a custom filling value forna_values, pass True. Beware that this most often results in non-unique index levels.
- Returns:
Reduced DataFrame with updated ‘duration_qb’ column and
pandas.IntervalIndexon the first level (if present).- Return type:
- ms3.transformations.segment_by_criterion(df: DataFrame, boolean_mask: Series | array, warn_na: bool = False, logger=None) DataFrame[source]#
Drop all rows where the boolean mask does not match, and adapt the IntervalIndex and the column ‘duration_qb’ accordingly.
- Parameters:
df – DataFrame to be reduced, expected to come with the column
duration_qband anpandas.IntervalIndex.boolean_mask – Boolean mask where every True value starts a new segment.
warn_na – If the boolean mask starts with any number of False, this first group will be missing from the result. Set warn_na to True if you want the logger to throw a warning in this case.
- Returns:
Reduced DataFrame with updated ‘duration_qb’ column and
pandas.IntervalIndexon the first level.
- ms3.transformations.segment_by_interval_index(df, idx, truncate=True)[source]#
Segment a DataFrame into chunks based on a given IntervalIndex.
- Parameters:
df (
pandas.DataFrame) – DataFrame that has apandas.IntervalIndexto allow for its segmentation.idx (
pandas.IntervalIndexorpandas.MultiIndex) – Intervals by which to segmentdf. The index will be prepended to differentiate between segments. Ifidxis apandas.MultiIndex, the first level is expected to be apandas.IntervalIndex.truncate (
bool, optional) – By default, the intervals of the segmented DataFrame will be cut off at segment boundaries and the event’s ‘duration_qb’ will be adapted accordingly. Pass False to prevent that and duplicate overlapping events without adapting their Intervals and ‘duration_qb’.
- Returns:
A copy of
dfwhere the index levelsidxhave been prepended and only rows ofdfwith overlapping intervals are included.- Return type:
- ms3.transformations.slice_df(df: DataFrame, quarters_per_slice: float | None = None) Dict[Interval, DataFrame][source]#
Returns a sliced version of the DataFrame. Slices appear in the IntervalIndex and the contained event’s durations within the slice are shown in the column ‘duration_qb’. Uses:
- Parameters:
df (
pandas.DataFrame) – The DataFrame is expected to come with an IntervalIndex and contain the columns ‘quarterbeats’ and ‘duration_qb’. Those can be obtained throughParse.get_lists(interval_index=True)orParse.iter_transformed(interval_index=True).quarters_per_slice (
float, optional) – By default, the slices have variable size, from onset to onset. If you pass a value, the slices will have that constant size, measured in quarter notes. For example, pass 1.0 for all slices to have size 1 quarter.
- Return type:
- ms3.transformations.transform_multiple(df, func, level=-1, logger=None, **kwargs)[source]#
Applying transformation(s) separately to concatenated pieces that can be differentiated by index level(s).
- Parameters:
df (
pandas.DataFrame) – Concatenated tables withpandas.MultiIndex.func (
Callableorstr) – Function to be applied to the individual tables. For convenience, you can pass strings to call the standard transformers for a particular table type. For example, pass ‘annotations’ to calltransform_annotations.level (
intorlist) – Argument passed topandas.DataFrame.groupby(). Defaults to -1, resulting in a GroupBy by all levels except the last. Conversely, you can pass, for instance, 2 to group by the first two levels.kwargs – Keyword arguments passed to
func.
- Return type:
- ms3.transformations.transform_annotations(at, groupby_features=None, resolve_relative=False)[source]#
Wrapper for applying several transformations to an annotation table.
- Parameters:
at (
pandas.DataFrame) – Annotation table corresponding to a single piece.groupby_features (
strorlist) – Argumentfeaturespassed togroup_annotations_by_features().resolve_relative (
bool) – Resolves slash notation (e.g. ‘vii/V’) from Roman numerals in the columns [‘localkey’, ‘relativeroot’, ‘pedal’].
- Return type:
- ms3.transformations.transpose_notes_to_localkey(notes)[source]#
Transpose the columns ‘tpc’ and ‘midi’ such that they reflect the local key as if it was C major/minor. This operation is typically required for creating pitch class profiles. Uses:
transform(),name2fifths(),roman_numeral2fifths()- Parameters:
notes (
pandas.DataFrame) – DataFrame that has at least the columns [‘globalkey’, ‘localkey’, ‘tpc’, ‘midi’].- Returns:
A copy of
noteswhere the columns ‘tpc’ and ‘midi’ are shifted in such a way that tpc=0 and midi=60 match the local tonic (e.g. for the local key A major/minor, each pitch A will have tpc=0 and midi % 12 = 0).- Return type:
- ms3.transformations.transform_columns(df, func, columns=None, param2col=None, inplace=False, **kwargs)[source]#
Wrapper function to use transform() on df[columns], leaving the other columns untouched.
- Parameters:
df (
pandas.DataFrame) – DataFrame where columns (or column combinations) work as function arguments.func (
callable) – Function you want to apply to all elements in columns.columns (
list) – Columns to which you want to apply func.param2col (
dictorlist, optional) – Mapping from parameter names of func to column names. If you pass a list of column names, the columns’ values are passed as positional arguments. Pass None if you want to use all columns as positional arguments.inplace (
bool, optional) – Pass True if you want to mutate df rather than getting an altered copy.**kwargs (keyword arguments for transform())
- ms3.transformations.transform_note_columns(df, to, note_cols=['chord_tones', 'added_tones', 'bass_note', 'root'], minor_col='localkey_is_minor', inplace=False, logger=None, **kwargs)[source]#
Turns columns with line-of-fifth tonal pitch classes into another representation.
Uses: transform_columns()
- Parameters:
df (
pandas.DataFrame) – DataFrame where columns (or column combinations) work as function arguments.to ({'name', 'iv', 'pc', 'sd', 'rn'}) –
The tone representation that you want to get from the note_cols.
- ’name’: Note names. Should only be used if the stacked fifths actually represent
absolute tonal pitch classes rather than intervals over the local tonic. In other words, make sure to use ‘name’ only if 0 means C rather than I.
- ’iv’: Intervals such that 0 = ‘P1’, 1 = ‘P5’, 4 = ‘M3’, -3 = ‘m3’, 6 = ‘A4’,
-6 = ‘D5’ etc.
’pc’: (Relative) chromatic pitch class, or distance from tonic in semitones.
- ’sd’: Scale degrees such that 0 = ‘1’, -1 = ‘4’, -2 = ‘b7’ in major, ‘7’ in minor etc.
This representation requires a boolean column minor_col which is True in those rows where the stacks of fifths occur in a local minor context and False for the others. Alternatively, if all pitches are in the same mode or you simply want to express them as degrees of particular mode, you can pass the boolean keyword argument minor.
- ’rn’: Roman numerals such that 0 = ‘I’, -2 = ‘bVII’ in major, ‘VII’ in minor etc.
Requires boolean ‘minor’ values, see ‘sd’.
note_cols (
list, optional) – List of columns that hold integers or collections of integers that represent stacks of fifth (0 = tonal center, 1 = fifth above, -1 = fourth above, etc).minor_col (
str, optional) – If to is ‘sd’ or ‘rn’, specify a boolean column where the value is True in those rows where the stacks of fifths occur in a local minor context and False for the others.**kwargs (keyword arguments for transform())
- ms3.transformations.transpose_chord_tones_by_localkey(df, by_global=False)[source]#
- Returns a copy of the expanded table where the scale degrees in the chord tone columns
have been transposed by localkey (i.e. they express all chord tones as scale degrees of the globalkey) or, if
by_globalis set to True, additionally by globalkey (i.e., chord tones as tonal pitch classes TPC).
- Parameters:
df (
pandas.DataFrame) – Expanded labels with chord tone columns.by_global (
bool) – By default, the transformed chord tone columns express chord tones as scale degrees (or intervals) of the global tonic. If set to True, they correspond to tonal pitch classes and can be further transformed to note names using transform_note_columns().
- Return type:
The commandline interface#
The library offers you the following commands. Add the flag -h to one of them to learn about its parameters.
usage: ms3 [-h] [--version]
{add,check,compare,convert,empty,extract,metadata,review,transform,update,precommit}
...
Positional Arguments#
- action
Possible choices: add, check, compare, convert, empty, extract, metadata, review, transform, update, precommit
The action that you want to perform.
Named Arguments#
- --version
show program’s version number and exit
Sub-commands#
add#
Add labels from annotation tables to scores.
ms3 add [-h] [--ask] [--use {expanded,labels}] [-d DIR] [-o OUT_DIR] [-n] [-a]
[-i REGEX] [-e REGEX] [-f REGEX] [-m [PATH]] [--reviewed]
[--files PATHs [PATHs ...]] [--iterative] [-l {c, e, w, i, d}]
[--log [LOG]] [-t] [-v] [-s SUFFIX] [--replace]
Named Arguments#
- --ask
If several files are available for the selected facet (default: ‘expanded’, see –use), I will pick one automatically. Add –ask if you want me to have you select which ones to compare with the scores.
Default:
False- --use
Possible choices: expanded, labels
Which type of labels you want to compare with the ones in the score. Defaults to ‘expanded’, i.e., DCML labels. Set –use labels to use other labels available as TSV and set –ask if several sets of labels are available that you want to choose from.
Default:
'expanded'- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -s, --suffix
Suffix of the new scores with inserted labels. Defaults to _annotated.
Default:
'_annotated'- --replace
Remove existing labels from the scores prior to adding. Like calling ms3 empty first.
Default:
False
check#
Parse MSCX files and look for errors. In particular, check DCML harmony labels for syntactic correctness.
ms3 check [-h] [--ignore_scores] [--ignore_labels] [--fail]
[--ignore_metronome] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX]
[-e REGEX] [-f REGEX] [-m [PATH]] [--reviewed]
[--files PATHs [PATHs ...]] [--iterative] [-l {c, e, w, i, d}]
[--log [LOG]] [-t] [-v]
Named Arguments#
- --ignore_scores
Don’t check scores for encoding errors.
Default:
False- --ignore_labels
Don’t check DCML labels for syntactic correctness.
Default:
False- --fail
If you pass this argument the process will deliberately fail with an AssertionError when there are any mistakes.
Default:
False- --ignore_metronome
Pass this flag if you want the check to pass (not fail) even if there is a warning about a missing metronome mark in the first bar of the score.
Default:
False- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False
compare#
For MSCX files for which annotation tables exist, create another MSCX file with a coloured label comparison if differences are found.
ms3 compare [-h] [--ask] [--use {expanded,labels}] [--flip] [--safe] [--force]
[-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX] [-f REGEX]
[-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]] [--iterative]
[-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v] [-c GIT_REVISION]
[-s SUFFIX]
Named Arguments#
- --ask
If several files are available for the selected facet (default: ‘expanded’, see –use), I will pick one automatically. Add –ask if you want me to have you select which ones to compare with the scores.
Default:
False- --use
Possible choices: expanded, labels
Which type of labels you want to compare with the ones in the score. Defaults to ‘expanded’, i.e., DCML labels. Set –use labels to use other labels available as TSV and set –ask if several sets of labels are available that you want to choose from.
Default:
'expanded'- --flip
Pass this flag to treat the annotation tables as if updating the scores instead of the other way around, effectively resulting in a swap of the colors in the output files.
Default:
False- --safe
Don’t overwrite existing files.
Default:
True- --force
Output comparison files even when no differences are found.
Default:
False- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -c, --compare
By default, the _reviewed file displays removed labels in red and added labels in green, compared to the version currently represented in the present TSV files, if any. If instead you want a comparison with the TSV files from another Git commit, additionally pass its specifier, e.g. ‘HEAD~3’, <branch-name>, <commit SHA> etc. LATEST_VERSION is accepted as a revision specifier and will result in a comparison with the TSV files at the tag with the highest version number (falling back to HEAD if no tags have been assigned to the repository.
Default:
''- -s, --suffix
Suffix of the newly created comparison files. Defaults to _compared
Default:
'_compared'
convert#
Use your local install of MuseScore to convert MuseScore files.
ms3 convert [-h] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX]
[-f REGEX] [-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]]
[--iterative] [-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v]
[-s SUFFIX] [--format FORMAT]
[--extensions EXTENSIONS [EXTENSIONS ...]] [--safe]
Named Arguments#
- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -s, --suffix
Suffix of the converted files. Defaults to .
Default:
''- --format
Output format of converted files. Defaults to mscx. Other options are {png, svg, pdf, mscz, wav, mp3, flac, ogg, musicxml, mxl, mid}
Default:
'mscx'- --extensions
Those file extensions that you want to be converted, separated by spaces. Defaults to mscx mscz
Default:
['mscx', 'mscz']- --safe
Don’t overwrite existing files.
Default:
True
empty#
Remove harmony annotations and store the MuseScore files without them.
ms3 empty [-h] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX]
[-f REGEX] [-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]]
[--iterative] [-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v]
[-s SUFFIX]
Named Arguments#
- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -s, --suffix
Suffix of the new scores with removed labels. Defaults to _clean.
Default:
'_clean'
extract#
Extract selected information from MuseScore files and store it in TSV files.
ms3 extract [-h] [-M [folder]] [-N [folder]] [-R [folder]] [-L [folder]]
[-X [folder]] [-F [folder]] [-E [folder]] [-C [folder]]
[-J [folder]] [-D [suffix]] [-p] [--raw] [-u] [--interval_index]
[--corpuswise] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX]
[-e REGEX] [-f REGEX] [-m [PATH]] [--reviewed]
[--files PATHs [PATHs ...]] [--iterative] [-l {c, e, w, i, d}]
[--log [LOG]] [-t] [-v]
Named Arguments#
- -M, --measures
Folder where to store TSV files with measure information needed for tasks such as unfolding repetitions.
- -N, --notes
Folder where to store TSV files with information on all notes.
- -R, --rests
Folder where to store TSV files with information on all rests.
- -L, --labels
Folder where to store TSV files with information on all annotation labels.
- -X, --expanded
Folder where to store TSV files with expanded DCML labels.
- -F, --form_labels
Folder where to store TSV files with all form labels.
- -E, --events
Folder where to store TSV files with all events (chords, rests, articulation, etc.) without further processing.
- -C, --chords
Folder where to store TSV files with <chord> tags, i.e. groups of notes in the same voice with identical onset and duration. The tables include lyrics, dynamics, articulation, staff- and system texts, tempo marking, spanners, and thoroughbass figures.
- -J, --joined_chords
Like -C except that all Chords are substituted with the actual Notes they contain. This is useful, for example, for relating slurs to the notes they group, or bass figures to their bass notes.
- -D, --metadata
Set -D to update the ‘metadata.tsv’ files of the respective corpora with the parsed scores. Add a suffix if you want to update ‘metadata{suffix}.tsv’ instead.
- -p, --positioning
When extracting labels, include manually shifted position coordinates in order to restore them when re-inserting.
Default:
False- --raw
When extracting labels, leave chord symbols encoded instead of turning them into a single column of strings.
Default:
True- -u, --unfold
Unfold the repeats for all stored DataFrames.
Default:
False- --interval_index
Prepend a column with [start, end) intervals to the TSV files.
Default:
False- --corpuswise
Parse one corpus after the other rather than all at once.
Default:
False- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False
metadata#
Update MSCX files with changes made to metadata.tsv (created via ms3 extract -D [-a]).
ms3 metadata [-h] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX]
[-f REGEX] [-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]]
[--iterative] [-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v]
[-s SUFFIX] [-p] [--instrumentation] [--empty] [--remove]
Named Arguments#
- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -s, --suffix
Suffix of the new scores with updated metadata fields.
- -p, --prelims
Pass this flag if, in addition to updating metadata fields, you also want score headers to be updated from the columns title_text, subtitle_text, composer_text, lyricist_text, part_name_text.
Default:
False- --instrumentation
Pass this flag to update the score’s instrumentation based on changed values from ‘staff_<i>_instrument’ columns.
Default:
False- --empty
Set this flag to also allow empty values to be used for overwriting existing ones.
Default:
False- --remove
Set this flag to remove non-default metadata fields that are not columns in the metadata.tsv file anymore.
Default:
False
review#
Extract facets, check labels, and create _reviewed files.
ms3 review [-h] [--ignore_scores] [--ignore_labels] [--fail]
[--ignore_metronome] [--ask] [--use {expanded,labels}] [--flip]
[--safe] [--force] [-M [folder]] [-N [folder]] [-R [folder]]
[-L [folder]] [-X [folder]] [-F [folder]] [-E [folder]]
[-C [folder]] [-J [folder]] [-D [suffix]] [-p] [--raw] [-u]
[--interval_index] [--corpuswise] [-d DIR] [-o OUT_DIR] [-n] [-a]
[-i REGEX] [-e REGEX] [-f REGEX] [-m [PATH]] [--reviewed]
[--files PATHs [PATHs ...]] [--iterative] [-l {c, e, w, i, d}]
[--log [LOG]] [-t] [-v] [-c [GIT_REVISION]] [--threshold THRESHOLD]
Named Arguments#
- --ignore_scores
Don’t check scores for encoding errors.
Default:
False- --ignore_labels
Don’t check DCML labels for syntactic correctness.
Default:
False- --fail
If you pass this argument the process will deliberately fail with an AssertionError when there are any mistakes.
Default:
False- --ignore_metronome
Pass this flag if you want the check to pass (not fail) even if there is a warning about a missing metronome mark in the first bar of the score.
Default:
False- --ask
If several files are available for the selected facet (default: ‘expanded’, see –use), I will pick one automatically. Add –ask if you want me to have you select which ones to compare with the scores.
Default:
False- --use
Possible choices: expanded, labels
Which type of labels you want to compare with the ones in the score. Defaults to ‘expanded’, i.e., DCML labels. Set –use labels to use other labels available as TSV and set –ask if several sets of labels are available that you want to choose from.
Default:
'expanded'- --flip
Pass this flag to treat the annotation tables as if updating the scores instead of the other way around, effectively resulting in a swap of the colors in the output files.
Default:
False- --safe
Don’t overwrite existing files.
Default:
True- --force
Output comparison files even when no differences are found.
Default:
False- -M, --measures
Folder where to store TSV files with measure information needed for tasks such as unfolding repetitions.
- -N, --notes
Folder where to store TSV files with information on all notes.
- -R, --rests
Folder where to store TSV files with information on all rests.
- -L, --labels
Folder where to store TSV files with information on all annotation labels.
- -X, --expanded
Folder where to store TSV files with expanded DCML labels.
- -F, --form_labels
Folder where to store TSV files with all form labels.
- -E, --events
Folder where to store TSV files with all events (chords, rests, articulation, etc.) without further processing.
- -C, --chords
Folder where to store TSV files with <chord> tags, i.e. groups of notes in the same voice with identical onset and duration. The tables include lyrics, dynamics, articulation, staff- and system texts, tempo marking, spanners, and thoroughbass figures.
- -J, --joined_chords
Like -C except that all Chords are substituted with the actual Notes they contain. This is useful, for example, for relating slurs to the notes they group, or bass figures to their bass notes.
- -D, --metadata
Set -D to update the ‘metadata.tsv’ files of the respective corpora with the parsed scores. Add a suffix if you want to update ‘metadata{suffix}.tsv’ instead.
- -p, --positioning
When extracting labels, include manually shifted position coordinates in order to restore them when re-inserting.
Default:
False- --raw
When extracting labels, leave chord symbols encoded instead of turning them into a single column of strings.
Default:
True- -u, --unfold
Unfold the repeats for all stored DataFrames.
Default:
False- --interval_index
Prepend a column with [start, end) intervals to the TSV files.
Default:
False- --corpuswise
Parse one corpus after the other rather than all at once.
Default:
False- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -c, --compare
Pass -c if you want the _reviewed file to display removed labels in red and added labels in green, compared to the version currently represented in the present TSV files, if any. If instead you want a comparison with the TSV files from another Git commit, additionally pass its specifier, e.g. ‘HEAD~3’, <branch-name>, <commit SHA> etc. LATEST_VERSION is accepted as a revision specifier and will result in a comparison with the TSV files at the tag with the highest version number (falling back to HEAD if no tags have been assigned to the repository.
- --threshold
Harmony segments where the ratio of non-chord tones vs. chord tones lies above this threshold will be printed in a warning and will cause the check to fail if the –fail flag is set. Defaults to 0.6 (3:2).
Default:
0.6
transform#
Concatenate and transform TSV data from one or several corpora. Available transformations are unfolding repeats and adding an interval index.
ms3 transform [-h] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX]
[-f REGEX] [-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]]
[--iterative] [-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v] [-M]
[-N] [-R] [-L] [-X] [-F [folder]] [-E] [-C] [-D]
[-s [SUFFIX ...]] [-u] [--interval_index] [--resources] [--safe]
[--uncompressed] [--dirty]
Named Arguments#
- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -M, --measures
Concatenate measures TSVs for all selected pieces.
Default:
False- -N, --notes
Concatenate notes TSVs for all selected pieces.
Default:
False- -R, --rests
Concatenate rests TSVs for all selected pieces (use ms3 extract -R to create those).
Default:
False- -L, --labels
Concatenate raw harmony label TSVs for all selected pieces (use ms3 extract -L to create those).
Default:
False- -X, --expanded
Concatenate expanded DCML label TSVs for all selected pieces.
Default:
False- -F, --form_labels
Concatenate form label TSVs for all selected pieces.
- -E, --events
Concatenate events TSVs (notes, rests, articulation, etc.) for all selected pieces (use ms3 extract -E to create those).
Default:
False- -C, --chords
Concatenate chords TSVs (<chord> tags group notes in the same voice with identical onset and duration) including lyrics, dynamics, articulation, staff- and system texts, tempo marking, spanners, and thoroughbass figures, for all selected pieces (use ms3 extract -C to create those).
Default:
False- -D, --metadata
Output ‘concatenated_metadata.tsv’ with one row per selected piece.
Default:
False- -s, --suffix
Pass -s to use standard suffixes or -s SUFFIX to choose your own. In the latter case they will be assigned to the extracted aspects in the order in which they are listed above (capital letter arguments).
- -u, --unfold
Unfold the repeats for all concatenated DataFrames.
Default:
False- --interval_index
Prepend a column with [start, end) intervals to the TSV files.
Default:
False- --resources
Store the concatenated DataFrames as TSV files with resource descriptors rather than in a ZIP with a package descriptor.
Default:
False- --safe
Don’t overwrite existing files.
Default:
True- --uncompressed
Store the transformed files as uncompressed TSVs rather than writing them into a ZIP file.
Default:
False- --dirty
Allows to override the ‘This repository is dirty’ blocker.
Default:
False
update#
Convert MSCX files to the latest MuseScore version and move all chord annotations to the Roman Numeral Analysis layer. This command overwrites existing files!!!
ms3 update [-h] [-d DIR] [-o OUT_DIR] [-n] [-a] [-i REGEX] [-e REGEX]
[-f REGEX] [-m [PATH]] [--reviewed] [--files PATHs [PATHs ...]]
[--iterative] [-l {c, e, w, i, d}] [--log [LOG]] [-t] [-v]
[-s SUFFIX] [--above] [--safe] [--staff STAFF] [--type TYPE]
Named Arguments#
- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -s, --suffix
Add this suffix to the filename of every new file.
- --above
Display Roman Numerals above the system.
Default:
False- --safe
Only moves labels if their temporal positions stay intact.
Default:
False- --staff
Which staff you want to move the annotations to. 1=upper staff; -1=lowest staff (default)
Default:
-1- --type
defaults to 1, i.e. moves labels to Roman Numeral layer. Other types have not been tested!
Default:
1
precommit#
Like ms3 review but also adds the resulting files to the Git index.
ms3 precommit [-h] [--ignore_scores] [--ignore_labels] [--fail]
[--ignore_metronome] [--ask] [--use {expanded,labels}] [--flip]
[--safe] [--force] [-M [folder]] [-N [folder]] [-R [folder]]
[-L [folder]] [-X [folder]] [-F [folder]] [-E [folder]]
[-C [folder]] [-J [folder]] [-D [suffix]] [-p] [--raw] [-u]
[--interval_index] [--corpuswise] [-d DIR] [-o OUT_DIR] [-n]
[-a] [-i REGEX] [-e REGEX] [-f REGEX] [-m [PATH]] [--reviewed]
[--files PATHs [PATHs ...]] [--iterative] [-l {c, e, w, i, d}]
[--log [LOG]] [-t] [-v] [-c [GIT_REVISION]]
[--threshold THRESHOLD]
FILE [FILE ...]
Positional Arguments#
- FILE
Shadows the –files argument because pre-commit passes files as positional arguments.
Named Arguments#
- --ignore_scores
Don’t check scores for encoding errors.
Default:
False- --ignore_labels
Don’t check DCML labels for syntactic correctness.
Default:
False- --fail
If you pass this argument the process will deliberately fail with an AssertionError when there are any mistakes.
Default:
False- --ignore_metronome
Pass this flag if you want the check to pass (not fail) even if there is a warning about a missing metronome mark in the first bar of the score.
Default:
False- --ask
If several files are available for the selected facet (default: ‘expanded’, see –use), I will pick one automatically. Add –ask if you want me to have you select which ones to compare with the scores.
Default:
False- --use
Possible choices: expanded, labels
Which type of labels you want to compare with the ones in the score. Defaults to ‘expanded’, i.e., DCML labels. Set –use labels to use other labels available as TSV and set –ask if several sets of labels are available that you want to choose from.
Default:
'expanded'- --flip
Pass this flag to treat the annotation tables as if updating the scores instead of the other way around, effectively resulting in a swap of the colors in the output files.
Default:
False- --safe
Don’t overwrite existing files.
Default:
True- --force
Output comparison files even when no differences are found.
Default:
False- -M, --measures
Folder where to store TSV files with measure information needed for tasks such as unfolding repetitions.
- -N, --notes
Folder where to store TSV files with information on all notes.
- -R, --rests
Folder where to store TSV files with information on all rests.
- -L, --labels
Folder where to store TSV files with information on all annotation labels.
- -X, --expanded
Folder where to store TSV files with expanded DCML labels.
- -F, --form_labels
Folder where to store TSV files with all form labels.
- -E, --events
Folder where to store TSV files with all events (chords, rests, articulation, etc.) without further processing.
- -C, --chords
Folder where to store TSV files with <chord> tags, i.e. groups of notes in the same voice with identical onset and duration. The tables include lyrics, dynamics, articulation, staff- and system texts, tempo marking, spanners, and thoroughbass figures.
- -J, --joined_chords
Like -C except that all Chords are substituted with the actual Notes they contain. This is useful, for example, for relating slurs to the notes they group, or bass figures to their bass notes.
- -D, --metadata
Set -D to update the ‘metadata.tsv’ files of the respective corpora with the parsed scores. Add a suffix if you want to update ‘metadata{suffix}.tsv’ instead.
- -p, --positioning
When extracting labels, include manually shifted position coordinates in order to restore them when re-inserting.
Default:
False- --raw
When extracting labels, leave chord symbols encoded instead of turning them into a single column of strings.
Default:
True- -u, --unfold
Unfold the repeats for all stored DataFrames.
Default:
False- --interval_index
Prepend a column with [start, end) intervals to the TSV files.
Default:
False- --corpuswise
Parse one corpus after the other rather than all at once.
Default:
False- -d, --dir
Folder(s) that will be scanned for input files. Defaults to current working directory if no individual files are passed via -f.
Default:
/home/docs/checkouts/readthedocs.org/user_builds/ms3/checkouts/stable/docs- -o, --out
Output directory. For conversion, an absolute path will result in a copy of the original sub-folder structure, whereas a relative path will contain all converted files next to each other.
- -n, --nonrecursive
Treat DIR as single corpus even if it contains corpus directories itself.
Default:
False- -a, --all
By default, only files listed in the ‘piece’ column of a ‘metadata.tsv’ file are parsed. With this option, all files will be parsed.
Default:
False- -i, --include
Select only files whose names include this string or regular expression.
- -e, --exclude
Any files or folders (and their subfolders) including this regex will be disregarded.By default, files including ‘_reviewed’ or starting with . or _ or ‘concatenated’ are excluded.
- -f, --folders
Select only folders whose names include this string or regular expression.
- -m, --musescore
- Command or path of your MuseScore 3 executable. -m by itself will set ‘auto’ (attempt to use standard
path for your system). Other shortcuts are -m win, -m mac, and -m mscore (for Linux).
- --reviewed
By default, review files and folder are excluded from parsing. With this option, they will be included, too.
Default:
False- --files
(Deprecated) The paths are expected to be within DIR. They will be converted into a view that includes only the indicated files. This is equivalent to specifying the file names as a regex via –include (assuming that file names are unique amongst corpora.
- --iterative
Do not use all available CPU cores in parallel to speed up batch jobs.
Default:
False- -l, --level
Choose how many log messages you want to see: c (none), e, w, i, d (maximum)
Default:
'i'- --log
Can be a file path or directory path. Relative paths are interpreted relative to the current directory.
- -t, --test
No data is written to disk.
Default:
False- -v, --verbose
Show more output such as files discarded from parsing.
Default:
False- -c, --compare
Pass -c if you want the _reviewed file to display removed labels in red and added labels in green, compared to the version currently represented in the present TSV files, if any. If instead you want a comparison with the TSV files from another Git commit, additionally pass its specifier, e.g. ‘HEAD~3’, <branch-name>, <commit SHA> etc. LATEST_VERSION is accepted as a revision specifier and will result in a comparison with the TSV files at the tag with the highest version number (falling back to HEAD if no tags have been assigned to the repository.
- --threshold
Harmony segments where the ratio of non-chord tones vs. chord tones lies above this threshold will be printed in a warning and will cause the check to fail if the –fail flag is set. Defaults to 0.6 (3:2).
Default:
0.6
Unittests#
ms3 has a test suite that uses the PyTest library.
Install dependencies#
Install the library via pip install ms3[testing].
Configuring the tests#
In order to run the tests you need to
clone the unittest_metacorpus including submodules (ask for permission)
in the configuration file
new_tests/conftest.py, change the value ofCORPUS_DIRto the path containing your clone of the metacorpus (defaults to the user’s home directory)in the line below, copy the commit SHA of
TEST_COMMIT, e.g.51e4cb5, and checkout your metacorpus to that commit (e.g.,git checkout 51e4cb5).
Running the tests#
In the commandline, head to your ms3 folder and call pytest new_tests. Alternatively, some IDEs allow
you to right-click on the folder new_tests and select something like Run pytest in new_tests.