CoordinateField, ComputedField, and custom loader configuration
How to Load Tabular Data
The fastest way to get music data into TimeToAlign! is through tabular loaders. If your data is in CSV or TSV format, you’re just 3 lines of code away from analysis.
What you’ll learn: - Load music annotations from TSV/CSV files - Access event counts, coordinate ranges, and metadata - Create timelines from loaded data - Create custom loaders with different extra_columns strategies - Use Field for nested JSON column access
Time: 15 minutes
TL;DR
from timetoalign.loader.tabular import Ms3Loaderloader = Ms3Loader()loader.load("beethoven.notes.tsv")df = loader.events.to_pandas() # Get as DataFrametimeline = loader.create_timeline() # Create Timeline
Setup
from timetoalign.testdata import ensure_dataBEETHOVEN = ensure_data("score") /"beethoven_woo71"THORESEN = ensure_data("thoresen")# Available files{"Beethoven files": [f.name for f in BEETHOVEN.glob("WoO71.*.tsv")],"Thoresen files": [f.name for f in THORESEN.glob("*.tsv")],}
Note: Both timelines represent the same 11 events, but in different coordinate systems: - Physical:0 - 142.5 seconds (audio time) - Graphical:10 - 760 pixels (image coordinates)
TimeToAlign! uses these dual representations to align graphical annotations with audio.
Advanced Features
The following sections cover advanced features for complex data loading scenarios.
Sometimes your data contains coordinates in multiple systems (e.g., seconds AND pixels). Use CoordinateField to parse any column as a proper coordinate struct with unit metadata, enabling:
Multiple coordinate columns in one EventData
C-Map creation from loaded coordinate pairs
Full precision preservation with Fraction number type
Proper unit tracking per column (not just the primary unit)
The Thoresen data has both time coordinates (seconds) and pixel coordinates (in JSON):
from timetoalign.core import NumberType, TimeUnit # noqa: E402, F811from timetoalign.loader import CoordinateField, Field # noqa: E402, F811from timetoalign.loader.tabular import TsvLoader # noqa: E402, F811class MultiCoordinateLoader(TsvLoader):"""Loader that extracts multiple coordinate columns.\n Primary coordinates in seconds, with additional x_pixels column. This enables creating C-Maps between coordinate systems. """# Primary coordinates: seconds id_column ="event_id" start_column ="start_time_sec" duration_column ="duration_sec" event_type_column ="event_type" _default_unit = TimeUnit.seconds coordinate_type = NumberType.float# Extra columns - mix of regular and coordinate columns extra_columns = ["image_filename", # Regular string column# CoordinateField extracts x as a proper coordinate struct CoordinateField("x_pixels", source=Field("rect_coords_json", "x"), # Nested JSON access unit=TimeUnit.pixels, ), ]multi = MultiCoordinateLoader()multi.load(THORESEN /"thoresen_test.tsv")# The x_pixels column is now a proper coordinate with unit metadatamulti.events.to_pandas()[["id", "start", "end", "x_pixels", "image_filename"]]
With dual coordinates loaded, you can create Conversion Maps (C-Maps) to convert between coordinate systems. The loader’s create_cmap() method supports:
TableMap (default): Point-to-point mapping with interpolation
LinearMap: Fits a linear function y = ax + b
ScalarMap: Fits a pure scaling y = ax
from timetoalign.maps import LinearMap # noqa: E402# Create a TableMap from start (seconds) -> x_pixelstable_cmap = multi.create_cmap("start", "x_pixels")# Create a LinearMap (fits y = ax + b)linear_cmap = multi.create_cmap("start", "x_pixels", map_type=LinearMap)# Compare the two map types{"TableMap": str(table_cmap),"LinearMap": str(linear_cmap),"5.0 seconds (TableMap)": f"{table_cmap(5.0):.1f} pixels","5.0 seconds (LinearMap)": f"{linear_cmap(5.0):.1f} pixels",}
group_by: Creating Child Timelines from Column Values
When your data contains events from multiple sources (e.g., multiple images, pages, or tracks), use group_by to automatically create child timelines for each unique value.
The Thoresen data has events from 5 different image files:
# Using the earlier 'auto' loader which has image_filenamefrom timetoalign.timelines import create_timeline # noqa: E402# Create timeline grouped by image filenamegrouped_tl = create_timeline(auto, group_by="image_filename")grouped_tl
# Each child timeline represents events from one image{"parent_id": grouped_tl.id,"n_children": grouped_tl.n_children,"children": { child.id: len(child._events) if child._events else0for _, child in grouped_tl.iter_children() },}
Key Takeaway: Tabular loaders provide a declarative way to map CSV/TSV columns to TimeToAlign! events. Use Field for nested JSON access, CoordinateField for additional coordinate columns with unit tracking, and group_by for multi-source timelines.