GSData container

class gsply.gsdata.GSData(means, scales, quats, opacities, sh0, shN, _format=<factory>, masks=None, mask_names=None, _base=None)[source]

Bases: object

Gaussian Splatting data container.

This container holds Gaussian parameters, either as separate arrays or as zero-copy views into a single base array for maximum performance. Implemented as a mutable dataclass with direct attribute access.

Parameters:
means

(N, 3) - xyz positions

scales

(N, 3) - scale parameters - PLY format: log-scales (log(scale)) - LINEAR format: linear scales (scale)

quats

(N, 4) - rotation quaternions

opacities

(N,) - opacity values - PLY format: logit-opacities (logit(opacity)) - LINEAR format: linear opacities (opacity in [0, 1])

sh0

(N, 3) - DC spherical harmonics (always SH format)

shN

(N, K, 3) - Higher-order SH coefficients (K bands) (always SH format)

masks

(N,) or (N, L) - Boolean mask layers for filtering (None = no masks)

mask_names

list[str] - Names for each mask layer (None = unnamed layers)

_base

(N, P) - Private base array (keeps memory alive for views, None otherwise)

_format

FormatDict - Format tracking per attribute (type-safe TypedDict) - Format: {“scales”: DataFormat.SCALES_PLY, “opacities”: DataFormat.OPACITIES_PLY, …} - Scales: DataFormat.SCALES_PLY (log-scales) or DataFormat.SCALES_LINEAR (linear scales) - Opacities: DataFormat.OPACITIES_PLY (logit-opacities) or DataFormat.OPACITIES_LINEAR (linear opacities) - Colors: DataFormat.SH0_SH (sh0 as SH) or DataFormat.SH0_RGB (sh0 as RGB) - SH Order: DataFormat.SH_ORDER_0/1/2/3 (spherical harmonics degree for shN) - Positions/Rotations: DataFormat.MEANS_RAW (means) and DataFormat.QUATS_RAW (quats) - raw format - Always provided when creating GSData (auto-detected if not specified)

Mask Layers:
  • Single layer: masks shape (N,), mask_names = None or [“name”]

  • Multi-layer: masks shape (N, L), mask_names = [“name1”, “name2”, …]

  • Use add_mask_layer() to add named layers

  • Use combine_masks() to merge layers with AND/OR logic

  • Use apply_masks() to filter data using mask layers

Performance:
  • Zero-copy reads provide maximum performance

  • No memory overhead (views share memory with base)

Example

>>> data = plyread("scene.ply")
>>> print(f"Loaded {len(data)} Gaussians")
>>> # Add named mask layers
>>> data.add_mask_layer("high_opacity", data.opacities > 0.5)
>>> data.add_mask_layer("foreground", data.means[:, 2] < 0)
>>> # Combine and apply
>>> filtered = data.apply_masks(mode="and")
get_sh_degree()[source]

Get SH degree from shN shape.

Return type:

int

Returns:

SH degree (0-3)

property is_scales_ply: bool

Check if scales are in PLY format (log-scales).

Returns:

True if scales are log-scales

property is_scales_linear: bool

Check if scales are in linear format.

Returns:

True if scales are linear

property is_opacities_ply: bool

Check if opacities are in PLY format (logit-opacities).

Returns:

True if opacities are logit-opacities

property is_opacities_linear: bool

Check if opacities are in linear format [0, 1].

Returns:

True if opacities are linear

property is_sh0_sh: bool

Check if sh0 is in spherical harmonics format.

Returns:

True if sh0 is in SH format

property is_sh0_rgb: bool

Check if sh0 is in RGB color format.

Returns:

True if sh0 is in RGB format

property is_sh_order_0: bool

Check if SH degree is 0 (only sh0, no shN).

Returns:

True if SH degree is 0

property is_sh_order_1: bool

Check if SH degree is 1 (3 bands).

Returns:

True if SH degree is 1

property is_sh_order_2: bool

Check if SH degree is 2 (8 bands).

Returns:

True if SH degree is 2

property is_sh_order_3: bool

Check if SH degree is 3 (15 bands).

Returns:

True if SH degree is 3

property format_state: FormatDict

Get a read-only copy of the format state.

Returns a copy of the internal format dict for inspection. Use copy_format_from() to copy format between objects.

Returns:

Copy of the format dict (modifications won’t affect original)

Example

>>> data = gsply.plyread("scene.ply")
>>> fmt = data.format_state
>>> print(fmt)  # {'scales': DataFormat.SCALES_PLY, ...}
copy_format_from(other)[source]

Copy format tracking from another GSData object.

This is the public API for copying format state between objects. Use this instead of directly accessing _format dict.

Parameters:

other (GSData) – Source GSData to copy format from

Return type:

None

Example

>>> # After processing that might lose format
>>> processed.copy_format_from(original)
with_format(**updates)[source]

Create a copy with updated format settings.

Returns a new GSData with the same data but updated format dict. This is useful for explicitly setting format after operations.

Parameters:

updates – Format updates (keys: scales, opacities, sh0, sh_order)

Return type:

GSData

Returns:

New GSData with updated format

Example

>>> # Mark data as having linear opacities after conversion
>>> linear_data = data.with_format(opacities=DataFormat.OPACITIES_LINEAR)
add_mask_layer(name, mask)[source]

Add a named boolean mask layer.

Parameters:
  • name (str) – Name for this mask layer

  • mask (ndarray) – Boolean array of shape (N,) where N is number of Gaussians

Raises:

ValueError – If mask shape doesn’t match data length or name already exists

Return type:

None

Example

>>> data.add_mask_layer("high_opacity", data.opacities > 0.5)
>>> data.add_mask_layer("foreground", data.means[:, 2] < 0)
>>> print(data.mask_names)  # ['high_opacity', 'foreground']
get_mask_layer(name)[source]

Get a mask layer by name.

Parameters:

name (str) – Name of the mask layer

Return type:

ndarray

Returns:

Boolean array of shape (N,)

Raises:

ValueError – If layer name not found

Example

>>> opacity_mask = data.get_mask_layer("high_opacity")
remove_mask_layer(name)[source]

Remove a mask layer by name.

Parameters:

name (str) – Name of the mask layer to remove

Raises:

ValueError – If layer name not found

Return type:

None

Example

>>> data.remove_mask_layer("foreground")
combine_masks(mode='and', layers=None)[source]

Combine mask layers using boolean logic.

Parameters:
  • mode (str) – Combination mode - “and” (all must pass) or “or” (any must pass)

  • layers (list[str] | None) – List of layer names to combine (None = use all layers)

Return type:

ndarray

Returns:

Combined boolean mask of shape (N,)

Raises:

ValueError – If no masks exist or invalid mode

Example

>>> # Combine all layers with AND
>>> mask = data.combine_masks(mode="and")
>>> filtered = data[mask]
>>>
>>> # Combine specific layers with OR
>>> mask = data.combine_masks(mode="or", layers=["opacity", "foreground"])
apply_masks(mode='and', layers=None, inplace=False)[source]

Apply mask layers to filter Gaussians.

Parameters:
  • mode (str) – Combination mode - “and” or “or”

  • layers (list[str] | None) – List of layer names to apply (None = all layers)

  • inplace (bool) – If True, modify self; if False, return filtered copy

Return type:

GSData

Returns:

Filtered GSData (self if inplace=True, new object if inplace=False)

Example

>>> # Filter using all mask layers (AND logic)
>>> filtered = data.apply_masks(mode="and")
>>>
>>> # Filter in-place using specific layers (OR logic)
>>> data.apply_masks(mode="or", layers=["opacity", "scale"], inplace=True)
consolidate()[source]

Consolidate separate arrays into a single base array.

This creates a _base array from separate arrays, which can improve performance for boolean masking operations and file writes.

Uses JIT-compiled parallel kernels for 2.8-5x faster interleaving compared to slice assignment.

Return type:

GSData

Returns:

New GSData with _base array, or self if already consolidated

Note

  • One-time cost: ~3ms per 400K Gaussians (JIT-optimized)

  • Benefit: 1.5x faster boolean masking, 36% faster writes

  • No benefit for slicing (actually slightly slower)

  • Use when doing many boolean mask operations or file writes

copy()[source]

Return a deep copy of the GSData.

Creates independent copies of all arrays, ensuring modifications to the copy won’t affect the original data.

Return type:

GSData

Returns:

A new GSData object with copied arrays

add(other)[source]

Concatenate two GSData objects along the Gaussian dimension.

Combines two GSData objects by stacking all Gaussians. Validates compatibility (same SH degree) and handles mask layer merging.

Performance: Highly optimized using pre-allocation + direct assignment - 1.10x faster for 10K Gaussians (412 M/s) - 1.56x faster for 100K Gaussians (106 M/s) - 1.90x faster for 500K Gaussians (99 M/s)

For GPU operations, use GSTensor.add() which is 18x faster on large datasets.

Note: For concatenating multiple arrays, use GSData.concatenate() which is 5.74x faster than repeated add() calls due to single allocation.

Parameters:

other (GSData) – Another GSData object to concatenate

Return type:

GSData

Returns:

New GSData object with combined Gaussians

Raises:

ValueError – If SH degrees don’t match or formats don’t match

Example

>>> data1 = gsply.plyread("scene1.ply")  # 100K Gaussians
>>> data2 = gsply.plyread("scene2.ply")  # 50K Gaussians
>>> combined = data1.add(data2)  # 150K Gaussians
>>> # Or use + operator
>>> combined = data1 + data2  # Same result
>>> print(len(combined))  # 150000

See also

concatenate: Bulk concatenation of multiple arrays (5.74x faster)

static concatenate(arrays)[source]

Bulk concatenate multiple GSData objects.

Significantly more efficient than repeated add() calls: - Single allocation instead of N-1 intermediate allocations - 5.74x faster for concatenating 10 arrays - Reduces total memory copies

Parameters:

arrays (list[GSData]) – List of GSData objects to concatenate

Return type:

GSData

Returns:

New GSData object with all Gaussians combined

Raises:

ValueError – If list is empty, SH degrees don’t match, or formats don’t match

Example

>>> scenes = [gsply.plyread(f"scene{i}.ply") for i in range(10)]
>>> combined = GSData.concatenate(scenes)  # 5.74x faster than loop!
Performance Comparison (10 arrays of 10K Gaussians):
>>> # Slow: Pairwise add() - 5.990 ms
>>> result = scenes[0]
>>> for scene in scenes[1:]:
...     result = result.add(scene)
>>>
>>> # Fast: Bulk concatenate - 1.044 ms (5.74x faster!)
>>> result = GSData.concatenate(scenes)
make_contiguous(inplace=True)[source]

Convert all arrays to contiguous memory layout for better performance.

When data is loaded from PLY files via _base arrays, all field arrays (means, scales, etc.) are non-contiguous views with poor cache locality, causing 1.5-45x performance overhead for operations.

Conversion Cost (measured): - 1K Gaussians: 0.02 ms - 10K Gaussians: 0.14 ms - 100K Gaussians: 2.2 ms - 1M Gaussians: 25 ms

Per-Operation Speedup (100K Gaussians): - argmax(): 45.5x faster - max/min(): 18-19x faster - sum/mean(): 6-7x faster - std(): 2.7x faster - element-wise: 2-4x faster

Break-Even Analysis: - < 8 operations: DON’T convert (overhead not justified) - >= 8 operations: CONVERT (speedup outweighs cost) - >= 100 operations: CRITICAL (7.9x total speedup)

Real-World Scenarios (100K Gaussians): - Light processing (3 ops): 2.4x slower (DON’T convert) - Iterative processing (10x): 2.1x faster (CONVERT!) - Heavy computation (100x): 7.9x faster (CONVERT!)

Memory: Zero overhead (same total memory, just reorganized)

Parameters:

inplace (bool) – If True, modify arrays in-place and clear _base (default). If False, return new GSData with contiguous arrays.

Return type:

GSData

Returns:

Self if inplace=True, new GSData if inplace=False

Example

>>> data = gsply.plyread("scene.ply")  # Non-contiguous from _base
>>>
>>> # For few operations (< 8) - don't convert
>>> total = data.means.sum()  # Just use as-is
>>>
>>> # For many operations (>= 8) - convert first!
>>> data.make_contiguous()  # Up to 45x faster per operation
>>> for i in range(100):
...     result = data.means.sum() + data.means.max()  # 7.9x faster!

See also

is_contiguous: Check if arrays are already contiguous

is_contiguous()[source]

Check if all arrays are C-contiguous.

Return type:

bool

Returns:

True if all arrays are contiguous, False otherwise

Example

>>> data = gsply.plyread("scene.ply")
>>> print(data.is_contiguous())  # False (from _base)
>>> data.make_contiguous()
>>> print(data.is_contiguous())  # True
unpack(include_shN=True)[source]

Unpack Gaussian data into tuple of arrays.

Convenient for standard Gaussian Splatting workflows that expect individual arrays rather than a container object.

Parameters:

include_shN (bool) – If True, include shN in output (default True)

Return type:

tuple

Returns:

If include_shN=True: (means, scales, quats, opacities, sh0, shN), If include_shN=False: (means, scales, quats, opacities, sh0)

Example

>>> data = plyread("scene.ply")
>>> means, scales, quats, opacities, sh0, shN = data.unpack()
>>> # Use with rendering functions
>>> render(means, scales, quats, opacities, sh0)
>>>
>>> # For SH0 data, exclude shN
>>> means, scales, quats, opacities, sh0 = data.unpack(include_shN=False)
to_dict()[source]

Convert Gaussian data to dictionary.

Return type:

dict

Returns:

Dictionary with keys: means, scales, quats, opacities, sh0, shN

Example

>>> data = plyread("scene.ply")
>>> props = data.to_dict()
>>> # Access by key
>>> positions = props['means']
>>> # Unpack dict values
>>> render(**props)
normalize(inplace=True)[source]

Convert linear scales/opacities to PLY format (log-scales, logit-opacities).

Converts: - Linear scales → log-scales: log(scale) with clamping - Linear opacities → logit-opacities: logit(opacity) with clamping

This is the standard format used in Gaussian Splatting PLY files. Use this when you have linear data and need to save to PLY format.

Parameters:

inplace (bool) – If True, modify this object in-place (default). If False, return new object.

Return type:

GSData

Returns:

GSData object (self if inplace=True, new object otherwise)

Example

>>> # Data with linear scales and opacities
>>> data = GSData(scales=[0.1, 0.2, 0.3], opacities=[0.5, 0.7, 0.9], ...)
>>> # Convert to PLY format in-place (modifies data)
>>> data.normalize()  # or: data.normalize(inplace=True)
>>> # Now ready to save with plywrite()
>>> plywrite("output.ply", data)
>>>
>>> # Or create a copy if you need to keep original
>>> ply_data = data.normalize(inplace=False)
denormalize(inplace=True)[source]

Convert PLY format (log-scales, logit-opacities) to linear format.

Converts: - Log-scales → linear scales: exp(log_scale) with clamping - Logit-opacities → linear opacities: sigmoid(logit) - Quaternions → normalized quaternions

Use this when you load PLY files (which use log/logit format) and need linear values for computations or visualization.

Parameters:

inplace (bool) – If True, modify this object in-place (default). If False, return new object.

Return type:

GSData

Returns:

GSData object (self if inplace=True, new object otherwise)

Example

>>> # Load PLY file (contains log-scales and logit-opacities)
>>> data = plyread("scene.ply")
>>> # Convert to linear format in-place (modifies data)
>>> data.denormalize()  # or: data.denormalize(inplace=True)
>>> # Now scales and opacities are in linear space [0, 1] for opacities
>>> print(f"Linear opacity range: [{data.opacities.min():.3f}, {data.opacities.max():.3f}]")
>>>
>>> # Or create a copy if you need to keep PLY format
>>> linear_data = data.denormalize(inplace=False)
to_rgb(inplace=True)[source]

Convert sh0 from spherical harmonics (SH) format to RGB color format.

Converts SH DC coefficients to RGB colors in [0, 1] range. Formula: rgb = sh0 * SH_C0 + 0.5

Parameters:

inplace (bool) – If True, modify this object in-place (default). If False, return new object.

Return type:

GSData

Returns:

GSData object (self if inplace=True, new object otherwise)

Example

>>> # Load PLY file (sh0 is in SH format)
>>> data = gsply.plyread("scene.ply")
>>> # Convert to RGB format in-place
>>> data.to_rgb()  # or: data.to_rgb(inplace=True)
>>> # Now sh0 contains RGB colors [0, 1]
>>> print(f"RGB color range: [{data.sh0.min():.3f}, {data.sh0.max():.3f}]")
>>>
>>> # Or create a copy if you need to keep SH format
>>> rgb_data = data.to_rgb(inplace=False)
to_sh(inplace=True)[source]

Convert sh0 from RGB color format to spherical harmonics (SH) format.

Converts RGB colors in [0, 1] range to SH DC coefficients. Formula: sh0 = (rgb - 0.5) / SH_C0

Parameters:

inplace (bool) – If True, modify this object in-place (default). If False, return new object.

Return type:

GSData

Returns:

GSData object (self if inplace=True, new object otherwise)

Example

>>> # Create GSData with RGB colors
>>> rgb_colors = np.random.rand(1000, 3).astype(np.float32)
>>> data = GSData(means=..., scales=..., sh0=rgb_colors, ...)
>>> # Convert to SH format in-place
>>> data.to_sh()  # or: data.to_sh(inplace=True)
>>> # Now sh0 contains SH DC coefficients
>>>
>>> # Or create a copy if you need to keep RGB format
>>> sh_data = data.to_sh(inplace=False)
copy_slice(key)[source]

Efficiently slice and copy in one operation.

For slices that return views, this is more efficient than data[key].copy() as it avoids creating intermediate view objects.

For boolean masks and fancy indexing, this simply delegates to __getitem__ since those already return copies.

Parameters:

key – Slice key (slice, int, array, or boolean mask)

Return type:

GSData

Returns:

A new GSData object with copied sliced data

Examples

data.copy_slice(100:200) # Copy of elements 100-199 (avoids view) data.copy_slice(::10) # Copy of every 10th element (avoids view) data.copy_slice(mask) # Same as data[mask] (already a copy)

get_gaussian(index)[source]

Get a single Gaussian as a GSData object.

Unlike direct indexing which returns a tuple for efficiency, this method returns a GSData object containing a single Gaussian.

Parameters:

index (int) – Index of the Gaussian to retrieve

Return type:

GSData

Returns:

GSData object with a single Gaussian

save(file_path, compressed=False)[source]

Save GSData to PLY file.

Convenience method that wraps plywrite() for object-oriented API.

Parameters:
  • file_path (str | Path) – Output PLY file path

  • compressed (bool) – If True, write compressed format (default False)

Return type:

None

Example

>>> data = gsply.plyread("input.ply")
>>> data.save("output.ply")  # Uncompressed
>>> data.save("output.ply", compressed=True)  # Compressed
classmethod load(file_path)[source]

Load GSData from PLY file.

Convenience classmethod that wraps plyread() for object-oriented API. Auto-detects compressed and uncompressed formats.

Parameters:

file_path (str | Path) – Path to PLY file

Return type:

GSData

Returns:

GSData container with loaded data

Example

>>> data = GSData.load("scene.ply")  # Auto-detect format
>>> print(f"Loaded {len(data)} Gaussians")
classmethod from_arrays(means, scales, quats, opacities, sh0, shN=None, format='auto', sh_degree=None, sh0_format=DataFormat.SH0_SH)[source]

Create GSData from individual arrays with format preset.

Convenient factory method for creating GSData from external arrays with automatic format detection or explicit format presets.

Parameters:
  • means (ndarray) – (N, 3) array - Gaussian centers

  • scales (ndarray) – (N, 3) array - Scale parameters

  • quats (ndarray) – (N, 4) array - Rotation quaternions

  • opacities (ndarray) – (N,) array - Opacity values

  • sh0 (ndarray) – (N, 3) array - DC spherical harmonics

  • shN (ndarray | None) – (N, K, 3) array or None - Higher-order SH coefficients

  • format (str) – Format preset - “auto” (detect), “ply” (log/logit), “linear” or “rasterizer” (linear)

  • sh_degree (int | None) – SH degree (0-3) - auto-detected from shN if None

  • sh0_format (DataFormat) – Format for sh0 (SH0_SH or SH0_RGB), default SH0_SH

Return type:

GSData

Returns:

GSData object with specified format

Example

>>> # Auto-detect format from values
>>> data = GSData.from_arrays(means, scales, quats, opacities, sh0)
>>>
>>> # Explicit PLY format (log-scales, logit-opacities)
>>> data = GSData.from_arrays(means, scales, quats, opacities, sh0, format="ply")
>>>
>>> # Explicit linear format (for rasterizer)
>>> data = GSData.from_arrays(means, scales, quats, opacities, sh0, format="linear")
classmethod from_dict(data_dict, format='auto', sh_degree=None, sh0_format=DataFormat.SH0_SH)[source]

Create GSData from dictionary with format preset.

Convenient factory method for creating GSData from a dictionary with automatic format detection or explicit format presets.

Parameters:
  • data_dict (dict) – Dictionary with keys: means, scales, quats, opacities, sh0, shN (optional)

  • format (str) – Format preset - “auto” (detect), “ply” (log/logit), “linear” or “rasterizer” (linear)

  • sh_degree (int | None) – SH degree (0-3) - auto-detected from shN if None

  • sh0_format (DataFormat) – Format for sh0 (SH0_SH or SH0_RGB), default SH0_SH

Return type:

GSData

Returns:

GSData object with specified format

Example

>>> # From dictionary with auto-detection
>>> data = GSData.from_dict({
...     "means": means, "scales": scales, "quats": quats,
...     "opacities": opacities, "sh0": sh0, "shN": shN
... })
>>>
>>> # Explicit PLY format
>>> data = GSData.from_dict(data_dict, format="ply")
>>>
>>> # Explicit linear format
>>> data = GSData.from_dict(data_dict, format="linear")